Notes onSignalsandSystems - Purdue EngineeringChapter 1 Complexnumbers This chapter presents some elementary facts concerning complex numbers, inner product spacesandorthogonalsystems

Notes on Signals and Systems

A.E. Frazho

2

Contents

1 Complex numbers 71.1 Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.1.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.2 The polar decomposition of a complex number . . . . . . . . . . . . . . . . . 9

1.2.1 The roots of unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171.2.2 Classical sinusoid formulas . . . . . . . . . . . . . . . . . . . . . . . . 181.2.3 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.3 Inner products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211.3.1 The L2(0, τ) space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241.3.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

1.4 Orthogonal basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.4.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2 Fourier series 292.1 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.1.1 Some convergence results for Fourier series . . . . . . . . . . . . . . . 322.1.2 Fourier series in L2(−μ, μ) . . . . . . . . . . . . . . . . . . . . . . . . 35

2.2 A square wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.2.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.3 The Fourier series consisting of cosine and sines . . . . . . . . . . . . . . . . 462.3.1 A sinusoid example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.4 Harmonic Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552.4.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2.5 The Fourier series for even and odd functions . . . . . . . . . . . . . . . . . 602.5.1 An even square wave . . . . . . . . . . . . . . . . . . . . . . . . . . . 612.5.2 The Fourier series in terms of sine or cosine functions . . . . . . . . . 652.5.3 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

2.6 The integral of a Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . 702.6.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

2.7 The Cesaro mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812.7.1 The convergence of Cesaro means . . . . . . . . . . . . . . . . . . . . 812.7.2 The Dirac delta function . . . . . . . . . . . . . . . . . . . . . . . . . 922.7.3 Dirichlet and Fejer kernels . . . . . . . . . . . . . . . . . . . . . . . . 952.7.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

3

4 CONTENTS

3 The discrete Fourier transform 113

3.1 The discrete Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . 113

3.1.1 Nyquist sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

3.1.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

3.2 The discrete Fourier transform and Fourier series . . . . . . . . . . . . . . . 127

3.2.1 A Fourier series example . . . . . . . . . . . . . . . . . . . . . . . . . 128

3.2.2 Bessel functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

3.2.3 Computing the inner product in L2(0, τ) . . . . . . . . . . . . . . . . 138

3.2.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

3.3 Properties of the discrete Fourier transform . . . . . . . . . . . . . . . . . . 144

3.3.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

3.4 Sinusoid estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

3.4.1 Sunspots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

3.4.2 A least squares optimization problem . . . . . . . . . . . . . . . . . . 152

3.4.3 A sinusoid estimation problem . . . . . . . . . . . . . . . . . . . . . . 158

3.4.4 An example of sinusoid estimation . . . . . . . . . . . . . . . . . . . . 161

3.4.5 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

3.5 The shift and the discrete Fourier transform . . . . . . . . . . . . . . . . . . 170

3.5.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

3.6 Convolution and the discrete Fourier transform . . . . . . . . . . . . . . . . 173

4 Laplace transforms and transfer functions 181

4.1 The Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

4.1.1 Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

4.1.2 Multiplication by eat . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

4.1.3 The Dirac delta function . . . . . . . . . . . . . . . . . . . . . . . . . 186

4.1.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

4.2 Properties of the Laplace transform . . . . . . . . . . . . . . . . . . . . . . . 188

4.2.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

4.3 The inverse Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . 192

4.3.1 Complex poles and the residue command in Matlab . . . . . . . . . . 198

4.3.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

4.4 Transfer functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

4.5 An elementary RCL circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

4.5.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

4.6 The Final Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

4.6.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

4.7 A cascaded circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

4.7.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

4.8 Transfer functions and impedance . . . . . . . . . . . . . . . . . . . . . . . . 220

4.8.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

CONTENTS 5

5 State space 2295.1 The exponential matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

5.1.1 A spectral method to compute eAt . . . . . . . . . . . . . . . . . . . . 2325.1.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

5.2 The rotation matrix around a specified axis . . . . . . . . . . . . . . . . . . 2355.2.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

5.3 State space input output maps . . . . . . . . . . . . . . . . . . . . . . . . . . 2505.3.1 Transfer functions for state space systems . . . . . . . . . . . . . . . 2535.3.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

5.4 State space realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2575.4.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

5.5 Stable state space systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2715.5.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

5.6 State space realizations and operational amplifiers . . . . . . . . . . . . . . . 2775.6.1 Circuits for state space systems . . . . . . . . . . . . . . . . . . . . . 2835.6.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

5.7 A simple pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2865.7.1 A Simulink model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2915.7.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

6 Mass spring damper systems 3016.1 A general mass spring damper equation . . . . . . . . . . . . . . . . . . . . . 3016.2 Positive matrices and stability . . . . . . . . . . . . . . . . . . . . . . . . . . 304

6.2.1 The Gershgorin circle theorem . . . . . . . . . . . . . . . . . . . . . . 3076.2.2 The case when M > 0 and K > 0 and Φ ≥ 0 . . . . . . . . . . . . . . 310

6.3 The mass spring damper and state space . . . . . . . . . . . . . . . . . . . . 3116.4 A mass spring damper example . . . . . . . . . . . . . . . . . . . . . . . . . 317

6.4.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3266.5 The mass spring system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

6.5.1 The conservative system Mq +Kq = 0 . . . . . . . . . . . . . . . . . 3306.5.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3326.5.3 Exercise: sine and cosine matrices . . . . . . . . . . . . . . . . . . . . 333

6.6 A mass spring approximation of the wave equation . . . . . . . . . . . . . . 3366.6.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

7 An introduction to filtering theory 3437.1 Sinusoid response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

7.1.1 Steady state response . . . . . . . . . . . . . . . . . . . . . . . . . . . 3447.1.2 A vibration suppression example . . . . . . . . . . . . . . . . . . . . 3547.1.3 A mass spring damper identification example . . . . . . . . . . . . . . 3587.1.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359

7.2 The steady state response and G(iω) . . . . . . . . . . . . . . . . . . . . . . 3647.2.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

7.3 Ideal filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372

6 CONTENTS

7.3.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3777.4 Bode plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378

7.4.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3807.5 Natural frequencies and damping ratios . . . . . . . . . . . . . . . . . . . . . 382

7.5.1 A band pass filter example . . . . . . . . . . . . . . . . . . . . . . . . 3857.5.2 The resonance frequency . . . . . . . . . . . . . . . . . . . . . . . . . 3897.5.3 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394

7.6 A bus suspension problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4027.6.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405

7.7 All pass filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4077.7.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414

8 Butterworth filters 4178.1 Low pass Butterworth filters . . . . . . . . . . . . . . . . . . . . . . . . . . . 417

8.1.1 A low pass Butterworth filtering example . . . . . . . . . . . . . . . . 4198.1.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425

8.2 High pass Butterworth filters . . . . . . . . . . . . . . . . . . . . . . . . . . 4288.2.1 Constructing high pass filters from low pass filters . . . . . . . . . . . 4288.2.2 State space realizations for low and high pass filters . . . . . . . . . . 4298.2.3 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431

8.3 Band pass Butterworth filters . . . . . . . . . . . . . . . . . . . . . . . . . . 4318.3.1 A bandpass Butterworth filtering example . . . . . . . . . . . . . . . 4338.3.2 State space realizations for band pass filters . . . . . . . . . . . . . . 4378.3.3 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439

8.4 Band stop Butterworth filters . . . . . . . . . . . . . . . . . . . . . . . . . . 4408.4.1 State space realizations for band stop filters . . . . . . . . . . . . . . 4428.4.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444

9 The Fourier transform 4459.1 The Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445

9.1.1 The Fourier transform of exponential functions. . . . . . . . . . . . . 4479.1.2 Connections to the Laplace transform. . . . . . . . . . . . . . . . . . 4509.1.3 The Dirac delta function . . . . . . . . . . . . . . . . . . . . . . . . . 4529.1.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453

9.2 Properties of the Fourier transform . . . . . . . . . . . . . . . . . . . . . . . 4539.2.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458

9.3 The inverse Fourier transform of rational functions . . . . . . . . . . . . . . 4599.3.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462

9.4 Transfer functions and sinusoid response . . . . . . . . . . . . . . . . . . . . 4629.4.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465

9.5 Ideal filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4669.5.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469

9.6 The Nyquist sampling rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4709.6.1 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473

Chapter 1

Complex numbers

This chapter presents some elementary facts concerning complex numbers, inner productspaces and orthogonal systems.

1.1 Complex numbers

In this section we will review some elementary properties of complex numbers. Throughouti is the square root of −1, that is, i =

√−1. By definition i2 = −1. Moreover, i3 = −iand i4 = 1. A complex number is a number of the form z = x + iy where x and y are realnumbers. Moreover, x is the called the real part of z and is denoted by x = �z. Furthermore,y is the imaginary part of z and is denoted by y = �z. For example, if z = 2 − 3i, then2 = �z and −3 = �z.

If z1 = x1 + iy1 and z2 = x2 + iy2 are two complex numbers, then the sum of z1 and z2is given by

z1 + z2 = (x1 + x2) + (y1 + y2)i .

For example, 2 + 3i + (−3 + i) = −1 + 4i. Notice that multiplying x1 + iy1 times x2 + iy2yields

(x1 + iy1)(x2 + iy2) = x1x2 − y1y2 + (x1y2 + y1x2)i .

To see this simply observe that

(x1 + iy1)(x2 + iy2) = x1x2 + iy1iy2 + iy1x2 + x1iy2 = x1x2 − y1y2 + i(x1y2 + y1x2) .

For example, (2 + 3i)(4 + 2i) = 2 + 16i. This follows from

(2 + 3i)(4 + 2i) = 8 + 3i× 2i+ i(12 + 4) = 2 + 16i .

For another example notice that (2− 3i)2 = −5 − 12i. This follow from

(2− 3i)2 = (2− 3i)(2− 3i) = 4 + 3i× 3i− 6i− 6i = −5 − 12i .

The graph of a complex number z = x+ iy is represented by the point (x, y) in the planewhere the horizontal axis corresponds to the real part x = �z of the complex number z, and

7

8 CHAPTER 1. COMPLEX NUMBERS

the vertical axis corresponds to the imaginary part y = �z of the complex number z. Forexample, the eight complex numbers

{1, i, −1, −i, 2 + 3i, −3 + 2i, −3− 3i and 3− i} (1.1)

are plotted in Figure 1.1. The graph of these numbers are respectively the same as plottingthe points

{(1, 0), (0, 1), (−1, 0), (0,−1), (2, 3), (−3, 2), (−3,−3) and (3,−1)} (1.2)

in the plane. However, the interpretation is different. The entries in (1.1) are complexnumbers and the entries in (1.2) are points in R2.

−4 −3 −2 −1 0 1 2 3 4−4

−3

−2

−1

0

1

2

3

4

ℜ

ℑ

Figure 1.1: The graph of {1, i, −1, −i, 2 + 3i, −3 + 2i, −3 − 3i and 3− i}The complex conjugate of z is defined by z = x − iy. The complex conjugate simply

replaces i by −i. The graph of the complex conjugate z rotates z about the real axis. Noticethat if z is any complex number, then

�z = z + z

2and �z = z − z

2i. (1.3)

The magnitude or absolute value of the complex number z is defined by

|z| =√x2 + y2 (z = x+ iy) . (1.4)

It is easy to verify that zz = x2 + y2. Thus |z|2 = zz. Clearly, z = 0 if and only if |z| = 0.For example, |3− 4i| = √

9 + 16 = 5 and |1 + i| = √2. Finally, it is noted the magnitude of

i is one, that is, |i| = 1.Let us compute the real and imaginary part of z = (2 + 3i)/(1 + 2i). Using the complex

conjugate, we obtain

z =2 + 3i

1 + 2i=

(2 + 3i)(1− 2i)

(1 + 2i)(1− 2i)

=2 + 6 + i(3− 4)

12 + 22=

8− i

5.

1.2. THE POLAR DECOMPOSITION OF A COMPLEX NUMBER 9

So �z = 8/5 and �z = −1/5. Finally, |z| = √65/5.

The Matlab command to compute the real part of a complex number z = x+iy is real(z),and the Matlab command for the imaginary part is imag(z). Finally, the magnitude of z inMatlab is abs(z), and conj(z) is the complex conjugate of z.

1.1.1 Exercise

Problem 1. Plot in the complex plane the following complex numbers

2 + i, 2− i, −3− 2i and− 2 + 3i .

Problem 2. Find the magnitude, the real and imaginary part of the following complexnumbers

i(2− 3i)(4− 2i), (−2 + 4i)/(2− 3i) and (1 + i)2|3− 4i| .Problem 3. Find the real and imaginary part of the following complex numbers

(1 + 2i)(2 + i)

(2− 3i)and

i(1− 3i)(2 + 3i)

(1− 2i)(2− i)and

−2i(1 + 2i)(1 + 2i)

|1− i|2 .

Problem 4. Find the magnitude for

(1 + 4i)(2 + i)

(2− 3i)and

−i(1 − i)

(1 + i)(2− 3i).

Problem 5. Find i13, (1 + i)4, (1− i)4 and (1 + i)−4.

1.2 The polar decomposition of a complex number

This section is devoted to the polar decomposition of a complex number. Let θ be any real

number. Then Euler’s formula states that

eiθ = cos(θ) + i sin(θ) (2.1)

Notice that cos(θ) is the real part of eiθ and sin(θ) is the imaginary part of eiθ. Since

cos(θ)2 + sin(θ)2 = 1, it follows that |eiθ| = 1 for all real θ. Recall that the cosine is an even

function, that is, cos(−θ) = cos(θ). The sine is an odd function, that is, sin(−θ) = − sin(θ).

Using this fact in Euler’s identity, we arrive at e−iθ = cos(θ) − i sin(θ). Adding this to

eiθ = cos(θ) + i sin(θ) and dividing by two, yields the following expression for the cosine

cos(θ) =eiθ + e−iθ

2and sin(θ) =

eiθ − e−iθ

2i(2.2)


The second equation in (2.2) is obtained by subtracting e−iθ = cos(θ)− i sin(θ) fromeiθ = cos(θ) + i sin(θ) and dividing by 2i.

We claim that the complex conjugate of eiθ equals e−iθ. This follows from

eiθ = cos(θ) + i sin(θ) = cos(θ)− i sin(θ) = e−iθ .

Since cos(θ) is the real part of eiθ and sin(θ) is the imaginary part of eiθ, equation (2.2) alsofollows from (1.3).

Any complex number z = x+ iy admits a polar decomposition of the form

z = reiθ (2.3)

where r ≥ 0 and θ is a real number. The representation z = reiθ is also called the complexexponential form of a complex number. Moreover, r = |z| = √

x2 + y2 is the magnitude ofz, and θ is called the angle for z. The angle θ is uniquely determined by z module 2π.

To obtain the polar decomposition x+ iy = reiθ, let x = r cos(θ) and y = r sin(θ) be thepolar representation for the vector �v with coordinates (x, y) in the plane R2. Recall that thenorm or length of �v is given by ‖�v‖ =

√x2 + y2 = r, and θ is the angle from the positive

horizontal axis to the vector �v. Hence r =√x2 + y2 = |z|. Using Euler’s identity (2.1), we

obtainreiθ = r cos(θ) + ir sin(θ) = x+ iy.

Therefore reiθ = x+ iy.So any complex number z = x + iy admits a unique polar decomposition of the form

z = reiθ where r = |z| is the magnitude of z and θ is the angle for z. We denote the angleof z by either angle(z) or arg(z). In other words, z = |z|ei arg(z). The Matlab command for|z| is abs(z), and angle(z) is the Matlab command for the angle of z.

It is noted that for any complex number x+ iy, we have

angle(x+ iy) = −angle(x− iy) (when x and y are real). (2.4)

To see this observe that x + iy = reiθ where r =√x2 + y2 and θ = angle(x + iy) is the

polar decomposition for x+ iy. By taking the complex conjugate x− iy = re−iθ is the polardecomposition for x− iy. Hence angle(x− iy) = −angle(x+ iy).

The polar decomposition is unique, that is, if r1eiθ1 = r2e

iθ2 where r1 and r2 are positiveand θ1 and θ2 are real, then r1 = r2 and θ1 = θ2 modulo 2π.

Clearly, 1 = ei 0. In other words, the magnitude of 1 is one and the angle of 1 is zero.Notice that −1 = eiπ. So the magnitude of −1 is one and the angle of −1 is π. Moreover,i = ei

π2 . In particular, |i| = 1 and the angle for i is π

2. Furthermore, −i = e−i

π2 . Hence

| − i| = 1 and the angle for −i is −π2. We claim that 1 + i =

√2ei

π4 . This follows from

|1 + i| =√2 and arctan(1/1) = arctan(1) =

π

4.

It is noted that e2πki = 1 for all integers k. Finally,

eikπ = (−1)k for all integers k

eikπ2 = ik for all integers k. (2.5)


One must be careful when using the arctan function. Recall that arctan is an oddfunction. Moreover, the range of arctan(b) is between −π

2and π

2, that is, −π

2≤ arctan(b) ≤ π

2

where b is a real number. So when computing the angle of a complex number x + iy oneuses the following rule:

angle(x+ iy) = arctan(y/x) if x > 0

angle(x+ iy) = π + arctan(y/x) if x < 0 and y ≥ 0

angle(x+ iy) = −π + arctan(y/x) if x < 0 and y < 0 (2.6)

angle(x+ iy) =π

2if x = 0 and y > 0

angle(x+ iy) = −π2

if x = 0 and y < 0.

This places the angle of the complex number x+ iy in (−π, π]. Of course, the angle is uniquemodule 2π. So there is noting magical about placing the angle in (−π, π], or [0, 2π) or even[100π, 102π). However, Matlab places the angle of a complex number in (−π, π]. If y = 0,then angle(x) = 0 if x > 0, and angle(x) = π when x < 0. Of course, one can simply usethe angle command in Matlab to compute angle(x+ iy) and avoid using the arctan function.Finally, it is noted that angle(x + iy) = atan2(y, x) where atan2(y, x) is the four quadrantinverse tangent function; the Matlab command is atan2(y, x).

To see how ±π naturally occurs in computing the angle in (2.6), consider the four complexnumbers 1 + i, 1 − i, −1 + i and −1 − i. The angle for these four complex numbers arerespectively given by π

4, −π

4, 3π

4and −3π

4. Notice that arctan(1) = π

4and arctan(−1) = −π

4.

So if one only uses the arctan(y/x) to compute the angles of −1 + i and −1 − i, then onecannot distinguish between the angles of −1 + i and 1− i, or the angles of −1− i and 1+ i.To correct this problem observe that

−1 + i = −(1− i) = eiπ(1− i) = eiπ√2e−i

π4 =

√2ei(π−

π4) =

√2ei

3π4

−1− i = −(1 + i) = e−iπ(1 + i) = e−iπ√2ei

π4 =

√2ei(

π4−π) =

√2e−i

3π4 .

(Here we used the fact that −1 = e±iπ.) Hence the angle of −1 + i equals π plus the angleof 1− i, that is,

angle(−1 + i) = π + angle(1− i) = π + arctan(−1/1) =3π

4

angle(−1 − i) = −π + angle(1 + i) = −π + arctan(1/1) = −3π

4.

In the previous calculation, we implicitly used the fact that the angle of the product oftwo complex numbers is the sum of their respective angles, that is,

angle(z1z2) = angle(z1) + angle(z2), (2.7)

or equivalently, in arg notation arg(z1z2) = arg(z1)+arg(z2). To see this simple observe thatz1z2 = r1e

iθ1r2eiθ2 = r1r2e

i(θ1+θ2), where z1 = r1eiθ1 and z2 = r2e

iθ2 is the polar decompositionfor z1 and z2, respectively. Therefore arg(z1z2) = θ1 + θ2 = arg(z1) + arg(z2).


For another example, observe that

angle(2− 3i) = arctan(−3/2) ≈ −0.9828

angle(−2 + 3i) = π − arctan(3/2) ≈ 2.1588 because − 2 + 3i = eiπ(2− 3i)

angle(−2− 3i) = −π + arctan(3/2) ≈ −2.1588 because − 2− 3i = e−iπ(2 + 3i).

If x > 0, then angle(x+ iy) = arctan(y/x). However, if x < 0, then x+ iy = e±iπ(−x− iy),and thus, angle(x+ iy) = ±π + arctan(y/x). This is how ±π naturally arises in (2.6).

It is noted that if one does not care about placing the angle of a complex number in(−π, π], then one can simply use

angle(x+ iy) = arctan(y/x) if x > 0

angle(x+ iy) = π + arctan(y/x) if x < 0 (2.8)

angle(x+ iy) =π

2if x = 0 and y > 0

angle(x+ iy) = −π2

if x = 0 and y < 0.

This places the angle in [−π2, 3π

2).

−4 −3 −2 −1 0 1 2 3 4−4

−3

−2

−1

0

1

2

3

4

ℜ

ℑ

Figure 1.2: The graph of {1, i, −1, −i, 2 + 3i, −3 + 2i, −3 − 3i and 3− i}

Figure 1.2 presents a graph of the complex numbers

{1, i, −1, −i, 2 + 3i, −3 + 2i, −3− 3i and 3− i} (2.9)

in the complex plane. Here we connected a line from the origin 0 + i0 to these points.Recall that x + iy = reiθ. The length r of this line to the point x + iy is the magnitude ofx+ iy, and θ is the angle from the positive horizontal axis to the line corresponding to thecomplex number x + iy. As noted earlier 1 = ei 0, −1 = eiπ, i = ei

π2 and −i = e−i

π2 . Using


x+ iy = |x+ iy|eiθ, we see that the last four complex numbers admit a polar decompositionof the form

2 + 3i ≈√13 e0.98i and − 3 + 2i ≈

√13 e2.55i

−3− 3i ≈√18 e−2.36i and 3− i ≈

√10 e−0.32i . (2.10)

The angles were computed in Matlab using the angle command. Finally, −3−3i =√18 e−i

3π4 .

Notice that reiθ = rei(θ+2πk) where k is an integer. So the angle of reiθ with r > 0 equalsθ + 2πk where k is an integer. In other words, the angle of a complex number is uniquemodule 2π. It is emphasized that Matlab always places the angle of a complex number in(−π, π]. For example, the angle of −1− i equals 5π

4+ 2πk where k is an integer. So Matlab

would express the angle of −1 − i as −3π4.

As before, let reiθ be the polar decomposition for a complex number z. If r is fixedand θ varies from zero to 2π, then reiθ moves counter clockwise once around the circle ofradius r, starting at r+ i0 and ending at r+ i0. For example, consider the complex numberse

2πik100 where k = 0, 1, 2, · · · , 100. Notice that 2πk

100is the angle for e

2πik100 . So as the integer k

varies from 0 to 100, the complex numbers e2πik100 move counter clockwise around the unit

circle starting at 1 + i0 and ending at e200πi100 = 1 + i0. In other words, the graph of e

2πik100

for k = 0, 1, 2, · · · , 100 places 101 points around the unit circle with angles {2πk100

}1000 , and

places a point at 1 + i0 twice. Now consider the complex numbers 2e2πik100 . Clearly, 2πk

100is the

angle for 2e2πik100 . In this case, as the integer k varies from 0 to 100, the complex numbers

2e2πik100 move counter clockwise around the circle of radius two starting at 2 + i0 and ending

at 2e200πi100 = 2 + i0. In other words, the graph of 2e

2πik100 for k = 0, 1, 2, · · · , 100 places 101

points around the circle of radius two with angles {2πk100

}1000 , and places two points at 2 + i0.

The graph of e2πik100 and 2e

2πik100 for k = 0, 1, 2, · · · , 100 is given in Figure 1.3.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

ℜ

ℑ

Figure 1.3: The graph of e2πik100 and 2e

2πik100 for k = 0, 1, 2, · · · , 100

Let reiθ be the polar decomposition for a complex number z. If r is fixed and θ variesfrom zero to 4π, then reiθ moves counter clockwise around the circle of radius r twice starting


at r+ i0 and ending at r+ i0. If θ varies from zero to 10π, then reiθ moves counter clockwisearound the circle of radius r five times starting at r + i0 and ending at r + i0. If θ variesfrom zero to 5π, then re−iθ moves clockwise around the circle of radius r two and one halftimes starting at r+ i0 and ending at −r+ i0. If θ varies from π

2to 3π

2, then reiθ moves one

half a rotation counter clockwise around the circle of radius r starting at 0 + ir and endingat 0− ir.

As before, let reiθ be the polar decomposition for a complex number z. If θ is fixed andr varies from zero to R > 0, then reiθ moves from the origin 0 + i0 in a straight line to thepoint Reiθ = R cos(θ)+ iR sin(θ). In other words, as r varies from zero to R > 0, then reiθ isthe graph of a line of length R with angle θ starting at the origin. For example, consider thecomplex numbers re

2πik20 where r varies between zero and two. Clearly, the angle of re

2πik20 is

2πk20

. In this case, the graph of re2πik20 forms a line of length two with angle 2πk

20starting at the

origin. So if we let k = 0, 1, 2, · · · , 19, then we obtain twenty lines of length two with angles{2πk

20}190 starting at the origin. The graph of these twenty lines is given in Figure 1.4. The

graph looks like a wheel with twenty spokes at angles {2πk20

}190 and no tire. Finally, to makethe plot look like a wheel with a tire, we also plotted the circle of radius two by graphing2eiθ for 0 ≤ θ ≤ 2π.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

ℜ

ℑ

Figure 1.4: A wheel of radius two with twenty spokes.

The polar decomposition plays a fundamental role in computing the roots and powers ofcomplex numbers. For example, the polar decomposition can be used to compute zn where nis a positive integer. As before, let z = x+ iy. Then using the polar decomposition z = reiθ

where r = |z| and θ = arg(x+ iy), we have De Moivre’s formula

(x+ iy)n = rneinθ = rn cos(nθ) + irn sin(nθ) (2.11)


For a concrete example, let z = 1 + i. Then z =√2ei

π4 . Hence

(1 + i)3 = (√2ei

π4 )3 = 2

32 ei

3π4 = 2

32 (cos(3π/4) + i sin(3π/4)) = −2 + 2i

(1 + i)12 = (√2ei

π4 )12 = 26ei

12π4 = 26ei3π = −26 = −64 .

Thus (1 + i)3 = −2 + 2i and (1 + i)12 = −64.Now just for fun let us compute ii. Using i = ei

π2 , we see that

ii = (eiπ2 )i = ei i

π2 = e−

π2 . (2.12)

Hence ii = e−π2 . Notice that i = ei

π2+2πki where k is any integer. Using this we obtain

ii =(ei

π2+2kπi

)i= ei i

π2+2πki i = e−(π

2+2πk).

So ii = e−(π2+2πk) where k is any integer. In other words, ii = e−

π2 e2πj where j is any integer.

Now let zk = xk + iyk = rkeiθk for k = 1, 2, · · · , 5 be the polar decomposition for a

sequence of nonzero complex numbers. Consider the complex number z defined by

z =z1z2z3z4z5

=(x1 + iy1)(x2 + iy2)

(x3 + iy3)(x4 + iy4)(x5 + iy5)=

r1eiθ1r2e

iθ2

r3eiθ3r4eiθ4r5eiθ5=

r1r2r3r4r5

ei(θ1+θ2−θ3−θ4−θ5) .

Let |z|eiθ be the polar decomposition for z. Then the magnitude |z| of z and the angle θ ofz are respectively given by

|z| = r1r2r3r4r5

=

√(x21 + y21)(x

22 + y22)

(x23 + y23)(x24 + y24)(x

25 + y25)

θ = θ1 + θ2 − θ3 − θ4 − θ5.

In other words, the magnitude |z| of z is the product of the magnitudes of the complexnumbers in the numerator, divided by the product of the magnitudes of the complex numbersin the denominator. The angle θ of z is the sum of the angles of the complex numbers inthe numerator, minus the sum of the angles of the complex numbers in the denominator.

For a concrete example, consider the complex number

z =−(2 + 3i)(4− 2i)

(1− 2i)(3 + 2i).

Then the magnitude and angle for z are given by

|z| = | − 1||2 + 3i||4− 2i||1− 2i||3 + 2i| =

√(4 + 9)(16 + 4)

(1 + 4)(9 + 4)= 2

arg(z) = π + arctan(3/2)− arctan(2/4) + arctan(2/1)− arctan(2/3) = 4.1799.

(If one wants to place the angle in (−π, π], then arg(z) = 4.1799 − 2π = −2.1033.) Noticethat a minus sign appears before the arctan(2/4) and a plus appears before the arctan(2/1)because the arctan is an odd function.


Let zj = |zj |ei arg(zj) be a set of nonzero complex numbers with magnitude |zj | and anglearg(zj) for j = 1, 2, · · · , n. Let z be a complex number of the form

z =

∏mj=1 zj∏n

j=m+1 zj.

(The product of a set of complex numbers {aj} is denoted by∏aj .) The polar decomposition

of z, its magnitude |z| and angle arg(z) are given by

z =

∏mj=1 zj∏n

j=m+1 zj=

∏mj=1 |zj|∏n

j=m+1 |zj |ei(

∑mj=1 arg(zj)−

∑nj=m+1 arg(zj))

|z| =∏m

j=1 |zj |∏nj=m+1 |zj|

(2.13)

arg(z) =m∑j=1

arg(zj)−n∑

j=m+1

arg(zj).

It is emphasized that the angle arg(z) =∑m

j=1 arg(zj) −∑n

j=m+1 arg(zj) is not necessarilyin (−π, π]. Finally, as noted earlier, the magnitude |z| of z is the product of the magnitudesof the complex numbers in the numerator, divided by the product of the magnitudes of thecomplex numbers in the denominator. The angle arg(z) of z is the sum of the angles of thecomplex numbers in the numerator, minus the sum of the angles of the complex numbers inthe denominator.

A proof of Euler’s identity

Let us derive Euler’s identity eiθ = cos(θ)+ i sin(θ). To this end, recall that the Taylor seriesexpansion for ex, cos(θ) and sin(θ) are given by

ex = 1 +x

1!+x2

2!+x3

3!+x4

4!+x5

5!+ · · ·

cos(θ) = 1− θ2

2!+θ4

4!− θ6

6!+θ8

8!− θ10

10!+ · · ·

sin(θ) = θ − θ3

3!+θ5

5!− θ7

7!+θ9

9!− θ11

11!+ · · · .

Using the fact that i2 = −1, i3 = −i, i4 = 1, and i5 = i etc, along with x = iθ, we obtain

eiθ = 1 +iθ

1!+i2θ2

2!+i3θ3

3!+i4θ4

4!+i5θ5

5!+i6θ6

6!+i7θ7

7!+ · · ·

=

(1− θ2

2!+θ4

4!− θ6

6!+ · · ·

)+ i

(θ − θ3

3!+θ5

5!− θ7

7!+ · · ·

)= cos(θ) + i sin(θ) .

Therefore eiθ = cos(θ) + i sin(θ) which proves Euler’s identity.


1.2.1 The roots of unity

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

ℜ

ℑ

The roots of λ75 =1

Figure 1.5: The roots of λ75 − 1 = 0

This section is devoted to the roots of unity. The roots of unity play a fundamental rolein the discrete Fourier transform. First let us use the polar decomposition to find the rootsof a complex number z. Consider the equation λn − z = 0 where n is a strictly positiveinteger. Clearly, this equation has n roots, and these roots are given by λ = z

1n . To solve

this equation, let z = reiθ be the polar decomposition for z. Notice that z = reiθ+2πki wherek is any integer. In particular, z

1n = r

1n ei

θ+2πkn . So the n roots {λk}n−1

0 to the equationλn − z = 0 are given by

λk = r1n ei

θ+2πkn (for k = 0, 1, 2, · · · , n− 1).

Notice that it is sufficient to let k = 0, 1, 2, · · · , n − 1. If m is any other integer, thenei

θ+2πmn = ei

θ+2πkn for some integer k between 0 and n − 1. Finally, the n roots {λk}n−1

0 of

λn = z live on the circle of radius r1n with corresponding angles at { θ+2πk

n}n−1k=0.

For a concrete example, let z = 1 − i and let us find the five roots of λ5 = z. Noticethat 1 − i =

√2e−i

π4 . Hence z =

√2e−i

π4+2πki =

√2eiπ

8k−14 for any integer k. Therefore the

five roots {λk}40 of λ5 = 1− i are given by λk = 2110 eiπ

8k−120 for k = 0, 1, 2, 3, 4, that is, using

Euler’s identity

λ0 = 2110 e−

πi20 = 1.0586− 0.1677i and λ1 = 2

110 e

7πi20 = 0.4866 + 0.9550i

λ2 = 2110 e

15πi20 = −0.7579 + 0.7579i and λ3 = 2

110 e

23πi20 = −0.9550− 0.4866i

λ4 = 2110 e

31πi20 = 0.1677− 1.0586i.

It is noted that λ2 = 2110 e

15πi20 = 2−

252

12 e

3πi4 = −2−

25 +2−

25 i. The five roots {λk}40 of λ5 = 1− i

live on the circle of radius 2110 with corresponding angles at {− π

20, 7π20, 15π

20, 23π

20, 31π

20}.

Now let us compute the n roots of unity, that is, let us find all the roots of the equationλn − 1 = 0 where n is a strictly positive integer. Clearly, λ = 1

1n . Notice that 1 = e2πki for


all integers k. Therefore the n roots {λk}n−10 of the equation λn = 1 are given by

λk = e2πkin (for k = 0, 1, 2, · · · , n− 1). (2.14)

The roots of unity {e 2πkin }n−1

k=0 live on the unit circle with corresponding angles at {2πkn}n−1k=0.

Moreover, the root λk = λk1 for k = 0, 1, 2, · · · , n − 1. Finally, it is noted that the roots ofunity will play a fundamental role in the discrete Fourier transform. In the discrete Fouriertransform the roots of unity will be expressed as e−

2πkin for k = 0, 1, 2, · · · , n− 1.

For an example, consider the roots of λ75 = 1. The seventy five roots of λ75 = 1 are givenby {e 2πki

75 }74k=0. The graph of these roots are presented in Figure 1.5. As expected, the rootsof λ75 = 1 live on the unit circle with corresponding angles at {2πk

75}74k=0.

1.2.2 Classical sinusoid formulas

Euler’s equation eiϕ = cos(ϕ) + i sin(ϕ) can be used to derive many classical formulas forsinusoids. For example,

cos(θ) cos(φ) =cos(θ + φ) + cos(θ − φ)

2. (2.15)

To verify this result recall that cos(ϕ) = eiϕ+e−iϕ

2. Using this, we obtain

cos(θ) cos(φ) =eiθ + e−iθ

2× eiφ + e−iφ

2=

1

4

(eiθeiφ + eiθe−iφ + e−iθeiφ + e−iθe−iφ

)=

1

4

(ei(θ+φ) + e−i(θ+φ) + ei(θ−φ) + e−i(θ−φ)

)=

cos(θ + φ) + cos(θ − φ)

2.

This yields (2.15). In particular, choosing θ = φ in (2.15), we arrive at

cos(θ)2 =1 + cos(2θ)

2. (2.16)

For another example recall that

cos(θ + φ) = cos(θ) cos(φ)− sin(θ) sin(φ). (2.17)

To see this notice that

cos(θ + φ) =ei(θ+φ) + e−i(θ+φ)

2=eiθeiφ + e−iθe−iφ

2

=1

2

((cos(θ) + i sin(θ))(cos(φ) + i sin(φ))

)+

1

2

((cos(θ)− i sin(θ))(cos(φ)− i sin(φ))

)=

1

2

(cos(θ) cos(φ)− sin(θ) sin(φ)

)+i

2

(sin(θ) cos(φ) + cos(θ) sin(φ)

)+

1

2

(cos(θ) cos(φ)− sin(θ) sin(φ)

)− i

2

(sin(θ) cos(φ) + cos(θ) sin(φ)

)= cos(θ) cos(φ)− sin(θ) sin(φ).


Therefore (2.17) holds. Finally, it is noted that one did not have to compute the imaginarypart in the previous calculation. Because cos(θ + φ) is a real number, the imaginary partmust cancel out.

In certain problems one is concerned with summing two sinusoids of the same frequency.This leads to the following classical formula which arises in certain engineering problems:

α cos(ωt+ θ) + β sin(ωt+ φ) = r cos(ωt+ ϕ)

r =

√(α cos(θ) + β sin(φ)

)2+(α sin(θ)− β cos(φ)

)2(2.18)

ϕ = angle(α cos(θ) + β sin(φ) + i

(α sin(θ)− β cos(φ)

)).

Here α and β are real numbers, ω is the angular frequency, t is time, while θ and φ are phaseshifts. In particular, r and ϕ are the magnitude and angle of the complex number

α cos(θ) + β sin(φ) + i(α sin(θ)− β cos(φ)) = reiϕ. (2.19)

To derive the classical formula in (2.18), recall that �(z) = z+z2

where z is any complexnumber. Now observe that

α cos(ωt+ θ) + β sin(ωt+ φ) =αeiωteiθ + αe−iωte−iθ

2+βeiωteiφ − βe−iωte−iφ

2i

=1

2

(αeiθ − iβeiφ

)eiωt +

1

2

(αe−iθ + iβe−iφ

)e−iωt

=1

2

(αeiθ − iβeiφ

)eiωt +

1

2

(αeiθ − iβeiφ

)eiωt

= �((αeiθ − iβeiφ)eiωt

)= �

((α cos(θ) + β sin(φ) + i(α sin(θ)− β cos(φ))

)eiωt

)= �

(reiϕeiωt

)= �

(ei(ωt+ϕ)

)= r cos(ωt+ ϕ).

If θ and φ are both zero, then the superposition formula in (2.18) reduces to

α cos(ωt) + β sin(ωt) =√α2 + β2 cos(ωt− ψ) where ψ = angle(α + iβ). (2.20)

Here we used the fact that ϕ = angle(α− iβ) = −angle(α + iβ) = −ψ; see (2.4).Let us complete this section with the following result.

LEMMA 1.2.1 If z is a complex number, then

|z| cos(θ + arg(z)) = �(z) cos(θ)− �(z) sin(θ)|z| sin(θ + arg(z)) = �(z) cos(θ) + �(z) sin(θ). (2.21)

Proof. Let z = a + ib where a = �(z) is the real part of z and b = �(z) is the imaginarypart of z. Using Euler’s formula we have

|z| cos(θ + arg(z)) + i|z| sin(θ + arg(z)) = |z|ei(θ+arg(z)) = |z|ei arg(z)eiθ= (a+ ib)eiθ = (a + ib) (cos(θ) + i sin(θ))

= (a cos(θ)− b sin(θ)) + i (b cos(θ) + a sin(θ)) .


By matching the real and imaginary parts, we arrive at the formulas in (2.21). This completesthe proof.

1.2.3 Exercise

Problem 1. Find the magnitude and angle for

1 + i, 2 + 3i, −1 + i, −1 + 3i, 1− i, 4− 3i, −1 − i and − 3− 4i.

Problem 2. Find the magnitude and angle for

−i(1 + i)(−2 + i)

(3− 2i),

(1 + 2i)(2 + i)

(2− 3i)and

−i(1 − i)

(1 + i)(2− 3i).

Problem 3. Compute (1 + 2i)3 and (1 + i)12.

Problem 4. Compute 2i and (1 + i)i. Hint: a = eln a.

Problem 5. Compute |4i| and | − 5e−i2727|.

Problem 6. Suppose that z is a complex number with angle π/4. Then find z/|z|.

Problem 7. Find the set of all the roots for λ3 + 2 + i = 0.

Problem 8. Find the set of all roots for λ3 + 8 = 0.

Problem 9. Find the set of all roots for λn + i = 0 where n is a positive integer.

Problem 10. Find cos(i) and sin(i).

Problem 11. Graph by hand all the roots of the equation λ8 = 1.

Problem 12. Graph in Matlab all the roots of the equation λ100 = 1.

Problem 13. In Matlab set θ = linspace(0, 2π, 1000). Plot eiθ, describe what happens andexplain why. Plot −3e4iθ, describe what happens and explain why. Plot ie−iθ/2, describewhat happens and explain why. Plot �3e8iθ and �3e8iθ, describe what happens and explainwhy.

Problem 14. Using reiθ plot in Matlab a half wheel in the right half plane {z : �z ≥ 0}with twenty spokes and radius two.

Problem 15. Find all complex numbers z such that z = ln(i).

Problem 16. Find all the roots of the equation λ4 + 4 = 0.

1.3. INNER PRODUCTS 21

Problem 17. Using Euler’s formula eiϕ = cos(ϕ) + i sin(ϕ), prove the following classicalresults:

sin(θ) sin(φ) =cos(θ − φ)− cos(θ + φ)

2

sin(θ)2 =1− cos(2θ)

2

sin(θ) cos(φ) =sin(θ − φ) + sin(θ + φ)

2. (2.22)

Problem 18. Using Euler’s formula eiϕ = cos(ϕ) + i sin(ϕ), prove the following classicalresult:

sin(θ + φ) = sin(θ) cos(φ) + cos(θ) sin(φ). (2.23)

Problem 19. Find the roots of the equation

λ2 − (2 + i)λ+ 1 + i = 0.

Hint: Quadratic formula. You can check your answer by using the roots command in Matlab.

Problem 20. Find the roots of the equation

λ2 − (4 + 6i)λ− 5 + 10i = 0.

Problem 21. Determine the integral∫ 2π

0

cos(100θ)dθ(sin(θ) + i cos(θ)

)100 .Problem 22. In Matlab set θ = linspace(−pi/2, 5*pi,1000) and z = exp(2∗i∗θ). Clearly, 2θis the angle for z, and the angle 2θ for z varies from −π to 10π. Now plot(θ, angle(z)); grid.Notice that Matlab places the angles of z in the interval (−π, π]. To unravel this one can usethe unwrap command, that is, plot(θ, unwrap(angle(z))); grid. Explain what the unwrapcommand in Matlab is doing.

1.3 Inner products

In this section we will introduce the inner product and an inner product space. The set ofall complex numbers is denoted by C. We say that H is a linear space if f and g are vectorsin H, then αf + βg is also a vector in H where α and β are complex numbers. For example,let Cν denote the set of all n-tuples of the form:

f =

⎡⎢⎢⎢⎣f1f2...fν

⎤⎥⎥⎥⎦ (3.1)


where fk is a complex number for all k = 1, 2, · · · , ν. Then Cν is a linear space. For instance,if ν = 3, then C

3 denotes set of all vectors of the form

f =

⎡⎣ f1f2f3

⎤⎦where f1, f2 and f3 are complex numbers.

Let H be a linear space. Then we say that (·, ·) is an inner product (or dot product) onH if (·, ·) is a complex valued function mapping H×H into C with the following properties

(i) (f, h) = (h, f);

(ii) (αf + βg, h) = α(f, h) + β(g, h);

(iii) (f, f) ≥ 0 (for all f ∈ H);

(iv) (f, f) = 0 if and only if f = 0.

Here f, g and h are vectors in H while α and β are complex numbers. Property ii says thatthe inner product (·, ·) is linear in the first variable. Moreover, using part (i), it follow thatthe inner product is conjugate linear in the second variable, that is,

(h, αf + βg) = α(h, f) + β(h, g) .

An inner product space is simply a linear space with an inner product. If H is an innerproduct space, then the norm of the vector f in H is defined by ‖f‖ = +

√(f, f). The

distance between two vectors f and g in H is given by ‖f − g‖. The Cauchy-Schwartzinequality shows that

|(f, g)| ≤ ‖f‖ ‖g‖ (for all f, g ∈ H) . (3.2)

Furthermore, we have equality |(f, g)| = ‖f‖ ‖g‖ if and only if f and g are linearly dependent.

For an example of an inner product space consider H = Cν . One can define manydifferent inner products on Cν . The standard inner product on Cν is defined by

(f, g) =ν∑k=1

fkgk (f, g ∈ Cν) . (3.3)

where f = [f1, f2, · · · , fν ]tr and g = [g1, g2, · · · , gν]tr are vectors in Cν . Here tr denotes thetranspose. Notice that (f, g) = g∗f where ∗ denotes the complex conjugate transpose. Thenorm of a vector f in Cν is given by

‖f‖ =

(ν∑k=1

|fk|2)1/2

.

1.3. INNER PRODUCTS 23

The Cauchy-Schwartz inequality shows that |(f, g)| ≤ ‖f‖‖g‖. In other words,∣∣∣∣∣ν∑k=1

fkgk

∣∣∣∣∣ ≤(

ν∑k=1

|fk|2)1/2 ( ν∑

k=1

|gk|2)1/2

(f, g ∈ Cν) .

Notice that the Cauchy-Schwartz inequality is a generalization of the fact that (f, g) =‖f‖‖g‖ cos(θ) when f and g are two real vectors in C3, and θ is the angle between f and g.Finally, the distance between two vectors f = [f1, f2, · · · , fν ]tr and g = [g1, g2, · · · , gν ]tr inCν is given by

‖f − g‖ =

(ν∑k=1

|fk − gk|2)1/2

.

For a concrete example, let f and g be the vectors in C3 be given by

f =

⎡⎣ 2i

3 + i

⎤⎦ and

⎡⎣ 2− i1 + i2

⎤⎦ .

For these vectors we obtain

(f, g) = 2(2 + i) + i(1− i) + (3 + i)2 = 4 + 2i+ 1 + i+ 6 + 2i = 11 + 5i

‖f‖2 = |2|2 + |i|2 + |3 + i|2 = 4 + 1 + 9 + 1 = 15

‖g‖2 = |2− i|2 + |1 + i|2 + |2|2 = 4 + 1 + 1 + 1 + 4 = 11 .

So ‖f‖ =√15 and ‖g‖ =

√11. Moreover,

‖f − g‖2 = ‖⎡⎣ 2

i3 + i

⎤⎦−⎡⎣ 2− i

1 + i2

⎤⎦ ‖2 = ‖⎡⎣ i

−11 + i

⎤⎦ ‖2 = 1 + 1 + 1 + 1 = 4 .

So the distance between f and g is given by ‖f − g‖ = 2.Let H be an inner product space. Then the following identity is useful

‖f + g‖2 = ‖f‖2 + ‖g‖2 + 2�(f, g) (f, g ∈ H) . (3.4)

To verify this simply observe that

‖f + g‖2 = (f + g, f + g) = (f, f) + (f, g) + (g, f) + (g, g)

= ‖f‖2 + ‖g‖2 + (f, g) + (g, f) = ‖f‖2 + ‖g‖2 + (f, g) + (f, g)

= ‖f‖2 + ‖g‖2 + 2�(f, g) .Hence (3.4) holds.

If H is an inner product space, then the following triangle inequality holds:

‖f + g‖ ≤ ‖f‖+ ‖g‖ (f, g ∈ H) . (3.5)

To verify that the triangle inequality holds, notice that (3.4) and the Cauchy-Schwartzinequality yields

‖f + g‖2 = ‖f‖2 + ‖g‖2 + 2�(f, g) ≤ ‖f‖2 + ‖g‖2 + 2|(f, g)|≤ ‖f‖2 + ‖g‖2 + ‖f‖‖g‖ =

(‖f‖2 + ‖g‖2)2 .Hence ‖f+g‖2 ≤ (‖f‖2+‖g‖2)2. By taking the square root, we obtain the triangle inequality.


1.3.1 The L2(0, τ) space.

In this section we will present the Lebesgue space L2(0, τ) where τ is a finite positive realnumber. Throughout L2(0, τ) is the linear space consisting of the set of all functions f(t)defined on the interval [0, τ ] satisfying∫ τ

0

|f(t)|2dt <∞ .

It is easy to verify that L2(0, τ) is a linear space. Moreover,

(f, g) =1

τ

∫ τ

0

f(t)g(t)dt (f, g ∈ L2(0, τ)) (3.6)

defines an inner product on L2(0, τ). In particular, under this inner product the norm of afunction f in L2(0, τ) is defined by

‖f‖ =

(1

τ

∫ τ

0

|f(t)|2dt)1/2

. (3.7)

The distance between two vectors f and g in L2(0, τ) is given by

‖f − g‖ =

(1

τ

∫ τ

0

|f(t)− g(t)|2dt)1/2

.

In this setting the Cauchy-Schwartz inequality |(f, g)| ≤ ‖f‖‖g‖ becomes∣∣∣∣∫ τ

0

f(t)g(t)dt

∣∣∣∣ ≤ (∫ τ

0

|f(t)|2dt)1/2(∫ τ

0

|g(t)|2dt)1/2

.

Notice the 1/τ cancels out in the Cauchy-Schwartz inequality.For some examples on computing the inner product, consider the inner product space

L2(0, 2). Then

(t, t2) =1

2

∫ 2

0

tt2dt =1

2

∫ 2

0

t3dt =t4

8

∣∣∣∣20

= 2

(t, 1− it) =1

2

∫ 2

0

t(1− it)dt =1

2

∫ 2

0

(t + it2)dt =

(t2

4+it3

6

)∣∣∣∣20

= 1 + 4i/3

(1, t4) =1

2

∫ 2

0

1t4dt =1

2

∫ 2

0

t4dt =t5

10

∣∣∣∣20

= 3.2

‖t2‖ = +√(t2, t2) =

(1

2

∫ 2

0

t4dt

)1/2

=√3.2 .

Hence (t, t2) = 2, (t, 1− it) = 1 + 4i/3, (1, t4) = 3.2 and ‖t2‖ =√3.2. Notice that

‖1− t‖2 = 1

2

∫ 2

0

|1− t|2 dt = 1

2

∫ 2

0

(1− 2t + t2) dt =

(t

2− t2

2+t3

6

)∣∣∣∣20

=1

3.

1.4. ORTHOGONAL BASIS 25

So the distance between 1 and t in the L2(0, 2) norm is given by ‖1 − t‖ = 1/√3. Finally,

let us compute the L2(0, 2) norm of 2− 3ti. In this case, the definition of the L2(0, 2) normin (3.7) implies that

‖2− 3ti‖2 = 1

2

∫ 2

0

|2− 3ti|2dt = 1

2

∫ 2

0

(4 + 9t2)dt =

(2t+

3t3

2

)∣∣∣∣20

= 16 .

Therefore the L2(0, 2) norm of 2− 3ti is 4.

1.3.2 Exercise

Problem 1. Consider the vectors f and g in C 3 given by

f =

⎡⎣ 21 + i3 + 2i

⎤⎦ and g =

⎡⎣ i2− i2− 3i

⎤⎦ .

Then compute the following ‖f‖, ‖g‖, (f, g) , (f, 2ig) and the distance between f and g,that is, ‖f − g‖.Problem 2. Let H be an inner product space. Show that the following parallelogram lawholds

‖f + g‖2 + ‖f − g‖2 = 2(‖f‖2 + ‖g‖2) (f, g ∈ H) .

Problem 3. Consider the inner product space L2(0, 1). Then compute (1, t), (t2, t3), (1 −it, t2), (t2, 1− it), ‖t2‖, ‖1− it‖ and ‖1− t2‖.Problem 4. Consider the inner product space L2(0, 2π). Let k and m be integers. Thencompute ‖eikt‖, (t, eikt) and ‖ cos(kt)‖. If m is any integer not equal to k, then compute(eikt, eimt) and (cos(kt), sin(mt)).

1.4 Orthogonal basis

In this section we introduce the concept of an orthogonal basis. To begin, let us establishsome notation. Let H be an inner product space. Then two vectors f and g in H areorthogonal, denoted by f ⊥ g, if (f, g) = 0. We say that f is orthogonal to a set M, denotedby f ⊥ M, if (f, g) = 0 for all g in M.

Let K be a set of integers. A set of vectors {ψk}k∈K is orthogonal if ψk is orthogonal toψm for all integers k = m, that is, (ψk, ψm) = δkm‖ψk‖2 where δkm is the Kronecker delta.By definition δkm = 0 if k = m, and δkm = 1 if k = m. We say that {ψk}k∈K is a set ofnonzero orthogonal vectors if {ψk}k∈K is a set of orthogonal vectors and ψk = 0 for all k inK. If {ψk}k∈K is a set of nonzero orthogonal vectors, then {ψk}k∈K is linearly independent.To see this assume that

∑αkψk = 0 where {αk} is a set of scalars. Let m be any integer in

K. Because the inner product of any vector with the zero vector is zero, we obtain

0 = (∑k

αkψk, ψm) =∑k

αk(ψk, ψm) = αm‖ψm‖2 .


Hence 0 = αm‖ψm‖2. Since ψm is nonzero, αm = 0. In other words, αm = 0 for all m in K.Therefore {ψk}k∈K is linearly independent.

Recall that the dimension of a linear space H is the number of linearly independentvectors needed to span H. In particular, the dimension of Cν is ν. Furthermore, if {ξk}ν1 isa linearly independent set of vectors in Cν , then {ξk}ν1 is a basis for Cν . Therefore if {ψk}ν1is any nonzero orthogonal set of vectors, then {ψk}ν1 is a basis for Cν .

We say that {ψk}k∈K is an orthogonal basis for an inner product space H if {ψk}k∈K is anonzero orthogonal set of vectors and the (closed) linear span of {ψk}k∈K equals H. In otherwords, {ψk}k∈K is an orthogonal basis for H if {ψk}k∈K is a nonzero orthogonal set of vectorsand given any vector f in H, then there exists a set of scalars {αk}k∈K such that f =

∑αkψk.

Recall that the dimension of a linear space H is the number of linearly independent vectorsneeded to span H. In particular, the dimension of Cν is ν. Furthermore, if {ξk}ν1 is a linearlyindependent set of vectors in Cν , then {ξk}ν1 is a basis for Cν . Therefore if {ψk}ν1 is anynonzero orthogonal set of vectors, then {ψk}ν1 is a basis for Cν . For example,⎡⎢⎢⎣

1001

⎤⎥⎥⎦ ,⎡⎢⎢⎣

0110

⎤⎥⎥⎦ ,⎡⎢⎢⎣

200

−2

⎤⎥⎥⎦ ,⎡⎢⎢⎣

03

−30

⎤⎥⎥⎦ (4.1)

is a nonzero orthogonal set of vectors in C4. Hence the set of vectors in (4.1) form anorthogonal basis for C4. Finally, it is noted that if {ψk}k∈K is an orthogonal basis forH, then the dimension of H equals the cardinality of K. The following result forms thefundamental basis for the Fourier series representation of a function.

THEOREM 1.4.1 Let {ψk}k∈K be an orthogonal basis for an inner product space H. Thenany vector f in H admits a unique representation of the form f =

∑k∈K akψk where {ak}k∈K

are scalars. In fact,

f =∑k∈K

akψk where ak =(f, ψk)

‖ψk‖2 (k ∈ K) . (4.2)

Moreover, we have Parseval’s equality, that is,

‖f‖2 =∑k∈K

|ak|2‖ψk‖2 . (4.3)

Finally, if g =∑

k∈K bkψk is another vector in H where {bk}k∈K are scalars, then

(f, g) =∑k∈K

akbk‖ψk‖2 . (4.4)

Proof. Since {ψk}k∈K is an orthogonal basis for H. it follows that any vector f in H admitsa decomposition of the form f =

∑k∈K akψk. For any m in K, the identity (ψk, ψm) =

δkm‖ψk‖2 yields

(f, ψm) = (∑k∈K

akψk, ψm) =∑k∈K

ak(ψk, ψm) =∑k∈K

akδkm‖ψk‖2 = am‖ψm‖2 .

1.4. ORTHOGONAL BASIS 27

Thus am is uniquely given by am = (f, ϕm)/‖ψm‖2 for all integers m in K. In other words,Equation (4.2) holds.

If g is any other vector in K, then g =∑

m∈K bmψm where {bk}k∈K are scalars. Using thiswe obtain

(f, g) = (∑k∈K

akψk,∑m∈K

bmψm) =∑k∈K

∑m∈K

(akψk, bmψm)

=∑k∈K

∑m∈K

akbm(ψk, ψm) =∑k∈K

∑m∈K

akbmδkm‖ψk‖2 =∑k∈K

akbk‖ψk‖2 .

Hence (4.4) holds. By setting g = f in (4.4) yields Parseval’s formula (4.3). This completesthe proof.

The following result can be used to find an approximation of f by using a finite numberof orthogonal vectors.

COROLLARY 1.4.2 Let {ψk}k∈K be an orthogonal basis for an inner product space H. LetK1 and K2 be two distinct subsets of K satisfying K1

⋃K2 = K. Finally, let f =∑

k∈K akψkbe the series representation for a vector f in H. Then the distance between f and the vectorp =

∑k∈K1

akψk is given by

‖f −∑k∈K1

akψk‖ =

(∑k∈K2

|ak|2‖ψk‖2) 1

2

=

(‖f‖2 −

∑k∈K1

|ak|2‖ψk‖2) 1

2

. (4.5)

Proof. Using f =∑

k∈K akψk we have

‖f −∑k∈K1

akψk‖2 = ‖∑k∈K

akψk −∑k∈K1

akψk‖2 = ‖∑k∈K2

akψk‖2

= (∑k∈K2

akψk,∑m∈K2

amψm) =∑k∈K2

∑m∈K2

akam(ψk, ψm)

=∑k∈K2

∑m∈K2

akamδkm‖ψk‖2 =∑k∈K2

|ak|2‖ψk‖2 .

By taking the square root, we obtain (4.5). This completes the proof.We say that a set of vectors {ϕk}k∈K is orthonormal if {ϕk}k∈K is a set of unit vectors

and ϕk is orthogonal to ϕm for all integers k = m, that is, (ϕk, ϕm) = δkm where δkm is theKronecker delta. Moreover, {ϕk}k∈K is an orthonormal basis for an inner product space H if{ϕk}k∈K is an orthonormal set of vectors and the (closed) linear span of {ϕk}k∈K equals H.Finally, it is noted that if {ϕk}k∈K is an orthonormal basis for H, then the dimension of Hequals the cardinality of K.

For an example of an orthonormal basis for C3, consider the set {ϕ1, ϕ2, ϕ3} where

ϕ1 =1√2

⎡⎣ 110

⎤⎦ , ϕ2 =1√2

⎡⎣ 1−10

⎤⎦ and ϕ3 =

⎡⎣ 001

⎤⎦ .


It is easy to verify that (ϕj, ϕk) = δjk. Since the dimension of C3 is three, and {ϕ1, ϕ2, ϕ3}is an orthonormal set with three vectors, it follows that {ϕ1, ϕ2, ϕ3} is an orthonormal basisfor C3. By setting ϕk = ψk in Theorem 1.4.1 we readily obtain the following result.

THEOREM 1.4.3 Let {ϕk}k∈K be an orthonormal basis for an inner product space H.Then any vector f in H admits a decomposition of the form

f =∑k∈K

(f, ϕk)ϕk . (4.6)

Moreover, this decomposition is unique, that is, if the vector f =∑

k∈K akϕk where {ak}k∈Kare scalars, then ak = (f, ϕk) for all k in K. In this setting Parseval’s formula becomes

‖f‖2 =∑k∈K

|(f, ϕk)|2 . (4.7)

Finally, if g is any vector in H, then we have

(f, g) =∑k∈K

(f, ϕk)(g, ϕk) . (4.8)

The following approximation result is an immediate consequence of Corollary 1.4.2.

COROLLARY 1.4.4 Let {ϕk}k∈K be an orthonormal basis for an inner product space H.Let K1 and K2 be two distinct subsets of K satisfying K1

⋃K2 = K. Let f =∑

k∈K akϕk bethe series representation for a function f in H. Then the distance between f and the vectorp =

∑k∈K1

akϕk is given by

‖f −∑k∈K1

akϕk‖ =

(∑k∈K2

|ak|2) 1

2

=

(‖f‖2 −

∑k∈K1

|ak|2) 1

2

. (4.9)

1.4.1 Exercise

Problem 1. Let ϕk(t) = e−ikt where k is any integer and t ∈ [0, 2π]. Then show that{ϕk}∞−∞ is an orthonormal set of vectors in L2(0, 2π).

Problem 2. Let ψk(t) = cos(kt) where k ≥ 0 is a positive integer. Then show that {ψk}∞0is an orthogonal set of vectors in L2(0, 2π). Compute ‖ψk‖ for all k.

Problem 3. Let φm(t) = sin(mt) where m > 0 is a strictly positive integer. Then showthat {φm}∞1 is an orthogonal set of vectors in L2(0, 2π). Compute ‖φm‖ for all m.

Problem 4. Let ψk(t) = cos(kt) where k ≥ 0 is a positive integer. Let φk(t) = sin(mt)where m > 0 is a strictly positive integer. Then show that {ψk, φm}∞,∞

k=0,m=1 is an orthogonalset of vectors in L2(0, 2π).

Problem 5. Find a constant α such that 1 is orthogonal to 1 + αt in the L2(0, 1) space.

Problem 6. Show that cos(t) is orthogonal to sin(t) in the L2(0, 2π) space, and cos(t) is innot orthogonal to sin(t) in the L2(0, 1) space.

Chapter 2

Fourier series

In this chapter we will study Fourier series. Then we will use Fourier series to solve the waveand beam equation.

2.1 Fourier series

In this section we will use Theorem 1.4.3 to develop Fourier series. Recall that L2(0, τ) isthe set of all (Lebesgue measurable) functions f(t) defined on the interval [0, τ ] satisfying

‖f‖2 = 1

τ

∫ τ

0

|f(t)|2dt <∞ .

Throughout we assume that τ is finite. It is noted that the norm ‖f‖ is also called the rootmean square of f . In other words, the root mean square of the function f on the interval[0, τ ] is defined by

‖f‖ =

(1

τ

∫ τ

0

|f(t)|2dt)1/2

.

Recall that our inner product on L2(0, τ) is given by

(f, g) =1

τ

∫ τ

0

f(t)g(t)dt (f, g ∈ L2(0, τ)) . (1.1)

Set ϕk(t) = e−2πikt/τ for k = 0,±1,±2,±3, · · · . Then {ϕk}∞−∞ is an orthonormal set ofvectors in L2(0, τ). To see this observe that for any integer m such that k = m, we have

(ϕk, ϕm) =1

τ

∫ τ

0

e−2πikt/τe2πimt/τdt =1

τ

∫ τ

0

e2πi(m−k)t/τdt

=e2πi(m−k)t/τ

2πi(m− k)

∣∣∣∣τt=0

=e2πi(m−k) − 1

2πi(m− k)= 0 .

Hence (ϕk, ϕm) = 0 when k = m. On the other hand, for any integer k, we have

‖ϕk‖2 = 1

τ

∫ τ

0

|e−2πikt/τ |2dt = 1

τ

∫ τ

0

1dt = 1 .

29

30 CHAPTER 2. FOURIER SERIES

Thus (ϕk, ϕm) = δkm for all integers k and m. In other words, {ϕk}∞−∞ is an orthonormalset of vectors in L2(0, τ).

Notice that 1/τ is the frequency of the sinusoid e−2πit/τ = cos(2πt/τ) + i sin(2πt/τ), andτ is its period. For convenience we set

ω0 =2π

τ. (1.2)

Then ω0 is the angular frequency for e−iω0t. Obviously, 2π = ω0τ . Moreover, using ω0 theorthonormal functions {ϕk}∞−∞ become

ϕk(t) = e−ikω0t (for all integers k) . (1.3)

Finally, ω0 is called the fundamental frequency for the functions {e−ikω0t}.Using the Weierstrass approximation theorem along with some measure theoretic results

one can show that {ϕk}∞−∞ is an orthonormal basis for L2(0, τ). To be precise, any functionf in L2(0, τ) admits a unique representation of the form f =

∑∞−∞ akϕk. In fact, according

to Theorem 1.4.3, we have

f =

∞∑k=−∞

akϕk =

∞∑k=−∞

ake−ikω0t

where {ak}∞−∞ are computed by

ak = (f, e−ikω0t) =1

τ

∫ τ

0

eikω0tf(t)dt .

The scalars {ak}∞−∞ are the Fourier coefficients for f and f =∑∞

−∞ ake−iω0kt is called the

Fourier series for f . In this setting Parseval’s formula is given by

1

τ

∫ τ

0

|f(t)|2dt = ‖f‖2 =∞∑

k=−∞|ak|2 .

This also shows that f is a function in L2(0, τ) if and only if f =∑∞

−∞ ake−ikω0t where∑∞

−∞ |ak|2 is finite. Summing up this analysis with Theorem 1.4.3 yields the followingresult.

THEOREM 2.1.1 Let f be any function in L2(0, τ) and set ω0 = 2π/τ . Then f admits aFourier series representation of the form

f(t) =

∞∑k=−∞

ake−ikω0t where ak =

1

τ

∫ τ

0

eikω0tf(t)dt . (1.4)

Moreover, in this setting Parseval’s formula becomes

1

τ

∫ τ

0

|f(t)|2dt =∞∑

k=−∞|ak|2 . (1.5)

2.1. FOURIER SERIES 31

Finally, if g =∑∞

k=−∞ bke−ikω0t is the Fourier series representation for any function g in

L2(0, τ), then

1

τ

∫ τ

0

f(t)g(t)dt =

∞∑k=−∞

akbk . (1.6)

A function f(t) is function periodic with period τ if f is a function on the real linesatisfying f(t) = f(t + τ) for all −∞ < t < ∞. Recall that ω0 = 2π/τ . So for any integerk the functions cos(kω0t) and sin(kω0t) are periodic function with period τ . Furthermore,k/τ is the frequency and kω0 is the angular frequency of both cos(kω0t) and sin(kω0t). Sincee−ikω0t = cos(kω0t) − i sin(kω0t), it follows that e−ikω0t is a periodic function with periodτ , frequency k/τ and angular frequency kω0. Because the linear combination of periodicfunctions is periodic,

∑∞k=−∞ ake

−ikω0t is a periodic function with period τ . Moreover, if fis periodic with period τ and f is in L2(0, τ), then the Fourier series expansion for f in (1.4)holds for (almost) all t on the real line. If τ is not a period for f , then f(t) =

∑∞k=−∞ ake

−ikω0t

still holds for (almost) all t in (0, τ), and there is no guarantee that this equality holds whent is not in (0, τ).

Theorem 2.1.1 shows that any function f in L2(0, τ) can be decomposed into an infinitesum f =

∑∞−∞ ake

−ikω0t consisting of sinusoids with angular frequencies {kω0}. The complexnumber ak is referred to as the amplitude of the frequency at k/τ or the amplitude of theangular frequency kω0. Notice that all the frequencies k/τ are integer multiples of 1/τ ,and all the angular frequencies kω0 are integer multiples of ω0. For this reason 1/τ iscalled the fundamental frequency, and ω0 is called the fundamental angular frequency. Inother words, Theorem 2.1.1 shows that any function f can be decomposed into an infinitelinear combination of sinusoids whose frequencies are integer multiples of the fundamentalfrequency 1/τ .

As before, let f =∑∞

−∞ ake−ikω0t be the Fourier series expansion for a function f in

L2(0, τ). The graph of {|ak|2}∞−∞ vs {k}∞−∞ is called the power spectrum for f . The graph of{|ak|2}∞−∞ vs the frequency {2πk/τ}∞−∞, or the graph of {|ak|2}∞−∞ vs the angular frequency{kω0}∞−∞ is also referred to as the power spectrum for f . Parseval’s theorem shows that‖f‖2 equals the “area under the curve” of the power spectrum, that is, ‖f‖2 =

∑∞−∞ |ak|2.

The graph of {|ak|}∞−∞ vs {k}∞−∞ is called the magnitude of the spectrum for f . The graphof {|ak|}∞−∞ vs the frequency {2πk/τ}∞−∞, or the graph of {|ak|}∞−∞ vs the angular frequency{kω0}∞−∞ is also referred to as the magnitude of the spectrum for f . Finally, it is noted thatone also graphs the spectrum by plotting {log |ak|}∞−∞ vs the frequency.

It is noted that Fourier analysis is the study of decomposing a signal into its corre-sponding sinusoids and amplitudes. On the other hand, Fourier synthesis is the process ofreconstructing a signal from its sinusoids and amplitudes. An application of Corollary 1.4.4readily yields the following result.

COROLLARY 2.1.2 Let K1 and K2 be two distinct subsets of integers such that K1

⋃K2

equals all the integers. Let f =∑∞

−∞ ake−ikω0t be the Fourier series expansion for a function

f in L2(0, τ) and p the function defined by p =∑

k∈K1ake

−ikω0t where ω0 = 2πτ. Then the


distance between f and p is given by

‖f − p‖ =

(1

τ

∫ τ

0

|f(t)− p(t)| 2 dt) 1

2

=

(∑k∈K2

|ak|2) 1

2

(1.7)

‖f − p‖ =(‖f‖2 − ‖p‖2) 1

2 =

(1

τ

∫ τ

0

|f(t)|2dt−∑k∈K1

|ak|2) 1

2

. (1.8)

We say that f is a real valued function if f(t) is real for all t.

COROLLARY 2.1.3 Let f(t) =∑∞

−∞ ake−ikω0t be the Fourier series for a function f in

L2(0, τ) where the fundamental frequency ω0 =2πτ. Then f(t) is a real valued function if and

only if a0 is real and ak = a−k for all integers k = 0.

Proof. Notice that

f(t) =

( ∞∑k=−∞

ake−ikω0t

)−

=

∞∑k=−∞

akeikω0t =

∞∑k=−∞

a−ke−ikω0t.

(The superscript bar on the right parenthesis indicates the complex conjugate.) Clearly, f(t)is a real valued function if and only if f(t) = f(t) for all t. Using f(t) =

∑∞−∞ ake

−ikω0t, weobserve that f(t) is a real valued function if and only if

∞∑k=−∞

ake−ikω0t = f(t) = f(t) =

∞∑k=−∞

a−ke−ikω0t.

By matching the Fourier coefficients of e−ikω0t, we see that f(t) is a real valued function ifand only if ak = a−k for all integers k. In particular, a0 = a0, and thus, a0 is real. Thiscompletes the proof.

2.1.1 Some convergence results for Fourier series

In this section, we will present several different convergence results for Fourier series. Asbefore, let

∑∞−∞ ake

−ikω0t be the Fourier series representation for a function f(t) in L2(0, τ)where ω0 =

2πτ. Let pn(t) be the partial Fourier series for f(t) defined by

pn(t) =

n∑k=−n


1

τ

∫ τ

0

f(t)eikω0tdt. (1.9)

By definition the infinite sum∑∞

−∞ ake−ikω0t = limn→∞ pn(t). According to Corollary 2.1.2,

we have

‖f − pn‖ =

(1

τ

∫ τ

0

|f(t)− pn(t)|2 dt) 1

2

=

⎛⎝∑|k|>n

|ak|2⎞⎠ 1

2

. (1.10)


Because f is in L2(0, τ), Parseval’s equality shows that ‖f‖2 = ∑∞−∞ |ak|2 is finite. Hence

0 = limn→∞

∑|k|>n

|ak|2 = limn→∞

‖f − pn‖2 = limn→∞

1

τ

∫ τ

0

|f(t)− pn(t)|2 dt.

In other words, the sequence pn(t) converges to f(t) in the L2(0, 2π) norm, that is,

0 = limn→∞

‖f − pn‖ = limn→∞

(1

τ

∫ τ

0

|f(t)− pn(t)|2 dt) 1

2

. (1.11)

In particular, this implies that pn(t) converges to f(t) almost everywhere with respect to theLebesgue measure.

To introduce another convergence result, recall that a function f(t) on [a, b] is of boundedvariation if the total distance a point travels along the y axis is finite as t moves from a to b.(Throughout it is assumed that b− a is finite.) To be precise, a function f(t) is of boundedvariation on [a, b] if there exists a bound γ <∞ such that

|f(t1)− f(a)|+ |f(t2)− f(t1)|+ · · ·+ |f(tn)− f(tn−1)|+ |f(b)− f(tn)| ≤ γ

for all a < t1 < t2 < · · · < tn < b and all n. For example, any polynomial p(t) defined on[a, b] is of bounded variation. A continuous function is not necessarily of bounded variation.The function t sin(1

t) is a classical example of a continuous function which is not of bounded

variation on [0, 1].The following version of Dirichlet’s Fourier convergence Theorem is sufficient for our

purposes, and is taken from Problem 4, Page 25 of Hoffman [20].

THEOREM 2.1.4 (Dirichlet convergence) Let f be a function in L2(0, τ) of boundedvariation, and pn(t) =

∑n−n ake

−ikω0t the partial Fourier series for f(t). Then

limn→∞

pn(t) =1

2limε→0+

(f(t+ ε) + f(t− ε)) . (1.12)

Finally, the partial Fourier series pn(t) converges to f(t) at the continuous points of f(t).

In particular, the Dirichlet convergence theorem states that if f in L2(0, τ) is a piecewisecontinuous function of bounded variation over the interval [0, τ ], then its Fourier series

∞∑k=−∞

ake−ikω0t = f(t) if f(t) is continuous at t

=1

2(f(t◦+) + f(t◦−)) if f(t) is discontinuous at t◦. (1.13)

DEFINITION 2.1.5 The Wiener algebra Wτ with period τ is the set of all functions f(t)which admit a Fourier series expansion of the form

f(t) =

∞∑k=−∞

ake−ikω0t and

∞∑k=−∞

|ak| <∞ (1.14)

where ω0 =2πτ

is the fundamental frequency.


If f is in the Wiener algebra, then∑∞

−∞ |ake−ikω0t| =∑∞

−∞ |ak| is finite. Thereforethe Fourier series

∑∞−∞ ake

−ikω0t converges to f(t) pointwise. Because the Fourier series isperiodic, we see that f(t) is also a τ periodic function satisfying f(0) = f(τ). (If f is onlydefined on the interval [0, τ ], then we simply let f also denote the unique τ periodic extensionof f .) In a moment we will obtain a stronger convergence result.

If∑∞

−∞ |ak| is finite, then the sum∑∞

−∞ |ak|2 also is finite. So if f is in the Wieneralgebra, then f is also in L2(0, τ). Moreover, if f is a function in the Wiener algebra, thenf(t) is continuous and f(0) = f(τ); see Theorem 2.1.6. Therefore the Wiener algebra Wτ

is strictly contained in L2(0, τ). Finally, it is emphasized that not all continuous functionsg(t) satisfying g(0) = g(τ) are in the Wiener algebra Wτ .

Let f(t) =∑∞

−∞ ake−ikω0t be the Fourier series expansion for a function f in L2(0, τ). If

ak is of order O(

1kr

)where r > 1, then

∑∞−∞ |ak| is finite and f is in the Wiener algebra

Wτ . In particular, if ak is of order O(

1k2

), then f is in the Wiener algebra; see for example

Problem 2 in Section 2.2.1.Recall that f = df

dtis the time derivative of f . A function f(t) is continuously differentiable

if f(t) exists and f(t) is a continuous function. If f and f are continuously differentiablefunctions in some neighborhood of [0, τ ] and f(0) = f(τ), then f is in the Wiener algebraWτ . For all k = 0 integration by parts yields

ak =1

τ

∫ τ

0

f(t)eikω0tdt =f(t)eikω0t

ikω0τ

∣∣∣∣τ0

−∫ τ

0

f(t)eikω0t

ikω0τdt

=f(t)eikω0t

k2ω20τ

∣∣∣∣∣τ

0

−∫ τ

0

f(t)eikω0t

k2ω20τ

dt

=f(τ)− f(0)

k2ω20τ

− 1

k2ω20τ

∫ τ

0

f(t)eikω0tdt.

So ak is of order O(

1k2

), and f is in the Wiener algebra Wτ ; see also [20]. Finally, for an

example, f(t) = t− t2 is in the Wiener algebra W1.We say that a sequence of continuous functions {fn}∞0 uniformly converge to f over the

interval [a, b] if0 = lim

n→∞max{|f(t)− fn(t)| : a ≤ t ≤ b}. (1.15)

If fn uniformly converges to f , then obviously fn(t) converges pointwise to f(t). Finally, itis emphasized that if a sequence of continuous functions {fn}∞0 converges uniformly to f ,then f is also a continuous function. Let us conclude our discussion on the Wiener algebrawith the following classical result; see [5, 20].

THEOREM 2.1.6 Let pn(t) be the n-th partial Fourier series for a function f(t) in theWiener algebra Wτ . Then pn converges uniformly to f . In particular, f(t) is a τ periodiccontinuous function satisfying f(0) = f(τ).

Sketch of proof. Let f(t) =∑∞

−∞ ake−ikω0t be the Fourier series expansion for f ∈ Wτ .

Thenmax{|f(t)− pn(t)|} = max{|

∑n<|k|

ake−ikω0t|} ≤

∑n<|k|

|ak| → 0.


Therefore pn converges uniformly to f . Because the partial Fourier series are continuousτ periodic functions which converge to f uniformly, f must also be τ periodic continuousfunction satisfying f(0) = f(τ). This completes the proof.

Wiener algebras plays a fundamental role in solving interpolation problems. For anexcellent presentation see Gohberg-Goldberg-Kaashoek [14].

2.1.2 Fourier series in L2(−μ, μ)In many applications f(t) is a function in L2(−μ, μ), that is, f(t) is a Lebesgue measurablefunction on the interval [−μ, μ] where μ > 0 and

1

2μ

∫ μ

−μ|f(t)|2 dt <∞ .

In this case, the length 2μ of the interval [−μ, μ] equals the period, that is, τ = 2μ. Moreover,the fundamental frequency ω0 =

πμ. In this setting, f(t) admits a Fourier series representation

of the form

f(t) =

∞∑k=−∞


1

2μ

∫ μ

−μeikω0tf(t)dt. (1.16)

Furthermore, Parseval’s formula becomes

1

2μ

∫ μ

−μ|f(t)|2dt =

∞∑k=−∞

|ak|2 . (1.17)

Finally, if g(t) =∑∞

−∞ bke−ikω0t is the Fourier series representation for any function g in

L2(−μ, μ), then1

2μ

∫ μ

−μf(t)g(t)dt =

∞∑k=−∞

akbk . (1.18)

One can directly prove the previous results. This also follows by extending f and g toperiodic functions with period τ = 2μ. Then using the fact that the integral of a periodicfunction over any period is the same, the previous results now follow from Theorem 2.1.1.

It is emphasized that the Dirichlet convergence Theorem 2.1.4 and corresponding Wienerconvergence Theorem 2.1.6 hold for functions in L2(−μ, μ).

If f(t) is a 2μ periodic function, then the k-th Fourier coefficient

ak =1

2μ

∫ μ

−μeikω0tf(t)dt =

1

2μ

∫ 2μ

0

eikω0tf(t)dt (when f has period 2μ). (1.19)

This also follows from the fact that the integral of a periodic function over any period is thesame. In particular, if f(t) is 2μ periodic, then

∑∞−∞ ake

−ikω0t is the Fourier series expansionfor f(t) in L2(−μ, μ) and f(t) in L2(0, 2μ). Finally, one has to be careful when f(t) is notperiodic. For example, f(t) = t2 is not periodic. So the Fourier series for f(t) in L2(−μ, μ)is different from its Fourier series expansion in L2(0, 2μ).


As before, let f(t) be a function in L2(−μ, μ). Then f(t− μ) is a function in L2(0, 2μ).Hence f(t− μ) has a Fourier series expansion of the form

∑∞−∞ dke

−ikω0t. By changing thelimits of integration, we obtain

dk =1

2μ

∫ 2μ

0

eikω0tf(t− μ)dt =(−1)k

2μ

∫ μ

−μeikω0tf(t)dt = (−1)kak. (1.20)

(Here we used eikω0μ = eikπ = (−1)k.) Thus dk = (−1)kak, or equivalently, ak = (−1)kdk forall integers k, where {dk} are the coefficients in the Fourier series expansion

∑∞−∞ dke

−ikω0t

for the shifted function f(t−μ) in L2(0, 2μ). For another perspective, observe that replacingt by t− μ in the Fourier series expansion

∑∞−∞ ake

−ikω0t for f(t) in L2(−μ, μ), we have

∞∑k=−∞

dke−ikω0t = f(t− μ) =

∞∑k=−∞

ake−ikω0(t−μ)

=∞∑

k=−∞ake

ikω0μe−ikω0t =∞∑

k=−∞(−1)kake

−ikω0t.

By matching like coefficients of e−ikω0t in the corresponding Fourier series, we arrive atdk = (−1)kak, or equivalently, ak = (−1)kdk. Because |ak| = |dk| for all integers k, thefunctions f(t) in L2(−μ, μ) and f(t− μ) in L2(0, 2μ) have the same power spectrum.

2.2 A square wave

Let us compute the Fourier series for the following square wave function f in L2(0, 2π)defined by

f(t) = 1 if 0 ≤ t < π

f(t) = −1 if π ≤ t < 2π. (2.1)

(Setting f(0) = 1 and f(π) = −1 makes the function f(t) right continuous. Because {0, π} isa set of Lebesgue measure zero, this is not necessary and we could have just as easily let f(t)be unspecified at these points.) Here we assume that τ = 2π. In this case, the fundamentalangular frequency ω0 = 1. Thus ϕk = e−ikt for all integers k. Notice that for k = 0, we have

a0 =1

2π

∫ 2π

0

f(t)dt =1

2π

∫ π

0

dt− 1

2π

∫ 2π

π

dt = 0 .

Thus a0 = 0. If k = 0, then (1.4) gives

ak =1

2π

∫ 2π

0

eiktf(t)dt =1

2π

∫ π

0

eiktdt− 1

2π

∫ 2π

π

eiktdt =eikt

2πik

∣∣∣∣π0

− eikt

2πik

∣∣∣∣2ππ

=2eikπ − 1− eik2π

2πik=eikπ − 1

iπk=

cos(kπ)− 1 + i sin(kπ)

iπk.

2.2. A SQUARE WAVE 37

This implies that ak = 0 if k is even and ak = −2/πik when k is odd. Hence

f =∑odd k

−2e−ikt

πik=

∑odd k>0

−2e−ikt + 2eikt

πik=

∑odd k>0

4 sin(kt)

πk.

This readily implies that

f(t) =4

π

∞∑k=0

sin((2k + 1)t)

(2k + 1)(2.2)

In this setting Parseval’s equality shows that

1 =1

2π

∫ 2π

0

|f(t)|2dt =∞∑−∞

|ak|2 =∑odd k

4

π2k2=

8

π2

∞∑k=0

1

(2k + 1)2.

This yields the following formula to compute π2, that is,

π2

8=

∞∑k=0

1

(2k + 1)2(2.3)

In fact, one can use Matlab to check this formula.Following some of the ideas on page 153 of [26], we have

π2

8=

∞∑k=0

1

(2k + 1)2=

∞∑k=1

1

k2−

∑even k≥2

1

k2=

∞∑k=1

1

k2−

∞∑k=1

1

(2k)2

=∞∑k=1

1

k2− 1

4

∞∑k=1

1

k2=

3

4

∞∑k=1

1

k2.

By rearranging terms, we arrive at the following famous formula due to Euler:

π2

6=

∞∑k=1

1

k2(2.4)

Finally, it is noted that Euler proved this result without using Fourier series; see page 153in [26] and the Basel problem in Wikipedia.

Clearly, f(t) is of bounded variation over the interval [0, 2π]. The Dirichlet ConvergenceTheorem 1.12 shows that the Fourier series equals f(t) at the points where f(t) is continuous.In particular, the Fourier series converges at t = π

2, that is,

1 = f(π2

)=

[4

π

∞∑k=0

sin((2k + 1)t)

(2k + 1)

]t=π

2

=4

π

[1− 1

3+

1

5− 1

7+ · · ·

].


This yields the following classical formula to compute π, that is,

π

4=

∞∑k=0

(−1)k

2k + 1(2.5)

One can use Matlab to check this formula. Finally, because the square wave function f(t) isdiscontinuous, f is not in the Wiener algebra W2π.

Notice that the function f(t) is discontinuous at π. According to equation (1.13) in theDirichlet Convergence Theorem 1.12, we have

0 =1

2(f(π+) + f(π−)) =

1

2(−1 + 1) =

[4

π

∞∑k=0

sin((2k + 1)t)

(2k + 1)

]t=π

= 0.

In other words, at the discontinuity point π, the Fourier series converges to the average ofthe function f(t) in a small neighborhood around π; see (1.12). It is emphasized that theFourier series has period 2π.

Let f(t) also denote the unique 2π periodic extension of f(t), that is, we extend f(t)outside the interval [0, 2π] such that f(t) = f(t+2π) for all t, and also denote this extendedfunction by f(t); see the upper graph in Figure 2.1. In fact,

f(t) = 1 if jπ ≤ t < (j + 1)π and j is even

= −1 if jπ ≤ t < (j + 1)π and j is odd. (2.6)

Because the Fourier series∑∞

0sin((2k+1)t)

(2k+1)for f(t) has period 2π, this Fourier series will

converge to the continuous points of the unique 2π periodic extension f(t). Notice that f(t)is discontinuous at πj where j is an integer. By the Dirichlet Convergence Theorem 1.12,the Fourier series converges to 0 at πj which equals the average 1

2(f(πj+) + f(πj−)).

To provide a demonstration of how our Fourier series converges, consider the Fourierseries approximation

pn(t) =4

π

n∑odd k≥1

sin(t)

k(2.7)

of f(t). A graph of the square wave f(t) is given in the upper plot of Figure 2.1. A graphof p25(t) is presented in middle plot of Figure 2.1. The rapid oscillation at the discontinuouspoints of f(t) is known as the Gibbs phenomenon. The bottom plot in Figure 2.1 is a graphof pn(t) for n = 1, 2, 3, · · · , 25. (Notice that pn(t) = pn+1(t).) This plot shows how thepartial Fourier series pn(t) converge to the square wave f(t).

The fact that the approximation pn(t) for f(t) is jumping around or oscillating at thediscontinuities of f(t) is known as the Gibbs phenomenon. The Gibbs effect occurs becauseone is trying to approximate a discontinuous function f(t) by a continuous periodic functionpn(t). If f(t) is a piecewise continuous function in L2(0, τ) and f(0) = f(τ), then the τperiodic extension of f(t) is not continuous and the Gibbs phenomenon will occur at 0, τand the discontinuous points of f(t); see Section 4.4 in [26] for further details.


0 5 10 15 20

t

-2

0

2The 2π periodic square wave f(t)

0 5 10 15 20

t

-2

0

2The graph for f(t) and p

25(t)

0 5 10 15 20

t

-2

0


n(t) for n=1,2, ... , 25

Figure 2.1: The square wave and pn(t) for n = 1, 2, · · · , 25

The Matlab commands we used to plot Figure 2.1 are given by

t=linspace(0,6.5π,120000); f=square(t);

subplot(3,1,1); plot(t,f); grid; xlabel(’t’); axis([0,22,-2,2]);

title(’The 2π periodic square wave f(t)’);

p=zeros(size(t)); for k=1:2:25; p = p+ 4 ∗ sin(k ∗ t)/(π ∗ k); end;subplot(3,1,2); plot(t,f); grid; hold on; plot(t,p,’r’);

axis([0,22,−2,2]); xlabel(’t’);

title(’The graph for f(t) and p25(t)’);

subplot(3,1,3); plot(t,f); grid for j=1:2:25; p=zeros(size(t));

for k=1:2:j; p = p+ 4 ∗ sin(k ∗ t)/(π ∗ k); end;hold on; plot(t,p); end; xlabel(’t’);

title(’The graph for f(t) and pn(t) for n = 1, 2, ..., 25’);


0 5 10 15 20

t

-2

-1

0

1

2The 2π periodic square wave f(t)

0 5 10 15 20

t

-2

-1

0

1


2001(t)

Figure 2.2: The square wave f(t) and p2001(t)

Now let us see what happens when n is large. The upper graph in Figure 2.2 plots thesquare wave f(t). The bottom graph in Figure 2.2 plots the Fourier series approximation

p2001(t) =4

π

2001∑odd k≥1

sin(t)

k

of f(t). This is a ”large” n = 2001 Fourier series approximation for the square wave f(t).Notice that the Fourier series approximation pn(t) matches f(t) everywhere except at thediscontinuities πj for f(t) where j is an integer. It is emphasized that the overshoot in thepartial Fourier series pn(t) occurring at the discontinuities {πj}∞−∞ for f(t), is due to theGibbs phenomenon. The Gibbs effect occurs because the partial Fourier series pn(t) is acontinuous periodic function which is trying to approximate a discontinuous function f(t).The Gibbs phenomenon is present even when n = 2001 is ”large”; see the overshoot at thediscontinuities for f(t) in the lower graph of Figure 2.2. In general, the overshoot for aFourier series is around 8.9% of the total distance of the corresponding discontinuity; seeSection 4.4 in [26] for further details. Finally, the Gibbs phenomenon does not occur whenf(t) is a continuous periodic function. The Matlab commands we used to generate the plots


in Figure 2.2 are given by

t = linspace(0, 6.5π, 120000); f = square(t);

subplot(2, 1, 1); plot(t, f); grid;

axis([0, 22,−2, 2]); xlabel(′t′);

title(’The 2π periodic square wave f(t)’); p = zeros(size(t));

for k = 1 : 2 : 2001; p = p+ 4 ∗ sin(k ∗ t)/(π ∗ k); end;

subplot(2,1,2); plot(t,f); grid; hold on; plot(t,p,’r’)

axis([0,22,−2,2]); xlabel(’t’);

title(’The graph for f(t) and p2001(t)’);

The power spectrum for f is given in Figure 2.3. The Matlab commands used to generatethis power spectrum are given by

a=zeros(1,71); for k = 1 : 2 : 70; a(k + 1) = 2i/(π ∗ k); end

% Matlab does not have a zero index. So a(1) = a0, a(2) = a1 etc.

bar((−70:70), abs([a(71 : −1 : 2), a(1 : 71)]). ∧ 2); grid

xlabel(’Frequency k’); title(’The power spectrum of f’)

Because f(t) is a real valued function, the power spectrum of f is symmetric about the yaxis. Moreover, a significant part of the spectrum for f is contained in the Fourier coefficients{ak}51−51. In other words, f(t) is a ”low frequency signal”. It is noted that ‖f‖ = 1. Moreover,an application of Corollary 2.1.2 shows that

‖f − p25‖ =

(8

π2

∑odd k≥27

1

k2

) 12

≈ 0.1248;

‖f − p101‖ =

(8

π2

∑odd k≥103

1

k2

) 12

≈ 0.0630;

‖f − p2001‖ =

(8

π2

∑odd k≥2003

1

k2

) 12

≈ 0.0142.

Finally, it is noted that one could also use (1.8) to calculate the error.


-80 -60 -40 -20 0 20 40 60 80

Frequency k

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45The power spectrum for f

Figure 2.3: The power spectrum for f(t)

2.2.1 Exercise

Problem 1. In Matlab plot the approximation p100(t) in (2.7) for the function f(t) definedin (2.1). Plot the spectrum of |ak| vs k.Problem 2. Compute the Fourier series for the function f in L2(0, 2π) given by

f(t) = t if 0 ≤ t ≤ π

= 2π − t if π ≤ t ≤ 2π .

• Choose a partial Fourier series approximation pn(t) for f(t). Then plot pn(t) and f(t)on the same graph. Compute the error

‖f − pn‖ =

√1

2π

∫ 2π

0

|f(t)− pn(t)|2dt

in Matlab.

• Plot the power spectrum for f . Compute the root mean square of f , that is, compute

the norm ‖f‖ =√

12π

∫ 2π

0|f(t)|2dt.

• By evaluating f(t) and its corresponding Fourier series at π show that

π2

8=

∞∑k=0

1

(2k + 1)2.


• By using Parseval’s equality show that

π4

96=

∞∑k=0

1

(2k + 1)4.

• Show that f(t) is in the Wiener algebra W2π with period 2π. Hence, by Theorem 2.1.6,the Fourier series converges uniformly to f(t).

Answer: The Fourier series

f(t) =π

2− 4

π

∞∑k=0

cos((2k + 1)t)

(2k + 1)2(for 0 ≤ t ≤ 2π).

Problem 3. Compute the Fourier series for the function f in L2(0, 2π) given by

f(t) = sin(t) if 0 ≤ t ≤ π

= 0 if π ≤ t ≤ 2π .

Show that f(t) is in the Wiener algebra W2π. In particular, the Fourier series convergesuniformly to f(t); see Theorem 2.1.6. Choose a partial Fourier series approximation pn(t)for f(t). Then plot pn(t) and f(t) on the same graph. Compute the error ‖f − pn‖. Plotthe power spectrum for f . Compute the root mean square ‖f‖ of f .

Problem 4. Compute the Fourier series for the function f in L2(0, 2π) given by

f(t) = t if 0 ≤ t < π

= t− π if π ≤ t < 2π .

Choose a partial Fourier series approximation pn(t) for f(t). Then plot pn(t) and f(t) on thesame graph. Compute the error ‖f −pn‖. Plot the power spectrum for f . Compute the rootmean square ‖f‖ of f . Does this Fourier series converge for t = jπ where j is an integer,and if so what does it converge to; see the Dirichlet convergence Theorem 2.1.4. Finally, isf(t) in the Wiener algebra W2π?

Problem 5. Compute the Fourier series for the function f in L2(0, 1) given by

f(t) = 10 cos(8πt)− 2 sin(2πt) .

Plot the power spectrum for f . Compute the root mean square of f , that is, compute

‖f‖L2(0,1) =√∫ 1

0|f(t)|2dt.

Problem 6. Suppose that g(t) admits a Fourier series expansion of the form

g(t) = 2 + 3ei2t − ie−i4t


Then find the real part of g(t) and compute

1

2π

∫ 2π

0

|g(t)|2 dt .

Problem 7. Suppose that g(t) admits a Fourier series expansion of the form

g(t) = 10 + 8ei2t + 8e−i2t + (1 + i)ei7t + (1− i)e−i7t .

Assume that a Fourier series approximation p for g is given by

p(t) = 10 + 8ei2t + 8e−i2t .

Then compute the square error in this approximation, that is, find

‖g − p‖2 = 1

2π

∫ 2π

0

|g(t)− p(t)|2 dt .

Problem 8. Suppose that g(t) is a real valued function and g admits a Fourier seriesexpansion of the form

g(t) =∞∑

k=−∞ake

−ikt

If a10 = 3 + 4i, then find a−10.

Problem 9. Find the Fourier series for

f(t) = 2 + 8 sin(t)3

over the interval [0, 2π] of the form:

f(t) =∞∑

k=−∞ake

−ikt.

Plot the power spectrum for f .

Problem 10. Find the Fourier series for

f(t) = 4 cos(100t) cos(2t)

over the interval [0, 2π] of the form:

f(t) =∞∑

k=−∞ake

−ikt.


Problem 11 Consider the function f(t) = πeit2 in L2(0, 2π). It is noted that πe

it2 for

0 ≤ t ≤ 2π forms a semi-circle of radius π in the upper half of the complex plane.


(i) Find the Fourier series for f(t) of the form

f(t) =∞∑

k=−∞ake

−ikt.

(ii) The Dirichlet convergence Theorem 2.1.4 shows that this Fourier series converges att = π. By evaluating f(t) and its corresponding Fourier series at π show that

π

2=

∞∑k=0

(−1)k

k + 12

.

(iii) Use Parseval’s theorem to show that

π2

2=

∞∑k=0

1

(k + 12)2.

(iv) The function f(t) = πeit2 is continuous on the open interval (0, 2π). However, πe

it2

does not have period 2π. Therefore its corresponding Fourier series has the Gibbsphenomenon at t = 2πj where j is an integer. Use equation (1.12) in the Dirichletconvergence Theorem 2.1.4 to verify that

0 =1

2(f(2π) + f(0)) = lim

n→∞pn(t) = lim

n→∞

n∑k=−n

ak.

(v) Is f(t) in the Wiener algebra W2π?

Problem 12. Consider the function f(t) = πeit2 in L2(−π, π); see Section 2.1.2. It is noted

that πeit2 for −π ≤ t ≤ π forms a semi-circle of radius π in the right half of the complex

plane.

(i) Find the Fourier series for f(t) of the form

f(t) =

∞∑k=−∞

ake−ikt.

Is this the same Fourier series one obtains in Problem 11? Explain why or why not.

(ii) The Dirichlet convergence Theorem 2.1.4 shows that this Fourier series converges att = 0. By evaluating f(t) and its corresponding Fourier series at 0 show that

π

2=

∞∑k=0

(−1)k

k + 12

.


(iii) Use Parseval’s theorem to show that

π2

2=

∞∑k=0

1

(k + 12)2.

(iv) The function f(t) = πeit2 is continuous on the open interval (−π, π). However, πe

it2

does not have period 2π. Therefore its corresponding Fourier series has the Gibbsphenomenon at t = πj where j is an integer. Use equation (1.12) in the Dirichletconvergence Theorem 2.1.4 to verify that

0 =1

2(f(π) + f(−π)) = lim

n→∞pn(t) = lim

n→∞

n∑k=−n

e−ikπak.

Problem 13. Consider the function f in L2(0, 2π) given by

f(t) = 8 cos(t)2 sin(3t)− 4 cos(t)2.

(i) Express f(t) as a Fourier series of the form f(t) =∑∞

−∞ ake−ikt.

(ii) Plot the power spectrum for f .

(iii) Find the total power, that is,1

2π

∫ 2π

0

|f(t)|2dt.

Problem 14. Let f(t) be a τ periodic function. Then show that the integral of f(t) overany interval of length τ is the same, that is, show that

∫ x+τx

f(t)dt = γ the same constant γ

for all x. Hint: Take the derivative of∫ x+τx

f(t)dt.

2.3 The Fourier series consisting of cosine and sines

The following result develops a Fourier series by only using sine and cosine functions.

THEOREM 2.3.1 Let f be a function in L2(0, τ) and set ω0 = 2π/τ . Then f admits aFourier series representation of the form

f(t) = a0 +

∞∑k=1

(αk cos(kω0t) + βk sin(kω0t)) . (3.1)

The coefficients a0, αk and βk for k = 1, 2, 3, · · · are computed by

a0 =1

τ

∫ τ

0

f(t)dt

αk =2

τ

∫ τ

0

f(t) cos(kω0t)dt (3.2)

βk =2

τ

∫ τ

0

f(t) sin(kω0t)dt .

2.3. THE FOURIER SERIES CONSISTING OF COSINE AND SINES 47

Finally, in this setting Parseval’s equality is given by

1

τ

∫ τ

0

|f(t)|2dt = |a0|2 + 1

2

∞∑k=1

(|αk|2 + |βk|2). (3.3)

Proof. According to Theorem 2.1.1, the function f admits a Fourier series expansion ofthe form f =

∑∞−∞ ake

−ikω0t. We claim that

ake−ikω0t + a−keikω0t = αk cos(kω0t) + βk sin(kω0t) (3.4)

where the coefficients αk and βk are given by

αk = ak + a−k and βk = i(a−k − ak) (k = 1, 2, 3, · · · ) . (3.5)

This follows from

ake−ikω0t + a−keikω0t = ak cos(kω0t)− iak sin(kω0t) + a−k cos(kω0t) + ia−k sin(kω0t)

= (ak + a−k) cos(kω0t) + i(a−k − ak) sin(kω0t)

= αk cos(kω0t) + βk sin(kω0t) .

Therefore (3.4) holds.By consulting (3.4) we arrive at

f(t) =∞∑

k=−∞ake

−ikω0t = a0 +∞∑k=1

ake−ikω0t +

∞∑k=1

a−keikω0t

= a0 +

∞∑k=1

(ake

−ikω0t + a−keikω0t)

(3.6)

= a0 +

∞∑k=1

(αk cos(kω0t) + βk sin(kω0t)) .

Thus f admits a Fourier series expansion of the form (3.1).Equation (3.5) can also be used to derive the expressions for αk and βk in (3.2). To see

this, notice that (1.4) yields

αk = ak + a−k =1

τ

∫ τ

0

f(t)(eikω0t + e−ikω0t

)dt =

2

τ

∫ τ

0

f(t) cos(kω0t)dt

βk =ak − a−k

i=

1

iτ

∫ τ

0

f(t)(eikω0t − e−ikω0t

)dt =

2

τ

∫ τ

0

f(t) sin(kω0t)dt .

Therefore (3.2) holds.To complete the proof it remains to establish Parseval’s equality in (3.3). Rewriting

equation (3.5) matrix form yields[αkβk

]=

[1 1

−i i

] [aka−k

](k = 1, 2, 3, · · · ) . (3.7)


Recall that if A and B are two matrices of the appropriate size, then (AB)∗ = B∗A∗. Usingthis fact in (3.7), we obtain

|αk|2 + |βk|2 =

[αkβk

]∗ [αkβk

]=

([1 1

−i i

] [aka−k

])∗ [1 1

−i i

] [aka−k

]

=[ak a−k

] [ 1 i1 −i

] [1 1

−i i

] [aka−k

]

=[ak a−k

] [ 2 00 2

] [aka−k

]= 2

(|ak|2 + |a−k|2).


|ak|2 + |a−k|2 =(|αk|2 + |βk|2

)/2 (k = 1, 2, 3, · · · ) . (3.8)

An application of Parseval’s equality in (1.5) shows that

1

τ

∫ τ

0

|f(t)|2dt =∞∑

k=−∞|ak|2 = |a0|2 +

∞∑k=1

(|ak|2 + |a−k|2)= |a0|2 + 1

2

∞∑k=1

(|αk|2 + |βk|2).

This yields (3.3) and completes the proof.

REMARK 2.3.2 As before, let f =∑∞

−∞ ake−ikω0t be a function in L2(0, τ). Assume that

f is also given by the Fourier series expansion in (3.1). Notice that αk and βk are given by(3.5). By taking the inverse of the 2× 2 matrix in (3.7), we arrive at[

aka−k

]=

1

2

[1 i1 −i

] [αkβk

](k = 1, 2, 3, · · · ) . (3.9)

In other words, this and (3.5) implies that

ak = (αk + iβk) /2 and a−k = (αk − iβk) /2 (k = 1, 2, 3, · · · )αk = ak + a−k and βk = i(a−k − ak) (k = 1, 2, 3, · · · ) . (3.10)

Therefore the pairs {αk, βk} and {ak, a−k} uniquely determine each other for all integersk ≥ 1.

Recall that if f is a real valued function, then ak = a−k for all integers k. In this case,αk = 2�ak and βk = 2�ak for all integers k ≥ 1. To see this simply notice that (3.5) yields

αk = ak + a−k = ak + a−k = ak + ak = 2�akβk = i(a−k − ak) = (ak − a−k)/i = (ak − ak)/i = 2�ak .

Therefore for k ≥ 1, we have

αk = 2�ak and βk = 2�ak when f is a real valued function. (3.11)


For an example consider the function f in L2(0, 2) given by

f = 4 + 2 cos(2πt) + 4 sin(2πt)− 4 cos(4πt) + 2 sin(4πt) + 6 cos(5πt)− 8 cos(6πt) . (3.12)

Recall that fundamental angular frequency ω0 = 2π/τ . So if τ = 2, then ω0 = π. In thiscase, f admits a Fourier series expansion of the form

f(t) = a0 +∞∑k=1

αk cos(πkt) + βk sin(πkt) .

For the function f in (3.12), we see that a0 = 4,

α2 = 2, β2 = 4, α4 = −4, β4 = 2, α5 = 6, and α6 = −8 , (3.13)

while αk and βk are zero for all the other integers k. In particular, equation (3.3) with τ = 2,shows that

1

2

∫ 2

0

|f(t)|2 dt = 16 +4 + 16 + 16 + 4 + 36 + 64

2= 86 .

In other words, the L2(0, 2) norm of the function f in (3.12) is given by ‖f‖ =√86.

Let us express the function f in (3.12) as a Fourier series of the form f =∑∞

−∞ ake−ikω0t

where ω0 = π. By consulting equation (3.10) in Remark 2.3.2, we see that ak = (αk+βki)/2for all integers k ≥ 1. Using this along with (3.13), yields a0 = 4,

a2 = 1 + 2i, a4 = −2 + i, a5 = 3, a6 = −4i

a−2 = 1− 2i, a−4 = −2− i, a−5 = 3, a−6 = 4i

ak = 0 otherwise .

Here we used the fact that a−k = ak for all integer k when the function f is real valued.Therefore the power spectrum of the function f in (3.12) has a value of

|a0|2 = 16, |a±2|2 = 5, |a±4|2 = 5, |a±5|2 = 9, |a±6|2 = 16

for k respectively equal to 0,±2,±4,±5,±6 and zero otherwise. For a simple exercise plotthe power spectrum for f .

COROLLARY 2.3.3 Let M1 and M2 be two distinct subsets of the strictly positive inte-gers such that M1

⋃M2 equals the set of all strictly positive integers. Assume that (3.1) isthe Fourier series expansion for a function f in L2(0, τ), set ω0 = 2π/τ , and let p be thefunction defined by

p(t) = a0 +∑k∈M1

(αk cos(kω0t) + βk sin(kω0t)) . (3.14)

Then the distance from f to p in the L2(0, τ) norm is given by

‖f − p‖ =

(1

τ

∫ τ

0

|f(t)− p(t)| 2 dt) 1

2

=1√2

( ∑k∈M2

(|αk|2 + |βk|2)) 1

2

(3.15)

‖f − p‖ =

(1

τ

∫ τ

0

|f(t)| 2dt− |a0|2 − 1

2

∑k∈M1

(|αk|2 + |βk|2)) 1

2

. (3.16)


Proof. According to (3.4) the function p in (3.14) is also given by

p(t) = a0 +

∞∑k∈M1

(ake

−ikω0t + a−keikω0t)

(3.17)

By setting K2 = −M2

⋃M2 in equation (1.7) in Corollary 2.1.2 along with (3.8), we obtain

‖f − p‖ =∑k∈M2

(|ak|2 + |a−k|2)=

1

2

∑k∈M2

(|αk|2 + |βk|2).


A proof of Theorem 2.3.1 based on orthogonal functions. Now let us presentanother proof of Theorem 2.3.1 based on Theorem 1.4.1 in Chapter 1. The reader whointerested in an example can proceed to Section 2.3.1. Consider the set of functions {ψk, ξk}∞1defined by

ψk = cos(kω0t) and ξk = sin(kω0t) (k = 1, 2, 3, · · · ) . (3.18)

The angular frequency ω0 = 2π/τ . We claim that {1, ψk, ξm}∞,∞k=1,m=1 is an orthogonal set of

functions in L2(0, τ). To see this simply observe that using 2π = τω0, yields

(1, ψk) =1

τ

∫ τ

0

cos(kω0t)dt = 0 and (1, ξk) =1

τ

∫ τ

0

sin(kω0t)dt = 0

for all integers k ≥ 1. So 1 is orthogonal to both ψk and ξm for all integers k ≥ 1 and m ≥ 1.If k and m are two different strictly positive integers, then

(ψk, ψm) =1

τ

∫ τ

0

cos(kω0t) cos(mω0t)dt = 0 (k = m)

(ξk, ξm) =1

τ

∫ τ

0

sin(kω0t) sin(mω0t)dt = 0 (k = m) .

Notice that for any integer k and m, we have

(ψk, ξm) =1

τ

∫ τ

0

cos(kω0t) sin(mω0t)dt = 0 .

Thus ψk is orthogonal to ξm for all integers k ≥ 1 and m ≥ 1. Therefore {1, ψk, ξm}∞,∞k=1,m=1

is an orthogonal set of functions in L2(0, τ).A simple calculation shows that ‖1‖ = 1 in the L2(0, τ) norm. Furthermore,

‖ψk‖2 = 1

τ

∫ τ

0

| cos(kω0t)|2dt = 1

2and ‖ξk‖2 = 1

τ

∫ τ

0

| sin(kω0t)|2dt = 1

2.

In other words, ‖ψk‖2 = 1/2 and ‖ξk‖2 = 1/2 for all integers k ≥ 1. By employing theWeierstrass approximation theorem along with some measure theoretic results one can show


that {1, ψk, ξm}∞,∞k=1,m=1 is an orthogonal basis for L2(0, τ). Using this orthogonal basis in

Theorem 1.4.1 in Chapter 1, yields another proof of Theorem 2.3.1.It is noted that an application of Corollary 1.4.2 in Chapter 1 readily yields another proof

of Corollary 2.3.3.To complete this section, let us emphasize that the Dirichlet convergence Theorem also

holds when the Fourier series is expressed in its sin and cosine form. This follows from thefact that its partial Fourier series pn(t) can be expressed in either its complex exponentialform or sin and cosine form. To be precise, let us restate the Dirichlet convergence Theoremin its sin and cosine form.

THEOREM 2.3.4 (Dirichlet convergence) Let f be a function in L2(0, τ) of boundedvariation, and

pn(t) =

n∑k=−n

ake−ikω0t = a0 +

n∑k=1

αk cos(kω0t) + βk sin(kω0t)

the partial Fourier series expansion for f(t). Then

limn→∞

pn(t) =1

2limε→0+

(f(t+ ε) + f(t− ε)) . (3.19)

Finally, the partial Fourier series pn(t) converges to f(t) at the continuous points of f(t).

In particular, the Dirichlet convergence theorem states that if f in L2(0, τ) is a piecewisecontinuous function of bounded variation over the interval [0, τ ], then its Fourier series

a0 +

∞∑k=1

αk cos(kω0t) + βk sin(kω0t) = f(t) if f(t) is continuous at t (3.20)

=f(t◦+) + f(t◦−)

2if f(t) is discontinuous at t◦.

2.3.1 A sinusoid example

For an example, let us compute the Fourier series expansion for the function f in L2(0, 2π)defined by f(t) = cos(bt), where b is a positive constant. Throughout this example we alsoassume that b is not an integer. So τ = 2π is not a period for f(t). (Hence f(t) is not in theWiener algebra W2π.) Since τ = 2π, the fundamental angular frequency ω0 = 1. Accordingto Theorem 2.1.1, the function f =

∑∞−∞ ake

−ikt where

ak =1

2π

∫ 2π

0

cos(bt)eiktdt =1

2π

∫ 2π

0

eibt + e−ibt

2eiktdt

=

∫ 2π

0

ei(k+b)t + ei(k−b)t

4πdt =

ei(k+b)t

4π(k + b)i

∣∣∣∣2π0

+ei(k−b)t

4π(k − b)i

∣∣∣∣2π0

=e2πib − 1

4π(k + b)i+e−2πib − 1

4π(k − b)i=

(e2πib − 1)(k − b) + (e−2πib − 1)(k + b)

4π(k + b)(k − b)i

=(e2πib + e−2πib)k − (e2πib − e−2πib)b− 2k

4π(k2 − b2)i=

−b sin(2πb) + ik(1 − cos(2πb))

2π(k2 − b2).


0 5 10 15 20

t

-1.5

-1

-0.5

0

0.5

1

1.5The graph for f(t) and p

25(t) over the interval [0, 6 π]

Figure 2.4: The Fourier series for cos(1.3t) periodic 2π extension on [0, 6π].

In other words,

ak =−b sin(2πb) + ik(1− cos(2πb))

2π(k2 − b2). (3.21)

Recall f admits a Fourier series expansion of the form (3.1). Moreover, because f is realαk = 2�ak and βk = 2�ak for integers k ≥ 0. By consulting (3.21), we see that

a0 =sin(2πb)

2πb, αk =

−b sin(2πb)π(k2 − b2)

, and βk =k(1− cos(2πb))

π(k2 − b2)(k ≥ 0) . (3.22)

Therefore f admits a Fourier series expansion of the form

cos(bt) =sin(2πb)

2πb+

∞∑k=1

(−b sin(2πb)π(k2 − b2)

cos(kt) +k(1− cos(2πb))

π(k2 − b2)sin(kt)

)(0 < t < 2π).

(3.23)The Dirichlet convergence Theorem 2.3.4 shows that this Fourier series converges on theinterval (0, 2π). This example also shows that the sinusoid cos(bt) cannot be represented asa finite linear combinations of sinusoids of the form {cos(kt), sin(kt)}. This happens becausethe period of cos(bt) is not equal to 2π when b is not an integer.

Now let us follow some of the ideas in Section 4.3 of [26] to find a power series formulafor π. Evaluating the Fourier series for cos(bt) in (3.23) at t = π, we have

cos(bπ) =sin(2πb)

2πb+

∞∑k=1

(−1)kb sin(2πb)

π(b2 − k2)

=cos(πb) sin(πb)

πb+

∞∑k=1

(−1)k2b cos(πb) sin(πb)

π(b2 − k2).


−30 −20 −10 0 10 20 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

magnitude of ak

angular frequency k

|ak|

Figure 2.4

The last equality follows from the fact that sin(2θ) = 2 cos(θ) sin(θ). Hence

1 =sin(πb)

πb+

∞∑k=1

(−1)k2b sin(πb)

π(b2 − k2).

Multiplying both sides of the previous equation by πsin(πb)

, we obtain

π

sin(bπ)=

1

b+ 2b

∞∑k=1

(−1)k

b2 − k2. (3.24)

Choosing b = 12yields the following formula for π on page 155 of [26] that we have been

looking for:

π = 2 + 4∞∑k=1

(−1)k

1− 4k2(3.25)

Finally, one can use Matlab to check that is indeed a power series for π.The Dirichlet convergence Theorem 2.3.4 shows that at the discontinuity points 2πj, the

Fourier series for cos(bt) converges to

cos(2πb) + 1

2=

sin(2πb)

2πb+b sin(2πb)

π

∞∑k=1

1

b2 − k2; (3.26)


see the second equation in (3.20) with 1 = cos(b0). By multiplying the previous equation by2π

sin(2πb)and using (3.24) with 2b replacing b, we obtain

π cos(2πb)

sin(2πb)= − π

sin(2πb)+

1

b+ 2b

∞∑k=1

1

b2 − k2

= − 1

2b− 4b

∞∑k=1

(−1)k

4b2 − k2+

1

b+ 2b

∞∑k=1

1

b2 − k2

=1

2b+

∞∑k=1

8b

4b2 − (2k)2+

∞∑k=1

(−1)k+14b

4b2 − k2

=1

2b+

∞∑even k≥2

8b

4b2 − k2+

∞∑k=1

(−1)k+14b

4b2 − k2

=1

2b+ 4b

∞∑k=1

1

4b2 − k2.

Replacing 2b with b, we obtain the following formula on page 156 in [26] for the cotangent

π cot(πb) =1

b+ 2b

∞∑k=1

1

b2 − k2(3.27)

Finally, it is noted that Section 4.3 in [26] uses cos(αt) in L2(−π, π) to derive the formulasfor π in (3.25) and π cot(πb) in (3.27). See Section 2.1.2 for Fourier series in L2(−π, π).

Now let p25(t) be the partial Fourier series defined by

p25(t) =sin(2πb)

2πb+

25∑k=1

(−b sin(2πb)π(k2 − b2)

cos(kt) +k(1− cos(2πb))

π(k2 − b2)sin(kt)

).

Figure 2.4 presents a graph of both p25(t) and f(t) = cos(bt) for b = 1.3 over the interval[0, 6π]. Here we periodically extended the function cos(bt) with a period of 2π to the interval[0, 6π]. Because f(0) = f(2π), its 2π periodic extension has discontinuities at 2πj where jis an integer. Hence its Fourier series approximation is jumping around at 2πj, which is theGibbs phenomenon. Finally, it is noted that at the endpoints this Fourier series convergesto cos(2πb)+1

2; see (3.26).

Figure 2.4 presents the graph of the spectrum for f where we plotted |ak| vs the angularfrequency k. This clearly shows that the amplitudes |ak| become small when |k| ≥ 25. Theerror between f and p25 is given by

‖f − p25‖ =1

π√2

( ∞∑k=26

1.32 sin(2.6π)2 + k2(1− cos(2.6π))2

(k2 − 1.32)2

)1/2

≈ 0.034 .

As expected, because p25 is periodic with period 2π and f(0) = f(2π), the approximationp25 does not match the end points f(0) and f(2π) of f . Finally, it is noted that f(t) and

2.4. HARMONIC FOURIER SERIES 55

p25(t) behave very differently when t is not contained in (0, 2π). This is a consequence ofthe fact that 2π is not a period for cos(bt). Therefore the Fourier series expansion in (3.23)is only valid for t in (0, 2π), that is, equation (3.23) does not necessarily hold when t is notcontained in (0, 2π).

2.4 Harmonic Fourier series

In this section we will show that any function in L2(0, τ) admits a Fourier series consistingof cosine functions and a phase shift.

THEOREM 2.4.1 Let f be a real valued function in L2(0, τ) and set ω0 = 2πτ. Then f

admits a Fourier series representation of the form

f(t) = a0 +

∞∑k=1

γk cos(kω0t− ψk) (4.1)

where γk =√α2k + β2

k and ψk equals the angle of the complex number αk + iβk. The scalarsa0, αk and βk for k = 1, 2, 3, · · · are computed by

a0 =1

τ

∫ τ

0

f(t)dt

αk =2

τ

∫ τ

0

f(t) cos(kω0t)dt (4.2)

βk =2

τ

∫ τ

0

f(t) sin(kω0t)dt .

Finally, in this setting Parseval’s equality is given by

1

τ

∫ τ

0

|f(t)|2dt = |a0|2 + 1

2

∞∑k=1

γ2k . (4.3)

The Fourier series representation in (4.1) is called the harmonic Fourier series for f .

Proof of Theorem 2.4.1. Let α and β be real numbers. We claim that

α cos(x) + β sin(x) =√α2 + β2 cos(x− ψ) (4.4)

where ψ is the angle of α+ iβ. The formula in (4.4) is a special case of (2.18) in Chapter 1;see also (2.20) in Chapter 1. Let us present a direct proof. Using α + iβ = |α + iβ|eiψ, weobtain

α cos(x) + β sin(x) = α(eix + e−ix

)/2 + β

(eix − e−ix

)/2i

= (α− iβ) eix/2 + (α + iβ) e−ix/2

= |α− iβ| e−iψeix/2 + |α + iβ| eiψe−ix/2= |α + iβ| ei(x−ψ)/2 + |α + iβ| e−i(x−ψ)/2= |α + iβ| (ei(x−ψ) + e−i(x−ψ)

)/2

=√α2 + β2 cos(x− ψ) .


Hence (4.4) holds. Using the trigonometric identity (4.4) with γk =√α2k + β2

k in equation(3.1) of Theorem 2.3.1, we arrive at the Fourier series representation in (4.1). The form ofParseval’s equality in (4.3) follows by using γ2k = α2

k + β2k in (3.3). This completes the proof.

Recall that any function f in L2(0, τ) also admits a Fourier series representation of theform f =

∑∞−∞ ake

−iω0kt. Moreover, according to Remark 2.3.2 if the function f is realvalued, then αk = 2�ak and βk = 2�ak. In other words, ak = (αk + iβk)/2 for all integersk ≥ 1. This readily implies that √

α2k + β2

k = 2|ak| .

Moreover, αk + iβk and ak have the same angle ψk. So by consulting Theorem 2.4.1, we seethat any real valued function f in L2(0, τ) admits a Fourier series representation of the form

f(t) = a0 +∞∑k=1

2|ak| cos(kω0t− ψk) (4.5)

where

ak =1

τ

∫ τ

0

f(t)eiω0kt dt (k = 0, 1, 2, · · · )and ψk is the angle for ak.

To complete this section, it is emphasized that the Dirichlet convergence Theorem alsoholds for harmonic Fourier series, when f is a real function of bounded variation in L2(0, τ).On simply uses the harmonic form pn(t) = a0 +

∑n1 γk cos(kω0t− ψk) for the partial Fourier

series in the Dirichlet convergence Theorem.

2.4.1 Exercise


f(t) = 2 if 0 ≤ t < π

= −1 if π ≤ t < 2π .

Find the sine and cosine Fourier series expansion (3.1) for f , that is, find the Fourier seriesfor f of the form:

f(t) = a0 +∞∑k=1

αk cos(kt) + βk sin(kt).

Choose a partial Fourier series approximation pn(t) for f(t). Then plot pn(t) and f(t) onthe same graph. Compute the error

‖f − pn‖ =

√1

2π

∫ 2π

0

|f(t)− pn(t)|2dt

in Matlab. Does this Fourier series converge for t = jπ where j is an integer, and if sowhat does it converge to; see the Dirichlet convergence Theorem 2.3.4. Is f(t) in the Wieneralgebra W2π?



f(t) = sin(1.5t) (when 0 ≤ t < 2π).

Find the sine and cosine Fourier series expansion (3.1) for f . Choose a partial Fourier seriesapproximation pn(t) for f(t). Then plot pn(t) and f(t) on the same graph. Compute theerror ‖f − pn‖. Does this Fourier series converge for t = 2πj where j is an integer, and if sowhat does it converge to; see the Dirichlet convergence Theorem 2.3.4. Is f(t) in the Wieneralgebra W2π? Can f(t) be expressed as a finite linear combination of sinusoids of the form{e−ikt}∞−∞?

Problem 3. Consider the function f(t) = e−t in L2(0, 2π). Find the sine and cosine Fourierseries expansion (3.1) for f . Choose a partial Fourier series approximation pn(t) for f(t).Then plot pn(t) and f(t) on the same graph. Compute the error ‖f −pn‖. Does this Fourierseries converge for t = 2πj where j is an integer, and if so what does it converge to; seethe Dirichlet convergence Theorem 2.3.4. Is f(t) in the Wiener algebra W2π? Finally, usingParseval’s equality show that

π1 + e−2π

1− e−2π=

∞∑k=−∞

1

1 + k2;

see page 158 in [26].


f(t) = 4 + 2 cos(2t)− 4 sin(2t) + 4 cos(3t) + 8 sin(3t) + 6 cos(5t)− 4 sin(8t).

Compute the norm ‖f‖L2(0, 2π) for this function. Express f as a Fourier series of the formf =

∑∞−∞ ake

−ikt. Plot the power spectrum for f .

Problem 5. Consider the function f in L2(0, 1) given by

f(t) = −4 + 4 cos(2πt)− 2 sin(2πt) + 6 cos(6πt)− 10 sin(6πt) + 8 cos(10πt)− 6 sin(14πt) .

Compute the norm ‖f‖L2(0,1) for this function. Express f as a Fourier series of the formf =

∑∞−∞ ake

−2πikt. Plot the power spectrum for f .

Problem 6. Express the function f ∈ L2(0, 2π) in Problem 4 as a harmonic Fourier seriesof the form (4.1).

Problem 7. Consider the function f in L2(0, 1) defined by

f(t) = 2 sin(πt).

Find the sine and cosine Fourier series expansion (3.1) for f , that is, the Fourier seriesexpansion of the form:

f(t) = a0 +

∞∑k=1

αk cos(2πkt) + βk sin(2πkt).


Choose a partial Fourier series approximation pn(t) for f(t). Then plot pn(t) and f(t) on

the same graph. Compute the error ‖f − pn‖ =√∫ 1

0|f(t)− pn(t)|2dt. Show that f(t) is

in the Wiener algebra W1. In particular, the Fourier series converges uniformly to f(t); seeTheorem 2.1.6. Finally, plot the power spectrum for f .

Problem 8. Consider the function f in L2(0, 2π) defined by

f(t) = t sin(t) (for 0 ≤ t ≤ 2π).

Find the sine and cosine Fourier series expansion (3.1) for f , that is,

f(t) = a0 +∞∑k=1


Choose a partial Fourier series approximation pn(t) for f(t). Then plot pn(t) and f(t) on thesame graph. Compute the error ‖f − pn‖. Show that f(t) is in the Wiener algebra W2π. Inparticular, the Fourier series converges uniformly to f(t); see Theorem 2.1.6. Finally, plotthe power spectrum for f .

Problem 9. Consider the function f in L2(0, 1) defined by

f(t) = t− 3t2 + 2t3 (for 0 ≤ t ≤ 1).

Notice that and f(0) = f(1) and f(0) = f(1). Find the sine and cosine Fourier seriesexpansion (3.1) for f . Choose a partial Fourier series approximation pn(t) for f(t). Thenplot pn(t) and f(t) on the same graph. Compute the error ‖f − pn‖. Show that f(t) is inthe Wiener algebra W1. In particular, the Fourier series converges uniformly to f(t); seeTheorem 2.1.6. Finally, plot the power spectrum for f . Why is this Fourier series convergingso fast?


f(t) =π

4if 0 ≤ t < π

= −π4

if π ≤ t < 2π.

Find the sine and cosine Fourier series expansion (3.1) for f . Does this Fourier series convergefor t = jπ where j is an integer, and if so what does it converge to; see the Dirichletconvergence Theorem 2.3.4. Plot the power spectrum for f . Use this Fourier series to showthat

π2

8=

∞∑odd k=1

1

k2= 1 +

1

32+

1

52+

1

72+ · · ·

Is f(t) in the Wiener algebra W2π?

Problem 11. Consider the function f(t) = t in L2(0, 1).


(i) Show that the Fourier series expansion for the function f(t) = t in L2(0, 1) is given by

t =1

2−

∞∑k=1

sin(2πkt)

kπ(for 0 < t < 1).

Is f(t) in the Wiener algebra W1?

(ii) Use this Fourier series with the Dirichlet convergence Theorem 2.3.4 and Parseval’sequality to show that

π

4=

∞∑k=0

(−1)k

2k + 1and

π2

6=

∞∑k=1

1

k2.

(iii) Clearly t2 = 2∫ t0tdt. Using this find the Fourier series for t2 in L2(0, 1).

Problem 12. Find the sine and cosine Fourier series expansion (3.1) for the function f inL2(0, 2π) given by

f(t) = 2πt− t2 (for 0 ≤ t ≤ 2π).

Show that f(t) is in the Wiener algebra W2π. In particular, the Fourier series convergesuniformly to f(t); see Theorem 2.1.6.Answer:

f(t) =2π2

3−

∞∑k=1

4 cos(kt)

k2.

Problem 13. Find the Fourier series for f(t) = π sin(t2

)in L2(0, 2π). Show that f(t) is in

the Wiener algebra W2π. In particular, the Fourier series converges uniformly to f(t); seeTheorem 2.1.6. Using this show that π and 1 are respectively given by

π = 2 + 4∞∑k=1

(−1)k+1

4k2 − 1and 1 = 2

∞∑k=1

1

4k2 − 1.

Answer:

π sin

(t

2

)= 2− 4

∞∑k=1

cos(kt)

4k2 − 1(for 0 ≤ t ≤ 2π).

Problem 14. The command soundsc in Matlab plays a vector on the speakers with a defaultsampling rate of 213 = 8192 hertz. In Matlab set t = linspace(0, 3, 3 ∗ 8192). Then on lowvolume (the high frequencies are ignoring) listen to

• soundsc(cos(2 ∗ π ∗ 200 ∗ t))• soundsc(cos(2 ∗ π ∗ 400 ∗ t))• soundsc(cos(2 ∗ π ∗ 800 ∗ t))


• soundsc(cos(2 ∗ π ∗ 1200 ∗ t))

• soundsc(cos(2 ∗ π ∗ 200 ∗ t + cos(2 ∗ π ∗ 800 ∗ t)) % the sum of two sinusoids

• soundsc(cos(2 ∗ π ∗ 300 ∗ t). ∗ (t < 1.5 ∗ 8192) + cos(2 ∗ π ∗ 1000 ∗ t). ∗ (t > 1.5 ∗ 8192))

In Matlab the speakers for the command soundsc(cos(2 ∗ π ∗ f ∗ t)) are playing a frequencyof f hertz for around 3 seconds. The human hearing range is around 64 to 23000 hertz.However, your speakers may not replicating the Matlab commands very well at the high andlow end. Describe the sounds that you heard in each case. The Matlab command randn(1, b)yields a row vector of length b of independent Gaussian random numbers with mean zeroand variance one. In fact, randn(1, b) can be viewed as Gaussian white noise. Now listen tosoundsc(randn(size(t))) and describe what you heard.

2.5 The Fourier series for even and odd functions

Recall that f(t) is a periodic function with period τ if f(t) is a function on the real linesatisfying f(t) = f(t + τ) for all −∞ < t < ∞. Moreover, the corresponding fundamentalfrequency ω0 =

2πτ. Notice that for any integer k the functions e−ikω0t, cos(kω0t) and sin(kω0t)

are all periodic with period τ . Finally, if f(t) is τ periodic and f is in L2(0, τ), then theFourier series expansion for f(t) in (1.4) and (3.1) holds for all t (almost everywhere) on thereal line.

Recall that a function f(t) defined on the real line is even if f(t) = f(−t) for all t. Forexample, cos(kω0t) is an even function for all integers k. Recall also that f(t) is an oddfunction if f(−t) = −f(t) for all t. For example, sin(kω0t) is an odd function. Finally, itis noted that the product of two even functions is an even function, the product of two oddfunctions is an even function, while the product of an even function with an odd function isan odd function.

If f(t) is an odd function with period τ and f is in L2(0, τ), then f admits a Fourierseries expansion of the form

f(t) =

∞∑k=1

βk sin(kω0t) where βk =2

τ

∫ τ

0

f(t) sin(kω0t)dt . (5.1)

In other words, if f(t) is an odd τ periodic function, then f(t) is given by (3.1) and (3.2)where a0 = 0 and αk = 0 for all integers k ≥ 1; see Theorem 2.3.1. Since f(t) is oddand cos(kω0t) is an even function, f(t) cos(kω0t) is an odd function. Using the fact thatf(t) cos(kω0t) is also τ periodic, we obtain

αk =2

τ

∫ τ

0

f(t) cos(kω0t)dt =2

τ

∫ τ2

− τ2

f(t) cos(kω0t)dt = 0.

The second equality follows from the fact that the integral over any interval [c, c+ τ ] of a τperiodic function is the same. The last integral equals zero because f(t) cos(kω0t) is an odd

2.5. THE FOURIER SERIES FOR EVEN AND ODD FUNCTIONS 61

function, and the integral over the interval [− τ2, τ2] of an odd function is zero. Finally, using

the fact that f(t) is odd with period τ , we have

a0 =1

τ

∫ τ

0

f(t)dt =1

τ

∫ τ2

− τ2

f(t)dt = 0.

Therefore (5.1) holds.

If f(t) is an even function with period τ and f is in L2(0, τ), then f(t) admits a Fourierseries expansion of the form

f(t) = a0 +∞∑k=1

αk cos(kω0t)

a0 =1

τ

∫ τ

0

f(t)dt and αk =2

τ

∫ τ

0

f(t) cos(kω0t)dt. (5.2)

In other words, if f(t) is an even function, then f(t) is given by (3.1) and (3.2) where βk = 0for all integers k ≥ 1; see Theorem 2.3.1. Since f(t) is even and sin(kω0t) is an odd function,f(t) sin(kω0t) is an odd function. Using the fact that f(t) sin(kω0t) is also τ periodic, weobtain

βk =2

τ

∫ τ

0

f(t) sin(kω0t)dt =2

τ

∫ τ2

− τ2

f(t) sin(kω0t)dt = 0 .

The second equality follows from the fact that the integral over any interval of length τ ofa τ periodic function is the same. The last integral equals zero because f(t) sin(kω0t) is anodd function. Therefore (5.2) holds.

2.5.1 An even square wave

-15 -10 -5 0 5 10 15

t

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

Figure 2.5: The graph of p100(t) =rπ+ 2

π

∑100k=1

sin(kr)k

cos(kt) with r = π3


-100 -50 0 50 100

k

0

0.1

0.2

0.3

0.4The magnitude |a

k|

-100 -50 0 50 100

k

0

0.05

0.1

0.15The power spectrum |a

k|2

Figure 2.6: The power spectrum (k, |ak|2) in the upper plot and (k, |ak|) in the lower plot.

Let us compute the Fourier series expansion of the form (3.1) for the square wave f inL2(0, 2π) defined by

f(t) = 1 if 0 ≤ t < r

= 0 if r ≤ t < 2π − r

= 1 if 2π − r ≤ t < 2π .

Here r is a positive scalar satisfying 0 < r < π. Notice that f(t) can be extended to an evenperiodic function with period 2π defined on the whole line. Since τ = 2π, the fundamentalangular frequency ω0 = 1. So f admits a Fourier series expansion of the form (5.2) whereω0 = 1. Hence a0 is computed by

a0 =1

2π

∫ 2π

0

f(t)dt =1

2π

∫ r

0

dt+1

2π

∫ 2π

2π−rdt =

r

π.

Thus a0 =rπ. For any integer k ≥ 1, we obtain

αk =1

π

∫ 2π

0

f(t) cos(kt)dt =1

π

∫ r

0

cos(kt)dt +1

π

∫ 2π

2π−rcos(kt)dt

=sin(kt)

kπ

∣∣∣∣r0

+sin(kt)

kπ

∣∣∣∣2π2π−r

=2 sin(kr)

kπ.

Thus αk =2 sin(kr)kπ

. (One could verify that βk =1π

∫ 2π

0f(t) sin(kt)dt = 0; see Theorem 2.3.1.

However, this is not necessary. Because the 2π periodic extension of f(t) is an even function,βk must be zero.) By consulting (5.2), we see that f admits a Fourier series expansion ofthe form

f(t) =r

π+

2

π

∞∑k=1

sin(kr)

kcos(kt) (5.3)


Notice that f(t) =∑∞

−∞ ake−ikt where a0 = r

πand ak = sin(kr)

πkfor k = 0. In other words,

the complex Fourier series representation for f(t) is given by

f(t) =r

π+

1

π

∑k =0

sin(kr)

ke−ikt. (5.4)

Figure 2.5 plots the partial Fourier series approximation

p100(t) =r

π+

2

π

100∑k=1

sin(kr)

kcos(kt)

for the square wave f(t) with r = π3. Because f(t) is not continuous, we have the Gibbs

phenomenon at the discontinuous points of f(t). According to (3.15), the error in thisapproximation is given by

‖f − p100‖ =

√2

π

( ∞∑k=101

sin(kr)2

k2

) 12

≈ 0.0318.

One can also use (3.16) to compute the error, that is, using ‖f‖2 = rπ(see equation (5.5)

below), we have

‖f − p100‖ =(‖f‖2 − ‖p100‖2

) 12 =

(r

π− r2

π2− 2

π2

100∑k=1

sin(kr)2

k2

) 12

≈ 0.0318.

The error ‖f − p100‖ = 0.0318 might seem small. However, this error is not that smallconsidering that ‖f‖ =

√rπ= 1√

3= 0.5774 when r = π

3.

The top graph in Figure 2.6 plots the power spectrum for f with |ak|2 on the y axis andthe angular frequency k on the x axis. The bottom graph in Figure 2.6 plots the magnitude|ak| on the y axis with the angular frequency k on the x axis. Finally, it is noted that |ak|2converges to zero on the order of 1

k2, while |ak| converges to zero on the order of 1

|k| .Now let us follow some of the ideas in Chapter 4.3 of [26], and use the Fourier series for

the square wave to derive some formulas for π. To this end, notice that

‖f‖2 = 1

2π

∫ 2π

0

|f(t)|2dt = 1

2π

∫ r

0

dt+1

2π

∫ 2π

2π−rdt =

r

π. (5.5)

Applying Parseval’s equality (2.3.3) in Theorem 2.3.1 to the Fourier series in (5.3), we have

r

π=r2

π2+

2

π2

∞∑k=1

sin(kr)2

k2(when 0 ≤ r ≤ π).

Multiplying this equation by π2, we see that

rπ = r2 + 2∞∑k=1

sin(kr)2

k2(when 0 ≤ r ≤ π) (5.6)


By choosing r = π2, we obtain

π2

8=

∞∑k=0

1

(2k + 1)2. (5.7)

Recall that π2

6=∑∞

11k2; see (2.4). Using this with 2 sin(θ)2 = 1− cos(2θ) in (5.6), we arrive

at

r2 − rπ +π2

6=

∞∑k=1

cos(2kr)

k2(when 0 ≤ r ≤ π) (5.8)

In particular, choosing r = π2yields

π2

12=

∞∑k=1

(−1)k+1

k2. (5.9)

Because f(t) is discontinuous, f(t) is not in the Wiener algebra W2π. The Dirichletconvergence Theorem 2.3.4 shows that the Fourier series (5.3) converges to f(t) pointwiseat the continuous points of f(t). Moreover, the Fourier series (5.3) converges to 1

2= 1+0

2at

the discontinuous points of f(t). In particular, by evaluating the Fourier series at t = r andusing sin(2θ) = 2 sin(θ) cos(θ), we have

1

2=r

π+

1

π

∞∑k=1

sin(2kr)

k(when 0 < r < π). (5.10)

Multiplying by 2π yields

π = 2r + 2

∞∑k=1

sin(2kr)

k(when 0 < r < π) (5.11)

In particular, by choosing r = π4, we obtain the classical formula for π, that is,

π

4=

∞∑k=1

sin(πk2

)k

=∞∑k=0

(−1)k

2k + 1. (5.12)

By replacing r by t in (5.8), we have

t2 − tπ +π2

6=

∞∑k=1

cos(2kt)

k2(when 0 ≤ t ≤ π). (5.13)

This is precisely the Fourier series expansion for the function g(t) = t2 − tπ+ π2

6in L2(0, π).

Moreover, g(t) is in the Wiener algebraWπ. Therefore this Fourier series converges uniformlyto g(t) over the interval [0, π].

By replacing r by t in (5.11), we obtain

π − 2t = 2

∞∑k=1

sin(2kt)

k(when 0 < t < π). (5.14)


This is precisely the Fourier series expansion for the function h(t) = π − 2t in L2(0, π).Now observe that π = h(0) = h(π) = −π. Therefore the function h(t) is not in theWiener algebra Wπ. The Dirichlet convergence Theorem 2.3.4 shows that this Fourier seriesconverges pointwise to π − 2t on the open interval (0, π) and to zero for t = πj where j isan integer.

2.5.2 The Fourier series in terms of sine or cosine functions

The following result shows that any function in L2(0, μ) can be expresses as an infinite seriesof sine functions.

THEOREM 2.5.1 Let f be any function in L2(0, μ). Then f admits a Fourier seriesrepresentation of the form

f(t) =

∞∑k=1

βk sin(kω1t) (for almost all 0 ≤ t ≤ μ). (5.15)

The fundamental frequency ω1 =πμ, and the Fourier coefficients {βk}∞1 are computed by

βk =2

μ

∫ μ

0

f(t) sin(kω1t) dt (k ≥ 1) . (5.16)

In this setting, Parseval’s equality becomes

1

μ

∫ μ

0

|f(t)|2 dt = 1

2

∞∑k=1

|βk|2 . (5.17)

Finally, if f(t) is an odd function with period 2μ, then (5.15) holds (almost everywhere) onthe real line.

Proof. Let us extend f(t) to the interval [−μ, μ], by setting f(t) = −f(−t) for −μ ≤ t ≤ 0.This extension is also denoted by f(t), and f(t) is an odd function over [−μ, μ]. Moreover,f(t) is an odd function in L2(−μ, μ). By consulting Section 2.1.2, we see that f(t) admits aFourier series expansion of the form

f(t) =∞∑

k=−∞ake

−ikω1t where ak =1

2μ

∫ μ

−μeikω1tf(t)dt.

Notice that the integral of an odd function over the interval [−μ, μ] is zero. Since f(t) isodd, a0 = 1

2μ

∫ μ−μ f(t)dt = 0. Because f(t) is an odd function, f(t) cos(kω1) is odd and

f(t) sin(kω1) is even. So for k = 0, we have

ak =1

2μ

∫ μ


1

2μ

∫ μ

−μf(t) cos(kω1)dt+

i

2μ

∫ μ

−μf(t) sin(kω1)dt

=i

2μ

∫ μ

−μf(t) sin(kω1)dt =

i

μ

∫ μ

0

f(t) sin(kω1)dt.


The last equality follows from the fact that f(t) sin(kω1) is an even function. Hence ak =iβk2

when k ≥ 1 and ak = − iβk2

when k ≤ −1; see the definition of βk in (5.16). This readilyimplies that

f(t) =

∞∑k=−∞

ake−ikω1t =

∞∑k=1

βkie−ikω1t − ieikω1t

2=

∞∑k=1

βk sin(kω1).

This yields the Fourier series in (5.15). Applying Parseval’s formula along with the fact that|f(t)|2 is even, we have

1

μ

∫ μ

0

|f(t)|2dt = 1

2μ

∫ μ

−μ|f(t)|2dt =

∞∑k=−∞

|ak|2 =∞∑k=1

( |βk|24

+|βk|24

)=

1

2

∞∑k=1

|βk|2.

Therefore (5.17) holds. This completes the proof.The following result shows that any function in L2(0, μ) can be expresses as an infinite

series of cosine functions.

THEOREM 2.5.2 Let f be any function in L2(0, μ). Then f admits a Fourier seriesrepresentation of the form

f(t) = a0 +∞∑k=1

αk cos(kω1t) (for almost all 0 ≤ t ≤ μ). (5.18)

The fundamental frequency ω1 =πμ, and the Fourier coefficients {αk}∞1 are computed by

a0 =1

μ

∫ μ

0

f(t) dt and αk =2

μ

∫ μ

0

f(t) cos(kω1t) dt (k ≥ 1) . (5.19)

In this setting, Parseval’s equality becomes

1

μ

∫ μ

0

|f(t)|2 dt = |a0|2 + 1

2

∞∑k=1

|αk|2 . (5.20)

Finally, if f(t) is an even function with period 2μ, then (5.18) holds (almost everywhere) onthe real line.

Proof. Let us extend f(t) to the interval [−μ, μ], by setting f(t) = f(−t) for −μ ≤ t ≤ 0.This extension is also denoted by f(t), and f(t) is an even function over [−μ, μ]. Moreover,f(t) is an even function in L2(−μ, μ). By consulting Section 2.1.2, we see that f(t) admitsa Fourier series expansion of the form

f(t) =

∞∑k=−∞


1

2μ

∫ μ

−μeikω1tf(t)dt.

Using the fact that f(t) is even, we see that

a0 =1

2μ

∫ μ

−μf(t)dt =

1

μ

∫ μ

0

f(t)dt.


Because f(t) is an even function, f(t) cos(kω1) is even and f(t) sin(kω1) is odd. Recall thatthe integral of an odd function over [−μ, μ] is zero. For k = 0, we have

ak =1

2μ

∫ μ


1

2μ

∫ μ

−μf(t) cos(kω1)dt+

i

2μ

∫ μ

−μf(t) sin(kω1)dt

=1

2μ

∫ μ

−μf(t) cos(kω1)dt =

1

μ

∫ μ

0

f(t) cos(kω1)dt.

The last equality follows from the fact that f(t) cos(kω1) is an even function. Hence ak =αk

2

when k = 0; see the definition of αk in (5.19). This readily implies that

f(t) =∞∑

k=−∞ake

−ikω1t = a0 +∞∑k=1

αke−ikω1t + eikω1t

2= a0 +

∞∑k=1

αk cos(kω1).

This yields the Fourier series in (5.18).Applying Parseval’s formula with ak =

αk

2for k = 0 and the fact that |f(t)|2 is even, we

have

1

μ

∫ μ

0

|f(t)|2dt = 1

2μ

∫ μ

−μ|f(t)|2dt = |a0|2 +

∑k =0

|ak|2 = |a0|2 + 1

2

∞∑k=1

|αk|2.

In other words, the equality in (5.20) holds. This completes the proof.

An even Fourier series converging uniformly to π sin(t) over [0, π]. Consider thefunction f(t) = π sin(t) in L2(0, π). Then an application of Theorem 2.5.2 shows that

π sin(t) = 2− 4∑

even,k≥2

cos(kt)

k2 − 1(for 0 ≤ t ≤ π). (5.21)

In fact, the Fourier series in (5.21) is in the Wiener algebra W2π. So the Fourier series in(5.21) converges uniformly to π sin(t) over the interval [0, π]; see Theorem 2.1.6. In fact,this Fourier series converges uniformly to π sin(t) over the interval [nπ, (n + 1)π] for evenintegers n, and does not converge to π sin(t) outside these intervals; see Figure 2.7. Becausethe Fourier series in (5.21) is an even function in the Wiener algebra W2π, it convergesuniformly to an even function, and thus, one cannot expect this Fourier series to convergeto the odd function π sin(t) everywhere. By evaluating the previous Fourier series at t = 0and respectively t = π

2, we obtain

1

2=

∑even,k≥2

1

k2 − 1and π = 2− 4

∑even,k≥2

(−1)k2

k2 − 1. (5.22)

Finally, an application of Parseval’s equality in (5.20) yields

π2

2= π

∫ π

0

sin(t)2dt = 4 + 8∑

even,k≥2

1

(k2 − 1)2. (5.23)


-15 -10 -5 0 5 10 15

t

-4

-3

-2

-1

0

1

2

3

4

Figure 2.7: The graph of π sin(t) and 2− 4∑

even,k≥2cos(kt)k2−1

Extending a function to avoid the Gibbs effect. In some applications it may beadvantageous to extend a function f in L2(0, μ) to a function in L2(0, 2μ) (or to a functionin L2(−μ, μ)) to avoid the Gibbs phenomenon at the endpoints 0 and μ. For example,consider the function f(t) = 1− t2 in L2(0, π). Since f(0) = f(π), the corresponding Fourierseries with period π displays the Gibbs phenomenon at the end points. To avoid this onecan extend f(t) to a continuous even function fe(t) in L

2(−π, π) such that fe(−π) = fe(π).Then the corresponding Fourier series for fe(t) in L

2(−π, π) with period 2π does not havethe Gibbs phenomenon and matches f(t) = 1− t2 over the interval [0, π].

To be more specific, let fe(t) be the function defined by fe(t) = f(t) = 1 − t2 when0 ≤ t ≤ π, and fe(t) = f(−t) = 1− t2 for −π ≤ t ≤ 0. Notice that fe(t) is an even functionin L2(−π, π) satisfying fe(−π) = fe(π). So the corresponding Fourier series with period 2πavoids the Gibbs phenomenon. Now let fe(t) also denote the 2π periodic extension of fe(t)to the entire real line. In particular, we have

fe(t) = 1− t2 if 0 ≤ t ≤ π

= 1− (t− 2π)2 if π ≤ t ≤ 2π.

Notice that fe(0) = fe(2π). Furthermore, f(t) = fe(t) for 0 ≤ t ≤ π and fe(t) is a 2πperiodic continuous even function. Hence fe(t) has a Fourier series representation of theform a0 +

∑∞1 αk cos(kt). In fact,

fe(t) = 1− π2

3− 4

∞∑k=1

(−1)k cos(kt)

k2. (5.24)

This Fourier series was obtained by using Theorem 2.5.2. (Deriving this Fourier series isleft to the reader as an exercise.) Since fe(t) is continuous, its Fourier series avoids theGibbs phenomenon. The Dirichlet convergence Theorem 2.3.4 guarantees that the Fourierseries in (5.24) converges to fe(t) for all t. Furthermore, fe(t) is in the Wiener algebra W2π.Therefore the Fourier series converges uniformly to fe(t). In particular, this Fourier seriesconverges uniformly to 1 − t2 over the interval [−π, π]. It is also noted that evaluating the


Fourier series in (5.24) at t = 0 shows that

π2

12=

∞∑k=1

(−1)k+1

k2.

Finally, there are many ways to extend a function to avoid the Gibbs phenomenon at theendpoints; see also [26].

2.5.3 Exercise

Problem 1. Compute the even Fourier series in (5.2) for the function f in L2(0, 2π) givenby

f(t) = t if 0 ≤ t ≤ π

= t− 2π if π < t ≤ 2π .

In Matlab plot your approximation for the function f and compute the error. Plot the powerspectrum for f .

Problem 2. Compute the odd Fourier series in (5.1) for the function f in L2(0, 2π) givenby

f(t) = 2 if 0 ≤ t < π

= −2 if π ≤ t < 2π .

In Matlab plot your approximation for the function f and compute the error. Plot the powerspectrum for f .

Problem 3. Compute the odd Fourier series in (5.15) of Theorem 2.5.1 for the functionf in L2(0, π) given by f(t) = t for 0 ≤ t ≤ π. In Matlab plot your approximation for thefunction f and compute the error. Does this Fourier series avoid the Gibbs phenomenon?Explain why or why not.

Problem 4. Compute the even Fourier series in (5.18) of Theorem 2.5.2 for the functionf in L2(0, π) given by f(t) = t for 0 ≤ t ≤ π. In Matlab plot your approximation for thefunction f and compute the error. Does this Fourier series avoid the Gibbs phenomenon?Explain why or why not.

Problem 5. Compute the odd Fourier series in (5.15) of Theorem 2.5.1 for the function fin L2(0, π) given by f(t) = 1− t2 for 0 ≤ t ≤ π. In Matlab plot your approximation for thefunction f and compute the error. Does this Fourier series avoid the Gibbs phenomenon?Explain why or why not.

Problem 6. Compute the even Fourier series in (5.18) of Theorem 2.5.2 for the function fin L2(0, π) given by f(t) = 1 − t for 0 ≤ t ≤ π. In Matlab plot your approximation for the


function f and compute the error. Does this Fourier series avoid the Gibbs phenomenon?Explain why or why not.

Problem 7. Compute the odd Fourier series in (5.15) of Theorem 2.5.1 for the functionf in L2(0, π) given by f(t) = 1 for 0 ≤ t ≤ π. Notice that this Fourier series introduces aGibbs phenomenon. Explain why.

2.6 The integral of a Fourier series

Let us begin with a simple example taken from the book [26] entitled: Dr. Euler’s FabulousFormula: Cures many mathematical ills. Consider the function f(t) = t in L2(0, 2π), thatis, f(t) = t for 0 ≤ t < 2π. In this setting, τ = 2π and ω0 = 1. Let us find the complexFourier series

∑∞−∞ ake

−ikt for t. To this end, observe that

a0 =1

2π

∫ 2π

0

f(t)dt =1

2π

∫ 2π

0

tdt =t2

4π

∣∣∣∣2π0

= π.

Moreover, for any integer k = 0, we have

ak =1

2π

∫ 2π

0

f(t)eiktdt =1

2π

∫ 2π

0

teiktdt =1

2π

[teikt

ik+eikt

k2

]2π0

=1

2π

[2πei2πk

ik+ei2πk − 1

k2

]=

1

ik= − i

k.

Therefore

a0 = π and ak =1

ikfor all integers k = 0. (6.1)

By consulting (3.1) and (3.11), we see that f(t) = t on L2(0, 2π) admits a Fourier seriesexpansion of the form:

t = π +∑k =0

e−ikt

ik= π − 2

∞∑k=1

sin(kt)

k(for 0 < t < 2π) (6.2)

In this case, αk = 2�(ak) = 0 and βk = 2�(ak) = − 2kfor all integers k ≥ 1.

According to Parseval’s equality, we have

4π2

3=

1

2π

∫ 2π

0

t2dt =1

2π

∫ 2π

0

|f(t)|2dt =∞∑

k=−∞|ak|2 = π2 + 2

∞∑k=1

1

k2. (6.3)

By rearranging terms, we arrive at the following famous formula due to Euler:

π2

6=

∞∑k=1

1

k2(6.4)

2.6. THE INTEGRAL OF A FOURIER SERIES 71

Finally, it is noted that Euler proved this result without using Fourier series; see page 153in [26] and the Basel problem in Wikipedia.

Clearly, f(t) = t in L2(0, 2π) is not a periodic function with period τ = 2π. Moreover, atthe endpoints t|t=0 = 0 = 2π = t|t=2π. (Thus t is not in the Wiener algebra W2π.) Let f(t)also denote the 2π periodic extension of t such that f(t) = f(t+2π) for all t and f(2πj) = 0for all integers j. To be precise,

f(t) = t− 2πj if 2jπ ≤ t < 2(j + 1)π where j is an integer; (6.5)

see the upper graph in Figure 2.8 or Figure 2.10. (Since {2πj}∞−∞ is a set of Lebesguemeasure zero, fixing f(2πj) = 0 is not necessary. However, it makes f(t) right continuous.)Sometimes this periodic extension f(t) is referred to as a sawtooth function or wave. Becauset evaluated at zero does not equal t evaluated at 2π, the 2π periodic extension f(t) of t is

discontinuous at 2πj. However, its Fourier series approximation π−2∑n

1sin(kt)k

is continuousand periodic with period 2π. Therefore its Fourier series approximation displays the Gibbsphenomenon at 2πj; see the lower plot in Figure 2.8 or Figure 2.10. In particular, the Fourierseries does not converge to t at the endpoints, that is,

0 = t|t=0 =[π − 2

∞∑k=1

sin(kt)

k

]t=0

= π

2π = t|t=2π =[π − 2

∞∑k=1

sin(kt)

k

]t=2π

= π.

Finally, π is the average between 0 = t|t=0 and 2π = t|t=2π, the endpoints of t on the interval[0, 2π]. This is also follows from (3.19) or (3.20) in the Dirichlet convergence Theorem 2.3.4.

Because t is of bounded variation on [0, 2π] and continuous on the open interval (0, 2π),

the Fourier series π − 2∑∞

1sin(kt)k

equals t on the open interval (0, 2π); see the Dirichletconvergence Theorem 2.3.4. In other words, the Fourier series converges at the continuouspoint of the sawtooth function f(t). In fact, by choosing t = π

2, we obtain

π

2= t|t=π

2=

[π − 2

∞∑k=1

sin(kt)

k

]t=π

2

= π − 2

1+

2

3− 2

5+

2

7− 2

9+ · · ·

By rearranging terms, we see that π4is given by the following classical series expansion:

π

4= 1− 1

3+

1

5− 1

7+

1

9− 1

11+ · · · =

∞∑k=0

(−1)k

2k + 1(6.6)

The bottom graph of Figure 2.8 plots the sawtooth function f(t) with the followingpartial Fourier series

p20(t) = π − 2

20∑k=1

sin(kt)

kand p2000(t) = π − 2

2000∑k=1

sin(kt)

k


on the same graph. It is emphasized that p20(t) and p2000(t) keeps the first 20 and 2000 termsin the Fourier series expansion for f(t) = t, respectively. The errors in these approximationsare given by

‖t− p20‖ =

√1

2π

∫ 2π

0

|t− p20(t)|2 dt =√ ∑

|k|≥21

|ak|2 =√√√√2

∞∑k=21

1

k2≈ 0.3123

‖t− p2000‖ =

√1

2π

∫ 2π

0

|t− p2000(t)|2 dt =√ ∑

|k|≥2001

|ak|2 =√√√√2

∞∑k=2001

1

k2≈ 0.0316.

These errors should be compared to ‖f‖ = 2π√3≈ 3.6276; see (6.3).

0 5 10 15 20

t

-2

0

2

4

6

8The sawtooth function f(t)

0 5 10 15 20

t

-2

0

2

4

6

8Two Fourier series approximations for f(t)

Figure 2.8: The sawtooth function f(t) with p20(t) and p2000(t).

The power spectrum of f(t) = t in L2(0, 2π) is presented in Figure 2.9. Here we onlyplotted k on the x axis and |ak|2 on the y axis for k ≥ 0. It is understood that the powerspectrum for real functions is symmetric about the y axis.

The Matlab commands we used to plot Figure 2.8, Figure 2.9 and compute the errors‖t− p20‖ and ‖t− p2000‖ are given by

t=linspace(0,6.5*pi,2^18);subplot(2,1,1); f=pi*sawtooth(t)+pi;

plot(t,f);grid; axis([0,22,-2,8]);

xlabel(’t’); title(’The sawtooth function f(t)’);


subplot(2,1,2); plot(t,f); hold on;

p=pi;for k=1:20;p=p-2*sin(k*t)/k; end; plot(t,p,’b’);

p=pi;for k=1:2000;p=p-2*sin(k*t)/k;end; plot(t,p,’r’);

grid; axis([0,22,-2,8]);

xlabel(’t’); title(’Two Fourier series approximations for f(t)’);

a= [pi;-i./(1:1000000-1)];

sqrt(2)*norm(a(22:1000000)) = 0.3123 % The error ||t - p_20||

sqrt(2)*norm(a(2002:1000000)) = 0.0316 % The error ||t - p_2000||

% To plot the power spectrum

hold off; bar((0:20),abs(a(1:21)).^2); grid

xlabel(’The frequency k’); ylabel(’|a_k|^2’)

title(’The power spectrum of f(t) = t over [0, 2\pi]’)

-5 0 5 10 15 20 25

The frequency k

0

1

2

3

4

5

6

7

8

9

10

|ak|2

The power spectrum of f(t) = t over [0, 2 π]

Figure 2.9: The power spectrum of f(t) = t in L2(0, 2π).

The function t is not periodic. However, the Fourier series π−2∑∞

1sin(kt)k

for t is periodic,which leads to the Gibbs phenomenon at 2πj where j is an integer. As before, let f(t) denotethe 2π periodic extension of t. Because f(t) is discontinuous at 2πj, as expected, the Gibbsphenomenon was present even when we plotted the partial Fourier series pn(t) for largen = 12000; see the lower graph in Figure 2.10 and the overshoot at the discontinuities 2πjfor f(t). In general, the overshoot for a Fourier series is around 8.9% of the total distanceof the corresponding discontinuity; see Section 4.4 in [26] for further details. In this case,‖f − pn‖ ≈ 0.0128. In Section 2.7 below, we will use the Cesaro mean to help smooth outthe Gibbs phenomenon and corresponding overshoot.


0 5 10 15 20

t

-2

0

2

4

6

8The sawtooth function f(t)

0 5 10 15 20

t

-2

0

2

4

6

8The Fourier series approximations p

12000(t)

Figure 2.10: The graph of the sawtooth wave f(t) and π − 2∑12000

1sin(kt)k

.

REMARK 2.6.1 Now let f(t) = t be viewed as a function in L2(0, τ) and set ω0 = 2πτ.

Then f(t) = t admits a Fourier series expansion of the form

t =τ

2+∑k =0

e−ikω0t

ikω0=τ

2− 2

∞∑k=1

sin(kω0t)

kω0(for 0 < t < τ). (6.7)

The proof that this is indeed the Fourier series expansion for f(t) = t in L2(0, τ) is left tothe reader as an exercise. Finally, it is noted that one can directly obtain (6.7) by replacing

t with ω0t in the Fourier series for t = π − 2∑∞

1sin(kt)k

when 0 < t < 2π.

REMARK 2.6.2 It is noted that one can extend f(t) = t in L2(0, 2π) to a continuousperiodic function fe(t) with period 4π. To see this, simply set

fe(t) = t if 0 ≤ t ≤ 2π

= 4π − t if 2π ≤ t ≤ 4π.

In this case, fe(t) is a continuous function with period 4π and the corresponding frequency

ω0 =12. So fe(t) admits a Fourier series expansion of the form fe(t) =

∑∞−∞ ake

−ikt2 . In this

extension, the Fourier series converges everywhere, and one avoids the Gibbs phenomenon;see also Section 2.5.2.

The integral of a Fourier series

In general one can integrate a Fourier series term by term. To be specific, let f(t) be afunction in L2(0, τ). Because f(t) is in L2(0, τ), its integral

∫ t0f(x)dx is also in L2(0, τ). To


see this, recall the following Cauchy-Schwartz inequality for two square integrable functionsg and h: ∣∣∣∣∫ τ

0

g(x)h(x)dx

∣∣∣∣2 ≤ ∫ τ

0

|g(x)|2dx∫ τ

0

|h(x)|2dx.

Using this Cauchy-Schwartz inequality with g = |f | and h = 1 and 0 ≤ t ≤ τ , we have

1

τ

∫ τ

0

∣∣∣∣∫ t

0

f(x)dx

∣∣∣∣2 dt ≤ 1

τ

∫ τ

0

∣∣∣∣∫ t

0

|f(x)|dx∣∣∣∣2 dt ≤ 1

τ

∫ τ

0

∣∣∣∣∫ τ

0

|f(x)|dx∣∣∣∣2 dt

=

∣∣∣∣∫ τ

0

|f(x)| × 1dx

∣∣∣∣2 ≤ ∫ τ

0

|f(x)|2dx∫ τ

0

12dx

= τ

∫ τ

0

|f(x)|2dx <∞.

Hence∫ t0f(x)dx is in L2(0, τ). Therefore

∫ t0f(x)dx admits a Fourier series expansion. Here

we will show that the Fourier series expansion for∫ t0f(x)dx can be obtained by integrating

term by term the Fourier series expansion for f(t).Recall that f is in L2(0, τ) and ω0 =

2πτ. Hence f(t) admits a Fourier series expansion of

the form

f(t) =∞∑

k=−∞ake

−ikω0t = a0 +∞∑k=1

(αk cos(kω0t) + βk sin(kω0t))

ak =1

τ

∫ τ

0

f(t)eikω0tdt (for all integers k)

1

τ

∫ τ

0

|f(t)|2dt =∞∑

k=−∞|ak|2. (6.8)

By formally integrating the Fourier series for f(t), that is, by interchanging the integral andsummation, we obtain∫ t

0

f(x)dx =

∞∑k=−∞

∫ t

0

ake−ikω0xdx = a0t+

∑k =0

ake−ikω0x

−ikω0

∣∣∣∣t0

= a0t+∑k =0

akikω0

+∑k =0

iakkω0

e−ikω0t.

Notice that a0t is not in Fourier series form. However, a0t is in L2(0, τ), and thus, has aFourier series expansion. In fact,

t =τ

2+∑k =0

e−ikω0t

ikω0(for 0 < t < τ)

is the Fourier series expansion for t in L2(0, τ); see (6.7). Furthermore, the Cauchy-Schwartz

inequality for sequences (∣∣∑ bkdk

∣∣2 ≤ ∑ |bk|2∑ |dk|2) guarantees that { iak

kω0}k =0 is indeed


summable. Therefore we formally see that∫ t

0

f(x)dx =τa02

+∑k =0

akikω0

+∑k =0

(ak − a0)i

kω0e−ikω0t (6.9)

is the Fourier series expansion for∫ t0f(x)dx in L2(0, τ). A similar result holds by integrating

the sin and cos version of the Fourier series term by term; see the second Fourier series inequation (6.8).

Because∫ t0f(x)dx is in L2(0, 2π), the integral

∫ t0f(x)dx admits a Fourier series expansion

of the form∫ t

0

f(x)dx =∞∑

k=−∞cke

−ikω0t where ck =1

τ

∫ τ

0

(∫ t

0

f(x)dx

)eikω0tdt.

By employing integration by parts for k = 0, we have

ck =1

τ

∫ τ

0

(∫ t

0

f(x)dx

)eikω0tdt =

[eikω0t

∫ t0f(x)dx

ikω0τ

]τ0

− 1

ikω0τ

∫ τ

0

f(t)eikω0tdt

=1

ikω0τ

∫ τ

0

f(x)dx− akikω0

=iak − ia0kω0

.

Moreover, using f(t) =∑∞

−∞ ake−ikω0t and t = τ

2+∑

k =0e−ikω0t

ikω0with Parseval’s formula (1.6)

for 1τ

∫ τ0f(t)tdt (clearly t = t), we obtain

c0 =1

τ

∫ τ

0

∫ t

0

f(x)dxdt =1

τ

∫ τ

0

∫ τ

x

dtf(x)dx =1

τ

∫ τ

0

(τ − x)f(x)dx

= τa0 − a0τ

2−∑k =0

ak1

ikω0

=τa02

+∑k =0

akikω0

.

In particular,

c0 =1

τ

∫ τ

0

(τ − t)f(t)dt =τa02

+∑k =0

akikω0

.

So the Fourier series expression for∫ t

0

f(x)dx =

∞∑k=−∞

cke−ikω0t =

τa02

+∑k =0

akikω0

+∑k =0

(ak − a0)i

kω0e−ikω0t.

This is precisely the Fourier series for∫ t0f(x)dx obtained by integrating the Fourier series

for∫ t0f(x)dx term by term in (6.9). Therefore integrating the Fourier series term by term

for a function f(t) in L2(0, τ) is well defined.It is noted that

∫ tt◦ f(x)dx =

∫ t0f(x)dx− ∫ t◦

0f(x)dx for 0 ≤ t◦ < τ . In other words, the

integral∫ tt◦ f(x)dx equals

∫ t0f(x)dx minus a constant. So the Fourier series for

∫ tt◦ f(x)dx

equals the Fourier series for∫ t0f(x)dx minus the constant

∫ t◦0f(x)dx.


The Fourier series for t2

2=∫ t0xdx in L2(0, 2π)

For an example on integrating Fourier series, let us return to the function f(t) = t in

L2(0, 2π). Recall that t admits a Fourier series expansion of the form t = π − 2∑∞

1sin(kt)k

.

Let g(t) =∫ t0xdx = t2

2. Then using π2

6=∑∞

11k2, we have

t2

2=

∫ t

0

xdx =

∫ t

0

(π − 2

∞∑k=1

sin(kx)

k

)dx

= πt− 2

∞∑k=1

∫ t

0

sin(kx)

kdx = πt+ 2

∞∑k=1

cos(kx)

k2

∣∣∣∣t0

= πt+ 2

∞∑k=1

cos(kt)

k2− 2

∞∑k=1

1

k2

= πt+ 2∞∑k=1

cos(kt)

k2− π2

3

= π

(π − 2

∞∑k=1

sin(kt)

k

)+ 2

∞∑k=1

cos(kt)

k2− π2

3

=2π2

3+

∞∑k=1

(2 cos(kt)

k2− 2π sin(kt)

k

).

In other words, the Fourier series for t2

2in L2(0, 2π) is given by

g(t) =t2

2=

2π2

3+ 2

∞∑k=1

(cos(kt)

k2− π sin(kt)

k

)(for 0 < t < 2π). (6.10)

Because the function t2

2is of bounded variation over the interval [0, 2π], and is continuous on

(0, 2π), the Fourier series converges to t2

2on the interval (0, 2π); see Dirichlet’s convergence

Theorem 2.3.4. By evaluating both sides of (6.10) at t = π, we obtain

π2

2=t2

2

∣∣∣∣t=π

=2π2

3+

[ ∞∑k=1

(2 cos(kt)

k2− 2π sin(kt)

k

)]t=π

=2π2

3+ 2

∞∑k=1

(−1)k

k2.

By rearranging terms, we see that

π2

12=

∞∑k=1

(−1)k+1

k2(6.11)

Recall that π2

6=∑∞

11k2. So equation (6.11) tells us that

π2

12=

∞∑k=1

(−1)k+1

k2=

1

2

∞∑k=1

1

k2.


It is left as an exercise to directly show that these two sums converge to the number.

Let t2

2=

∑∞−∞ cke

−ikt be the complex Fourier series expansion for the function t2

2in

L2(0, 2π). Recall that ck =αk+iβk

2for all integers k ≥ 1; see (3.11) in Remark 2.3.2. Hence

ck =1k2

− πikfor all integers k ≥ 1. Moreover, c0 =

2π2

3. This with c−k = ck, implies that t2

2

in L2(0, 2π) also admits a Fourier series expansion of the form:

t2

2=

2π2

3+∑k =0

(1

k2− πi

k

)e−ikt. (6.12)

Using Parseval’s equality with π2

6=∑∞

11k2, we have

4π4

5=

1

2π

∫ 2π

0

∣∣∣∣t22∣∣∣∣2 dt = 4π4

9+ 2

∞∑k=1

(1

k4+π2

k2

)=

4π4

9+π4

3+ 2

∞∑k=1

1

k4.

(This equality follows also follows from (6.10) and (3.3).) In other words,

4π4

5=

7π4

9+ 2

∞∑k=1

1

k4.


π4

90=

∞∑k=1

1

k4(6.13)

The Fourier series for t on L2(0, 2π) in (6.2) yields π−t2

=∑∞

1sin(kt)k

. Combining this

with the Fourier series for t2

2in (6.10), we have

∞∑k=1

cos(kt)

k2=t2

4− π2

3+

∞∑k=1

π sin(kt)

k=t2

4− π2

3+π(π − t)

2

=t2

4− πt

2+π2

6=

3t2 − 6πt+ 2π2

12.

In other words, we obtain the elegant Fourier series in L2(0, 2π) taken from Page 148 in [26]:

∞∑k=1

cos(kt)

k2=

3t2 − 6πt+ 2π2

12(for 0 ≤ t ≤ 2π) (6.14)

Let g(t) = 3t2−6πt+2π2

12be the function defined by the right hand side of (6.14). Notice that

g(0) = g(2π) = π2

6. Since g(t) =

∑k =0

e−ikt

2k2, we see that g is in the Wiener algebra W2π. So

the Fourier series in (6.14) converges uniformly to g(t) over [0, 2π]; see Theorem 2.1.6. Inparticular, evaluating both sides of equation (6.14) at t = 0 yields π2

6=∑∞

11k2.


2.6.1 Exercise

Problem 1. Let us revisit Problem 2 in Section 2.2.1. Consider the square wave functionf(t) in L2(0, 2π) given by

f(t) = 1 if 0 ≤ t < π

f(t) = −1 if π ≤ t < 2π; (6.15)

see Section 2.2. Recall that the Fourier series expansion for this square wave function isgiven by

f(t) =4

π

∞∑k=0

sin((2k + 1)t)

2k + 1; (6.16)

see equation (2.2). Let g(t) =∫ t0f(x)dx be the integral of f(t) for 0 ≤ t ≤ 2π.

(i) Show that

g(t) = t if 0 ≤ t ≤ π

g(t) = 2π − t if π < t ≤ 2π.

(ii) By integrating the Fourier series (6.16) for f(t) with π2

8=∑∞

01

(2k+1)2(see (2.3)) show

that the Fourier series for g(t) is given by

g(t) =π

2− 4

π

∞∑k=0

cos((2k + 1)t)

(2k + 1)2. (6.17)

(iii) By using Parseval’s equality show that

π4

96=

∞∑k=0

1

(2k + 1)4.

(iv) Does taking the derivative ddt

of the Fourier series for g(t) in (6.17) yield the Fourierseries for f(t) in (6.16). Explain why or why not.

Problem 2. Consider the Fourier series for g(t) = t2

2in L2(0, 2π) presented in (6.10), that

is,

g(t) =t2

2=

2π2

3+ 2

∞∑k=1

(cos(kt)

k2− π sin(kt)

k

).

(i) Find the Fourier series for q(t) = t3 = 6∫ t0g(x)dx in L2(0, 2π) of the form:

q(t) = a0 +

∞∑k=1

(αk cos(kt) + βk sin(kt)) .


(ii) Where does this Fourier series converge? What does this Fourier series converge towhen t = 0 or t = 2π? Is q in the Wiener algebra W2π?

(iii) Plot t3 and pn(t) = a0 +∑n

1 (αk cos(kt) + βk sin(kt)) on the same graph for severaldifferent values of n over the interval [0, 2π].

(iv) Plot the power spectrum for t3 in L2(0, 2π).

Problem 3. Consider the Fourier series for the function g(t) in L2(0, 2π) defined by

g(t) =3t2 − 6πt+ 2π2

12=

∞∑k=1

cos(kt)

k2(for 0 ≤ t ≤ 2π);

see equation (6.14). Let q(t) be the function in L2(0, 2π) obtained taking the integral of g(t),that is,

q(t) =

∫ t

0

g(x)dx =t3 − 3πt2 + 2π2t

12.

(i) Find the Fourier series for q(t) in L2(0, 2π) of the form:

q(t) = a0 +

∞∑k=1

(αk cos(kt) + βk sin(kt)) .

(ii) Where does this Fourier series converge? What does this Fourier series converge towhen t = 0 or t = 2π? Is q in the Wiener algebra W2π?

(iii) Plot q(t) and its partial Fourier series pn(t) = a0 +∑n

1 (αk cos(kt) + βk sin(kt)) on thesame graph for several different values of n over the interval [0, 2π].

(iv) Plot the power spectrum for q(t).

(v) Does taking the derivative of g(t) equal the derivative of its Fourier series, that is, does

d

dt

∞∑k=1

cos(kt)

k2=dg

dt=t− π

2?

Explain why to why not.


f(t) =2

2− cos(t) + i sin(t).

(i) Find a Fourier series for f(t) of the form∑∞

−∞ ake−ikt. Is f in the Wiener algebra

W2π? Hint: if z is a complex number satisfying |z| < 1, then 11−z =

∑∞0 zk.

(ii) Find the Fourier series for f = dfdt

of the form∑∞

−∞ bke−ikt. Is f in the Wiener algebra

W2π?

(ii) Find1

2π

∫ 2π

0

|f(t)|2dt and1

2π

∫ 2π

0

|f(t)|2dt.

2.7. THE CESARO MEAN 81

2.7 The Cesaro mean

The results in this section are not used in the remaining part of the notes and the materialis a bit beyond the scope of the notes. So the uninterested reader can skip this section.The Cesaro mean plays a fundamental role in studying Fourier series on Lp spaces. For anexcellent presentation of the Cesaro mean see Hoffman [20]. Here we will use the Cesaromean to smooth out some of the oscillation in the Gibbs phenomenon, and to gain somefurther insight into taking the derivative of a Fourier series. Finally, it is noted that one candevelop Fourier series on Lp spaces. However, for our purposes we will only consider L2(0, τ)spaces.

Let f(t) =∑∞

−∞ ake−ikω0t be the Fourier series for a function f(t) in L2(0, τ) where

ω0 =2πτ

is its fundamental frequency. Let pn(t) be the n-th finite partial sum and σn(t) thecorresponding arithmetic mean defined by

pn(t) =

n∑k=−n

ake−ikω0t and σn(t) =

1

n

n−1∑ν=0

pν(t). (7.1)

Then σn(t) is called the n-th Cesaro mean of f(t). It is noted that σ1(t) = p0(t) = a0.The Cesaro mean σn(t) is the arithmetic mean of the trigonometric polynomials {pν(t)}n−1

0 .Finally, it is emphasized that Cesaro means have many nice convergence properties; seeHoffman [20].

By using σn(t) = 1n

∑n−10 pν(t) and combining like terms, we see that the n-th Cesaro

mean is given by

σn(t) =∑|k|<n

n− |k|n


1

τ

∫ τ

0

eikω0tf(t)dt (7.2)

As before, the fundamental frequency ω0 =2πτ. Recall that f(t) is a real function if and only

if its Fourier coefficients a−k = ak for all integers k. So if f is a real function, then {pn(t)}∞0are also real functions. In this case, the Cesaro means {σn(t)}∞1 are real functions and theformula for σn(t) in (7.2) simplifies to

σn(t) = a0 +n−1∑k=1

n− k

n2�(ake−ikω0t

)(when f is real). (7.3)

In particular, if f(t) is a real function, then

σn(t) = a0 +

n−1∑k=1

n− k

n

(2�(ak) cos(kω0t) + 2�(ak) sin(kω0t)

)(when f is real) (7.4)

2.7.1 The convergence of Cesaro means

Let {zn}∞0 be a sequence of complex numbers, and mn = 1n

∑n−10 zν the arithmetic mean

for {zν}n−10 . The following classical result shows that if zn converges to z, then mn also


converges to z. However, the converse is not necessarily true. For example, if zn = (−1)n

for all integers n ≥ 0, then zn does not converge, and mn = 1n

∑n−10 zν converges to zero.

LEMMA 2.7.1 Let {zn}∞0 be a sequence of complex numbers.

If z = limn→∞

zn then z = limn→∞

1

n

n−1∑ν=0

zν (7.5)

Sketch of proof. First notice that for any complex number z, we have

|mn − z| =∣∣∣∣∣ 1n

n−1∑ν=0

(zn − z)

∣∣∣∣∣ ≤ 1

n

n−1∑ν=0

|zn − z|.

Assume that zn converges to z. Then given any ε > 0 there exists an integer κ such that|zn − z| ≤ ε

2for all n ≥ κ. Hence

|mn − z| ≤ 1

n

n−1∑ν=0

|zν − z| = 1

n

κ−1∑ν=0

|zν − z| + 1

n

n−1∑ν=κ

|zν − z|

≤ 1

n

κ−1∑ν=0

|zν − z|+ 1

n

n−1∑ν=κ

ε

2

≤ 1

nmax{|zν − z| : ν < κ}+ ε

2.

For fixed κ, we can choose n sufficiently large, such that 1nmax{|zν − z| : ν < κ} ≤ ε

2. So for

n sufficiently large, |mn − z| ≤ ε. Therefore mn converges to z. This completes the proof.

Now let us return to the Cesaro means for a function f in L2(0, τ). Equation (7.5)implies that the Cesaro means converge whenever the corresponding Fourier series converges.Moreover, the Cesaro means may converge when the corresponding Fourier series diverge.The Cesaro means pay a price for this stronger convergence property. In this section, wewill see that the Cesaro means converge slower than the Fourier series in the L2(0, τ) norm.However, the Cesaro means have the tendency to smooth out the wiggles or oscillations inthe Fourier series for discontinuous functions.

To be specific, let σn(t) = 1n

∑n−10 pν(t) be the n-th Cesaro mean for some function f

in L2(0, τ). By consulting equation (7.5), we see that the Cesaro means {σn(t)}∞1 convergewhenever the corresponding Fourier series

∑∞−∞ ake

−ikω0t converge, and in this case theyconverge to the same value. In particular, the Dirichlet convergence Theorem 2.1.4 alsoholds when the Fourier series limn→∞ pn(t) is replaced by the limit of the Cesaro meanslimn→∞ σn(t). In other words, if f is a piecewise continuous function of bounded variation,then

∞∑k=−∞

ake−ikω0t = lim

n→∞σn(t) = f(t) if f(t) is continuous at t (7.6)

=1

2(f(t◦+) + f(t◦−)) if f(t) is discontinuous at t◦.


The Cesaro means, have many nice convergence properties for functions in Lp spaces. Forexample, if f is in Lp for 1 ≤ p <∞, then the corresponding Cesaro means σn converge to fin the Lp norm; see Hoffman [20]. For our purposes, let us present the following convergenceresult taken from Page 17 of Hoffman [20].

THEOREM 2.7.2 Let {σn}∞1 be the Cesaro means for a function f in L2(0, τ). Then σnconverges to f in the L2(0, τ) norm, that is,

0 = limn→∞

‖σn − f‖2 = limn→∞

1

τ

∫ τ

0

|σn(t)− f(t)|2 dt. (7.7)

If f(t) is continuous and f(0) = f(τ), then σn(t) converges to f(t) uniformly, that is,

0 = limn→∞

max{|σn(t)− f(t)| : 0 ≤ t ≤ τ}. (7.8)

For example, consider the continuous function f(t) = t sin(2πt) in L2(0, 1). It is noted that

0 = f(0) = f(1) and f(t) is not of bounded variation. Therefore f(t) does not satisfy thehypothesis of the Dirichlet convergence Theorem 2.1.4. However, Theorem 2.7.2 guaranteesthat the Cesaro means σn(t) converge uniformly to f(t) over the interval [0, 1].

The famous Weierstrass approximation Theorem for trigonometric polynomials statesthat if f(t) is a continuous τ periodic function, then there exists a sequence of τ periodictrigonometric polynomials which converge uniformly to f(t). The second part of Theorem2.7.2 yields a constructive proof of this Weierstrass approximation Theorem. In particular,the Cesaro means σn(t) converge uniformly to f(t).

To obtain some further insight into the convergence of the Cesaro means, recall that theinner product between two functions f and g in L2(0, τ) is given by

(f, g) =1

τ

∫ τ

0

f(t)g(t)dt. (7.9)

Moreover, ‖f‖ =√(f, f) is the norm of f . Let

f(t) =∞∑

k=−∞ake

−ikω0t and g(t) =∞∑

k=−∞bke

−ikω0t (7.10)

be the Fourier series for f and g, respectively. Then Parseval’s equality states that

(f, g) =1

τ

∫ τ

0

f(t)g(t)dt =

∞∑k=−∞

akbk. (7.11)

If g and h are vectors in L2 (or any Hilbert space), then ‖g + h‖2 = ‖g‖2 + 2�(g, h) + ‖h‖2.To see this simply observe that

‖g + h‖2 = (g + h, g + h) = (g, g) + (g, h) + (h, g) + (h, h)

= ‖g‖2 + (g, h) + (g, h) + ‖h‖2 = ‖g‖2 + 2�(g, h) + ‖h‖2.


Hence ‖g + h‖2 = ‖g‖2 + 2�(g, h) + ‖h‖2.Let pn be the n-th partial Fourier series for a function f in L2(0, τ). We claim that

‖f − pn‖2 = ‖f‖2 − ‖pn‖2 =∑n<|k|

|ak|2 (7.12)

As expected,∑∞

−∞ ake−ikω0t is the Fourier series for f . It is noted that ‖f − pn‖ is the error

in the approximation of f by pn, or in Hilbert space terminology, ‖f − pn‖ is the distancefrom pn to f in the L2(0, τ) norm. Using Parseval’s equality with ‖pn‖2 =

∑n−n |ak|2, we

obtain

‖f − pn‖2 = ‖∑n<|k|

ake−ikω0t‖2 =

∑n<|k|

|ak|2 = ‖f‖2 − ‖pn‖2.

This yields (7.12).The following result shows that the n-th partial Fourier series pn for f has the smallest

possible error over the set of all trigonometric polynomials of degree at most n. In particular,we will see that the Cesaro means σn converge slower than pn to f in the L2(0, τ) norm.

LEMMA 2.7.3 Let∑∞

−∞ ake−ikω0t be the Fourier series for a function f in L2(0, τ) and pn

the n-th partial Fourier series for f . Let q(t) =∑n

−n qke−ikω0t be a trigonometric polynomial

of degree at most n. Then

‖f − q‖2 = ‖f − pn‖2 + ‖pn − q‖2. (7.13)

Moreover, expressed in terms of its Fourier coefficients

‖f − q‖2 = ‖f − pn‖2 +n∑

k=−n|ak − qk|2 =

∑n<|k|

|ak|2 +n∑

k=−n|ak − qk|2. (7.14)

In particular, ‖f − pn‖ ≤ ‖f − q‖ for all trigonometric polynomials q of degree at most n.Moreover, ‖f − pn‖ = ‖f − q‖ if and only if q = pn.

The previous lemma states that pn is the unique trigonometric polynomial of degree at mostn which has the smallest error over the set of all trigonometric polynomials of degree at mostn. In other words, pn is the unique solution to the following optimization problem:

‖f − pn‖ = min

{‖f − q‖ : q =

n∑k=−n

qke−ikω0t where qk ∈ C

}. (7.15)

Because the Cesaro mean σn+1(t) is a trigonometric polynomials of degree at most n, theerror ‖f − pn‖ ≤ ‖f − σn+1‖; see Lemma 2.7.3. Therefore the Cesaro means σn convergeslower than pn to f in the L2(0, τ) norm. Finally, it is emphasized that Lemma 2.7.3 andthe fact that pn is the unique solution to the minimization problem in (7.15) are simple


consequences of the projection theorem on Hilbert space. For an excellent introduction toHilbert space and operator theory see [15].

Proof of Lemma 2.7.3. First observe that f − pn is orthogonal to pn − q, that is,(f − pn, pn − q) = 0. Using Parseval’s equality we have

(f − pn, pn − q) =

⎛⎝∑n<|k|

ake−ikω0t,

∑|k|≤n

(ak − qk)e−ikω0t

⎞⎠ = 0.

The last equality follows by applying Parseval’s equality along with the observation that thesequences {ak}|k|<n and {ak−qk}|k|≤n have no common terms. Therefore f−pn is orthogonalto pn − q. Using ‖g + h‖2 = ‖g‖2 + 2�(g, h) + ‖h‖2 with g = f − pn and h = pn − q, weobtain

‖f − q‖2 = ‖f − pn + pn − q‖2= ‖f − pn‖2 + 2�(f − pn, pn − q) + ‖pn − q‖2= ‖f − pn‖2 + ‖pn − q‖2.

Hence (7.13) holds. In particular, ‖f − pn‖ ≤ ‖f − q‖. If ‖f − pn‖ = ‖f − q‖, then (7.13)implies that ‖pn − q‖ = 0, or equivalently, pn = q.

By applying Parseval’s equality once again, we have

‖f − q‖2 = ‖f − pn‖2 + ‖pn − q‖2 = ‖f − pn‖2 + ‖∑|k|≤n

(ak − qk)e−ikω0t‖2

= ‖f − pn‖2 +∑|k|≤n

|ak − qk|2 =∑n<|k|

|ak|2 +∑|k|≤n

|ak − qk|2.

This yields (7.14) and completes the proof.Specifying Lemma 2.7.3 to the Cesaro mean setting leads to the following result, which

also proves the first part of Theorem 2.7.2. Moreover, we also obtain a specific formulashowing that the Cesaro means converge slower than the Fourier series in the L2(0, τ) norm.

THEOREM 2.7.4 Let f(t) =∑∞

−∞ ake−ikω0t be the Fourier series for a function f in

L2(0, τ). Let pn be the n-partial Fourier series and σn the n-th Cesaro mean for f .

(i) The error ‖f − pn‖ ≤ ‖f − σn+1‖. To be precise,

‖f − σn+1‖2 = ‖f − pn‖2 +∑|k|≤n

k2|ak|2(n + 1)2

(7.16)

=∑n<|k|

|ak|2 +∑|k|≤n

k2|ak|2(n + 1)2

. (7.17)

In particular, ‖f − pn‖ = ‖f − σn+1‖ if and only if ak = 0 for all k = ±1,±2, · · · ,±n.Hence ‖f − pn‖ = ‖f − σn+1‖ for all positive integers n if and only if f(t) = a0 is aconstant function.


(ii) Moreover,

0 = limn→∞

∑|k|≤n

k2|ak|2(n + 1)2

. (7.18)

Therefore the Cesaro means σn converge to f in the L2(0, τ) norm.

(iii) The sequence ‖σn‖ converges to ‖f‖, that is,∞∑

k=−∞|ak|2 = 1

τ

∫ τ

0

|f(t)|2dt = limn→∞

1

τ

∫ τ

0

|σn(t)|2dt = limn→∞

∑|k|<n

|(n− |k|)ak|2n2

. (7.19)

(iv) Given any positive integer m ≤ n, we have the upper bound

‖f − σn+1‖2 ≤ 2‖f − pm‖2 + m2‖f‖2(n+ 1)2

. (7.20)

Proof. By rewriting the formula for σn(t) in (7.2) for n+ 1, we have

σn+1(t) =∑|k|≤n

n + 1− |k|n+ 1

ake−ikω0t. (7.21)

Using this in equation (7.14) of Lemma 2.7.3 with qk =(n+1−|k|)ak

n+1, we obtain

‖f − σn+1‖2 = ‖f − pn‖2 +∑|k|≤n

∣∣∣∣ak − (n+ 1− |k|)akn+ 1

∣∣∣∣2= ‖f − pn‖2 +

∑|k|≤n

k2|ak|2(n+ 1)2

.

This yields the equality in (7.16), and proves Part (i).

Now let us prove that xn =∑

|k|≤nk2|ak|2(n+1)2

converges to zero as n tends to infinity. Let mbe any positive integer such that m < n. Then

∑|k|≤n

k2|ak|2(n + 1)2

=∑|k|≤m

k2|ak|2(n + 1)2

+∑

m<|k|≤n

k2|ak|2(n + 1)2

≤∑|k|≤m

m2|ak|2(n+ 1)2

+∑

m<|k|≤n|ak|2

≤ m2

(n + 1)2

∞∑k=−∞

|ak|2 +∑m<|k|

|ak|2

=m2‖f‖2(n+ 1)2

+ ‖f − pm‖2.


The last equation follows from Parseval’s equality. So we have

xn =∑|k|≤n

k2|ak|2(n+ 1)2

≤ m2‖f‖2(n+ 1)2

+ ‖f − pm‖2 (whenever 0 ≤ m ≤ n). (7.22)

Given any ε > 0 there exists an integer m such that ‖f−pm‖2 ≤ ε2. For this fixed m, we also

have m2‖f‖2(n+1)2

≤ ε2for all n sufficiently large. Therefore xn ≤ ε

2+ ε

2= ε for all n sufficiently

large. In other words, xn converges to zero and the limit in equation (7.18) holds.Because pn converges to f in the L2(0, τ) norm, equations (7.16) and (7.18) imply that

‖f − σn+1‖ converges to zero. In other words, σn converges to f in the L2(0, τ) norm. (Thisalso proves the first part of Theorem 2.7.2.) Hence Part (ii) holds.

To obtain the bound in (7.20), notice that for 0 ≤ m ≤ n, we have

‖f − σ2n+1‖2 = ‖f − pn‖2 + xn ≤ ‖f − pm‖2 + m2‖f‖2

(n + 1)2+ ‖f − pm‖2.

Therefore Part(iv) holds.Because σn converges to f , the norms ‖σn‖ converge to ‖f‖. By applying Parseval’s

equality to the formula for σn in (7.2), we obtain

‖σn‖2 = 1

τ

∫ τ

0

|σn(t)|2dt =∑|k|<n

(n− |k|)2|ak|2n2

. (7.23)

Combining this with ‖f‖2 =∑∞

−∞ |ak|2 yields the formula in (7.19). This completes theproof.

For a simple example to demonstrate that the Cesaro means σn(t) converge slower thanthe partial Fourier series pn(t), consider the function f(t) = sin(3t) in L2(0, 2π). Thenpn(t) = 0 for n = 0, 1, 2 and pn(t) = sin(3t) for all integers n ≥ 3. Obviously, pn(t) convergesto sin(3t) in virtually any norm. Recall that σn(t) = 1

n

∑n−10 pν(t). Hence σn(t) = 0 for

n = 1, 2, 3 and σn(t) = n−3n

sin(3t) for all integers n ≥ 4. Clearly, σn(t) = 0 convergesto sin(3t). However, pn(t) converges to sin(3t) in four steps, while the Cesaro means σn(t)converge to sin(3t) on the order of 1

n.

REMARK 2.7.5 Recall that {e−ikω0t}∞−∞ is an orthonormal basis for the Hilbert spaceL2(0, τ). It is noted that the results in Theorem 2.7.4 hold in any separable Hilbert spaceH. To be specific, let {ϕk}k∈Z be an orthonormal basis for H where Z is the set of integers.(One could also use Z+ the set of positive integers including zero.) Let f be a vector in H.Then f =

∑k∈Z akϕk where ak = (f, ϕk). Let

pn =∑|k|≤n

akϕk and σn =1

n

n−1∑ν=0

pν =∑|k|<n

n− |k|n

akϕk.

Then pn converges to f and σn converges to f . In this setting, ‖f − pn‖2 =∑

n<|k| |ak|2.Moreover, ‖f − pn‖ ≤ ‖f − σn+1‖. In other words,

‖f − σn+1‖2 = ‖f − pn‖2 +∑|k|≤n

k2|ak|2(n + 1)2

=∑n<|k|

|ak|2 +∑|k|≤n

k2|ak|2(n+ 1)2

.


In particular, ‖f − pn‖ = ‖f − σn+1‖ if and only if ak = 0 for all k = ±1,±2, · · · ,±n.Furthermore,

∑|k|≤n

k2|ak|2(n+1)2

converges to zero as n tends to infinity. Because ‖σn‖ converges

to ‖f‖, we have

∞∑k=−∞

|ak|2 = ‖f‖2 = limn→∞

‖σn‖2 = limn→∞

∑|k|<n

|(n− |k|)ak|2n2

.

Finally, the bound in (7.20) also holds.

0 5 10 15 20 25

t

-1

0

1

2

3

4

5

6

7The graph for f(t) with p

10(t) and σ

11(t)

Figure 2.11: The graph of the sawtooth wave f(t) with p10(t) and σ11(t).

The Cesaro means for f(t) = t in L2(0, 2π)

As before, consider the function f(t) = t in L2(0, 2π). Let f(t) also denote the 2π periodicextension of f(t) = t. In other words, f(t) = f(t+ 2π) and f(2πj) = 0 for all integers j. Tobe precise, f(t) = t− 2πj when t ∈ [2πj, 2π(j + 1)). Recall that f(t) is also referred to as asawtooth function. (Since any point has Lebesgue measure zero, defining f(2πj) = 0 is notnecessary.) Observe that this 2π periodic extension f(t) has discontinuities at 2πj. Recallthat t admits a Fourier series expansion of the form

t = π +∑k =0

e−ikt

ik= π − 2

∞∑k=1

sin(kt)

k(when 0 < t < 2π);

see (6.2). In this case, the Fourier coefficients ak for e−ikt in the Fourier series are given bya0 = π and ak =

1ik

for all integers k = 0. By consulting (7.4), we see that the n-th Cesaromean for f(t) is determined by

σn(t) = π − 2

n−1∑k=1

(n− k) sin(kt)

nk(the n-th Cesaro mean for t ∈ L2(0, 2π)). (7.24)


The Cesaro means σn(t) converge to the sawtooth function f(t) for all t = 2πj. Since π isthe average of t evaluated at the endpoints of the interval [0, 2π], the Cesaro means σn(2πj)converge to π for all integers j; see equation (7.6).

Figure 2.11 plots the sawtooth function f(t), the partial Fourier series p10(t) and theCesaro mean σ11(t) on the same graph. (We compared p10 with σ11 because they both usethe same complex exponential functions {e−ikt}10−10.) It is noted that the Cesaro mean σ11(t)eliminated the wiggles or oscillations in the approximation of p10(t) for f(t) = t. Next weapplied (7.12) and (7.17) to calculate the errors ‖f − p10‖ and ‖f − σ11‖:

‖f‖ =

√1

2π

∫ 2π

0

t2dt =2π√3

‖f − p10‖ =

√1

2π

∫ 2π

0

|t− p10(t)|2dt ≈ 0.4362

‖f − σ11‖ =

√1

2π

∫ 2π

0

|t− σ11(t)|2dt ≈ 0.5963. (7.25)

The Cesaro means converge slower then the Fourier series in the L2(0, 2π) norm. So asexpected, the error for p10(t) is smaller then the error for the Cesaro mean σ11(t), that is,‖f−p10‖ < ‖f−σ11‖. However, both errors are rather large considering ‖f‖ = 2π√

3≈ 3.6276.

To find out how far the Cesaro mean error lags the error for p10, we used Matlab to findthe smallest integer n such that the error for σn is smaller than the error for p10, that is,the smallest integer n satisfying ‖f − σn‖ ≤ ‖f − p10‖. For n = 20, 21 and 22, the norm for‖f − σn‖ was respectively given by 0.4444, 0.4338 and 0.4240. Therefore σ21(t) is the firstCesaro mean which has a smaller error ‖f − σ21‖ than the error ‖f − p10‖ for p10(t).

By consulting equation (7.19), we see that

4π2

3=

1

2π

∫ 2π

0

t2 = π2 + 2 limn→∞

n−1∑k=1

(n− k)2

k2n2.

In other words,

π2

6= lim

n→∞

n−1∑k=1

(n− k)2

k2n2. (7.26)

As noted earlier the Cesaro means converge slower than the corresponding Fourier series inthe L2 norm. So it is not surprising that the convergence to π2

6in (7.26) is rather slow.

The Matlab commands we used to generate Figure 2.11 and the errors computed in (7.25)are given by

t=linspace(0,6.5*pi,2^17); f=pi*sawtooth(t)+pi;

a=[pi, -i./(1:50000-1)];

p=a(1); for k=1:10; p=p-2*sin(k*t)/k; end

n=11; u=a(1); for k=1:n; u=u+(n-k)*2*real(a(k+1)*exp(-i*k*t))/n; end;

plot(t,f); hold on; plot(t,p); plot(t,u,’g’); grid


xlabel(’t’)

title(’The graph for f(t) and p_{10}(t) and the 11-th Cesaro mean’)

print(’Cerslowt’,’-depsc’)

ep=sqrt(2)*norm(a(12:50000)), 0.4362

eu=ep^2+2*norm((0:10).*a(1:11))^2/(11^2);eu=sqrt(eu), 0.5963

for k=1:500;

e(k+1)=2*norm(a(k+2:50000))^2+2*norm((0:k).*a(1:k+1))^2/((k+1)^2);

e(k+1)=sqrt(e(k+1)); end e(11), 0.5963

e(20:22), 0.4444 0.4338 0.4240

Figure 2.12 plots the sawtooth function f(t) with the Cesaro means σ21(t) and σ2001(t) onthe same graph. Recall that Figure 2.8 graphs the partial Fourier series p20(t) and p2000(t) forf(t). By comparing Figure 2.12 with Figure 2.8, we see that the Cesaro means eliminatedthe oscillations and overshoot from the Fourier series approximations p20(t) and p2000(t) off(t). Figure 2.10 plots p12000(t) and it still overshoots at 2πj. Finally, it is noted that theCesaro means never really exhibited any overshoot; see Figure 2.12. This is partially due tothe maximal property of the Cesaro means discussed later.

0 5 10 15 20 25

t

0

1

2

3

4

5

6

7The graph for f(t) with σ

21(t) and σ

2001(t)

Figure 2.12: The graph of f(t) with the Cesaro means σ21(t) and σ2001(t).

A continuous periodic function example

For an another example to demonstrate that the Cesaro means converge slower than theFourier series in the L2 norm, consider the function f in L2(0, 2π) defined by

f(t) = sin(t) if 0 ≤ t ≤ π

= 0 if π ≤ t ≤ 2π.


In this case, the Fourier series for f(t) is given by

f(t) =1

π+ie−it

4− ieit

4− 1

π

∑0=k

e−2ikt

4k2 − 1

=1

π+

sin(t)

2− 2

π

∞∑k=1

cos(2kt)

4k2 − 1(for 0 ≤ t ≤ 2π). (7.27)

Notice that f(t) is a continuous function satisfying f(0) = f(2π). So the Gibbs phenomenondoes not play a role in this example. Moreover, f(t) is in the Wiener algebra W2π. Thusboth pn(t) and σn(t) converge uniformly to f(t) over the interval [0, 2π]; see Theorem 2.1.6.Therefore our function f(t) is well behaved and there are no convergence issues surroundingf(t). The graph of f(t), the partial Fourier series p10(t) and the Cesaro mean σ11(t) arepresented in Figure 2.13. It is noted that p10(t) is approximately equal to f(t) while σ11(t)is still ”far” from f(t). It is left to the reader as an exercise to derive the Fourier series forf(t) and plot p10(t) and σ11(t).

0 1 2 3 4 5 6 7

t

-0.2

0

0.2

0.4

0.6

0.8

1

1.2The graph f(t), p

10(t) and the 11-th Cesaro mean

Figure 2.13: The graph of f(t), p10(t) and σ11(t) for 0 ≤ t ≤ 2π.

We also computed the errors ‖f−p10‖ and ‖f−σ11‖ using (7.12) and (7.17), respectively.The norm of our function f(t) is given by

‖f‖ =

√1

2π

∫ 2π

0

|f(t)|2dt =√

1

2π

∫ π

0

sin(t)2dt =1

2.

By employing Matlab with the Fourier coefficients in (7.27), we discovered that

‖f − p10‖ =

√1

2π

∫ 2π

0

|f(t)− p10(t)|2dt ≈ 0.0050

‖f − σ11‖ =

√1

2π

∫ 2π

0

|f(t)− σ11(t)|2dt ≈ 0.0449.


Because the Cesaro means converge slower than the Fourier series in the L2(0, 2π) norm, it isnot surprising that the error ‖f−p10‖ < ‖f−σ11‖. However, in this case, the error ‖f−p10‖for p10(t) is significantly smaller than the error ‖f − σ11‖ for σ11(t). In other words, thepartial Fourier series p10(t) is a much better approximation of f(t) than σ11(t). Since f(t)is a continuous function, there is no Gibbs phenomenon, and thus, there are no oscillationsfor the Cesaro means to help smooth out.

To complete this example, we wanted to see how far the Cesaro means lag p10(t). So usingMatlab with (7.17), we discovered that the smallest integer n satisfying ‖f−σn‖ ≤ ‖f−p10‖is n = 100. In other words ‖f − σ100‖ ≈ 0.0050. So in this example, one has to implementthe 100-th Cesaro mean σ100(t) to obtain ruffly same approximation of f(t) as the 10-thpartial Fourier series p10(t). We also computed ‖p10 − σ100‖ ≈ 0.0065. Finally, it is notedthat one can use fast Fourier transform methods, discussed in Chapter 3, to simplify manyof these computations. In fact, relaying on fast Fourier transform techniques, Problem 7 inSection 3.2.4 of Chapter 3 presents another example to demonstrate that the Cesaro meansconverge slower than the Fourier series in the L2 norm.

Recall that the Cesaro means converge slower than the Fourier series in the L2 norm.However, the Cesaro means may converge and the Fourier series diverge on certain sets, eventhough they both converge in L2. To to gain some further insight as to why the Cesaro meanstend converge slower than the Fourier, it is instructive to look at the scalar case. As before,let {zn}∞0 be a sequence of complex numbers and mn = 1

n

∑n−10 zν the arithmetic mean of

{zn}n−10 where m1 = z0. For example, if zn = 100 for n = 0, 1, 2, · · · , 9 and zero otherwise,

then clearly zn converges to zero. In fact, zn equals zero for all n ≥ 10. However, the meanmn converges to zero and never attains zero. Clearly, the means mn converge slower thanthe sequence zn. The main reason in this case is that the means mn always include the firstpart of the sequence {zn}∞0 which is slowing down the convergence of mn.

For another more realistic example, consider the case when zn = λn where λ is a complexnumber in the open unit disc, that is, |λ| < 1. Clearly, zn = λn converges to zero ofexponential order. By consulting Lemma 3.3.4 in Chapter 3, we see that

mn =1

n

n−1∑ν=0

λν =1− λν

n(1− λ).

Hence zn = λn converges to zero of exponential order and mn converges to zero on the orderof 1

n, which is slower. This follows from the fact that nλn converges to zero. Furthermore, if

zn =∑

j αjλnj where {λj} is a finite set of complex numbers in the open unit disc, then zn

converges exponentially to zero, while mn converges zero on the order of 1n.

Therefore ruffly speaking, the Cesaro means tend to converge slower than the Fourierseries, simply because it is an inherent property that mn converges slower than zn, in manyinstances.

2.7.2 The Dirac delta function

Before we proceed to the Dirichlet and Fejer kernels, let us present a short introduction toDirac delta functions. A Dirac delta function δ(t) is a generalized function with area one,


-4 -3 -2 -1 0 1 2 3 4

t

0

0.5

1

1.5The Dirac delta function δ(t)

-4 -3 -2 -1 0 1 2 3 4

t

0

0.5

1

1.5The Dirac comb with period 1

Figure 2.14: The Dirac delta function δ(t) and Dirac comb.

infinity at the origin, and zero everywhere except at t = 0. There are many different ways togenerate a Dirac delta function δ(t). For example, let δε(t) for 0 < ε be the function definedby

δε(t) =1

εif |t| ≤ ε

2= 0 otherwise. (7.28)

Notice that δε(t) has area one and is zero outside of [− ε2, ε2]. Formally, δ(t) = limε→0 δε(t). In

probability theory δε is the uniform density function over the interval [− ε2, ε2]. Another way to

generate a Dirac delta function is to use the Gaussian density, that is, δ(t) = limε→01√2πεe−

t2

2ε2 .

The Dirac delta function defined by (7.28) or the Gaussian density is symmetric about they axis. (In many applications a ”one sided” Dirac delta function is used. For instance, setδε(t) =

1εfor 0 ≤ t ≤ ε and zero otherwise. Then δ(t) = limε→0 δε(t), formally.)

The graph of the Dirac delta function δ(t) is represented by an arrow at the origin ofheight 1; see the first plot in Figure 2.14. The graph of aδ(t− t◦) is represented by an arrowat to with a height of a pointing north if a > 0 and south if a < 0. (The stem command inMatlab was used to plot the Dirac delta function.)

If h is a continuous function, then

h(t) =

∫ ∞

−∞δ(t− x)h(x)dx =

∫ ∞

−∞δ(x)h(t− x)dx (7.29)

The fact that δ(t− x) picks out the value of h(t) is called the sifting or sampling property ofthe Dirac delta function. The previous integral is defined by taking the limit:∫ ∞

−∞δ(t− x)h(x)dx = lim

ε→0

∫ ∞

−∞δε(t− x)h(x)dx.


To verify that (7.29) holds for continuous functions h(t), let g(t) =∫h(x)dx be the integral

of h(t). Then using the definition of the derivative, we have∫ ∞

−∞δ(t− x)h(x)dx = lim

ε→0

∫ ∞

−∞δε(t− x)h(x)dx = lim

ε→0

1

ε

∫ t+ ε2

t− ε2

h(x)dx

= limε→0

g(t+ ε2)− g(t− ε

2)

ε=dg

dt= h(t).

This yields (7.29). It is noted that 1ε

∫ t+ ε2

t− ε2h(x)dx is simply the average of h(t) over the

interval [t − ε2, t + ε

2], and thus, 1

ε

∫ t+ ε2

t− ε2h(x)dx converges to h(t); see also the mean value

theorem.

Let g(t) be a piecewise smooth function. The formal or generalized derivative of g containsDirac delta functions at the discontinuous points {tj} of g(t). The derivative of g(t) at tj isg(tj) = cδ(t − tj) where c = g(tj+) − g(tj−). In particular, |c| = |g(tj+) − g(tj−)| is thedistance between g(tj+) and g(tj−).

For example, let I{a,b} be the function which is 1 for a < t < b and zero otherwise. (Theinequality does not matter, one can just as easily use inequalities or equalities.) Considerthe function

g(t) = 2I{−1,1} + (3t2 + 4t− 1)I{1,2} + 2(t− 3)2I{3,4}.

The generalized derivative g of g from the perspective of Dirac delta functions is given by

g(t) = 2δ(t+ 1) + 4δ(t− 1) + (6t+ 4)I{1,2} − 19δ(t− 2) + 4(t− 3)I{3,4} − 2δ(t− 4).

To formally see why the derivative of a function g(t) at its discontinuous points {tj} leadsto a Dirac delta function, observe that the definition of the derivative (formally) yields

g(tj) = limε→0

g(tj +ε2)− g(tj − ε

2)

ε= lim

ε→0cδε(t− tj) = cδ(t− tj).

Hence the generalized derivative g(tj) = cδ(t− tj) where c = g(tj+)− g(tj−).

To complete this section, let us define the Dirac comb generalized function. The Diraccomb with period τ , denoted by Δ(t), is a τ periodic Dirac delta function, that is,

Δ(t) =

∞∑j=−∞

δ(t− jτ) (7.30)

The graph of the Dirac comb with period 1 is presented in the lower plot of Figure 2.14.Finally, it is noted that the theory of Dirac delta functions is based on a shaky mathematicalfoundation and should be used with caution. However, Dirac delta functions provide onewith some valuable insight. For some further results see the Dirac delta and Dirac combWikipedia webpages.


2.7.3 Dirichlet and Fejer kernels

In this section, we will introduce the Dirichlet and Fejer kernels. This will allow us to expressthe n-th partial Fourier series pn(t) and the n-th Cesaro mean σn(t) in terms of a convolutionintegral. For further results on Dirichlet and Fejer kernels see [5, 20, 40].

As before, assume that pn(t) =∑n

−n ake−ikω0t is the n-th partial Fourier series for a

function f in L2(0, τ) where the fundamental frequency ω0 = 2πτ. First, let us concentrate

on computing pn(t). Using ak =1τ

∫ τ0f(t)eikω0tdt, we obtain

pn(t) =n∑

k=−nake

−iω0kt =n∑

k=−n

e−iω0kt

τ

∫ τ

0

f(x)eiω0kxdx.

By interchanging the summation and the integral, we arrive at

pn(t) =1

τ

∫ τ

0

f(x)

(n∑

k=−ne−iω0k(t−x)

)dx. (7.31)

The Dirichlet kernel Dn(t) with period τ is defined by

Dn(t) =

n∑k=−n

e−ikω0t = 1 + 2

n∑k=1

cos(kω0t). (7.32)

It is noted that D0(t) = 1. Using the Dirichlet kernel Dn(t) in (7.31), we see that the n-thpartial Fourier series pn(t) is given by the following convolution formula

pn(t) =1

τ

∫ τ

0

f(x)Dn(t− x)dx (7.33)

To derive a formula for the Dirichlet kernel recall that

n∑k=0

zk =1− zn+1

1− z(for z ∈ C and z = 1); (7.34)

see Lemma 3.3.4 in Chapter 3. Let us set z = eiω0t. Because z is on the unit circle, zz = 1and z = 1

z. Assume that z = 1, or equivalently, t = τj for all integers j. Then the Dirichlet


kernel

Dn(t) =

n∑k=−n

e−iω0kt =

n∑k=−n

zk = z−n2n∑k=0

zk

=z−n(1− z2n+1)

1− z=

(z−n − zn+1)(1− z)

(1− z)(1− z)

=zn + zn − zn+1 − zn+1

|1− z|2

=einω0t + e−inω0t − ei(n+1)ω0t − e−i(n+1)ω0t

|1− cos(ω0t) + i sin(ω0t)|2

=2 cos(nω0t)− 2 cos((n + 1)ω0t)

(1− cos(ω0t))2 + sin(ω0t)2

=2 cos(nω0t)− 2 cos((n + 1)ω0t)

2− 2 cos(ω0t). (7.35)

In other words, the Dirichlet kernel

Dn(t) =

n∑k=−n

e−iω0kt =cos(nω0t)− cos((n+ 1)ω0t)

1− cos(ω0t)=

sin((n+ 1

2)ω0t

)sin

(ω0t2

) . (7.36)

To obtain the last equality, observe that the fourth equality in the previous calculation (7.35)yields

Dn(t) =z−n(1− z2n+1)

1− z=zn+1 − zn

z − 1=z

12 (zn+1 − zn)

z12 (z − 1)

=zn+

12 − zn+

12

z12 − z

12

=sin

((n + 1

2)ω0t

)sin

(ω0t2

) .

The Dirichlet kernel Dn(t) is trigonometric polynomial of degree n with period τ . TheDirichlet kernels are symmetric about the y axis. In other words, Dn(t) = Dn(−t) for allintegers n ≥ 0. Moreover,

1

τ

∫ τ

0

Dn(t)dt = 1 and Dn(τj) = 1 + 2n for all integers j. (7.37)

It is emphasized that the Dirichlet kernels are not well behaved. First, they oscillate wildly.For large n the Dirichlet kernel Dn(t) behaves like a high frequency sine wave sin((n+1/2)ω0t)with an amplitude of 1/ sin(ω0t/2) in a neighborhood of t. Since sin(ω0t/2) has its maximumvalue at t = τ

2in the interval [0, τ ], for large n, the Dirichlet kernels oscillate rapidly with a

minimum amplitude of 1. Furthermore, Dn(t) has 2n evenly spaced zeros located at t = τk2n+1

for k = 1, 2, · · · , 2n. Finally, it is noted that the Dirichlet kernels oscillate so fast that thearea under |Dn(t)| diverges, that is,

∞ = limn→∞

∫ τ

0

|Dn(t)|dt.

To fix this behavior we will use the Cesaro mean, to introduce the Fejer kernels.


Dirichlet kernels and Dirac delta functions. In this paragraph, we assume that thereader is familiar with some of the properties of the Dirac delta function δ(t) discussedin Section 2.7.2. The theory of Dirac delta functions is based on a shaky mathematicalfoundation, and thus, we will use the Dirac delta function with some degree of caution.

The Dirichlet kernels Dn(t) formally converge to τδ(t) in some neighborhood of the originwhere δ(t) is Dirac delta function. Because Dn(t) has period τ , the Dirichlet kernels formallyconverge to the Dirac comb τΔ(t) with period τ , that is,

τΔ(t) = D∞(t) = 1 + 2∞∑k=1

cos(kω0t) (formally). (7.38)

(Formally means that we will write out the series even though it may not converge.) To seethat D∞(t) = τΔ(t), recall that

∫δ(t)h(t)dt = h(0); see the sampling property of δ(t) in

(7.29). The Fourier coefficients of e−ikω0t for the Dirac delta function δ(t) are given by

1

τ=

1

τ

∫ τ

0

δ(t)eikω0tdt (for all integers k). (7.39)

(Here we assume that the integral includes the Dirac delta function δ(t).) Therefore Dn(t)τ

is the n-th partial Fourier series for δ(t); see (7.32). Since Dn(t) has period τ , formallyD∞(t) = τ

∑∞−∞ δ(t− τj) = τΔ(t), where Δ(t) is the Dirac comb with period τ .

It is noted that the Dirac delta function δ(t) is not a function in L2(0, τ). Hence functionscontaining Dirac delta functions are not in L2(0, τ), and thus, their Fourier coefficients arenot square summable. Finally, δ(t) is formally in L1.

The Fejer kernels.

Now let us return to the Cesaro mean. Recall that the n-th Cesaro mean σn(t) for a functionf in L2(0, τ) is average of its partial Fourier series {pν(t)}n−1

0 . By consulting the convolutionformula for pν(t) in (7.33), we obtain

σn(t) =1

n

n−1∑ν=0

pν(t) =1

n

n−1∑ν=0

1

τ

∫ τ

0

f(x)Dν(t− x)dx =1

τ

∫ τ

0

f(x)

(1

n

n−1∑ν=0

Dν(t− x)

)dx.

Now let Kn(t) =1n

∑n−10 Dν(t) be the n-th Cesaro mean of

∑∞−∞ e−iω0kt, that is,

Kn(t) =1

n

n−1∑ν=0

Dν(t) =1

n

n−1∑ν=0

ν∑k=−ν

e−iω0kt

=∑|k|<n

n− |k|n

e−ikω0t = 1 + 2n−1∑k=1

n− k

ncos(kω0t). (7.40)

Notice that K1(t) = D0(t) = 1. The functions {Kn(t)}∞1 are called the Fejer kernels withperiod τ . Using Kn(t) with (7.33), the n-th Cesaro mean σn(t) for f(t) is given by thefollowing convolution formula:

σn(t) =1

τ

∫ τ

0

f(x)Kn(t− x)dx. (7.41)


Let us derive a formula for the Fejer’s kernels {Kn(t)}∞1 . By consulting (7.36), we seethat the n-th Fejer kernel

Kn(t) =1

n

n−1∑ν=0

Dν(t) =1

n

n−1∑ν=0

cos(νω0t)− cos((ν + 1)ω0t)

1− cos(ω0t).

The sum is a telescoping series, that is, a sum whose intermediate terms cancel. Hence then-th Fejer kernel is given by

Kn(t) =1− cos(nω0t)

n(1− cos(ω0t))=

sin(nω0t2

)2n sin

(ω0t2

)2 (7.42)

In particular, the Fejer kernels {Kn(t)}∞1 are symmetric about the y axis (Kn(t) = Kn(−t)),and positive, that is, Kn(t) ≥ 0 for all integers n ≥ 1. Summing up, we see that the n-thCesaro mean σn(t) for f(t) in L

2(0, τ) is given by the convolution formula

σn(t) =1

τ

∫ τ

0

f(x)Kn(t− x)dx (7.43)

where ω0 =2πτ

and Kn(t) is defined in (7.42).It is emphasized that the Fejer kernels {Kn}∞1 have three important properties:

1 =1

τ

∫ τ

0

Kn(t)dt for all integers n ≥ 1 (7.44)

n = Kn (τj) for all integers j (7.45)

0 = limn→∞

Kn(t) for all t = τj where j is an integer. (7.46)

The first equation (7.44) follows by using the formula for the Fejer kernel in (7.40), that is,

1

τ

∫ τ

0

Kn(t)dt =1

n

n−1∑ν=0

1

τ

∫ τ

0

Dν(t)dt =1

n

n−1∑ν=0

1 = 1.

To obtain the second equation apply L’Hospital’s rule twice to the first equation in (7.42).The last equation is a simple consequence of the fact that the denominator of the n-th Fejerkernel n(1− cos(ω0t)) = 0 when t = τj.

Let us emphasize that when f(t) is in L2(0, 2π), then ω0 = 1 and the formulas for theCesaro means reduce to

σn(t) =1

2π

∫ 2π

0

f(x)Kn(t− x)dx (when f ∈ L2(0, 2π))

Kn(t) =1− cos(nt)

n(1− cos(t))=

sin(nt2

)2n sin

(t2

)2 (7.47)

Kn(t) =1

n

n−1∑ν=0

ν∑k=−ν

e−ikt =∑|k|<n

n− |k|n

e−ikt (the n-th Cesaro mean of

∞∑k=−∞

e−ikt).

For further results on the Fejer kernel see [5, 20, 40] and the Fejer kernel webpage inWikipedia.


A maximal property: ‖σn‖∞ ≤ ‖f‖∞. The results in this paragraph are motivated byProblem 2 page 43 in Hoffman [20]. Let f be a (Lebesgue measurable) function over theinterval in [0, τ ]. Then the L∞(0, τ) norm ‖f‖∞ of f is the supremum or least upper boundof |f(t)| ignoring the sets of measure zero. To be precise,

‖f‖∞ = ess sup{|f(t)| : 0 ≤ t ≤ τ}.

If f(t) is a continuous function over [0, τ ], then ‖f‖∞ equals the maximum value of |f(t)|over the interval [0, τ ], that is, ‖f‖∞ = max{|f(t)| : 0 ≤ t ≤ τ}. Because the n-th Cesaromean σn(t) is a τ periodic continuous function, ‖σn‖∞ = max{|σn(t)| : 0 ≤ t ≤ τ}. Forseveral examples of the L∞ norm: ‖1 − 2t3‖L∞(0,1) = 1 and ‖1 − 2t3‖L∞(0,2) = 15 and‖3teit‖L∞(0,4) = 12.

As before, let σn(t) be the n-th Cesaro mean for a function f in L2(0, τ). Then σn(t) hasthe following maximal property:

‖σn‖∞ ≤ ‖f‖∞. (7.48)

Moreover, ‖σn‖∞ = ‖f‖∞ if and only if f(t) is a constant function. (If f(t) is a constantfunction, then f(t) must equal a0 = 1

τ

∫ τ0f(t)dt.) The maximal property ‖σn‖∞ ≤ ‖f‖∞

states that the magnitude |σn(t)| ≤ ‖f‖∞ for all t. Moreover, |σn(t)| = ‖f‖∞ for some t ifand only if f(t) is a constant function. If ‖f‖∞ = ∞, then the inequality ‖σn‖∞ ≤ ‖f‖∞ ismeaningless.

The maximal property ‖σn‖∞ ≤ ‖f‖∞ can be seen in the graph in Figure 2.11. (Wehave equality ‖σn‖∞ = ‖f‖∞ if and only if f(t) is a constant function.) The correspondingfunction f = t is in L2(0, 2π). Because 2π equals the maximum of |t| for t ∈ [0, 2π],we see that ‖f‖∞ = 2π. Using Matlab, we discovered that ‖σ11‖∞ = 5.5134. Clearly,‖σ11‖∞ < ‖f‖∞. (The Matlab command norm(u,inf) was used to compute the L∞(0, 2π)norm ‖σ11‖∞.) Finally, the maximal property ‖σn‖∞ < ‖f‖∞ = 2π also explains why theCesaro means eliminated the overshoot in the Gibbs phenomenon. The Cesaro means σn(t)can never reach 2π.

The graph presented in Figure 2.13 also shows that the maximal property ‖σn‖∞ ≤ ‖f‖∞holds. The corresponding function f in L2(0, 2π) is given by f(t) = sin(t) for 0 ≤ t ≤ π andzero otherwise. Because 1 equals the maximum of |f(t)| over the interval [0, 2π], we see that‖f‖∞ = 1. In this case, Matlab gives ‖σ11‖∞ = 0.9256. Clearly, ‖σ11‖∞ < ‖f‖∞.

The proof of the maximal property ‖σn‖∞ ≤ ‖f‖∞ follows from the Cauchy-Schwartzinequality with the weight Kn(t− x), that is,

|σn(t)|2 =∣∣∣∣1τ

∫ τ

0

f(x)Kn(t− x)dx

∣∣∣∣2 ≤ 1

τ

∫ τ

0

|f(x)|2Kn(t− x)dx1

τ

∫ τ

0

12Kn(t− x)dx

=1

τ

∫ τ

0

|f(x)|2Kn(t− x)dx.


|σn(t)|2 ≤ 1

τ

∫ τ

0

|f(x)|2Kn(t− x)dx. (7.49)


Moreover, we have equality if and only if the function f(x) and 1 are linearly dependent. Inother words, we have equality if and only if f(t) = a0 is a constant function. Finally, theinequality |σn(t)|2 ≤ 1

τ

∫ τ0|f(x)|2Kn(t − x)dx shows that |σn(t)|2 less than or equal to the

n-th Cesaro mean for |f(t)|2 with equality if and only if f(t) = a0.To prove that |σn(t)| ≤ ‖f‖∞ simply observe that

|σn(t)|2 ≤ 1

τ

∫ τ

0

|f(x)|2Kn(t− x)dx ≤ 1

τ

∫ τ

0

‖f‖2∞Kn(t− x)dx

‖f‖2∞τ

∫ τ

0

Kn(t− x)dx = ‖f‖2∞.

Hence |σn(t)| ≤ ‖f‖∞ for all t, or equivalently, ‖σn‖∞ ≤ ‖f‖∞. If |σn(t)| = ‖f‖∞ for somet (or equivalently, ‖σn‖∞ = ‖f‖∞), then we have equality in (7.49), and thus, f(t) = a0is a constant function. On the other hand, if f(t) = a0, then σn(t) = a0 = f(t) and‖σn‖∞ = ‖f‖∞. Finally, it is noted that a similar maximal property holds for the arithmeticmeans for scalars; see Problem 5 in Section 2.7.4.

Fejer kernels and the Dirac delta function. The three properties in (7.44), (7.45) and(7.46) imply that the Fejer kernels Kn(t) with period τ formally converge to τδ(t) in someneighborhood of the origin, that is,

τΔ(t) = K∞(t) = 1 + limn→∞

2n−1∑k=1

n− k

ncos(kω0t) (formally). (7.50)

Here Δ(t) =∑

j δ(t − τj) is the Dirac comb with period τ . As noted earlier, the Fourier

coefficients of e−ikω0t for the Dirac delta function δ(t) are given by 1τ= 1

τ

∫ τ0δ(t)eikω0tdt. By

consulting (7.2) with ak =1τand (7.40), we see that Kn(t)

τis the n-th Cesaro mean for δ(t).

Because Kn(t) has period τ , formally K∞(t) = τΔ(t).

The derivative of f(t) = t in L2(0, 2π)

It is noted that one has to be careful when taking the derivative of a Fourier series. Letf(t) =

∑∞−∞ ake

−ikω0t be the Fourier series expansion for a function f in L2(0, τ). Thenformally

df

dt=

∞∑k=−∞

d

dtake

−ikω0t =

∞∑k=−∞

−ikω0ake−ikω0t.

However, the corresponding Fourier coefficients {−kω0ak}∞−∞ may not be square summable,that is, it is possible that

∑∞−∞ ω2

0|kak|2 = ∞. In this case, the corresponding Fourier series∑∞∞−ikω0ake

−ikω0t for the derivative f = dfdt

does not define a function in L2(0, τ) and may

not converge. If f is continuous and has period τ , then the derivative of the Fourier seriesis well defined; see [5].

To see what can go wrong, let f(t) = t in L2(0, 2π). Recall that the Fourier series

expansion for t = π − 2∑∞

1sin(kt)k

. Clearly, t is not periodic. However, the Fourier series


-5 0 5 10 15 20

t

-40

-35

-30

-25

-20

-15

-10

-5

0

5

10The graph of -2Σ

120 cos(kt)

Figure 2.15: The graph of −2∑20

1 cos(kt) for −π ≤ t ≤ 5π.

π−2∑∞

1sin(kt)k

for t is periodic; see Figures 2.8 and 2.10 for a plot of π−2∑12000

1sin(kt)k

andthe corresponding Gibbs phenomenon at t = 2πj where j is an integer. As before, let f(t)also denote the 2π periodic extension of t satisfying f(t) = f(t+2π) for all t and f(2πj) = 0for all integers j. To be precise, f(t) = t − 2πj when t ∈ [2πj, 2π(j + 1)). Because f(t)is discontinuous at 2πj, its derivative f = df

dtdoes not exist at t = 2πj for any integer j.

(Formally, this introduces Dirac delta functions at t = 2πj; see Section 2.7.2.) As before, let

pn(t) = π − 2∑n

1sin(kt)k

be the n-th partial Fourier series for t. By taking the derivative off(t) and its n-th partial Fourier series, we arrive at

f(t) = 1 if t = 2πj where j is an integer

pn(t) =d

dt

(π − 2

n∑k=1

sin(kt)

k

)= −2

n∑k=1

cos(kt) = −n∑k=1

(eikt + e−ikt

).

The function t is not periodic, and its 2π periodic extension f(t) has discontinuities at 2πjwhere j is an integer; see Figure 2.10. So its derivative should incorporate these discontinu-ities. The formal Fourier series p∞ = −2

∑∞1 cos(kt) = −∑

k =0 e−ikt tries to accommodate

this. However, one has immediate problems. First the Fourier coefficients ak = −1 for e−ikt

for all integers k = 0 and a0 = 0. So {ak}∞−∞ is not square summable, that is,∑∞

−∞ |ak|2 = ∞.Hence the Fourier series p∞ = −2

∑∞1 cos(kt) is not a function in L2(0, 2π). So we just view

p∞ as formally given by p∞ = −2∑∞

1 cos(kt). By consulting the definition of the Dirichletkernel Dn(t) in (7.32) with fundamental frequency ω0 = 1, we see that

pn = 1−Dn(t) and p∞ = −2

∞∑k=1

cos(kt) = 1− limn→∞

Dn(t) (formally).

Recall that the Dirichlet kernels Dn(t) are not well behaved and oscillate rapidly. Hencethe approximations pn(t) = −2

∑n1 cos(kt) for the derivative also oscillate rapidly for large


n around 1 when t = 2πj. For n = 20, Figure 2.15 graphs pn(t) = 1 − Dn(t) and 1 on thesame graph over the interval [−π, 5π]. It is emphasized that pn(t) is oscillating around 1 = fwhen t = 2πj, and diverging to −∞ at t = 2πj.

For a Dirac delta functional perspective, recall that formally

2πΔ(t) = 1 + 2∞∑k=1

cos(kt) = D∞(t),

where Δ(t) is the Dirac comb with period 2π; see (7.38). Hence the derivative f of f formallyequals

f = p∞ = 1−D∞(t) = 1− 2πΔ(t).

This is precisely the generalized derivative f of f one obtains by applying the Dirac deltaanalysis in Section 2.7.2 to our periodic function f(t). The minus −2πδ(t−2πj) correspondsto the fact that −2π = f(2πj+)− f(2πj−) for all integers j.

Now let us use the Cesaro mean to try to make sense of the derivative of the Fourierseries for t = π − 2

∑∞1

sin(kt)k

. Recall that the n-th Cesaro mean for f(t) = t in L2(0, 2π) isgiven by

σn(t) = π − 2n−1∑k=1

(n− k) sin(kt)

nk;

see (7.24). By taking the derivative and consulting the formula for the Fejer kernel Kn(t) in(7.40) with ω0 = 1, we obtain

σn(t) = −2n−1∑k=1

(n− k) cos(kt)

n= 1−Kn(t).

In other words, we have

σn(t) = 1−Kn(t) = 1− 1− cos(nt)

n(1− cos(t)). (7.51)

Recall that Kn(2πj) = n for all integers j; see (7.45) with ω0 = 1. Hence σn(2πj) = 1 − nfor all j. By letting n approach infinity, we have

limn→∞

σn(t) = 1 if t = 2πj where j is an integer

= −∞ otherwise.

This also yields the fact that 1 equals the derivative of the function f(t) = t in (0, 2π). Inother words, f(t) = 1 when t = 2πj. The −∞ corresponds to the fact that the 2π periodicextension f(t) of t is discontinuous at 2πj, and thus, its corresponding derivative f(t) doesnot exist at t = 2πj. It is noted that σn(2πj) diverges to minus infinity, and the 2π periodicfunction f(2πj−) = 2π moves downward to f(2πj+) = 0. Figure 2.16 plots 1−K21(t) and 1on the same graph over the interval [−π, 5π]. This is an improvement over the approximationof f in presented in Figure 2.15 using p20 = −2

∑201 cos(kt) = 1−D20(t). Finally, we used 21


for 1−K21(t) and 20 for 1−D20(t) because they both employ the same complex exponentialfunctions {e−ikt}20−20.

Recall that the derivative of the Fourier series had serious convergence problems. TheCesaro mean approach eliminated this convergence problem and actually converges to thederivative f(t) of f(t) except at its discontinuities. Finally, it is noted that, the Cesaromeans also formally yield f(t) = 1−K∞(t) = 1− 2πΔ(t) the generalized derivative f for fexpressed in terms of delta functions; see Section 2.7.2.

-5 0 5 10 15 20

t

-20

-15

-10

-5

0

5The graph of 1-K

21(t)

Figure 2.16: The graph of 1−K21(t) for −π ≤ t ≤ 5π.

A formal Dirac delta function method to compute σn(t) from f(t). To computethe Cesaro mean σn(t) for f(t) =

dfdt, let us take the generalized derivative of the 2π periodic

extension f(t) of t in L2(0, 2π). Since this approach is formal and on shaky mathematicalgrounds, we will only sketch the result. First, recall that f(t) is the 2π periodic extension oft such that f(t) = f(t+ 2π) for all t. In the notation of Section 2.7.2

f(t) =∞∑

j=−∞(t− 2πj)I{2πj,2π(j+1)} and f(t) = 1−

∞∑j=−∞

2πδ(t− 2πj).

Notice that we have −2πδ(t − 2πj), because −2π = f(2πj+)− f(2πj−). It is emphasizedthat our formal derivative f(t) should only have one delta function for each 2π period. Soformally using the convolution formula in (7.43) and including δ(x) in this integral, we obtain

σn(t) =1

2π

∫ 2π

0

f(x)Kn(t− x)dx =1

2π

∫ 2π

0

(1− 2πδ(x))Kn(t− x)dx = 1−Kn(t).

Here we used the sampling property∫δ(x)h(t−x)dx = h(t) of the Dirac delta function; see

(7.29). Therefore σn(t) = 1 −Kn(t) is the n-th Cesaro mean for f(t). In a similar fashion


using the Dirichlet convolution formula (7.33), we see that

pn(t) =1

2π

∫ 2π

0

f(x)Dn(t− x)dx =1

2π

∫ 2π

0

(1− 2πδ(x))Dn(t− x)dx = 1−Dn(t).

Therefore pn(t) = 1−Dn(t) is the n-th partial Fourier series for f(t).

The case when f is in L2(−μ, μ). Assume that f(t) is a τ periodic function in L2(−μ, μ)where 2μ = τ . Let f(t) =

∑∞−∞ ake

−iω0kt be the Fourier series expansion for f(t) whereω0 =

πμis the fundamental frequency; see Section 2.1.2. Since f(t) is a τ periodic function,

f(t) is also a function in L2(0, τ). Let Dn(t) be the Dirichlet kernel with period τ , andKn(t) the Fejer with period τ . Because the integral of a τ periodic function over any intervalof length τ is the same, the n-th partial Fourier series pn(t) and Cesaro mean σn(t) arerespectively given by

pn(t) =1

τ

∫ τ

0

f(x)Dn(t− x)dx =1

2μ

∫ μ

−μf(x)Dn(t− x)dx

σn(t) =1

τ

∫ τ

0

f(x)Kn(t− x)dx =1

2μ

∫ μ

−μf(x)Kn(t− x)dx. (7.52)

If f(t) in L2(−μ, μ) is not τ periodic, then f(t− μ) is in L2(0, τ). Moreover, a change ofvariable shows that

pn(t) =1

2μ

∫ μ

−μf(x)Dn(t− x)dx =

1

τ

∫ τ

0

f(x− μ)Dn(t+ μ− x)dx

σn(t) =1

2μ

∫ μ

−μf(x)Kn(t− x)dx =

1

τ

∫ τ

0

f(x− μ)Kn(t+ μ− x)dx.

2.7.4 Exercise

Problem 1. Let {zn}∞0 be a sequence of complex numbers, and mn = 1n

∑n−10 zν the mean

of {zn}n−10 with m1 = z0. Finally, recall that

∑n0 λ

ν = 1−λn+1

1−λ when λ = 1.

(i) Let zn = λn where λ is a complex number on the unit circle and λ = 1. Then showthat the sequence zn does not converge and mn converges to zero. What happens whenλ = 1?

(ii) Let zn =∑n

0 λν where λ is a complex number such that |λ| < 1. Then show that the

both zn and mn converges to 11−λ .

(iii) Let zn =∑n

0 λν where λ is a complex number such that |λ| = 1 and λ = 1. Then show

that zn does not converge and mn converges to 11−λ . In particular, if λ = −1, then mn

converges to 12; this is called the Grandi series see [5].

(iv) In Part (iii) assume that λ = 1, that is, zn = n + 1 =∑n

0 1. Then show that both znand mn diverge. Hint:

∑n1 k = n(n+1)

2.


Problem 2. Let Dn(t) be the Dirichlet kernel with period τ and Kn(t) be the Fejer kernelwith period τ . Then show that

1

τ

∫ τ

0

|Dn(t)|2dt = 1 + 2n and1

τ

∫ τ

0

|Kn(t)|2dt = 2n

3+

1

3n. (7.53)

Hint:∑n

1 k2 = n(n+1)(2n+1)

6.

Problem 3. Use Euler’s formula eiθ = cos(θ) + i sin(θ) to prove the following results.

(i) Show that 1− cos(2θ) = 2 sin(θ)2.

(ii) Show that

cos(θ)− cos(φ) = −2 sin

(θ + φ

2

)sin

(θ − φ

2

).

(iii) Use Parts (i) and (ii) to directly verify the second equality in the following formula forthe Dirichlet kernel (see also (7.36))

Dn(t) =cos(nω0t)− cos((n+ 1)ω0t)

1− cos(ω0t)=

sin((n + 1/2)ω0t)

sin(ω0t/2).

Problem 4.

(i) Plot the Dirichlet kernel Dn(t) in Matlab with a fundamental frequency of ω0 = 1 forn = 10, 20, 30 over the interval [−π, 5π]; see (7.36).

(ii) Plot the Fejer kernel Kn(t) in Matlab with a fundamental frequency of ω0 = 1 forn = 10, 20, 30 over the interval [−π, 5π]; see (7.42).

(iii) What is the difference between the plot for the Dirichlet and Fejer kernels?

Problem 5.

(i) Let {zν}n−10 be a sequence of complex numbers and mn = 1

n

∑n−10 zν the arithmetic

mean of {zν}n−10 . Use the Cauchy-Schwartz inequality to show that

|mn| ≤√√√√ 1

n

n−1∑ν=0

|zν |2 ≤ max{|zν | : 0 ≤ ν < n}. (7.54)

In particular, |mn|2 is less than or equal to the arithmetic mean of {|zν |2}n−10 .

(ii) Prove that |mn|2 = 1n

∑n−1ν=0 |zν |2 if and only if {zν}n−1

0 are all equal, that is, zν = γ forν = 0, 1, · · · , n − 1. Then show that we have equality in (7.54) if and only if {zν}n−1

0

are all equal.


(iii) Let σn(t) be the n-th Cesaro mean, and pν(t) the ν-th partial Fourier series for afunction f in L2(0, τ). Recall that σn(t) =

1n

∑n−1ν=0 pν(t). By consulting (7.54), we see

that |σn(t)|2 ≤ 1n

∑n−1ν=0 |pν(t)|2. By integrating both sides of this inequality, we obtain

‖σn‖ ≤√√√√ 1

n

n−1∑ν=0

‖pν(t)‖2. (7.55)

In other words, ‖σn‖2 is less than or equal to the arithmetic mean of {‖pν‖2}n−10 .

Using Parseval’s equality directly prove that (7.55) holds. Moreover, show that we haveequality in (7.55) if and only if pn−1(t) = a0 is a constant. (As expected, a0 =

1τf(t)dt.)

One can also solve Part (iii) by using elementary Hilbert space techniques; see [15] for anexcellent introduction to Hilbert space. Let H = ⊕n−1

0 L2(0, τ) be the Hilbert determined byn orthogonal copies of L2(0, τ). In other words, H consists of all vectors of the form ⊕n−1

0 fνwhere fν ∈ L2(0, τ) for ν = 0, 1, · · · , n− 1. The corresponding the inner product and normare given by

(⊕n−10 fν ,⊕n−1

0 gν) =1

n

n−1∑ν=0

(fν , gν) =1

n

n−1∑ν=0

1

τ

∫ τ

0

fν(t)gν(t)dt

‖ ⊕n−10 fν‖2 = 1

n

n−1∑ν=0

‖fν‖2.

Notice that σn(t) =(⊕n−1

0 pν ,⊕n−10 1

). Here 1 is the constant function 1 in L2(0, τ). So using

the Cauchy-Schwartz inequality, we obtain

|σn(t)|2 =∣∣(⊕n−1

0 pν ,⊕n−10 1

)∣∣2 ≤ ‖ ⊕n−10 pν‖2‖ ⊕n−1

0 1‖2 = 1

n2

n−1∑ν=0

‖pν‖2n−1∑ν=0

1 =1

n

n−1∑ν=0

‖pν‖2.

This yields the inequality in equation (7.55). Moreover, we have equality if and only ifthe vectors ⊕n−1

0 pν and ⊕n−10 1 are linearly dependent, or equivalently, pν(t) = γ the same

constant for all ν = 0, 1, 2, · · · , n− 1. Because pν(t) =∑

|k|≤ν ake−ikω0t, we have equality in

(7.55) if and only if pn−1(t) = a0.

Problem 6. Let pn be the n-th partial Fourier series and σn+1 the n+ 1 Cesaro mean for afunction f in L2(0, τ). Then show that

‖σn+1‖ ≤ ‖pn‖ ≤ ‖f‖. (7.56)

Moreover, ‖σn+1‖ = ‖pn‖ if and only if pn = a0 is a constant. Hint: Parseval’s equality.

Problem 7. To demonstrate that the Cesaro means converge slower than the Fourier seriesfor functions in L2, consider the function f(t) = t sin(t) in L2(0, 2π). The Fourier series for


f(t) is given by

t sin(t) =

∞∑k=−∞

ake−ikt where a0 = −1 and a±1 = −1

4± πi

2

ak =1

k2 − 1otherwise.

Recall that pn(t) =∑n

−n ake−ikt.

(i) Prove that f is in the Wiener algebra W2π. Therefore the partial Fourier series pn(t)converge uniformly to f(t), that is,

0 = limn→∞

max{|pn(t)− f(t)| : 0 ≤ t ≤ 2π}.

(ii) Find the n-th Cesaro mean for f(t) of the form

σn(t) = a0 +n−1∑k=1

ck cos(kt) + dk sin(kt).

(iii) True or False and give a reason for your answer: the Cesaro means σn(t) convergeuniformly to f(t) over the interval [0, 2π], that is,

0 = limn→∞

max{|σn(t)− f(t)| : 0 ≤ t ≤ 2π}.

(iv) In Matlab plot f(t) = t sin(t), p10(t) and σ11(t) on the same graph over the interval[0, 2π]. Use Theorem 2.7.4 and Matlab to compute errors

‖f − p10‖ and ‖f − σ11‖.

Compute ‖f‖∞ and ‖σ11‖∞. Then verify that the maximal property ‖σ11(t)‖∞ < ‖f‖∞holds. (We have equality if and only if f(t) is a constant function.) The Matlabcommand norm(f,inf) may be helpful.

(v) Find the smallest integer n such that ‖f − σn‖ ≤ ‖f − p10‖.

Problem 8. As before, consider the square wave function f(t) in L2(0, 2π) given by (6.15),and its Fourier series presented in (6.16), that is,

f(t) = 1 if 0 ≤ t < π

f(t) = −1 if π ≤ t < 2π

f(t) =4

π

∞∑k=0

sin((2k + 1)t)

2k + 1=

4

π

∑odd k>0

sin(kt)

k. (7.57)


Let f(t) also denote the unique 2π extension of f(t) such that f(t) = f(t+ 2π) for all t. Tobe precise,

f(t) = 1 if t ∈ [jπ, (j + 1)π) for all even integers j

= −1 if t ∈ [jπ, (j + 1)π) for all odd integers j.

(Since any point has Lebesgue measure zero, defining f(jπ) = (−1)j is not necessary.) Noticethat f(t) is discontinuous at jπ where j is an integer. Hence its derivative f(t) equals zeroeverywhere except at jπ, and its derivative f(t) is not defined at t = jπ (unless one usesDirac delta functions; see Section 2.7.2). In this case, the n-th partial Fourier series

pn(t) =4

π

n−12∑

k=0

sin((2k + 1)t)

2k + 1=

2

π

n∑odd k>0

eikt − e−ikt

ik.

The derivative pn of pn is given by

pn(t) =4

π

n−12∑

k=0

cos((2k + 1)t) =∑

|k|≤n odd k

2

πe−ikt. (7.58)

The Fourier coefficients { 2π}k odd for e−ikt are not square summable, that is,

∑k odd

4π2 = ∞.

Therefore the formal Fourier series p∞ = 4π

∑∞0 cos((2k+1)t) does not define a function p∞

in L2(0, 2π). In other words, for the derivative p∞ for the Fourier series is not well defined.

(i) Show that

pn(t) =Dn(t)−Dn(t− π)

πwhere Dn(t) =

sin((n+ 1

2)t)

sin(t2

) . (7.59)

Here Dn(t) is the Dirichlet kernel with fundamental frequency ω0 = 1.

(ii) Plot the partial Fourier series p49 approximating the derivative f over the interval[−π, 5π] in Matlab. Use the magnifying glass in Matlab to explain how p49 is tryingto approximate f .

(iii) Find the n-th Cesaro mean σn(t) for the square wave function f(t) in (7.57).

(iv) Show that

σn(t) =Kn(t)−Kn(t− π)

πwhere Kn(t) =

1− cos(nt)

n(1− cos(t)). (7.60)

Here Kn(t) is the Fejer kernel with fundamental frequency ω0 = 1. In particular, thisimplies that

0 = limn→∞

σn(t) if t = jπ where j is an integer

∞ = if t = jπ for even j

−∞ = if t = jπ for odd j.

In other words, the Cesaro means σn(t) converge to f(t) = 0 when t = jπ, and σn(t)diverges to (−1)j∞ at the discontinuous points jπ of the 2π periodic extension f(t).


(v) Plot σ50(t) in Matlab over the interval [−π, 5π]. Then compare the graph for σ50(t) tothe previous graph of p49(t). (In this case, σ49 = σ50.) Which plot behaves better?

Hint: f(t) = 2π

∑∞1

(1−(−1)k) sin(kt)k

and (−1)k cos(kt) = cos(k(t− π)).

Remark. Recall that formally K∞(t) = 2πΔ(t) where Δ(t) is the Dirac comb with period2π. This with the formula for σn in (7.60) and pn in (7.59) formally yield the derivative ffor f in terms of Dirac delta functions, that is,

f(t) = limn→∞

σn =

∞∑j=−∞

2(−1)jδ(t− πj) and f(t) = limn→∞

pn =

∞∑j=−∞

2(−1)jδ(t− πj).

The converse is also true, that is, we can formally compute σn by employing Dirac deltafunctions. Using the notation in Section 2.7.2, we have

f(t) =∞∑

j=−∞(−1)jI{jπ,(j+1)π} and f(t) =

∞∑j=−∞

2(−1)jδ(t− πj) (formally).

In this case, each 2π period of f(t) should contain only two Dirac delta functions. So formally,the n-th Cesaro mean σn(t) for f(t) is given by the convolution formula

σn(t) =1

2π

∫ 2π

0

f(x)Kn(t− x)dx =1

2π

∫ 2π

0

2 (δ(x)− δ(x− π))Kn(t− x)dx

=Kn(t)−Kn(t− π)

π.

Likewise pn(t) is given by the corresponding convolution formula

pn(t) =1

2π

∫ 2π

0

f(x)Dn(t− x)dx =1

2π

∫ 2π

0

2 (δ(x)− δ(x− π))Dn(t− x)dx

=Dn(t)−Dn(t− π)

π.

Problem 9. Consider the function g(t) = t2

2in L2(0, 2π) given by (6.10), that is,

g(t) =t2

2=

2π2

3+ 2

∞∑k=1

(cos(kt)

k2− π sin(kt)

k

)(for 0 < t < 2π). (7.61)

Let g(t) also denote the unique 2π periodic extension of g(t) such that g(t) = g(t+ 2π) forall t and g(2πj) = 0 for all integers j. (Since any point has Lebesgue measure zero, definingg(2πj) = 0 is not necessary.) Notice that g(t) is discontinuous at 2πj. Let pn(t) be the n-th


partial Fourier series for g(t). The derivative pn(t) of pn(t) is given by

pn(t) = −2n∑k=1

(π cos(kt) +

sin(kt)

k

)= π − 2

n∑k=1

sin(kt)

k− πDn(t)

p∞(t) = −2∞∑k=1

(π cos(kt) +

sin(kt)

k

)= π − 2

∞∑k=1

sin(kt)

k− πD∞(t) (formally)

d

dt

t2

2= t = π − 2

∞∑k=1

sin(kt)

k(for 0 < t < 2π). (7.62)

Here Dn(t) is the n-th Dirichlet kernel with fundamental frequency ω0 = 1. Notice that pncontains the n-partial Fourier series π− 2

∑n1

sin(kt)k

for t. The Fourier series π− 2∑∞

1sin(kt)k

for t does not equal the Fourier series for p∞, the formal derivative of the Fourier series forg(t). The Fourier series for p∞ does not define a function in L2(0, 2π). This follows fromParseval’s equality, that is,

‖p∞‖2 = 1

2

∞∑k=1

(4π2 +

4

k2

)= ∞.

Therefore the formal Fourier series p∞ for the derivative is not well defined.

(i) Plot the partial Fourier series p200 = −2∑200

1

(π cos(kt) + sin(kt)

k

)for g and f(t) = t

on the same graph over the interval [0, 2π] in Matlab. Use the magnifying glass inMatlab to hone in on f(t) = t and see how well p200 approximates t.

(ii) Find the n-th Cesaro mean σn(t) for the function g(t) = t2

2in L2(0, 2π).

(iii) Show that

σn(t) = π − 2n−1∑k=1

(n− k) sin(kt)

nk− πKn(t) where Kn(t) =

1− cos(nt)

n(1− cos(t)). (7.63)

Here Kn(t) is the n-th Fejer kernel with fundamental frequency ω0 = 1. Notice thatσn(t) contains the n-th Cesaro mean for the function t in L2(0, 2π); see (7.24). Asbefore, let f(t) be the 2π periodic extension for t, that is, f(t) = f(t + 2π) andf(2πj) = 0 for all integers j. Then this shows that

f(t) = limn→∞

σn(t) if t = 2πj where j is an integer

−∞ = if t = 2πj.

In other words, the Cesaro means σn(t) converge to t over the interval (0, 2π), and σn(t)diverges to −∞ at the discontinuous points 2πj of the 2π periodic extension g(t).

(iv) Plot σ201(t) and the function f(t) = t over the interval [0, 2π] in Matlab. Use themagnifying glass in Matlab to hone in on f(t) = t and see how well σ201(t) approximatest. Compare this graph to the previous plot in Part (i). Which plot behaves better?


Remark. Finally, it is noted that formally

limn→∞

σn(t) = f(t)− π limn→∞

Kn(t) = f(t)− 2π2Δ(t)

limn→∞

pn(t) = f(t)− π limn→∞

Dn(t) = f(t)− 2π2Δ(t).

Here Δ(t) is the Dirac comb with period 2π. In other words, both σn and pn formally convergeto g(t) = f(t)− 2π2Δ(t), the generalized derivative g for g using Dirac delta functions; seeSection 2.7.2.


Chapter 3

The discrete Fourier transform

In this chapter we will introduce and study the discrete Fourier transform. Some connectionsbetween the discrete Fourier transform and Fourier series will be given. Finally, we will showhow the discrete Fourier transform can be used to solve a sinusoid estimation problem.

3.1 The discrete Fourier transform

In this section, we will introduce the discrete Fourier transform. To this end, recall thata trigonometric polynomial is simply a Fourier series

∑ake

−ikω0t with a finite number ofterms. (As expected, ω0 = 2π

τ.) To introduce the discrete Fourier transform, consider the

trigonometric polynomial p(t) with period τ defined by

p(t) =n∑

k=−make

− 2πiktτ = a0 +

n∑k=1

ake− 2πikt

τ +m∑k=1

a−ke2πikt

τ , (1.1)

where m and n are specified positive integers. In many applications m = n or n = m + 1.However, m can be zero, or n can be zero. It is emphasized that τ > 0 is the period for p(t),that is, p(t) = p(t+τ) for all t. The trigonometric polynomial p(t) is already in Fourier seriesform. Moreover, p(t) is a τ periodic continuous function in L2(0, τ). So the correspondingFourier coefficients {ak}n−m are given by

ak =1

τ

∫ τ

0

p(t)e2πikt

τ dt (for k ∈ [−m,n]). (1.2)

Parseval’s equality implies that

1

τ

∫ τ

0

|p(t)|2dt =n∑

k=−m|ak|2. (1.3)

Clearly, p is in the Wiener algebraWτ . Finally, p(t) is a real valued trigonometric polynomialif and only if a0 is real and ak = a−k for k = 1, 2, · · · ,min{m,n} and zero otherwise; seeCorollary 2.1.3 in Chapter 2.

113

114 CHAPTER 3. THE DISCRETE FOURIER TRANSFORM

As before, consider the τ periodic trigonometric polynomial p(t) =∑n

−m ake− 2πikt

τ wherem and n are specified positive integers; see (1.1). Let us divide the interval [0, τ ] into ν equallyspaces points {tj}ν−1

0 starting with t0 = 0, that is, let tj =jτν

for j = 0, 1, 2, · · · , ν − 1, orequivalently,

t0 = 0, t1 =τ

ν, t2 =

2τ

ν, t3 =

3τ

ν, · · · , tν−1 =

(ν − 1)τ

ν. (1.4)

Notice that p(tj) is the value or sample of the trigonometric polynomial p(t) at the point tj .Substituting tj =

jτνinto (1.1) yields

p(tj) = a0 +

n∑k=1

ak(e− 2πij

ν )k +

m∑k=1

a−k(e2πijν )k (for j = 0, 1, 2, · · · , ν − 1). (1.5)

Recall that the ν roots of unity are the roots of the polynomial λν −1 = 0; see Section 1.2.1.Moreover, the ν roots {λj}ν−1

0 of unity are given by

λj = e−2πijν =

(e−

2πiν

)j(for j = 0, 1, 2, · · · , ν − 1). (1.6)

It is emphasized that λj = λj1 for j = 0, 1, 2, · · · , ν − 1. Using λj = e−2πijν in the expression

for p(tj) in (1.5) yields

p(tj) = a0 +

n∑k=1

akλkj +

m∑k=1

a−kλ−kj (for j = 0, 1, 2, · · · , ν − 1) . (1.7)

By rewriting this equation in matrix form, we obtain⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

p(t0)p(t1)p(t2)...

p(tν−3)p(tν−2)p(tν−1)

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦=

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

1 1 1 1 · · · 1 1 11 λ1 λ21 λ31 · · · λ−3

1 λ−21 λ−1

1

1 λ2 λ22 λ32 · · · λ−32 λ−2

2 λ−12

......

...... · · · ...

......

1 λν−3 λ2ν−3 λ3ν−3 · · · λ−3ν−3 λ−2

ν−3 λ−1ν−3

1 λν−2 λ2ν−2 λ3ν−2 · · · λ−3ν−2 λ−2

ν−2 λ−1ν−2

1 λν−1 λ2ν−1 λ3ν−1 · · · λ−3ν−1 λ−2

ν−1 λ−1ν−1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

a0a1a2...a−3

a−2

a−1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦(1.8)

So one can evaluate p(tj) for j = 0, 1, 2, · · · , ν − 1 by matrix multiplication. Finally, it isnoted that the matrix in (1.8) has ν rows and n +m+ 1 columns.

Now let Fν be the square ν × ν matrix in (1.8) when ν = n +m+ 1, that is, let

Fν =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

1 1 1 1 · · · 1 1 11 λ1 λ21 λ31 · · · λ−3

1 λ−21 λ−1

1

1 λ2 λ22 λ32 · · · λ−32 λ−2

2 λ−12

......

...... · · · ...

......

1 λν−3 λ2ν−3 λ3ν−3 · · · λ−3ν−3 λ−2

ν−3 λ−1ν−3

1 λν−2 λ2ν−2 λ3ν−2 · · · λ−3ν−2 λ−2

ν−2 λ−1ν−2

1 λν−1 λ2ν−1 λ3ν−1 · · · λ−3ν−1 λ−2

ν−1 λ−1ν−1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦. (1.9)

3.1. THE DISCRETE FOURIER TRANSFORM 115

The positive integers m and n play a basic role in describing the form of the τ periodictrigonometric polynomial p(t) =

∑n−m ake

− 2πiktτ one is analyzing. However, the matrix Fν is

defined independently of the choice of m and n. To see this, assume that λ is one of the rootsof unity, that is, λν = 1. Then for any integer j, we have λ−j = λ−jλν = λν−j . In particular,λ−1 = λν−1, λ−2 = λν−2, λ−3 = λν−3, etc. Using this fact with λ = λ1 and λj = λj1, we seethat the matrix Fν on Cν can be written as

Fν =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

1 1 1 1 · · · 11 λ1 λ21 λ31 · · · λν−1

1

1 λ21 λ41 λ61 · · · λ2(ν−1)1

......

...... · · · ...

1 λν−21 λ

2(ν−2)1 λ

3(ν−2)1 · · · λ

(ν−1)(ν−2)1

1 λν−11 λ

2(ν−1)1 λ

3(ν−1)1 · · · λ

(ν−1)2

1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦(where λ1 = e−

2πiν ). (1.10)

Clearly, Fν is independent of the choice of m and n. This matrix Fν is called the ν × ν

discrete Fourier transform matrix. If u is a vector in Cν and y = Fνu, then y is called the

discrete Fourier transform of u. Moreover, u is called the inverse discrete Fourier transform

of y. It is emphasized that the form of Fν in (1.10) shows that Fν = F trν . (The transpose

of a matrix is denoted by tr.) In a moment we will show that Fν is an invertible matrix. In

fact,

F−1ν =

1

νF ∗ν =

1

νF ν (1.11)

where ∗ denotes the complex conjugate transpose. Furthermore, F−1ν = 1

νF ν is the complex

conjugate of each entry of Fν divided by ν; see Theorem 3.3.2 in Section 3.3 below. So u isthe inverse discrete Fourier transform of y if and only if u = F−1

ν y. In particular, u and yuniquely determine each other.

REMARK 3.1.1 Consider the τ periodic trigonometric polynomial p(t) =∑n

−m ake− 2πikt

τ ,and assume that ν = n + m + 1. Let Fν on Cν be the discrete Fourier transform matrixdefined in (1.9), or equivalently, (1.10). Then (1.8) and F−1

ν = 1νF ∗ν imply that

⎡⎢⎢⎢⎢⎢⎣p(t0)p(t1)...

p(tν−2)p(tν−1)

⎤⎥⎥⎥⎥⎥⎦ = Fν

⎡⎢⎢⎢⎢⎢⎣a0a1...a−2

a−1

⎤⎥⎥⎥⎥⎥⎦ and

⎡⎢⎢⎢⎢⎢⎣a0a1...a−2

a−1

⎤⎥⎥⎥⎥⎥⎦ =1

νF ∗ν

⎡⎢⎢⎢⎢⎢⎣p(t0)p(t1)...


⎤⎥⎥⎥⎥⎥⎦ (1.12)

where tj =jτνfor j = 0, 1, 2, · · · , ν−1 are the sample times specified in (1.4). In this setting,


the vector ⎡⎢⎢⎢⎢⎢⎣p(t0)p(t1)...


⎤⎥⎥⎥⎥⎥⎦ is called the discrete Fourier transform of

⎡⎢⎢⎢⎢⎢⎣a0a1...a−2

a−1

⎤⎥⎥⎥⎥⎥⎦ .

On the other hand, the vector⎡⎢⎢⎢⎢⎢⎣a0a1...a−2

a−1

⎤⎥⎥⎥⎥⎥⎦ is the inverse discrete Fourier transform of

⎡⎢⎢⎢⎢⎢⎣p(t0)p(t1)...


⎤⎥⎥⎥⎥⎥⎦ .

Clearly, {p(tk)}ν−10 and {ak}n−m are related by matrix multiplication. Finally, equation (1.12)

shows that {p(tk)}ν−10 and {ak}n−m uniquely determine each other.

Let A be a ν × ν matrix and u be a vector in Cν . In general it takes approximately ν2

operations to compute the vector y = Au. So if Fν on Cν is the matrix given in (1.9) and uis a vector in Cν , then standard matrix calculations require on the order of ν2 operations tocompute the discrete Fourier transform y = Fνu of u. However, if ν = 2n where n is a positiveinteger, then using the special structure of Fν one can construct an algorithm called the fastFourier transform to compute y = Fνu in ν log ν operations. Since F−1

ν = 1νF ∗ν = 1

νF ν this

algorithm can also be used to compute the inverse discrete Fourier transform F−1ν y in ν log ν

operations. This is a significant savings when ν is large. So in practice, if possible, one choosesν = 2n for some integer n when computing the discrete Fourier transform Fνu and its inversediscrete Fourier transform F−1

ν y. A common choice of n is 12. In this case, ν = 212 = 4096.Finally, it is noted that the literature on the fast Fourier transform is massive. For somefurther results see [10, 44], the fast Fourier transform Wolfram MathWorld Webpage andtheir references.

The following result is a sampling lemma. It shows that the trigonometric polynomialp is uniquely determined by sampling p at a finite number of points {p(tj)}. This factprovides the motivation and algorithm for using the discrete Fourier transform to constructa trigonometric polynomial p which approximates a function f .

LEMMA 3.1.2 Let p and q be two τ periodic trigonometric polynomial of the form

p(t) =

n∑k=−m

ake− 2πikt

τ and q(t) =

n∑k=−m

bke− 2πikt

τ , (1.13)

where {ak}n−m and {bk}n−m are constants, and ν = n+m+1. Then the following statementsare equivalent.

(i) The trigonometric polynomials p(t) = q(t) for all t.


(ii) The samples p(tj) = q(tj) for all tj =jτν

where j = 0, 1, 2, · · · , ν − 1.

(iii) The coefficients ak = bk for all integers k in [−m,n].In particular, one can uniquely recover a τ periodic trigonometric polynomial p(t) of theform (1.13) by sampling {p(tj)}ν−1

0 at {tj}ν−10 whenever n +m+ 1 ≤ ν, that is, let {ak}n−m

be the coefficients computed by equation (1.12) with the appropriate ak set equal to zero, then

p(t) =∑n

−m ake− 2πikt

τ .

Proof. If p(t) = q(t) for all t, then obviously, p(tj) = q(tj) for all tj . Hence Part (i)

implies Part (ii). Remark 3.1.1 shows that[a0 a1 · · · a−1

]tris the inverse discrete Fourier

transform of[p(t0) · · · p(tν−1)

]tr, and

[b0 b1 · · · b−1

]tris the inverse discrete Fourier

transform of[q(t0) · · · q(tν−1)

]tr. (The transpose of a matrix is denoted by tr.) Thus if

Part (ii) holds, then we arrive at⎡⎢⎢⎢⎢⎢⎣a0a1...a−2

a−1

⎤⎥⎥⎥⎥⎥⎦ =1

νF ∗ν

⎡⎢⎢⎢⎢⎢⎣p(t0)p(t1)...


⎤⎥⎥⎥⎥⎥⎦ =1

νF ∗ν

⎡⎢⎢⎢⎢⎢⎣q(t0)q(t1)...

q(tν−2)q(tν−1)

⎤⎥⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎢⎣b0b1...b−2

b−1

⎤⎥⎥⎥⎥⎥⎦ .

Therefore ak = bk for all integers k in [−m,n]. In other words, Part (iii) holds. Finally, ifPart (iii) holds, then clearly, Part (i) holds. This completes the proof.

As before, let p(t) =∑n

−m ake− 2πikt

τ be a τ periodic trigonometric polynomial whereν = n + m + 1. Moreover, let us take n = m = ν−1

2if ν is odd, and m = ν

2− 1 and

n = m + 1 = ν2if ν is even. Without loss of generality, we can choose ν large enough such

that an = 0 when ν is even. By consulting Remark 2.3.2 in Chapter 2, we see that p alsoadmits a representation of the form

p(t) =

n∑k=−m

ake−ikω0t = a0 +

m∑k=1

(αk cos(kω0) + βk sin(kω0)) (1.14)

where the fundamental frequency ω0 =2πτ, the coefficients {αk}m1 and {βk}m1 are related to

{ak}n−m by

αk = ak + a−k and βk = i(a−k − ak)

ak =αk + iβk

2and a−k =

αk − iβk2

. (1.15)

Here k is an integer in [1, m]. Moreover, if p is a real valued function, then ak = a−k for allintegers k in [1, m]. In this case, the first two equations in (1.15) reduce to αk = 2�ak andβk = 2�ak where k is an integer in [1, m].

REMARK 3.1.3 The inverse discrete Fourier transform F−1ν can be used to compute the

Fourier coefficients for certain functions. For example, let f be a Riemann integrable function


in L2(0, τ). Moreover, let f(t) =∑∞

−∞ ake−ikω0t be the Fourier series expansion for f(t),

and ω0 =2πτits fundamental frequency. Let tj =

jτνfor j = 0, 1, 2, · · · , ν−1. (For numerical

efficiently, one would choose ν = 2�.) Let us set n = m = ν−12

if ν is odd, and m = ν2−1 with

n = m+1 = ν2when ν is even. Now evaluate the function {f(tj)}ν−1

0 at the ν points {tj}ν−10 .

Then apply the inverse discrete Fourier transform on the vector formed by {f(tj)}ν−10 , that

is, ⎡⎢⎢⎢⎢⎢⎣c0c1...c−2

c−1

⎤⎥⎥⎥⎥⎥⎦ =1

νF ∗ν

⎡⎢⎢⎢⎢⎢⎣f(t0)f(t1)...

f(tν−2)f(tν−1)

⎤⎥⎥⎥⎥⎥⎦ = F−1ν

⎡⎢⎢⎢⎢⎢⎣f(t0)f(t1)...


⎤⎥⎥⎥⎥⎥⎦ . (1.16)

This yields {ck}n−m that we have been looking for. It is emphasized that for large ν and|k| < ν

2, the coefficient ck is approximately equal to the following Riemann integral

ck =1

ν

ν−1∑j=0

f(tj)λj

k ≈∫ 1

0

f(τt)e2πiktdt =1

τ

∫ τ

0

f(t)eikω0tdt = ak. (1.17)

In other words, if ν is sufficiently large, then

ck ≈ 1

τ

∫ τ

0

f(t)eikω0tdt = ak (when |k| < ν

2). (1.18)

However, one has to be careful. It is not enough to just take |k| < ν2. The sample size ν

must be large enough to guarantee that the sum in (1.17) is an accurate approximation ofthe corresponding Riemann integral for ak. Then if both of these conditions are satisfied, wehave ck ≈ ak.

Without getting tied down in technical details, in many applications, the function f is”well behaved”. In this setting, one simply chooses a sufficiently large ν = 2� for f toguarantee that ck ≈ ak for all k ∈ K. Here K is a finite set of integers such that |k| � ν

2for

all k ∈ K, and ck is approximately equal to the Riemann integral in (1.17) for k ∈ K, and∑k/∈K |ak|2 ≈ 0. In this case, ruffly speaking

‖f −∑k∈K

cke−ikω0t‖ ≈ ‖f −

∑k∈K

ake−ikω0t‖ =

√∑k/∈K

|ak|2 ≈ 0.

In other words, if the function f is ”well behaved”, then one can use the inverse fast Fouriertransform compute an approximation

∑k∈K cke

−ikω0t for f ; see Section 3.2 for an example.Finally, it is noted that one can provide examples where fast Fourier transform techniqueswill not yield a good approximation for f .

Finally, it is noted that in many problems there is no formula to compute the integral1τ

∫ τ0f(t)eikω0tdt = ak, and thus, one has to numerically compute this integral. In this case,

one may as well use fast Fourier transform techniques to efficiently compute ak.


REMARK 3.1.4 It is emphasized that one has to be careful when applying the inversediscrete Fourier transform to functions in L2(−μ, μ); see Section 2.1.2 in Chapter 2. Letf(t) be a Riemann integrable function in L2(−μ, μ). Moreover, let

f(t) =∞∑

k=−∞ake


2μ

∫ μ

−μeikω0tf(t)dt (1.19)

be its Fourier series representation. Recall that τ = 2μ is the period, and ω0 = πμis the

corresponding fundamental frequency. Let {rj}ν−10 be ν evenly spaced points over the interval

[−μ, μ) starting with r0 = −μ and ending with rν−1 =(ν−2)μ

ν. Let {tj}ν−1

0 be ν evenly spaced

points over the interval [0, 2μ) starting with t0 = 0 and ending with tν−1 =(ν−1)2μ

ν; see (1.4).

To be precise,

rj =(2j − ν)μ

ν=

2jμ

ν− μ and tj =

2jμ

ν(for j = 0, 1, 2, · · · , ν − 1). (1.20)

Notice that rj = tj − μ and f(rj) = f(tj − μ). (For numerical efficiently, one would chooseν = 2�.) Now evaluate or sample the function f(t) at {rj}ν−1

0 . Then apply the inversediscrete Fourier transform on the vector formed by the samples {f(rj)}ν−1

0 , that is,⎡⎢⎢⎢⎢⎢⎣c0c1...c−2

c−1

⎤⎥⎥⎥⎥⎥⎦ =1

νF ∗ν

⎡⎢⎢⎢⎢⎢⎣f(r0)f(r1)...

f(rν−2)f(rν−1)

⎤⎥⎥⎥⎥⎥⎦ =1

νF ∗ν

⎡⎢⎢⎢⎢⎢⎣f(t0 − μ)f(t1 − μ)

...f(tν−2 − μ)f(tν−1 − μ)

⎤⎥⎥⎥⎥⎥⎦ . (1.21)

The {ck}n−m computed by the inverse discrete Fourier transform 1νF ∗ν corresponds to sampling

a trigonometric polynomial starting at time t = 0; see (1.4), (1.5) and Remark 3.1.1. So the{ck}n−m in (1.21) is the inverse discrete Fourier transform corresponding to sampling f(t−μ)at {tj}ν−1

0 which starts at time zero. Recall that f(t− μ) is in L2(0, 2μ) and

f(t− μ) =∞∑

k=−∞dke

−ikω0t where dk =1

2μ

∫ 2μ

0

eikω0tf(t− μ)dt (1.22)

is the Fourier series for f(t−μ). Hence for ν sufficiently large, the inverse discrete Fouriertransform in (1.21) actually approximates the Fourier coefficients {dk} corresponding to theshifted function f(t− μ) in L2(0, 2μ). In other words, if |k| < ν

2and ν is sufficiently large,

then ck approximates the following Riemann integral

ck ≈ dk =1

2μ

∫ 2μ

0

eikω0tf(t− μ)dt =(−1)k

2μ

∫ μ

−μeikω0tf(t)dt = (−1)kak; (1.23)

see equation (1.20) in Section 2.1.2 of Chapter 2 and Remark 3.1.3.In many applications, the function f in L2(−μ, μ) is ”well behaved”. In this setting, one

simply chooses a sufficiently large ν = 2� for f to guarantee that

ak = (−1)kdk ≈ (−1)kck (when k ∈ K). (1.24)


Here K is a finite set of integers such that |k| � ν2for all k ∈ K, and ck is approximately

equal to the Riemann integral in (1.23) for k ∈ K, and∑

k/∈K |ak|2 ≈ 0. In this case, rufflyspeaking

‖f −∑k∈K

(−1)kcke−ikω0t‖ ≈ ‖f −

∑k∈K

ake−ikω0t‖ =

√∑k/∈K

|ak|2 ≈ 0.

In other words, if the function f(t) is ”well behaved”, then one can use the inverse fastFourier transform compute an approximation

∑k∈K(−1)kcke

−ikω0t for f .

REMARK 3.1.5 It is noted that in Matlab one can take the discrete Fourier transform ofeither a row or a column vector. The Matlab command for the discrete Fourier transform isfft, and ifft is for the inverse discrete Fourier transform. By taking the fft or ifft of a rowvector (respectively a column vector) one obtains a row vector (respectively a column vector).Moreover, if the row and column vectors are the transpose of each other, then Matlab givesthe same answer up to the transpose. For example, if a is a row vector and p = fft(a),then p is a row vector and ptr = fft(atr). Here tr denotes the transpose and not the complexconjugate transpose. Likewise, a = ifft(p) and atr = ifft(ptr).

3.1.1 Nyquist sampling

One has to be careful when trying to estimate the frequency of periodic function f(t) bysampling. For example, assume that one is trying to estimate the frequency of the functionf(t) = cos(100t). Now suppose that one samples cos(100t) at tj =

2πj100

for j = 0, 1, 2, · · · , 99.Then we have cos(100tj) = cos(2πj) = 1 for all j. There is no way one can determine thefrequency 100 by evaluating cos(100t) at these times {tj}. The sample rate {tj} is simplenot fine enough. This leads us to the Nyquist sampling problem, which is how fine does onehave to sample a periodic signal to extract the frequencies of the corresponding sinusoids.

To see how the discrete Fourier transform Fν plays a role in sampling, consider a functionf(t) of the form f(t) =

∑k ake

−ikω0t with period τ and fundamental frequency ω0 =2πτ. Let

{tj}ν−10 be ν evenly spaces points over the interval [0, τ) stating with t0 = 0, that is,

tj =jτ

νfor j = 0, 1, 2, · · · , ν − 1 and ω0 =

2π

τ. (1.25)

Now assume that one samples f(t) at {tj}ν−10 and computes {ak} by using the inverse discrete

Fourier transform, that is,

F−1ν

⎡⎢⎢⎢⎢⎢⎣f(t0)f(t1)...


⎤⎥⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎢⎣a0a1...a−2

a−1

⎤⎥⎥⎥⎥⎥⎦ . (1.26)

In general, one cannot recover f(t) =∑

k ake−ikω0t from the amplitudes {ak} computed in

(1.26). This follows because the sampling rate may not be fine enough to capture the high


frequencies. To reconstruct f(t) from {f(tj)}ν−10 , the frequencies {kω0} must be contained

in the Nyquist frequency range {ω : |ω| < νω0

2

}. (1.27)

This sets the stage for the following periodic formulation of the Nyquist-Shannon samplingTheorem.

THEOREM 3.1.6 (Nyquist-Shannon sampling) Let f(t) =∑

k ake−ikω0t be a τ

periodic trigonometric polynomial with fundamental frequency ω0 = 2πτ. Let {ak} be the

coefficients for {f(tj)}ν−10 computed by using the inverse discrete Fourier transform F−1

ν in(1.26). If all the frequencies {kω0} of f(t) are contained in the Nyquist frequency range{

ω : |ω| < νω0

2

}, (1.28)

then f(t) =∑

k ake−ikω0t. In other words, if all the frequencies {kω0} are contained in the

open interval(−νω0

2, νω0

2

), then one can use fast Fourier transform techniques to recover

f(t) =∑

k ake−ikω0t.

Proof. Theorem 3.1.6 is a consequence of Lemma 3.1.2. If all the frequencies {kω0} forf(t) are contained in the open interval (−νω0

2, νω0

2), then f(t) is a trigonometric polynomial

of the form

f(t) =∑|k|< ν

2

ake−ikω0t =

n∑k=−m

ake−ikω0t.

In this case, m < ν2and n < ν

2. Therefore m + n + 1 ≤ ν, and Theorem 3.1.6 follows from

Lemma 3.1.2. This completes the proof.The Nyquist frequency range

(−νω0

2, νω0

2

)contains the maximum number of frequencies

that one can successfully recover by using ν evenly spaced sample times {tj}ν−10 over the

interval [0, τ). The corresponding frequency νω0

2is called the Nyquist frequency. The length

of the Nyquist frequency range(−νω0

2, νω0

2

)is at least two times the maximum frequency one

can capture by using the sample times {tj}ν−10 .

The Nyquist-Shannon sampling Theorem 3.1.6 shows that one has to sample the signalf(t) fine enough to include all the relevant frequencies to recover f(t) =

∑k ake

−ikω0t. Asexpected, the more sample times {tj}ν−1

0 one uses leads to better estimation of f(t); especiallyin a noisy environment.

To explain how one arrives at the Nyquist frequency range, set ϕk(t) = e−ikω0t, and let{ek}ν−1

0 be the standard orthonormal basis for Cν . (A trigonometric polynomial with periodτ is a finite linear combination of {ϕk(t)}.) By sampling ϕk(t) = e−ikω0t at {tj}ν−1

0 , we have⎡⎢⎢⎢⎢⎢⎣ϕk(t0)ϕk(t1)ϕk(t2)

...ϕk(tν−1)

⎤⎥⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎢⎣1

e−ikω0t1

e−ikω0t2

...e−ikω0tν−1

⎤⎥⎥⎥⎥⎥⎦ = Fνek (for k = 0, 1, 2, · · · , ν − 1). (1.29)


Let � be any integer. Since tj =jτνand ω0 =

2πτ, we have

ϕk(tj) = e−ikω0tj = e−i(k+�ν)ω0tj = ϕk+�ν(tj) (for j = 0, 1, 2, · · · , ν − 1). (1.30)

So when sampling the functions ϕk+�ν(t) with with frequencies (k + �ν)ω0 at {tj}ν−10 , one

obtains the same result as sampling ϕk(t). In other words, one cannot distinguish betweenϕk(t) and ϕk+�ν(t) by sampling with {tj}ν−1

0 over the interval [0, τ). Therefore the inversediscrete Fourier transform F−1

ν cannot pick up the frequencies higher than νω0. However,this is not the end of the story. One must also be able to determine what values of k in[0, ν − 1] correspond to what frequency. If 0 ≤ k < ν

2, then the frequency for ϕk(t) is kω0.

Moreover, (k − ν)ω0 (and not kω0) is the frequency for ϕk(t) when ν2< k < ν. So to

guarantee that we pick out the correct frequencies, we assume that all the frequencies arecontained in the Nyquist frequency range {ω : |ω| < νω0

2}.

Five samples of 2 sin(4t)

0 1 2 3 4 5 6 7

t

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2Five samples of 2sin(4t)

Figure 3.1: The graph of 2 sin(4t) with 2 sin(4tj) and −2 sin(t)

Consider the function f(t) = 2 sin(4t) over the interval [0, 2π]. Here ω0 = 1 and ϕk(t) =e−ikt for all integers k. Because the frequencies for 2 sin(4t) are ±4, the Nyquist-Shannonsampling Theorem states that we need at least 9 samples of f(t) to successfully recover2 sin(4t). Let us see what happens when take five samples {tj}40 of 2 sin(4t) where tj =

2πj5

for j = 0, 1, 2, 3, 4. First notice that

2 sin(4t) = ie−4it − ie4it = iϕ4(t)− iϕ−4(t).

Since ν = 5, we see that sampling 2 sin(4t) with {tj}40 yields the same result as samplingiϕ4−5(t)− iϕ−4+5(t). In other words, sampling 2 sin(4t) with {tj}40 is the same as sampling

iϕ−1(t)− iϕ1(t) = ieit − ie−it = −2 sin(t).


Clearly, −2 sin(t) = 2 sin(4t), and we do not recover 2 sin(4t) by sampling 2 sin(4t) at {tj}40.In this case, Matlab yields⎡⎢⎢⎢⎢⎣

a0a1a2a−2

a−1

⎤⎥⎥⎥⎥⎦ = F−15

⎡⎢⎢⎢⎢⎣2 sin(0)2 sin

(8π5

)2 sin

(16π5

)2 sin

(24π5

)2 sin

(32π5

)

⎤⎥⎥⎥⎥⎦ = F−15

⎡⎢⎢⎢⎢⎣0

−1.9021−1.17561.17561.9021

⎤⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎣0−i00i

⎤⎥⎥⎥⎥⎦ .(The Matlab command for the inverse discrete Fourier transform is ifft.) So the trigonometricpolynomial p(t) formed by sampling 2 sin(4t) at {tj}40 is given by

p(t) =2∑

k=−2

ake−ikt = −ie−it + ieit = −2 sin(t).

As expected, sampling 2 sin(4t) five times yields −2 sin(t), and we do not recover 2 sin(4t).The graph of 2 sin(4t), with its five samples {2 sin (8πj

5

)}40 and −2 sin(t) are presented inFigure 3.1.

The Nyquist-Shannon sampling Theorem says that we must sample 2 sin(4t) at least 9times to recover 2 sin(4t). By sampling 2 sin(4t) at tj =

2πj9

for j = 0, 1, 2, · · · , 8 and takingthe inverse discrete Fourier transform, we obtain[

0 0 0 0 i −i 0 0 0]tr

= F−19

[2 2 sin

(8π9

)2 sin

(16π9

) · · · 2 sin(64π9

)]tr.

(Recall that tr denotes the transpose.) The trigonometric polynomial corresponding to takingnine samples, is given by

q(t) = ie−4it − ie4it = 2 sin(4t).

As expected, sampling nine times successfully recovers 2 sin(4t).One might think that sampling 2 sin(4t) eight times at tj = 2πj

8for j = 0, 1, 2, · · · , 7

might work. However, it does not. Notice that 2 sin(4tj) = 2 sin(πj) = 0 for all {tj}70. Inother words, sampling 2 sin(4t) at {tj}70 yields zero, and one cannot recover 2 sin(4t) fromzero.

The Matlab commands we used to generate Figure 3.1 are given by

t=(0:99999)*2*π/100000;

t5=(0:4)’*2*π/5; plot(t,2*sin(4*t));

hold on; plot(t5, 2*sin(4*t5),’rs’);

plot(t5, 2*sin(4*t5),’r*’); plot(t,−2 ∗ sin(t)); gridxlabel(’t’); title(’Five samples of 2sin(4t)’)

a = ifft(2 ∗ sin(4 ∗ t5)) =

⎡⎢⎢⎢⎢⎣0.0000 + 0.0000i−0.0000− 1.0000i0.0000 + 0.0000i0.0000− 0.0000i−0.0000 + 1.0000i

⎤⎥⎥⎥⎥⎦ , 2 sin(4 ∗ t5) =

⎡⎢⎢⎢⎢⎣0

−1.9021−1.17561.17561.9021

⎤⎥⎥⎥⎥⎦


A 128 fast Fourier transform example

For another example, consider the functions f(t) = 2 sin(40t) and g(t) = 2 sin(88t) bothover the interval [0, 2π) with period τ = 2π. Now set tj =

2πj128

for j = 0, 1, 2, · · · , 127. Bycomputing the 128 inverse fast Fourier transform F−1

128 (ifft in Matlab), we obtain the samefrequencies and different amplitudes for {f(tj)}1270 and {g(tj)}1270 ; see (1.26). In Matlab, weset a = ifft(f) and b = ifft(g). Then a and b in Matlab are vectors of length 128. Moreover,a0 = a(1), a1 = a(2), a2 = a(3), · · · , while a−1 = a(128) and a−2 = a(127) etcetera. Asimilar pattern holds for b. In this case,

a40 = a(41) = i which corresponds to ie−40it

b−40 = b(89) = i which corresponds to ie−88it

a−40 = a(89) = −i which corresponds to − ie40it

b40 = b(41) = −i which corresponds to − ie88it. (1.31)

while all the other components of a and b are zero. This follows because,

2 sin(40t) = ie−40it − ie40it and 2 sin(88t) = ie−88it − ie88it. (1.32)

Since −88 + 128 = 40, we see that ie−88it and ie40it both place i in the b(89) = b−40

component of b; see (1.29). Because 88− 128 = −40, we have that −ie88it and −ie−40it bothplace −i in the b40 = b(41) component of b; see (1.29). This yields (1.31). By taking F−1

128,the inverse discrete Fourier transform, we obtain same nonzero positions in a and b withdifferent values; see (1.31). So clearly, we cannot distinguish between the frequencies ±40and ±88. However, if we know that all frequencies are contained in the Nyquist frequencyrange {ω : |ω| < ν

2= 64}, then the frequencies must be ±40 and f(t) = 2 sin(40t).

For another example, consider the function h(t) = 2 sin(40t)+2 sin(88t), and let {h(tj)}1270

be the corresponding 128 samples of h(t). Let c be the 128 inverse fast Fourier transformF−1128 of {h(tj)}1270 (in Matlab c = ifft(h)). Because h(t) = f(t) + g(t), we have c = a + b.

Equations (1.32) and (1.31) show that c = a+ b = 0. Clearly, one cannot recover h(t) from{h(tj)}1270 . The frequencies ±88 are not in the Nyquist frequency range (−64, 64).

It is noted that the Nyquist frequency range {ω : |ω| < 64}, did not include the frequency64. At the midway point one cannot distinguish between ±64 or even obtain 64. For example,let q(t) = 2 sin(64t) and compute d the inverse fast Fourier transform F−1

128 of {q(tj)}1270 (inMatlab d = ifft(q)). Then d = 0, as expected. Therefore one cannot recover 2 sin(64t) bysampling {q(tj)}ν−1

0 .The Matlab commands used to compute a, b, c and d are given by

t = (0 : 127) ∗ 2 ∗ π/128;f = 2 ∗ sin(40 ∗ t); g = 2 ∗ sin(88 ∗ t);h = f + g; q = 2 ∗ sin(64 ∗ t);a = ifft(f); b = ifft(g);

c = ifft(h); d = ifft(q);

It is noted that two different signals which produce the same {ak} by sampling is calledaliasing . For example, ϕk(t) and ϕk+�ν(t) where � is an integer. The distortion that occurs


by the reconstructing of a signal through sampling is also referred to as aliasing; see Aliasingin Wikipedia.

In many applications the function f(t) =∑

k ake−ikω0t is not a trigonometric polynomial.

However, all the significant frequencies of f(t) are contained in the Nyquist frequency range(−νω0

2, νω0

2

). Ruffly speaking by significant frequencies, we mean all the frequencies of f(t)

whose amplitude ak is not close to zero. In this case, the inverse discrete Fourier transformcan be used to approximately recover the original function f(t) =

∑k ake

−ikω0t by samplingf(t) at {tj}ν−1

0 .

3.1.2 Exercise

Problem 1. Let Fν be the ν×ν discrete Fourier transform matrix in (1.9). If a is any vectorin Cν , then the discrete Fourier transform y = Fνa in Matlab is computed by y = Fνa =fft(a). Moreover, a = F−1

ν y = F ∗ν y/ν = ifft(a). In particular, this matrix Fν is computed by

T = fft(eye(ν)). In Matlab compute fft(eye(5)) and compare this to F5 in (1.9) where ν = 5.Just for fun plot fft(eye(256)) and describe your answer. You do not have to print the plotof fft(eye(256)).

Problem 2. Let k be any positive integer. Notice that as t varies from 0 to τ the functione−

2πiktτ = (e−

2πitτ )k rotates clockwise around the unit circle in the complex plane k times,

starting and ending at 1+0i. If k = 0, then obviously, e−2πikt

τ = 1. If k is a negative integer,then as t varies from 0 to τ the function e−

2πiktτ rotates counterclockwise around the unit

circle in the complex plane |k| times, starting and ending at 1 + 0i. Clearly, cos(2πkt/τ) is

the real part of e−2πikt

τ , and − sin(2πkt/τ) is the imaginary part of e−2πikt

τ . For τ = 5 plot

e−2πikt

τ along with its real and imaginary part for k = 2, 4 and −3 when 0 ≤ t ≤ τ .

Problem 3. Let {ϕ0, ϕ1, ϕ2, · · · , ϕ−2, ϕ−1} be the standard orthonormal basis in Cν formed

by the columns of the identity matrix I on Cν , that is,[ϕ0 ϕ1 ϕ2 · · · ϕ−2 ϕ−1

]= I.

Notice that ϕ0 has one in the first position and all the other components are zero, ϕ1 has onein the second position and all the other components are zero, and so on, while ϕ−1 has onein the last position and all the other components are zero, ϕ−2 has one in the second fromthe last position and all the other components are zero, etc. Now let p(t) = e−

2πiktτ . Clearly,

p(t) =∑

j aje− 2πijt

τ where aj = δjk and δjk is the Kronecker delta. Using this in (1.12) wearrive at

Fνϕk =

⎡⎢⎢⎢⎢⎢⎣p(t0)p(t1)...


⎤⎥⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎢⎢⎣e−

2πikt0τ

e−2πikt1

τ

...

e−2πiktν−2

τ

e−2πiktν−1

τ

⎤⎥⎥⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎢⎣cos(2πkt0/τ)cos(2πkt1/τ)

...cos(2πktν−2/τ)cos(2πktν−1/τ)

⎤⎥⎥⎥⎥⎥⎦− i

⎡⎢⎢⎢⎢⎢⎣sin(2πkt0/τ)sin(2πkt1/τ)

...sin(2πktν−2/τ)sin(2πktν−1/τ)

⎤⎥⎥⎥⎥⎥⎦ .


Recall that tj = jτν

for j = 0, 1, 2, · · · , ν − 1 are equally spaced points over the interval

[0, τ ]. Since p(t) = e−2πikt

τ , for positive integers k the components in the column vector Fνϕkrotate clockwise around the unit circle in the complex plane (almost) k times, starting at1 + 0i and ending at e−2πik(ν−1), when ν is large and k � ν

2. (If ν is not large, then the

points are sparse.) If k is a negative integer, then the components in the column vectorFνϕk rotate counterclockwise around the unit circle (almost) |k| times, starting at 1+0i andending at e−2πik(ν−1), when ν is large and |k| � ν

2. Moreover, the real part of Fνϕk contains

the components of the cosine function with frequency 2πkτ

evaluated at {tj}ν−10 , while the

imaginary part of Fνϕk contains the components of minus the sine function with frequency2πkτ

evaluated at {tj}ν−10 .

If b is any vector in C� where � ≤ ν, then the Matlab command fft(b, ν) computes the

vector Fν[b; 0; 0; · · · 0

]tr. In Matlab plot the following and describe what happens.

You do not have to hand in your plots.

(i) fft([0; 5], 4096);

(ii) real(fft([0; 5], 4096));

(iii) imag(fft([0; 5], 4096));

(iv) fft([0; 0; 0;−2i], 4096);

(v) conj(fft([0; 0; 0;−2], 4096));

(vi) comet(real(fft([0; 0; 0; 0;−3], 4096)));

(vii) comet(imag(fft([0; 0; 0; 0;−3], 4096)))

(viii) comet(real(fft([0; 0; 0; 0;−3], 4096)), imag(fft([0; 0; 0; 0;−3], 4096)));

(ix) comet(real(fft([0; 0; 0;−2i], 4096)), imag(fft([0; 0; 0;−2i], 4096)));

(x) comet(real(fft([0; 0; 3i], 4096)), imag(fft([0; 0;−3i], 4096))); .

In Parts (i), (iv), (v), (viii), (ix) and (x) how many times does the plot go around the circle,in what direction, where does it start where does it end, and what is the radius.

Problem 4. Using a 4096 fast Fourier transform in Matlab plot 3+2 cos(2t)−4 sin(6t) overthe interval [0, 2π].

Problem 5. Use a 4096 fast Fourier transform to plot the function

p(t) = 2 + 7 cos(6πt/10)− 5 cos(8πt/10) + 6 sin(10πt/10)

over the interval [0, 10]. Hint:

p(t) = 2 + �(7e

2πi3t10 − 5e

2πi4t10

)+ �6e 2πi5t

10 .

3.2. THE DISCRETE FOURIER TRANSFORM AND FOURIER SERIES 127

Then consider the Matlab command

p = real(fft([2; 0; 0; 7,−5], 4096))) + imag(fft([0; 0; 0; 0; 0;−6], 4096)));

and plot p over the interval [0, 10].

Problem 6. Consider the function

f(t) = 3 + 4 cos(3t) + 22 cos(28t)− 2 cos(64t)− 22 cos(100t) + 12 sin(120t) + 2 sin(125t).

Let tj =2πj128

for j = 0, 1, 2, · · · , 127 be 128 even spaced sample times over the interval [0, 2π)starting at time t0 = 0. Let a be the inverse fast Fourier transform F−1

128 for {f(tj)}1270 ; seeequation (1.26).

(i) Find a, and verify your result with Matlab.

(ii) Is it possible to reconstruct f(t) from a? Explain why or why not.

(iii) Find the smallest number of samples {tj}ν−10 needed to perfectly reconstruct f(t).

Using these time samples, let b be the inverse fast Fourier transform F−1ν for {f(tj)}ν−1

0

computed in Matlab. Then reconstruct f(t) from b.

3.2 The discrete Fourier transform and Fourier series

In this section, we will expand on Remark 3.1.3, and show how the discrete Fourier transformcan be used to compute a Fourier series approximation for a function. To this end, let f bea function in L2(0, τ). According to Theorem 2.1.1 in Chapter 2, the function f(t) admits aFourier series expansion of the form

f(t) =∞∑

k=−∞ake


τ

∫ τ

0

f(t)eikω0tdt (2.1)

and ω0 =2πτ

is the fundamental frequency. By Corollary 2.1.2 in Chapter 2, one can choosea finite set of integers K such that p(t) =

∑k∈K ake

−ikω0t is a good approximation of f . Tobe precise, given any ε > 0 one can choose a finite set of integers K such that the distancefrom f to p in the L2(0, τ) norm is less than ε, that is, ‖f − p‖ < ε.

Let f(t) =∑∞

−∞ ake−ikω0t be the Fourier series for a function f in L2(0, τ). In many

instances one can use fast Fourier transform methods to compute a trigonometric polynomialapproximation pK(t) =

∑k∈K cke

−ikω0t for f(t). As in Remark 3.1.3, let us avoid gettinginvolved in technical issues and assume that f is a ”well behaved” function. To compute theFourier coefficients, chose ν = 2� such that ν is sufficiently large. Then compute {f(tj)}ν−1

0

where tj = jτν

for j = 0, 1, 2, · · · , ν − 1. Now use the fast inverse Fourier transform tocompute the coefficients {ck}n−m corresponding to {f(tj)}ν−1

0 , that is,⎡⎢⎢⎢⎢⎢⎣c0c1...c−2

c−1

⎤⎥⎥⎥⎥⎥⎦ = ν−1F ∗ν

⎡⎢⎢⎢⎢⎢⎣f(t0)f(t1)...


⎤⎥⎥⎥⎥⎥⎦ . (2.2)


Notice that this calculation can be done in ν log ν operations.As in Remark 3.1.3, let us assume that we have chosen a sufficiently large ν = 2� for f

to guarantee that ck ≈ ak for all k ∈ K. (For numerical efficiency, one would choose ν = 2�,if possible.) Here K is a finite set of integers such that |k| � ν

2for all k ∈ K, and ck is

approximately equal to the Riemann integral 1τ

∫ τ0eikω0tf(t)dt for k ∈ K, and

∑k/∈K |ak|2 ≈ 0.

Moreover, the set K is chosen by analyzing the power spectrum formed by {ck}. In this case,ruffly speaking, for large enough ν, we have

‖f −∑k∈K

cke−ikω0t‖ ≈ ‖f −

∑k∈K

ake−ikω0t‖ =

√∑k/∈K

|ak|2 ≈ 0

‖f‖ =

√1

τ

∫ τ

0

|f(t)|2dt ≈√√√√ n∑

k=−m|ck|2. (2.3)

In other words, if the function f is ”well behaved”, then one can use the inverse fast Fouriertransform compute a trigonometric polynomial approximation pK(t) =

∑k∈K cke

−ikω0t forf(t), and approximate the norm ‖f‖ of f .

If f(t) is a real valued function, then ak = a−k for all integers k. In this case, the functionf in L2(0, τ) also admits a Fourier series expansion of the form

f(t) =∞∑

k=−∞ake

−ikω0t = a0 +∞∑k=1

(ake

−ikω0t + akeikω0t

)= a0 +

∞∑k=1

2� (ake

−ikω0t)

= a0 +∞∑k=1

(�(ak) cos(ω0t) + �(ak) sin(kω0t)) ; (2.4)

see Remark 2.3.2. If f(t) is a real valued function, then ck = c−k for all |k| < ν2; see

Proposition 3.3.1 below. As before, let us assume that ν is sufficiently large and we havechosen a set K such that (2.3) holds. Because f(t) is a real valued function, we also assumethat if k ∈ K, then −k ∈ K. Then our approximation pK(t) for f(t) is given

pK(t) =∑k∈K

cke−iω0kt = c0 +

∑1≤k∈K

(cke

−ikω0t + ckeikω0t

)= c0 +

∑1≤k∈K

2� (cke

−ikω0t)

= c0 +∑

1≤k∈K(�(ck) cos(kω0t) + �(ck) sin(kω0t)) . (2.5)

3.2.1 A Fourier series example

Consider the function f in L2(0, 2π) defined by

f(t) = cos(2t) + esin(6t). (2.6)

Since f(t) is a continuous 2π periodic function, the Gibbs phenomenon does not play a

role in this example. Moreover, the derivatives dfdt

and d2fdt2

are also continuous 2π periodicfunctions. Therefore f(t) is in the Wiener algebra W2π; see Section 2.1.1. In particular, if


∑∞−∞ ake

−ikt is the Fourier series for f(t), then∑n

−n ake−ikt converges uniformly to f(t); see

Theorem 2.1.6. (The fundamental frequency ω0 = 1.)Let us use a fast Fourier transform of size ν = 212 = 4096, to compute a Fourier series ap-

proximation for f . Because τ = 2π, the 4096 sample points tj =2πj4096

for j = 0, 1, 2, · · · , 4095.(One can use the linspace command in Matlab to generate the sample points. However,linspace does not include the time t = 0, and this, can introduce numerical errors.) Let c bethe vector in C

4096 obtained by taking the inverse fast Fourier transform (ifft in Matlab) of

the vector[f(t0) f(t1) · · · f(t4095)

]trin C4096, that is,

c = F−1ν

[f(t0) f(t1) f(t2) · · · f(t4094) f(t4095)

]tr=[c0 c1 c2 · · · c2048 c−2047 · · · c−3 c−2 c−1

]tr=[c(1) c(2) c(3) · · · c(4094) c(4095) c(4096)

]tr. (2.7)

(If one takes the ifft in Matlab of the row vector[f(t0) f(t1) · · · f(t4095)

], then Matlab

returns the row vector[c0 c1 · · · c−2 c−1

]; see Remark 3.1.5.) Because Matlab has no

zero index, the last vector in (2.7) is how Matlab encodes the vector c, that is, in Matlab

c(1) = c0, c(2) = c1, c(3) = c2 · · · c(4095) = c−2 and c(4096) = c−1.

Since we used a large fast Fourier transform, ak ≈ ck for all |k| < 2048 = 211 and∑|k|≥2048 |ak|2 ≈ 0. (In fact,

∑|k|>60 |ak|2 = 0 on the order of 10−22; see equation (2.9)

below.) Because f(t) is a real valued function, we have ak = a−k for all integers k, andck = c−k; see Corollary 2.1.3 in Chapter 2 and Proposition 3.3.1 below.

-10 0 10 20 30 40 50 60 70

k

0

0.5

1

1.5

2|c

k|2 for 0 ≤ k ≤ 60

-10 0 10 20 30 40 50 60 70

k

0

0.5

1

1.5|c

k|0.1 for 0 ≤ k ≤ 60

Figure 3.2: The power spectrum |ck|2 in the top plot and |ck| 110 in the bottom plot.

To find a trigonometric polynomial pK(t) to approximate f(t), we will use the powerspectrum. The top graph in Figure 3.2 plots the power spectrum |ck|2 on the y axis for


k ∈ [0, 60]. Since ak ≈ ck, this is essentially the power spectrum for f . Because f(t) is areal valued function, the power spectrum is symmetric about the y axis. So we only plottedthe power spectrum for positive k. The power spectrum in Figure 3.2 shows that all thesignificant frequencies for f are contained in K = {0,±2,±6,±12,±18}. The frequencies±18 are there, but with very small amplitudes. (Since the period τ = 2π, the integers in Kare actual frequencies for f . If τ = 2π and k ∈ K, then the corresponding frequency for f iskω0.) Finally, to show that the higher frequencies are indeed present, we plotted |ck| 1

10 for0 ≤ k ≤ 60 in the bottom graph, which highlights the smaller amplitudes. By examiningthis graph, it appears that f has infinitely many frequencies at

K∞ = {0,±2,±6,±12,±18,±24, · · · }.

Inspired by our spectral analysis, when searching for a trigonometric polynomial approx-imation pK(t) for f(t), it makes sense to choose frequencies from the set K∞. Based onthis observation with the power spectrum for f , we have chosen pK(t) =

∑k∈K cke

−ikt asour approximation for f(t). For comparison, with a smaller number of frequencies, considerthe trigonometric polynomial pS(t) =

∑k∈S cke

−ikt generated by the set S = {0,±2,±6}.Because f(t) is a real valued function, ck = c−k, and thus, the trigonometric polynomialspS(t) and pK(t) are given by

pS(t) =∑k∈S

cke−ikt = c0 +

∑k=2,6

(�(ck) cos(kt) + �(ck) sin(kt))

pK(t) =∑k∈K

cke−ikt = c0 +

∑k=2,6,12,18

(�(ck) cos(kt) + �(ck) sin(kt)) . (2.8)

Using Matlab we computed:

‖f‖ =

√1

2π

∫ 2π

0

|f(t)|2dt ≈√∑

k

|ck|2 = 1.6672

‖f − pS‖ =

√1

2π

∫ 2π

0

|f(t)− pS(t)|2dt ≈√∑

k/∈S|ck|2 = 0.1946

‖f − pK‖ =

√1

2π

∫ 2π

0

|f(t)− pK(t)|2dt ≈√∑

k/∈K|ck|2 = 0.0039

∑|k|>60

|ak|2 ≈ 22048∑k=62

|ck|2 = 3.1253× 10−22

| c(3)− 1

2| = 1.0345× 10−16 and | c(4095)− 1

2|= 1.2759× 10−16. (2.9)

Figure 3.3 presents a graph of pS(t), pK(t) and f(t) over the interval [0, 2π]. Because theerror ‖f − pK‖ ≈ 0.0039 is small relative to ‖f‖ ≈ 1.6672, the graphs of pK(t) and f(t) lookalmost identical. The error ‖f − pS‖ = 0.1946 is not close to zero, and thus, the graph of


0 1 2 3 4 5 6 7

t

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

4The graph of f(t) and p

S(t) and p

K(t)

Figure 3.3: The graph of cos(2t) + esin(6t) with pS(t) and pK(t)

pS(t) deviates from f(t); see Figure 3.3. In the next section, we will use Bessel functionsto gain some further insight into the spectrum of f , and present a formula for the Fourierseries of esin(6t); see (2.18). Finally, all the significant frequencies of f(t) are contained in theNyquist frequency range (−2048, 2048). Therefore there are no issues in reconstructing f(t)by using a 4096 (ifft) fast Fourier transform; see Section 3.1.1.

The Matlab commands we used to compute pS(t), and pK(t) are given by

t = (0: 4095)′ ∗ 2*pi/4096;f = cos(2 ∗ t) + exp(sin(6 ∗ t));c = ifft(f); norm(c) = 1.6672 % the norm for ‖f‖p = c(1); p = p+ 2 ∗ real(c(3) ∗ exp(−i ∗ 2 ∗ t));p = p+ 2 ∗ real(c(7) ∗ exp(−i ∗ 6 ∗ t)); ps = p;

p = p+ 2 ∗ real(c(13) ∗ exp(−i ∗ 12 ∗ t));p = p+ 2 ∗ real(c(19) ∗ exp(−i ∗ 18 ∗ t)); pk = p;

The Matlab commands we used to compute the different norms in (2.9) are given by

norm(ifft(f)) = 1.6672 % the norm ‖f‖;norm(ifft(f − ps)) = 0.1946 % the norm ‖f − pS‖;norm(ifft(f − pk)) = 0.0039 % the norm ‖f − pK‖;2 ∗ norm(c(62 : 2048))2 = 3.1253× 10−22 % this approximates

∑|k|>60 |ak|2

abs(c(3)− 1/2) = 1.0345× 10−16 % | c(3)− 1

2|

abs(c(4095)− 1/2) = 1.2759× 10−16 % | c(4095)− 1

2| .


The Matlab commands we used to plot Figure 3.2 and Figure 3.3 are given by

subplot(2,1,1); bar((0:60), abs(c(1:61)).∧2); grid;xlabel(’k’); title(’|ck|2 for 0 ≤ k ≤ 60’);

subplot(2,1,2); bar((0:60), abs(c(1:61)).∧.1); grid;xlabel(’k’); title(’|ck|0.1 for 0 ≤ k ≤ 60’); hold off;

plot(t,f); hold on; plot(t,ps,’c’); plot(t,pk,’r’); grid;

xlabel(’t’); title(’The graph of f(t) and pS(t) and pK(t)’);

To complete this section, let us prove that the frequencies for f are indeed contained inthe set K∞ = {0,±2,±6k}∞k=1, that is, f(t) =

∑k∈K∞ ake

−ikt. To see this, first notice thatthe frequencies ±2 are due to the cos(2t) term in

f(t) = cos(2t) + esin(6t) =e−2it

2+e2it

2+ esin(6t) =

∞∑k=−∞

ake−ikt.

This observation is also confirmed by Matlab, because a2 ≈ c(3) = 12and a−2 ≈ c(4095) = 1

2.

In Matlab c(3) = 12and c(4095) = 1

2on the order of 10−16; see (2.9). We claim that the

remaining frequencies {0,±6k}∞k=1 in K∞ are due to the esin(6t) term in f(t). To verify this,recall that ez admits a power series expansion of the form

ez =

∞∑n=0

zn

n!and (x+ y)n =

n∑j=0

(n

j

)xn−jyj (2.10)

where(nj

)= n!

j!(n−j)! is the binomial coefficient. Using this we obtain

esin(6t) =

∞∑n=0

sin(6t)n

n!=

∞∑n=0

(ei6t − e−i6t)n

2ninn!=

∞∑n=0

(−i)n2nn!

(n∑j=0

(n

j

)(−1)je−i6(2j−n)

). (2.11)

By rearranging terms, this shows that esin(6t) admits a Fourier series expansion of the formesin(6t) =

∑∞−∞ dke

−6ik. Thus esin(6t) only contains frequencies in the set {0,±6k}∞k=1, anddoes have any frequency component at ±2. The frequencies at ±2 are present because ofthe cos(2t) term in f(t). Hence all the frequencies of f(t) = cos(2t) + esin(6t) are containedin K∞. In other words, the Fourier series for f(t) is of the form f(t) =

∑∞−∞ ake

−ikt whereak = 0 when k /∈ K∞. Therefore our conjecture that the frequencies for f are contained inK∞ is indeed correct.

3.2.2 Bessel functions

In this section, we will present a short introduction to integer Bessel function of the firstkind. Then we will use Bessel functions to provide some further insight into the Fourierseries expansion for f(t) = cos(2t) + esin(6t). Bessel functions are a solutions to certaindifferential equations. However, we are only interested in the Fourier series associated with


Bessel functions, and will not pursue Bessel differential equations. Finally, it is noted thatthe results in this section will not be used in the rest of the notes and can be skipped by theuninterested reader. For further results on Bessel functions and other special functions, see[41] and the Wikipedia webpage on Bessel functions.

The Bessel function of the first kind Jk(z) when k is an integer can be represented by

Jk(z) =1

2π

∫ π

−πeikt−iz sin(t)dt =

1

2π

∫ 2π

0

eikte−iz sin(t)dt (where z ∈ C) (2.12)

It is emphasized that k is an integer and z is a complex number. Because eikt−iz sin(t) is a 2πperiodic function, the integral over any interval of length 2π is the same. For each z ∈ C,let g(t, z) be the function defined by the function g(t, z) = e−iz sin(t). For each fixed z, thefunction g(t, z) = e−iz sin(t) is a continuous 2π periodic function. In Particular, g(t, z) is inL2(0, 2π). By consulting equation (2.12), we see that the Bessel function Jk(z) is the Fouriercoefficient of e−ikt in the Fourier series expansion for e−iz sin(t). In other words, the Fourierseries expansion for e−iz sin(t) is given by

e−iz sin(t) =∞∑

k=−∞Jk(z)e

−ikt where Jk(z) =1

2π

∫ 2π

0

eikte−iz sin(t)dt (z ∈ C) (2.13)

Notice that derivatives dgdt

and d2gdt2

are also continuous 2π periodic functions. Therefore g(t, z)is in the Wiener algebra W2π; see Section 2.1.1. In particular, for fixed z, the partial Fourierseries

∑n−n Jk(z)e

−ikt converges uniformly to e−iz sin(t); see Theorem 2.1.6.The Bessel functions Jk(z) have the following properties.

☼ The Bessel function

J−k(z) = Jk(z)(−1)k (for all integers k and z ∈ C). (2.14)

☼ If z is a real number, then Jk(z) is a real number.

☼ If z is a purely imaginary number, then Jk(z) is real for even k and purely imaginaryfor odd k.

To show that J−k(z) = Jk(z)(−1)k, recall that − sin(θ) = sin(θ − π). Using this we have

J−k(z) =1

2π

∫ 2π

0

e−ikte−iz sin(t)dt =1

2π

∫ 2π

0

e−ikteiz sin(t−π)dt

=1

2π

∫ π

−πe−ik(t+π)eiz sin(t)dt =

e−ikπ

2π

∫ π

−πe−ikteiz sin(t)dt

= −(−1)k

2π

∫ −π

π

eikteiz sin(−t)dt =(−1)k

2π

∫ π

−πeikte−iz sin(t)dt = Jk(z)(−1)k.

Therefore J−k(z) = Jk(z)(−1)k for all integers k.


Now assume that z = x is a real number. Then

Jk(x) =1

2π

∫ π

−πe−ikteix sin(t)dt = − 1

2π

∫ −π

π

eikteix sin(−t)dt = Jk(x).

Therefore Jk(x) = Jk(x), and thus, Jk(z) is a real number when z is a real number.Now assume that z = ia is a purely imaginary number, that is, a is a real number. Then

e−iz sin(t) = ea sin(t) is a real valued function. Using J−k(z) = Jk(z)(−1)k, we have

Jk(ia) =1

2π

∫ 2π

0

e−iktea sin(t)dt = J−k(ia) = Jk(ia)(−1)k.

Hence Jk(ia) = Jk(ia)(−1)k. So Jk(ia) is real when k is even, and purely imaginary when kis odd.

Using J−k(z) = Jk(z)(−1)k in the Fourier series for e−iz sin(t) in (2.13), we have

e−iz sin(t) =∞∑

k=−∞Jk(z)e

−ikt = J0(z) +

∞∑k=1

(Jk(z)e

−ikt + J−k(z)eikt)

= J0(z) +

∞∑even k≥2

Jk(z)(e−ikt + eikt

)+

∞∑odd k≥1

Jk(z)(e−ikt − eikt

)= J0(z) + 2

∞∑even k≥2

Jk(z) cos(kt)− 2

∞∑odd k≥1

Jk(z)i sin(kt).

In other words, e−iz sin(t) admits a Fourier series expansion of the form:

e−iz sin(t) = J0(z) + 2

∞∑even k≥2

Jk(z) cos(kt)− 2

∞∑odd k≥1

Jk(z)i sin(kt) (2.15)

Because e−iz sin(t) is in the Wiener algebra W2π, this Fourier series converges uniformly toe−iz sin(t) for each z in C.

A fast Fourier transform method to compute the Bessel function Jk(z). TheMatlab command to compute the Bessel function Jk(z) is given by besselj(k, z). One canalso use fast Fourier transform techniques to compute Jk(z). To see this, let ν = 2n wheren is a positive integer. Let tj = 2πj

νfor j = 0, 1, 2, · · · , ν − 1 be ν sample points over the

interval [0, 2π] starting at t0 = 0. Let c =[c(1) c(2) · · · c(ν)

]be the inverse fast Fourier

transform of {e−iz sin(tj)}ν−10 in Matlab. Then for sufficiently large ν, we have

J0(z) ≈ c(1), J1(z) ≈ c(2), J2(z) ≈ c(3), · · · , J−2(z) ≈ c(ν − 1), J−1(z) ≈ c(ν).

For example, we choose ν = 214 with z = 2+ i, and let c be the inverse fast Fourier transform(ifft in Matlab) of {e−iz sin(tj)}ν−1

0 . Then using Matlab, we obtained

c(1) = 0.1879− 0.6462i and |J0(2 + i)− c(1)| = 5.5511× 10−17;

c(5) = −0.0044 + 0.0560i and |J4(2 + i)− c(5)| = 4.0460× 10−17;

c(214 − 2) = −0.0824− 0.1754i and |J−3(2 + i)− c(214 − 2)| = 1.4752× 10−16;


The Matlab commands we used to compute c and the corresponding errors are given by

t = (0: 2 ∧ 14− 1) ∗ 2 ∗ pi/(2 ∧ 14);

z = 2 + i; g = exp(−i ∗ z ∗ sin(t)); c = ifft(g);

c(1) = 0.1879− 0.6462i; abs(besselj(0, z)− c(1)) = 5.5511× 10−17;

c(5) = −0.0044 + 0.0560i; abs(besselj(4, z)− c(5)) = 4.0460× 10−17;

c(2 ∧ 14− 2) = −0.0824− 0.1754i; abs(besselj(−3, z)− c(2 ∧ 14− 2)) = 1.4752× 10−16;

The function f(t) = cos(2t)+esin(6t). Let us return to our function f(t) = cos(2t)+esin(6t),studied in Section 3.2.1 using fast Fourier transform techniques; see (2.6). Set z = iawhere a is a complex number. (In our application a is a real number.) Observe thatea sin(t) = g(t, ia) = e−iia sin(t). By consulting the Fourier series in (2.15) with z = ia, we seethat ea sin(t) admits a Fourier series expansion of the form

ea sin(t) = J0(ia) + 2∑

even k≥2

Jk(ia) cos(kt)− 2∑

odd k≥1

Jk(ia)i sin(kt)

= J0(ia) + 2∑

even k≥2

Jk(ia) cos(kt) + 2∑

odd k≥1

�(Jk(ia)) sin(kt) (if a is real). (2.16)

Recall that if z is purely imaginary, then Jk(z) is real for even k and purely imaginary forodd k. So when a is a real number, we have −Jk(ia)i = �(Jk(ia)) when k is odd.

By replacing t by 6t and choosing a = 1 in (2.16), we arrive at the Fourier series expansionfor esin(6t) that we have been looking for, that is,

esin(6t) = J0(i) + 2∑

even k≥2

Jk(i) cos(6kt) + 2∑

odd k≥1

�(Jk(i)) sin(6kt). (2.17)

Because cos(2t) is already in Fourier series form, we see that the Fourier series expansion forf(t) = cos(2t) + esin(6t) is given by

f(t) = J0(i) + cos(2t) + 2∑

even k≥2

Jk(i) cos(6kt) + 2∑

odd k≥1

�(Jk(i)) sin(6kt). (2.18)

In particular, this shows that the frequencies for f(t) are contained in

K∞ = {0,±2,±6,±12,±18,±24, · · · }.This is precisely the set of frequencies that we conjectured were present, in our analysis ofthe power spectrum for f , using fast Fourier transform techniques.

Recall that we choose to keep the frequencies K = {0,±2,±6,±12,±18} for our approx-imation pK(t) of the function f(t) = cos(2t) + esin(6t). In terms of Bessel functions

pK(t) = J0(i) + cos(2t) + 2J2(i) cos(12kt) + 2∑k=1,3

�(Jk(i)) sin(6kt)

‖f − pK‖ =

√√√√ ∞∑4≤|k|

|Jk(i)|2 =√2

√√√√ ∞∑k=4

|Jk(i)|2 = 0.0039. (2.19)


Finally, it is noted that ‖f − pK‖ = 0.0039 is the same error we obtain using fast Fouriertransform methods; see equation (2.9). The Matlab command we used to compute the erroris given by

sqrt(2)*norm(besselj((4:100000),i)) = 0.0039.

Some special Fourier series involving Bessel functions. To complete this section, letus present several more examples of how Bessel functions can be used to construct a Fourierseries. For our first example, consider the function ea cos(t) in L2(0, 2π) where a a complexnumber. In particular, a could be a real number. Clearly ea cos(t) and all of its derivativesare 2π periodic continuous functions. Therefore ea cos(t) is in the Wiener algebra W2π. Weclaim that the Fourier series expansion for the function ea cos(t) in L2(0, 2π) is given by

ea cos(t) = J0(ia) + 2

∞∑k=1

Jk(ia)(−i)k cos(kt). (2.20)

Because ea cos(t) is in the Wiener algebra, this Fourier series converges uniformly to ea cos(t).Since ea cos(t) is an even function, the sine terms in the Fourier series expansion are notpresent. Finally, it is noted that when a is real Jk(ia)(−i)k is also real for all integers k. (Ifz = ia is purely imaginary, then Jk(z) is real for even k and purely imaginary for odd k.)

To derive the Fourier series in (2.20), recall that cos(θ) = sin(π2−θ). Replacing t by π

2− t

and setting z = ia in (2.15), we have

ea cos(t) = e−iia sin(π2−t)

= J0(ia) + 2

∞∑even k≥2

Jk(ia) cos(kt− kπ

2) + 2

∞∑odd k≥1

Jk(ia)i sin(kt− kπ

2)

= J0(ia) + 2

∞∑even k≥2

Jk(ia) cos(kt) cos(kπ

2)− 2

∞∑odd k≥1

Jk(ia)i cos(kt) sin(kπ

2)

= J0(ia) + 2

∞∑k=1

Jk(ia)(−i)k cos(kt).

This yields the Fourier series in (2.20).Notice that cos(a sin(t)) is an even 2π periodic continuous function in L2(0, 2π), while

sin(a sin(t)) is an odd 2π periodic continuous function in L2(0, 2π). Moreover, both of thesefunction have infinitely many derivatives which are also 2π periodic functions. Thereforecos(a sin(t)) and sin(a sin(t)) are also in the Wiener algebra W2π. The corresponding Fourierseries are given by

cos(a sin(t)) = J0(a) + 2

∞∑even k≥2

Jk(a) cos(kt)

sin(a sin(t)) = 2

∞∑odd k≥1

Jk(a) sin(kt). (2.21)


Because cos(a sin(t)) and sin(a sin(t)) are in the Wiener algebra W2π, the correspondingFourier series in (2.21) converge uniformly to cos(a sin(t)) and sin(a sin(t)), respectively.Finally, it is noted that if a is real, then Jk(a) is real for all integers k.

To derive the previous Fourier series observe that

cos(z sin(t))− i sin(z sin(t)) = e−iz sin(t)

= J0(z) + 2∞∑

even k≥2

Jk(z) cos(kt)− 2∞∑

odd k≥1

Jk(z)i sin(kt);

see (2.15). Notice that cos(z sin(t)) is an even, while sin(z sin(t)) is an odd function. Becausethe cosine terms in the Fourier series are even and the sine terms are odd, we can evaluatethe Fourier series at t and −t to obtain.

cos(z sin(t)) =e−iz sin(t) + e−iz sin(−t)

2= J0(z) + 2

∞∑even k≥2

Jk(z) cos(kt)

sin(z sin(t)) =e−iz sin(−t) − e−iz sin(t)

2i= 2

∞∑odd k≥1

Jk(z) sin(kt).

This yields the Fourier series in (2.21).

Notice that cos(a cos(t)) and sin(a cos(t)) are both 2π periodic continuous even functionin L2(0, 2π). Moreover, both of these function have infinitely many derivatives which arealso 2π periodic functions. Therefore cos(a cos(t)) and sin(a cos(t)) are also in the Wieneralgebra W2π. The corresponding Fourier series are given by

cos(a cos(t)) = cos(a sin(t− π

2)) = J0(a) + 2

∞∑even k≥2

Jk(a)(−1)k2 cos(kt)

sin(a cos(t)) = − sin(a sin(t− π

2)) = 2

∞∑odd k≥1

Jk(a) sin(kπ

2) cos(kt). (2.22)

Because cos(a cos(t)) and sin(a cos(t)) are in the Wiener algebra W2π, the correspondingFourier series in (2.22) converge uniformly to cos(a cos(t)) and sin(a cos(t)), respectively. Toderive these Fourier series notice that the trigonometric identity cos(θ) = − sin(θ− π

2) yields

the first set of equation in (2.22). The second set of equations follow by replacing t by t− π2

in the Fourier series (2.21). The details are left to the reader as an exercise.

To complete this section, let us observe that one can also use Bessel functions to find theFourier series for functions of the form ea cos(mt), cos(a sin(mt)), sin(a sin(mt)), etc. where mis a nonzero integer. To accomplish this simply replace t by mt in the corresponding Fourier


series, that is, by replacing t by mt in, (2.20) and (2.21), we have

ea cos(mt) = J0(ia) + 2

∞∑k=1

Jk(ia)(−i)k cos(mkt)

cos(a sin(mt)) = J0(a) + 2∞∑

even k≥2

Jk(a) cos(mkt)

sin(a sin(mt)) = 2∞∑

odd k≥1

Jk(a) sin(mkt).

The functions ea cos(mt) and cos(a sin(mt)) and sin(a sin(mt)) are in the Wiener algebra W2π,and thus, the corresponding Fourier series converges uniformly. One can also find the Fourierseries for functions of the form sin(a sin(mt) +ϕ) where ϕ is a constant. To this end, simplyuse the trigonometric identity sin(θ+φ) = sin(θ) cos(φ)+cos(θ) sin(φ) along with our previousFourier series formulas involving Bessel functions. The details are left to the reader.

One can also use Bessel functions to find the Fourier series for functions of the formea cos(ω0t), cos(a sin(ω0t)), sin(a sin(ω0t)), etc. viewed as functions in L2(0, τ). In this case,the fundamental frequency ω0 =

2πτ. To find the corresponding Fourier series simply replace

t by ω0t in (2.20) and (2.21), that is,

ea cos(ω0t) = J0(ia) + 2

∞∑k=1

Jk(ia)(−i)k cos(kω0t)

cos(a sin(ω0t)) = J0(a) + 2

∞∑even k≥2

Jk(a) cos(kω0t)

sin(a sin(ω0t)) = 2

∞∑odd k≥1

Jk(a) sin(kω0t).

It is noted that ea cos(ω0t) and cos(a sin(ω0t)) and sin(a sinω0t)) are all in the Wiener algebraWτ , and thus, the corresponding Fourier series converges uniformly.

3.2.3 Computing the inner product in L2(0, τ)

The fast Fourier transform can also be used to compute the inner product and norms of func-tions in L2(0, τ). To see this, let f =

∑∞−∞ ake

− 2πikτ and g =

∑∞−∞ bke

− 2πikτ be respectively,

the Fourier series expansion for the functions f and g in L2(0, τ). According to Theorem2.1.1 in Chapter 2, we have

a0 =1

τ

∫ τ

0

f(t)dt

‖f‖2 =1

τ

∫ τ

0

|f(t)|2dt =∞∑

k=−∞|ak|2 (2.23)

(f, g) =1

τ

∫ τ

0

f(t)g(t)dt =

∞∑k=−∞

akbk .


So to compute the integral of f , the norm f and the inner product between f and g all weneed is the Fourier coefficients {ak} and {bk}. The fast Fourier transform can be used toobtain an approximation of {ak} and {bk}, and thus, compute ‖f‖ and (f, g).

For example, let f and g be the functions in L2(0, 1) given by f(t) = t and g(t) = t2.

Clearly,∫ 1

0f(t)dt = 1

2. Moreover,

‖f‖2 =∫ 1

0

t2 dt =1

3and (f, g) =

∫ 1

0

tt2dt =1

4.

Hence ‖f‖ = 1√3and (f, g) = 1

4. To compute these results by implementing the fast Fourier

transform in Matlab, we used the following commands:

t = (0: 218 − 1)/218;

f = t and g = t. ∧ 2

a = ifft(f) and b = ifft(g) .

First we observed that∫ 1

0f(t)dt = a0 = a(1) = 1

2. (Notice that a0 is a(1) in Matlab.)

Furthermore,

‖f‖ ≈ norm(a) = 0.5773 ≈ 1√3

and (f, g) = a ∗ b′ = 0.25.

This matches the theoretical results.

3.2.4 Exercise


f(t) = t if 0 ≤ t ≤ π/2

= 3π/2− 2t if π/2 < t ≤ π

= −5π/2 + 2t if π < t ≤ 3π/2

= 2π − t if 3π/2 < t ≤ 2π .

Use fast Fourier transform methods of size ν = 214, with the sample points tj = 2πjν

forj = 0, 1, 2, · · · , ν − 1, to solve the following problems.

(i) Plot the power spectrum for f .

(ii) Find an approximation p(t) =∑

k∈K cke−ikt for f(t), where K is a set consisting of

no more than seven integers or frequencies. (Here p(t) = pK(t) and we dropped thesubscript K. Since τ = 2π, the integers in K are actual frequencies for f .) Expressp(t) as a Fourier series consisting of sinusoids of the form

p(t) = c0 +∑

1≤k∈K(αk cos(kt) + βk sin(kt)) .


(iii) Compute ‖f‖. Compute the distance between f and p, that is, find ‖f − p‖.(iv) Plot p(t) and f(t) on the same graph over the interval [0, 2π].

Hint: consider the following commands in Matlab to construct f(t):

t = (0 : 214 − 1) ∗ 2 ∗ π/214;f = t. ∗ (t <= π/2) + (3π/2− 2 ∗ t). ∗ (π/2 < t)− (3π/2− 2 ∗ t). ∗ (π < t);

f = f + (−5π/2 + 2 ∗ t). ∗ (π < t)− (−5π/2 + 2 ∗ t). ∗ (3π/2 < t);

f = f + (2π − t). ∗ (3π/2 < t); plot(t, f)

Then use c = ifft(f) to compute the Fourier coefficients.


f(t) = sin(3t) + cos(2 sin(8t) +

π

4

).



(i) Let c be the inverse fast Fourier transform for {f(tj)}ν−10 . Plot the power spectrum

for f . By plotting the powers |ck|q for some q, make an educated guess to find a set offrequencies K∞ such that f(t) =

∑k∈K∞ ake

−ikt is the Fourier series for f(t).

(ii) Find two approximations pS(t) =∑

k∈S cke−ikt and pK(t) =

∑k∈K cke

−ikt for f(t),where S is a set consisting of no more than seven integers or frequencies, and K is a setconsisting of no more than eleven frequencies. What set S and K did you use? (Sincethe period τ = 2π, the integers in K are actual frequencies for f . If τ = 2π and k ∈ K,then the corresponding frequency for f is kω0.)

(iii) Plot pS(t), pK(t) and f(t) on the same graph over the interval [0, 2π].

(iv) Compute the norm ‖f‖. Find the errors ‖f − pS‖ and ‖f − pK‖.

� For extra credit: explain why f is in the Wiener algebra W2π. Use Bessel functions tofind the Fourier series expansion for f(t).


f(t) = sin(30t)e2 cos(5t).



(i) Let c be the inverse fast Fourier transform for {f(tj)}ν−10 . Plot the power spectrum

for f . By plotting the powers |ck|q for some q, make an educated guess to find a set offrequencies K∞ such that f(t) =

∑k∈K∞ ake

−ikt is the Fourier series for f(t).


(ii) Find two approximations pS(t) =∑

k∈S cke−ikt and pK(t) =

∑k∈K cke

−ikt for f(t),where S is a set consisting of no more than six integers or frequencies, and K is aset consisting of no more than fourteen frequencies. What set S and K did you use?(Since the period τ = 2π, the integers in K are actual frequencies for f . If τ = 2π andk ∈ K, then the corresponding frequency for f is kω0.)

(iii) Plot pS(t), pK(t) and f(t) on the same graph over the interval [0, 2π].

(iv) Compute the norm ‖f‖. Find the errors ‖f − pS‖ and ‖f − pK‖.

� For extra credit: explain why f is in the Wiener algebra W2π. Use Bessel functions tofind the Fourier series expansion for f(t).

Problem 4. Consider the functions f(t) = et and g(t) = t in L2(0, 1). Use the fast Fouriertransform to compute the following integrals∫ 1

0

f(t)dt , ‖f‖2 =∫ 1

0

|f(t)|2dt , and (f, g) =

∫ 1

0

f(t)g(t)dt .

Compute these integrals analytically and compare your answers. Do you need a 213 fastFourier transform for more accuracy?


f(t) = t sin

(4π2

t

).

Notice that f(t) is a continuous function and f(0) = f(2π) = 0. However, f(t) is not ofbounded variation. Hence the hypothesis in the Dirichlet convergence Theorem 2.1.4 doesnot hold. In this case, there is no guarantee that the Fourier series will converge. Moreover,due to the high the frequencies contained in f(t), sampling f(t) with fast Fourier transformtechniques will certainly lead to reconstruction problems; see Section 3.1.1. Clearly, f(t)is not ”well behaved”. In light of this, lets continue and see how fast Fourier transformtechniques handle this function. In Matlab set t = (0 : 220 − 1) ∗ 2π/(220) and evaluatethe 220 row vector F corresponding to f(t). In Matlab F (1) corresponding to f(0) = 0 isundefined. So you must set F (1) = 0 in Matlab. (Matlab does not have a zero index. HenceF (1) in Matlab corresponds to the function f(t) evaluated at t = 0.)

(i) Use the ifft command in Matlab to find the Fourier coefficients {ak} for F . Becausethe function f(t) is oscillating so fast near the origin, one is not guaranteed to obtaina good approximation of the Fourier coefficients {ak} in the Fourier series expansion∑∞

−∞ ake−ikt for f(t). However, to demonstrate how the Fourier series may not con-

vergence, let us continue.

(ii) Plot the power spectrum for f in Matlab.

(iii) Find an approximation pn(t) =∑n

−n ake−ikt for f(t).


(iv) Plot f(t) and pn(t) on the same graph and find a region where the Fourier series is notconverging. Plot this region out in Matlab. You may have to use the magnifying glassin Matlab.

Problem 6. This problem is a demonstration of some of the results in Section 2.1.2 ofChapter 2 and Remark 3.1.4. Consider the function f(t) = t2

2in L2(−π, π).

(i) Show that the coefficients {ak} in the Fourier series expansion∑∞

−∞ ake−ikt for the

function f(t) = t2

2in L2(−π, π) are given by

a0 =1

2π

∫ π

−π

t2

2dt =

π2

6

ak =1

2π

∫ π

−π

t2

2eiktdt =

(−1)k

k2(for all integers k = 0). (2.24)

So the Fourier series for the function f(t) = t2

2in L2(−π, π) is given by

t2

2=π2

6+∑k =0

(−1)k

k2e−ikt =

π2

6+ 2

∞∑k=1

(−1)k cos(kt)

k2(for − π ≤ t ≤ π). (2.25)

Plot f(t) = t2

2and its Fourier series (2.25) on the same graph over the interval [−π, π].


(ii) Show that coefficients {dk} in the Fourier series expansion∑∞

−∞ dke−ikt for the function

f(t− π) = (t−π)22

in L2(0, 2π) are given by

d0 =1

2π

∫ 2π

0

(t− π)2

2dt =

π2

6

dk =1

2π

∫ 2π

0

(t− π)2

2eiktdt =

1

k2(for all integers k = 0). (2.26)

In other words, the Fourier series for the function f(t) = (t−π)22

in L2(0, 2π) is given by

(t− π)2

2=π2

6+∑k =0

1

k2e−ikt =

π2

6+ 2

∞∑k=1

cos(kt)

k2(for 0 ≤ t ≤ 2π). (2.27)

As expected, the Fourier coefficients ak = (−1)kdk for all integers k; see Section 2.1.2

in Chapter 2. Plot f(t − π) = (t−π)22

and its Fourier series (2.27) on the same graphover the interval [0, 2π]. Plot the power spectrum for f(t−π). Finally, it is noted that(t−π)2

2is in the Wiener algebra W2π.


(iii) Set r = (0 : 217−1)∗2π/(217)−π in Matlab. Notice that r =[r0 r1 r2 · · · rν−1

]is

a row vector of length ν = 217 consisting of 217 evenly spaced points in [−π, π) staringwith −π and ending with (ν−2)π

ν. In Matlab compute[

c0 c1 c2 · · · c−2 c−1

]= ifft

(r.2/2

)= ifft

([r202

r212

r222

· · · r2ν−1

2

]).

Notice that {ck} forms a 217 row vector corresponding to sampling to f(t) = t2

2in

L2(−π, π) at {rj}ν−10 .

(iv) Compare {ck} computed in Matlab with the Fourier coefficients {dk} for the

Fourier series of (t−π)22

in L2(0, 2π) in (2.26). They should be approximatelyequal, that is, dk ≈ ck.

(v) Recall that the Fourier coefficients ak = (−1)kdk. Hence the Fourier coefficients{ak} for f = t2

2in L2(−π, π) can be computed in Matlab, that is, ak ≈ (−1)kck.

So check to see if indeed: a0 ≈ c0, a1 ≈ −c1, a2 ≈ c2, a3 ≈ −c3, · · · , a−2 ≈ c−2

and a−1 ≈ −c−1.

Problem 7. In this problem, we will use fast Fourier transform techniques to demonstratethat the Cesaro means converge slower than the partial Fourier series, even on well behavedfunctions; see also Theorem 2.7.4 in Chapter 2. To this end, let f be a function in L2(0, 2π)with Fourier series of the form f(t) =

∑∞−∞ ake

−ikt. Recall that pn(t) =∑n

−n ake−ikt is

the n-th partial Fourier series for f(t) and σn(t) =1n

∑n−10 pν(t) its n-th Cesaro mean; see

Section 2.7 of Chapter 2. Now consider the function f in L2(0, 2π) given by

f(t) = t(2π − t) cos(2t)2

and f(t) =∑∞

−∞ ake−ikt its corresponding Fourier series. Because f(t) is a continuous

function of finite variation and f(0) = f(2π), the Fourier series converges to f(t) for allvalues of t ∈ [0, 2π]; see the Dirichlet convergence Theorem 2.1.4. Since f(0) = f(2π) andf(t) is continuous, the Gibbs phenomenon does not play a role in this example. The Cesaromeans σn(t) converge uniformly to f(t) over the interval [0, 2π]; see Theorem 2.7.2. Thereforeour function f(t) is well behaved and there are no convergence issues surrounding f(t). InMatlab set t = 2π(0 : 217−1)/(217), and let a be the 217 row vector computed by a = ifft(f).Recall that ak ≈ a(k + 1) and a−k−1 ≈ a(217 − k) in Matlab for k = 0, 1, 2, · · · , 216.(i) Plot the power spectrum for f .

(ii) Parseval’s equality states that

π4

5− 195

4096=

1

2π

∫ 2π

0

t2(2π − t)2 cos(2t)4dt =∞∑

k=−∞|ak|2.

We used the sysm command in Matlab to obtain the first equality. However, withsome work one can also compute this integral by hand. Use fast Fourier transformtechniques to verify that Parseval’s equality holds for f , that is, in Matlab verify that

π4

5− 195

4096≈ norm(a)2.


(iii) Plot the n-th partial Fourier series pn(t) and f(t) on the same graph over [0, 2π] fortwo values of n. For each value of n use fast Fourier transform techniques to computethe error

en = ‖f − pn‖ =

√1

2π

∫ 2π

0

|f(t)− pn(t)|2dt ≈ norm(ifft(f − pn)).

See also equation (7.12) in Chapter 2.

(iv) Recall that the n-th Cesaro mean is given by

σn(t) = a0 +

n−1∑k=1

n− k

n

(2�(ak) cos(kω0t) + 2�(ak) sin(kω0t)

);

see equation (7.4) in Chapter 2. Plot the Cesaro mean σn(t) and f(t) over [0, 2π] onthe same graph, for the same values of n you choose in the previous part. For eachvalue of n use fast Fourier transform techniques to compute the error

εn = ‖f − σn‖ =

√1

2π

∫ 2π

0

|f(t)− σn(t)|2dt ≈ norm(ifft(f − σn)).

For some further insight see Theorem 2.7.4 in Chapter 2.

(v) Plot f(t), pn(t) and σn(t) on the same graph for some value of n. One should see thatthe Cesaro means σn(t) converge slower than pn(t), and that the errors in the Cesaromean are larger, that is, en ≤ εn.

(vi) How large does n have to be such that p10 and σn(t) approximate f(t) equally well,that is, find n such that e10 ≈ εn. In fact, find the smallest n such that εn ≤ e10.

(vii) For additional insight, use fast Fourier transform techniques to plot the bar graph forthe error ‖f − pn‖ for 1 ≤ n ≤ 50. Then plot the bar graph for the error ‖f − σn‖ for1 ≤ n ≤ 500. Discuss the difference between these two bar graphs.

3.3 Properties of the discrete Fourier transform

In this section, we will present several properties of the discrete Fourier transform. As before,let {λj}ν−1

0 be the ν roots of unity. To be specific, λj = e−2πijν for j = 0, 1, 2, · · · , ν − 1. In

particular, λ1 = e−2πiν and λj = λj1. Let Fν be the ν × ν discrete Fourier transform matrix

given in (1.9). Theorem 3.3.2 below shows that Fν is invertible and F−1ν = 1

νF ∗ν . Recall that

y is the discrete Fourier transform of u if y = Fνu where u and y are vectors in Cν . In this

case, u = F−1ν y is the inverse discrete Fourier transform of y. For convenience let us write

the components of u and y as

u =[a0 a1 · · · an a−m · · · a−2 a−1

]try =

[y0 y1 y2 y3 · · · yν−3 yν−2 yν−1

]tr(3.1)

3.3. PROPERTIES OF THE DISCRETE FOURIER TRANSFORM 145

where n and m are positive integers satisfying n + m + 1 = ν. Recall that tr denotes thetranspose. Using the form of u and y in (3.1), we see that y = Fνu if and only if⎡⎢⎢⎢⎢⎢⎣

y0y1...

yν−2

yν−1

⎤⎥⎥⎥⎥⎥⎦ = Fν

⎡⎢⎢⎢⎢⎢⎣a0a1...a−2

a−1

⎤⎥⎥⎥⎥⎥⎦ or equivalently

⎡⎢⎢⎢⎢⎢⎣a0a1...a−2

a−1

⎤⎥⎥⎥⎥⎥⎦ =1

νF ∗ν

⎡⎢⎢⎢⎢⎢⎣y0y1...

yν−2

yν−1

⎤⎥⎥⎥⎥⎥⎦ . (3.2)

By consulting the form of Fν in (1.9), we observe that y = Fνu if and only if

yj = a0 +

n∑k=1

akλkj +

m∑k=1

a−kλ−kj (for j = 0, 1, 2, · · · , ν − 1). (3.3)

Finally, we say that y is a real vector in Cν if all the components of y are real numbers. Thefollowing result is a discrete Fourier transform version of Corollary 2.1.3 in Chapter 2.

PROPOSITION 3.3.1 Let y = Fνu be the discrete Fourier transform for a vector u inCν. Without loss of generality, let us assume that u and y are written in the form presentedin (3.1) where n+m+1 = ν. Moreover, if ν is odd, then we set n = m = ν−1

2, and for even

ν, we set m = ν2− 1 and n = ν

2= m+ 1. Then y is a real vector in Cν if and only if

a0 is real and ak = a−k for k = 1, 2, · · · , m and am+1 is real when ν is even (3.4)

Proof. First let us observe that when ν is even, we have n = ν2, and thus,

λm+1j = λnj = e−

2πijν2ν = e−πij = (−1)j (when ν is even, n = m+ 1 =

ν

2). (3.5)

Now assume that (3.4) holds. Because λj is on the unit circle, λj = λ−1j . Without loss

of generality, we can set am+1 = 0 when ν is odd. By consulting (3.3), we see that the yjcomponent of y is given by

yj = a0 +m∑k=1

akλkj +

m∑k=1

a−kλ−kj + am+1λm+1j

= a0 + am+1(−1)j +m∑k=1

(akλ

kj + akλ

k

j

)= a0 + am+1(−1)j +

m∑k=1

2�(akλkj ).

Because a0 and am+1 are both real, yj is real for all j = 0, 1, 2, · · · , ν − 1. Therefore y is areal vector.


Now assume that y is a real vector. Let {yj}ν−10 be the components of y; see (3.1). By

using (3.3), we have

0 = yj − yj =

n∑k=−m

akλkj −

n∑k=−m

akλk

j =

n∑k=−m

akλkj −

n∑k=−m

akλ−kj

= a0 − a0 +m∑k=1

(ak − a−k)λkj + am+1λm+1j − am+1(λj)

m+1 +m∑k=1

(a−k − ak)λ−kj

= a0 − a0 +m∑k=1

(ak − a−k)λkj + (am+1 − am+1)λm+1j +

m∑k=1

(a−k − ak)λ−kj .

Recall that am+1 is set equal to zero when ν is odd. The last equality follows because forν even, λnj = (−1)j = (λj)

n when m + 1 = n = ν2. By consulting (3.2) and (3.3) with yj

replaced by yj − yj = 0, we see that

0 = Fν[a0 − a0 a1 − a−1 · · · am − a−m an − an a−m − am · · · a−1 − a1

]tr.

(The an − an term does not exist when ν is odd.) Since the discrete Fourier transform Fν isinvertible, the vector on the right of Fν must be zero. Therefore (3.4) holds. This completesthe proof.

It is convenient to present several different expressions for the ν × ν discrete Fouriertransform matrix Fν in (1.9). Notice that Fν is also given by

Fν =

⎡⎢⎢⎢⎢⎢⎣1 1 1 · · · 11 λ1 λ21 · · · λν−1

1

1 λ2 λ22 · · · λν−12

......

... · · · ...1 λν−1 λ2ν−1 · · · λν−1

ν−1

⎤⎥⎥⎥⎥⎥⎦ (where λj = e−2πijν ). (3.6)

See (1.10) where λ1 = e−2πiν and λj = λj1 for j = 0, 1, 2, · · · , ν−1. The following result shows

that Fν is an invertible matrix.

THEOREM 3.3.2 Let Fν be the ν × ν discrete time Fourier transform matrix given inequation (1.9), or equivalently, (3.6) where λj = e−

2πijν for j = 0, 1, 2, · · · , ν − 1. Then Fν is

invertible, and its inverse is given by

F−1ν =

1

νF ∗ν =

1

νF ν (3.7)

Here F−1ν = 1

νF ν is the complex conjugate of each entry of Fν divided by ν. Finally, Fν = F tr

ν .

REMARK 3.3.3 Let a = F−1ν y be the inverse discrete Fourier transform of a vector y in

Cν. Then the first component of a equals the arithmetic mean of y. To be precise, if[a0 a1 a2 · · · a−2 a−1

]tr= F−1

ν

[y0 y1 y2 · · · yν−1

]tr, (3.8)


then

a0 =1

ν

ν−1∑j=0

yj. (3.9)

To prove this simply use F−1ν = 1

νF ν along with the fact that the first column of Fν is all

ones; see (3.6).

The proof of Theorem 3.3.2 uses the following classical result.

LEMMA 3.3.4 Let z be a complex number. Then

ν−1∑k=0

zk =1− zν

1− zif z = 1

= ν if z = 1 . (3.10)

Finally, if |z| < 1, then

∞∑k=0

zk =1

1− z(|z| < 1) . (3.11)

Proof. To verify that (3.10) holds, let s be the sum s =∑ν−1

k=0 zk. Then

s = 1 + z + z2 + z3 + · · ·+ zν−2 + zν−1

zs = 0 + z + z2 + z3 + · · ·+ zν−2 + zν−1 + zν .

This readily implies that (1 − z)s = s − zs = 1 − zν . If z = 1, then we can divide by1− z to obtain s = (1− zν)/(1− z). This yields the first equation in (3.10). If z = 1, then∑ν−1

k=0 1k = ν. Therefore (3.10) holds. Finally, it is noted that the second equation in (3.10)

also follows by applying L’Hospital’s rule to the first equation in (3.10).To verify that (3.11) holds, notice that when |z| < 1 equation (3.10) implies that

∞∑k=0

zk = limν→∞

ν−1∑k=0

zk = limν→∞

1− zν

1− z=

1

1− z.


Proof of Theorem 3.3.2. Let Rj be the j + 1 row of Fν in (3.6), that is,

Rj =[1 λj λ2j · · · λν−1

j

](for j = 0, 1, 2, · · · , ν − 1). (3.12)

Recall that ∗ denotes the complex conjugate transpose. We claim that RjR∗k = νδjk where

δjk is the Kronecker delta, that is, δjk = 1 when j = k and zero otherwise. To see thisassume that j = k. Then the first equality in (3.10) implies that

RjR∗k =

ν−1∑�=0

λ�jλ�

k =

ν−1∑�=0

(λjλk)� =

ν−1∑�=0

(λj−k1 )� =1− (λj−k1 )ν

1− λj−k1

= 0. (3.13)


The last term equals zero, because λ1 is a ν root of unity, that is, (λj−k1 )ν = (λν1)j−k = 1.

Equation (3.13) shows that RjR∗k = 0 when j = k. Using the fact that λj is on the unit

circle, we have

RjR∗j =

ν−1∑�=0

|λ�j|2 =ν−1∑�=0

1 = ν (for j = 0, 1, 2, · · · , ν − 1) .

Hence RjR∗j = ν. In other words, RjR

∗k = νδjk. In particular, all the rows {Rj}ν−1

0 of Fνare nonzero and orthogonal. So all the rows of Fν are linearly independent. Therefore Fν isinvertible.

We claim that F−1ν = 1

νF ∗ν . To verify this notice that by construction

Fν =

⎡⎢⎢⎢⎣R0

R1...

Rν−1

⎤⎥⎥⎥⎦ . (3.14)

Using RjR∗k = νδjk, we obtain

FνF∗ν =

⎡⎢⎢⎢⎣R0

R1...

Rν−1

⎤⎥⎥⎥⎦ [R∗

0 R∗1 · · · R∗

ν−1

]=

⎡⎢⎢⎢⎣R0R

∗0 R0R

∗1 · · · R0R

∗ν−1

R1R∗0 R1R

∗1 · · · R1R

∗ν−1

...... · · · ...

Rν−1R∗0 Rν−1R

∗1 · · · Rν−1R

∗ν−1

⎤⎥⎥⎥⎦ = νI .

Dividing by ν shows that Fν(F∗ν /ν) = I. Recall that if A and B are matrices on Cν , then B

is the inverse of A if and only if AB = I. Therefore 1νF ∗ν is the inverse of Fν .

We have previously noted that Fν = F trν ; see the form of Fν in equation (1.10) or (3.15)

below. Hence F−1ν = 1

νF ∗ν = 1

νF ν . This completes the proof.

Recall that the discrete Fourier transform Fν on Cν is given by

Fν =

⎡⎢⎢⎢⎢⎢⎣1 1 1 · · · 11 λ1 λ21 · · · λν−1

1

1 λ21 λ41 · · · λ2(ν−1)1

......

... · · · ...

1 λν−11 λ

2(ν−1)1 · · · λ

(ν−1)2

1

⎤⎥⎥⎥⎥⎥⎦ (where λ1 = e−2πiν ); (3.15)

see (1.10). Let {φk}ν−10 be the standard orthonormal basis for Cν , that is,

φk =[0 0 · · · 0 1 0 · · · 0

]tr(for k = 0, 1, 2, · · · , ν − 1)

is the unit vector in Cν obtained by placing one in the k + 1 position and zeros elsewhere.By using the form of Fν in (3.15), we see that

Fνφk =[1 λk1 λ2k1 λ3k1 · · · λ

(ν−1)k1

]tr(for k = 0, 1, 2, · · · , ν − 1).


The first column Fνφ0 of Fν consists of all ones. The second column Fνφ1 of Fν consistsof the points {λk1}ν−1

0 which rotate clockwise almost once around the unit circle starting at1 + i0 and ending at λ1 = λν−1

1 . (We almost complete the circle when ν is ”large”. If ν isnot large, then the points are sparse. ) The third column Fνφ2 of Fν consists of the points{(λ21)k}ν−1

0 which rotate almost twice clockwise around the unit circle starting at 1 + i0 and

ending with λ2

1. If j � ν2, then the j+1 column Fνφj of Fν consists of the points {(λj1)k}ν−1

0

which rotate clockwise almost j times around the unit circle starting at 1 + 0i and ending

with λj

1. In other words, plotting fft(φj) in Matlab with ν large will produce a graph whichwill appear to move around the unit circle almost j times clockwise for j � ν

2. On the other

hand, the last column of Fνφν−1 consists of the points {λk1}ν−10 which rotate counterclockwise

almost once around the unit circle starting at 1 + 0i and ending with λ1. The second to

last column Fνφν−2 consists of points {(λ21)k}ν−10 which rotate counterclockwise almost twice

around the unit circle starting at 1 + 0i and ending with λ21, etc. Finally, it is noted that

if ν is even, then (λν21 )

k = (−1)k. In this case, the points {(−1)k}ν−10 in the nu

2+ 1 column

Fνφ ν2of Fν will simply oscillate between ±1.

For a demonstration in Matlab, let T = fft(eye(1024) be the discrete Fourier transformmatrix F1024. Notice that 210 = 1024. Then use the comet command to plot the following:

� comet(real(T(:,5)),imag(T(:,5))) % The fifth column of T should rotate clockwisearound the unit circle almost four times. Notice that Matlab places a line connectingthe adjacent points.

� comet(real(T(:,1022)),imag(T(:,1022))) % The third column from the last of T shouldrotate around the unit circle almost three times counterclockwise.

� Plot comet(real(T(:,513)),imag(T(:,513))) % The 29 + 1 = 513 column of T shouldoscillate between ±1.

� Plot comet(real(T(:,510)),imag(T(:,510))) % This appears to move counterclockwise.Is this an illusion? Use the unwrap and angle command to help explain what Matlabis doing. Remember that Matlab places a line between the adjacent points.

� Plot comet(real(T(:,515)),imag(T(:,515))) % This appears to move clockwise. Is thisan illusion? Use the unwrap and angle command to help explain what Matlab is doing.

� Plot comet(real(T(:,j)),imag(T(:,j))); for j = 25, 25 + 1, 26, 26 + 1, 27, 27 + 1,28, 28 + 1, 29 + 28, 29 + 28 + 1.

3.3.1 Exercise

Problem 1. Let Fν be the discrete Fourier transform matrix Fν in (1.9). Then using thefact that F−1

ν = 1νF ∗ν show that F ∗

νFν = νI.

Problem 2. Let {ξk}ν1 be the columns for the discrete Fourier transform matrix Fν in (1.9),that is,

Fν =[ξ1 ξ2 ξ3 · · · ξν

](3.16)


where ξk is a vector in Cν for all integers k in [1, ν]. Then show that (ξj, ξk) = νδjk whereδjk is the Kronecker delta and (f, g) = g∗f is the inner product on C

ν . In particular, thisshow that {ξk}ν1 is an orthogonal basis for Cν . Hint: use F ∗

νFν = νI and the form of Fν in(3.16).

Problem 3. A Vandermonde matrix is a m× ν matrix of the form

V =

⎡⎢⎢⎢⎢⎢⎣1 z1 z21 · · · zν−1

1

1 z2 z22 · · · zν−12

1 z3 z23 · · · zν−13

......

... · · · ...1 zm z2m · · · zν−1

m

⎤⎥⎥⎥⎥⎥⎦ : Cν → Cm (3.17)

where {zk}m1 is a set of complex numbers. Let p be any polynomial of the form

p(z) = α0 + α1z + α2z2 + · · ·+ αν−1z

ν−1 .

Then notice that⎡⎢⎢⎢⎢⎢⎣p(z1)p(z2)p(z3)...

p(zm)

⎤⎥⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎢⎣1 z1 z21 · · · zν−1

1

1 z2 z22 · · · zν−12

1 z3 z23 · · · zν−13

......

... · · · ...1 zm z2m · · · zν−1

m

⎤⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎣α0

α1

α2...

αm−1

⎤⎥⎥⎥⎥⎥⎦ = V

⎡⎢⎢⎢⎢⎢⎣α0

α1

α2...

αm−1

⎤⎥⎥⎥⎥⎥⎦ . (3.18)

So evaluating the polynomial p(z) at the points {zk}m1 can be done by matrix multiplication.Recall also that a matrix A is one to one if the only solution to the homogeneous systemAf = 0 is f = 0.

Now assume that {zk}m1 are distinct points in the complex plane and the Vandermondematrix V has more rows than columns, that is, m ≥ ν. Then show that the Vandermondematrix V is one to one. Hint: use the fact that a polynomial p of degree ν − 1 has at mostν − 1 zeros.

Recall that a square matrix in one to one if and only if it is invertible. So if {zk}m1 aredistinct and V is a square Vandermonde matrix, then V is invertible. By taking zj = λj−1

for j = 1, 2, · · · , ν with m = ν we see that the discrete Fourier transform matrix Fν in (3.6)is a special case of a Vandermonde matrix. In particular, this also shows that the discreteFourier transform matrix Fν in (3.6) is invertible.

3.4 Sinusoid estimation

In this section, we will use the discrete Fourier transform to solve some sinusoid estimationproblems.

3.4. SINUSOID ESTIMATION 151

1700 1750 1800 1850 1900 1950 2000 2050

date

0

50

100

150

200

250

300

350

400

suns

pot n

umbe

r

Sunspot data

Figure 3.4: Sunspot data from NASA: Solar Physics Marshall Space Flight Center.

3.4.1 Sunspots

Let us use the discrete Fourier transform to show that the sunspot activity is cyclical, with aperiod of around eleven years. The Wolfer number or sunspot number is determined by thenumber of sunspot groups and the number of sunspots; see NASA: Solar Physics MarshallSpace Flight Center webpage for more information. We downloaded the sunspot data fromthis website. This data contains the monthly sunspot numbers from 1749 through the endof 2015, for a total of 3204 sunspot numbers. We collected the sunspot data as a vector g inC3204. The sunspot numbers are presented in Figure 3.4.

To find the period for this data g, we computed[b0 b1 · · · b−2 b−1

]tr= F−1

3204g, theinverse discrete Fourier transform of our data g. Then we plotted the power spectrum with|bk|2 on the y axis and 2πk

267on the x axis for k = 0, 1, 2, · · · , 40. This power spectrum is

presented in the top graph of Figure 3.5. Because the sunspot data is real valued, we onlyplotted the power spectrum for positive k. Since the total length of the data is τ = 267 years,the x axis represents the frequency in radians per year. Notice that there is an immediateproblem. The power spectrum has a large peak b0 at frequency zero, which makes the restof the graph difficult to read. However, we do not care about the zero frequency and itscorresponding amplitude b0. The b0 term in the inverse discrete Fourier transform is simplythe arithmetic mean of the data g; see Remark 3.3.3. We are looking for a period and notthe mean of the data.

To address this problem, we subtracted the mean from the data, that is, we set f equalto g minus the mean of g. (In Matlab f = g −mean(g).) Then we computed[

a0 a1 a2 · · · a−2 a−1

]tr= F−1

3204f

the inverse discrete Fourier transform for f , and plotted the corresponding power spectrumin the bottom graph of Figure 3.5. As before, we plotted the power spectrum with |ak|2 onthe y axis and 2πk

267on the x axis for k = 0, 1, 2, · · · , 40. By consulting this power spectrum,

we concluded that the peak occurs at a frequency in the vicinity of 0.57 radians per year.


Therefore the corresponding period is 2π0.57

= 11.0231. In other words, the period for thesunspot activity is around eleven years.

-0.2 0 0.2 0.4 0.6 0.8 1

frequency

0

2000

4000

6000

8000The power spectrum

-0.2 0 0.2 0.4 0.6 0.8 1

frequency

0

200

400

600The power spectrum with mean zero data

Figure 3.5: The power spectrum for the sunspot data.

3.4.2 A least squares optimization problem

The results in the remaining part of this section are not used in the rest of the notes (exceptin Section 7.2 of Chapter 7) and can be skipped by the unrested reader. Before tackling asinusoid estimation problem, let us present some results concerning a classical least squaresoptimization problem. Let T be a matrix mapping Cη into Cν . For a specified y in Cν , thematrix equation y = Tx may or may not have a solution. If y = Tx does not have a solution,then one would search for a vector xopt to make the distance between y and Tx as small aspossible. In other words, one would look for a vector xopt ∈ Cη such that

‖y − Txopt‖ ≤ ‖y − Tx‖ (for all x ∈ Cη). (4.1)

This leads to the following classical least squares optimization problem:

d = min{‖y − Tx‖ : x ∈ Cη} (4.2)

Here d = ‖y − Txopt‖ is the smallest possible distance between y and Tx. In particular,d = 0 if and only if the matrix equation y = Tx has a solution, or equivalently, y is in therange of T .


Recall that a matrix T is one to one if the null space of T equals zero, or equivalently, allthe columns of T are linearly independent. If T mapping C

η into Cν is one to one, then we

must have η ≤ ν. Furthermore, if η = ν, then T is one to one if and only if T is invertible.Recall that M∗ denotes the complex conjugate transpose of a matrix M . A matrix T is

one to one if and only if T ∗T is a strictly positive matrix. In particular, T is one to one ifand only if T ∗T is invertible. If y = Tx has a solution, then T ∗y = T ∗Tx. If in addition Tis one to one, then T ∗T is invertible, and thus, the unique solution x = (T ∗T )−1T ∗y. Thissets the stage for the following classical result.

THEOREM 3.4.1 Let T be a one to one matrix mapping Cη into Cν and y a vector in

Cν. Then there exists a unique solution xopt ∈ C

η to the optimization problem in (4.2). This

unique solution is given by

xopt = (T ∗T )−1T ∗y and d = ‖y − T (T ∗T )−1T ∗y‖ (4.3)

In particular, d = 0 if and only if the matrix equation y = Tx has a solution, and in thiscase, xopt = (T ∗T )−1T ∗y is the only solution to y = Tx.

It is emphasized that (T ∗T )−1T ∗ is called the Moore-Penrose pseudo inverse of T . TheMatlab command is pinv. Furthermore, Matlab will provide a solution xopt = pinv(T )y tothe optimization problem in (4.2), even when T is not one to one. (This xopt is orthogonalto the null space of T .) In our applications, T maps Cη into Cν where η ≤ ν. However, ifη = ν and T is one to one, then T is invertible and (T ∗T )−1T ∗ = T−1 is simply the inverseof T . For further results on the optimization problem in (4.2), a proof Theorem 3.4.1, andits applications in linear control systems see [3] and the related references therein.

For an application of our optimization problem in (4.2) involving sinusoids, let U be theset of all functions of the form

u(t) =

η∑k=1

uke−iωkt (4.4)

where {ωk}η1 is a fixed finite set of distinct frequencies and {uk}η1 is the set of correspondingamplitudes. It is noted that U is a finite dimensional linear space of dimension η. Considerthe interval [0, τ ] with ν evenly spaced points tj =

jτν

for j = 0, 1, 2, · · · , ν − 1. Moreover,let us assume that {ωk}η1 are contained in the Nyquist sampling range

|ωk| < νω0

2=πν

τ(for k = 1, 2, · · · , η and η ≤ ν). (4.5)

Recall that the fundamental frequency ω0 =2πτ. Finally, let us assume that η ≤ ν. Consider

the matrix T mapping Cη into Cν defined by

T =

⎡⎢⎢⎢⎢⎢⎣1 1 · · · 1

e−iω1t1 e−iω2t1 · · · e−iωηt1


......

......

e−iω1tν−1 e−iω2tν−1 · · · e−iωηtν−1

⎤⎥⎥⎥⎥⎥⎦ : Cη → Cν . (4.6)


Using t1 =τνwith |ωk| < πν

τ, we have |ωkt1| < π. Hence λk = e−iωkt1 for k = 1, 2, · · · , η are η

distinct complex numbers on the unit circle. Notice that tj = j× t1 for j = 0, 1, 2, · · · , ν−1.Since λjk = (e−iωkt1)j = e−iωktj , we see that T is a Vandermonde matrix, whose kth column isformed by {λjk}ν−1

j=0 . (The transpose of a Vandermonde matrix is also called a Vandermondematrix; see (3.17).) Because η ≤ ν, it follows that T is one to one, or equivalently, the nullspace of T equals zero. In particular, T ∗T is invertible.

Let u(t) be a function in U . Then u(t) is a function of the form

u(t) =

η∑k=1

uke−iωkt if and only if

⎡⎢⎢⎢⎣u(t0)u(t1)...

u(tν−1)

⎤⎥⎥⎥⎦ = T

⎡⎢⎢⎢⎣u1u2...uη

⎤⎥⎥⎥⎦ . (4.7)

If u(t) =∑η

1 uke−iωkt, then clearly, the matrix equation in (4.7) holds.

On the other hand, assume that the second equation in (4.7) holds. Recall that T is one

to one, and thus, T ∗T is invertible. Solving for[u1 u2 · · · uη

]trin (4.7), we obtain

u(t) =

η∑k=1

uke−iωkt where

⎡⎢⎢⎢⎣u1u2...uη

⎤⎥⎥⎥⎦ = (T ∗T )−1T ∗

⎡⎢⎢⎢⎣u(t0)u(t1)...

u(tν−1)

⎤⎥⎥⎥⎦ ; (4.8)

see also Theorem 3.4.1. Therefore (4.7) holds. In particular, u ∈ U is uniquely determinedby sampling {u(tj)}ν−1

0 . It is noted that if η = ν, then T is invertible and we can replace(T ∗T )−1T ∗ by T−1.

It is emphasized that equations (4.7) and (4.8) show that if u(t) is a function in U , thenone can uniquely recover u(t) by sampling {u(tj)}ν−1

0 .Now let g(t) be any continuous function over the interval [0, τ ]. Then we are looking

for a function u(t) =∑η

1 uke−iωkt in U which comes as close as possible to g(t) over the

interval [0, τ ]. One could use the Gram matrix with Hilbert space techniques to solve thisproblem; see [3, 15] and the references therein. However, this method is beyond the scope ofthese notes. So let us settle on finding a function u(t) in U that comes as close as possibleto g(t) at the points {tj}ν−1

0 . This approach is also more closely related to discrete Fouriertransform techniques. By sampling g(t) at {g(tj)}ν−1

0 , observe that

u(tj) = g(tj) for all j = 0, 1, 2, · · · , ν − 1 if and only if

⎡⎢⎢⎢⎣g(t0)g(t1)...

g(tν−1)

⎤⎥⎥⎥⎦ = T

⎡⎢⎢⎢⎣u1u2...uη

⎤⎥⎥⎥⎦ . (4.9)

It is emphasized that we could have u(tj) = g(tj) for all j and u(t) = g(t) over [0, τ ]. Ifu(tj) = g(tj) for all j, then this simply means that we found a function u(t) in U whichinterpolates g(t) at the points {tj}ν−1

0 , and there is no guarantee that they will agree whent = tj . (However, if g ∈ U , then u(t) = g(t) for all t.) Moreover, whenever η < ν, which


is our case of interest, the matrix T is not invertible. In this setting, the matrix equationy = Tx may not have a solution for a specified y. In fact, in most instances y = Tx willnot have a solution. So given g(t) in most cases there is simply no solution to the matrixequation in (4.9). In other words, given an arbitrary continuous function g(t) there may beno u(t) in U such that u(tj) = g(tj) for all j = 0, 1, 2, · · · , ν − 1.

To address this issue, let us search for a function u(t) in U , which will force u(tj) tobe as close as possible to g(tj) for all j. To be precise, we would like to solve the followingoptimization problem:

d2 = min

{ν−1∑j=0

|g(tj)− u(tj)|2 : u ∈ U

}

= min

{ν−1∑j=0

|g(tj)− u(tj)|2 : u(t) =η∑k=1

uke−iωkt

}. (4.10)

Here d is simply a measure of the distance between {g(tj)}ν−10 and the subspace U . For

example, d = 0 if and only if there exists a u(t) in U such that u(tj) = g(tj) for allj = 0, 1, 2, · · · , ν − 1. To solve this optimization problem, let

y =[g(t0) g(t1) · · · g(tν−1)

]trand x =

[u1 u2 · · · uη

]tr. (4.11)

(Recall that tr denotes the transpose.) Then (4.7) implies that

ν−1∑j=0

|g(tj)− u(tj)|2 = ‖y − Tx‖2. (4.12)

(Here ‖ξ‖ =√∑

j |ξj|2 is the standard norm in Cν .) Hence the optimization problem in

(4.10) is equivalent to the following classical least squares optimization problem:

d = min{‖y − Tx‖ : x ∈ Cη}. (4.13)

Theorem 3.4.1 shows that the solution to this minimization problem is unique and given by

xopt = (T ∗T )−1T ∗y and d = ‖y − T (T ∗T )−1T ∗y‖. (4.14)

In particular, d = 0 if and only if y = Tx has a solution, or equivalently, y is in the range ofT . Therefore solving for xopt yields the amplitudes {uk}η1, and thus, the function uopt(t) inU that we have been looking for. In other words,

uopt(t) =

η∑k=1

uke−iωkt where

⎡⎢⎢⎢⎣u1u2...uη

⎤⎥⎥⎥⎦ = (T ∗T )−1T ∗

⎡⎢⎢⎢⎣g(t0)g(t1)...

g(tν−1)

⎤⎥⎥⎥⎦ (4.15)

is our approximation for g(t), which uniquely solves the least squares optimization problemin equation (4.10).

Let us summarise our previous analysis in the following result, which is essentially anotherversion of the Nyquist-Shannon sampling Theorem 3.1.6.


THEOREM 3.4.2 Consider the interval [0, τ ] with ν evenly spaced points tj =jτν

for j =0, 1, 2, · · · , ν − 1. Let {ωk}η1 be η distinct frequencies satisfying |ωk| < πν

τfor k = 1, 2, · · · , η

where η ≤ ν.

� The Matrix T mapping Cη into Cν defined in (4.6) is one to one. In particular, T ∗Tis invertible.

� Let u(t) and v(t) be two functions in U of the form

u(t) =

η∑k=1

uke−iωkt and v(t) =

η∑k=1

vke−iωkt. (4.16)

Then the following statements are equivalent.

(i) The the functions u(t) = v(t) for all t.

(ii) The samples u(tj) = v(tj) for all j = 0, 1, 2, · · · , ν − 1.

(iii) The coefficients uk = vk for all k = 1, 2, · · · , η.In particular, one can uniquely recover any function u(t) in U by sampling {u(tj)}ν−1

0 ,that is,

u(t) =

η∑k=1

uke−iωkt where

⎡⎢⎢⎢⎣u1u2...uη

⎤⎥⎥⎥⎦ = (T ∗T )−1T ∗

⎡⎢⎢⎢⎣u(t0)u(t1)...

u(tν−1)

⎤⎥⎥⎥⎦ . (4.17)

(If η = ν, then (T ∗T )−1T ∗ is simply T−1.)

� Let g(t) be any continuous function. Then there is a unique function uopt(t) in U ,which solves the optimization problem

min

{ν−1∑j=0

|g(tj)− u(tj)|2 : u(t) =η∑k=1

uke−iωkt

}. (4.18)

Moreover, this optimal function uopt(t) is given by (4.15). In particular, if η = ν, thenT is invertible and g(tj) = uopt(tj) for j = 0, 1, 2, · · · , ν − 1.

� Assume that the frequencies {ωk}η1 are given by {kω0}k∈K where ω0 = 2πτ

and K is aset of η integers, that is, ωk = mkω0 where mk is an integer in K for k = 1, 2, · · · , η.Let [

a0 a1 a2 · · · a−2 a−1

]tr= F−1

ν

[g(t0) g(t1) g(t2) · · · g(tν−1)

]trbe vector in Cν formed by applying the inverse discrete Fourier transform F−1

ν on Cν

to the samples {g(tj)}ν−10 . Then the optimal solution uopt(t) in U to the optimization

problem in (4.18) is given by

uopt(t) =∑k∈K

ake−ikω0t. (4.19)


Sketch of proof. We have all ready proven Parts �, � and equation (4.17) concerningthe uniqueness of u(t) in U . Let us verify that Part � holds, that is, Parts (i) to (iii) areequivalent. Clearly Part (i) implies Part (ii). Notice that

T[u1 u2 · · · uη

]tr=[u(t0) u(t1) · · · u(tν−1)

]trT[v1 v2 · · · vη

]tr=[v(t0) v(t1) · · · v(tν−1)

]tr.

So u(tj) = v(tj) for j = 0, 1, 2, · · · , ν − 1 if and only if

0 = T( [u1 u2 · · · uη

]tr − [v1 v2 · · · vη

]tr ).

Since T is one to one, this implies that u(tj) = v(tj) for all j if and only if uk = vk fork = 1, 2, · · · , η. Hence Parts (ii) and (iii) are equivalent. Part (iii) clearly, implies Part (i).Therefore Parts (i) to (iii) are equivalent. This completes the proof of Part �.

To verify that Part � holds, notice that when {ωk} can be expressed as integers times thefundamental frequency ω0, then T is simply the matrix formed by the columns in the discreteFourier transform matrix Fν corresponding to the integers in K. Recall that the columns ofFν are orthogonal with norm

√ν. Hence T ∗T = νI, and thus, (T ∗T )−1T ∗y = 1

νT ∗y which is

just the rows corresponding to K for 1νF ∗ν y = F−1

ν y. Here y is the vector in Cν determinedby {g(tj)}ν−1

0 ; see (4.11). This completes the proof.In many applications, the frequencies are of the form {0,±ωk}n1 where {ωk}n1 is a set of

distinct strictly positive frequencies and η = 2n+ 1. In this setting, U is the set consistingof all functions of the form

u0 +n∑

k=1

(uke

−iωkt + u−keiωkt)

where {uk}n−n are complex numbers. Due to Euler’s identity eiθ = cos(θ)+ i sin(θ), it followsthat U can also be described by the set of all functions of the form

u(t) = u0 +n∑

k=1

αk cos(ωkt) + βk sin(ωkt) (4.20)

where {u0, αk, βk}n1 are complex numbers. It is noted that U is a linear space of dimension2n + 1. The zero frequency may not be present. In this case, the frequencies are {±ωk}n1,u0 = 0, η = 2n and U is linear space of dimension 2n. As before, we assume that all thefrequencies are in the Nyquist sampling range, that is,

|ωk| < πν

τ(for k = 1, 2, · · · , n and 2n < ν). (4.21)

(In fact, n � ν2is even better.) In this setting, we replace the matrix T in (4.6) with the

matrix Tc given by

Tc =

⎡⎢⎢⎢⎢⎢⎣1 1 0 · · · 1 01 cos(ω1t1) sin(ω1t1) · · · cos(ωnt1) sin(ωnt1)1 cos(ω1t2) sin(ω1t2) · · · cos(ωnt2) sin(ωnt2)...

...... · · · ...

...1 cos(ω1tν−1) sin(ω1tν−1) · · · cos(ωntν−1) sin(ωntν−1)

⎤⎥⎥⎥⎥⎥⎦ : C2n+1 → Cν . (4.22)


If there is no zero frequency, then the first column consisting of ones is not present, and Tcmaps C2n into C

ν , and u0 = 0. Due to Euler’s identity, one can show that the columns of Tcand the columns of T formed by {0,±ωk}n1 are linear combinations of each other. In otherwords, Tc and T have the same rank. Therefore Tc is also one to one, and T ∗

c Tc is invertible.In particular, if u(t) is a function in U , then u(t) is a function of the form

u(t) = u0 +

n∑k=1

αk cos(ωkt) + βk sin(ωkt) if and only if[u(t0) u(t1) · · · u(tν−1)

]tr= Tc

[u0 α1 β1 · · · αn βn

]tr. (4.23)

If g(t) is a continuous function over the interval [0, τ ], then the optimization problem corre-sponding to (4.10) is given by

min

{ν−1∑j=0

|g(tj)− u(tj)|2 : u ∈ U

}

= min

{ν−1∑j=0

|g(tj)− u(tj)|2 : u(t) = u0 +

n∑k=1

αk cos(ωkt) + βk sin(ωkt)

}. (4.24)

By applying Theorem 3.4.1, we see that the unique solution uopt(t) in U to this optimizationproblem is determined by

uopt(t) = u0 +n∑

k=1

αk cos(ωkt) + βk sin(ωkt)[u0 α1 β1 · · · αn βn

]tr= (T ∗

c Tc)−1T ∗

c

[g(t0) g(t1) · · · g(tν−1)

]tr. (4.25)

3.4.3 A sinusoid estimation problem

Consider a function g(t) of the form

g(t) =

m∑k=1

bke−i�kt (4.26)

where {�k}m1 is a distinct set of unknown frequencies with unknown amplitudes {bk}m1 . Thena sinusoid estimation problem is to estimate the frequencies {�k}m1 and amplitudes {bk}m1by sampling the function g(t) at ν points over the interval [0, τ ]. In general this problem isbeyond the scope of these notes. In this section, we will use the discrete Fourier transform topresent a heuristic approach for estimating the frequencies and amplitudes. For a statisticalapproach to sinusoid estimation, see Capon’s method in [11] and the corresponding referencestherein.

To expand further on the discrete Fourier transform approach to sinusoid estimation, letus divide the interval [0, τ ] into ν equally spaces points {tj}ν−1

0 starting with t0 = 0, that is,let tj =

jτν

for j = 0, 1, 2, · · · , ν − 1. Here we assume that we have chosen ν large enough


to guarantee that all the frequency components of g are contained in the Nyquist samplingrange, that is, |�k| < νω0

2= νπ

τfor all k = 1, 2, · · · ,m where ω0 =

2πτ. In fact, it is preferable

if |�k| � νπτ

for all k. If g is a τ periodic trigonometric polynomial, then one can perfectlyreconstruct g(t) by sampling {g(tj)}ν−1

0 ; see Section 3.1.1. However, in general g(t) is not aτ periodic function, let alone a τ periodic trigonometric polynomial. Therefore even whenall the frequencies {�k}m1 are in the Nyquist sampling range, one may not be able determinethe frequencies {�k}m1 for g by sampling {g(tj)}ν−1

0 .In light of this how does one go about estimating the frequencies {�k}m1 for g. First

apply the inverse discrete Fourier transform to {g(tj)}ν−10 to compute {ak}, that is,⎡⎢⎢⎢⎢⎢⎣

a0a1...a−2

a−1

⎤⎥⎥⎥⎥⎥⎦ = ν−1F ∗ν

⎡⎢⎢⎢⎢⎢⎣g(t0)g(t1)...

g(tν−2)g(tν−1)

⎤⎥⎥⎥⎥⎥⎦ (4.27)

where Fν on Cν is the discrete Fourier transform matrix in (1.9). Next plot the powerspectrum and select a set of integers K which correspond to the largest values of |ak|2. Thenhopefully, this yields a good approximation for the actual frequencies {�k}m1 of g, that is,�j ≈ kω0 for some j ∈ [1,m] and k ∈ K. Finally,

pK(t) =∑k∈K

ake−ikω0t (where ω0 =

2π

τ) (4.28)

is hopefully a reasonable approximation for g(t), that is, ‖g−pK‖ is small. However, pK(t) is aτ periodic trigonometric polynomial and g is not necessarily a τ periodic function. Therefore‖g − pK‖ could be small and one still may not be able to accurately recover the frequencies{�k}m1 ; see the example in Section 3.4.4 below where ‖g − p30‖ ≈ 0. In fact, we have seenthis phenomenon before in Section 2.3.1 in Chapter 2, where we viewed cos(bt) as a functionin L2(0, 2π). If b is not an integer, then the Fourier series expansion for cos(bt) has infinitelymany frequencies, while cos(bt) has only two frequencies. So how does one obtain b fromthese infinitely many frequencies?

To help alleviate this problem, one can use the power spectrum to find a finite set offrequencies {ωk}η1 which correspond to the peaks in the spectrum, but are not necessarilylocated at {kω0}∞−∞ where ω0 is the fundamental frequency. Moreover, we assume that allof our chosen frequencies {ωk}η1 are distinct and in the Nyquist sampling range and η ≤ ν.(In fact, η � ν would even be better.) Recall that U is the set of all functions of the form

u(t) =

η∑k=1

uke−iωkt

where {uk}η1 are complex numbers; see Section 3.4.2. So we are looking for a function uopt(t)in U which comes close as possible to g(t), at the sampling points {tj}ν−1

0 . In other words,we are looking for the unique solution uopt(t) in U to the optimization problem in (4.18).


The unique solution to this optimization problem is given by

uopt(t) =

η∑k=1

uke−iωkt where

⎡⎢⎢⎢⎣u1u2...uη

⎤⎥⎥⎥⎦ = (T ∗T )−1T ∗

⎡⎢⎢⎢⎣g(t0)g(t1)...

g(tν−1)

⎤⎥⎥⎥⎦ . (4.29)

Here T is the one to one matrix defined by

T =

⎡⎢⎢⎢⎢⎢⎣1 1 · · · 1



......

......

e−iω1tν−1 e−iω2tν−1 · · · e−iωηtν−1

⎤⎥⎥⎥⎥⎥⎦ : Cη → Cν . (4.30)

In this case, uopt(t) is our approximation of g(t), and hopefully �k ≈ ωk and bk ≈ uk for allk, and m = η.

In many applications, g(t) is a real valued function of the form

g(t) = g0 +

m∑k=1

ck cos(�kt) + dk sin(�kt). (4.31)

where {�k}m1 are m strictly positive unknown frequencies contained in the Nyquist samplingrange |�k| < νπ

τfor all k = 1, 2, · · · ,m. Moreover, the amplitudes {g0, ck, dk}m1 are also

unknown. In this setting, one would use the power spectrum to select positive frequen-cies {0, ωk}n1 corresponding to the peaks in the spectrum, but not necessarily at located at{kω0}∞1 . We assume that the selected frequencies are also in the Nyquist sampling rangeand 2n < ν. The zero frequency may or may not be there. In this case, the optimal solu-tion uopt(t) in U is the unique solution to the optimization problem in (4.24). The uniquesolution to this optimization problem is given by

uopt(t) = u0 +n∑

k=1

αk cos(ωkt) + βk sin(ωkt)[u0 α1 β1 · · · αn βn

]tr= (T ∗

c Tc)−1T ∗

c

[g(t0) g(t1) · · · g(tν−1)

]tr. (4.32)

Here Tc is the one to one matrix defined by

Tc =

⎡⎢⎢⎢⎢⎢⎣1 1 0 · · · 1 01 cos(ω1t1) sin(ω1t1) · · · cos(ωnt1) sin(ωnt1)1 cos(ω1t2) sin(ω1t2) · · · cos(ωnt2) sin(ωnt2)...

...... · · · ...

...1 cos(ω1tν−1) sin(ω1tν−1) · · · cos(ωntν−1) sin(ωntν−1)

⎤⎥⎥⎥⎥⎥⎦ : C2n+1 → Cν . (4.33)

(If the zero frequency is not present, then the first column of ones is not there, and Tc mapsC2n into Cν .) Then uopt(t) is our approximation for g(t), and hopefully, �k ≈ ωk, g0 ≈ u0,αk ≈ ck and βk ≈ dk for all k, and n = m.


3.4.4 An example of sinusoid estimation

-5 0 5 10 15 20 25 30 35

frequency k

0

0.5

1

1.5

2The power spectrum |a

k|2

-5 0 5 10 15 20 25 30 35

frequency k

0

0.5

1

1.5The magnitude |a

k|

Figure 3.6: The power spectrum |ak|2 and |ak|

For an example of how discrete Fourier transforms play a role in sinusoid estimation,consider the function

g(t) = 4 cos(3.5t) + 2 cos(6t) = 2e−3.5it + 2e3.5it + e−6it + e6it.

Now assume that τ = 2π, or we are viewing g as a function in L2(0, 2π). Clearly, g(t) isnot a 2π periodic function. The component 4 cos(3.5t) of g is not 2π periodic. However,2 cos(6t) is a 2π periodic function. So let us apply fast Fourier transform methods of sizeν = 214 to g(t), over the interval [0, 2π], to estimate the frequencies {±3.5,±6} for g. Clearly,all the frequencies are in the Nyquist sampling range. Moreover, in this case, tj =

2πjν

forj = 0, 1, 2, · · · , ν − 1. Let {ak} be the inverse fast Fourier transform of {g(tj)}. Since ν islarge, g(t) ≈ p(t) =

∑|k|<213 ake

−ikt in the L2(0, 2π) norm. We plotted the power spectrum

|ak|2 in the top graph of Figure 3.6. Because the function is real the power spectrum issymmetric around the y axis, and thus, we only plotted the power spectrum for positive k.Since g(t) ≈ p(t) in the L2(0, 2π) norm, this is essentially the power spectrum for g, viewedas a function in L2(0, 2π). (The power spectrum changes depending upon which L2(0, τ)space is used.) The peaks in the spectrum occur around 3, 4 and 6. However, the resultis not conclusive. Do we have frequencies at 3, 4, 5 and 6, or do we have infinitely manyfrequencies? Do the peaks at 3 and 4 indicate that there single frequency somewhere in theinterval (3, 4), and the two peaks at 3 and 4 are simply sidebands of this single frequency. Thetruth is that it is difficult to determine exactly where the frequencies are by just examiningthe power spectrum. However, we can say that g has frequency components somewhere inthe region around [3, 4] and 6. Finally, recall that in case we know that we have frequenciesat ±3.5 and ±6.

Let us also note that the amplitudes corresponding to the peaks in the power spectrumare different from the amplitudes in g. To see this we plotted the magnitude |ak| of ak in


the bottom graph of Figure 3.6. At frequency 3.5 we should have a magnitude of 2 and themagnitude in Matlab is less than 1.5 at both 3 and 4. At frequency 6 we should have amagnitude of 1 and the magnitude in Matlab is larger than 1. In fact, in Matlab

a3 = a(4) = 0.0002− 1.1753i and a4 = a(5) = 0.0002 + 1.3581i

a5 = a(6) = 0.0002 + 0.4993i and a6 = a(7) = 1.0002 + 0.3217i. (4.34)

(Remember in Matlab a(k+1) = ak corresponds to the Fourier coefficient of e−ikt.) In otherwords, Matlab thinks that the amplitudes {ak}63 all contain nonzero imaginary parts, whenall the amplitudes in g(t) are real. One may surmise that the power in the frequencies 3 and4 is close to the power 22 = 4 corresponding to the 2e−3.5it component in g. However, 3.225 =|a3|2+ |a4|2 is not that close to 4. Therefore estimating the amplitudes using discrete Fouriertransform methods can be misleading. Finally, it is noted that a0 = a(1) = 2.4414 × 10−4.So the amplitude corresponding to the zero frequency is close to zero, and thus, we did notinclude a0 in our analysis.

0 1 2 3 4 5 6 7

t

-10

0

10The plot of g(t) and p

K(t)

0 1 2 3 4 5 6 7

t

-10

0

10The plot of g(t) and p

30(t)

0 1 2 3 4 5 6 7

t

-10

0

10The plot of g(t) and u

opt(t)

Figure 3.7: The graph of g(t) with pK(t), p30(t) and uopt(t)

For the moment assume that one concludes that K = {±3,±4±6} is the set of frequencieswhich generated g(t), even though g(t) is actually determined by four frequencies. The 2πperiodic trigonometric polynomial determined by K is given by

pK(t) =∑k∈K

ake−ikt =

∑k=3,4,6

(2�(ak) cos(kt) + 2�(ak) sin(kt)

).


(The coefficients a3, a4 and a6 are given in (4.34).) To see how well pK(t) approximatesg(t), we plotted pK(t) and g(t) over the interval [0, 2π] in the top graph of Figure 3.7. Asexpected, pK(t) deviates from g(t), but tries to follow g(t). Next, we considered the 2πperiodic trigonometric polynomial

p30(t) =

30∑k=−30

ake−ikt = a0 +

30∑k=1

(2�(ak) cos(kt) + 2�(ak) sin(kt)

)which contains the first sixty one frequencies {0,±k}301 corresponding to {ak}30−30. The plot ofp30(t) and g(t) over the interval [0, 2π] is presented in the middle graph of Figure 3.7. Noticethat p30(t) essentially follows g(t) except at the endpoints. Because g(0) = g(2π), the Gibbsphenomenon takes over at the endpoints 0 and 2π, and thus, one would not expect p30(t)approximate g(t) at 0 and 2π. However, we know that g(t) has frequencies at {±3.5,±6}and p30(t) contains sixty one frequencies, and even though, p30(t) ≈ g(t), the trigonometricpolynomial p30(t) does not help us discover the frequencies {±3.5,±6} for g(t). Finally, sinceg(t) is not 2π periodic, p30(t) will follow g(t) over the interval (0, 2π), and then they will gooff on their separate ways when t > 2π.

For the moment, assume that by studying the power spectrum in Figure 3.6, one con-cludes that g contains four frequencies at {±3.6,±5.8}. Since the actual frequencies areat {±3.5,±6}, this is not a bad guess. Set ω1 = 3.6 and ω2 = 5.8. Let U be the fourdimensional linear space consisting of all function of the form

u(t) = α1 cos(ω1t) + β1 sin(ω1t) + α2 cos(ω2t) + β2 sin(ω2t)

where {αk, βk}21 are complex numbers. In this case, there is no zero frequency. Let uopt(t)be the unique solution in U which solves the optimization problem in (4.24). By consulting(4.25) with u0 = 0, we see that the optimal solution is given by

uopt(t) = 3.7160 cos(ω1t) + 1.1106 sin(ω1t) + 1.5834 cos(ω2t)− 1.0251 sin(ω2t)[3.7160 1.1106 1.5834 −1.0251

]tr= (T ∗

c Tc)−1T ∗

c

[g(t0) g(t1) · · · g(tν−1)

]tr. (4.35)

Moreover, in this setting, the matrix Tc is given by

Tc =

⎡⎢⎢⎢⎢⎢⎣cos(ω1t0) sin(ω1t0) cos(ω2t0) sin(ω2t0)cos(ω1t1) sin(ω1t1) cos(ω2t1) sin(ω2t1)cos(ω1t2) sin(ω1t2) cos(ω2t2) sin(ω2t2)

......

......

cos(ω1tν−1) sin(ω1tν−1) cos(ω2tν−1) sin(ω2tν−1)

⎤⎥⎥⎥⎥⎥⎦ : C4 → Cν . (4.36)

Since we did not include the zero frequency in {±3.6,±5.8}, we eliminated the first columnof ones in Tc; see (4.22). The plot of uopt(t) and g(t) over the interval [0, 2π] is presentedin the bottom graph of Figure 3.7. As expected, uopt(t) follows g(t) fairly well. Finally,it is noted that if one is not satisfied with the choice of ω1 = 3.6 and ω2 = 5.8, then onecan always keep searching for two other frequencies and repeating the above analysis until‖y − Tx‖Cν or ‖g − uopt‖L2 is as small as possible.


To present a more precise measure of how close pK(t), p30(t) and uopt(t) are to g(t)), wecomputed the L2(0, 2π) norm ‖g‖ along with the errors ‖g − pK‖, ‖g − p30‖ and ‖g − uopt‖in Matlab:

‖g‖ =

√1

2π

∫ 2π

0

|g(t)|2dt ≈√∑

k

|ak|2 = 3.1624

‖g − pK‖ =

√1

2π

∫ 2π

0

|g(t)− pS(t)|2dt ≈√∑

k/∈K|ak|2 = 1.1582

‖g − p30‖ =

√1

2π

∫ 2π

0

|g(t)− p30(t)|2dt ≈√ ∑

|k|>30

|ak|2 = 0.3275

‖g − uopt‖ =

√1

2π

∫ 2π

0

|g(t)− uopt(t)|2dt ≈ 0.7041.

It is emphasized that all of the estimation errors of the frequency and amplitudes couldhave been eliminated, if we had chosen the period τ = 4π. Notice that g(t) is a 4π periodictrigonometric polynomial. In this setting, the results in Section 3.1.1 show that we canperfectly reconstruct g(t) by using 214 samples {g(tj)} of g(t). In fact, 214 samples is certainlyfar more than is needed. Moreover, for τ = 4π the power spectrum yields four frequenciesat {±3.5,±6} for g with the correct amplitudes. However, in many problems, one does notknow where the frequencies {�m} for g(t) live a priori, and may have little control over thelength τ of the data collected.

The Matlab commands used to compute the power spectrum are given by

t = (0: 2 ∧ 14− 1)′ ∗ 2 ∗ pi/(2 ∧ 14);

g = 4 ∗ cos(3.5 ∗ t) + 2 ∗ cos(6 ∗ t);subplot(2,1,1);

a = ifft(g); bar((0 : 30), abs(a(1 : 31)). ∧ 2); grid

xlabel(’frequency k’); title(’The power spectrum |ak|2’)subplot(2,1,2);

bar((0 : 30), abs(a(1 : 31))); grid

xlabel(’frequency k’); title(’The magnitude |ak|’)a(4 : 7) =

[0.0002− 1.1753i 0.0002 + 1.3581i 0.0002 + 0.4993i 1.0002 + 0.3217i

]


The Matlab commands used to plot pK(t), p30(t), uopt(t) and g(t) are given by

p=0; for k=3:4;p=p+2*real(a(k+1)*exp(−i ∗ k ∗ t)); endk=6 ;p=p+2*real(a(k+1)*exp(-i*k*t)); pk=p;

p=a(1); for k=1:30; p=p+2*real(a(k+1)*exp(−i ∗ k ∗ t)); end; p30=p;

subplot(3,1,1); plot(t,g); hold on; plot(t,pk,’r’); grid

xlabel(’t’); title(’The plot of g(t) and pK(t)’)

subplot(3,1,2); plot(t,g); hold on; plot(t,p30,’r’); grid

xlabel(’t’); title(’The plot of g(t) and p30(t)’); w1 = 3.6; w2=5.8;

T =[cos(w1 ∗ t) sin(w1 ∗ t) cos(w2 ∗ t) sin(w2 ∗ t)]

x = pinv(T ) ∗ g = [3.7160 1.1106 1.5834 −1.0251

]truo = T ∗ x; subplot(3,1,3);

plot(t,g); hold on; plot(t,uo,’r’); grid

xlabel(’t’); title(’The plot of g(t) and uopt(t)’)

[norm(a), norm(ifft(g-pk)), norm(ifft(g-p30)), norm(ifft(g-uo))]

=[3.1624 1.1582 0.3275 0.7041

]3.4.5 Exercise

Problem 1. Consider the real function

g(t) = A sin(ω1t) sin(ω2t)

over the interval [0, 2] where A > 0. Let {ak}∞−∞ be the Fourier coefficients for g. The Powerspectrum for g is given in Figure 3.8, that is, the graph of |ak|2 vs k with k on the x axis.Find the function g(t).

Problem 2. Matlab has the sunspot numbers in the file sunspot.dat. Load this file andanalyze this data to find the period of the sunspot activity.

Problem 3. In Matlab let t = (0 : 211 − 1)′ ∗ 5/211; andg = 3 + 6 ∗ cos(200 ∗ t)− 8 ∗ sin(900 ∗ t) + 10 ∗ randn(size(t)) . (4.37)

The command randn(ν, 1) generates a vector of length in Cν whose components are indepen-

dent Gaussian random variables with mean zero and variance one. The function g in (4.37)is the sum of two functions, g(t) = f(t)+n(t) where f(t) = 3+6∗cos(200∗t)−8∗sin(900∗t)is a sinusoid signal and n(t) is additive Gaussian noise. The idea is to try to find the sinusoidf(t) from the data g in (4.37). Now plot f in Matlab. Then use the inverse fast Fouriertransform to plot the power spectrum for g and find the function f , that is, try to estimatethe angular frequencies 0, 200 and 900 for f along with the absolute value of its amplitudes,3, 6 and 8. Finally it is noted that determining the sign 3, 6 and −8 for the amplitudes isnot required in many applications.


−2 0 2 4 6 8 10 12 14 160

0.5

1

1.5

2

2.5

3

3.5

4

The power spectrum |ak|2

k

|ak|2

Figure 3.8: The power spectrum of g

Problem 4. In Matlab set

t = (0 : 4 ∗ 213 − 1) ∗ 4/(4 ∗ 213); x = cos(2 ∗ π ∗ 300 ∗ t);

listen to soundsc(x). You should hear a pure sinusoid with frequency 300 hertz over 4seconds. Consider the function

y = cos(2 ∗ π ∗ 300 ∗ t) + 4randn(size(t))

Notice that y is the original signal cos(2π × 300t) corrupted by additive Gaussian whitenoise. Listen to soundsc(y). Can you hear the original signal cos(600πt)? Now use the ifft torecover the signal cos(600πt) from y. Set a = ifft(y) and use the power spectrum to find anamplitude ak which corresponds to the original signal cos(600πt). Then using τ = 4 considerthe function

f = 2�(ak) cos(2πkt

τ

)+ 2�(ak) sin

(2πkt

τ

).

which is your estimate of cos(600πt). Now listen to soundsc(f). Does it sound like theoriginal signal soundsc(cos(600πt))? Finally, what k did you use and compute 2πk

τ.

Problem 5. Gaussian white noise has a power spectrum which is constant over all frequen-cies. To see this use the ifft to plot the power spectrum for g = randn(1, 212) in Matlab.So in the previous problem we added noise to the original signal cos(600πt) over the entirefrequency spectrum.

Problem 6. One can use the DSP systems tool box in Simulink in Matlab to record soundand plot its spectrum; see Figure 3.9. The microphone is from the sources. The time scope,


signal to workspace and spectrum scope are in sinks. The gain is in the commonly usedblocks section in the general Simulink toolbox. We set the gain to 1000 due to our particularmicrophone in the computer. Your gain might have to be different. In the spectrum boxunder scope properties we set spectrum units to Whats/Hertz, we set the buffer size andfft length to 4096 although this is not necessary, Bartlett window and number of spectralaverages to 200. Under axis properties, we set minimum frequency as 50 hertz and themaximum to 2000 hertz, the minimum Y-limit to zero and maximum Y-limit to 20 (yournumbers can be different) and the Y-axis label to Whats/Hertz. In display properties, weclicked show grid, persistence, frame number, open scope at the start of simulation. Inthe signal to workspace box, we saved the format as array. Finally, we ran the Simulinksimulation for 40 seconds.

To see how the Simulink spectrum scope works in Matlab we set:

t = (0 : 60 ∗ 213 − 1) ∗ 60/(60 ∗ 213);y = cos(2 ∗ π ∗ 200 ∗ t) + cos(2 ∗ π ∗ 600 ∗ t);soundsc(y)

% This sound is the sum of two sinusoids at 200 and 600 hertz for 60 seconds.

% Click on start simulation in Simulink file in under 20 seconds to capture all the sound.

% Hopefully, your microphone picks up the sound and the spectrum scope opens up.

% If all goes well you should see two peaks at 200 and 600 hertz.

% The time data has been sent to Matlab in yout.

% The time data has been sent to Matlab in yout.

t = linspace(0,40,length(yout)); plot(t,yout(:,2)); grid

% This plots the data the microphone is recording.

% One can analyze the spectrum recorded from the microphone

% by taking the ifft of yout(:,2) over various intervals of time.

Now whistle or have someone whistle into the microphone. A good whistle should ap-proximate a pure sinusoid. Find the frequency of the whistle. Moreover, go into Matlab anduse the yout data to find the spectrum of the whistle. Since the whistle will probability notlast 40 seconds you have to find the spectrum of yout over a certain interval of time. Finally,just for fun, you can try a find the spectrum of guitar string, by playing a string into themicrophone. A guitar string is a sum of infinitely many sinusoids; see Chapter one in [40].


TimeScope

Time

SpectrumScope

B−FFT

Signal ToWorkspace

yout

Gain

1000

From AudioDevice

Figure 3.9: A Simulink model for recording and spectrum.

Problem 7 Consider the differential equation mx+Kx = 0 corresponding to a mass springsystem where m is the mass, K is the spring constant and x(t) is the position of the massat time t. The solution with x(0) = 0 is given by

x(t) = x(0) cos

(√K

mt

).

Assume that the mass m = 2, and the initial condition x(0) > 0. The power spectrum ofx(t) over the interval [0, 10π] is presented in Figure 3.10 where x(t) =

∑∞−∞ ake

−iω0kt. Findthe initial condition x(0) and spring constant K.


-2 0 2 4 6 8 10 12 14 16

k

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

|ak|2

The power spectrum of |ak|2 vs k

Figure 3.10: The power spectrum of x

Problem 8. Consider a function f in L2(0, π) of the form

f(t) = a0 +∞∑k=1

αk cos(ωkt) (for 0 ≤ t ≤ π)

0 ≤ a0 and 0 ≤ αk for all k.

The frequencies {ωk}, a0 and {αk} are all unknown. In Matlab we set t = (0:217−1)π217

anda = ifft(f). Then plotted |ak|2 = |a(k + 1)|2 vs k = 0, 1, 2, · · · , 10 in Figure 3.11. Find thefunction f(t), and its total power

1

π

∫ π

0

|f(t)|2dt.


0 1 2 3 4 5 6 7 8 9 10

k

0

0.5

1

1.5

2

2.5

3

3.5

4The power spectrum of |a

k|2 vs k

Figure 3.11: The power spectrum of f

3.5 The shift and the discrete Fourier transform

In this section we will show that the columns of the discrete Fourier transform matrix Fν in(3.6) are the eigenvectors for a certain shift matrix. To begin, a matrix U on Cν is unitaryif U∗U = I where ∗ denotes the complex conjugate transpose. Recall that if A and B aresquare matrices, then A is the inverse of B if and only if AB = I. So U is a unitary matrixif and only if U is invertible and U∗ is the inverse of U , that is, U∗ = U−1. Finally, it isnoted that if U is unitary, then (Uf, Ug) = (f, g) for all vectors f and g in Cν . Here (f, g)is the standard inner product on Cν , that is, (f, g) = g∗f . To see this simply observe that

(Uf, Ug) = (Ug)∗Uf = g∗U∗Uf = g∗f = (f, g) .

Thus (Uf, Ug) = (f, g). In particular, by choosing f = g, we arrive at ‖Uf‖ = ‖f‖ for all fin Cν .

Let A be a matrix on Cν . Recall that λ is an eigenvalue with eigenvector ψ for A if λ is

a complex number and ψ is a nonzero vector in Cν satisfying Aψ = λψ. The characteristicpolynomial for A is the polynomial of degree ν determined by det[λI−A]. The determinantof a square matrix is denoted by det. Moreover, the roots of the characteristic polynomialare the eigenvalues for A. To be precise, the eigenvalues {λk}ν−1

0 for A are given by

det[λI − A] =

ν−1∏k=0

(λ− λk) . (5.1)

Here we started the index on the eigenvalues {λk}ν−10 at zero to match up with some of our

later results.If U is a unitary matrix on Cν , then all the eigenvalues of U are on the unit circle. In

other words, the magnitude of all the eigenvalues of U is one. To see this assume that λ is

3.5. THE SHIFT AND THE DISCRETE FOURIER TRANSFORM 171

an eigenvalue for U with eigenvector ψ, that is, Uψ = λψ. Since ‖Uf‖ = ‖f‖ for any f inCν , we have |λ|‖ψ‖ = ‖λψ‖ = ‖Uψ‖ = ‖ψ‖. Hence |λ|‖ψ‖ = ‖ψ‖. Because an eigenvector

by definition is nonzero, |λ| = 1. Therefore all the eigenvalues of a unitary matrix are onthe unit circle.

As before, assume that U is a unitary matrix on Cν . Then the eigenvectors for Ucorresponding to distinct eigenvalues are orthogonal. To be precise, assume λ1 and λ2 aretwo distinct eigenvalues for U with corresponding eigenvectors ψ1 and ψ2, that is, Uψ1 = λ1ψ1

and Uψ2 = λ2ψ2 where λ1 = λ2. Then ψ1 is orthogonal to ψ2. To see this observe that theidentity (Uf, Ug) = (f, g) yields

λ1λ2(ψ1, ψ2) = (λ1ψ1, λ2ψ2) = (Uψ1, Uψ2) = (ψ1, ψ2) .

Hence (1− λ1λ2)(ψ1, ψ2) = 0. Recall that the eigenvalues for any unitary matrix are on theunit circle. So we must have |λ2|2 = 1. Multiplying (1− λ1λ2)(ψ1, ψ2) = 0 by λ2, we obtain

(λ2 − λ1)(ψ1, ψ2) = 0 .

Since λ1 = λ2, we must have (ψ1, ψ2) = 0. In other words, ψ1 is orthogonal to ψ2. Thereforethe eigenvectors for a unitary matrix corresponding to distinct eigenvalues are orthogonal.

Let S be the shift matrix on Cν defined by

S =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

0 1 0 0 · · · 0 00 0 1 0 · · · 0 00 0 0 1 · · · 0 0...

......

.... . .

......

0 0 0 0 · · · 1 00 0 0 0 · · · 0 11 0 0 0 0 · · · 0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦. (5.2)

Notice that a one appears immediately above the main diagonal and in the southwest corner,while all the other entries are zero. If y = [y1, y2, · · · , yν ]tr is a vector in Cν , then

S[y1 y2 y3 · · · yν−2 yν−1 yν

]tr=[y2 y3 y4 · · · yν−1 yν y1

]tr. (5.3)

The matrix S shifts the entries {yk}ν1 of a vector y up one position and places the first entryy1 in the last position. A simple calculation shows that S∗S = I. Therefore S is a unitarymatrix. In particular, S is invertible, S−1 = S∗ and all the eigenvalues for S are on the unitcircle. Finally, it is noted that Sν = I.

Now λk = e−2πik/ν for k = 0, 1, 2, · · · , ν − 1 be the ν roots of unity. Let {ψk}ν−10 be the

column vectors for the ν × ν discrete Fourier transform matrix Fν in (3.6), that is,

Fν =[ψ0 ψ1 ψ2 · · · ψν−1

]ψk =

[1 λk λ2k · · · λν−1

k

]tr(k = 0, 1, · · · , ν − 1) . (5.4)

We claim that λk is an eigenvalue for S with eigenvector ψk, that is, Sψk = λkψk. To seethis simply observe (5.3) and λνk = 1 imply that

Sψk = S[1 λk λ2k · · · λν−3

k λν−2k λν−1

k

]tr=

[λk λ2k λ3k · · · λν−2

k λν−1k 1

]tr= λk

[1 λk λ2k · · · λν−3

k λν−2k λν−1

k

]tr= λkψk .


Thus Sψk = λkψk. In other words, {λk}ν−10 are the eigenvalues and {ψk}ν−1

0 are the corre-sponding eigenvectors for S. Because S is unitary and its eigenvalues {λk}ν−1

0 are distinct,it follows that {ψk}ν−1

0 is an orthogonal set of vectors. Since {ψk}ν−10 contains ν nonzero

vectors, {ψk}ν−10 is an orthogonal basis for Cν . Finally, by consulting (5.1), we see that the

characteristic polynomial for S is given by

det[λI − U ] =ν−1∏k=0

(λ− λk) =ν−1∏k=0

(λ− e−2πik/ν) . (5.5)

To complete this section, let D be the diagonal matrix on Cν given by

D =

⎡⎢⎢⎢⎢⎢⎣λ0 0 0 · · · 00 λ1 0 · · · 00 0 λ2 · · · 0...

......

. . ....

0 0 0 · · · λν−1

⎤⎥⎥⎥⎥⎥⎦ . (5.6)

Notice that D is the diagonal matrix with the roots of unity {λk}ν−10 on the main diagonal

and zero’s elsewhere. We claim that SFν = FνD. Since Fν is invertible, F−1ν SFν = D, or

equivalently, ν−1F ∗ν SFν = D. The last equality follows from F−1

ν = F ∗ν /ν. To show that

SFν = FνD, observe that Fνψk = λkψk yields

SFν = S[ψ0 ψ1 ψ2 · · · ψν−1

]=[λ0ψ0 λ1ψ1 λ2ψ2 · · · λν−1ψν−1

]= FνD .

Therefore SFν = FνD .

3.5.1 Exercise

Problem 1. Let S be the matrix given by

S =

⎡⎢⎢⎢⎢⎣0 1 0 0 00 0 1 0 00 0 0 1 00 0 0 0 11 0 0 0 0

⎤⎥⎥⎥⎥⎦ . (5.7)

Find all the eigenvalues, a corresponding set of eigenvectors and the characteristic polynomialfor S.

Problem 2. Let U be the matrix given by

U =

⎡⎢⎢⎢⎢⎣0 0 0 0 11 0 0 0 00 1 0 0 00 0 1 0 00 0 0 1 0

⎤⎥⎥⎥⎥⎦ .

Find all the eigenvalues, a corresponding set of eigenvectors and the characteristic polynomialfor U . Hint: U = S∗.

3.6. CONVOLUTION AND THE DISCRETE FOURIER TRANSFORM 173

3.6 Convolution and the discrete Fourier transform

Let us review some elementary facts concerning the discrete Fourier transform. Let {λj}ν−10

be the ν roots of unity defined by λj = e−i2πj/ν for j = 0, 1, 2, · · · , ν − 1. It is noted thatλ0 = 1 and λj1 = λj for j = 0, 1, 2, · · · , ν − 1. Recall that the discrete Fourier transform Fνis the Vandermonde matrix on Cν given by

Fν =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

1 1 1 · · · 1 11 λ1 λ21 · · · λν−2

1 λν−11

1 λ2 λ22 · · · λν−22 λν−1

2

1 λ3 λ23 · · · λν−23 λν−1

3...

...... · · · ...

...1 λν−1 λ2ν−1 · · · λν−2

ν−1 λν−1ν−1

⎤⎥⎥⎥⎥⎥⎥⎥⎦. (6.1)

It is emphasized that F ∗ν = F ν , the complex conjugate of each entry of Fν . Moreover,

F ∗νFν = νI and F−1

ν = 1νF ∗ν . In particular, ν−

12Fν is a unitary operator on Cν . If x is a

vector in Cν , then x(λ) denotes the polynomial formed by x, that is,

x(λ) =

ν−1∑k=0

xkλk where x =

⎡⎢⎢⎢⎢⎢⎣x0x1x2...

xν−1

⎤⎥⎥⎥⎥⎥⎦ .

The discrete Fourier transform Fνx evaluates polynomial x(λ) at the ν points {λj}ν−10 on

the unit circle. To be precise, ⎡⎢⎢⎢⎢⎢⎣x(1)x(λ1)x(λ2)

...x(λν−1)

⎤⎥⎥⎥⎥⎥⎦ = Fνx. (6.2)

It is noted that if ζ is ν root of unity (ζν = 1), then ζν−k = ζkζν = ζ

kwhere k is a

positive integer. In particular, ζν−1 = ζ, ζν−2 = ζ2, ζν−3 = ζ

3, et cetera. Using this the

discrete Fourier transform can also be written as

Fν =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

1 1 1 · · · 1 1

1 λ1 λ21 · · · λ2

1 λ1

1 λ2 λ22 · · · λ2

2 λ2

1 λ3 λ23 · · · λ2

3 λ3...

...... · · · ...

...

1 λν−1 λ2ν−1 · · · λ2

ν−1 λν−1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦. (6.3)


The powers of λkj and λm

j meet in the middle and will be off by one when ν is even. Usingthis it follows that x(λj) has two different interpretations, that is,[

x(1) x(λ1) x(λ2) · · · x(λν−1)]tr

= Fν[x0 x1 x2 · · · xν−1

]trx(λj) =

ν−1∑k=0

xkλkj (6.4)

x(λj) = x0 + x1λj + x2λ2j + x3λ

3j + · · ·+ xν−3λ

3

j + xν−2λ2

j + xν−1λj.

The transpose of a vector is denoted by the superscript tr. It is emphasized that the transposedoes not take the complex conjugate. The second expression in (6.4) shows that x(λj) can beviewed as a polynomial of degree at most ν−1 evaluated at the ν points {λj}ν−1

0 on the unitcircle. The last expression shows that x(λj) can be viewed as a trigonometric polynomialevaluated at the same ν points {λj}ν−1

0 on the unit circle. (The powers of λj and λj meet inthe middle and one of the powers can be one degree larger when ν is even.)

Let S be the shift operator on Cν defined by

S =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

0 1 0 0 · · · 0 00 0 1 0 · · · 0 00 0 0 1 · · · 0 0...

...... · · · . . .

......

0 0 0 0 · · · 1 00 0 0 0 · · · 0 11 0 0 0 · · · 0 0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦(6.5)

where ones appear immediately above the main diagonal and the lower left hand corner(Sν,1 = 1) also equals one with zeros elsewhere. It is emphasized that S is a unitary operatoron Cν and {λj}ν−1

0 are the eigenvalues for S. Moreover, the corresponding eigenvectors{ψj}ν−1

0 are given by the columns of Fν , that is, Sψj = λjψj where

Fν =[ψ0 ψ1 ψ2 · · · ψν−2 ψν−1

]and ψj =

⎡⎢⎢⎢⎢⎢⎣1λjλ2j...

λν−1j

⎤⎥⎥⎥⎥⎥⎦ (6.6)

for j = 0, 1, 2, · · · , ν − 1. In particular, SFν = FνD where D is the unitary diagonal matrixformed by the eigenvalues {λj}ν−1

0 , that is,

D =

⎡⎢⎢⎢⎢⎢⎣1 0 0 · · · 00 λ1 0 · · · 00 0 λ2 · · · 0...

......

. . ....

0 0 0 · · · λν−1

⎤⎥⎥⎥⎥⎥⎦ .


Recall that F ∗ν = F ν . Now observe that SF ∗

ν = SFν = FνD = F ∗νD

∗. In other words,SF ∗

ν = F ∗νD

∗. By taking the adjoint we arrive at

FνS∗ = DFν .

As before, let q(λ) =∑ν−1

0 qjλj be a polynomial of degree at most ν − 1 determined

by the vector q =[q0 q1 · · · qν−1

]tr. Recall that a Toeplitz matrix is a matrix whose

diagonal entries are all the same. To be precise, T is a Toeplitz matrix if its entries are ofthe form Tj,k = aj−k for all indices j and k, that is,

T =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

a0 a−1 a−2 · · · a−(ν−2) a−(ν−1)

a1 a0 a−1 · · · a−(ν−3) a−(ν−2)

a2 a1 a0 · · · a−(ν−4) a−(ν−3)...

......

. . ....

...aν−2 aν−3 aν−4 · · · a0 a−1

aν−1 aν−2 aν−3 · · · a1 a0

⎤⎥⎥⎥⎥⎥⎥⎥⎦. (6.7)

It is emphasized that all the diagonal elements of T are the same. Notice that S and S∗

are both Toeplitz matrices. Hence q(S∗) is also a Toeplitz matrix. Let Cq be the Toeplitzmatrix defined by

Cq = q(S∗) =ν−1∑j=0

qjS∗j. (6.8)

We claim that Cq admits a matrix representation of the form:

Cq = q(S∗) =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

q0 qν−1 qν−2 qν−3 · · · q2 q1q1 q0 qν−1 qν−2 · · · q3 q2q2 q1 q0 qν−1 · · · q4 q3...

......

.... . .

......

qν−3 qν−4 qν−5 qν−6 · · · qν−1 qν−2

qν−2 qν−3 qν−4 qν−5 · · · q0 qν−1

qν−1 qν−2 qν−3 qν−4 · · · q1 q0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦. (6.9)

To obtain the previous matrix representation for Cq =∑ν−1

j=0 qjS∗j , simply observe that for


any x in Cν , we have

q(S∗)

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

x0x1x2...

xν−3

xν−2

xν−1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦= q0

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

x0x1x2...

xν−3

xν−2

xν−1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦+ q1

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

xν−1

x0x1...

xν−4

xν−3

xν−2

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦+ q2

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

xν−2

xν−1

x0...

xν−5

xν−4

xν−3

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦+ · · ·+ qν−1

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

x1x2x3...

xν−2

xν−1

x0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

=

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣


......

.... . .

......

qν−3 qν−4 qν−5 qν−6 · · · qν−1 qν−2

qν−2 qν−3 qν−4 qν−5 · · · q0 qν−1

qν−1 qν−2 qν−3 qν−4 · · · q1 q0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

x0x1x2...

xν−3

xν−2

xν−1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦.

Therefore q(S∗) is the Toeplitz matrix given by equation (6.9). Finally, using FνS∗ = DFν ,

we obtain Fν q(S∗) = q(D)Fν . In other words,

FνCq = q(D)Fν =

⎡⎢⎢⎢⎢⎢⎣q(1) 0 0 · · · 00 q(λ1) 0 · · · 00 0 q(λ2) · · · 0...

......

. . ....

0 0 0 · · · q(λν−1)

⎤⎥⎥⎥⎥⎥⎦Fν . (6.10)

Because Fν q(S∗) = q(D)Fν , the eigenvalues for Cq = q(S) are given by {q(λj)}ν−1

0 . Recallthat q(λj) = (Fνq)j, the j + 1 component of the vector Fνq. Multiplying Fν q(S

∗) = q(D)Fνby F ∗

ν on both sides with F ∗νFν = νI, we see that q(S∗)F ∗

ν = F ∗ν q(D), or equivalently,

q(S∗)F ν = F ν q(D). So the j + 1 column ψj is the eigenvector for q(S∗) corresponding to

the eigenvalue q(λj), that is, q(S∗)ψj = q(λj)ψj , for j = 0, 1, 2, · · · , ν − 1; see also (6.6).

A matrix of the form Cq in (6.9) is called a circulant matrix. It is noted that a circulantmatrix is a Toeplitz matrix where all the columns are rotated about the first column. It iseasy to show that C is a circulant matrix if and only if C = q(S∗), where q is a polynomial

of degree at most ν. In this case, C = Cq = q(S∗) where q =[q0 q1 · · · qν−1

]tris the first

column of Cq. Moreover, {q(λj)}ν−10 are the eigenvalues for Cq. If x is a vector in Cν , then

Cqx is called circular convolution between q and x.

Let x and y be two vectors in Cν with components {xj}ν−10 and {yj}ν−1

0 , respectively.Then x� y is the vector in Cν obtained by multiplying their components {xjyj}ν−1

0 . By em-ploying FνCq = q(D)Fν with q(λj) = (Fνq)j , we arrive at the following circular convolution


result in discrete Fourier transform theory:

FνCqx = q(D)Fνx = (Fνq)� (Fνx) =

⎡⎢⎢⎢⎢⎢⎣q(1)x(1)q(λ1)x(λ1)q(λ2)x(λ2)

...q(λν−1)x(λν−1)

⎤⎥⎥⎥⎥⎥⎦ . (6.11)

To see where the circulant matrix comes from assume that

q(λ) =

ν−1∑k=0

qkλk and x(λ) =

ν−1∑k=0

xkλk

are two polynomials of degree at most ν − 1. Then q(λ)x(λ) is a polynomial of degree atmost 2ν− 2. Assume that ζ is one of the ν roots of unity, that is, ζν = 1. In this setting, wecan find a polynomial y(λ) =

∑ν−10 ykλ

k of degree at most ν − 1 such that y(ζ) = q(ζ)x(ζ).In fact, one such y is the polynomial determined by the circular convolution[

y0 y1 · · · yν−1

]tr= Cq

[x0 x1 · · · xν−1

]tr. (6.12)

To see this without loss of generality, let us assume that ν = 4. (The general case followsby a similar argument.) Then ζ4 = 1 and ζ5 = ζ and ζ6 = ζ2. Moreover, q(ζ)x(ζ) has apower series expansion of the form:

q(ζ)x(ζ) =

(3∑k=0

qkζk

)(3∑j=0

xjζj

)= q0x0 + ζ(q1x0 + q0x1) + ζ2(q2x0 + q1x1 + q0x2)

+ ζ3(q3x0 + q2x1 + q1x2 + q3x0)

+ ζ4(q3x1 + q2x2 + q1x3) + ζ5(q3x2 + q2x3) + ζ6q3x3

= q0x0 + ζ(q1x0 + q0x1) + ζ2(q2x0 + q1x1 + q0x2)

+ ζ3(q3x0 + q2x1 + q1x2 + q3x0)

+ (q3x1 + q2x2 + q1x3) + ζ(q3x2 + q2x3) + ζ2q3x3

= q0x0 + q3x1 + q2x2 + q1x3 + ζ(q1x0 + q0x1 + q3x2 + q2x3)

+ ζ2(q2x0 + q1x1 + q0x2 + q3x3) + ζ3(q3x0 + q2x1 + q1x2 + q3x0)

=[1 ζ ζ2 ζ3

] ⎡⎢⎢⎣q0 q3 q2 q1q1 q0 q3 q2q2 q1 q0 q3q3 q2 q1 q0

⎤⎥⎥⎦⎡⎢⎢⎣x0x1x2x3

⎤⎥⎥⎦ =

3∑k=0

ykζk

where[y0 y1 y2 y3

]tr= Cq

[x0 x1 x2 x3

]tr.


As before, let x =[x0 x1 · · · xν−1

]trbe a vector in Cν . We claim that

⎡⎢⎢⎢⎢⎢⎣x(1)

x(λ1)

x(λ2)...

x(λν−1)

⎤⎥⎥⎥⎥⎥⎦ = Fνx = Fν

⎡⎢⎢⎢⎢⎢⎢⎢⎣

x0xν−1

xν−2...x2x1

⎤⎥⎥⎥⎥⎥⎥⎥⎦where x(λ) =

ν−1∑k=0

xkλk. (6.13)

Using this in (6.11), we obtain

Fν

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣


......

.... . .

......

qν−3 qν−4 qν−5 qν−6 · · · qν−1 qν−2

qν−2 qν−3 qν−4 qν−5 · · · q0 qν−1

qν−1 qν−2 qν−3 qν−4 · · · q1 q0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

x0xν−1

xν−2...x3x2x1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦= FνCq

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

x0xν−1

xν−2...x3x2x1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

= (Fνq)� (Fνx) =

⎡⎢⎢⎢⎢⎢⎣q(1) x(1)

q(λ1) x(λ1)

q(λ2) x(λ2)...

q(λν−1) x(λν−1)

⎤⎥⎥⎥⎥⎥⎦ . (6.14)

This is a cyclic version of multiplying a vector by a Hankel matrix.

To verify that (6.13) holds, let Ej be the row vector mapping Cν into C where one appearsin the j + 1 column and zeros elsewhere for j = 0, 1, · · · , ν − 1. (Notice that we are startingthe index at zero to be consistent with our other notation.) Equation (6.13) follows fromthe fact that λj is a ν root of unity (λνj = 1), and thus,

x(λj) =

(ν−1∑k=0

xkλkj

)∗

=ν−1∑k=0

xkλk

j =ν−1∑k=0

xkλν−kj

=[1 λj λ2j · · · λν−2

j λν−1j

]⎡⎢⎢⎢⎢⎢⎢⎢⎣

x0xν−1

xν−2...x2x1

⎤⎥⎥⎥⎥⎥⎥⎥⎦= EjFν

⎡⎢⎢⎢⎢⎢⎢⎢⎣

x0xν−1

xν−2...x2x1

⎤⎥⎥⎥⎥⎥⎥⎥⎦.

where EjFν =[1 λj λ2j · · · λν−2

j λν−1j

]picks out the j + 1 row of the discrete Fourier

transform Fν , for j = 0, 1, · · · , ν − 1; see (6.1). This yields (6.13).


To complete this section, it is noted that if we relabel the indices of q, we have

Cq =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

q0 q−1 q−2 q−3 · · · q2 q1q1 q0 q−1 q−2 · · · q3 q2q2 q1 q0 q−1 · · · q4 q3...

......

.... . .

......

q−3 q−4 q−5 q−6 · · · q−1 q−2

q−2 q−3 q−4 q−5 · · · q0 q−1

q−1 q−2 q−3 q−4 · · · q1 q0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦q =

[q0 q1 q2 q−3 · · · q−2 q−1

]tr. (6.15)

By relabeling these indices, we see that the circulant matrix Cq looks like part of a Laurentmatrix.


Chapter 4

Laplace transforms and transferfunctions

This chapter presents a review of the Laplace transform and some of its properties. TheLaplace transform will also be used to solve some simple differential equations includingstate space systems. The Laplace transform will be used to study some simple circuits.

4.1 The Laplace transform

In this section we introduce the Laplace transform. Recall a function g(t) is of exponentialorder if g is a function define on [0,∞) and there exists positive constants m and γ such that

|g(t)| ≤ meγt (for all t ≥ 0) . (1.1)

For example the functions 1, eat, cos(ωt) and cos(ωt) are all functions of exponential order.If g is a function of exponential order, then the Laplace transform of g is the function G(s)defined by

G(s) = (Lg)(s) =∫ ∞

0

e−stg(t) dt . (1.2)

Notice that the Laplace transform G(s) is a function of the complex variable s = σ+ iω. TheLaplace transform maps a function g(t) in time to a function G(s) of a complex variable.Here we use capital G to represent the Laplace transform of g. The notation G = Lg simplymeans that G is the Laplace transform of g.

It is emphasized that if g is a function of exponential order, then its Laplace transformis well defined. In other words, if g is not of exponential order, then its Laplace transformG may not be well defined and may not make any sense. If g is a function of exponentialorder satisfying (1.1), then its Laplace transform G(s) is well defined in {s : �s > γ}. Tosee this let s = σ + iω and simply observe that

|G(s)| =

∣∣∣∣∫ ∞

0

e−stg(t) dt

∣∣∣∣ ≤ ∫ ∞

0

|e−(σ+iω)tg(t)| dt ≤∫ ∞

0

e−σtmeγt dt

=

∫ ∞

0

me(γ−σ)t dt =me(γ−σ)t

γ − σ

∣∣∣∣∞0

=m

σ − γ(if σ > γ) .

181

182 CHAPTER 4. LAPLACE TRANSFORMS AND TRANSFER FUNCTIONS

So |G(s)| is finite for all σ > γ. Therefore G(s) is a well defined function of s for all s in theregion {s : �s > γ}.

Now let us compute the Laplace transform of the function g defined by

g(t) = 1 if 0 ≤ t ≤ t0

= 0 otherwise . (1.3)

The Laplace transform of g is given by

G(s) =1− e−st0

s. (1.4)


G(s) =

∫ ∞

0

e−stg(t) dt =∫ t0

0

e−st dt =e−st

−s∣∣∣∣t00

=1− e−st0

s.

Hence the Laplace transform of g in (1.3) is given by (1.4).

4.1.1 Linearity

The Laplace transform is a linear operator. To be precise, let g and h be two functions ofexponential order. Then

αG(s) + βH(s) = αLg + βLh = L(αg + βh) , (1.5)

where α and β are scalars. The proof is simple and follows from the fact that the integral isa linear operator,

L(αg + βh) =

∫ ∞

0

e−st (αg(t) + βh(t)) dt = α

∫ ∞

0

e−stg(t) dt+ β

∫ ∞

0

e−sth(t) dt

= αLg + βLh = αG(s) + βH(s) .

Therefore (1.5) holds and the Laplace transform L is a linear operator.

The Laplace transform of eat. If a is any complex number, then the Laplace transformof eat is given by

(Leat)(s) = 1

s− a. (1.6)

To verify this simply observe that for �s > a, we have

(Leat)(s) =

∫ ∞

0

e−steat dt =∫ ∞

0

e(a−s)t dt =e(a−s)t

a− s

∣∣∣∣∞0

=1

s− a.

The last equality follows from the fact that �s > a. Hence (1.6) holds. Finally, it is notedthat G(s) is well defined for all s = a.

4.1. THE LAPLACE TRANSFORM 183

For an example, assume that

g(t) = 3e−2t − 4e−5t + e−6t (t ≥ 0) . (1.7)

Using linearity along with (1.6), we see that the Laplace transform of this g is given by

G(s) =3

s+ 2− 4

s + 5+

1

s + 6. (1.8)

The unit step function 1+(t) is defined by

1+(t) = 1 if t ≥ 0

= 0 if t < 0. (1.9)

Clearly, 1+(t) = e0t for all t ≥ 0. By substituting a = 0 in (1.6), we see that the Laplacetransform of the unit step function 1+(t) is given by (L1+)(s) = 1

s. It is noted that 1+(t) is

also called the Heaviside step function. For convenience, we set the value of 1+(0) = 1. Thismakes the unit step function right continuous. Depending upon the application, sometimes1+(0) is defined to be 1

2or 0. Because any point has Lebesgue measure zero, the definition

of 1+(t) at t = 0 does not effect our applications of the unit step function. Finally, see alsothe Heaviside step function in Wikipedia.

The Laplace transform of sinusoids. In this section we will compute the Laplace trans-forms of the cosine and sine functions. We claim that

(L cos(ωt))(s) =s

s2 + ω2and (L sin(ωt))(s) =

ω

s2 + ω2. (1.10)

To obtain the first equality notice that (Leiθt)(s) = 1/(s − iθ); see (1.6). Using this alongwith the linearity of the Laplace transform, we have

(L cos(ωt))(s) = Leiωt + e−iωt

2=

1

2(s− iω)+

1

2(s+ iω)

=s+ iω + s− iω

2(s− iω)(s+ iω)=

s

s2 + ω2.

Hence the first equality in (1.10) holds. On the other hand,

(L sin(ωt))(s) = Leiωt − e−iωt

2i=

1

2i(s− iω)− 1

2i(s+ iω)

=s+ iω − (s− iω)

2(s− iω)(s+ iω)=

ω

s2 + ω2.

Hence the second equality in (1.10) holds.For another example, consider the function g defined by

g(t) = 6− 3e−2t + 2 cos(4t)− 3 sin(6t) .


Then using linearity, we see that the Laplace transform of this g is given by

(Lg)(s) = 6

s− 3

s+ 2+

2s

s2 + 16− 18

s2 + 36.

Finally, let us show that

(L cos(ωt+ φ))(s) =s cos(φ)

s2 + ω2− ω sin(φ)

s2 + ω2. (1.11)

As expected, the angular frequency ω and the phase φ are constants. To verify this recallthat

cos(a+ b) = cos(a) cos(b)− sin(a) sin(b) .

Using this along with the fact that the Laplace transform is a linear operator, we obtain

(L cos(ωt+ φ))(s) = L cos(ωt) cos(φ)− L sin(ωt) sin(φ) =s cos(φ)

s2 + ω2− ω sin(φ)

s2 + ω2.

Thus (1.11) holds.

The Laplace transform of tn. The Laplace transform of tn for any integer n ≥ 0 is givenby

(Ltn)(s) = n!

sn+1(n = 0, 1, 2, · · · ) . (1.12)

In particular, (Lt)(s) = 1/s2 and (Lt2)(s) = 2/s3. To verify that (1.12) holds, recall that(L1)(s) = 1/s. Now let us use induction and assume that (Ltn)(s) = n!/sn+1 holds for someinteger n. Then using integration by parts with �s > 0, we obtain

Ltn+1 =

∫ ∞

0

tn+1e−st dt = −tn+1e−st

s

∣∣∣∣∞0

+n+ 1

s

∫ ∞

0

e−sttn dt =(n+ 1)n!

ssn+1=

(n + 1)!

sn+2.

Thus (Ltn+1)(s) = (n+ 1)!/sn+2 which completes the induction argument. Therefore (1.12)holds.

4.1.2 Multiplication by eat

Recall that G(s) is the Laplace transform of g(t). The Laplace transform of eatg(t) is givenby

(Leatg(t))(s) = G(s− a) . (1.13)

In other words, the Laplace transform of eatg(t) equals the Laplace transform of g(t) wheres is replaced by s− a. To see this simply observe that

(Leatg(t))(s) =∫ ∞

0

e−steatg(t) dt =∫ ∞

0

e−(s−a)tg(t) dt = (Lg(t))(s− a) = G(s− a) .

Hence (1.13) holds.


For example, let us use (1.13) to show that the Laplace transform of eat equals 1/(s−a).Recall that (L1)(s) = 1/s. So using (1.13), we see that (Leat)(s) = (Leat×1)(s) = 1/(s−a).

Now let us show that

(Leattn)(s) = n!

(s− a)n+1(n = 0, 1, 2, · · · ) . (1.14)

This follows from (1.13) and the fact that (Ltn)(s) = n!/sn+1, that is,

(Leattn)(s) = (Ltn)(s− a) =n!

sn+1

∣∣∣∣s−a

=n!

(s− a)n+1.

Hence (1.14) holds.We claim that

(Leat cos(ωt))(s) =s− a

(s− a)2 + ω2

(1.15)

(Leat sin(ωt))(s) =ω

(s− a)2 + ω2.

To verify the first equality notice that the property (Leatg(t))(s) = G(s − a) and (1.10)yields,

(Leat cos(ωt))(s) = (L cos(ωt))(s− a) =s

s2 + ω2

∣∣∣∣s−a

=s− a

(s− a)2 + ω2.

Hence the first equality in (1.15) holds. To obtain the second equality observe that

(Leat sin(ωt))(s) = (L sin(ωt))(s− a) =ω

s2 + ω2

∣∣∣∣s−a

=ω

(s− a)2 + ω2.

Thus (1.15) holds.For example, let g be the function defined by

g(t) = 2e−3t + 6e3t cos(2t)− 3e−2t sin(4t) .

Then using the Laplace transform pairs in (1.15), we see that the Laplace transform of thisg is given by

G(s) =2

s+ 3+

6(s− 3)

(s− 3)2 + 4− 12

(s+ 2)2 + 16.

To complete this section let us observe that by replacing s by s− a in (1.11), we arriveat

(Leat cos(ωt+ φ))(s) =(s− a) cos(φ)

(s− a)2 + ω2− ω sin(φ)

(s− a)2 + ω2. (1.16)

As expected, the angular frequency ω and the phase φ are constants.



Let δ(t) be the Dirac delta function. Recall that the Dirac delta function δ(t) is positiveinfinity at the origin, is zero everywhere else and has area one. To obtain a model for theDirac delta function, let δn(t) be the function defined by

δn(t) = n if 0 ≤ t ≤ 1/n


Clearly, δn(t) is positive and has area one. Then the Dirac delta function is formally givenby

δ(t) = limn→∞

δn(t) . (1.18)

By construction δ(t) is positive infinity at the origin, is zero everywhere else and has areaone. (Because the Laplace transform is only defined for functions g(t) where t ≥ 0, ourdefinition of the Dirac delta function used here is slightly different from the symmetric Diracdelta function obtained in equation (7.28) of Section 2.7.2 in Chapter 2.)

Recall that if h is any continuous function, then

h(0) =

∫ ∞

−∞h(t)δ(t)dt and h(t) =

∫ ∞

−∞h(t− τ)δ(τ)dτ. (1.19)

The fact that δ(t − τ) picks out the value of h(t) is called the sifting or sampling propertyof the Dirac delta function. To see this let g(t) be the integral of h, or equivalently, assumethat g = h. Then formally∫ ∞

−∞h(t)δ(t) dt = lim

n→∞

∫ ∞

−∞h(t)δn(t) dt = lim

n→∞

∫ 1/n

0

nh(t) dt

= limn→∞

g(1/n)− g(0)

1/n= lim

Δ→0

g(Δ)− g(0)

Δ= g(0) = h(0) .

Hence equation (1.19) holds.The Laplace transform of the Dirac delta function is given by

1 = (Lδ(t))(s) . (1.20)

To verify this simply notice that (1.19) gives

(Lδ(t))(s) =∫ ∞

0

e−stδ(t) dt = e−s0 = 1 .

For some further results concerning the Dirac delta function see Section 2.7.2 in Chapter 2.


The following presents a table presents a summary of the Laplace transform pairs pre-sented in this section.

Laplace transform pairs

g(t) G(s) =∫∞0e−stg(t)dt

δ(t) 1

11s

tnn!sn+1

eat1s−a

cos(ωt)s

s2+ω2

sin(ωt)ω

s2+ω2

eattnn!

(s−a)n+1

eat cos(ωt)s−a

(s−a)2+ω2

eat sin(ωt)ω

(s−a)2+ω2

2�(γ)eat cos(ωt)− 2�(γ)eat sin(ωt) γs−a−ωi +

γs−a+ωi

4.1.4 Exercise

Problem 1. Find the Laplace transform for the function g defined by

g(t) = 1 if 1 ≤ t ≤ 2

= −1 if 2 < t ≤ 4

= 0 otherwise .


g(t) = t if 0 ≤ t ≤ 1

= 0 otherwise .



g(t) = 2δ(t) + 2 sin(3t)− 4e−3t (t ≥ 0) .


g(t) = 2e−2t sin(3t)− 4e−3t cos(6t) (t ≥ 0) .


g(t) = sin(ωt+ φ)

where ω and φ are constants.


g(t) = eat sin(ωt+ φ)

where ω and φ are constants.

Problem 7. Assume that ωn and ζ are real numbers satisfying ωn > 0 and |ζ | < 1. Thenfind the function g whose Laplace transform G is given by

G(s) =ωn

√1− ζ2

s2 + 2ζωns+ ω2n

.

4.2 Properties of the Laplace transform

In this section we will present several useful properties of the Laplace transform.

Differentiation. Recall that g denotes the derivative of a function g(t) with respect totime. The Laplace transform of the derivative of a function g(t) is given by

(Lg)(s) = sG(s)− g(0). (2.1)

To verify this notice that integration by parts yields

(Lg)(s) =∫ ∞

0

e−stg(t) dt = e−stg(t)∣∣∞0+ s

∫ ∞

0

e−stg(t) dt = sG(s)− g(0) .

Hence (2.1) holds.The Laplace transform of the second derivative of a function g(t) is given by

(Lg)(s) = s2G(s)− sg(0)− g(0) . (2.2)

To verify this notice that (2.1) yields

(Lg)(s) = s(Lg)(s)− g(0) = s2G(s)− sg(0)− g(0) .

4.2. PROPERTIES OF THE LAPLACE TRANSFORM 189

Thus (2.2) holds.Let g(n) denote the n-th derivative of g where n is a positive integer. The Laplace

transform of the third derivative of a function g(t) is given by

(Lg(3))(s) = s3G(s)− s2g(0)− sg(0)− g(0) . (2.3)

To verify this notice that (2.1) and (2.2) yield

(Lg(3))(s) = s(Lg)(s)− g(0) = s(s2G(s)− sg(0)− g(0)

)− g(0) .

Thus (2.3) holds. Finally, an induction argument shows that the Laplace transform of then-th derivative of a function g(t) is given by

(Lg(n))(s) = snG(ω)− sn−1g(0)− sn−2g(0)− · · · − sg(n−2)(0)− g(n−1)(0) . (2.4)

Integration. As before, let g be a function of exponential order. The following presentsthe Laplace transform for the integral of a function g, that is,(

L∫ t

0

g(x) dx

)(s) =

G(s)

s. (2.5)

To verify that (2.5) holds, let h(t) =∫ t0g(x) dx. Then using the fact that h = g, we obtain

G(s) = Lh = sH(s)− h(0) = sH(s)−∫ 0

0

g(x) dx .

Notice that∫ 0

0g(x) dx is zero. Thus G(s) = sH(s). This implies that(

L∫ t

0

g(x) dx

)(s) = H(s) =

G(s)

s.

Therefore (2.5) holds. Finally, it is noted that differentiation corresponds to multiplicationby s while integration is division by s.

Notice that for any differentiable function g of exponential order, we have

g(t) = g(0) +

∫ t

0

g(x)dx.

By taking the Laplace transform of both sides, equation (2.5) shows that

G(s) = L(g(0) +

∫ t

0

g(x)dx

)=g(0)

s+

(Lg)(s)s

.

Multiplying by s and rearranging terms, yields the classical formula for the Laplace transformof the derivative g of g, that is,

(Lg)(s) = sG(s)− g(0);


see equation (2.1).Let us directly verify that (2.5) holds for g(t) = sin(ωt). In this case,∫ t

0

sin(ωx) dx = − cos(ωx)

ω

∣∣∣∣t0

=1

ω− cos(ωt)

ω.

Moreover, observe that(L∫ t

0

sin(ωx) dx

)(s) = L

(1

ω− cos(ωt)

ω

)=

1

sω− s

ω(s2 + ω2)

=s2 + ω2 − s2

ωs(s2 + ω2)=

ω

s(s2 + ω2)=

L sin(ωt)

s.

As expected, (L∫ t

0

sin(ωx) dx

)(s) =

L(sin(ωt))s

.

Convolution. The the convolution between two function g and u is defined by

(g ⊗ u)(t) =

∫ t

0

g(t− σ)u(σ) dσ . (2.6)

Notice that by changing the variables of integration, we obtain

(u⊗ g)(t) =

∫ t

0

g(σ)u(t− σ) dσ . (2.7)

In particular, g ⊗ u = u⊗ g.Notice that g ⊗ δ = g where δ(t) the Dirac delta function. To see this simply observe

that the sampling property of the Dirac delta function in (1.19) yields

(g ⊗ δ)(t) =

∫ t

0

g(t− σ)δ(σ) dσ = g(t− 0) = g(t) .

Therefore g ⊗ δ = g.If g and u are two functions, then

(L(g ⊗ u))(s) = G(s)U(s) . (2.8)

In other words, the Laplace transform of the convolution of two functions g and u in thetime domain corresponds to multiplication GU in the s or Laplace transform domain.

To show that (L(g ⊗ u))(s) = G(s)U(s) notice that (2.7) yields

(L(g ⊗ u))(s) =

∫ ∞

0

∫ t

0

e−stg(σ)u(t− σ) dσ dt =

∫ ∞

0

g(σ)

∫ ∞

σ

e−stu(t− σ) dt dσ

=

∫ ∞

0

e−sσg(σ)∫ ∞

σ

e−s(t−σ)u(t− σ) dt dσ

=

∫ ∞

0

e−sσg(σ)∫ ∞

0

e−sτu(τ) dτ dσ

=

∫ ∞

0

e−sσg(σ)U(s) dσ = G(s)U(s) .

4.2. PROPERTIES OF THE LAPLACE TRANSFORM 191

Hence (2.8) holds.For an example, let us convolve g(t) = e−t with u(t) = e−2t, that is,

(e−t ⊗ e−2t)(t) =

∫ t

0

e−(t−σ)e−2σdσ = e−t∫ t

0

e−σdσ = −e−t e−σ∣∣tσ=0

= e−t − e−2t .

Hence (e−t ⊗ e−2t)(t) = e−t − e−2t. Clearly,

(Le−t ⊗ e−2t)(s) = L(e−t − e−2t) =1

s+ 1− 1

s+ 2=

1

(s+ 1)(s+ 2).

So as expected (Le−t ⊗ e−2t)(s) equals the product (Le−t)(s)× (Le−2t)(s).

The shifting property. Recall that 1+(t) = 1 for all t > 0 and 1+(t) = 0 for t < 0. Lett0 > 0 be a specified point on the real line. Then the Laplace transform of g(t− t0)1+(t− t0)is given by

(Lg(t− t0)1+(t− t0))(s) = e−st0G(s) (t0 > 0) . (2.9)

To verify this simply observe that g(t− t0)1+(t− t0) = 0 for 0 ≤ t < t0. Thus

(Lg(t− t0)1+(t− t0))(s) =

∫ ∞

t0

e−stg(t− t0) dt =

∫ ∞

0

e−s(σ+t0)g(σ) dσ

= e−st0∫ ∞

0

e−stg(t) dt = e−st0(Lg)(s) = e−st0G(s) .

Therefore (2.9) holds.For an application of the shifting property, consider the Dirac delta function δ(t). Let us

show that (Lδ(t−t0))(s) = e−st0 when t0 > 0. First observe that δ(t−t0) = δ(t−t0)1+(t−t0).Combining this with Lδ = 1 and (2.9), we have

(Lδ(t− t0))(s) = (Lδ(t− t0)1+(t− t0))(s) = e−st0Lδ = e−st0 .

Thus (Lδ(t− t0))(s) = e−st0 .It is emphasized that one can only apply (2.9) to functions of the form g(t− t0)1+(t− t0)

when t0 > 0. Notice that this function is zero on the interval [0, t0). To see what can gowrong, consider the function g(t) = e−t. Then

(Le−(t−t0))(s) = et0(Le−t)(s) = et0

s+ 1and (Le−(t−t0)1+(t− t0))(s) =

e−st0

s+ 1.

Therefore (Le−(t−t0))(s) = et0(s + 1)−1 = (Le−(t−t0)1+(t − t0))(s). We cannot apply (2.9)because e−(t−t0) = e−(t−t0)1+(t− t0) = 0 on the interval [0, t0).

4.2.1 Exercise

Problem 1. Consider the function g defined by

g(t) = 1 if 1 ≤ t < 2

= 0 otherwise.


Notice that g(t) = 1+(t− 1)− 1+(t− 2). Using this form of g find G(s).

Problem 2. Assume that the Laplace transform of the function h is given by

H(s) =1

(s− 1)(s+ 2).

Using Let = 1/(s− 1) and Le−2t = 1/(s+ 2) along (2.8) find h(t).

Problem 3. Consider the differential equation

x = ax+ bu

where a and b are constants while u(t) is a (forcing) function. Then verify that

x(t) = eatx(0) +

∫ t

0

ea(t−σ)bu(σ) dσ

is a solution to this differential equation.

Problem 4. Recall that (Lg)(s) = sG(s) − g(0). Using this fact show that the derivativeof 1+(t) formally equals the Dirac delta function δ(t).

4.3 The inverse Laplace transform

We say that g is the inverse Laplace transform of a function G(s) if G(s) = (Lg)(s). In thiscase, g is also denoted by g(t) = (L−1G)(t). Using some simple results form analytic functiontheory, it follows that g and its Laplace G uniquely determine each other. So the inverseLaplace transform is well defined. In this section we will use partial fraction expansion tocompute the inverse Laplace transform for proper rational functions. To this end, recall thatG(s) is a rational function if G(s) = n(s)/d(s) where n(s) and d(s) are polynomials in s.Now assume that G(s) = n(s)/d(s) is a rational function. Then we say that z is a zero ofG(s) if z is a complex number satisfying G(z) = 0. Throughout we assume that n and dhave no common roots. So z is a zero of G(s) if and only if z is a root of n(s), that is,n(z) = 0. We say that λ is a pole of G(s) if λ is a complex number satisfying |G(λ)| = ∞.Notice that λ is a pole of G(s) if and only if λ is a root of d(s), that is, d(λ) = 0.

For example, consider the function

G(s) =(s− 6)(s2 − 2s+ 2)

(s+ 2)(s+ 4)(s2 + 2s+ 5)=

(s− 6)(s− 1 + i)(s− 1− i)

(s + 2)(s+ 4)(s+ 1 + 2i)(s+ 1− 2i).

This function G has three zeros {6, 1± i}, and four poles {−2,−4,−1± 2i}.We say that G(s) is a proper rational function if G(s) is a rational function of the form

G(s) = n(s)/d(s) and deg n ≤ deg d. As expected, the degree of a polynomial is denoted bydeg. The function G(s) is a strictly proper rational function if G(s) is a rational function ofthe form G(s) = n(s)/d(s) and deg n < deg d.

4.3. THE INVERSE LAPLACE TRANSFORM 193

Consider the function g given by

g(t) =ν∑k=1

akeλkt (t ≥ 0) . (3.1)

Without loss of generality we assume that {λk}ν1 are distinct complex numbers. The Laplacetransform of g is given by

G(s) =

ν∑k=1

aks− λk

. (3.2)

The form of G in (3.2) is called the partial fraction expansion for G. By placing all the termsover a common denominator we see that

G(s) =n(s)∏ν

1(s− λk)=n(s)

d(s)(3.3)

where d(s) =∏ν

1(s − λk) is a monic polynomial of degree ν and n(s) is a polynomial ofdegree at most ν−1. A monic polynomial is a polynomial p(s) such that the coefficient of sν

equals one where ν = deg p. Notice that G is a strictly proper rational function. Moreover,all the roots of d are distinct. In other words, if g is a function of the form (3.1), then theLaplace transform G of g is a strictly proper rational function with distinct poles.

Now let us show that the converse statement also holds. If G(s) is a strictly properrational function with distinct poles, then g(t) = (L−1G)(t) exists and g is a function ofthe form (3.1). To this end, assume that G(s) is a strictly proper rational function of theform G(s) = n(s)/d(s) where n and d are polynomials with no common roots satisfyingdeg n < deg d. Without loss of generality we also assume that d is a monic polynomial ofdegree ν. Now let {λk}ν1 be the ν distinct poles of G, or equivalently, the ν distinct roots ofd. Then we claim that

G(s) =ν∑k=1

aks− λk

where aj = [G(s)(s− λj)]s=λj (j = 1, 2, · · · , ν) . (3.4)

In this case, the inverse Laplace transform g(t) = (L−1G)(t) is given by

g(t) =ν∑k=1

akeλkt where aj = [G(s)(s− λj)]s=λj (j = 1, 2, · · · , ν) . (3.5)

Clearly, the Laplace transform of g in (3.5) is given by G in (3.4). To verify that (3.4) holds,for the moment assume that G admits a partial fraction expansion of the form

G(s) =

ν∑k=1

aks− λk

. (3.6)

Then multiplying this equation by s− λj, yields

[G(s)(s− λj)]s=λj =

ν∑k=1

[ak(s− λj)

s− λk

]s=λj

=

ν∑k=1

akδkj = aj .


Recall that δkj is the Kronecker delta. Hence G(s) is given by (3.4). Finally, it is noted thatone can use the residue command in Matlab to compute the partial expansion for G.

To show that G admits a partial fraction of the form (3.6) observe that d(s) =∏ν

1(s−λk).Let q be the polynomial defined by

q(s) = d(s)ν∑k=1

aks− λk

=ν∑k=1

akd(s)

s− λk. (3.7)

Notice that q is a polynomial of degree at most ν − 1. We claim that q(s) = n(s). SinceG = n/d and aj = [G(s)(s− λk)]s=λj , we arrive at

q(λj) =

ν∑k=1

akd(s)

s− λk

∣∣∣∣s=λj

=

ν∑k=1

ak

ν∏k =j

(λj − λk) = aj

ν∏k =j

(λj − λk)

= [G(s)(s− λk)]s=λj

ν∏k =j

(λj − λk) = G(λj)d(λj) = n(λj) .

So q(λj) = n(λj) for all j = 1, 2, · · · , ν. Obviously, q(λj)− n(λj) = 0 for all j = 1, 2, · · · , ν.In other words, p(s) = q(s) − n(s) is a polynomial of degree at most ν − 1 with ν roots{λj}ν1. Thus p(s) = 0. Therefore q(s) = n(s). By consulting the form of q in (3.7) alongwith G(s) = n(s)/d(s), we see that G is given by the partial fraction expansion of the formexpressed in equation (3.4).

Example, real poles. Let G be the function defined by

G(s) =6(s+ 3)

(s+ 1)(s+ 2)(s+ 4).

The poles of G are −1, −2 and −4. Notice that this G admits a partial fraction expansionof the form

G(s) =a1s+ 1

+a2s+ 2

+a3s+ 4

.

The coefficients are computed by

a1 = [G(s)(s+ 1)]s=−1 =

[6(s+ 3)

(s+ 2)(s+ 4)

]s=−1

=6(3− 1)

(2− 1)(4− 1)= 4

a2 = [G(s)(s+ 2)]s=−2 =

[6(s+ 3)

(s+ 1)(s+ 4)

]s=−2

=6(3− 2)

(1− 2)(4− 2)= −3

a3 = [G(s)(s+ 4)]s=−4 =

[6(s+ 3)

(s+ 1)(s+ 2)

]s=−4

=6(3− 4)

(1− 4)(2− 4)= −1 .


G(s) =4

s+ 1− 3

s+ 2− 1

s+ 4.

Therefore the inverse Laplace transform of G is given by

g(t) = 4e−t − 3e−2t − e−4t .


Example, a differential equation. Now let us use the Laplace transform to solve thedifferential equation

y + 3y + 2y = 2 (3.8)

subject to the initial conditions y(0) = 2 and y(0) = −10. Recall that the Laplace transformof g is sG(s) − g(0), while the Laplace transform of g is s2G(s) − sg(0) − g(0). By takingthe Laplace transform of both sides of the differential equation in (3.8), we arrive at

s2Y − sy(0)− y(0) + 3sY − 3y(0) + 2Y = s2Y + 3sY + 2Y − 2s+ 4 = 2/s .

This implies that (s2 + 3s+ 2)Y = 2s− 4 + 2/s = (2s2 − 4s+ 2)/s. Hence

Y (s) =2s2 − 4s+ 2

s(s2 + 3s+ 2)=

2s2 − 4s+ 2

s(s+ 1)(s+ 2). (3.9)

So to find the solution to the differential equation in (3.8) one simply computes theinverse Laplace transform y(t) of Y (s). The poles of Y are 0, −1 and −2. Notice that thisY admits a partial fraction expansion of the form

Y (s) =a1s

+a2s+ 1

+a3s+ 2

.


a1 = [Y (s)s]s=0 =

[2s2 − 4s+ 2

(s+ 1)(s+ 2)

]s=0

=2

(1)(2)= 1

a2 = [Y (s)(s+ 1)]s=−1 =

[2s2 − 4s+ 2

s(s+ 2)

]s=−1

=2 + 4 + 2

(−1)(2− 1)= −8

a3 = [Y (s)(s+ 2)]s=−2 =

[2s2 − 4s+ 2

s(s+ 1)

]s=−2

=8 + 8 + 2

(−2)(1− 2)= 9 .


Y (s) =1

s− 8

s+ 1+

9

s+ 2.

Therefore the inverse Laplace transform of Y is given by

y(t) = 1− 8e−t + 9e−2t .

In other words, this y(t) is the solution to the differential equation in (3.8) subject to theinitial conditions y(0) = 2 and y(0) = −10.

Example, complex poles. Let G be the function defined by

G(s) =20

s(s2 + 2s+ 5)=

20

s(s+ 1− 2i)(s+ 1 + 2i).


The poles of G are 0 and −1± 2i. Notice that this G admits a partial fraction expansion ofthe form

G(s) =a1s

+a2

s+ 1− 2i+

a3s+ 1 + 2i

.


a1 = [G(s)(s+ 1)]s=0 =

[20

s2 + 2s+ 5

]s=0

=20

5= 4

a2 = [G(s)(s+ 1− 2i)]s=−1+2i =

[20

s(s+ 1 + 2i)

]s=−1+2i

=20

(−1 + 2i)4i= −2 + i

a3 = [G(s)(s+ 1 + 2i)]s=−1−2i =

[20

s(s+ 1− 2i)

]s=−1−2i

=20

(−1− 2i)(−4i)= −2− i .

Notice that a3 is the complex conjugate of a2. This happens because the pole −1 − 2icorresponding to a3 is the complex conjugate of the pole −1 + 2i corresponding to a2. Theprevious calculation readily implies that

G(s) =4

s+

−2 + i

s+ 1− 2i− 2 + i

s+ 1− 2i.

The inverse Laplace transform g of G is computed in the following calculation

g(t) = 4 + (−2 + i)e−(1−2i)t + (−2− i)e−(1+2i)t = 4 + e−t((−2 + i)e2it + (−2 − i)e−2it

)= 4 + e−t

((−2 + i)e2it + (−2 + i)e2it

)= 4 + 2e−t� (

(−2 + i)e2it)

= 4 + 2e−t� ((−2 + i)(cos(2t) + i sin(2t)))

= 4− 4e−t cos(2t)− 2e−t sin(2t) .

Therefore the inverse Laplace transform of g is

g(t) = 4− 4e−t cos(2t)− 2e−t sin(2t).

Example, numerator and denominator of the same degree. In this example wewill compute the inverse Laplace transform for a rational function whose numerator anddenominator have the same degree. Consider the function

G(s) =2s2 + 3s+ 4

s2 + 3s+ 2=

2s2 + 3s+ 4

(s+ 1)(s+ 2).

Because the numerator and denominator have the same degree, it follows that G admits apartial fraction expansion of the form

G(s) = a+b

(s+ 1)+

c

(s+ 2)

where a, b and c are constants. To calculate a simply evaluate both sides at s = ∞, that is,

a = [G(s)]s=∞ =2s2 + 3s+ 4

s2 + 3s+ 2

∣∣∣∣s=∞

= 2 .


The coefficient b is computed by

b = [G(s)(s+ 1)]s=−1 =2s2 + 3s+ 4

s+ 2

∣∣∣∣s=−1

= 3 .

The coefficient c is given by

c = [G(s)(s+ 2)]s=−2 =2s2 + 3s+ 4

s + 1

∣∣∣∣s=−2

= −6 .

So the partial fraction expansion for G is given by

G(s) = 2 +3

(s+ 1)− 6

(s+ 2).

Therefore g(t) is given byg(t) = 2δ(t) + 3e−t − 6e−2t .

Example, multiple poles. Multiple poles rarely happen in practice. So we will notspend a lot of time investigating multiple poles. However, for completeness let us computethe inverse Laplace transform for a function with multiple roots. Consider the function

G(s) =1

s(s+ 1)3.

This G admits a partial fraction of the form

G(s) =a1s

+a2

(s+ 1)3+

a3(s+ 1)2

+a4s+ 1

. (3.10)

The poles of G are 0 and −1. The pole of −1 occurs three times. The coefficient a1 iscompute by

a1 = [G(s)s]s=0 =

[1

(s+ 1)3

]s=0

= 1 .

Notice that a2 is the coefficient for (s+ 1)−3. So let us multiply G by (s+ 1)3 and evaluateat s = −1. This yields

a2 =[G(s)(s+ 1)3

]s=−1

=

[1

s

]s=−1

= −1 .

Using a1 = 1 and a2 = −1 in (3.10), we see that G is given by

G(s) =1

s− 1

(s+ 1)3+

a3(s+ 1)2

+a4s+ 1

. (3.11)

To compute a3 let us move 1/(s+ 1)3 to the other side, that is, let

F (s) = G(s) +1

(s+ 1)3=

1

s(s+ 1)3+

1

(s+ 1)3=

s+ 1

s(s+ 1)3=

1

s(s+ 1)2.


Hence F (s) = 1/s(s+ 1)2. This and (3.11) implies that

F (s) =1

s(s+ 1)2=

1

s+

a3(s+ 1)2

+a4s+ 1

. (3.12)

Notice that a3 is the coefficient for (s+ 1)−2. So let us multiply F by (s+ 1)2 and evaluateat s = −1. This yields

a3 =[F (s)(s+ 1)2

]s=−1

=

[1

s

]s=−1

= −1 .

By consulting (3.12), we see that

F (s) =1

s− 1

(s+ 1)2+

a4s+ 1

. (3.13)

To obtain a4 let us move 1/(s+ 1)2 to the other side, that is, let

Q(s) = F (s) +1

(s+ 1)2=

1

s(s+ 1)2+

1

(s+ 1)2=

s+ 1

s(s+ 1)2=

1

s(s+ 1).

Hence Q(s) = 1/s(s+ 1). This and (3.13) implies that

Q(s) =1

s(s+ 1)=

1

s+

a4s+ 1

. (3.14)

Notice that a4 is the coefficient for (s + 1)−1. So let us multiply Q by (s + 1) and evaluateat s = −1. This yields

a4 = [Q(s)(s+ 1)]s=−1 =

[1

s

]s=−1

= −1 .

By consulting (3.10), we see that

G(s) =1

s− 1

(s+ 1)3− 1

(s+ 1)2− 1

s+ 1.

This and (Ltne−at)(s) = n!/(s + a)n+1 readily implies that

g(t) = 1− t2e−t/2− te−t − e−t .

4.3.1 Complex poles and the residue command in Matlab

In this section, we will show how one can compute the inverse Laplace transform for G(s)when the roots come in complex conjugate pairs. Let us begin with the following result.

LEMMA 4.3.1 Assume that F (s) is of the form

F (s) =γ

s− a− iω+

γ

s− a + iω(3.15)

where a and ω is are real numbers, and γ is a complex number. Then the inverse Laplacetransform f(t) of F (s) is given by

f(t) = 2�(γ)eat cos(ωt)− 2�(γ)eat sin(ωt). (3.16)


Proof. By taking the inverse Laplace transform of F (s), we obtain

f(t) = γeateiωt + γeate−iωt.

Notice that the second term is the complex conjugant of the first term. If z is a complexnumber, then 2�(z) = z + z. Using this with γ = α+ iβ where α = �γ and β = �γ, we seethat

f(t) = 2�(γeateiωt

)= 2eat�

((α + iβ)(cos(ωt) + i sin(ωt))

)= 2eat

(α cos(ωt)− β sin(ωt)

)= 2�(γ)eat cos(ωt)− 2�(γ)e−at sin(ωt).


0 1 2 3 4 5 6−0.5

0

0.5

1

1.5

2

2.5

3

3.5

4

The plot of g(t)

Time (seconds)

g(t)

Figure 4.1: The graph of g(t)

An example. Let us find g(t) the inverse Laplace transform of the function G(s) given by

G(s) =12(s+ 4)

(s2 + 2s+ 5)(s2 + 2s+ 2)=

12s+ 48

s4 + 4s3 + 11s2 + 14s+ 10.

A simple calculation shows that

(s2 + 2s+ 5)(s2 + 2s+ 2) = s4 + 4s3 + 11s2 + 14s+ 10.

One can compute the previous fourth order polynomial by using the conv command inMatlab, that is,

conv([1, 2, 5], [1, 2, 2]) =[1 4 11 14 10

]The conv command multiplies two polynomials. In Matlab notation, the polynomial s2 +2s + 5 is represented by the row vector [1, 2, 5], and s2 + 2s + 2 is represented by [1, 2, 2].


Matlab represents a polynomial of degree n by a row vector of length n + 1 containing thecoefficients of the polynomial in decreasing order. In our case, the vector

[1 4 11 14 10

]in Matlab corresponds to the fourth order polynomial s4 + 4s3 + 11s2 + 14s+ 10.

To compute g(t) first observe that

s2 + 2s+ 5 = (s+ 1− 2i)(s+ 1 + 2i)

s2 + 2s+ 2 = (s+ 1− i)(s+ 1 + i)

G(s) =12(s+ 4)

(s+ 1− 2i)(s + 1 + 2i)(s+ 1− i)(s+ 1 + i).

Hence G(s) admits a partial fraction expansion of the form:

12(s+ 4)

(s2 + 2s+ 5)(s2 + 2s+ 2)=

r1s+ 1− 2i

+r2

s+ 1 + 2i+

r3s+ 1− i

+r4

s+ 1 + i. (3.17)

To find rj simply observe that

r1 =12(s+ 4)

(s+ 1 + 2i)(s2 + 2s+ 2)

∣∣∣∣s=−1+2i

= −2 + 3i

r2 = r1 = −2 − 3i

r3 =12(s+ 4)

(s2 + 2s+ 5)(s+ 1 + i)

∣∣∣∣s=−1+i

= 2− 6i

r4 = r3 = 2 + 6i.

(The rj corresponding to the roots with complex conjugate pairs are the complex conjugateof each other.) According to Lemma 4.3.1 we see that

g(t) = 2e−t�(−2 + 3i) cos(2t)− 2e−t�(−2 + 3i) sin(2t)

+ 2e−t�(2− 6i) cos(t)− 2e−t�(2− 6i) sin(t).

By rearranging terms, this readily implies that

g(t) = 4e−t cos(t) + 12e−t sin(t)− 4e−t cos(2t)− 6e−t sin(2t). (3.18)

Finally, it is noted that the real part of the complex roots −1 determine how fast the functiong(t) decays, while the imaginary part {±1,±2} corresponds to the frequency of oscillationfor g(t).

One can use the residue command in Matlab to compute {rj}41. In fact, let num be thenumerator for G, and den the denominator for G in Matlab notation, that is,

num = [12, 48]; den = conv([1, 2, 5], [1, 2, 2]);

[r, p] = residue(num,den);

G = tf(num,den) =12s+ 48

s4 + 4s3 + 11s2 + 14s+ 10.


The polynomial 12s+48 is represented in Matlab by [12, 48]. (The conv command multipliestwo polynomials.) The result of the Matlab residue commands is

r =

⎡⎢⎢⎣−2 + 3i−2 − 3i2− 6i2 + 6i

⎤⎥⎥⎦ and p =

⎡⎢⎢⎣−1 + 2i−1 − 2i−1 + i−1− i

⎤⎥⎥⎦den = conv([1, 2, 5], [1, 2, 2]) =

[1 4 11 14 10

]The vector p contains the roots of the denominator. The roots of the denominator are alsocalled the poles of G(s). The vector r respectively contains the corresponding residues; seeequation (3.17). Using this we see that

g(t) =4∑k=1

rkepkt = (−2 + 3i)e(−1+2i)t + (−2 − 3i)e(−1−2i)t

+ (2− 6i)e(−1+i)t + (2 + 6i)e(−1−i)t

= 4e−t cos(t) + 12e−t sin(t)− 4e−t cos(2t)− 6e−t sin(2t).

The graph of g(t) is given in Figure 4.1. To plot the graph of g we used the impulse commandin Matlab. The impulse command will plot the inverse Laplace transform of any functionG(s). This command automatically chooses the time scale. We also plotted g(t) in (3.18)on the same graph. As expected, the two graphs are identical. The Matlab commands weused are given by

impulse(num,den); grid

t = linspace(0, 6, 2 ∧ 14);

g = 4 ∗ exp(−t). ∗ cos(t) + 12 ∗ exp(−t). ∗ sin(t)− 4 ∗ exp(−t). ∗ cos(2 ∗ t)− 6 ∗ exp(−t). ∗ sin(2 ∗ t);

hold on; plot(t,g,’r’);

title(’The plot of g(t)’); ylabel(’g(t)’);

One can also use the residue command to compute g(t) in Matlab, that is,

num = [12, 48]; den = conv([1, 2, 5], [1, 2, 2]);

[r, p] = residue(num,den);

gr = r(1) ∗ exp(p(1) ∗ t);for k = 2 : 4; gr = gr + r(k) ∗ exp(p(k) ∗ t); endnorm(g − gr) ans = 7.2981× 10−13

norm(imag(gr)) ans = 1.8018× 10−14

The previous calculations show that the gr computed from the residue command in Matlabis the same as g(t) up to 7.2981× 10−13 which is numerically zero in Matlab. Moreover, asexpected the imaginary part of gr is numerically zero 1.8018× 10−14.


The residue command for a proper rational function. Let us find the inverse Laplacetransform g(t) for the function

G(s) =4s2 − 8s+ 8

s2 + 4s+ 20=

4s2 − 8s+ 8

(s+ 2− 4i)(s+ 2 + 4i)

Because the numerator and denominator of G(s) are of the same degree, G(s) admits apartial fraction expansion of the form:

G(s) = k +r1

s+ 2− 4i+

r2s+ 2 + 4i

.

Moreover, k, r1 and r2 are given by

k = lims→∞

4s2 − 8s+ 8

s2 + 4s+ 20= 4

r1 =4s2 − 8s+ 8

s+ 2 + 4i

∣∣∣∣s=−2+4i

= −12 + 3i

r2 = r1 = −12− 3i.

In other words,

G(s) = k +−12 + 3i

s+ 2− 4i+

−12− 3i

s+ 2 + 4i. (3.19)

By consulting Lemma 4.3.1 we see that

g(t) = 4δ(t)− 24e−2t cos(4t)− 6e−2t sin(4t).

In this setting, the residue command in Matlab can also be used to compute k, r1 and r2,that is, in Matlab

num = [4,−8, 8]; den = [1, 4, 20];

[r, p, k] = residue(num,den);

r =

[−12 + 3i−12− 3i

]and

[−2 + 4i−2 − 4i

]and k = 4

This yields the form of the partial fraction expansion for G(s) in (3.19). Finally, it isnoted that the real part of the complex roots −2 determine how fast g(t) decays, while theimaginary part ±4 corresponds to the frequency of oscillation for g(t).

4.3.2 Exercise

Problem 1. Find the inverse Laplace transform for the function G given by

G(s) =2s+ 6

(s+ 1)(s2 + 4s+ 5).


G(s) =s− 4

(s+ 1)(s2 + 2s+ 10).



G(s) =s2 + 2s+ 3

(s2 − 1)(s+ 2).


G(s) =s2 + 2s+ 3

s2 + s− 2.

Hint: notice that s2 + s− 2 = (s− 1)(s+ 2). Decompose G into a function of the form

G(s) = γ +a

s− 1+

b

s+ 2

where a, b and γ are constants. Moreover,

γ =s2 + 2s+ 3

s2 + s− 2

∣∣∣∣s=∞

, a =s2 + 2s+ 3

s + 2

∣∣∣∣s=1

and b =s2 + 2s+ 3

s− 1

∣∣∣∣s=−2

.

Problem 5. Use the Laplace transform to solve the following differential equation

y + 4y + 5y = 4e−3t

where all the initial conditions are equal to zero.

Problem 6. Solve the following differential equation

y + 4y = 16

where all the initial conditions are equal to zero.

Problem 7. Use the Laplace transform to solve the following differential equation

y + 2ζωny + ω2ny = 0 .

Here the initial conditions y(0) and y(0) are specified. Moreover, the damping ratio ζ andnatural frequency ωn are real numbers satisfying |ζ | < 1 and ωn > 0.

Problem 8. Use the residue command in Matlab to find the inverse Laplace transform g(t)of the function

G(s) = G(s) =17s4 + 176s3 + 1236s2 + 3356s+ 5239

s5 + 19s4 + 188s3 + 836s2 + 1771s+ 1105.

Use the impulse command in Matlab to graph g(t). Plot g(t) = L−1(G) on the same graph.

Problem 9. Find the inverse Laplace transform for the function

G(s) =20

s(s2 + 2s+ 5).

Problem 10. Solve the following differential equation where the initial conditions are allzero

y + y = 10

∫ t

0

sin(2t− 2τ)δ(τ)dτ.


4.4 Transfer functions

A system is nothing more than a map from an input signal or forcing function u to anoutput signal y. The input u(t) and output y(t) are functions in time t. If the system islinear and time invariant, then the transfer function G is defined by G(s) = Y (s)/U(s) whereY is the Laplace transform of the output y and U is the Laplace transform of the input u.Clearly, Y (s) = G(s)U(s). By taking the inverse Laplace transform of the output, we obtainy(t) = (g ⊗ u)(t). In other words, the corresponding system with input u and output y isgiven by

y(t) = (g ⊗ u)(t) =

∫ t

0

g(t− σ)u(σ) dσ =

∫ t

0

g(σ)u(t− σ) dσ . (4.1)

Notice that if u = δ the Dirac delta function, then the output y = g = g⊗ δ. For this reasong is called the impulse response for the transfer function G. In other words, the impulseresponse g is simply the inverse Laplace transform of G. Finally, it is noted that if thetransfer function G(s) = 1, or equivalently, g(t) = δ(t), then the output y = u.

For an example of a transfer function consider the differential equation

y(3) + 8y − 3y + 2y = 4u .

The n-th derivative of y is denoted by y(n). Here u is the input or forcing function and y isthe output. To find the transfer function set all the initial conditions equal to zero and takethe Laplace transform. This yields

s3Y + 8s2Y − 3sY + 2Y = 4U .

Hence (s3 + 8s2 − 3s + 2)Y = 4U . By dividing by the appropriate terms, we see that thetransfer function is for this differential equation is given by

G(s) =Y (s)

U(s)=

4

s3 + 8s2 − 3s+ 2.

4.5 An elementary RCL circuit

For an another example of a transfer function, let us consider a simple electrical circuit. Tothis end, recall that the voltage v across a resistor R with current i◦ is given by

v = Ri◦. (5.1)

The circuit for a resistor R with current i◦ is presented in Figure 4.2. The unit for resistanceis ohms, denoted by Ω. Because we do not want to confuse the current with i =

√−1 we willuse i◦ for the current in this section. (Electrical engineers use i for current and j =

√−1.)

i◦

v = Ri◦

+

−R

Figure 4.2: Resistor v = Ri◦

4.5. AN ELEMENTARY RCL CIRCUIT 205

The voltage v across a capacitor is given by

v =1

C

∫i◦(t) dt, or equivalently, i◦ = C

dv

dt. (5.2)

The circuit for a capacitor C with current i◦ is presented in Figure 4.3. The unit for thecapacitor is farad, denoted by F .

i◦

v =1

C

∫i◦(t)dt

+

−C

Figure 4.3: Capacitor v =1

C

∫i◦(t)dt

The voltage across an inductor is given by

v = Ldi◦dt, or equivalently, i◦ =

1

L

∫v(t) dt . (5.3)

The circuit for an inductor L with current i◦ is presented in Figure 4.4. The unit for aninductor is henry, denoted by H .

i◦

v = Ldi◦dt

+

−L

Figure 4.4: Inductor v = Ldi◦dt

Recall that Kirchhoff’s voltage law states that the sum of the voltages around any loopof a circuit is zero, that is,

n∑k=1

vk = 0

where {vk}nk=1 are the voltages in the loop of a circuit. Kirchhoff’s current law states thatthe sum of the currents flowing in and out of a node of a circuit is zero, that is,

n∑k=1

ik = 0

where {ik}nk=1 are the currents at the node of a circuit.


+−u

i◦

R i◦ L

C y

+

−

Figure 4.5: A resistor inductor and capacitor

For an example of a transfer function, consider a circuit consisting of a single loop con-taining a resistor with R ohms, an inductor with L henrys and capacitor with C farads. Theinput u is a voltage source. The output y is the voltage across the capacitor; see Figure 4.5.Finally, the current is i◦. By applying Kirchhoff’s voltage law to this circuit, we obtain thefollowing equations

Ri◦ + Ldi◦dt

+1

C

∫ t

0

i◦(t) dt = u and y(t) =1

C

∫ t

0

i◦(t) dt .

By taking the Laplace transform with all the initial conditions set equal to zero, we have(R + Ls+

1

Cs

)I◦(s) = U(s) and Y (s) =

I◦(s)Cs

.


I◦(s) =U(s)

R + Ls + 1Cs

=Cs

LCs2 +RCs+ 1U(s). (5.4)

Hence Y (s) = (Ly)(s) is given by

Y (s) =I◦(s)Cs

=U(s)

LCs2 +RCs+ 1.

Therefore the transfer function G for the circuit in Figure 4.5 is given by

G(s) =Y (s)

U(s)=

1

LCs2 +RCs+ 1. (5.5)

An example with u(t) = 12. Now assume that the input u(t) = 12 volts. (A car batteryhas 12 volts.) Let us compute the voltage across the capacitor as t tends to infinity, that is,y(∞). Since u(t) = 12, we obtain

Y (s) = G(s)U(s) = G(s)12

s=

12

(LCs2 +RCs+ 1)s.

Let {λ1, λ2} be the roots for s2 + RLs+ 1

LC, that is,

s2 +R

Ls+

1

LC= (s− λ1)(s− λ2).


Then Y (s) admits a partial fraction expansion of the form

Y (s) =12

s+

a

s− λ1+

b

s− λ2if λ1 = λ2

=12

s+

a

s− λ1+

b

(s− λ1)2if λ1 = λ2

where a and b are constants. Therefore

y(t) = 12 + aeλ1t + beλ2t if λ1 = λ2

= 12 + aeλ1t + bteλ1t if λ1 = λ2.

Because LC > 0 and RC > 0, Problem 1 in Section 4.6.1 shows that the roots {λ1, λ2} forLCs2+RCs+1 have negative real part, that is, �(λ1) < 0 and �(λ2) < 0. (In other words,λ1 and λ2 are stable.) Since �(λ1) < 0 and �(λ2) < 0, we see that

12 = limt→∞

y(t).

In other words, the voltage across the capacitor will eventually converge to 12 volts. If theroots of s2 + R

Ls + 1

LCcontain nonzero complex numbers, then y(t) will oscillate to 12 volts

with a frequency of |�(λ1)| and will converge to 12 on the order of e (λk)t.To see what happens in a specific example, assume that L = 1, C = 1 and R = 1

5. If

u(t) = 12, then the output Y (s) with all the initial condition set equal to zero in the sdomain is given by

Y (s) =12

(s2 + 0.2s+ 1)s=

12

s+

−6 + 0.6030i

s+ 0.1− 0.995i+

−6− 0.6030i

s+ 0.1 + 0.995i. (5.6)

Here we used the residue command in Matlab to compute the poles {0,−0.1 ± 0.995i} forG(s)12

s, or equivalently, the roots of (s2 + 0.2s+ 1)s, that is,

[r, p] = residue([12], [1, .2, 1, 0]);

r =

⎡⎣−6.0000 + 0.6030i−6.0000− 0.6030i

12.0000

⎤⎦ and p =

⎡⎣−0.1000 + 0.9950i−0.1000− 0.9950i

0

⎤⎦Using the quadratic formula the roots of s2 + 0.2s+ 1 are given by

−0.1±√0.99i ≈ −0.1± 0.995i.

As expected, the answers are the same. This yields the partial fraction expansion in (5.6).Hence the voltage y(t) across the capacitor is determined by

y(t) = 12 + (−6 + 0.603i)e(−0.1+0.995i)t + (−6− 0.603i)e(−0.1−0.995i)t

= 12− 12e−0.1t cos(√0.99t)− 1.206e−0.1t sin(

√0.99t); (5.7)


see Lemma 4.3.1. Using the step command in Matlab, we plotted y(t) when u(t) = 12. Thecommands we used are given by

num = 1; den = [1, 0.2, 1];G = tf(num,den)

G =1

s2 + 0.2s+ 1

step(12 ∗G); grid

% The step command plots the output y(t) for the input u(t) = 1.

t = linspace(0, 60, 2 ∧ 14)

y = 12− 12 ∗ exp(−0.1 ∗ t). ∗ cos(0.995 ∗ t)− 1.2060 ∗ exp(−0.1 ∗ t). ∗ sin(0.995 ∗ t);hold on; plot(t, y,′ r′)

% The actual solution y(t) from (5.7) is plotted and yields the same graph.

title(’The output y(t) for u(t) = 12’)

The output y(t) for u(t) = 12 with all the initial conditions set equal to zero is presented inFigure 4.6. As expected, y(t) converges to 12 as t tends to infinity. Notice that the transferfunction G(s) has two poles −.01±√

.99i. The output y(t) oscillates to 12 at the frequencyof

√.99 and y(t) converges to 12 on the order of e−0.1t.

0 10 20 30 40 50 600

5

10

15

20

25

The output y(t) for u(t) = 12

Time (seconds)

Am

plitu

de

Figure 4.6: The output y(t) for u(t) = 12.

The impulse response. Now let us assume that the circuit in Figure 4.5 is hit by lighting,that is, u(t) = δ(t) and all the initial conditions are zero. Then the output y(t) the voltageacross the capacitor in the s domain is given by

Y (s) = G(s)U(s) = G(s)× 1 =1

LCs2 +RCs+ 1.


As before, let {λ1, λ2} be the roots for s2 + RLs + 1

LC. Then Y (s) admits a partial fraction

expansion of the form

Y (s) =a

s− λ1+

b


=a

s− λ1+

b

(s− λ1)2if λ1 = λ2


y(t) = aeλ1t + beλ2t if λ1 = λ2

= aeλ1t + bteλ1t if λ1 = λ2.

It is emphasized that there is no Dirac delta function δ(t) contained in y(t). Certainly, thereis voltage across the capacitor due to the lighting strike. However, there is no δ(t) voltageacross the capacitor. For a concrete example, consider the previous case when L = 1, C = 1and R = 0.2. In this case,

Y (s) =1

s2 + 0.2s+ 1=

1

(s+ 0.1)2 + 0.99

By taking the inverse Laplace transform, we see that

y(t) =1√0.99

e−110t sin

(√0.99t

).

The plot of y(t) with u(t) = δ(t) is given in Figure 4.7. The Matlab commands we used togenerate this graph are given by

num = 1; den = [1, .2, 1];G = tf(num,den)

G =1

s2 + 0.2s+ 1

impulse(G); grid

% The impulse response plots y(t) for u(t) = δ(t).

t = linspace(0, 60, 2 ∧ 14);

y = exp(−t/10). ∗ sin(sqrt(0.99) ∗ t)/sqrt(0.99);hold on; plot(t, y,′ r′)

% y(t) was plotted to show the impulse command yields the same answer.

max(abs(y)), ans = 0.8583

Matlab shows that the maximum voltage across the capacitor is 0.8583; see also Figure 4.7.Even though lighting hits this circuit, the voltage across the capacitor is less that one volt.So where did the voltage for the lighting go? It turns out that the voltage from the lightinghit the inductor.


0 10 20 30 40 50 60−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Impulse Response

Time (seconds)

Am

plitu

de

Figure 4.7: The impulse response for y(t).

The transfer function for the voltage across the inductor. As before, consider thecircuit presented in Figure 4.5. Now assume that the output y

L(t) is the voltage across the

inductor. By consulting (5.4), we see that

YL(s) = LsI◦(s) =

LCs2

LCs2 +RCs+ 1U(s).

Hence the transfer function from u to yLthe voltage across the inductor is given by

YL(s)

U(s)= G

L(s) =

LCs2

LCs2 +RCs+ 1. (5.8)

Now assume that the input u(t) = 12 volts. Let us find the voltage across the inductoras t tends to infinity, that is, y

L(∞). Since u(t) = 12, we see that

YL(s) = G(s)U(s) = G(s)

12

s=

12LCs

LCs2 +RCs + 1.

Let {λ1, λ2} be the roots for s2 + RLs+ 1

LC. Then Y

L(s) admits a partial fraction expansion

of the form

YL(s) =

a

s− λ1+

b


=a

s− λ1+

b

(s− λ1)2if λ1 = λ2


yL(t) = aeλ1t + beλ2t if λ1 = λ2

= aeλ1t + bteλ1t if λ1 = λ2.


Because LC > 0 and RC > 0, Problem 1 in Section 4.6.1 shows that the roots {λ1, λ2} forLCs2 +RCs+1 have negative real part, that is, �(λ1) < 0 and �(λ2) < 0. Since �(λ1) < 0and �(λ2) < 0, we see that

0 = limt→∞

yL(t).

In other words, the voltage across the inductor will eventually converge to 0 volts. Thismakes sense because the voltage across the capacitor converges to 12. So at time infinityKirchhoff’s voltage law shows that the voltage across the resistor and inductor must both bezero. Finally, it is noted that if the roots of s2+ R

Ls+ 1

LCcontain nonzero complex numbers,

then yL(t) will oscillate with a frequency of |�(λ1)| and will converge on the order of e (λ1)t

to zero.Now assume that the circuit get hit by lighting, that is, u(t) = δ(t). Then the voltage

across the inductor in the s domain is given by

YL(s) = G(s)U(s) = G(s)× 1 =

LCs2

LCs2 +RCs+ 1.

As before, let {λ1, λ2} be the roots for s2+ RLs+ 1

LC. Because the numerator and denominator

are both of the same degree and YL(∞) = 1, we see that Y

L(s) admits a partial fraction

expansion of the form

YL(s) = 1 +

a

s− λ1+

b


= 1 +a

s− λ1+

b

(s− λ1)2if λ1 = λ2


yL(t) = δ(t) + aeλ1t + beλ2t if λ1 = λ2

= δ(t) + aeλ1t + bteλ1t if λ1 = λ2.

It is emphasized that there is a Dirac delta function δ(t) contained in yL(t). Therefore the

full impact of the lighting hits the inductor. As expected, there is no Dirac delta functionin the voltage across the resistor.

4.5.1 Exercise

Problem 1. Find the transfer function for the circuit given in Figure 4.5 where the outputy is the voltage measured across the resistor R.

• If u(t) = 12, then find limt→∞ y(t).

• Assume that lighting hits the circuit, that is, u(t) = δ(t). Does the Dirac delta functionδ(t) appear in the output y(t). Explain why or why not.

• Assume L = 1, C = 1 and R = 15. Find the solution y(t) when u(t) = 12. Plot your

solution in Matlab.


Problem 2. Construct a circuit consisting of a resistor and a capacitor such that the transferfunction for this circuit is given by

G(s) =1

αs+ 1

where α > 0 is a constant.

Problem 3. Construct a circuit consisting of a resistor, inductor and capacitor such thatthe transfer function for this circuit is given by

G(s) =α

s2 + βs+ α

where α > 0 and β > 0 are constants.

Problem 4. Construct a circuit such that the transfer function for this circuit is given by

G(s) =s2

s2 + βs+ α

where α > 0 and β > 0 are constants.

4.6 The Final Value Theorem

Motivated by our analysis of u(t) = 12 volts acting a circuit, let us present a general resultto compute y(∞), known as the final value Theorem. Recall that the poles of a transferfunction are the roots of its denominator. For example, the poles of the transfer function

G(s) =s− 4

(s2 + 5s+ 6)(s+ 7)(s2 + 2s+ 2)(6.1)

are {−2,−3,−7,−1± i} which are the roots of (s2 + 3s+ 2)(s+ 7)(s2 + 2s+ 2).

Definition. A transfer function G(s) is stable, if all of its poles are contained in the openleft half plane {λ ∈ C : �(λ) < 0}.

For example, the transfer function G(s) in (6.1) is stable. Consider the transfer function

F (s) =s + 4

(s2 + 5s+ 6)(s+ 7)(s2 − 2s+ 2)

The poles of F (s) are given by {−2,−3,−7, 1 ± i}. Due to the poles 1 ± i, the transferfunction F (s) is unstable. The following is a fundamental result in systems theory.

THEOREM 4.6.1 (Final value theorem) Let G(s) be a stable transfer function withinput u and output y, that is, Y (s) = G(s)U(s). If the input u(t) = u0 a constant for all t,then

G(0)u0 = limt→∞

y(t). (6.2)

4.6. THE FINAL VALUE THEOREM 213

Proof. Notice that Y (s) = G(s)U(s) admits a partial fraction expansion of the form

Y (s) = G(s)U(s) = G(s)u0s

=G(0)u0s

+Ψ(s)

where Ψ(s) is a stable rational function whose poles are contained in the poles of G. In thetime domain y(t) = G(0)u0+ψ(t) where ψ is the inverse Laplace transform of Ψ. Moreover,ψ(t) =

∑tkjeλjtγj,k where {λj} are the poles of G. Since G is stable, ψ(t) → 0 as t tends to

infinity. Thereforelimt→∞

y(t) = limt→∞

(G(0)u0 + ψ(t)

)= G(0)u0.

This yields equation (6.2), and completes the proof.For example, consider the transfer function

G(s) =Y (s)

U(s)=

1

LCs2 +RCs+ 1.

determined by the resistor, capacitor and inductor circuit in Figure 4.5. Because LC > 0 andRC > 0, Problem 1 in Section 4.6.1 shows that the roots of the polynomial LCs2+RCs+1are contained in the open left hand plane {λ ∈ C : �(λ) < 0}. Therefore G(s) is stable.Notice that

G(0) =1

LCs2 +RCs+ 1

∣∣∣∣s=0

= 1.

So if u(t) = 12, then the final value Theorem yields

y(∞) = limt→∞

y(t) = G(0)× 12 = 12.

Finally, it is emphasized that the poles of G(s) tell us how fast y(t) converges to 12 and atwhat frequency.

If the transfer function G is not stable and u(t) = u0, then y(∞) is undefined. For aspecific example, consider the unstable transfer function

Y (s)

U(s)= G(s) =

2

(s− 1)(s+ 1).

The poles of G are ±1. Due to the pole of 1, the transfer function G is unstable. If u(t) = 1the unit step function, then

Y (s) = G(s)U(s) =2

(s− 1)(s+ 1)s=

1

s− 1+

1

s+ 1− 2

s.

By taking the inverse Laplace transform, we obtain

y(t) = et + e−t − 2.

Clearly, y(t) goes to infinity and y(∞) is undefined. It is emphasized that the stable pole of−1 is not enough to guarantee that y(∞) is well defined. All the poles of G must be stable,for limt→∞ y(t) to exist.


For another example, consider the transfer function

Y (s)

U(s)= G(s) =

1

s2 + 1.

The poles of G(s) are ±i, and thus, G is unstable. If u(t) = 1 the unit step function, thenthe output Y (s) in the s domain is given by

Y (s) = G(s)U(s) =1

(s2 + 1)s=

1

s− s

s2 + 1.

By taking the inverse Laplace transform, the output y(t) is determined by

y(t) = 1− cos(t).

Notice that y(t) does not converge as t tends to infinity, and thus, y(∞) is undefined.However, y(t) is bounded for all t. In fact, |y(t)| ≤ 2 for all t.

A mass, spring, damper system

m

k

c

y

u

Figure 4.8: A mass spring damper system

Consider a mass spring damper system consisting of a mass m at position y, connected toa spring with constant k and damper with coefficient c; see Figure 4.8. According to Hooke’slaw, the force required to move a spring a distance y is given by −ky where k is the springconstant (in units of newton per meter). Furthermore, the force required to move a damperor dashpot is given by −cy where c is the damping coefficient (in units of newton-secondsper meter). Moreover, assume that there is a force of u on the mass. According to Newton’ssecond law, the equation of motion is given by

my = u− ky − cy.

By rearranging the terms, we see that the equation of motion is determined by

my + cy + ky = u. (6.3)

If y is viewed as the output, then the transfer function G(s) from u to y is given by

G(s) =Y (s)

U(s)=

1

ms2 + cs+ k. (6.4)

4.6. THE FINAL VALUE THEOREM 215

Now assume that one puts a force of u = 1, Newton on the mass. Because m > 0, c > 0and k > 0, it follows that G(s) is stable; see Problem 1 in Section 4.6.1. Since G(0) = 1

k, the

final value Theorem yields1

k= G(0)× 1 = lim

t→∞y(t).

Therefore the mass ends up at the position of 1k. The roots {λ1, λ2} of ms2 + cs + k will

determine how fast y(t) converges to 1k. The �(λj) determines how fast y(t) converges to 1

k.

If �(λ1) = 0, then |�(λ1)| is the frequency at which y(t) converges to 1k.

Now let v = y be the velocity of the mass. Notice that V (s) = sY (s) when y(0) = 0.Hence

V (s)

U(s)=sY (s)

U(s)= sG(s) =

s

ms2 + cs+ k.

So the transfer function F (s) from u to the velocity v is given by

F (s) =V (s)

U(s)=

s

ms2 + cs+ k.

Now observe that F (0) = 0. So if u(t) = u0 a constant for all t, then the final value Theoremshows that v(∞) = 0. As expected, the velocity of the mass converges to zero as t tends toinfinity.

4.6.1 Exercise

Problem 1. Consider the polynomial p(s) = s2 + bs + c where b and c are real numbers.Show that all the roots of p(s) are both contained in the open left half plane {s : �s < 0} ifand only if b > 0 and c > 0. Hint: use the quadratic formula.

Problem 2. Consider the transfer function

Y (s)

U(s)= G(s) =

12s3 + s− 32

(s2 + 3s+ 2)(s2 + 0.1s+ 8).

Assume that the input u(t) = 5. Find y(∞) = limt→∞ y(t). Plot y(t) in Matlab. Does yoursolution oscillate to y(∞) and if so at what frequency. How fast does the solution convergeto y(∞).


Y (s)

U(s)= G(s) =

5321s4 + 782s2 − 298s+ 16

(489s2 + 78942s+ 4)(3678s2 + 39787s+ 2).

Assume that the input u(t) = 4. Find, if possible, y(∞) = limt→∞ y(t).


Y (s)

U(s)= G(s) =

s4 + 2s2 + 16

(s2 − s+ 4)(s2 + 14s+ 2).


Assume that the input u(t) = 4. Find, if possible, y(∞) = limt→∞ y(t).

Problem 5. Find the transfer function G(s) for the differential equation given by

y(3) + 5y + 24y + 20y = 10u+ 4u

where u is the input and y is the output. Is G(s) stable? If u(t) = 4, then computey(∞) = limt→∞ y(t). Use the step command to plot y(t). Use the residue command to findy(t). Plot this y(t) on the same graph.

Problem 6 Consider the mass spring damper system

my + cy + ky = u

where y is the position of the mass m, the spring constant is k and c is the damping. Assumethat all the initial conditions are zero, the input u(t) = 4 and the position

y(t) = 1− 2e−t + e−2t.

Find m, c and k.

4.7. A CASCADED CIRCUIT 217

4.7 A cascaded circuit

+−u

i1

R1 i1

C1

R2 i2

C2 y

+

−

Figure 4.9: A cascaded resistor and capacitor circuit

Now consider the circuit in Figure 4.9 consisting of two resistors with R1 and R2 ohms andtwo capacitors with C1 and C2 farads. The input this circuit is the voltage u and the outputis the voltage across the second capacitor denoted by y. The current in the first loop is i1,while i2 denotes the current in the second loop. By applying Kirchhoff’s voltage law to thiscircuit, we obtain the following equations

u = R1i1 +1

C1

∫ t

0

(i1 − i2) dt

0 = R2i2 +1

C1

∫ t

0

(i2 − i1) dt+1

C2

∫ t

0

i2 dt

y =1

C2

∫ t

0

i2 dt .

By taking the Laplace transform with all the initial conditions equal to zero, we arrive at[U0

]=

[R1 + 1/C1s −1/C1s−1/C1s R2 + 1/C1s+ 1/C2s

] [I1I2

].

Moreover, Y (s) = I2(s)/C2s. Notice that the previous matrix equation can be simplified to[U0

]=

1

C1C2s

[R1C1C2s+ C2 −C2

−C2 R2C1C2s+ C1 + C2

] [I1I2

]. (7.1)

Now consider the 2× 2 matrix

T =

[a bc d

](7.2)

Recall that if the determinant det T = ad− bc is not equal to zero , then the inverse of T isgiven by

T−1 =1

ad− bc

[d −b

−c a

]. (7.3)

Notice that the determinant of the 2× 2 matrix in (7.1) is given by

Δ = C2(R1C1s+ 1)(R2C1C2s+ C1 + C2)− C22

= C2

[R1R2C

21C2s

2 + (R2C1C2 +R1C21 +R1C1C2)s+ C1

]= C1C2

[R1R2C1C2s

2 + (R2C2 +R1C1 +R1C2)s+ 1].


Using this along with the inversion formula in (7.2) we obtain[I1(s)I2(s)

]=C1C2s

Δ

[R2C1C2s+ C1 + C2 C2

C2 R1C1C2s+ C2

] [U(s)0

].

In particular, I2(s) = C1C22sU(s)/Δ. Using Y = I2/C2s, we see that Y = C1C2U(s)/Δ. In

other words, the transfer function G from u into y for the circuit in Figure 4.2 is given by

G(s) =Y (s)

U(s)=

1

R1R2C1C2s2 + (R2C2 +R1C1 +R1C2)s+ 1. (7.4)

It is noted that if u(t) = 12, then the voltage across the capacitor converges to 12, thatis,

12 = limt→∞

y(t) (when u(t) = 12).

Because G(s) is a strictly proper transfer function, if u(t) = δ(t), then the Dirac deltafunction δ(t) does not appear in the output y(t).

4.7.1 Exercise

Problem 1. Construct a circuit consisting of only resistors and capacitors such that thetransfer function for this circuit is given by

G(s) =α

s2 + βs+ α

where α and β are constants.

Problem 2. Find the transfer function for the circuit given in Figure 4.9 where the outputy is the voltage measured across the resistor R1.



• Assume R1 = 1, R2 = 12, C1 = 1 and C2 = 1

2. Find the solution y(t) when u(t) = 12.

Plot your solution in Matlab using the step command.

Problem 3. Find the transfer function for the circuit given in the Figure 4.10.

• If u(t) = 12, then find limt→∞ y(t). Use the step command in Matlab to plot y(t). Usethe residue command in Matlab to find y(t) and plot y(t) on the same graph.


4.7. A CASCADED CIRCUIT 219

+−u

i1

1H i1

1F

3Ω

2Ω i21F

12F y

+

−

2H

Figure 4.10: Problems 3 and 4

Problem 4. Consider the circuit given in the Figure 4.10. Find the transfer function fromthe input u to the voltage y1 across the inductor with 1 henry in the first loop.

• If u(t) = 12, then find limt→∞ y1(t). Use the step command in Matlab to plot y1(t).Use the residue command in Matlab to find y1(t) and plot y1(t) on the same graph.

• Assume that lighting hits the circuit, that is, u(t) = δ(t). Does the Dirac delta functionδ(t) appear in the output y1(t). Explain why or why not.

Problem 5. Consider a circuit consisting of a voltage source u and two resistors R1 and R2

all in parallel. The first resistor R1 is between the voltage source u and the second resistorR2. Assume that the voltage u = 10 volts, the current i1 leaving the voltage source is 1/2amp and the current i2 across the second resistor R2 is 1/10 amp. What is the voltage acrossthe second resistor R2. Find R1 and R2. Hint:

u = R1(i1 − i2) and 0 = R1(i2 − i1) +R2i2 .

Problem 6. Consider the circuit presented in Figure 4.11 where all the components areequal to one, that is, R1 = R2 = 1, and L = 1 and C1 = C2 = 1.

• Find the transfer function G(s) from the input voltage u to the voltage y across thecapacitor C2.

• Assume that u(t) = 12. Then find y(∞) = limt→∞ y(t).

+−u

i1

R1 i1

L

R2 i2C1

C2 y

+

−

Figure 4.11: A cascade circuit


4.8 Transfer functions and impedance

In this section we will recall some elementary facts concerning the impedance for a linearcircuit. The impedance for a network with no internal voltage or current source is thetransfer function from the input current io to the output voltage v, that is, the impedancefor a network is defined by

Z(s) =V (s)

Io(s). (8.1)

The admittance is one over the impedance, that is, 1/Z. In other words, the admittance isthe transfer function from the voltage to the current.

For example, recall that v = Rio for a resistor. Clearly, V (s) = RIo(s). Thus theimpedance for a resistor with R ohms is given by Z = R. Recall that v = Ldio/dt for aninductor with L henrys. In this case, V (s) = sLIo(s). Hence the impedance for an inductorwith L henrys is Z = Ls. Finally, recall that Cv =

∫io dt for a capacitor. In the s-domain

V (s) = Io(s)/Cs. So the impedance for a capacitor with C farads is given by Z = 1/Cs.

Let {Zk}n1 be the impedance functions for a set of networks. If networks {Zk}n1 are allconnected in series, then the total impedance Z is given by the sum of all the impedances,that is, the

Z(s) =n∑k=1

Zk(s) ({Zk}n1 in a series connection ) . (8.2)

For example, if a resistor with resistance R, an inductor with L henrys and a capacitor withC farads are connected in series, then the total impedance is given by Z(s) = R+Ls+1/Cs.

As before, let {Zk}n1 be the impedance functions for a set of networks. If the networks{Zk}n1 are all connected in parallel, then the total impedance Z is given by

Z(s) =

(n∑k=1

1

Zk(s)

)−1

({Zk}n1 in a parallel connection ) . (8.3)

In particular, if Z1 and Z2 are two impedances in parallel, then the total impedance Z isgiven by

Z(s) =1

1/Z1 + 1/Z2=

Z1Z2

Z1 + Z2.

For example, if two resistors with resistance R1 and R2 ohms are in parallel connection, thenthe total impedance Z = R1R2/(R1 + R2). For another example, consider a resistor withresistance R, an inductor with L henrys and a capacitor with C farads are connected inparallel. Then the total impedance is given by

Z(s) =1

1/R + 1/Ls+ Cs=

RLs

RLCs2 + Ls +R.

4.8. TRANSFER FUNCTIONS AND IMPEDANCE 221

+−u

i◦Z1

Z2

+

y

−

Figure 4.12: A single loop network

A simple network. Consider the network consisting of a single loop with impedance Z1

and Z2. The input u is a voltage source and the output y is the voltage across the impedanceZ2; see Figure 4.12. The current in this network is denoted by io. Notice that by choosingZ1(s) = R + Ls and Z2(s) =

1Cs, the circuit Figure 4.5 is a special case of the network in

Figure 4.12. By applying Kirchhoff’s voltage law to the circuit in Figure 4.12, we obtain thefollowing equations in the s-domain

U(s) = (Z1 + Z2)Io(s) and Y (s) = Z2Io(s) .

This readily implies that Io(s) = U(s)/(Z1 + Z2). Hence Y (s) = Z2Io(s) is given by

Y (s) =Z2U(s)

Z1 + Z2

.

Therefore the transfer function G for the network in Figure 4.12 is given by

G(s) =Y (s)

U(s)=

Z2(s)

Z1(s) + Z2(s). (8.4)

An example. Consider the network in Figure 4.12 where the impedance Z1 is a networkconsisting of a resistor R1 and an inductor L in parallel, while Z2 is a network consisting ofa resistor R2 and a capacitor C in parallel. Then find the transfer function from u to thevoltage y across the capacitor C. To this end, notice that the impedance for Z1 is given by

Z1 =1

1R1

+ 1Ls

=1

Ls+R1

R1Ls

=R1Ls

Ls+R1.

The impedance for Z2 is given by

Z2 =1

1R2

+ Cs=

R2

R2Cs+ 1.


So according to (8.4) the transfer function from u to the voltage across the capacitor C iscomputed by

G(s) =Y (s)

U(s)=

Z2(s)

Z1(s) + Z2(s)=

R2/(R2Cs+ 1)

R1Ls/(Ls+R1) +R2/(R2Cs+ 1)

=R2

R1Ls(R2Cs+ 1)/(Ls+R1) +R2

=R2(Ls+ R1)

R1Ls(R2Cs+ 1) +R2(Ls+R1)

=R2Ls +R1R2

R1R2LCs2 + (R1 +R2)Ls +R1R2.

In other words, the transfer function from u to the voltage across the capacitor C is givenby

G(s) =Y (s)

U(s)=

R2Ls+R1R2

R1R2LCs2 + (R1 +R2)Ls+R1R2. (8.5)

Finally, it is noted that because the voltage across the capacitor C and the resistor R2 is thesame, the the transfer function from u to the voltage across the resistor R2 is also given byequation (8.5).

+−u

i1Z1

i1

Z2

Z3

i2

Z4

+

y

−

Figure 4.13: A cascaded network

A cascaded network. Consider the network in Figure 4.13 consisting of four impedancesZ1, Z2, Z3 and Z4. The input to this network is the voltage u and the output is the voltagey across the impedance Z4. The current in the first loop is i1, while i2 denotes the current inthe second loop. Notice that by choosing Z1(s) = R1 and Z2(s) =

1C1s

with Z3(s) = R2 and

Z4(s) =1C2s

, the circuit Figure 4.9 is a special case of the network in Figure 4.13. By applyingKirchhoff’s voltage law to the network in Figure 4.13, we obtain the following equations

U = Z1I1 + Z2(I1 − I2)

0 = Z3I2 + Z2(I2 − I1) + Z4I2

Y = Z4I2 .

By rewriting these equations in matrix form, we arrive at[U0

]=

[Z1 + Z2 −Z2

−Z2 Z2 + Z3 + Z4

] [I1I2

]. (8.6)


Moreover, Y = Z4I2. The determinant of the 2× 2 matrix in (8.6) is given by

Δ = (Z1 + Z2)(Z2 + Z3 + Z4)− Z22

= Z1(Z2 + Z3 + Z4) + Z2(Z3 + Z4) .

Using this along with the inversion formula for a 2× 2 matrix in (7.2) and (7.3), we obtain[I1(s)I2(s)

]=

1

Δ

[Z2 + Z3 + Z4 Z2

Z2 Z1 + Z2

] [U(s)0

]In particular, I2(s) = Z2U(s)/Δ. Using Y = Z4I2, we see that Y = Z2Z4U(s)/Δ. In otherwords, the transfer function G from u into y for the network in Figure 4.13 is given by

G(s) =Y (s)

U(s)=

Z2Z4

Z1(Z2 + Z3 + Z4) + Z2(Z3 + Z4). (8.7)

0 5 10 15 20 25 30 353.4

3.6

3.8

4

4.2

4.4

4.6

4.8

5

5.2

5.4

The output y(t) for u(t) = 12

Time (seconds)

Am

plitu

de

Figure 4.14: The output for u(t) = 12.

A cascaded example. Consider the cascaded system in Figure 4.13 where the impedanceZ1 is a resistor R1 = 2 and a capacitor C1 = 3 in parallel, that is,

Z1 =1

1R1

+ 11

C1s

=1

12+ 3s

=2

6s+ 1.

The impedance Z2 consists of a resistor R2 = 4 and a capacitor C2 = 2 in parallel, that is,

Z2 =1

1R2

+ 11

C2s

=1

14+ 2s

=4

8s+ 1.


The impedance Z3 consists of a resistor R3 = 2 and an inductor L3 = 4 and in parallel. Inother words,

Z3 =1

1R3

+ 1L3s

=1

12+ 1

4s

=4s

2s+ 1.

Finally, let Z4 be a resistor R4 = 2, that is, Z4 = 2. Recall that the transfer function G(s)from u to y the voltage across the impedance Z4 is determined by (8.7). Using Matlab wesee that G is given by

G(s) =Y (s)

U(s)=

Z2Z4

Z1(Z2 + Z3 + Z4) + Z2(Z3 + Z4)=

0.3s2 + 0.2s+ 0.025

s2 + 0.45s+ 0.0625. (8.8)

Now assume that the input u(t) = 12 with all the initial conditions equal to zero. Because0.45 > 0 and 0.0625 > 0, the transfer function G is stable; see Problem 1 in Section 4.6.1.In fact, the poles of G(s) are −0.2250± 0.1090i. As expected, G is stable. According to thefinal value Theorem 4.6.1, we see that

y(∞) = limt→∞

y(t) = G(0)× 12 =12× 0.025

0.0625= 4.8.

Therefore the voltage y(t) across the resistor Z4 = 2 converges to 4.8 as t tends to infinity,that is, y(∞) = 4.8. It is noted that not all the 12 volts from the input ends up acrossthe resistor Z4 at time t = ∞. The output y(t) for u(t) = 12 is presented in Figure 4.14.Because the poles for G are −0.2250±0.1090i, we see that y(t) converges to 4.8 on the orderof e−0.225t at a frequency of 0.109. Since the frequency 0.109 is small compared to −0.2250,the output y(t) converges to 4.8 before one the signal y(t) has much of a chance to oscillate.

The output Y (s) in the s domain is given by

Y (s) = G(s)U(s) = G(s)12

s=

12(0.3s2 + 0.2s+ 0.025)

(s2 + 0.45s+ 0.0625)s

=4.8

s+

−0.6000− 2.3400i

−0.2250 + 0.1090i+

−0.6000 + 2.3400i

−0.2250− 0.1090i.

By taking the inverse Laplace transform, we obtain

y(t) = 4.8− 1.2e−0.225t cos(0.109t) + 4.68e−0.225t sin(0.109t);


see Lemma 4.3.1. The Matlab commands we used to compute G(s) and y(t) are given by

z1 = tf(2, [6, 1]); z1 =2

6s+ 1

z2 = tf(4, [8, 1]); z1 =4

8s+ 1

z3 = tf([4, 0], [2, 1]); z1 =4s

2s+ 1

z4 = 2; G = z2 ∗ z4/(z1 ∗ (z2 + z3 + z4) + z2 ∗ (z3 + z4));

G = minreal(G)

G =0.3s2 + 0.2s+ 0.025

s2 + 0.45s+ 0.0625

% The minreal command eliminates all the common poles and zeros in a transfer function.

step(12 ∗G); grid

% To compute a formula for y(t) with u(t) = 12 we used

num = [0.3, 0.2, 0.025]; den = [1, 0.45, 0.0625];

[r, p] = residue(12 ∗ num, [den, 0])% Notice that [den, 0] corresponds to s3 + 0.45s2 + 0.0625s

r =

⎡⎣−0.6000− 2.3400i−0.6000 + 2.3400i

4.8000

⎤⎦ and

⎡⎣−0.2250 + 0.1090i−0.2250− 0.1090i

0

⎤⎦t = linspace(0, 35, 2 ∧ 14);

y = 4.8− 1.2 ∗ exp(−0.225 ∗ t). ∗ cos(0.109 ∗ t) + 4.68 ∗ exp(−0.225 ∗ t). ∗ sin(0.109 ∗ t);hold on; plot(t, y,′ r′)

It is noted that the minreal command in Matlab eliminates all the common poles andzeros in a transfer function. (The zeros for a transfer function are the roots of its numerator.)For example, if

F (s) =s+ 1

s2 + 3s+ 2=

s+ 1

(s+ 1)(s+ 2),

then the minreal command yields F (s) = 1s+2

. The corresponding Matlab commands are

F = tf([1, 1], [1, 3, 2]);

F =s+ 1

s2 + 3s+ 2

F = minreal(F );

F =1

s+ 2

Now assume that the circuit gets hit by lighting that is u(t) = δ(t). Then the output


Y (s) in the s domain is given by

Y (s) = G(s)U(s) = G(s)× 1 =0.3s2 + 0.2s+ 0.025

s2 + 0.45s+ 0.0625

= 0.3 +0.0325 + 0.0384i

−0.2250 + 0.1090i+

0.0325− 0.0384i

−0.2250− 0.1090i.

(We used the residue command in Matlab to compute this partial fraction expansion.) Bytaking the inverse Laplace transform, we have

y(t) = 0.3δ(t) + 0.0650e−0.225t cos(0.109t)− 0.0769e−0.225t sin(0.109t);

see Lemma 4.3.1. Therefore part of the lighting strike 0.3δ(t) finds it way at the resistorZ4 = 2.

4.8.1 Exercise

Problem 1. Find the impedance for a network consisting of Z1 and Z2 in series, where Z1

is formed by a resistor R1 and capacitor C in parallel, and Z2 is formed by a resistor R2 andinductor L in parallel.

Problem 2. Find the impedance for a network consisting of Z1 and Z2 in parallel, whereZ1 is formed by a resistor R1 and capacitor C in series, and Z2 is formed by a resistor R2

and inductor L in series.

Problem 3. Find the impedance for a network consisting of Z1 and Z2 in parallel, whereZ1 is formed by a resistor R1 and capacitor C1 in series, and Z2 is formed by a resistor R2

and capacitor C2 in parallel.

Problem 4. Consider the network in Figure 4.12 where Z1 is formed by a resistor R1 andcapacitor C in parallel, and Z2 is formed by a resistor R2 and inductor L in parallel. Thenfind the transfer function from the voltage u to the output voltage y across the inductor L.

Problem 5. Consider the network in Figure 4.12 where Z1 is formed by a resistor R1 andcapacitor C1 in parallel and Z2 is formed by a resistor R2 and capacitor C2 in series. Thenfind the transfer function from the voltage u to the output voltage y across the capacitor C2.

Problem 6. Consider the cascaded system in Figure 4.13 where the impedance Z1 is aresistor R1 = 4 and a capacitor C1 = 5 in parallel, the impedance Z2 is a resistor R2 = 3and a capacitor C2 = 2 in parallel, the impedance Z3 is a resistor R3 = 6 and a capacitorC3 = 8 and inductor L3 = 6 in parallel, and the impedance Z4 is a resistor R4 = 3 and acapacitor C4 = 7 in parallel. Let the output y(t) be the voltage across Z4.

• Find the transfer function Y (s)U(s)

= G(s).

• If u(t) = 12, then find y(∞). Use the step command to plot y(t). Use the residuecommand to find y(t). Plot y(t) on the same graph.

• Assume the circuit gets hit by lighting u(t) = δ(t). Does any part of the lighting δ(t)end up as the voltage across Z4. Explain why or why not.


Problem 7. Find the transfer function for the circuit given in Figure 4.15 where the outputy is the voltage across the 1/4 ohm resistor. In this case, assume that u(t) = 12 and then,compute y(∞) = limt→∞ y(t).

+−u

i1

12Ω

1F1Ω 1H

12F

14Ω 1

4Ω y

+

−

Figure 4.15: Problems 7 and 8

Problem 8. Find the transfer function for the circuit given in Figure 4.15 where the outputy1 is the voltage across the 1/2 farad capacitor. In this case, assume that u(t) = 12 andthen, compute y1(∞) = limt→∞ y1(t).


Problem 9. Consider the circuit presented in the Figure 4.16 where the impedance Z1 isa capacitor C1 = 1 and resistor R1 = 1 in parallel, while the impedance Z2 is a capacitorC2 = 1 and unknown resistor R in parallel. Assume that the input u(t) = 12. An experimentshows that the output voltage

8 = y(∞) = limt→∞

y(t).

Find the unknown resistance R.

+−u

i◦Z1

Z2

+

y

−

Figure 4.16: Problem 9: A single loop network

Chapter 5

State space

This chapter presents an introduction to state space systems and operational amplifiers.Operational amplifiers with state space analysis will be used to build a circuit to implementany proper rational transfer function.

5.1 The exponential matrix

In this section, we will introduce the exponential matrix eAt. Recall that if a is any scalar,then the Taylor series expansion for eat is given by eat =

∑∞0

aktk

k!. Motivated by this fact,

let A be any matrix on Cn. Then the exponential matrix eAt is defined by

eAt =

∞∑k=0

Aktk

k!(1.1)

Notice that eAt is a n × n matrix. (In our applications t is time.) Moreover, one can provethat the series in (1.1) uniformly converges for all t in a compact set. Therefore eAt is welldefined for all t. Notice that eA0 = I, the identity matrix on Cn. Furthermore, using thepower series expansion for eAt, one can show that

eA(t1+t2) = eAt1eAt2 (for all t1 and t2). (1.2)

By choosing t1 = t and t2 = −t, we have eAte−At = eA0 = I. In particular, eAt is invertiblefor all t and [

eAt]−1

= e−At (for all t). (1.3)

By setting t = 1, we see that the exponential matrix eA =∑∞

0Ak

k!. Finally, the Matlab

command to compute the exponential matrix is expm.

We claim that

deAt

dt= AeAt (1.4)

229

230 CHAPTER 5. STATE SPACE

To see this observe that

deAt

dt=

d

dt

∞∑k=0

Aktk

k!=

∞∑k=0

d

dt

Aktk

k!=

∞∑k=1

Akt(k−1)

(k − 1)!

= A∞∑k=1

A(k−1)t(k−1)

(k − 1)!= A

∞∑k=0

Aktk

k!= AeAt .

Therefore (1.4) holds.Recall that the Laplace transform of eat equals 1/(s− a). We claim that for any square

matrix A, an analogous result holds, that is,(LeAt) (s) = (sI −A)−1 (1.5)

To verify this let Φ(s) = (LeAt)(s). Then using (Lg)(s) = sG(s)− g(0), we have

AΦ(s) = A(LeAt)(s) = (LAeAt)(s) = (LdeAt

dt)(s) = sΦ(s)− eA0 = sΦ(s)− I .

Hence AΦ = sΦ − I, or equivalently, (sI − A)Φ = I. By inverting sI − A, we arrive atΦ = (sI −A)−1. Therefore (1.5) holds.

Now let us show how one can use the Laplace transform to compute eAt. For example,consider the matrix

A =

[0 1

−2 −3

]. (1.6)

In general the exponential matrix eAt is the inverse Laplace transform of (sI − A)−1. Tocompute (sI − A)−1 observe that

sI − A =

[s 00 s

]−[

0 1−2 −3

]=

[s −12 s+ 3

].

Recall that the inverse of a 2× 2 matrix is given by[a bc d

]−1

=1

ad− cb

[d −b−c a

](1.7)

when its determinant det[A] = ad − cb is nonzero; see equations (7.2) and (7.3) in Chapter4. (The inverse of a square matrix exists if and only if its determinant is nonzero.) Usingthis we obtain

(sI − A)−1 =1

Δ

[s+ 3 1−2 s

]where the determinant Δ = s2 + 3s+ 2 = (s+ 1)(s+ 2). This readily implies that

(sI − A)−1 =

⎡⎢⎢⎢⎣s+ 3

(s+ 1)(s+ 2)

1

(s+ 1)(s+ 2)

−2

(s+ 1)(s+ 2)

s

(s+ 1)(s+ 2)

⎤⎥⎥⎥⎦ .

5.1. THE EXPONENTIAL MATRIX 231

The exponential matrix eAt is obtained by taking the inverse Laplace transform of eachcomponent of (sI − A)−1. To this end, observe that the components in (sI − A)−1 admit apartial fraction expansion of the form

s+ 3

(s+ 1)(s+ 2)=

2

s+ 1+

−1

s+ 21

(s+ 1)(s+ 2)=

1

s+ 1+

−1

s+ 2−2

(s+ 1)(s+ 2)=

−2

s+ 1+

2

s+ 2s

(s+ 1)(s+ 2)=

−1

s+ 1+

2

s+ 2.

By taking the inverse Laplace transform, we have

eAt =

[2e−t − e−2t e−t − e−2t

−2e−t + 2e−2t −e−t + 2e−2t

].

As expected, eA0 = I and the derivative of eAt equals AeAt.The exponential matrix can be use to solve a matrix differential equation of the form

x = Ax (1.8)

where A is a n × n matrix, x is a vector in Cn and the initial condition x(0) is a specifiedvector in Cn . To be precise,⎡⎢⎢⎢⎣

x1x2...xn

⎤⎥⎥⎥⎦ =

⎡⎢⎢⎢⎣a11 a12 · · · a1na21 a22 · · · a2n...

... · · · ...an1 an2 · · · ann

⎤⎥⎥⎥⎦⎡⎢⎢⎢⎣x1x2...xn

⎤⎥⎥⎥⎦ . (1.9)

Here A is the n× n matrix with entries ajk and x = [x1, x2, · · · , xn]tr. The initial conditionis x(0) = [x1(0), x2(0), · · · , xn(0)]tr. Recall that tr denotes the transpose. The solution tothis differential equation is given by x(t) = eAtx(0). To show that this is indeed a solutionsimply plug it into x = Ax and show that it works. Indeed, assume that x(t) = eAtx(0).Then

x =deAt

dtx(0) = AeAtx(0) = Ax .

Notice that x(0) = eA0x0 = x(0) matches the initial condition x(0). Therefore x(t) = eAtx(0)is a solution to the differential equation in (1.8).

For an example, consider the differential equation[x1x2

]=

[0 −11 0

] [x1x2

](1.10)

where the initial conditions x1(0) = α and x2(0) = β. To find the solution, let us computeeAt. In this case,

A =

[0 −11 0

]and sI −A =

[s 1−1 s

]. (1.11)


By using the 2× 2 matrix inversion formula in (1.7), we obtain

(sI −A)−1 =1

s2 + 1

[s −11 s

].


(sI − A)−1 =

⎡⎢⎢⎢⎣s

s2 + 1

−1

s2 + 1

1

s2 + 1

s

s2 + 1

⎤⎥⎥⎥⎦ .

By taking the inverse Laplace transform, we have

eAt =

[cos(t) − sin(t)sin(t) cos(t)

].

As expected, eA0 = I and the derivative of eAt equals AeAt. Recall that the initial conditionsare x1(0) = α and x2(0) = β. Hence the solution x(t) = eAt[α, β]tr is given by[

x1(t)x2(t)

]=

[α cos(t)− β sin(t)α sin(t) + β cos(t)

].

The Matlab command to compute the exponential matrix of A is expm(A). One can alsouse the syms command to compute eAt. For example, consider the 2× 2 matrix A in (1.11).Then to compute in eAt Matlab we used the following commands:

syms t;A = [0,−1; 1, 0];

eAt = expm(A ∗ t)eAt =

[exp(−t ∗ i)/2 + exp(t ∗ i)/2 −(exp(−t ∗ i) ∗ i)/2 + (exp(t ∗ i) ∗ i)/2

(exp(−t ∗ i) ∗ i)/2− (exp(t ∗ i) ∗ i)/2 exp(−t ∗ i)/2 + exp(t ∗ i)/2]

% Matlab gave the answer in terms of complex exponentials.

% To convert to sines and cosines we used the simplify command.

simplify(eAt)[cos(t) − sin(t)sin(t) cos(t)

]

5.1.1 A spectral method to compute eAt

One can use Jordan forms to compute the exponential matrix. However, computing theJordan form for matrices with repeated eigenvalues may be numerical sensitive. To avoidthese numerical issues and the general theory of Jordan forms, let us simply assume that Aon Cn has n distinct eigenvalues {λj}n1 . Recall that the eigenvalues for A are the roots ofits characteristic polynomial det[sI−A]. Let {ξj}n1 be the eigenvectors corresponding to the

5.1. THE EXPONENTIAL MATRIX 233

eigenvalues {λj}n1 , that is, Aξj = λjξj for j = 1, 2, · · · , n. Let T be the matrix on Cn formedby the eigenvectors {ξj}n1 , that is,

T =[ξ1 ξ2 ξ3 · · · ξn

]on C

n. (1.12)

Because {λj}n1 are distinct, the corresponding eigenvectors {ξj}n1 are linearly independent,and thus, the matrix T is invertible.

Let Λ be the diagonal matrix formed by the eigenvalues {λj}n1 for A, that is,

Λ =

⎡⎢⎢⎢⎢⎢⎣λ1 0 0 · · · 00 λ2 0 · · · 00 0 λ3 · · · 0...

......

. . ....

0 0 · · · 0 λn

⎤⎥⎥⎥⎥⎥⎦ . (1.13)

Using Aξj = λjξj for j = 1, 2, · · · , n, it follows that

AT = TΛ and A = TΛT−1 (1.14)

In particular, Ak = TΛkT−1 for all integers k ≥ 0. Hence

eAt =

∞∑k=0

Aktk

k!=

∞∑k=0

TΛkT−1tk

k!= T

( ∞∑k=0

Λktk

k!

)T−1 = TeΛtT−1.

So when the eigenvalues for A are distinct, we obtain

eAt = TeΛtT−1 (1.15)

Because Λ is a diagonal matrix, eΛt is the diagonal matrix formed by {eλjt}n1 . Therefore

eAt = T

⎡⎢⎢⎢⎢⎢⎣eλ1t 0 0 · · · 00 eλ2t 0 · · · 00 0 eλ3t · · · 0...

......

. . ....

0 0 · · · 0 eλnt

⎤⎥⎥⎥⎥⎥⎦T−1. (1.16)

In summary, to compute eAt find the eigenvalues {λj}n1 for A. If the eigenvalues {λj}n1are distinct, then eAt is given by (1.16) where T is the invertible matrix formed by the eigen-vectors {ξj}n1 corresponding to the eigenvalues {λj}n1 . The Matlab command eig computesthe eigenvalues and eigenvectors for a square matrix. If the eigenvalues for A are repeated,then one can use the Jordan form for A to compute eAt; see [2] for further details.

For example, consider the matrix A on C2 given by

A =

[0 −ωω 0

]


where ω > 0. In this case, the characteristic polynomial for A is given by

det[sI − A] = det

[s ω−ω s

]= s2 + ω2 = (s− iω)(s+ iω).

Since the eigenvalues for A are the roots of the characteristic polynomial, the eigenvalues forA are {iω,−iω}. The corresponding eigenvectors are given by

ξ1 =

[i1

]and ξ2 =

[−i1

].

To be precise, Aξ1 = iωξ1 and Aξ2 = −iωξ2. (Because A is a real matrix, ξ2 = ξ1.) In thiscase, T =

[ξ1 ξ2

]and Λ is the diagonal matrix formed by {iω,−iω}, that is,

T =

[i −i1 1

]and T−1 =

1

2i

[1 i−1 i

]and Λ =

[iω 00 −iω

].

Using eAt = TeΛtT−1, we have

eAt = TeΛtT−1 =1

2iT

[eiωt 00 e−iωt

] [1 i−1 i

]=

1

2i

[i −i1 1

] [eiωt ieiωt

−e−iωt ie−iωt

]=

1

2i

[ieiωt + ie−iωt e−iωt − eiωt

eiωt − e−iωt ieiωt + ie−iωt

]=

[cos(ωt) − sin(ωt)sin(ωt) cos(ωt)

].

Therefore

eAt =

[cos(ωt) − sin(ωt)sin(ωt) cos(ωt)

].

The reader should check this by taking the inverse Laplace transform of (sI − A)−1 tocompute eAt.

5.1.2 Exercise

Problem 1. Consider the matrix

A =

[0 12 −1

].

Find eAt by taking the inverse Laplace transform of (sI −A)−1. Then find eAt by using theeigenvalue method in Section 5.1.1.


A =

[0 −51 −2

].

5.2. THE ROTATION MATRIX AROUND A SPECIFIED AXIS 235



A =

⎡⎣ 0 1 00 0 10 −4 0

⎤⎦ .Find eAt by taking the inverse Laplace transform of (sI −A)−1. Then find eAt by using theeigenvalue method in Section 5.1.1.

Problem 4. Find the solution to the differential equation⎡⎣ x1x2x3

⎤⎦ =

⎡⎣ 0 1 00 0 10 −4 0

⎤⎦⎡⎣ x1x2x3

⎤⎦subject to the initial conditions x1(0) = 2, x2(0) = −1 and x3(0) = 0.


A =

[5 8−5 −7

].


Problem 6. Consider the skew symmetric matrix

A =

⎡⎣ 0 −3 2

3 0 −√3

−2√3 0

⎤⎦ .Find eAt by taking the inverse Laplace transform of (sI −A)−1. Then find eAt by using theeigenvalue method in Section 5.1.1.

5.2 The rotation matrix around a specified axis

This section uses exponential matrices to present an introduction to rotation matrices. Theresults in this section are not used in the rest of the notes and can be skipped by theuninterested reader. Consider the skew symmetric matrix

Aα =

⎡⎣ 0 −α3 α2

α3 0 −α1

−α2 α1 0

⎤⎦ where α =

⎡⎣α1

α2

α3

⎤⎦ (2.1)


is a real vector in R3. In rotational dynamics, it turns out that eAαθ is the rotation matrix inthe counterclockwise direction around the vector α. Moreover, ω = αθ is the angular velocityfor the rotation matrix eAαθ. Finally, a square matrix A is skew symmetric if A + A∗ = 0.(The complex conjugate transpose is denoted by ∗.)

We claim that 1 is an eigenvalue for eAαθ with corresponding eigenvector α when α = 0.(Recall that an eigenvector must be nonzero.) This means that α is invariant for eAαθ, andthus, eAαθ rotates around α. To see that 1 is an eigenvalue for eAαθ, simply observe thatAαα = 0. Hence

eAαθα =

∞∑n=0

1

n!θnAnαα = Iα = α.

Therefore eAαθα = α. If α = 0, then Aα = 0 (the 3 by 3 zero matrix), and eA0θ = I.Recall that U is a unitary matrix if U∗U = I, the identity matrix. We claim that eAαθ is

a unitary matrix. To see this notice that A∗α = −Aα. Hence(

eAαθ)∗eAαθ = eA

∗αθeAαθ = e−AαθeAαθ =

(eAαθ

)−1eAαθ = I.

Therefore eAαθ is a unitary matrix.Recall that the trace of a square matrix A, denoted by trace(A), is the sum of its diagonal

entries. It is well known that det[eA] = etrace(A) where det is the determinant. Because thetrace of Aα equals zero, it follows that

det[eAαθ] = 1 (for all θ). (2.2)

It is emphasized that eAαθ is a unitary matrix on R3 whose determinant equals 1. On theother hand, if R is a unitary matrix on R3 whose determinant equals 1, then R = eAα whereAα is a real skew symmetric matrix; see Problem 11. Therefore the set of all unitary matricesR on R

3 whose determinant equals 1 can be viewed as rotation matrices of the form eAα .(Because the logarithm of a matrix is not unique, the skew symmetric matrix Aα is notuniquely determined by R.) For further results on rotation matrices see [12, 17] and therotation group SO(3) in Wikipedia.

It is also noted that eAα(θ1+θ2) = eAαθ1eAαθ2. Moreover, is easy to verify that

Aαb = α× b (2.3)

where × denotes the cross product and b is a vector in R3. Recall that eAαθb rotates thevector b. To obtain the initial direction of the rotation eAαθb, notice that

d

dθeAαθb

∣∣∣∣θ=0

= AαeAαθb

∣∣θ=0

= Aαh = α× b.

So using the right hand rule of the cross product, we see that the rotation matrix initiallymoves from the vector b in the direction α×b which is orthogonal to both α and b and pointscounterclockwise around α. In other words, the rotation matrix eAαθ rotates counterclockwisearound α.


If α is a unit vector, that is, ‖α‖ = 1, then the rotation matrix eAαθ around the vector αis determined by

eAαθ = cos(θ)I + Aα sin(θ) + αα∗(1− cos(θ)) (if ‖α‖ = 1) (2.4)

If α is a unit vector, then equation (2.4) shows that the period of eAαθ is 2π, or equivalently,as θ moves from 0 to 2π, then eAαθ rotates once around α. Applying eAαθ to a vector b, weobtain the classical Rodrigues formula for rotating a vector b around a unit vector α throughan angle of θ according to the right hand rule, that is,

eAαθb = cos(θ)b+ sin(θ)α× b+ (α∗b)(1− cos(θ))α (when ‖α‖ = 1). (2.5)

In fact, the Rodrigues rotation formula can be used to derive eAαθ; see also the Rodriguesrotation formula in Wikipedia.

The form for eAαθ in (2.4) readily yields the three classical elementary rotation matricesabout the x, y and z axis. In particular, we have

eAαθ =

⎡⎣cos(θ) − sin(θ) 0sin(θ) cos(θ) 00 0 1

⎤⎦ when α =

⎡⎣001

⎤⎦ (2.6)

which is the classical rotation matrix about the z axis. Moreover,

eAαθ =

⎡⎣ cos(θ) 0 sin(θ)0 1 0

− sin(θ) 0 cos(θ)

⎤⎦ when α =

⎡⎣010

⎤⎦ (2.7)

is the classical rotation matrix about the y axis. Finally,

eAαθ =

⎡⎣1 0 00 cos(θ) − sin(θ)0 sin(θ) cos(θ)

⎤⎦ when α =

⎡⎣100

⎤⎦ (2.8)

is the classical rotation matrix about the x axis.


-610

-4

-2

5 10

0z

2

An ellipse around the [1;-1;1] axis

5

y

0

4

x

6

0-5 -5

-10 -10

Figure 5.1: An ellipse around the[1 − 1 1

]traxis

An ellipse. For an example, let us use Matlab to draw a three dimensional ellipse aroundthe axis formed by the vector

[1 −1 1

]tr. Our ellipse has a maximum length of 2 along

the axis and a maximum height of 6 in the orthogonal plane; see Figure 5.1. (The transposeof a matrix M is denoted by M tr.) To this end, consider the rotation matrix eAαθ with

the unit vector α = 1√3

[1 −1 1

]trwhich rotates counterclockwise around the axis formed

by α or[1 −1 1

]tr. Let v be any unit vector which is orthogonal to α. We choose

v = 1√6

[−1 1 2]tr

. To generate this ellipse, let 0 ≤ t ≤ π and compute the vector

b(t) = 2 cos(t)w+ 6 sin(t)v. Then b(t) will form a semi-ellipse around the axis[1 −1 1

]tr.

Now use the rotation matrix eAαθ to compute eAαθb(t) for 0 ≤ θ ≤ 2π to wrap the semi-ellipse

around the axis formed by[1 −1 1

]tr. (Because α is a unit vector, eAαθb(t) will rotate

around the axis formed by α once when θ moves from 0 to 2π.) This yields an ellipse whichhas a maximum length of 2 along the axis and a maximum height of 6 in the orthogonalplane. To save computational time, we computed eAαθ only once and then used the fact that

eAαkθ =(eAαθ

)kto compute eAαkθ, rather than computing eAαkθ for each k in the loop.

If b(t) = l cos(t)w + h sin(t)v with l > 0 and h > 0, then our algorithm produces a threedimensional ellipse which has a maximum length of l along the axis and a maximum height ofh in the orthogonal plane. Finally, applying the previous procedure to b(t) = cos(t)w+sin(t)v


forms a unit sphere.

The Matlab commands we used to generate the ellipse in Figure 5.1 are given by

(1) w = [1;−1; 1]; w = w/norm(w); v =[− 1; 1; 2

]/sqrt(6);

(2) a = [0,−w(3), w(2);w(3), 0,−w(1);−w(2), w(1), 0];

(3) p = linspace(−4, 4, 10000);

(4) n = 30; t = linspace(0, π, n);

(5) T = expm(a ∗ 2π/200); for j = 1: n;

(6) b = 2 ∗ cos(t(j)) ∗ w + 6 ∗ sin(t(j)) ∗ v;

(7) x(1) = b(1); y(1) = b(2); z(1) = b(3); for k = 2: 200;

(8) b = T ∗ b; x(k) = b(1); y(k) = b(2); z(k) = b(3); end

(9) plot3(x, y, z, ’-’); hold on; end; plot3(p,−p, p, ’r.’); grid;

(10) xlabel(’x’); ylabel(’y’); zlabel(’z’);

(11) title(’An ellipse around the [1;−1; 1] axis’)


-22

-1

1 2

0zCones around the [1;-1;1] axis

1

1

y

0

x

2

0-1 -1

-2 -2

Figure 5.2: Cones around the[1 − 1 1

]traxis

A cone. For another example, let us generate a pair of cones around the axis formed by[1 −1 1

]tr. To generate a cone with slope 1

2above the axis, we let −2 ≤ t ≤ 2 and now set

b(t) = tw+ t2v which is a line of slope 1

2with respect to the

[1 −1 1

]traxis. Then we used

the rotation matrix eAαθ to compute eAαθb(t) for 0 ≤ θ ≤ 2π to wrap the line formed by t2

around the axis[1 −1 1

]tr, and thus, draw the cones presented in Figure 5.2. The Matlab

command we used to generate the two cones in Figure 5.2 are the same as our previous code,except lines (3), (4), (6) and (11) are replaced by

(3c) p = linspace(−2, 2, 10000);

(4c) n = 60; t = linspace(−2, 2, n);

(6c) b = t(j) ∗ w + 0.5 ∗ t(j) ∗ v;(11c) title(’Cones around the [1;−1; 1] axis’)

Let us use the Laplace transform technique to compute the exponential matrix eAαt when‖α‖ = 1. Then replacing t by θ yields eAαθ, and its formula in (2.4), that is,

eAαθ = cos(θ)I + Aα sin(θ) + αα∗(1− cos(θ)) (if ‖α‖ = 1) (2.9)


To this end, recall thatLeAαt = (sI − Aα)

−1

where L denotes the Laplace transform. Notice that

sI − Aα =

⎡⎣ s α3 −α2

−α3 s α1

α2 −α1 s

⎤⎦ .The characteristic polynomial for Aα is given by

det[sI − Aα] = det

⎡⎣ s α3 −α2

−α3 s α1

α2 −α1 s

⎤⎦= s(s2 + α2

1) + α3(sα3 + α1α2)− α2(α1α3 − sα2)

= s3 + s(α21 + α2

2 + α23).

Because ‖α‖ = 1, the characteristic polynomial for Aα is given by

det[sI − Aα] = s(s2 + 1) (when ‖α‖ = 1). (2.10)

To compute (sI − Aα)−1 observe that⎡⎣ s α3 −α2

−α3 s α1

α2 −α1 s

⎤⎦tr =⎡⎣ s −α3 α2

α3 s −α1

−α2 α1 s

⎤⎦where tr denotes the transpose. So the algebraic adjoint of sI − Aα is given by

adj(sI −Aα) =

⎡⎣ s2 + α21 α1α2 − sα3 sα2 + α1α3

sα3 + α1α2 s2 + α22 α2α3 − sα1

α1α3 − sα2 sα1 + α2α3 s2 + α23

⎤⎦ .Recall that det[sI −Aα] = s(s2 + 1). This readily implies that

(sI −Aα)−1 =

adj(sI − Aα)

det[sI − Aα]=

1

s(s2 + 1)

⎡⎣ s2 + α21 α1α2 − sα3 sα2 + α1α3

sα3 + α1α2 s2 + α22 α2α3 − sα1

α1α3 − sα2 sα1 + α2α3 s2 + α23

⎤⎦ . (2.11)

By rewriting the 3 by 3 matrix, we obtain

(sI − Aα)−1 =

s2

s(s2 + 1)

⎡⎣1 0 00 1 00 0 1

⎤⎦+s

s(s2 + 1)

⎡⎣ 0 −α3 α2

α3 0 −α1

−α2 α1 0

⎤⎦+

1

s(s2 + 1)

⎡⎣ α21 α1α2 α1α3

α2α1 α22 α2α3

α3α1 α3α2 α23

⎤⎦ .


Notice that

αα∗ =

⎡⎣α1

α2

α3

⎤⎦ [α1 α2 α3

]=

⎡⎣ α21 α1α2 α1α3

α2α1 α22 α2α3

α3α1 α3α2 α23

⎤⎦ .Using this with the definition of Aα in (2.1) and the identity matrix I on C3, we have

(sI − Aα)−1 =

s

s2 + 1I +

1

s2 + 1Aα +

1

s(s2 + 1)αα∗. (2.12)

To compute the inverse Laplace transform, observe that

1

s(s2 + 1)=

1

s− s

s2 + 1.

Recall that

L(1) = 1

sand L(cos(t)) = s

s2 + 1and L(sin(t)) = 1

s2 + 1.

In particular, the inverse Laplace transform

L−1

(1

s(s2 + 1)

)= (1− cos(θ)) .

By taking the inverse Laplace transform of (sI −Aα)−1 in (2.12), we see that

eAαt = cos(t)I + Aα sin(t) + αα∗(1− cos(t)) (when ‖α‖ = 1).

Replacing t by θ, we obtain the expression for eAαθ that we have been looking for, that is,the rotation matrix eAαθ around the α axis is determined by

eAαθ = cos(θ)I + Aα sin(θ) + αα∗(1− cos(θ)) (when ‖α‖ = 1). (2.13)

This is precisely the formula for eAαθ in (2.4).Here we used some insight to compute eAαt. One can also compute eAαt directly from

the formula for (sI − Aα)−1 in (2.11). By taking the inverse Laplace of each component of

(sI − Aα)−1 in (2.11), we see that

eAαt =⎡⎣ cos(t) + α21(1− cos(t)) α1α2(1− cos(t))− α3 sin(t) α1α3(1− cos(t)) + α2 sin(t)

α1α2(1− cos(t)) + α3 sin(t) cos(t) + α22(1− cos(t)) α2α3(1− cos(t))− α1 sin(t)

α1α3(1− cos(t))− α2 sin(t) α2α3(1− cos(t)) + α1 sin(t) cos(t) + α23(1− cos(t))

⎤⎦ .

Replacing t by θ we obtain the expression for eAαθ that we have been looking for

eAαθ =⎡⎣ cos(θ) + α21(1− cos(θ)) α1α2(1− cos(θ))− α3 sin(θ) α1α3(1− cos(θ)) + α2 sin(θ)

α1α2(1− cos(θ)) + α3 sin(θ) cos(θ) + α22(1− cos(θ)) α2α3(1− cos(θ))− α1 sin(θ)

α1α3(1− cos(θ))− α2 sin(θ) α2α3(1− cos(θ)) + α1 sin(θ) cos(θ) + α23(1− cos(θ))

⎤⎦ .


By rearranging terms, we have

eAαθ = cos(θ)I + Aα sin(θ) +

⎡⎣ α21 α1α2 α1α3

α2α1 α22 α2α3

α3α1 α3α2 α23

⎤⎦ (1− cos(θ))

= cos(θ)I + Aα sin(θ) + αα∗(1− cos(θ)).

This yields the formula for the rotation matrix eAαθ around the α axis presented in (2.4).

Computing eAαθ by power series

The results in this section are not used in the remaining part of the notes and can be skippedby the uninterested reader. In this section, we will compute eAαθ by using the power serieseAαθ =

∑∞0

Anαθ

n

n!. Let α be a vector in R

3. Recall that Aα is the matrix defined by

Aα =

⎡⎣ 0 −α3 α2

α3 0 −α1

−α2 α1 0

⎤⎦ where α =

⎡⎣α1

α2

α3

⎤⎦ . (2.14)

It is noted that Aα is skew symmetric, that is, A∗α = −Aα. (Recall that T ∗ is the complex

conjugate transpose of a matrix T .) Moreover, is easy to verify that

Aαb = α× b (2.15)

where × denotes the cross product and b is a vector in R3. In particular, we have Aαα =

α× α = 0.First let us compute eAαθ when α is a unit vector, that is, ‖α‖ = 1. To this end, set

P = A∗αAα. We claim that

P = A∗αAα = I − αα∗ (when ‖α‖ = 1) (2.16)

Moreover, P = A∗αAα is the orthogonal projection onto the null space of α∗, that is, P =

P 2 = P ∗ and the range of P equals K the null space of α∗. To be precise, the range of Pequals K = {x ∈ C

3 : α∗x = 0}. Finally, null space of P equals the null space of Aα whichequals the span of α. (The notion of an orthogonal projection is a bit beyond the scope ofthese notes.)

Using 1 = α21 + α2

2 + α23, we obtain

A∗αAα =

⎡⎣ 0 α3 −α2

−α3 0 α1

α2 −α1 0

⎤⎦⎡⎣ 0 −α3 α2

α3 0 −α1

−α2 α1 0

⎤⎦=

⎡⎣α22 + α2

3 −α1α2 −α1α3

−α1α2 α21 + α2

3 −α2α3

−α1α3 −α2α3 α21 + α2

2

⎤⎦ =

⎡⎣1− α21 −α1α2 −α1α3

−α1α2 1− α22 −α2α3

−α1α3 −α2α3 1− α23

⎤⎦=

⎡⎣1 0 00 1 00 0 1

⎤⎦−⎡⎣α1

α2

α3

⎤⎦ [α1 α2 α3

]= I − αα∗.


Hence A∗αAα = I − αα∗ and (2.16) holds. Now observe that

P ∗ = (A∗αAα)

∗ = A∗αAα = P.

In other words, P = P ∗. (Here we used the fact that (AB)∗ = B∗A∗ and A∗∗ = A where Aand B are square matrices.) Because α is a unit vector, α∗α = 1. Using this we obtain

P 2 = (I − αα∗)(I − αα∗) = I − 2αα∗ + αα∗αα∗ = I − αα∗ = P.

Therefore P 2 = P . In particular, this implies that P n = P for any integer n ≥ 1.Because P = P 2 = P ∗, it follows that P is an orthogonal projection. Finally, it is noted

the range of P equals the null space of α∗. So P is the orthogonal projection onto the nullspace of α∗.

Notice that A∗α = −Aα. Hence P = A∗

αAα = −A2α, or equivalently, A2

α = −P . SinceAαα = 0 and P = I −αα∗, we also have AαP = Aα. (One can directly verify that Aαα = 0,or simply observe that Aαα = α × α = 0.) This readily implies that for any integer n ≥ 1,we have

A2nα = (−P )n = (−1)nP = (−1)n(I − αα∗)

A2n+1α = AαA

2nα = (−1)nAαP = (−1)nAα.

In other words,

A2nα = (−1)n(I − αα∗) and A2n+1

α = (−1)nAα (when n ≥ 1 and ‖α‖ = 1). (2.17)

It is noted that one can also use the Cayley-Hamilton Theorem to derive (2.17); see theaxis-angle representation in Wikipedia and Problem 4. Since the Cayley-Hamilton Theoremis a bit beyond the scope of the notes, we derived (2.17) by only using the properties of Aα.Finally, we also wanted to introduce the orthogonal projection P which provides anotherperspective in the analysis of eAαθ.

Recall that the cos(θ) and sin(θ) functions admit Taylor series expansions of the form:

cos(θ) =∞∑n=0

(−1)nθ2n

(2n)!= 1− θ2

2!+θ4

4!− θ6

6!+ · · ·

sin(θ) =∞∑n=0

(−1)nθ2n+1

(2n+ 1)!=

θ

1!− θ3

3!+θ5

5!− θ7

7!+ · · · (2.18)

By decomposing the Taylor series expansion for eAαθ into its even and odd components with(2.17), we obtain

eAαθ =

∞∑n=0

Anαθn

n!= I +

∞∑n=1

A2nα θ

2n

(2n)!+

∞∑n=0

A2n+1α θ2n+1

(2n+ 1)!

= I + (I − αα∗)∞∑n=1

(−1)nθ2n

(2n)!+ Aα

∞∑n=0

(−1)nθ2n+1

(2n+ 1)!

= αα∗ + (I − αα∗)∞∑n=0

(−1)nθ2n

(2n)!+ Aα sin(θ)

= αα∗ + (I − αα∗) cos(θ) + Aα sin(θ).


Therefore the rotation matrix around α is determined by

eAαθ = cos(θ)I + Aα sin(θ) + αα∗ (1− cos(θ)) (when ‖α‖ = 1). (2.19)

Notice that Acα = cAα where c is a constant. If α = 0, then α‖α‖ is a unit vector. Now

observe that Aα = ‖α‖ Aα

‖α‖ = ‖α‖A α‖α‖ . Hence

eAαθ = eA α

‖α‖‖α‖θ

.

So replacing α by α‖α‖ and θ by ‖α‖θ in (2.19), we obtain the following general form for a

rotation matrix about α:

eAαθ = cos(‖α‖θ)I + Aα‖α‖ sin(‖α‖θ) + αα∗

‖α‖2 (1− cos(‖α‖θ)) (if α = 0) (2.20)

Equation (2.20) shows that the period of eAαθ is 2π‖α‖ , or equivalently, as θ moves from 0 to

2π‖α‖ , then e

Aαθ rotates once around α. If α = 0, then Aα = 0 and eA0θ = I. Finally, because

Aα is skew symmetric, eAαθ is a unitary matrix for all real θ, that is,(eAαθ

)∗eAαθ = I the

identity matrix.

Rotational dynamics and eAαθ. This section is motivated by Chapter 2 in Tong [42]. Inthis section, we will use rotation matrices to provide some insight into classical rotationaldynamics for a particle moving in a rotating frame. Here we only consider the case whenthe rotation is described by the rotation matrix eAαθ and not the general setting.

As before, let Aα be the skew symmetric matrix corresponding to α; see (2.1). LeteAαθ : B → I be the rotation matrix mapping the body frame or rotating frame B intothe inertial frame or fixed reference frame I. (For example, the rotation of the earth or amerry-go-round.) Here θ = θ(t) is a differentiable function of time. The angular velocity isdefined by ω = αθ. If θ = t, then ω = α. Using d

dteAαθ = Aαθe

Aαθ, we have

d

dteAαθ = Aωe

Aαθ = eAαθAω (ω = αθ) (2.21)

The last equality follows from the fact that Aα commutes with eAαθ.Let b ∈ B be a vector for the position of a particle in the body frame and r = eAαθb

its corresponding position in the inertial frame I. Here b = b(t) is a differentiable vectorfunction of time. Recall that e−Aαθ is the inverse of eAαθ. The velocities r in I and b in Bare given by

(i) r = eAαθb and b = e−Aαθr

(ii) r = AωeAαθb+ eAαθb = ω × r + eAαθb

(iii) r = eAαθ(Aωb+ b

)= eAαθ

(ω × b+ b

)(iv) b = e−Aαθ

(r − Aωr

)= e−Aαθ

(r − ω × r

).


Part (i) states that r = eAαθb in the I frame is represented by or identified with b = e−Aαθrin the B frame. Part (ii) shows that the velocity r in I equals ω × r + eAαθb where b is thevelocity of the particle in the B frame. Part (iii) shows that the velocity r in the I frame isrepresented by ω × b+ b in the B frame. Part (iv) states the velocity b in B is representedby r − ω × r in the I frame.

By using Part(iii) with (2.21), we see that the acceleration r is given by

r = eAαθ(A2ωb+ 2Aωb+ Aωb+ b

)= eAαθ

(ω × (ω × b) + 2ω × b+ ω × b+ b

). (2.22)

Hence the acceleration r in the inertial frame I is represented by A2ωb + 2Aωb + Aωb + b in

the B frame. Solving for b yields the following formula for the acceleration b in the B frame

b = e−Aαθr − A2ωb− 2Aωb−Aωb

= e−Aαθr − ω × (ω × b)− 2ω × b− ω × b. (2.23)

Moreover, −A2ωb = −ω×(ω×b) is called the centrifugal acceleration, while −2Aω b = −2ω× b

is the Coriolis acceleration, and −Aωb = −ω × b is the Euler acceleration.Recall that Newton’s second law states that F = mr where F is the force on a particle

of mass m at position r. Multiplying (2.22) by m yields Newton’s equation of motion in theI frame

F = mr = eAαθ(mA2

ωb+ 2mAωb+mAωb+mb). (2.24)

By rearranging terms or multiplying (2.23) bym with F = mr, we obtain Newton’s equationsof motion in the B frame

mb = e−AαθF −mA2ωb− 2mAωb−mAωb

= e−AαθF −mω × (ω × b)− 2mω × b−mω × b. (2.25)

Finally, −mA2ωb = −mω×(ω×b) is called the centrifugal force, while −2mAω b = −2mω×b is

the Coriolis force, and −mAωb = −mω× b is the Euler force; see also the Rotating referenceframe in Wikipedia.

The rotation matrix eAαθ is a special case of a general rotation matrix used in dynamics.A rotation matrix U = U(t) is a differentiable unitary matrix on R3 whose determinantequals one. In this case, U = AωU = UAϕ where ω = ω(t) and ϕ = ϕ(t) are two vectorsin R3 with corresponding skew symmetric matrices Aω and Aϕ. The ω and ϕ correspond toangular velocities in different frames. If U = eAαθ, then αθ = ω = ϕ. The main differencebetween using U and eAαθ is that one has to keep track of the angular velocities ω and ϕwhen using U . For example, assume that U maps B into I and r = Ub where b is in B.Then the velocity r = ω × r + Ub. The acceleration r of r in the I frame is given by

r = U(A2ϕb+ 2Aϕb+ Aϕb+ b

)= U

(ϕ× (ϕ× b) + 2ϕ× b+ ϕ× b+ b

).

In this setting, Newton’s equations of motion F = mr in the B frame are given by

mb = U∗F −mA2ϕb− 2mAϕb−mAϕb

= U∗F −mϕ× (ϕ× b)− 2mϕ× b−mϕ× b.


Finally, −mA2ϕb = −mϕ× (ϕ× b) is the centrifugal force, while −2mAϕb = −2mϕ× b is the

Coriolis force and −mAϕb = −mϕ× b is the Euler force.

5.2.1 Exercise


A =

⎡⎣0 −3 03 0 −40 4 0

⎤⎦ .Find the exponential matrix eAθ.

Problem 2. Let α be a nonzero vector in R3. Recall that the trace of a square matrix A,denoted by trace(A), is the sum of the diagonal entries of A.

(i) Use equation (2.20) to show that

cos(‖α‖θ) = trace(eAαθ)− 1

2.

(ii) Show that(eAαθ

)∗= e−Aαθ = cos(‖α‖θ)I − Aα

‖α‖ sin(‖α‖θ) + αα∗

‖α‖2 (1− cos(‖α‖θ)) (if α = 0).

(iii) If θ = πj‖α‖ where j is an integer, then show that

Aα‖α‖ =

eAαθ − (eAαθ

)∗2 sin(‖α‖θ) .

For similar results see the axis-angle representation in Wikipedia.

Problem 3. Let α be a unit vector in R3. Then use the eigenvalue method in Section 5.1.1to show that

eAαθ = cos(θ)I + Aα sin(θ) + αα∗(1− cos(θ)) (if ‖α‖ = 1).

Problem 4. Let A be a square matrix. The Cayley-Hamilton Theorem states that p(A) = 0where p(s) = det[sI − A] is the characteristic polynomial for A. One can use the Cayley-Hamilton Theorem to compute eAt; see [2]. Let α be a vector in R

3. Then the characteristicpolynomial for Aα is given by p(s) = s(s2 + ‖α‖2). So the eigenvalues for Aα are λ1 = 0,λ2 = i‖α‖ and λ3 = −i‖α‖. For the reader familiar with the Cayley-Hamilton method ofcomputing eAt, first observe that

eAαθ = f(θ)I + g(θ)Aα + h(θ)A2α (2.26)


where f(θ), g(θ) and h(θ) are three function of θ. Moreover, these three functions mustsatisfy the following three equations:

eλjθ = f(θ) + g(θ)λj + h(θ)λ2j (for j = 1, 2, 3).

Since λ1 = 0, the function f(θ) = 1. Solve for g(θ) and h(θ). Then use (2.26) to show thatwhen α = 0


‖α‖2 (1− cos(‖α‖θ)) (if α = 0).

For similar results see the axis-angle representation in Wikipedia.


Aα =

⎡⎣ 0 −α3 α2

α3 0 −α1

−α2 α1 0

⎤⎦ where α =

⎡⎣α1

α2

α3

⎤⎦ (2.27)

is a real vector. Show that Aαb = α× b where × denotes the cross product and b is a vectorin R

3. For any vectors b and c in R3 show that

eAαθ (b× c) =(eAαθb

)× (eAαθc

).

Part b. By taking the inverse Laplace transform of (sI − Aα)−1 with α = 0, directly show

(without using (2.4)) that in general the rotation matrix


‖α‖2 (1− cos(‖α‖θ)) (when α = 0). (2.28)

Part c. Use the formula for eAαθ in (2.20) or (2.28) to directly verify that

d

dθeAαθ = Aαe

Aαθ.

Part d. Let r = eAαθb where b = b(t) is a differentiable vector function. Using b = e−Aαθr,show that the acceleration b for b is given by

b = e−Aαθ(A2ωr − 2Aωr −Aωr + r

)= e−Aαθ

(ω × (ω × r)− 2ω × r − ω × r + r

).

Problem 6. Use the plot3 command in Matlab to draw a three dimensional ellipse aroundthe axis formed by

[1 −2 1

]trwith a maximum length of 8 along this axis and a maximum


height of 2 in the orthogonal plane. Include the[1 −2 1

]traxis in your plot, label the x,

y and z axis, and place a grid on the plot.

Problem 7. Use the plot3 command in Matlab to draw a cylinder around the axis formedby

[1 −2 1

]trwith a radius of 2 around the axis and a length of 4 along this axis. Include

the[1 −2 1

]traxis in your plot, label the x, y and z axis, and place a grid on the plot.

Problem 8. Use the comet3 command in Matlab with hold on to draw a circle of radiustwo orbiting the axis formed by

[1 −1 1

]trthree times counterclockwise. Moreover, make[

0 0 0]tr

the center of the circle. Plot both the axis formed by[1 −1 1

]trand the circle

of radius two on the same comet3 plot. Finally, label the x, y and z axis in the comet3graph, and place a grid on the plot. Hint: set α = 1√

3

[1 −1 1

]trand consider eAαθv for

0 ≤ θ ≤ 6π where v ⊥ α and ‖v‖ = 2.

Problem 9. The rotation group SO(3) is the set of all unitary matrices R on R3 whosedeterminant equals 1. If particular, if R1 and R2 are both in SO(3), then their product R1R2

is also in SO(3). The rotation group

SO(3) = {eAα : Aα is a real skew symmetric matrix}.In other words, a matrix R is in SO(3) if and only if R = eAα where Aα is a skew symmetricmatrix on R3. In particular, R is a rotation matrix. Because eAαθ is periodic, the skewsymmetric matrix Aα is not uniquely determined by R. (This also follows from the fact thatthe logarithm of a matrix is not unique.) We have shown that eAα is in SO(3), that is, eAα

is a real unitary matrix with determinant 1. Now assume that R is in SO(3). Then showthat R = eAα where Aα is a real skew symmetric matrix. For further details on the rotationgroup SO(3), see [12, 17] and the rotation group SO(3) in Wikipedia. Finally, also see thelogarithm of a matrix in Wikipedia.

Hint: Because R is a real unitary matrix with determinant 1, the eigenvalues for R are givenby {1, e±iϕ} where ϕ is an angle in [0, π]. If ϕ = 0, then R = I and Aα = 0. Let {ξ1, ξ2, ξ3}be the unit eigenvectors for the eigenvalues {1, eiϕ, e−iϕ} respectively, that is, Rξ1 = ξ1and Rξ2 = eiϕξ2 and Rξ3 = e−iϕξ3, and ‖ξj‖ = 1 for j = 1, 2, 3. Recall that the spectraldecomposition for R is given by

R = ξ1ξ∗1 + eiϕξ2ξ

∗2 + e−iϕξ3ξ∗3 =

[ξ1 ξ2 ξ3

] ⎡⎣1 0 00 eiϕ 00 0 e−iϕ

⎤⎦⎡⎣ξ∗1ξ∗2ξ∗3

⎤⎦ .Then show that R = eAα where

Aα = iϕξ2ξ∗2 − iϕξ3ξ

∗3 if 0 ≤ ϕ < π

=[ξ2 ξ3

] [0 −ππ 0

] [ξ∗2ξ∗3

]if ϕ = π.

If 0 < ϕ < π, then the eigenvector ξ3 = ξ2 the complex conjugate of the eigenvector ξ2 andAα = 2�(iϕξ2ξ∗2). Since Aα + A∗

α = 0, we see that Aα is a real skew symmetric matrix. If


ϕ = π, then the eigenvalues for R are {1,−1,−1}, the corresponding eigenvectors are real,and Aα is a real skew symmetric matrix. Finally, it is noted that we choose Aα such thatthe eigenvalues for Aα are {0,±iϕ}.

5.3 State space input output maps

The general state space model for a differential equation is given by

x = Ax+Bu and y = Cx+Du . (3.1)

Here A is a n× n matrix, B is a column vector in Cn, while C is a 1× n row vector and Dis a scalar. The input or forcing function is u(t), the state x(t) is a vector in Cn and y(t) isthe output. To be more specific,⎡⎢⎢⎢⎣

x1x2...xn

⎤⎥⎥⎥⎦ =

⎡⎢⎢⎢⎣a11 a12 · · · a1na21 a22 · · · a2n...

... · · · ...an1 an2 · · · ann

⎤⎥⎥⎥⎦⎡⎢⎢⎢⎣x1x2...xn

⎤⎥⎥⎥⎦+

⎡⎢⎢⎢⎣b1b2...bn

⎤⎥⎥⎥⎦u(3.2)

y =[c1 c2 · · · cn

]x+Du .

Here A is the n×n matrix with entries ajk and the state variable x = [x1, x2, · · · , xn]tr. Thecolumn vector B = [b1, b2, · · · , bn]tr and the row vector C = [c1, c2, · · · , cn]tr. Finally, theinitial condition is x(0) = [x1(0), x2(0), · · · , xn(0)]tr.

The solution to the state space system in (3.1) is given by

x(t) = eAtx(0) +

∫ t

0

eA(t−σ)Bu(σ) dσ (3.3)

y(t) = CeAtx(0) +

∫ t

0

CeA(t−σ)Bu(σ) dσ +Du(t) . (3.4)

To derive this solution let us take the Laplace transform of the state space system in (3.1),that is,

sX(s)− x(0) = AX(s) +BU(s) .

Thus (sI − A)X = x(0) +BU , or equivalently,

X(s) = (sI − A)−1x(0) + (sI − A)−1BU(s) . (3.5)

Recall that eAt is the inverse Laplace transform of (sI − A)−1, and multiplication in thes-domain corresponds to convolution in the time domain. So by taking the inverse Laplacetransform in (3.5), we have

x(t) = eAtx(0) + (eAt ⊗ u)(t) = eAtx(0) +

∫ t

0

eA(t−σ)Bu(σ) dσ .

This yields the formula for x(t) in (3.3). The formula for y(t) in (3.4) follows by substitutingthe expression for x in (3.3) into y(t) = Cx(t) +Du(t). Therefore (3.4) holds.

5.3. STATE SPACE INPUT OUTPUT MAPS 251

m

k

c

y

u


A simple mass spring damper system revisited. For an example, consider the mass,spring and damper system presented in Figure 5.3. Here m is the mass, c is the dampingconstant, k is the spring constant and u is the forcing function. In this setting, y is theposition of the mass and y is the velocity of the mass. By applying Newton’s second law,the equation of motion is given by the following second order differential equation:

my + cy + ky = u. (3.6)

To convert this differential equation to a state space system, consider the state space variablesdefined by

x1 = y and x2 = x1 = y. (3.7)

Notice that

x2 = y = − c

my − k

my +

1

mu = − c

mx2 − k

mx1 +

1

mu.

Combining this with (3.7), we arrive at the following state space model for the mass, springdamper system in (3.6): [

x1x2

]=

[0 1

− k

m− c

m

][x1x2

]+

[01

m

]u

y =[1 0

] [x1x2

]. (3.8)

The initial conditions are given by x1(0) = y(0) and x2(0) = y(0). Notice that in thisexample the state space consists of two states, x1 = y which is the distance of the mass fromthe origin, and x2 = y the velocity of the mass. The input to this state space system is theforcing function u and the output y is the position of the mass.

One can also construct a state space system for the mass spring damper system in (3.6)where the output v is the velocity of the mass, that is, v = y = x2. In this case, thecorresponding state space realization is given by[

x1x2

]=

[0 1

− k

m− c

m

][x1x2

]+

[01

m

]u

v =[0 1

] [x1x2

]. (3.9)


The initial conditions are determined by x1(0) = y(0) and x2(0) = y(0).Suppose that one wants the output z to be three times the distance minus two times the

velocity of the mass, that is, z = 3y − 2y = 3x1 − 2x2. Then the corresponding realizationis determined by [

x1x2

]=

[0 1

− k

m− c

m

][x1x2

]+

[01

m

]u

z =[3 −2

] [x1x2

]. (3.10)

The initial conditions are given by x1(0) = y(0) and x2(0) = y(0).

Converting a linear differential equation to state space form. To complete thissection, let us demonstrate that any linear differential equation can be expressed in statespace form. To see this consider the fourth order differential equation

y(4) + a3y(3) + a2y + a1y + a0y = bu (3.11)

where {ak}30 and b are constants. The initial conditions are {y(k)(0)}30. The input or forcingfunction is u and y is the output. Now let {xk}41 be the state variables defined by

x1 = y, x2 = x1 = y, x3 = x2 = y and x4 = x3 = y(3) . (3.12)

By combining (3.11) and (3.12), we have

x4 = y(4) = −a0y − a1y − a2y − a3y(3) + bu = −a0x1 − a1x2 − a2x3 − a3x4 + bu . (3.13)

Rewriting (3.12) and (3.13) in matrix form yields the following state space model for thedifferential equation in (3.11)⎡⎢⎢⎣

x1x2x3x4

⎤⎥⎥⎦ =

⎡⎢⎢⎣0 1 0 00 0 1 00 0 0 1

−a0 −a1 −a2 −a3

⎤⎥⎥⎦⎡⎢⎢⎣x1x2x3x4

⎤⎥⎥⎦+

⎡⎢⎢⎣000b

⎤⎥⎥⎦u (3.14)

y =[

1 0 0 0]x .

As expected, the state x = [x1, x2, x3, x4]tr. The initial conditions are specified by

x(0) =

⎡⎢⎢⎣x1(0)x2(0)x3(0)x4(0)

⎤⎥⎥⎦ =

⎡⎢⎢⎣y(0)y(0)y(0)y(3)(0)

⎤⎥⎥⎦ .

By following the technique in the previous example, one can convert any linear differentialequation to a state space system of the form (3.1). To see this, consider any input-output


system, with input u(t) and output y(t), described by an n-th order differential equation ofthe form

y(n) + an−1y(n−1) + · · ·+ a1y + a0y = u (3.15)

where y(j) denotes the j-th derivative of y and aj is a scalar for j = 0, 1, · · · , n − 1. Toconvert this system to state space form, let x1 = y, x2 = y, · · · , xn = y(n−1). Then usingxj = xj+1 for j = 1, 2, · · · , n− 1 along with xn = y(n), we obtain⎡⎢⎢⎢⎢⎢⎢⎢⎣

x1x2......

xn−1

xn

⎤⎥⎥⎥⎥⎥⎥⎥⎦=

⎡⎢⎢⎢⎢⎢⎢⎢⎣

0 1 0 · · · 0 00 0 1 · · · 0 0...

......

. . ....

......

......

. . ....

...0 0 0 · · · 0 1

−a0 −a1 −a2 · · · −an−2 −an−1

⎤⎥⎥⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎢⎢⎣

x1x2......

xn−1

xn

⎤⎥⎥⎥⎥⎥⎥⎥⎦+

⎡⎢⎢⎢⎢⎢⎢⎢⎣

00......01

⎤⎥⎥⎥⎥⎥⎥⎥⎦u (3.16)

y =[

1 0 0 · · · 0 0]x .

The initial condition is given by x(0) = [y(0) , y(1)(0) , · · · , y(n−1)(0)]tr where tr denotes thetranspose. Therefore, one can readily obtain a state space representation for any systemdescribed by (3.15).

5.3.1 Transfer functions for state space systems

In this section we present the transfer function state space system. As before, consider thestate space given by

x = Ax+Bu and y = Cx+Du . (3.17)

The input or forcing function is u and y is the output. The state space system in (3.17) isdenoted by {A,B,C,D}. Recall that L(f)(s) = sF (s) − f(0) where f(t) is a function fort ≥ 0. By taking the Laplace transform of the state space system in (3.17), we obtain

sX(s)− x0 = AX(s) +BU(s)

Y (s) = CX(s) +DU(s)

where X, Y and U are the Laplace transforms of x, y and u. Solving these equations for Yin terms of x(0) and U yields

Y (s) = C(sI − A)−1x(0) +[C(sI −A)−1B +D

]U(s) . (3.18)

Recall that the transfer function for a system with input u and output y is given byG(s) = Y (s)/U(s) where all the initial conditions are set equal to zero. By setting x(0) = 0in (3.18), we see that the transfer function G(s) = Y (s)/U(s) for the state space system{A,B,C,D} is given by

G(s) =Y (s)

U(s)= C(sI −A)−1B +D . (3.19)


Recall that the impulse response g for a system whose transfer function is G is given byg(t) = (L−1G)(t). By taking the inverse Laplace transform of G in (3.19), we see that theimpulse response for {A,B,C,D} is given by

g(t) = CeAtB +Dδ(t) .

Notice that when x(0) = 0, the relationship between the Laplace transform of the inputU and the Laplace transform of the output Y is given by Y (s) = G(s)U(s). In other words,when all the initial conditions are zero, the transfer function is the multiplication operatorwhich maps U(s) into Y (s). Recall that convolution in the time domain corresponds tomultiplication in the s-domain. Hence,

y(t) =

∫ t

0

g(t− σ)u(σ) dσ .

Obviously, G and g uniquely determine each other. So when the initial condition is zero, theinput-output response of the system {A,B,C,D} is completely determined by its transferfunction.

For an example, let us compute the transfer function G = Y/U for the state space systemgiven by [

x1x2

]=

[0 −51 −2

] [x1x2

]+

[2

−1

]u

y =[1 −2

]x . (3.20)

In this case, the matrices A, B, C, and D are given by

A =

[0 −51 −2

]and B =

[2

−1

]C =

[1 −2

]and D = 0 .

Notice that

sI − A =

[s 5−1 s+ 2

].

The characteristic polynomial Δ(s) = det[sI −A] = s2 + 2s+ 5. By using the 2× 2 matrixinversion formula in (7.2) and (7.3) in Chapter 4, we obtain

(sI − A)−1 =1

Δ(s)

[s+ 2 −51 s

].

So the transfer function G is computed by

G(s) = C(sI − A)−1B =[1 −2

] 1

Δ(s)

[s+ 2 −51 s

] [2

−1

]

=1

Δ(s)

[1 −2

] [ 2s+ 92− s

]=

4s+ 5

Δ(s).


Therefore the transfer function for the state space system in (3.20) is given by

G(s) =4s+ 5

s2 + 2s+ 5. (3.21)

Recall that the impulse response g is the inverse Laplace transform of G. To computethe impulse response for the state space system in (3.20), notice that its transfer function Ggiven by (3.21) admits a decomposition of the form

G(s) =4s+ 5

s2 + 2s+ 5=

4(s+ 1) + 1

(s+ 1)2 + 4=

4(s+ 1)

(s+ 1)2 + 4+

2

2 ((s+ 1)2 + 4).

Therefore the impulse response g = L−1G for the state space system in (3.20) is given by

g(t) = 4e−t cos(2t) + e−t sin(2t)/2 . (3.22)

5.3.2 Exercise

Problem 1. Find a state space representation for the differential equation

2y(3) + 6y − 4y + 8y = 16u .

Problem 2. Consider the state space system given by[x1x2

]=

[0 −51 −2

] [x1x2

]+

[21

]u

y =[1 −2

]x .

Find the solution to this system when all the initial conditions are zero and u(t) = 1.

Problem 3. Find the transfer function and impulse response for the state space systemgiven by

x =

[0 −21 −3

]x+

[34

]u

y =[1 2

]x+ 2u .

Problem 4. Find the transfer function and impulse response for the state space systemgiven by

x =

[0 1

−2 −3

]x+

[11

]u

y =[2 1

]x .

Problem 5. Consider the state space system

x = Ax+Bu and y = Cx .


Assume that u(t) = δ(t) the Dirac delta function. Then show that the solution to this systemis given by

y(t) = CeAtx(0) + CeAtB .


x = Ax+Bu and y = Cx .

Assume that u(t) = 1 for all t ≥ 0. Then show that the solution to this system is given by

y(t) = CeAtx(0) +

∫ t

0

CeAσB dσ .

Problem 7. Consider the following state space system:[x1x2

]=

[−4 1−1 −2

] [x1x2

]+

[11

]u+

[−11

]u

y =[1 −1

] [x1x2

].

Assume that all the initial conditions are zero.

• Find the transfer function G(s) = Y (s)U(s)

.

• If the input u(t) = δ(t), then find y(t).



x =

⎡⎣−1 0 00 −2 00 0 −1

⎤⎦ x+⎡⎣201

⎤⎦uy =

[3 5 7

]x− 2u.



.

• If u(t) = δ(t), then find the output y(t).

• If u(t) = −4, then find limt→∞ y(t).

Problem 9. Consider the following state space system[x1x2

]=

[2 −34 −5

] [x1x2

]+

[11

]u

y =[2 1

]x− u(t).


5.4. STATE SPACE REALIZATIONS 257


.

• If the input u(t) = δ(t), then find y(t).


5.4 State space realizations

Recall that a functionG of a complex variable is a proper rational function ifG(s) = p(s)/d(s)where p and d are polynomials satisfying deg p ≤ deg d. The degree of a polynomial isdenoted by deg. The function G is a strictly proper rational function if G(s) = p(s)/d(s)and deg p < deg d. Notice that G is a proper rational function if and only if G admits adecomposition of the form G(s) = c(s)/d(s) + D where c and d are polynomials satisfyingdeg c < deg d, and D is a constant. In other words, G is a proper rational function if andonly if G admits a decomposition of the form G(s) = Go(s)+D where Go is a strictly properrational function and D is a constant.

We claim that the transfer function G for the state space system {A,B,C,D} is a properrational function. To see this, we can assume, without loss of generality, that A is a n × nmatrix. LetM be the algebraic adjoint of sI−A and Δ(s) the determinant of sI−A. Noticethat Δ(s) = det[sI − A] is the characteristic polynomial for A. In particular, the roots ofΔ(s) are the eigenvalues for A including their multiplicity. Then

(sI − A)−1 =M(s)/Δ(s)

where M is a n × n matrix consisting of polynomials of degree at most n−1. Since thedegree of Δ(s) is n, we see that (sI − A)−1 is a strictly proper rational function. Hence,C(sI−A)−1B is also a strictly proper rational function. In fact, C(sI−A)−1B = c(s)/Δ(s)where c(s) is a polynomial of degree at most n − 1. Substituting this into (3.19) showsthat G(s) = C(sI − A)−1B +D is a proper rational function. In particular, G(s) admits adecomposition of the form

C(sI −A)−1B +D =c(s)

det[sI −A]+D . (4.1)

Finally, it is noted that the poles of G are contained in the eigenvalues of A.We say that {A on Cn, B, C,D} is a state space realization of G if G is the transfer

function for {A,B,C,D}, that is, (3.19) holds and A is a n × n matrix. (Recall that Ck

is the set of all complex vectors of length k.) Here n is the dimension of the state. Theabove analysis shows that the transfer function for a state space system is a proper rationalfunction.

Let G(s) = p(s)d(s)

be a transfer function where p(s) and d(s) are two polynomials. By

canceling out the common roots of p(s) and d(s), without loss of generality, we can assumethat p(s) and d(s) have no common roots. We say that the McMillan degree of a transfer

function G(s) = p(s)d(s)

equals the degree of d(s) when p(s) and d(s) have no common roots.

Since the poles of G(s) are the roots of d(s), the McMillan degree of G(s) also equal the


number of poles of G(s) including their multiplicity. If {A on Cn, B, C,D} is a state space

realization for a transfer function G(s) = p(s)d(s)

, then equation (4.1) shows that

p(s)

d(s)= G(s) =

q(s)

Δ(s)

where q(s) is a polynomial of degree at most deg[Δ(s)]. Moreover, there can be commonroots between q(s) and Δ(s). Because p(s) and d(s) have no common roots, it follows thatΔ(s) = α(s)d(s) where α(s) is a polynomial. In particular, the poles of G(s) are contained inthe eigenvalues for A. Moreover, α(s) = α0 is a constant if and only if deg[Δ(s)] = deg[d(s)],or equivalently, n equals the McMillan degree of G(s). In this case, Δ(s) = α0d(s), and thepoles of G(s) equal the eigenvalues for A including their multiplicity.

We say that {A on Cn, B, C,D} is a minimal realization for G if {A,B,C,D} is a statespace realization of G of the lowest possible state dimension, that is, if {A1 on Cm, B1, C1, D}is another realization for G, then n ≤ m. It is noted that {A on Cn, B, C,D} is a minimalrealization for G if and only if n equals the McMillan degree for G, or equivalently, n equalsthe number of poles of G including their multiplicity. This proves part of the following basicresult in system theory.

THEOREM 5.4.1 Let G(s) be a rational function of a complex variable s. Then G admitsa state space realization {A on C

n, B, C,D} if and only if G is proper rational function.In this case, the poles of G(s) are contained in the eigenvalues for A. Moreover, the polesof G(s) equal the eigenvalues for A including their multiplicity if and only if n equals theMcMillan degree of G(s), or equivalently, {A,B,C,D} is a minimal realization for G.

Proof. The proof of this fact is motivated by the form of the matrix A in equation (3.16).Let G be a proper rational function. Then G admits a decomposition of the form

G(s) =c(s)

d(s)+D (4.2)

where c and d are polynomials of the form

c(s) =n−1∑k=0

cksk and d(s) = sn +

n−1∑k=0

aksk . (4.3)

Here D, {ck}n−10 and {ak}n−1

0 are scalars. Let A be the n× n matrix, B the column vectorin C

n and C the 1× n row vector defined by

A =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

0 1 0 · · · 0 00 0 1 · · · 0 0...

......

. . ....

......

......

. . ....

...0 0 0 · · · 0 1

−a0 −a1 −a2 · · · −an−2 −an−1

⎤⎥⎥⎥⎥⎥⎥⎥⎦, B =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

00......01

⎤⎥⎥⎥⎥⎥⎥⎥⎦, (4.4)

C =[c0 c1 c2 · · · cn−2 cn−1

].


Let v(s) be the column vector in Cn defined by v(s) = [1 , s , s2 , · · · , sn−1]tr. Then we

have

(sI − A)v(s) =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

s −1 0 · · · 0 00 s −1 · · · 0 0...

......

. . ....

......

......

. . ....

...0 0 0 · · · s −1a0 a1 a2 · · · an−2 (s+ an−1)

⎤⎥⎥⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎢⎢⎣

1s......

sn−2

sn−1

⎤⎥⎥⎥⎥⎥⎥⎥⎦=

⎡⎢⎢⎢⎢⎢⎢⎢⎣

00......0d(s)

⎤⎥⎥⎥⎥⎥⎥⎥⎦= Bd(s) .

Hence (sI −A)v(s) = Bd(s). So whenever s is not an eigenvalue of A, we have

(sI − A)−1B = v(s)/d(s) .

This along with the definition of c, implies that

C(sI −A)−1B =[c0 c1 · · · cn−1

]v(s)/d(s) = c(s)/d(s) .

By consulting (4.2), we have

G(s) = C(sI − A)−1B +D .

Therefore, {A ,B ,C ,D} is a realization of G. This completes the proof.As before, let G be a proper rational function of the form (4.2) where c and d are

polynomials as defined in (4.3). Moreover, let A, B, C be the matrices defined in (4.4). Thenthe proof of the previous theorem shows that {A,B,C,D} is a realization of G. In particular,any proper rational function of the form (4.2), (4.3) admits a state space realization whosestate dimension is at most n.

To be more specific, let G be any scalar valued proper rational function. Then G admitsa decomposition of the form

G(s) =c0 + c1s+ · · ·+ cn−1s

n−1

a0 + a1s + · · ·+ an−1sn−1 + sn+D . (4.5)

Let A ,B and C be the matrices defined by

A =

⎡⎢⎢⎢⎢⎢⎣0 1 0 0 · · · 00 0 1 0 · · · 0...

......

... · · · ...0 0 0 0 · · · 1

−a0 −a1 −a2 −a3 · · · −an−1

⎤⎥⎥⎥⎥⎥⎦ , B =

⎡⎢⎢⎢⎢⎢⎣00...01

⎤⎥⎥⎥⎥⎥⎦ (4.6)

C =[

c0 c1 c2 c3 · · · cn−1

].

Then our previous analysis shows that {A ,B ,C ,D} is a realization of G.


For example, consider the rational function G given by

G(s) =c0 + c1s+ c2s

2 + c3s3

a0 + a1s+ a2s2 + a3s3 + s4+D .

Then the state space realization for this system is given by⎡⎢⎢⎣x1x2x3x4

⎤⎥⎥⎦ =

⎡⎢⎢⎣0 1 0 00 0 1 00 0 0 1

−a0 −a1 −a2 −a3

⎤⎥⎥⎦⎡⎢⎢⎣x1x2x3x4

⎤⎥⎥⎦+

⎡⎢⎢⎣0001

⎤⎥⎥⎦uy =

[c0 c1 c2 c3

]x+Du .

REMARK 5.4.2 Let {A,B,C,D} be a realization for a rational function G(s) = n(s)d(s)

where n(s) and d(s) are polynomials. Then G(s) is a strictly proper rational function, thatis, deg[n(s)] < deg[d(s)] if and only if D = 0. In other words, deg[n(s)] = deg[d(s)] if andonly if D = 0.

Recall that the characteristic polynomial for a square matrix is defined by det[sI − A]where det denotes the determinant. If A is a n×n matrix, then its characteristic polynomialis a monic polynomial of order n. Finally, the eigenvalues for A are precisely the roots ofthe characteristic polynomial for A.

LEMMA 5.4.3 Let A be the n× n matrix given by

A =

⎡⎢⎢⎢⎢⎢⎣0 1 0 0 · · · 00 0 1 0 · · · 0...

......

... · · · ...0 0 0 0 · · · 1

−a0 −a1 −a2 −a3 · · · −an−1

⎤⎥⎥⎥⎥⎥⎦ . (4.7)

Let d(s) be polynomial defined by

d(s) = a0 + a1s+ a2s2 + · · ·+ an−1s

n−1 + sn . (4.8)

Then d is the characteristic polynomial for A, that is, d(s) = det[sI − A]. Furthermore, alleigenvectors v corresponding to a specified eigenvalue λ are given by

v = α

⎡⎢⎢⎢⎢⎢⎣1λλ2

...λn−1

⎤⎥⎥⎥⎥⎥⎦ (4.9)

where α is a nonzero scalar. In particular, λ is an eigenvalue for the n×n matrix A in (4.7)if and only if λ is a zero of the polynomial d(s). In this case, the corresponding eigenvectorfor A is given by [1, λ, λ2, · · · , λn−1]tr.


Proof. Recall that λ is an eigenvalue of A if and only if Av = λv where v is a nonzerovector Cn. Clearly, any such v admits a decomposition of the form

v = [v0 , v1 , v2 , · · · , vn−1]tr

where {vk}n−10 are all scalars. By using the structure of A, it follows that Av = λv if and

only if

v1 = λv0 , v2 = λv1 , · · · , vn−1 = λvn−2 and

−a0v0 − a1v1 − · · · − an−1vn−1 = λvn−1 .

These relationships are equivalent to

v1 = λv0 , v2 = λ2v0 , · · · , vn−1 = λn−1v0 and

(λn + λn−1an−1 + · · ·+ λa1 + a0)v0 = 0 . (4.10)

By setting α = v0, the first n− 1 equations show that v admits a representation of the form(4.9). In particular, v = 0 is equivalent to α = 0. Furthermore, the last expression in (4.10)shows that λ is an eigenvalue of A if and only if d(λ)v0 = 0. Therefore, λ is an eigenvalueof A if and only if d(λ) = 0. ¿From the proceeding analysis it should be clear that all theeigenvectors corresponding to a specified eigenvalue λ are given by v in (4.9) where d(λ) = 0.

To compete the proof, let B be the column vector in Cn and C the 1 × n row vector

defined by

B =[0 0 · · · 0 1

]trand C =

[1 0 0 · · · 0

].

By consulting (4.5) and (4.6) we see that C(sI −A)−1B = 1/d(s). However, equation (4.1)shows that C(sI −A)−1B = c(s)/Δ(s) where c is a polynomial of degree at most n− 1 andΔ is the characteristic polynomial for A. Thus 1/d(s) = c(s)/Δ(s). Because d and Δ areboth monic polynomials of degree n, we must have d = Δ. This completes the proof.

A classical result on minimal realizations. We say that two state space realizations{A on Cn, B, C,D} and {A1 on Cn, B1, C1, D1} are similar if there exists an invertible matrixT such that

A1 = TAT−1 and B1 = TB and C1 = CT−1 and D = D1. (4.11)

Similar realizations have the same transfer function. Recall that (MN)−1 = N−1M−1 whenM and N are invertible matrices of the same size. Using this we obtain

C1(sI − A1)−1B1 +D1 = CT−1(sI − TAT−1)−1TB +D

= CT−1(T (sI −A)T−1

)−1

TB +D

= CT−1T(sI − A

)−1

T−1TB +D

= C(sI −A)B +D.


Therefore similar realizations determine the same transfer function.Consider the realization

x = Ax+Bu and y = Cx+Du (4.12)

where A is a n×n matrix. Let T be any invertible matrix on Cn and consider the new state

variable ξ = Tx. Hence ξ = T x. Multiplying the first equation in (4.12) by T on the leftyields

T x = TAx+ TBu = TAT−1Tx+ TBu

y = CT−1Tx+Du.

Using (4.11) with ξ = Tx, we obtain a new state space realization with state ξ, that is,

ξ = A1ξ +B1u and y = C1ξ +D1u. (4.13)

The transformation Tx = ξ can be viewed as a change of basis from the state x to the newstate ξ. Because the realizations in (4.12) and (4.13) are similar, they have the same transferfunction. Finally, it is also noted that since the realizations in (4.12) and (4.13) have the

same input u and output y they must have the same transfer function G(s) = Y (s)U(s)

. Let usconclude with the following classical result which is a beyond the scope of the these notes.For a proof and further results on state space systems; see [3, 8, 23, 33].

THEOREM 5.4.4 All minimal realizations of the same transfer function G(s) are similar.Moreover, {A on Cn, B, C,D} is a minimal realization for G if and only if n equals theMcMillan degree of G. In this case, the eigenvalues for A equal the poles for G includingtheir multiplicity.

An example of a pole zero cancelation

Consider the state space system⎡⎣x1x2x3

⎤⎦ =

⎡⎣−4 −9 −271 −1 00 1 −1

⎤⎦⎡⎣x1x2x3

⎤⎦+

⎡⎣100

⎤⎦uy =

[1 4 3

] ⎡⎣x1x2x3

⎤⎦ . (4.14)

In this state space system x = Ax+Bu and y = Cx, where

A =

⎡⎣−4 −9 −271 −1 00 1 −1

⎤⎦ and B =

⎡⎣100

⎤⎦ and C =[1 4 3

].

The eigenvalues for A are {−1± 3i,−4}. So the matrix A is a stable matrix. Using Matlabthe transfer function

G(s) = C(sI −A)−1B =s2 + 6s+ 8

s3 + 6s2 + 18s+ 40=

(s+ 2)(s+ 4)

(s2 + 2s+ 10)(s+ 4). (4.15)


0 1 2 3 4 5 60

2

4

6

8

10

12The output y(t) with u(t)=30

Time (seconds)

Am

plitu

de

Figure 5.4: The output y(t) for u(t) = 30

Notice that the roots of the numerator s2 + 6s + 8 are {−2,−4}, while the roots of thedenominator s3 + 6s2 + 18s + 40 are {−1 ± 3i,−4}. (Applying the residue command toG in (4.15) will show that the residue for the pole of −4 is zero.) So we have a pole zerocancelation in G(s). Canceling out the common root of −4, we obtain

G(s) =s+ 2

s2 + 2s+ 10. (4.16)

Therefore the McMillan degree of G is two. The poles of G are {−1±3i} which are containedin the set of eigenvalues for A. Clearly, the transfer function G(s) is stable. A stable secondorder state space realization for G(s) is given by[

x1x2

]=

[0 1

−10 −2

] [x1x2

]+

[01

]u and y =

[2 1

] [x1x2

]. (4.17)

In this state space system x = A1x+B1u and y = C1x, where

A1 =

[0 1

−10 −2

]and B1 =

[01

]and C1 =

[2 1

]. (4.18)

A simple calculation or by consulting (4.5) and (4.6), we see that {A1, B1, C1, 0} is indeed astate space realization for G(s), that is,

G(s) = C1(sI − A1)−1B1 =

s+ 2

s2 + 2s+ 10.

Because the A1 is a 2×2 matrix and the McMillan degree of G(s) equals two, the poles of Gare precisely the eigenvalues of A, that is, {−1± 3i} are the poles of G and the eigenvaluesof A.

Now assume that u(t) = 30 for all t. Because G(s) is stable and G(0) = 15, the Final

value theorem states that6 = 30G(0) = lim

t→∞y(t).


The plot of y(t) using the step command in Matlab is presented in Figure 5.4. Because thepoles of G are {−1 ± 3i}, the output converges to y(∞) = 6 on the order of e−t with anangular frequency of 3. Moreover, in this case U(s) = 30

s. Hence

Y (s) = G(s)U(s) =30(s+ 2)

(s2 + 2s+ 10)s=

6

s+

−3− 4i

s+ 1− 3i+

−3 + 4i

s+ 1 + 3i.

By taking the inverse Laplace transform, we arrive at

y(t) = 6− 6e−t cos(3t) + 8e−t sin(3t).

Clearly, 6 = limt→∞ y(t).The Matlab commands we used to compute G(s) in (4.15) and its reduced form in (4.16)

are given by

A = [−4, −9, −27; 1, −1, 0; 0, 1, −1]; B = [1; 0; 0]; C = [1, 4, 3];

[num, den]=ss2tf(A,B,C, 0); G=tf(num,den)

G =s2 + 6s+ 8

s3 + 6s2 + 18s+ 40

% This is the transfer function G(s) in (4.15).

G = minreal(G)

s + 2

s2 + 2s+ 10

% The minreal command cancels out the common poles and zeros of G.

pole(G)

= −1 + 3i, −1− 3i

eig(A)

− 1 + 3i, −1− 3i, −4

% As expected, pole(G) ⊂ eig(A).

step(30*G); grid; title(’The output y(t) with u(t) = 30’)

[r, p] = residue(30 ∗ [1, 2], [1, 2, 10, 0])

r =

⎡⎣−3 − 4i−3 + 4i

6

⎤⎦ and

⎡⎣−1 + 3i−1− 3i

0

⎤⎦t = linspace(0, 6, 2 ∧ 14)

y = 6− 6 ∗ exp(−t). ∗ cos(3 ∗ t) + 8 ∗ exp(−t). ∗ sin(3 ∗ t)hold on; plot(t,y,’r’)

Simulink models. We also used Simulink to compute the output y(t). The Simulinkmodel we designed in Matlab for computing the output y(t) with u(t) = 30 is presented inFigure 5.5. Here we set the simulation parameters to fixed step and 0.001 for the step size.


We set the simout to work space block to array. This sends the data to Matlab under thevariable simout. Setting

t = linspace(0, 6, 6001); plot(t,simout); grid

in Matlab yields the same plot as Figure 5.4. By clicking on the history tab in the scopeand unchecking limit data point to the last 5000 and then clicking array, the data from thescope is sent to Matlab in ScopeData. Clicking on the scope also gives the graph of y(t) inFigure 5.4. Setting

t = ScopeData(:, 1);

y = ScopeData(:, 2);

plot(t,y); grid

in Matlab yields the data for y(t) and plots the same graph as Figure 5.4. The constant blockis given in sources in Simulink, while the scope and simout to workspace are in sinks, andthe transfer function block is in continuous. Finally, it is emphasized that the commands inMatlab or Simulink may change depending upon which version of Matlab one is running.

u y

Transfer Fcn

s+2

s +2s+102

To Workspace

simout

ScopeConstant

30

Figure 5.5: A Simulink model for computing y(t) with u(t) = 30.

One can also design a Simulink model to implement the minimal realization in (4.17)on the level of integrators and gains; see Figure 5.6. Clearly, this takes more work to


build. However, in certain applications, one may need to design Simulink models involvingintegrators, gains, and summers. Both Simulink models yield the same output as Figure 5.4.If the state space matrix A is in the form of (4.6), then one can line the integrators up ina series starting with the integrator corresponding to the highest state on the left. Finally,the sum and gain blocks are found in the commonly used blocks in Simulink.

Now consider the invertible transformation

T =

[2 −1−1 1

]and T−1 =

[1 11 2

].

The state space system {A1, B1, C1, 0} in (4.17) and (4.18) is similar to the state space system{A2, B2, C2, 0} where

A2 = TA1T−1 =

[2 −1−1 1

] [0 1

−10 −2

] [1 11 2

]=

[14 18−13 −16

]B2 = TB1 =

[2 −1−1 1

] [01

]=

[−11

]C2 = CT−1 =

[2 1

] [1 11 2

]=[3 4

].

Because these two realizations are similar, they have the same transfer function G(s). Inparticular, [

x1x2

]=

[14 18−13 −16

] [x1x2

]+

[−11

]u and y =

[3 4

] [x1x2

](4.19)

is a minimal realization for G(s) = s+2s2+2s+10

. A Simulink model based on this minimal statespace realization is presented in Figure 5.7. All three Simulink models yield the same outputy(t) given in Figure 5.4.


dot x_2 x_2 = dot x_1 x_1

integrator1

yu

To Workspace

simout

Scope

Integrator2

1s

1s

Gain3

1

Gain2

2

Gain1

−10

Gain

−2

Constant

30

Figure 5.6: A Simulink model for computing y(t) with u(t) = 30 using integrators with thestate space realization in (4.17).


dot x_1

dot x_2

x_1

x_2

u

y

To Workspace

simout

Scope

Integrator2

1s

Integrator1

1s

Gain6

−1

Gain5

18

Gain4

−13

Gain3

14

Gain2

−16

Gain1

4

Gain

3

Constant

30

Figure 5.7: A Simulink model for computing y(t) with u(t) = 30 using integrators with thestate space realization in (4.19).


REMARK 5.4.5 It is emphasized that canceling out unstable poles and zeros is rarely agood idea. In most problems there are numerical errors in calculating the roots of a polyno-mial. So this means that if the original system is unstable, there is in all likelihood no polezero cancelation and the system is probably unstable. For example, plotting the step response(u(t) = 1) for the unstable transfer function

s− 2.0001

(s+ 1)(s− 2)

shows that the output y(t) is unbounded. Moreover, one cannot cancel out the pole of 2 withthe zero of 2.0001. If u(t) = 1, then output of the stable transfer function 1

s+1, its supposed

approximation, converges to 1.It is emphasized that pole zero cancelation of stable poles makes sense. For example, the

step response (u(t) = 1) for the stable transfer functions

s + 2.01

(s+ 1)(s+ 2)=

1.01

s+ 1+

1

100(s+ 2)and

1

s+ 1

are almost identical.

5.4.1 Exercise

Problem 1. Find a state space realization for the transfer function

G(s) =22 + 15s2

10− 14s+ 52s3 + s4+ 6 .

Problem 2. Find a state space realization for the transfer function

G(s) =2 + 6s2 + 4s3

8 + 2s− 4s2 + 2s3.

Problem 3. Find the transfer function for the following state space system⎡⎢⎢⎣x1x2x3x4

⎤⎥⎥⎦ =

⎡⎢⎢⎣0 1 0 00 0 1 00 0 0 110 −15 22 −4

⎤⎥⎥⎦⎡⎢⎢⎣x1x2x3x4

⎤⎥⎥⎦+

⎡⎢⎢⎣0001

⎤⎥⎥⎦uy =

[30 4 2 5

]x− 4u.

Problem 4. Find the characteristic polynomial, eigenvalues and corresponding eigenvectorsfor the following matrix

A =

⎡⎣ 0 1 00 0 1

−10 −9 −4

⎤⎦ .

Problem 5. Canceling out an unstable pole by a zero can lead to problems. For example,

G(s) =s− 2.0001

(s+ 1)(s− 2)


is unstable, and one cannot cancel out the pole of 2 with the zero of 2.0001. Plot the stepresponse u(t) = 1 for G(s) is Matlab and notice that y(t) is unbounded. So this behavior cannever be approximated by is supposed stable approximation 1

s+1. However, one can cancel

stable zeros. For example, plot the step response of

s+ 2.01

(s+ 1)(s+ 2)=

1.01

s+ 1+

1

100(s+ 2)and

1

s+ 1

on the same graph.

Problem 6. Design a Simulink model similar to using integrators, summers and gains forthe state space system[

x1x2

]=

[23 13−45 −25

] [x1x2

]+

[−12

]u and y =

[5 3

] [x1x2

]; (4.20)

see Figure 5.7.

• If u(t) = 60, then compute y(∞) = limt→∞ y(t).

• Graph the output y(t) for u(t) = 60.


.

• Find the McMillan of G.

• Is the state space realization in for G in (4.20) a minimal realization?

Problem 7. Consider the state space system⎡⎢⎢⎣x1x2x3x4

⎤⎥⎥⎦ =

⎡⎢⎢⎣−3 −6 −10 101 −1 0 00 1 1 −20 0 1 −2

⎤⎥⎥⎦⎡⎢⎢⎣x1x2x3x4

⎤⎥⎥⎦+

⎡⎢⎢⎣1000

⎤⎥⎥⎦uy =

[1 −4 −2 10

]x.

• Find the transfer function Y (s)U(s)

= G(s).

• Find the McMillan degree for G(s).

• Express G as G(s) = n(s)d(s)

where deg[d(s)] equals the McMillan degree for G.

• Find a minimal realization {A,B,C,D} for G(s).

• If the input u(t) = 30, then find y(∞) = limt→∞ y(t).

• If the input u(t) = 30, then find y(t).

• If the input u(t) = 30, then plot y(t) in Matlab.

5.5. STABLE STATE SPACE SYSTEMS 271

• Design a Simulink model of the form in Figure 5.5 when u(t) = 30. Print the outputof the scope or simout to workspace block, that is, plot y(t).

• Design a Simulink model consisting of integrators, gains and summers of the form inFigure 5.6 when u(t) = 30. Here you need three integrators. Print the output of thescope or simout to workspace block, that is, plot y(t).

5.5 Stable state space systems

In this section we will presents some elementary results concerning stable state space systems.Recall that a state space system is a linear model of the form

x = Ax+Bu and y = Cx+Du. (5.1)

Here A is a matrix on Cν and B is a column vector of length ν, while C is a row vector oflength ν and D is a scalar. The input to the state space system is u(t), the state space isx(t) and the output is y(t). The solution to this state space system is given by

x(t) = eAtx(0) +

∫ t

0

eA(t−τ)Bu(τ)dτ

y(t) = CeAtx(0) +

∫ t

0

CeA(t−τ)Bu(τ)dτ +Du(t). (5.2)

The transfer function from u to y is determined by

G(s) =Y (s)

U(s)= C (sI − A)−1B +D. (5.3)

Moreover, G(s) is a proper rational function of the form G(s) = n(s)d(s)

where n(s) and d(s) are

polynomials satisfying deg n ≤ deg d. In fact, one can choose d(s) = det[sI−A]. Recall thatdet[sI −A] is the characteristic polynomial for A. (The determinant is denoted by det.) Inparticular, if all the eigenvalues of A are in the open left half plane {λ ∈ C : �(λ) < 0}, thenthe transfer function G is stable. The Matlab command for computing G from the statespace realization {A,B,C,D} in (5.1) is given by

[num,den ] = ss2tf(A,B,C,D) and G = tf(num,den)

Let us begin by recalling the definition of a stable matrix.

DEFINITION 5.5.1 A matrix A on Cν or the state space system x = Ax is stable if

limt→∞

eAt = 0.

THEOREM 5.5.2 The system x = Ax is stable if and only if all the eigenvalues for A arecontained in the open left half plane {λ ∈ C : �(λ) < 0}.


The proof of this theorem is given below. In particular, if {A,B,C,D} is a state spacerealization for a transfer function G(s), that is, G(s) = D + C(sI − A)−1B and A is stable,then G(s) is stable. Moreover, if {A,B,C,D} is a minimal realization for G(s), then G(s)is stable if and only if A is stable. On the other hand, if the realization is not minimal, thenit is possible that G(s) is stable and A is unstable. The following is a state space version ofthe final value Theorem.

COROLLARY 5.5.3 Consider the state space system given by


where A is stable. If u(t) = u0 a constant for all t, then for all initial conditions x(0), wehave

x(∞) = limt→∞

x(t) = −A−1Bu0

y(∞) = limt→∞

y(t) = Du0 − CA−1Bu0 = G(0)u0. (5.5)

Here G(s) = D + C(sI −A)−1B is the transfer function for {A,B,C,D}.Proof. Recall that the solution x to the state space system x = Ax+Bu is given by

x(t) = eAtx(0) +

∫ t

0

eA(t−τ)Bu(τ)dτ

= eAtx(0) + eAt∫ t

0

e−AτBu0dτ

= eAtx(0)− eAt e−Aτ∣∣t0A−1Bu0

= eAtx(0) + eAt(I − e−At

)A−1Bu0

= eAtx(0) + eAtA−1Bu0 − A−1Bu0.

Here we used the fact that∫e−Aτdτ = −A−1e−Aτ = −e−AτA−1. The inverse A−1 is well

defined because A is stable, and thus, zero is not an eigenvalue for A. So when u(t) = u0 aconstant for all t, then

x(t) = eAt(x(0) + A−1Bu0

)− A−1Bu0. (5.6)

Because A is stable, eAt → 0 as t tends to infinity. Hence

limt→∞

x(t) = limt→∞

(eAt

(x(0) + A−1Bu0

)− A−1Bu0

)= −A−1Bu0.

This yields the first equation in (5.5).The second equation follows from the formula for x in (5.6) and the following calculation

limt→∞

y(t) = limt→∞

(Cx(t) +Du(t)

)= lim

t→∞

(CeAt

(x(0) + A−1Bu0

)− CA−1Bu0 +Du0

)= −CA−1Bu0 +Du0. (5.7)


Therefore the second equation in (5.5) holds. This completes the proof.

A partial fraction proof of Corollary 5.5.3. One can also use Laplace transformsto obtain the formula for x in (5.6), and thus prove Corollary 5.5.3. To this end, notice thatthe Laplace transform of x = Ax+Bu is given by

sX(s)− x(0) = AX(s) +BU(s) which implies (sI −A)X(s) = x(0) +BU(s).

By employing u(t) = u0 and taking the inverse of sI − A, we have

X(s) = (sI − A)−1x(0) + (sI − A)−1Bu0s. (5.8)

Now observe that 1s(sI − A)−1 admits a partial fraction expansion of the form

1

s(sI − A)−1 = −A

−1

s+ Ω(s) (5.9)

where Ω(s) is a rational function. By rearranging this equation, we obtain

Ω(s) =1

s(sI −A)−1 +

1

sA−1 =

1

s(sI −A)−1

(I + (sI − A)A−1

)= (sI − A)−1A−1.

Substituting this Ω(s) into (5.9) yields

1

s(sI − A)−1 = (sI −A)−1A−1 − 1

sA−1. (5.10)

Using this in (5.8), we see that

X(s) = (sI −A)−1(x(0) + A−1Bu0

)− 1

sA−1Bu0. (5.11)

By taking the inverse Laplace transform, we arrive at

x(t) = eAt(x(0) + A−1Bu0

)− A−1Bu0. (5.12)

This is precisely the formula for x(t) in (5.6). Hence x(t) converges to −A−1Bu0 and the firstequation in (5.5) holds. The second equation follows from (5.7). This completes a partialfraction proof of Corollary 5.5.3.

REMARK 5.5.4 As before, consider the state space system


where A is stable and u(t) = u0 a constant for all t. By consulting (5.11), we see that

X(s) =adj [sI − A]

det[sI −A]

(x(0) + A−1Bu0

)− 1

sA−1Bu0. (5.14)


(The algebraic adjoint of a matrix is denoted by adj [Z].) Recall that det[sI − A] is thecharacteristic polynomial for A. In particular, the roots of det[sI − A] are precisely theeigenvalues for A. So by taking the inverse Laplace transform of X(s) in (5.14), we see thatx(t) is of the form

x(t) =∑

tkeλjtγj,k − A−1Bu0 (5.15)

where γj,k are vectors and {λj} are the eigenvalues for A. Because A is stable, x(t) convergesto −A−1Bu0. Moreover, the eigenvalues {λj} for A indicate how x(t) converges to −A−1Bu0.The imaginary part of the eigenvalues give us the frequencies at which x(t) converges to−A−1Bu0. For example, if the eigenvalues contain large imaginary parts, then in generalx(t) will oscillate at a high frequency as it converges to −A−1Bu0. (The appropriate γj,k willhave to be nonzero, which is almost always the case.) The real part of the eigenvalues {λj}for A indicate how fast x(t) converges to −A−1Bu0. For example, if the real part of all theeigenvalues are far in the left hand plane, then x(t) will converge fast to −A−1Bu0. On theother hand, if A has eigenvalues close to the imaginary axis, then x(t) will converge slowlyto −A−1Bu0.

Sketch of the proof of Theorem 5.5.2. Assume that A is stable. Let λ be an eigenvaluefor A with eigenvector ξ, that is, Aξ = λξ where ξ is a nonzero vector. We claim thateAtξ = eλtξ. To see this, first observe that Anξ = λnξ for all integers n ≥ 0. Using the factthat eα =

∑∞0

αn

n!, we obtain

eAtξ =∞∑n=0

tnAn

n!ξ =

∞∑n=0

λntn

n!ξ = eλtξ.

Hence eAtξ = eλξ. Because A is stable, we have

0 = limt→∞

eAtξ = limt→∞

eλtξ.

Since ξ is nonzero, we see that eλt → 0 as t tends to infinity. Therefore λ is contained in theopen left half plane, that is, �(λ) < 0.

Recall that the eigenvalues {λj} are the roots for the characteristic polynomial for A,that is, det[sI − A] =

∏(s− λj). Moreover,

L(eAt) = (sI − A)−1 =adj[sI −A]

det[sI − A]

where adj[sI −A] is the algebraic adjoint for sI −A. Using det[sI −A] =∏(s− λj), we see

that L(eAt) admits a partial fraction expansion of the form

L(eAt) =∑ Mj,k

(s− λj)k

where Mj,k is a matrix. In the time domain,

eAt =∑

Mj,ktk−1eλjt

(k − 1)!.


Now assume that all the eigenvalues for A are in the open left half plane, that is, �(λj) < 0for all j. Then the previous formula for eAt shows that eAt → 0 as t tends to infinity.Therefore A is stable. This completes the proof.

General state space systems

The general state space system has a vector input u(t) and vector output y(t). To be precise,in general a state space system is a linear model of the form

x = Ax+Bu and y = Cx+Du. (5.16)

Here A is a matrix on Cν and B is a ν × j matrix mapping Cj into Cν , while C is a k × νmatrix mapping Cν into Ck and D is a k × j matrix mapping Cj into Ck. The input to thestate space system u(t) is a vector in Cj, the state space x(t) is a vector in Cν and the outputy(t) is a vector in C

k. The solution to this state space system is given by

x(t) = eAtx(0) +

∫ t

0

eA(t−τ)Bu(τ)dτ

y(t) = CeAtx(0) +

∫ t

0

CeA(t−τ)Bu(τ)dτ +Du(t). (5.17)

The transfer function from u to y is determined by

G(s) = C (sI − A)−1B +D. (5.18)

It is emphasized that G(s) is a proper rational matrix function mapping Cj into Ck. In

particular, Y (s) = G(s)U(s). (Since U(s) is a vector Y (s)U(s)

does not make any sense.) To be

specific, G(s) is a proper rational function of the form

G(s) =N(s)

d(s)(5.19)

where N(s) is a k × j matrix polynomial and d(s) is a scalar polynomial. Moreover, thedegree of the entries of N(s) is less than or equal to the degree of d(s). In fact, one canchoose d(s) = det[sI − A]. In particular, if A is stable, then the matrix valued transferfunction G is stable. Finally, it is noted that if A is stable, then we also have the final valueTheorem: if u(t) = u0 a constant vector for all t, then for all initial conditions x(0), we have

x(∞) = limt→∞

x(t) = −A−1Bu0

y(∞) = limt→∞

y(t) = Du0 − CA−1Bu0. (5.20)

In these notes we only presented a brief introduction to state space systems. For furtherresults on state space systems see [2, 3, 7, 23, 33].


5.5.1 Exercise

Problem 1. Consider the state space system[x1x2

]=

[7 12−6 −10

] [x1x2

]+

[1−1

]u and y = 2x1 + 3x2 + 2u

(i) Is the 2× 2 state matrix stable?

(ii) Find the transfer function G(s) and its McMillan degree.

(iii) If the input u(t) = 2, then find

x1(∞) = limt→∞ x1(t);

x2(∞) = limt→∞ x2(t);

y(∞) = limt→∞ y(t).

Problem 2. Consider the state space system x = Ax+Bu and y = Cx where

A =

[−4 −2k 1

]and B =

[11

]and C =

[2 2

].

Moreover, k is an unknown constant.

(i) Find all real values of k such that A is stable.

(ii) Assume that u(t) = 2 for all time and 8 = limt→∞ y(t). Find k.

Problem 3. Consider the stable transfer function G(s) = 1s+1

. Find an unstable state spacerealization {A,B,C,D} for G(s) such that A is an unstable matrix on C2 and the eigenvaluesfor A are {−1, 2}. Is this realization unique? If so explain why, or give a counter example.


x = Ax+Bu and y = Cx+Du

where A is stable. Let G(s) = D + C(sI − A)−1B be the transfer function for this statespace system. Assume that the input u(t) = γeiωt is a complex sinusoid with fixed frequencyω and amplitude γ. Then show that

x(t) = eAt(x(0)− (iωI −A)−1Bγ

)+ (iωI − A)−1Bγeiωt

y(t) = CeAt(x(0)− (iωI − A)−1Bγ

)+G(iω)γeiωt.

Because A is stable all the eigenvalues for A are contained in the open left hand plane, andthus, iωI − A is invertible. In particular, for all large t, we have

x(t) ≈ (iωI − A)−1Bγeiωt and y(t) ≈ G(iω)γeiωt = |G(iω)|γei(ωt+φ(iω)) (for all large t).

Here G(iω) = |G(iω)|eiφ(iω) is the polar representation for the complex number G(iω), thatis, |G(iω)| is the magnitude of G(iω), while φ(iω) is the angle of G(iω).

5.6. STATE SPACE REALIZATIONS AND OPERATIONAL AMPLIFIERS 277

5.6 State space realizations and operational amplifiers

The results in this section are not used in the rest of the notes and can be skipped bythe uninterested reader. In this section we will show that any proper rational transferfunction or state space system can be simulated by using an electrical circuit consistingof operational amplifiers, resistors and capacitors. For a nice elementary presentation ofoperational amplifiers see electrical engineering, operational amplifiers in the Khan AcademyWebpage.

−

+

−vp

vp

+

−

v2 +

−v1

v+

−y

Figure 5.8: Operational amplifier: v = v1 − v2 and y = Kv with K � 0

An operator amplifier is essentially a high gain amplifier where there is virtually nocurrent flowing between the input terminals of the amplifier. Operational amplifiers arefundamental components in many integrated circuits. Operational amplifiers can used tosimulate many basic mathematical operations such as integration, sign changes, gain andsummation. In this chapter we will show how one can use operational amplifiers to simulateany proper rational transfer function.

The schematics of a basic ideal operational amplifier is presented in Figure 5.8. Theoperational amplifier has two input ports or terminals. The positive or non-inverting port isdenoted by + and the negative or inverting port is denoted by −. The voltage from groundto the positive terminal is v1, while the voltage from ground to the negative terminal isv2. The differential voltage defined by v = v1 − v2 is the difference between the voltage atthe positive and negative terminals. The minus sign placed at the top of the operationalamplifier is the inverting input, which amplifies the voltage −v2. The plus sign at thebottom of the operational amplifier is the non-inverting input, which amplifies the voltagev1. The output y of our operational amplifier is the voltage from ground to the outputterminal of the operational amplifier. The vp and −vp are the voltages from the powersupply to the operational amplifier. A voltage supply is needed to amplify our signals. Theoperator amplifier also has a connection to ground which we did not display. In general theoutput voltage y = Kv where K ranges between [104, 107]. Since we will only consider idealoperational amplifier, we will assume that y

vis approximately infinity. The ideal operational

amplifier has the following three basic properties:

(i) The open loop voltage gain is approximately infinity, that is, yv≈ ∞.

(ii) The input resistance between the input terminals to the operational amplifier is infinite.In other words, the current flowing into the operational amplifier is negligible.


(iii) The output resistance is approximately zero. This allows one to cascade operationalamplifiers in applications.

Furthermore, there is also saturation in the operational amplifier, and we will assume thatwe are always operating in the linear region. Finally, throughout the rest of this section wewill not include the external power supply terminals vp and −vp in the schematics of theoperational amplifier, and it is always understood that there is an external power supply.For a more detailed explanation of how an operational amplifier work, the reader shouldconsult some well established texts in circuits and an electronics manual.

−

+

iinZi

i1

io

Zo

v2 = −v+

−y

+−u

Figure 5.9: Operational amplifier: transfer function Y (s)U(s)

= −Zo(s)Zi(s)

Now consider the circuit in Figure 5.9 where the impedance to the input of the operationalamplifier is denoted by Zi, while the impedance connecting the negative terminal of the inputto the output is denoted by Zo. Moreover, v2 is the voltage from ground to the negative orinverting terminal. Because the positive port is connected to the ground, the voltage at thepositive terminal is v1 = 0. Hence the differential voltage v = −v2. The input voltage is uand y is the output voltage. We claim that the transfer function from u to y is given by

Y (s)

U(s)= −Zo(s)

Zi(s). (6.1)

To obtain this result, recall that there is no current flowing into the input terminals ofan operational amplifier. In particular, i1 = 0. Hence Kirchhoff’s laws imply that iin = io,where iin and io are respectively, the currents flowing into Zi and Zo. By taking the Laplacetransform, we obtain

Iin(s) =U(s)− V2(s)

Zi(s)and Io(s) =

V2(s)− Y (s)

Zo(s).

Recall that y = Kv = −Kv2 where the gain K � 0 of the operational amplifier is large.Using V2(s) = −Y (s)/K in the previous equations with Iin = Io, we have

U(s) + Y (s)/K

Zi(s)= −Y (s)/K + Y (s)

Zo(s).


By letting K approach infinity, we see that U(s)Zi(s)

= − Y (s)Zo(s)

. In other words, the transfer

function from u to y for the circuit in Figure 5.9 is given by Y (s)U(s)

= −Zo(s)Zi(s)

.

Operational amplifiers and gain

−

+

Ri

Ro

+

−y

+−u

Figure 5.10: Operational amplifier: gain y(t) = −Ro

Riu(t)

Consider the circuit presented in Figure 5.10 consisting of two resistors with Ri and Ro ohms,respectively. Notice that this is precisely a special case of the circuit in Figure 5.9 whereZi = Ri and Zo = Ro. Therefore the transfer function from the input voltage u to the outputvoltage y is given by

Y (s)

U(s)= −Ro

Ri

. (6.2)

In this case, y(t) = −(Ro/Ri)u(t). In other words, the output y(t) is simply −Ro/Ri timesthe input u(t). If Ri = Ro, then y(t) = −u(t), and this circuit simply changes the sign of theinput. By connecting these circuits in series if necessary one can build a circuit such thaty(t) = γu(t) where γ is any real constant. If 0 ≤ γ ≤ 1 one can also use a potentiometer toimplement y(t) = γu(t); see also Problems 5 and 6 in Section 5.6.2.


A summing circuit

−

+

R1 i1u1R2 i2u2R3 i3u3...

Rn inun

iinio

Ro

v = −v2+

−y

Figure 5.11: Operational amplifier: summer y(t) = −∑nj=1

Ro

Rjuj(t)

The circuit presented in Figure 5.11 can be used to sum a set of voltages. In this case,{uj}n1 is a set of input voltages, that is, the input voltage from the ground to the resistorRj is denoted by uj for j = 1, 2, · · · , n. The n input resistors respectively, have {Rj}n1 ohmsand the resistor connected to the output voltage y has Ro ohms. Furthermore, v2 is thevoltage from the ground to the negative or inverting terminal of the operational amplifier.Because the positive or non-inverting terminal is connected to ground, the differential voltagev = −v2. We claim that the output voltage is given by

y(t) = −n∑j=1

Ro

Rjuj(t) . (6.3)

By using additional operational amplifiers in series to change the sign this shows that onecan use operational amplifiers to simulate the summation y(t) =

∑n1 γjuj(t) where {γj}n1 is

a set of real numbers; see the paragraph on operational amplifiers in series below.To derive y(t) = −∑n

1Ro

Rjuj(t), notice that the current iin =

∑n1 ij where ij is the current

leaving the resistor Rj . Because no current flows into the operational amplifier, iin = io.Moreover, applying Ohm’s law we have

n∑j=1

ij =

n∑j=1

uj − v2Rj

= iin = io =v2 − y

Ro.

The output y = Kv = −Kv2 where v = 0−v2 is the differential voltage. So using v2 = −y/K,we have

n∑j=1

uj + y/K

Rj

= iin = io = −y/K + y

Ro

.

Recall that K � 0 is large. Hence letting K approach infinity, we obtain∑n

1ujRj

= − yRo. In

other words, y(t) = −∑n1Ro

Rjuj(t). This yields the summation formula in (6.3).


Integration

−

+

R

C

+

−y

+−u

Figure 5.12: Operational amplifier: integrator y = − 1RC

∫udt

Consider the circuit presented in Figure 5.12 consisting of a resistor with R ohms and acapacitor with C farads. Notice that this is a special case of the circuit in Figure 5.9 whereZi = R is a resistor and Zo =

1Cs

is a capacitor. Hence the transfer function from the inputvoltage u to the output voltage y is given by

Y (s)

U(s)= − 1

RCs. (6.4)

In this case,

y(t) = − 1

RC

∫ t

0

u(σ) dσ.

Here we have assumed that the initial voltage or charge across the capacitor is zero. Inthis case, the output y(t) is simply − 1

RCtimes the integral of the input u(t). By changing

the sign and setting RC = 1, we can build a circuit consisting of operational amplifiers tointegrate the input, that is, y(t) =

∫ t0u(σ) dσ, or equivalently, Y (s) = U(s)

s.


Operational amplifiers in series

+−u

−

+

Z1

Z2

+

w

−

−

+

Z3

Z4

+

y

−

Figure 5.13: Operational amplifiers in series: Y (s)U(s)

= Z4(s)Z2(s)Z3(s)Z1(s)

Consider the circuit in Figure 5.13 consisting of two operational amplifiers in series with fourimpedances Z1, Z2, Z3 and Z4. Here u is the input voltage, y is the output voltage and w isthe voltage from ground to the output terminal of the operational amplifier on the left. Byconsulting the operational amplifier circuit in Figure 5.9, we see that the transfer functionfrom u to w is given by W (s)

U(s)= Z2(s)

Z1(s). Moreover, the transfer function from w to y is given

by Y (s)W (s)

= Z4(s)Z3(s)

. Therefore the transfer function from u to y is determined by

Y (s)

U(s)=

Y (s)

W (s)

W (s)

U(s)=Z4(s)Z2(s)

Z3(s)Z1(s).

In other words,Y (s)

U(s)=Z4(s)Z2(s)

Z3(s)Z1(s). (6.5)

For example, if one chooses, Z1 = 1, Z2 = 1 and Z3 = 1 all 1 ohm resistors with Z4 =1s

a 1 farad capacitor, then the transfer function Y (s)U(s)

= 1sis an integrator, that is, y =

∫udt.

If one chooses, Z1 = 1 and Z3 = 1 both 1 ohm resistors with Z2 = 1sand Z4 = 1

s

both 1 farad capacitors, then the transfer function Y (s)U(s)

= 1s2

is a double integrator, that is,

y =∫ ∫

u.If one chooses, Z1 = 1, Z2 = 1 and Z3 = 1 all 1 ohm resistors with Z4 = 2 a 2 ohm

resistor, then the transfer function Y (s)U(s)

= 2 is a gain of 2, that is, y(t) = 2u(t). The circuitin Figure 5.17 of Problems 6 in Section 5.6.2 shows how one can implement the gain ofy(t) = 2u(t) by using only one operational amplifier.


5.6.1 Circuits for state space systems

The preceding section shows how one can use operational amplifiers along with resistors andcapacitors to build a circuit which performs the following three basis operations:

(i) Multiplies by a signal by a constant;

(ii) sums a set of signals;

(iii) integrates a signal.

By connecting these components together, one can build a circuit to implement any statespace system. For example consider the second order state space system given by[

x1x2

]=

[a11 a12a21 a22

] [x1x2

]+

[b1b2

]u

y =[c1 c2

]x+Du .

The circuit for this system is given in Figure 5.14. This circuit using Simulink. Here thecircuit representing integration is denoted by 1/s. The circuit corresponding to multiplicationby a constant is denoted by a triangle. The summing circuits are labeled by sum. Notice allof these circuits can be built by using operational amplifiers with the appropriate resistorsand capacitors. The circuit in Figure 5.14 is obtained by cascading these components.

To complete this section, let us observe that one can construct a circuit to implementany proper rational transfer function. For example, consider the rational function G givenby

G(s) =c0 + c1s+ c2s

2

a0 + a1s+ a2s2 + s3+D .

Recall that the state space realization for this system is given by⎡⎣ x1x2x3

⎤⎦ =

⎡⎣ 0 1 00 0 1

−a0 −a1 −a2

⎤⎦⎡⎣ x1x2x3

⎤⎦+

⎡⎣ 001

⎤⎦uy =

[c0 c1 c2

]x+Du .

Notice that x1 = x2 and x2 = x3. The Simulink representation for this system is given inFigure 5.15. Finally, it is noted that all of these circuits can be built by using the operationalamplifiers with the appropriate resistors and capacitors.

To complete this section, let us observe that one can use operational amplifiers to con-struct an oscillator. To see this consider the transfer function

G(s) =Y (s)

U(s)=

s

s2 + ω2.

Clearly, one can use operational amplifiers along with the appropriate resistors and capacitorsto implement this transfer function. Now consider the input u(t) = ω. Then Y (s) =G(s)U(s) = ω/(s2 + ω2). Hence the output y(t) = sin(ωt). Finally, it is noted that by usingFourier series on can build a circuit to implement any periodic function.


Du

y

x1

x2dx2/dt

c2 x2

a11 x1

a21 x1

a12 x2

dx1/dt

a22 x2

b1u

b2u

sumu

input

s

1

Integrator

a12

sum

sum

s

1

Integrator

D

b2

b1

c2

a22

c1

a21

a11

c1 x1

Figure 5.14: A second order realization

dx3/dt

a1 x1

a2 x2

a3 x3

y

x3=dx2/dt x2=dx1/dt x1

Du

c3 x3

c2 x2

c1 x1

u

input

Sum

Sum

s

1

Integrator s

1

Integrator s

1

Integrator

c1

c2

c3

D

a1

a2

a3

Figure 5.15: A third order canonical realization


5.6.2 Exercise

Problem 1. Find the transfer function from u to y in Figure 5.9, where Zi is the impedanceformed by a resistor Ri and capacitor Ci in parallel, and Zo is the impedance formed by aresistor Ro and capacitor Co in parallel.


G(s) =Y (s)

U(s)=

1

s2 + ω2.

Discuss how one can use this transfer function with the input u(t) = γ where γ is a constantto build an oscillator.

Problem 3. Build a circuit of the form given by Figure 5.14 to implement the followingstate space system ⎡⎣ x1

x2x3

⎤⎦ =

⎡⎣ 0 1 21 −2 03 0 5

⎤⎦⎡⎣ x1x2x3

⎤⎦+

⎡⎣ 10−2

⎤⎦ uy =

[1 2 0

]x .

Problem 4. Build a circuit of the form given by Figure 5.15 to implement the followingtransfer function

s3 + 3s+ 2

s4 + 2s2 + 3s+ 22.

Problem 5. Consider the circuit presented in Figure 5.16 consisting of two resistors R1

and R2. The input u is the voltage source and the output y is the voltage across the secondresistor R2. Show that

y(t) =R2

R1 +R2u(t).

Because R1 > 0 and R2 > 0, one can choose R1 and R2 such that y(t) = γu(t) when0 < γ < 1. In fact, by choosing R2 =

γR1

1−γ we have y(t) = γu(t).

+−u

R1

R2 y

+

−

Figure 5.16: A circuit containing two resistors

Problem 6. Consider the circuit presented in Figure 5.17 consisting of a non-invertingoperational amplifier with two resistors R1 and R2. The input u is the voltage source and the


output y is the voltage from the ground to the output terminal of the operational amplifier.Here v = u − v2 where v2 is the voltage from the ground to the negative terminal of theoperational amplifier, or equivalently, v2 is the voltage across the second resistor R2. It isemphasized that the positive terminal is positioned at the top of the operational amplifier andthe negative or inverting terminal is at the bottom. In this case, we say that the operationalamplifier is non-inverting. Using the properties of an ideal operational amplifier show that

y(t) =R1 +R2

R2u(t).

Because R1 > 0 and R2 > 0, one can choose R1 and R2 such that y(t) = γu(t) when γ > 1.In fact, by choosing R2 =

R1

γ−1we have y(t) = γu(t).

−

+

v2

v

R1

R2

+

v2

−

+−u

+

−

y

Figure 5.17: A non-inverting operational amplifier: v = u− v2

Problem 7. Using operational amplifiers design a circuit whose transfer function is givenby Y (s)

U(s)= 3

s, that is, y = 3

∫udt.

Problem 8. Using operational amplifiers design a circuit whose transfer function is givenby Y (s)

U(s)= 3s, that is, y = 3u.

5.7 A simple pendulum

The results in this section are not used in the rest of the notes and can be skipped by theuninterested reader. Consider a particle of mass m connected to the end of a pendulum oflength l swinging in the x-y plane; see Figure 5.18. (This figure was drawn with GeoGebra.)Assume that the mass at the end of the pendulum is attached by a string or massless rod oflength l to the origin in the x-y plane. Let θ be the angle between the y axis and the stringsupporting the pendulum. A positive θ corresponds to the pendulum being in the right halfplane. The position of the pendulum is given by

r =

[l sin(θ)−l cos(θ)

]and r =

[l cos(θ)l sin(θ)

]θ. (7.1)

5.7. A SIMPLE PENDULUM 287

Figure 5.18: The pendulum

dot{theta} theta

ddot{phi} dot{phi}phi

To Workspace2

simout_dot

To Workspace1

simout1

To Workspace

simout

Sine Function

t

Scope2

Scope1

Scope

Integrator3

1s

Integrator2

1s

Integrator1

1s

Integrator

1s

Gain

−9.8/l

ddot{theta}

Figure 5.19: The Simulink model for the pendulum.

Since the kinetic energy T = m‖r‖2/2, we see that T = ml2θ2/2. The potential energy isgiven by V = −mgl cos(θ) where g is the gravitational constant. Notice that the potentialenergy is mg times the y coordinate of the position r. Hence the Lagrangian L = T − V is


given by

L =ml2θ2

2+mgl cos(θ). (7.2)

Here the generalized coordinate is q = θ. Using Lagrange’s equations, ddt∂L∂q

− ∂L∂q

= 0 wearrive at

0 =d

dt

∂L

∂θ− ∂L

∂θ=

d

dtml2θ +mgl sin(θ) = ml2θ +mgl sin(θ).

Therefore the equation of motion describing the pendulum is given by

θ +g

lsin(θ) = 0. (7.3)

Recall that for small angles sin(θ) ≈ θ. Hence for small angles the equations of motionare approximately determined by

ϕ+g

lϕ = 0. (7.4)

When the angles are small the actual angle of the pendulum θ(t) ≈ ϕ(t). For small anglesthe solution for (7.4) is given by

θ(t) ≈ ϕ(t) = a cos(ωt) + b sin(ωt) where ω =

√g

l(7.5)

and a and b are constants determined by the initial conditions ϕ(0) = θ(0) and ϕ(0) = θ(0).The solution to the linear differential equation ϕ + g

lϕ = 0 with ϕ(0) = θ(0) and ϕ(0) =

θ(0) = 0 is given by

ϕ(t) = θ(0) cos(√g

lt). (7.6)

Finally, θ(t) ≈ ϕ(t) when ϕ(0) = θ(0) is small and ϕ(0) = θ(0) = 0.The frequency in hertz for the solution ϕ(t) in (7.6) is determined by

1

2π

√g

l.

Because the period is one over the frequency (in hertz), the period for the solution ϕ(t) in(7.6) is given by

τlin = 2π

√l

g. (7.7)

The energy approach. The total energy of the system is given by the kinetic energy plusthe potential energy. So the total energy of the pendulum is determined by

H = T + V =ml2θ2

2−mgl cos(θ).

Because the system is conservative the total energy H equals a constant. Hence

0 = H = ml2θθ +mgl sin(θ)θ.


Dividing by mlθ yields the differential equation lθ + g sin(θ) = 0, and we arrive at the sameequation of motion in (7.3).

The energy approach can also be used to find the period of the pendulum when θ(0) = θ0is in [0, π) and θ(0) = 0. (Theoretically the pendulum is sits at the unstable equilibriumwhen θ(0) = π and θ(0) = 0.) To obtain the period notice that the conservation of energyprinciple shows that

ml2θ2

2−mgl cos(θ) = −mgl cos(θ0).

Because the kinetic energy is positive, cos(θ)− cos(θ0) ≥ 0. Hence

dθ

dt=

√2g

l

√cos(θ)− cos(θ0) and

dt

dθ=

√l

2g

1√cos(θ)− cos(θ0)

.

By taking the integral, this implies that

t0 =

√l

2g

∫ θ0

0


dθ

is the time it takes for the pendulum to move from θ0 to θ = 0. The period is obtained bymoving from θ0 to 0 and then from 0 to −θ0 and then back from −θ0 to 0 and finally from0 to θ0. (The conservation of energy principle guarantees that the maximum height of thependulum is achieved at θ0 and −θ0.) Moreover, the time it takes the pendulum to travelthese four paths is the same. So the period is given by

τ = 4

√l

2g

∫ θ0

0


dθ. (7.8)

It is emphasized that the period τ = τ(θ0) depends upon the initial condition θ0.By consulting the webpage Pendulum (mathematics) in Wikipedia:

http://en.wikipedia.org/wiki/Pendulum %28mathematics%29one can use elliptic integrals to show that the period

τ = 2π

√l

g

(1 +

(1

2

)2

sin2

(θ02

)+

(1 · 32 · 4

)2

sin4

(θ02

)+

(1 · 3 · 52 · 4 · 6

)2

sin6

(θ02

)+ · · ·

).

(7.9)In particular, the linearized model for the pendulum has a smaller period τlin ≤ τ than theactual pendulum; see (7.7). Therefore the frequency of the pendulum is slower, then thecorresponding frequency of its linearized model. The power series in (7.9) also shows thatτlin ≈ τ for small angles.

Using the power series in (7.9), with Matlab we plotted the graph of

τ

τlin=

(1 +

(1

2

)2

sin2

(θ02

)+

(1 · 32 · 4

)2

sin4

(θ02

)+

(1 · 3 · 52 · 4 · 6

)2

sin6

(θ02

)+ · · ·

)in Figure 5.20. For example,


• if θ0 =π12, then τ = 1.0043τlin;

• if θ0 =π3, then τ = 1.0732τlin;

• if θ0 =π2, then τ = 1.1803τlin;

• if θ0 =3π4, then τ = 1.5268τlin.

The Matlab commands we used to generate Figure 5.20 are given by

• t = linspace(0, π, 2 ∧ 14);

• p = 1 + sin(t/2). ∧ 2/2;

• for k = (4:2:50); p = p+ prod(1:2:k-1) ∧ 2 ∗ sin(t/2). ∧ k/(prod(2:2:k) ∧ 2);end

• plot(t,p); grid

• title(’The plot of τ/τlin’)

• xlabel(’θ’); ylabel(’τ/τlin’);print pend

0 0.5 1 1.5 2 2.5 3 3.51

1.2

1.4

1.6

1.8

2

2.2

2.4

The plot of τ/τlin

θ

τ/τ lin

Figure 5.20: The plot of τ/τlin.


5.7.1 A Simulink model.

Let us present a Simulink model for the pendulum. Assume that the l = 3. The Simulinkmodel we used is presented in Figure 5.19. We set the configuration parameters in Simulinkto: fixed step, fixed step size 0.001 and then ran the model for 40 seconds. In the simoutto work space block we choose the array option. The simout in Matlab contains a vector oflength 40001 which corresponds to the solution θ for the equations of motion in (7.3), whilesimout1 in Matlab is a vector of length 40001 which corresponds to the solution ϕ for thelinear approximation of the equations of motion in (7.4). Notice that the gain block was setto −9.8

l≈ −g

land the amplitude in the sine function block was also set to −9.8

l≈ −g

l. To run

the Simulink model, we set the initial condition for integrator 1 (that is, θ(0)) and integrator3 (that is, ϕ(0)) both equal to π

12. Running Simulink generated the graph in Figure 5.21.

(The initial conditions for integrator and integrator 2 which correspond to θ(0) and ϕ(0)were set equal to zero. Zero is the default setting.) Simulink solves the differential equationsin (7.3) and its linear approximation in (7.4) with the initial conditions θ(0) = ϕ(0) = π

12

and θ(0) = ϕ(0) = 0. Since the initial angle is small Figure 5.21 shows that the plots of θand its linear approximation ϕ are close. To plot this graph we used the Matlab commands:

t = linspace(0, 40, 40001);

plot(t, simout); grid; hold on; plot(t, simout1,′ r′)

According to (7.6) when the initial condition is small

θ(t) ≈ ϕ(t) = θ(0) cos(√g

lt)=

π

12cos(1.8074t).

The corresponding frequency in radians is√

9.83

= 1.8074. So the period is τlin = 2π1.8074

=

3.4764. In other words, the pendulum swings back and forth or cycles 403.4764

= 11.5062times in 40 seconds; see Figure 5.21. Finally, recall that for θ0 = π

12the actual period of

the pendulum is τ = 1.0043τlin = 3.4914. As expected, the period and frequency of thependulum and its linear approximation are close when θ0 =

π12.

If one wants to see how the pendulum move in the x-y plane, then in Matlab set

l = 3; x = l sin(simout); y = −l cos(simout)

comet(x, y)

Since simout is the angle θ of the pendulum, Matlab plots the x-y position of the pendulum.


0 5 10 15 20 25 30 35 40−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1The plot of θ and its approximation with θ(0) = π/12

time

Figure 5.21: The plot of θ and its approximation ϕ with θ(0) = π12.

Now let us use Simulink to numerical solve the differential equations in (7.3) and (7.4)with the initial conditions θ(0) = ϕ(0) = π

3and θ(0) = ϕ(0) = 0. To accomplish this we set

the initial conditions in integrator 1 and integrator 3 both equal to π3. The initial conditions

in the other two integrators were set equal to zero; see Figure 5.19. Then we ran Simulink.The graph of θ and its linear approximation ϕ is given in Figure 5.22. Because the initialcondition is not small the two graphs for θ and its linear approximation ϕ diverge. Thesolution ϕ for the linear approximation or differential equation ϕ+ g

lϕ = 0 is given by

ϕ = θ(0) cos(√g

lt)=π

3cos(1.8074t).

which runs through 403.4764

= 11.5062 cycles in 40 seconds. The power spectrum for θ, theactual angle of the pendulum, is given in Figure 5.23 where x axis is labeled in hertz. Thepower spectrum was obtained by plotting the inverse fast Fourier transform for θ over theinterval [0, 40]. The Matlab commands we used are given by

a = ifft(simout);

bar((0:30),abs(a(1:31)). ∧ 2); grid

title(’The power spectrum of θ’); ylabel(’|a|2’);xlabel(’The frequency in hertz’);

We see that the main frequency occurs at 0.275 hertz or 2π × 0.275 = 1.7279 radian. (Thefrequency of 0.275 was obtained by magnifying the graph in Matlab.) The corresponding pe-riod determined experimentally is 1

0.275= 3.6364. The frequency for the linear approximation

ϕ equals 1.8074 in radians. Since the fundamental frequency for the nonlinear differentialequation (determined experimentally) is 1.7279, we see that the actual solution for θ has asmaller frequency than its corresponding linear approximation ϕ.


Recall that for θ0 =π3the actual period of the pendulum is τ = 1.0732τlin = 3.7308 and

τlin = 3.4764. Hence the experimental estimate of the period 3.6364 is slightly off the actualperiod 3.7308 for the pendulum. For many engineering applications this is close enough.Finally, it is noted that there are statistical techniques which will give more accurate results;see Capon’s method in [11] further details.

0 5 10 15 20 25 30 35 40−3

−2

−1

0

1

2

3The plot of θ and its approximation with θ(0) = π/3

time

Figure 5.22: The plot of θ and its approximation ϕ with θ(0) = π3.

−0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

0.05

0.1

0.15

0.2

0.25

The frequency in hertz

The power spectrum of θ

|a|2

Figure 5.23: The power spectrum of θ with θ(0) = π3.


REMARK 5.7.1 It is emphasized that the data for our pendulum is ”relatively clean” anddoes not contain any noise. In this case, we can compute the period of θ by simply measuringthe distance between the adjacent peaks of θ(t). So using Matlab we simple zoomed in onFigure 5.22 and measured the difference between the peaks of θ(t) with the initial conditionθ(0) = π

3, and discovered that the corresponding period is 3.73, as expected. (This is the θ

computed from θ + glsin(θ) = 0 in Simulink.) Recall that using the power spectrum for θ

with θ(0) = π3in Figure 5.23, we estimated the period at 3.6364. It is emphasized that the

power spectrum in Figure 5.23 contains one large peak with several sidebands. This is becausewe computed the inverse fast Fourier transform over 40 seconds and the pendulum did notcomplete its cycle at the end of 40 seconds. In fact, θ(t) completed 40

3.73= 10.72 cycles, and

thus, by Gibbs phenomenon we have side bands in the power spectrum. To eliminate theseside bands we computed the inverse fast Fourier transform of θ(t) over the interval of [0, 37.3]or 10 cycles. Then we plotted the power spectrum of θ with θ(0) = π

3in Figure 5.24. This

eliminated the Gibbs phenomenon. The new power spectrum has only one peak at the desiredfrequency of 1.6839 radians, or .268 hertz, or equivalently, a period of 3.73.

In Simulink we computed θ(t) with the initial condition θ(0) = π3using 37301 points over

the interval t ∈ [0, 37.3]. Then we computed a = ifft(θ) by taking the inverse fast Fouriertransform of θ in Matlab. The a(k) corresponding to the peak in the power spectrum inMatlab was a(11) = 0.5267 + 0.0031i. (The Fourier coefficient a(11) because θ(t) completed10cycles.) Then we set

y = 2�(a(11)) cos(1.6839t) + 2�(a(11)) sin(1.6839t)= 1.0534 cos(1.6839t) + 0.0062 sin(1.6839t).

We used norm(x,inf) in Matlab to compare θ(t) to its approximation y(t) from its powerspectral density over the interval [0, 37.3] to obtain

max{|θ(t)− y(t)| : 0 ≤ t ≤ 37.3} = 0.0207.

The plot of θ(t) and y(t) was virtually identical. It is noted that 1.0534 ≈ π3and 0.0061 ≈ 0.

So our analysis of the power spectrum also obtained a good approximation of periodic motionπ3cos(1.6841t) formed by the initial condition θ(0) = π

3, θ(0) = 0 and the actual fundamental

frequency 1.6841 of the pendulum. Finally, the theoretical frequency 1.6841 ≈ 1.6839 whichis our experimental estimate. For all practical purposes the two frequencies are the same.

It is emphasized that in many applications one does not know the period of the signala priori, or there may be noise in the signal, and the signal could have many differentfrequencies in which case setting the length of the data may not be an option. Finally, isnoted that one can use Capon’s methods to find all the frequencies without worrying aboutthe length of the data; see [11] further details.


-0.2 0 0.2 0.4 0.6 0.8 1 1.2

The frequency in hertz

0

0.05

0.1

0.15

0.2

0.25

0.3

|a|2

The power spectrum of θ

Figure 5.24: The power spectrum of θ(t) with θ(0) = π3and 0 ≤ t ≤ 37.3.

To complete this section, let us use the conservation of energy principle to gain somefurther insight into the motion of the pendulum. The conservation of energy principle statesthat the total energy

H =ml2θ2

2−mgl cos(θ) (7.10)

for the pendulum equals a constant for all time. To apply this principle, assume that thependulum is initially at rest θ(0) = 0, and there is an initial angular velocity θ(0) on thependulum. Physically, this is like one is ”hitting” the pendulum while it sits motionless inthe down position. If θ(0) is large enough, then the pendulum will make a complete circlearound the origin and rotate infinitely many times around the origin. On the other hand,if θ(0) is not large enough, then the pendulum will never reach the top of the circle, thatis, |θ(t)| will reach a maximum angle θmax strictly less than π and then fall back down. Inother words, the pendulum will swing back an forth between ±θmax. So there is an initialcondition θ(0) = θ◦ with θ(0) = 0 such that the pendulum moves to the top of the circle andstays there. In other words, for these initial conditions

π = limt→∞

θ(t) and 0 = limt→∞

θ(t). (7.11)

To find this initial condition θ◦, let us use the conservation of energy principle, that is, thetotal energy equals a constant for all time. To be precise,

ml2θ(0)2

2−mgl cos(θ(0)) =

ml2θ(t)2

2−mgl cos(θ(t)) (7.12)

for all time t. So choosing θ(0) = θ◦ and θ(0) = 0 with the constraint in (7.11), we obtain

ml2θ2◦2

−mgl =ml2θ(t)2

2−mgl cos(θ(t)) → 0−mgl cos(π) = −mgl.


This readily implies that l2θ2◦2

= 2gl. Therefore the initial conditions θ◦ which moves thependulum to the top and stays there is given by

θ◦ = ±2

√g

l. (7.13)

It is noted that θ◦ equals ±2 times the natural frequency√

glof the linear approximation

for the pendulum; see equations (7.4) and (7.5). Summing up we have the following cases.

(i) If |θ(0)| < 2√

gland θ(0) = 0, then the conservation of energy principle (7.12) shows

that the pendulum will swing back and forth in the region

|θ(t)| ≤ arccos

(1− lθ(0)2

2g

)= θmax. (7.14)

The pendulum will keep rising until it angular velocity θ(t1) = 0 at some time t1. Thenthe pendulum will begin to fall and its angular speed will increase. The maximum angleθmax = max{|θ(t)| : t ≥ 0} the pendulum obtains occurs when the angular velocityθ(t1) = 0 for some time t1. Using this in (7.12) with θ(0) = 0 and θ(t1) = ±θmax, weobtain

ml2θ(0)2

2−mgl = −mgl cos(θ(t1)) = −mgl cos(θmax).

Hence cos(θmax) = 1− lθ(0)2

2g. Taking the arccos of both sides yields (7.14).

(ii) If |θ(0)| = 2√

gland θ(0) = 0, then

π = limt→∞

θ(t) and 0 = limt→∞

θ(t).

(iii) If |θ(0)| > 2√

gland θ(0) = 0, then the pendulum will rotate around the origin infinitely

many times. To see this, notice that the conservation of energy principle in (7.12) yields

ml2θ(t)2

2=ml2θ(0)2

2+mgl

(cos(θ(t))− cos(θ(0))

)≥ ml2θ(0)2

2− 2mgl ≥ δ > 0

for some positive number δ > 0. The last equality follows from the fact that lθ(0)2 > 4g.Therefore |θ(t)| ≥ ε > 0 for all time where ε > 0 is some positive number. In otherwords, the angular velocity never crosses zero and is bounded away from zero. So thependulum keeps rotating around the origin infinitely many times.

To simulate Part (ii), we set the initial conditions in the integrator and integrator2 blocksto 2

√gl, with the initial conditions in integrator1 and integrator3 blocks to 0 in our Simulink

model in Figure 5.19. We ran Simulink for 40 seconds with l = 3. The graph of θ(t) with itslinear approximation ϕ(t) is presented in Figure 5.25. Due to numerical error it is unlikelythat the angle θ(t) for the pendulum will satisfy (7.11). In our simulation, the pendulumwould approach the top angle π stay there for about 8 seconds and then fall back down.


(One can see this using the comet command in Matlab.) Clearly, the linear approximationϕ does not even come close to the ”actual angle” of the pendulum θ; see (7.11).

One can also use the scope block in Simulink to send the scope data to Matlab. Toaccomplish this open the scope and under the history tab uncheck limit data to 5000 points.Then check save data to workspace and array. The data is saved in ScopeData (or anyvariable name you give it). The Matlab commands we used to generate Figure 5.25 aregiven by

t=ScopeData(:,1); plot(t,ScopeData(:,2:3)); grid; xlabel(’time’);

It is noted that in our example ScopeData for the scope block is a 40001 × 3 vector. Thefirst column corresponds to time t, while the second column and third column correspondto θ and ϕ, respectively. Moreover, ScopeData1 for the scope1 block is a 40001× 2 vector,where the first column corresponds to time t, while the second column corresponds to θ. Asexpected, ScopeData2 for the scope2 block is a 40001 × 2 vector, where the first columncorresponds to time t, while the second column corresponds to θ.

To complete this section, let us simulate Part (i) with θ(0) = 0 and θ(0) = 1.98√

glwhen

l = 3. Using Simulink with Matlab, we computed the [x(t), y(t)] position of the pendulum;see (7.1). Then we plotted the phase plot of x(t) vs y(t) in Figure 5.26. To run Simulink weset the initial conditions in the integrator and integrator2 blocks to 1.98

√glwith the initial

conditions in integrator1 and integrator3 blocks to 0 in our Simulink model in Figure 5.19.(Since we did not use ϕ in this part one can ignore integrator2 and integrator3.) In this case,θmax = arccos(1 − 1.982

2) ≈ 2.8585. The following Matlab commands were used to generate

Figure 5.26.

l = 3; g = 9.8; x = l ∗ sin (ScopeData1(:,2));

y = −l ∗ cos (ScopeData1(:,2)); comet(x,y)

% x = l sin(θ) and y = −l cos(θ); see (7.1)

% The comet command plots x(t) vs y(t) as the pendulum moves.

plot(x,y); grid; xlabel(’x’); ylabel(’y’)

θmax = acos(1− 1.982/2) = 2.8585

max(abs(ScopeData1(:,2))) = 2.8585

% This computes max(|θ(t)|) = 2.8585 as expected.

% To show that the total energy H is indeed a constant

t = ScopeData1(:,1);

H = l2 ∗ ScopeData2(:, 2).2/2− g ∗ l ∗ cos(ScopeData1(:,2));

% Since the mass only adds a scale factor, without loss of generality we set m = 1.

l2 ∗ 1.982 ∗ g/(2 ∗ l)− g ∗ l = 28.2299

plot(t,H); grid

% The graph should be the constant l2θ(0)2

2− gl cos(θ(0)) = 28.2299


0 5 10 15 20 25 30 35 40−4

−3

−2

−1

0

1

2

3

4

time

Figure 5.25: The graph of θ and ϕ with θ(0) = ϕ(0) = 0 and θ(0) = ϕ(0) = 2√

gl.

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

x

y

Figure 5.26: The phase plot of x(t) vs y(t) with θ(0) = 0 and θ(0) = 1.98√

g3.

5.7.2 Exercise

Exercise 1. Derive the solution to the differential equation

ϕ+ ω2ϕ = 0

subject to the initial conditions ϕ(0) = α and ϕ(0) = β. In the linear approximation for ourpendulum ω2 = g

l.

Exercise 2. Consider the linear approximation for the pendulum given by

ϕ+g

lϕ = 0


where l = 3, ϕ(0) = π3. For this example the fundamental frequency is ω ≈

√9.83

= 1.8074.

Run the Simulink model in Figure 5.19 to solve this linear differential equation for 40 seconds.Use the fast Fourier transform to try and estimate the frequency 1.8074.

Exercise 3. Consider the equation of motion for the pendulum given by

θ +g

lsin(θ) = 0.

For l = 3 and g = 9.8 design a Simulink model to compute θ over the interval [0, 40] subjectto the initial conditions θ(0) = 0 and θ(0) = 1.99

√gl. Plot θ(t) and θ(t). Graph the phase

plot of the pendulum, that is, plot x vs y or plot(x,y) in Matlab. (Try the comet command.)Use the fast Fourier transform to estimate the frequencies of θ(t). Plot the power spectrumof θ(t).

Exercise 4. Now assume that there is damping on the pendulum. In this case, the equationof motion becomes:

ml2θ +mgl sin(θ) = −cθ (7.15)

where c is the damping coefficient. (This follows from Lagrange’s equations of motion withdamping d

dt∂L∂q

− ∂L∂q

= −cq.) For small angles its linear approximation θ ≈ ϕ is given by

ml2ϕ+mglϕ = −cϕ. (7.16)

Assume that m = 2, l = 4 and c = 3 with the initial conditions θ(0) = ϕ(0) = π2and

θ(0) = ϕ(0) = 0.

• Modify the Simulink model in Figure 5.19 to numerically solve the differential equationsof motion in (7.15) and its linear approximation in (7.16). Plot the solutions θ and ϕfor these differential equations over the interval [0, 40] on the same graph.

• If one increases the mass does the solution to (7.15) die out faster or slower? Explainwhy.

• If one increases the length l of the pendulum does the solution to (7.15) die out fasteror slower? Explain why.

• As before, consider the equation of motion for the pendulum in (7.15) with m = 2,l = 4 and c = 3. Now assume that the initial condition θ(0) = 0 and θ(0) = 0 in (7.15).Physically, this is like one is ”hitting” the pendulum while it sits motionless in the downposition. If θ(0) is large enough, then the pendulum will make a complete circle. (Infact, it could rotate around the origin many times.) Using your Simulink model tofind the smallest θ(0) > 0 such that the pendulum makes one complete circle. (Due todamping and Part (ii) above, this θ(0) > 2

√gl.) The comet command in Matlab may

be helpful in determining when and how many times the pendulum rotates around theorigin.


Exercise 5. Consider the linearized equation for the pendulum:

ϕ+g

lϕ = 0.

Part (i) Find the solution ϕ(t) where ϕ(0) = ϕ0 and ϕ(0) = 0.

Part (ii) Let a = ifft(θ) be 214 ifft of the solution ϕ(t) over the interval 0 ≤ t ≤ 10π. Thepower spectrum is presented in Figure 5.27, that is, the plot of k verses |ak|2. Find the lengthl of the pendulum. Assume that ϕ(0) = 0 and ϕ(0) > 0. Find the initial condition ϕ(0).

-5 0 5 10 15 20 25

k

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018The power spectrum of |a

k|2

Figure 5.27: The power spectrum of ϕ(t)

Chapter 6

Mass spring damper systems

In this chapter we will present an introduction to mass spring damper systems and theirstate space models.

6.1 A general mass spring damper equation

The equations of motion for a general mass spring damper system is of the form:

Mq + Φq +Kq = bu. (1.1)

Here the input or forcing function is u. Moreover, M , Φ and K are matrices on Cν , while bis a vector in Cν , and q(t) is a vector in Cν . (The set of all vectors of length ν with complexentries is denoted by C

ν .) The matrix M is the mass matrix, while Φ is the damping matrix,and K is the spring matrix. To eliminate some technical issues we will always assume thatthe mass matrix M is invertible. In fact, in all of our applicationsM is a strictly positivematrix. Finally, systems of the formMq+Φq+Kq = bu are also referred to as second ordermatrix differential equations.

By taking the Laplace transform of Mq + Φq +Kq = bu, we obtain(s2M + sΦ+K

)Q(s)− sMq(0)−Mq(0)− Φq(0) = bU(s)(

s2M + sΦ+K)Q(s) = (sM + Φ) q(0) +Mq(0) + bU(s).

By taking the inverse of s2M + sΦ +K, we arrive at

Q(s) =(s2M + sΦ+K

)−1((sM + Φ)q(0) +Mq(0)

)+(s2M + sΦ +K

)−1

bU(s). (1.2)

It is noted that Q(s) = (Lq)(s) is a vector in Cν . Moreover,(s2M + sΦ+K

)−1

=adj[s2M + sΦ +K]

det[s2M + sΦ +K](1.3)

where adj[Z] denotes the algebraic adjoint of a square matrix Z and det[Z] is the determinant

of Z. The poles of(s2M + sΦ +K

)−1are the values of s such that s2M + sΦ +K is not

301

302 CHAPTER 6. MASS SPRING DAMPER SYSTEMS

invertible, or equivalently, the values of s where det[s2M + sΦ + K] = 0. Therefore the

roots of det[s2M + sΦ + K] are the poles of(s2M + sΦ + K

)−1. Finally, it is noted that(

s2M + sΦ +K)−1

converges to zero as s tends to infinity. Hence(s2M + sΦ +K

)−1is a

strictly proper rational function, that is, the degree of the numerator is strictly less than thedegree of the denominator.

By taking the inverse Laplace transform of Q(s) in (1.2), one can theoretically computethe solution q(t) = (L−1Q)(t) for Mq +Φq +Kq = bu. This requires computing the inverse(s2M + sΦ +K

)−1. This inverse can computed for low order systems. However, in general

computing the inverse(s2M + sΦ + K

)−1can be numerically challenging. To get around

this problem we will convert the mass spring damper system in (1.1) to a state space settingin Section 5.5 below. The state space setting allows us to compute q(t) for large systems.

By setting all the initial conditions in (1.2) equal to zero, we see that the transfer functionfrom u to q is given by

Q(s)

U(s)=(s2M + sΦ +K

)−1

b. (1.4)

Since differentiation in the time domain corresponds to multiplication by s in the Laplacedomain, the transfer function from u to q and u to q are respectively given by

(Lq)(s)U(s)

= s(s2M + sΦ +K

)−1

b and(Lq)(s)U(s)

= s2(s2M + sΦ +K

)−1

b. (1.5)

Finally, it is noted that s(s2M + sΦ + K

)−1is a strictly proper rational matrix function,

while s2(s2M + sΦ + K

)−1is a proper rational matrix function when M = 0, that is, the

numerator and denominator are of the same degree. To see this simply let s tend to infinity.

DEFINITION 6.1.1 The mass spring damper system

Mq + Φq +Kq = 0 (1.6)

is stable if q(t) converges to zero for all initial conditions q(0) and q(0).

REMARK 6.1.2 By consulting equation (1.2) with u = 0, the mass spring damper system

Mq + Φq +Kq = 0 is stable if and only if all the poles of(s2M + sΦ+K

)−1are contained

in the open left half plane {λ ∈ C : �(λ) < 0}. Because the roots of det[s2M + sΦ +K] are

the poles of(s2M + sΦ +K

)−1, the mass spring damper system is stable if and only if all

the zeros of det[s2M + sΦ+K] are contained in the open left half plane. In particular, if themass spring damper system is stable, then the matrix K is invertible.

We will say that the mass spring damper system Mq + Φq + Kq = bu is stable if thecorresponding system Mq + Φq +Kq = 0 with u = 0 is stable. Therefore the general massspring damper system is stable if and only if all the roots of det[s2M+sΦ+K] are containedin the open left plane.

If the mass spring damper system Mq + Φq + Kq = 0 is stable, then K is invertible.Because the mass spring damper system is stable, det[s2M + sΦ+K] has all its roots in theopen left half plane. In particular, 0 is not a root of det[s2M + sΦ+K]. Hence det[K] = 0,and K is invertible.

6.1. A GENERAL MASS SPRING DAMPER EQUATION 303

PROPOSITION 6.1.3 Assume that the mass spring damper system Mq +Φq +Kq = buis stable. If u(t) = u0 equals a constant for all t, then

K−1bu0 = limt→∞

q(t)

0 = limt→∞

q(t) (1.7)

0 = limt→∞

q(t).

To see why this is true from a dynamics point of view, because the input is a constant andthe system is stable, the velocity q and acceleration q must settle down as t tends to infinity.In other words, q(∞) = 0 and q(∞) = 0. So for large t, the system Mq + Φq + Kq = u0is approximately equal to Kq ≈ bu0. The stability of our mass spring damper systemguarantees that K is invertible. Therefore q(∞) = K−1bu0.

Proof of Proposition 6.1.3. By rewriting (1.2), we obtain

Q(s) =(s2M + sΦ+K

)−1((sM + Φ)q(0) +Mq(0)

)+(s2M + sΦ +K

)−1

bU(s). (1.8)

Recall that the poles of (s2M + sΦ+K

)−1

=adj[s2M + sΦ +K]

det[s2M + sΦ +K]

are precisely the zeros of det[s2M + sΦ +K]. Because Mq + Φq +Kq = 0 is stable, all the

poles of(s2M + sΦ + K

)−1are contained in the open left half plane {λ ∈ C : �(λ) < 0}.

By consulting (1.8) with U(s) = u0s, a partial fraction expansion of Q(s) yields

Q(s) =(s2M + sΦ +K

)−1((sM + Φ)q(0) +Mq(0)

)+(s2M + sΦ+K

)−1 bu0s

= Ψ(s) +(s2M + sΦ +K

)−1∣∣∣∣s=0

bu0s

= Ψ(s) +K−1 bu0s

(1.9)

where Ψ(s) is a rational function with all its poles in the open left half plane. In fact, the polesof Ψ are contained in the zeros of det[s2M + sΦ+K]. In other words, q(t) = ψ(t) +K−1bu0where Ψ(s) = (Lψ)(s). Since Ψ(s) is a rational function with all its poles in the open left halfplane, ψ(t) =

∑tkjeλjtγj,k consists of a linear combination of stable exponential functions

where all {λj} are contained in the open left half plane {λ ∈ C : �(λ) < 0}. In fact, {λj} arethe roots of det[s2M + sΦ +K]. In particular, ψ(t) converges to zero as t tends to infinity.Therefore

q(t) = ψ(t) +K−1bu0 → K−1bu0

as t tends to infinity. This proves the first equation in (1.7).Since q(t) = ψ(t) +K−1bu0, we obtain q(t) = ψ(t). Because ψ(t) converges to zero as t

tends to infinity, q(t) → 0 as t tends to infinity. Moreover, q(t) = ψ(t). Since ψ(t) convergesto zero as t tends to infinity, q(t) → 0 as t tends to infinity. This completes the proof.

Recall that a transfer function G(s) is stable, it all of its poles are contained in the openleft half plane {λ ∈ C : �(λ) < 0}. For convenience let us recall the final value Theorem.


THEOREM 6.1.4 (Final value theorem) Let G(s) be a stable transfer function withinput u and output y, that is, Y (s) = G(s)U(s). If the input u(t) = u0 a constant for all t,then

G(0)u0 = limt→∞

y(t). (1.10)

It is noted that the function ψ(t) =∑tkjeλjtγj,k, and thus, the poles {λj} of G determine

how y(t) → G(0)u0. If the poles of G contain nonzero imaginary parts, then in general y(t)will oscillate to G(0)u0 at those frequencies. For example, if the poles contain large imaginaryparts, then in general ψ(t) will oscillate at a high frequency as it converges to G(0)u0. If allthe real parts of the poles of G are far in the left half plane, then y(t) will converge rapidlyto G(0)u0. If the real parts of some of the poles of G are close to the imaginary axis, theny(t) will converge slowly to G(0)u0.

Consider the stable mass spring damper system Mq + Φq + Kq = bu. By consulting(1.4), we see that Q(s) = (s2M + sΦ+K)−1bU(s). Because this mass spring damper systemis stable det[s2M + sΦ + K] has all its roots in the open left half plane, or equivalently,(s2M + sΦ +K)−1 is stable; see (1.3). In particular, the transfer function

G(s) =Q(s)

U(s)= (s2M + sΦ +K)−1b

from u to q is stable. (It is noted that G is a vector valued transfer function.) Assume thatthe input u(t) = u0 a constant for all t. According to the final value Theorem 6.1.4, we have

limt→∞

q(t) = (s2M + sΦ+K)−1b∣∣s=0

u0 = K−1bu0.

(Because our system is stable, K is invertible.) This agrees with the first equation in (1.7)of Proposition 6.1.3. (Here we have ignored the initial conditions. Including the initialconditions q(0) and q(0) do not effect the limit of q(t) as t tends to infinity; see (1.9).)

By consulting (1.5), we see that L(q) = s(s2M + sΦ + K)−1bU(s). Because our mass

spring damper system is stable, the transfer function L(q)U

= s(s2M + sΦ +K)−1b is stable.Assume that the input u(t) = u0 a constant for all t. According to the final value Theorem,we have

limt→∞

q(t) = s(s2M + sΦ+K)−1b∣∣s=0

u0 = 0.

This agrees with the second equation in (1.7) of Proposition 6.1.3. A similar argument yieldsthe third equation in (1.7) of Proposition 6.1.3.

6.2 Positive matrices and stability

In many practical mass spring damper systems, the matrices M , Φ and K are positive.To introduce the concept of a positive matrix, consider the mass spring damper systemMq + Φq +Kq = 0. The corresponding kinetic energy is given by

T =1

2q∗Mq.

6.2. POSITIVE MATRICES AND STABILITY 305

(Recall that L∗ denotes the conjugate transpose of a matrix or vector L.) It is emphasizedthat the kinetic energy 1

2q(t)∗Mq(t) is positive for all t, independent of the initial conditions

q(0) and q(0). In particular, 12ξ∗Mξ must be positive for all possible initial conditions

ξ = q(0) in Cν . The mass matrix M must have a special property to guarantee that thekinetic energy 1

2q(t)∗Mq(t) is positive for all t. This motivates the definition of a positive

matrix.

DEFINITION 6.2.1 Let R be a matrix on Cν.

• R is positive if ξ∗Rξ ≥ 0 for all ξ in Cν.

• R is strictly positive or positive definite if ξ∗Rξ > 0 for all nonzero ξ in Cν.

A positive matrix R is denoted by R ≥ 0, while a strictly positive matrix R is denotedby R > 0. Recall that a matrix R on Cν is self adjoint if R = R∗. We say that a scalar α ispositive if α ≥ 0, and strictly positive if α > 0. The following is a classical result in linearalgebra; see for example Horn and Johnson [21].

THEOREM 6.2.2 Let R be a matrix on Cν. Then the following holds.

• R is positive if and only if R is self adjoint and all the eigenvalues for R are positive.

• R is strictly positive if and only if R is self adjoint and all the eigenvalues for R arestrictly positive.

• R is strictly positive if and only if R is positive and invertible.

In particular, a strictly positive matrix is invertible.

The matrix

R =

[1 30 2

]is not positive. The eigenvalues {1, 2} for R are positive. However, R is not self adjoint.

The matrix

P =

[1 44 1

]is not positive. The matrix P is self adjoint. However, the eigenvalues {−3, 5} for P arenot both positive. Even though all the entries of P are strictly positive, the matrix P is notpositive. This example shows the entries of a matrix being strictly positive is not sufficientto guarantee that the matrix is positive.

The matrix

W =

[1 −1−1 1

]is positive. The matrix W is self adjoint. The eigenvalues {0, 2} for W are both positive.Because 0 is an eigenvalue forW , the matrix W is not strictly positive. Finally, even thoughtwo of the entries −1 of W are negative, W is still positive. So a positive matrix can havenegative entries.


The matrix

Z =

[2 −1−1 2

]is strictly positive. The matrix Z is self adjoint. The eigenvalues {1, 3} for Z are bothstrictly positive. Clearly, Z is invertible. Finally, even though two of the entries −1 of Z arenegative, Z is still strictly positive. So a strictly positive matrix can have negative entries.

REMARK 6.2.3 (Sylvester’s criterion) One can use the determinant of the principleminors known as Sylvester’s criterion to determine when a matrix is strictly positive; seeHorn and Johnson [21] or on the internet see Sylvester’s criterion in Wikipedia or WolframMathWorld. Sylvester’s criterion states that a square matrix R is strictly positive if and onlyif R is self adjoint and the determinant of all the matrices in the upper left hand corner ofR are strictly positive. For example, a 2× 2 matrix

R =

[r11 r12r21 r22

]is strictly positive if and only if R is self adjoint, r11 > 0 and det[R] > 0. The matrix

P =

⎡⎣p11 p12 p13p21 p22 p23p31 p32 p33

⎤⎦is strictly positive if and only if P is self adjoint and

p11 > 0 and det

[p11 p12p21 p22

]> 0 and det[P ] > 0.

Sylvester’s criterion for higher order matrices continue in the same fashion.

THEOREM 6.2.4 Consider the mass spring damper system Mq+Φq+Kq = 0 where M ,Φ and K are all strictly positive matrices. Then Mq +Φq +Kq = 0 is stable. In particular,all the roots of det[s2M + sΦ+K] are contained in the open left half plane.

Proof. According to Remark 6.1.2, the mass spring damper system Mq + Φq +Kq = 0 isstable if and only if all the roots of det[s2M + sΦ + K] are contained in the open left halfplane {z ∈ C : �(z) < 0}. Now assume that λ is a root of det[s2M + sΦ + K], that is,det[λ2M +λΦ+K] = 0. In other words, the matrix λ2M +λΦ+K is singular. Hence thereexists nonzero vector ξ such that

(λ2M +λΦ+K

)ξ = 0. Multiplying by ξ∗ on the left yields

0 = ξ∗(λ2M + λΦ +K

)ξ = ξ∗Mξλ2 + ξ∗Φξλ+ ξ∗Kξ.

This readily implies that λ is a root of the quadratic polynomial ξ∗Mξs2 + ξ∗Φξs + ξ∗Kξ.Because M , Φ and K are all strictly positive, the coefficients of this quadratic polynomialξ∗Mξ > 0, ξ∗Φξ > 0 and ξ∗Kξ > 0 are all strictly positive. By consulting Problem 1 inSection 4.6 of Chapter 4 we see that �(λ) < 0. Therefore the mass spring damper systemMq + Φq +Kq = 0 is stable. This completes the proof.

For further results on mechanical systems see [3]. Theorem 6.2.4 with Proposition 6.1.3readily yields the following result.


COROLLARY 6.2.5 Consider the mass spring damper system Mq+Φq+Kq = bu whereM , Φ and K are all strictly positive matrices and u is the input or forcing function. Ifu(t) = u0 equals a constant for all t, then


q(t)

0 = limt→∞

q(t) (2.1)

0 = limt→∞

q(t).

6.2.1 The Gershgorin circle theorem

In this section, we will review the Gershgorin circle theorem. This Theorem can be used toshow that the matrices occurring in our mass spring damper systems are positive. If A is amatrix on Cn, then eig(A) is the set of all eigenvalues for A. Let {ajk}nn11 be the entries fora matrix A on Cn, that is,

A =

⎡⎢⎢⎢⎣a11 a12 · · · a1na21 a22 · · · a2n...

... · · · ...an1 an2 · · · ann

⎤⎥⎥⎥⎦ on Cn. (2.2)

Let rj(A) = rj be the �1 norm of the j-th row of A not including ajj, that is,

rj =∑k =j

|ajk| (for j = 1, 2, · · · , n). (2.3)

Finally, if a is a complex number and r ≥ 0, then D(a, r) is the closed disc in the complexplane with center a and radius r defined by

D(a, r) = {z ∈ C : |z − a| ≤ r}. (2.4)

This sets the stage for the following classical result.

THEOREM 6.2.6 (Gershgorin circle theorem) Consider a matrix A on Cn with en-

tries {ajk}nn11 . Then every eigenvalue λ for A is contained in some closed disc D(ajj, rj) withcenter ajj and radius rj, and some index 1 ≤ j ≤ n. In particular,

eig(A) ⊆n⋃j=1

D(ajj, rj). (2.5)

It is noted that the bounds on the eigenvalues for A given by the Gershgorin circletheorem are in general not tight, and sometimes provide an estimate of the eigenvalueswhich is too large for certain problems. However, the Gershgorin circle theorem has manyuseful applications.

Recall that T ∗ denotes the complex conjugate transpose of a matrix T . Recall also that λis an eigenvalue for A if and only if λ is an eigenvalue for A∗. The Gershgorin circle theoremapplied to A∗ also yields the following result.


COROLLARY 6.2.7 Let A be a matrix on Cn with entries {ajk}nn11 . Set ρj =∑

k =j |akj|.Then every eigenvalue λ for A is contained in some closed disc D(ajj, ρj) with center ajjand radius ρj, and some index 1 ≤ j ≤ n. In particular,

eig(A) ⊆n⋃j=1

D(ajj, ρj). (2.6)

DEFINITION 6.2.8 A matrix A on Cn is called diagonally dominant if rj ≤ |ajj| forall j = 1, 2, · · · , n. Moreover, A on Cn is strictly diagonally dominant if rj < |ajj| for allj = 1, 2, · · · , n.

If A is a strictly diagonally dominant matrix, then 0 /∈ D(ajj, rj) for all j. In this case,the Gershgorin circle theorem shows that 0 is not an eigenvalue for A, or equivalently, A isinvertible. Therefore if A is a strictly diagonally dominant matrix, then A is invertible.

The null space of a matrix P mapping Cm into Cn is defined by {x ∈ Cm : Px = 0}.If A is a square matrix, then the null space of A equals zero if and only if 0 is not aneigenvalue for A, or equivalently, A is invertible. In other words, the the null space of Aon Cn equals zero if and only if the rank of A equals n, or equivalently, the columns A arelinearly independent, or equivalently, the rows A are linearly independent. By definition Ais a self adjoint matrix if A = A∗. Recall that the eigenvalues for a self adjoint matrix arereal; see Horn and Johnson [21].

COROLLARY 6.2.9 Let A be a self adjoint matrix on Cn whose diagonal entries are allpositive. If A is a diagonally dominant matrix, then A is a positive matrix. In this case,eig(A) ⊆ [0, 2γ] where

γ = max{ajj : 1 ≤ j ≤ n}.In particular, if A is a diagonally dominant matrix, then A is strictly positive if and onlyif the null space of A equals zero, or equivalently, A is invertible. Finally, if A is a strictlydiagonally dominant matrix, then A is a strictly positive matrix. In this case, we haveeig(A) ⊆ (0, 2γ).

Proof. If A is diagonally dominant, then the condition rj ≤ ajj implies that the closed disc

D(ajj, rj) ⊆ {z ∈ C : |z − ajj| ≤ ajj} = D(ajj, ajj).

The real part of the closed disc D(ajj, ajj) equals [0, 2ajj]. Because the eigenvalues for A arereal, the Gershgorin circle theorem tells us that

eig(A) ⊆n⋃j=1

[0, 2ajj] = [0, 2γ].

In other words, all the eigenvalues for A are positive. Since A is self adjoint, A is a positivematrix.


On the other hand, if A is strictly diagonally dominant, then the condition rj < ajjimplies that the closed disc

D(ajj, rj) ⊆ {z ∈ C : |z − ajj| < ajj}.Because the eigenvalues for A are real, the Gershgorin circle theorem tells us that

eig(A) ⊆n⋃j=1

(0, 2ajj) = (0, 2γ).

In other words, all the eigenvalues for A are strictly positive. Since A is self adjoint, A is astrictly positive matrix. This completes the proof.

To complete this section let us present the following result which is a consequence ofBrauer’s ovals theorem, and is a generalization of part of Corollary 6.2.9.

PROPOSITION 6.2.10 Let A be a self adjoint diagonally dominant matrix on Cn whose

diagonal entries are all strictly positive. If rj < ajj for all j except perhaps one j, then A isstrictly positive.

Let us complete this section with some examples. The matrix

A =

[1 −1−1 1

]is diagonally dominant. According to Corollary 6.2.9, this matrix is positive. The eigenvaluesfor A are {0, 2}. Therefore A is positive and singular. Finally, it is noted that in this case,γ = 1 and the bound [0, 2] on the eigenvalues for A presented in Corollary 6.2.9 is tight.However, in general these bounds are not tight.

The matrix

A =

⎡⎣ 25 −15 0−15 40 −210 −21 21

⎤⎦is diagonally dominant. According to Corollary 6.2.9, this matrix is positive. In this case,

r1 = 15 < 25 = a11, r2 = 36 < 40 = a22 and r3 = 21 = a22.

So the hypothesis of Proposition 6.2.10 is satisfied. Thus A is a strictly positive matrix.According to Matlab, the eigenvalues for A are given by {3.7824, 23.7332, 58.4844}. Clearly,the bound [0, 80] on the eigenvalues for A in Corollary 6.2.9 is not tight. Finally, it is notedthat A is not strictly diagonally dominate.

The self adjoint matrix

A =

⎡⎣ 13 −5 −6−5 13 −8−6 −8 14

⎤⎦is diagonally dominant. According to Corollary 6.2.9, this matrix is positive. Moreover, therank of A equals three. Therefore this matrix is strictly positive. The eigenvalues for A


are {0.6149, 17.6888, 21.6963}, which clearly shows that A is strictly positive. Finally, it isnoted that A is not strictly diagonally dominant. The hypothesis of Proposition 6.2.10 isnot satisfied. So one cannot use Proposition 6.2.10 to determine if A is strictly positive.

The matrix

A =

⎡⎣ 13 −5 −6.2−5 13 −8−6.2 −8 14

⎤⎦is not diagonally dominant. In particular, the hypothesis of Corollary 6.2.9 is not satisfied.However, A is strictly positive. In fact, {0.4874, 17.7539, 21.7587} are the eigenvalues for A.

Assume that A is an irreducible self adjoint diagonally dominant matrix on Cn whosediagonal entries are all strictly positive. Then the Levy-Desplanques theorem tells us thatA is strictly positive. However, the concept of irreducibly takes us beyond the scope of thesenotes. Finally, see diagonally dominant matrix in Wikipedia.

6.2.2 The case when M > 0 and K > 0 and Φ ≥ 0

Consider a matrix P mapping Cm into Cn, that is, a matrix P with m columns and n rows.Recall that P is one to one if Px = 0 implies that x = 0. The matrix P is one to one if andonly if the null space of P equals zero. In other words, P is one to one if and only if therank of P equals m. Finally, P is one to one if and only if the range of P ∗ equals Cm. Forfurther results on linear algebra see Horn and Johnson [21].

The mass spring damper system Mq + Φq + Kq = 0 can be stable even when thehypotheses of Theorem 6.2.4 does not hold. The following result taken from Theorem 2.2.3in [3] is useful in applications.

THEOREM 6.2.11 Consider the mass spring damper system Mq+Φq+Kq = 0 where Mand K are strictly positive matrices, while Φ is a positive matrix. Then Mq +Φq +Kq = 0is stable if and only if the matrix[

K − σMΦ

]is one to one for all real numbers σ > 0. (2.7)

In this case, all the roots of det[s2M + sΦ+K] are contained in the open left half plane.

If Φ is strictly positive, then the matrix in (2.7) is one to one. Hence the previousTheorem 6.2.11 is mild generalization of Theorem 6.2.4.

For an application of Theorem 6.2.11, consider the mass spring damper system given by[1 00 1

] [q1q2

]+

[2.5 00 0

] [q1q2

]+

[2 −1−1 1

] [q1q2

]=

[00

]. (2.8)

Here Mq + Φq +Kq = 0 where M = I the identity matrix on C2,

Φ =

[2.5 00 0

]and K =

[2 −1−1 1

]and q =

[q1q2

].

6.3. THE MASS SPRING DAMPER AND STATE SPACE 311

Clearly, M = I is strictly positive. Moreover, Φ and K are both self adjoint matrices.The eigenvalues for Φ are {0, 2.5}. So Φ is just positive and not strictly positive. Thecharacteristic polynomial for K is

det[sI −K] = det

[s− 2 11 s− 1

]= s2 − 3s+ 1.

By using the quadratic formula, the roots for s2−3s+1 are 32±

√52. Hence the eigenvalues for

K are 32±

√52. (Recall that the roots of the characteristic polynomial are the eigenvalues of

the corresponding matrix.) Because the eigenvalue for K are strictly positive, K is strictlypositive. Since K11 = 2 > 0 and det[K] = 1, Sylvester’s criterion also shows that K isstrictly positive; see Remark 6.2.3. Hence the hypotheses of Theorem 6.2.11 are satisfied.Now observe that [

K − σMΦ

]=

⎡⎢⎢⎣2− σ −1−1 1− σ2.5 00 0

⎤⎥⎥⎦is one to one for all σ > 0. In fact, the rank of the previous matrix equal 2 for all σ. Byconsulting Theorem 6.2.11, we see that the mass spring damper system in (2.8) is stable.

To directly verify that the mass spring damper system in (2.8) is stable, notice that

det[s2M + sΦ+K] = det

[s2 + 2.5s+ 2 −1

−1 s2 + 1

]= s4 + 2.5s3 + 3s2 + 2.5s+ 1

= (s2 + 2s+ 1)(s2 + 0.5s+ 1) = (s+ 1)2(s2 + 0.5s+ 1).

In other words,det[s2M + sΦ+K] = (s+ 1)2(s2 + 0.5s+ 1).

Therefore all the roots of det[s2M + sΦ +K] are contained in the open left hand plane. Infact, using the quadratic formula, the roots of det[s2M + sΦ+K] are given by{

−1,−1,−1

4±

√15

4i

}. (2.9)

Therefore the mass spring damper system in (2.8) is stable.

6.3 The mass spring damper and state space

In this section, we will present a state space model for our mass spring damper system.Consider the mass spring damper system

Mq + Φq +Kq = bu (3.1)

where q(t) is a vector in Cν . Throughout we assume that the mass matrix M is invertible.

By multiplying the matrix second order differential equation (3.1) by M−1, we see that theequations of motion can also be written as

q = −M−1Kq −M−1Φq +M−1bu. (3.2)


To convert our mass spring damper system in (3.1) to state space, let x1 and x2 be the statevariables defined by

x1 = q and x2 = x1 = q. (3.3)

Notice that x1 and x2 are both column vectors of length Cν . By employing x2 = q in equation(3.2), we obtain

x2 = −M−1Kq −M−1Φq +M−1bu = −M−1Kx1 −M−1Φx2 +M−1bu.

Since x1 = Ix2 where I is the identity matrix on Cν , the equations of motion in (3.1) can bewritten in the following state space form:[

x1x2

]=

[0 I

−M−1K −M−1Φ

] [x1x2

]+

[0

M−1b

]u. (3.4)

In other words, the mass spring damper system in (3.1) admits a state space realization ofthe form:

x = Ax+Bu where x =

[x1x2

]=

[qq

]A =

[0 I

−M−1K −M−1Φ

]and B =

[0

M−1b

]. (3.5)

It is emphasized that x is a vector of length C2ν while A is a 2ν × 2ν matrix and B is a

column vector of length C2ν . Moreover, A is block matrix, that is, A is a matrix whoseentries consist of matrices. The 0 in the upper left hand corner of A is the ν×ν zero matrix,the identity matrix I in the upper right hand corner of A is the ν × ν identity matrix, whileboth M−1K and M−1Φ are ν × ν matrices. The 0 in B is the zero column vector of lengthν while M−1b is a column vector of length ν. Finally, the solution to the state space systemin (3.5) is given by

x(t) = eAtx(0) +

∫ t

0

eA(t−τ)Bu(τ)dτ where x(0) =

[q(0)q(0)

][q(t)q(t)

]= eAt

[q(0)q(0)

]+

∫ t

0

eA(t−τ)Bu(τ)dτ. (3.6)

A state space realization for the mass spring damper system Mq + Φq +Kq = bu withinput u and output y = q is given by[

x1x2

]=

[0 I

−M−1K −M−1Φ

] [x1x2

]+

[0

M−1b

]u where

[x1(t)x2(t)

]=

[q(t)q(t)

]q =

[I 0

] [x1x2

]. (3.7)

The matrix[I 0

]mapping C 2ν into Cν is a block row matrix where I is the identity on Cν

and 0 is the ν × ν zero matrix on Cν . Furthermore, the output q(t) is a vector in C

ν . Byconsulting (1.4), we see that the transfer function from u to q is given by

Q(s)

U(s)=(s2M + sΦ +K

)−1

b =[I 0

](sI −A)−1B. (3.8)


where A and B are defined in (3.5). Here we used the fact that the transfer function G fromu to y for any state space system


is given by

G(s) =Y (s)

U(s)= C(sI − A)−1B +D. (3.10)

The state space realization for the mass spring damper system Mq+Φq+Kq = bu withinput u and output y = q is given by[

x1x2

]=

[0 I

−M−1K −M−1Φ

] [x1x2

]+

[0

M−1b

]u where

[x1(t)x2(t)

]=

[q(t)q(t)

]q =

[0 I

] [x1x2

]. (3.11)

The matrix[0 I

]mapping C 2ν into Cν is a block row matrix where I is the identity on

Cν and 0 is the zero matrix on Cν . Furthermore, the output q(t) is a vector in Cν . Byconsulting (1.5), we see that the transfer function from u to q is given by

(Lq)(s)U(s)

= s(s2M + sΦ +K

)−1

b =[0 I

](sI −A)−1B. (3.12)

where A and B are defined in (3.5); see also (3.9) and (3.10).To obtain a state space realization with input u and output y = q, notice that the

equations in (3.11) yield

q =[0 I

] [x1x2

]=[0 I

] [[ 0 I−M−1K −M−1Φ

] [x1x2

]+

[0

M−1b

]u

]=[−M−1K −M−1Φ

] [x1x2

]+M−1bu.

Therefore the state space realization for the mass spring damper system Mq+Φq+Kq = buwith input u and output y = q is given by[

x1x2

]=

[0 I

−M−1K −M−1Φ

] [x1x2

]+

[0

M−1b

]u where

[x1(t)x2(t)

]=

[q(t)q(t)

]q = − [

M−1K M−1Φ] [x1x2

]+M−1bu. (3.13)

By consulting (1.5), we see that the transfer function from u to q is given by

(Lq)(s)U(s)

= s2(s2M + sΦ+K

)−1

b = − [M−1K M−1Φ

](sI −A)−1B +M−1b (3.14)

where A and B are defined in (3.5); see also (3.9) and (3.10).


The force corresponding to the mass matrix is Mq. So a state space model for the massspring damper system Mq + Φq +Kq = bu with input u and output y =Mq is given by[

x1x2

]=

[0 I

−M−1K −M−1Φ

] [x1x2

]+

[0

M−1b

]u where

[x1(t)x2(t)

]=

[q(t)q(t)

]Mq = − [

K Φ] [x1x2

]+ bu. (3.15)

The transfer function from the force u to the force Mq on the mass matrix M is given by

(LMq)(s)

U(s)= s2M

(s2M + sΦ+K

)−1

b = − [K Φ

](sI − A)−1B + b. (3.16)

Let us set C2 = − [K Φ

]. If u(t) = δ(t) the Dirac delta function, then the force corre-

sponding to Mq is given by

Mq(t) = C2eAtx(0) + C2

∫ t

0

eA(t−τ)Bδ(τ)dτ + bδ(t) = C2eAt(x(0) +B

)+ bδ(t)

C2 = − [K Φ

]. (3.17)

(The second equality follows from the fact that h(t) =∫ t0h(t− τ)δ(τ)dτ . To see this simply

take the Laplace transform.) Equation (3.17) shows that if one hits the mass spring dampersystem with a hammer or Dirac delta function, then the force of δ(t) is felt by the componentsin the b vector.

REMARK 6.3.1 Assume that M is invertible and A is the block matrix in (3.5). Thenthe characteristic polynomial for A is given by

det[sI − A] = det[s2I + sM−1Φ +M−1K] =det[s2M + sΦ +K]

det[M ]. (3.18)

Hence det[sI − A] and det[s2M + sΦ +K] have the same roots including their multiplicity.Moreover, the eigenvalues for A are the roots of det[s2M + sΦ + K]. So the eigenvalues

for A are precisely the poles of(s2M + sΦ + K

)−1; see Remark 6.1.2. Therefore A is a

stable matrix if and only if the mass spring damper system Mq + Φq + Kq = 0 is stable.In particular, if M , Φ and K are all strictly positive matrices, then x = Ax is stable; seeTheorem 6.2.4.

To prove that (3.18) holds, recall that if α, β, γ and ρ are square matrices with ρinvertible, then the Schur complement shows that

det

[α βγ ρ

]= det[ρ] det[α− βρ−1γ]. (3.19)


Recall that det[PR] = det[P ] det[R] where P and R are square matrices of the same size.Using this with (3.19), we obtain

det[sI −A] = det

[sI −I

M−1K sI +M−1Φ

]= det[sI +M−1Φ] det[sI + (sI +M−1Φ)−1M−1K]

= det[s(sI +M−1Φ) +M−1K]

= det[s2I + sM−1Φ+M−1K]

= det[M−1] det[s2M + sΦ+K]

=det[s2M + sΦ +K]

det[M ].

Therefore (3.18) holds.Assume that

A =

[0 I

−M−1K −M−1Φ

]where M , Φ and K are strictly positive matrices on Cν . Then Remark 6.3.1 shows thatA is stable. Let us directly prove this result. Assume that λ is an eigenvalue for A witheigenvector ξ, that is,

Aξ =

[0 I

−M−1K −M−1Φ

] [ξ1ξ2

]= λ

[ξ1ξ2

]where ξ =

[ξ1ξ2

]. (3.20)

Here ξ1 and ξ2 are both vectors in Cν . This readily implies that

ξ2 = λξ1 and −M−1Kξ1 −M−1Φξ2 = λξ2.

Using ξ2 = λξ1 in the second equation and multiplying by M yields

λ2Mξ1 + λΦξ1 +Kξ1 = 0. (3.21)

It is noted that ξ1 = 0. If ξ1 = 0, then ξ2 = λξ1 implies that ξ2 = 0. So the eigenvector ξ = 0.This contradicts the fact that an eigenvector is nonzero. Therefore ξ1 = 0. Multiplying (3.21)by ξ∗1 on the left implies that λ is a root of the following quadratic equation

λ2ξ∗1Mξ1 + λξ∗1Φξ1 + ξ∗1Kξ1 = 0. (3.22)

Because M , Φ and K are all strictly positive and ξ1 is nonzero, ξ∗1Mξ1 > 0, ξ∗1Φξ1 > 0 andξ∗1Kξ1 > 0. Recall that a quadratic polynomial aλ2 + bλ + c = 0 where a > 0 and b and care real has all it roots in the open left hand plane if and only if b > 0 and c > 0. (Thisfollows from the quadratic formula.) By consulting (3.22) we see that �(λ) < 0. Thereforethe matrix A is stable.

Assume that Mq + Φq + Kq = 0 is a stable mass spring damper system. In this case,(s2M + sΦ + K

)−1is a stable transfer function, the matrix K is invertible, and the block


matrix A in (3.5) is stable; see Remarks 6.1.2 and 6.3.1. Because A is stable, 0 is not aneigenvalue for A, and thus, A is invertible. Moreover, the inverse of A is given by

A−1 =

[−K−1Φ −K−1MI 0

]. (3.23)

To verify that this is indeed the inverse of A, simply show that A−1A = I. Recall that themass spring damper system Mq+Φq+Kq = bu admits a state space realization of the form:

x = Ax+Bu where x =

[x1x2

]=

[qq

]A =

[0 I

−M−1K −M−1Φ

]and B =

[0

M−1b

]. (3.24)

Assume that the input u(t) = u0 a constant for all t. By consulting (5.12) in Chapter 5, wesee that [

q(t)q(t)

]= x(t) = eAt

(x(0) + A−1Bu0

)− A−1Bu0

= eAt([q(0)q(0)

]−[K−1bu0

0

])+

[K−1bu0

0

]. (3.25)

Because A is stable, by matching components we obtain


q(t) and 0 = limt→∞

q(t).

This yields the first two equations in (1.7) of Proposition 6.1.3. To obtain the third equationin (1.7), notice that the equation for q in (3.25) shows that

q =[0 I

]eAt

([q(0)q(0)

]−[K−1bu0

0

])and q =

[0 I

]AeAt

([q(0)q(0)

]−[K−1bu0

0

])→ 0

as t tends to infinity. Therefore equation (1.7) holds.

6.4. A MASS SPRING DAMPER EXAMPLE 317

6.4 A mass spring damper example

m2

m1 m3

k1

k4

c1

k2

c4

c2

k3

c3

q1

q2

q3

u


Consider the mass spring damper system in Figure 6.1, with the forcing or input functionu applied to the second mass m2. Let qj be the position of the mass mj for j = 1, 2, 3. Sincethere are three masses, the equations of motion are described by three coupled second orderlinear differential equations. The equations of motion are given by

m1q1 + c1q1 + c2(q1 − q2) + c4(q1 − q3) + k1q1 + k2(q1 − q2) = 0

m2q2 + c2(q2 − q1) + c3(q2 − q3) + k4q2 + k2(q2 − q1) + k3(q2 − q3) = u (4.1)

m3q3 + c4(q3 − q1) + c3(q3 − q2) + k3(q3 − q2) = 0.

In other words, this system admits a second order matrix differential equation of the formMq + Φq + kq = bu where

M =

⎡⎣m1 0 00 m2 00 0 m3

⎤⎦ and Φ =

⎡⎣c1 + c2 + c4 −c2 −c4−c2 c2 + c3 −c3−c4 −c3 c3 + c4

⎤⎦K =

⎡⎣k1 + k2 −k2 0−k2 k2 + k3 + k4 −k30 −k3 k3

⎤⎦ and q =

⎡⎣q1q2q3

⎤⎦ and b =

⎡⎣010

⎤⎦ . (4.2)

Throughout we assume that mj > 0 for j = 1, 2, 3, while cn > 0 and kn > 0 forn = 1, 2, 3, 4. Clearly, the mass matrix M is strictly positive. Moreover, Φ and K are both


diagonally dominant self adjoint matrices, with positive entries on the diagonal. Accordingto Corollary 6.2.9 of the Gershgorin circle theorem, the matrices Φ and K are both positivematrices. We claim that both Φ and K are strictly positive.

Because M , Φ and K are all strictly positive matrices, our mass spring damper systemMq+Φq+Kq = bu determined by (4.2) is stable. In particular, the roots of det[Ms2+Φs+K]are all contained in the open left half plane; see Theorem 6.2.4.

Because Φ is a positive matrix, to prove that Φ is strictly positive, it is sufficient to showthat Φ is invertible. By using Gaussian elimination or row reduction, we have

Φ =

⎡⎣c1 + c2 + c4 −c2 −c4−c2 c2 + c3 −c3−c4 −c3 c3 + c4

⎤⎦ →⎡⎣c1 + c4 c3 −c3 − c4

−c2 c2 + c3 −c3−c4 −c3 c3 + c4

⎤⎦→

⎡⎣ c1 0 0−c2 c2 + c3 −c3−c4 −c3 c3 + c4

⎤⎦ →⎡⎣c1 0 00 c2 + c3 −c30 −c3 c3 + c4

⎤⎦ .Here we added the second row to the first row, and then added the last row to the first row.The last matrix is obtained by using the first row of the third matrix to eliminate the firstcomponent of the second and third row. The 2× 2 matrix in the lower right hand corner ofthe last matrix is invertible. Because c1 is strictly positive, the last matrix is invertible, andthus, Φ is invertible. Therefore Φ is strictly positive.

Because K is a positive matrix, to prove that K is strictly positive, it sufficient to showthat K is invertible. By using Gaussian elimination or row reduction, we have

K =

⎡⎣k1 + k2 −k2 0−k2 k2 + k3 + k4 −k30 −k3 k3

⎤⎦ →⎡⎣k1 + k2 −k2 0

−k2 k2 + k4 00 −k3 k3

⎤⎦→

⎡⎣k1 + k2 −k2 0k1 k4 00 −k3 k3

⎤⎦ →⎡⎣k1 + k2 +

k2k1k4

0 0

k1 k4 00 −k3 k3

⎤⎦ .Here we added the last row to the second row, and then added the first row to the secondrow. Then we multiplied the second row by k2

k4and added it the first row. Because the last

matrix is lower triangular with nonzero entries on the diagonal, this matrix is nonsingular.Hence K is invertible, and thus, K is a strictly positive matrix. Finally, it is noted thatProposition 6.2.10 also shows that K is strictly positive.

Now let us assume that

m1 = 7, m2 = 3, m3 = 2

k1 = 10, k2 = 15, k3 = 21, k4 = 4 (4.3)

c1 = 2, c2 = 5, c3 = 8, c4 = 6.


In this case, the equations of motion are given by

7q1 + 2q1 + 5(q1 − q2) + 6(q1 − q3) + 10q1 + 15(q1 − q2) = 0

3q2 + 5(q2 − q1) + 8(q2 − q3) + 4q2 + 15(q2 − q1) + 21(q2 − q3) = u (4.4)

2q3 + 6(q3 − q1) + 8(q3 − q2) + 21(q3 − q2) = 0.

The equations of motion in (4.4) can be rewritten as a second order matrix differentialequation, that is,⎡⎣7 0 0

0 3 00 0 2

⎤⎦⎡⎣q1q2q3

⎤⎦+

⎡⎣ 13 −5 −6−5 13 −8−6 −8 14

⎤⎦⎡⎣q1q2q3

⎤⎦+

⎡⎣ 25 −15 0−15 40 −210 −21 21

⎤⎦⎡⎣q1q2q3

⎤⎦ =

⎡⎣010

⎤⎦u. (4.5)

So the mass spring damper system in Figure 6.1, admits a second order matrix differentialequation of the form:

Mq + Φq +Kq = bu. (4.6)

Here q is a column vector of length three, while M , Φ, and K are the 3× 3 matrices, and b,is the column vector of length three given by

M =

⎡⎣7 0 00 3 00 0 2

⎤⎦ , Φ =

⎡⎣ 13 −5 −6−5 13 −8−6 −8 14

⎤⎦ and K =

⎡⎣ 25 −15 0−15 40 −210 −21 21

⎤⎦q =

⎡⎣q1q2q3

⎤⎦ and b =

⎡⎣010

⎤⎦ . (4.7)

Recall that our M , Φ and K are all strictly positive matrices. Moreover, a matrix R isstrictly positive if and only if R is self adjoint and all of its eigenvalues are strictly positive;see Theorem 6.2.2. Clearly, M , Φ and K are all self adjoint matrices. Because M is adiagonal matrix, the eigenvalues for M are {7, 3, 2}. Since M is self adjoint, M is strictlypositive. Using Matlab to compute the corresponding eigenvalues for Φ and K we obtained:

• The eigenvalues for Φ are {0.6149, 17.6888, 21.6963}. Since Φ is self adjoint, Φ isstrictly positive.

• The eigenvalues for K are {3.7824, 23.7332, 58.4844}. Since K is self adjoint, K isstrictly positive.

As expected, M , Φ and K are all strictly positive matrices. According to Theorem 6.2.4,the mass spring damper system in (4.5) is stable; see also Remark 6.3.1.

Now let us convert the equations of motion in (4.5) and (4.6) to state space. In thissetting, the state space model in (3.4) and with (4.7) yields the state space equation

x = Ax+Bu (4.8)


where the state x is a vector in C6 while A is the 6 × 6 matrix and B is the column vectorof length 6 defined by

A =

[0 I

−M−1K −M−1Φ

]=

⎡⎢⎢⎢⎢⎢⎢⎣0 0 0 1 0 00 0 0 0 1 00 0 0 0 0 1

−257

157

0 −137

57

67

5 −407

7 53

−137

83

0 212

−212

3 4 −7

⎤⎥⎥⎥⎥⎥⎥⎦

B =

[0

M−1b

]=

⎡⎢⎢⎢⎢⎢⎢⎣000013

0

⎤⎥⎥⎥⎥⎥⎥⎦ and x =

[x1x2

]=

[qq

]=

⎡⎢⎢⎢⎢⎢⎢⎣q1q2q3q1q2q3

⎤⎥⎥⎥⎥⎥⎥⎦ . (4.9)

The Matlab commands we used to compute A and B are given by

A = [zeros(3, 3), eye(3);−inv(M) ∗K,−inv(M) ∗ Φ]B = [0; 0; 0; inv(M) ∗ b] .

Because all the matrices M , Φ and K are strictly positive, A is a stable matrix; see Remark6.3.1. In fact, using Matlab the eigenvalues for A are given by

eig(A) = {−1.4177, −6.6653, −2.4656± 2.3530i, −0.0881± 1.0635i}. (4.10)

As expected, the state space A in (4.9) is stable, that is, all the eigenvalue for A live in theopen left half plane {λ ∈ C : �(λ) < 0}. Remark 6.3.1. also shows that the eigenvalues for

A are also the poles of(s2M + sΦ + K

)−1. Finally, since the eigenvalues contain nonzero

imaginary part would expect the step and impulse response to oscillate at two the frequencies2.3530 and 1.0635.

Let us assume that the input u(t) = 2. According to (1.7) in Proposition 6.1.3 or (2.1)in Corollary 6.2.5, we have

limt→∞

q(t) = 2K−1b =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

3

25

1

5

1

5

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦. (4.11)

Moreover, both q(t) → 0 and q(t) → 0 as t tends to infinity when u(t) = 2. Finally, the first


equation in (5.5) of Corollary 5.5.3 also shows that

limt→∞

x(t) = limt→∞

[q(t)q(t)

]= −2A−1B =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

3

25

1

5

1

5

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦⎡⎣000

⎤⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦. (4.12)

This also yields the expression for q(∞) in (4.11) and q(∞) = 0.Figure 6.2 displays the graph of q(t) with the input force u(t) = 2. The graph at the

top plots q1(t), while the middle plots q2(t) and the bottom graphs q3(t). As expected,q1(∞) = 3

25while q2(∞) = 1

5and q3(∞) = 1

5. Because the eigenvalues of A contain nonzero

complex entries −0.0881 ± 1.0635i and −2.4656 ± 2.3530i, the step response q(t) oscillateswith frequencies 1.0635 and 2.3530 to its final value q(∞). Finally, since the largest real partof the eigenvalues {λj}61 for A equals −0.0881, the solution for q(t) will contain terms of theform:

αe−0.0881te1.0635it + βe−0.0881te−1.0635it

where α and β are constant vectors. This means that q(t) will be approximately equal toq(∞) when e−0.0881t becomes small around 50 seconds; see Remark 5.5.4 in Chapter 5. TheMatlab commands we used to plot the step response in Figure 6.2 are given by

• step(A,2*B,[eye(3),zeros(3,3)],zeros(3,1));grid

• title(’The step response for q(t) with u=2’)

The 2 is in front of B to generate the input u(t) = 2. The term [eye(3), zeros(3,3)] picks outq(t) from x(t).

Figure 6.3 displays the graph of q(t) with the input force u(t) = 2. The graph at thetop plots q1(t), while the middle plots q2(t) and the bottom graphs q3(t). As expected,q(∞) = 0. Because the eigenvalues of A contain nonzero complex entries −0.0881± 1.0635iand −2.4656 ± 2.3530i, the step response q(t) oscillates with frequencies 1.0635 and 2.3530to its final value q(∞) = 0. Finally, since the largest real part of the eigenvalues {λj}61 forA equals −0.0881, the solution for q(t) will contain terms of the form

αe−0.0881te1.0635it + βe−0.0881te−1.0635it

which means that q(t) will be approximately equal to q(∞) = 0 when e−0.0881t becomes smallaround 50 seconds; see Remark 5.5.4 in Chapter 5. The Matlab commands we used to plotthe step response in Figure 6.3 are given by

• step(A,2*B,[zeros(3,3),eye(3)],zeros(3,1));grid


0

0.05

0.1

0.15

0.2

0.25

To:

Out

(1)

0

0.1

0.2

0.3

0.4

To:

Out

(2)

0 10 20 30 40 50 60 700

0.1

0.2

0.3

0.4

To:

Out

(3)

The step response for q(t) with u=2

Time (seconds)

Am

plitu

de

Figure 6.2: The graph of q with u(t) = 2

• title(’The step response for the dot q(t) with u=2’)

The 2 is in front of B to generate the input u(t) = 2. The term [zeros(3,3),eye(3)] picks outq(t) from x(t).

One can also compute the transfer function Gj(s) from u to qj(t) for j = 1, 2, 3, that is,

G1(s) =Q1

U=

0.2381s3 + 3.524s2 + 10.5s+ 7.5

s6 + 13.19s5 + 64.36s4 + 166s3 + 203.9s2 + 179.3s+ 125

G2(s) =Q2

U=

0.3333s4 + 2.952s3 + 8.167s2 + 14.83s+ 12.5

s6 + 13.19s5 + 64.36s4 + 166s3 + 203.9s2 + 179.3s+ 125(4.13)

G3(s) =Q3

U=

1.333s3 + 6.69s2 + 13.4s+ 12.5

s6 + 13.19s5 + 64.36s4 + 166s3 + 203.9s2 + 179.3s+ 125.

It is emphasized that the denominator of these three transfer functions are all equal todet[sI −A] which is the characteristic polynomial for A. In other words, the poles of Gj(s)are precisely the eigenvalues for A. The state space formula for the transfer function Q

U


−0.2

−0.1

0

0.1

0.2

To:

Out

(1)

−0.2

−0.1

0

0.1

0.2

To:

Out

(2)

0 10 20 30 40 50 60 70−0.2

−0.1

0

0.1

0.2

To:

Out

(3)

The step response for the dot q(t) with u=2

Time (seconds)

Am

plitu

de

Figure 6.3: The graph of q with u(t) = 2

is given in (3.8). Motivated by this state space formula, we used the following Matlabcommands to compute the transfer functions Gj(s) in (4.13):

• [num, den] = ss2tf(A,B, [eye(3), zeros(3, 3)], zeros(3, 1))

• G1 = minreal(tf(num(1, :), den))



Finally, it is noted that the num is a 3× 7 matrix whose j-th row contain the numerator ofGj(s) for j = 1, 2, 3.

Recall that a transfer function G(s) is called stable if all the poles of G are contained inthe open left half plane.

Now let us use the final value Theorem 6.1.4 to obtain the value of q(∞) in (4.11) whenu(t) = 2. The transfer functions Gj(s) form u to qj(t) for j = 1, 2, 3 are given in (4.13).


According to the final value Theorem with u(t) = 2, we have

limt→∞

q1(t) = 2G1(0) =3

25and lim

t→∞q2(t) = 2G2(0) =

1

5and lim

t→∞q3(t) = 2G3(0) =

1

5.

This agrees with the limit q(∞) = limt→∞ q(t) we obtained in equation (4.11).Since differentiation in the time domain corresponds to multiplication by s in the Laplace

domain, we see that the transfer function from u to qj for j = 1, 2, 3 are given by

(Lq1)(s)U

= sG1(s) =0.2381s4 + 3.524s3 + 10.5s2 + 7.5s

s6 + 13.19s5 + 64.36s4 + 166s3 + 203.9s2 + 179.3s+ 125(Lq2)(s)

U= sG2(s) =

0.3333s5 + 2.952s4 + 8.167s3 + 14.83s2 + 12.5s

s6 + 13.19s5 + 64.36s4 + 166s3 + 203.9s2 + 179.3s+ 125(4.14)

(Lq2)(s)U

= sG3(s) =1.333s4 + 6.69s3 + 13.4s2 + 12.5s

s6 + 13.19s5 + 64.36s4 + 166s3 + 203.9s2 + 179.3s+ 125.

Since x = Ax+Bu and q =[0 I

]x, we can also obtain these transfer functions by employing

the state space formula for (Lq)U

in (3.12) with the following Matlab commands:

• [num, den] = ss2tf(A,B, [zeros(3, 3), eye(3)], zeros(3, 1))

• F1 = minreal(tf(num(1, :), den))



Finally, since sGj(s)|s=0 = 0, the final value Theorem 6.1.4 tells us the q(∞) = 0 for allconstant input functions u(t) = u0. This confirms the results in (4.12) and the secondequation in (1.7) of Proposition 6.1.3 or (2.1) of Corollary 6.2.5.

Since differentiation in the time domain corresponds to multiplication by s in the Laplacedomain, we see that the transfer functions from u to qj for j = 1, 2, 3 are given by

(Lq1)(s)U

= s2G1(s) =0.2381s5 + 3.524s4 + 10.5s3 + 7.5s2

s6 + 13.19s5 + 64.36s4 + 166s3 + 203.9s2 + 179.3s+ 125(Lq2)(s)

U= s2G2(s) =

0.3333s6 + 2.952s5 + 8.167s4 + 14.83s3 + 12.5s2

s6 + 13.19s5 + 64.36s4 + 166s3 + 203.9s2 + 179.3s+ 125(4.15)

(Lq2)(s)U

= s2G3(s) =1.333s5 + 6.69s4 + 13.4s3 + 12.5s2

s6 + 13.19s5 + 64.36s4 + 166s3 + 203.9s2 + 179.3s+ 125.

According to (3.13), we have

x = Ax+Bu and q = − [M−1K M−1Φ

] [x1x2

]+M−1bu. (4.16)

So we can also obtain these transfer functions using the following Matlab commands:

• [num, den] = ss2tf(A,B,−[inv(M) ∗K, inv(M) ∗ Φ], inv(M) ∗ b)


• H1 = minreal(tf(num(1, :), den))



Since s2Gj(s)|s=0 = 0, the final value Theorem 6.1.4 tells us the q(∞) = 0 for all constantinput functions u(t) = u0. This confirms the third equation in (1.7) of Proposition 6.1.3 or(2.1) of Corollary 6.2.5.

Let us set C2 = − [K Φ

]. If the input u(t) = δ(t) the Dirac delta function, then (3.17)

shows that

Mq(t) = C2eAtx(0) + C2

∫ t

0

eA(t−τ)Bδ(τ)dτ +M−1bδ(t) = C2eAt(x(0) +B

)+

⎡⎣010

⎤⎦ δ(t).Notice that 1 appears in the second component of b =

[0 1 0

]∗and all the other compo-

nents of b equal zero. Therefore the force of the Dirac delta function δ(t) is felt on the secondmass m2 = 3. In other words, if one hits the second mass with a hammer, then the full forceof the impact of the hammer is felt on the second mass m2, while the first m1 and third massm3 does not feel the impact of the hammer. The masses m1 and m3 move, however, there isno δ(t) force on m1 and m3.

m2

m1 m3

k1

k4

c1

k2

c2

k3

q1

q2

q3

u



6.4.1 Exercise

Problem 1. Consider the mass spring damper system in Figure 6.4. Assume that {mj}31and {cj}21 and {kj}41 are all strictly positive.

(1) Derive the equations of motion for the mass spring damper system in Figure 6.4. Writethese equations of motion in matrix second order form, that is,

Mq + Φq +Kq = bu where q =

⎡⎣q1q2q3

⎤⎦ .(2) Show that M and K are strictly positive, while Φ is just positive. So the hypothesis

of Theorem 6.2.4 does not hold. Use Theorem 6.2.11 to show that the mass springdamper system in Figure 6.4 is stable.

(3) For the remaining part of this problem assume that

m1 = 1, m2 = 2, m3 = 3, c1 = 5, c2 = 1

k1 = 3, k2 = 1, k3 = 2, k4 = 2. (4.17)

Using the state variables x1 = q and x2 = x1 = q, that is,

x =

[x1x2

]=

[qq

]find a state space realization

x = Ax+Bu

for this mass spring damper system. Part (2) guarantees that A is stable, that is, allthe eigenvalues for A are contained in the open left hand plane, and this mass springdamper system is stable; see Remark 6.3.1.

(4) Use Matlab to find the characteristic polynomial det[sI − A] for A. Compute theeigenvalues for A. Check to see if A is stable, and thus, the corresponding the massspring damper system is stable.

(5) Assume that u(t) = 3. Then compute q(∞) = limt→∞ q(t). Does q(t) oscillate to q(∞).Explain why or why not. Explain the speed at which q(t) converges to q(∞). Plot thestep response with u(t) = 3 for qj for j = 1, 2, 3 on the same graph.

(6) Find the transfer functions Gj(s) from u to qj . For j = 1, 2, 3.

(7) Find the transfer function from u to the force Mq. Assume that u(t) = δ(t), thenMq(t) is of the form Mq(t) = αδ(t) + ψ(t) where α is a vector in C3 and ψ is a linearcombination of stable exponential functions. Find α. What does this say concerningthe effect of the Dirac delta function on each mass?

6.5. THE MASS SPRING SYSTEM 327

Problem 2. Draw a mass spring damper system whose equations of motion are given by[2 00 3

] [q1q2

]+

[6 −4−4 5

] [q1q2

]+

[7 −5−5 5

] [q1q2

]=

[10

]u.

Is this mass spring damper system stable? Explain why or why not.

Problem 3. Draw a mass spring damper system whose equations of motion are given bythe stable mass spring damper system in (2.8), that is,[

1 00 1

] [q1q2

]+

[2.5 00 0

] [q1q2

]+

[2 −1−1 1

] [q1q2

]=

[00

].

Is this mass spring damper system stable? Explain why or why not.

6.5 The mass spring system

The material in the remaining part of this chapter is not used in the rest of the notes and maybe skipped by the uninterested reader. In this section, we will study mass spring systemsand their underlying unitary operator eAt.

mk

q

Figure 6.5: A simple mass spring system

For some insight let us begins with the scalar case. Consider the simple mass springsystem presented in Figure 6.5 with mass m and spring constant k. The position of the massis q. The equation of motion is given by

mq + kq = 0. (5.1)

By taking the Laplace transform of both sides, we obtain

ms2Q(s) + kQ(s) = msq(0) +mq(0).

In other words,

Q(s) =msq(0) +mq(0)

ms2 + k=sq(0) + q(0)

s2 + km

=sq(0) + q(0)

s2 + ω2.


Here ω =√

km. By taking the inverse Laplace transform, we see that

q(t) = q(0) cos(ωt) +q(0)

ωsin(ωt)

q(t) = −q(0)ω sin(ωt) + q(0) cos(ωt).

The second equation follows by taking the derivative of q. By rewriting this in matrix form,we obtain [

q(t)q(t)

]=

[cos(ωt) 1

ωsin(ωt)

−ω sin(ωt) cos(ωt)

] [q(0)q(0)

]. (5.2)

Because q(t) is a function of sinusoids with frequency√

km

we call ω =√

km

the natural

frequency for the mass spring system mq + kq = 0.The kinetic energy is given by T = 1

2mq2, while the potential energy of the spring is

V = 12kq2. The total energy H = T + V is the sum of the kinetic energy plus the potential

energy, that is,

H(q, q) =1

2mq2 +

1

2kq2. (5.3)

The mass spring system mq + kq = 0 is a conservative system, that is, H equals a constant,or equivalently, H = 0. To see this simple observe that

H = mqq + kqq = (mq + kq) q = 0.

Therefore H equals a constant and mq + kq = 0 is a conservative system.Now let us convert mq + kq = 0 to a state space system. Consider the state variables

defined byx1 =

√kq and x2 =

√mq. (5.4)

In matrix form the state variables x1 and x2 are given by[x1x2

]=

[√k 00

√m

] [qq

]equivalently

[x1x2

]= P

[qq

]where P =

[√k 00

√m

]. (5.5)

By taking the inverse of P , we have[qq

]= P−1

[x1x2

]where P−1 =

[1√k

0

0 1√m

]. (5.6)

Recall that the natural frequency of the system mq + kq = 0 is ω =√

km. Using mq = −kq,

we obtain

x1 =√kq =

√k

m

√mq = ωx2

x2 =√mq = − m√

mq = − k√

mq = −

√k

m

√kq = −ωx1.


Hence x1 = ωx2 and x2 = −ωx1. This leads to the following state space system:[x1x2

]=

[0 ω−ω 0

] [x1x2

]where

[x1x2

]=

[√kq√mq

]. (5.7)

In state space notation

x = Ax where A =

[0 ω−ω 0

]and x =

[x1x2

]. (5.8)

It is noted that A is skew symmetric, that is, A+ A∗ = 0. Moreover,

eAt =

[cos(ωt) sin(ωt)− sin(ωt) cos(ωt)

]when A =

[0 ω−ω 0

]. (5.9)

Using x(t) = eAtx(0) with (5.5), we see that[q(t)q(t)

]= P−1x(t) = P−1eAtPP−1x(0) = P−1eAtP

[q(0)q(0)

].

A simple calculation shows that

P−1eAtP =

[cos(ωt) 1

ωsin(ωt)


]. (5.10)

It is noted that P−1eAtP = eP−1APt. Equation (5.10) readily implies that[

q(t)q(t)

]=

[cos(ωt) 1

ωsin(ωt)


] [q(0)q(0)

]. (5.11)

This is precisely, the equation for q and q, we obtained in (5.2) using Laplace transforms.The state space analysis might seem like a roundabout way to derive (5.2) or (5.11); however,the state space approach gives us some useful insight into the matrix case, which we willstudy in the next section.

Recall that U on Cν is a unitary matrix if U∗U = I, the identity matrix. In other words,U is a unitary matrix if its inverse U−1 = U∗. It is emphasized that eAt in (5.9) is a unitarymatrix for all t. In fact, a simple calculation shows that

(eAt

)∗eAt = I. Since x(t) = eAtx(0),

it follows that x(t)∗x(t) = x(0)∗x(0). Using x1 =√kq and x2 =

√mq, we obtain

H =1

2mq2 +

1

2kq2 =

1

2x22 +

1

2x21 =

1

2x2(0)

2 +1

2x1(0)

2 =1

2mq(0)2 +

1

2kq(0)2.

Therefore the total energy H equals a constant, and mq + kq = 0 is a conservative system.Recall that A is a skew symmetric matrix on Cν if A+A∗ = 0. For example, the matrix

A in (5.9) is skew symmetric. Let us conclude this section with the following classical resultfrom linear algebra.

REMARK 6.5.1 Let A be a skew symmetric matrix on Cν. Then the following holds.


(i) All the eigenvalues for A are on the imaginary axis {λ ∈ C : �(λ) = 0}.(ii) The exponential matrix eAt is a unitary matrix for all t, that is,

(eAt

)∗eAt = I.

To verify that Part (i) holds, let λ be an eigenvalue for A with eigenvector ϕ, that is,Aϕ = λϕ where ϕ is nonzero. Without loss of generality, we can assume that ϕ is a unitvector (ϕ∗ϕ = 1). Using A = −A∗, we have

λ = ϕ∗Aϕ = −ϕ∗A∗ϕ = −(Aϕ)∗ϕ = −(λϕ)∗ϕ = −λϕ∗ϕ = −λ.

Hence 0 = λ + λ = 2�(λ). Therefore all the eigenvalues for A are on the imaginary axis{λ ∈ C : �(λ) = 0}.

To prove that Part (ii) holds, simply observe that A∗ = −A yields(eAt

)∗eAt = eA

∗teAt = e−AteAt =(eAt

)−1eAt = I.

Therefore(eAt

)∗eAt = I and eAt is a unitary matrix.

Because the matrix A in (5.9) is skew symmetric, Remark 6.5.1 also shows that eAt is aunitary matrix.

6.5.1 The conservative system Mq +Kq = 0

Motivated by our analysis of the scalar system mq+kq = 0, consider the mass spring systemdetermined by

Mq +Kq = 0 (5.12)

where M and K are positive matrices on Cν . The kinetic energy is given by T = 12q∗Mq,

while the potential energy of the spring system is V = 12q∗Kq. The total energy H = T + V

is the sum of the kinetic energy plus the potential energy, that is,

H(q, q) =1

2q∗Mq +

1

2q∗Kq. (5.13)

As expected, Mq + Kq = 0 is a conservative system, that is, H equals a constant, orequivalently, H = 0. Using Mq +Kq = 0, we obtain

H =1

2q∗Mq +

1

2q∗Mq +

1

2q∗Kq +

1

2q∗Kq

=1

2q∗ (Mq +Kq) +

1

2(q∗M + q∗K) q

=1

2q∗ (Mq +Kq) +

1

2(Mq +Kq)∗ q = 0.

(Here we used the fact that M and K are self adjoint matrices.) Therefore H = 0 andMq +Kq = 0 is a conservative system.

Now consider the mass spring system determined by Mq +Kq = 0 where M and K arestrictly positive matrices on Cν . It is well known in linear algebra that if Λ is a positivematrix, then Λ admits a unique positive square root Λ

12 , that is, Λ

12 is a positive matrix and


Λ = (Λ12 )2. (The Matlab command to compute the unique positive square root is sqrtm.)

Now let Ω be the strictly positive matrix on Cν defined by

Ω =(M− 1

2KM− 12

) 12. (5.14)

(IfM = m and K = k are scalars, then Ω =√

km

= ω the natural frequency of mq+kq = 0.)

Because M is strictly positive, M12 is invertible. Since K is strictly positive Ω is also

invertible. Consider the state variables x1 and x2 in Cν defined by

x1 = ΩM12 q and x2 =M

12 q. (5.15)

In matrix form the state variables x1 and x2 are given by[x1x2

]=

[ΩM

12 0

0 M12

] [qq

]or equivalently

[qq

]=

[M− 1

2Ω−1 0

0 M− 12

] [x1x2

]. (5.16)

In other words, [x1x2

]= P

[qq

]where P =

[ΩM

12 0

0 M12

][qq

]= P−1

[x1x2

]where P−1 =

[M− 1

2Ω−1 0

0 M− 12

]. (5.17)

It is noted that P is a block matrix on C2ν . Finally, a simple calculation shows that

P ∗P =

[K 00 M

]. (5.18)

By employing x1 = ΩM12 q and x2 =M

12 q, with q = −M−1Kq, we obtain

x1 = ΩM12 q = ΩM

12M− 1

2x2 = Ωx2

x2 =M12 q = −M− 1

2Kq = −M− 12KM− 1

2M12 q = −Ω2M

12 q = −Ωx1.

In other words, x1 = Ωx2 and x2 = −Ωx1. Hence x1 and x2 admit a state space representationof the form [

x1x2

]=

[0 Ω−Ω 0

] [x1x2

]. (5.19)

In state space notation

x = Ax where A =

[0 Ω

−Ω 0

]and x =

[x1x2

]=

[ΩM

12 q

M12 q

]. (5.20)

Motivated by this analysis, we call the strictly positive matrix Ω =(M− 1

2KM− 12

) 12the

natural frequency matrix for the system Mq +Kq = 0.


It is emphasized that A is a block skew symmetric matrix on C2ν , that is, A + A∗ = 0.In particular, all the eigenvalues for A are on the imaginary axis. Moreover, eAt is a unitarymatrix for all t, that is,

(eAt

)∗eAt = I; see Remark 6.5.1. Using x(t) = eAtx(0), we obtain

the following conservation of energy type equality

x(t)∗x(t) = x(0)∗x(0) (for all t.)

Using this with (5.17) and (5.18), we see that total energy H is preserved, that is,

H =1

2q(t)∗Kq(t) +

1

2q(t)∗Mq(t) =

1

2

[q(t)∗ q(t)∗

]P ∗P

[q(t)q(t)

]=

1

2x(t)∗x(t) =

1

2x(0)∗x(0).

Therefore H is a constant for all t, and Mq +Kq = 0 is a conservative system.To complete this section recall that x(t) = eAtx(0). By consulting (5.17), we have[

q(t)q(t)

]= P−1eAtP

[q(0)q(0)

]. (5.21)

In other words, q and q are determined by P−1eAtP = eP−1APt which is similar to the unitary

matrix eAt. Finally, it is noted that

P−1AP =

[0 I

−M−1K 0

]and

[q(t)q(t)

]=

[0 I

−M−1K 0

] [q(t)q(t)

];

as expected.

REMARK 6.5.2 It is beyond the scope of these notes. However, consider the Hilbert spacegenerated by the inner product ∥∥∥∥[ξ1ξ2

]∥∥∥∥2

= ξ∗1Kξ1 + ξ∗2Mξ2

where M and K are both strictly positive. Then the matrix

Λ =

[0 I

−M−1K 0

]is skew symmetric with respect to this inner product. Therefore eΛt is unitary with respect tothe inner product generated by K and M .

6.5.2 Exercise

Problem 1. If A is any matrix on Cν and P is an invertible matrix on Cν , then show that

P−1eAtP = eP−1APt.


Problem 2. Verify that (5.9) holds, that is, show that

eAt =

[cos(ωt) sin(ωt)− sin(ωt) cos(ωt)

]when A =

[0 ω−ω 0

]. (5.22)

(Here we assume that ω is a real number.)

Problem 3. Consider the block matrix A on C2ν given by

A =

[0 Ω

−Ω 0

]where Ω is a matrix on C

ν . Then using the Schur determinant in (3.19) show that thecharacteristic polynomial for A is given by

det[sI −A] = det[s2I + Ω2].

Problem 4. Consider the system[2 11 2

] [q1q2

]+

[4 −2−2 2

] [q1q2

]=

[00

]. (5.23)

Use the sqrtm command in Matlab to convert this system to a state space realization of theform:

x = Ax where A =

[0 Ω−Ω 0

]and x =

[x1x2

].

Here A is a skew symmetric matrix on C4 and Ω is the natural frequency matrix for themass spring system in (5.23).

• In particular, find Ω.

• Use the eig command Matlab to find the eigenvalues for A.

• Use the poly command in Matlab to find the characteristic polynomial for A, that is,det[sI − A].

• Use Matlab to compute det[s2I+Ω2]. Notice that this is the characteristic polynomialfor −Ω2 evaluated at s2, that is,

det[s2I + Ω2] = det[λI − (−Ω2)]∣∣λ=s2

.

6.5.3 Exercise: sine and cosine matrices

This section is an exercise concerning sine and cosine matrices. Let Ω be a matrix on Cν .Let cos(Ω) and sin(Ω) be the matrices on Cν defined by

cos(Ω) =1

2

(eiΩ + e−iΩ

)and sin(Ω) =

1

2i

(eiΩ − e−iΩ

). (5.24)


Part (i) Show that the following holds:

• eiΩ = cos(Ω) + i sin(Ω)

• cos(Ω)2 + sin(Ω)2 = I

• cos(Ω) = I − Ω2

2!+

Ω4

4!− · · · =

∞∑k=0

(−1)kΩ2k

(2k)!

• sin(Ω) = Ω− Ω3

3!+

Ω5

5!− · · · =

∞∑k=0

(−1)kΩ2k+1

(2k + 1)!

• d

dtcos(Ωt) = −Ω sin(Ωt)

• d

dtsin(Ωt) = Ω cos(Ωt)

• L( cos(Ωt)) = s(s2I + Ω2

)−1

• L( sin(Ωt)) = Ω(s2I + Ω2

)−1. (5.25)

Therefore the sine and cosine matrices cos(Ω) and sin(Ω) have many of the same propertiesas the functions cos(θ) and sin(θ).

Part (ii) Recall that a matrix T on Cν is self adjoint if T = T ∗. If Ω is self adjoint, then

show cos(Ω) and sin(Ω) are also self adjoint matrices.

Part (iii) Let A be the block matrix on C2ν defined by

A =

[0 Ω−Ω 0

]. (5.26)

Then show that

eAt =

[cos(Ωt) sin(Ωt)− sin(Ωt) cos(Ωt)

]. (5.27)

This is a generalization of (5.22). In particular, if Ω is a self adjoint matrix, then one candirectly show that eAt is a unitary operator for all t. If Ω is a self adjoint matrix, thencos(Ωt) and sin(Ωt) are also self adjoint matrices. Hence

(eAt

)∗eAt =

[cos(Ωt) − sin(Ωt)sin(Ωt) cos(Ωt)

] [cos(Ωt) sin(Ωt)− sin(Ωt) cos(Ωt)

]=

[I 00 I

].

In other words,(eAt

)∗eAt = I and eAt is a unitary matrix. (This also follows from the fact

that A is a skew symmetric matrix when Ω is a self adjoint matrix; see Remark 6.5.1.)

Hint: To obtain (5.27) verify that

(sI − A)−1 =

[sI −ΩΩ sI

]−1

=

[s(s2I + Ω2)−1 Ω(s2I + Ω2)−1

−Ω(s2I + Ω2)−1 s(s2I + Ω2)−1

]. (5.28)


(The Schur complement can also be used to compute (sI − A)−1.) Then take the inverseLaplace transform of (sI −A)−1 to find eAt. One can also show that (5.27) holds by directlyverifying that

d

dt


]= A


].

Part (iv) Assume that Mq +Kq = 0 where M and K are strictly positive matrices. Then

using Ω =(M− 1

2KM− 12

) 12with (5.17) and (5.21) show that[

q(t)q(t)

]= P−1eAtP

[q(0)q(0)

]=

[M− 1

2 cos(Ωt)M12 M− 1

2Ω−1 sin(Ωt)M12

−M− 12Ω sin(Ωt)M

12 M− 1

2 cos(Ωt)M12

] [q(0)q(0)

]. (5.29)

(Notice that Ω, cos(Ωt) and sin(Ωt) commute with each other.) The formula in (5.29) is ageneralization of the scalar formula in (5.11).

Part (v) Assume that

Ω =

[2 −1−1 2

]and A =

[0 Ω

−Ω 0

]=

⎡⎢⎢⎣0 0 2 −10 0 −1 2−2 1 0 01 −2 0 0

⎤⎥⎥⎦ .It is noted that Ω is a strictly positive matrix on C2, while A is a skew symmetric matrix onC4. Compute cos(Ωt), sin(Ωt) and the unitary matrix eAt in (5.27).

Hint: One can find eiΩt and e−iΩt by first computing eΩt. To see this simply observe thateiΩt (respectively e−iΩt) is obtained by replacing t by it (respectively t by −it) in eΩt. To beprecise,

eiΩt = eΩλ∣∣λ=it

and e−iΩt = eΩλ∣∣λ=−it . (5.30)

Then using eiΩt and e−iΩt one can readily compute

cos(Ωt) =1

2

(eiΩt + e−iΩt

)and sin(Ωt) =

1

2i

(eiΩt − e−iΩt

). (5.31)

For real matricescos(Ωt) = �(eiΩt) and sin(Ωt) = �(eiΩt). (5.32)

The formula in (5.31) does not hold for complex matrices or even complex numbers. Forexample, cos(i) = �(ei×i), that is,

cos(i) =1

2(e−1 + e1) = e−1 = �(ei×i).

Finally, one can also compute cos(Ωt) and sin(Ωt) by taking the inverse Laplace transformof certain matrix functions, that is,

cos(Ωt) = L−1(s(s2I + Ω2

)−1)

and sin(Ωt) = L−1(Ω(s2I + Ω2

)−1). (5.33)


Part (vi) Assume that Ω is the skew symmetric matrix given by

Ω =

[0 a−a 0

]where a is a real number. Compute eiΩt, cos(Ωt) and sin(Ωt).

6.6 A mass spring approximation of the wave equation

In this section, will study a classical mass spring system where the spring constant matrix isa positive band Toeplitz matrix. This mass spring system will be used to approximate thesolution to the wave equation. This section is motivated by some of the results in Chapter6 of [43] concerning the wave equation, and the wave equation article in Wikipedia:

http://en.wikipedia.org/wiki/Wave_equation

Recall that a Toeplitz matrix is a matrix where all the diagonal entries are the same, thatis, a matrix T is Toeplitz if its entries Tj,k = aj−k is just a function of the difference of theentries. Let us begin with the following mathematical result taken from Chapter 6 of [43].

LEMMA 6.6.1 Consider the Toeplitz matrix Tν on Cν determined by

Tν =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

2 −1 0 0 · · · 0 0 0−1 2 −1 0 · · · 0 0 00 −1 2 −1 · · · 0 0 00 0 −1 2 · · · 0 0 0...

......

... · · · ......

...0 0 0 0 · · · 2 −1 00 0 0 0 · · · −1 2 −10 0 0 0 · · · 0 −1 2

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦on C

ν (6.1)

with 2 on the main diagonal, −1 immediately above and below the main diagonal, and zeroselsewhere. Then Tν is a strictly positive matrix with eigenvalues {λj}ν1 and correspondingorthonormal eigenvectors {ϕj}ν1 given by

λj = 2− 2 cos(θj) and ϕj =

√2√

ν + 1

⎡⎢⎢⎢⎢⎢⎣sin(θj)sin(2θj)sin(3θj)

...sin(νθj)

⎤⎥⎥⎥⎥⎥⎦ (for j = 1, 2, · · · , ν) (6.2)

where {θj}ν1 are the angles defined by

θj =jπ

ν + 1(for j = 1, 2, · · · , ν). (6.3)

6.6. A MASS SPRING APPROXIMATION OF THE WAVE EQUATION 337

In other words,Tνϕj =

(2− 2 cos(θj)

)ϕj (for j = 1, 2, · · · , ν). (6.4)

In particular, let Λ be the diagonal matrix on Cν formed by the eigenvalues {λj}ν1 and U theunitary matrix on Cν given by

Λ =

⎡⎢⎢⎢⎢⎢⎣λ1 0 0 · · · 00 λ2 0 · · · 00 0 λ3 · · · 0...

......

. . . 00 0 0 · · · λν

⎤⎥⎥⎥⎥⎥⎦ on Cν and U =

[ϕ1 ϕ2 ϕ3 · · · ϕν

]on C

ν . (6.5)

Then Tν is unitarily equivalent to Λ, that is,

TνU = UΛ. (6.6)

Proof. (The first paragraph of this proof provides some additional insight for the readerwho is familiar Toeplitz operator theory. However, this paragraph is beyond the scope ofthe notes.) Consider the Toeplitz matrix T∞ on �2+ determined by

T∞ =

⎡⎢⎢⎢⎢⎢⎣2 −1 0 0 · · ·−1 2 −1 0 · · ·0 −1 2 −1 · · ·0 0 −1 2

. . ....

......

. . .. . .

⎤⎥⎥⎥⎥⎥⎦ on �2+.

As expected, 2 is on the main diagonal, −1 immediately above and below the main diagonal,and zeros elsewhere. The symbol for this Toeplitz matrix T∞ is given by

f(z) = 2− z − z = 2− eiθ − e−iθ = 2− 2 cos(θ)

where z = eiθ is on the unit circle. Because f(z) ≥ 0, we see that T∞ is a positive Toeplitzmatrix. Moreover, ‖T∞‖ = ‖f‖∞ = 2 where ‖f‖∞ is the L∞ norm of f . Notice that theToeplitz matrix Tν is contained in the upper ν × ν left hand corner of T∞. Therefore Tν isa positive matrix whose eigenvalues {λj}ν1 are contained in the interval [0, 2].

Let λ be an eigenvalue for Tν with corresponding eigenvector ϕ =[x1 x2 · · · xν

]trwhere tr denotes the transpose. Using Tνϕ = λϕ, we obtain the following difference equation:

−xn+2 + 2xn+1 + xn = λxn+1

subject to the initial and final conditions x0 = 0 and xν+1 = 0. In other words,

xn+2 + (λ− 2)xn+1 + xn = 0 (6.7)

subject to the initial and final conditions x0 = 0 and xν+1 = 0 where λ is a real numberin [0, 2]. The solutions of this difference equation are of the form xn = αzn1 + βzn2 where z1


and z2 are complex numbers, while α and β are constants satisfying the initial and finialconditions. Substituting xn = zn into xn+2+(λ−2)xn+1+xn = 0 shows that we are lookingfor roots of the polynomial

z2 + (λ− 2)z + 1 =

(z +

λ− 2

2

)2

+ 1− (λ− 2)2

4= 0

where 0 ≤ λ ≤ 2. In other words,

z = −λ− 2

2±√

(λ− 2)2

4− 1.

Motivated by the symbol f = 2− 2 cos(θ) for T∞, let us set λ = 2− 2 cos(θ). Then we have

z = cos(θ)±√

cos(θ)2 − 1 = cos(θ)± i sin(θ) = e±iθ.

Using this we see that all solutions xn to the difference equation in (6.7) is given by

xn = αeinθ + βe−inθ

subject to the initial and final conditions x0 = 0 and xν+1 = 0. Thus[1 1

ei(ν+1)θ e−i(ν+1)θ

] [αβ

]=

[x0xν+1

]=

[00

]. (6.8)

The only way the previous equation has a nonzero solution is when the previous 2×2 matrixis singular, or equivalently, its determinant equals zero, that is, when

0 = e−i(ν+1)θ − ei(ν+1)θ = −2i sin((ν + 1)θ).

Hence this 2× 2 matrix is singular if and only if sin((ν + 1)θ) = 0, or equivalently,

θ = θj =jπ

ν + 1for some j = 1, 2, · · · , ν.

In this case, all solutions to (6.8) are determined by α = −β. Choosing α = 12i

= −β, wesee that xn = sin(nθj). Therefore

λ = 2− 2 cos(θj) and ϕ =

⎡⎢⎢⎢⎢⎢⎣sin(θj)sin(2θj)sin(3θj)

...sin(νθj)

⎤⎥⎥⎥⎥⎥⎦is an eigenvalue eigenvector pair for Tν . Moreover, {2 − 2 cos(θj)}ν1 are the set of all eigen-values for Tν . Because Tν is self adjoint and all its eigenvalues are strictly positive, Tν is astrictly positive Toeplitz matrix.


To normalize the eigenvector ϕ, recall that 2 sin(ϑ)2 = 1− cos(2ϑ). Moreover, for r = 1we have the geometric series

∑ν0 r

n = 1−rν+1

1−r . Hence

‖ϕ‖2 =ν∑

n=1

sin(nθj)2 =

1

2

ν∑n=1

(1− cos(2nθj)) =ν

2+

1

2− 1

2

ν∑n=0

cos(2nθj)

=ν + 1

2− 1

4

ν∑n=0

(ei2nθj + e−i2nθj

)=ν + 1

2− 1

4

(1− ei2(ν+1)θj

1− ei2θj+

1− e−i2(ν+1)θj

1− e−i2θj

)=ν + 1

2.

Therefore ‖ϕ‖2 = ν+12. So ϕj =

√2√

ν+1ϕ is a unit eigenvector corresponding to the eigenvalue

λj = 2 − 2 cos(θj). Because all the eigenvalues of Tν are distinct and Tν is self adjoint (infact, Tν is strictly positive), the eigenvectors {ϕj}ν1 form an orthogonal basis for Cn. HenceU =

[ϕ1 ϕ2 · · · ϕν

]is a unitary operator on Cν . Finally, using Tνϕj = λjϕj , we obtain

TνU = UΛ. This completes the proof.

REMARK 6.6.2 By letting ν approach infinity, we see that the spectrum {2− 2 cos(θj)}ν1for Tν forms a dense set in the spectrum [0, 2] = {2− 2 cos(θ)} for T∞.

mm mk k kk ... k

q1 q2 qν

Figure 6.6: A mass spring system

Consider the mass spring system presented in Figure 6.6. Here there are ν masses withmass m and ν+1 springs with constants k, the position of the j-th mass from its equilibriumpoint is qj. Let q be the vector in Cν defined by q =

[q1 q2 · · · qν

]trwhere tr denotes the

transpose. (The standard inner product on Cν is denoted by (·, ·).) For example, if ν = 4,then the equations of motion are given by

m

⎡⎢⎢⎣1 0 0 00 1 0 00 0 1 00 0 0 1

⎤⎥⎥⎦⎡⎢⎢⎣q1q2q3q4

⎤⎥⎥⎦+

⎡⎢⎢⎣2 −1 0 0−1 2 −1 00 −1 2 −10 0 −1 2

⎤⎥⎥⎦⎡⎢⎢⎣q1q2q3q4

⎤⎥⎥⎦ =

⎡⎢⎢⎣0000

⎤⎥⎥⎦ .


In general, for any positive integer ν > 0, the equations of motion for the mass springsystem in Figure 6.6 are given by

mq + kTνq = 0. (6.9)

Lemma 6.6.1 guarantees that kmTν is a strictly positive Toeplitz matrix on Cν . Moreover,

ω2j =

k

m

(2− 2 cos(θj)

)(for j = 1, 2, · · · , ν)

are the eigenvalues for kmTν with corresponding orthonormal eigenvectors {ϕj}ν1 presented

in (6.2), that is,k

mTνϕj = ω2

jϕj (for j = 1, 2, · · · , ν).In other words, k

mTνU = UΩ where Ω = diag{ω2

j}ν1 = kmΛ and U is the unitary operator on

Cν given in (6.5). Using kmTνϕj = ω2

jϕj, it follows that eiωjtϕj is a solution tomq+kTνq = 0.

Because {ϕj}ν1 forms an orthogonal basis for Cν , the solution to mq + kTνq = 0 is given by

q(t) =ν∑j=1

(αj cos(ωjt) + βj sin(ωjt)

)ϕj (6.10)

where {αj}ν1 and {βj}ν1 are constants. Since {ϕj}ν1 is an orthogonal basis for Cν

q(t) =ν∑j=1

(q(t), ϕj)ϕj.

In particular, evaluating q(t) and q(t) at t = 0 yields

ν∑j=1

(q(0), ϕj)ϕj = q(0) =ν∑j=1

αjϕj

ν∑j=1

(q(0), ϕj)ϕj = q(0) =

ν∑j=1

βjωjϕj.

By matching the coefficients of ϕj , we have

αj = (q(0), ϕj) and βjωj = (q(0), ϕj) (for j = 1, 2, · · · , ν).Using this in (6.10), we see that the solution q(t) to mq + kTνq = 0 is given by

q(t) =ν∑j=1

((q(0), ϕj) cos(ωjt) +

(q(0), ϕj)

ωjsin(ωjt)

)ϕj . (6.11)

Recall that 2 cos(a) sin(b) = sin(b− a) + sin(b+ a). So if q(0) = 0, we obtain

q(t) =

ν∑j=1

(q(0), ϕj)√2(ν + 1)

⎡⎢⎢⎢⎢⎢⎣sin(θj − ωjt) + sin(θj + ωjt)sin(2θj − ωjt) + sin(2θj + ωjt)sin(3θj − ωjt) + sin(3θj + ωjt)

...sin(νθj − ωjt) + sin(νθj + ωjt)

⎤⎥⎥⎥⎥⎥⎦ (when q(0) = 0). (6.12)


Finally, it is noted that this is a discrete time version of the wave equation corresponding toa string pinned at two points; see Chapter 4 Section 1 in [26] or any standard reference onthe wave equation.

Connections to the wave equation. Consider the equations of motion for a stringpinned at the endpoint 0 and l determined by

ϕ = c2∂2ϕ

∂x2(where c =

√τ

ρ) (6.13)

where ϕ(0, t) = ϕ(l, t) = 0; see Chapter 4 Section 1 in [26] or any standard reference on thewave equation. Moreover, c2 = τ

ρwhere τ is the constant tension in the string and ρ is the

mass per unit length. Finally, c is the speed of the wave. Let us break up the interval [0, l]into ν points of length h. To be specific, set

qj(t) = ϕ(jh, t) (for j = 1, 2, · · · , ν)

where h = lν+1

. Now observe that for a differentiable function ψ(x) we have

∂ψ

∂x= lim

h→0

(ψ(x+ h)− ψ(x)

h

)∂2ψ

∂x2= lim

h→0

(ψ(x+ h)− 2ψ(x) + ψ(x− h)

h2

). (6.14)

So for ν sufficiently large,

∂ϕ

∂x

∣∣∣∣jh,t

≈ qj+1 − qjh

and∂2ϕ

∂x2

∣∣∣∣jh,t

≈ qj+1 − 2qj + qj−1

h2. (6.15)

(Here we set q0 = qν+1 = 0.) Using this in the wave equation ϕ = c2 ∂2ϕ∂x2

, we obtain

q +c2

h2Tνq = 0.

This yields the approximation for the wave equation that we have been looking for. In otherwords, one can approximate the wave equation as a large number of masses (or beads) slidingback and forth as in Figure 6.6.

6.6.1 Exercise

Problem 1. Consider the mass spring damper system presented in Figure 6.6 where thesprings are replaced by a damper c and spring k in parallel. In this case, show that theequations of motion are given by

mq + cTν q + kTνq = 0.


Recall that λj = 2− 2 cos(θj) and ϕj are the eigenvalues and corresponding eigenvectors forTν for j = 1, 2, · · · , ν; see Lemma 6.6.1. Assume that {ς

j1, ς

j2}ν1 are the respective roots of

mλ2 + cλjλ+ kλj = 0.

For simplicity assume that ςj1

= ςj2

for all j = 1, 2, · · · , ν. Then show that the uniquesolution to mq + cTν q + kTνq = 0 is given by

q(t) =ν∑j=1

(αje

ςj1 t + βjeςj2 t

)ϕj

where {αj}ν1 and {βj}ν1 are the constants determined by solving the linear equation[1 1ςj1

ςj2

] [αjβj

]=

[(q(0), ϕj)(q(0), ϕj)

](for j = 1, 2, · · · , ν).

In other words,[αjβj

]=

1

ςj2− ς

j1

[ςj2

−1−ς

j11

] [(q(0), ϕj)(q(0), ϕj)

](for j = 1, 2, · · · , ν).

Find the solution when ςj1= ς

j2for some possible integers j.

Chapter 7

An introduction to filtering theory

In this chapter we will present an introduction to low pass, band pass and high pass filters.The Butterworth filters will be studied in detail.

7.1 Sinusoid response

As before, let G be the transfer function for a linear system with input u and output y.Then we say that G is bounded input bounded output stable if given any input u(t) satisfying|u(t)| ≤ γ for all t ≥ 0 where γ is a finite bound depending upon u, then exists a finitenumber γ1 such that |y(t)| ≤ γ1 for all t ≥ 0. It turns out that G is bounded input boundedoutput stable if and only if ∫ ∞

0

|g(t)| dt <∞ (1.1)

where g is the inverse Laplace transform of G. If (1.1) holds, then it is easy to show that Gis bounded input bounded output stable. To see this assume that |u(t)| ≤ γ for all t ≥ 0.Then (1.1) implies that

|y(t)| = |(g ⊗ u)(t)| =∣∣∣∣∫ t

0

g(σ)u(t− σ) dσ

∣∣∣∣ ≤ ∫ t

0

|g(σ)u(t− σ)| dσ

≤∫ t

0

|g(σ)|γ dσ ≤ γ

∫ ∞

0

|g(t)| dt = γ1 <∞ .

So |y(t)| ≤ γ1 < ∞ for all t. In other words, if (1.1) holds, then that G is bounded inputbounded output stable. On the other hand, if G is bounded input bounded output stable,then (1.1) holds. The proof of this fact is beyond the scope of these notes.

¿From now on we will drop the bounded input bounded output when referring to stability,that is, we say that a transfer function G is stable, if G is bounded input bounded outputstable. Now assume that G is a proper rational transfer function. Then G is stable if andonly if all the poles of G are contained in the open left half plane {s : �s < 0}. To see thisnotice that G admits a decomposition of the form

G(s) =p∏ν

j=1(s− λj)

343

344 CHAPTER 7. AN INTRODUCTION TO FILTERING THEORY

where p is a polynomial of degree at most ν and {λj}ν1 are the poles of G. By using partialfraction expansion, it follows that the inverse Laplace transform g of G is given by

g(t) = αδ(t) +∑k,m

αk,mtkeλmt

where α and αk,m are constants. The constant α equals zero if and only if G is a strictlyproper rational function. Notice that (1.1) holds if and only if g(t) approaches zero as t tendsto infinity, or equivalently, {λj}ν1 are contained in the open left half plane {s : �s < 0}. Inother words, G is stable if and only if all the poles of G are contained in the open left halfplane.

For example, consider the transfer function

G(s) =s− 1

(s+ 6)(s2 + 2s+ 5)=

s− 1

(s + 6)(s+ 1 + 2i)(s+ 1− 2i).

The poles for G are −6 and −1±2i. Since all of these poles are in the open left half plane, Gis a stable transfer function. Notice that the zero’s of G does not play a role in determiningthe stability of G. In fact, 1 is a zero of G and G is still stable. Now consider the transferfunction

H(s) =s+ 1

(s2 + 9)(s2 + 2s+ 5)=

s+ 1

(s+ 3i)(s− 3i)(s+ 1 + 2i)(s+ 1− 2i).

The poles for H are ±i and −1 ± 2i. Since ±i is not contained in the open left half plane,H is unstable. Finally, consider the transfer function

Q(s) =s+ 1

s(s− 1)(s+ 2)(s+ 3).

The poles for Q are 0, 1, −2 and −3. Since 0 and 1 are not contained in the open left halfplane, Q is unstable.

7.1.1 Steady state response

Let z be a complex number. Recall that z admits a polar representation of the form z = reiθ

where r = |z| is the magnitude of z and θ = arg(z) is the angle of z; see Chapter 1 for furtherdetails. (The angle of z is denoted by θ = arg(z), that is, z = |z|ei arg(z).) Let zj = |zj |ei arg(zj)be a set of complex numbers for j = 1, 2, · · · , n. Let z be a complex number of the form

z =

∏mj=1 zj∏n

j=m+1 zj.

(The product of a set of complex numbers {aj} is denoted by∏aj .) The polar representation

of z, its magnitude |z| and angle arg(z) are given by

z =

∏mj=1 zj∏n

j=m+1 zj=

∏mj=1 |zj |∏n

j=m+1 |zj|ei(

∑mj=1 arg(zj)−

∑nj=m+1 arg(zj))

|z| =∏m

j=1 |zj|∏nj=m+1 |zj |

and arg(z) =m∑j=1

arg(zj)−n∑

j=m+1

arg(zj). (1.2)

7.1. SINUSOID RESPONSE 345

In other words, the magnitude |z| of z is the product of the magnitudes of the complexnumbers in the numerator, divided by the product of the magnitudes of the complex numbersin the denominator. The angle arg(z) of z is the sum of the angles of the complex numbersin the numerator, minus the sum of the angles of the complex numbers in the denominator.

For example, let z = z1z2z3z4z5

where zj = |zj|eiθj for j = 1, 2, 3, 4, 5. Then z admits a polar

representation |z|eiθ of the form

z =z1z2z3z4z5

=|z1||z2|

|z3||z4||z5|ei(θ1+θ2−θ3−θ4−θ5)

|z| = |z1||z2||z3||z4||z5| and θ = θ1 + θ2 − θ3 − θ4 − θ5.

As before, let G(s) be a transfer function. For each ω in the real line G(iω) is simply acomplex number. So G(iω) admits a polar decomposition of the form

G(iω) = |G(iω)| eiφ(iω) = |G(iω)| eiarg(G(iω)) (1.3)

where |G(iω)| is the magnitude for G(iω) and φ(iω) = arg(G(iω)) is the angle for G(iω).For example, consider the transfer function

G(s) =(s− 4)(s+ 6)

(s+ 2)(s+ 3). (1.4)

In this case,

|G(iω)| =

√(ω2 + 16)(ω2 + 36)

(ω2 + 4)(ω2 + 9)

φ(iω) = π − arctan(ω/4) + arctan(ω/6)− arctan(ω/2)− arctan(ω/3) . (1.5)

Rather than directly use (1.2), let us derive this result from first principles, that is,

G(iω) =−(4 − iω)(iω + 6)

(iω + 2)(iω + 3)=eiπ(ω2 + 16)

12 e−i arctan(ω/4)(ω2 + 36)

12 ei arctan(ω/6)

(ω2 + 4)12 ei arctan(ω/2)(ω2 + 9)

12 ei arctan(ω/3)

=

√(ω2 + 16)(ω2 + 36)

(ω2 + 4)(ω2 + 9)ei(π−arctan(ω/4)+arctan(ω/6)−arctan(ω/2)−arctan(ω/3)) .

Hence G(i) = |G(iω)| eiφ(iω) where the magnitude |G(iω)| and angle φ(iω) are given by (1.5).Recall that arctan is a function mapping (−∞,∞) into (−π

2, π2). Clearly, iω−4 = −(4− iω).

Hence π − arctan(ω/4) is the angle for iω − 4. Finally, it is noted that one could also use−1 = e−iπ; see Section 1.2 in Chapter 1.

For another example, consider unstable the transfer function

G(s) =400s(2− s)

(s− 4)(s+ 10).


In computing the magnitude and phase for G it is convenient to convert G to time constantform, that is, to express the numerator and denominator factors of G in the form (1 + s

a)

where a is a scalar. The time constant form for this G is given by

G(s) =400s(2)(1− s

2)

(−4)(1− s4)(10)(1 + s

10)=

−20s(1− s2)

(1− s4)(1 + s

10).

In this case,

G(iω) =−20iω(1− iω

2)

(1− iω4)(1 + iω

10).

For ω > 0, we have

|G(iω)| = 20|ω|√

(1 + ω2

4)

(1 + ω2

16)(1 + ω2

100)

(1.6)

φ(iω) = −π2− arctan(ω/2) + arctan(ω/4)− arctan(ω/10) (if ω > 0).

Let us derive this result from first principles, that is,

G(iω) =20ωe−i

π2 (1 + ω2

4)12 e−i arctan(ω/2)

(1 + ω2

16)12 e−i arctan(ω/4)(1 + ω2

100)12 ei arctan(ω/10)

= 20ω

√(1 + ω2

4)

(1 + ω2

16)(1 + ω2

100)ei(−

π2−arctan(ω/2)+arctan(ω/4)−arctan(ω/10)).

Finally, it is noted that if ω < 0, then −iω = i|ω| = |ω|eiπ2 . So if ω < 0, then |G(iω)| isgiven by (1.6), and the angle φ(iω) is given by

φ(iω) =π

2− arctan(ω/2) + arctan(ω/4)− arctan(ω/10) (if ω < 0).

Let G be a stable transfer function. Recall that the output y corresponding to the inputu is given by y(t) = (g ⊗ u)(t), or equivalently, y(t) = (L−1GU)(t). We say that yss is thesteady state response to an input u if yss(t) is approximately equal to (g ⊗ u)(t) when t islarge, or equivalently, yss(t) ≈ y(t) where y(t) = (L−1GU)(t) for all large t. The steadystate response is the output of the transfer function G determined by the input u after thetransient response has died out. The following result shows that the steady state response fora stable transfer function G driven by a sinusoid is also a sinusoid with the same frequencywhose amplitude is |G(iω)| and phase shift is φ(iω).

PROPOSITION 7.1.1 Let G be a stable proper rational transfer function from u into y.If the input u(t) = eiωt where the angular frequency ω is constant, then the steady stateoutput for G is given by

yss(t) = |G(iω)|ei(ωt+φ(iω)) (if u(t) = eiωt). (1.7)

In particular, if the input u(t) =∑m

1 akeiωkt where the amplitudes {ak}m1 and corresponding

angular frequencies {ωk}m1 are constants, then the steady state output for G is given by

yss(t) =

m∑k=1

akG(iωk)eiωkt =

m∑k=1

ak|G(iωk)|ei(ωkt+φ(iωk)) (if u(t) =

m∑k=1

akeiωkt). (1.8)


Derivation. Assume that the input u(t) = eiωt. Then the Laplace transform of u is givenby U(s) = 1/(s− iω). Clearly, the output Y (s) = G(s)U(s) = G(s)/(s− iω). By taking thepartial fraction expansion it follows that Y admits a representation of the form

Y (s) =G(s)

s− iω=

a

s− iω+R(s)

where R(s) is a rational function and a is the constant computed by a = [G(s)]s=iω = G(iω).This readily implies that

Y (s) =G(iω)

s− iω+R(s) . (1.9)

We claim that R is a strictly proper stable rational function. To see this assume thatG(s) = n(s)/d(s) where n and d are polynomials with no common factors. Since G isstable, all the roots of d are contained in the open left half plane {s : �s < 0}. Recall thatY (s) = G(s)/(s− iω). By consulting (1.9), we obtain

R(s) = Y (s)− G(iω)

s− iω=G(s)−G(iω)

s− iω

=n(s)/d(s)− n(iω)/d(iω)

s− iω=n(s)d(iω)− n(iω)d(s)

d(s)d(iω)(s− iω).

Notice that iω is not a pole of R(s). The factor s − iω is common to both the numeratorand denominator of R(s), and thus can be removed from the denominator of R(s). To seethis simply observe that the numerator n(s)d(iω)− n(iω)d(s) of R(s) evaluated at s = iω iszero. In other words,

n(s)d(iω)− n(iω)d(s) = p(s)(s− iω)

where p is a polynomial of degree at most deg d−1. Hence R(s) = p(s)/d(s)d(iω) is a strictlyproper rational function. Because all the poles of d are contained in the open left half plane,R is stable. In particular, r(t) approaches zero as t tends to infinity.

Recall that G(iω) = |G(iω)| eiφ(iω). By taking the inverse Laplace transform in (1.9), wearrive at

y(t) = G(iω)eiωt + r(t) = |G(iω)| ei(ωt+φ(iω)) + r(t) .

Since r(t) approaches zero as t tends to infinity, it follows that yss(t) = |G(iω)| ei(ωt+φ(iω)).Therefore (1.7) holds.

Equation (1.8) follows from the fact that the transfer function is a linear operator fromthe input to the output. If u(t) =

∑m1 ake

iωkt, then the output y(t) is given by

y(t) = (g ⊗ u)(t) =

m∑k=1

ak(g ⊗ eiωkt)(t) .

This readily implies that the steady state response is given by yss(t) =∑m

1 akyss, k(t) whereyss, k(t) is the steady state response to the input eiωkt. By consulting (1.7), we see thatyss, k(t) = |G(iωk)| ei(ωkt+φ(iωk)). This yields (1.8) and completes the proof.

A state space derivation of equation (1.7) in Proposition 7.1.1. Let {A,B,C,D}be a stable state space realization for G(s), that is, G(s) = C(sI − A)−1B + D and A is


stable. Assume that the input u(t) = γeiωt where γ is a constant and ω is the frequency.The solution x(t) to the state space system x = Ax+Bu with u(t) = γeiωt is given by

x(t) = eAtx(0) +

∫ t

0

eA(t−τ)Bu(τ)dτ


0

e−AτBγeiωτdτ


0

e(iωI−A)τBγdτ

= eAtx(0) + eAt e(iωI−A)τ∣∣t0(iωI −A)−1Bγ

= eAtx(0) + eAt(e(iωI−A)t − I

)(iωI −A)−1Bγ

= eAt(x(0)− (iωI − A)−1Bγ

)+ eAte(iωI−A)t(iωI −A)−1Bγ


)+ eAte−Ateiωt(iωI − A)−1Bγ


)+ (iωI −A)−1Bγeiωt.

Here we used the fact that∫e(iωI−A)τdτ = (iωI − A)−1e(iωI−A)τ = e(iωI−A)τ (iωI − A)−1.

The inverse (iωI−A)−1 is well defined because A is stable, and thus, iω is not an eigenvaluefor A. So when u(t) = γeiωt, we have

x(t) = eAt(x(0)− (iωI − A)−1Bγ

)+ (iωI −A)−1Bγeiωt (if u(t) = γeiωt). (1.10)

Because A is stable, eAt → 0 as t tends to infinity. Hence in steady state

xss(t) = (iωI − A)−1Bγeiωt (if u(t) = γeiωt). (1.11)

Using y = Cx+Du we have

y(t) = CeAt(x(0)− (iωI − A)−1Bγ

)+ C(iωI − A)−1Bγeiωt +Dγeiωt

= CeAt(x(0)− (iωI − A)−1Bγ

)+G(iω)γeiωt

= CeAt(x(0)− (iωI − A)−1Bγ

)+ |G(iω)|γei(ωt+φ(iω)). (1.12)

Recall that G(iω) = |G(iω)| eiφ(iω). Since A is stable, eAt → 0 as t tends to infinity, andthus,

yss(t) = G(iω)γeiωt = |G(iω)|γei(ωt+φ(iω)) (if u(t) = γeiωt). (1.13)

This yields (1.7).It is emphasized that if ω = 0, then γei0t = γ. So if u(t) = u0 a constant for all t, then

we obtain the following state space version of the final value Theorem

x(t) = eAt(x(0) + A−1Bu0

)− A−1Bu0 and xss(t) = −A−1Bu0 (when u(t) = u0)

y(t) = CeAt(x(0) + A−1Bu0

)+G(0)u0 and yss(t) = G(0)u0 (when u(t) = u0).


REMARK 7.1.2 Assume that x = Ax + Bu and y = Cx + Du is a stable state spacesystem. Moreover, the input u(t) =

∑m1 ake

iωkt. Then we have

x(t) =m∑k=1

(eAt

(x(0)− (iωkI − A)−1Bak

)+ (iωkI − A)−1Bake

iωkt)

xss(t) =

m∑k=1

(iωkI − A)−1Bakeiωkt

y(t) =

m∑k=1

(CeAt

(x(0)− (iωkI −A)−1Bak

)+ ak|G(iωk)|ei(ωt+φ(iωk))

)yss(t) =

m∑k=1

akG(iωk)eiωkt =

m∑k=1

ak|G(iωk)|ei(ωkt+φ(iωk)).

As before, assume that G is a stable proper rational transfer function. If u(t) = 1, thenthe steady state response yss(t) = G(0). To see this simply notice that 1 = 1ei 0t. So usingω = 0 in (1.7), we see that yss(t) = G(0). If u(t) = γ where γ is a constant, then the steadystate response yss(t) = γG(0). For example, consider the transfer function,

G(s) =20(s− 1)

(s+ 2)(s2 + 2s+ 5). (1.14)

Then the steady state response to the input u(t) = 1 is yss(t) = G(0) = −20/2×5 = −2, thatis, yss(t) = −2. For another example, notice that the steady state response to u(t) = −3, isyss(t) = −3G(0) = −3×−20

2×5= 6, that is, yss(t) = 6.

The case when g is a real valued. In almost all practical problems the transfer functionG(s) is the Laplace transform of a real valued function g. So without loss of generality letus assume that g is a real valued function. In this case, G(iω) = G(−iω). To see this simplyobserve that

G(iω) = (Lg)(iω) =∫ ∞

0

e−iωg(t) dt =∫ ∞

0

eiωg(t) dt = (Lg)|s=−iω = G(−iω) .

Hence G(iω) = G(−iω) when g is a real valued function.As before, assume that g(t) is a real valued function. Since G(−iω) is the complex

conjugate of G(iω), it follows that G(iω) and G(−iω) have the same magnitude, that is,|G(iω)| = |G(−iω)|. In particular, |G(iω)| is even and |G(iω)| is symmetric about the yaxis. Moreover, φ(iω) = −φ(−iω), and thus, φ(iω) = arg(G(iω)) is an odd function. Toverify this, recall that if z = a + ib is a complex number, then arg(z) = − arg(z); see (2.4)in Chapter 1. Hence φ(iω) = −φ(−iω).

Consider the input u(t) = γ cos(ωt + ϕ) where ϕ is a constant phase. Then the steadystate response is given by

yss(t) = γ|G(iω)| cos(ωt+ ϕ+ φ(iω)) (if u(t) = γ cos(ωt+ ϕ)) . (1.15)


To verify this observe that

cos(ωt+ ϕ)) =eiϕeiωt

2+e−iϕe−iωt

2.

By employing (1.8) and φ(−iω) = −φ(iω), we obtain

yss(t) =γeiϕ|G(iω)| ei(ωt+φ(iω))

2+γe−iϕ|G(−iω)| e−i(ωt−φ(−iω))

2

=γ|G(iω)| ei(ωt+ϕ+φ(iω))

2+γ|G(−iω)| e−i(ωt+ϕ+φ(iω))

2

= γ|G(−iω)| cos(ωt+ ϕ+ φ(iω)) .

Hence (1.15) holds.For example, consider the stable transfer function G given by

G(s) =−10(1− s

4)

(1 + s2)(1 + s

3).

Notice that we have expressed this transfer function in time constant form. In this case,

|G(iω)| = 10

√1 + ω2

16

(1 + ω2

4)(1 + ω2

9)

φ(iω) = π − arctan(ω/4)− arctan(ω/2)− arctan(ω/3).

If u(t) = a+ b cos(ωt) where a and b are constants while ω is the frequency, then the steadystate output yss(t) is given by

yss(t) = aG(0) + 10b

√1 + ω2

16

(1 + ω2

4)(1 + ω2

9)cos(ωt+ φ(iω)).

Clearly, G(0) = −10. In particular, if u(t) = −2 + 6 cos(5t), then the previous expressionfor yss reduces to

yss(t) = 20 + 60

√1 + 25

16

(1 + 254)(1 + 25

9)cos(5t+ φ(5i)) = 20 + 18.35 cos(5t+ 0.025).

Since sin(θ) = cos(θ− π/2), equation (1.15) shows that the steady state response for thetransfer function G driven by the input sin(ωt+ ϕ) is given by

yss(t) = γ|G(iω)| sin(ωt+ ϕ+ φ(iω)) (if u(t) = γ sin(ωt+ ϕ)) . (1.16)

It is noted that, if the input is the sum of sinusoids, that is,

u(t) = a+

m∑k=1

(αk cos(ωkt+ ϕk) + βk sin(ωkt+ ϕk)) , (1.17)


then the corresponding steady state response is given by the following sum of sinusoids

yss(t) = G(0)a+

m∑k=1

(αk|G(iωk)| cos(ωkt + ϕk + φ(iωk)) + βk|G(iωk)| sin(ωkt+ ϕk + φ(iωk)) .

(1.18)This follow from (1.15) and (1.16) along with the fact that the transfer is a linear operatorfrom the input to the output.

REMARK 7.1.3 Assume that {A,B,C,D} is a stable state space realization for a transferfunction G(s), that is,

G(s) = C(sI − A)−1B +D. (1.19)

Then G(0) = −CA−1B +D. Moreover, for frequency ω we have

G(iω) = C(iωI − A)−1B +D. (1.20)

In particular, the corresponding Matlab commands are given by

G(0) = −C ∗ inv (A) ∗B +D and G(iω) = C ∗ inv (iωI −A) ∗B +D.

Therefore one can compute G(0) and G(iω) in Matlab directly from the state space realization{A,B,C,D} for G(s).

0 2 4 6 8 10 12−15

−10

−5

0

5

10

15

Linear Simulation Results

Time (seconds)

Am

plitu

de

Figure 7.1: The actual and steady state response for g


A Matlab example. Consider the transfer function

G(s) =20s+ 4

s3 + 4s2 + 9s+ 10.

Because the poles of G(s) are −2 and −1± 2i, this transfer function is stable. Consider theinput u(t) = 3 + 4 cos(2t). Notice that

G(0) =4

10=

2

5

G(2i) =20s+ 4

s3 + 4s2 + 9s+ 10

∣∣∣∣s=2i

= 2.7647− 2.0588i

|G(2i)| = |2.7647− 2.0588i| = 3.4471

4|G(2i)| = 13.7883

angle(G(2i)) = −0.6401.

Recall that the steady state response is determined by

yss(t) = 3G(0) + 4|G(2i)| cos(2t+ angle(G(2i))).

Therefore the steady state response is given by

yss(t) =6

5+ 13.7883 cos(2t− 0.6401).

The actual response and the steady state response is presented in Figure 7.1, which showsthat yss(t) converges to y(t) for large t. The Matlab commands we used to generate this plotare given by

t = linspace(0, 12, 2 ∧ 14); u = 3 + 4 ∗ cos(2 ∗ t);num = [20, 4]; den = [1, 4, 9, 10];G = tf(num,den);

lsim(G, u, t); grid

g = (20 ∗ 2i+ 4)/((2i)3 + 4 ∗ (2i)2 + 9 ∗ 2i+ 10) = 2.7647− 2.0588i

yss = 6/5 + 4 ∗ abs(g) ∗ cos(2t+ angle(g));

hold on; plot(t, yss,′ r′)

Finally, it is noted that the lsim command in Matlab plots both the input u and output yon the same graph.

A square wave example. Let Y (s)U(s)

= G(s) be a stable transfer function with input u(t)

and output y(t). Consider the square wave input u(t) with period 2π defined by

u(t) = 1 if jπ ≤ t < (j + 1)π and j is even

= −1 if jπ ≤ t < (j + 1)π and j is odd.


The Fourier series for u(t) is given by

u(t) =4

π

∞∑odd k≥1

sin(kt)

k;

see equation (2.2) in Section 2.2 of Chapter 2. The steady state response for this squarewave input u(t) is determined by

yss(t) =4

π

∞∑odd k≥1

|G(ik)| sin(kt+ φ(ik))

k

where G(iω) = |G(iω)|eiφ(iω).For example, consider the transfer function

Y (s)

U(s)= G(s) =

s− 1

s3 + 2s2 + 2s+ 1.

The poles of G(s) are −1 and −12±

√3i2. Therefore G(s) is stable. Then we used Matlab to

plot the square wave input u(t), the output y(t) and the steady state response yss(t) over theinterval [0, 8π]; see Figure 7.2. To compute G(iω) one could also use a state space realization{A,B,C,D} for G(s). In this case, D = 0. Then G(iω) = C(iωI −A)−1B.

0 5 10 15 20 25 30-1.5

-1

-0.5

0

0.5

1

1.5Linear Simulation Results

Time (seconds)

Am

plitu

de

Figure 7.2: The steady state response to a square wave


The Matlab commands we used to plot Figure 7.2 are given by

num = [1,−1]; den = [1, 2, 2, 1]; g = tf(num,den)

t = linspace(0, 8π, 2 ∧ 17); u = square(t);

lsim(g, u, t); grid

p = zeros(size(t));

for k = 1 : 2 : 1001;

gk = polyval(num, i ∗ k)/polyval(den, i ∗ k);p = p+ abs(gk) ∗ sin(k ∗ t+ angle(gk))/k; end;

p = 4 ∗ p/π; hold on; plot(t, p,′ r′);

7.1.2 A vibration suppression example

m1

k12

c k12

k2

m2

q1

u

q2

Figure 7.3: A mass spring damper system to suppress a vibration

In this section, we will use our steady state analysis to eliminate or suppress a vibrationin a certain mass spring damper system. To this end, consider the mass spring dampersystem in Figure 7.3. The equations of motion are given by[

m1 00 m2

] [q1q2

]+

[c 00 0

] [q1q2

]+

[k1 + k2 −k2−k2 k2

] [q1q2

]=

[u0

]. (1.21)

Here q1 is the position of the mass m1, while q2 is the position of the mass m2. Moreover,the input u(t) is a force on the first block m1. (By choosing m1 = 1, m2 = 1, k1 = 1, k2 = 1,c = 2.5 and u = 0, this system reduces to the stable mass spring damper system presentedin equation (2.8) of Section 6.2.2 in Chapter 6.) Finally, the output y(t) = q1(t).

By taking the Laplace transform of (1.21) with all the initial conditions set equal to zero,we arrive at [

m1s2 + cs+ k1 + k2 −k2

−k2 m2s2 + k2

] [Q1(s)Q2(s)

]=

[10

]U(s). (1.22)


The determinant of the previous 2× 2 matrix is given by

d(s) = m1m2s4 +m2cs

3 +(m1k2 +m2k2 +m2k1

)s2 + ck2s+ k1k2. (1.23)

By taking the inverse of the 2× 2 matrix in (1.22), we have[Q1(s)Q2(s)

]=

1

d(s)

[m2s

2 + k2 k2k2 m1s

2 + cs+ k1 + k2

] [10

]U(s). (1.24)

Therefore the transfer function G(s) from u to y = q1 is given by

G(s) =Y (s)

U(s)=

m2s2 + k2

m1m2s4 +m2cs3 +(m1k2 +m2k2 +m2k1

)s2 + ck2s+ k1k2

. (1.25)

The equations of motion in (1.21) can be rewritten as Mq + Φq +Kq = bu where

M =

[m1 00 m2

]and Φ =

[c 00 0

]and K =

[k1 + k2 −k2−k2 k2

]and b =

[10

]. (1.26)

We assume that m1, m2, c, k1 and k2 are all strictly positive. Clearly, M is a strictly positivematrix, and Φ is a positive matrix. Moreover, K is a strictly positive matrix. To see this,simple notice that k1 + k2 > 0 and det[K] = k1k2 > 0. By Sylvester’s criterion, the springmatrix K is strictly positive; see Remark 6.2.3 in Chapter 6. Now observe that

[K − σM

Φ

]=

⎡⎢⎢⎣k1 + k2 − σm1 −k2

−k2 k2 − σm2

c 00 0

⎤⎥⎥⎦is one to one for all real numbers σ > 0. So according to Theorem 6.2.11 in Chapter 6, ourmass spring damper system in (1.21) is stable. Therefore

det[Ms2 + Φs +K] = d(s) = m1m2s4 +m2cs

3 +(m1k2 +m2k2 +m2k1

)s2 + ck2s+ k1k2

is a stable polynomial, that is, all the roots of d(s) are contained in the open left hand plane.In particular, our transfer function G(s) in (1.25) is stable.

Suppose that the mass m1 is subject to a disturbance of the form u(t) = γ sin(ωt). Forexample, m1 can represent the mass of a building, and u(t) can be a piece of machineryoperating at an angular frequency of ω. The idea is to find a mass m2 and k2 to eliminatethe vibration γ sin(ωt) on the mass m1, that is, find a mass m2 and a spring constant k2such that the steady state response yss(t) = 0. In other words, find m2 and k2 to eliminatethe vibration on the mass m1 in steady state.

The steady state response for the disturbance u(t) = γ sin(ωt) is given by

yss(t) = |G(iω)|γ sin(ωt+ φ(iω))

where G(iω) = |G(iω)|eiφ(iω). The steady state response yss(t) = 0 if and only if |G(iω)| = 0,or equivalently, G(iω) = 0. By consulting the formula for G(s) in (1.25), we see that


G(iω) = 0 if and only if its numerator m2(iω)2+k2 = 0. Therefore the steady state response

yss(t) = 0 if and only if k2 = m2ω2, or equivalently, ω =

√k2m2

. It is noted that this choice

of m2 and k2 corresponds to the natural frequency ω =√

k2m2

of the mass spring system

m2x+ k2x = 0. (Clearly, we always assume that m2 > 0 and k2 > 0.)So if we choose m2 and k2 such that k2 = m2ω

2, then the steady state response yss(t) = 0,and the first mass m1 does not feel the vibration in steady state. Substituting k2 = m2ω

2

into G(s), yields the following form for our transfer function G(s), which corresponds to azero steady state output for the input u(t) = γ sin(ωt):

G(s) =m2s

2 + k2

m1m2s4 +m2cs3 +(m1k2 +m2k2 +m2k1

)s2 + ck2s+ k1k2

∣∣∣∣∣k2=m2ω2

=m2s

2 +m2ω2

m1m2s4 +m2cs3 +(m1m2ω2 +m2

2ω2 +m2k1

)s2 + cm2ω2s + k1m2ω2

=s2 + ω2

m1s4 + cs3 +(m1ω2 +m2ω2 + k1

)s2 + cω2s+ k1ω2

.

In other words, by choosing m2 and k2 such that k2 = m2ω2, the transfer function G(s)

which guarantees a zero steady state output is given by

G(s) =s2 + ω2

d1(s)=

s2 + ω2

m1s4 + cs3 +(m1ω2 +m2ω2 + k1

)s2 + cω2s+ k1ω2

. (1.27)

The poles of G(s) or the roots of the denominator

d1(s) = m1s4 + cs3 +

(m1ω

2 +m2ω2 + k1

)s2 + cω2s+ k1ω

2

of G(s) determines how fast and how much q1(t) oscillates before arriving at zero steadystate. The engineer can determine the poles of G(s) by letting m2 vary from 0 to ∞. Thenone would choose m2 from this set of poles which yields the ”best” transient response toachieve a zero steady state output. Ruffly speaking, the further the poles of G(s) are in theleft hand plane, the faster the first mass m1 moves to zero. The imaginary part of the polesof G(s) determine how fast the transient response is oscillating before achieving zero steadystate. Of course there is a bound on the mass m2 one can use in practice. So choosing alarge m2 or likewise k2 may not make any sense.

Let F (s) be a transfer function. Recall that the root locus plots the zeros of 1 + kF (s)as k varies from zero to infinity; see Ogata [27] for a discussion of the root locus method. Itis noted that the poles of G(s), or equivalently, the roots of d1(s) are the zeros of

0 = 1 +m2ω2s2

m1s4 + cs3 +(m1ω2 + k1

)s2 + cω2s+ k1ω2

.

So one can plot the root locus of

F (s) =ω2s2

m1s4 + cs3 +(m1ω2 + k1

)s2 + cω2s + k1ω2

(1.28)


to find the poles of G(s) corresponding to a specified m2, or visa versa, findm2 correspondingto a specified pole of G(s) on the root locus. The root locus of the transfer function F (s)in (1.28) will plot the poles of G(s) in (1.27) as m2 varies from zero to infinity. From theroot locus one can choose a reasonable value of m2 and set k2 = m2ω

2 to achieve a zerosteady state output. (The Matlab command for the root locus is rlocus, which is applied tothe transfer function F (s) in (1.28).) However, we did not cover the root locus method inthese notes. Moreover, the root locus is not necessary to determine the roots of d1(s) as m2

varies in an acceptable range. One can simply let m2 vary and plot the corresponding rootsof d1(s) on the same graph. Then using this graph one can choose an appropriate m2 andcorresponding spring constant k2 = m2ω

2.

-1 -0.5 0 0.5-10

0

10Root Locus

Real Axis (seconds -1)

Imag

inar

y A

xis

(sec

onds

-1)

0 5 10 15 20-1

0

1Linear Simulation Results

Time (seconds)

Am

plitu

de

Figure 7.4: The roots d1(s) as m2 varies. The graph of u(t) = sin(2t) and y(t).

For a numerical example, we choose m1 = 2, c = 3 and k1 = 8 with the disturbanceu(t) = sin(2t). Then we plotted the root locus of the transfer function F (s) in (1.28) whichis the roots of d1(s), or equivalently, the poles of G(s) in (1.27) as m2 varies from 0 toinfinity; see the upper graph in Figure 7.4. The x on the root locus corresponds to m2 = 0.As m2 moves from 0 to infinity, the roots move toward each other and split off. Then theimaginary part of two of the roots head towards infinity, while the other two roots move tozero. Moreover, as m2 leaves 0 two of the roots of d1(s) move toward the imaginary axis,while the other two roots move away from the imaginary axis. So as a comprise, we choosem2 = .28 and k2 = m2ω

2 = 1.12. In this case, the roots of d1(s), or equivalently, the polesof G(s) are given by {−0.4 ± 1.96i,−0.35 ± 1.97i}. The bottom graph in Figure 7.4 plotsthe disturbance u(t) = sin(2t) and the output y(t) on the same graph. (The lsim commandin Matlab was used to generate this plot.) As expected, the output y(t) converges to zero,which is the zero steady state response.

Finally, it is noted that Figure 7.3 and some of our other mechanical drawings were madeby implementing \usepackage{tikz} and following an open internet code due to A. Staceyand others.


7.1.3 A mass spring damper identification example

m

k

c

q

u


Consider the scalar mass spring damper system

mq + cq + kq = u (1.29)

where m is the mass, c is the damping constant and k is the spring constant. As expected,q(t) is the position of the mass, and u(t) is the input force; see Figure 7.5. Moreover, weassume that mq+ cq+kq = u is stable, or equivalently, m > 0, c > 0 and k > 0. The systemidentification area tries to find the system parameters by preforming an experiment on thesystem and collecting the corresponding data. In our case, we are looking to experimentallydetermine m, c, and k. To find the constants m, c, and k assume that one applies twodifferent inputs to mq + cq + kq = u and measures the steady state position of q(t):

• If u(t) = 1, then limt→∞ q(t) = 13.

• If u(t) = cos(t), then qss(t) =1√2cos(t− π

4).

Then using this data find m, c, and k. We claim that

m = 2 and c = 1 and k = 3. (1.30)

To see this first notice that we have three unknowns m, c, and k, while the steady stateconditions determine three constants 1

3, 1√

2and −π

4. In other words, we are looking for three

equations with three unknowns m, c, and k. Then hopefully we can find m, c, and k. Thetransfer function G(s) from u to q is given by

G(s) =Q(s)

U(s)=

1

ms2 + cs+ k.

Because the transfer function G(s) is stable, the final value theorem states that when theinput u(t) = 1, the steady state output

1

k= G(0) = lim

t→∞q(t) =

1

3.

Therefore the spring constant k = 3.


If the input u(t) = cos(t), then the steady state response

1√2cos

(t− π

4

)= qss(t) = |G(i)| cos(t + angle(G(i))).

Thus |G(i)| = 1√2and angle(G(i)) = −π

4. Hence G(i) = 1√

2e−i

π4 = 1

2− i

2. Observe that

1− i

2= G(i) =

1

ms2 + cs+ 3

∣∣∣∣s=i

=1

3−m+ ic.

By moving 3−m+ ic to the other side, we have

3−m+ ic =2

1− i= 1 + i.

By matching the real and imaginary parts, we see that m = 2 and c = 1.

REMARK 7.1.4 One can also consider the input u(t) = 1 + cos(t) to our previous massspring damper system mq + cq + kq = u. Then the corresponding steady state response isgiven by

qss(t) =1

3+

1√2cos

(t− π

4

).

Due to linearity the input u(t) = 1 yields q(∞) = 13, while the input u(t) = cos(t) gives a

steady state response of qss(t) =1√2cos

(t− π

4

). In this setting, one simply repeats the above

analysis and arrives at m = 2, c = 1 and k = 3; see also Problem 10 below. Finally, it isnoted that this approach requires one to find the constant 1

3and the sinusoid 1√

2cos

(t− π

4

)from the steady state response qss(t) =

13+ 1√

2cos

(t− π

4

).

7.1.4 Exercise


G(s) =18s(s− 3)

(s+ 1)(s+ 2). (1.31)

Is this transfer function stable? Find the magnitude |G(iω)| and angle φ(iω) for G(iω). Plotin Matlab |G(iω)|. Find the steady state response yss for the input u(t) = −3 + 2 sin(4t)−6 cos(7t). Use the lsim command in Matlab to plot your results.


G(s) =20(s− 4)

(3 + s)(s2 + 2s+ 5).

Is this transfer function stable? Find the magnitude |G(iω)| and angle φ(iω) for G(iω). Findthe steady state response yss for the input u(t) = 4 + 2 sin(5t). Use the lsim command inMatlab to plot your results.


Problem 3. Find the steady state solution to the following differential equation

y(3) + 3y + 12y + 10y = 100− 150 sin(2t) + 120 cos(3t) .

Use the lsim command in Matlab to plot your results.

Problem 4. Consider the polynomial p(s) = s2 + bs + c where b and c are real numbers.Show that all the roots of p are in the open left half plane {s : �s < 0} if and only if b > 0and c > 0. Hint: the quadratic formula.


G(s) =(s2 + z)

(s+ 1)(s+ 2).

Find z such that the steady state response y equals zero for the input u(t) = γ cos(iω0t+ϕ).Plot in Matlab |G(iω)| for ω0 = 3.

Problem 6. Consider the scalar stable mass spring damper system

mq + cq + kq = u (1.32)

where m is the mass, c is the damping constant and k is the spring constant. Moreover,q is the position of the mass; see Figure 7.5. (By stable we mean that m > 0, c > 0 andk > 0.) To find the constants m, c, and k assume that one applies two different inputs tomq + cq + kq = u and measures the position q(t):

• If u(t) = u0, then limt→∞ q(t) = r.

• If u(t) = γ cos(ωt), then qss(t) = a cos(ωt+ ϕ).

Then given the results r, a and ϕ of these experiments discuss how to find m, c, and k.


mq + cq + kq = u (1.33)

where m is the mass, c is the damping constant and k is the spring constant. Finally, q isthe position of the mass; see Figure 7.5. To find the constants m, c, and k assume that oneapplies the input u(t) = cos(t) + cos(2t) to mq + cq + kq = u and measures the position

qss(t) =1

2√2cos(t− 3π

4) + 0.0854 cos(2t− 2.7928).

Find m, c, and k.


mq + cq + kq = u (1.34)


where m is the mass, c is the damping constant and k is the spring constant. Finally, q isthe position of the mass; see Figure 7.5. To find the constants m, c, and k assume that oneapplies the input

u(t) = γ1 cos(ω1t) + γ2 cos(ω2t)

to mq + cq + kq = u and measures the position

qss(t) = a1 cos(ω1t + ϕ1) + a2 cos(ω2t+ ϕ2).

Assume that the frequencies ω1 = ω2. Then given the results a1, a2, ϕ1 and ϕ2 of thisexperiment discuss how to find m, c, and k.

Problem 9. Consider the mass spring damper system[2 −1−1 1

] [q1q2

]+

[2 11 1

] [q1q2

]+

[1 −1−1 2

] [q1q2

]=

[73

]u.

(i) Is this system stable? Explain why or why not.

(ii) Find the transfer functions G1(s) =Q1(s)U(s)

and G2(s) =Q2(s)U(s)

.

(iii) Convert this mass spring damper system to a state space system of the formx = Ax+Bu where the state

x =

⎡⎢⎢⎣q1q2q1q2

⎤⎥⎥⎦ .(iv) If u(t) = 1 − 2 cos(t), then find the steady state response xss(t). Notice that xss(t) is

a vector of length 4.

Problem 10. Consider the mass spring damper system[m 00 1

] [q1q2

]+

[c −1−1 1

] [q1q2

]+

[k −1−1 k2

] [q1q2

]=

[10

]u (1.35)

where the mass m, the damping c and the spring constants k and k2 are unknown. Finally,q1(t) and q2(t) are the respective positions of the two masses, and u(t) is the input. To findthe constants m, c, k and k2 assume that one applies the input

u(t) = 3 + 2 cos(t)

to (1.35) and measures the corresponding steady state response

q1ss(t) = 2 + sin(t)

q2ss(t) = 1 + sin(t).

Find m, c, k and k2.


Problem 11. Consider the resistor capacitor and inductor circuit in Figure 7.6, where theinput u is the voltage and the output y is the voltage across the capacitor. Assume that theinput u(t) = 12 cos(t) and the steady state output is

yss(t) = sin(t)− cos(t).

Find the transfer function Y (s)U(s)

= G(s). Does this transfer function uniquely determine R,C and L? Explain why or why not.

+−u

i◦

R i◦ L

C y

+

−

Figure 7.6: A resistor inductor and capacitor

Problem 12. Consider the periodic 2π sawtooth wave given by

u(t) = t− 2πj if 2jπ ≤ t < 2(j + 1)π where j is an integer.

It is noted that u(t) is the 2π periodic extension of the function t in L2(0, 2π). Show thatthe Fourier series for u(t) is given by

u(t) = π − 2

∞∑k=1

sin(kt)

k.

Consider the transfer function

Y (s)

U(s)= G(s) =

8(s− 1)

s3 + 4s2 + 8s+ 8.

Find the poles for G(s). Is G(s) stable? The steady state response is given by

G(0)π − 2∞∑k=1

|G(ik)| sin(kt+ φ(ik))

k

where G(iω) = |G(iω)|eiφ(iω). In Matlab plot u(t), the output y(t) and yss(t) all on the samegraph over the interval [0, 8π].

Hint: consider the sawtooth command in Matlab (u = π ∗ sawtooth(t) + π).

Problem 13. Consider the state space system x = Ax + Bu where A is a stable matrix.Assume that the input u(t) = γ cos(ωt+ ϕ). Then show that

x(t) = eAt(x(0) +

(ω2I + A2

)−1(ABγ cos(ϕ)− ωBγ sin(ϕ)

))+(ω2I + A2

)−1(ωBγ sin(ωt+ ϕ)−ABγ cos(ωt+ ϕ)

). (1.36)


In particular, the steady state response is given by

xss(t) =(ω2I + A2

)−1(ωBγ sin(ωt+ ϕ)− ABγ cos(ωt+ ϕ)

). (1.37)

Problem 14. Consider the input u(t) = cos(ωt) where the frequency ω > 0. Find a stabletransfer function G(s) such that yss(t) = sin(ωt). Hint: G(s) = a−s

a+s.

m2m1 m3

k1

c1

k2

c2

k3

q1 q2 q3

u


Problem 15. Consider the mass spring damper system presented in Figure 7.7. Theposition of the mass mj is denoted by qj for j = 1, 2, 3. Moreover, the input force u(t) is onthe first mass m1. The equations of motion for this mass spring damper system is given bythe following second order matrix differential equation:⎡⎣m1 0 0

0 m2 00 0 m3

⎤⎦⎡⎣q1q2q3

⎤⎦+

⎡⎣c1 + c2 −c2 0−c2 c2 00 0 0

⎤⎦⎡⎣q1q2q3

⎤⎦+

⎡⎣k1 + k2 −k2 0−k2 k2 + k3 −k30 −k3 k3

⎤⎦⎡⎣q1q2q3

⎤⎦ =

⎡⎣100

⎤⎦u.Assume that the masses {m1, m2, m3} and the damping coefficients {c1, c2} and the springconstants {k1, k2, k3} are all strictly positive.

(i) Use Theorem 6.2.11 in Chapter 6 to show that this mass spring damper system isstable.

(ii) Show that the transfer function from the input force u to the output y = q2 is of theform

Q2(s)

U(s)= G(s) =

(m3s2 + k3)(c2s+ k2)

det[Ms2 + Φs+K].

Here M , Φ and K are the corresponding mass matrix, damping matrix and springmatrix. Notice that G(iω) = 0 when k3 = m3ω

2. In particular, if u(t) = γ sin(ωt),then yss(t) = 0 when k3 = m3ω

2. (In this case, ω =√k3/m3 is the natural frequency

for the differential equation m3x+ k3x = 0.)


(iii) Assume that the masses {m1, m2}, the damping coefficients {c1, c2} and the springconstants {k1, k2} are all fixed. Then our problem is to choose a mass m3 and springconstant k3 to eliminate the vibration on the second mass m2, that is, yss(t) = 0, forthe disturbance u(t) = γ sin(ωt). To achieve a zero steady state output, one choosesk3 = m3ω

2. However, the choice of m3 affects the poles of G(s), and thus, the transientresponse. So one would like to choose m3 and k3 = m3ω

2 to have the ”best” transientresponse possible.

Now consider the case when m1 = 1, m2 = 2, c1 = 2, c2 = 3, k1 = 6, k2 = 8 withω = 2 and the disturbance u(t) = sin(2t). Choosing k3 = m3ω

2 guarantees that thesteady state response yss(t) = 0 for u(t) = sin(2t). By choosing k3 = m3ω

2 with ω = 2,plot the poles of G(s) as m3 varies from 0.01 to 100 on the same graph. Notice thatfor each m3 there are 6 poles of G(s). Place an x on the graph for the poles of G(s)corresponding to the case when m3 = 0.01. Finally, Remark 6.3.1 in Chapter 6 maybe useful. (The root locus technique is not necessary.)

(iv) By searching through the poles of G(s) when k3 = m3ω2 and m3 varies from 0.01 to

100, choose a mass m3 with k3 = m3ω2 such that y(t) achieves steady state in 20

seconds or less. (Hint: search for a set of poles for G(s) whose real part is as farin the left hand plane as possible, without a ”large” imaginary part, and then findthe corresponding mass m3.) Compute the poles of G(s) for your choice of m3 andk3 = m3ω

2. For this m3 and k3 use the lsim command in Matlab to plot u(t) and y(t)on the same graph.

7.2 The steady state response and G(iω)

The results in this section, are not used in the rest of the notes and can be skipped by theuninterested reader. Let G(s) = Y (s)

U(s)be a stable transfer function with input u and output y.

In this section, we will present method to recover {G(iωk)}η1 from the steady state response.To be specific, consider the input u(t) =

∑η1 ake

iωkt. Without loss of generality, we alwaysassume that the frequencies {ωk}η1 are distinct and ak = 0 for k = 1, 2, · · · , η. By consulting(1.7) in Proposition 7.1.1, the corresponding steady state response is given by

yss(t) =

η∑k=1

akG(iωk)eiωkt when u(t) =

η∑k=1

akeiωkt. (2.1)

Assume that both the input u(t) and the steady state response yss(t) are known, whilethe transfer function G(s) is unknown. Motivated by the results in Section 3.4.2, let uspresent a sampling method to compute {G(iωk)}η1. First assume that we have collected thesteady state data yss(t) for t over an interval [τ1, τ2] of length τ = τ2 − τ1. Let tj =

jτν

forj = 0, 1, 2, · · · , ν − 1 be ν equally spaced points over the interval [0, τ ]. Throughout weassume that η ≤ ν. (In fact, η � ν is even better.) Moreover, we always assume that all thefrequencies {ωk}η1 are contained in the Nyquist sampling range, that is,

|ωk| < πν

τ(for k = 1, 2, · · · , η). (2.2)

7.2. THE STEADY STATE RESPONSE AND G(Iω) 365

As in Section 3.4.2 in Chapter 3, consider the matrix

T =

⎡⎢⎢⎢⎢⎢⎣1 1 · · · 1

eiω1t1 eiω2t1 · · · eiωηt1

eiω1t2 eiω2t2 · · · eiωηt2

......

......

eiω1tν−1 eiω2tν−1 · · · eiωηtν−1

⎤⎥⎥⎥⎥⎥⎦ : Cη → Cν . (2.3)

Because η ≤ ν, the matrix T is one to one. Let Λ on Cη be the diagonal matrix determined

by {akeiωkτ1}η1, that is,

Λ =

⎡⎢⎢⎢⎢⎢⎣a1e

iω1τ1 0 · · · 0 00 a2e

iω2τ1 · · · 0 0...

.... . .

......

0 0 · · · aη−1eiωη−1τ1 0

0 0 · · · 0 aηeiωητ1

⎤⎥⎥⎥⎥⎥⎦ : Cη → Cη. (2.4)

Since ak = 0 for all k, the diagonal matrix Λ is invertible. Let Γ = TD, that is,

Γ = TΛ =

⎡⎢⎢⎢⎢⎢⎣a1e

iω1(τ1+t0) a2eiω2(τ1+t0) · · · aηe

iωη(τ1+t0)

a1eiω1(τ1+t1) a2e

iω2(τ1+t1) · · · aηeiωη(τ1+t1)

a1eiω1(τ1+t2) a2e

iω2(τ1+t2) · · · aηeiωη(τ1+t2)

......

......

a1eiω1(τ1+tν−1) a2e

iω2(τ1+tν−1) · · · aηeiωη(τ1+tν−1)

⎤⎥⎥⎥⎥⎥⎦ : Cη → Cν (2.5)

Since T is one to one and Λ is invertible, Γ is also one to one. By consulting (2.1), we seethat

yss(τ1 + tj) =

η∑k=1

akeiωk(τ1+tj)G(iωk) (for j = 0, 1, 2, · · · , ν − 1). (2.6)

By rewriting this equation in matrix form, we have⎡⎢⎢⎢⎣yss(τ1 + t0)yss(τ1 + t1)

...yss(τ1 + tν−1)

⎤⎥⎥⎥⎦ = Γ

⎡⎢⎢⎢⎣G(iω1)G(iω2)

...G(iωη)

⎤⎥⎥⎥⎦ . (2.7)

If the matrix equation b =Mx has a solution, then M∗b =M∗Mx. If in addition M is oneto one, then M∗M is invertible, and thus, x = (M∗M)−1M∗b. Because Γ is one to one, weobtain the solution for {G(iωk)}η1 that we have been looking for:⎡⎢⎢⎢⎣

G(iω1)G(iω2)

...G(iωη)

⎤⎥⎥⎥⎦ = (Γ∗Γ)−1Γ∗

⎡⎢⎢⎢⎣yss(τ1 + t0)yss(τ1 + t1)

...yss(τ1 + tν−1)

⎤⎥⎥⎥⎦ ; (2.8)


see also Theorem 3.4.1 in Chapter 3. Finally, it is noted that (Γ∗Γ)−1Γ∗ is the Moore-Penrosepseudo inverse of Γ. (The Matlab command is pinv.)

If the measurement for the output y(t) contains noise, then there may be no solution tothe matrix equation in (2.7). To be more specific, consider the vectors

yp =

⎡⎢⎢⎢⎣yss(τ1 + t0)yss(τ1 + t1)

...yss(τ1 + tν−1)

⎤⎥⎥⎥⎦ and v =

⎡⎢⎢⎢⎣v0v1...

vν−1

⎤⎥⎥⎥⎦ and g =

⎡⎢⎢⎢⎣G(iω1)G(iω2)

...G(iωη)

⎤⎥⎥⎥⎦ . (2.9)

Here yp is a vector in Cν obtained by sampling {yss(τ1 + tj)}ν−10 , and g is the vector in Cη

determined by {G(iωk)}η1, while v is a vector in Cν representing the measurement error insampling {yss(τ1+tj)}ν−1

0 . It is noted that yp = Γg, and thus, g = (Γ∗Γ)−1Γ∗yp. Now assumethat ydat = yp + v is the vector obtained by sampling {yss(τ1 + tj)}ν−1

0 in the presence ofadditive noise {vj}ν−1

0 . Then g = (Γ∗Γ)−1Γ∗ydat is an estimate for g. If there is no noisev = 0, or perfect data collection, then ydat = yp, and thus, g = g yields {G(iωk)}η1. However,in general v = 0. To see how accurate the estimate g for g is, let us compute the distancebetween g and g, or the error in estimation, that is,

‖g − g‖ = ‖(Γ∗Γ)−1Γ∗yp − (Γ∗Γ)−1Γ∗(yp + v)‖ = ‖(Γ∗Γ)−1Γ∗v‖.

In other words, the distance between g and g is given by

‖g − g‖ = ‖(Γ∗Γ)−1Γ∗v‖ ≤ ‖(Γ∗Γ)−1Γ∗‖‖v‖ =‖v‖√λmin

. (2.10)

Here λmin is the smallest eigenvalue for Γ∗Γ. Furthermore,√λmin is also the smallest singular

value for Γ, and thus, the matrix norm ‖(Γ∗Γ)−1Γ∗‖ = 1√λmin

. Because Γ is one to one, Γ∗Γis a strictly positive matrix, and hence, all of the eigenvalues for Γ∗Γ are strictly positive.

Equation (2.10) shows that ‖v‖√λmin

is an upper bound for the error ‖g− g‖. In particular,

if ‖v‖√λmin

≈ 0, then g ≈ g. It is emphasized that ‖v‖√λmin

is an upper bound, and ‖v‖√λmin

could

be large, while g ≈ g. For example, if v is in the null space of Γ∗, then g = g independent of‖v‖√λmin

. Ruffly speaking, λmin is an indicator of how large the noise v can be and still yield a

reasonable estimate g for g. For example, if the eigenvalue λmin is large, then we can toleratemore noise and still obtain a good estimate g for g. On the other hand, when λmin is small,estimating g from g can be problematic. Finally, it is noted that the upper bound ‖v‖√

λminin

(2.10) is achieved by choosing v = Γξ, where ξ is the eigenvector for Γ∗Γ corresponding tothe eigenvalue λmin. This is a consequence of the singular value decomposition of a matrix.For further results on singular values see [3, 16] or any standard text on the singular valuedecomposition of a matrix.

In summary, to recover {G(iωk}η1 from the input u(t) =∑η

1 akeiωkt and output y(t),

simply wait until y(t) reaches steady state at some time τ1. Then yss(t) ≈ y(t) for all t ≥ τ1.Now let ydat be the vector obtained by sampling {y(τ1 + tj)}ν−1

0 over the interval [τ1, τ2] oflength τ . If all the frequencies {ωk}η1 are in the Nyquist sampling range and η ≤ ν, then


g = (Γ∗Γ)−1Γ∗ydat yields an estimate for {G(iωk}η1. Finally, it is noted that the data has tobe relatively clean for this method to work. Moreover, on real physical systems one couldalso have nonlinearities like Coulomb friction, which can lead to further problems in datacollection and analysis.

If ωk = mkω0 for all k = 1, 2, · · · , η where mk is an integer, and ω0 = 2πτ, then all the

columns of T are orthogonal with norm√ν. In fact, the columns of T are the corresponding

columns of the discrete Fourier transform matrix Fν on Cν . In this case, the columns of Γ

are also orthogonal. Hence Γ∗Γ is a diagonal matrix, and thus, the inverse (Γ∗Γ)−1 is trivialto compute. Finally, λmin is simply the smallest number on the diagonal of Γ∗Γ.

One could also use the Gram matrix with Hilbert space techniques to compute {G(iωk)}η1;see [3, 15] and the references therein. However, this approach is beyond the scope of thesenotes.

An example

0 10 20 30 40 50-50

0

50Linear Simulation Results

Time (seconds)

Am

plitu

de

25 30 35 40 45 50

25+tj for j=0,1,2, ... , 999

-40

-20

0

20

40

Am

plitu

de

The graph of yp and y

dat

Figure 7.8: The graph of u(t) and y(t) with yp and ydat

For an example, consider the stable transfer function

G(s) =30s+ 24

s3 + 7s2 + 20s+ 24.

The poles of G(s) are {−3,−2± 2i}, and thus, G(s) is stable. Consider the input

u(t) = 2 + 2 cos(t)− 4 sin(t) + 6 cos(2.5t)− 8 sin(3t) + 4 cos(4t+ π/4).

Notice that u(t) can also be written in the form∑η

1 akeiωkt where η = 9. In fact, using

Euler’s identity eiθ = cos(θ) + i sin(θ), we have

u(t) = 2ei0t+(1+2i)eit+3e2.5it+4ie3it+2eiπ4 e4it+(1−2i)e−it+3e−2.5it−4ie−3it+2e−i

π4 e−4it.


This is precisely the form of u(t) we need in our algorithm. In this setting,

ω1 = 0, ω2 = 1, ω3 = 2.5, ω4 = 3, ω5 = 4

ω6 = −1, ω7 = −2.5, ω8 = −3, ω9 = −4.

Let us compute {G(iωk)}91. In Matlab, we used the lsim command with 2000 points for tover the interval [0, 50], to compute y(t). The graph of the input u(t) and the output y(t)is presented in the top plot of Figure 7.8. By consulting the graph for y(t) in Figure 7.8, wesee that y(t) certainly achieves steady state in twenty five seconds. So we set yss(t) = y(t)for 25 ≤ t ≤ 50. Here we choose ν = 1000. Clearly, all of our frequencies are in the Nyquistsampling range. Using Matlab, we computed⎡⎢⎢⎢⎢⎣

G(0)G(i)G(2.5i)G(3i)G(4i)

⎤⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎣1.0000 + 0.0000i1.5046 + 0.0831i1.3388− 1.4674i0.7793− 1.6483i−0.0240− 1.3680i

⎤⎥⎥⎥⎥⎦ and

⎡⎢⎢⎢⎢⎣1.0000 + 0.0000i1.5045 + 0.0831i1.3383− 1.4669i0.7789− 1.6475i−0.0240− 1.3669i

⎤⎥⎥⎥⎥⎦ = Π5(Γ∗Γ)−1Γ∗yp.

The notation Π5(Γ∗Γ)−1Γ∗yp picks out the first five components of the vector (Γ∗Γ)−1Γ∗yp

in C9 corresponding to {ωk}51. Since G(−iω) = G(iω), we did not display {G(iωk)}96. Ourestimate (Γ∗Γ)−1Γ∗yp is essentially identical to {G(iωk)}91. In fact, let g be the vector in C9

determined by {G(iωk)}91; see (2.9). Then we have

‖g − (Γ∗Γ)−1Γ∗yp‖C9 = 0.0022 and ‖g‖C

9 = 4.8810.

Therefore g ≈ (Γ∗Γ)−1Γ∗yp, as expected.Next we set ydat = yp + v where v is a vector in C

1000 taken from Matlab’s mean zeroGaussian random number generator with standard deviation 3. (In Matlab we set the seedin the random number generator to rng(879321).) The graph of yp and ydat is presented inthe bottom plot of Figure 7.8. It is emphasized that yp and ydat are vectors in C1000. In thebottom graph of Figure 7.8, we scaled the x axis over the interval [25, 50] to represent thetime where the sampling occurred. Using Matlab we computed⎡⎢⎢⎢⎢⎣

G(0)G(i)G(2.5i)G(3i)G(4i)

⎤⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎣1.0000 + 0.0000i1.5046 + 0.0831i1.3388− 1.4674i0.7793− 1.6483i−0.0240− 1.3680i

⎤⎥⎥⎥⎥⎦ and

⎡⎢⎢⎢⎢⎣0.9850 + 0.0000i1.4597 + 0.0743i1.3138− 1.4348i0.7799− 1.6398i−0.0163− 1.3445i

⎤⎥⎥⎥⎥⎦ = Π5(Γ∗Γ)−1Γ∗ydat.

Our estimate (Γ∗Γ)−1Γ∗ydat is fairly close to {G(iωk)}91. Moreover,√λmin = 62.872 and

‖v‖ = 93.8176. Finally, it is noted that the error

0.0957 = ‖g − g‖C9 ≤ ‖v‖√

λmin= 1.4922.

In this example, the upper bound ‖v‖√λmin

= 1.4922 is far from the actual error ‖g−g‖ = 0.0957.


The Matlab commands we used to plot Figure 7.8 and compute (Γ∗Γ)−1Γ∗yp are givenby

t = (0: 1999)′ ∗ 50/2000;num=[30, 24]; den=[1, 7, 20, 24]; G=tf(num,den)

u = 2 + 2 ∗ cos(t)− 4 ∗ sin(t) + 6 ∗ cos(2.5 ∗ t)− 8 ∗ sin(3 ∗ t) + 4 ∗ cos(4 ∗ t+ π/4);

subplot(2,1,1); lsim(G,u,t); grid; y=lsim(G,u,t); t(1001) = 25;

yss = y(1001 : 2000); ts = t(1001 : 2000);

T=[exp(i*ts)+2i*exp(i*ts), 3*exp(2.5*i*ts), 4i*exp(3*i*ts)]

T= [T, 2*exp(i*pi/4)*exp(4*i*ts)];

T=[T, conj(T)]; T=[2*ones(size(ts)), T]; x=pinv(T)*yss;

The commands in Matlab we used to compute {G(iωk)} and compare this to our estimate(Γ∗Γ)−1Γ∗yp are given by

w=[0; 1; 2.5; 3; 4]; w= [w;− w(2:5)];

g= polyval(num,i*w)./polyval(den,i*w);

[g(1 : 5) x(1 : 5)

]=

⎡⎢⎢⎢⎢⎣1.0000 + 0.0000i 1.0000 + 0.0000i1.5046 + 0.0831i 1.5045 + 0.0831i1.3388− 1.4674i 1.3383− 1.4669i0.7793− 1.6483i 0.7789− 1.6475i−0.0240− 1.3680i −0.0240− 1.3669i

⎤⎥⎥⎥⎥⎦norm(g − x) = 0.0022; norm(g) = 4.8810.

The Matlab commands we used to finish the plot Figure 7.8 and compute (Γ∗Γ)−1Γ∗ydat

are given by

rng(879321) % This sets the seed for the random number generator.

v=3*randn(size(ts)); yd=yss+v; subplot(2,1,2);

plot(ts,yd); grid; hold on; plot(ts,yss,’r’);

xlabel(’25 + tj for j = 0, 1, 2, · · · , 999’); ylabel(’Amplitude’)

title(’The graph of yp and ydat’); xd = pinv(T)*yd;

[g(1 : 5) xd(1 : 5)

]=

⎡⎢⎢⎢⎢⎣1.0000 + 0.0000i 0.9850 + 0.0000i1.5046 + 0.0831i 1.4597 + 0.0743i1.3388− 1.4674i 1.3138− 1.4348i0.7793− 1.6483i 0.7799− 1.6398i−0.0240− 1.3680i −0.0163− 1.3445i

⎤⎥⎥⎥⎥⎦norm(g − xd) = 0.0957; norm(v)/min(svd(T)) = 1.4922; norm(v) = 93.8176;

[sqrt(min(eig(T′ ∗ T))), min(svd(T))] =[62.8720 62.8720

].


The mass spring damper revisited

As in Section 7.1.3, consider the mass spring damper system

mq + cq + kq = u (2.11)

where q is the position of the mass. Let G(s) = Q(s)U(s)

be the transfer function from u to q.Throughout we assume that m > 0, c > 0 and k > 0. In particular, the transfer functionG(s) is stable. Assume that m, c and k are unknown, and we want to determine the massspring and damper experimentally. To this end, consider a sinusoidal input u(t) of the form

u(t) = u0 +

n∑�=1

(α� cos(ω�t) + β� sin(ω�t))

= u0 +n∑�=1

((α� − iβ�)

2eiω�t +

(α� + iβ�)

2e−iω�t

).

(Since the spring constant is k, we used � for the subscript.) In this case, the steady stateresponse is given by

qss(t) = G(0)u0 +

n∑�=1

α�|G(iω�)| cos(ω�t + arg(G(iω�)))

+

n∑�=1

β�|G(iω�)| sin(ω�t+ arg(G(iω�)))

= G(0)u0 +

n∑�=1

((α� − iβ�)G(iω�)

2eiω�t +

(α� + iβ�)G(iω�)

2e−iω�t

).

Throughout we assume that the frequencies {0, ω�}n1 are all positive and distinct. Moreover,α�+ iβ� is nonzero for all � = 1, 2, · · · , n. Here η = 2n+1 (or η = 2n if u0 = 0). One can useour previous algorithm to compute {G(0), G(iω�)}n1 (or {G(iω�)}n1 if u0 = 0). Now let W1 bethe matrix mapping C2 into Cn+1 and g1 the vector in Cn+1 defined by

W1 =

⎡⎢⎢⎢⎢⎢⎣u0 01 −ω2

1

1 −ω22

......

1 −ω2n

⎤⎥⎥⎥⎥⎥⎦ : C2 → Cn+1 and g1 =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

u0G(0)

�(

1G(iω1)

)�(

1G(iω2)

)...

�(

1G(iωη)

)

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦.

(If u0 = 0, then the first row of W1 and g1 are zero and can be eliminated. In this case, W1

maps C2 into Cn and g1 is a vector in Cn.) Let W2 and g2 be the vectors in Cn defined by

W2 =

⎡⎢⎢⎢⎣ω1

ω2...ωn

⎤⎥⎥⎥⎦ and g2 =

⎡⎢⎢⎢⎢⎢⎢⎣�(

1G(iω1)

)�(

1G(iω2)

)...

�(

1G(iωη)

)

⎤⎥⎥⎥⎥⎥⎥⎦ .


The Matrix W1 is one to one whenever n ≥ 2 (or u0 = 0 and n ≥ 1). In this setting, themass spring and damper constants are given by[

km

]= (W ∗

1W1)−1W ∗

1 g1 and c =W ∗

2 g2‖W2‖2 . (2.12)

To see this notice that for any frequency ω, we have

G(iω) =1

ms2 + cs+ k

∣∣∣∣s=iω

=1

k −mω2 + icω.

Because m, c and k are strictly positive, k−mω2 + icω is nonzero for all ω. Hence G(iω) isnonzero. Thus k −mω2 + icω = 1

G(iω). By matching the real and imaginary part, we have

[1 −ω2

] [ km

]= �

(1

G(iω)

)and ωc = �

(1

G(iω)

).

By rewriting this out for each ω� with G(0) =1k, we obtain the equations

W1

[km

]= g1 and W2c = g2. (2.13)

Because W1 and W2 are both one to one, we arrive at our solution for m, c and k in (2.12);see also Theorem 3.4.1. Finally, it is noted that if m, c or k is not strictly positive, thenobviously something has gone wrong.

7.2.1 Exercise

Problem 1. Consider the transfer function G(s) = Y (s)U(s)

given by

G(s) =s3 − 6s2 + 11s− 6

s5 + 8s4 + 28s3 + 58s2 + 67s+ 30.

Assume that the input is

u(t) = 2−8∑k=1

4 sin

(kt

2+ k

).

Here ωk =k−12

for k = 1, 2, · · · , 9 and ω10 = −ω1, ω11 = −ω2, · · · , ω17 = −ω8. In Matlab sett = (0: 1999)′ ∗ 40/2000.(i) Use the lsim command in Matlab to plot u(t) and y(t) on the same graph.

(ii) Use the steady state response yss(t) for 20 ≤ t ≤ 40 with ν = 1000, to estimate{G(iωk)}171 . In other words, compute (Γ∗Γ)−1Γ∗yp and compare this to the actualvalue of g, that is, compute ‖g − (Γ∗Γ)−1Γ∗yp‖ and ‖g‖; see (2.8) and (2.9). Print out{G(iωk)}91 and Π9(Γ

∗Γ)−1Γ∗yp.


(iii) Set the random number seed in Matlab equal to rng(949318) and v = randn(1000, 1)/3.Set ydat = yp+ v. Plot yp and ydat on the same graph with the x axis scaled to [20, 40].

(iv) Compute g = (Γ∗Γ)−1Γ∗ydat and compare this calculation to the actual value of g, that

is, compute ‖g − g‖. Compute ‖v‖ and√λmin and ‖v‖√

λmin. Print out {G(iωk)}91 and

Π9(Γ∗Γ)−1Γ∗ydat.

Problem 2. Let M be a one to one matrix mapping Cη into Cν . Let λ be an eigenvalue forM∗M with corresponding eigenvector ξ. Set v =Mξ. Then show that

‖(M∗M)−1Mv‖ =‖v‖√λ.

In particular, this holds for λmin the smallest eigenvalue for M∗M .

Problem 3. Consider the mass spring damper system

mq + cq + kq = u

where m = 4, c = 2 and k = 1. Let G(s) = Q(s)U(s)

= 1ms2+cs+k

be the transfer function from

the input force u(t) to the position q(t) of the mass. Consider the input

u(t) = 1 + 4 cos(t)− 2 sin(t) + 2 cos(2t)− 4 sin(2t) + 2 cos(3t+π

4)− 4 sin(3t).

Use the lsim command in Matlab with 0 ≤ t ≤ 40 and the corresponding steady stateresponse, to numerically compute G(iωk) for ωk = 0, 1, 2, 3; see Section 7.2. How manysamples ν of yss(t) did you use? Then using {G(iωk)} with (2.12), find m, c and k.

7.3 Ideal filters

In this section we will present a brief introduction to filtering theory. The ideal low pass,high pass, band pass and band stop filters will be given. A filter is a transfer functionwhich is designed to pass through certain frequencies and reject others. A low pass filter is atransfer function which passes through signals with low frequencies and rejects signals withhigh frequencies. For example, consider the ideal low pass filter G defined by

G(iω) = 1 if − λ0 ≤ ω ≤ λ0

= 0 otherwise. (3.1)

Here 0 < λ0 is the cutoff angular frequency. Then the steady state response yss to the inputu(t) = γeiωt is given by

yss(t) = γeiωt if − λ0 ≤ ω ≤ λ0


7.3. IDEAL FILTERS 373

Notice that if u(t) =∑

k akeiωkt, then the corresponding steady state response is given by

yss(t) =∑

|ωk|≤λ0ake

iωkt. (3.3)

In other words, the ideal low pass filter in (3.1) passes through all sinusoids whose angularfrequency is between [−λ0, λ0] and rejects all other sinusoids. Clearly, α cos(ωt)+β sin(ωt) isa linear combination of sinusoids of the form eiωt and e−iωt. Hence the steady state responseof G to

u(t) =∑k

(αk cos(ωkt) + βk sin(ωkt))

is given by

yss(t) =∑

|ωk|≤|λ0|(αk cos(ωkt) + βk sin(ωkt)) . (3.4)

The filter in (3.1) is called an ideal filter because the inverse Laplace transform g ofG does not exist. Recall that the Laplace transform of a function of exponential order isanalytic in some region of the complex plane. Notice that if G(s) is an analytic function suchthat G(iω) is given by (3.1), then G(s) is zero on a line, and thus, G = 0. This contradictsthe fact that G is nonzero. Therefore the inverse Laplace transform g of G does not exist.

We say that G is a stable rational filter if G(s) is a stable proper rational function.According to the results in Section 5.6 in Chapter 5, one can build a circuit consisting ofresistors, capacitors and operational amplifiers to implement any proper rational function.So we can build a circuit to realize any stable rational filter. However, the ideal low passfilter in (3.1) is not a stable proper rational function. In other words, there is no stablerational proper transfer function G such that G(iω) is given by (3.1). Hence one cannotbuild the ideal low pass filter in (3.1). So the idea behind designing a low pass filter is finda stable proper rational function G which approximates the ideal low pass filter. Then wecan construct a circuit to implement G.

+−u

i◦

R i◦

C y

+

−

Figure 7.9: A resistor and capacitor low pass circuit

For an example of a simple stable low pass filter, consider the transfer function

G(s) =1

1 + s/ωc. (3.5)

where ωc > 0 is a specified constant. Since −ωc is the only pole of G(s) and −ωc < 0,the filter G is stable. Clearly, the impulse response g(t) = ωce

−ωct for all t ≥ 0. Moreover,


one can implement G by using a resistor and capacitor in a electrical circuit. To see this,consider the circuit in Figure 7.9 where the resistor has R ohms and the capacitor has Cfarads. The input is the voltage source u and the output y is the voltage across the capacitor.The transfer function for this circuit is given by

G(s) =1

1 + CRs.

By setting ωc = 1/CR, we arrive at the transfer function in (3.5). In other words, thetransfer function in (3.5) can be implemented by using a resistor and a capacitor.

Now let us show that the transfer function G in (3.5) can be used as a low pass filterwith cutoff angular frequency ωc. To this end, observe that

|G(iω)| = 1√1 + ω2/ω2

c

and φ(iω) = − arctan(ω/ωc) . (3.6)

Obviously, |G(iω)| is symmetric about the y axis. This also follows from that fact that g(t)is a real valued function. If the input u(t) = γeiωt is fed into this filter, then the steady stateoutput is given by

yss(t) =γei(ωt−arctan(ω/ωc))√

1 + ω2/ω2c

.

Notice that yss(t) ≈ γeiωt when ω is small, and yss(t) ≈ 0 when ω is large. So G acts likea low pass filter. The parameter ωc = 1/CR allows us to keep a higher number of lowfrequencies as ωc becomes large.

The maximum of |G(iω)| occurs at ω = 0, that is, 1 = |G(0)| ≥ |G(iω)|. Moreover,|G(iω)| is monotonically decreasing to zero as |ω| tends to infinity. To verify that |G(iω)| isdecreasing observe that the derivative of |G(iω)| is negative for all ω > 0, that is,

d|G(iω)|dω

=−ω/ω2

c

(1 + ω2/ω2c )

3/2< 0 (ω > 0) .

Notice that |G(iωc)| = 1/√2 ≈ .7071. Hence 1/

√2 ≤ |G(iω)| ≤ 1 when |ω| ≤ ωc. So

the filter keeps at least 70% of the amplitude of the input signal γeiωt when the angularfrequency |ω| ≤ ωc. Observe that |G(iω)| is less that 10% of its maximum 1 = G(0) whenthe frequencies are greater than 10ωc, that is, |G(iω)| < 0.1 ≈ |G(i10ωc)| when |ω| > 10ωc.So this filter wants to chop off all high frequencies. Finally, it is noted that Figure 7.10presents a graph of |G(iω)| for ωc = 10.

An ideal high pass filter. A high pass filter is a transfer function which passes throughhigh frequency signals and rejects low frequency signals. For example, consider the idealhigh pass filter defined by

H(iω) = 1 if |ω| > λ0



−150 −100 −50 0 50 100 1500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

angular frequency ω

|G(ω

)||G(ω)| = 10/(ω2+100)1/2

Figure 7.10: The graph of |G(iω)|

Here 0 < λ0 is a cutoff angular frequency. Consider any sinusoid input u(t) = γeiωt. Thenthe corresponding steady state output is given by

yss(t) = γeiωt if |ω| > λ0



k akeiωkt, then the corresponding steady state output is given by

yss(t) =∑

|ωk|>λ0ake

iωkt. (3.9)

In other words, the ideal high pass filter in (3.7) passes through all sinusoids whose angularfrequency has absolute value strictly greater than λ0 and rejects all the other sinusoids. Inparticular,

yss(t) =∑

|ωk|>|λ0|(αk cos(ωkt) + βk sin(ωkt)) (if u(t) =

∑k

(αk cos(ωkt) + βk sin(ωkt))) .

Finally, it is noted that the ideal high pass filter H(iω) = 1 − G(iω) where G is the ideallow pass filter in (3.1).


The inverse Laplace transform h of the ideal high pass filter H does not exist. In otherwords, one cannot build the ideal high pass filter in (3.7). Later we will see how one canapproximate the ideal high pass filter by a stable proper rational function.

An ideal band pass filter. A band pass filter is a transfer function which passes throughsignals in a certain frequency range or band and rejects all signals outside of this frequencyband. For example, consider the ideal band pass filter defined by

B(iω) = 1 if λ1 ≤ |ω| ≤ λ2


Here 0 ≤ λ1 < λ2 and {ω : |ω| ∈ [λ1, λ2]} is the band or the range of angular frequencies thatthe filter accepts. Consider any sinusoid input u(t) = γeiωt. Then the steady state output isgiven by

yss(t) = γeiωt if λ1 ≤ |ω| ≤ λ2




yss(t) =∑

λ1≤|ωk|≤λ2ake

iωkt. (3.12)

In other words, the ideal band pass filter in (3.10) passes through all sinusoids whose angularfrequency is in the band {ω : |ω| ∈ [λ1, λ2]} and rejects all other sinusoids. In particular,

yss(t) =∑

λ1≤|ωk|≤λ2(αk cos(ωkt) + βk sin(ωkt)) (if u(t) =

∑k

(αk cos(ωkt) + βk sin(ωkt))) .

The inverse Laplace transform b of the ideal band pass filter B does not exist. In par-ticular, the ideal band pass filter cannot be realized by a stable rational function. In otherwords, one cannot build the ideal band pass filter in (3.10). Later we will see how one canapproximate the ideal band pass filter by a stable proper rational function.

An ideal band stop filter. A band stop filter is a transfer function which rejects signalsin a certain frequency range or band and passes through signals in all the other frequencyranges. For example, consider the ideal band stop filter defined by

Q(iω) = 0 if λ1 ≤ |ω| ≤ λ2


Here 0 ≤ λ1 < λ2 and {ω : |ω| ∈ [λ1, λ2]} is the band or the range of angular frequencies thatthe filter rejects. Consider any sinusoid input u(t) = γeiωt. Then the steady state output isgiven by

yss(t) = 0 if λ1 ≤ |ω| ≤ λ2

= γeiωt otherwise. (3.14)




yss(t) =∑

|ωk|<λ1, λ2<|ωk|ake

iωkt. (3.15)

In other words, the ideal band stop filter in (3.13) rejects all sinusoids whose angular fre-quency is in the band {ω : |ω| ∈ [λ1, λ2]} and accepts all other sinusoids. In particular, ifthe input

u(t) =∑k

(αk cos(ωkt) + βk sin(ωkt)))

yss(t) =∑

|ωk|<λ1, λ2<|ωk|(αk cos(ωkt) + βk sin(ωkt)) .

Finally, it is noted that the ideal band stop filter Q(iω) = 1 − B(iω) where B is the ideallow pass filter in equation (3.10).

The inverse Laplace transform q of the ideal band stop filter Q does not exist. Hence, onecannot build the ideal band stop filter in (3.13). Later we will see how one can approximatethe ideal band stop filter by a stable proper rational function.

7.3.1 Exercise

Problem 1. Consider the periodic square wave u with period 2π given by

u(t) = 1 if 0 ≤ t ≤ π

= 0 if π < t < 2π , (3.16)

and u(t) = u(t + 2π) for all t. Let G be the ideal low pass filter defined in (3.1) whereλ0 = 2.5. Then find the corresponding steady state output yss. Plot y(t) in Matlab. Hint:compute the Fourier series for u.

Problem 2. Consider the stable rational low pass filter given by G(s) = 1/(1 + s/ωc)where ωc > 0. Choose the parameter ωc to approximate the ideal low pass filter withλ0 = 2.5. Let u be the periodic square wave with period 2π given by (3.16). Then plot thecorresponding steady state output yss in Matlab. How does this compare to the output ofthe ideal low pass filter in Problem 1. Hint: notice that G(s) is the transfer function forthe system y = −ωcy + ωcu. So yss is the steady state response for the differential equationy = −ωcy + ωcu driven by the square wave in (3.16).

Problem 3. Consider the periodic square wave u with period 2π given by (3.16). Let Bbe the ideal low pass filter defined in (3.10) where λ1 = 3.5 and λ1 = 5.5. Then find thecorresponding steady state output yss. Plot y(t) in Matlab.

Problem 4. Consider the stable rational filter given by

G(s) =s

1 + s/ωc(where ωc > 0) . (3.17)

Discuss why G can be used as a high pass filter with cutoff frequency ωc. Design a circuitconsisting of a resistor and a capacitor such that G is the transfer function for this circuit.


7.4 Bode plots

In this section we will introduce the Bode plot. Then we will use the Bode plot to designsome simple stable rational filters. If G(s) is a transfer function, then the magnitude Bodeplot of G is the plot of 20 log |G(iω)| vs log ω. This plot is done on a log-log graph. The angleBode plot of G is the plot of angle(G(iω)) vs logω. This plot is done on a semi-log graph.Finally, it is noted that G(s) is usually the Laplace transform for a real valued function g.Hence |G(iω)| is symmetric about the y axis, that is, |G(iω)| = |G(−iω)|. So the magnitudeBode plot is symmetric about the y axis. For this reason we only plot the Bode plot forangular frequencies ω > 0.

To introduce the magnitude Bode plot consider the function G(s) = 1 + s/z where z isa real number. Notice that −z is a zero of G. In this case,

G(iω) = 1 + iω/z . (4.1)

Now assume that ω ≥ 0. Then computing 20 log |G(iω)| we arrive at

20 log |1 + iω/z| ≈ 0 if 0 ≤ ω << |z|= 10 log(2) if ω = |z| (4.2)

≈ 20 log |ω| − 20 log |z| if ω >> |z| .

Now let f(ω) = 20 log |G(iω)| and x = log |ω|. Substituting this into (4.2) yields

f ≈ 0 if 0 ≤ ω << |z|= 10 log(2) if ω = |z| (4.3)

≈ 20x− 20 log |z| if ω >> |z| .

Notice that the graph of f vs x is zero until ω = |z| and is a line with slope 20 for ω >> |z|.The graph of f vs x is equivalent to a log-log plot of 20 log |G(iω)| vs log ω, which is themagnitude Bode plot for G. This log-log plot is zero until ω = |z| and is a line with slope20 for ω >> |z|.

Now consider the function H(s) = 1/(1 + s/p) where p is a real number. In this case,−p is a pole of H(s). In most applications H(s) is a stable transfer function. In this case,p > 0. As before, assume that ω ≥ 0. Then 20 log |H(iω)| is given by

20 log∣∣(1 + iω/p)−1

∣∣ = −20 log |1 + iω/p| ≈ 0 if ω << |p|= −10 log(2) if ω = |p| (4.4)

≈ −20 log |ω|+ 20 log |p| if ω >> |p| .

Now let f(ω) = 20 log |H(iω)| and x = log |ω|. Substituting this into (4.4) yields

f ≈ 0 if 0 ≤ ω << |p|= −10 log(2) if ω = |p|≈ −20x+ 20 log |p| if ω >> |p| .

7.4. BODE PLOTS 379

Notice that the graph of f vs x is zero until ω = |p| and is a line with slope −20 for ω >> |p|.The graph of f vs x is equivalent to a log-log plot of 20 log |H(ω)| vs logω, which is themagnitude Bode plot for H . This log-log plot is zero until ω = |p| and is a line with slope−20 for ω >> |p|. Finally, it is noted that at the cutoff frequency ω = |p|, the magnitude|H(ip)| = 1/

√2. This fact plays a role in designing stable rational filters.

Frequency (rad/sec)

Pha

se (

deg)

; Mag

nitu

de (

dB)

Bode Diagrams

0

5

10

15

20

10−1

100

101

102

103

−100

−50

0

50

100

Figure 7.11: The bode plot for G(s) = s+1(1+s/10)(1+s/100)

One can combine the previous analysis to sketch the magnitude Bode plot for any rationalfunction with real poles and zeros. For example consider the stable rational filter

G(s) =s+ 1

(1 + s/10)(1 + s/100). (4.5)

Notice that we have expressed the transfer function G in time constant form, that is, whensketching the Bode plot it is convenient to write the factors in the numerator and denominatorin the form 1 + s/α. The three important angular frequencies in this transfer function are1, 10 and 100. Clearly, G(iω) is given by

G(iω) =1 + iω

(1 + iω/10)(1 + iω/100).


Notice that

log |G(iω)| = log |1 + iω| − log |1 + iω/10| − log |1 + iω/100| .

As before, assume that ω ≥ 0. By consulting (4.2) and (4.4), we arrive at

20 log |G(iω)| ≈ 0 if ω << 1

= γ1 if ω = 1

≈ 20 log |ω|+ γ2 if 1 << ω << 10

= γ3 if ω = 10

≈ 20 log |ω| − 20 log |ω|+ γ4 = γ4 if 10 << ω << 100

= γ5 if ω = 100

≈ −20 log |ω|+ γ6 if ω >> 100 .

Here {γk}61 are constants. The Bode plot of G in (4.5) is given in Figure 7.11. The aboveanalysis shows that the magnitude Bode plot of G is zero until ω approaches the first cutoffangular frequency 1. Because 1 + s is in the numerator, the log-log plot has slope 20 in theregion 1 << ω << 10. The next cutoff angular frequency is 10. Since 1 + s/10 is in thedenominator, this adds a slope of −20 for ω >> 10. Hence magnitude Bode plot has slopezero (20 − 20), or equivalently, is constant in the region 10 << ω << 100. The final cutoffangular frequency is 100. Since 1 + s/100 is in the denominator, this adds a slope of −20for ω >> 100. Thus the magnitude Bode plot has slope −20 for ω >> 100. In other words,numerator term of the form 1 + s/α adds a slope of 20 once ω is greater than |α|, while adenominator term of the form 1 + s/β adds a slope of −20 when ω greater than |β|.

Now let us use our previous analysis to design a stable rational band pass filter withcutoff frequencies 1 and 10. To this end, consider the transfer function given by

G(s) =s2

(1 + s)2(1 + s/10)2. (4.6)

Since all the poles of G(s) are in the open left half plane, G is a stable filter. The Bodeplot for G in (4.6) is given in Figure 7.12. Notice that 0 is a double zero of G(s). So themagnitude plot has slope 40 = 2 × 20 starting at ω = 0. The double pole −1 adds a slopeof −40 = 2 × 20 to the magnitude plot for ω >> 1. Hence the magnitude plot has slope40 − 40 = 0 , or equivalently, is constant in the region 1 << ω << 10. The double pole of−10 adds a slope of −40 once ω >> 10. Since 0− 40 = −40, the magnitude plot has slope−40 for ω >> 10. Finally, it is notes that this is not really a very good band pass filter. Onecan obtain better band pass filters by using poles and zeros with nonzero complex parts.

7.4.1 Exercise

Problem 1. Consider the stable filter given by

G(s) =−s(1− s/10)

(1 + s)(1 + s/100).

7.4. BODE PLOTS 381

Frequency (rad/sec)

Pha

se (

deg)

; Mag

nitu

de (

dB)

Bode Diagrams

−100

−80

−60

−40

−20

0

10−2

10−1

100

101

102

103

−200

−100

0

100

200

Figure 7.12: The bode plot for G(s) = s2

(1+s)2(1+s/10)2

Plot the Bode plot for G in Matlab and explains what happens in each region.


G(s) =s− 1

(s+ 1)(1 + s/10).

Plot the Bode plot for G in Matlab and explains what happens in each region. Compare theBode plot of G to the Bode plot of H(s) = 1/(1 + s/10). What is the difference between Gand H?


G(s) =(s− 1)(s− 10)

(s+ 1)(s+ 10).


Problem 4. Consider the transfer function given by

G(s) =6(s+ 1)

(1− s/10)(1 + s/100).


Plot the Bode plot for G in Matlab. Then find a stable filter H with the same magnitudeplot as G. In other words, stable filter H such that |G(ω)| = |H(ω)| for all ω.

7.5 Natural frequencies and damping ratios

In this section we will study the magnitude Bode plot for stable filters which have poles witha nontrivial imaginary part. To this end, consider the transfer function given by

G(s) =ω2n

s2 + 2ζωns+ ω2n

=1

s2/ω2n + 2ζs/ωn + 1

. (5.1)

Here ζ is called the damping ratio and ωn is the natural frequency. Throughout we assumethat ζ is a real number, |ζ | < 1, and ωn > 0. This guarantees that the poles of G(s)have a nontrivial imaginary part. Notice that the poles of G are the roots of the quadraticpolynomial λ2+2ζωnλ+ω2

n = 0. Using the quadratic formula along with |ζ | < 1, we obtain

λ = −2ζωn2

±√4ζ2ω2

n − 4ω2n

2= −ζωn ± iωn

√1− ζ2 . (5.2)

In particular, |λ| = ωn. Finally, it is noted that G is stable if and only if ζ > 0 and ωn > 0.To obtain the magnitude Bode plot of G first observe that

G(iω) =1

(iω)2/ω2n + 2ζω i/ω2

n + 1and |G(iω)| = 1

|1− ω2/ω2n + 2ζωi/ωn| .

As before, we assume that ω ≥ 0. A simple calculation shows that

20 log |(1− ω2/ω2n + 2ζω i/ωn)

−1| ≈ 0 if ω << ωn

= −20 log |2ζ| if ω = ωn (5.3)

≈ −40 log |ω|+ 40 log |ωn| if ω >> ωn .


f ≈ 0 if ω << ωn

= −20 log |2ζ| if ω = ωn (5.4)

≈ −40x+ 40 log |ωn| if ω >> ωn .

Notice that the graph of f vs x is zero until ω = ωn and is a line with slope −40 for ω >> ωn.The graph of f vs x is equivalent to a log-log plot of 20 log |G(iω)| vs log ω, which is themagnitude Bode plot for G. This log-log plot is zero until ω = ωn and is a line with slope−40 for ω >> ωn. Notice that at the point ω = ωn the magnitude |G(iωn)| = 1/2|ζ |. Hence

20 log |G(iωn)| = −20 log |2ζ | > 0 if |ζ | < 1/2

= 0 if |ζ | = 1/2

< 0 if |ζ | > 1/2 .

7.5. NATURAL FREQUENCIES AND DAMPING RATIOS 383

Frequency (rad/sec)

Pha

se (

deg)

; Mag

nitu

de (

dB)

Bode Diagrams

−60

−40

−20

0

20

40

100

101

102

−200

−150

−100

−50

0

Figure 7.13: The bode plot for G(s) = ω2n

s2+2ζωns+ω2n

As ζ approaches zero 20 log |G(iωn)| = −20 log |2ζ | tends to infinity. The Bode plot forG with ωn = 10 and ζ = 0.01, 0.1, 0.2, 0.5, 0.71, 0.9 is given in Figure 7.13. As expected,ζ = 0.01 produces the highest point in the magnitude plot at ω = ωn = 10. In this case, ωnis also called the resonance frequency.

Let G be the transfer function given by (5.1) where 0 < ζ < 1 and ωn > 0. Then Gis a stable transfer function. Moreover, this G can be used as a stable low pass filter withcutoff angular frequency ωn. The choice of the damping ratio ζ determines the shape of thislow pass filter at the cutoff angular frequency ωn. Moreover, this filter can be implementedby building the circuit in Figure 4.1. in Chapter 4 consisting of a resistor with R ohms, aninductor with L henrys, a capacitor with C farads in series in a single loop with a voltagesource u and the output y is the voltage measured across the capacitor. To see this simplyrecall that the transfer function for this circuit is given by equation (5.5) in Chapter 4.Combining this with (5.1), yields

G(s) =1

s2/ω2n + 2ζs/ωn + 1

=1

LCs2 +RCs+ 1.

By choosing LC = 1/ω2n and RC = 2ζ/ωn, we can build a circuit to implement the stable


low pass filter G = ω2n/(s

2 + 2ζωns + ω2n). Recall that any proper rational transfer function

can by implemented by using resistors, capacitors and operational amplifiers; see Section 5.6in Chapter 5. In particular, this filter can also be built by using resistors, capacitors andoperational amplifiers. In fact, this is probability a better method to implement this filter.

If ζ = 1/√2 ≈ 0.7071, then the transfer function G in (5.1) is often used as a low

pass stable filter with cutoff angular frequency ωn. If ζ = 1/√2, then 20 log |G(iωn)| =

−20 log√2 = −3.01. So at the cutoff frequency ωn the magnitude Bode plot is approximately

3 decibels off its peak of 0 decibels. This criteria of being off 3 decibels is used in designingButterworth filters.

Frequency (rad/sec)

Pha

se (

deg)

; Mag

nitu

de (

dB)

Bode Diagrams

−40

−20

0

20

40

60

100

101

102

0

50

100

150

200

Figure 7.14: The bode plot for G(s) = s2

ω2n+ 2ζs

ωn+ 1

The numerator s2

ω2n+ 2ζs

ωn+ 1. Now consider the function

G(s) =s2

ω2n

+2ζs

ωn+ 1.


Many times functions of this form appear in the numerator of a transfer function. To obtainthe magnitude Bode plot of G first observe that

|G(iω)| = |1− ω2/ω2n + 2ζω i/ωn| .

As before, assume that ω ≥ 0. A simple calculation shows that

20 log |(1− ω2/ω2n + 2ζω i/ωn)| ≈ 0 if ω << ωn

= 20 log |2ζ| if ω = ωn (5.5)

≈ 40 log |ω| − 40 log |ωn| if ω >> ωn .


f ≈ 0 if ω << ωn

= 20 log |2ζ| if ω = ωn (5.6)

≈ 40x− 40 log |ωn| if ω >> ωn .

Notice that the graph of f vs x is zero until ω = ωn and is a line with slope 40 for ω >> ωn.The graph of f vs x is equivalent to a log-log plot of 20 log |G(iω)| vs log ω, which is themagnitude Bode plot for G. This log-log plot is zero until ω = ωn and is a line with slope40 for ω >> ωn. Notice that at the point ω = ωn the magnitude |G(iω)| = 2|ζ |. Hence

20 log |G(iω)| = 20 log |2ζ | < 0 if |ζ | < 1/2

= 0 if |ζ | = 1/2

> 0 if |ζ | > 1/2 .

As ζ approaches zero 20 log |G(iω)| = 20 log |2ζ | tends to minus infinity. The Bode plot forG with ωn = 10 and ζ = 0.01, 0.1, 0.2, 0.5, 0.7, 0.9 is given in Figure 7.14. Hence the choiceof ζ = 0.01 produces the lowest point in the magnitude plot at ω = ωn = 10.

7.5.1 A band pass filter example

Now let us try to find a stable rational band pass filter to accept angular frequencies in theregion [1, 10] and reject all the other angular frequencies. To this end, consider the filtergiven by

B(s) =s2

(s2 + 1.42s+ 1)(s2/100 + 1.42s/10 + 1). (5.7)

Notice that in this case the damping ratio in the denominator is ζ = 0.71 ≈ 1/√2. Because

all the poles of B(s) are in the open left half plane, B is a stable filter. The Bode plot of B isgiven in Figure 7.15. Since the s2 term appears in the numerator of B(s), the magnitude plothas slope 40 starting at ω = 0. The denominator term s2 + 1.42s+ 1 has natural frequency1 and damping ratio 0.71. This adds a slope of −40 to the magnitude plot for ω >> 1.Hence the magnitude plot has slope 40− 40 = 0 , or equivalently, is constant in the region1 << ω << 10. The denominator term s2/100 + 1.42s/10 + 1 has natural frequency 10 and


Frequency (rad/sec)

Pha

se (

deg)

; Mag

nitu

de (

dB)

Bode Diagrams

−100

−80

−60

−40

−20

0

10−2

10−1

100

101

102

103

−200

−100

0

100

200

Figure 7.15: The bode plot for B(s) = s2

(s2+1.42s+1)(s2/100+1.42s/10+1)

damping ratio 0.71. This adds a slope of −40 once ω >> 10. Because 0 − 40 = −40, themagnitude plot has slope −40 for ω >> 10. Finally, it is noted that this B is a reasonableband pass filter. Furthermore, the stable filter B in (5.7) is a much better band pass filter,than the stable filter G in (4.6) whose bode plot is given in Figure 7.12.

To see how well our filter preforms, consider the input

u(t) = sin(t/10) + sin(3t) + sin(100t)

Using Matlab we see that

G(i/10) = 0.01e2.9849i, G(3i) = 0.9879e0.0515i and G(100i) = 0.01e−2.9849i

Therefore the steady state response is given by

yss(t) = 0.01 sin(t/10 + 2.9849) + 0.9879 sin(3t+ 0.0515) + 0.01 sin(100t− 2.9849)

≈ sin(3t).

Hence yss(t) ≈ sin(3t). To simulate this filter we used the Simulink model presented in Figure7.16. (One can also use the lsim command. However, we would like to introduce another


method. ) In the simulation parameters we used fixed step of 0.001. We also checked thearray in the simout to workspace box. Under the history box in the scope we checked savedata to workspace and array. The graph of yss(t) with sin(3t) is presented in Figure 7.17from the scope data. As expected, the two plots converge. The Matlab commands we usedto generate Figure 7.17 are given by

• plot(ScopeData(:,1),ScopeData(:,2)); grid

• hold on; plot(ScopeData(:,1),ScopeData(:,3),’r’)

• title(’The plot of yss(t) and cos(3t)’);

• xlabel(’time’)


sin(t/10)

sin(3t)

sin(100t)

Transfer Fcn

s 2

0.0100s +0.1562s +1.2116s +1.5620s+1.00004 3 2

To Workspace

simout

Scope

Figure 7.16: The Simulink model for G(s) = s2

(s2+1.42s+1)(s2/100+1.42s/10+1)

0 1 2 3 4 5 6 7 8 9 10−1.5

−1

−0.5

0

0.5

1

1.5

The plot of yss

(t) and cos(3t)

time

Figure 7.17: yss(t) with u(t) = sin(3t)


7.5.2 The resonance frequency

0 5 10 15 20 25 30 35 40 45 50−50

−40

−30

−20

−10

0

10

20

30

40

50


Time (seconds)

Am

plitu

de

Figure 7.18: yss(t) with u(t) = sin(10t) and G(s) = 100s2+0.2s+100

The resonance frequency ωr is a frequency where |G(iω)| is maximum, that is,

|G(iω)| ≤ |G(iωr)| (for all real ω).

The resonance is not necessarily unique. The resonance is a frequency ωr which produces thelargest steady state output amplitude for an input of sin(ωrt) (or cos(ωrt)). For example,consider the transfer function

G(s) =100

s2 + 0.2s+ 100(5.8)

where the natural frequency ωn = 10 and the damping ration ζ = 1100

. The Bode plot of thisG(s) is given in the top plot in Figure 7.13. The resonance frequency is ωn = 10. It is noted

that G(10i) = −50i = 50e−πi2 . So if the input u(t) = sin(10t), then the steady state output

is given by

yss(t) = 50 sin(10t− π

2

).

In particular, the amplitude of yss(t) is 50 times larger than the amplitude of the input. Thegraph of yss(t) with u(t) = sin(10t) is given in Figure 7.18. As expected, the amplitude of


yss(t) converges to 50. By magnifying the plot, one can see that the phase of yss(t) is indeedoff by −π

2from sin(10t). The Matlab commands we used to generate this graph are given by

• t = linspace(0, 50, 2 ∧ 14); u = sin(10 ∗ t);• G = tf(100, [1, 0.2, 100]);

• lsim(G, u, t); grid

The poles of G(s) = 100s2+0.2s+100

are given by −1±√9999i

10. So the output is of the form

y(t) = 50 sin(10t− π

2

)+ αe−

t10 e

√9999it10 + βe−

t10 e−

√9999it10

where α and β are constants. The transient response is the part of the output which convergesto zero. In our example, the transient response is

αe−t10 e

√9999it10 + βe−

t10 e−

√9999it10 .

Hence our transient response dies out on the order of e−t10 and this is why it takes around 50

seconds to achieve its equilibrium. Finally, it is noted that our transient response oscillatesat a frequency of

√999910

which is close to the resonance frequency of 10.If G(s) was the transfer function for something like a beam or wing and one shook

the beam with an input of ε sin(10t) for some small ε, then the steady state response hasan amplitude of 50ε. So this vibration could cause the beam to crack of break. It isnoted that if one moves off the resonance frequency, then the amplitude of the steady stateresponse may drop off relatively fast. For example, consider G(s) in (5.8). Then |G(5i)| ≈1.3332, which is significantly smaller then |G(10i)| = 50. Finally, in applications, one canexperimentally determine the resonance frequency by putting in an input of a sin(ωt) andvarying the frequency ω. Then the resonance frequency is a frequency ωr with the largeststeady state amplitude.

To be mathematically precise, a resonance frequency ωr is a frequency which solves theoptimization problem

|G(iωr)| = max{|G(iω)| : −∞ < ω <∞}. (5.9)

In functional analysis |G(iωr)| = ‖G‖∞ is the H∞ norm of G where H∞ is the Hardy space ofall analytic functions in the open right half plane which are uniformly bounded; see Hoffman[20] for further details.

A mass spring damper example

Consider the mass spring damper in Figure 7.19; see Chapter 6 for a discussion of massspring damper systems. Here q1, q2 and q3 are respectively the positions of the masses m1,m2 and m3. The input u is the force on the second mass m2. The equations of motion aregiven by

m1q1 + c1q1 + c2(q1 − q3) + k1q1 + k2(q1 − q2) = 0

m2q2 + k4q2 + k2(q2 − q1) + k3(q2 − q3) = u (5.10)

m3q3 + c2(q3 − q1) + k3(q3 − q2) = 0.


m2

m1 m3

k1

k4

c1

k2

c2

k3

q1

q2

q3

u


In other words, this system admits a second order matrix differential equation of the formMq + Φq + kq = bu where

M =

⎡⎣m1 0 00 m2 00 0 m3

⎤⎦ and Φ =

⎡⎣c1 + c2 0 −c20 0 0

−c2 0 c2

⎤⎦K =

⎡⎣k1 + k2 −k2 0−k2 k2 + k3 + k4 −k30 −k3 k3

⎤⎦ and q =

⎡⎣q1q2q3

⎤⎦ and b =

⎡⎣010

⎤⎦ . (5.11)

Now assume that {mj}31 and {c1, c2} and {kj}41 are all strictly positive. Clearly, themass matrix M is strictly positive. Moreover, Φ and K are both diagonally dominant selfadjoint matrices, with positive entries on the diagonal. According to Corollary 6.2.9 of theGershgorin circle theorem, the matrices Φ and K are both positive matrices; see Chapter 6.Because the second column of Φ is zero, Φ singular. Proposition 6.2.10 in Chapter 6 showsthat K is strictly positive. We claim that

[K − σM

Φ

]=

⎡⎢⎢⎢⎢⎢⎢⎣k1 + k2 − σm1 −k2 0

−k2 k2 + k3 + k4 − σm2 −k30 −k3 k3 − σm3

c1 + c2 0 −c20 0 0−c2 0 c2

⎤⎥⎥⎥⎥⎥⎥⎦ (5.12)

is one to one for all real numbers σ > 0. To see this simply observe that the matrix in (5.12)


is of the form ⎡⎢⎢⎢⎢⎢⎢⎣� � 0� � �0 −k3 �

c1 + c2 0 −c20 0 0−c2 0 c2

⎤⎥⎥⎥⎥⎥⎥⎦ →

⎡⎢⎢⎢⎢⎢⎢⎣� � 0� � �0 −k3 �c1 0 00 0 0−c2 0 c2

⎤⎥⎥⎥⎥⎥⎥⎦where � denotes an unspecified entry. By Gaussian elimination, the last matrix is obtainedby adding the last row to the fourth row. Because, {c1, c2, k3} are all nonzero, the columnsof the last matrix are linearly independent. Therefore the matrix in (5.12) is one to one forreal numbers σ > 0. According to Theorem 6.2.11 in Chapter 6, our mass spring dampersystem formed by (5.11) is stable. In particular, all the roots of det[Ms2 + Φs + K] arecontained in the open left half plane.

For example, let us assume that

m1 = 1, m2 = 2, m3 = 3, c1 = 5, c2 = 1

k1 = 3, k2 = 1, k3 = 2, k4 = 2. (5.13)

The output y = q2 is the velocity of the second mass m2. In this example, we will use Matlabto compute the following:

(i) The transfer function G(s) = Y (s)U(s)

.

(ii) The steady state response yss(t) to the input u(t) = 3 + 2 sin(1.5t).

(iii) The resonance frequency.

To this end, first observe that the equations of motion are given by

q1 + 5q1 + q1 − q3 + 3q1 + q1 − q2 = 0

2q2 + 2q2 + q2 − q1 + 2(q2 − q3) = u

3q3 + q3 − q1 + 2(q3 − q2) = 0.

By setting q =[q1 q2 q3

]trwhere tr denotes the transpose, we obtain⎡⎣1 0 0

0 2 00 0 3

⎤⎦ q +⎡⎣ 6 0 −1

0 0 0−1 0 1

⎤⎦ q +⎡⎣ 4 −1 0−1 5 −20 −2 2

⎤⎦ q =⎡⎣010

⎤⎦u.Notice that this is a second order system of the form

Mq + Φq +Kq = bu

where M , Φ and K are 3 × 3 matrices and b =[0 1 0

]tris a column vector of length 3.

To be specific,

M =

⎡⎣1 0 00 2 00 0 3

⎤⎦ , Φ =

⎡⎣ 6 0 −10 0 0−1 0 1

⎤⎦ and K =

⎡⎣ 4 −1 0−1 5 −20 −2 2

⎤⎦ .


By setting x1 = q and x2 = x1 = q, we arrive at the following state space system[x1x2

]=

[0 I

−M−1K −M−1Φ

] [x1x2

]+

[0

M−1b

]y =

[[0 0 0

] [0 1 0

]] [x1x2

].

The matrix corresponding to the output y has a 1 in the fifth position and zeros elsewhere.This follows from the fact that y = q2 and x2 = q. So setting

A =

[0 I

−M−1K −M−1Φ

]and B =

[0

M−1b

]C =

[0 0 0 0 1 0

]and D = 0

we have a state space system of the form:

x = Ax+Bu and y = Cx+Du.

Here A is a 6 × 6 matrix and B is a column vector of length 6, while C is a row vector oflength 6. By using ss2tf in Matlab we arrived at

G(s) =0.5s5 + 3.167s4 + 3.167s3 + 2.667s2 + 1.333s

s6 + 6.333s5 + 8.833s4 + 21.17s3 + 17.33s2 + 8.5s+ 3.667.

The Matlab commands we used to compute G are

• m = diag([1, 2, 3]); om = [6, 0,−1; 0, 0, 0;−1, 0, 1];

• k = [4,−1, 0;−1, 5,−2; 0,−2, 2]

• b = [0; 1; 0]; c = [0, 0, 0, 0, 1, 0]

• a = [zeros(3, 3), eye(3);−[inv(m) ∗ k, inv(m) ∗ om]];B = [0; 0; 0; inv(m) ∗ b];• [num, den] = ss2tf(a, B, c, 0); g = tf(num, den)

For another method to compute G we used the Matlab commands

• syms s

• s ∗ [0, 1, 0] ∗ inv(m ∗ s2 + om ∗ s+ k) ∗ bThis yields the same transfer function in a different form:

G(s) =3s5 + 19s4 + 19s3 + 16s2 + 8s

6s6 + 38s5 + 53s4 + 127s3 + 104s2 + 51s+ 22.

Part (ii). The steady state response to u = 3 + 2 sin(1.5t) is given by

yss(t) = 3G(0) + 2|G(1.5i)| sin(t+ angle(G(1.5i))).


Using Matlab we see that

G(0) = 0 and G(1.5i) ≈ 0.2727 + 0.9985i = 1.0351e1.3042i.

Using Remark 7.1.3, one can also use the state space realization {A,B,C,D} to computeG(0) = −CA−1B = 0 and G(1.5i) = C(1.5iI − A)−1B ≈ 0.2727 + 0.9985i. So the steadystate response is given by

yss ≈ 2.07 sin(1.5t+ 1.3042).

Part (iii). The Bode plot for G is given in Figure 7.20. The frequency which attainsthe maximum is around ωr = 1.69 with corresponding amplitude of 16.2db or a gain of10

16.220 = 6.4565. In particular, the resonance frequency is approximately 1.69. Using Matlab

or reading the Bode plot we see that the angle for G(1.69i) equals −0.158. So if u(t) =sin(1.69t), then the steady state output is given by

yss(t) = 6.4565 sin(1.69t− 0.158).

It is noted that |G(1.5i)| = 1.0351 while at the resonance frequency |G(1.69i)| = 6.4565.Even though 1.5 is close to the resonance frequency 1.69, the corresponding amplitude forthe resonance frequency is 6 times higher. Using the lsim command in Matlab we plottedyss(t) = 6.4565 sin(1.69t− 0.158) in Figure 7.21. The steady state response converges slowlybecause the poles of G(s) are given by

{−5.3150, −0.0335± 1.6869i, −0.7123, −0.1195± 0.5709i}.In particular, the poles −0.0335 ± 1.6869i with real part −0.0335 means that the steadystate response will converge on the order of e−0.0335t.

7.5.3 Exercise


G(s) =(s− 1)(1 + s/10)

(s2 + .01s+ 1)(1 + s/100).



G(s) =100s2 + 4s+ 100

s2 + 0.2s+ 100.

(i) Plot the Bode plot for G in Matlab and explains what happens in each region.

(ii) Find the steady state response for u(t) = cos(t).

(iii) Find the steady state response for u(t) = cos(10t).

(iv) Find the resonance frequency.


10−1

100

101

−90

−45

0

45

90

Pha

se (

deg)

Bode Diagram

Frequency (rad/sec)

−30

−25

−20

−15

−10

−5

0

5

10

15

20

System: gFrequency (rad/sec): 1.69Magnitude (dB): 16.2

Mag

nitu

de (

dB)

Figure 7.20: The Bode plot for G(s) = 3s5+19s4+19s3+16s2+8s6s6+38s5+53s4+127s3+104s2+51s+22

0 20 40 60 80 100 120 140 160 180 200−8

−6

−4

−2

0

2

4

6

8


Time (seconds)

Am

plitu

de

Figure 7.21: The steady state response at the resonance frequency


G(s) =s2 − 2s+ 10

s2 + 2s+ 10.



(ii) Find the steady state response for u(t) = cos(t).




G(s) =s2 − 2s+ 2

s3 + 3s2 + 4s+ 2.


(ii) Is this G(s) a low pass or high pass filter or neither?

(iii) Find the steady state response for u(t) = cos(t).

(iv) Find the steady state response for u(t) = cos(10t).

(v) Find the resonance frequency.

Problem 5. Consider the transfer function given by

G(s) =(s+ 1)

(s− 4)(s2 − 10s+ 100).

Plot the Bode plot for G in Matlab. Then find a stable filter H with the same magnitudeplot as G. In other words, find a stable filter H such that |G(iω)| = |H(iω)| for all ω.

Problem 6. Explain how

G(s) =s2

s2/ω2n + 2ζs/ωn + 1

can be used as a stable high pass filter. Use this G to design a high pass filter with cutoffangular frequency 10. Plot the Bode plot of your design in Matlab. Design a circuit consistingof a resistor, capacitor and inductor such that G is the transfer function for this circuit.

Problem 7. Recall that the McMillan degree of a transfer function G(s) = n(s)d(s)

equals the

degree of the denominator d(s) when n(s) and d(s) have no common roots. Consider theinput signal

u(t) = cos(t) + cos(3t)− sin(10t).

Design a band pass filter G(s) = p(s)q(s)

of McMillan degree at most four to pick out the signal

cos(3t) from the input u(t). The output for the filter is Y (s) = G(s)U(s). The idea is todesign a filter G such that the steady state output yss(t) ≈ cos(3t) with almost no phaseshift.


One could use the bandpass Butterworth filter. The band pass Butterworth filter isdiscussed in Section 8.3 of Chapter 8. However, this is not necessary. The Matlab commandfor the Butterworth filter is

[p, q] = butter(m, [a, b], ’s’) .

Here 2m is the McMillan degree of the filter and [a, b] is the cutoff frequencies for the filter.

(i) Find a band pass filter G(s) = n(s)d(s)

of McMillan degree of at most four to pick out

cos(3t) from u(t). Use Matlab to find the Bode plot of G.

(ii) Use the lsim command over 40 seconds in Matlab with u(t) to see how well your filter

G(s) = n(s)d(s)

picks out cos(3t) from u(t). Plot u(t) and y(t) on separate graphs.

(iii) Compare the output y(t) of lsim to cos(3t), that is, plot cos(3t) and y(t) on the samegraph.

(iv) Construct a Simulink model for your filter G(s). Use Simulink to plot y(t) and cos(3t)on the same graph.

Problem 8. For an elementary band pass filter, consider the transfer function given by

Y (s)

U(s)= G(s) =

2ζωns

s2 + 2ζωns+ ω2n

(5.14)

where u is the input and y is the output. The damping ratio ζ > 0 and ωn is the naturalfrequency. The transfer function G(s) acts a band pass filter centered about the angularfrequency ωn. Moreover, G(iωn) = 1, while G(0) = 0 and G(i∞) = 0. So sinusoids withfrequency ωn pass through the filter unchanged in steady state. If ζ < 1

2, then the filter

becomes sharp around the center. For example, consider the transfer function

G(s) =20ζs

s2 + 20ζs+ 100(ζ > 0 and ωn = 10).

The Bode plot of this filter is presented in Figure 7.22 for ζ = 150, 120, 14, 12, 1, 5, 10. The plot

on the bottom is the graph for ζ = 150, while the plot on the top is for ζ = 10. The smaller

the damping ratio ζ the sharper the filter becomes. However, as ζ tends to zero, the poles(−ζωn ± iωn

√1− ζ2 when ζ < 1) of G(s) move towards the imaginary axis and instability.

(i) Find the poles for G(s) = 2ζωnss2+2ζωns+ω2

nfor any ζ > 0.

(ii) Design a circuit consisting of a resistor, capacitor and inductor for the transfer functionG(s) in (5.14); see Section 4.5 in Chapter 4.

(iii) Consider the input u(t) = cos(t) + cos(3t) + cos(5t). Design a band pass filter G(s) ofMcMillan degree two which picks out cos(3t) from u(t), that is, yss(t) ≈ cos(3t). UseMatlab to graph the Bode plot for G(s).


(iii-a) Use the lsim command in Matlab with u(t) to see how well your filter G(s) picksout cos(3t) from u(t). Plot u(t) and y(t) on the same graph.

(iii-b) Compare the output y of lsim to cos(3t), that is, plot cos(3t) and y(t) on the samegraph.

(iii-c) Construct a Simulink model for your filter G(s). Use Simulink to plot y(t) andcos(3t) on the same graph.

(iii-d) Repeat Parts (iii-a) and (iii-b) with your filter replaced by G(s)2. Is there animprovement?

-150

-100

-50

0

Mag

nitu

de (

dB)

10-2 10-1 100 101 102 103 104-90

-45

0

45

90

Pha

se (

deg)

Bode Diagram

Frequency (rad/s)

Figure 7.22: The Bode plot for G(s) = 20ζss2+20ζs+100

for ζ = 150, 120, 14, 12, 1, 5, 10

Problem 9. For an elementary band stop filter, consider the transfer function

Y (s)

U(s)= G(s) =

s2 + ω2n

s2 + 2ζωns+ ω2n

(5.15)

where u is the input, y is the output and ωn is the natural frequency. Notice that G(iωn) = 0,while G(0) = 1 and G(i∞) = 1. To see why this acts like a band stop filter, consider thetransfer function

G(s) =s2 + 100

s2 + 20ζs+ 100

where ωn = 10. The Bode plot for this transfer function with ζ = 120, 14, 12, 1, 4, 1000 is

presented in Figure 7.23. (The poles of G(s) are repeated {ωn, ωn} when ζ = 1, and real forζ > 1.) The top plot is the graph corresponding to ζ = 1

20, while the bottom graph is for

ζ = 1000. Notice that as ζ moves from zero to infinity, the width of the interval around ωnbecomes larger. On the other hand, as ζ tends to zero the poles (−ζωn ± iωn

√1− ζ2 when

ζ < 1) of G(s) move towards the imaginary axis and closer to instability.


(i) Design a circuit consisting of a resistor, capacitor and inductor for the transfer functionG(s) in (5.15); see Section 4.5 in Chapter 4.

(ii) Design a band stop filter Y (s)U(s)

= G(s) of the form (5.15) to reject cos(3t) from the input

signal u(t) = cos(t) + cos(3t) + cos(5t).

(iii) Use the lsim command in Matlab to plot the input u(t) and output y(t) on the samegraph.

(iv) Use the lsim command in Matlab to plot the output y(t) and cos(t) + cos(5t) on thesame graph. How well does your band stop filter work?

(v) Replace your band stop filter with G(s)2 or G(s)3. Does this improve your results?

Finally, it is noted that sometimes sharp band stop filters are called notch filters.

-300

-200

-100

0

Mag

nitu

de (

dB)

10-4 10-2 100 102 104 106-90

-45

0

45

90

Pha

se (

deg)

Bode Diagram

Frequency (rad/s)

Figure 7.23: The Bode plot for G(s) = s2+100s2+20ζs+100

with ζ = 120, 14, 12, 1, 4, 1000

Problem 10. Consider the transfer function G of the form:

G(s) =γsm

s2 + 2ζωns+ ω2n

where m is a positive integer and γ is a scalar. The Bode plot for G is given in Figure 7.24.Using the Bode plot in Figure 7.24 find:

(i) Find the steady state response for u(t) = cos(3t).

(ii) Find the steady state response for u(t) = cos(10t).



(v) Find the values for the gain γ, the integerm, the damping ratio ζ and natural frequencyωn, that is, find G(s).


20

25

30

35

40

45

50

55

60

Mag

nitu

de (

dB)

100

101

102

−90

−45

0

45

90

Pha

se (

deg)

The Bode plot for G(s)

Frequency (rad/s)

Figure 7.24: The Bode plot for G(s) = γsm

s2+2ζωns+ω2n

Problem 11. Consider the square wave u(t) with period 2π generated by

u(t) =π

4if 0 ≤ t < π

= −π4

if π ≤ t < 2π.

Then u(t) is extended periodically for all t ≥ 2π. Consider the transfer function

G(s) =s

5s2 + s+ 5.

(i) Find the Fourier series expansion for the input u(t) of the form

u(t) = sin(t) +∞∑k=2


This problem shows how G(s) can be used as a filter to extract sin(t) from u(t).

(ii) Find the steady state output yss(t) for the square wave input u(t) in terms of an infinitesum of sinusoids.


(iii) Use lsim in Matlab to plot u(t), the output y(t) and yss(t) on the same graph over0 ≤ t ≤ 16π. The square command in Matlab may be helpful.

(iv) Plot the steady state response yss(t) and sin(t) on the same graph over the interval0 ≤ t ≤ 16π.

(v) Show that yss(t) ≈ sin(t), that is, compute

e =1

2π

∫ 2π

0

|yss(t)− sin(t)|2 dt.

You will need Matlab to compute this.

(vi) Plot the Bode plot of G(s). Use the magnitude and phase of this Bode plot to explainin a couple of sentences why G(s) acts like a filter to extract sin(t) from the squarewave input u(t).

(vii) Consider the input u+2w where u is the previous square wave, and w is a mean zero,variance one Gaussian white noise processes. In this case, the input is corrupted byGaussian white noise 2w. The idea is to find a filter to extract sin(t) from u + 2w insteady state. The Matlab command for w is randn. Use the lsim command in Matlabwith the input u + 2w over the interval 0 ≤ t ≤ 16π to plot the input u + 2w, theoutput y(t) and sin(t) all on the same graph. Does the filter G(s) pick out sin(t) fromthe signal plus noise u+ 2w?


7.6 A bus suspension problem

In this section we will present the bus suspension problem discussed in the web page [4]:

Control Tutorials for Matlab web page www.engin.umich.edu/group/ctm/

Consider the model for a bus suspension in Figure 7.25. Here m1 is the mass of the bus,while m2 is the mass of the suspension and tire. The spring constants are denoted by k1 andk2. The damping coefficients due to the shock absorbers and tires are denoted by b1 andb2. The output is y = q1 − q2 which is the displacement between the bus and the tire. Thecontrol force is u and the disturbance due to the road is denoted by w. The bus suspensionproblem is to design a feedback controller u to make the bus ride smooth when the tire hitsa bump in the road.

Figure 7.25: The model for the bus suspension.

The equations of motion for the bus suspension are given by

m1q1 + b1(q1 − q2) + k1(q1 − q2) = u

m2q2 + b1(q2 − q1) + k1(q2 − q1) + b2(q2 − w) + k2(q2 − w) = −u. (6.1)

Here u is the force between the tire and the bus, while w is the function describing theroad. The road is viewed as a disturbance. Finally, the output is y = q1 − q2 which is thedisplacement between the bus and the tire. Notice that the equations of motion in (6.1) canbe rewritten as a second order matrix differential equation of the form:[

m1 00 m2

] [q1q2

]+

[b1 −b1−b1 b1 + b2

] [q1q2

]+

[k1 −k1−k1 k1 + k2

] [q1q2

]=

[1−1

]u+

[0k2

]w +

[0b2

]w. (6.2)

So the bus suspension admits a second order matrix differential equation of the form

Mq + Φq +Kq = βu+ β1w + β2w. (6.3)

7.6. A BUS SUSPENSION PROBLEM 403

Here q is a column vector of length two, while M , Φ, and K are the 2 × 2 matrices, and β,β1 and β2 are column vectors of length two defined by

M =

[m1 00 m2

], Φ =

[b1 −b1−b1 b1 + b2

], K =

[k1 −k1−k1 k1 + k2

](6.4)

q =

[q1q2

], β =

[1−1

], β1 =

[0k2

]and β2 =

[0b2

].

A state space model. Let us convert the equations of motion in (6.3) to state space. Tothis end, notice that (6.3) can be written as

q = −M−1Φq −M−1Kq +M−1βu+M−1β1w +M−1β2w. (6.5)

Now let x1 and x2 be the state variables defined by

x1 = q =

[q1q2

]and x2 = x1 = q =

[q1q2

]. (6.6)

Notice that x1 and x2 are both column vectors of length two. Using x2 = q in equation (6.5)with x1 = Ix2, we see that the equations of motion in (6.1) admit a state space representationof the form:[

x1x2

]=

[0 I

−M−1K −M−1Φ

] [x1x2

]+

[0

M−1β

]u+

[0

M−1β1

]w+

[0

M−1β2

]w (6.7)

In other words, the bus suspension admits a state space model of the form:

x = Ax+Bu+ Γ1w + Γ2w and y = Cx. (6.8)

Here the state x is a vector of length four, while A is a 4 × 4 matrix, B, Γ1 and Γ2 are allcolumn vectors of length four, and C is a row vector of length four. To be precise,

A =

[0 I

−M−1K −M−1Φ

]=

⎡⎢⎢⎣0 0 1 00 0 0 1

− k1m1

k1m1

− b1m1

b1m1

k1m2

−k1+k2m2

b1m2

− b1+b2m2

⎤⎥⎥⎦ and x =

[x1x2

]=

⎡⎢⎢⎣q1q2q1q2

⎤⎥⎥⎦ .(The identity matrix I in the upper right hand corner of A is the 2×2 identity matrix, whilethe zero in the upper left hand corner of A is the 2 × 2 zero matrix.) The row matrix C isgiven by

C =[1 −1 0 0

].

The column matrices B, Γ1 and Γ2 are given by

B =

[0

M−1β

]=

⎡⎢⎢⎣001m1− 1m2

⎤⎥⎥⎦ , Γ1 =

[0

M−1β1

]=

⎡⎢⎢⎣000k2m2

⎤⎥⎥⎦ , Γ2 =

[0

M−1β2

]=

⎡⎢⎢⎣000b2m2

⎤⎥⎥⎦ .


The Matlab commands to form {A,B,Γ1,Γ2, C} are given by

A = [zeros(2, 2), eye(2);−inv(M) ∗K,−inv(M) ∗ Φ]B = [0; 0; inv(M) ∗ β]Γ1 = [0; 0; inv(M) ∗ β1]Γ2 = [0; 0; inv(M) ∗ β2]C = [1,−1, 0, 0]

If the disturbance w = 0, then the transfer function G1(s) from u to y is given by

G1(s) =Y (s)

U(s)= C(sI −A)−1B. (6.9)

To see this, notice that when w = 0, the state space equation in (6.8) reduces to

x = Ax+Bu and y = Cx. (6.10)

By taking the Laplace transform with x(0) = 0, we obtain sX = AX + BU . This readilyimplies that (sI − A)X = BU . Multiplying both sides by (sI − A)−1 yields

X(s) = (sI − A)−1BU(s).

The Laplace transform of the output y = Cx is given by Y = CX . By substituting X =(sI − A)−1BU into Y = CX, we obtain

Y = C(sI − A)−1BU.

Dividing by U , we see that the transfer function from u to y is given by

Y (s)

U(s)= C(sI − A)−1B,

that is, equation (6.9) holds.If the input u = 0, then the transfer function G2(s) from the disturbance w to y is given

by

G2(s) =Y (s)

W (s)= C(sI − A)−1Γ1 + sC(sI − A)−1Γ2. (6.11)

To see this notice that when u = 0, the state space equation in (6.8) reduces to

x = Ax+ Γ1w + Γ2w and y = Cx. (6.12)

By taking the Laplace transform with x(0) = 0 and w(0) = 0, we obtain

sX = AX + Γ1W + sΓ2W.

This readily implies that (sI −A)X = Γ1W + sΓ2W . Multiplying both sides by (sI −A)−1

yieldsX(s) = (sI −A)−1 (Γ1W + sΓ2W ) . (6.13)

7.6. A BUS SUSPENSION PROBLEM 405

The Laplace transform of y = Cx is given by Y = CX . By substituting X(s) in (6.13) intoY = CX, we obtain

Y =(C(sI − A)−1Γ1 + sC(sI − A)−1Γ2

)W.

Dividing by W , we see that the transfer function from w to y is given by

Y (s)

W (s)= C(sI − A)−1Γ1 + sC(sI −A)−1Γ2.

that is, equation (6.11) holds.

7.6.1 Exercise

Following [4] assume that

• The body mass m1 = 2500 kg;

• The suspension mass m2 = 320 kg;

• The spring constant of suspension system k1 = 80, 000 N/m;

• The spring constant of wheel and tire k2 = 500, 000 N/m;

• The damping constant of suspension system b1 = 350 Ns/m;

• The damping constant of wheel and tire b2 = 15, 020 Ns/m;

• The control force is u and the disturbance is w.

Exercise 1. Assume that the disturbance w = 0. Then use the Matlab command ss2tf tofind the transfer function G1(s) from the input u to the output y, that is, find

G1(s) =Y (s)

U(s).

Exercise 2. Assume that the input u = 0 (and w(0) = 0). Then use the Matlab commandss2tf to find the transfer function G2(s) from the disturbance w to the output y, that is, find

G2(s) =Y (s)

W (s).

Depending upon your approach, the minreal command in Matlab may be helpful in addingtwo transfer functions by eliminating the common poles and zeros.

Exercise 3. Assume that the input u = 0 and the disturbance w = 110. Then use the step

command in Matlab on G2 to plot the ride the passengers experiences when the bus hitsbump of w(t) = 1

10in the road with w(0) = 0.


Exercise 4. Graph the Bode plot of G2(s) in Matlab, and find the corresponding resonancefrequency ωr.

Exercise 5. Assume that the input u = 0 and the disturbance w = 110sin(ωrt), that is,

the bus is riding over a disturbance at the resonance frequency ωr with amplitude 110.

(5a) Use the lsim command in Matlab on G2 to plot the ride y(t) the passengers experiencesfor 50 seconds when w(t) = 1

10sin(ωrt).

(5b) Find G2(iωr) and compute the steady state response yss(t) for w = 110sin(ωrt). What is

the amplitude of yss(t)? In other words, how high are the passengers bouncing around?

(5c) Plot yss(t) on the same graph as the lsim plot in Part (5a).

Exercise 6. Notice that the bus oscillates for 50 seconds when it hits a bump of w = 110.

To help alleviate this problem, let us apply a feedback force u = −γy = −γ(q1 − q2) on thetwo masses to reduce the oscillation time when w(t) = 1

10. The gain or constant γ applies

a force u = −γy which is proportional to the displacement between the wheel and bus. Inthis case, substituting u = −γy = −γCx in (6.8), we have

x = (A− BγC)x+ Γ1w + Γ2w and y = Cx. (6.14)

Notice that the eigenvalues of A−BγC control the oscillation of the bus and how fast theseoscillations die out. In this problem, we are looking for a gain γ to dramatically improve theride of the bus.

(6a) Prove that the transfer function from w to y for the state space system in (6.14) isgiven by

Gf(s) = C(sI − (A− BγC)

)−1

Γ1 + sC(sI − (A− BγC)

)−1

Γ2. (6.15)

Recall that x(0) = 0 and w(0) = 0 and u(t) = 0.

(6b) Find a constant γ and the corresponding eigenvalues for A − BγC to help damp outthe buses oscillations when w(t) = 1

10. There is no unique answer. You can plot the

eigenvalues of A − BγC for 1000 different γ between 105 and 5 × 106 on the samegraph. One could also plot γ vs max(real(eig(A−B ∗ γ ∗ C))) in Matlab to help sportthrough these eigenvalues. One needs a lot of force to move the bus. Hence γ is large.(In a control course one may use the root locus to solve this problem. The graph ofthe eigenvalues eig(A−BγC) as γ varies is actually the root locus; see Ogata [27] fora discussion of the root locus. Here we will use brute force to tune the gain γ.) UseMatlab to sort through these gains γ and choose one gain γ to make the oscillationsdie out in three seconds or less. Try to choose a γ such that the eigenvalues {λj}41 forA−BγC are all in the left hand plane �λj < 0 and eλjt dies quickly for all j = 1, 2, 3, 4.

(6c) Using your gain γ find Gf and plot the step response for Gf with w = 110. Did you

reduce the oscillation time?

7.7. ALL PASS FILTERS 407

(6d) Using your gain γ with lsim, plot the output y(t) for Gf with w = 110sin(ωrt). Did

you reduce the amplitude of the oscillation and the ride of the passengers at theresonance frequency ωr for G2? Find the steady state response yss(t) for Gf withw(t) = 1

10sin(ωrt).

(6e) Graph the Bode plot of Gf(s) and find the corresponding resonance frequency ω◦ forGf(s). What is |Gf(iω◦)|? Does the new resonance frequency introduce another ”ruff”ride when the bus drives over the disturbance w(t) = 1

10sin(ω◦t)?

Finally, to obtain some better control methods to improve the ride of the bus see the excellentweb page [4].

7.7 All pass filters

The results in this section are not used in the remaining part of the notes and can be skippedby the uninterested reader. In this section, we will present an brief introduction to all passfilters or Blaschke products. We say that a transfer function is an all pass filter, if G(s) is aproper stable function and |G(iω)| = 1 for all frequencies ω.

We say that a function B(s) is a Blaschke product if

B(s) = γ

n∏j=1

s+ λjs− λj

(7.1)

where �(λj) < 0 for all j = 1, 2, · · · , n and γ = ±1. Moreover, n is the order of the Blaschkeproduct. Throughout we always assume that the poles {λj}n1 for B(s) come in complex

conjugate pairs. (The poles of B(s) can be repeated.) This guarantees that B(s) = n(s)d(s)

where the numerator n(s) and denominator d(s) are polynomials with real coefficients, whichhappens in almost all engineering problems. Clearly, B(s) is a stable transfer function. It isemphasized that because the poles {λj}n1 come in complex conjugate pairs,

B(s) = ±1d(−s)d(s)

where d(s) =n∏j=1

(s− λj) (7.2)

is a stable polynomial. Finally, a polynomial is stable if all of its roots are contained in theopen left half plane {z ∈ C : �(z) < 0}.

For example,

B(s) =(s− 1)(s− 2)(s− 3− i)(s− 3 + i)

(s+ 1)(s+ 2)(s+ 3 + i)(s + 3− i)=s4 − 9s3 + 30s2 − 42s+ 20

s4 + 9s3 + 30s2 + 42s+ 20(7.3)

is a Blaschke product of order four. We used the Matlab command

tf(poly([1, 2, 3− i, 3 + i]),poly([−1,−2,−3 − i,−3 + i]))

to compute the second equality. Finally, as expected, the Blaschke product B(s) in (7.3) is

of the form B(s) = d(−s)d(s)

; see (7.2).


We claim that a Blaschke product is an all pass filter. In other words, B(s) in (7.1) is anall pass filter. Clearly, G(s) is stable. Using the fact that |z| = |z| for any complex numberz, we have

|B(iω)|2 =∣∣∣∣∣n∏j=1

iω + λjiω − λj

∣∣∣∣∣2

=n∏j=1

|iω + λj |2|iω − λj |2 =

n∏j=1

∣∣∣iω + λj

∣∣∣2|iω − λj|2 =

n∏j=1

| − iω + λj |2|iω − λj |2 = 1.

Therefore |B(iω)| = 1, and thus, a Blaschke product B(s) is an all pass filter. This provespart of the following result.

THEOREM 7.7.1 Let G(s) be a stable transfer function. Then the following statementsare equivalent.

• G(s) is an all pass filter.

• G(s) is a Blaschke product.

• G(s) = ±d(−s)d(s)

where d(s) is a stable polynomial.

Let B(s) = s−as+a

be a Blaschke product of order one where a > 0. The pole for B(s) isλ = −a. The angle of B(iω) and respectively −B(iω) are respectively given by

angle

(iω − a

iω + a

)= π − 2 arctan

(ωa

)angle

(a− iω

iω + a

)= −2 arctan

(ωa

)(when a > 0). (7.4)

To obtain the second equality, simple observe that

a− iω

iω + a=

√a2 + ω2e−i arctan(

ωa )

√a2 + ω2ei arctan(

ωa )

= e−2i arctan(ωa ) (when a > 0). (7.5)

The first equality in (7.4) follows from the previous calculation and

iω − a

iω + a= eiπ

a− iω

iω + a= eiπe−2i arctan(ω

a ) = eπ−2i arctan(ωa ) (when a > 0). (7.6)

Equation (7.4) shows that the angle of iω−aiω+a

varies from π to 0 as ω moves from 0 to infinity.

With ω fixed the the angle of iω−aiω+a

varies from 0 to π as a moves a from 0 to infinity. On

the other hand, the angle of a−iωiω+a

varies from 0 to −π as ω moves from 0 to infinity. With ω

fixed the the angle of a−iωiω+a

varies from −π to 0 as a moves from 0 to infinity.


An example of an all pass steady state response. Consider the all pass filter B(s) oforder two given by

B(s) =Y (s)

U(s)=

(s− 2)(s− 2√3)

(s+ 2)(s+ 2√3).

Clearly, B(s) is a second order Blaschke product with poles λ1 = −2 and λ2 = −2√3. In

particular, |B(iω)| = 1. Assume that the input

u(t) = 4− 4 cos(2t),

and y(t) is the output, that is, Y (s) = B(s)U(s). Then we claim that the steady stateoutput

yss(t) = 4− 4 cos

(2t+

7π

6

)= 4 + 4 cos

(2t +

π

6

).

Finally, it is noted that the amplitudes of u(t) and yss(t) are the same. However, the phasehas shifted.

Since B(s) is stable, the steady state response is given by

yss(t) = 4B(0)− 4|B(2i)| cos (2t+ angle(B(2i))).

Because B(0) = 1, we see that 4B(0) = 4. Moreover,

B(2i) =2i− 2

2i+ 2× 2i− 2

√3

2i+ 2√3=i− 1

i+ 1× i−√

3

i+√3

=

√2e

3πi4√

2eiπ4

× 2e5πi6

2eiπ6

= eπi2+ 4πi

6 = e7πi6 .

Therefore B(2i) = e7πi6 . (This also follows from (7.6).) In particular, |B(2i)| = 1 (as

expected) and angle(B(2i)) = 7π6. So the steady state response is determined by

yss(t) = 4− 4 cos

(2t +

7π

6

).

Finally, for another form of the steady state response, recall that − cos(θ) = cos(θ + π).Using this we also have

yss(t) = 4− 4 cos

(2t+

7π

6

)= 4 + 4 cos

(2t +

π

6

).

Another all pass example. The McMillan degree of a transfer function G(s) = n(s)d(s)

is

the degree of the denominator polynomial when n(s) and d(s) have no common roots. Inparticular, the order of a Blaschke product equals its McMillan degree. All pass filters areused to change the phase of the steady state response without changing its amplitude. Forexample, consider the input

u(t) = cos(2t).


Our problem is to find transfer function G(s) with McMillan degree one such that the steadystate output is determined by

yss(t) = 3 cos(2t)− 4 sin(2t).

According to equation (2.20) in Chapter 1, or (4.4) in Chapter 2, we have

α cos(x) + β sin(x) =√α2 + β2 cos(x− ψ)

where ψ equals the angle of α + iβ. (As expected, α and β are real number.) Hence

yss(t) = 3 cos(2t)− 4 sin(2t) = 5 cos(2t+ arctan(4/3))

= |G(2i)| cos(2t+ angle(G(2i)).

In other words, we are looking for a stable transfer function with McMillan degree one suchthat G(2i) = 5ei arctan(4/3). There are many solutions to this problem. In our approach wewill choose G(s) = 5B(s) where B(s) is a Blaschke product of order one, that is,

G(s) = ±5(s− a)

s+ a

where a > 0 and B(s) = ±s−as+a

. (The pole for B(s) is λ = −a.) For the moment we arenot sure of the sign ±. It is emphasized that |G(iω)| = 5 for all frequencies ω, that is,|G(iω)| = 5|B(iω)| = 5. Let us first try

G(s) = 5s− a

s+ a.

In this case, G(2i) = |G(2i)|eiϕ = 5eiϕ where ϕ is the angle of G(2i). Moreover, we wantϕ = arctan(4/3). Notice that

angle(G(2i)) = angle

(2i− a

2i+ a

)= angle

(eiπ

a− 2i

2i+ a

)= π + angle

(a− 2i

2i+ a

)= π − arctan(2/a)− arctan(2/a) = π − 2 arctan(2/a).

(This calculation is a special case of (7.4).) Recall that we want G(2i) = 5ei arctan(4/3). Since|G(2i)| = 5, we see that

arctan(4/3) = angle(G(2i)) = π − 2 arctan(2/a).

In other words, 2 arctan(2/a) = π − arctan(4/3), or equivalently,

2

a= tan

(π − arctan(4/3)

2

)= 2.

Hence a = 1. Therefore the filter we are looking for is given by

G(s) =5(s− 1)

s+ 1.


Finally, it is noted that

G(2i) =5(2i− 1)

2i+ 1= 3 + 4i = 5ei arctan(4/3).

The Bode plot for this transfer function G(s) is presented in Figure 7.26. As expected,the magnitude |G(iω)| = 5, or 20 log(5) = 13.9794 db across all frequencies. The angle ofG(2i) is 53.6 degrees or 0.9355 ≈ angle(3+ 4i) in radians. Finally, the Matlab commands tograph the Bode plot of G(s) we used are given by

• g = tf([5,−5], [1, 1])

• bode(g); grid

12.5

13

13.5

14

14.5

15

Mag

nitu

de (

dB)

Bode Diagram

Frequency (rad/s)10

−210

−110

010

110

20

45

90

135

180

System: gFrequency (rad/s): 1.99Phase (deg): 53.6

Pha

se (

deg)

Figure 7.26: The Bode plot for G(s) = 5(s−1)s+1

Let us use the lsim command in Matlab to plot the output y(t) of our filter G for theinput u(t) = cos(2t), over a 20 second interval, and compare y(t) to the steady state response

yss(t) = 3 cos(2t)− 4 sin(2t) = 5 cos(2t + arctan(4/3)).

The simulation shows that 3 cos(2t) − 4 sin(2t) is indeed the steady state response of ourfilter. The graph is presented in Figure 7.27. The Matlab commands we used to generatethis graph are given by

• g = tf([5,−5], [1, 1]);

• t = linspace(0, 20, 2 ∧ 14);

• u = cos(2 ∗ t); yss = 3 cos(2 ∗ t)− 4 sin(2 ∗ t);• lsim(g, u, t); hold on;plot(t, yss,′ r′); grid


0 2 4 6 8 10 12 14 16 18 20−6

−4

−2

0

2

4

6


Time (seconds)

Am

plitu

de

Figure 7.27: Simulation for y(t)

An example of changing the phase of a transfer function. All pass filters can beused to change the phase of the steady state response. For example, let G(s) be a stabletransfer function. Assume that the input

u(t) =m∑j=1

Aj cos(ωjt)

where {Aj}m1 and {ωj}m1 are the corresponding amplitudes and frequencies. Then the steadystate response yss(t) is given by

yss(t) =

m∑j=1

|G(iωj)|Aj cos(ωjt+ ϕ(ωj)

)where G(iω) = |G(iω)|eiϕ(ω). (7.7)

Let B(s) be an all pass filter or Blaschke product. Let F (s) be the stable transfer functiondefined by F (s) = B(s)G(s). Then

F (iω) = B(iω)G(iω) = |B(iω)|eiψ(ω)|G(iω)|eiϕ(ω) = |G(iω)|ei(ϕ(ω)+ψ(ω)).

As expected, B(iω) = eiψ(ω). (Because B(s) is an all pass filter, |B(iω)| = 1.) Let v(t) bethe output for the filter F (s) with input u(t), that is, V (s) = F (s)U(s) in the s domain.Since Y (s) = G(s)U(s), we also have V (s) = B(s)Y (s). So the steady state response vss(t)to the filter F (s) = B(s)G(s) is given by

vss(t) =

m∑j=1

|G(iωj)|Aj cos(ωjt+ ϕ(ωj) + ψ(ωj)

). (7.8)

Because V (s) = B(s)Y (s), the steady state response vss(t) in (7.8) is also the steady stateresponse for the all pass filter B(s) corresponding to the input yss(t). By designing the


appropriate all pass filter B(s), one can choose the phases {ψ(ωj)}m1 , and thus, change thephase of cos

(ωjt+ϕ(ωj)+ψ(ωj)

)in the steady state response vss(t). In general this involves

solving nonlinear equations which takes us far from the scope of the notes. Instead of pursingthe general case, we will present a simple example.

Consider the transfer function

G(s) =s2 − 2s

s3 + 9s2 + 24s+ 16=

s(s− 2)

(s+ 1)(s+ 4)2.

The Bode plot for G(s) is presented in the bottom graph of Figure 7.28. The magnitudeplot of G looks like a band pass filter about ω = 3. (This is not a very good band pass filter,we are just using this G as an example.) Assume that the input u(t) = cos(3t). Then usingG(3i) = 0.0504 + 0.1272i = 0.1368e1.1935i, we see that the steady state response is given by

yss(t) = 0.1368 cos(3t+ 1.1935).

A good band pass filter centered about ω = 3 would have a steady state response of cos(3t).Now let us try to find a constant γ with an all pass filter B(s) such that the stable transferfunction F (s) = γB(s)G(s) has a steady state response of vss(t) = cos(3t). The idea is todevelop a band pass filter from G(s) such that the cos(3t) passes through the filter F (s)unaltered. To accomplish this we must have F (3i) = 1. In other words,

1 = F (3i) = γB(3i)G(3i) = γeψ(3i)i|G(3i)|eϕ(3i)i = γ|G(3i)|ei(ψ(3i)+ϕ(3i)).Using G(3i) = 0.1368e1.1935i, we obtain

1 = 0.1368γei(ψ(3i)+1.1935).

The choice of γ is simple, that is, γ = 10.1368

= 7.3088. So we are looking for B(s) such thatangle(B(3i)) = −1.1935. Motivated by the second equation in (7.4), consider a first orderBlaschke product B(s) = a−s

s+afor some a > 0. Then (7.4) yields

−1.1935 = angle(B(3i)) = angle

(a− 3i

3i+ a

)= −2 arctan

(3

a

).

In other words,3

a= tan

(1.1935

2

)= 0.6794.

Thus a = 30.6794

= 4.4158 and

B(s) =4.4158− s

s+ 4.4158.

The transfer function F that we are looking for is given by

F (s) = γB(s)G(s) =7.3088(4.4158− s)(s2 − 2s)

(s+ 4.4158)(s3 + 9s2 + 24s+ 16).

The Bode plot of F (s) is given in the top graph of Figure 7.28. Notice that |F (3i)| = 1or zero db and angle(F (3i)) = 0. Hence F (3i) = 1 (up to numerical error). Therefore ifu(t) = cos(3t), then the steady state response vss(t) = cos(3t).


−80

−70

−60

−50

−40

−30

−20

−10

0

10

20

Mag

nitu

de (

dB)

Bode Diagram

Frequency (rad/s)10

−210

−110

010

110

2−180

0

180

360

540

720

Pha

se (

deg)

Figure 7.28: The bode for G(s) = s(s−2)(s+1)(s+4)2

and F (s) = 7.3088B(s)G(s)

7.7.1 Exercise

Problem 1. Prove Theorem 7.7.1.

Problem 2. Consider the input

u(t) = 4− 3 sin(2t).

(i) Find an all pass filter B(s) such that its steady state response is given by

yss(t) = 4 + 3 cos(2t).

(ii) Compute B(0) and B(2i).

(iii) Use the lsim command in Matlab to plot the input u(t) = 4− 3 sin(2t) and the outputy(t) on the same graph for 16 seconds.

(iv) Use hold on in Matlab and plot yss(t) on the same graph.

Hint: express sin in terms of cos with a phase shift.

Problem 3. Consider the input

u(t) = 4 + 3 sin(2t).

(i) Find an all pass filter B(s) such that its steady state response is determined by

yss(t) = 4 + 3 cos(2t).

(ii) Compute B(0) and B(2i).


(iii) Use the lsim command in Matlab to plot the input u(t) = 4+ 3 sin(2t) and the outputy(t) on the same graph for 20 seconds.

(iv) Use hold on in Matlab and plot yss(t) on the same graph.

Hint: Consider a Blaschke product of the form B(s) =(s−as+a

)2.

Problem 4. Recall that the McMillan degree of a transfer function G(s) = n(s)d(s)

equals the

degree of the denominator d(s) when n(s) and d(s) have no common roots. Consider theinput

u(t) = cos(t) + cos(3t)− sin(5t).

(i) Design a band pass filter F (s) of McMillan degree three such that the steady stateoutput

yss(t) ≈ cos(3t+π

3).

Use Matlab to find the Bode plot for F (s).

(ii) Use the lsim command in Matlab with u(t) to see how well your filter F (s) approximatescos(3t + π

3). Plot cos(3t+ π

3) and y(t) on the same graph.

(iii) Construct a Simulink model for your filter F (s). Use Simulink to plot cos(3t+ π3) and

y(t) on the same graph.

Hint: Consider F (s) = B(s)G(s) where B(s) is a Blaschke product of order one, and G(s)is a band pass filter of McMillan degree two; see also Problem 9 in Section 7.5.3.

Problem 5. This involves solving two nonlinear equations with two unknowns. Considerthe input signal

u(t) = 2 cos(2t− 2.8929)− 3 sin(4t− 1.7770).

Find an all pass filter B(s) of McMillan degree two such that the steady state output is givenby

yss(t) = 2 cos(2t)− 3 sin(4t).

Use the lsim command in Matlab to plot both 2 cos(2t) − 3 sin(4t) and the output y(t) onthe same graph.


Chapter 8

Butterworth filters

In this chapter we will present an introduction to Butterworth filters. For further results onButterworth filters and other filtering techniques see [6, 19, 30, 31, 34].

8.1 Low pass Butterworth filters

This section is devoted to low pass Butterworth filters. The low pass Butterworth filter oforder n with cutoff angular frequency ωc is the stable rational filter defined by

G(s) =ωncd(s)

where d(s) =

n−1∏k=0

(s− iωc ei(π+2πk)/2n) , (1.1)

and n is a strictly positive integer. As expected, the cutoff angular frequency ωc > 0. Noticethat d is a monic polynomial of degree n whose roots are given by

λk = iωc ei(π+2πk)/2n (k = 0, 1, 2, · · · , n− 1) . (1.2)

The roots {λk}n−10 live on the circle of radius ωc in the open left half plane, that is,

{λk}n−10 ⊆ {s : |s| = ωc}

⋂{s : �s > 0} . (1.3)

The roots {λk}n−10 start with λ0 on the circle with radius ωc and angle π/2+π/2n and move

counter clockwise around the circle of radius ωc to λn−1 with radius ωc and angle 3π/2−π/2n.Since all the roots of d are contained in the open left half plane, the Butterworth filterG = ωnc /d is stable. Finally, it is noted that G(s) = G1(s/ωc) where G1(s) is the low passButterworth filter with cutoff angular frequency one.

By construction the Butterworth filter is a stable proper rational transfer function. There-fore one can use the techniques in Section 5.6 in Chapter 5 to build a circuit consisting ofresistors, capacitors and operational amplifiers to implement any n-th order Butterworthfilter.

To motivate where the Butterworth comes from, consider the problem of trying to finda stable rational filter G(s) satisfying

|G(iω)| 2 = 1

1 + ω2n/ω2nc

=ω2nc

ω2nc + ω2n

. (1.4)

417

418 CHAPTER 8. BUTTERWORTH FILTERS

As before, ωc is the cutoff angular frequency and n is a strictly positive integer. In a momentwe will see that the filter G in (1.1) satisfies magnitude constraint in (1.4). If G is a stablerational filter satisfying (1.4), then graphing |G(iω)| shows that G is a nice low pass stablefilter. Clearly, |G(iω)| ≈ 1 when |ω| << ωc and |G(iω)| ≈ 0 when |ω| >> ωc. At the cutoffangular frequency |G(iωc)| = 1/

√2. So the Butterworth filter G maintains approximately

71% of its maximum amplitude at the cutoff angular frequency. On the magnitude Bodeplot the Butterworth filter G drops approximately 3 decibels at the cutoff angular frequency.Moreover, |G(iω)| is decreasing as ω moves from zero to infinity. The higher the integer nthe faster the magnitude |G(iω)| drops off when ω > ωc. Moreover, the Butterworth filterhas the following properties. First |G(0)| = 1 and

dk|G(iω)|2dωk

∣∣∣∣ω=0

= 0 (for k = 1, 2, · · · , 2n− 1) . (1.5)

In other words, the first 2n − 1 derivatives of |G(iω)|2 evaluated at the zero frequency arezero. This along with |G(0)| = 1 means that the low pass Butterworth filter of order n is oneat the zero frequency and the ’slope’ of |G(iω)|2 at ω = 0 up to its first 2n− 1 derivatives isalso zero.

To find a stable rational filter satisfying (1.4) consider a rational function of the form

G(s) =γ

a0 + a1s+ a2s2 + · · ·+ an−1sn−1 + sn=

γ

d(s)(1.6)

where γ is a real constant, d(s) = a0 + a1s+ · · ·+ an−1sn−1 + sn is a monic polynomial with

real coefficients of order n and G(0) = 1. It turns out that there is only one stable rationalfilter G(s) of the form (1.6) satisfying G(0) = 1 and (1.4). Furthermore, this filter is then-th order Butterworth filter defined in (1.1).

Now let us construct a stable rational filter of the form G(s) = γ/d(s) satisfying (1.4).Because γ and the coefficients {ak}n−1

0 are all real numbers

γ = γ and d(iω) = d(−iω) .Since G(iω) = γ/d(iω), we have G(iω) = G(−iω). By combining this with (1.4) and (1.6),we that G must satisfy

G(iω)G(−iω) = G(iω)G(iω) = |G(iω)| 2 = ω2nc

ω2nc + ω2n

. (1.7)

Using s = iω, and G(s) = γ/d(s), we obtain

γ2

d(s)d(−s) = G(s)G(−s) = ω2nc

ω2nc + (s/i)2n

. (1.8)

Hence γ = ωnc and d(s)d(−s) = ω2nc + (s/i)2n. Clearly, d(s)d(−s) is a polynomial of degree

2n. To complete our derivation it remains to show that there exists a monic polynomiald of degree n whose roots are all contained in the open left half plane and d(s)d(−s) =ω2nc + (s/i)2n.

8.1. LOW PASS BUTTERWORTH FILTERS 419

Now let us compute the roots of ω2nc + (s/i)2n = 0. Notice that ω2n

c + (s/i)2n = 0 if andonly if (s/iωc)

2n = −1, or equivalently, s = iωc(−1)1/2n. In other words, the zeros of thepolynomial ω2n

c + (s/i)2n are iωc times the 2n roots of −1. Recall that −1 = ei(π+2πk) wherek is any integer. Hence the 2n roots of ω2n

c + (s/i)2n = 0 are given by

λk = iωc ei(π+2πk)/2n (k = 0, 1, 2, · · · , 2n− 1) . (1.9)

The roots {λk}2n−10 start with λ0 on the circle with radius ωc and angle π/2 + π/2n, and

then move counter clockwise around the circle of radius ωc ending with λ2n−1 whose angleis π/2− π/2n. The roots {λk}n−1

0 are all contained in the open left half plane {s : �s < 0},while {λk}2n−1

n are all contained in the open right half plane {s : �s > 0}. In fact, for anyinteger k in [0, n− 1], we have −λk = λn+k. To see this simply observe that

−λk = −iωc ei(π+2πk)/2n = iωc eiπ ei(π+2πk)/2n = iωc e

i(π+2πn+2πk)/2n

= iωc ei(π+2π(n+k))/2n = λn+k .

Thus −λk = λn+k for all k = 0, 1, · · · , n− 1.Now let d be the monic polynomial of degree n whose roots are {λk}n−1

0 , that is,

d(s) =

n−1∏k=0

(s− iωc ei(π+2πk)/2n) . (1.10)

Because all the roots of {λk}n−10 are contained in the open left half plane, the filter G(s) =

γ/d(s) is stable. Clearly, λ is a root of d(s) if and only if −λ is a root of d(−s). Hence{λk}2n−1

n are the roots of d(−s). So {λk}2n−10 are the roots of the polynomial d(s)d(−s).

By construction d(s)d(−s) and ω2nc + (s/i)2n have the same roots {λk}2n−1

0 . We claim thatd(s)d(−s) = ω2n

c + (s/i)2n. To see this observe that ω2nc + (s/i)2n = ω2n

c + (−1)ns2n. Since(−1)n is also the coefficient of s2n for d(s)d(−s), we must have d(s)d(−s) = ω2n

c +(s/i)2n. Inparticular, by setting s = iω, we arrive at d(iω)d(−iω) = ω2n

c + ω2n. Since d(−iω) = d(iω),we obtain

|d(iω)|2 = ω2nc + ω2n . (1.11)

Notice the G(s) = ωnc /d(s) is precisely the n-th order Butterworth filter defined in (1.1). Byemploying (1.11), we arrive at

|G(iω)|2 = ω2nc

|d(iω)|2 =ω2nc

ω2nc + ω2n

.

Therefore (1.4) holds. In other words, n-th order low pass Butterworth filter defined in (1.1)satisfies the constraint in (1.4).

8.1.1 A low pass Butterworth filtering example

Consider the following input signal with period 2π defined by

u(t) = t/π if 0 ≤ t ≤ π

= (2π − t)/π if π ≤ t ≤ 2π


over the interval [0, 2π]. For t > 2π, the function u extended periodically by choosingu(t) = u(t+ 2π) for all t. Notice that u(t) is a triangular wave.

Let us compute the Fourier series for u(t) of the form:

u(t) = a0 +

∞∑k=1


The constant term a0 in the Fourier series is given by

a0 =1

2π

∫ π

0

u(t)dt =1

2π2

∫ π

0

tdt +1

2π2

∫ 2π

π

(2π − t)dt

=1

2π2

[t2

2

]π0

− 1

2π2

[(2π − t)2

2

]2ππ

=1

4+

1

4=

1

2.

Hence a0 =12. Recall that admits a Fourier series expansion of the form u(t) =

∑∞−∞ ake

−ikt;see Chapter 2. Moreover, for any integer k ≥ 1, we have

ak =1

2π2

∫ π

0

eikttdt+1

2π2

∫ 2π

π

eikt(2π − t)dt

=1

2π2

[teikt

ik+eikt

k2

]π0

+1

2π2

[(2π − t)eikt

ik− eikt

k2

]2ππ

=1

2π2

(eikt − 1

k2+eikt − ei2kt

k2

)=eikt − 1

π2k2.


ak = − 2

π2k2if k is odd

= 0 if k is even.

Recall that αk = 2�(ak) and βk = 2�(ak); see (3.11) in Chapter 2. Using this we see thatthe Fourier series expansion for u(t) is given by

u(t) =1

2− 4

π2

∞∑k=1,k odd

1

k2cos(kt) =

1

2− 4

π2

∞∑k=0

1

(2k + 1)2cos((2k + 1)t). (1.12)

The plot of u(t) in Matlab using Fourier series is given in Figure 8.1. The Matlab commandswe used to plot u are

• t = linspace(0, 25, 214);

• u = 1/2; for k = 1 : 2 : 500; u = u− 4 ∗ cos(k ∗ t)/(π2 ∗ k2); end• plot(t, u); grid

• title(’The plot of u(t) using Fourier series’)


0 5 10 15 20 250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1The plot of u(t) using Fourier series

Figure 8.1: The Fourier series plot of u(t)

It is emphasized the triangular wave u(t) consists of infinitely many sinusoids; see theFourier series expansion in (1.12). Now let us design a low pass Butterworth filter G whichpicks out

1

2− 4

π2cos(t)

from u(t). In other words, if Y (s) = G(s)U(s), then the steady state output

yss(t) ≈ a0 + α1 cos(t + θ) + β1 sin(t+ θ)

where θ is a phase shift. (In Matlab butter(m,c,’s’) where m is the order of the Butterworthfilter and c is the cutoff frequency.) There are many filters which will solve this problem.Here we choose a sixth order low pass Butterworth filter G with cut off frequency 1.9. Usingthe Matlab command butter(6, 1.9, ′s′), we obtained

G(s) =47.05

s6 + 7.341s5 + 26.95s4 + 62.7s3 + 97.27s2 + 95.67s+ 47.05. (1.13)

The bode plot for this G is given in Figure 8.2.We used the lsim command in Matlab to see how well our low pass filter G picks out

12− 4

π2 cos(t) from the triangular input u(t). In other words, we used the lsim commandin Matlab to find the output y(t) for the triangular input u(t). In the Laplace domainY (s) = G(s)U(s). The Matlab commands we used are given by

• t = linspace(0, 25, 2 ∧ 14);

• u = 1/2; for k = 1 : 2 : 500; u = u− 4 ∗ cos(k ∗ t)/(π2 ∗ k2); end• [n, d] = butter(6, 1.9,′ s′); bode(n, d); grid

• g = tf(n, d)


−250

−200

−150

−100

−50

0

50

Mag

nitu

de (

dB)

10−1

100

101

102

−540

−360

−180

0

Pha

se (

deg)

Bode Diagram

Frequency (rad/sec)

Figure 8.2: The Bode plot for G(s) = 47.05s6+7.341s5+26.95s4+62.7s3+97.27s2+95.67s+47.05

• u1 = 1/2− 4 cos(t)/(π2); yss = 1/2− 4 cos(t− 2.113)/(π2);

• lsim(g, u, t); hold on;plot(t, u1,′m′); plot(t, yss, ′r′);grid

The plot for the filter output y(t), the desired signal u1 = 1/2− 4π2 cos(t) and yss(t) are given

in Figure 8.3. The simulation shows that in steady state yss(t) is a phase shift of the signalu1 =

12− 4

π2 cos(t) by −2.113 radians, that is

yss(t) ≈ 1

2− 4

π2cos(t− 2.113).

0 5 10 15 20 250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Time (sec)

Am

plitu

de

Figure 8.3: The simulation of y(t)

To find the phase shift at frequency 1, we used Matlab to compute

G(s)|s=i = G(i) = −0.5159− 0.8564i = e−2.113i.


(The phase shift at frequency 1, or any frequency, can also be directly read off the Bode plotfor G.) Since our low pass Butterworth filter essentially eliminates all frequencies above 2 inthe triangular signal u, the steady state response shifts the desired signal u1 = 1/2− 1

π2 cos(t)by −2.113 radians, that is,

yss(t) ≈ 1

2G(0)− 4

π2|G(i)| cos(t+ angle(G(i))) =

1

2− 4

π2cos(t− 2.113).

The ≈ sign is due to the fact that we have ignored all the higher order terms our low passfilter essentially eliminated.

An all pass filter to eliminate the phase shift. Now let us use an all pass filter toeliminate the phase shift in the steady state response at frequency 1. To accomplish thisconsider an all pass filter of the form:

B(s) =

(s− a

s+ a

)2

where a > 0. Now consider the new filter F (s) = B(s)G(s) where G is the sixth order lowpass Butterworth filter with cutoff frequency 1.9 given in (1.13). The McMillan degree ofF is eight. Because B(s) and G(s) are both stable, F (s) is also stable. By construction|F (iω)| = |B(iω)G(iω)| = |G(iω)| for all frequencies ω. So F (iω) and G(iω) are two stablefilters with the same magnitude over all frequencies. In particular, F (s) is a low pass filterwith cutoff frequency 1.9. However, F (iω) and G(iω) have different phases. To be precise,

F (0) = B(0)G(0) = 1

angle(F (i)) = angle(B(i)G(i)) = angle(B(i)) + angle(G(i)).

To eliminate the phase shift of angle(G(i)) = −2.113 radians at frequency 1, we want tochoose a > 0 such that angle(F (i)) = 0. In other words, choose a > 0 such that

−angle(G(i)) = angle(B(i)) = angle

(i− a

i+ a

)2

= 2π − 4 arctan(1/a);

see (7.4) in Chapter 7. (The 2π is used to arrive at a > 0.) Hence the a we are looking foris given by

a =1

tan(

2π+angle(G(i))

4

) =1

tan(2π−2.113

4

) = 0.5836.

For this a = 0.5836 our new low pass filter F (s) = B(s)G(s) with cutoff frequency 1.9 isgiven by the following stable transfer function of McMillan degree 8:

F (s) =47.05(s− 0.5836)2

(s+ 0.5836)2(s6 + 7.341s5 + 26.95s4 + 62.7s3 + 97.27s2 + 95.67s+ 47.05).

By construction F (0) = F (i) = 1. (We needed a second order all pass filter to guaranteethat both of these conditions are satisfied, that is, F (0) = 1 and eliminate the phase shift;


see also Problem 3 in Section 7.7.1 in Chapter 7.) Because our low pass filter F (s) essentiallyeliminates all frequencies above 2 in the triangular signal u(t), the steady state response isgiven by

vss(t) ≈ 1

2F (0)− 4

π2|F (i)| cos(t + angle(F (i))) =

1

2− 4

π2cos(t).

Here v(t) is the output for the new low pass filter F (s) = B(s)G(s) corresponding to thetriangular wave input u(t). In the s domain V (s) = F (s)U(s). The ≈ sign is due to the factthat we have ignored all the higher order terms our low pass filter essentially eliminated.The simulation below verifies this. The Bode plot for this F is given in Figure 8.4. Themagnitude Bode plot for both F and G are the same. However, F has zero phase shift atthe frequencies 0 and 1. The Matlab command we used to graph the Bode plot are

• a = 1/ tan((2π − 2.113)/4)

• b = tf([1,−a], [1, a]); f = b ∗ b ∗ g; bode(f);grid

−250

−200

−150

−100

−50

0

50

Mag

nitu

de (

dB)

10−2

10−1

100

101

102

−540

−360

−180

0

180

360

Pha

se (

deg)

Bode Diagram

Frequency (rad/sec)

Figure 8.4: The Bode plot for F (s)

The plot for the filter output v(t) and u1 = 12− 4

π2 cos(t) are given in Figure 8.5. Thissimulation shows that in steady state vss(t) is approximately equal to the desired signalu1 =

12− 4

π2 cos(t), that is

vss(t) ≈ 1

2− 4

π2cos(t).

The Matlab commands we used are given by

• t = linspace(0, 25, 2 ∧ 14);

• u = 1/2; for k = 1 : 2 : 500; u = u− 4 ∗ cos(k ∗ t)/(π2 ∗ k2); end• [n, d] = butter(6, 1.9,′ s′); g = tf(n, d); b = tf([1,−a], [1, a]); f = b ∗ b ∗ g;• u1 = 1/2− 4 cos(t)/(π2);

• lsim(f, u, t); hold on;plot(t, u1,′ r′);grid


0 5 10 15 20 25−0.4

−0.2

0

0.2

0.4

0.6

0.8

1


Time (sec)

Am

plitu

de

Figure 8.5: The zero phase simulation of v(t)

8.1.2 Exercise

Problem 1. Find the third order low pass Butterworth filterG with cutoff angular frequency10. Find a state space realization for G. Plot the Bode plot of this G in Matlab. Finally,give a Simulink block diagram for this Butterworth filter; see Figure 5.15 in Section 5.6 ofChapter 5. The Matlab command for the Butterworth filter is butter. So you can verifyyour answer by using Matlab.

Problem 2. Consider the periodic signal u(t) with period 1 given by

u(t) = sin(2πt) if 0 ≤ t ≤ 1

2

= 0 if1

2< t ≤ 1, (1.14)

Now extend u(t) periodically such that u(t) = u(t+ 1) for all t.

(i) Find the Fourier series by hand for u(t), that is,

u(t) = a0 +

∞∑k=1

αk cos(2πkt) + βk sin(2πkt) .

(ii) Plot u(t) in Matlab for 0 ≤ t ≤ 10.

(iii) Design a low pass Butterworth filter G which picks out

a0 + α1 cos(2πt) + β1 sin(2πt)

from u(t) up to a phase. In other words, if Y (s) = G(s)U(s), then the steady stateoutput

yss(t) ≈ a0 + α1 cos(2πt+ θ) + β1 sin(2πt+ θ)

where θ is a phase shift. (In Matlab butter(m,c,’s’) where m is the order of theButterworth filter and c is the cutoff frequency.) Plot the Bode plot of G(s).


(iv) Use the lsim command in Matlab to see how well your filter G picks out


from u, that is, plot y(t) and a0 + α1 cos(2πt) + β1 sin(2πt) on the same graph.

(v) Find a low pass filter F (s) = B(s)G(s) where B(s) is an all pass filter B(s) to eliminatethe phase shift θ in yss(t). Plot v(t) and a0 + α1 cos(2πt) + β1 sin(2πt) on the samegraph where v(t) is now the output of B(s)G(s), that is, V (s) = B(s)G(s)U(s) in thes domain.

Problem 3. Consider the periodic signal u(t) with period 1 given by

u(t) = t sin(2πt) if 0 ≤ t ≤ 1. (1.15)

Now extend u(t) periodically such that u(t) = u(t+ 1) for all t.

(i) Find the Fourier series for u(t), that is,

u(t) = a0 +

∞∑k=1

αk cos(2πkt) + βk sin(2πkt) .

(ii) Plot u(t) in Matlab for 0 ≤ t ≤ 10.

(iii) Design a low pass Butterworth filter G which picks out


from u(t) up to a phase. In other words, if Y (s) = G(s)U(s), then the steady stateoutput

yss(t) ≈ a0 + α1 cos(2πt+ θ) + β1 sin(2πt+ θ)

where θ is a phase shift. (In Matlab butter(m,c,’s’) where m is the order of theButterworth filter and c is the cutoff frequency.) Plot the Bode plot of G(s).

(iv) Use the lsim command in Matlab to see how well your filter G picks out


from u, that is, plot y(t) and a0 + α1 cos(2πt) + β1 sin(2πt) on the same graph.

(v) Find a low pass filter F (s) = B(s)G(s) where B(s) is an all pass filter B(s) to eliminatethe phase shift θ in yss(t). Plot v(t) and a0 + α1 cos(2πt) + β1 sin(2πt) on the samegraph where v(t) is now the output of B(s)G(s), that is, V (s) = B(s)G(s)U(s) in thes domain.


Problem 4. Let G be the n-th order Butterworth filter with cutoff angular frequency ωc,and G1 be the n-th order Butterworth filter with cutoff angular frequency one. Then showthat G(s) = G1(s/ωc).

Problem 5. This is a Matlab demonstration on digital filters. (We have not studied digitalfilters.) Consider the following Matlab commands:

t = linspace(0, 2, 2 ∗ 2 ∧ 13);

% The sample rate for the sound command in Matlab is 213 = 8192 hertz.

x1 = cos(1000 ∗ t); x2 = cos(4000 ∗ t); y = x1 + x2;

soundsc(x1);

% The signal you hear has frequency 10002π

≈ 159 hertz.

soundsc(x2);

% The signal you hear has frequency 40002π

≈ 637 hertz.

soundsc(y);

y = y + randn(1, 2 ∧ 14); soundsc(y);

% The signal you hear is the sum of two sinusoids with frequencies 159 and 637 hertz

% plus additive white noise with mean zero and variance one.

a = ifft(y);ω = (0 : 2400) ∗ π;% Here we are running the signal for 2 seconds. In other words, τ = 2.

plot(ω, abs(a(1 : 2401)). ∧ 2);

% The power spectrum of y vs the angular frequency ω.

% The peaks occur at 1000 and 4000.

[n, d] = butter(5, [.15, .16]);

% Notice that 40008192π

= 0.1554 which is what the band is around.

g = fft(n, 2 ∧ 14)./fft(d, 2 ∧ 14);

% g is our digital filter which picks out cos(4000t) from y.

plot(ω, abs(a(1 : 2401)). ∧ 2);

hold on; plot(ω,abs(g(1 : 2401)),’r’);

soundsc(real(fft(g.*a)));

% This should sound like cos(4000t).

hold off; q = g. ∗ a; plot(ω,abs(q(1 : 2401)) ∧ 2);

% This is the power spectrum of the filtered signal.

In this program, y(t) = cos(1000t) + cos(4000t) + v(t) is the sum of two sinusoids plus meanzero variance one Gaussian white noise v(t). The idea is to construct a filter to pick outcos(4000t) from the signal y(t). The program plots the power spectrum of the signal y alongwith the band pass filter. Now design a low pass filter to pick out cos(1000t) from y. Theproblem you will have is that the low pass filter will not eliminate the white noise. So tryto find a band pass filter to pick out cos(1000t).


8.2 High pass Butterworth filters

The n-th order high pass Butterworth filter with cutoff angular frequency ωc is defined by

H(s) =sn

d(s)where d(s) =

n−1∏k=0

(s− iωc ei(π+2πk)/2n) . (2.1)

Since all the roots of d(s) are contained in the open left half plane, the high pass Butterworthfilter H(s) is stable.

Now let us compute |H(iω)|2. Equation (1.11) implies that

|H(iω)|2 = ω2n

|d(iω)| 2 =ω2n

ω2nc + ω2n

=(ω/ωc)

2n

1 + (ω/ωc)2n. (2.2)

Clearly, |H(iω)| ≈ 1 when |ω| >> ωc and |H(iω)| ≈ 0 when |ω| << ωc. At the cutoff angularfrequency |H(iωc)| = 1/

√2. So the Butterworth filter H maintains approximately 71% of its

maximum amplitude at the cutoff angular frequency. On the magnitude Bode plot the highpass Butterworth filter G drops approximately 3 decibels at the cutoff angular frequency.The higher the integer n the faster the |H(iω)| drops off for |ω| < ωc.

8.2.1 Constructing high pass filters from low pass filters

In this section we will show that low pass and high pass filters can be obtained from a lowpass filter with cutoff angular frequency one. In other words, when an engineer designs astable low pass filter with unity cutoff angular frequency, then this filter can be used todesign a stable low pass and high pass filter with cutoff angular frequency ωc. For example,if G is a stable low pass filter with unity cutoff angular frequency, then F (s) = G(s/ωc)is a stable low pass filter with cutoff angular frequency ωc. Notice that F (iωc) = G(i). Inparticular, the magnitude |F (iω)| at the cutoff angular frequency ωc equal the magnitude|G(iω)| at the cutoff angular frequency one. The magnitude Bode plot of F (s) is simply themagnitude Bode plot of G(s) where the frequency ω scaled by the factor 1/ωc.

As before, assume that G(s) is a stable low pass filter with unity cutoff angular frequency.Then H(s) = G(ωc/s) is a stable high pass filter with cutoff angular frequency ωc. Noticethat H(iωc) = G(−i). The magnitude |H(iω)| at the cutoff angular frequency ωc equal themagnitude |G(iω)| at the cutoff angular frequency one. The transformation ωc/s flips themagnitude Bode plot of the low pass filter G around the cutoff angular frequency ωc. Inparticular, |H(i0)| = |G(i∞)| and |H(i∞)| = |G(i0)|.

For an example, let H be the n-th order high pass Butterworth filter with cutoff angularfrequency ωc, and G be the n-th order low pass Butterworth filter with unity cutoff angularfrequency. Then H(s) = G(ωc/s). According to (1.1) and (1.2), the n-th order Butterworthfilter with cutoff angular frequency one is given by G(s) = 1/d(s) where d is the monicpolynomial of degree n defined by

d(s) =

n−1∏k=0

(s− λk) and λk = i ei(π+2πk)/2n (k = 0, 1, 2, · · · , n− 1) . (2.3)

8.2. HIGH PASS BUTTERWORTH FILTERS 429

The roots {λk}n−10 of the polynomial d are on the unit circle. Furthermore, the roots {λk}n−1

0

of d come in complex conjugate pairs. Moreover, −1 is a root of d if and only if n is odd. Inparticular, this implies that

1 = (−1)nn−1∏k=0

λk . (2.4)

By employing these facts, we obtain

G(ωc/s) =1

d(ωc/s)=

1∏n−1k=0(ωc/s− λk)

=sn∏n−1

k=0(ωc − sλk)

=sn

(−1)n∏n−1

k=0(sλk − ωc)=

sn

(−1)n∏n−1

k=0 λk(s− ωcλk)

=sn∏n−1

k=0(s− ωcλk)=

sn∏n−1k=0(s− ωcλk)

= H(s) .

The second from the last equality follows from the fact that the roots {λk}n−10 of d come in

complex conjugate pairs. Therefore H(s) = G(ωc/s).

8.2.2 State space realizations for low and high pass filters

Recall that {A,B,C,D} is a state space realization for a transfer function G(s) if

G(s) = D + C (sI − A)−1B . (2.5)

Here A is a n × n matrix, B is a column vector in Cn, while C is a 1 × n row vector and

D is a scalar; see Section 5.3 in Chapter 5. We say that a square matrix A is stable if allthe eigenvalues of A are contained in the open left half plane {s : �s < 0}. The realization{A,B,C,D} is stable if A is stable.

If {A,B,C,D} is a stable realization for G, then G is a stable transfer function. To seethis recall that

G(s) = D + C (sI −A)−1B =c(s)

det[sI − A]+D

where c is a polynomial of degree at most n− 1; see (4.1) in Chapter 5. Since det[sI −A] isthe characteristic polynomial for A, we have det[sI −A] =

∏n1 (s− λk) where {λk}n1 are the

eigenvalues for A. Because A is stable, all the roots of det[sI −A] are contained in the openleft half plane. Therefore G is stable.

The following result provides a method to compute a state space realization for a stablelow pass filter with cutoff angular frequency ωc, directly from a state space realization for astable low pass filter with unity cutoff angular frequency.

PROPOSITION 8.2.1 Let {A,B,C,D} be a stable realization for a transfer function G.Let F (s) be the transfer function defined by F (s) = G(s/ωc) where the cutoff angular fre-quency ωc > 0. Then {ωcA,Bc, ωcCc, Dc} is a stable realization for F . In particular, if{A,B,C,D} is a stable realization for the n-th order low pass Butterworth filter G(s) withunity cutoff angular frequency, then {ωcA,Bc, ωcCc, Dc} is a stable realization for the n-thorder low pass Butterworth filter F (s) with angular cutoff frequency ωc.


Proof. By employing the state space realization for G in (2.5), we obtain

F (s) = G(s/ωc) = D + C(sω−1

c I −A)−1

B = D + Cωc (sI − ωcA)−1B

= D + ωcC (sI − ωcA)−1B .

Therefore {ωcA,Bc, ωcCc, Dc} is a realization for F .To complete the proof it remains to show that ωcA is stable. To this end, assume that

μ is an eigenvalue for ωcA. We claim that λ = μ/ωc is an eigenvalue for A. Clearly,μf = ωcAf where f is an eigenvector for ωcAf corresponding to μ. By inverting ωc, weobtain Af = (μ/ωc)f . Hence λ = μ/ωc is an eigenvalue for A. Since A is stable, �λ < 0.Because ωc > 0, we have �μ = ω−1

c �λ < 0. Therefore ωcA is stable. This completes theproof.

Finally, it is noted that if G is any stable low pass filter with angular cutoff angularfrequency ω1, then the transformation F (s) = G(ω1s/ω2) produces a stable low pass filterwith angular cutoff frequency ω2. So by setting ωc = ω2/ω1, one can use Proposition 8.2.1to transform G from a stable low pass filter with cutoff angular frequency ω1 into a stablelow pass filter F (s) = G(s/ωc) with cutoff angular frequency ω2.

The following result provides a method to compute a state space realization for a stablehigh pass filter with cutoff angular frequency ωc, directly from a state space realization fora stable low pass filter with unity cutoff angular frequency.

PROPOSITION 8.2.2 Let {A,B,C,D} be a stable realization for a transfer function G.Let H(s) be the transfer function defined by H(s) = G(ωc/s) where the cutoff angular fre-quency ωc > 0. Finally, let {Ac, Bc, Cc, Dc} be the system defined by

Ac = ωcA−1, Bc = A−1B, Cc = −ωcCA−1 and Dc = D − CA−1B . (2.6)

Then {Ac, Bc, Cc, Dc} is a stable realization for H. In particular, if {A,B,C,D} is a stablerealization for the n-th order low pass Butterworth filter G(s) with unity cutoff angularfrequency, then {Ac, Bc, Cc, Dc} is a stable realization for the n-th order high pass Butterworthfilter H(s) with angular cutoff frequency ωc.

Proof. Let T be any n× n matrix such that one is not an eigenvalue for T . Then

(I − T )−1 = I + T (I − T )−1 . (2.7)

To see this notice that

(I − T )−1 = (I − T + T )(I − T )−1 = I + T (I − T )−1 .

Hence (2.7) holds. By combining (2.5) with T = s−1ωcA−1 in (2.7), we obtain

H(s) = G(ωc/s) = D + C(s−1ωcI − A

)−1B = D − C

(I − s−1ωcA

−1)−1

A−1B

= D − CA−1B − CωcA−1s−1

(I − s−1ωcA

−1)−1

A−1B

= D − CA−1B − ωcCA−1

(sI − ωcA

−1)−1

A−1B

= Dc + Cc (sI − Ac)−1Bc .

8.3. BAND PASS BUTTERWORTH FILTERS 431

Therefore {Ac, Bc, Cc, Dc} realization for H .To complete the proof it remains to show that Ac is stable. To this end, assume that

μ is an eigenvalue for Ac. Because Ac is invertible, μ = 0. We claim that λ = ωc/μ is aneigenvalue for A. Clearly, μf = Acf = ωcA

−1f where f is an eigenvector corresponding toμ. By inverting A, we obtain Af = (ωc/μ)f . Hence λ = ωc/μ is an eigenvalue for A. Nowlet λ = α + iβ. Since A is stable, α < 0. Notice that

μ =ωcλ

=ωc

α + iβ=

ωc(α− iβ)

(α+ iβ)(α− iβ)=ωc(α− iβ)

α2 + β2.

Thus �μ = ωcα/(α2 + β2) < 0. Therefore Ac is stable. This completes the proof.

8.2.3 Exercise

Problem 1. Find the third order high pass Butterworth filter G with cutoff angular fre-quency 10. Find a state space realization for G. Plot the Bode plot of this G in Matlab.Finally, give a Simulink block diagram for this Butterworth filter; see Figure 5.15 in Section5.6 of Chapter 5.

Problem 2. Find a state space realization {A,B,C,D} for the second order low passButterworth filter with unity cutoff angular frequency. Then use Proposition 8.2.1 to find astate space realization for the second order low pass Butterworth filter with cutoff angularfrequency 10. The Matlab command corresponding to Proposition 8.2.1 is given by lp2lp.So you can verify your answer by using Matlab.

Problem 3. Find a state space realization {A,B,C,D} for the second order low passButterworth filter with cutoff angular frequency one. Then use Proposition 8.2.2 to find astate space realization for the second order high pass Butterworth filter with cutoff angularfrequency 10. The Matlab command corresponding to Proposition 8.2.2 is given by lp2hp.So you can verify your answer by using Matlab.

Problem 4. Let T be an invertible matrix. Then show that λ is an eigenvalue for T if andonly if 1/λ is an eigenvalue for T−1.

8.3 Band pass Butterworth filters

In this section we will present the band pass Butterworth filter by transforming the low passButterworth filter into a band pass filter. This transformation is done by using a specialconformal mapping. To this end, let 0 < a < b be two positive real numbers and set

m(s) =s2 + ab

s(b− a). (3.1)

The n-th order band pass Butterworth filter P (s) with angular frequency band [a, b] is definedby

P (s) = G(m(s)) = G((s2 + ab)/s(b− a)) (3.2)


where G(s) is the n-th order low pass Butterworth filter with unity cutoff angular frequency.The n-th order band pass Butterworth filter has McMillan degree 2n. As expected, the bandpass Butterworth filter is a stable proper rational filter. Moreover, at the cutoff angularfrequencies a and b the magnitude |P (ia)| = |P (ib)| = 1/

√2. So at the cutoff angular

frequencies a and b, the magnitude Bode plot drops off approximately three decibels fromits peak of one.

The second order band pass Butterworth filter with cutoff angular frequency band [a, b]is given by

P (s) =s2(b− a)2

s4 +√2(b− a)s3 + (a2 + b2)s2 +

√2ab(b− a)s + a2b2

. (3.3)

To verify this recall that the second order low pass Butterworth filter G(s) with unity cutoffangular frequency is given by

G(s) =1

s2 +√2s+ 1

.

Hence the second order band pass Butterworth filter is computed by

P (s) = G(m(s)) =1(

s2+abs(b−a)

)2

+√2(s2+abs(b−a)

)+ 1

=(b− a)2s2

(s2 + ab)2 +√2(s2 + ab)s(b− a) + s2(b− a)2

=(b− a)2s2

s4 + 2abs2 + a2b2 +√2(b− a)s3 +

√2ab(b− a)s+ s2(b− a)2

=s2(b− a)2

s4 +√2(b− a)s3 + (a2 + b2)s2 +

√2ab(b− a)s+ a2b2

.

Therefore the second order band pass Butterworth filter P (s) is given by (3.3).

If G is any low pass stable rational filter with cutoff angular frequency one, then P (s) =G(m(s)) is a stable band pass filter with angular frequency band [a, b]. In particular, ifG is the n-th order low pass Butterworth filter with unity cutoff angular frequency, thenP (s) = G(m(s)) is a stable band pass filter with angular frequency band [a, b].

Now assume that G is any low pass stable rational filter with unity cutoff angular fre-quency. We claim that P (s) = G(m(s)) determines a stable band pass filter with angularfrequency band [a, b]. Recall that any stable proper rational transfer function admits a stablestate space realization {A,B,C,D}. Proposition 8.3.1 below shows that P is a stable properrational transfer function.

Now let us investigate some properties of the conformal mapping m. Notice that

m(iω) = iω2 − ab

ω(b− a).


This readily implies that m(i√ab) = 0,

m(ia) = −i and m(ib) = i (3.4)

lim|ω|→0

|m(iω)| = ∞ and lim|ω|→∞

|m(iω)| = ∞ . (3.5)

Notice that√ab is the geometric mean of a and b. In other words, (log(a) + log(b))/2 =

log√ab. So

√ab is the average of the interval [a, b] on the horizontal axis of the Bode plot.

Using m(i√ab) = 0, we see that |P (i√ab)| = |G(m(i

√ab))| = |G(0)|. So at the center point

ω =√ab, the magnitude of the filter P (iω) equals the magnitude |G(0)| of the low pass filter

G(iω) evaluated at the origin.At the cutoff frequencies a and b, the magnitude |P (iω)| = |G(i)|. This follows from

|P (ia)| = |G(m(ia))| = |G(−i)| and |P (ib)| = |G(m(ib))| = |G(i)| .

In other words, at the cutoff angular frequencies a and b, the magnitude of the filter P (iω)equals the magnitude |G(i)| of the low pass filter G(iω) evaluated at the cutoff frequencyone.

Without loss of generality, let us assume that |G(i∞)| = 0, that is, the magnitude of thelow pass filter G evaluates at ω = ∞ equals zero. Equation (3.5) implies that

lim|ω|→0

|P (iω)| = lim|ω|→0

|G(m(iω))| = lim|ω|→∞

|G(iω)| = 0

lim|ω|→∞

|P (iω)| = lim|ω|→∞

|G(m(iω))| = lim|ω|→∞

|G(iω)| = 0 .

In other words, as |ω| approaches zero or infinity, the magnitude of P (iω) tends to zero, andthus, P (s) is a stable band pass filter.

Now assume that G is the n-th order low pass Butterworth filter with unity cutoff angularfrequency, and P (s) = G(m(s)) is the corresponding n-th order band pass Butterworth filterwith cutoff angular band [a, b]. The above analysis shows that |P (i√ab)| = |G(0)| = 1. So atthe center point ω =

√ab, the magnitude of the Butterworth band pass filter P (iω) equals

one. Recall that for the low pass Butterworth filter with unity cutoff angular frequency|G(i)| = 1/

√2. This readily implies that for the cutoff angular frequencies a and b of P , we

have |P (ia)| = |P (ib)| = 1/√2. In other words, the Bode plot of the magnitude of P (iω)

drops off approximately three decibels at the cutoff angular frequencies a and b. Finally, as|ω| approaches zero or infinity, the magnitude of P (iω) tends to zero.

8.3.1 A bandpass Butterworth filtering example

In this section, we will design a fourth order Butterworth filter to pick out a sinusoid froma signal with additive Gaussian white noise. Recall that the McMillan degree of a transferfunction G(s) = p(s)

q(s)is the degree of the denominator q(s) when p(s) and q(s) have no

common roots.Consider the input signal

u(t) = cos(t) + cos(3t) + 2w(t) (3.6)


−80

−70

−60

−50

−40

−30

−20

−10

0

Mag

nitu

de (

dB)

10−1

100

101

102

−180

−90

0

90

180

Pha

se (

deg)

Bode Diagram

Frequency (rad/sec)

Figure 8.6: The Bode plot for our bandpass G(s)

0 5 10 15 20 25 30 35 40−8

−6

−4

−2

0

2

4

6

8

10The graph of u(t)

Figure 8.7: The simulation of u(t)

where w(t) is a mean zero Gaussian white noise process with standard deviation 1. Let us

design a filter G(s) = p(s)q(s)

of McMillan degree 4 to pick out the signal cos(3t) from the signal

u(t). The output for the filter is Y (s) = G(s)U(s). The idea is to design a filter G such thatthe steady state output yss(t) ≈ cos(3t) with almost no phase shift. The Matlab commandfor the bandpass Butterworth filter is

[p, q] = butter(m, [a, b], ’s’) .

In this command 2m is the McMillan degree of the filter and [a, b] is the cutoff frequencyrange for the filter. The Matlab command to simulate the Gaussian white noise w is randn.For example, randn(1, 5000) simulates 5000 points in the Gaussian white noise process.

Clearly, there are infinitely many filters that will work. Here we will use a bandpass


0 5 10 15 20 25 30 35 40−1.5

−1

−0.5

0

0.5

1

1.5The graph of y(t) and cos(3t)

Figure 8.8: The simulation of y(t) and cos(3t)

2 w(t)

G(s)

u(t) y(t)

cos(3t)

cos(10t)

1.21s 2

s +1.556s +19.21s +14s+814 3 2

To Workspace

simout

Scope

Figure 8.9: A Simulink model for G(s)

Butterworth filter with m = 2 which is of McMillan degree 4 with the band pass frequency[3

1.2, 3× 1.2

]= [2.5, 3.6] .


We used 2.5 = 31.2

and 3.6 = 3 × 1.2 as the cutoff frequencies because this places 3 in themiddle of the log scale between the two cut off frequencies 2.5 and 3.6 with a distance oflog(1.2) from log(3), that is,

log(3)− log(3/1.2) = log(1.2) and log(3× 1.2)− log(3) = log(1.2).

Since log(3) is in the middle of the band [2.5, 3.6] with respect to the log scale, the phaseshift for the band pass Butterworth filter at ω = 3 will be approximately zero. The Matlabcommands we used are given by

• [n, d] = butter(2, [2.5, 3.6] , ′s′);G = tf(n, d)

• bode(G); grid.

The filter Matlab gave us is

G(s) =1.21s2

s4 + 1.556s3 + 19.21s2 + 14s+ 81. (3.7)

In Matlab we computed G(3i) and G(3i) ≈ 1. So there is virtually no phase shift at theangular frequency 3. The Bode plot for this band pass Butterworth filter G(s) is given inFigure 8.6.

We used the lsim command in Matlab with the transfer function G(s) in (3.7) to find theoutput y(t) for the input

u(t) = cos(t) + cos(3t) + 2w(t).

In other words, Y (s) = G(s)U(s) in the s domain. The Matlab plot for the input u(t) isgiven in Figure 8.7. The plot of the output y(t) and cos(3t) on the same graph is presentedin Figure 8.8. The Matlab commands we used to generate these plots are given by

• t = linspace(0, 40, 2 ∧ 14);

• u = cos(t) + cos(3 ∗ t) + 2 ∗ randn(1, 2 ∧ 14); plot(t, u);

• y = lsim(G, u, t); grid; plot(t, y).

• u3 = cos(3 ∗ t); hold on;plot(t, u3,′r′)

This simulation shows that in steady state yss(t) ≈ cos(3t). So our bandpass filter pickedout cos(3t) from the input u(t), and did not introduce any phase shift at frequency 3. Itis emphasized that because there is white noise in the system, a filter cannot eliminate allthe noise in any frequency range. Therefore y(t) only approximates cos(3t) in steady state.Finally, due to the white noise every time one runs the simulation, the output will be different(unless one runs the white noise with the same random seed).

A Simulink model to compute y(t) is given in Figure 8.9. The time was set to 40 seconds.The data was sent to Matlab via the simout to workspace block. We clicked array in thisblock. Then one can graph the output by the plot(tout,simout) command in Matlab. Alsoone can click on the scope to view the output. (The output from the scope can also be sentMatlab.) The block from the transfer function and cos(3t) to the scope and simout is a mux.


This plots cos(3t) and y(t) on the same graph. This graph is similar to the plot in Figure8.8. The plots are not exactly the same because the Gaussian noise is different for each run.(Since the plots are similar, we did not bother to plot graph obtained from our Simulinkmodel.)

Exercise

Use Matlab to design a band pass Butterworth filter of McMillan degree at most 6 to solvethe following filtering problem.

(i) Find a band pass filter G(s) = p(s)q(s)

of McMillan degree at most 6 to pick out cos(4t)from the signal

u(t) = sin(t) + sin(4t) + sin(10t) + 2w(t)

where w(t) is mean zero Gaussian white noise with variance 1. Plot the Bode plot ofG, that is, plot bode(p, q).

(ii) Use the lsim command over 40 seconds in Matlab with u(t) to see how well your filter

G(s) = p(s)q(s)

picks out cos(4t) from u(t). Set y = lsim(G, u, t). Notice that the output

of lsim is y(t) where Y (s) = G(s)U(s). Plot u(t) and y(t) on separate graphs.

(iii) Compare the output y(t) of lsim to cos(4t), that is, hold on and plot cos(4t) and y(t)on the same graph.

(iv) Construct a Simulink model for your transfer function. Here you need will three sinewave generators and a summer. Use the mux to connect the cos(4t) and the outputof your transfer function to the scope, that is, cos(4t) and y(t) go into the mux, andthe scope is connected to the output of the mux. The mux block allows you to plotthe cos(4t) and the output y(t) of the transfer function on the same graph. Run theSimulink model for 30 seconds and hand in the plot obtained from the scope, that is,the plot cos(4t) and y(t) on the same graph.

8.3.2 State space realizations for band pass filters

Recall that {A,B,C,D} is a state space realization for a transfer function G if

G(s) = C(sI − A)−1B +D .

Moreover, {A,B,C,D} is a stable realization if all the eigenvalues of A are contained inthe open left half plane. The following result provides a method to compute a state spacerealization for a stable band pass filter with cutoff band [a, b], directly from a state spacerealization for a stable low pass filter with unity cutoff angular frequency. In particular, thisproposition can be use to construct a state space realization for the n-th order band passButterworth filter.


PROPOSITION 8.3.1 Let {A on Cn, B, C,D} be a stable state space realization for atransfer function G. Let P (s) be the transfer function defined by P (s) = G(m(s)) wherem(s) = (s2 + ab)/sτ , the inequality 0 < a < b holds and τ = b− a. Finally, let

A1 =

[Aτ

√abI

−√abI 0

]on

[Cn

Cn

]and B1 =

[Bτ0

]: Cn →

[Cn

Cn

](3.8)

C1 =[C 0

]:

[Cn

Cn

]→ C

n .

Then {A1, B1, C1, D} is a stable realization for P (s). In particular, if {A,B,C,D} is astable realization for the n-th order low pass Butterworth filter G(s) with unity cutoff angularfrequency, then {A1, B1, C1, D} is a stable realization for the n-th order band pass Butterworthfilter P (s) with angular frequency band [a, b].

Proof. The proof uses the following Schur formula to invert a 2× 2 block matrix[T XY Z

]−1

=

[Δ−1 −Δ−1XZ−1

−Z−1YΔ−1 Z−1 + Z−1YΔ−1XZ−1

](3.9)

where Δ = T − XZ−1Y is the Schur complement; see Problem 3. Using this inversionformula, we obtain

C1(sI −A1)−1B1 =

[C 0

] [ sI − Aτ −√abI√

abI sI

]−1 [Bτ0

]= CΔ−1Bτ = C (sI −Aτ + abI/s)−1Bτ

= C (sI/τ + abI/τs− A)−1B = C (m(s)I − A)−1B .

In other words,C1(sI − A1)

−1B1 = C (m(s)I − A)−1B .


P (s) = G(m(s)) = C (m(s)I − A)−1B +D = C1(sI − A1)−1B1 +D .

Therefore {A1, B1, C1, D} is a state space realization for P (s).To complete the proof it remains to show that A1 is stable. To this end, assume that

λ is an eigenvalue for A1, and [f, g]tr in C2n is a corresponding eigenvector. (Recall that trdenotes the transpose.) In other words,[

Aτ√abI

−√abI 0

] [fg

]= λ

[fg

](3.10)

where f and g are vectors in Cn and f or g is nonzero. Notice that f or g is nonzero,because by definition an eigenvector must be nonzero. We claim that λ is nonzero. If λ = 0,


then the second equation in (3.10) implies that −√abf = 0. Hence f = 0. Substituting

f = 0 into the first equation in (3.10) with λ = 0 yields g = 0. So both f and g equal zero.This contradicts the fact that the eigenvector [f, g]tr is nonzero by definition. Therefore λ isnonzero.

Now let us find an expression for f and g. The second equation in (3.10) implies that−√

abf = λg. Notice that f is nonzero. If f equals zero, then g must also be zero, and thiscontradicts the fact that [f, g]tr is an eigenvector. Substituting g = −(

√ab/λ)f into the first

equation in (3.10) yields Aτf − (ab/λ)f = λf . By rearranging this equation, we arrive at

Af =

(λ

τ+ab

λτ

)f =

λ2 + ab

λτf = m(λ)f .

In other words, m(λ) is an eigenvalue for A with corresponding eigenvector f . Because A isstable, �m(λ) < 0. Using λ = x+ iy, we obtain

�m(λ) = �(λ

τ+ab

λτ

)= �

(x+ iy

τ+

ab

(x+ iy)τ

)=x

τ+ �

(ab(x − iy)

(x2 + y2)τ

)

=x

τ+

abx

(x2 + y2)τ= x

(x2 + y2 + ab

(x2 + y2)τ

)= x

( |λ|2 + ab

|λ|2τ).

Hence �m(λ) = x(|λ|2+ab)/|λ|2τ . Since �m(λ) < 0 and (|λ|2+ab)/|λ|2τ is strictly positive,we see that �λ = x < 0. Therefore A1 is stable. This completes the proof.

8.3.3 Exercise

Problem 1. Find the second order band pass Butterworth filter P with cutoff band [1, 10].Find a state space realization for P . Plot the Bode plot of this P in Matlab.

Problem 2. Find a state space realization {A,B,C,D} for the second order low pass But-terworth filter with cutoff angular frequency one. Then use Proposition 8.3.1 to find a statespace realization for the second order band pass Butterworth filter with cutoff band [1, 10].You can use Matlab to compute {A1, B1, C1, D}. The Matlab command corresponding toProposition 8.3.1 is given by lp2bp. So you can verify your answer by using Matlab.

Problem 3. Let M be the block matrix defined by

M =

[T XY Z

]:

[Cn

Cν

]→

[Cn

Cν

].

Here T , X, Y and Z are all matrices. The Schur complement of M is the matrix defined byΔ = T −XZ−1Y . Assume that Δ and Z are both invertible matrices. Then show that theinverse of M is given by[

T XY Z

]−1

=

[Δ−1 −Δ−1XZ−1

−Z−1YΔ−1 Z−1 + Z−1YΔ−1XZ−1

]. (3.11)


Hint: the following identity may be useful

M =

[I XZ−1

0 I

] [T −XZ−1Y 0

0 Z

] [I 0

Z−1Y I

].

Finally, it is noted that the Schur inversion formula in (3.11) is a generalization of theclassical inversion formula for an invertible 2× 2 matrix.

8.4 Band stop Butterworth filters

In this section we will present the band stop Butterworth filter by transforming the low passButterworth filter into a band stop filter. To this end, let 0 < a < b be two positive realnumbers and set

r(s) =s(b− a)

s2 + ab. (4.1)

Notice that r(s) = 1/m(s) where m(s) is the conformal mapping defined in (3.1). The n-thorder band stop Butterworth filter Q(s) with angular frequency band [a, b] is defined by

Q(s) = G(r(s)) = G(s(b− a)/(s2 + ab)) (4.2)

where G(s) is the n-th order low pass Butterworth filter with unity cutoff angular frequency.As expected, the band stop Butterworth filter is a stable proper rational filter. Moreover,we will see that at the cutoff angular frequencies a and b the magnitude |Q(ia)| = |G(ib)| =1/√2. So at the cutoff angular frequencies a and b, the magnitude Bode plot drops off

approximately three decibels from its peak of one.The second order band stop Butterworth filter with cutoff angular frequency band [a, b]

is given by

Q(s) =s4 + 2abs2 + a2b2

s4 +√2(b− a)s3 + (a2 + b2)s2 +

√2ab(b− a)s+ a2b2

. (4.3)

To verify this recall that the second order low pass Butterworth filter G(s) with unity cutoffangular frequency is given by

G(s) =1

1 +√2s+ s2

.

Hence the second order band pass Butterworth filter is computed by

Q(s) = G(r(s)) =1

1 +√2(s(b−a)s2+ab

)+(s(b−a)s2+ab

)2

=(s2 + ab)2

(s2 + ab)2 +√2(s2 + ab)s(b − a) + s2(b− a)2

=s4 + 2abs2 + a2b2

s4 + 2abs2 + a2b2 +√2(b− a)s3 +

√2ab(b− a)s+ s2(b− a)2

=s4 + 2abs2 + a2b2

s4 +√2(b− a)s3 + (a2 + b2)s2 +

√2ab(b− a)s + a2b2

.

8.4. BAND STOP BUTTERWORTH FILTERS 441

Therefore the second order band stop Butterworth filter Q(s) is given by (4.3).If G is any low pass stable rational filter with unity cutoff angular frequency, then Q(s) =

G(r(s)) is a stable band stop filter with angular frequency band [a, b]. In particular, ifG is the n-th order low pass Butterworth filter with unity cutoff angular frequency, thenQ(s) = G(r(s)) is a stable band stop filter with angular frequency band stop [a, b].

Now assume that G is any low pass stable rational filter with unity cutoff angular fre-quency. We claim that Q(s) = G(r(s)) determines a stable band stop filter with angularfrequency band [a, b]. Recall that any stable proper rational transfer function admits a stablestate space realization {A,B,C,D}. Proposition 8.4.1 below shows that Q is a stable properrational transfer function.

Now let us investigate some properties of r. Notice that

r(iω) = iω(b− a)

ab− ω2.

This readily implies that |r(i√ab)| is infinite. Moreover,

r(ia) = i and r(ib) = −i (4.4)

lim|ω|→0

|r(iω)| = 0 and lim|ω|→∞

|m(iω)| = 0 . (4.5)

Recall that√ab is the geometric mean of a and b. So

√ab is the average of the interval [a, b]

on the horizontal axis of the Bode plot. Since |r(i√ab)| is infinite, we see that |Q(i√ab)| =|G(r(i√ab))| = |G(i∞)|. So at the center point ω =

√ab, the magnitude of the filter

Q(iω) equals the magnitude of the low pass filter G(iω) evaluated at ω = ∞. In general|G(i∞)| = 0 for a low pass filter. In this case, |Q(i√ab)| = 0.

At the cutoff frequencies a and b, the magnitude |Q(iω)| = |G(i)|. This follows from

|Q(ia)| = |G(r(ia))| = |G(i)| and |Q(ib)| = |G(r(ib))| = |G(−i)| .

In other words, at the cutoff angular frequencies a and b, the magnitude of the filter Q(iω)equals the magnitude |G(i)| of the low pass filter G(iω) evaluated at the cutoff frequencyone.

Equation (4.5) implies that

lim|ω|→0

|Q(iω)| = lim|ω|→0

|G(r(iω))| = |G(0)|lim

|ω|→∞|Q(iω)| = lim

|ω|→∞|G(r(iω))| = |G(0)| .

In other words, as |ω| approaches zero or infinity, the magnitude of Q(iω) tends to |G(0)|,and thus, Q(s) is a stable band stop filter.

Now assume that G is the n-th order low pass Butterworth filter with unity cutoff angularfrequency, and Q(s) = G(r(s)) is the corresponding n-th order band stop Butterworth filterwith cutoff angular band [a, b]. The above analysis shows that |Q(i√ab)| = |G(i∞)| = 0. Soat the center point ω =

√ab, the magnitude of the Butterworth band stop filter Q(iω) equals

zero. Moreover, |Q(0)| = |Q(i∞)| = |G(0)| = 1, that is, the magnitude of the band stop


filter Q(iω) evaluated at ω equal 0 or infinity is one. Recall that for the low pass Butterworthfilter with unity cutoff angular frequency |G(i)| = 1/

√2. This readily implies that for the

cutoff angular frequencies a and b of Q, we have |Q(ia)| = |Q(ib)| = 1/√2. In other words,

the Bode plot of the magnitude of Q(iω) drops off approximately three decibels at the cutoffangular frequencies a and b.

8.4.1 State space realizations for band stop filters

Recall that {A,B,C,D} is a state space realization for a transfer function G if

G(s) = C(sI − A)−1B +D .

Moreover, {A,B,C,D} is a stable realization if all the eigenvalues of A are contained inthe open left half plane. The following result provides a method to compute a state spacerealization for a stable band stop filter with cutoff angular frequency band [a, b], directlyfrom a state space realization for a stable low pass filter with unity cutoff angular frequency.In particular, this proposition can be use to construct a state space realization for the n-thorder band stop Butterworth filter.

PROPOSITION 8.4.1 Let {A on Cn, B, C,D} be a stable state space realization for a

transfer function G. Let Q(s) be the transfer function defined by Q(s) = G(r(s)) wherer(s) = sτ/(s2 + ab), the inequality 0 < a < b holds and τ = b− a. Finally, let

A2 =

[τA−1

√abI

−√abI 0

]on

[Cn

Cn

]and B2 =

[ −τA−1B0

]: Cn →

[Cn

Cn

](4.6)

C2 =[CA−1 0

]:

[Cn

Cn

]→ C

n and D2 = D − CA−1B .

Then {A2, B2, C2, D2} is a stable realization for Q(s). In particular, if {A,B,C,D} is astable realization for the n-th order low pass Butterworth filter G(s) with unity cutoff an-gular frequency, then {A2, B2, C2, D2} is a stable realization for the n-th order band stopButterworth filter Q(s) with angular frequency band [a, b].

Proof. Recall that (I − T )−1 = I + T (I − T )−1 when T is a square matrix and one isnot an eigenvalue for T ; see (2.7). Because A is stable, zero is not an eigenvalue for A. Inparticular, A is invertible. Using m(s) = 1/r(s), we obtain

G(r(s)) = C (r(s)I − A)−1B +D = −C (I − r(s)A−1

)−1A−1B +D

= −CA−1B − CA−1r(s)(I − r(s)A−1

)−1A−1B +D

= −CA−1(r(s)−1I − A−1

)−1A−1B +D − CA−1B

= −CA−1(m(s)I − A−1

)−1A−1B +D − CA−1B .


G(r(s)) = −CA−1(m(s)I −A−1

)−1A−1B +D − CA−1B . (4.7)

8.4. BAND STOP BUTTERWORTH FILTERS 443

Now let R(s) be the transfer function defined by

R(s) = −CA−1(sI − A−1

)−1A−1B +D − CA−1B . (4.8)

Equation (4.7) shows that G(r(s)) = R(m(s)). By construction

{A−1,−A−1B,CA−1, D − CA−1B} (4.9)

is a state space realization for R(s).

We claim that A−1 is stable. To see this assume that λ is an eigenvalue for A−1, and fis a corresponding eigenvector, that is, A−1f = λf where f is a nonzero vector. Since A−1

is invertible, λ is nonzero. Multiplying A−1f = λf by A on the left, yields Af = (1/λ)f . Inother words, 1/λ is an eigenvalue for A. Because A is stable, �(1/λ) < 0. Using λ = α+ iβ,we obtain

1

λ=

1

α + iβ=

α− iβ

(α + iβ)(α− iβ)=

α− iβ

α2 + β2=α− iβ

|λ|2 .

Hence 0 > �(1/λ) = α/|λ|2. So α = �λ < 0. Therefore A−1 is stable.

According to (4.8) and (4.9), the system

{A−1,−A−1B,CA−1, D − CA−1B}

is a stable realization for the transfer function R(s). So we can implement Proposition 8.3.1.By respectively replacing A by A−1 andB by−A−1B and C by CA−1 andD byD−CA−1B inProposition 8.3.1, we see that {A2, B2, C2, D2} is a stable state space realization for R(m(s)).Since Q(s) = G(r(s)) = R(m(s)), the system {A2, B2, C2, D2} is also a stable state spacerealization for Q(s). This completes the proof.

As before, let G be a stable low pass proper rational filter with unity cutoff angularfrequency. The following table summarizes the transformation performed on G to convert Gto a low pass, high pass, band pass or band stop filter.

The transformation of a low pass filter with unity cutoff

Type of filter cutoff new filter state space

low pass ωc G(s/ωc) Proposition 8.2.1

high pass ωc G(ωc/s) Proposition 8.2.2

band pass [a, b] G((s2 + ab)/s(b− a)) Proposition 8.3.1

band stop [a, b] G(s(b− a)/(s2 + ab)) Proposition 8.4.1


8.4.2 Exercise

Problem 1. Find the second order band stop Butterworth filter Q with cutoff band [1, 10].Find a state space realization for Q. Plot the Bode plot for this Q in Matlab.

Problem 2. Find a state space realization {A,B,C,D} for the second order low pass But-terworth filter with unity cutoff angular frequency. Then use Proposition 8.4.1 to find a statespace realization for the second order band stop Butterworth filter with cutoff band [1, 10].You can use Matlab to compute {A2, B2, C2, D2}. The Matlab command corresponding toProposition 8.4.1 is given by lp2bs. So you can verify your answer by using Matlab.

Problem 3. The low pass Butterworth filter of order 2 with cutoff frequency 1 is given by

G(s) =1

s2 +√2s+ 1

. (4.10)

Show that low pass Butterworth filter of order 2 with cutoff frequency ωc is given by

F (s) =ω2c

s2 +√2sωc + ω2

c

. (4.11)

Design a circuit consisting of a resistor, capacitor and inductor to implement the filter F (s);see Section 4.5 in Chapter 4.

Problem 4. Show that the high pass Butterworth filter of order 2 with cutoff frequency ωcis given by

H(s) =s2

s2 +√2sωc + ω2

c

. (4.12)

Design a circuit consisting of a resistor, capacitor and inductor to implement the filter H(s).

Problem 5. The low pass Butterworth filter of order 1 with cutoff frequency 1 is given by

G(s) =1

s+ 1. (4.13)

Show that band pass Butterworth filter of McMillan degree 2 with cutoff frequency range[a, b] is given by

P (s) =s(b− a)

s2 + s(b− a) + ab. (4.14)

Design a circuit consisting of a resistor, capacitor and inductor to implement the filter P (s).

Problem 6. Show that band stop Butterworth filter of McMillan degree 2 with frequencystop range [a, b] is given by

Q(s) =s2 + ab

s2 + s(b− a) + ab. (4.15)

Design a circuit consisting of a resistor, capacitor and inductor to implement the filter Q(s).

Chapter 9

The Fourier transform

This chapter presents the Fourier transform and some of it properties. The connectionbetween the Fourier transform and the Laplace transform is also studied. The Fouriertransform is used to give some additional insight in Filtering theory.

9.1 The Fourier transform

In this section we introduce the Fourier transform. To this end, recall L1(−∞,∞) is the setof all (Lebesgue measurable) functions g satisfying∫ ∞

−∞|g(t)| dt <∞ . (1.1)

If g is a function in L1(−∞,∞), then the Fourier transform of g is the function G(f) definedby

G(f) = (Fg)(f) =∫ ∞

−∞e−2πiftg(t) dt . (1.2)

Notice that the Fourier transform G(f) is a function of the real variable f . Here we use G

to represent the Fourier transform of g. The notation G = Fg simply means that G is theFourier transform of g. In many applications f is the frequency. By letting ω = 2πf , theFourier transform can also be a function of the angular frequency ω.

It is emphasized that g must be a function in L1(−∞,∞) for its Fourier transform to be

well defined. In other words, if g is not in L1(−∞,∞), then its Fourier transform G may

not be well defined and may not make any sense. If g is in L1(−∞,∞), then G(f) is welldefined for all f . To see this simply observe that

|G(f)| =∣∣∣∣∫ ∞

−∞e−2πiftg(t) dt

∣∣∣∣ ≤ ∫ ∞

−∞|e−2πiftg(t)| dt =

∫ ∞

−∞|g(t)| dt <∞ .

So |G(f)| is finite for all f . Therefore G(f) is well defined function of f when g is inL1(−∞,∞).

445

446 CHAPTER 9. THE FOURIER TRANSFORM

Now assume that g is a real valued function in L1(−∞,∞) and G = Fg is its Fourier

transform. Then G(f) = G(−f) for all f on the real line. To see this simply observe that inthis case

G(f) =

(∫ ∞

−∞e−2πiftg(t) dt

)−=

∫ ∞

−∞e−2πiftg(t) dt =

∫ ∞

−∞e2πiftg(t) dt = G(−f) .

Hence G(f) = G(−f). By taking the complex conjugate, we see that G(f) = G(−f) wheng is a real valued function.

To complete this section it is noted that the Laplace transform is different from theFourier transform. The Laplace transform is only defined for functions g satisfying theconstraint g(t) = 0 for all t < 0. The Fourier transform does not require g(t) = 0 for all

t < 0. However, the function g cannot be an arbitrary function. The Fourier transform Gof g is only well defined when g is in L1(−∞,∞), or g satisfies a similar constraint. Forexample, consider the function g(t) = et for t ≥ 0 and g(t) = 0 for t < 0. The Laplacetransform of g exist. In fact, Lg = 1/(s− 1). Notice that this g is not in L1(−∞,∞), andthus, the Fourier transform of this g does not exist.

The inverse Fourier transform. As before, let G(f) = (Fg)(f) be the Fourier transformof a function g in L1(−∞,∞). Then g is referred to as the inverse Fourier transform of G.Moreover, one can show that g is given by the formula

g(t) = (F−1G)(t) =

∫ ∞

−∞e2πiftG(f) df . (1.3)

Equation (1.2) and (1.3) show that g and its Fourier transform G uniquely determine eachother. Furthermore, since the integral is the limit of a sum, equation (1.3) shows that g(t) is

essentially a continuous sum of sinusoids e2πiftG(f) with frequency f and amplitude G(f).

The plot of |G(f)| vs f can be viewed as the spectrum for g. Finally, it is noted that provingthe fact that the inverse Fourier transform F−1 is given by (1.3) is beyond the scope of thesenotes.

Linearity. The Fourier transform is a linear operator. To be precise, let g and h be twofunctions in L1(−∞,∞). Then

αG(f) + βH(f) = αFg + βFh = F(αg + βh) , (1.4)

where α and β are scalars. The proof is simple and follows from the fact that the integral isa linear operator, that is,

F(αg + βh) =

∫ ∞

−∞e−2πift (αg(t) + βh(t)) dt

= α

∫ ∞

−∞e−2πiftg(t) dt+ β

∫ ∞

−∞e−2πifth(t) dt

= αFg + βFh = αG(f) + βH(f) .

Therefore (1.4) holds and the Fourier transform F is a linear operator.

9.1. THE FOURIER TRANSFORM 447

Example, a pulse. Let g be the function defined by

g(t) = 1 if − μ ≤ t ≤ μ


Then the Fourier transform of g is given by

G(f) =sin(2πfμ)

πf. (1.6)


G(f) =

∫ ∞

−∞e−2πiftg(t) dt =

∫ μ

−μe−2πift dt =

e−2πift

−2πif

∣∣∣∣μ−μ

=e2πifμ − e−2πifμ

2πif=

sin(2πfμ)

πf.

Hence the Fourier transform of g in (1.5) is given by (1.6). Finally, it is noted that according

to L’Hospital’s rule G(0) = 2μ. So G(0) is well defined.Recall that the sinc function is defined by

sinc(x) =sin(πx)

πx. (1.7)

According to according to L’Hospital’s rule sinc(0) = 1. Equation (1.6) show that the Fouriertransform of the function g defined in (1.5) is given by

G(f) = 2μsinc(2fμ) . (1.8)

9.1.1 The Fourier transform of exponential functions.

Throughout 1+(t) is the unit step function defined by

1+(t) = 1 if 0 ≤ t

= 0 if t < 0 . (1.9)

Now consider the function g defined by g(t) = eat1+(t) where a is a constant in the open lefthalf plane, that is, �a < 0. Notice that

g(t) = eat if 0 ≤ t

= 0 if t < 0 . (1.10)

The constraint �a < 0 guarantees that g(t) is a function in L1(−∞,∞). The Fouriertransform of eat1+(t) is given by

(Feat1+)(f) = 1

2πif − a(�a < 0) . (1.11)



(Feat1+)(f) =

∫ ∞

−∞e−2πifteat1+(t) dt =

∫ ∞

0

e(a−2πif)t dt =e(a−2πif)t

a− 2πif

∣∣∣∣∞0

=1

2πif − a.

Hence (1.11) holds.In particular, assume that

g(t) = 3e−2t − 4e−5t + e−6t if 0 ≤ t

= 0 if t < 0 . (1.12)

Notice that g(t) = 3e−2t1+(t)− 4e−5t1+(t) + e−6t1+(t). Using linearity along with (1.11), wesee that the Fourier transform of this g is given by

G(f) =3

2πif + 2− 4

2πif + 5+

1

2πif + 6. (1.13)

Throughout 1−(t) is the unit step function defined by

1−(t) = 0 if t ≥ 0

= 1 if 0 < t . (1.14)

Now consider the function g defined by g(t) = ect1−(t) where c is a constant in the openright plane, that is, �c > 0. Notice that g is the function defined by

g(t) = 0 if t ≥ 0

= ect if t < 0 . (1.15)

The constraint �c > 0 guarantees that g(t) is a function in L1(−∞,∞). The Fouriertransform of ect1−(t) is given by

(Fect1−)(f) = 1

c− 2πif(�c > 0) . (1.16)


(Fect1−)(f) =∫ ∞

−∞e−2πiftect1−(t) dt =

∫ 0

−∞e(c−2πif)t dt =

e(c−2πif)t

c− 2πif

∣∣∣∣0−∞

=1

c− 2πif.

Hence (1.16) holds.For an example consider the function g defined by

g(t) = 3e−2t − 4e−5t if 0 ≤ t

= 2e3t − 3e4t if t < 0 . (1.17)

Clearly, g is a function in L1(−∞,∞). Notice that

g(t) = 3e−2t1+(t)− 4e−5t1+(t) + 2e3t1−(t)− 3e4t1−(t) .


Using (1.11) and (1.16) along with the fact that the Fourier transform is a linear operator,we obtain

G(f) =3

2πif + 2− 4

2πif + 5+

2

3− 2πif− 3

4− 2πif. (1.18)

Let g be a function of the form

g(t) =n∑k=1

αkeakt if t ≥ 0

=

m∑j=1

βjecjt if t < 0 (1.19)

where {ak}n1 are distinct complex numbers satisfying �ak < 0 for all k = 1, 2, · · · , n, and{cj}m1 are distinct complex numbers satisfying �cj > 0 for all j = 1, 2, · · · , m. This constraintimplies that the function g is in L1(−∞,∞). By consulting (1.11) and (1.16), we see thatthe Fourier of g in (1.19) is given by the following strictly proper rational function

G(f) =n∑k=1

αk2πif − ak

+m∑j=1

βjcj − 2πif

. (1.20)

A two sided exponential function. Consider the function defined by g(t) = e−b|t| whereb > 0 is a strictly positive constant. The constraint b > 0 guarantees that g(t) is a functionin L1(−∞,∞). The Fourier transform of e−b|t| is given by

(Fe−b|t|)(f) = 2b

4π2f 2 + b2(b > 0) . (1.21)

To verify this fact, observe that

e−b|t| = e−bt if 0 ≤ t

= ebt if t < 0 . (1.22)

In other words, e−b|t| = e−bt1+(t) + ebt1−(t). Using linearity along with (1.11) and (1.16) wesee that the Fourier transform of e−b|t| is given by

(Fe−b|t|)(f) =1

2πif + b+

1

b− 2πif=

(b− 2πif) + (2πif + b)

(2πif + b)(b− 2πif)=

2b

4π2f 2 + b2.

Hence (1.21) holds.For a related example consider the function g defined by

g(t) = 2e−|t| − 3e−2|t| + 4e−6|t| .

Then using (1.21) along with the fact that the Fourier transform is a linear operator, weobtain

G(f) =4

4π2f 2 + 1− 6

4π2f 2 + 4+

8

4π2f 2 + 36.


9.1.2 Connections to the Laplace transform.

In this section we will develop some connections between the Laplace transform and theFourier transform. Let L1(0,∞) be the subset of L1(−∞,∞) consisting of all functions gsuch that g(t) = 0 for all t < 0 and ∫ ∞

0

|g(t)| dt <∞ .

Notice that g is in L1(0,∞) if and only if g is in L1(−∞,∞) and g(t) = g(t)1+(t) for all t.If g is in L1(0,∞), then the Laplace transform G of g is well defined and given by

G(s) = (Lg)(s) =∫ ∞

0

e−stg(t) dt (�s > 0) . (1.23)

Recall that the Laplace transform G(s) is a function of the complex variable s. As before,we will use capital G to represent the Laplace transform of g. The notation G = Lg simplymeans that G is the Laplace transform of g. By consulting the definition of the Fouriertransform of g in (1.2) we see that

G(f) = G(2πif) (g ∈ L1(0,∞)) . (1.24)

In other words, if g is a function in L1(0,∞), then the Fourier transform G(f) of g canbe computed by evaluating the Laplace transform G(s) at s = 2πif . For example, letg(t) = eat1+(t) where �a < 0. The Laplace transform of g is G(s) = 1/(s − a). Hence

G(f) = G(2πif) = 1/(2πif − a).

For another example, assume that g(t) = 3e−2t1+(t)− 4e−5t1+(t) + e−6t1+(t). Clearly, gis in L1(0,∞). Moreover, the Laplace transform of this g is given by

G(s) =3

s+ 2− 4

s+ 5+

1

s + 6.

Therefore the Fourier transform G(f) for g is given by

G(f) = G(2πif) =3

2πif + 2− 4

2πif + 5+

1

2πif + 6.

Let G(s) be a proper rational function and g its inverse Laplace transform. Then thepartial fraction expansion of G shows that g is in L1(0,∞) if and only if all the poles of G(s)are contained in the open left half plane {s : �s < 0}. In other words, g is in L1(0,∞) if

and only if G is a stable proper rational function. In this case, the Fourier transform G ofg exists and is given by G(f) = G(2πif). Moreover, if G is a stable strictly proper rationaltransfer function, then the formula for the inverse Fourier transform in (1.3) implies that

g(t) = (L−1G)(t) =

∫ ∞

−∞e2πiftG(2πif) df . (1.25)


Equation (1.25) provides us with a formula for computing the inverse Laplace transform fora stable strictly proper rational function G.

It is emphasized that one can only use the identity G(f) = G(2πif) when g is in L1(0,∞).For example, consider the function g(t) = et1+(t). Then the Laplace transform G(s) =1/(s − 1) is well defined for all s = 1. However, this g is not in L1(−∞,∞), and thus, gdoes not have a Fourier transform. To demonstrate why one needs g to be a function inL1(−∞,∞), consider the function h(t) = −et1−(t). Notice that H(f) = 1/(2πif−1). Hence

H(f) = G(2πif). However, h = g. So if one uses G(2πif) as the Fourier transform of g,then the inverse Fourier transform yields two different functions g and h. To avoid this kindof a problem one only defines the Fourier transform for functions g in L1(−∞,∞).

The Fourier transform for functions in L1(−∞, 0). In this section we will developsome further connections between the Laplace and Fourier transforms. Let L1(−∞, 0) bethe subset of L1(−∞,∞) consisting of all functions g such that g(t) = 0 for all t ≥ 0 and∫ 0

−∞|g(t)| dt <∞ .

Notice that g is in L1(−∞, 0) if and only if g is in L1(−∞,∞) and g(t) = g(t)1−(t) for allt. Moreover, g is in L1(−∞, 0) if and only if g(−t) is in L1(0,∞). So if g is in L1(−∞, 0),then the Laplace transform of g(−t) is well defined. If g is in L1(−∞, 0), then we claim that

G(f) = (Lg(−t))(−2πif) = (Lg(−t))(s)|s=−2πif (g ∈ L1(−∞, 0)) . (1.26)

In other words, if g is in L1(−∞, 0), then the Fourier transform G(f) equals the Laplacetransform of g(−t) evaluated at s = −2πif . To see this observe that

G(f) =

∫ 0

−∞e−2πiftg(t) dt = −

∫ 0

∞e2πifσg(−σ) dσ =

∫ ∞

0

e2πiftg(−t) dt

=

∫ ∞

0

e−stg(−t) dt∣∣∣∣s=−2πif

= (Lg(−t))(s)|s=−2πif = (Lg(−t))(−2πif) .

Hence (1.26) holds.For example, let g(t) = ect1−(t) where �c > 0. Notice that g(−t) = e−ct for all t > 0

and zero otherwise. The Laplace transform of g(−t) is given by 1/(s+ c). So by replacing s

with −2πif , we see that G(f) = 1/(c− 2πif).For another example, assume that g(t) = 3e2t1−(t)− 4e5t1−(t)+ e6t1−(t). Clearly, g is in

L1(−∞, 0). Moreover, g(−t) is given by

g(−t) = 3e−2t − 4e−5t + e−6t if t > 0

= 0 otherwise .

The Laplace transform for g(−t) is given by

(Lg(−t))(s) = 3

s+ 2− 4

s+ 5+

1

s+ 6.


By setting s = −2πif , we see that the Fourier transform G(f) for g is given by

G(f) = (Lg(−t))(−2πif) =3

2− 2πif− 4

5− 2πif+

1

6− 2πif.

To complete this section, let us observe that any function g in L1(−∞,∞) admits adecomposition of the form g = g+(t) + g−(t), where g+(t) = g(t) for all t ≥ 0 and g+(t) = 0for all t < 0 while g−(t) = g(t) for all t < 0 and g−(t) = 0 for all t ≥ 0. Clearly, g+ is afunction in L1(0,∞), and g− is a function in L1(−∞, 0). By combining (1.24) and (1.26),

we see that the Fourier transform G of g is given by

G(f) = (Lg+)(2πif) + (Lg−(−t))(−2πif) . (1.27)

In other words, the Fourier transform G(f) of g equals the Laplace transform of g+(t)evaluated at s = 2πif plus the Laplace transform of g−(−t) evaluated at s = −2πif .


Let δ(t) be the Dirac delta function. Recall that the Dirac delta function δ(t) is positiveinfinity at the origin, is zero everywhere else and has area one. Moreover, δ(t) is an evenfunction. If h is any continuous function and t0 is a fixed point on the real line, then

h(t0) =

∫ ∞

−∞h(t)δ(t− t0)dt. (1.28)

The fact that δ(t − t0) picks out the value of h(t) is called the sifting or sampling propertyof the Dirac delta function; see Section 2.7.2 in Chapter 2. In particular, if t0 = 0, then wehave

h(0) =

∫ ∞

−∞h(t)δ(t) dt . (1.29)

The Fourier transform of the Dirac delta function is given by

1 = F(δ(t)) and e−2πift0 = F(δ(t− t0)) . (1.30)

To verify this simply notice that (1.28) gives

F(δ(t− t0)) =

∫ ∞

−∞e−2πiftδ(t− t0) dt = e−2πift0 .

Hence e−2πift0 = F(δ(t− t0)). If t0 = 0, then 1 = Fδ. Therefore (1.30) holds.Clearly, the function g(t) = 1 is not in L1(−∞,∞). So technically speaking the Fourier

transform of 1 does not exist. However, we can formally define the Fourier transform of 1,that is,

δ(f) = (F1)(f) and 1 = (F−1δ(f))(t) . (1.31)

To show why this definition of the Fourier transform of 1 makes sense, apply formula for theinverse Fourier transform in (1.3) to δ(f). Thus

(F−1δ(f))(t) =

∫ ∞

−∞e2πiftδ(f) = e2πi0t = 1 .

Hence 1 = (F−1δ(f))(t), and thus, the formal Fourier transform (F1)(f) = δ(f) makessense.

9.2. PROPERTIES OF THE FOURIER TRANSFORM 453

9.1.4 Exercise

Problem 1. Find the Fourier transform for the function g defined by

g(t) = 1 if 0 ≤ t ≤ 2

= 0 otherwise .


g(t) = 2e−t − 4e−3t if 0 ≤ t

= 2et − e2t if t < 0 .


g(t) = δ(t) + δ(t− 3) + δ(t+ 3) .

Problem 4. Find the inverse Fourier transform g for the function G(f) defined by

G(f) = 1 if − 2 ≤ f ≤ 2

= 0 otherwise .

Plot the function g(t) in Matlab.


G(f) = 2e−3|f | .


G(f) = aδ(f − f0) + aδ(f + f0) .

Here a and f0 are real constants.

Problem 7. Use the inversion formula in equation (1.25) to numerically compute the inverseFourier transform of G(s) = 1/(s+ 1).

9.2 Properties of the Fourier transform

In this section we will present several useful properties of the Fourier transform.


The time shifting property. Let G be the Fourier transform for a function g and t0 aspecified point on the real line. Then the Fourier transform of g(t− t0) is given by

(Fg(t− t0))(f) = e−2πift0G(f) . (2.1)


(Fg(t− t0))(f) =

∫ ∞

−∞e−2πiftg(t− t0) dt =

∫ ∞

−∞e−2πif(σ+t0)g(σ) dσ

= e−2πift0

∫ ∞

−∞e−2πiftg(t) dt = e−2πift0(Fg)(f) = e−2πift0G(f) .

Therefore (2.1) holds.For an application of the shifting property let us show that (Fδ(t − t0))(f) = e−2πift0 .

To see this notice that Fδ = 1 and (2.1) yields

(Fδ(t− t0))(f) = e−2πift0Fδ = e−2πift0 .

In particular, the Fourier transform of δ(t− t0) + δ(t+ t0) equals 2 cos(2πft0).

The frequency shifting property. Let G be the Fourier transform for a function g andf0 a specified point on the real line. Then the Fourier transform of e2πif0tg(t) is given by

(Fe2πif0tg(t))(f) = G(f − f0) . (2.2)


(F−1G(f − f0))(t) =

∫ ∞

−∞e2πiftG(f − f0) df =

∫ ∞

−∞e2πi(σ+f0)tG(σ) dσ

= e2πif0t∫ ∞

−∞e2πiftG(f) df = e2πif0t(F−1G)(t) = e2πif0tg(t) .

Therefore (2.2) holds.Using (F1)(f) = δ(f), the shifting property in (2.2) shows that

(Fe2πif0t)(f) = δ(f − f0) . (2.3)

For an application of the shifting property notice that

(F cos(2πf0t))(f) = δ(f − f0)/2 + δ(f + f0)/2 . (2.4)

To see this notice that (F1) = δ(f) and (2.2) yields

(F(cos(2πf0t))(f) = F (e2πif0t1 + e−2πif0t1

)/2 = δ(f − f0)/2 + δ(f + f0)/2 .

Hence (2.4) holds. A similar calculation shows that

(F sin(2πf0t))(f) = δ(f − f0)/2i− δ(f + f0)/2i . (2.5)


Duality. As before, let G be the Fourier transform of a function g. Then

G(f) = (Fg)(f) if and only if g(−f) = (FG)(f) . (2.6)

To verify this notice that G(f) = (Fg)(f) if and only if

g(τ) = (F−1G)(τ) =

∫ ∞

−∞e2πiστ G(σ) dσ = (FG)(−τ) .

By setting −τ = f , we see that G(f) = (Fg)(f) if and only if g(−f) = (FG)(f). In otherwords, (2.6) holds.

Differentiation. Recall that g is the derivative of a function g with respect to time t.Assume that g and g are both functions in L1(−∞,∞). Then

(F g)(f) = 2πifG(f) . (2.7)

To verify this notice that integration by parts yields

(F g)(f) =∫ ∞

−∞e−2πift g(t) dt = e−2πiftg(t)

∣∣∞−∞ + 2πif

∫ ∞

−∞e−2πiftg(t) dt = 2πifG(f) .

Thus (F g)(f) = 2πifG(f).

Integration. Assume that g and∫ t−∞ g(σ)dσ are both functions in L1(−∞,∞). Then

(F∫ t

−∞g(σ)dσ)(f) =

G(f)

2πif. (2.8)

To verify this let h(t) =∫ t−∞ g(σ)dσ. Using h = g, we obtain

2πifH(f) = (F h)(f) = (Fg)(f) = G(f) .

Dividing by 2πif yields H(f) = G(f)/2πif . Therefore (2.8) holds.

Convolution. Let g and u be two functions in L1(−∞,∞). Then the convolution betweeng and u is defined by

(g ⊗ u)(t) =

∫ ∞

−∞g(t− σ)u(σ) dσ . (2.9)

By changing the variables of integration, we obtain

(g ⊗ u)(t) =

∫ ∞

−∞g(t− σ)u(σ) dσ =

∫ ∞

−∞g(σ)u(t− σ) dσ . (2.10)

In particular, g ⊗ u = u⊗ g.


Notice that g⊗ δ = g where δ(t) the Dirac delta function. To see this observe that (1.29)yields

(g ⊗ δ)(t) =

∫ ∞

−∞g(t− σ)δ(σ) dσ = g(t− 0) = g(t) .

Therefore g ⊗ u = g.

If g and u are two functions in L1(−∞,∞), then

(Fg ⊗ u)(f) = G(f)U(f) . (2.11)

In other words, the Fourier transform of the convolution of two function is multiplication inthe frequency domain.

To show that (Fg ⊗ u)(f) = G(f)U(f) notice that (2.10) yields

(Fg ⊗ u)(f) =

∫ ∞

−∞

∫ ∞

−∞e−2πiftg(σ)u(t− σ) dσ dt =

∫ ∞

−∞g(σ)

∫ ∞

−∞e−2πiftu(t− σ) dt dσ

=

∫ ∞

−∞e−2πifσg(σ)

∫ ∞

−∞e−2πif(t−σ)u(t− σ) dt dσ

=

∫ ∞

−∞e−2πifσg(σ)

∫ ∞

−∞e−2πifτu(τ) dτ dσ

=

∫ ∞

−∞e−2πifσg(σ)U(f) dσ = G(f)U(f) .


Finally, it is noted that if g(t) and u(t) are two function satisfying g(t) = 0 and u(t) = 0for all t < 0, then the convolution between g and u reduces to the following formula

(g ⊗ u)(t) =

∫ t

0

g(t− σ)u(σ) dσ .

This is the convolution formula used in Chapter 4 concerning Laplace transforms.

To complete this section, let us observe that multiplication in the time domain corre-sponds to convolution in the frequency domain. To be precise, let g and u be two functionsin L1(−∞,∞). Then we have

(Fgu)(f) = (G⊗ U)(f) =

∫ ∞

−∞G(f − σ)U(σ) dσ . (2.12)


To verify this notice that

(F−1G⊗ U)(t) =

∫ ∞

−∞

∫ ∞

−∞e2πiftG(σ)U(f − σ) dσ df

=

∫ ∞

−∞G(σ)

∫ ∞

−∞e2πiftU(f − σ) df dσ

=

∫ ∞

−∞e2πiσtG(σ)

∫ ∞

−∞e2πit(f−σ)U(f − σ) df dσ

=

∫ ∞

−∞e2πiσtG(σ)

∫ ∞

−∞e2πiτtU(τ) dτ dσ

=

∫ ∞

−∞e2πitσG(σ)u(t) dσ = g(t)u(t) .


Parseval’s formula. Let g and h be two functions satisfying∫ ∞

−∞|g(t)|2 dt <∞ and

∫ ∞

−∞|h(t)|2 dt <∞ .

Then it turns out the Fourier transforms G(f) = (Fg)(f) and H(f) = (Fh)(f) are welldefined. Moreover, the following result known as Parseval’s equality holds∫ ∞

−∞g(t)h(t) dt =

∫ ∞

−∞G(f)H(f) df . (2.13)

In particular, by choosing h = g we have∫ ∞

−∞|g(t)|2 dt =

∫ ∞

−∞|G(f)|2 df . (2.14)

For example, let g be the function defined by

g(t) = e−γt if 0 ≤ t

= 0 if t < 0 (2.15)

where γ > 0 is a strictly positive constant. Recall that the Fourier transform for g is givenby G(f) = 1/(2πif + γ); see (1.11). Using (2.14), we obtain

1

2γ=

∫ ∞

0

e−2γt dt =

∫ ∞

−∞|g(t)|2 dt =

∫ ∞

−∞|G(f)|2 df =

∫ ∞

−∞

1

4π2f 2 + γ2df .

In particular, this shows that

1 =

∫ ∞

−∞

2γ

4π2f 2 + γ2df .


Fourier transform pairs. The following presents a table of some Fourier transform pairs.In this table 1+(t) is the unit step function defined in (1.9), while 1−(t) is the unit stepfunction defined in (1.14).

Fourier Transform pairs

g(t) G(f)

δ(t) 1

δ(t− t0) e−2πift0

1 δ(f)

e2πif0t δ(f − f0)

cos(2πf0t) (δ(f − f0) + δ(f + f0)) /2

sin(2πf0t) (δ(f − f0)− δ(f + f0)) /2i

eat1+(t) where �a < 0 (2πif − a)−1

eat1−(t) where �a > 0 (a− 2πif)−1

e−b|t| where b > 0 2b(4π2f 2 + b2)−1

g(t) = if |t| ≤ μ and 0 otherwise sin(2πfμ)/πf

e− t2/2σ2

√2πσ

where σ > 0 e−2σ2π2f2

9.2.1 Exercise

Problem 1. Find the Fourier transform of g(t) = cos(2πf0+φ) where f0 and φ are constants.

Problem 2. Find the inverse Fourier transform of G(f) = e−|f |.

Problem 3. Let G be the Fourier transform of a function g. Assume that both g and tg(t)are both functions in L1(−∞,∞). Then show that the Fourier transform of tg(t) equals the

derivative of iG(f)/2π, that is,

(Ftg(t))(f) = i

2π

d G(f)

df.

9.3. THE INVERSE FOURIER TRANSFORM OF RATIONAL FUNCTIONS 459

Problem 4. Let G(f) be the Fourier transform for a function g(t) in L1(−∞,∞). Then

show that the Fourier transform of g(−t) equals G(−f).

Problem 5. Use Parseval’s equality to compute the following integrals∫ ∞

−∞

1

(2πif + 1)(2πif + 2)df and

∫ ∞

−∞

1

(2πif + 1)(2− 2πif)df .

Problem 6. Recall that the Gaussian density with mean zero and standard deviation σ isdefined by

g(t) =e− t2/2σ2

√2πσ

.

Show that the Fourier transform for this g is given by e−2σ2π2f2 . Hint: use the fact that

1 =1√2πσ

∫ ∞

−∞e−(t−μ)2/2σ2 dt ,

where μ is any constant.

REMARK 9.2.1 One can show that the Dirac delta function δ(t) is also given by the limitof a sequence of Gaussian density functions. To be precise,

δ(t) = limσ→0

e− t2/2σ2

√2πσ

and δ(t− t0) = limσ→0

e− (t−t0)2/2σ2√2πσ

.

9.3 The inverse Fourier transform of rational functions

In this section we will use partial fraction expansions to compute the inverse Fourier trans-form for a proper rational function. Let g be the function defined by

g(t) =n∑k=1

αkeakt if t ≥ 0

=m∑j=1

βjecjt if t < 0 . (3.1)

We assume that {ak}n1 are distinct complex numbers satisfying �ak < 0 for all k = 1, 2, · · · , n,while {cj}m1 are distinct complex numbers satisfying �cj > 0 for all j = 1, 2, · · · , m. Thisconstraint implies that the function g is in L1(−∞,∞). Let g+ and g− be the functionsdefined by

g+(t) = 1+(t)n∑k=1

αkeakt and g−(t) = 1−(t)

m∑j=1

βjecjt . (3.2)


By construction g(t) = g+(t)+ g−(t). Moreover, by consulting (1.11) and (1.16), we see thatthe Fourier transform of g is given by

G(f) =n∑k=1

αk2πif − ak

+m∑j=1

βjcj − 2πif

. (3.3)

Finally, it is noted that G(f) is a strictly proper rational function in f .

Now consider the function G(s) of the complex variable s obtained setting s = 2πif , or

equivalently, f = s/2πi. To be precise, G(s) = G(s/2πi). Clearly, G(2πif) = G(f). Bysetting s = 2πif in (3.3), we obtain

G(s) =n∑k=1

αks− ak

+m∑j=1

βjcj − s

. (3.4)

In particular, G(s) is a strictly proper rational function. Notice G(s) = G+(s)+G−(s) where

G+(s) =

n∑k=1

αks− ak

and G−(s) =m∑j=1

βjcj − s

. (3.5)

In other words, G(s) admits a unique decomposition of the form G(s) = G+(s) + G−(s)where all the poles of G+(s) are in the open left half plane {s : �s < 0}, and all the poles of

G−(s) are in the open right half plane {s : �s > 0}. In particular, G(s) has no poles on the

imaginary axis. Notice that the inverse Fourier transform of G(f) in (3.3) is given by (3.1).

In particular, g+ is the inverse Laplace transform of G+(s).

This observation along with G(s) = G(s/2πi) can be used to compute the inverse Fouriertransform for a proper rational function with distinct roots. To see this assume that

G(f) = γ +p0 + p1f + p2f

2 · · ·+ pμfμ

q0 + q1f + q2f 2 · · ·+ qν−1f ν−1 + qνf ν(3.6)

where μ < ν, all the roots of the denominator are distinct and γ is a constant. By settingf = s/2πi = −is/2π and using partial fractions we see that

G(s) = γ +p0 + p1(s/2πi) + p2(s/2πi)

2 · · ·+ pμ(s/2πi)μ

q0 + q1(s/2πi) + q2(s/2πi)2 · · ·+ qν−1(s/2πi)ν−1 + qν(s/2πi)ν

=

n∑k=1

αks− ak

+

m∑j=1

βjcj − s

. (3.7)

Here {ak}n1 are distinct complex numbers satisfying �ak < 0 for all k = 1, 2, · · · , n, while{cj}m1 are distinct complex numbers satisfying �cj > 0 for all j = 1, 2, · · · , m. Then the

inverse Fourier transform of G is given by

g(t) = γδ(t) + 1+(t)n∑k=1

αkeakt + 1−(t)

m∑j=1

βjecjt . (3.8)

9.3. THE INVERSE FOURIER TRANSFORM OF RATIONAL FUNCTIONS 461

For example consider the function G given by

G(f) =2πif + 1

−4π2f 2 + 2πif − 2.

Setting f = s/2πi yields

G(s) = G(s/2πi) =s+ 1

s2 + s− 2=

s+ 1

(s+ 2)(s− 1)=

1

3(s+ 2)− 2

3(1− s). (3.9)

Notice that G(s) has two poles −2 and 1, where −2 is in the left half plane and 1 is in theright half plane. Moreover, the denominator for the pole 1 is expressed as 1−s in the partialfraction expansion. The decomposition in (3.9) implies that

g(t) = e−2t/3 if t ≥ 0

= −2et/3 if t < 0 .

For another example consider the function G defined by

G(f) =2πif + 2

8iπ3f 3 − 8π2f 2 + 2πif − 2. (3.10)

Setting f = s/2πi yields

G(s) = G(s/2πi) =s+ 2

−s3 + 2s2 + s− 2=

s+ 2

(1− s2)(s− 2)=

s+ 2

(s+ 1)(1− s)(s− 2). (3.11)

So G(s) has three poles ±1 and 2. Notice that −1 is in the open left half plane, while 1 and

2 are in the open right half plane. The partial fraction expansion for this G is given by

G(s) =−1

6(s+ 1)− 3

2(1− s)+

4

3(2− s).

The denominator for the poles 1 and 2 in the partial fraction expansion for G is respectivelyexpressed as 1 − s and 2 − s. (This compensated for when applying the Matlab command

residue to compute the poles.) The inverse Fourier transform g = F−1G is given by

g(t) = −e−t/6 if t ≥ 0

= 4e2t/3− 3et/2 if t < 0 . (3.12)

Finally, it is noted that one does not have to convert G to G to compute the inverseFourier transform. We made this conversion to develop some connections with the Laplacetransform. One can compute the inverse Fourier transform directly from the partial fractionexpansion for G. For example, the partial fraction expansion for the function G in (3.10) isgiven by

G(f) =i

12π(f − i/2π)− 3i

4π(f + i/2π)+

2i

3π(f + i/π). (3.13)


Notice that G(f) has three poles ±i/2π and −i/π. The pole i/2π live in the upper halfplane {s : �s > 0}, while the other two poles −i/2π and −i/π are contained in the low halfplane {s : �s < 0}. By rearranging terms in (3.13), we obtain

G(f) =−1

6(2πif + 1)− 3

2(1− 2πif)+

4

3(2− 2πif).

Using this it follows that the inverse Fourier transform g(t) = (F−1G)(t) is given by (3.12).

9.3.1 Exercise

Problem 1. Find the inverse Fourier transform for the function G given by

G(f) =2πif + 3

i8π3f 3 − 4π2f 2 + 8πif − 4.


G(f) =1

(−4π2f 2 + 4πif + 5)(2πif − 1).


G(f) =2πif + 1

4− 2πif.


G(f) =2πif + 1

2(1− 2πif)(1 + πif).


G(f) =2πif − 1

−4π2f 2 + 6πif + 2.

9.4 Transfer functions and sinusoid response

In this section we present the Fourier transform version of the steady state analysis givenin Section 7.1 of Chapter 7. To introduce the Fourier version of a transfer function, recallthat a system is a mapping from an input signal or forcing function u to an output signaly. The input u(t) and output y(t) are functions of time t. If the system is linear and

time invariant, then the transfer function G is defined by G(f) = Y (f)/U(f) where Y

is the Fourier transform of the output y and U is the Fourier transform of the input u.Clearly, Y (f) = G(f)U(f). By taking the inverse Fourier transform of the output, we obtain

y(t) = (g⊗ u)(t). Motivated by this we say that G is the transfer function for a system if G

9.4. TRANSFER FUNCTIONS AND SINUSOID RESPONSE 463

is the Fourier transform of a function g in L1(−∞,∞). Moreover, the corresponding systemwith input u and output y is given by

y(t) = (g ⊗ u)(t) =

∫ ∞

−∞g(t− σ)u(σ) dσ =

∫ ∞

−∞g(σ)u(t− σ) dσ . (4.1)

Notice that if u = δ the Dirac delta function, then the output y = g = g⊗ δ. For this reasong is called the impulse response. Finally, it is noted that if the transfer function G(f) = 1,or equivalently, g(t) = δ(t), then the output y = u.

The transfer function G is causal if g(t) = 0 for all t < 0, or equivalently, g(t) = g(t)1+(t).

Notice that if G is causal, then the convolution y = g ⊗ u is given by

y(t) = (g ⊗ u)(t) =

∫ t

−∞g(t− σ)u(σ) dσ =

∫ ∞

0

g(σ)u(t− σ) dσ (if G is causal ) . (4.2)

The second integral follows by a simple change of variable. A system is causal if the outputy(t) at time t only depends upon the input u(σ) from time −∞ up to the present time t.In other words, the system is causal if the output y(t) is uniquely determined by the pastinput {u(σ)}t−∞. Finally, it is noted that if a system is causal, then one does not need thefuture to determine the present.

By consulting the results in Section 9.3 we readily obtain the following result.

REMARK 9.4.1 Assume that G(f) is a proper rational function. Then G is a causal

transfer function if and only if all the poles of G(s) = G(s/2πi) are contained in the open

left half plane {s : �s < 0}. In other words, G is causal if and only if G(s) is a stabletransfer function in the Laplace transform setting. In this case, g is simply the inverseLaplace transform of G(s).

For example, consider the transfer function

G(f) =2πif − 1

(2πif + 1)(2πif + 2), or equivalently, G(s) =

s− 1

(s+ 1)(s+ 2).

Obviously, −1 and −2 are the poles of G(s). Since all the poles of G(s) are in the open left

half plane, G is causal. Consider the transfer function

H(f) =2πif + 1

(2πif − 1)(2πif + 2), or equivalently, H(s) =

s+ 1

(s− 1)(s+ 2).

Clearly, 1 and −2 are the poles of H(s). Because 1 is not in the open left half plane, H isnot causal.

As before, let G be the Fourier transform for a function g in L1(−∞,∞). For each f in

the real line G(f) is simply a complex number. So G(f) admits a polar decomposition ofthe form

G(f) = |G(f)| eiϕ(f) (4.3)


where ϕ(f) is the angle for the complex number G(f). Now assume that the input is asinusoid of the form u(t) = e2πift where the frequency f is a constant. Then the output

y(t) = (g ⊗ u)(t) = G(f)e2πift = |G(f)| ei(2πft+ϕ(f)) (u(t) = e2πift) . (4.4)

In other words, if one feeds a sinusoid with frequency f into a transfer function G, then theoutput is also a sinusoid with frequency f , magnitude |G(f)| and phase shift ϕ(f). To verifythis simply observe that

(g ⊗ u)(t) =

∫ ∞

−∞g(σ)u(t− σ) dσ =

∫ ∞

−∞g(σ)e2πif(t−σ) dσ = e2πift

∫ ∞

−∞g(σ)e−2πifσ dσ

= G(f)e2πift = |G(f)| eiϕ(f)e2πift = |G(f)| ei(2πft+ϕ(f)) .Hence (4.4) holds.

By combining linearity with (4.4) we see that feeding a sum of sinusoids into a transfer

function G also yields a sum of sinusoids at the output, that is,

y(t) = g ⊗∑k

γke2πifkt =

∑k

γk|G(fk)| ei(2πfkt+ϕ(fk)) . (4.5)

Now let us assume that g is a real valued function. In this case, recall that G(f) = G(−f).Since G(−f) is the complex conjugate of G(f) it follows that G(f) and G(−f) have the same

magnitude, that is, |G(f)| = |G(−f)|. In particular, |G(f)| is even and |G(f)| is symmetricabout the y axis. Moreover, ϕ(f) = −ϕ(−f). To verify this notice that

|G(−f)| eiϕ(−f) = G(−f) = G(f) = |G(f)| eiϕ(f) = |G(f)| e−iϕ(f) = |G(−f)| e−iϕ(f) .This implies that eiϕ(−f) = e−iϕ(f). Thus ϕ(−f) = −ϕ(f) modulo 2π. So without loss ofgenerality we can assume that ϕ(−f) = −ϕ(f).

Consider the input u(t) = γ cos(2πft+ψ) where ψ is a constant phase. Then the output

y(t) = (g ⊗ u)(t) = γ|G(f)| cos(2πft+ ψ + ϕ(f)) (u(t) = γ cos(2πft+ ψ)) . (4.6)

To verify this observe that (4.4) and ϕ(−f) = −ϕ(f) yieldsy(t) = (g ⊗ γ cos(2πft+ ψ))(t) = γg ⊗ ei(2πft+ψ)/2 + γg ⊗ e−i(2πft+ψ)/2

= γeiψ(g ⊗ ei2πft)/2 + γe−iψ(g ⊗ e−i2πft)/2

= γeiψ|G(f)| ei(2πft+ϕ(f))/2 + γe−iψ|G(−f)| e−i(2πft−ϕ(−f))/2

= γ|G(f)| ei(2πft+ψ+ϕ(f))/2 + γ|G(−f)| e−i(2πft+ψ+ϕ(f))/2

= γ|G(f)| cos(2πft+ ψ + ϕ(f)) .

9.4. TRANSFER FUNCTIONS AND SINUSOID RESPONSE 465

Hence (4.6) holds.By combining linearity with (4.6) we see that feeding a sum of sinusoids into a transfer

function G also yields a sum of sinusoids at the output, that is,

y(t) = g ⊗∑k

γk cos(2πfkt+ ψk) =∑k

γk|G(fk)| cos(2πfkt+ ψk + ϕ(fk)) . (4.7)

For an example consider the transfer function G given by

G(f) =−(4− if)

(2 + if)(3− if).

In this case,

|G(f)| =

√f 2 + 16

(f 2 + 4)(f 2 + 9)

ϕ(f) = π/2− arctan(f/4)− arctan(f/2) + arctan(f/3) .

So if u(t) = γ cos(2πft), then the output

y(t) = γ

√f 2 + 16

(f 2 + 4)(f 2 + 9)cos(ft+ ϕ(f)) .

In particular, if u(t) = 6 cos(4πt)− 4 sin(6πt), then we obtain

y(t) = 2.63 cos(2t+ 0.91)− 1.31 sin(2t+ 0.73) .

9.4.1 Exercise


G(f) =−if(3− if)

(1 + if)(2− if). (4.8)

Find the output y = g ⊗ u for u(t) = 4 cos(2πt)− 2 sin(6πt). Plot the Bode plot of G(f).


H(f) =if(3− if)

(1 + if)(2 + if).

Find the output y = g⊗ u for u(t) = 4 cos(2πt)− 2 sin(6πt). Does |H(f)| = |G(f)| where Gis defined in (4.8)?


G(f) =(z − 4π2f 2)

(2πif + 1)(2πif + 2).

Find a zero z such that the output y = g ⊗ u equals zero for the input u(t) = γe2πif0t. Plot

in Matlab |G(f)| for f0 = 3.


9.5 Ideal filters

In this section we will present a Fourier transform version of the ideal low pass, high pass,band pass and band stop filters discussed in Section 7.3 in Chapter 7. Recall that a filter is atransfer function which is designed to pass through signals in a certain frequency range andreject signals outside of this frequency range. In filtering theory one wants to design a causalfilter, that is, a causal transfer function. The Butterworth filters discussed in Section 8.1 to8.4 in Chapter 7 are all causal filters. To be specific, if G(s) is any n-th order Butterworthfilter, then G is a stable transfer function. The corresponding Butterworth filter in theFourier transform setting is the causal filter defined by G(f) = G(2πif).

Recall that a low pass filter is a transfer function which passes through signals with lowfrequencies and rejects signals high frequencies. In the Fourier transform setting, the ideallow pass filter is defined by

G(f) = 1 if − μ ≤ f ≤ μ


Here 0 < μ is the cutoff frequency. Consider any sinusoid input u(t) = e2πift. Then accordingto (4.4), the output y to the system y = g ⊗ u is given by

y(t) = g ⊗ e2πift = e2πift if − μ ≤ f ≤ μ



k ake2πifkt, then

y(t) = g ⊗∑k

ake2πifkt =

∑|fk|≤μ

ake2πifkt. (5.3)

In other words, the ideal low pass filter in (5.1) passes through all sinusoids whose frequencyis between [−μ, μ] and rejects all other sinusoids. Clearly, α cos(2πft) + β sin(2πft) is alinear combination of sinusoids of the form e2πift and e−2πift. Hence

y(t) = g ⊗∑k

(αk cos(2πfkt) + βk sin(2πfkt))

=∑|fk|≤μ

(αk cos(2πfkt) + βk sin(2πfkt)) .

The inverse Fourier transform g of the ideal low pass filter G in (5.1) is given by

g(t) =sin(2πμt)

πt. (5.4)

Clearly, g(t) = 0 for all t < 0. Therefore the ideal low pass filter in (5.1) is not causal. So onecannot implement this filter in practice. To verify that g(t) = sin(2πμt)/πt simply observethat

g(t) = (F−1G)(t) =

∫ ∞

−∞e2πiftG(f) df =

∫ μ

−μe2πift df =

e2πift

2iπt

∣∣∣∣μ−μ

=sin(2πμt)

πt.

Hence (5.4) holds.


An ideal high pass filter. A high pass filter is a transfer function which passes throughhigh frequency signals and rejects low frequency signals. In the Fourier transform setting,the ideal high pass filter is given by

H(f) = 1 if |f | > μ


Here 0 < μ is a cutoff frequency. Consider any sinusoid input u(t) = e2πift. Then accordingto (4.4), the output y to the system y = g ⊗ u is given by

y(t) = h⊗ e2πift = e2πift if |f | > μ



k ake2πifkt, then

y(t) = h⊗∑k

ake2πifkt =

∑|fk|>μ

ake2πifkt. (5.7)

In other words, the ideal high pass filter in (5.5) passes through all sinusoids whose frequencyhas absolute value strictly greater than μ and rejects all the other sinusoids. In particular,

y(t) = h⊗∑k

(αk cos(2πfkt) + βk sin(2πfkt)) =∑|fk|>μ

(αk cos(2πfkt) + βk sin(2πfkt)) .

The inverse Fourier transform h of the ideal high pass filter H in (5.5) is given by

h(t) = δ(t)− sin(2πμt)

πt. (5.8)

Clearly, h(t) = 0 for all t < 0. Therefore the ideal high pass filter in (5.5) is not causal. Soone cannot implement this filter in practice. To verify that h is given by (5.8) simply observe

that H(f) = 1− G(f) where G is the ideal low pass filter given in (5.1). By consulting (5.4)we see that

h(t) = (F−1H)(t) = (F−11)(t)− (F−1G)(t) = δ(t)− sin(2πμt)

πt.

Hence (5.8) holds.

An ideal band pass filter. A band pass filter is a transfer function which passes throughsignals in a certain frequency range or band and rejects all other signals outside of thisfrequency range. In the Fourier transform setting, the ideal band pass filter defined by

B(f) = 1 if μ1 ≤ |f | ≤ μ2



Here 0 ≤ μ1 < μ2 and {f : |f | ∈ [μ1, μ2]} is the band or the range of frequencies that thefilter accepts. Consider any sinusoid input u(t) = e2πift. Then according to (4.4), the outputy to the system y = b⊗ u is given by

y(t) = b⊗ e2πift = e2πift if μ1 ≤ |f | ≤ μ2



k ake2πifkt, then

y(t) = b⊗∑k

ake2πifkt =

∑μ1≤|fk|≤μ2

ake2πifkt. (5.11)

In other words, the ideal band pass filter in (5.9) passes through all sinusoids whose frequencyis in the band {f : |f | ∈ [μ1, μ2]} and rejects all other sinusoids. In particular,

y(t) = b⊗∑k

(αk cos(2πfkt) + βk sin(2πfkt)) =∑

μ1≤|fk|≤μ2(αk cos(2πfkt) + βk sin(2πfkt)) .

The inverse Fourier transform b of the ideal band pass filter B in (5.9) is given by

b(t) =sin(2πμ2t)− sin(2πμ1t)

πt. (5.12)

Clearly, b(t) = 0 for all t < 0. Therefore the ideal band pass filter in (5.9) is not causal. Soone cannot implement this filter in practice. To verify that b is given by (5.12) observe that

b(t) = (F−1B)(t) =

∫ ∞

−∞e2πiftB(f) df =

∫ −μ1

−μ2e2πift df +

∫ μ2

μ1

e2πift df

=e2πift

2πit

∣∣∣∣−μ1−μ2

+e2πift

2πit

∣∣∣∣μ2μ1

=e2πiμ2t − e2πiμ1t + e−2πiμ1t − e−2πiμ2t

2πit

=e2πiμ2t − e−2πiμ2t − (e2πiμ1t − e−2πiμ1t)

2πit=

sin(2πμ2t)− sin(2πμ1t)

πt.

Hence (5.12) holds. Finally, it is noted that if μ1 = 0, then the ideal band pass filter in (5.9)reduces to the ideal low pass filter in (5.1) with μ = μ2. In this case, the inverse Fouriertransforms g in (5.4) and b in (5.12) coincide.

An ideal band stop filter. A band stop filter is a transfer function which rejects signalsin a certain frequency range or band and passes through all other signals outside of thisfrequency range. For example, consider the ideal band stop filter defined by

Q(f) = 0 if μ1 ≤ |f | ≤ μ2



Here 0 ≤ μ1 < μ2 and {f : |f | ∈ [μ1, μ2]} is the band stop or the range of frequencies thatthe filter rejects. Consider any sinusoid input u(t) = e2πift. Then according to (4.4), theoutput y to the system y = q ⊗ u is given by

y(t) = q ⊗ e2πift = 0 if μ1 ≤ |f | ≤ μ2

= e2πift otherwise. (5.14)


k ake2πifkt, then

y(t) = b⊗∑k

ake2πifkt =

∑|fk|<μ1, μ2<|fk|

ake2πifkt. (5.15)

In other words, the ideal band stop filter in (5.9) rejects through all sinusoids whose frequencyis in the band {f : |f | ∈ [μ1, μ2]} and accepts all other sinusoids. In particular,

y(t) = b⊗∑k

(αk cos(2πfkt) + βk sin(2πfkt))

=∑

|fk|<μ1, μ2<|fk|(αk cos(2πfkt) + βk sin(2πfkt)) .

The inverse Fourier transform q of the ideal band stop filter Q in (5.9) is given by

q(t) = δ(t)− sin(2πμ2t)− sin(2πμ1t)

πt. (5.16)

Clearly, q(t) = 0 for all t < 0. Therefore the ideal band stop filter in (5.9) is not causal. Soone cannot implement this filter in practice. To verify that q is given by (5.16) observe that

Q(f) = 1 − B(f) where B is the ideal band pass filter defined in (5.9). So using (5.12) weobtain

q(t) = (F−1Q)(t) = (F−1(1− B))(t) = δ(t)− (F−1B)(t) = 1− b(t) .

Hence (5.16) holds. Finally, it is noted that if μ1 = 0, then the ideal band stop filter in (5.13)reduces to the ideal high pass filter in (5.5) with μ = μ2. In this case, the inverse Fouriertransforms g in (5.8) and q in (5.16) coincide.

9.5.1 Exercise

Problem 1. Consider the filter defined by

G(f) = 1 if |f | ≤ 1

= 0 if 1 < |f | ≤ 2

= 1 if 2 < |f | ≤ 3

= 0 otherwise .

Then find g. Is this filter causal?


Problem 2. Consider the filter defined by

G(f) = 1− |f | if |f | ≤ 1

= 0 otherwise .

Then find g. Is this filter causal?

Problem 3. Consider the function g(t) = sin(2πt)/πt. Find the filter G(f) corresponding

to g. Plot G(f). Find the filter B(f) corresponding to 2 cos(10πt)g. Plot B(f).

9.6 The Nyquist sampling rate

In this section we will present the uniform sampling theorem. We say that a function G hassupport in {f : |f | ≤ μ} if G(f) = 0 for all |f | > μ. The following result is known as theuniform Nyquist sampling theorem. It shows that one can reconstruct g(t) by sampling g(t)at t = k/2μ for k = 0,±1,±2, · · · where μ is greater than or equal to the maximum frequency

contained in G(f). The smallest value of μ such that G has support in {f : |f | ≤ μ} is calledthe Nyquist frequency or Nyquist sampling rate.

THEOREM 9.6.1 Let G be the Fourier transform for a function g in L1(−∞,∞). Assume

that G has support in {f : |f | ≤ μ}. Then G admits a Fourier series expansion of the form

G(f) = Δ

∞∑k=−∞

g(kΔ)e−2πifkΔ (|f | ≤ μ) (6.1)

where Δ = 1/2μ. Moreover, the function g(t) can be recovered from the sampled data{g(kΔ)}∞−∞, that is,

g(t) =

∞∑k=−∞

g(kΔ)sin(2μπ(t− kΔ))

2μπ(t− kΔ)=

∞∑k=−∞

g(kΔ) sinc(2μ(t− kΔ)) . (6.2)

In particular, one can reconstruct g(t) from the sampled data {g(k/2μ)}∞−∞ when μ is greater

than or equal to the Nyquist frequency for G(f).

Proof. Since G has support in {f : |f | ≤ μ}, it follows that G(f) is a function in L2(−μ, μ).Hence G(f) admits a Fourier series expansion of the form

G(f) =∞∑

k=−∞ake

−2πifk/τ (|f | ≤ μ) , (6.3)

where τ = 2μ; see Section 2.1.2 in Chapter 2. Moreover, the Fourier coefficients {ak}∞−∞ arecomputed by

ak =1

2μ

∫ μ

−μe2πifk/τ G(f) df =

1

2μ

∫ ∞

−∞e2πifkΔG(f) df = Δg(kΔ) .

9.6. THE NYQUIST SAMPLING RATE 471

The third equality follows by using the formula for the inverse Fourier transform in (1.3).Hence ak = Δg(kΔ) for all integers k. Therefore (6.1) holds.

By employing the Fourier series for G in (6.1) along with the formula for the inverseFourier transform in (1.3), we see that

g(t) =

∫ μ

−μe2πiftG(f) df =

∞∑k=−∞

g(kΔ)

2μ

∫ μ

−μe2πifte−2πifkΔ df . (6.4)

To compute the last integral observe that∫ μ

−μe2πifte−2πifkΔ df =

∫ μ

−μe2πif(t−kΔ) df =

e2πif(t−kΔ)

2πi(t− kΔ)

∣∣∣∣μ−μ

=sin(2πμ(t− kΔ))

π(t− kΔ).

Substituting this into (6.4) yields (6.2). This completes the proof.

As before, assume that G has support in {f : |f | ≤ μ}. In other words, μ is greater than

or equal to the Nyquist frequency for G. Then there exists a unique periodic function P (f)

with period 2μ satisfying G(f) = P (f) for all |f | ≤ μ. According to Theorem 9.6.1, the

function G admits a Fourier series expansion of the form (6.1). Hence P (f) is given by

P (f) = Δ

∞∑k=−∞

g(kΔ)e−2πifkΔ (−∞ < f <∞) . (6.5)

In many applications the sampled data {g(kΔ)}∞−∞ is used to obtain an estimate of the

Fourier transform G of g. To accomplish this one first constructs the function P (f) fromthe Fourier series expansion in (6.5). This can be done by using the fast Fourier transform.

Then the function G is given by G(f) = P (f) for |f | ≤ μ and G(f) = 0 otherwise. This

works well as long as the support of G(f) is contained in {f : |f | ≤ μ}, that is, μ is greaterthan or equal to the Nyquist sampling frequency.

If the support of G(f) is not contained in {f : |f | ≤ μ}, then the function P (f) defined

by the Fourier series in (6.5) is the unique periodic function with period 2μ satisfying G(f) =

P (f) for |f | ≤ μ. However, in this case, one can say noting about G(f) for |f | > μ. If the

support of G(f) is not contained in {f : |f | ≤ μ}, then the difference between P and G isknown as aliasing. The effect due to aliasing occurs because one is simply not sampling thesignal g(t) fast enough.

For an example consider the signal g(t) whose Fourier transform G is given by

G(f) = 1− |f | if |f | ≤ 1


In this case, the Nyquist frequency is one. Now let μ = 1.5. Obviously, 1.5 is greater thanthe Nyquist frequency. So there is no aliasing with μ = 1.5. The periodic function P (f)

with period 3 = 2μ corresponding G(f) is given in Figure 9.1. Clearly, one can reconstruct

G(f) by setting G(f) = P (f) for |f | ≤ μ and G(f) = 0 for |f | > μ.


−8 −6 −4 −2 0 2 4 6 80

0.5

1

1.5

frequency f

hatP

(f)

Figure 9.1

As before, consider the signal g(t) whose Fourier transform G is given by (6.6). Recallthat the Nyquist frequency is one. Now let μ = 1/2. Obviously, 1/2 is less than the Nyquist

frequency. So there is aliasing with μ = 1/2. The periodic function P (f) with period 1 = 2μ

corresponding G(f) is presented in Figure 9.2. Clearly, one can reconstruct G(f) in the

interval [−1/2, 1/2] by setting G(f) = P (f) for |f | ≤ 1/2. However, the knowledge of P (f)

does not allow us to obtain G(f) for |f | > 1/2.

9.6. THE NYQUIST SAMPLING RATE 473

−8 −6 −4 −2 0 2 4 6 80

0.5

1

1.5

frequency f

hatP

(f)

Figure 9.2

9.6.1 Exercise

Problem 1. Consider the function g(t) = e−|t|. Recall that in this case

G(f) =2

4π2f 2 + 1. (6.7)

Notice that G(f) ≈ 0 for all |f | ≥ 10. For μ = 10, Plot g(t) in Matlab by using the seriesexpansion for g in equation (6.2) of Theorem 9.6.1. Compare this to the plot of e−|t|.

Problem 2. As in Problem 1, consider the function g(t) = e−|t|. For μ = 10, Plot P (f) in

Matlab by using the series expansion for P (f) in equation (6.5). Compare this to the plot

of G(f).

Problem 3. As in Problem 1, consider the function g(t) = e−|t|. For μ = 1/2, Plot P (f) in

Matlab by using the series expansion for P (f) in equation (6.5). Compare this to the plot

of G(f) and comment on aliasing.


Bibliography

[1] J.J. Cathey and S.A. Nasar, Basic Electrical Engineering, Schaum’s Outline Series,McGraw-Hill, New York, 1984.

[2] Chi-Tsong Chen, Linear System Theory and Design, The Oxford Series in Electricaland Computer Engineering, Oxford University press, New York 1999.

[3] M. J. Corless and A. E. Frazho, Linear Systems and Control; An Operator Perspective,Marcel Decker, New York, 2003.

[4] Control Tutorials for Matlab web page www.engin.umich.edu/group/ctm/

[5] K. S. S. Cuddy, Convergence of Fourier series.

[6] G. Daryanani, Principles of Active Network Synthesis and Design, Wiley 1976

[7] R.A. DeCarlo, Linear Systems, Prentice Hall, Englewood Cliffs, New Jersey, 1989.

[8] R.A. DeCarlo and P. Lin, Linear Circuit Analysis, Oxford University Press, New York,2001.

[9] J.F. Doyle, Wave Propagation in Structures, Springer-Verlag, New York, 1989.

[10] P. Duhamel and M. Vetterli, Fast Fourier Transforms: A Tutorial Review, Signal Pro-cessing 19, pp. 259-299, 1990

[11] A. Frazho and W. Bhosri, An operator perspective on signals and systems, Operatortheory advances and applications 204, Birkhauser Verlag, Basel, 2010.

[12] I. M. Gelfand, R. A. Minlos and Z.Ya. Shapiro, Representations of the Rotation andLorentz Groups and their Applications, New York: Pergamon Press, 1963

[13] I. Gohberg and S. Goldberg, Basic Operator Theory, Birkhauser, Basel, 1981.

[14] I. Gohberg, S. Goldberg and M.A. Kaashoek, Classes of Linear Operators, Vol. II,Operator Theory: Advances and Applications, 63, Birkhauser-Verlag, Basel, 1993.

[15] I. Gohberg, S. Goldberg and M.A, Kaashoek, Basic Classes of Linear Operators,Springer Basel, 2003.

475

476 BIBLIOGRAPHY

[16] G.H. Golub and C.F. Van Loan, Matrix Computations, The Johns Hopkins UniversityPress, Baltimore, 1996.

[17] C. B. Hall, Lie Groups, Lie Algebras, and Representations: An Elementary Introduction,Graduate Texts in Mathematics, 222 (2nd ed.), Springer, 2015

[18] P.R. Halmos, A Hilbert Space Problem Book, Springer-Verlag, New York, 1982.

[19] M.H. Hayes, Digital Signal Processing, Schaum’s Outline Series, McGraw-Hill, NewYork, 1999.

[20] K. Hoffman, Banach Spaces of Analytic Functions, Prentice Hall, New Jersey, 1962.

[21] R. Horn and C. R. Johnson, Matrix Analysis (Second Edition), Cambridge UP, Cam-bridge, 2012.

[22] H.P. Hsu, Signals and Systems, Schaum’s Outline Series, McGraw-Hill, New York, 1995.

[23] T. Kailath, Linear Systems, Prentice Hall, New Jersey, 1980.

[24] S.L. Marple, Digital Spectral Analysis with Applications, Prentice-Hall, New Jersey,1987.

[25] L. Meirovitch, Principles and Techniques of Vibrations, Prentice Hall, New Jersey, 1997.

[26] P.J. Nahin, Dr. Euler’s Fabulous Formula Cures many mathematical ills, PrincetonUniversity Press Princeton and Oxford, 2006.

[27] K. Ogata, Modern Control Engineering, Prentice-Hall, New Jersey, 1970.

[28] P. J. Olver, Topics in Fourier Analysis: DFT & FFT, Wavelets, Laplace Transform,Notes online, University of Minnesota

[29] J. O’Malley, Basic Circuit Analysis, Schaum’s Outline Series, McGraw-Hill, New York,1992.

[30] A.V. Oppenheim and R.W. Schafer, Digital Signal Processing, Prentice-Hall, New Jer-sey, 1975.

[31] A.V. Oppenheim, R.W. Schafer and J.R. Buck, Discrete-Time Signal Processing,Prentice-Hall, New Jersey, 1999.

[32] A.V. Oppenheim, A.S. Willsky and S.H. Nawab, Signals and Systems, Prentice-Hall,New Jersey, 1997.

[33] W. J. Rugh, Linear System Theory, Prentice-Hall, New Jersey, 1993.

[34] R. Schaumann and M. E. Van Valkenburg, Design of Analog Filters Oxford UniversityPress; 2 edition 2009.

BIBLIOGRAPHY 477

[35] W.W. Seto, Mechanical Vibrations, Schaum’s Outline Series, McGraw-Hill, New York,1964.

[36] I.S. Sokolnikoff and R.M. Redheffer, Mathematics of Physics and Modern Engineering,McGraw-Hill, New York, 1958.

[37] M.R. Spiegel, Laplace Transforms, Schaum’s Outline Series, McGraw-Hill, New York,1965.

[38] M.R. Spiegel, Fourier Analysis, Schaum’s Outline Series, McGraw-Hill, New York, 1995.

[39] R.T. Stefani, B. Shahian, C.J. Savant and G.H. Hostetter, Design of Feedback ControlSystems, Oxford Univeresity Press, New York, 2002.

[40] E. M. Stein and R. Shakarchi Fourier Analysis An Introduction, Princeton UniversityPress, 2003.

[41] Nico M. Temme, Special Functions: An introduction to the classical functions of math-ematical physics New York: Wiley, 1996.

[42] D. Tong, Lectures on Dynamics and Relativity, taken from the internet.http://www.damtp.cam.ac.uk/user/tong/dynamics.html, University of Cambridige.

[43] C. A. Tracy, Lectures on Differential Equations, Department of Mathematics, Universityof California Davis, CA, December 2014.

[44] J. S. Walker, Fast Fourier Transform, Second Edition (Studies in Advanced Mathemat-ics) CRC Press, 1996.

478 BIBLIOGRAPHY

Index

L1(−∞, 0), 451L1(−∞,∞), 445L1(0,∞), 450L2(0, τ), 24SO(3), 249�, 7�, 7arg, 10, 344deg, 192det, 170Cν , 21F , 445F−1, 446L, 181L−1, 192sinc, 447

absolute value, 8aliasing, 124all pass filter, 407angle, 10angular velocity, 236, 245atan2, 11

Bessel function, 133Blaschke product, 407Bode plot, 378bounded variation, 33Butterworth filter

band pass, 431band stop, 440high pass, 428low pass, 417

capacitor, 205

Cauchy-Schwartz inequality, 22causal

filter, 466system, 463transfer function, 463

centrifugal acceleration, 246centrifugal force, 246Cesaro mean, 81characteristic polynomial, 170complex conjugate, 8complex exponential form, 10continuously differentiable, 34convolution, 190, 455Coriolis acceleration, 246Coriolis force, 246cosine matrix, 333

damping coefficient, 214damping ratio, 382De Moivre’s formula, 14diagonally dominant, 308differential voltage, 277Dirac comb, 94Dirac delta, 92Dirac delta function, 186Dirichlet convergence, 33, 51Dirichlet kernel, 95discrete Fourier transform, 115

inverse, 115distance, 22

L2(0, τ), 24Cν , 23

eigenvalue, 170eigenvector, 170

479

480 INDEX

Euler acceleration, 246Euler force, 246Euler’ formula, 9even function, 60exponential matrix, 229exponential order, 181

farad, 205fast Fourier transform, 116Fejer kernels, 97filter, 372

band pass, 376band stop, 376high pass, 374low pass, 372stable rational, 373

Final value Theorem, 212Fourier coefficients, 30Fourier series, 30

harmonic form, 55sine and cosine form, 46

Fourier transform, 445inverse, 446table, 458

frequency, 31angular, 31

fundamental angular frequency, 31fundamental frequency, 31

Gershgorin circle, 307Gibbs phenomenon, 38

Heaviside step function, 183henry, 205Hooke’s law, 214

identification, 358imaginary part, 7impedance, 220impedance, parallel, 220impedance, series, 220impulse response, 204inductor, 205inner product, 22

L2(0, τ), 24Cν , 22

space, 22

Kirchhoff’s current law, 205Kirchhoff’s voltage law, 205Kronecker delta, 25

Laplace transform, 181inverse, 192

magnitude, 8maximal property, 99McMillan degree, 257monic polynomial, 193Moore-Penrose inverse, 153, 366

natural frequency, 382norm, 22

L2(0, τ), 24Cν , 22

notch filter, 399null space, 308Nyquist frequency, 121, 470Nyquist frequency range, 121Nyquist-Shannon sampling Theorem, 121

odd function, 60ohms, 204one to one matrix, 153operational amplifier, 277orthogonal basis, 26orthogonal vectors, 25orthonormal basis, 27orthonormal vectors, 27

Parseval’s formula, 26, 28, 30, 47, 55, 457partial fraction, 193period, 31polar decomposition, 10pole, 192positive matrix, 305power spectrum, 31proper rational, 192

real part, 7real vector, 145resistor, 204Rodrigues formula, 237roots of unity, 17

INDEX 481

rotation matrix, 236

sampling property, 93, 186sawtooth function, 71self adjoint matrix, 308shift matrix, 171sifting property, 93, 186sine matrix, 333sinusoid estimation, 158skew symmetric, 236spring constant, 214stable, 343stable matrix, 271stable transfer function, 212start, 1state space, 250state space realization, 257

steady state, 346strictly diagonally dominant, 308strictly proper rational, 192sunspots, 151

time constant form, 346transfer function, 204trigonometric polynomial, 113

uniform convergence, 34unit step function, 183unitary matrix, 170, 236

Vandermonde matrix, 150

Wiener algebra, 33Wolfer number, 151

zero, 192

Documents

Notes onSignalsandSystems - Purdue EngineeringChapter 1 Complexnumbers This chapter presents some elementary facts concerning complex numbers, inner product spacesandorthogonalsystems