82
Math Notes for ECE 278 G. C. Papen September 6, 2017

Math Notes for ECE 278 - web.eng.ucsd.edu

  • Upload
    others

  • View
    2

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Math Notes for ECE 278 - web.eng.ucsd.edu

Math Notes for ECE 278

G. C. Papen

September 6, 2017

Page 2: Math Notes for ECE 278 - web.eng.ucsd.edu

c© 2017 by George C. Papen

All rights reserved. No part of this manuscript is to be reproduced without written consent of the author.

Page 3: Math Notes for ECE 278 - web.eng.ucsd.edu

Contents

1 Background 11.1 Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Bandwidth and Timewidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.1.2 Passband and Complex-Baseband Signals . . . . . . . . . . . . . . . . . . . . 121.1.3 Signal Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.2 Random Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.2.1 Probability Distribution Functions . . . . . . . . . . . . . . . . . . . . . . . . 231.2.2 Random Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

1.3 Electromagnetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461.3.1 Material Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471.3.2 The Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501.3.3 Random Electromagnetic field Fields . . . . . . . . . . . . . . . . . . . . . . . 55

1.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2 Examples 652.1 Filter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652.2 Constant-Modulus Objective Function . . . . . . . . . . . . . . . . . . . . . . . . . . 682.3 Adaptive Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

Bibliography 71

iii

Page 4: Math Notes for ECE 278 - web.eng.ucsd.edu

iv

Page 5: Math Notes for ECE 278 - web.eng.ucsd.edu

1 Background

The study of communication systems is rich and rewarding, bringing together a broad range of topicsin engineering and physics. Our development of this subject draws on the understanding of basicmaterial in the subjects of linear systems, random signals, electromagnetics. The emphasis in thischapter is on the concepts that are relavent to the understanding of modern digital communicationsystems.This background chapter also introduces and reinforces various and alternative sets of notation

that are used throughout the book. Much of the understanding of the various topics in this bookdepends on the choice of clear and appropriate notation and terminology.

1.1 Linear Systems

A communication system conveys information by embedding that information into temporal andperhaps spatial variations of a propagating signal. We begin with a discussion of the properties ofsignals and systems.A signal is a real-valued or complex-valued function of a continuous or discrete variable called

time. A system responds to a signal s(t) at its input producing one or more signals r(t) at its output.The most amenable systems are linear systems because a linear mathematical model can support

Figure 1.1: A block diagram of a linear system characterized by an impulse response function h(t).Using the properties of homogeneity and additivity, an input ax1(t) + bx2(t) producesan output ay1(t) + by2(t).

powerful methods of analysis and design. We are interested in both discrete systems and continuoussystems expressed in a variety of mathematical forms such as by using continuous integral equations,continuous differential equations, or discrete difference equations.A communication signal may be a real function of time or a complex function of time. The

rectangular form of a complex function is a(t) = aR(t) + iaI(t) where aR(t) is the real part and aI(t)is the imaginary part and i2 = −1. The polar form is a(t) = A(t)eiφ(t) where A(t) =

√a2R(t) + a2I (t)

is the amplitude and φ = tan−1(aI/aR) is the phase.Systems can be classified by the properties that relate the input s(t) to the output r(t).

Linearity

A system, either real or complex, is linear if it is homogeneous and additive:

1

Page 6: Math Notes for ECE 278 - web.eng.ucsd.edu

1. Homogeneous systemsIf input s(t) has output r(t), then for every scalar a, real or complex, input as(t) has outputar(t).

2. Additive systemsIf input x1(t) has output y1(t) and input x2(t) has output y2(t), then input x1(t) + x2(t) hasoutput y1(t) + y2(t).

The output r(t) of a linear continuous-time system can be written as a superposition integral ofthe input s(t) and a function h(t, τ)

r(t) =

∫ ∞−∞

h(t, τ)s(τ)dτ (1.1.1)

where h(t, τ), called the time-varying impulse response, is defined as the output of the system attime t in response to a Dirac impulse δ(t− τ).The Dirac impulse δ(t) is a not a proper functiona. It is defined by the formal integral relationship

s(t) =

∫ ∞−∞

δ(t− τ)s(τ)dτ, (1.1.2)

for any function s(t). This integral is referred to as the sifting property of a Dirac impulse.For the treatment of discrete-time signals, a Kronecker impulse δmn is useful, defined by δmn

equal to one if m is equal to n, and δmn equal to zero otherwise.

Shift Invariance

Under appropriate conditions, a system described by a superposition integral can be reduced toa simpler form known as a shift-invariant system, or when appropriate, as a time-invariant or aspace-invariant system. If input s(t) has output r(t), then for every τ , input s(t − τ) has outputr(t − τ). In this case, the form of the impulse response for a linear and shift-invariant systemdepends only on the time difference so that h(t, τ) = h(t− τ, 0) and (1.1.1) reduces to

r(t) =

∫ ∞−∞

h(τ)s(t− τ)dτ. (1.1.3)

The output is then a convolution of the input s(t) and the shift-invariant impulse response h(t)and is denoted by r(t) = s(t)~ h(t). The shift-invariant impulse response is also called, simply, theimpulse response. Every linear shift-invariant system can described as a linear shift-invariant filter.Convolution has the following properties:

1. Commutative propertyh(t) ~ s(t) = s(t) ~ h(t).

2. Distributive propertyh(t) ~ (x1(t) + x2(t)) = h(t) ~ x1(t) + h(t) ~ x2(t).

aA Dirac impulse is an example of a generalized function or a generalized signal. For the formal theory see Strichartz(2003).

2

Page 7: Math Notes for ECE 278 - web.eng.ucsd.edu

3. Associative propertyh1(t) ~ (h2(t) ~ s(t)) = (h1(t) ~ h2(t)) ~ s(t).

Using the distributive property of convolution, we can write for complex functions,

a(t) = b(t) ~ c(t)

aR(t) + iaI(t) = (bR(t) + ibI(t)) ~ (cR(t) + icI(t))

=(bR(t) ~ cR(t)− bI(t) ~ cI(t)

)+ i(bR(t) ~ cI(t) + bI(t) ~ cR(t)

). (1.1.4)

The class of shift-invariant systems includes all those described by constant-coefficient, lineardifferential equations. An example of a spatially-invariant system is free space because it has noboundaries and thus the choice of the spatial origin is arbitrary. Systems with spatial boundariesare spatially-varying in at least one direction, but may be spatially-invariant in the other directions.However, many spatial systems with boundaries can be approximated as spatially-invariant over alimited range of spatial inputs.

Causality

A causal filter h(t) is a linear filter whose impulse response has a value equal to zero for all timest less than zero. A causal impulse response cannot have an output before it has an input. A right-sided signal s(t) has a value equal to zero for all times less than zero. A linear time-invariant systemis causal if and only if it has a right-sided impulse response. A causal h(t) can be defined using theunit-step function, which is

u(t).=

1 for t > 012 for t = 00 for t < 0.

(1.1.5)

A linear shift-invariant system is causal if its impulse response h(t) satisfies h(t) = h(t)u(t) exceptat t = 0. For this case, the lower limit of the integral for the output signal given in (1.1.3) is equalto zero.A function related to the unit-step function is the signum function defined as

sgn(t).= 2u(t)− 1

.=

1 for t > 00 for t = 0−1 for t < 0.

(1.1.6)

A system for which the output r(t) depends on only the current value of s(t) is called memoryless.The corresponding property in space is called local .

3

Page 8: Math Notes for ECE 278 - web.eng.ucsd.edu

The Fourier Transform

The Fourier transformb (or spectrum) S(f) of the temporal signal s(t) is defined, provided theintegral exists, as

S(f) =

∫ ∞−∞

s(t)e−i2πftdt. (1.1.7)

The Fourier transform formally exists for any signal whose energyc E, given by

E =

∫ ∞−∞|s(t)|2, (1.1.8)

is finite. Such signals are called finite energy or square-integrable signals.The Fourier transform can be extended to include a large number of signals and generalized

signals with infinite energy, but finite power, such as cos(2πfct) and ei2πfct by means of a limitingprocess that often can be expressed using the Dirac impulse δ(t).The signal s(t) can be recovered as an inverse Fourier transform

s(t) =

∫ ∞−∞

S(f)ei2πftdf, (1.1.9)

with s(t)←→ S(f) denoting the transform pair. To this purpose, two signals whose difference haszero energy are regarded as the same signal. Another way to say this is that the two signals areequal almost everywhere.A Fourier transform can also be defined for spatial signals. For a one-dimensional spatial signal

f(x), we haved

F (k) =

∫ ∞−∞

f(x)eikxdx, (1.1.10)

where k is the spatial frequency , which is the spatial equivalent of the temporal frequency ω = 2πf .

Properties of the Fourier Transform

Several properties of the Fourier transform used to analyze communication systems are listed below.

1. Scaling

s(at) ←→ 1

|a|S(f

a

)(1.1.11)

for any nonzero real value a. This scaling property states that the width of a function in onedomain scales inversely with the width of the function in the other domain.

bAngular frequency ω = 2πf (Hz)is also used to define a Fourier transform pair where S(ω) =∫∞−∞ s(t)e

−iωtdt, ands(t) = 1

∫∞−∞ S(ω)e

iωtdω. We will use this alternative notation for electromagnetics, where it is conventional.cThe term energy here refers to a mathematical concept and does not necessary correspond to physical energy.dThe usual sign convention for the spatial Fourier transform is the opposite of the sign convention for the temporalFourier transform, but this is a matter of preference.

4

Page 9: Math Notes for ECE 278 - web.eng.ucsd.edu

2. Differentiation

d

dts(t) ←→ i2πfS(f). (1.1.12)

The dual property is

t s(t) ←→ − 1

i2π

d

dfS(f). (1.1.13)

3. Convolution

s(t) ~ h(t) ←→ S(f)H(f). (1.1.14)

Convolution in the time domain is equivalent to multiplication in the frequency domain. Thedual property is

s(t)h(t) ←→ S(f) ~H(f). (1.1.15)

4. ModulationA special case of the convolution property occurs when h(t) = ei2πfct and gives

s(t)ei2πfct ←→ S(f − fc) (1.1.16)

for any real value fc. Multiplication in the time domain by ei2πfct translates the frequencyorigin of the baseband signal S(f) to the carrier frequency fc, which can be written as

S(f) ~ δ(f − fc) = S(f − fc).The modulation process is linear with respect to s(t) but does contain frequency componentsthat are not present in the original baseband signal. The dual property is

s(t− t0) ←→ e−i2πft0S(f) (1.1.17)

for any real value t0.

5. Parseval’s relationshipTwo signals s(t) and h(t) with finite energy satisfy∫ ∞

−∞s(t)h∗(t)dt =

∫ ∞−∞

S(f)H∗(f)df. (1.1.18)

When h(t) = s(t), the two integrals express the energy in s(t) computed both in the timedomain and in the frequency domain. These integrals are equal and finite.

If s(t) is real, then the following relationships hold for S(f) = SR(f) + iSI(f) where SR(f).=

Re[S(f)] is the real part and SI(f).= Im[S(f)] is the imaginary part of the Fourier transform

S(f) = S∗(−f)

SR(f) = SR(−f)

SI(f) = −SI(−f)

|S(f)| = |S(−f)|φ(f) = −φ(−f), (1.1.19)

5

Page 10: Math Notes for ECE 278 - web.eng.ucsd.edu

where |S(f)| =√SR(f)2 + SI(f)2 is the magnitudee of the Fourier transform, and

φ(f).= argS(f) = tan−1

(SI(f)

SR(f)

)is the phase of the Fourier transform.A consequence of these properties is that the Fourier transform of a real signal s(t) is conjugate

symmetric, meaning that the negative frequency part of the Fourier transform containing the sameinformation as the positive frequency part. This observation allows us to construct an equivalentrepresentation of the real signal that consists only of the nonnegative frequency components. To doso, define

Z(f).=

2S(f) for f > 0S(0) for f = 0

0 for f < 0.(1.1.20)

The function Z(f) is equal to twice the positive part of S(f) for positive frequencies, has a valueequal to S(0) at the zero-frequency componentf, and contains no negative frequency components.The inverse Fourier transform of Z(f) is called the analytic signal z(t) corresponding to s(t). Theanalytic signal z(t) is complex. Similarly, the real signal s(t) is related to z(t) by

s(t) = 12 (z(t) + z∗(t)) (1.1.21a)

= Re[z(t)], (1.1.21b)

where S(f) = S∗(−f) has been used because s(t) is real. The analytic signal z(t) is directly relatedto the real signal s(t) by

z(t) = s(t) + is(t)

with z(t)←→ Z(f), where s(t) is the Hilbert transform of s(t) defined as

s(t) =1

π

∫ ∞−∞

s(τ)

t− τ dτ. (1.1.22)

The Hilbert transform is formally the convolution of s(t) and (πt)−1 For example, if s(t) = cos(2πft),then the analytic signal is z(t) = cos(2πft) + i sin(2πft) = ei2πft.The Hilbert transform relates a real function of time to a complex function of time with a one-

sided function of frequency Z(f). A counterpart of the Hilbert transform, called the Kramers-Kronigtransform (or the Kramers-Kronig relationship), relates a function of frequency to a real-valued one-sided (causal) function of time. The inverse of this transform relates a real-valued one-sided functionof time to a function of frequency. Let s(t) be a real-valued causal function with Fourier transformS(ω) = SR(ω) + iSI(ω), conventionally stated using the angular frequency ω in place of f . Thefunctions SR(ω) and SI(ω) are related by

SI(ω) =1

π

∫ ∞−∞

SR(Ω)

ω − ΩdΩ. (1.1.23)

An alternative form of the Kramers-Kronig transform expresses instead SR(ω) in terms of SI(ω).eThe word modulus is sometimes used for the magnitude of a complex number.fThe value at zero frequency is often called the DC value (direct current) even if the signal does not representcurrent.

6

Page 11: Math Notes for ECE 278 - web.eng.ucsd.edu

Modes of a Linear Time-invariant System

Let the input to a linear time-invariant system characterized by an impulse response h(t) consist ofa single complex frequency component given by s(t) = ei2πft. Using the commutative property ofthe convolution operation, the output is

r(t) =

∫ ∞−∞

h(τ)ei2πf(t−τ)dτ

= ei2πft∫ ∞−∞

h(τ)e−i2πfτdτ

= H(f)ei2πft, (1.1.24)

where H(f), called the transfer function, is the Fourier transform of h(t). For any frequency f0,the output H(f0)e

i2πf0t is a scaled version of the input ei2πf0t. Therefore, the function ei2πf0t is aneigenfunctiong, eigenmode, or simply a mode of a linear, shift-invariant system with the value H(f0)being the eigenvalue. For a linear transformation described by a matrix, the corresponding vectoris called an eigenvector . Given that a linear, shift-invariant system can only scale the functionei2πf0t by a complex number H(f0), a linear shift-invariant system cannot create new frequencycomponents.Any s(t) at the input to h(t) will have an output described by the convolution r(t) = s(t)~ h(t).

Using the convolution property of the Fourier transform given in (1.1.14), the output signal R(f)in the frequency domain is given by R(f) = S(f)H(f) where S(f) is the Fourier transform of s(t).The inverse Fourier transform (cf. (1.1.9)) of R(f) yields the output signal r(t)

r(t) =

∫ ∞−∞

S(f)H(f)ei2πftdf. (1.1.25)

The relationship between the input and the output of a linear shift-invariant system in both thetime domain and the frequency domain is shown in Figure 1.2, where the two-way arrows represent

h(t)

H(f)

s(t)

S(f)

r(t) = s(t) h(t)

R(f) = S(f)H(f)

Figure 1.2: Time and frequency representation of a time-invariant linear system.

the Fourier transform relationship.

1.1.1 Bandwidth and Timewidth

Signals used in communication systems are constructed from finite energy pulses. A signal s(t) offinite energy must have most of its energy in some finite region of the time axis, and its Fourier

gIn general, an eigenfunction of a linear transformation is any function that is unchanged by that transformationexcept for a scalar multiplier called an eigenvalue.

7

Page 12: Math Notes for ECE 278 - web.eng.ucsd.edu

transform S(f) must have most of its energy in some finite region of the frequency axis. Thetimewidth is a measure of the width of the signal s(t). The bandwidth is a measure of the width ofthe spectrum S(f).The root-mean-squared bandwidth Wrms of a signal s(t) with nonzero energy is defined by

W2rms

.=

1

E

∫ ∞−∞

(f − f)2|S(f)|2df, (1.1.26)

where S(f) is the Fourier transform of s(t), where

f.=

1

E

∫ ∞−∞

f |S(f)|2df (1.1.27)

is the centroid or the mean of the term |S(f)|2/E, and where E =∫∞−∞ |S(f)|2df is the energy in

the pulse s(t). Expanding the square of (1.1.26) and simplifying yields the alternative form

W2rms =

1

E

∫ ∞−∞

f2|S(f)|2df − f2

= f2 − f2, (1.1.28)

where

f2.=

1

E

∫ ∞−∞

f2|S(f)|2df (1.1.29)

is defined as the mean-squared frequency .The root-mean-squared timewidth Trms is defined in an analogous fashion to (1.1.28) as

T 2rms = t2 − t2, (1.1.30)

where t = 1E

∫∞−∞ t |s(t)|2dt and t2 = 1

E

∫∞−∞ t

2|s(t)|2dt. The value T 2rms defines the mean-squared

timewidth of the pulse s(t).The relationship between Trms and Wrms for a baseband pulse is shown in Figure 1.3a. If the

same pulse is modulated onto a carrier to produce a passband pulse, as shown in Figure 1.3b, thenthe definition ofWrms is not as useful because the spectrum S(f) is not contiguous. In this case, thepassband bandwidth B is twice the baseband bandwidthW because it occupies twice the frequencyrange.Other measures of the bandwidth and timewidth are common. The ideal bandwidth is the smallest

value of W such that S(f) = S(f)rect(f/W). The three-decibel bandwidth or half-power bandwidthof a signal s(t) whose power density |S(f)|2 is unimodal is denoted by Wh. It is defined as thefrequency at which |S(f)|2 is half (or −3 dB) of the power density at the maximum of |S(f)|2.The effective timewidth Tamp of a nonnegative real pulse s(t) is defined as

Tamp.=

1

E

(∫ ∞−∞

s(t)dt

)2

=

(∫∞−∞ s(t)dt

)2∫∞−∞ s

2(t)dt. (1.1.31)

8

Page 13: Math Notes for ECE 278 - web.eng.ucsd.edu

t

s(t)

f

S(f )

t

s(t)

ffc-fc

←→

←→

S(f )

TrmsWrms

Figure 1.3: (a) A baseband pulse and its spectrum. (b) A passband pulse and its spectrum.

Instead, the effective timewidth Tpower of a complex pulse s(t) is defined differently in terms of theinstantaneous power P (t) = |s(t)|2 as

Tpower.=

(∫∞−∞ P (t)dt

)2∫∞−∞ P

2(t)dt

=

(∫∞−∞ |s(t)|2dt

)2∫∞−∞ |s(t)|4dt

. (1.1.32)

These two definitions are similar, but are not the same.

Timewidth-Bandwidth Product

The Schwarz inequality, discussed in Section 1.1.3 (cf. (1.1.72)), can be used to determine a lowerbound on the timewidth-bandwidth producth of the root-mean-squared timewidth Trms of the signals(t) and the root-mean-squared bandwidth Wrms of the corresponding spectrum S(f). A pulses(t) with a mean time t and a mean frequency f has the same timewidth and bandwidth as thepulse s(t− t)ei2πf . Therefore it is enough to consider s(t) with both means, t and f equal to zero.Normalize the energy so that

∫∞−∞ |s(t)|2dt =

∫∞−∞ |S(f)|2df = 1 (cf. (1.1.18)). The derivation then

follows from the expression

d

dt

(t|s(t)|2

)= |s(t)|2 + t s(t)

ds∗(t)

dt+ t s∗(t)

ds(t)

dt

= |s(t)|2 + 2Re

[ts(t)

ds∗(t)

dt

]. (1.1.33)

hThis is also called the time-bandwidth product.

9

Page 14: Math Notes for ECE 278 - web.eng.ucsd.edu

Integrate both sides from −∞ to ∞

t∣∣s(t)∣∣2∣∣∣∞

−∞=

∫ ∞−∞|s(t)|2dt+ 2Re

[∫ ∞−∞

ts(t)ds∗(t)

dtdt

].

The left is zero because the power |s(t)|2 in a finite energy pulse must go to zero faster than 1/|t|as |t| goes to infinity. Therefore, the squared magnitude of the terms on the right are equal so that∣∣∣∣∫ ∞

−∞|s(t)|2dt

∣∣∣∣2 = 4

∣∣∣∣Re

[∫ ∞−∞

ts(t)ds∗(t)

dt

]∣∣∣∣2 . (1.1.34)

Setting the left side to E2 and applying the Schwarz inequality given in (1.1.72) to the right gives

E2 ≤ 4

∫ ∞−∞|ts(t)|2 dt

∫ ∞−∞

∣∣∣∣ds∗(t)dt

∣∣∣∣2 dt. (1.1.35)

The first integral on the right equals ET 2rms (cf (1.1.30)). Using the differentiation property of

the temporal Fourier transform (cf. (1.1.12)) and Parseval’s relationship (cf. (1.1.18)), the secondintegral can be written as (cf. (1.1.26))∫ ∞

−∞

∣∣∣∣ds∗(t)dt

∣∣∣∣2 dt =

∫ ∞−∞|i2πfS(f)|2 df = 4π2EW2

rms.

With these expressions, (1.1.35) now leads to following inequality for the timewidth-bandwidthproducti

TrmsWrms ≥1

4π. (1.1.36)

As an example, the Fourier transform S(f) of a gaussian pulse s(t) = e−πt2 in time is a gaussian

pulse in frequency with the transform pair given by

e−πt2 ←→ e−πf

2. (1.1.37)

Expressing these pulse in the standard form e−t2/2σ2 , each pulse is characterized by σ2 = 1/2π. For

a gaussian pulse, because T 2rms is defined using |s(t)|2 and W2

rms is defined using |S(f)|2, T 2rms =

W2rms = 1/4π so that TrmsWrms = 1/4π, which satisfies (1.1.36) with equality. This means that

a gaussian pulse, perhaps time-shifted or frequency shifted, produces the minimum value of thetimewidth-bandwidth product.

Communication Pulses

Several basic pulses are commonly used in the study of communication systems. A rectangular pulseof unit height and unit width centered at the origin is the rect pulse defined as

rect(t).=

1 |t| ≤ 1/20 |t| > 1/2

. (1.1.38)

iThis is called the Heisenberg uncertainty relationship in other contexts.

10

Page 15: Math Notes for ECE 278 - web.eng.ucsd.edu

The Fourier transform of this rectangular pulse is

S(f) =

∫ ∞−∞

s(t)e−i2πftdt

=sin(πf)

πf

= sinc(f), (1.1.39)

where the sinc pulse is defined as

sinc(t).=

sin(πt)

πt. (1.1.40)

The sinc pulse has its zeros on the nonzero integers and is equal to one for t equal to zero. TheFourier transform pairs

rect(t) ←→ sinc(f) (1.1.41a)sinc(t) ←→ rect(f) (1.1.41b)

are duals.The scaling property of the Fourier transform gives the pair

1T rect(t/T ) ←→ sinc(fT ). (1.1.42)

In the limit as T goes to zero, the left side approaches a Dirac impulse and the right side approachesthe constant one. In the sense of this limit, the Fourier transform pair

δ(t) ←→ 1, (1.1.43)

and its dual

1 ←→ δ(f), (1.1.44)

can be defined.Another useful Fourier transform pair that is defined using a limiting process is the Fourier

transform of an infinite series of Dirac impulses, which is given byj

∞∑j=−∞

δ(t− j)←→∞∑

j=−∞δ(f − j). (1.1.45)

This Fourier transform pair is abbreviated as comb(t)←→ comb(f). The transform pair

∞∑j=−∞

δ(t− jTs)←→ (1/Ts)

∞∑j=−∞

δ(f − j/Ts), (1.1.46)

jThis pair is whimsically called the “picket fence miracle”. A companion statement that avoids the use of impulsesis the Poison summation formula.

11

Page 16: Math Notes for ECE 278 - web.eng.ucsd.edu

then follows from the scaling property of the Fourier transform.Another useful pulse is the gaussian pulse

s(t) = e−t2/2σ2

. (1.1.47)

The transform pair e−πt2 ←→ e−πf2 given in (1.1.37) becomes

e−t2/2σ2 ←→

√2πσe−2π

2σ2f2 =√

2πσe−σ2ω2/2, (1.1.48)

by using the scaling property of the Fourier transform given in (1.1.11).Inserting the mathematical symbol i into the argument of a gaussian pulse gives another pulse

called a quadratic phase pulse, a chirp pulse, or an imaginary gaussian pulse. Because e−iπt2 hasinfinite energy, it does not conform to the requirements of the formal definition of a Fourier transformpair. Therefore, the Fourier transform must be defined by a limiting process and is given by

e−iπt2 ←→ e−iπ/4eiπf

2= e−iπ/4eiω

2/4π. (1.1.49a)

The duality property of the Fourier transform gives

e−iπ/4eiπt2 ←→ e−iπf

2. (1.1.50)

Another transform pair is

πe−2π|t| ←→ 1

1 + f2, (1.1.51)

with the pulse waveform in the frequency domain called a lorentzian pulse. For this pulse Trms

exists, but Wrms does not (or is infinite).

A list of Fourier transform pairs is provided for reference in Table 1.1.

1.1.2 Passband and Complex-Baseband Signals

A passband signal is a signal of the form

s(t) = A(t) cos(2πfct+ φ(t)

), (1.1.52)

where A(t) is the amplitude, φ(t) is the phase, and both A(t) and φ(t) vary slowly compared to thecarrier or reference frequency fc. Radio-frequency signals of the form of (1.1.52) are passband signalsbecause of the high frequency of the carrier as compared to the baseband modulation bandwidth.An equivalent representation for a passband signal can be derived using the trigonometric identity

cos(A+B) = cosA cosB − sinA sinB to yield

s(t) = sI(t) cos(2πfct)− sQ(t) sin(2πfct), (1.1.53)

where sI(t) = A(t) cosφ(t) is the in-phase component , and sQ(t) = A(t) sinφ(t) is the quadraturecomponent . A passband signal can also be written as the real part of a complex signal

s(t) = Re[(A(t)eiφ(t)

)ei2πfct

](1.1.54a)

= Re[(sI(t) + isQ(t)) ei2πfct

](1.1.54b)

= Re[s(t)ei2πfct

], (1.1.54c)

12

Page 17: Math Notes for ECE 278 - web.eng.ucsd.edu

s(t) S(f)

1 δ(f)

δ(t) 1ei2πfct δ(f − fc)rect(t) sinc(f)

sinc(t) rect(f)

e−πt2

e−πf2

e−t2/2σ2 √

2πσe−2π2σ2f2

e−iπt2

e−iπ/4 eiπf2

πe−2π|t| 11+f2

∞∑j=−∞

δ(t− j)∞∑

j=−∞δ(f − j)

comb(t) comb(f)

K∑j=−K

δ(t− j) sin[(2K + 1)πf

]sin(πf)

Table 1.1: Table of Fourier transform pairs.

where s(t) = sI(t) + isQ(t) is the complex-baseband signal that represents the passband signal s(t).The amplitude of a passband signal can be written in terms of the root-mean-squared amplitude

Arms(t), defined by

Arms(t).=

√∫Ts(t)2dt =

√∫T

(A(t) cos

(2πfct+ φ(t)

))2dt ≈

√1

2A2(t), (1.1.55)

where∫T

0 cos2(2πfct)dt = 1/2 has been used, and the integration time T is large compared to 1/fcand small compared to any temporal variation of A(t). If the complex-baseband signal is a functionof both time and space, then the signal is called the complex signal envelope a(z, t), and is oftenexpressed using a root-mean-squared amplitude.The Fourier transform of the passband signal s(t) is

S(f) =

∫ ∞−∞

s(t)e−i2πftdt =

∫ ∞−∞

Re[s(t)ei2πfct

]e−i2πftdt.

Using the identity Re[z] = 12 (z + z∗) gives

S(f) =1

2

∫ ∞−∞

(s(t)ei2πfct + s∗(t)e−i2πfct

)e−i2πftdt.

Applying the modulation property of the Fourier transform yields

S(f) =1

2

(S(f − fc) + S∗(−f − fc)

), (1.1.56)

13

Page 18: Math Notes for ECE 278 - web.eng.ucsd.edu

where S(f) is the Fourier transform of the complex-baseband signal s(t). The notion of a passbandsignal implies that S(f − fc) and S(f + fc) have essentially no overlap.A passband impulse response h(t) has a passband transfer function of the form

H(f) = H(f − fc) +H∗(−f − fc), (1.1.57)

which has the same functional form as (1.1.56), but without the factor of 1/2. Provided the termsH(f − fc) and H(f + fc) do not overlap,∣∣H(f)

∣∣2 = |H(f − fc)|2 + |H(−f − fc)|2 . (1.1.58)

To define the baseband equivalent of the passband system, the complex-baseband transfer functionH(f) is defined as

H(f) =

H(f + fc) f < fc

0 f > fc

Using the modulation property of the Fourier transform given in (1.1.16) and (1.1.57), the realpassband impulse response can be written as

h(t) = h(t)ei2πfct + h∗(t)e−i2πfct

= 2Re[h(t)ei2πfct], (1.1.59)

where h(t) is the complex-baseband impulse response, which is the inverse Fourier transform of thecomplex-baseband transfer function H(f).The output passband signal r(t) has a Fourier transform given by

R(f) = S(f)H(f)

= 12

(R(f − fc) +R∗(−f − fc)

), (1.1.60)

which can be verified using the definitions of S(f) and H(f), noting that the two cross termsS(f − fc)H∗(−f − fc) and S∗(−f − fc)H(f − fc) are zero under the same set of constraints usedto derive (1.1.58).In summary, the output of a passband linear system h(t) to a passband signal s(t) at the input

can be determined in either the time domain or the frequency domain using

r(t) = s(t) ~ h(t)

R(f) = S(f)H(f).

With these translated to complex baseband using the same frequency reference, the output of thecorresponding complex-baseband linear system h(t) with an input complex-baseband signal s(t) is

r(t) = s(t) ~ h(t)

R(f) = S(f)H(f).

14

Page 19: Math Notes for ECE 278 - web.eng.ucsd.edu

1.1.3 Signal Space

The set of all complex signals of finite energy on an interval [0, T ] defines the signal space over thatinterval. Two elements within the signal space are deemed to be equivalent or equal, if the energyof the difference of the two signals is zero. A countable set of signals ψn(t) of a signal space spansthat signal space if every element of the signal space can be expressed as a linear combination ofthe ψn(t). This means that every s(t) can be written as

s(t) =∑n

snψn(t), (1.1.61)

in the sense that the difference between the left side and the right side has zero energy. The setψn(t) is called a basis if the elements of the set are linearly independent and span the signalspace. Every basis for a signal space is countably infinite. An orthonormal basis ψn(t) satisfiesthe additional requirement that ∫ T

0ψm(t)ψ∗n(t)dt

.= δmn, (1.1.62)

for all m and n, where δmn is the Kronecker impulse. This means that each basis function satisfies∫ T

0|ψn(t)|2dt = 1. (1.1.63a)

and ∫ T

0ψm(t)ψ∗n(t)dt = 0 (m 6= n). (1.1.63b)

For an orthonormal basis, the coefficient sn of the expansion in (1.1.61) is given by

sn.=

∫ T

0s(t)ψ∗n(t)dt. (1.1.64)

A set of basis functions must span the entire signal space, which implies that the number offunctions in any basis for signal space is infinite. A basis must be infinite, but not every infiniteorthonormal set is a basis for the set of square-integrable functions.A linear transformation on a signal space is a mapping from the space onto itself that satisfies

the linearity properties. With respect to a fixed basis, a linear transformation can be described bya matrix, called the transformation matrix .

Inner Product For any orthonormal basis ψm(t) a signal s(t) in the signal space over [0, T ] iscompletely determined by an infinite sequence of complex components sn, which are the coefficientsof the expansion given in (1.1.61). These coefficients may be regarded as forming an infinitelylong vector s called a signal vector . Given a signal vector s with complex components, define theconjugate transpose vector as the vector s† whose components are the complex conjugates of thecorresponding components of the vector s. If s is a column vector, then s† is a row vector.

15

Page 20: Math Notes for ECE 278 - web.eng.ucsd.edu

Using a(t) =∑

m amψm(t) (cf. (1.1.61)) and b(t) =∑

n bnψn(t), define the inner product as

a · b .=

∫ T

0a(t)b∗(t)dt

=∑m

∑n

amb∗n

∫ T

0ψm(t)ψ∗n(t)dt

=∑m

∑n

amb∗nδmn

=∑m

amb∗m, (1.1.65)

where (1.1.62) is used in going from the second line to the third line. Setting a(t) = b(t) in (1.1.65)immediately gives the energy statement∫ T

0|a(t)|2dt =

∑m

|am|2. (1.1.66)

For a finite-energy signal a(t), this implies that |am|2 goes to zero as m goes to infinity. For someinteger N , an arbitrarily small amount of energy is discarded by including only N terms.The term am is a component of the (column) signal vector a. The term b∗m is the component of

a (row) signal vector b†. Therefore,

a · b = b†a, (1.1.67)

where b†a is the matrix product of a one by N matrix and an N by one matrix.Using (1.1.66), the energy in the signal s(t) is

E =

∫ T

0|s(t)|2dt =

∑n

|sn|2 = |s|2. (1.1.68)

Similarly, the component sn in (1.1.64) is determined using a(t) = s(t) and b(t) = ψn(t)

sn =

∫ T

0s(t)ψ∗n(t)dt

.= s ·ψn. (1.1.69)

This expression defines the projection of s(t) onto ψn(t). The vector ψn has the nth componentequal to one and all other components equal to zero. It is a basis vector that corresponds to thebasis function ψn(t) defined in (1.1.61). The set of all linear combinations of basis vectors alongwith an inner product is an instance of a Hilbert space.

Outer Product In contrast to the inner product operation a · b .= b†a defined in (1.1.65), which

produces a scalar, the outer product operation of two vectors produces a transformation. Just asthe inner product operation is invariant with respect to a change in basis, the outer product as anoperation is invariant with respect to a change in basis. However, the matrix representation M ofan outer product with elements mij = aib

∗j does depend on the basis.

16

Page 21: Math Notes for ECE 278 - web.eng.ucsd.edu

The outer product of two column vectors is defined as

a⊗ b.= ab†. (1.1.70)

The outer product is also called the dyadic product or the tensor product of the two vectors.The outer product distributes over both addition and scalar multiplication so that

(αa + βb)⊗ (γc + δd) = αγ (a⊗ c) + αδ (a⊗ d) + βγ (b⊗ c) + βδ (b⊗ d).

Schwarz Inequality Any two signal vectors satisfy

|s1 · s2|2 ≤ |s1|2|s2|2,which is known as the Schwarz inequality . For the signal space of complex functions with finiteenergy on the interval [0, T ], the Schwarz inequality can be written using (1.1.65) as∣∣∣∣∫ T

0s1(t)s

∗2(t)dt

∣∣∣∣2 ≤∫ T

0|s1(t)|2 dt

∫ T

0|s2(t)|2 dt, (1.1.71)

whereas for the set of square-integrable functions on the infinite line, it is∣∣∣∣∫ ∞−∞

s1(t)s∗2(t)dt

∣∣∣∣2 ≤∫ ∞−∞|s1(t)|2 dt

∫ ∞−∞|s2(t)|2 dt. (1.1.72)

In either case, the equality holds if and only if s1(t) = ks2(t) for some constant k, possibly complex.

Distance in a Signal Space Using (1.1.65) and (1.1.68), the squared euclidean distance d212 betweentwo signals s1(t) and s2(t), or, equivalently two signal vectors s1 and s2, is defined as

d212.=

∫ T

0|s1(t)− s2(t)|2 dt

=

∫ T

0

(|s1(t)|2 + |s2(t)|2 − s1(t)s∗2(t)− s∗1(t)s2(t)

)dt

= E1 + E2 − 2Re [s1·s2] . (1.1.73)

This expression states that the squared euclidean distance between two signals depends on theenergy of each signal as well as their inner product.Now consider an infinite-duration passband signal

s(t) = A(t) cos(2πfct+ φ(t)

)= Re

[s(t)ei2πfct

].

Using Parseval’s relationship and the modulation property of the Fourier transform, the energy inthis passband signal is

Es =

∫ ∞−∞

s2(t)dt =

∫ ∞−∞

S2(f)df

=

∫ ∞−∞

∣∣12S(f − fc) + 1

2S(f + fc)∣∣2 df

=1

2

∫ ∞−∞|S(f)|2df = 1

2Es, (1.1.74)

17

Page 22: Math Notes for ECE 278 - web.eng.ucsd.edu

provided that S(f−fc)S(f+fc) = 0. Equation (1.1.74) states that, under this condition, the energyin a complex-baseband signal s(t) is twice the energy in the passband signal s(t). Similarly, underthis condition, the euclidean distance between two complex-baseband signals is twice the distancebetween the equivalent passband signals so that

d2ij(complex-baseband signal) = 2d2ij(passband signal). (1.1.75)

Using the same line of reasoning, the cosine and sine components are orthogonal with∫ ∞−∞

sI(t) cos(2πfct)sQ(t) sin(2πfct)dt = 0. (1.1.76)

For a narrowband signal, both A(t) and φ(t) vary slowly compared to the carrier frequency fc.Therefore, over a finite time interval T much greater than 1/fc, the energy in the signal is well-approximated by (1.1.74) with the cosine and sine components being nearly orthogonal.

Fourier Series

A function in signal space on the interval [0, T ] can be expanded in a Fourier series. The sinusoidsare the basis functions and the Fourier coefficients are the expansion coefficients.

Nyquist-Shannon Series A deterministic baseband waveform s(t) whose spectrum S(f) is zero for|f | larger than W is called a bandlimited waveform. Because (−W,W) defines an interval on thefrequency axis, the set of functions on this interval is a signal space and is spanned by a countableset of basis functions. One such set of basis functions is given by the Nyquist-Shannon samplingtheorem.The sampling theorem can be described by settingW = 1/2, and multiplying s(t) by comb(t) (cf.

(1.1.45)). This produces a sampled waveform with the samples s(j) spaced by the sampling intervalTs = 1. Two sampled waveforms are shown in Figure 1.4 for two different sampling rates. Usingcomb(t) ←→ comb(f) and its dual (cf. Table 1.1), and the convolution property of the Fouriertransform (cf. (1.1.15)) gives

s(t)comb(t) ←→ S(f) ~ comb(f) (1.1.77a)(s(t)comb(t)

)~ sinc(t) ←→

(S(f) ~ comb(f)

)rect(f) (1.1.77b)

The left side of (1.1.77a) is an infinite sequence of impulses with the area of the kth impulse equalto the kth sample value. The right side is the spectrum of the sampled waveform S(f) ~ comb(f)and is shown at the bottom of Figure 1.4 for two different sampling rates. For the left set of curvesin Figure 1.4a, multiplying the right by rect(f), which is shown as a dashed line, recovers S(f)because the support of S(f) is [−1/2, 1/2] and thus the images of the original spectrum S(f) donot overlap in S(f)~ comb(f). Multiplication by rect(f) in frequency corresponds to a convolutionin time with sinc(t). Because [S(f) ~ comb(f)]rect(f) = S(f), the convolution of sinc(t) with theleft side of (1.1.77b) recovers s(t) so that

s(t) = sinc(t) ~[s(t)comb(t)

]=

∞∑j=−∞

s(j)sinc(t− j

). (1.1.78)

18

Page 23: Math Notes for ECE 278 - web.eng.ucsd.edu

. . .. . . . . . . . .

(a)

t

t

t

f

t

t

t

f

BasebandSignals

SampledSignals

Spectra of Sampled Signals

(b)Combsin Time

Rectfunction

Figure 1.4: (a) Sampling at greater than the Nyquist rate. (b) Sampling at less than the Nyquistrate showing the effect of aliasing.

This expression states that a waveform s(t) bandlimited to [−1/2, 1/2] can be expanded using anorthogonal set of basis functions sinc(t− j) with the coefficients simply being samples s(j) of thebandlimited waveform s(t) for a sampling interval Ts = 1. The expression for an arbitrary samplinginterval Ts can be determined by applying the scaling property of the Fourier transform and gives

s(t) =∞∑

j=−∞s(jTs)sinc

(2Wt− j

), (1.1.79)

where Ts = 1/2W. In this way, the sequence of sinc functions is seen as the sequence of interpolatingfunctions for a bandlimited signal.The images of the original signal spectrum S(f) shown in Figure 1.4 are offset by the sampling

rate Rs = 1/Ts. When Rs ≥ 2W, the images do not overlap. The minimum sampling rate Rs = 2Wthat generates nonoverlapping images of the signal spectrum is called the Nyquist rate.When the sampling rate is greater than or equal to the Nyquist rate, the images do not overlap

and the original signal s(t) can be reconstructed as given by (1.1.79). The set of curves in Figure1.4a show a spectrum for a signal that is sampled at greater than the Nyquist rate. When thesampling rate is less than the Nyquist rate, the images of the original signal spectrum S(f) overlap.This effect is called aliasing and is shown in the set of curves in Figure 1.4b. In this case, anyfilter that reconstructs all of the frequency components of the original signal must also pass somefrequency components from one or more images. Aliasing is a form of signal distortion that replicatesfrequency components in the original signal at other frequencies in the reconstructed signal.

19

Page 24: Math Notes for ECE 278 - web.eng.ucsd.edu

Matrices

A matrix A is a doubly-indexed set aij , i = 1, . . . , n, j = 1, . . . ,m of real or complex numbersconventionally interpreted as a two-dimensional array A = [aij ]. The conjugate of A is the matrixA∗ = [a∗ij ]. The transpose of A is the matrix AT = [aji]. The conjugate transpose of A is the matrixA† = [a∗ji]. A matrix with n = m is a square matrix .

Trace of a Matrix The trace of a square matrix A is defined as the sum∑

nAnn of the diagonalelements of A. The trace is an inherent property of the transformation represented by the matrixand is independent of the choice of basis. Accordingly, the trace is the sum of the eigenvalues of thematrix.The trace operation has the following properties:

trace(cA)

= c traceA (1.1.80a)trace

(A + B

)= traceA + traceB (1.1.80b)

trace(AB)

= trace(BA). (1.1.80c)

The trace of a square matrix that can be expressed as an outer product of two vectors xy† is equalto the inner product of the same two vectors y†x. This is given by

trace(xy†) = y†x. (1.1.81)

The proof of this statement is asked as an end-of-chapter exercise.

Determinant of a Matrix The determinant of a square matrix A is a real or complex numberdefined in the usual way. The determinant is an inherent property of the transformation representedby a matrix and is independent of the choice of the basis. The determinant, denoted det(·), is definedby the Laplace recursion formula. Let Aij be the (n− 1) by (n− 1) matrix obtained from the n byn matrix A by striking out the ith row and the jth column. Then, for any fixed i,

detA =

n∑j=1

(−1)i+jaij detAij , (1.1.82)

where aij is the element of A indexed by i and j. A matrix whose determinant is nonzero is afull-rank matrix . For this matrix, all the eigenvalues are nonzero, and if they are distinct, thenthe eigenvectors are orthogonal with the number equal to the size of the square matrix. The rankof a matrix is defined as the maximum number of linearly-independent row vectors in the matrix.Equivalently, the rank of a matrix is the maximum number of linearly-independent column vectorsin the matrix. The rank of a matrix is equal to the size of the largest square submatrix with anonzero determinant.The determinant has the following properties for an n by n matrix:

detAB = detA detB (1.1.83a)det(cA) = cn detA (1.1.83b)

detA =1

det(A−1

) , (1.1.83c)

20

Page 25: Math Notes for ECE 278 - web.eng.ucsd.edu

provided that detA is nonzero where A−1 is the matrix inverse of A.The trace and the determinant are the two important invariant scalar metrics describing the

characteristics of a square matrix.

Hermitian Matrices A matrix that is invariant under the conjugate transpose, satisfies A† = A.The transformation it represents is called a self-adjoint transformation and the matrix is called ahermitian matrix . Every hermitian matrix can be diagonalized by a change of basis and has onlyreal eigenvalues. However, not every matrix with real eigenvalues is hermitian. The eigenvectorsof a hermitian matrix are orthogonal. By normalizing each eigenvector, the inner products satisfye†j ej = 1 for every j, and the set of normalized eigenvectors form an orthonormal basis.A transformation matrix A that satisfies AA† = A†A = I is a unitary matrix representing a

unitary transformation, where I is the identity matrix . The inverse of a unitary matrix satisfiesA−1 = A†. Multiplication of a vector by a unitary matrix preserves length and can be regarded asa generalized rotation in the space spanned by the eigenvectors of A.

Discrete Linear Transformations

A discrete linear transformation, represented by the transformation matrix A, can be representedas a vector-matrix product

r = As.

When the length of the output vector r is equal to the length of the input vector s, the matrix Ais a square matrix.A discrete linear transformation corresponding to a matrix A is characterized by a finite set of

eigenvectors ej and a corresponding finite set of eigenvalues λj, possibly complex, such that

Dej = λjej .

The eigenvectors are always orthogonal when the eigenvalues are distinct, and can be constrainedto be orthogonal even when the eigenvalues are not distinct. The eigenvectors then form a basis.The eigenvalues are the zeros of det(D− λI) = 0, which is a polynomial in λ of degree n.

Projections The set of orthonormal eigenvectors ej of a hermitian matrix representation of aself-adjoint transformation can be used as a set of orthonormal basis vectors. The inner product ofan orthonormal basis vector with itself is equal to one. The outer product of an orthonormal basisvector with itself is a matrix Pj = ej e

†j with only a single nonzero element equal to one, which is on

the diagonal of the matrix. This matrix is referred to as a projection matrix Pj , where the subscriptdenotes that the projection is onto the jth orthonormal basis vector ej .For a given basis, the sum of all projection matrices Pj equals the identity matrix I in the signal

space spanned by that basis. Thus

I =∑j

Pj =∑j

ej e†j . (1.1.84)

21

Page 26: Math Notes for ECE 278 - web.eng.ucsd.edu

The transformation A expressed by a hermitian matrix can be re-expressed in terms of its eigenvaluesλj and its projection matrices Pj . This is a diagonal matrix D given by

D =∑j

λjPj = diag(λ1, · · · , λn) =∑j

ejλj e†j . (1.1.85)

The same transformation A can also be re-expressed using yet another basis xm. Each eigen-vector ej in the new basis is given as ej =

∑n anxn. The matrix X in the new basis is

X =∑j

λj ej e†j

=∑m,n

∑j

λjana∗mxnx

†m

=∑m,n

Amnxnx†m, (1.1.86)

where the matrix X has elements Xmn =∑

j λjana∗m. This equation states that any hermitian

matrix can be expressed as a linear combination of the outer products of the basis vectors that areused to represent A. If the basis consists of the set of eigenvectors of A, then (1.1.86) reduces to(1.1.85).

Commuting Transformations Two square matrices A and B of the same size commute if AB = BA.A matrix A that commutes with A† is called a normal matrix . Then AA† = A†A. Two matriceswith a common set of eigenvectors commute. For this case, the order in which the transformationsare applied AB or BA does not affect the outcome. Two transformations that are represented bymatrices comprise a relationship known as a commutator defined as

[A,B].= AB− BA. (1.1.87)

Two matrices that do not commute do not have a common set of eigenvectors, the commutator[A,B] is nonzero, and the order in which the transformations are applied does affect the outcome.Two square matrices A and B that do not commute can always be embedded in two larger squarematrices that do commute by appending additional row and columns. The proof of this statementis asked as an exercise at the end of the chapter.

Singular Value Decomposition A transformation described by a matrix need not have real eigen-values and the eigenvectors need not be orthogonal. A useful decomposition of the matrix A, calledthe singular-value decomposition, is

A = UMV†. (1.1.88)

The matrices U and V are each unitary. The columns of U are the eigenvectors of AA†, whereasthe columns of V are the eigenvectors of A†A. All of the nonzero elements of the matrix M areamong the diagonal elements mk. These are called the singular values of A. These values arethe nonnegative square roots

√ξk of the eigenvalues ξk of the real symmetric matrix AA† so that

ξk = |mk|2. For a hermitian matrix, because A† is equal to A, U is equal to V and the matrix isdiagonalized with the orthogonal eigenvectors of A.

22

Page 27: Math Notes for ECE 278 - web.eng.ucsd.edu

1.2 Random Signals

Probability and statistics are important topics for the subject of communications because the in-formation that is conveyed is always random and a communications channel is always noisy. Tointroduce a quantitative discussion of a random variable, consider the random thermally-generatedvoltage across a resistor as a function of time. Because the voltage is random, the specific mea-sured waveform is only one of an infinite number of waveforms that could have been measured.Each possible waveform is called a sample function, or realization. The collection of all possiblerealizations defines a random process.k Four sample functions of the voltage waveform are shown inFigure 1.5. A random process is described in terms of its amplitude structure and its time structure.

Sample Functions)b()a(

Slice at time t1 fv(v)

v

Figure 1.5: (a) Four realizations of a random voltage waveform. The slice across all possible real-izations at a time t1 defines the random variable v(t1) with a corresponding probabilitydensity function fv(v) shown in (b).

In Section 1.2.1, the amplitude structure of a random process at one time instant is described usinga probability distribution function. In Section 1.2.2, the temporal (or spatial) structure of a randomprocess is described using correlation functions and power density spectra.

1.2.1 Probability Distribution Functions

Consider a slice at a fixed time t1 through all possible realizations of the random voltage shown inFigure 1.5a. The voltage at time t1 is a random variable, and as such is denoted by v. A specificinstance, or realization, of the random variable v is denoted by v. If the sample function v(t)represents a complex random function of time, such as a complex-baseband signal, then the randomvariable v(t1) defined at time t1 is a complex random variable v consisting of a real part and animaginary part.Associated with any discrete random variable is a probability distribution function called a proba-

bility mass function denoted by px(x) or p(x).l Associated with any real continuous random variablex is a cumulative probability distribution function (cdf), Fx(x), and a probability density function,fx(x). The cumulative probability distribution function is defined as

Fx(x).= Prx ≤ x,

kThis is also called a stochastic process.lThe underlined subscript on p denoting the random variable is sometimes omitted for brevity

23

Page 28: Math Notes for ECE 278 - web.eng.ucsd.edu

where Prx < x is the probability that the random variable x is less than x. Every cumulativeprobability density function is nonnegative, monotonically increasing, and goes to one as x goes toinfinity.The probability density function fx(x) is related to the cumulative probability density function

Fx(x) by

fx(x).=

d

dxFx(x), (1.2.1)

provided the derivative exists.The probability density function fx(x) is a nonnegative real function that integrates to one. The

definite integral

Prx1 < x < x2 =

∫ x2

x1

fx(x)dx

is the probability that the random variable x lies in the interval between x1 and x2.The statistical expectation, or the expected value, of the random variable x is defined as

〈x〉 .=

∫ ∞−∞

x fx(x)dx. (1.2.2)

The expectation of x is also called the mean or the first moment of the random variable x. In asimilar way, the expected value of a function g(x) of the random variable x is

〈g(x)〉 =

∫ ∞−∞

g(x) fx(x)dx, (1.2.3)

provided the integral exists. The nth moment of x is defined as

〈xn〉 .=

∫ ∞−∞

xnfx(x)dx, (1.2.4)

provided the integral exists. In order for the nth moment to exist, the probability density functionmust decrease faster than x−n as x goes to infinity.The variance σ2x of the random variable x is defined as (cf. (1.1.30))

σ2x.= 〈(x− 〈x〉)2〉= 〈x2〉 − 2〈x〉〈x〉+ 〈x〉2= 〈x2〉 − 〈x〉2. (1.2.5)

The variance measures the spread of the random variable x about the mean 〈x〉. The square rootσx of the variance of the random variable x, and is called the standard deviation of x. This is theroot-mean-squared value (rms) of x if x has zero mean.As an example, consider the gaussian probability density function (cf. (1.2.18)) which will be

discussed in detail later in this section. An important property for a gaussian random variable isthat all moments of the probability density function of an order larger than two can be expressed

24

Page 29: Math Notes for ECE 278 - web.eng.ucsd.edu

in terms of the first-order and second-order moments.m Therefore, the gaussian distribution iscompletely characterized by its mean and variance.As another example, consider the probability density function

fx(x) =

λx−(λ+1) x ≥ 10 x < 1

, (1.2.6)

which is called the Pareto probability density function with a Pareto index λ that is a positivenumber. It is asked as a problem at the end of the chapter to show that the mean is λ/(λ− 1) forλ ≥ 1 and otherwise is infinite, and to show that the variance is λ/[(λ− 1)2(λ− 2)] for λ ≥ 2 andotherwise is infinite.

Joint Probability Distributions A probability density function can be defined for more than onerandom variable. The probability density function for two random variables fx,y(x, y) is called abivariate probability density function or a joint probability density function. The probability thatthe joint event x1 < x < x2 ∪ y1 < y < y2 occurs is equal to the volume under fx,y(x, y) overthe rectangle supported by the two corners (x1.y1) and (x2, y2).Several other probability density functions can be defined in terms of the joint probability density

function fx,y(x, y). Every joint probability density function fx,y(x, y) is associated with marginalprobability density functions and conditional probability density functions. The marginal probabilitydensity functions fx(x) and fy(y) are determined by integrating fx,y(x, y) over the range of the othervariable, a process called marginalization. Thus,

fy(y) =

∫ ∞−∞

fx,y(x, y)dx (1.2.7a)

fx(x) =

∫ ∞−∞

fx,y(x, y)dy. (1.2.7b)

Substituting the marginal (1.2.7a) into (1.2.2) gives

⟨y⟩

=

∫ ∞−∞

∫ ∞−∞

yfx,y(x, y)dxdy

as the mean for y.The probability density function of y given that the event x = x has occurred is called the

conditional probability density function. It is denoted by fy|x(y|x) and is given by

fy|x(y|x) =fx,y(x, y)

fx(x),

where the notation y|x indicates that the probability density function of y depends on, or is con-ditioned by, the event x = x. Likewise, the conditional probability density function, denoted byfx|y(x|y), is the probability density function of x given that the event y = y has occurred.mThis statement is called Isserlis theorem See Reed (1962).

25

Page 30: Math Notes for ECE 278 - web.eng.ucsd.edu

The joint probability density function of the events x = x and y = y is equal to the probabilitydensity function of the event x = x multiplied by the conditional probability density function ofthe event y = y|x = x, so that

fx,y(x, y) = fx(x)fy|x(y|x). (1.2.8)

Similarly, fx,y(x, y) = fy(y)fx|y(x|y). Equating these two expressions gives

fx|y(x|y) =fx(x)fy|x(y|x)

fy(y), (1.2.9)

which is a form of Bayes rule.n When x represents a transmitted signal value and y represents thevalue of the received signal, the term on the left is called the posterior probability density function.It is the probability density that x was transmitted given that y is received.The marginal probability density function fy(y) can be expressed in terms of the conditional

density function fy|x(y|x) and the other marginal probability density function fx(x) by integratingboth sides of (1.2.8) with respect to x and using (1.2.7) to give

fy(y) =

∫ ∞−∞

fx(x)fy|x(y|x)dx. (1.2.10)

Correlation and Independence The random variables x and y are independent and fx,y(x, y)called a product distribution if the joint probability density function fx,y(x, y) can be written asfx,y(x, y) = fx(x)fy(y). If two random variables are independent, then knowing the realization ofone random variable does not affect the probability density function of the other random variable.This means that fy|x(y|x) = f(y) for independent random variables x and y.If x and y are independent, then the probability density function of the sum x+ y of the random

variables is

fz(z) =

∫ ∞−∞

fx(z − y)fy(y)dy = fx(z) ~ fy(z), (1.2.11)

where ~ is the convolution operator defined in (1.1.3). The derivation of this equation is discussedin an exercise at the end of the chapter.The correlation of two real random variables x and y is the expected value of their product. Then

〈xy〉 =

∫ ∞−∞

∫ ∞−∞

xyfx,y(x, y)dxdy. (1.2.12)

The covariance cov is defined as

cov.= 〈xy〉 − 〈x〉〈y〉. (1.2.13)

If at least one of the two random variables has zero mean, then cov = 〈xy〉.Two random variables x and y are uncorrelated if 〈xy〉 = 〈x〉〈y〉 in (1.2.12) or if cov = 0 in (1.2.13).

If random variables x and y are uncorrelated at least one has a zero mean, then 〈xy〉 = cov = 0.

nBayes rule is simply an immediate consequence of the definition of marginal and conditional distributions.

26

Page 31: Math Notes for ECE 278 - web.eng.ucsd.edu

Characteristic Functions The expected value of the function eiωx of the random variable x thathas a probability density function f(x) is called the characteristic function Cx(ω)

Cx(ω) =⟨eiωx

⟩=

∫ ∞−∞

fx(x)eiωxdx

=∞∑n=0

(iω)n

n!〈xn〉 , (1.2.14)

provided the moments exist, where the power-series expansion for ex has been used in the lastexpression.For a discrete random variable with a probability mass function p(k) on the integers, the charac-

teristic function is

Ck(ω) =

∞∑k=−∞

eiωkp(k). (1.2.15)

Using the convolution property of the Fourier transform given in (1.1.14), the characteristicfunction Cz(ω) of the probability density function fz(z) for the sum of two independent randomvariables x and y (cf. (1.2.11)) is equal to the product Cx(ω)Cy(ω) of the characteristic functions ofthe two probability density functions for the two random variables. The inverse Fourier transformof this product yields the desired probability density function

fz(z) =1

∫ ∞−∞

Cx(ω)Cy(ω)e−iωzdω. (1.2.16)

This expression is readily extended to multiple independent random variables.The nth moment of a probability density function, if it exists, can be determined by differentiation

of the characteristic function

〈xn〉 =1

indn

dωnCx(ω)

∣∣∣∣ω=0

. (1.2.17)

The derivation of this expression is assigned as a problem at the end of the chapter.

Probability Density Functions used in Communication Theory

Probability density functions and probability mass functions are regularly encountered in the anal-ysis of communication systems. The most relevant probability density functions for the analysis ofdigital communication systems are reviewed in this section.

Gaussian Probability Density Function A gaussian random variable has a gaussian probabilitydensity function defined by

fx(x).=

1√2πσ

e−(x−m)2/2σ2. (1.2.18)

27

Page 32: Math Notes for ECE 278 - web.eng.ucsd.edu

It is easy to verify that the mean of x is m and the variance is σ2. A unique property of gaus-sian random variables is that any weighted sum of multiple gaussian random variables, whetherindependent or dependent, is also a gaussian random variable.The probability that a unit-variance gaussian random variable exceeds a value z, Prx > z is

expressed in terms of the complementary error function, which is denoted by erfc and defined aso

1√2π

∫ ∞z

e−x2/2dx =

1

2erfc

(z√2

), (1.2.19)

where erfc(z) = 1−erf(z) with erf(z) being the error function, defined as

erf(z).=

2√π

∫ z

0e−s

2ds.

For large arguments, the complementary error function can be approximated by

erfc(x) ≈ 1

x√πe−x

2. (1.2.20)

A multivariate gaussian probability density function is a joint probability density function for ablock x of real random variables with components xi given by

fx(x) =1√

(2π)N detCe−

12 (x−〈x〉)

TC−1(x−〈x〉), (1.2.21)

where C is the real covariance matrix defined for any multivariate probability density function as

C =⟨(x− 〈x〉) (x− 〈x〉)T

⟩. (1.2.22)

The square matrix C has a determinant detC. The diagonal matrix element Cii is the varianceof the random variable xi. The off-diagonal matrix element Cij is the covariance (cf. (1.2.13)) ofthe two random variables xi and xj . These two elements are uncorrelated if Cij equals zero. Ingeneral, this need not be a strong statement, but for gaussian random variables, it means that theyare independent.It is possible to have a joint probability density function such that each marginal density function

is a gaussian probability density function, yet the joint probability density function is not jointlygaussian, and so is not given by (1.2.21). This means that knowing that each marginal probabilitydensity function is gaussian is not sufficient to infer that the joint probability density function isjointly gaussian. This is discussed in an end-of-chapter exercise.A zero-mean bivariate gaussian random variable consists of two random, zero-mean gaussian

components x and y, which may be correlated. The covariance matrix given in (1.2.22) is

C =

[σ2x ρxyσxσy

ρxyσxσy σ2y

], (1.2.23)

where

ρxy.=

⟨xy⟩/σxσy, (1.2.24)

28

Page 33: Math Notes for ECE 278 - web.eng.ucsd.edu

(a) (b)

x

x’

y’

y

Figure 1.6: Contours of the joint gaussian probability density function fx,y(x, y) as a function of thecorrelation coefficient ρxy: (a) ρxy = 0, (b) ρxy = 0.5.

is defined as the correlation coefficient . An example of a two-dimensional joint gaussian probabilitydensity function is shown in plan view in Figure 1.6.If σx = σy = σ, then (1.2.21) reduces to

fx,y(x, y) =1

2πσ2√

1− ρ2xyexp

(−x

2 − 2ρxyxy + y2

2σ2(1− ρ2xy)

). (1.2.25)

Moreover, if ρxy = 0, then fx,y(x, y) is a product distribution in the chosen coordinate system. Forthis case, the bivariate gaussian density function is called a circularly-symmetric density functionwith the bivariate gaussian random variable called a circularly-symmetric gaussian random variable.The joint gaussian probability density function fx,y(x, y), now also including a nonzero mean foreach component, can then be written as

fx,y(x, y) =

[1√2πσ

e−(x−〈x〉)2/2σ2

] [1√2πσ

e−(y−〈y〉)2/2σ2

], (1.2.26)

where the probability density function of each component is written in the form of (1.2.18). There-fore, uncorrelated gaussian random variables are independent.For a set of N independent real gaussian random variables with C = σ2IN , the joint probability

density function is

fx(x) =1√

(2πσ2)Ne−(x−〈x〉)

2/2σ2, (1.2.27)

which factors as a product of the N single variable gaussian densities. A complex version of (1.2.27)is given in (1.2.33).A covariance matrix is a hermitian matrix and can be diagonalized by a change of basis. Therefore,

any multivariate gaussian probability density function has a basis for which the probability densityfunction expressed in this basis is a product distribution. The resulting marginal gaussian randomvariables in this basis are independent, but need not have the same mean and variance.

oThis probability is also expressed using the equivalent function Q(x).= 1

2erfc(z/

√2).

29

Page 34: Math Notes for ECE 278 - web.eng.ucsd.edu

For example, consider the two-dimensional gaussian probability density function given in (1.2.25)with diagonal elements σ2x = σ2y = σ2 and off-diagonal elements σ2ρxy. Define a new basis (x′, y′)that is a rotation of the original basis (x, y). The components in the new basis for this example canbe expressed by a unitary transformation R of components in the original basis as given by[

x′

y′

]= R

[xy

].

The matrix R is generated from the normalized eigenvectors of the covariance matrix C given in(1.2.23) and satisfies the matrix equation

RTCR = D,

where D is a diagonal matrix with diagonal elements given by the eigenvalues of C. Using theseeigenvalues, the variances of the uncorrelated gaussian random variables in this new basis are σ2x′ =σ2(1 + ρxy) and σ2y′ = σ2(1 − ρxy), which can be equal only if ρxy = 0. Using the normalizedeigenvectors of C, the components of the new basis are x′ = 1√

2(x + y) and y′ = 1√

2(x − y). The

joint gaussian probability density function in the new basis is a product distribution given by

f(x′, y′) =

[1√

2πσ2(1 + ρxy)e−x

′2/2σ2(1+ρxy)

][1√

2πσ2(1− ρxy)e−y

′2/2σ2(1−ρxy)

],

which is written to show that each marginal probability density function in the new basis is anindependent gaussian probability density function of the form of (1.2.18) with each distributionhaving a different variance.

Complex Gaussian Random Variables and Vectors A complex gaussian random variable z = x+iyhas components x and y described by a real bivariate gaussian random variable (x, y). A complexgaussian random vector denoted as z = x+iy has vector components zk described by a real bivariategaussian random variable (xk, yk).The multivariate complex gaussian probability density function describing a complex gaussian

random vector z is

fz(z) =1

πN detWe−(z−〈z〉)

†W−1(z−〈z〉), (1.2.28)

where

W .= 〈(z− 〈z〉) (z− 〈z〉)†〉, (1.2.29)

is the complex covariance matrix with † denoting the complex conjugate transpose. Using theproperties of determinants, the leading term (πN detW)−1 in (1.2.28) can be written as det(πW)−1.

Circularly-Symmetric Complex Gaussian Random Variables and Vectors A complex gaussianrandom variable z = x + iy with independent,zero-mean components x and y of equal variance isa complex circularly-symmetric gaussian random variable. The corresponding probability density

30

Page 35: Math Notes for ECE 278 - web.eng.ucsd.edu

function is called a complex circularly-symmetric gaussian probability density function. A circularly-symmetric complex gaussian random variable has the property that eiθz has the same probabilitydensity function for all θ.Generalizing, a complex, jointly gaussian random vector z = x+ iy is circularly symmetric when

each vector component eiθzk has the same probability density function for all θ. The multivariatecomplex gaussian probability density function for a circularly-symmetric gaussian random vector zis determined by setting 〈z〉 equal to zero in (1.2.28) and gives

fz(z) =1

πN detWe−z

†W−1z, (1.2.30)

with

W .= 〈zz†〉 (1.2.31)

being the autocovariance matrix W.To relate the covariance matrix W for a complex random vector to the covariance matrix C for

a real random vector, define z as a vector of N complex circularly-symmetric gaussian randomvariables with a complex covariance matrix W given in (1.2.29). Define x as a real vector oflength 2N that consists of the real part Re[z] and the imaginary part Im[z] in the order x =Re[z1], ...,Re[zN ], Im[z1], ..., Im[zN ]. The real 2N × 2N covariance matrix C given in (1.2.22) isexpressed in block form as

C =1

2

[ReW −ImWImW ReW

], (1.2.32)

in terms of the N ×N complex covariance matrix W.As an example, the covariance matrix of a set of N uncorrelated circularly-symmetric complex

gaussian random variables of equal variance has the form W = 2σ2IN where IN is the N by Nidentity matrix. Using det(2σ2IN) = (2σ2)N , the joint probability density function given in (1.2.28)reduces to

fz(z) =1

(2πσ2)Ne−|z−〈z〉|

2/2σ2, (1.2.33)

which separates into a product of N terms.

Rayleigh and Ricean Probability Density Functions Several useful random variables and theirassociated probability density functions are generated from the square root of the sum of the squaresr =

√x2 + y2 of independent gaussian random variables x and y, usually with equal variances and

possibly with nonzero means. The random amplitude r has a geometrical interpretation as thelength of a random vector with components x and y that are gaussian-distributed. The probabilitydensity function of the amplitude can be determined by transforming the joint probability densityfunction fxy(x, y) from rectangular coordinates, which is given in (1.2.26), into the correspondingjoint probability density function frφ(r, φ) in polar coordinates. The marginal probability densityfunction fr(r) is generated by integrating out the functional dependence of φ.

31

Page 36: Math Notes for ECE 278 - web.eng.ucsd.edu

(a)

(b)

fr(r) fr(r) fr(r)

fφ(φ) fφ(φ) fφ(φ)

rrr

−π π2

− 0 π2

π −π π2

− 0 π2

π −π π2

− 0 π2

π

Figure 1.7: (a) The marginal probability density function of the amplitude fr(r) given in (1.2.35).The expected value A increases from left to right, (b) the marginal probability densityfunctions for the phase fφ(φ) given in (1.2.37). The leftmost set of plots is for A = 0.

Let A =√〈x〉2 + 〈y〉2 be the expected amplitude where 〈x〉2 and 〈y〉2 are the squares of the

means of the independent gaussian random variables x and y, both with variance σ2. Now makethe change of variables, 〈x〉 = A cos θ, 〈y〉 = A sin θ, dxdy = dr rdθ, r =

√x2 + y2, x = r cosφ, and

y = r sinφ. Then redefine φ with respect to the phase θ of the constant amplitude signal so thatθ − φ is replaced by φ. Using these substitutions and standard trigonometric identities, (1.2.26)becomes

frφ(r, φ) =r

2πσ2e−(r

2−2Ar cosφ+A2)/2σ2. (1.2.34)

The marginal probability density function fr(r) is

fr(r) =r

2πσ2e−(r

2+A2)/2σ2

∫ 2π

0eAr cosφ/σ

2dφ.

The integral can be expressed in terms of the modified Bessel function of the first kind of orderzero.p Using the change of variable x = Ar/σ2, the probability density function of the amplitude ris

fr(r) =r

σ2e−(r

2+A2)/2σ2I0

(A r

σ2

)r ≥ 0. (1.2.35)

This probability density function is known as the ricean probability density function and character-izes a ricean random variable. As the ratio of A/σ becomes large, the ricean probability density

pThe modified Bessel function of the first kind of order ν is defined as Iν(x).= 1

∫ π−π e

x cos θ cos(νθ)dθ. The orderν can be an integer or a half integer.

32

Page 37: Math Notes for ECE 278 - web.eng.ucsd.edu

function begins to resemble a gaussian probability density function with the same mean and vari-ance. For A = 0, the probability density function reduces to

fr(r) =r

σ2e−r

2/2σ2r ≥ 0, (1.2.36)

which is known as a rayleigh probability density function that characterizes a rayleigh random vari-able. This probability density function has mean σ

√π/2 and variance σ2(2− π/2).

Plots of the ricean and the rayleigh probability density functions are shown in Figure 1.7a.The marginal probability density function of the phase φ can be obtained by integrating (1.2.34)

over r with the variables changed as r′ = r/√

2σ, F = A2/2σ2, and B =√F cosφ. Completing the

square in the exponent, and factoring produces

fφ(φ) =1

πe(B

2−F )

∫ ∞0

r′e−(r′−B)2dr′.

To evaluate the integral, use the second change of variables R = r′ −B to yield∫ ∞0

r′e−(r′−B)2dr′ =

∫ ∞−B

(R+B)e−R2dR

=1

2

(e−B

2+B√π(1 + erf(B)

)).

Substituting this expression back into the expression for fφ(φ) gives

fφ(φ) =1

(e−F +

√πF cosφe−F sin2 φ

(1 + erf

(√F cosφ

))). (1.2.37)

The function fφ(φ) is a zero-mean, even, periodic function with a period T = 2π that integratesto one over one period for any value of F . If F = 0, then fφ(φ) = 1/2π and the marginal probabilitydensity function of the phase is a uniform probability density function over [−π, π). If F 6= 0, thenfφ(φ) becomes “peaked” about φ = 0 with the width of the probability density function inverselyrelated to F . These effects are shown in Figure 1.7b.For F 1 and φ ≈ 0, the approximation erf(

√F cosφ) ≈ 1 holds. Using this expression, setting

cosφ ≈ 1, sin2 φ ≈ φ2, and neglecting the first term in (1.2.37) as compared to the second termgives

fφ(φ) ≈√F

πe−Fφ

2. (1.2.38)

This is a zero-mean gaussian probability density function with variance 1/2F = σ2/A2. Thisform is evident in Figure 1.7c. While this approximation is defined only over −π ≤ φ < π, itis sometimes a mathematically expedient approximation when the variance is small to extend therange to −∞ ≤ φ <∞ because then the value of fφ(φ) is negligible outside the interval −π ≤ φ < π.

33

Page 38: Math Notes for ECE 278 - web.eng.ucsd.edu

Exponential and Gamma Probability Density Functions The central chi-square random variablewith two degrees of freedom is known as an exponential random variable with an associated expo-nential probability density function. The general form of an exponential probability density functionis

f(z) = µe−µz, (1.2.39)

with mean µ−1 and variance µ−2. The sum of k independent, identically-distributed exponentialrandom variables, each with a mean µ−1, is a gamma random variable with a probability densityfunction given by

f(z) = µΓ(k)−1 (µz)k−1 e−µz, (1.2.40)

where k > 0 and Γ(k) is the gamma function. The mean of the gamma probability density functionis kµ−1 and the variance is kµ−2. Using the substitutions k = N/2 and µ = 1/2σ2, the gammaprobability density function is equal to a central chi-square probability density function with Ndegrees of freedom. Plots of the gamma probability density function of several pairs (µ, k) areshown in Figure 1.8.

x

p(x)

(a)

0 1 2 3 4 50.0

0.2

0.4

0.6

0.8

1.0(b)

(μ=1, k=1)

(μ=2, k=2)

0 10 20 30 40 50 60 700.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

x

p(x)

(μ=1, k=5)

(μ=1, k=30)(μ=1, k=50)

Figure 1.8: (a) Plot of an exponential probability density function with µ = k = 1 and a gammaprobability density function with µ = k = 2. (b) For a fixed value of µ, as k increases, thegamma probability density function approaches a gaussian probability density function.

The Central Limit Theorem A random variable that describes an event may arise as the sumof many independent random variables xi for repeated instances of some underlying constituentevent. If the variances of the random variables xi are finite and equal, then the probability densityfunction px(x) for the normalized sum x = 1√

N

∑Ni (xi−〈x〉) as N goes to infinity will usually tend

towards a gaussian probability density function with mean 〈x〉 irrespective of the functional formof the probability density functions pxi(xi) of the individual constituent events. When the densityfunctions are all the same, the formal statement is called the central limit theorem. The centrallimit theorem explains why the gaussian probability density function and its variants are ubiquitousin statistical analysis.As an example of the application of the central limit theorem, consider the probability density

function of the normalized sum of N independent and identically distributed (IID) complex ran-dom variables Aie

iφi added to a constant Aeiθ, where Ai and φi are zero-mean, independent, and

34

Page 39: Math Notes for ECE 278 - web.eng.ucsd.edu

identically-distributed random variables, and the probability density function of φiis uniform over

[0, 2π). The resulting normalized sum is a complex random variable written as

S =1√N

N∑i=1

(Aie

iφi −Aeiθ

)This can be written as S = x+iy where x = 1√

N

∑Ni=1(Ai cosφ

i−A cos θ) and y = 1√

N

∑Nk=i(Ai sinφ

i−

A sin θ). In the limit as N goes to infinity, asserting the central limit theorem yields a joint prob-ability density function fx,y(x, y) for S that is a circularly-symmetric gaussian probability densityfunction centered on the constant Aeiθ. This is shown schematically in Figure 1.9. Although the

Many independentcontributions

Sum of independentrandom vectors

Joint gaussian pdf

Real Axis (in-phase)Real Axis (in-phase)

Imag

inar

y A

xis

(qua

drat

ure)

Imag

inar

y A

xis

(qua

drat

ure)

Figure 1.9: The limit of the sum of many independent random vectors superimposed on a constantsignal is a circularly-symmetric gaussian probability density function centered on theconstant Aeiθ.

central limit is quite powerful, the convergence to a gaussian distribution is not complete for anyfinite number of summand random variables. This means that calculations of small probabilities ofevents by using the central limit theorem to validate the use of a gaussian distribution may not bevalid.

1.2.2 Random Processes

Returning to the random voltage measurement shown in Figure 1.5, the probability density functionof the random voltage defined at a single time instant is a first-order probability density function.To study the time structure of the random process at two time instants, consider a joint probabilitydensity function f(v1, v2; t1, t2) that describes the joint probability that the voltage v1 is measuredat time t1 and the voltage v2 is measured at time t2. This probability density function is called asecond-order probability density function because it relates two, possibly complex, random variablesat two different times.The correlation of two continuous random variables defined from the same random process at two

different times t1 and t2 defines the autocorrelation function

R(t1, t2).= 〈v1v∗2〉 =

∫ ∞−∞

∫ ∞−∞

v1v∗2f(v1, v2; t1, t2)dv1dv2. (1.2.41)

35

Page 40: Math Notes for ECE 278 - web.eng.ucsd.edu

The covariance function is defined as C(t1, t2) = R(t1, t2)− 〈v1〉〈v2〉 (cf. (1.2.13)). Similar expres-sions in time and space are defined for electromagnetic fields in Section 1.3.3. There, the temporalproperties of a random electromagnetic field are characterized by a temporal coherence functionthat is an extension of the autocorrelation function to electromagnetic fields.Higher-order joint probability density functions can be defined in a similar way using n different

times. The collection, or ensemble, of all sample functions that could be measured, which areshown schematically in Figure 1.5, along with all of the nth-order probability density functions forall values of n, completely specifies a general random process.A gaussian random process is a random process for which every nth-order probability density

function is jointly gaussian. If a gaussian random process is transformed or filtered by a linearsystem, then the output at any time t is a weighted superposition of gaussian random variablesand so is also a gaussian random variable. Accordingly, filtering a gaussian random process pro-duces another gaussian random process, with the filtering affecting the mean, the variance, and thecorrelation properties, but not the gaussian form of the random process.

Stationarity and Ergodicity

The analysis of a random process is simplified whenever some or all of the probability densityfunctions are shift-invariant. When the first-order and the second-order probability density functionsare time-invariant, then the mean is independent of the time t1, and the autocorrelation functiondepends only on the time difference τ = t2 − t1 so that R(t1, t2) = R(t2 − t1, 0) = R(τ, 0). Whenthe autocorrelation function is time-invariant, it is written as R(τ).Random processes for which the first-order and second-order probability density functions are

time-invariant are called stationary in the wide sense. In this case, the subscripts are droppedbecause the mean and the mean-squared value defined by (1.2.2) and (1.2.5) are now independentof time. If all probability density functions that can be defined for a random process are time-invariant, then the process is called strict-sense stationary .Often, every sample function of a stationary random process contains the same statistical informa-

tion as every other sample function. If the statistical moments of the ensemble can be constructedfrom the temporal moments of a single sample function, then the random process is called ergodic.In particular, the expectation can be replaced with a time average over a single sample function

v(t)

〈v〉 = v.= lim

T→∞

1

2T

∫ T

−Tv(t)dt

for an ergodic random process.

Power Density Spectrum

The power density spectrum S(f) can be regarded as the density of the power per infinitesimalfrequency interval at the frequency f . For a wide-sense stationary random process s(t), the powerdensity spectrum can be derived by defining a temporal function sT (t) with finite support given bysT (t)

.= s(t)rect(t/2T ) with a Fourier transform UT (f). Convolving sT (t) and s∗T (t) and using the

36

Page 41: Math Notes for ECE 278 - web.eng.ucsd.edu

convolution property of the Fourier transform gives∫ ∞−∞

sT (τ)s∗T (t− τ)dτ ←→ |UT (f)|2 .

Take the expectation of each side and divide by T . Then because the expectation is linear, we canwrite

1

T

∫ ∞−∞

⟨sT (τ)s∗T (t− τ)

⟩dτ ←→ 1

T

⟨|UT (f)|2

⟩.

Now define RT (τ).= 〈sT (τ)s∗T (t− τ)〉 as the autocorrelation function of the finite duration random

process sT (t). In the limit as T goes to infinity, this becomes

limT→∞

1

T

∫ ∞−∞

RT (τ)dτ ←→ limT→∞

1

T|UT (f)|2 . (1.2.42)

The Wiener-Khintchine theorem states that the left side is the autocorrelation function R(τ) of s(t)with the right side defined as the two-sided power density spectrum. This means that autocorrelationfunction R(τ) and the power density spectrum S(f) are a Fourier transform pair given by

S(f) =

∫ ∞−∞

R(τ)e−i2πfτdτ (1.2.43a)

R(τ) =

∫ ∞−∞S(f)ei2πfτdf. (1.2.43b)

If the autocorrelation function R(τ) is a real and even function, then from the properties of theFourier transform (cf. (1.1.19)), the power density spectrum is real and even as well. For this case,a one-sided power density spectrum can be defined for positive f with a value that is twice that ofS(f). The relationship between the signal power and the two-sided power density spectrum is givenby

P =

∫ ∞−∞S(f)df. (1.2.44)

Coherence Time

The coherence time τc defined as

τc.=

1

|R(0)|2∫ ∞−∞|R(τ)|2 dτ (1.2.45)

is a measure of the width of the magnitude of the autocorrelation function of a stationary randomprocess. For a random electromagnetic field (cf. (1.3.3)), the coherence time is the width of thetemporal coherence function, which is discussed in Section 1.3.3.Within a coherence interval , defined as any interval of duration τc, the values of the random

process are highly correlated. This means that over a coherence interval, the random process canbe approximated as unchanging and described by a single random variable. A random process can

37

Page 42: Math Notes for ECE 278 - web.eng.ucsd.edu

then be approximated as a sequence of random variables with each random variable defined over acoherence interval.The reciprocal quantity Bc = 1/τc of the coherence time is the effective bandwidth, and also

quantifies the number of coherence intervals per unit time. The effective bandwidth can be writtenas

Bc =|R(0)|2∫∞

−∞ |R(τ)|2 dτ=

∣∣∣∫∞−∞ S(f)df∣∣∣2∫∞

−∞ |S(f)|2 df, (1.2.46)

as a consequence of Parseval’s relationship.The concepts of autocorrelation function and power density spectrum are illustrated by determin-

ing the autocorrelation function and power density spectrum for an example of a random processcalled the random telegraph signal. A sequence of random and independent nonoverlapping pulses,each transmitted within a time interval T and of duration T . For each pulse interval, the probabilityof transmitting amplitude 0 or A is equiprobable. This random process is written as

s(t) =

∞∑n=−∞

Anrect

(t− nT − j

T

).

where j is an offset time described by uniformly-distributed random variable over [0, T ]. Threerealizations with different random offset times are shown in Figure 1.10.

j1

j2

j3

Figure 1.10: Three possible realizations of a random binary waveform consisting of a random se-quence of marks of amplitude A and spaces of amplitude zero offset by a randomvariable j.

For any realization s(t) and for τ > T , the pulses are independent because they are generatedfrom independent bits. Therefore, the autocorrelation function R(τ) in this region for equally likelypulses with p = 1/2 is

R(τ) = 〈s(t)s(t+ τ)〉= 〈s(t)〉 〈s(t+ τ)〉

=

(A

2

)(A

2

)=

A2

4for τ > T.

38

Page 43: Math Notes for ECE 278 - web.eng.ucsd.edu

This is the expected power in the random signal with mean amplitude of A/2.For τ < T , there are two possibilities depending on the value of the offset j for each sample

function. If j > T − τ , then the random variables s(t) and s(t + τ) are defined in different timeintervals. Thus 〈s(t)s(t+ τ)〉 = A2/4 because each term has amplitude A with a probability ofone-half and otherwise has amplitude zero. If j < T − τ , then s(t) and s(t + τ) are defined in thesame time interval. If a mark is transmitted, then s(t)s(t + τ) = A2. If a space is transmitted,then s(t)s(t+ τ) = 0. Because marks and spaces are equally probable, 〈s(t)s(t+ τ)〉 = A2/2. Theresults for j < T − τ and j > T − τ are combined by recalling that the offset j is described by auniform probability density function, f(j) = 1/T . The resulting autocorrelation is

〈s(t)s(t+ τ)〉 =

∫ T−τ

0

A2

2Tdj +

∫ T

T−τ

A2

4Tdj

=A2

4

(2− τ

T

)(0 < τ < T ).

The same analysis holds for negative values of τ . Therefore, the autocorrelation function is

R(τ) =

A2

4A2

4

(2− |τ |T

) |τ | > T|τ | ≤ T. (1.2.47)

A plot of the autocorrelation function is shown in Figure 1.11a. The corresponding power density

Normalized Frequency (fT)

Nor

mal

ized

Pow

er (d

B)

−4 −2 0 2 4

0

−10

−20

−30

(a) (b) Zero-frequencycomponentModulated

signal component

τ

A2

4

A2

2

T−T

R(τ)

Figure 1.11: (a) Autocorrelation function of a binary sequences of marks and spaces. (b) Powerdensity spectrum.

spectrum S(f) is the Fourier transform of R(τ). The Fourier transform of the constant meanpower A2/4 is a Dirac impulse (A2/4)δ(t). The Fourier transform of the triangular function is asinc-squared function. The total power density spectrum is

S(f) =A2

4δ(f) +

A2T

4sinc2(fT ), (1.2.48)

and is shown in Figure 1.11b. For this waveform, half of the signal power is carried in the zero-frequency component and conveys no information.As a second example, consider the random process

x(t) =1√M

M∑m=1

Ame−i2πfmt,

39

Page 44: Math Notes for ECE 278 - web.eng.ucsd.edu

where Am is again a set of identically-distributed random variables indexed by m, and fm isa known frequency for each m. We want to determine whether this random process is widesensestationary and whether this random process is ergodic.To be widesense stationary, the expected value must be independent of time and the autocorre-

lation function can depend only on the time difference. The expected value is

〈x(t)〉 =1√M

M∑m=1

〈Am〉 e−i2πfmt.

For the process to be stationary, the expected value cannot depend on t. Examining 〈x(t)〉, thiscondition is satisfied only if 〈Am〉 = 0. Therefore, the probability density function of the amplitudemust have zero mean. The autocorrelation function is

R(t, t+ τ) =M∑m=1

M∑`=1

〈AmA∗` 〉 e−i2πfmt ei2πf`(t+τ)

=M∑m=1

M∑`=1

〈AmA∗` 〉 e−i2π(fm−f`)t ei2πf`τ .

In order for the autocorrelation function to depend only on τ and not t, the expected value 〈AmA∗` 〉must vanish when m 6= `. If m = `, then 〈AmA∗` 〉 = 〈|Am|2〉. Thus the requirement for the processto be stationary is that 〈AmA∗` 〉 = 〈|Am|2〉δm` where δm` is the Kronecker impulse. This conditionimplies that the random variables are uncorrelated. Combining the two observations, we concludethat in order for the process to be stationary, the random variables must have mean zero and beuncorrelated.To test for ergodicity, the temporal moments of a single sample function must be equal to the

statistical moments of the ensemble. However, each term in the summation for a sample function isof the form Ane

i2πfnt, and is sinusoidal. Therefore, each sample function is a deterministic functionwith individual temporal sections correlated over any time interval. It follows that the randomprocess is not ergodic.

Noise Processes

Many common noise processes, both electrical and optical, can be accurately modeled as stationarygaussian random processes with a constant power density spectrum over a limited frequency range.Let Sn(f) = N0/2 be a constant (two-sided) power density spectrum of a stationary, zero-mean

noise process n(t), possibly gaussian. This noise process has an equal contribution to the powerdensity spectrum from every frequency component and is called a white noise process.q BecauseSn(f) = N0/2 is a constant, the autocorrelation function Rn(τ) of a white noise process is a scaledDirac impulse.When a stationary random process with a power density spectrum S ′(f) is the input to a linear

time-invariant system with a causal impulse response and a baseband transfer function H(f), theqA photodetected electrical signal has units of current so that N0 is sometimes expressed using an equivalent powerdensity spectrum per unit resistance with units of A2/Hz. A discussion of units is given in the book section titled“Notation”.

40

Page 45: Math Notes for ECE 278 - web.eng.ucsd.edu

power density spectrum S(f) of the random process at the output is related to the power densityspectrum S ′(f) at the input by the expression

S(f) = S ′(f)|H(f)|2. (1.2.49)

For white noise, S ′(f) = N0/2, and

S(f) = 12N0|H(f)|2. (1.2.50)

Using (1.2.44), and the fact that the stationary noise process has zero mean, the output noisepower Pn at time t is equal to the variance σ2 given by (cf. (1.2.5)) so that

Pn = σ2 =N0

2

∫ ∞−∞|H(f)|2df (1.2.51a)

=N0

2

∫ ∞−∞|h(t)|2dt, (1.2.51b)

where the second line follows from Parseval’s relationship (1.1.18).In general, the power in the filtered noise process can be determined using (1.2.49) and (1.2.51b)

along with the convolution property of the Fourier transform given by (1.1.14)

Pn =

∫ ∞−∞Sn(f)|H(f)|2df

=

∫ ∞−∞Sn(f)|H(f)|2ei2πfτdf

∣∣∣∣τ=0

= Rn(τ) ~ h(τ) ~ h∗(−τ)∣∣τ=0

(1.2.52)

where Rn(τ) is the noise autocorrelation function defined in (1.2.43b) corresponding to Sn(f). Anoise source whose power density spectrum varies with frequency is called a colored noise source.The noise power σ2 can be written as

σ2 = R(0) = GN0BN , (1.2.53)

where BN is defined as the noise equivalent bandwidth

BN

.=

1

2G

∫ ∞−∞|H(f)|2df (1.2.54)

=1

2G

∫ ∞0|h(t)|2 dt, (1.2.55)

with the normalization constant G = max|H(f)|2 equal to the maximum power gain of the ban-dlimited system that defines the noise power. The concept of noise equivalent bandwidth regardsthe total noise power as equivalent to a power density spectrum that is flat to a frequency BN andzero thereafter.For example, the noise equivalent bandwidth of filtered white noise using a baseband system

described by

H(f) =1

1 + if/Wh

41

Page 46: Math Notes for ECE 278 - web.eng.ucsd.edu

is

BN =1

|H(0)|2∫ ∞0

1

1 + (f/Wh)2df =

π

2Wh,

where max |H(f)|2 = |H(0)|2 = G = 1, andWh is the half-power or 3-dB bandwidth of the basebandspectrum. The expression |H(f)|2 and the corresponding noise equivalent bandwidth are plotted inFigure 1.12a.The passband noise equivalent bandwidth B measures the width of a passband spectrum (cf.

Figure 1.3). It is twice the bandwidth W of the baseband signal because it occupies twice thefrequency range. The corresponding noise equivalent bandwidth of a passband system is plotted inFigure 1.12b.

(a) (b)

Frequency0 fc

Figure 1.12: (a) Noise equivalent bandwidth of a baseband transfer function. (b) Noise equivalentbandwidth of a passband transfer function that is symmetric about the carrier frequencyfc.

To determine the noise equivalent bandwidth of a system described by the causal impulse responseh(t) = Arect((t/T ) − 1/2), use (1.2.55) with the upper limit set to T . The Fourier transformof a rectangular pulse is a sinc function, which has its maximum value at f = 0. ThereforeG = max |H(f)|2 = |H(0)|2 =

∣∣∫∞0 h(t)dt

∣∣2 = Arect((t− T/2)/T ). Then

BN =1

2G

∫ T

0A2dt =

A2T

2G=

1

2T. (1.2.56)

If A = 1, then the filter integrates the noise over a time T , and the noise power is

σ2 = GN0BN =T 2N0

2T=

N0T

2, (1.2.57)

as given by (1.2.53).The effect of the noise is quantified by the signal-to-noise ratio given by

SNR =expected electrical signal powerexpected electrical noise power

. (1.2.58)

42

Page 47: Math Notes for ECE 278 - web.eng.ucsd.edu

Passband Noise

When the passband bandwidth B is much smaller than the passband center frequency fc, thebandlimited noise process is called a passband noise process. A passband noise process n(t) can bewritten in several equivalent forms as

n(t) = nI(t) cos(2πfct)− nQ(t) sin(2πfct)

= A(t) cos (2πfct+ φ(t))

= Re[A(t)ei(2πfct+φ(t))

]= Re

[n(t)ei2πfct

], (1.2.59)

where n(t) = nI(t) + inQ(t) = A(t)eiφ(t) is the complex-baseband noise process. The term nI(t) =A(t) cosφ(t) is the in-phase component of the noise, nQ(t) = A(t) sinφ(t) is the quadrature compo-nent of the noise. The autocorrelation function of n(t) is

Rn(t)(τ) = 〈n(t)n(t+ τ)〉=

⟨Re[n(t)ei2πfct

]Re[n(t+ τ)ei(2πfc(t+τ))

]⟩.

Using the identity Re[z1]Re[z2] = 12Re[z1z2]+

12Re[z

∗1z2], and the fact that θ is an independent phase

gives

Rn(τ) = 12Re

[〈n(t)n(t+ τ)〉 ei[2πfc(2t+τ)]

]+ 1

2Re[〈n(t)∗n(t+ τ)〉 ei2πfcτ

]= 1

2Re[〈n(t)∗n(t+ τ)〉 ei2πfcτ

]= 1

2Re[Rn(τ)ei2πfcτ

], (1.2.60)

where the rapidly oscillating first term is neglected compared to the second term. The term Rn(τ) =〈n(t)n(t+ τ)〉 is the autocorrelation function of the complex-baseband noise process.The passband power density spectrum is obtained by taking the Fourier transform of (1.2.60)

Sn(f) = 14

(Sn(f − fc) + S∗n(−f − fc)

), (1.2.61)

where the modulation property of the Fourier transform given in (1.1.16) has been used. Thepassband power density spectrum for the noise is shown in Figure 1.13a.The power density spectrum of the complex-baseband noise process is obtained by substitut-

ing (1.1.58) into (1.2.61) and equating terms

Sn(f) = 2N0 |H(f)|2

= 4Sn(f + fc) f > 0, (1.2.62)

where only the positive frequency part of the passband noise power density spectrum Sn(f) is used todetermine the complex-baseband power density spectrum Sn(f), and H(f) is the complex-basebandtransfer function defined in (1.1.58). This power density spectrum is shown in Figure 1.13b.

43

Page 48: Math Notes for ECE 278 - web.eng.ucsd.edu

fc

N0/2

2N0

(a)

(b)

N0

BN

(c)PassbandNoise Bandwidth

fc

fc

fc

fc

fc

Figure 1.13: (a) Power density spectrum of the real passband noise process Sn(f), (b) The complex-baseband power density spectrum Sn(f), (c) Power density spectrum of the complex-baseband noise components, SnI (f) and SnQ(f), along with the passband noise equiv-alent bandwidth.

The power density spectrum for each complex-baseband noise component of a passband noiseprocess with a passband bandwidth B is determined using the same steps that were used to derive(1.2.61) with

SnI (f) = SnQ(f)

= Sn(f − fc) + Sn(f + fc).

This spectrum is shown in Figure 1.13c.Using (1.2.60), the noise power σ2 in the passband noise process n(t) is

σ2 =N0

2

∫ ∞−∞

∣∣H(f)∣∣2df

= Rn(0) = 12Re [〈n(t)n∗(t)〉]

= 12

⟨|n(t)|2

⟩. (1.2.63)

A complex-baseband signal has twice the signal energy as the corresponding passband signal (cf.(1.1.74)). Accordingly, the complex-baseband noise equivalent bandwidth BN is defined to be twiceas large as the baseband noise equivalent bandwidth given in (1.2.54) so that

BN

.=

1

G

∫ ∞−∞|H(f)|2 df. (1.2.64)

44

Page 49: Math Notes for ECE 278 - web.eng.ucsd.edu

The difference in the noise equivalent bandwidth for a real-baseband signal and a complex-basebandsignal is shown in Figure 1.12.For the idealized case in which a constant noise power density spectrum N0 is filtered by an ideal

lowpass filter, the noise equivalent bandwidth BN is equal to the effective bandwidth Bc definedin (1.2.46). To show this, let H(f) = rect(f/B) so that the filtered noise process is Sn(f) =N0rect(f/B). The effective bandwidth of this noise process is

Bc =

(∫∞−∞ Sn(f)df

)2∫∞−∞ S2n(f)df

=

(∫∞−∞N0rect(f/B)df

)2∫∞−∞N

20 rect2(f/B)df

= B. (1.2.65)

The complex-baseband noise equivalent bandwidth is given by (1.2.64)

BN =1

G

∫ ∞−∞|H(f)|2 df

=1

N20

∫ ∞−∞

N20 rect2(f/B)df = B.

Therefore, BN = Bc = B. For this case, the coherence time defined in (1.2.45) is the exact reciprocalof the noise equivalent bandwidth BN . For other filters, this equality need not hold, and the numberof coherence intervals per unit time Bc does not equal the noise equivalent bandwidth BN .When the in-phase nI(t) and quadrature nQ(t) noise components of a passband noise process

n(t) are independent, and have the same statistics, the autocorrelation for each noise component isthe same with RnI (τ) equal to RnQ(τ). The corresponding autocorrelation function Rn(τ) of thecomplex-baseband process is

Rn(τ) = 2RnI (τ) = 2RnQ(τ). (1.2.66)

The power in each quadrature component can be determined using (1.2.63) and (1.2.66)

σ2 =⟨n2(t)

⟩=

⟨n2I (t)

⟩=

⟨n2Q(t)

⟩= 1

2

⟨|n(t)|2

⟩. (1.2.67)

When the complex process is a gaussian random process, then the two-dimensional gaussian prob-ability density function of the in-phase and quadrature components at any time t describes acircularly-symmetric gaussian random variable with the process being a circularly-symmetric gaus-sian random process.

Noise Figure The amount of noise added to a signal in a linear system can be quantified by theconcept of noise figure FN , which is defined in terms of the signal power and the noise power overa bandwidth B. A frequency-dependent noise figure FN(f) can be defined by constraining thebandwidth B, centered at f , to be sufficiently narrow so that the signal power and the noise powerare independent of frequency over bandwidth B.

45

Page 50: Math Notes for ECE 278 - web.eng.ucsd.edu

The frequency-dependent noise figure FN(f) is defined as the ratio of the total noise power Pin(f)in the system to the noise power Pin(f) at the input due solely to thermal noise

FN(f) =Pin(f) + Pa(f)

Pin(f)= 1 +

Pa(f)

Pin(f), (1.2.68)

where Pa(f) is an additional uncorrelated noise contribution, typically from an amplifier, that isadded to the input noise Pin(f). The noise powers are referenced to the output of the system with atransfer function H(f) defined over the bandwidth B given that the noise and signal are frequencyindependent. The input power Sin(f) for the signal is filtered by the system to produce the outputsignal power Sout(f) = Sin(f)|H(f)|2. The frequency-dependent noise figure can be expressed interms of the signal-to-noise ratio by multiplying the numerator and denominator of (1.2.68) bySout(f) = Sin(f)|H(f)|2 to obtain

FN(f) =Sin(f) |H(f)|2 (Pin(f) + Pa(f))

Sin(f) |H(f)|2 Pin(f). (1.2.69)

Then

FN(f) =Sin(f)

Sout(f)

Pout(f)

Pin(f)(1.2.70)

where Pout(f) = |H(f)|2 (Pa(f) + Pin) is the output noise power.The frequency-independent form of the noise figure FN is

FN =SinPout

SoutPin=

SNRin

SNRout, (1.2.71)

where the signal power and the noise power need not be frequency independent over the bandwidthB.

1.3 Electromagnetics

The physical quantity used for wireless communications is a time-varying electromagnetic field de-scribed by Maxwell’s equations. These equations comprise a system of partial differential equationshaving a rich set of solutions that depend both on the geometry and on the medium. This sectionprovides a summary review of Maxwell’s equations as used for analyzing electromagnetic signalpropagation.The electromagnetic field in free space can be described by two vector functions of space r and

time t. These functions are the electric field vector E(r, t) and the magnetic field vector H(r, t).When an electromagnetic field interacts with a material, two additional material-dependent vectorquantitiesD(r, t) and B(r, t) are needed to describe the interaction of the electromagnetic field withthe material. In the most basic case, these quantities are scaled forms of E(r, t) and H(r, t) withD(r, t) = εE(r, t) and B(r, t) = µH(r, t) where ε and µ are material-dependent constants. Theseconstants convert the field quantities E(r, t) and H(r, t) into the material-dependent quantitiesD(r, t) and B(r, t).

46

Page 51: Math Notes for ECE 278 - web.eng.ucsd.edu

Suppressing the arguments of the vector functions, Maxwell’s set of partial differential equationsfor a material with no free charge is given by

∇× E = −∂B∂t

(1.3.1a)

∇×H =∂D∂t

(1.3.1b)

∇ ·B = 0 (1.3.1c)∇ ·D = 0. (1.3.1d)

The operator ∇· is the scalar divergence differential operator, and the operator ∇× is the vectorcurl differential operator. Integrating the electric field vector E along a line produces the potentialdifference between the end points of the line. Integrating the magnetic field vectorH around a closedcurve yields the “displacement current” that passes through the surface enclosed by the closed curve.Each material-dependent quantity, D or B, has units of field per unit area and is called a flux

density . The term D, called the electric flux density, accounts for the bound charge within thedielectric material, and so can generate an electric field. The term B, called the magnetic fluxdensity, accounts for any closed current paths that are induced within the material, and so generatea magnetic field.The application of Maxwell’s equations to electromagnetic wave propagation at the single fre-

quency ω can be described as a monochromatic electric field vector E(r, t) = Re[E(r, ω)eiωt]. Usinga set of unit vectors x, y, z defined for cartesian coordinates, the expression

E(r, ω) = Ex(r, ω)x + Ey(r, ω)y + Ez(r, ω)z, (1.3.2)

is the complex-baseband vector fieldr at a position r and frequency ω of the monochromatic com-ponent given by eiωt. The general electric field is then a superposition over all appropriate ω. Aswill be explained in Section 1.3.2, the function E(r, ω) satisfies the vector Helmholtz equation (see(1.3.15)) given by

∇2E(r, ω) + k20n2(r, ω)E(r, ω) = 0,

where k0 = ω/c0 is the free-space wavenumber , and n(r, ω) is the index of refraction as a functionof the position r and the frequency ω. Given the boundary conditions determined by the geometry,the solution to this equation describes the propagation of a monochromatic electromagnetic field,which is a field with that has a single temporal frequency.

1.3.1 Material Properties

The two quantities D and B account for the effect of the material on the field. In free space, therelationship between the electric field vector E and the electric flux density D is given by D = ε0Ewhere ε0 is a constant known as the permittivity of free space. Similarly, in free space, the magneticflux density B is related to the magnetic field vector by B = µ0H, where µ0 is a constant known asthe permeability of free space.

rThe term “complex-baseband vector field” is not generally used in electromagnetics. Its use here is to emphasize therelationship between electromagnetic fields and the communication signals derived from electromagnetic fields.

47

Page 52: Math Notes for ECE 278 - web.eng.ucsd.edu

The materials used for the fabrication of optical communication fibers are glass or plastic. Thesehave no free charge. Materials with no free charge are called dielectric materials. To guide light, thematerials are spatially modified to create two or more regions with different dielectric properties. Thefields in each region are constrained by the boundary conditions at the interface between the regions.The boundary conditions can be derived from the integral formulation of Maxwell’s equations whereeither a line integral or a surface integral spans the discontinuous interface. Applying a limitingoperation results in the expressions for the boundary conditions for the differential form of Maxwell’sequations. These conditions state that for a dielectric material, the tangential components of thefields E andH and the normal components of the flux densities D and B must be continuous acrossthe interface.When an electric field E is applied to a dielectric material, the bound charge separates slightly to

create dipoles. These separated charges produce an additional field, called the material polarizations

P , that is added to the original field to produce the electric flux density D. Accordingly, in adielectric material, the flux densities are related to the fields by

D = ε0E +P (1.3.3a)B = µ0H, (1.3.3b)

which are known as the constitutive relationships.The relationship between the applied electric field E and the resulting material polarization P is

the material susceptibility , denoted X. When the susceptibility is linear and does not depend ontime or space, it appears as the simple expression

P(t) = ε0XE(t). (1.3.4)

In general the susceptibility may be a scalar function of space X(r), a scalar function of time X(t),or both X(r, t), or even a tensor. These are each described in turn in the following paragraphs.

Homogeneous and inhomogeneous media A material whose parameters do not depend on theposition r is homogeneous. If some material parameters do depend on r, then the material is inho-mogeneous. A common form of an inhomogeneous material is a material for which the susceptibilitydepends on r, and so the index of refraction, denoted n(r), depends on r. The relationship betweenthe index of refraction and the susceptibility is derived later in this section.

Isotropic and anisotropic materials A material whose properties do not depend on the orientationof the electric field within the material are called isotropic materials. Materials for which thematerial properties depend on the orientation of the electric field are called anisotropic materials.The response of an anisotropic material varies depending on the orientation of the electric fieldvector E with respect to a set of preferred directions, called principal axes, which are a consequenceof the internal structure of the material.An optically-anisotropic material may exhibit birefringence, which describes a material that has

an angularly-dependent index of refraction with an index that is different for two orthogonal po-larization states. For a birefringent material, the material polarization component Pi in a direction

sTwo distinctly different properties are referred to using the word “polarization”—the field polarization and thematerial polarization P within a material.

48

Page 53: Math Notes for ECE 278 - web.eng.ucsd.edu

i for i = 1, 2, 3 representing directions x, y, z, depends on the electric field component Ei in thatdirection as well as the components Ej in directions for which j is not equal to i. For a materialthat is both linear and memoryless, this dependence can be written as

Pi = ε0

3∑j=1

XijEj , (1.3.5)

where Xij is an element of a three-by-three matrix X called the susceptibility tensor . The suscepti-bility tensor represents the anisotropic nature of the material. Birefringent materials are discussedlater in this section.

Linear causal media Many materials exhibit a causal, linear response in time, and a local responsein space for which the material polarizationP at each position r depends on the incident electric fieldE only at that position and not at other positions. In this case, suppressing the spatial dependenceon r from the notation, the polarization P(t) for an isotropic material can be written, in general,as a temporal convolution

P(t) = ε0

∫ ∞0

X(t− τ)E(τ)dτ, (1.3.6)

where X(t) is called the temporal susceptibility of the material. The temporal susceptibility de-scribes the real, causal impulse response of the material to the electric field at each position r.Accordingly, the temporal susceptibility represents the temporal memory of the material to theapplied E field, as expressed by the temporal convolution in (1.3.6).

Linear Material Dispersion

The susceptibility X(t) may have a narrow timewidth compared to the relevant time scale. Then thesusceptibility is modeled as instantaneous with X(t) = Xδ(t) where X is a real constant. Materialsthat have no memory on the relevant time scale are called nondispersive materials. Using (1.1.2)in (1.3.6), the material polarization at a single point in space for a nondispersive material is givenby (1.3.4). Substituting that expression into (1.3.3a) gives

D(t) = ε0 (1 +X)E(t)

= ε0εrE(t), (1.3.7)

where εr.= 1 +X is the relative permittivity of the nondispersive material.

Materials that do exhibit memory on the relevant time scale are called dispersive materials. For adispersive material, the material polarization P at a single point of the dielectric material often hasthe linear restoring force of a simple harmonic oscillator, in which case the material is linear. Thedielectric material consists of a volume of bound charge with a density N in units of coulombs percubic centimeter, possibly depending on r. Each bound charge with a mass m and a total charge qhas a linear restoring force that models the “stiffness” of the material response. The restoring forcecan also be nonlinear leading to a nonlinear relationship between E and P .For a linear isotropic dielectric dispersive material, the material polarization P(t) at each point r

is in the same direction as the applied electric field E(t) with the response of a single component of

49

Page 54: Math Notes for ECE 278 - web.eng.ucsd.edu

P(t) to a single component of an applied electric field E(t) described by the second-order differentialequation

d2P(t)

dt2+ σω

dP(t)

dt+ ω2

0P(t) =Nq2E(t)

m, (1.3.8)

where ω20 = K/m is the resonant frequency of the material response, and σω is the width of the

resonance.For this linear material, the response to a monochromatic electric field E(t) = Re[E(ω)eiωt] is

another monochromatic field P(t) = Re[P (ω)eiωt]. Substituting these forms into the differentialequation (1.3.8), the complex spectral susceptibility , denoted as χ(ω), is

χ(ω) =1

ε0

P (ω)

E(ω)= χ0

ω20

ω20 − ω2 + iσωω

. (1.3.9)

Because the spectral susceptibility χ(ω) is the temporal Fourier transform of temporal susceptibilityX(t) denoted as X(t) ←→ χ(ω), the temporal susceptibility X(t) follows from (1.3.9). As ω goesto zero, χ(ω) goes to χ0 where χ0 = Nq2/mε0ω

20 is the low-frequency susceptibility. As ω goes to

infinity, χ(ω) goes to zero.Because the temporal susceptibilityX(t) is a real, causal, linear function of time, there is an inher-

ent relationship between the real part χR(ω) and the imaginary part χI(ω) of the complex spectralsusceptibility χ(ω) given by the Kramers-Kronig transform (cf. (1.1.23)). A similar relationshipexists between the magnitude of χ(ω) and the phase of χ(ω).

1.3.2 The Wave Equation

Modulated waveforms within the wave-optics signal model are electromagnetic waves. The waveequation for electromagnetics can be derived by applying the vector curl operation to both sidesof (1.3.1a). Substituting (1.3.1b) and the constitutive relationships, (1.3.3a) and (1.3.3b) for anondispersive material into the resulting equation yields

∇×∇× E(r, t) +1

c20

∂2E(r, t)

∂t2= −µ0

∂2P(r, t)

∂t2, (1.3.10)

where c0 = 1/√ε0µ0 is the speed of light in vacuum. Substituting the vector identity

∇×∇× E(r, t) = ∇(∇ · E(r, t)

)−∇2E(r, t)

and (1.3.4) into (1.3.10) gives

∇2E(r, t)−∇(∇ · E(r, t)

)− n2

c20

∂2E(r, t)

∂t2= 0, (1.3.11)

where

n2.= 1 +X = εr (1.3.12)

defines the index of refraction n for a nondispersive homogeneous material, commonly referred tosimply as the index. More generally, for an inhomogeneous material, the index depends on r. For

50

Page 55: Math Notes for ECE 278 - web.eng.ucsd.edu

a dispersive material, the index depends on time (or frequency). The index n = c0/c relates thespeed of light c in the material to the speed of light c0 in vacuum with c = c0/n, as will be shownlater in this section. For silica glass, the index is approximately 1.5.For a nondispersive inhomogeneous material for which the index n(r) varies as a function of r

slowly in comparison to the variations of E(r, t) as a function of r, the term ∇(∇·E(r, t)) in (1.3.11)is zero or can be neglected, and the wave equation is written as

∇2E(r, t)− n2(r)

c20

∂2E(r, t)

∂t2= 0. (1.3.13)

The specific conditions that need to be satisfied for (1.3.13) to be valid are discussed in an end-of-chapter problem.

Complex Representations

The analysis of Maxwell’s equations in a dispersive medium can often be simplified by using acomplex representation based on Fourier analysis. For a time-invariant system, the temporal de-pendence of the electric field vector E(r, t) can be written as an inverse temporal Fourier transform(cf. (1.1.9))

E(r, t) =1

∫ ∞−∞

E(r, ω)eiωtdω, (1.3.14)

showing the temporal dependence of E(r, t) expressed as a superposition of complex frequencycomponents E(r, ω)eiωt. Substituting this form into (1.3.13), and using ∂2(E(r, ω)eiωt)/∂t2 =−ω2E(r, ω)eiωt, the complex representation of the wave equation for a dispersive material is de-scribed by the vector Helmholtz equation

∇2E(r, ω) + k20n2(r, ω)E(r, ω) = 0, (1.3.15)

where E(r, ω) is the complex electric vector field at a position r and frequency ω (cf. (1.3.2), andn(r, ω) is the index at that position and frequency.t A similar expression governs the complex vectormagnetic field H(r, ω).The square n2(r, ω) of the frequency-dependent index appearing in (1.3.15) is equal to 1+χ(r, ω)

(cf. (1.3.12)). The complex spectral susceptibility χ(r, ω) (cf. (1.3.9)) is the temporal Fouriertransform of the real temporal susceptibility X(r, t) (cf. (1.3.6)) and characterizes the dispersiveproperties of the material. For a dielectric material, a basic expression for χ(r, ω) is given by (1.3.9).In a homogeneous, nondispersive dielectric material n(r, ω) reduces to a constant n, and the

vector Helmholtz equation reduces to

∇2E(r) + k2E(r) = 0, (1.3.16)

where k .= nk0 is defined as the wavenumber .

tThe expression for the frequency-dependent index n(r, ω) can be formally derived by introducing (1.3.6) into(1.3.11)

51

Page 56: Math Notes for ECE 278 - web.eng.ucsd.edu

When all vector components of the E field and the H field have the same functional form, thenwave propagation can be analyzed using a scalar form of the Helmholtz equation given by

∇2U(r, ω) + k20n2(r, ω)U(r, ω) = 0, (1.3.17)

where U(r, ω) is a scalar function representing one component of either the electric field E(r, ω) orthe magnetic field H(r, ω) normalized so that |U(r, ω)|2 represents a spatial power density. Thisnormalization is discussed later (cf. (1.3.29)). For a constant index, the scalar Helmholtz equationsimplifies to (cf. (1.3.16))

∇2U(r) + k2U(r) = 0, (1.3.18)

with k being the wavenumber. This equation is used to develop geometrical optics later in thissection.

Plane waves in unbounded media

To solve the vector Helmholtz equation for a specified geometry and medium requires applyingthe specified boundary conditions in an appropriate coordinate system. The most basic geometryconsisting of an unbounded, lossless, linear, isotropic, homogeneous medium, not necessarily freespace, allows the simplest solution of the Helmholtz equation. To this point, the fields E and H ofthe form

E(r) = E0e−iβ·r e (1.3.19a)

H(r) = H0e−iβ·r h, (1.3.19b)

satisfy (1.3.15) for such a medium, where e and h are orthogonal unit vectors. This is a plane wave.The cross product e × h defines the real propagation vector β = βxx + βyy + βzz with the unitvector β .

= β/β pointing in the direction of propagation of the plane wave. The magnitude β = |β|is called the propagation constant . In general, a wave for which both the electric field and themagnetic field are transverse to the direction of propagation is called a transverse electromagneticwave or a TEM wave.The complex amplitudes E0 and H0 for the field given in (1.3.19) for the plane wave are related by

H0 = E0/η where the material-dependent quantity η =√µ0/ε is the impedance, which may depend

on the frequency. The spatial phase term e−iβ·r depends on the inner product of the propagationvector β and the position vector r = x x + y y + z z.Using E0 = |E0|eiφ, the real electric field for a plane wave is given by

E(r, t) = Re[E(r)eiωt

]= |E0| cos (ωt− β · r + φ) e. (1.3.20)

The cosine function repeats in time whenever ωt = m2π, where m is an integer. This conditiondefines the temporal period T = 2π/ω = 1/f . The cosine function repeats in space wheneverβ · r = m2π. This corresponds to a spatial period or wavelength λ = c/f that is defined as thedistance along the β direction between two consecutive phase fronts in the medium.For a plane wave, the propagation vector β is also called the wavevector k, and is given by

β = k.= k0n(ω)β, (1.3.21)

52

Page 57: Math Notes for ECE 278 - web.eng.ucsd.edu

where the magnitude |k| .= k = n(ω)k0 is the wavenumber . Setting the propagation constant βequal to the wavenumber k gives kλ = 2π or

λ = 2π/k.

The phase velocity c is defined by

c.=

λ

T= λ(ω)

ω

2π=

ω

β(ω), (1.3.22)

where the notation shows dependence of the propagation constant β, and thereby the dependenceof the wavelength λ on the frequency ω.

Dispersion Relationship Substituting the planes waves of (1.3.19) into the vector Helmholtzequation in (1.3.15) and noting that the spatial operator ∇2 reduces to a multiplication by −β2 fora plane-wave field of the form of (1.3.19), the Helmholtz equation has a solution when β(ω) satisfies

β(ω) = k0n(ω) =ωn(ω)

c0=

2πn(ω)

λ0, (1.3.23)

where k0 = 2π/λ is the free-space wavenumber with λ0 being the free-space wavelength. Usingc0 = ω/k0 = λ0f , the index of refraction n(ω) = c0/c(ω) is the ratio of the phase velocity in freespace to the phase velocity in the medium.The functional dependence β(ω) of the propagation constant β on the frequency ω is called the

dispersion relationship. Values of ω and β that satisfy (1.3.23) produce plane-wave solutions tothe Helmholtz equation. As the field propagates a distance L in a lossless medium in the directionof β, the complex amplitude of the field is multiplied by e−iβ(ω)L, which is a distance-dependentphase shift. This phase shift does not change the functional form of the solution. Each plane-wavesolution is called an eigenfunction or eigenmode, of the unbounded, lossless, linear, homogeneousmedium with the phase shift e−iβ(ω)L being the eigenvalue defined at a distance L.For other geometries, such as waveguiding structures, the dispersion relationship is different than

(1.3.23) because of the presence of boundaries. This dispersion relationship must be derived fora given geometry starting with the vector Helmholtz equation given in (1.3.15), and applying thegeometry-dependent boundary conditions. This means that different waveguiding geometries havedifferent dispersion relationships.

Poynting Vector

The cross product of the time-varying vector fields E(r, t) and H(r, t), which are not necessarilyplane waves, nor orthogonal, is defined as the Poynting vector S(r, t)

S(r, t).= E(r, t)×H(r, t).

The Poynting vector is a directional spatial power density with units of power per unit area. It mayhave components that are not in the direction of propagation.

53

Page 58: Math Notes for ECE 278 - web.eng.ucsd.edu

Narrowband fields can be written as E(r, t) = Re[E(r, t)eiωct] and H(r, t) = Re[H(r, t)eiωct],where the complex-baseband electromagnetic fieldsu E(r, t) and H(r, t) are slowly varying as com-pared to the carrier frequency ωc . The time-average power density Save(r, t) for the complex-baseband field is expressed as

Save(r, t) = Re[12E(r, t)×H∗(r, t)

](1.3.24a)

= Re [Save(r, t)] (1.3.24b)

where Save(r, t) = 12E(r, t)×H∗(r, t) is the complex Poynting vector .

The intensity of the field I(r, t) is defined as the magnitude of Save(r, t)

I(r, t).= |Save(r, t)| . (1.3.25)

When both field vectors are transverse to the direction of propagation, the wave is a transverseTEM wave with the Poynting vector along the direction of propagation. When, instead, one ofthe two fields has a axial vector component along the direction of propagation, as in a dielectricwaveguide, the Poynting vector has a component along the direction of propagation and a componenttransverse to the direction of propagation.

Power

The power P (z, t) flowing along the z axis through a cross-sectional region A transverse to the zaxis at a distance z is given by

P (z, t) =

∫ASave(r, t) · z dA, (1.3.26)

where z is a unit vector in the z direction and dA is the differential of the area. For a TEM wavepropagating in the z direction, the Poynting vector Save(z, t) is in the z direction, and

P (z, t) =

∫AI(r, t)dA. (1.3.27)

If the intensity is constant over the transverse region A, then the power P (z, t) and the intensityI(z, t) differ only by the constant scaling factor equal to the area of the region A.As an example, the complex Poynting vector given in (1.3.24a) for the plane wave described by

(1.3.19) is

Save(r, t) = 12E(r, t)×H∗(r, t)

= 12

(E0e

−iβ·r e)×(H0e

iβ·r h)

=|E0|2

2ηβ (1.3.28)

where e×h = β andH0 = E0/η (cf. (1.3.19)). The corresponding intensity is I = |Save| = |E0|2/2η.uThe same symbol E is used for the complex-baseband electric field E(r, t) in the time domain and the complexelectric field E(r, ω) in the frequency domain. The context will resolve the ambiguity.

54

Page 59: Math Notes for ECE 278 - web.eng.ucsd.edu

Equation (1.3.28) states that the intensity of a TEM wave is proportional to the squared magni-tude |E0|2 of the electric field, and therefore proportional to the squared magnitude of the magneticfield. For this case, it is convenient to define a complex field envelope U(r, t) representing with thesquared magnitude of U(r, t) normalized so that

I(r, t) = |U(r, t)|2, (1.3.29)

where U(r, t) represents either E(r, t) or H(r, t).

1.3.3 Random Electromagnetic field Fields

A noncoherent wireless carrier may have random fluctuations in both time and space. The temporalproperties of a stationary random signal were characterized in Section 1.2.2 by an autocorrelationfunction. This section extends that analysis to the random properties of a electromagnetic fieldby defining a joint coherence function for time and space together. The temporal randomness of aelectromagnetic field is quantified by the temporal coherence function. For a single point in space,this function is an extension of the autocorrelation function defined in (1.2.41) that includes thevector nature of the field.The spatial randomness in a electromagnetic field is quantified by the spatial coherence function.

In general, the term autocorrelation function is commonly used when describing the statistical prop-erties of electrical signals, whereas the term coherence function is commonly used when describingthe statistical properties of electromagnetic fields. The meaning is much the same.For a random electromagnetic field, the intensity I(r, t) at a single time instant and single point

in space is defined as the expectation of the random complex field envelope U(r, t) (cf. (1.3.29))

I(r, t) = 〈U∗(r, t) ·U(r, t)〉 = 〈| U(r, t) |2〉. (1.3.30)

Extending this definition to two points r1 and r2 in space, one of which may be delayed in timeby τ , the first-order mutual coherence function ϕ(r1, r2, τ) is defined as

ϕ(r1, r2, τ).= 〈U(r1, t) ·U∗(r2, t+ τ)〉,

where the expectation is over both time and space. The mutual coherence function may be viewedas the ability of an electromagnetic field to interfere at two points r1 and r2 in space at two timesseparated by τ .For r1 = r2 = r, the temporal coherence function describes the ability of an electromagnetic

field to interfere with a time-delayed version of itself at the single point r. The width of thetemporal coherence function, which can be defined in several ways, is called the coherence timeτc (cf. (1.2.45)). Likewise, for τ = 0, the spatial coherence function describes the ability of anelectromagnetic field to interfere in space at a single time instant t. A coherence region Acoh isthe spatial equivalent of the coherence time τc and is a region for which the electromagnetic field ishighly correlated in space.For a single point in space, an ergodic electromagnetic field is one that satisfies

ϕ(r, r, τ) = limT→∞

1

2T

∫ T

−TU(r, t) ·U∗(r, t+ τ)dt.

55

Page 60: Math Notes for ECE 278 - web.eng.ucsd.edu

Loosely, this expression states that a single realization of the random, ergodic time-varying electro-magnetic field is enough to compute a property defined for the ensemble. In general, the electro-magnetic field from an unmodulated source is ergodic.Likewise, the associated cross-coherence function for two complex electromagnetic fields is

ϕij(r1, r2, τ) = 〈Ui(r1, t) ·U∗j (r2, t+ τ)〉. (1.3.31)

The intensity autocoherence function is defined as

ϕI(r, r, τ).= 〈I(r, t)I(r, t+ τ)〉= 〈U(r, t)U∗(r, t)U(r, t+ τ)U∗(r, t+ τ)〉.

This is a fourth-order function of the complex field envelope at a single point in space.

1.4 References

Basic material on linear systems can be found in Kudeki and Munson (2009). Background materialon random processes is discussed in Helstrom (1991) as well as Stark and Woods (1994). The subjectof generalized functions was developed by Schwartz (1950). Probability distributions involvinggaussian random variables are discussed in Simon (2007).Electromagnetic theory is covered in Harrington (1961), Kong (1990), and Chew (1990).

1.5 Problems

1. Linear systemsShow that for any constants a and b, the definition of a linear system can be replaced by thesingle statement:

a x1(t) + b x2(t)→ a y1(t) + b y2(t),

whenever x1(t)→ y1(t), and x2(t)→ y2(t).

2. Properties of the Fourier transform

a) Starting with the definition of the Fourier transform and its inverse, derive the primaryproperties of the Fourier transform listed in Section 1.1.

b) Using the modulation property of the Fourier transform and the transform pair 1 ←→δ(f), show that

∫∞−∞ e

i2πf1te−i2πf2tdt = δ(f2 − f1), thereby demonstrating that the sete−i2πfjt of time-harmonic functions is orthogonal.

c) The statement of the Fourier transform and the statement of its inverse are identical withthe exception of the signs in the exponent. Using this observation and the differentiationproperty, derive the expression

tnf(t)←→ indn

dωnF (ω).

56

Page 61: Math Notes for ECE 278 - web.eng.ucsd.edu

3. Gram-Schmidt procedureThe Gram-Schmidt procedure is a constructive method to create an orthonormal basis for thespace spanned by a set of N signal vectors that are not necessarily linearly independent. Letxn(t) be a set of signal vectors. The procedure is as follows:

a) Set ψ1(t) = x1(t)/√E1 where E1 is the signal energy.

b) Determine the component of x2(t) that is linearly independent of ψ1(t) by finding theprojection of x2(t) along ψ1(t). This component is given by [x2(t) ·ψ1(t)]ψ1(t) where theinner product is defined in (1.1.65).

c) Subtract this component from x2(t).

d) Normalize the difference. The resulting basis vector can be written as

ψ2(t) =x2(t)− [x2(t) · ψ1(t)]ψ1(t)

‖x2(t)− [x2(t) · ψ1(t)]ψ1(t)‖.

e) Repeat for each subsequent vector in the set forming the normalized difference betweenthe vector and the projection of the vector onto each of the basis vectors already deter-mined. If the difference is zero, then the vector is linearly dependent on the previousvectors and does not constitute a new basis vector.

f) Continue until all vectors have been used.

Using this procedure, determine:

a) An orthonormal basis for the space over the interval [0, 1] spanned by the functionsx1(t) = 1, x2(t) = sin(2πt), and x3(t) = cos2(2πt).

b) An orthonormal basis for the space over the interval [0, 1] spanned by the functionsx1(t) = et, x2(t) = e−t, and x3(t) = 1.

4. Gaussian pulse

a) Using the Fourier transform pair e−πt2 ←→ e−πf2 and the scaling property of the Fourier

transform, show that

e−t2/2σ2 ←→

√2πσe−2π

2σ2f2 =√

2πσe−σ2ω2/2,

and thereby prove that the timewidth-bandwidth product TrmsWrms for a gaussian pulseis equal to (2π)−1 with σf is measured in hertz.

b) Repeat part (a) and show that when the root-mean-squared timewidth is defined usingthe squared magnitude of the pulse and the root-mean-squared bandwidth is definedusing the squared magnitude of the Fourier transform of the pulse, TrmsWrms = 1/2.

c) Derive the relationship between the root-mean-squared bandwidth Wrms for the signalpower and the −3 dB or half-power bandwidthWh for a pulse whose power P (t) is givenby e−t2/2σ2

P .

d) An electromagnetic pulse s(t) modeled as a gaussian pulse with a root-mean-squaredtimewidth Trms is incident on a square-law photodetector with the electrical pulse p(t)generated by direct photodetection given by |s(t)|2/2. Determine the following:

57

Page 62: Math Notes for ECE 278 - web.eng.ucsd.edu

i. The root-mean-squared timewidth of p(t) in terms of Trms.

ii. The root-mean-squared timewidth of the electrical power per unit resistance Pe(t) =p(t)2 in terms of Trms.

e) Finally, rank-order the root-mean-squared timewidth of the electromagnetic pulse s(t),the electrical pulse p(t) generated by direct photodetection, and the electrical power pulsePe(t). Is this ordering valid for any pulse shape?

5. Pulse formatsDerive relationships between the root-mean-squared width, the −3 dB width, and the full-width-half-maximum width in both the time domain and the frequency domain for:

a) A rectangular pulse defined as p(t) = 1 for −W/2 ≤ t ≤W/2, and zero otherwise.

b) A triangular pulse defined as p(t) = 1− |t| /W for |t| ≤W , and zero otherwise.

c) A lorentzian pulse defined as

p(t) =2α

t2 + α2,

where α is a constant.

6. Pulse characterizationThe rectangular pulse p(t) defined in Problem 2.5 is used as the input to a time-invariantlinear system defined by h(t) = p(t) so that the impulse response is equal to the input pulse.

a) Derive the full-width-half-maximum timewidth and the root-mean-squared timewidth ofthe output y(t) = p(t) ~ p(t) and show explicitly that 2σ2p = σ2y .

b) Let the full-width-half-maximum width be denoted by F . Determine if the relationship2F 2

p = F 2y holds for each of the signals defined in Problem 2.5.

7. Passband, baseband, analytic signals, and the Hilbert transform

a) Using

s(t) = A(t) cos(2πfct+ φ(t)

)= Re

[(sI(t) + isQ(t)) ei2πfct

]= Re[z(t)],

determine expressions for A(t) and φ(t) in terms of sI(t) and sQ(t).

b) Verify the following relationships:

i. sI(t) = Re[z(t)e−i2πfct

]ii. sQ(t) = Im

[z(t)e−i2πfct

]iii. A(t) = |z(t)|iv. φ(t) = arg

[z(t)e−i2πfct

]

58

Page 63: Math Notes for ECE 278 - web.eng.ucsd.edu

c) Derive a relationship for the Hilbert transform s(t) in terms of the complex-basebandsignal sI(t) + isQ(t) and the carrier frequency fc.

d) Given that s(t) is a real causal function with the Fourier transform pair s(t)←→ SR(ω)+iSI(ω), use the conjugate symmetry properties of SR(ω) and SI(ω) to show that theKramers-Kronig transform can be written as

SI(ω) =2

π

∫ ∞0

ω

ω2 − Ω2SR(Ω)dΩ

SR(ω) =2

π

∫ ∞0

Ω

Ω2 − ω2SI(Ω)dΩ.

8. Preservation of the commutator propertyProve that the commutator property [A,B] = 0 is preserved under a change of basis.

9. Trace of an outer productUsing (1.1.86), a square matrix T can be written as a weighted sum of outer products

T =∑m,n

Tmnxnx†m.

Using this expression and the properties of the trace operation given in (1.1.80b) and (1.1.80c),show that the trace of a square matrix expressed as an outer product of two vectors is equalto the inner product of the same two vectors as given in (1.1.81).

10. Probability density functions

a) Verify that the mean of the rayleigh probability density function

f(r) =r

σ2e−r

2/2σ2r ≥ 0,

is σ√

π2 and that the variance is σ2(2− π/2).

b) Show that as A becomes large, a ricean probability density function can be approximatedby a gaussian probability density function. Why should this be expected?

11. Transformation of a function of a random variableA new probability density function f(y) is generated when a random variable x with proba-bility density function f(x) is transformed by the functional relationship y = T (x) where Tis invertible over the region where f(x) is defined.

a) Using the fact that the transformation must preserve probabilities on intervals so thatfy(y)dy = fx(x)dx, show that

fy(y) = fx

(T−1(y)

) ∣∣∣∣dxdy

∣∣∣∣ .b) Using fy(y) = fx

[T−1(y)

] ∣∣∣dxdy ∣∣∣, show that for y = x2/2 and fx(x) = xσ2 exp(−x2/2σ2),

which is a rayleigh probability density function, fy(y) is an exponential probability den-sity function with an expected value σ2.

59

Page 64: Math Notes for ECE 278 - web.eng.ucsd.edu

c) Let x = G(w, z) and y = F (w, z) be the inverse transformations that expresses thevariables (x, y) in terms of the variables (w, z). For a joint probability density functionfxy(x, y), the expression for fwz(w, z) is

fwz(w, z) = fxy(G(w, z), F (w, z)

)|J | ,

where |J | is the determinant of the jacobian matrix of this transformation, which is givenby

∂F

∂w

∂F

∂z

∂G

∂w

∂G

∂z

.Using this expression and the transformation x = G(w, z) = z −w and y = F (w, z) = wshow that

fwz(w, z) = fxy(z − w,w).

d) Using the result of part (c), show that

fz(z) =

∫ ∞−∞

fxy(z − y, y)dy.

e) Show that if the two random variables x and y are independent, then fxy(z − y, y) is aproduct distribution, and the probability density function fz(z) for z is given by

fz(z) =

∫ ∞−∞

fx(z − y)fy(y)dy = fx(z) ~ fy(z),

which is (1.2.11).

12. MarginalizationThe bivariate gaussian probability density function has the form

px,y(x, y) = Ae−(ax2+2bxy+cy2).

a) Express A in terms of a, b and c.

b) Find the marginals, px(x) and py(y), and the conditionals px|y(x|y) and py|x(y|x).

c) Find the means 〈x〉, 〈y〉, the variances σ2x, σ2y and the correlation 〈xy〉.Hint: ax2 + 2bxy + cy2 = (a− b2/c)x2 + c(y + bx/c)2.

13. Number of required terms for the Fourier series expansion of the phase function (requires nu-merics)

Let f1(φ) be the exact form for the marginal probability density function of the phase given

60

Page 65: Math Notes for ECE 278 - web.eng.ucsd.edu

in (1.2.37) and let f2(N,φ) be the series approximation using N terms of the Fourier series asgiven by

fφ(φ) =1

2π+∞∑n=1

An cos(nφ),

where F = A2/2σ2 and where the zero-frequency component is 1/2π. The coefficients of thecosine Fourier series are given byv

An =1

2

√F

πe−F/2

[I(n−1)/2

(F

2

)+ I(n+1)/2

(F

2

)],

where In is the modified Bessel function of order n.

Define the root-mean-squared error as follows:

δ(N) =

√2

∫ π

0

(f1(φ)− f2(N,φ)

)2dφ.

a) How many terms are required so that the error is less than 10−6 if F=1?

b) How many terms are required if F=10?

c) Discuss the results with respect to the number of terms required for a specified accuracyas a function of F .

14. Joint and marginal gaussian probability density functionsThe joint probability density function p(x, y) is given as

p(x, y) =

1

2πσxσyexp

[−1

2

(x2

2σ2x+

y2

2σ2y

)]if xy > 0

0 if xy < 0

.

a) Show that this function is a valid probability density function.

b) Sketch p(x, y) in plan view and in three dimensions. Is this joint probability densityfunction jointly gaussian?

c) Find the marginal probability density functions px(x) and py(y) and comment on thisresult.

15. Coherence function and the power density spectrum

a) Let R(τ) = e−|τ |ei2πfcτ . Determine the one-sided power density spectra S(f).

b) A wireless carrier has a power density spectrum S(f) given by

mathcalS(f) =π

(f − fc)2 + π.

Determine the total signal power P .vSee Prabhu (1969).

61

Page 66: Math Notes for ECE 278 - web.eng.ucsd.edu

c) Determine the full-width-half-maximum width of the spectrum in part (b).

d) Estimate the coherence time τc for the spectrum in part (b).

16. Autocorrelation and the power density spectrum of a random signal using sinusoidal pulsesA binary waveform consists of a random and independent sequence of copies of the pulse(1 + cos(2πt/T )

)rect(t/T ) with random amplitude An for the nth term of the sequence. The

start time j of the pulse sequence is a uniformly-distributed random variable over [0, T ].The symbols transmitted in each nonoverlapping interval of length T are independent. Theprobability of transmitting a mark with an amplitude A is 1/2. The probability of transmittinga space with an amplitude 0 is 1/2.

a) Determine the autocorrelation function of the signal.

b) Determine the power density spectrum of the signal.

17. Covariance matricesDefine z as a vector of N circularly-symmetric gaussian random variables with a complex co-variance matrix W given in (1.2.29). Define x as a vector of length 2N that consists of the realpart Re[z] and the imaginary part Im[z] in the order x = Re[z1], ...,Re[zN ], Im[z1], ..., Im[zN ].Show that the real 2N × 2N covariance matrix C given by (cf. (1.2.22))

C =⟨(x− 〈x〉) (x− 〈x〉)T

⟩.

where x is a column vector formed by pairwise terms can be expressed in block form in termsof the N ×N complex covariance matrix W as

C =1

2

[ReW −ImWImW ReW

].

18. Pareto probability density functionThe Pareto probability density function (cf. (1.2.6)) is

fx(x) =

λx−(λ+1) x ≥ 10 x < 1

.

a) Show that the mean is λ/(λ− 1) for λ ≥ 1 and otherwise is infinite.

b) Show that the variance is λ/[(λ− 1)2(λ− 2)] for λ ≥ 2 and otherwise is infinite.

19. Diagonalizing a covariance matrixA real covariance matrix C of a bivariate gaussian random variable is given by

C =

[1 11 4

].

a) Determine a new coordinate system (x′, y′) such that the joint probability density func-tion in that coordinate system is a product distribution and express the joint probabilitydensity function in that coordinate system as the product of two one-dimensional gaus-sian probability density functions.

62

Page 67: Math Notes for ECE 278 - web.eng.ucsd.edu

b) Plot this probability density function using a contour plot showing the original coordi-nates (x, y) and the transformed coordinates (x′, y′).

c) Determine the angle θ of rotation defined as the angle between the x axis and the x′ axis.

20. Sums of gaussian random variablesA random variable G is formed from the sum of two independent gaussian random variablesof equal variance σ2 and expected values E1 and E1(1 + δ) as shown in the figure below.

a) Determine the probability that the value of G is less than E1 − xσ in terms of x, σ, andδ where x is a scaling parameter.

b) Show that if x is large, then for a value of finite δ, the probability determined in part(a) is dominated by the random variable centered at E1 and that the contribution to theprobability from the random variable centered at E1 + δ is negligible.

21. Material impedance

a) By substituting the complex plane-wave fields at a frequency ω

E = E0e−iβ·r e

H = H0e−iβ·r h

into the two curl equations

∇× E = −∂B∂t

∇×H =∂D∂t

,

and using the constitutive relationships along with e × h = β, show that H0 = E0/ηwhere η =

√µ0/ε is the material impedance.

b) Starting with

n2(r).= ε(r)/ε0 = εr(r) = 1 +X(r),

show that the material impedance η(ω) as a function of ω can be written as

η(ω) =η0n(ω)

,

where η0 =√µ0/ε0 is the impedance of free space. (The impedance of free space has a

value of 377 Ω.)

63

Page 68: Math Notes for ECE 278 - web.eng.ucsd.edu

22. Wave equationThis problem discusses the derivation of the wave equation.

a) Using (1.3.7), solve for the divergence term ∇ · E(r) in (1.3.11) in terms of D(r). Theresult consists of two terms one of which is E(r) · ∇ loge n

2(r).

b) State the conditions under which the divergence term can be neglected.

64

Page 69: Math Notes for ECE 278 - web.eng.ucsd.edu

2 Examples

2.1 Filter Estimation

In communication systems, a discrete-time filter yk based on minimum mean-squared error at thefilter output can be estimated directly from the received signal without first estimating the channelimpulse response. A detection filter estimated using this direct approach computes the form of thedetection filter from the received sequence based on statistical knowledge about the data. Thismethod results in a different filter than an equalization filter. The design of that equalization filter,described as a matched filter followed by a transversal filter, requires knowledge about the receivedpulse p(t) obtained from an estimate of the channel impulse response.When cast as an estimation problem, the sampled output component f

kof the finite-length

detection filter can be written as

fk

=

L−1∑m=0

ymrk−m, (2.1.1)

where the sequence rk consists of the received samples in additive white gaussian noise and thesequence ym specifies the detection filter to be estimated. The summation in (2.1.1) can bewritten as an inner product of the complex conjugate y∗ of the desired filter of length L and avector rk of noisy samples with (cf. (1.1.65))

fk

= yTrk = rTky, (2.1.2)

where y is an L-by-one column vector whose components are the filter coefficients yk, and rk isan L-by-one column vector of noisy samples from the sequence rk defined for each k as rk =[rk−L+1, . . . , rk]

T . The error in the kth component is ek = sk − fk.The filter coefficients yk are chosen to minimize the mean-squared-error objective function

J(k,y) = 〈eke∗k〉. (2.1.3)

The filter y that minimizes this objective function is determined by taking the gradient of (2.1.3)and equating the result to zero, where the gradient consists of the vector of all relevant partialderivatives.Because ek is a complex function, a complex gradient is used in preference to the combination of

the scalar gradient on each complex component. The complex gradient is defined either asa

∇y = [∂/∂u1, ∂/∂v2, . . . , ∂/∂yL]T (2.1.4a)

aFor details regarding the complex gradient, see Brandwood (1983).

65

Page 70: Math Notes for ECE 278 - web.eng.ucsd.edu

or as

∇y∗ = [∂/∂u∗1, ∂/∂v∗2, . . . , ∂/∂y

∗L]T , (2.1.4b)

where, for yj = uj + ivj ,

∂/∂yj.= 1

2(∂/∂uj − i∂/∂vj) (2.1.5a)∂/∂y∗j

.= 1

2(∂/∂uj + i∂/∂vj). (2.1.5b)

These two forms of the complex gradient are essentially equivalent. Either form can be used. Thecomplex gradient has the useful property that

∇y∗

[M∑k=1

akyk

]= 0 (2.1.6)

for any complex vector (a1, . . . , aM). The proof of this statement is asked in an end-of-chapterexercise.It is convenient here to choose ∇y∗ as the complex gradient used to derive the desired filter.

Differentiating by parts, this complex gradient becomes

∇y∗J(k,y) =⟨(∇y∗ek)e

∗k + ek∇y∗e

∗k

⟩.

Because ∂yj/∂y∗j = 0 (cf. (2.1.6)), and ek is linear in yj , the term ∇y∗ek is equal to zero. Thesecond term is evaluated as follows. Using e∗k = s∗k−y†r∗k (cf. (2.1.2)), the term ∇y∗e

∗k is written as

∇y∗e∗k = ∇y∗

(s∗k − y†r∗k

)= −r∗k (2.1.7)

because ∇y∗y†r∗k = r∗k. Therefore,

∇y∗J(k,y) = −〈ekr∗k〉. (2.1.8)

This means that the condition

〈ekr∗k〉 = 0 (2.1.9)

for each k, leads to the minimum mean-squared error where 0 is the zero vector of length L.Equivalently, 〈ekr∗k−`〉 = 0 for ` = 0, . . . , L− 1.The optimal filter yopt is the filter for which f

opt= yToptr has the minimum error emin for each

k. Multiplying each side of (2.1.9) by yTopt gives 〈ekyToptr∗k〉 = yTopt0 or

〈eminfopt〉 = 0, (2.1.10)

where fopt

= yToptr∗ for each k. This condition can be described geometrically on the complex

plane as an orthogonality condition between the minimum error emin and the optimal estimateddata value f

optfor each k, as is shown in Figure 2.1.

66

Page 71: Math Notes for ECE 278 - web.eng.ucsd.edu

(a) (b)

ekek

emin

sksk

fk fopt

Figure 2.1: (a) The error ek for the kth component is the vector difference on the complex planebetween the value sk and the estimated value f

k. (b) If the error is minimized, then

the error is orthogonal to the estimated value fkand is orthogonal to the kth observed

value rk for all k.

Now write the orthogonality condition in terms of the components of the sequences

〈ek r∗k−`〉 =⟨(sk −

L−1∑m=0

ymrk−m)r∗k−`

⟩= 0 for ` = 0, 1, ..., L− 1.

Separating and equating terms gives

L−1∑m=0

ymRrr(m− `) = Rrs(−`) for ` = 0, 1, ..., L− 1, (2.1.11)

where Rrr(m − `) =⟨rk−m r

∗k−`⟩is the autocorrelation of the received sequence (cf. (??)), and

Rrs(−`) = 〈r∗k−` sk〉 is the cross-correlation of the received sequence with the desired data sequence.Equation (2.1.11) is valid for each value of `. The resulting set of equations expressed in vector-

matrix form is (cf.(??))

Ry = wrs, (2.1.12)

where R is the matrix form of the statistical autocorrelation function (cf. (??)) with Rm` = Rrr(m−`),

wrs = 〈r∗sk〉 = [Rrs(0), Rrs(−1), . . . , Rrs(1− L))]T (2.1.13)

is the vector form of the cross-correlation function, and

y = [y0, y1, . . . , yL−1]T (2.1.14)

is a vector of length L of the filter coefficients. The solution to (2.1.12) is given by

yopt = R−1wrs, (2.1.15)

where the inverse R−1 exists when R is a positive-definite matrix, which is almost always the case foran autocorrelation function. The causal filter described by yopt is called a finite-impulse-responseWiener filter with the set of equations specified by (2.1.12) called the Wiener-Hopf equations.

67

Page 72: Math Notes for ECE 278 - web.eng.ucsd.edu

The error for each estimated component can be determined using (2.1.2) to rewrite the objectivefunction given in (2.1.3 ) in terms of matrices. Then

J(k,y) = σ2s − y†wrs −w†rsy + y†Ry, (2.1.16)

where (2.1.13) is used to write w†rs = 〈s∗krT 〉. Provided that R−1 exists, J(k,y) can be written asthe sum of a “perfect square” term and a residual error term given by

J(k,y) = (y − R−1wrs)† R(y − R−1wrs) + σ2s −w†rsR−1wrs, (2.1.17)

showing that the objective function is an L-dimensional quadratic surface with a unique minimumvalue given by yopt = R−1wrs. The error at this minimum is

σ2err = σ2s −w†rsR−1wrs

= σ2s −w†rsyopt, (2.1.18)

where the second equation follows from the first equation using yopt = R−1wrs.

2.2 Constant-Modulus Objective Function

An alternative linear technique for sequence detection starts with the form of the estimate givenin (2.1.2) but with a different objective function called the constant-modulus objective function.Recall that sk is the sequence of data values, f

k is the sequence of estimated data values,

and rk is a sequence of noisy samples from which the estimate is formed. Whereas the criterionof minimizing the mean-squared error 〈|sk − fk|

2〉 (cf. (2.1.3)) was used in Section 2.1, instead,the constant-modulus method for estimation of the filter coefficients uses the alternative criterionthat the squared magnitude |f

k|2 is equal to a constant for all k. This criterion is suitable for a

constant-magnitude signal constellation. The advantage of this criterion is that a cross-correlationfunction Rrs(`) based on the data sk is not required.An objective function that enforces a constant modulus constraint isb

J(k,y) =⟨(|fk|2 − 1

)2⟩, (2.2.1)

where the squared magnitude of each symbol is normalized to one. The objective function J(k,y)is minimized when ∇y∗J(k,y) = 0.Using (2.1.2) to write f∗

kfkas y†r∗kr

Tky, the complex gradient with respect to the filter y∗ (cf.

(2.1.4bb)) has the same form as (2.1.8) with

∇y∗J(k,y) = 2⟨|fk|2 − 1)∇y∗(y

†r∗krTky)⟩

= 2⟨(|f

k|2 − 1)(r∗kr

Tky))

⟩= 2

⟨(|f

k|2 − 1)r∗kfk

⟩= 2〈ekr∗k〉. (2.2.2)

bOther forms of the objective function replace the value one by a ratio of the moments of the statistical distributionof the datastream.

68

Page 73: Math Notes for ECE 278 - web.eng.ucsd.edu

where ek = (|fk|2−1)f

kis the error in the estimate and where the property ∇y∗(y

†By) = By of thecomplex gradient has been used with B = r∗kr

Tk . The error is minimized when ek is orthogonal to r∗k.

The constant-modulus objective function does not define an L-dimensional quadratic surface as wasthe case for the minimim mean-squared error. Therefore, there may be local minima at which thegradient is zero. Nevertheless, for small errors, the optimal filter defined by the constant-modulusobjective function is a scaled form of the Wiener filter defined by the minimum mean-squared-errorobjective function (cf.(2.1.12)).c

Setting (2.2.2) to the zero vector gives a set of equations written as

Ry = wrf , (2.2.3)

where R is the autocorrelation matrix of the received noisy signal, and wrf = 〈(|fk|2 − 1)f

kr∗k〉 is

the cross-correlation between (|fk|2 − 1)f

kand rk (cf. (2.1.13)).

2.3 Adaptive Estimation

Examining (2.1.15), the calculation of the Wiener filter requires the cross-correlation wrs, whichdepends on the joint statistics of the received sequence and the data sequence. An initial estimatecan be obtained by using a training sequence. The maintenance of the minimum of the objectivefunction can be cast into an adaptive form. To do so, write the vector of filter coefficients y(k + 1)for the (k + 1)th step in terms of y(k) for the kth step and the gradient of the objective evaluatedat k so that

y(k + 1) = y(k)− µ∇y∗J(k,y), (2.3.1)

where µ is a gain parameter that balances the rate of convergence with the accuracy.Using (2.2.2) and replacing the statistical expectation with the instantaneous value gives

y(k + 1) = y(k) + µekr∗k. (2.3.2)

This replacement leads to a gradient that is random. This estimation method is called the stochasticgradient descent method . Accordingly, the algorithm will execute a random walk about the optimalsolution yopt with an excursion that depends on the noise.The constant-modulus algorithm can also be cast into an adaptive form following the same steps

that were used to develop an adaptive form of the Wiener filter. Start with (2.2.2) and replace theexpectation by its instantaneous value. Incorporating the factor of two into the definition of µ leadsto

y(k + 1) = y(k)− µ(|fk|2 − 1)f

kr∗k

= y(k)− µekr∗k, (2.3.3)

where ek.= (|f

k|2 − 1)f

kis the update error.

cSee Treichler and Agee (1983).

69

Page 74: Math Notes for ECE 278 - web.eng.ucsd.edu
Page 75: Math Notes for ECE 278 - web.eng.ucsd.edu

Bibliography

D. H. Brandwood. A complex gradient operator and its application in adaptive array theory.Microwaves, Optics and Antennas, IEE Proceedings H, 130(1):11–16, 1983.

W.C. Chew. Waves and Fields in Inhomogeneous Media. IEEE Press Series on ElectromagneticWaves. Van Nostrand Reinhold, 1990.

R. F. Harrington. Time-Harmonic Electromagnetic Fields. McGraw-Hill„ New York, NY, 1961.

C. W. Helstrom. Probability and Stochastic Processes for Engineers. Macmillan; Collier-MacmillanCanada; Maxwell Macmillan International, New York: Toronto, 1991.

J. A. Kong. Electromagnetic Wave Theory. Wiley, New York, NY, 1990.

E. Kudeki and D.C. Munson. Analog Signals and Systems. Illinois ECE series. Pearson PrenticeHall, 2009.

V. K. Prabhu. Error-rate considerations for digital phase-modulation systems. IEEE Transactionson Communication Technology, 17(1):33–42, 1969.

I. Reed. On a moment theorem for complex gaussian processes. IRE Transactions on InformationTheory, 8(3):194–195, 1962.

L. Schwartz. Théorie des Distributions. Hermann and Cie, Paris, 1950.

M.K. Simon. Probability Distributions Involving Gaussian Random Variables: A Handbook forEngineers and Scientists. International series in engineering and computer science. Springer US,2007.

H. Stark and J.W. Woods. Probability, Random Processes, and Estimation Theory for Engineers.Industrial and Systems Engineering. Prentice Hall, 1994.

R.S. Strichartz. A Guide to Distribution Theory and Fourier Transforms. Studies in advancedmathematics. World Scientific, 2003.

J.R. Treichler and B.G. Agee. A new approach to multipath correction of constant modulus signals.IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-31:459–72, Apr. 1983.

71

Page 76: Math Notes for ECE 278 - web.eng.ucsd.edu
Page 77: Math Notes for ECE 278 - web.eng.ucsd.edu

Index

Adaptive estimation, 69Aliasing, 19Amplitude

root-mean-squared, 13Analytic signal, 6, 58Angular frequency, 4Anisotropic material, 48Associative property, 3Autocorrelation function, 35, 40, 62

complex-baseband noise, 43noise, 41width, 37

Autocovariance function, see Covariance functionAxis

principal, 48

Bandlimitednoise, 43signal, 7system, 41waveform, 18

Bandwidth, 7, 83-dB, 8effective, 38half-power, 8, 42maximum, 18noise equivalent

baseband, 41complex-baseband, 44passband, 42, 44

root-mean-squared, 8, 57Baseband

noise equivalent bandwidth, 41Basis, 15

gaussian distribution, 29orthonormal, 15, 21, 57

Bayes rule, 26Bessel function

modified, 32, 61Birefringence, 48Bivariate gaussian random variable, 62Bivariate probability density function, 25

gaussian, 29, 60Boundary conditions, 48

Carrierfrequency, 5

Causal impulse response, 40Causality, 3Central limit theorem, 34Centroid, 8Characteristic function, 27

probability mass function, 27Circularly-symmetric

-gaussian probability density function, 29, 45multivariate, 30, 31

-gaussian random variable, 29, 45-gaussian random vector, 30, 62

Coherencefunction, 55

temporal, 36interval, 37region, 55spatial, 55temporal, 55time, 37, 55

Coherence function, 61Coherence time, 45Colored noise, 41Commutative property, 2Commutator, 22Commuting operator, 22Complementary error function, 28Complex conjugate transpose, 15, 30Complex envelope

field, 55signal, 13

Complex gradient, 65Complex random variable, 23Complex-baseband noise

autocorrelation function, 43equivalent bandwidth, 44power density spectrum, 43random process, 43

Complex-baseband signal, 13, 23, 59energy, 18

Complex-baseband systemimpulse response, 14transfer function, 14

Componentin-phase, 12quadrature, 12

Conditional probability, 25Conditional probability density function, 25Conjugate symmetric signal, 6Constant

free-space permeability, 47free-space permittivity, 47

Constant-modulus objective function, 68Constitutive relationships, 48, 50Continuous-time

system, 2Convolution, 2

associative property, 3

73

Page 78: Math Notes for ECE 278 - web.eng.ucsd.edu

commutative property, 2distributive property, 2Fourier transform of, 5

Correlation, 26Correlation coefficient

random variable, 29Covariance, 26, 28Covariance function, 36Covariance matrix, 29

complex, 30real, 28

Cross-coherence function, 56Cumulative probability distribution function, 23Curl, 47, 50

Density function, see Probability density functionDeterminant, 20Dielectric material, 48, 51Dielectric waveguide, 54Differential equation

Helmholtz, 52Dipole, 48Dirac impulse, 2Dispersion

material, 49Dispersion relationship, 53

plane-wave, 53Dispersive material, 49Distance

euclidean, 17squared, 17

Distributive property, 2Divergence, 47Dyadic product, 17

Effective timewidth, 8Eigenfunction, 7, 53Eigenmode, 7, 53Eigenvalue, 7, 21, 53Eigenvector, 7, 20, 21

-of a covariance matrix, 30Electric field, 46

complex representation, 51Electric flux density, 47Electrical pulse, 57Electromagnetic

ergodic, 55Electromagnetic field, 46

complex, 56complex envelope, 55complex-baseband, 54random, 36, 55spatially random, 55temporally random, 55

Electromagnetic signalpower, 54

Electromagnetic wave, 50Energy

complex-baseband signal, 18passband signal, 17

Ensemble, 36Envelope

complex field, 55

complex signal, 13Equalizer

constant modulus, 68Ergodic process, 36Error function, 28, 33

complementary, 28Estimation

adaptive, 69detection filter, 65

Euclidean distance, 17squared, 17

Expectation, 24Expected value, 24Exponential probability density function, 34, 59Exponential random variable, 34

Fieldelectric, 46electromagnetic, 46magnetic, 46monochromatic, 47, 50narrowband, 54plane-wave, 52

Filterestimation, 65linear shift-invariant, 2Wiener, 67

filtercausal, 3

First moment, 24Flux density, 47Fourier series, 18

cosine, 61Fourier transform, 4, 12

convolution property, 5differentiation property, 5inverse, 4modulation property, 5properties, 4, 56scaling property, 4spatial, 4temporal, 4

Free-spaceimpedance, 63permeability, 47permittivity, 47phase velocity, 53wavelength, 53wavenumber, 47, 53

Frequencyangular, 4carrier, 5spatial, 4

Full-rank matrix, 20Function

coherence, 55complementary error, 28error, 28, 33gamma, 34generalized, 2modified Bessel, 32, 61rect, 10signum, 3

74

Page 79: Math Notes for ECE 278 - web.eng.ucsd.edu

sinc, 11triangular, 39unit-step, 3

Gamma probability density function, 34Gamma random variable, 34Gaussian probability density function, see Probability

density function, gaussianjoint, 61marginal, 61

Gaussian pulse, 12Gaussian random process, see Random process, gaussianGaussian random variable, see Random variable, gaussianGeneralized function, 2Gradient

complex, 65Gram-Schmidt procedure, 57

Half-power bandwidth, 42Harmonic oscillator

classical, 49Heisenberg uncertainty relationship, 10Helmholtz equation, 47, 51

scalar, 52vector, 47, 51

Hermitian matrix, 21, 29Hilbert space, 16Hilbert transform, 6, 58Homogeneity, 1Homogeneous material, 48Homogeneous system, 2

Identity matrix, 21Impedance, 52

free-space, 63Impulse

Dirac, 2Kronecker, 2, 15sifting property, 2

Impulse response, 1, 2, 58causal, 40, 42complex-baseband, 14passband, 14right-sided, 3shift-invariant, 2time-varying, 2

In-phase noise component, 43In-phase signal component, 12Independent

-random variable, 26random vector, 35

Index of refraction, 47, 50, 53angularly-dependent, 48

InequalitySchwarz, 17time-bandwidth, 10

Inhomogeneous material, 48Inner product, 16, 21Intensity, 54Intensity autocoherence function, 56Interpolation, 19Inverse Fourier transform, 4Isotropic material, 48

Isserlis theorem, 25

Jacobian matrix, 60Joint probability density function, 25

Kramers-Kronig transform, 6, 50, 59Kronecker impulse, 2, 15

Laplace recursion formula, 20Linear causal media, 49Linear shift-invariant filter, 2Linear time-invariant system, 7, 51Linearity, 1Linearly independent set, 15

Magnetic field, 46complex representation, 51

Magnetic flux density, 47Marginal probability density function, 25Marginalization, 25, 60Material

anisotropic, 48dielectric, 48, 51dispersive, 49homogeneous, 48inhomogeneous, 48isotropic, 48nondispersive, 49

Material impedance, 63Material polarization, 48Matrix

conjugate, 20conjugate transpose, 20covariance, see Covariance matrixfull-rank, 20hermitian, 21identity, 21nonhermitian, 22normal, 22projection, 21rank, 20singular-value decomposition, 22square, 20trace, 20

outer product, 59transformation, 15transpose, 20unitary, 21

Maxwell’s equations, 46complex representation, 51

Mean, 24Mean-squared value

-frequency, 8timewidth, 8

Memoryless system, 3Minimum mean-squared error

orthogonality condition, 66Mode

-of a linear system, 7transverse electromagnetic (TEM), 52

Modified Bessel function, 32, 61Modulus, 6Moment

75

Page 80: Math Notes for ECE 278 - web.eng.ucsd.edu

nth, 24first, 24

Monochromaticfield, 47, 50

Multivariant gaussian probability density function, 28Mutual coherence function, 55

Narrowbandfield, 54signal, 18

Noiseadditive white gaussian, 40autocorrelation function, 41bandlimited, 43colored, 41complex-baseband

autocorrelation function, 43equivalent bandwidth, 44

equivalent bandwidth, 41in-phase component, 43passband, see Passband, noisepower, 41

passband, 44quadrature component, 43thermal, 46white, 41

Noise figure, 45Nondispersive material, 49Normal matrix, 22Nyquist rate, 19Nyquist-Shannon sampling theorem, 18

Objective functionconstant-modulus, 68mean-squared error, 65

Operatorcommuting, 22convolution, 26determinant, 20trace, 20

Orthogonal signals, 18Orthogonality condition

minimum mean-squared error, 66Orthonormal basis, 15, 21, 57Outer product, 16, 20

orthonormal basis vector, 21

Pareto index, 25Pareto probability density function, 62Parseval’s relationship, 5, 17, 38, 41Passband

impulse response, 14noise, 43

power density spectrum, 43noise equivalent

bandwidth, 44noise equivalent bandwidth, 42signal, 12

energy, 17Passband noise process, 43Period

temporal, 52Permeability, 47

Permittivity, 47, 49Phase noise

probability density function, 33Phase probability density function, 60Phase velocity, 53Photodetector

square-law, 57Plane wave, 52, 63Poison summation formula, 11Polarization (field), 48Polarization (material), 48Position vector, 52Posterior probability, 26Power

effective bandwidth, 38electromagnetic, 54

Power density spectrum, 36, 61complex-baseband noise, 43electrical

per unit resistance, 40one-sided, 37passband noise, 43random binary waveform, 39two-sided, 37

noise, 40Poynting vector, 53

complex, 54Principal axis, 48Probability density function, 23

bivariate, 25gaussian, 29, 60

conditional, 25exponential, 34, 59first-order, 35gamma, 34gaussian, 27

circularly-symmetric, 29, 31, 45complex, 31multivariate, 28, 30, 31

joint, 25marginal, 25Pareto, 25, 62phase, 33, 60rayleigh, 31, 33, 59ricean, 31, 32, 59second-order, 35

Probability distribution function, 23Probability mass function, 23

Gordon, see Gordon distributionProduct distribution, 26

gaussian, 30Projection, 16, 57

matrix, 21Propagation constant, 52Propagation vector, 52Pulse

chirp, 12gaussian, 12, 57lorentzian, 12, 58quadratic phase, 12rectangular, 10sinc, 11triangular, 58

76

Page 81: Math Notes for ECE 278 - web.eng.ucsd.edu

Quadrature noise component, 43Quadrature signal component, 12

Random process, 23, 35circularly-symmetric gaussian, 45complex-baseband noise, 43ergodic, 36gaussian, 36stationary

strict sense, 36widesense, 36

Random telegraph signal, 38Random variable, 23

nth moment, 24bivariate

gaussian, 28complex, 23correlation, 26expectation, 24exponential, 34first moment, 24gamma, 34gaussian, 27

bivariate, 28circularly-symmetric, 29, 45complex, 30multivariate, 28uncorrelated, 29

independent, 26mean, 24rayleigh, 33realization, 23ricean, 32root-mean-squared value, 24uncorrelated, 26uniform, 38variance, 24

Random vectorgaussian

complex, 30Rayleigh probability density function, 31, 33, 59Rayleigh random variable, 33Realization

random variable, 23sample function, 23

Rectangular pulse, 10Relative permittivity, 49Ricean probability density function, 31, 32, 59Ricean random variable, 32Root-mean-squared

-bandwidth, 8, 57-timewidth, 8, 57

Root-mean-squared amplitude, 13, 24

Sample, 18Sample function, 23, 36Sampling rate, 19Sampling theorem, 18Scalar, 2Scalar Helmholtz equation, 52Schwarz inequality, 17Self-adjoint transformation, 21Shift-invariant system, 2

Sifting property of an impulse, 2Signal, 1

analytic, 6, 58bandlimited, 7complex envelope, 13complex-baseband, 13, 23, 59conjugate symmetric, 6narrowband, 18orthogonal, 18passband, 12right-sided, 3spectrum, 4square-integrable, 4timelimited, 7

Signal modelwave-optics, 50

Signal space, 15basis, 15definition of distance, 17

Signal vector, 15basis, 57euclidean distance, 17

Signum function, 3Silica glass, 51Sinc pulse, 11Singular value, 22Singular-value decomposition, 22Space-invariant system, 2Span, 15Spatial coherence

function, 55Spatial coherence function, 55Spatial Fourier transform, 4Spatial frequency, 4Spectral susceptibility, 50Spectrum, 4Speed of light

material, 51vacuum, 50

Square-integrable signal, 4Standard deviation, 24Stationarity

strict sense, 36widesense, 36

Stationary random process, 36Statistic, 23Stochastic gradient decent method, 69Stochastic process, 23Submatrix, 20Superposition, 1Superposition integral, 2Susceptibility, 48, 49

spectral, 50Susceptibility tensor, 49System, 1

additive, 2bandlimited, 41causal, 3continuous time, 2homogeneous, 2linear, 1, 1linear time-invariant, 7, 51memoryless, 3

77

Page 82: Math Notes for ECE 278 - web.eng.ucsd.edu

shift-invariant, 2space-invariant, 2spatially local, 3time-invariant, 2

Temporal coherence, 55Temporal coherence function, 55Temporal Fourier transform, 4Tensor product, 17Theorem

central limit, see Central limit theoremIsserlis, 25sampling, see Sampling theoremWiener-Khintchine, 37

Thermal noise, 46Time-bandwidth inequality, 10Time-invariant system, 2Timelimited signal, 7Timewidth, 7, 49

effective, 8root-mean-squared, 8, 57

Timewidth-bandwidth product, 9, 57Trace, 20

outer product, 59Transfer function, 7, 40

basebandnoise equivalent bandwidth, 42

complex-baseband, 14passband, 14

noise equivalent bandwidth, 42Transform

Fourier, 4, 12Hilbert, 6, 58Kramers-Kronig, 6, 50, 59

Transformationself-adjoint, 21unitary, 21

Transformation matrix, 15Transverse electromagnetic mode (TEM), 52Triangular pulse, 58

Unit-step function, 3Unitary matrix, 21Unitary transformation, 21

Variance, 24Vector field, 51Velocity

phase, see Phase velocity

Wave equation, 50, 64Wave optics

signal model, 50Waveform

bandlimited, 18random binary, 38

Waveguidedielectric, 54

Wavelength, 52free-space, 53

Wavenumber, 51, 53free-space, 47, 53

Wavevector, 52White noise, 41Wiener filter, 67Wiener-Hopf equations, 67Wiener-Khintchine theorem, 37

78