Theory of Linear Systems Lecture Notespalopoli/courses/TS/TScomplete.pdf · 2016-01-07 · This lecture notes are related to the course of \Theory of Linear Sys-tems$ taught in the

$Page 1: Theory of Linear Systems Lecture Notespalopoli/courses/TS/TScomplete.pdf · 2016-01-07 · This lecture notes are related to the course of \Theory of Linear Sys-tems$ taught in the$
Theory of Linear Systems

Lecture Notes

Luigi Palopoli and Daniele Fontanelli

Dipartimento di Ingegneria e Scienza dell’Informazione

Universita di Trento


Contents

List of symbols and notations vii

Chapter 1. Introduction 1

Chapter 2. Definition of system 32.1. Signals 52.2. Systems 82.3. I/O Representation 112.4. State Space representation 152.5. Linear Systems 202.6. Time Invariance 222.7. Hybrid Systems 23

Chapter 3. Impulse response and convolutions for linear and timeinvariant IO Representations 25

3.1. Forced Evolution of discrete–time systems 253.2. Forced Evolution of continuous–time systems 313.3. Properties of the impulse response 35

Chapter 4. Laplace and z-Transform 394.1. Complex exponentials 394.2. Complex Exponential functions 404.3. Definition of the Laplace Transform 404.4. Existence and Uniqueness of the Laplace transform 424.5. Properties of the Laplace transform 444.6. Inversion of the Laplace Transform 504.7. BIBO Stability for CT systems 604.8. Use of Laplace Transform for Control Design 664.9. The z-Transform 704.10. Existence and Uniqueness of z-Transform 714.11. Properties of the z-Transform 724.12. Inverse z-Transform 744.13. BIBO stability of DT systems 774.14. z-Transform of sampled data signals 78

Chapter 5. State space Analysis 795.1. Control Canonical Form 795.2. Coordinates Transformation 83

v

vi CONTENTS

5.3. Continous Time Linear System Solution 845.4. The Role of the Eigenvalues for Continuous Time Systems 915.5. Discrete Time Linear System Solution 1005.6. The Role of the Eigenvalues for Discrete Time Systems 1035.7. Modes Analysis 106

Chapter 6. Structural Properties of Systems 1096.1. Stability 1096.2. Stability of Linear Systems 124

Appendix A. Some mathematical definition of interest 127

Bibliography 129

List of symbols and notations

• R: set of real numbers• N: set of natural numbers• C: set of complex numbers• Z (:) set of integer numbers• T : time space• U : class of input signals• Y : class of output signal• S : system• 1(:) step function defined as

1(=)

0 if t < 0

1 if t ≥ 0

• L (.): Laplace transform• L−1 (.): inverse Laplace transform• Real (z): real part of a complex number• Imag (z): imaginary part of a complex number• |z|: modulus of a complex number• ∠z: phase of a complex number• z: complex conjugate.

vii


CHAPTER 1

Introduction

This lecture notes are related to the course of “Theory of Linear Sys-tems£ taught in the undergraduate class of Telecommunications and Com-puting Engineering at the University of Trento. The presentation of LinearSystems proposed here is necessarily synthetic and introductory. The readerintested in additional details is referred to one of the books that exist on thethe topic. Two excellent examples are offered by the book written by JoaHespanha on Linear Systems [Hes09] or by the book writtern by Philippset al. [PPR95] on linear systems.

1


CHAPTER 2

Definition of system

This course revolves around the notion of timed systems. For our pur-poses, a system is a physical or artificial entity (e.g., a computer program),where a number of meaningful quantities evolve in a way that can be de-scribed using a mathematical formalism. The evolution of our quantities ofinterest is described by a mathematical function that associates a value of avariable defined “time” with a value taken by the quantity which is definedin an appropriate set (e.g., voltage, position, velocity etc.). Such functionswill be defined “signals”.

In some cases we will have a convenience in defining a system as a relationbetween input signals and output signals. In other cases, a system is bestthought as a relation between the evolution of some quantities. An exampleis offered by an economic system or by a human body. In this case, we couldhave a convenience in identifying input and output quantities or not (as anexample, the system could evolve autonomously and have no input).

Our illustration of this notion is best given using examples.

Example 2.1. Consider the RC circuit shown in Figure 2.1. In thissystem, we can easily identify several quantities of interest, for which it ispossible to take measurements. The most evident ones are currents andvoltages across the resistor and the capacitor. The laws of physics allow usto describe the evolution of the different quantities in time.

In this example, by time we mean exactly the “physical time”, whichflows with continuity and allows us to observe the evolution of physicalphenomena.

Let us move to a completely different example.

Figure 1. A simple RC network

3

4 2. DEFINITION OF SYSTEM

0 1

23

0/00

1/01

0/01

1/10

0/10

1/11

0/11

1/00

Figure 2. State machine implementing a modulo 4 counter

Example 2.2. Consider a digital circuit that counts the number of timesthat a certain event occurs (e.g., input = 1) modulo 4 and that prints thisnumber as an output on the occurrence of any event. Recalling basic notionsof digital circuits, it is easy to see that a circuit like this can be modelledas the state machine shown in Figure 2.2. What we have just shown in theFigure is commonly referred to as a “state transition diagram”, which is agraph where circles represent states and arcs are associated to transitions. Atransition is labelled by an input/output pair, meaning that the transitionis taken for the specified input and results in the emission of the specifiedoutput. In our example, if the system receives a 1 as an input when instate 0, it moves into state 1 producing as output 01. The label associatedwith the transitions is therefore “1/01”, where the first number representsthe input that triggers the transition and in the second number representsthe output generated when the transition is taken. In this example, we areimplicitly using a binary representation for numbers.

The system in the example could be implemented by a sequential elec-tronic circuit or by a software programme and is different from the analoguecircuit discussed in Example 2.1 in many respects. The most importantone is that we are interested in describing the input/output relation (i.e.,what each input sequence produces) without being necessarily interested inthe point in time where each “reaction” of the machine takes place. Thenumber of events that the system can respond to is clearly infinite, butgiven the amount of time that the system needs to produce a reaction wecannot generate events continuously in time. One possible way to formalisethis is that the number of events that intervene between any two events isfinite [LSV98] is always finite.

2.1. SIGNALS 5

As known from elementary courses of computer systems, it is possible toreduce our freedom and require that the system reacts only at precise pointsin time. Indeed, most sequential networks are implemented in a “synchro-nous” way, with reaction to events forced to take place upon the rising orfalling edge of a clock signal1. In this case, the minimum time intervalbetween two different reactions is bound to the clock period. However le-gitimate and reasonable, this is just one of the possible implementation ofa state machine. In a different asynchronous implementations, the machinereacts to an event whenever it occurs. We could say that in Figure 2.2 wehave offered an abstract specification of the state machine, which is open todifferent implementation options for the designer.

There are cases where the system is inherently synchronous. Considerthe following example.

Example 2.3. Consider a banking account where losses or gains arecapitalised every T time units (where T is a sub-multiple of one year). Thecustomer is allowed to deposit or withdraw money every time an intervalexpires. If at the beginning of an interval the capital is positive, at the endthe credit grows with an annual interest rate I+. If it is negative, at the endthe debt grows with an annual interest rate I−.

Let C(kT ) be the capital at the beginning of the k−th observation in-terval and let C((k + 1)T ) be the capital at the end. Finally, let S(kT ) bethe amount of money (positive or negative) that the customer deposits orwithdraws. The evolution of the capital is given by:(2.1)

C((k + 1)T ) =

(1 + I+/12)T (C(kT ) + S(kT )) Se C(kT ) + S(kT ) ≥ 0

(1 + I−/12)T (C(kT ) + S(kT )) Se C(kT ) + S(kT ) < 0

At a fist sight, this system looks very similar to the state machine de-scribed earlier. Once again, it is important to know the exact sequence ofdeposits and withdrawals. Therefore, it is required that the time space Tbe an ordered set. However, it is not enough. This time, we need to assumethat the time between intervals between events are of the same durationand we need to know how large this interval is (otherwise the computationof the interest is not possible). Contrary to the modulo counter in Exam-ple 2.2, the banking account example is not only synchronous in one of itspossible implementation, but it is inherently so since its evolution is tiedto the evolution of physical time. However, contrary to the RC network inExample 2.1, it is not required (nor possible) to have arbitrarily close events.

2.1. Signals

A common presence in all the systems mentioned above are “signals”.A signal is simply defined as a function from a time space T to a set U. At

1When no event occurs on a clock, the network does not react


this point, we need not make any particular assumption on the set U (wewill have to later on).

We will frequently use calligraphic letter to denote classes of signals. Forexample,

U = u(·) : T → U .

2.1.1. The notion of time. A first problem to face is to define theexact meaning for the notion of time, a set that we will henceforth denoteby T .

2.1.1.1. Continuous Time. Continuous time (CT) signals are based ona time space that has to be:

(1) totally ordered,(2) metric,(3) a continuum (in the mathematical sense).

The exact meaning of these concepts is recalled in Appendix A. At thispoint, it is more useful to elaborate on the intuition underneath and on whywe need them to describe CT signals (and systems).

Let us start with an example.

Example 2.4. Consider the RC circuit in Example 2.1, VR(t) can beused to denote the voltage drop across the resistor R as a function of timeand is a CT signal. The same applies to the voltage VC(t) measured overthe capacitor C, for Vin(t) and for I(t):

CT signals are used to represent the evolution of physical quantities. Inthis setting, if a quantity takes on two different values at different time itis relevant to know which one comes first and which one follows, and howfar away the two events are. By the adjective “relevant” we mean that theevolution of the system will be quite different if the order of two events isinverted or if their distance is changed. To understand this, consider anelectrical RC circuit as in Example 2.1. Suppose that a 5V power sourceis connected to the terminals at time t1 and switched off at time t2. Theevolution of the system is understandably different if t2 − t1 = 10−3s ort2− t1 = 105s. Moreover, the value of a quantity like Vr can me measured atany time t and is potentially affected by the evolution of a different quantityat all possible times t′ ∈ T . These considerations explain why we need aset that is a continuum to mode CT signals. Therefore, the most obviouschoice for the time space is to choose it as the real set: T = R.

2.1.1.2. Discrete Events. The opposite case to CT signals is when theconnection between the time space T and the physical time is very shallow.Consider the system shown in Example 2.2. If we are only interested in theevolution of the output sequence given the input sequence, all we need toknow on the timing of the event is condensed in their order. For instance,the input sequence 011 will certainly produce a different sequence of statesand of output than the sequence 101 which only differs from the first onefor the order of the first two symbols.

2.1. SIGNALS 7

An important point is that an input sequence (e.g., 0 · 1 · 1) will producethe same output sequence whatever the spacing between any two events(be it one millisecond or one century). So if our goal is to specify theoutput sequence given an input sequence, we do not need continuity ormetric structure for our time space T .

Example 2.5. For the modulo counter in Example 2.2, a sequence ofinput such as 0 ·1 ·0 ·1 ·1 · . . . is an example of a DE signal. So is the sequenceof output 00 · 01 · 01 · 10 · . . ..

Discrete event (DE) signals are defined over time spaces that are totallyordered but are not required to be a continuum and to be metric. Some au-thors [LSV98] suggest that a in order to define a DE signal the time spacehas to satisfy the additional property that between any two events we canhave a finite number of intervening events. This requirement is formally cap-tured by imposing that events are order-isomorphic to the natural numbers.This subtle but important point is beyond the scope of these notes.

For our purposes, we can simply say that a DE signal consists of asequence of totally ordered events. Each of them can be associated with anincreasing natural number that reflects its position in the sequence. So, inthe sequence 011, the first 0 will be associated with the event 0, the first 1with event 1 and the third 1 with event 3.

2.1.1.3. Discrete Time. As mentioned above, DE signals are ordered se-quences of events each one associated with a time instant. For an importantsubclass of these signals, events have to be synchronous, meaning that theyoccur/are registered at specified time instants. This means that the timevariable the signal is defined over can take on only specific values. A typical(but not mandatory choice) is that such instants be multiple of a specifiedperiod T .

Generally speaking, we define a discrete–time (DT) signal a DE signalwhere events are constrained to be synchronous.

Example 2.6. In the banking account described in Example 2.3, theevolution of the capital C(kT ) and the sum S(kT ) deposited or withdrawnare examples of DT signal.

As in the case of DE signals, we can order the different time instantsusing the natural numbers. But, for DT signals the algebraic structure ofthe time space (i.e., its being an abelian group) is used in full for sums anddifferences. hence, we will use the set Z (t)o represent DT systems.

2.1.2. Sampled Data. Finally, it is useful to mention the presence ofa different type of systems, compounded of a collection of heterogeneoussubsystems, each one associated with a different time space. A classic ex-ample is offered by sample data systems, which are DT systems obtainedfrom CT systems restricting the points in time where certain quantities canbe measured (sampled) or certain input variables be changed.


Figure 3. Example of a sampled data system

An example of this type is shown in Figure 2.1.2. A given physicalquantity that evolves in continuous time is sampled at some points in timethat are spaced by a fixed amount of time (sampling interval). The outcomeis a sequence of numbers that is a DT system. We can have this signalprocessed through a DT system that generates a new DT signal (sequence ofsamples), which can in its turn be converted back into a CT signal througha digital to analogue converter. This is typically done by sampling thesignal value at time kT and holding it constant throughout the interval[kT, (k + 1)T ] (Zero Order Hold).

2.2. Systems

In this course, we will concentrate on systems defined on CT signalsand DT signals. We start by giving an abstract definition general enough toencompass most systems of interest. Let U taking value in the set Urepresenta class of input signals and Y represent a class of output signals taking valuein the set Y . We could define the system as a binary relation between U andY : S ⊆ U × Y .

We recall that a binary relation is a set of pairs. In this case one ele-ment of the pair is the input signal and the other one is the output signal.Importantly, the relation can associate more than one output signal to thesame input.

We remark that this definition of system is “oriented”, i.e., it assumes thedistinction between an input (cause) and of an output (effect). In a moregeneral setting, we could simply define it as a relation between tuples ofsignals, no-matter the role each one plays. This is a “speculative” viewpointwhich is of interest for such areas as theoretical physics or biology. In thiscourse, on the contrary, we will take an “engineering perspective” where adistinction is made between quantities that can be acted upon and otherquantities that evolve as a result of these actions.

Example 2.7. Consider the system in Example 2.1 and suppose thatthe input is given by Vin(t) and the output is given by Vc(t). Suppose weapply a step input Vf1(t) where the step 1(t) is defined as

1(t) =

0 if t < 0

1 if t ≥ 0.

2.2. SYSTEMS 9

It can be seen (we will) that VC(t) = Vc(0)e−t/RC+(1−et/RC)Vf . Therefore,depending on the initial value Vc(0) we obtain a different output for the sameinput.

Example 2.8. Consider the system in Example 2.3. Suppose that attime t = 0 we start with a capital C0 and that we deposit a constant amountof money S at the beginning of every capitalisation period. It is possible tosee that the capital at time kT is given by

c(kT ) = (1 + I+)kC0 + S(1 + I+)k+1 − 1

I+.

Depending on the initial capital, the evolution can be much different.

As we have just seen, we could have a different output associated to thesame input both for the case of CT systems and of DT systems.

In the examples above, we also underscore the choice of an initial pointin time to start the observation of the evolution of the system. Generallyspeaking, we can choose an instant t0. Denote by T (t0) the subset of thetime space containing instants

T (t0) = t ∈ T : t ≥ t0 .

We can denote by W T (t0) the set of functions defined over T (t0) that takevalue in W :

W T (t0) = w0(·) : ∀t ≥ t0, t→ w0(t) ∈W .

By w0|T (t1) we will denote the truncation of the function, which is evaluatedfrom t1 > t0 onward.

Now we can formally define a system as follows:

Definition 2.9. An abstract dynamic system is a 3-tuple T ,U ×Y ,Σwhere

• T is the time space• U is the set of input functions• Y is the set of output functions

and

Σ =

Σ(t0) ⊂ U T (t0) × Y T (t0) : t0 ∈ T and CRT is satisfied,

where CRT stands for closure with respect to truncation: i.e., ∀t1 ≥ t0

(u0, y0) ∈ Σ(t0) =⇒(u0|T (t1), y0|T (t1)

)∈ Σ(t1).

In plain words, the CRT property means that if a couple of functionsbelong to the system from t0, so will their truncation from t1 on.


2.2.1. Parametric representation. The binary relation R = U × Yis made of ordered pairs (u, y), we denote by D(R) the domain of the relation(i.e., U ) and by R(R) its range (i.e., Y ).

A general result applicable to binary relations is the following [RI79]

Lemma 2.10. Given a binary relation R, it is possible to define a set Pand a function π : P ×D(R)→ R(R) such that

(a, b) ∈ R =⇒ ∃p : b = π(p, a)(2.2)

p ∈ P, a ∈ D(R) =⇒ (a, π(p, a)) ∈ R(2.3)

π is said parametric representation and (P, π) is said parametrisation ofthe relation.

This essentially means that it is possible to represent a relation as theunion of a set of function graphs parametrised by a suitable choice of pa-rameters. The main interest of this result lies in the following:

Theorem 2.11. Consider a system defined as in Definition 2.9. It ispossible to identify a parametrisation (Xt0 , π) such that

(2.4) π = πt0 : Xt0 ×D(Σ(t0))→ R(Σ(t0))/t0 ∈ T satisfying the following properties:

(u0, y0) ∈ Σ(t0) =⇒ ∃x0 : y0 = πt0(x0, y0)(2.5)

x0 ∈ Xt0 , u0 ∈ D(Σ(t0)) =⇒ (u0, πt0(x0, u0)) ∈ Σ(t0).(2.6)

This result is easily understood looking at Example 2.7 and 2.8, wherethe Vc(0) and C0 are used as parameters.

2.2.2. Causality. We are now in condition to state the notion of causal-ity.

Definition 2.12. Let u|[t0,t] be the restriction of the function u to the

closed interval [t0, t]. A system is causal if it has a representation (Xt0 , π)such that

∀t0 ∈ T , ∀x0 ∈ Xt0 , ∀t ∈ T(2.7)

u[t0, t]= u′[t0, t] =⇒ [πt0(x0, u)](t) = [πt0(x0, u

′)](t).(2.8)

A parametric representation of this type is said causal. If instead of theclosed interval [t0, t] we use the semi-open interval [t0, t), the parametricrepresentation and the system is said strictly causal.

Observe that π is a functional, i.e., a function defined over space offunctions. So πt0(x0, u) is the output function associated to the parameterx0 and to the function u, and [πt0(x0, u)](t) is the value it takes at time t.In plain words, the definition just introduced means that the values of ubeyond t do not affect the value of the output at time t. A simple way toput it is that a causal system does not foresee the future. If the system isstrictly the output at time t is only affected by the input at time strictlysmaller than t.

2.3. I/O REPRESENTATION 11

2.2.3. Number of input and output signals. In the discussionabove, we have not made specific assumptions on the range U of the in-put functions U and on the range Y of the output functions Y . In ourcourse, we will have cases where these quantities are scalars, and other casesin which they are vectors.

The first case is when both input and output functions are scalar. Wecall this type of systems Single Input Single Output (SISO) systems.

If the input is not a scalar function we will refer to the system as MultipleInput (MI); likewise, if the output is not scalar we will use the definition Mul-tiple Output (MO). Clearly, we can have all different combinations: SingleInput Single Output (SISO), Single Input Multiple Output (SIMO), Multi-ple Input Single Output (MISO), Multiple Input Multiple Output (MIMO).

As an example, the RC network in Example 2.1 is a SISO system becauseits input is given by the the input voltage Vin and the output Vc are bothscalar functions. However, if our output functions are both Vin and I thesystem should be considered as SIMO. As for the banking account describedin Example 2.3, the amount of money s(kT ) deposited or withdrawn is thescalar input and the capital c(kT ) is the scalar output. So the system isSISO. However, if the interest rates I+ and I− are allowed to change intime, the system will be MISO.

2.3. I/O Representation

In the previous sections, we have seen that a system is essentially arelation between spaces of functions defined over a common time space. Inview of a general results any relation can be described as the union of thegraph of a set of functions parametrised by a suitable set. This lead us tothe parametric representation of system that applies to any type of system.We are now left with the problem of understanding how such a parametricdescription can be found.

We are now going to delve more deeply inside the subclass of continuoustime (T = R) and of discrete time systems (T =Z ()).

One possible way to describe a system is by expressing a mathematicalproperty that relates input to output. We will now turn our attention tosystems that can be described by differential equations (CT systems) ordifference equations (DT systems) of an appropriate order n:

(2.9) F (y(t), Dy(t),D2y(t), . . . ,Dny(t), u(t),Du(t), . . . ,Dpu(t), t) = 0,

where the operator D, when applied to a generic function f , is defined as:

(2.10) Dkf =

dkfdtk

for CT systems

y(t+ k) for DT systems.

For well posed systems, it is possible to write the equation in a normal form(2.11)

Dny(t) = F (y(t), Dy(t),D2y(t), . . . ,Dny(t), u,Du(t), . . . ,Dpu(t), t),


Example 2.13. Consider the differential equation:

(2.12) v(t) = −γv(t) + g,

which could describe a rocket moving in the atmosphere under the action ofgravity. Let us It is possible to solve the differential equation as follows:

dv

dt= −γv − g

dv

γv + g= −dt∫

dx

γv + g=

∫−t

1

γlog |γv + g| = −t+ c

γv + g = e−γt+cγ

v(t) = −g/γ +Ae−γt

The last equation tells us that the solution is determined modulo a constant(A), which can be found imposing the initial condition. As we can see wehave found a parametrisation of the system (although applicable only toconstant input function).

Figure 4. Example of mass-spring-damper system

Example 2.14. Consider the mass-spring-damper system in Figure 4.Assume that the x-axis is perpendicular to the ground and points upward.The Newton equation leads us to following differential equation

(2.13) mx = F (t)−mg −K(x− xrest)−Rxwhere the term −K(x−xrest) accounts for the elastic reaction of the springand −Rx accounts for Coulomb friction. This is an instance of the moregeneral expression in Equation (2.11). For this example, the solution of theequation is not so straightforward with standard means (although perfectlyfeasible). We will see generic ways to find a parametrisation of this systemfor any input function F (t) (if they are analytically “nice” enough).

2.3. I/O REPRESENTATION 13

Example 2.15. Let us go back to our differential equation

(2.14) v(t) = −γv(t) + F (t),

which could describe a rocket moving in the atmosphere under the action ofits thrusters (F (t)). Suppose we want to integrate the equation numericallyusing a fixed integration interval. We can integrate both sides of the equation∫ (k+1)T

kTvdt =

∫ (k+1)T

kT(−γv(t) + F (t))dt.

If the integration interval is small enough, we can assume that both x(t) andF (t) remain constant throughout the interval. We can also use the Eulerapproximation for the derivative:

v(t) ≈ v((k + 1)T )− v(kT )

T.

Therefore the integral above can be written as:

v((k + 1)T )− v(kT ) ≈ −γv(kT )T + F (kT )T.

As we can see, we have obtained a DT time system in the normal formin (2.11). Given an initial velocity v0, the equation above allows us toiteratively construct the parametrisation of the system.

In the examples before we have seen that CT systems are described bydifferential equations, DT systems are described by difference equations. Wehave also seen that for the latter that the difference equation is in fact analgorithmic way to obtain a parametrisation of the system. In this paramet-ric representation we need to initialise the input u and the output y up tothe for a number of steps equal than the maximum forward step with whichthey appear in the equation minus 1. For instance for the equation

y(k + 2) = y(k + 1) + y(k) + u(k + 1) + u(k)

in order to start the algorithm, we need y(0), y(1), u(0). We will also seehow to derive explicit forms for this representation.

For CT systems the same result is obtained solving a differential equa-tion (a task that is not always easy). A couple of very well known results(Peano and Lipschitz Theorems) specify conditions (e.g., Lipschitz continu-ity) under which a differential equation has a unique solution. This allowsus to construct a parametrisation, where we have to consider the initialconditions for the input and output function and for their derivatives withan order lower than the maximum order with which they appear in theequation. For instance for the equation:

y = −y + u+ 3y − uwe will need y(0), y(0), u(0), u(0). The complex discussion on existence anduniqueness of solutions will be left out of these notes, and we will assumethat the differential equation is well behaved and that it admits a uniquesolutions for the input function of our interest.


Figure 5. Inverted Pendulum

2.3.1. Causality. In order for the system to be causal, we have to haven ≥ p. Indeed, if we consider a simple DT example like:

y(k + 1) = y(k)− 3u(k) + 4u(k + 2),

in which n = 1 and p = 2, we easily see that the output at a given timedepends on the future value of the input.

Consider now an even simpler and apparently inoffensive example of CTsystem like

y(t) = u.

In order to define a derivative, we have to compute the limit of the leftand of the right increment as see that they are equal. But knowing theright increment is tantamount to peering into the future (although by aninfinitesimal amount).

2.3.2. MIMO systems. The definition in Equation (2.11) refers toSISO systems. It is easy to generalise to a MIMO systems as follows:

Dnyj(t) = F (y1(t), . . . ,Dny1(t), . . . , ym(t), . . . ,Dnym(t), u1,Du1(t), . . . ,(2.15)

. . . ,Dpu1(t), . . . uh,Dum(t), . . . ,Dpuh(t), t), j = 1, . . .m

for a system with m output functions and h input functions.

Example 2.16. Consider the inverted pendulum in Figure 2.16. Let Mbe the mass of the cart, m the mass of the pendulum, l the length of thependulum, y1 the position x of the centre of mass of the cart, y2 the angle θof the pendulum with the vertical line, and u be the force F applied to thecart. The equation of motion can be found as follows:

y1 =ml

M +m

(y2 cos y2 − y2

2 sin y2

)+

1

M +mu

y2 =1

l(g sin y2 + y1 cos y2)

2.4. STATE SPACE REPRESENTATION 15

2.4. State Space representation

In the sections above we have seen that: 1. for a system the sameinput function can be associated with multiple output functions, 2. for eacht0, we can introduce a parametric representation such that, given an inputfunction, the associated output function can be “disambiguated” using aparameter x0, 3. for a class of systems (causal systems), this parametricrepresentation is causal meaning that, for a given value of the parameter x0,the output at any given time t′ > t0 only depends on the values taken bythe input function in the interval [t0, t

′) (or [t0, t′] if the causality is strict).

Given that t0 can be chosen freely, this means that: 1. a causal systemcannot “foresee” the future, 2. the parameter x0 summarises all the storyof the system previous to t0. These are qualitative arguments. In order toturn them into a consistent and mathematically sound definition of state weneed a few more steps.

In particular, by changing the initial instant t0 we will find a differentvalue for the parameters x0. Our first assumption is that whatever thechoice of t0 such parameters live in the same set X: ∀t0 x0 ∈ X. Thisis obviously the case if we adopt the IO representation of the system; asdiscussed in Section 2.3, if a CT system is described by means of a differentialequations, and if this is mathematically well behaved (such as to admit aunique solution), the parameters are conveniently chosen as the set of allinitial conditions for the derivatives of input and output function up to acertain order. The same reasoning can be applied to DT system where theparameters are the initial values. Therefore, whatever the initial time t0,the parameters x0 live in the space given by the initial conditions.

In order to have a complete identification between x0 at time t0 andthe past history of the system (up to t0) we need some form of consistencybetween the parameter x0 at time t0 and the parameter x1 at time any othertime t1. We will assume the existence of a function φ, such that

x1 = φ(t1, t0, x0, u),

where u is restricted to its values in the set [t0, t1). Roughly speaking, φ tellsus how to transform the story up until t0 into the story up until t1, giventhe knowledge of the input in between t0 and t1. Lt (T × T )∗ be defined as:

(T × T )∗ = (t, t0) : t, t0 ∈ T and t ≥ t0 .

The function φ is defined as

φ : (T × T ) ∗ ×X × U → X(2.16)

x(t) = φ(t, t0, x0, u).

We will require that it respects the following three properties:

Consistency:

∀t ∈ T, ∀u ∈ U , φ(t, t, x, u) = x,


Causality:

∀t, t0 ∈ T ,∀x0 ∈ X u|[t0,t) = u′∣∣[t0,t)

=⇒ φ(t, t0, x0, u) = φ(t, t0, x0, u′)

Separation:

∀(t, t0),∀x0 ∈ X,∀u ∈ U(2.17)

t > t1 > t0 =⇒ φ(t, t0, x0, u) = φ(t, t1, φ(t1, t0, x0, u), u).(2.18)

The first two properties are obvious; the third one formalises the idea thatthe state captures the story of the system.

The notion of state we are introducing is rooted in that of a parametricrepresentation introduced earlier. The same notion leads to the link betweeninput state and output. Indeed, assuming a causal system, the function InEquation (2.4) leads us to:

∀t0, y0(t) = [πt0(x0, u0)] (t)∀t ≥ t0,where u0 is restricted to [t0, t]. Assuming t0 = t we can introduce thedefinition of a function η as follows:

y(t) = [πt(x, u)] (t) = η(t, x(t), u(t)).

The function η is described as

η : T ×X × U → Y(2.19)

y(t) = η(t, x(t), u(t)).

2.4.1. Input signals, output signals and states. The variable x(t)is called state of the system, and the set X it lives in is called state space.While systems with finite state space exist and are relevant (e.g., see Exam-ple 2.2), we will focus on systems with dense state space.

Generally speaking, a system has a vector of input signals, part of whichare directly controllable and can be used as command variables to drive thesystem toward a desired behaviour. Other input signals are not under directcontrol of the user and act as external disturbance or noise.

As an example, if we model a car the throttle and the steering wheelsare command variables, whereas the inclination of the road can be seen asan external disturbance.

Output signals are measurable quantities and can be used to formulatedesign specifications. For instance, in the model of a car, one can formulate aspecification like if the driver open the throttle completely, the speed reaches100km/h in 11 seconds, in which the speed plays the role of our controlledvariable.

In other cases, an output signal conveys useful elements of informationthat allow reconstructing the system state when the latter is not directlymeasurable. For instance, the speed of a vehicle is typically a state variablethat cannot be measured directly. It is possible to mount encoder on thewheels that measure the number of rotations n and from this derive anestimate of the velocity. The most obvious way is to compute a numeric


“derivative” using the left increments. This strategy typically produce noisyestimates. We will see in the course that better results can be obtained insystematic ways using the so called “observer”, a system that builds on theknowledge of the system model and of the story of the output signal toreconstruct an estimate of the state.

2.4.2. Existence of a state. The functions φ and η that we havequalitatively introduced above allow us to generate a system:

Σ(t0) =

(u0, y0) ∈ U T (t0) × Y T (t0),(2.20)

u0 = u|T (t0) , y0 : y0(t) = η(t, φ(t, t0, x0, u), u(t)), with u0 ∈ U and x0 ∈ X .

An important point is when is the converse true? In other words, whenis it possible to construct a state representation as outlined above for genericabstract system? It turns out that this is indeed possible for an abstractsystem under mild conditions. Essentially, what is required is that inputfunctions have the same range and that they be a closed set under concate-nation. The existence and uniqueness of a state representation for abstractsystems is beyond the scope of this course and we refer the interested readerto specialised text books [RI79] or to the scientific literature. In this course,we will always assume that a representation of this kind exists for our sys-tems of interest.

2.4.3. Implicit Representation. The η and φ functions that we haveintroduced in Equation (2.16) and (2.19) compound the so-called explicitrepresentation of a system. By “explicit” we mean that given an inputfunction and an initial state the two function enable to compute immediatelythe evolution of the state and of the output.

In contrast, an implicit representation is one in which the computationthe output is not immediate but it requires the solution of a system ofdifference or of differential equations.

For DT systems, an implicit representation can be found by by simplyevaluating φ across a single step:

x(t+ 1) = φ(t+ 1, t, x(t), u(t)) = f(t, x(t), u(t))(2.21)

The function f(. . .) is said generator function. The representation (X, f, η)is said implicit representation. Clearly the function φ can be recovered fromf through a simple iteration.

Example 2.17. Consider a DT system with scalar state (i.e., state withdimension 1) and suppose that its generator function is given by the linearfunction:

x(t+ 1) = a(t)x(t) + b(t)u(t).


If we want to compute the evolution from a state x0 at time t0 to a state x1

at time t1, we can do it by unrolling the following iteration:

x(t0 + 1) = a(t0)x0 + b(t0)u(t0)

x(t0 + 2) = a(t0 + 1)x(t0 + 1) + b(t0 + 1)u(t0 + 1) =

= a(t0 + 1)a(t0)x0 + a(t0 + 1)b(t0)u(t0) + b(t0 + 1)u(t0 + 1)

x(t0 + 3) = a(t0 + 2)x(t0 + 2) + b(t0 + 2)u(t0 + 2) =

= a(t0 + 2)a(t0 + 1)a(t0)x0 + a(t0 + 2)a(t0 + 1)b(t0)u(t0)+

+ a(t0 + 2)b(t0 + 1)u(t0 + 1) + b(t0 + 2)u(t0 + 2)

. . . . . .

We can prove by induction for a general H that:

x(t0 +H) = φ(t0 +H, t0, x0, u) =

= ψ(t0 +H, t0)x0 +

t0+H−1∑t=t0

G(t0 +H − 1, t)u(t)

where

ψ(t, τ) =

∏t−1t′=τ a(t′) if t > τ

1 if t ≤ τG(t, τ) = ψ(t+ 1, τ + 1)b(τ)

When we move to the CT domain, things are not so immediate. It isreasonable to think of the evolution of the state in real time as describedby means of a differential equation. This requires a sufficient regularity (atthe very least differentiability) of the function φ. If we admit that φ is thesolution to the equation:

∂

∂tφ = f(t, φ, u(t)),

then our implicit representation is given by:

x = f(t, x(t), u(t)).

Generally speaking the state is a vector and f is a vector values function.This is easy to understand looking at the example below.

Example 2.18. Consider the Mass Spring system in Figure 4. Let xrepresent the vertical position of the mass. If we choose the state as

x(t) =

[x1(t)x2(t)

]=

[x− xrestd(x−xrest)

dt

]the use of standard laws of mechanics leads us to

x(t) =˙[

x1(t)x2(t)

]=

[x2

F (t)−Kx1 −Rx2

].


2.4.4. Link between state and I/O representation. Looking atExample 2.14 and at Example 2.18, the reader can infer an easy way to derivea state representation from an I/O representation. Suppose our system isdescribed by an a differential (or by a difference) equation in Equation (2.11).Consider the simplified case in which p = 1. Suppose we set xi = Di−1y,with the obvious implication that

Dxi = DDiy = Di+1y = xi+1

. We can write Equation (2.11) as:

D

x1

x2

. . .xn

=

x2

x3

. . .F (x1, x2, . . . , xn, u, t)

(2.22)

y = x1.(2.23)

This way, we have found the expressions for the vector valued functionf(x, u, t) and for the function h(x, u, t):

f(x, u, t) =

x2

x3

. . .F (x1, x2, . . . , xn, u, t)

h(x, u, t) = x1,

where x is defined as x = [x1, x2, . . . , xn]T . If n ≥ p > 1, the constructioncan be generalised by setting:

xi = Di−1y for i ∈ [1, n](2.24)

xi+n = Di−1u for i ∈ [1, p].

This strategy leads to our results but is often inefficient, i.e., it introducesmore states than strictly needed. Consider the following example:

Example 2.19. Consider a system whose I/O representation is given by

y = y + 5u− u.

If we want to find a state space representation and approach the problem asdiscussed before, we would have to introduce two state x1 = y and x2 = u.Suppose, instead, that we set x1 = y − 5u. We have,

x1 = y − 5u

= y − u =

= y − 5u+ 4u

= x1 + 4u

y = x1 + 5u


The example above reveals that: 1) there are more than one state spacerepresentation for the same I/O description, 2) some systems (that we willlearn to qualify as linear) allow for a much more compact state space thanthe one using the canonical set of variables in Equation (2.24)

2.5. Linear Systems

As we discussed above, an abstract system Σ(t0) is defined as a set of

pairs (u0(t), y0(t)) with u0(t) ∈ U T (t0) and y0(t) ∈ Y T (t0). Suppose that thesets U and Y are both vector spaces defined over the real set R. Clearlythe range sets U and Y where the functions take value have to possess thesame algebraic structure. In plain words this means that if u1(t) ∈ U andu2(t) ∈ U , then αu1(t) + βu2(t) ∈ U for any α ∈ R and β ∈ R (and thesame for any pair y1(t) and y2(t) belonging to Y .

This being said, we can introduce the following definition

Definition 2.20. The system Σ(t0) is linear if given any two pairs(u1(t), y1(t)), (u2(t), y2(t)), if they both belong to the system Σ(t0), so willtheir linear combinationa: α (u1(t), y1(t)) + β (u2(t), y2(t)) ∈ Σ(t0), whereα, β ∈ R.

Consider now a parametric description of the system with a parameterspace Xt0 and a function π. For a given choice of the parameter x0 thesystem responds to an input u1 with the output y1 = πt0(x0, u1) and to theinput u2 with the output y2 = πt0(x0, u2). The intuitive meaning of thelinearity is captured by the following two properties:

• Superimposition: the response to the combination u1(t)+u2(t) willbe the superimposition of the effect of u1(t) and u2(t): y1(t)+y2(t)• Scaling : the response to αu1(t) is αy2(t).

Example 2.21. Consider the mass spring system in Example 2.14. Sup-pose that starting from an initial condition x1(t is the evolution of the systemfor F1(t) and x2(t) is the solution to corresponding to the force F2(t). Wecan write

mx1 = F1(t)−mg −K(x1 − xrest)−Rx1

mx2 = F2(t)−mg −K(x2 − xrest)−Rx2

If we multiply the first equation by α, the second equation by β and sumup the result we will see that x1(t) + x2(t) is a solution to F1(t) + F2(t).

The considerations in the example before apply because the differen-tial equation describing the system is linear and the derivative is a linearoperator: D(αf(t) + βg(t)) = αD(f(t)) + βD(g(t)).

If we repeat the same line of reasoning with the system in Example 2.16,we will fail because the equation is not linear. More simply, if we considerthe mass spring example but we impose a maximum value for the force F (t)the scaling principle will not apply and the linearity will be lost.

2.5. LINEAR SYSTEMS 21

2.5.1. Linear I/O representations. The same argument of Exam-ple 2.21 can be repeated for any system described by a linear difference ordifferential equation of:

(2.25) Dny(t) =

n−1∑i=0

αi(t)Dhy(t) +

p∑j=0

βj(t)Dju(t).

Generally speaking, when this equation has a unique solution (which isgenerally true for “well behaved” functions α(t) and β(t)) and of causal sys-tems (p ≤ n), such a solution is a function of u and of the initial conditions:

y(t) = φ(t, t0, y(0),Dy(0), . . . ,Dn−1y(0), u(0), . . . ,Dpu(0), u|[t0,t]

).

There are two known facts stated in the following:

Lemma 2.22. Given the Equation (2.25), define its associated homoge-neous equation as follows:

(2.26) Dny(t) =n−1∑i=0

αh(t)Diy(t)

The solution to this equation are a vector space and the dimension of thisspace is n. This applies to both differential and difference equations.

Lemma 2.23. Suppose that y is a particular solution of Equation (2.25)related to the input u. If we take solution of Equation (2.26) y and we sumit to y we still obtain a solution of Equation (2.25).

By choosing any basis y1, . . . , yn of the vector space of the solutionsof Equation (2.26) , we can write a general solution of of Equation (2.25) as

y(t) = y(t) +H1y1(t) + . . .+Hny2n(t)

where the coefficients H1 can be found by imposing n conditions derivedfrom the initial conditions. In essence, the linearity of the system leads usto a break down of the solution in two pieces. The first one, y, is calledforced evolution because it depends on the input term u. The second one is+H1y1(t) + . . .+Hnyn(t) and is called free evolution because it depends onthe initial conditions.

An obvious but fundamental fact is that a linear representation as de-scribed above generates a linear system as in Definition 2.20.

2.5.2. Linear State Space Representations. A state space explicitrepresentation is given by a couple of functions φ and η, defined as:

φ : T × T ×X × U → X x(t) = φ(t, t0, x0, u)

η : T ×X × U → X y(t) = η(t, x, u)


A representation of this type is linear if both functions are linear with respectto their arguments X × U . In plain words, if we consider φ we have:φ(t, t0, x0, u1|[t0,t)) = x1

φ(t, t0, x0, u2|[t0,t)) = x2=⇒ φ(t, t0, x0, αu1|[t0,t)+βu2|[t0,t)) = αx1 + βx2

φ(t, t0, x0,1, u|[t0,t)) = x1

φ(t, t0, x0,2, u|[t0,t)) = x2=⇒ φ(t, t0, αx0,1 + βx0,2, u) = αx1 + βx2.

We can write similar equations for η,η(t, x0, u1) = y1

η(t, x0, u2) = y2=⇒ η(t, x0, αu1 + βu2) = αy1 + βy2

η(t, x0,1, u) = y1

η(t, x0,2, u) = y2=⇒ η(t, αx0,1 + βx0,2, u) = αy1 + βy2.

The implicit representation can be obtained from the explicit one bydifferentiation (assuming that φ is differentiable) or by considering one stepof evolution.

f : T ×X × U → X Dx(t) = f(t, x, u)

η : T ×X × U → X y(t) = η(t, x, u)

In this case f has to be linear with respect to X×U . Observe the difference.For φ we require linearity with respect to X × U (where U is a space ofsignals), while for f and for η we require linearity with respect to X × U(where U is the range of the input signals U ).

The linearity of f and η can be expressed in matrix notation as:

Dx(t) = A(t)x(t) +B(t)u(t)

y(t) = C(t)x(t) +D(t)u(t).(2.27)

2.6. Time Invariance

We can introduce a general definition of time invariance for an abstractsystem Σ(t0) defined as a set of pairs (u0(t), y0(t)).

Definition 2.24. A system Σ(t0) is time invariant iff

(u0(t), y0(t)) ∈ Σ(t0) =⇒ (u0(t− t0), y0(t− t0)) ∈ Σ(t0 − t0).

In plain world, what we require is that if a behaviour (u0(t), y0(t)) isgenerated by the system, so will the same behaviour with u0 and y0 shiftedforward or backward in time of the same amount.

The definition of time invariance applies also to the representation ofthe system.

2.7. HYBRID SYSTEMS 23

2.6.1. IO representation. An IO representation of a system is timeinvariant if the differential or the difference equation do not explicitly dependon t.

In order to understand this definition, consider Example 2.3. If we allowthe interest rates I+, I− to change in time the system is not obviously timeinvariant. If they are kept constant, the system is time invariant. Likewise,for the mass-spring system in Example 2.14, we can say that the system istime invariant. But if one of the parameters (e.g., the Hooke constant of thespring) changes with time, then the system is time-varying.

Generally speaking, if we consider the linear IO representation in Equa-tion (2.25), we can say that the system is time–invariant if the parametersαi(t) and βj(t) are constant over time. In this case, the equation becomes

(2.28) Dny(t) =n−1∑i=0

αiDhy(t) +

p∑j=0

βjDju(t).

2.6.2. State Representation. If we consider a state representationin explicit form, it is time invariant if and only if phi does not dependssingularly on t and t0, but it depends on their difference, and if η does notdepend on t

φ : T × T ×X × U → X x(t) = φ(t, t0, x0, u) = φ(t− t0, x0, u)

η : T ×X × U → X y(t) = η(t, x, u) = η(x, u)

If we derive the implicit form, it is easy to see that this implies that f andη do not depend on t.

f : T ×X × U → X Dx(t) = f(t, x, u) = f(x, u)

η : T ×X × U → X y(t) = η(t, x, u) = η(x, u)

In the case of linear system, time invariance is translated into a set ofmatrices A,B,C,D that do not depend on time:

Dx(t) = Ax(t) +Bu(t)

y(t) = Cx(t) +Du(t).(2.29)

It is possible to see that linear IO and state space representations areassociated with linear systems

2.7. Hybrid Systems

Different hybrid systems are obtained combining a DES event that de-termines changes in the “physical laws” that govern the evolution of a CTsystem. Consider the example in Figure 2.7. A tank is filled with a fluid,which flows out with a rate Q0 from an outlet in the bottom of the tank.


Figure 6. Example of an Hybrid System

When the level of the fluid reaches a threshold, a sensor detects the eventand opens a valve which produces an incoming flow Q1 > Q2 until the tankis full. In this example, we have a state machine that determines switchesbetween two different evolutions of the system, each of them governed byCT dynamics. This type of systems are beyond the scope of this course.

CHAPTER 3

Impulse response and convolutions for linear andtime invariant IO Representations

In this chapter, we restrict our focus on IO representation of SISO linearand time invariant systems. Generally speaking, the evolution of the systemis described by the following equation

(3.1) Dny(t) =n−1∑i=0

αiDiy(t) +

p∑j=0

βjDju(t),

where time invariance requires that αi and βi be constant. The system iscausal if p ≤ n and strictly causal if p < n. We have seen that the initialconditions:

y(0),Dy(0), . . . ,Dn−1y(0), . . . ,Dpu(0), . . . ,Du(0)

and of the input u|[t0,t]. We have finally seen that the linearity of the systemallows us to split the evolution depending on the initial conditions and theevolution depending on the input in two separate terms:

y(t) = yfree(t) + yforced(t)

with

yfree(t) = Ffree(y(0),Dy(0), . . . ,Dn−1y(0), . . . ,Dpu(0), . . . ,Du(0))

yforced(t) = Fforced(u|[t0, t]).

Let us start with the computation of the forced evolution.

3.1. Forced Evolution of discrete–time systems

It is useful to introduce a special function, which we will call impulsefunction.

3.1.1. Kronecker δ. For discrete time system the impulse function isalso known as Kronecker Delta and is defined as follows:

(3.2) δ(t) =

1 t = 0

0 t 6= 0.

25

263. IMPULSE RESPONSE AND CONVOLUTIONS FOR LINEAR AND TIME INVARIANT IO REPRESENTATIONS

If we consider any discrete–time function we have the following property:

(3.3) f(t)δ(t− t0) =

f(t0) t = t0

0 t 6= t0

In plain words we will say that multiplying a Kronecker δ shifted to t0 bya function generates a new function that is zero at all times except for t0(where it is f(t0)). This is equivalent to saying that multiplying by δ hasthe effect of extracting a sample (“sampling”) of the function in the pointwhere δ is centred. Another property is:

(3.4)

τ=∞∑τ=−∞

f(τ)δ(t− τ) = f(t)

In other words we can express any function as a sum of infinite δ.

3.1.2. Impulse Response. Consider the DT system in Equation (3.1).Our goal is to compute the forced response to a generic signal u(t).

Let us define h(t) the forced response to the signal δ(t).

h(t) = Fforced(δ).

As we discussed above we can write u(t) as

u(t) =

τ=∞∑τ=−∞

u(τ)δ(t− τ)

Thanks to the system’s linearity we can write:

Fforced(u) = Fforced(τ=∞∑τ=−∞

u(τ)δ(t− τ))

=

τ=∞∑τ=−∞

u(τ)Fforced(δ(t− τ)),

and using time–invariance we get.

Fforced(u) =

τ=∞∑τ=−∞

u(τ)h(t− τ)

The summation above is called discrete–time convolution (also denoted asu(t) ∗ h(t). What we just said is that for a DT system it is sufficient toknow the impulse response h(t) to be able to reconstruct the response toany signal.

The meaning of the convolution sum is quite simple. Suppose we wantto find y(t) = u(t) ∗ h(t). What we need to do is the following:

(1) Compute the sequence h(−τ), which is the reflection of h(τ) throught = 0;

(2) Shift the obtained sequence by t and compute the product element–wise;

(3) Sum up the products, and this produces y(t).

3.1. FORCED EVOLUTION OF DISCRETE–TIME SYSTEMS 27

This is shown in the two examples below.

Example 3.1. Suppose that:

h(t) =

1 t = 0

2 t = 1

−2 t = 2

0 Otherwise

and

u(t) =

5 t = 0

−3 t = 1

0 Otherwise

compute the system’s response. We can see

y(t) =τ=∞∑τ=−∞

u(τ)h(t− τ)

= 5h(t)− 3h(t− 1) = =

5 t = 0

10− 3 t = 1

−10− 6 t = 2

6 t = 3.

From this example we can clearly see that if the support of h(t) is [0,M ]and the support of u(t) is [0, N ], the support of h(t)∗u(t) will be [0, N+M ].

Example 3.2. Suppose we know that the impulse response to a systemis given by 1(t)at, let us compute the forced response to the input functionu(t) = 1(t)bt, where

1(t) =

1 t > 0

0 t ≤ 0.

We have:

Fforced(u) =τ=∞∑τ=−∞

1(τ)bτ1(t− τ)at−τ =

=

τ=t−1∑τ=1

bτat−τ =

= atτ=t−1∑τ=1

(b

a

)τ=

= at(b/a)− (b/a)t

1− (b/a)=

=b

a− bat − a

a− bbt.


System:h(t)

u(t) y(t) = h(t) * u(t)

Figure 1. Example block scheme

Our previous discussion can be summarised in the block scheme in Fig-ure 3.1.2. The box represent the system and the arrows the input and theoutput signals.

3.1.3. Properties of the convolution sum. Three important prop-erties of the convolution sum are stated in the following.

Theorem 3.3. The convolution sum enjoys the following properties:

(1) Commutative Property: h(t) ∗ u(t) = h(t) ∗ u(t).(2) Distributive Property: (h1(t) + h2(t)) ∗ u(t) = h1(t) ∗ u(t) + h2(t) ∗

u(t)(3) Associative Property: h1(t) ∗ (h2(t) ∗ u(t)) = (h1(t) ∗ h2(t)) ∗ u(t).

Proof. (1) Commutative Property, considering that h(t) ∗ u(t) =∑+∞τ=−∞ h(τ)u(t − τ). By changing the variable as t − τ = τ1 we

can write :

+∞∑τ=−∞

h(τ)u(t− τ) =−∞∑τ1=∞

h(t− τ1)u(τ1)

= u(t) ∗ h(t).

(2) Distributive Property: this is a straightforward implication of thelinearity of the convolution operator:

(h1(t) + h2(t)) ∗ u(t) =

+∞∑τ=−∞

(h1(τ) + h2(τ))u(t− τ) =

=−∞∑τ1=∞

h1(τ)u(t− τ) +−∞∑τ1=∞

h2(τ)u(t− τ) =

= h1(t) ∗ u(t) + h2(t) ∗ u(t)

3.1. FORCED EVOLUTION OF DISCRETE–TIME SYSTEMS 29

+

h1

h2

uh1+h2

Figure 2. Parallel composition

h2 h1*h2

Figure 3. Series composition

(3) Associative Property:

(h1(t) ∗ h2(t)) ∗ u(t) =∞∑

τ2=−∞u(t− τ2)

∞∑τ1=−∞

h1(τ1)h2(τ2 − τ1) =

=

∞∑τ2=−∞

∞∑τ1=−∞

u(t− τ2)h1(τ1)h2(τ2 − τ1) =

=∞∑

τ ′2=−∞

h1(τ1)∞∑

τ1=−∞u(τ ′2)h2(t− τ ′2 − τ1) =

=∞∑

τ ′2=−∞

h1(τ1)(u(t) ∗ h2(t))|t−τ1

= h1(t) ∗ (u(t) ∗ h2(t))

The first property has merely an operational value: if we want to processu(t) through a system h(t) we can compute the most convenient betweenh(t) ∗ u(t) and u(t) ∗ h(t).

The second and the third property have a more substantial meaningrelated to the composition of two systems. The distributive property revealsthat the parallel composition of two systems produces the same result ashaving one system whose impulse response is given by the sum of the twoimpulse responses (see Figure 2).

The associative property is related to the series composition of two sys-tems, which is equivalent to one system having an impulse response givenby the convolution of the two impulse responses (see Figure 3).


3.1.4. Eigenfunctions. There is particular type of functions that aretreated in special way by LTI system. Consider function u(t) = zt andcompute the forced evolution of a LTI system. The response to this inputis given by:

+∞∑τ=−∞

h(τ)u(t− τ) =

∞∑τ=−∞

u(t− τ)h(τ)

=∞∑

τ=−∞zt−τh(τ)

= zt∞∑

τ=−∞z−τh(τ)

= ztH(z)

where H(z) =∞∑

τ=−∞z−τh(τ).

The property just shown can be stated by saying that whenever a signalzt (i.e., a power of t) gets processed by a DT LTI system, the outcome isthe same signal scaled by a constant H(z) which is the limit of the series∑∞

τ=−∞ z−τh(τ), as far as the series converge. For this reason zt is called

an eigenfunction. An eigenfunction is essentially similar to an eigenvector ofa standard linear application, with H(z) being its eigenvalue. This appliesboth to real and complex z.

3.1.4.1. A key property of eigenvalues. We should recall that eigenvec-tors enjoy an extremely useful property. Consider linear application definedfrom Rn to Rn. Suppose that it is associated to a matrix A:

y = Ax.

Suppose that we can identify n independent eigenvectors: u1, u2, . . . , unrelated to the eigenvalues λ1, ldots, λn. Let M = [u1u2 . . . un] be thematrix composed using these vectors and assume that these vectors form abasis. Let x be the coordinates in this basis of a generic vector x expressedin the canonical basis. We have:

x = M−1x.

Similarly the transformed version of y is given by:

y = M−1y.

By combining the two conditions we find:

y = M−1y =

= M−1Ax =

= M−1AMx.(3.5)

3.2. FORCED EVOLUTION OF CONTINUOUS–TIME SYSTEMS 31

It can easily be seen that M−1AM is a diagonal matrix. In simple words,this means that if we express a vector using a basis of eigenvectors, thesystem operates on each component in a decoupled way.

The same property holds if we express any signal as a linear combinationof Eigenfunctions. The convenience of this choice will become apparent inthe next few pages.

3.2. Forced Evolution of continuous–time systems

The line of arguments used for DT systems can be applied to CT systemsas well. The key point is the definition of an appropriate impulse function.

3.2.1. Dirac δ. The notion of impulse is not as obvious for CT systemsas it is for DT systems. Indeed, the type of functions we normally dealwith are at the very least continuous and often differentiable. The impulsefunction does not fall within this category, but it is useful to think of it asthe limit case of a relatively well behaved signal. The simplest choice thatcan be made is to define:

δ(t) = lim∆→0

δ∆(t)(3.6)

δ∆(t) =

0 t 6∈

[−∆

2 ,∆2

]1∆ t ∈

[−∆

2 ,∆2

](3.7)

A few properties that are easy to show are the following:

Property 1: If we compute the integral of Dirac δ on any intervalenclosing the origin, we get 1.0:

∀a > 0, b > 0

∫ b

−aδ(τ)dτ = 1.

Property 2: Multiplying a δ(t− τ) by any function has the effect of“sampling” the value of the function in τ , meaning that we obtainan impulse function multiplied by τ :

f(t)δ(t− τ) = f(τ)δ(t− τ).

Property 3: Any function can be expressed as an integral of impulsefunctions.

∀ε > 0,

∫ t+ε

t−εf(τ)δ(t− τ)dτ.

Indeed,∫ t+ε

t−εf(τ)δ(t− τ)dτ =

∫ t+ε

t−εf(t)δ(t− τ)dτ

= f(t)

∫ t+ε

t−εδ(t− τ)dτ

= f(t).


Property 4: The integral from any negative number to a genericinstant t produces a step function:

∀ε > 0,

∫ t

−εδ(τ)dτ = 1(t),

where 1(t) is defined as

1(t)

1 t > 1

0 t ≤ 0

Property 5: We could define

δ(t) =d

dt1(t).

This is an obvious abuse of notation because we know that the stepfunction is not differentiable in t = 0.

3.2.2. Impulse Response. We can repeat the same line of argumentsfollowed in Section 3.1.2 to compute the forced response of system (3.1)starting from the impulse response (i.e., the forced response to δ(t)):

h(t) = Fforced(δ).

Indeed, we can write

u(t) =

∫ τ=∞

τ=−∞u(τ)δ(t− τ)dτ.

Using linearity and time–invariance:

y(t) = Fforced(u)

= Fforced(

∫ τ=∞

τ=−∞u(τ)δ(t− τ)dτ)

=

∫ τ=∞

τ=−∞u(τ)Fforced(δ(t− τ))dτ, =

∫ τ=∞

τ=−∞u(τ)h(t− τ)dτ.

This allows us to conclude that once we know h(t), we can compute theresponse to any signal. This is called “convolution integral” and is denotedby the same symbol as the convolution sum: u(t) ∗ h(t). The computationof y(t) requires the following steps:

(1) Compute the “reflection” of h(τ) through τ = 0,(2) Translate the result to the right of t (to the left if t is negative).(3) Compute the product by u(τ) and then the integral of the function

thus obtained.

These steps can be appreciated through an example.

3.2. FORCED EVOLUTION OF CONTINUOUS–TIME SYSTEMS 33

Example 3.4. Suppose that h(t) = 1(t)e−3t. Compute the response ofthe system to u(t) = 1(t). The result can be found as follows:

y(t) = Fforced(u)

=

∫ τ=∞

τ=−∞u(τ)h(t− τ)dτ

=

∫ τ=∞

τ=−∞1(t)1(t− τ)e−3t+3τdτ

=

∫ τ=t

0e−3t+3τdτ

= e−3t 1

3e3τ |tτ=0

=1− e−3t

3.

3.2.3. Properties of the convolution integral. The convolution in-tegral has the same properties Three important properties of the convolutionsum are stated in the following.

Theorem 3.5. The convolution sum enjoys the following properties:

(1) Commutative Property: h(t) ∗ u(t) = h(t) ∗ u(t).(2) Distributive Property: (h1(t) + h2(t)) ∗ u(t) = h1(t) ∗ u(t) + h2(t) ∗

u(t)(3) Associative Property: h1(t) ∗ (h2(t) ∗ u(t)) = (h1(t) ∗ h2(t)) ∗ u(t).

The proof is very similar to one we have given for the convolution sum.

3.2.4. Eigenfunctions. Suppose that we use as input an exponentialfunction u(t) = est.

y(t) =

∫ +∞

τ=−∞h(τ)u(t− τ)dτ

=

∫ +∞

τ=−∞h(τ)es(t−τ)dτ

= est∫ +∞

τ=−∞h(τ)e−sτdτ.

Let H(s) be the integral∫ +∞τ=−∞ h(τ)e−sτdτ , assuming that it converges. We

can summarise the computation above by saying that est is an Eigenfunc-tions related to the eigenvalue H(s).

The above results apples to both real and complex exponentials. Thisleads us to an interesting fact if we analyse the output to an harmonicfunction: u(t) = cosωt. Using Euler’s expression, we have

cosωt =ejωt + e−jωt

2.


Applying linearity we find:

y(t) =

∫ +∞

τ=−∞h(τ)u(t− τ)dτ

=

∫ +∞

τ=−∞h(τ)

ejω(t−τ) + e−jω(t−τ)

2dτ

=1

2

∫ +∞

τ=−∞h(τ)ejω(t−τ)dτ +

1

2

∫ +∞

τ=−∞e−jω(t−τ)dτ

=1

2ejωtH(jω) +

1

2e−jωtH(−jω),

where H(jω) =∫ +∞τ=−∞ h(τ)e−jωτdτ . Let z represent the complex conjugate

of a complex number z. If z1 and z2 are two complex numbers and α is areal, we can easily show that:

(1) z1 + z2 = z1 + z2

(2) z1z2 = z1z2

(3) αz1 = αz1.

Applying these properties, we can see:

H(jω) =

∫ +∞

τ=−∞h(τ)e−jωτdτ

=

∫ +∞


=

∫ +∞


= H(−jω)

Therefore,

y(t) =1

2ejωtH(jω) +

1

2e−jωtH(−jω)

=1

2ejωtH(jω) +

1

2e−jωtH(jω).

If we use for H(jω) its modulus/phase representation:

H(jω) = |H(jω)| ej∠H(jω),

3.3. PROPERTIES OF THE IMPULSE RESPONSE 35

we find:

y(t) =1

2ejωtH(jω) +

1

2e−jωtH(jω)

=1

2|H(jω)| ej∠H(jω)ejωt +

1

2e−jωt |H(jω)| e−j∠H(jω)

=|H(jω)|

2

(ej(∠H(jω)+ωt) + e−j(∠H(jω)+ωt)

)=|H(jω)|

2cos (ωt+ ∠H(jω)) .

The result above can be summarised in the following:

Theorem 3.6. Consider a TC LTI system. If∫ +∞τ=−∞ h(τ)e−jωτdτ con-

verges to a value H(jω), then the system responds to an harmonic inputfunction cosωt with an harmonic output function having the same fre-quency.

3.3. Properties of the impulse response

Many important properties of the system can be inferred by looking atthe analytical features of h(t), both for DT and CT systems.

3.3.1. Causality. The first important problem that can be read fromh(t) is causality, as specified in the following Theorem.

Theorem 3.7. Let h(t) be the impulse response of a system Σ. Thesystem is causal if and only if h(t) = 0 for t < 0.

Proof. Let us focus for simplicity on CT systems. Necessity derivesfrom the observation that if we choose u(t) = δ(t), causality requires thath(t) = 0 for t < 0.

Sufficiency is shown considering that if h(t) = 0 then∫ ∞−∞

h(t− τ)u(τ)dτ =

∫ t

−∞h(t− τ)u(τ)dτ.

As shown in the formula above, the computation of y at time t is onlyaffected by u(·) until t. The proof for DT systems is absolutely similar.

3.3.2. BIBO Stability. Key to this course is the idea of stability.Roughly speaking, a stable system is one for which small perturbationsof various kind on the evolution of the output signal. We will see severalpossible way in which this intuitive notions can be formalised in mathemat-ical terms. The first one applies to IO representations and is the so-calledBounded-Input-Bounded-Output (BIBO) stability formalised below.

Definition 3.8 (BIBO stability). A system is BIBO stable iff for allε > 0 there exists a positive real δ > 0 such that

|u(t)| ≤ ε =⇒ |y(t)| < δ.


Figure 4. Scheme of a unicycle mobile robot.

The idea is that if we apply “small” input signals produce “small” outputsignals.

Example 3.9. Consider a cylindrical robot that moves along a path.Suppose we are able to control the vehicle forward speed v(t) and the angularspeed ω(t). We can easily see that the differential equations describing thekinematics of the system are

y = v sin θ

θ = ω.

Suppose that our output is ω We can see that if apply the following signal

ω =

ε t ∈ [0, 0.1s]

0 t 6∈ [0, 0.1s]

the system changes slightly its orientation, but from that moment onward ystarts to grow unbounded even if ε is very small. This is the perfect exampleof a BIBO unstable system.

The BIBO stability property is particularly easy to study for LTI system,for which we can prove the following.

Theorem 3.10. Consider a LTI system Σ with impulse response h(t). Ifthe system is DT then it is BIBO stable if and only if there exists a constantS such that

∑∞infty |h(t)| = S < ∞. If the system is CT then it is BIBO

stable if and only if there exist a constant S such that∫∞infty |h(τ)|dτ = S <

∞.

Proof. We give the proof for the case of DT systems. The output ofthe system is given by

y(t) =

∞∑τ=−∞

h(τ)u(t− τ)

3.3. PROPERTIES OF THE IMPULSE RESPONSE 37

To prove sufficiency, assume that for some ε > 0 we have u(t) ≤ ε,∀t.We have:

y(t) =∞∑

τ=−∞h(τ)u(t− τ)

≤∞∑

τ=−∞|h(τ)u(t− τ)|

≤∞∑

τ=−∞|h(τ)||u(t− τ)|

≤∞∑

τ=−∞|h(τ)|ε

≤ ε∞∑

τ=−∞|h(τ)|

≤ εSε.Hence,

|u(t)| ≤ ε =⇒ |y(t)| ≤ δ = Sε,

which proves sufficiency.To prove necessity, it is sufficient to consider the input signal u(t) =

εsign(−t). If we compute y(0), we have:

y(0) =∞∑

τ=−∞h(τ)u(−τ)

≤∞∑

τ=−∞εh(τ)sign(h(τ))

≤ ε∞∑

τ=−∞|h(τ)|

Therefore, if∑∞

τ=−∞ |h(τ)| diverges, so will y(0) in the face of a boundedsignal u(t).


CHAPTER 4

Laplace and z-Transform

In the past chapter we have seen that LTI system respond to exponentialsignals (est for CT systems and zt for DT systems) in a special way. Suchfunctions are “eigenfunctions”, meaning that the corresponding output isthe same function multiplied by an eigenvalue, which is given by:

H(s) =

∫ ∞0

h(τ)e−sτdτ For CT systems

H(z) =

∞∑0

h(τ)z−τ For DT systems.

This is the basis for a solution strategy based on the so-called Laplace andZ transform. Before delving into this, it is useful to re-cap some basic factsabout complex exponentials.

4.1. Complex exponentials

An exponential function has a fundamental property:

eaeb = ea+b

ea/eb = ea−b

(ea)n = ena.

The very same properties are also enjoyed by a strange complex valuedfunction:

ejθ = cos θ + j sin θ,

which we call the Euler exponential. We can see this for the first property:

ejθejψ = (cos θ + j sin θ)(cosψ + j sinψ)

= (cos θ cosψ − sin θ sinψ) + j(sin θ cosψ + sinψ cos θ)

= (cos(θ + ψ) + j(sin θ + ψ)

= ej(θ+ψ).

It is possible define exponential with real and imaginary part:

eσ+jθ = eσejθ

= eσ(cos θ + j sin θ).

Another important property has to do with the computation of a powerof a complex number z. If we express z = ρejθ, we have zn = ρnejnθ.

39

40 4. LAPLACE AND Z-TRANSFORM

4.2. Complex Exponential functions

Based on the definition of complex exponential, we can introduce com-plex exponential functions.

In the CT domain, given any complex number s = σ + jω, with σ ∈ R,ω ∈ R, we define CT complex exponential the function:

est : R ∈ = e(σ+jω)t =

= eσt (cosωt+ j sinωt) .

Both the real and the imaginary part of this function are sinusoidal oscilla-tions whose amplitude is modulated by an exponential function (increasingif σ > 0 and decreasing if σ < 0).

Likewise, if we consider any complex variable z, we can express z = ρejθ.In this setting zt is given by:

zt = ρtejtθ.

4.3. Definition of the Laplace Transform

The Laplace transform is an integral operator defined as

L (f(t)) = F (s) =

∫ ∞0

f(τ)e−sτdτ.

The idea is to associate each signal f (which is a function of t) with a newfunction F (s), which is a function of the complex variable s.

Example 4.1. Let us compute the Laplace transform of 1(t).

L (1(t)) =

∫ ∞0

1(τ)e−sτdτ =

=

∫ ∞0

e−sτdτ =

=1

s

(1− lim

t→∞e−st

).

We can easily see that

limt→∞e−st =

0 Real (s) > 0

Does not converge Real (s) ≤ 0.

Therefore,

L (1(t)) =

1s if Real (s) > 0

undefined otherwise

4.3. DEFINITION OF THE LAPLACE TRANSFORM 41

Example 4.2. Let us compute the Laplace transform of 1(t)eat, witha ∈ R.

L (1(t)) =

∫ ∞0

1(t)eaτe−sτdτ =

=

∫ ∞0

e−(s−a)τdτ =

=1

s− a

(1− lim

t→∞e−(s−a)t

).

We can easily see that

limt→∞e−(s−a)t =

0 Real (s) = σ > a

Does not converge Real (s) = σ ≤ a.Hence, the Laplace transform is given by:

L(1(t)eat

)=

1s−a if Real (s) > a

undefined otherwise

The same line of reasoning also applies to exponential functions with com-plex exponent: e(σ+ω)t with σ ∈ R and ω ∈ R. In this case the result is1/(s− σ − jω) and the convergence condition is Real (s) ≥ σ.

Example 4.3. Let us compute the Laplace transform of the DIRACδ(t).

L (δ(t)) =

∫ ∞0

δ(t)e−sτdτ =

=

∫ ∞0

δ(t)e−s0dτ =

=

∫ ∞0

δ(t)dτ =

1

Contrary to the examples before this transform is defined for all possible s.

Looking at the examples above, we can derive the following conclusions:

• The Laplace transform of a CT signal is a function of a complexvariable s,• The Transform is not defined for all possible values. In general, we

have a region of convergence (ROC) where the Transform makessense. For the step function 1(t) the ROC is Real (s) ≥ 0. For anexponential signal 1(t)eat the ROC is Real (s) > a.

There are several points that remain to be addressed. The first one is ifthe relation between a function and its Laplace transform is a bijection.In the affirmative case, another open point is how to invert the Laplace


transform. The third point is: what is the meaning and the practical use ofthis function.

4.4. Existence and Uniqueness of the Laplace transform

For the discussion on existence and uniqueness of the Laplace transform,we need the following definitions.

Definition 4.4. A function f(t) is said of exponential order γ if thereexist a constant A such that

f(t) ≤ Aeγt.

It is easy to show the following.

Theorem 4.5. Consider a function f(t) and assume that: 1) f(t) iscontinuous, 2) f(t) is of exponential order γ. Then the Laplace transform:

L (f(t)) =

∫ ∞0

f(τ)e−sτdτ,

exists and the ROC contains the half-space Real (s) > γ.

This theorem gives us conditions for the existence of the Laplace trans-form for some values of s.

The condition on invertibility requires the same definition.

Definition 4.6. Two functions f(t) and g(t) are said almost equal,f(t) ≈ g(t), if f(t) and g(t) are equal for all t except for a set of points ofnull measure.

A simple way to think of two functions that are almost equal is in thefollowing example.

Example 4.7. Consider the two functions f(t) = e3t and

g(t) =

0 if t = 2, 4, 6, 8 . . .

e3t otherwise.

The two functions are almost equal because the set of isolated points givenby t = 2k, with k ∈ N has null measure.

We are now able to state the following:

Theorem 4.8 (Lerch’s Theorem). Suppose f(t) and g(t) are continuousexcept for a countable number of isolated points, and that they are of expo-nential order γ. Then if L (f(t)) = L (g(t)) for all s > γ the two functionsare almost equal: f(t) ≈ g(t).

From an engineering viewpoint saying that two functions are almostequal is tantamount to saying that they are the same, therefore with aslight abuse of notation we will write f(t) = g(t).

4.4. EXISTENCE AND UNIQUENESS OF THE LAPLACE TRANSFORM 43

4.4.1. Inverse Laplace Transform. The importance of Lerch’s theo-rem is lies in the fact that under reasonable and mild conditions the Laplacetransform is actually invertible. The inverse is given by:

f(t) = L−1 (F (s)) =1

2πj

∫ σ+j∞

σ−j∞F (s)estds,

where Real (s) = σ belongs to the ROC. This complex integral is generallydifficult to compute and we will find more manageable means to computethe inverse transform. However, the expression is quite insightful, becauseit reveals that the Laplace transform is actually a decomposition of a vectorf(t) using a basis given by the family of exponential functions, with F (s)being in some sense the coordinate related to the basis element est.

4.4.2. A first application of the Laplace transform. As we dis-cussed above, the Laplace transform H(s) of the impulse response has ac-tually a very precise meaning: it is the eigenvalue of the eigenfunctions est.The function H(s) is commonly called transfer function. This allows us toimmediate answer the query in the following example.

Example 4.9. Suppose that a system has impulse response 1(t)et. Findthe forced response to eαt. For the reasons that we have seen H(s) =1/(s − 1) with ROC Real (s) > 1. The eigenvalue is given by H(α) andis is defined only if α > 1. In this case the forced response is given byy(t) = eαt/(α− 1).

With a moderate effort, we can also answer the query in the followingexample.

Example 4.10. Suppose that a system has impulse response 1(t)e−3t.Find the forced response to cos 4t. As discussed in Chapter 2 (Theorem 3.6),the response is well defined if H(j4) converges, and in this case it is givenby:

y(t) = |H(j4)| cos(4t+ ∠H(j4).

H(s) is given by 1/(s+ 3) and its ROC is Real (s) > −3. Since j4 is on theimaginary axis H(j4) converges and the response above is well defined. Wecan find:

|H(j4)| =∣∣∣∣ 1

j4 + 3

∣∣∣∣ =

=|1|

|3 + j4|=

=1√

9 + 16=

=1

5,


and

∠H(j4) = ∠1

j4 + 3=

= ∠1− ∠3 + 4j

= − arctan 4/3.

4.5. Properties of the Laplace transform

We have seen expressions for both the direct and the inverse Laplacetransform. The easiest way to accomplish these tasks is not the applica-tion of these formulas, but the use of a number of properties that we willdiscuss below and the knowledge of the expressions of a few “elementary”transforms. In this section we will quickly review some of the properties.

4.5.1. Linearity. Given the linearity of the integral operator, it isstraightforward to prove the following.

Theorem 4.11. Let F1(s) = L (f1(t)) and F2(s) = L (f2(t)). Then,

L (h1f1(t) + h2f2(t)) = h1F1(s) + h2F2(s).

We immediately see an application of this result.

Example 4.12. Let us compute the Laplace transform of cos 3t.

L (cos(3t)) = L(ejωt + e−jωt

2

)=

1

2

(L(ejωt + e−jωt

))=

1

2

(1

s− jω+

1

s+ jω

)=

1

2

(2s

s2 − (jω)2

)=

s

s2 + ω)2.

4.5.2. Time Shifting. This property applies when we delay or antici-pate a causal function.

Theorem 4.13. Let F (s) = L (1(t)f(t)). Then F (s)e−st0 = L (1t− t0f(t− t0)).

4.5. PROPERTIES OF THE LAPLACE TRANSFORM 45

Proof. We can prove this by a simple change of variable:

L (1(t− t0)f(t− t0)) =

∫ ∞0

1(t− t0)f(t− t0)e−stdt

=

∫ ∞0

1(t− t0)f(t− t0)e−ste−st0est0dt

= e−st0∫ ∞

01(t− t0)f(t− t0)e−s(t−t0)dt

= e−st0∫ ∞−t0

1(t′)f(t′)e−st′dt′

= e−st0∫ ∞

01(t′)f(t′)e−st

′dt′

= e−st0F (s).

This property is shown in the following example.

Example 4.14. Find the Laplace transform of f(t) given by the follow-ing expression:

f(t) =

1 If t ∈ [0, 1]

0 otherwise.

We can easily see that f(t) = 1(t)− 1(t− 1). Hence,

L (f(t)) = L (1(t))− L (1(t− 1))

=1

s− e−s

s

=1− e−s

s.

4.5.3. Shifting in the Laplace domain. The dual property of timeshifting applies if we apply a shift in the Laplace complex variable, as shownin the following.

Theorem 4.15. Left F (s) = L (f(t)). The F (s− s0) = L(f(t)es0t

).

Proof. Using the formula for Laplace transform, we have:

L(f(t)es0t

)=

∫ ∞0

f(τ)es0τe−sτdτ

=

∫ ∞0

f(τ)e−(s−s0)τdτ

= F (s− s0),

where the last step derives from the application of the definition of Laplacetransform.


Example 4.16. Suppose we know that L (1(t)) = 1/s, then it is im-mediate to deduct that L

(1(t)eat

)= 1/(s − a). Likewise, if we know that

L (cosωt) = s/(s2 + ω2), it is immediate to deduct that L(eat cosωt

)=

s−a(s−a)2+ω2 .

4.5.4. Time Scaling. It is interesting to see what happens if we ex-pand or shrink the time scale.

Theorem 4.17. Let F (s) = L (f(t)) and let a ∈ R+. Then L (f(at)) =1aF (s/a).

Proof. We can apply the definition:

L (f(at)) =

∫ ∞0

f(at)e−stdt

=

∫ ∞0

f(t′)e−st′/adt

′

a

=1

aF (s/a).

Example 4.18. Suppose we know that L (cos t) = s/(s2 + 1) then

L (cosωt) =1

ω

s/ω

s2/ω2 + 1

=s

s2 + ω2.

4.5.5. Convolution. Now we come to one of the most important prop-erties of the Laplace Transform for its implications on Linear Systems.

Theorem 4.19. Let f(t) and h(t)be two causal functions and let L (f(t)) =F (s) and L (h(t)) = H(s). Then,

L (f(t) ∗ h(t)) = F (s)H(s).

Proof. Let g(t) = f(t) ∗ h(t). We have

L (g(t)) =

∫ ∞0

e−stf(t) ∗ h(t)dt

=

∫ ∞0

e−st(∫ t

0h(τ)f(t− τ)τ

)dt

=

∫ ∞0

∫ t

0e−sth(τ)f(t− τ)dτdt

The integration is carried out in the triangle 0 ≤ τ ≤ t. We can change theorder of integration:

L (g(t)) =

∫ ∞τ=0

∫ ∞t=τ

e−sth(τ)f(t− τ)dtdτ.


Let us make the following change of variable t = t− τ .

L (g(t)) =

∫ ∞τ=0

∫ ∞t=0

e−s(t+τ)th(τ)f(t)dtdτ

=

(∫ ∞τ=0

e−sτh(τ)dτ

)(∫ ∞τ=0

e−stf(t)dt

)= F (s)H(s).

We have just discovered a key fact: convolution in the time domain be-comes algebraic product in the Laplace domain. This suggests the followingprocedure to find the forced response to any signal u(t).

(1) compute the Transfer Function H(s) = L (h(t)),(2) compute U(s),(3) compute the inverse transform of H(s)U(s).

The practical execution of these steps requires: 1. a method for the com-putation of H(s) given a system described by a differential equation, 2. aneffective strategy to compute the inverse transform. Such problems will beaddressed below.

4.5.6. Differentiation rule. The computation of the Laplace trans-form of the derivative is key to the practical solution of differential equations.

Theorem 4.20. Let F (s) = L (f(t)). Then,

L(df(t)

dt

)= sF (s)− f(0).

Proof. We can apply integration by part to the definition of the Laplacetransform.

L(df(t)

dt

)=

∫ ∞0

df(t)

dte−stdt

= e−stf(t)|∞0 −∫ ∞

0f(t)(−se−st)dt

= limt→∞

e−stf(t)− f(0) + s

∫ ∞0

f(t)e−stdt

= sF (s)− f(0),

the last step is justified because in the ROC we must have limt→∞ e−stf(t) =

0.

We now have a clear avenue to the solution of any differential equation.Consider the following example.

Example 4.21. Consider the RC network in discussed in Example 2.1.Set u(t) = Vin(t)1(t) and y(t) = Vc(t). As we now its evolution is described


by the differential equation:

y = − y

RC+u(t)

RC.

Let τ = RC. We can apply the properties of the Laplace transform asfollows:

L (y) = L(− y

RC+u(t)

RC

)sY (s)− y(0) = −L

(− y

RC

)+ L

(u(t)

RC

)sY (s)− y(0) = −− Y (s)

τ+ L

(U(s)

τ

)Y (s) =

U(s)

τ(s+ 1τ )

+y(0)

s+ 1τ

=

Y (s) =1

τs(s+ 1τ )

+y(0)

s+ 1τ

.

We can observe a few facts of interest.

• The decomposition between forced evolution ( 1τs(s+ 1

τ)) and free evo-

lution ( y(0)

s+ 1τ

) comes for free.

• Given that the forced response is given by U(s)

τ(s+ 1τ

)and that we know,

from the convolution rule, that Y (s) = H(s)U(s), where H(s) isthe transfer function, we can deduct that H(s) = 1

τ(s+ 1τ

)

• We can further write

1

τs(s+ 1τ )

=1

τ

(τ

s− τ

s+ 1τ

)

=1

s− 1

s+ 1τ

Under our assumptions, the Laplace transform is invertible. Hence, we canwrite:

y(t) = 1(()t)(1− et/τ ) + y(0)e−t/τ .

We can apply the differentiation recursively. For the case of the secondderivative:

L(D2f(t)

)= sL (Df(t))−Df(0) = s2F (s)− sf(0)−Df(0).

In the general case:

L (Dnf(t)) = snF (s)− sn−1f(0)− sn−2Df(0)− . . .Dn−1f(0).


4.5.7. Integration Rule. The Laplace transform of the integral of afunction is easily derived, as shown in the following:

Theorem 4.22. Le F (s) = L (f(t)), then

L(∫ t

0f(τ)dτ

)=F (s)

s.

Proof. We can simply observe that∫ t

0 f(τ)dτ = f(t) ∗ 1(t) and apply

the convolution rule taking into account that L (1(t)) = 1s .

The convolution, the integration and the differentiation rule simply tellus that integral and differential operations in the time domain become simplealgebraic operations in the Laplace domain.

4.5.8. Final Value and Initial Value. There is much we can readfrom the expression of the Laplace transform without computing the inversetransform. This is shown in the following:

Theorem 4.23. Let F (s) be the known expression of L (f(t)). Then:I) if limt→0 f(t) exists then limt→0 f(t) = lims→∞ sF (s), II) if limt→∞ f(t)exists and is finite, then limt→∞ f(t) = lims→0 sF (s).

Proof. Let us first prove the first claim. From the differentiation rule,we know:

(4.1) L (Df(t)) =

∫ ∞0

d

dtf(t)e−stdt = sF (s)− f(0).

If we compute∫∞

0ddtf(t)e−stdt for s→∞, we find:∫ ∞

0

d

dtf(t)e−s∞dt = 0,

Therefore, we find f(0) = sF (s). Now, let us move to the second claim. Ifwe evaluate it for s→ 0 of

∫∞0

ddtf(t)e−stdt, we have:

lims−>0

∫ ∞0

d

dtf(t)e−stdt =

∫ ∞0

lims−>0

d

dtf(t)e−stdt =

=d

dtf(t)

∣∣∣∣∞0

= f(∞)− f(0).

Consider once more Equation 4.1, we find for s→ 0:

lims→0

sF (s)− f(0) = f(∞)− f(0),

which leads us straight to the claim.

Example 4.24. Suppose that the Laplace Transform of a function isgiven by F (s) = s

s(s+2) . Then we have f(0) = lims→∞ sF (s) = 1, and

f(∞) = lims→0 sF (s) = 0.


4.5.9. Differentiation in the Laplace domain. The dual propertyfor the differentiation in time domain is shown in the following.

Theorem 4.25. Let L (f(t)) = F (s). Then L (−tf(t)) = dF (s)/ds.

Proof. We can write:

dF (s)

ds=

d

ds

∫ ∞0

f(t)e−stdt

=

∫ ∞0

df(t)e−st

dsdt

=

∫ ∞0

f(t)de−st

dsdt

=

∫ ∞0

(−t) · f(t)e−stdt

= L (−tf(t))

We can generalise the result in the following corollary:

Corollary 4.26. Let L (f(t)) = F (s). Then L ((−1)ntnf(t)) = dnF (s)/dsn.

Proof. It descends from an iterative application of Theorem 4.25.

4.6. Inversion of the Laplace Transform

The application of the properties described in the previous section leadsto the following consideration.

Fact 4.27. Consider a CT LTI system described by the differential equa-tion

(4.2) Dny(t) =n−1∑i=0

αiDiy(t) +

p∑j=0

βjDju(t),

with p ≤ n and initial conditions:

y(0),Dy(0), . . . ,Dn−1y(0), . . . ,Dpu(0), . . . ,Du(0).

Then the application of the differentiation rule leads to

Y (s) = Yforced(s) + Yfree(s) with

Yforced(s) =

∑pj=0 βjs

j

sn −∑n−1

i=0 αisiU(s)

Yfree(s) =N0(s)

sn −∑n−1

i=0 αisiU(s)

where N0(s) is a polynomial of degree n− 1 whose coefficients are functionsfor the initial conditions.

4.6. INVERSION OF THE LAPLACE TRANSFORM 51

Looking at the form of Yforced(s), we also conclude that the expressionof the transfer functions H(s) (i.e., the transform of the impulse response)is given by:

H(s) = L (h(t)) =

∑pj=0 βjs

j

sn −∑n−1

i=0 αisi.

Observing that: 1. for standard functions u(t) its Laplace Transformis given itself by a fraction of polynomials, 2. we operate in conditionswhere the uniqueness of the inverse Laplace Transform is guaranteed (up toa number of isolated points), we conclude that the solution of a differentialequation of a CT LTI system requires the inversion of two Laplace functionsgiven by fractions of polynomial in the s variable.

4.6.1. Inversion of fractions of polynomials. The general expres-sion of fraction of polynomials can be written as:

Asp + ap−1s

p−1 + ap−2sp−2 + . . . a0

sn + bn−1sn−1 + . . .+ b0,

where p ≤ n for the causality of the system. A fundamental result of algebrais the following:

Theorem 4.28. A polynomial of degree n with complex coefficients hasexactly n complex roots.

This allows us to write the fraction of polynomials as

A(s− z1)(s− z2) . . . (s− zp)(s− p1)(s− p2) . . . (s− pn)

.

The roots z1 of the numerator are called zeros while the roots of the denom-inator are called poles.

We now consider different cases.4.6.1.1. The case of real and distinct poles. The first case we consider is

when poles are real and distinct. In this case it is possible to see that ourfunction can be expressed as

F (s) = A(s− z1)(s− z2) . . . (s− zp)(s− p1)(s− p2) . . . (s− pn)

=A1

s− p1+

A2

s− p2+ . . .+

Ans− pn

Each term A1s−p1 is said a partial fraction and this is called partial fraction

expansion. We can show the following:

Proposition 4.29. Consider a function



with distinct and real roots. Then the coefficients of its partial fractionexpansion

F (s) =A1

s− p1+

A2

s− p2+ . . .+

Ans− pn

,

are given by:

Ai = F (s)(s− pi)|s=pi .

Proof. Without loss of generality we show the proposition for A1. Wecan write


=A1

s− p1+

A2

s− p2+ . . .+

Ans− pn

and multiply both sides by (s− p1).

F (s)(s− p1) = A1 +A2(s− p1)

s− p2+ . . .+

An(s− p1)

s− pn.

If we evaluate the result at s = p1, each of the termsAj(s−p1)s−pj is null (being

the roots distinct). Therefore, the right hand side produces A1. The righthand side is finite because the term s − p1 and the numerator cancels theonly s− p1 term at the denominator.

Example 4.30. Consider the system:

y = 5y − 4y − 4u+ u.

We want to study the system’s evolution for u(t) = 1(t), y(0) = 1, y(0) = 0,u(0) = 2. Computing the Laplace transform we get:

s2Y (s)− sy(0)− y(0) = 5sY (s)− 5y(0)− 4Y (s)− 4sU(s) + 4u(0) + U(s)(s2 + 5s+ 4

)Y (s) = (1− 4s)U(s) + sy(0)− 5y(0) + y(0) + 4u(0)

Y (s) =1− 4s

s2 + 5s+ 4U(s) +

sy(0)− 5y(0) + y(0) + 4u(0)

s2 + 5s+ 4

Y (s) =1− 4s

s2 + 5s+ 4

1

s+

s− 5 + 8

s2 + 5s+ 4

Observe that s2 + 5s+ 4 = (s+ 4)(s+ 1) and

Y (s) =1− 4s

s(s+ 4)(s+ 1)+

s+ 3

(s+ 4)(s+ 1)


If we compute the partial fraction expansion, we find:

Y (s) =A1

s+

A2

s+ 1+

A3

s+ 4+

+B1

s+ 1+

B2

s+ 4where

A1 =1− 4s

(s+ 4)(s+ 1)ss

∣∣∣∣s=0

=1

4

A2 =1− 4s

(s+ 4)(s+ 1)s(s+ 1)

∣∣∣∣s=−1

=−5

3

A3 =1− 4s

(s+ 4)(s+ 1)s(s+ 4)

∣∣∣∣s=−4

=17

12

B1 =s+ 3

(s+ 4)(s+ 1)(s+ 1)

∣∣∣∣s=−1

=2

3

B1 =s+ 3

(s+ 4)(s+ 1)(s+ 4)

∣∣∣∣s=−4

=1

3.

and

y(t) = 1(t)

(1

4− 5

3e−t +

17

12e−3t

)+

+ 1(t)

(2

3e−t +

1

3e−3t

).

4.6.1.2. The case of complex conjugate poles. Before considering the caseof complex poles it is useful to state the following results:

Lemma 4.31. Consider a polynomial P (s) in a complex variable s with

real coefficient. Then P (s) = P (s).

Proof. We can write:

P (p) = pn + an−1pn−1 + an−2p

n−2 + . . .+ a0

= pn + an−1pn−1 + an−2p

n−2 + . . .+ a0

= pn + an−1pn−1 + an−2pn−2 + . . .+ a0 = P (p).

where the first step is applicable because the coefficients ai are real and arenot affected by the conjugation and the second one because the conjugateof a sum is the sum of the conjugates.

Theorem 4.32. Consider a polynomial in a complex variable s with realcoefficient. If s = p is a root of the polynomial, then also its conjugate s = pis.

Proof. Consider the polynomial

P (s) = sn + an−1sn−1 + an−2s

n−2 + . . .+ a0,


with ai ∈ R. Il p is a root, then

P (p) = pn + an−1pn−1 + an−2p

n−2 + . . .+ a0 = 0.

In view of Lemma 4.31 we have: P (p) = P (p). If we now observe that

P (p) = 0 =⇒ P (p) = 0 our claim easily follows.

With this result in mind let us consider the case of fractions of polyno-mial with real coefficients. This situation always occurs when dealing withordinary differential equations. Without loss of generality we will considera single pair of complex conjugate roots. We can show the following:

Proposition 4.33. Consider the following function given by a ratio ofcomplex polynomials with real coefficients. Assume that p1 and p1 are apair of complex conjugate poles.

F (s) =n(s)

d(s)

=n(s)

(s− p1)(s− p∗1) . . ..

Let

F (s) =A1

s− p1+

A1′

s− p1+ . . .

be its partial fraction expansion. Then A′1 = A1. In other words the coeffi-cient of the partial fraction related to the complex conjugate of the pole isthe complex conjugate of the partial fraction related to the pole.

Proof. Proposition 4.35 holds independently of whether poles are realor complex. Therefore, we can write:

A′1 = lims→p1

n(s)

d(s)(s− p1).

The denominator d(s) can be written as d(s) = (s − p1)(s − p1)d1(s). It iseasy to see that d1(s) is a polynomial with real coefficients. By ApplyingLemma 4.31, we find

A′1 =n(s)

d(s)(s− p1)

∣∣∣∣s=p1

=n(p1)

d1(p1)(p1 − p1)

=n(p1)

d1(p1)(−2jImag (p1))

=n(p1)

d1(p1)(2jImag (p1))

= A1.


An important consequence of the above can be seen when we computethe inverse transform of the two partial fractions related to a complex con-jugate pair. If

F (s) =A1

s− p1+

A1′

s− p1+ F1(s),

with p1 = σ1 + jω1, then

L−1 (F (s)) = L−1

(A1

s− p1+

A1′

s− p1+ F1(s)

),

= L−1

(A1

s− p1

)+ L−1

(A1′

s− p1

)+ L−1 (F1(s))

= L−1

(A1

s− p1

)+ L−1

(A1

s− p1

)+ L−1 (F1(s))

= A1ep1t +A1e

p1t + f1(t)

= A1ep1t +A1ep1t + f1(t)

= 2Real(A1e

p1t)

+ f1(t)

= 21(t) |A1| eσ1t cos(ω1t+ ∠A1)) + f1(t).

In other words, a complex conjugate pair gives rise to a sinusoidal oscillationwith amplitude modulate by a real exponential.

Example 4.34. Consider the differential equation:

y = y − y + u(t)

Compute the system’s response to 1(t) for null initial conditions. Neglectingthe initial conditions the differential Laplace transform is given by:

Y (s)(s2 − s+ 1) = U(s)

Y (s) =1

(s2 − s+ 1)s

=1

(s− 1+√−3

2 )(s− 1−√−3

2 )s

=1

(s− 1+j√

32 )(s− 1−j

√3

2 )s

=A1

s+

A2

s− 1+j√

32

+A2

s− 1−j√

32


with

A1 =1

(s2 − s+ 1)

∣∣∣∣s=0

= 1

A2 =1

(s− 1−j√

32 )s

∣∣∣∣∣s= 1+j

√3

2

=1

j√

31+j√

32

=1

−32 + j

√3

2

=−3

2 − j√

32

94 + 3

4

= −1

2− j√

3

6

Therefore we can compute the inverse as:

y(t) = 1(t)

(1 + 2 |A2| e

12t cos(

√3

2t+ ∠A2)

)

|A2| =√

1

4+

3

36

=

√3

3

∠A2 = atan2(−√

3

6,−1

2)

= −0.8571

4.6.1.3. The case of multiple roots. There cases when a pole occurs asa multiple roots of the denominator of the transfer function. For instance,we could have d(s) factored as (s + 3)2(s2 − s + 1). For the moment beinglet us focus on the case of just one multiple and real pole. Without loss ofgenerality, let us assume that the multiple pole is p1 and that its multiplicityis h:

F (s) = An(s)

(s− p1)hd1(s)

where d1(s) does not divide (s− p1).

Proposition 4.35. Consider the function in Equation (4.3), where thepole p1 is real and with multiplicity h. It is possible to come up with the


following partial fraction expansion:

F (s) = An(s)

(s− p1)hd1(s)

=A1,1

s− p1+

A1,2

(s− p1)2+ . . .+

A1, h

(s− p1)h+ F1(s)(4.3)

(4.4)

where F1(s) is found using the same rules that apply to single poles as wediscussed above and A1,i is given by:

A1,h−r =1

(h− r)!dr

dsr

[F (s)(s− pi)h

]∣∣∣∣s=pi

.

Proof. Multiplying both sides of Equation (4.4) by (s− p1)h we get:

n(s)

d1(s)= A1,1(s− p1)h−1 + . . .+A1,h−1(s− p1) +A1,h + F1(s)(s− p1)h.

Since neither d1(s) and any denominator in the partial fraction expansionof F1(s) divide (s− p1), when we evaluate in s = p1, we will have:

n(p1)

d1(p1)= A1,h.

Similarly, if we differentiate r times we get:

dr

dsr

[n(s)

d1(s)

]=A1,1(h− 1)(h− 2) . . . (h− r)(s− p1)h−1−r +A1,2(h− 2) . . . (h− r − 1)(s− p1)h−2−r + . . .+

. . .+ (h− r)(h− r − 1) . . . 1 ·A1,h−r+

+ h(h− 1) . . . (h− r)(s− p1)h−rF1(s) +dr

dsr[F1(s)] (s− p1)h.

If we evaluate in s = p1 we obtain our claim.

Another important fact is in the following:

Proposition 4.36. Let F (s) = 1(s−p)h . Then L−1 (F (s)) = th−1

h! ept.

Proof. It descends from the application of Corollary 4.26 to L(1(t)eat

)=

1s−a

This leads us to:

Theorem 4.37. Consider the function in Equation (4.3), where the polep1 is real and with multiplicity h. Let

F (s) =A1,1

s− p1+

A1,2

(s− p1)2+ . . .+

A1, h

(s− p1)h+ F1(s)

then

L−1 (F (s)) = 1(t)

(A1,1e

p1t +A1,2tep1t +A1,3

t2

2ep1t + . . .+A1,h

th−1

(h− 1)!ep1t)

+ L−1 (F1(s))


Example 4.38. Consider the following differential equation

y = 2y − y + u(t).

Let us compute the free evolution for y(0) = −4, y(0) = 2. Neglecting theinput function u(t), the Laplace function becomes:

(s2 − sy(0)− y(0))Y (s)− (2s− 2y(0))Y (s) + Y (s) = 0,

which is easily written as

Y (s) =sy(0) + y(0)− 2y(0)

s2 − 2s+ 1

=−4s+ 10

(s− 1)2

=A1,1

s− 1+

A1,2

(s− 1)2

A1,2 = (−4s+ 10)|s=1 = 6

A1,1 =d

ds[(−4s+ 10)]

∣∣∣∣s=1

= −4

As a consequence,

y(t) = 1(t)et (−4 + 6t) .

The case of complex conjugate poles with multiplicity greater than 1 isnot significantly different. Following the same line of reasoning as in SectionWe can treat each pole in the complex conjugate pair as in Section 4.6.1.2:1. finding the coefficients of the partial fraction expansion, 2. observingthat the coefficient related to the complex conjugate are complex conjugate,3. recombining the pairs related to the partial fraction with the same powerand reducing them to real functions. This is shown in the following example:

Example 4.39. Let us compute the inverse of

F (s) =1

s(s2 − s+ 1)2.


Let p1 = 1+j√

32 be one of the roots of (s2 − s + 1) (the other one being its

conjugate p1. We can proceed as follows:

F (s) =1

s(s− p1)2(s− p1)2

=A1,1

s− p1+

A1,1

s− p1+

+A1,2

(s− p1)2+

A1,2

(s− p1)2+

+A2

sA2 = F (s)s|s=0 = 1

A1,2 = F (s)(s− p1)2∣∣s=p1

=1

p1(p1 − p1)2

A1,1 =d

ds

[F (s)(s− p1)2

]∣∣∣∣s=p1

=−((s− p1)2 + 2s(s− p1)

s2(s− p1)4

∣∣∣∣s=p1

=−(3s− p1)

s2(s− p1)3

∣∣∣∣s=p1

=−3p1 + p1

p21(p1 − p1)3

Observing that p1 = ejπ/3 and p1 − p1 = 2j√

32 = j√

3, we find

A1,2 = − 11+j√

32 3

= − 1

3ejπ/3

=1

3e−jπ/3


and

A1,1 =e−jπ/3 − 3ejπ/3

ej2π/3(−j3√

3)

=e−jπ/3 − 3ejπ/3

ej2π/3e−jπ/23√

3

=e−jπ/3 − 3ejπ/3

ejπ/63√

3

=e−jπ/2 − 3ejπ/6

3√

3

=−j − 3

√3

2 −32j

3√

3

=−3√

32 −

52j

3√

3

= −1

2− 5

6√

3j

=1

3

√13

3ejatan2(−5/(6

√3),−1/2)

=1

3

√13

3e−j2.3754

We can combine the results as follows:

f(t) = A21(t)+

+ 1(t)(A1,1e

p1t +A1,1ep1t)

+

+ 1(t)(A1,2te

p1t +A1,2tep1t)

= 1(t)(1 + 2Real

(A1,1e

p1t)

+ 2Real(A1,2te

p1t))

=

= 1(t)

(1 + 2

1

3

√13

3e1/2t cos (

√3

2t− 2.3754) +

2

3te1/2t cos (

√3

2t− π/3)

)

4.7. BIBO Stability for CT systems

The discussion above has lead us to the conclusion that a pole in a CTsystem gives rise to a complex exponential possibly multiplied by a power oft (if the root is multiple). In view of Theorem 3.10, this can be translatedinto the following.

Theorem 4.40. Consider a CT LTI system with transfer function:

H(s) =n(s)

d(s).

4.7. BIBO STABILITY FOR CT SYSTEMS 61

Assume that no cancellation between poles and zero takes place: 6 ∃p : n(p) =d(p) = 0. The system is BIBO stable if and only if all of its poles havenegative real part.

Proof. In view of Theorem 3.10, the system is stable if and only if∫∞0 |h(t)|dt = L < +∞. We now start by proving sufficiency. If H(s) =

L (h(t)) has all poles with negative real part then h(t) =∑H

h=1 Fhhh(t),where the functions hh(t) can be one of the following:

hh(t) =

epht Real root (single or multiple)

eReal(ph)t cos (Imag (ph) t+ φh) Complex single or multiple root

tnhepht Real multiple root

tnheReal(ph)t cos (Imag (ph) t+ φh) Complex multiple root,

which can be interpreted as

|hh(t)| ≤ Hh(t) =

epht Real root (single or multiple)

eReal(ph)t Complex single or multiple root

tnhepht Real multiple root

tnheReal(ph)t Complex multiple root,

and∫∞0 Hh(t)dt =

limK→∞ephK−1ph

For real roots

limK→∞eReal(ph)K−1

phFor complex roots

limK→∞∑nh

k=0(−1)nh−k nh!

k!pnh−kh

KkephK − (−1)nh 1pnhh

Real multiple root

limK→∞∑nh

k=0(−1)nh−k nh!

k!Real(ph)nh−kKkeReal(ph)K − (−1)nh 1

Real(ph)nhComplex multiple root

If Real (ph) is smaller than 0 for all poles, this leads to

∫ ∞0

Hh(t)dt =

−1ph

For real roots−1

Real(ph) For complex roots

− (−1)nh

pnhh

Real multiple root

− (−1)nh

Real(ph)nhReal multiple root.

In other words we can find a constant F that upper bounds all Hh(t) for allh. ∫ ∞

0|h(t)|dt ≤ |Fh|

∫ ∞0

∑|Hh(t)|dt ≤ FM.

To prove necessity, Consider a single pole pi with non negative real partexists and assume for simplicity that its multiplicity is1. Then in the partialfraction expansion ofH(s) we will have a term Ai

s−pi with non null Ai. Indeed,

Ai =n(pi)

(pi − p1) . . . (pi − pi−1)(pi − pi+1) . . . (pi − pn),


and by our assumption n(pi) 6= 0. This fraction will give rise to an expo-nential function in h(t) whose integral grows in an unbounded way.

Example 4.41. Let us consider the following transfer function:

H(s) =s− 1

s2 + 3s+ 2.

Its poles are given by

p1,2 =−3±

√9− 8

2=

−2

−1.

Both roots are negative. Hence, the system is BIBO stable.

This example is easy to analyse because the roots of a second orderequation can be found using a closed-form expression. This is possible alsofor order 3 and 4, although the expressions are way less intuitive to dealwith.

Definition 4.42. The polynomial d(s) at the denominator of the trans-fer function is said characteristic polynomial and the equation d(s) = 0 iscalled characteristic equation.

4.7.1. The Routh-Hurwitz Criterion. If we have a mere interest insystem’s BIBO stability the explicit computation of the roots is not requiredas long as we are able to compute the sign of their real part. This can bedone using the Routh-Hurwitz criterion. Suppose the denominator d(s) ofthe transfer function is:

ansn + an−1s

n−1 + . . . a0.

Suppose that n is even. The first step is to form the Routh table as follows:

sn an an−2 an−4 . . . a2 a0

sn−1 an−1 an−3 an−5 . . . a3 a1

sn−2 b1 b2 b3 . . . bn2 0sn−2 c1 c2 c3 . . . 0 0. . . . . . . . .s0 q


where computation of each row is made using the results of the two rowsimmediately above. For instance:

b1 =an−1an−2 − anan−3

an−1


an−1


an−1

. . .

c1 =b1an−3 − b2an−1

b1

c2 =b1an−5 − b3an−1

b1

c2 =b1an−7 − b4an−1

b1. . . .

After building this table, we can state the following.

Theorem 4.43. Suppose that the Routh table can be built as:

sn an an−2 an−4 . . . a2 a0

sn−1 an−1 an−3 an−5 . . . a3 a1

sn−2 b1 b2 b3 . . . bn2 0sn−2 c1 c2 c3 . . . 0 0. . . . . . . . .s0 q

Assume that the first column (an, an1, b1, c1, . . . , q) does not contain 0 ele-ments. Then the number of sign changes in the first column (an, an1, b1, c1, . . . , qof the Routh table corresponds to the number of poles in the right half of thecomplex plan. Moreover, a necessary and sufficient condition for all rootsto be in the left a half plan is that all coefficients ai be positive and that allthe elements of the first column be positive.

Example 4.44. Consider the classic second order equation:

a2s2 + a1s+ a0.

The Routh table is given by

s2 a2 a0

s a1 0s0 a2.

Hence the requirements that all poles are in the left half plan reduces to thefact that all coefficients have to be positive. This can be seen by consideringthat if the roots are negative, we can write the polynomial as

(s+ r1)(s+ r2) = s2 + (r1 + r2)s+ r1r2


So if r1 and r2 are positive (meaning that the roots are negative) if and onlyif (r1 + r2) and r1r2 are positive in their turn.

Example 4.45. We can now consider the case of polynomial of degreethree.

a3s3 + a2s

2 + a1s+ a0 = 0.

The Routh table is given by:

s3 a3 a1 0s2 a2 a0 0s a2a1−a3a0

a20 0

s0 a0 0 0

Hence, along with ai > 0, BIBO stability will also require a2a1 − a3a0 > 0.

To simplify the construction of the table, we can use the following result:

Proposition 4.46. If we multiply an entire row of the Routh table bya positive constant Theorem 4.43 still holds.

There are two important special cases:

(1) The first element of a row is 0, thus preventing to derive the fol-lowing row (which requires the division by the first element),

(2) An entire row is 0.

Both cases reveal the presence of roots on the imaginary axis or in thepositive half plan and therefore the loss of BIBO stability. Still we can useRouth criterion to identify the number of stable and unstable roots.

The first problem is solved by perturbing the 0 element (setting it toε) and evaluating the number of changes and permanence both for positiveand negative ε.

Example 4.47. Consider the polynomial

s3 + s− 1.

Its associated table iss3 1 1 0s2 0 −1 0s ???s0 ???

In order to compute the missing elements of the Routh table, we can intro-duce ε:

s3 1 1 0s2 ε −1 0s ε+1

ε 0 0s0 −1 0

If we consider ε→ 0+ we can see observe one sign change in the first column.The same is we consider the changes for negative values: ε→ 0−. We inferthe presence of one root in the right half plan and two roots in the negativehalf plan.


When an entire row is null, this means that two rows are proportionaland that the characteristic polynomial d(s) can be divided by the polynomialassociated with the row immediately above the null row. The di visor poly-nomial is called “auxiliary polynomial” and call it p(s). Having a null rowalso indicates that the auxiliary polynomial has roots that are symmetricalwith respect to the imaginary axis. The number of changes above the linewith zeros accounts for the stability of the roots of the “remainder” polyno-mial. One possibility to complete the construction of the Routh table is to

replace the null line with dp(s)ds and then continue. Looking the first column

of the modified Routh table, the number of sign changes is still equal to thenumber of roots with positive real part. From the null row down, each signchange indicates a root with positive real part. Since roots are symmetric,there is be a corresponding root in the negative half plan. All roots notaccounted in this way (i.e., no sign changes) are on the imaginary axis.

Example 4.48. Consider the polynomial

d(s) = (s2 + 1)(s+ 1)(s+ 3)

= s4 + 4s3 + 4s2 + 4s+ 3

In this case we know up-front that the polynomial has two stable roots anda pair of imaginary roots. Let us see how the modified Roth criterion allowsus to find this information. The first two rows of the Routh table are:

s4 1 4 3s3 4 4 0. . .

We can divide the second row by 4 and compute the third row as

s4 1 4 3s3 1 1 0s2 3 3 0. . .

We divide the third row by 3 and find:

s4 1 4 3s3 1 1 0s2 1 1 0s 0 0 0

The three positive elements of the first rows reveal two roots with negativereal part. We can now complete the construction. The auxiliary polynomialis given by s2 + 1 and its derivative is 2s. Therefore we can complete the


C(s) P(s) R(s) U(s) Y(s)

C(s) P(s) R(s) U(s) Y(s) E(s)

(A)

+ -‐

(B)

Figure 1. (A) series connection, (B) feedback connection

construction as:

s4 1 4 3s3 1 1 0s2 1 1 0s 2 0 01 1 0 0

The two new elements are positive. This means no changes and two rootson the imaginary axis.

4.8. Use of Laplace Transform for Control Design

The Laplace transform is a very useful tool for designing systems thatcontrol other systems. This is done taking advantage of some propertiesof the connections of systems. The two simplest ways of connecting twosystems are shown in Figure 1. For series connection (Figure 1.(A)), weknow already that, given the two impulse response c(t) and p(t) the outputis y(t) = p(t) ∗ c(t) ∗ r(t). In the Laplace domain this property can betranslated as:

Y (s) = P (s)C(s)R(s),

because convolutions correspond to products in the Laplace domain. Thisconnection can be used in different ways as in the following examples.

Example 4.49. Consider a system

P (s) =1

s+ 5.

4.8. USE OF LAPLACE TRANSFORM FOR CONTROL DESIGN 67

Suppose we want it to track with 0 error constant reference signals. If ourreference signal is r(t) = A1(t), this requirement could be stated as:

limt→∞

y(t) = A.

If make a series connection of P (s) and C(s), we have:

Y (s) = P (s)C(s)A

s.

Applying the final value theorem

limt→∞

y(t) = lims→0

sP (s)C(s)A

s= AP (0)C(0).

Hence, the specification requires: C(0) = 1P (0) = 5. The simplest way of

having this is by obviously choosing C(s) = 5. Clearly, if the pole is knownwith some error ∆ the steady state error will be affected by an error of thesame amount.

Suppose, now, that we also require a faster response than the one as-sociated with e−5t, setting it, for instance, to e−10t. One possibility is tochoose the controller

C(s) = 10s+ 5

s+ 10.

Notice that gain has been chosen equal to 10 in order to have P (0)C(0) = 1.

As shown in the previous example, using series connection we can ad-just the steady state response (up to a certain precision) and even changethe dynamics of the system cancelling some of the poles with zeros in thecontroller. This is not possible if some of the poles are unstable, as shownnext.

Example 4.50. Consider the system

P (s) =1

s− 5.

Suppose we want it to track with 0 error constant reference signals. Ifour reference signal is r(t) = A1(t). The evident problem here is that thesystem is unstable. So our first problem is to “stabilise” it. We could thinkof a series connection with a controller system like C(s) = p s−5

s+p with p > 0.

The problem becomes evident if the pole is known with some precision.Suppose that instead of having

P (s) =1

s− 5 + ∆.


with ∆ << 1. The overall response to U(s) = 1s is given by:

P (s)C(s)R(s) = ps− 5

s(s+ p)(s− 5 + ∆)

=5

5−∆

1

s− p+ 5

(p+ 5−∆)

1

s+ p− p −∆

(5−∆)(5−∆ + p)

1

s− 5 + ∆

The term

H1

s− 5 + ∆,

with = −p −∆(5−∆)(5−∆+p) . This means that however small a perturbation

∆ on the pole can be, it gives rise to an exponential term e(5−∆)t, whichdiverges compromising the system’s stability.

From the example above, we have seen that controlling a system by onlyusing a cascaded connection with a controller is a solution that successfullysolves only parts of the possible range of problems that a designer mightencounter. In particular, compensating the error of the system by a “gain”(i.e., multiplication by a constant) can be a viable solution but it is affectedadversely by the limited knowledge of the system parameters. More impor-tantly, an unstable system cannot be made stable by simply “cancelling”the unstable poles with zeros in the controlling system. A more useful solu-tion for this cases is the feedback connection shown in Figure 1.(B) For thisconnection we can write:

Y (s) = P (s)U(s)

U(s) = C(s)E(s)

E(s) = R(s)− Y (s).

By combining the three equations we easily find:

(4.5) Y (s) =P (s)C(s)

1 + P (s)C(s)R(s).

The equation above is called “closed–loop equation” and the term P (s)C(s)1+P (s)C(s)

is the “closed–loop transfer function”.If we want to enforce a 0 steady state error to the step response (U(s) =

As ), we can once again apply the final value theorem:

limt→∞

y(t) = lim s→ 0sY (s)

= lim s→ 0sY (s)

= lim s→ 0sP (s)C(s)

1 + P (s)C(s)

A

s

Now, if lims→0 P (s)C(s) =∞ we have that

limt→∞

y(t) = A,

4.8. USE OF LAPLACE TRANSFORM FOR CONTROL DESIGN 69

no matter how imprecise the knowledge of P (0) is. This means that eitherP (0) = 1

sP1(s) or C(0) = 1sC1(s). To put it in other words: “if we want to

use a feedback connection to follow a step function with steady state error0, an integrator 1

s has to be found in P (s) or in C(s)”. This is a particularexample of principle known as “internal model principle”, whereby if we wantto generate a signal it has to be contained in the system to be controlled orwe have to insert it into the loop through the controller. The fact that C(s)appears in the denominator of the closed loop transfer function allows us toshift the position of the pole. Unlike what we did with a simple pole-zerocancellation, this technique is much more reliable in terms of parametricuncertainty. This is shown in the following.

Example 4.51. Consider the system

P (s) =10 + ∆0

s− 5−∆1

where ∆0 and ∆1 are bounded but unknown. Suppose we choose the feed-back controller:

C(s) =10(s+ 5)

s.

Let us see whether or not it fulfils the specified requirements.In order to fulfil the requirement of having zero steady state error, we

need to have an integrator in the loop, which is verified in our case becauseC(s) has a pole in s = 0 as long as the system is stable. The closed loopresponse is given by

P (s)C(s)

1 + P (s)C(s)=

10+∆0s−5−∆1

10(s+5)s

1 + 10+∆0s−5−∆1

10(s+5)s

=(10 + ∆0)(10(s+ 5))

(s− 5−∆1)s+ 10(s+ 5)(10 + ∆0)

=(10 + ∆0)(10(s+ 5))

s2 + (10∆0 −∆1 + 95)s+ 500 + 50∆0

In order to verify when the system is stable, we can apply Routh Criterion,and require that all coefficients of the characteristic equation be positive,which is verified if:

∆0 > −10

∆1 < 10∆0 + 95.

As long as these conditions are met the system is both BIBO stable andresponds with null error to a step input. We emphasise that if we use an“open loop” (series connection) scheme, it is sufficient to have small ∆1 todestroy stability and that having a ∆0 6= 0 determines errors in he steadystate behaviour.


4.9. The z-Transform

The z-transform is the discrete time counter–part of the Laplace tran-form. In the same way as the Laplace transform, the z-Tranform, providesanalytical methods for the solution of a difference equation, it offers directinsight into the transient and steady state behaviour of a DT signal, andcan be used to evaluate the stability of a system.

The definition of the z-Transform is the following:

Z (f(t)) =∞∑0

f(t)z−t.

This time we associate a DT signal with a function of the complex variablez. We will find it convenient to use the polar representation: z = ρejφ.

Example 4.52. Let us compute the z-Transform of 1(t):

Z (1(t)) =

∞∑0

1(t)z−t

=∞∑0

z−t

= limH→∞

H∑0

z−t

= limH→∞

1− z−H

1− z−1.

Setting z = ρejθ, we have z−H = ρ−He−jHθ. We have two cases:

limH→∞

z−H =

0 if ρ = |z| > 1

∞ if ρ = |z| < 1

e−jHθ if ρ = |z| = 1

Z (1(t)) = limH→∞

1− z−H

1− z−1=

1

1−z−1 if |z| > 1

is not defined otherwise

4.10. EXISTENCE AND UNIQUENESS OF Z-TRANSFORM 71

Example 4.53. Let us compute the z-Transform of 1(t)at:

Z (1(t)) =∞∑0

1(t)atz−t

=

∞∑0

z−t

= limH→∞

H∑0

(az

)t= lim

H→∞

1−(az

)H1− a

z

.

Suppose a > 0. Setting z = ρejθ, we have(az

)H=(aρ

)He−jHθ. If a < 0,

we have(az

)H=(aρ

)Hejπ−jHθ. This allows us to conclude:

Z(1(t)at

)=

z

z−a if |z| > |a|is not defined otherwise

From this discussion we can conclude the following facts.

• The z-Transform of a DT signal is a function of a complex variablez (in the same way as the Laplace transform is a function of thecomplex variable s).• As for the Laplace transform, it is not typically possibile to define

the z-Transform for all values of z, but only for a subset that wedefine Region of Convergence (ROC).• Whereas for the Laplace transform the ROC is typically an half

space (Real (s) > α) for the z-Transform it is an anulus |z| ≥ ρ.

4.10. Existence and Uniqueness of z-Transform

On existence of the z - Transform, we can show the following result:

Theorem 4.54. Consider a function f(t) and assume that one of thefollowing limits exists:

Rf = limt→∞|f(t)|1/t

Rf = limt→∞

f(t+ 1)

f(t).

Then:

(1) the z-Transform Z (f(t)) exisats and converges for |z| geqRf .(2) the z-Trasform is analytic, i.e., continouus and infinitely differen-

tiable w.r.t. z, for |z| ≥ Rf .


Showing this result is complex and beyond the scope of this course.

We only remark that the existence of a limit Rf = limt→∞ |f(t)|1/t can berephrased as the existence of an exponential upper bound for the function

f(t) ≤ ARtf ,

for some A.We can also show the following inverse result:

Theorem 4.55. If F (z) and G(z) are z-Transofrms of two functionsf(t) and g(t) and if F (z) = G(z) for all |z| > R, for some R > 0 thenf(t) = g(t) for t = 0, 1, 2, . . ..

Proof. f G(z) = F (z) for all |z| > R then

∞∑t=0

f(t)z−t =∞∑t=0

g(t)z−t ↔

∞∑t=0

(f(t)− g(t)) z−t = 0↔∞∑t=0

atwt = 0

where we have set w = 1/z and at = f(t)− g(t). It is known that if we havea power series

∑∞t=0 atw

t then if∑∞

t=0 atwt = 0 for all |w| ≤W then at = 0.

Using this result, we conclude f(t) = g(t).

4.10.1. Inverse z-Transform. It is possible to show that the inversetransform of the z-Transform is defined by the following line integral.

f(t) = Z−1 (F (z)) =

∮|z|=R

F (z)zt−1dz

where the line integral is computed along a circle included in the ROC.As shown below, this formula is impractical and the inverse z-Transform isbest computed using a different procedure. As for the Laplace transformthis formula is insightful in that it reveals that the z-Transform is in fact adecomposition of a function using a basis of functions of type zn.

4.11. Properties of the z-Transform

Not surprisingly, the z-Transform enjoys a number of properties thatbear a close resemblance with thos that we have studied for the Laplacetransform. This is shown in the following.

Theorem 4.56. The z-Transform has the following properties (we willuse upper-case letters for the z-Transform and lower case letters for thecorresponding function. Let X(z) have ROC R, X1(z) have ROC R1 andX2(z) have ROC R2.

Linearity: Z (α1x1(t) + α2x2(t)) = α1X1(z) + α2X2(z) (ROC R′ =R1 ∩R2)

4.11. PROPERTIES OF THE Z-TRANSFORM 73

Time-shifting: if x(t) is a causal signal and k > 0 then Z (x(t− k)) =z−kX(z) (ROC R′ ⊃ R∩0 < |z| <∞, and Z (x(t+ k)) = zkX(z)−zkx(0)− zk−1x(1)− . . .− zx(K − 1) (ROC R′ = R)

Multiplication by exponential: Z(zt0x(t)

)= X( zz0 ) (ROC R′ =

|z0|RMultiplication by t: Z (tx(t)) = −z dX(z)

dz (ROC R′ = R)Time Scaling: Z (x(t/t0)) = X(zt0) for t0 positive integer.Convolution: Z (x1(t) ∗ x2(t)) = X1(z)X2(z). (ROC R′ ⊃ R1∩R2).Accumulation: Z (

∑∞t=0 x(t)) = z

z−1X(z) (ROC R′ = R∩|z| > 1)Initial value: x(0) = limz→∞X(z)Final value: x(∞) = limz→1(z − 1)X(z), if x(∞) exists.

Proof. The proof is quit similar to the proofs we have given for theanalogous properties of the Laplace transform. Let us take two examples 1.Time shifting (with positive shift):

Z (x(t+ k)) =

∞∑t=0

x(t+ k)z−t =

∞∑t=0

x(t+ k)z−(t+k)zk =

= zk∞∑t′=k

x(t′)z−t′

= zk

( ∞∑t′=0

x(t′)z−t′

)− x(0)zk − x(1)zk−1 − . . . zx(k − 1)

= zkX(z)− x(0)zk − x(1)zk−1 − . . . zx(k − 1)

2. Convolution (case of causal signals)

Z (x1(t) ∗ x2(t)) =∞∑t=0

x1(t) ∗ x2(t)z−t =∞∑t=0

∞∑τ=0

x1(τ)x2(t− τ)z−t =

=

∞∑τ=0

∞∑t=0

x1(τ)x2(t− τ)z−t =

∞∑τ=0

x1(τ)

∞∑t=0

x2(t− τ)z−t =

=∞∑τ=0

x1(τ)∞∑t=0

x2(t− τ)z−(t−τ)z−τ =∞∑τ=0

x1(τ)z−τ∞∑

t=−τx2(t′)z−t

′=

= X1(z)X2(z)

Example 4.57. We can use the properties to build the tranform of morecomplex signals. For instance, consider the signal s(t) = 1(t) cos Ωt. We canuse the properties as follows:

Z (1(t) cos Ωt) =1

2Z(1(t)ejΩt

)+

1

2Z(1(t)e−jΩt

)=

1

2

z

z − ejΩ+

1

2

z

z − ejΩ=

=1

2

z(z − ejΩ + z − e−jΩ)

z2 − z(ejΩ + e−jΩ) + 1=

z(z − cos Ω)

z2 − 2z cos Ω + 1


By using the properties, we can find the following results:

z-Transform of known Signals

Signal z-Tranform ROC

δ(t) 1 z ∈ Cδ(t− t0) z−t0 z ∈ C\01(t) z

z−1 |z| > 1

t z(z−1)2

|z| > 1

t2 z(z+1)(z−1)3

|z| > 1

at z(z−a) |z| > |a|

at z(z−a) |z| > |a|

tat az(z−a)2

|z| > |a|t2at az(z+a)

(z−a)3|z| > |a|

cos Ωt z(z−cos Ω)z2−2z cos Ω+1

|z| > 1

sin Ωt z sin Ωz2−2z cos Ω+1

|z| > 1

at cos Ωt z(z−a cos Ω)z2−2az cos Ω+a2

|z| > a

at sin Ωt za sin Ωz2−2az cos Ω+a2

|z| > a

A more interesting application of the properties introduced above is inthe following.

Example 4.58. Consider the following difference equation.

y(t+ 2) = 3y(t+ 1)− 2y(t) + u(t+ 1)− 3u(t).

Let us find Y (z) for u(t) = 1(t), y(1) = 1, y(0) = −1, u(0) = 0. We canfind the z-Transform as follows:

Z (y(t+ 2)) = Z (3y(t+ 1)− 2y(t) + u(t+ 1)− u(t)) .

The application of the time shifting rule produces.

z2Y (z)− z2y(0)− zy(1) = 3zY (z)−3zy(0)−2Y (z) + zU(z)− zu(0)−3U(z)

which can be written as

Y (z) =U(z)(z − 3)

z2 − 3z + 2+z2y(0) + z(y(1)− 3y(0)− u(0))

z2 − 3z + 2

=z(z − 2)

(z − 1)(z2 − 3z + 2)+z2y(0) + z(y(1)− 3y(0)− u(0))

z2 − 3z + 2.

4.12. Inverse z-Transform

As shown by Example 4.58, also for LTI DT systems the evolution ofthe system is compounded by a free evolution and by a forced evolution. Inboth cases we have to deal with a ratio of polynomial with the numeratorthat is typically proportional to z. One of the possible ways to deal withthis case is by using the same technique (partial fraction expansion) that we

4.12. INVERSE Z-TRANSFORM 75

have used for the Laplace transform. This is shown in the following sequenceof examples.

Example 4.59. Let us compute the free evolution Example 4.58 fory(1) = 1, y(0) = −1, u(0) = 0.

Y (z) =z2y(0) + z(y(1)− 3y(0)− u(0))

z2 − 3z + 2

=−z2 + 4z

(z − 2)(z − 1)

It is convenient to divide by z and then proceed with partial fraction expan-sion:

Y (z)

z=

−z + 4

(z − 2)(z − 1)=

2

z − 2− 3

z − 1

and

Y (z) =2z

z − 2− 3z

z − 1

y(t) = 1(t)(2 · 2t − 3

).

Example 4.60. Let us compute the forced evolution of Example 4.58for u(t) = 1(t).

Y (z) =z(z − 3)

(z − 1)2(z − 2).

It is convenient to divide by z and then proceed with partial fraction expan-sion:

Y (z)

z=

(z − 3)

(z − 1)2(z − 2)=

=A1,1

(z − 1)2+

A1,2

z − 1+

A2

z − 2=

=A1,1

(z − 1)2+

A1,2

z − 1+

A2

z − 2=

=2

(z − 1)2+

1

z − 1− 1

z − 2=

which leads to

Y (z) =2z

(z − 1)2+

z

z − 1− z

z − 2

y(t) = 1(t)(t+ 1− 2t

)


Example 4.61. Let us compute the forced step response of the following:

y(k + 3) + 0.2y(k + 2)− 0.12y(k + 1) + 0.04y(k) = u(k).

The z-Transform produces:

z3Y (z) + 0.2z2Y (z)− 0.12zY (z) + 0.04Y (z) = U(z)

Y (z) =1

z3 + 0.2z2 − 0.12z + 0.04U(z) =

1

z3 + 0.2z2 − 0.12z + 0.04

z

z − 1

The partial fraction expansion is as follows:

Y (z)

z=

1

(z + 0.5)(z − 1)(z − 0.2− 0.2j)(z − 0.2 + 0.2j)

=1

(z + 0.5)(z − 1)(z − 0.2− 0.2j)(z − 0.2 + 0.2j)

=A1

z − 1+

A2

z + 0.5+

A3

z − 0.2− 0.2j+

A3

z − 0.2 + 0.2j

with

A1 =1

(1 + 0.5)(1− 0.2− 0.2j)(1− 0.2 + 0.2j)= 0.9804

A2 =1

(−0.5− 1)(−0.5− 0.2− 0.2j)(−0.5− 0.2 + 0.2j)= −1.2579

A3 =1

(0.5 + 0.2 + 0.2j)(0.2 + 0.2j − 1)((0.2 + 0.2j − 0.2 + 0.2j)=

=1

0.008− 0.24j=

0.08 + 0.24j

0.0577= 0.1386 + 4.1594j.

Therefore,

y(t) = 1(t)(A1 +A2(−0.5)t +A3(0.2 + 0.2j)t +A3(0.2− 0.2j)t

).

Since

0.2 + 0.2j = 0.2828ejπ4

0.2− 0.2j = 0.2828e−jπ4

we have

y(t) = 1(t)(A1 +A2(−0.5)t +A30.2828tej

π4t +A30.2828te−j

π4t)

= 1(t)(A1 +A2(−0.5)t + |A3| 0.2828t

(ej

π4t+j∠A3 + e−j

π4t−j∠A3

))= 1(t)

(A1 +A2(−0.5)t + 2 |A3| 0.2828t cos

(π4t+ ∠A3

))

4.13. BIBO STABILITY OF DT SYSTEMS 77

4.12.1. Natural modes. For CT systems we have seen that poles de-termine the evolution of the system. Each pole determines an evolution ofthe system (natural modes) described by an exponential function. The samecan be said for DT system as shown in the following table.

Natural modes associated with the different ples

Pole CT modes DT modes

Single real pole p ept pt

Multiple real pole p(multipl. m)

ept, tept, . . . , tm−1ept pt, tpt, . . . , tm−1pt

Single complex pair p, p eReal(p)t cos (Imag (p) t+ φ) |p|t cos (∠pt+ φ)

It is worth observing that for DT systems a real pole p, when negative,gives rise to an exponential mode pt that oscillates. For CT systems, onthe contrary, oscillating behaviours are only possible for complex conjugatepairs.

4.13. BIBO stability of DT systems

The discussion above can be translated into the following results.

Theorem 4.62. Consider a DT LTI system with transfer function:

H(z) =n(z)

d(z).

and assume that no zero pole cancellation takes place. Then the system isBIBO stable if and only if all poles have modules smaller than 1: ∀ps.t. d(p) =0, we have |p| < 1.

The proof of this result is the same as the proof of the similar resultsthat we have given for CT systems. The stability could be checked with acriterion very similar to the Rout-Hurwitz criterion (called Jury criterion),but htis is out of the scope of this course.

Example 4.63. Consider the system:

y(k + 2)− 3y(k + 1) + 2y(k) = 3u(k + 1)− u(k).

Let us verify if the system is BIBO stable. The transfer function of thesystem is

Y (z)(z2 − 3z + 2) = 3zU(z)− U(z)

Y (z) =3z − 1

z2 − 3z + 2.

The roots of the denominator are given by:

p =3±√

9− 8

2= 2, 1 .

Both poles are outside the unit circle. Hence, the system is BIBO unstable.


4.14. z-Transform of sampled data signals

It is interesting to see how the z-Transform and Laplace transform arein fact close relatives. Consider a CT signal f(t), its Laplace transform isgiven by: F (s) =

∫∞0 f(τ)e−sτdτ. Suppose we transform the signal into a

DT sequence by taking a sample every T time units. This can be doneby multiplying the signal by a sequence of Dirac δ, each one extracting asample: fD(t) = f(t)

∑∞k=0 δ(t− kT ). If we compute the Laplace transform

of this signal, we find:

L (fD(t)) =

∫ ∞0

fD(τ)e−sτdτ =

∫ ∞0

f(t)

∞∑k=0

δ(t− kT )e−sτdτ

=

∫ ∞0

∞∑k=0

f(kT )δ(t− kT )e−sτdτ =

∫ ∞0

∞∑k=0

f(kT )δ(t− kT )e−skTdτ

=

∞∑k=0

∫ kT

(k−1)Tf(kT )δ(t− kT )e−skTdτ =

∞∑k=0

f(kT )e−skT∫ kT

(k−1)Tδ(t− kT )dτ

=

∞∑k=0

f(kT )e−skT =

∞∑k=0

f(kT )(e−sT )k

If we set esT = z we find

F (z) =∞∑k=0

f(kT )zk,

we find the definition of the z-Transform.

CHAPTER 5

State space Analysis

In this chapter, we will focus our attention on the analytic tools that hasto be used to compute the time response of time-invariant linear systems inthe state space form, as presented in Section 2.4 and here reported:

(5.1)Dx(t) = Ax(t) +Bu(t),

y(t) = Cx(t) +Du(t),

where the operator D has been defined in (2.10) and represents the dif-ferential (CT systems) or difference (DT systems) operator. For a linearsystem in state space form, the x ∈ Rn is a vector representing the n statesof the system, y ∈ Rp is a vector representing the p outputs of the system,while u ∈ Rm is a vector representing the m inputs of the system. One canimmediately recognise how the state space form is a reacher description ofthe system dynamic than the one given in terms of the transfer functionspresented in Chapter 4. In fact, it allows the description of MIMO systemsand, more importantly, we will see that it is able also to capture systemdynamics that are not available in the I/O relation.

5.1. Control Canonical Form

As previously presented, given a SISO linear and time invariant systemdescription in terms of the IO relation, the evolution can be generally givenby (3.1), here reported for clarity

(5.2) Dny(t) =n−1∑i=0

αiDiy(t) +

p∑j=0

βjDju(t),

where time invariance requires that αi and βi be constant. The system iscausal if p ≤ n and strictly causal if p < n. We have seen that the initialconditions:

y(0),Dy(0), . . . ,Dn−1y(0), . . . ,Dpu(0), . . . ,Du(0)

and of the input u|[t0,t]. In this section we will clarify how to obtain thestate space description (5.1) from (5.2).

Let us start from the case of p = 0, i.e.,

Dny(t) =n−1∑i=0

αiDiy(t) + β0u(t).

79

80 5. STATE SPACE ANALYSIS

The immediate solution to this problem is to consider as state space variablesx1 = y, x2 = Dy, etc., that yields

A =

0 1 0 · · · 00 0 1 · · · 0...

......

. . ....

0 0 0 · · · 1α0 α1 α2 · · · αn−1

, and B =

00...01

,

C =[

1 0 0 · · · 0], and D = 0.

with initial conditionsxi(0) = Di−1y(0).

The matrix A is in lower horizontal companion form, since it is the compan-ion of the characteristic polynomial P (A) of the matrix A (see below) dueto the presence of the coefficient (with opposite sign) of the polynomial inthe last row.

In the general case, i.e., p ≥ 0, it is still possible to compute a similardescription using the superposition principle and the fact that the outputdue to the k-th time derivative of the input signal is the k-th time derivativeof the output. Indeed, let us call x(t) the solution considering only u(t), i.e.,

Dnx(t) =n−1∑i=0

αiDix(t) + β0u(t),

withy(t) = x(t).

It then follows that the output corresponding to an input involving the timederivatives of u(t), such as

ω(t) =

p∑j=0

βjDju(t),

is given by

y(t) =

p∑j=0

βjDjx(t).

As a consequence, the matrices A and B do not change, while we get forstrictly proper systems (p < n) that

A =

0 1 0 · · · 00 0 1 · · · 0...

......

. . ....

0 0 0 · · · 1α0 α1 α2 · · · αn−1

, and B =

00...01

,

C =[β0 β1 β2 · · · βp 0 · · · 0

], and D = 0.

5.1. CONTROL CANONICAL FORM 81

If instead p = n (non-strictly proper systems)

A =

0 1 0 · · · 00 0 1 · · · 0...

......

. . ....

0 0 0 · · · 1α0 α1 α2 · · · αn−1

, and B =

00...01

,

C =[β0 + βnα0 β1 + βnα1 β2 + βnα2 · · · βn−1 + βnαn−1

], and D = βn.

Example 5.1. Consider the equation

ay(t)− by(t) = cu(t)− du(t)− eu(t),

where a, b, c, d, e are constant parameters. The canonical control form isgiven by

A =

[0 1ba 0

], and B =

[01

],

C =[cb−eaa2

−da

], and D = c

a .

Example 5.2. Consider the circuit with a resistor R and an inductor Lreported in Figure 1. The dynamic equations are given by (Ohm and Henry

Figure 1. RL example.

laws)

Ldi(t)

dt+Ri(t) = v(t),

for which we assume the output y(t) = i(t) and the input u(t) = v(t).Therefore:

y(t) = −RLy(t) +

u(t)

L,


and, hence, in state space

x1(t) = −RLx1(t) +

u(t)

L,

y(t) = x1(t)

Example 5.3. A mass m in position p(t) subjected to an external forcef(t) and constrained to a wall by a spring with elasticity K and a damperB, as depicted in Figure 2. The dynamic equations are given by (Newton,

Figure 2. Mass-Spring-Damper example.

Hooke and Rayleigh laws)

d2p(t)

dt2m+K(p(t)− p) +B

dp(t)

dt= f(t),

hence, choosing y(t) = p(t) and u(t) = f(t), yields to

y(t) = −Km

(y(t)− y)− B

my(t) +

u(t)

m.

Without loss of generality we assume the spring relaxed position be y = 0,which yields to the following canonical control form

x(t) =

[0 1

−Km −B

m

]x(t) +

[01

]u(t),

y(t) =[

1m 0

].

Example 5.4. The following discrete time equation

ay(k − 2)− by(k − 1) + cy(k) = −du(k) + eu(k − 1),

where a, b, c, d, e are constant parameters, has the following canonical controlform

A =

[0 1

−ac

bc

], and B =

[01

],

C =[dac2

ac−dbc2

], and D = −d

c .

5.2. COORDINATES TRANSFORMATION 83

5.2. Coordinates Transformation

The state space description is no more than a dynamic relation betweenquantities, i.e., the states x, the inputs u and the outputs y, expressed in dif-ferent vector spaces. As reported in Section 2.4.4, the inputs and the outputscan be defined with the components pertaining to the homogenous represen-tation of the differentail or difference equations. Moreover, the states canbe identified with the outputs and their derivatives. However, this is notalways possible and, more importantly, this approach may not be always thebest way.

Even though the inputs u and the outputs y are, in some respect, con-strained by the homogeneous representation of the equations, the states xare to some extent arbitrary. In general, it is always possible to define acoordinates transformation that binds a state variable x to another statevariable, say z, using a coordinates mapping:

z = Φ(x).

A valid mapping Φ(·) able to correctly return an alternative description ofthe states x is not arbitrary. Indeed, it has to be bijective, thus for anyx there exists only one z. Therefore, there exists also the inverse mappingx = Φ−1(z).

Of particular interest are the linear mappings, satisfying the followinglinear property:

Φ(αx1 + βx2) = αΦ(x1) + βΦ(x2).

By recalling that any linear trasformation from a vector space of dimensiona to a vector space of dimension b is represented by a matrix in b × a,it follows that a (time invariant) linear mapping between two state spacerepresentations is given by

z = Tx, T ∈ Rn×n,

satisfying det(T ) 6= 0 in order to be bijective.In the case of a linear time invariant system, it is possible to write the

dynamic in z given (5.1). Indeed, by substituting in (5.1) the x = T−1z andrecalling that T is time invariant, one yields

Dz(t) = Azz(t) +Bzu(t),

y(t) = Czz(t) +Dzu(t),

where Az = TAT−1, Bz = TB, Cz = CT−1 and Dz = D. All the possiblestate coordinates are similar to each other and the system dynamic obtainedare equivalent. As a consequence, there is not a representation that is “bet-ter” than another one, which is true also for nonlinear systems, and thechoice depends on the particular problem to solve.

Moreover, we also say that Az is similar to A.


5.3. Continous Time Linear System Solution

For the linear system the solution of the differentail equation involvedin (5.1) can be made explicit.

Let us recall the scalar differential equation

x(t) = ax(t) + bu(t)

and then extend such results to the multidimensional case. The homogeneoussolution of the differential scalar equation, i.e., assuming u(t) = 0, is simplygiven by

xu(t) = eatx(0),

where x(0) is the initial condition fo the system. Using the Taylor expansionaround t0 = 0, the exponential function can be rewritten as

xu(t) =

+∞∑i=0

aiti

i!x(0).

Using a similar approach, we can assume that the solution for the mul-tidimensional case is given by

xu(t) =

(I +At+

A2t2

2+ . . .

)x(0) =

+∞∑i=0

Aiti

i!x(0),

where I is the identity matrix of dimension n. Now, we have to show thatsuch a solution is in fact a solution of the differential equation x(t) = Ax(t).It is easy to see that

xu(t) =

(0 +A+At+

A2t2

2+ . . .

)x(0) =

= A

(I +At+

A2t2

2+ . . .

)x(0) = Axu(t),

as desired. Similarly, we can state that the matrix exponential is given by

(5.3) eAt =+∞∑i=0

Aiti

i!,

and then that

(5.4) xu(t) = eAtx(0) = Φ(t)x(0),

where Φ(t) is referred to as the state transition matrix, which maps theinitial state x(0) in the state x(t) with a linear transformation.

Example 5.5. Let us compute the homogeneous solution of

x1 = x1

x2 = −2x1 − x2 + u

when the initial conditions are x1(0) = 1 and x2(0) = 2.

5.3. CONTINOUS TIME LINEAR SYSTEM SOLUTION 85

The direct application of the summation in (5.3) yields to

eAt = I +At+ It2

2!+A

t3

3!+ I

t4

4!+A

t5

5!+ . . .

and hence

eAt =

[1 + t+ t2

2 + t3

3! + t4

4! + t5

5! + . . . 0

−2(t+ t3

3! + t5

5! + . . .)

1− t+ t2

2 −t3

3! + t4

4! −t5

5! + . . .

].

It then follows that

1 + t+t2

2+t3

3!+t4

4!+t5

5!+ . . . = et,

1− t+t2

2− t3

3!+t4

4!− t5

5!+ . . . = e−t,

t+t3

3!+t5

5!+ . . . =

et − e−t

2,

and finally

eAt =

[et 0

e−t − et e−t

].

Therefore

x(t) =

[et 0

e−t − et e−t

] [12

]=

[et

3e−t − et].

It is now necessary to compute the particular solution of the differentialequation to derive the complete output of the system. Indeed, the overallresponse will be given by the superposition principle, i.e., the sum of thehomogeneous and of the particular responses.

There are several ways to compute this. One possibility is to computedirectly this sum by considering again the scalar system

x(t) = ax(t) + bu(t)⇒ x(t)− ax(t) = bu(t).

Sincede−atx(t)

dt= e−at (x(t)− ax(t)) ,

it follows thatde−atx(t)

dt= e−atbu(t).

Integrating both sides∫ t

0

de−aτx(τ)

dτdτ = e−atx(t)− x(0) =

∫ t

0e−aτ bu(τ)dτ,

that finally yields to

x(t) = eatx(0) +

∫ t

0ea(t−τ)bu(τ)dτ.


As a consequence, we have that the particular solution is given by

xf (t) =

∫ t

0ea(t−τ)bu(τ)dτ.

In order to derive the solution for the multidimensional case, we firstprove some properties of the matrix exponential in (5.3):

• eA1eA2 = eA2eA1 = eA1+A2 iff A1A2 = A2A1. The proof followsfrom the definition in (5.3);• From the previous property follows that eAe−A = eA−A = e0 = I,

where the last relation is given again directly from (5.3). Noticethat the inverse of an exponential matrix eA is e−A, which alwaysexists;

• Finally, deAt

dt = AeAt = eAtA, which again follows directly from (5.3).

We are now in a position to follow the same steps of the monodimen-sionale case. Indeed, let us start with

x(t) = Ax(t) +Bu(t)⇒ x(t)−Ax(t) = Bu(t).

Since

de−Atx(t)

dt= −Ae−Atx(t)+e−Atx(t) = −e−AtAx(t)+e−Atx(t) = e−At (x(t)−Ax(t)) ,

it follows thatde−Atx(t)

dt= e−AtBu(t).

Integrating both sides∫ t

0

de−Aτx(τ)

dτdτ = e−Atx(t)− eA0x(0) =

∫ t

0e−Aτ bu(τ)dτ,

that finally yields to

(5.5) x(t) = eAtx(0) +

∫ t

0eA(t−τ)Bu(τ)dτ.

To double check this result, it is possible to compute

dx(t)

dt= AeAtx(0) +

deAt∫ t

0 e−AτBu(τ)dτ

dt

= AeAtx(0) +AeAt∫ t

0e−AτBu(τ)dτ + eAte−AtBu(t)

= Ax(t) +Bu(t).

Example 5.6. Compute the response of the system

x1 = x1

x2 = −2x1 − x2 + u

when the initial conditions are x1(0) = 1 and x2(0) = 2 and the inputu(t) = 10, ∀t ≥ 0.


From Example 5.5, we know that

eAt =

[et 0

e−t − et e−t

],

and hence the unforced response is

xu(t) =

[et 0

e−t − et e−t

] [12

]=

[et

3e−t − et].

For the forced response, we have to compute∫ t

0eA(t−τ)Bu(τ)dτ =

∫ t

0eA(t−τ)dτBu(t),

since the input is a step function. Therefore∫ t

0eA(t−τ)dτBu(t) =

[ ∫ t0 e

t−τdτ 0∫ t0 e

τ−t − et−τdτ∫ t

0 eτ−tdτ

]Bu(t) =

[et − 1 0

2− et − e−t 1− e−t]Bu(t),

that, by noticing that B = [0 1]T , we have

x(t) = xu(t)+xf (t) =

[et

3e−t − et]+

[0

1− e−t]u(t) =

[et

3e−t − et + 10(1− e−t)

].

Remark 5.7. The state space evolution (5.5), solution of the linear timeinvariant system (5.1), comprises two terms. The first depends only on theinitial condition x(0) of the system and hence it is dubbed unforced response.The second term instead depends on the inputs u(t) but not on the initialcondition and therefore it is termed forced response.

Remark 5.8. Since eAt is a constant once t is fixed, it turns out that thestates are combined through a time invariant linear transformation. More-over, since the integral is a linear operator, the combination of forced andunforced response is an application of the superposition principle betweenthe states and the inputs.

Remark 5.9. There are other definitions of the matrix exponential thatmay be useful sometimes, for example:

eAt = limk→+∞

(I +

At

k

)k.

5.3.1. Connection between the State Space Representation andthe Laplace Transform. To explain the connection between the LaplaceTransform and the state space description, let us first consider the dynamicof a autonomous system:

x(t) = Ax(t),

whose Laplace Transform are given by

L (x(t)) = sX(s)− x(0), and L (Ax(t)) = AX(s),


that has been obtained by applying the differentiation rule. Hence,(5.6)sX(s)− x(0) = AX(s)⇒ (sI −A)X(s) = x(0)⇒ X(s) = (sI −A)−1x(0).

The matrix (sI −A)−1 is called the resolvent of A. The resolvent is definedfor any s ∈ C except for the eigenvalues of A, which are the points in whichdet(sI −A) = 0.

By applying the inverse Laplace Transform to (5.6), we have

x(t) = L−1((sI −A)−1

)x(0),

that, recalling (5.4), implies that the inverse Laplace Transform of the re-solvent of A is the matrix exponential eAt or, equivalently, the transitionmatrix Φ(t).

Remark 5.10. An alternative way to show that L−1((sI −A)−1

)= eAt

is by considering the power series expansion of the inverse of a generic non-singular matrix I −M , i.e.,

(I −M)−1 = I +M +M2 +M3 + . . . ,

that, in the case of the resolvent turns to

(sI −A)−1 =

[s

(I − A

s

)]−1

=1

s

(I − A

s

)−1

=I

s+A

s2+A2

s3+A3

s4+ . . .

It is now possible to apply the inverse Lapace Transform and the superpo-sition principle to obtain

L−1((sI −A)−1

)= I +At+

A2t2

2+A3t3

3!+ · · · , eAt.

Example 5.11. Consider the following harmonic oscillator

x =

[0 1−1 0

]x = Ax.

Since

sI −A =

[s −11 s

],

we have that the eigenvalues are ±j and hence

(sI −A)−1 =1

s2 + 1

[s 1−1 s

].

So, by the inverse Laplace Transform we have

x(t) =

[cos t sin t− sin t cos t

]x(0),


which is a rotation matrix. Notice that, by directly computing the exponen-tial matrix, we have

eAt =

+∞∑i=0

Aiti

i!=

[1− t2

2 + t4

4! + . . . t− t3

3! + t5

5! + . . .

−t+ t3

3! −t5

5! + . . . 1− t2

2 + t4

4! + . . .

],

which are the Taylor expansions of cos t and sin t around t0 = 0.

Example 5.12. Consider the following double integrator

x =

[0 10 0

]x = Ax.

Since

sI −A =

[s −10 s

],

we have that the eigenvalues are 0 and 0, and hence

(sI −A)−1 =1

s2

[s 10 s

].

So, by the inverse Laplace Transform we have

x(t) =

[1 t0 1

]x(0).

Notice that, by directly computing the exponential matrix, we have

eAt =+∞∑i=0

Aiti

i!= I +At,

as expected. Notice that the matrix A is nilpotent, i.e., there exists a numberq such that Aq 6= 0 and Aq = 0 ∀q > q.

We are now able to note that the time evolution of the output is givenby substituting the solution of x(t) given in (5.5) in the output descriptiongiven in (5.1), i.e.,

(5.7) y(t) = C

(eAtx(0) +

∫ t

0eA(t−τ)Bu(τ)dτ

)+Du(t).

Notice that the forced response involves the extension to matrices of theconvolution integral between the states and the inputs. In particular, for aSISO system and assuming that u(t) = 0 ∀t < 0, we can write∫ t

0eA(t−τ)Bu(τ)dτ = eAt ∗Bu(t),

and, as a consequence,

L(eAt ∗Bu(t)

)= (sI −A)−1BU(s).

Therefore, computing the Laplace Transform of (5.7) and applying the su-perposition principle, we get

L (y(t)) = C(sI −A)−1x(0) + C(sI −A)−1BU(s) +DU(s).


Of course, if the system starts at rest, as it is usually assumed for thefrequency domain approach, we have x(0) = 0 and hence

Y (s)

U(s)= C(sI −A)−1B +D.

Remark 5.13. A similar result can be also obtained by computing theLaplace tranform of the linear time invariant dynamic in (5.1) as

sX(s)− x(0) = AX(s) +BU(s),

Y (s) = CX(s) +DU(s),

that, by solving the first equation in X(s) and then substituting in thesecond yields to

Y (s) = C(sI −A)−1x(0) +[C(sI −A)−1B +D

]U(s),

as desired.

5.3.2. The Role of the Eigenvalues in the Transfer Function.Let us consider a SISO system. As highlighted previously, there is a clearconnection between the Laplace Transform and the state space representa-tion. Indeed, we know that for a transfer function

Y (s)

U(s)= G(s),

are of prominent role the poles, which are the roots of the denominator. Forexample, the poles determine if a system is BIBO stable or not by applyingthe inverse Laplace Transform to the output given the input.

As stated previously, assuming x(0) = 0 as in the transfer functionanalysis case, we have

Y (s) = C(sI −A)−1BU(s) +DU(s),

where the direct coupling between the input and the output given by D isnow neglected since it does not involve the system characteristics. In sucha case, we have G(s) = C(sI − A)−1B. Let us now notice that the (i, j)element of the resolvent (sI − A)−1 can be computed using the Cramer’srule

(−1)i+jdet Adji,jP (A)

,

where Adji,j is the adjoint matrix of sI − A, i.e., the matrix sI − A withthe j-th row and i-th column deleted. P (A) is instead the characteristicpolynomial of A, defined as

P (A) = det(sI −A),

and satisfying the following properties:

• P (A) is a polynomial of degree n, with leading (i.e., sn) coefficientone;• The roots of P (A) are the eigenvalues of A;

5.4. THE ROLE OF THE EIGENVALUES FOR CONTINUOUS TIME SYSTEMS 91

• P (A) has real coefficients, so eigenvalues are either real or occurin conjugate pairs;• There are n eigenvalues (if we count multiplicity as roots of P (A)).

It then follows that det Adji,j has degree less than n. Moreover, we alsonotice that for SISO systems all the roots of the denominator of G(s) arealso the roots P (A), and, hence, are the eigenvalues of A. The converse isnot true, since there may be some cancellation between the roots of P (A)and the roots of det Adji,j . This is a problem that will have a direct impacton the analysis of the structural properties of a system.

5.4. The Role of the Eigenvalues for Continuous Time Systems

Let us add some additional properties of the exponential matrix and itseigenvalues that turn to be useful in the next development.

• If v ∈ Rn is an eigenvector of a matrix A associated with the eigen-value λ, i.e., Av = λv, than it immediately follows that v is alsoan eigenvector of the matrix Ak associated with the eigenvalue λk,i.e., Akv = λkv. Hence, by recalling the definition of the exponen-tial matrix as an infinite summation given in (5.3), we have that vis also the eigenvector of eAt associated to the eigenvalue eλt, i.e.,eAtv = eλtv;• For any given A and for any transformation matrix T , we have that

(T−1AT )k = T−1AkT . Hence, it follows again from (5.3) that

eT−1AT = T−1eAT.

Bearing these properties in mind, we can now relate the exponentialmatrix to its eigenvalues.

5.4.1. Exponential Matrix for Diagonal Matrices. Let us firstrecall the definition of diagonalisable matrix.

Definition 5.14. A square matrix A is called diagonalisable if it issimilar to a diagonal matrix, i.e., if there exists an invertible matrix T suchthat T−1AT is a diagonal matrix.

More precisely, a matrix A ∈ Rn×n is diagonalisable if and only if thereexists a basis of Rn consisting of eigenvectors of A, i.e., the sum of thedimensions of its eigenspaces is equal to n. If such a basis has been found,the matrix T such as T−1AT = Λ is diagonal has these basis vectors ascolumns. The Λ diagonal entries of this matrix are the eigenvalues of A.

If a matrix is diagonalisable, the exponential matrix can be easily ob-tained. Indeed, let

T−1AT = Λ =

λ1 0 . . . 00 λ2 . . . 0...

.... . .

...0 0 . . . λn

,


we have by the definition (5.3)

eΛt =

+∞∑i=0

Λiti

i!=

∑+∞

i=0λi1t

i

i! 0 . . . 0

0∑+∞

i=0λi2t

i

i! . . . 0...

.... . .

...

0 0 . . .∑+∞

i=0λint

i

i!

,or, equivalently

eΛt =

eλ1t 0 . . . 0

0 eλ2t . . . 0...

.... . .

...0 0 . . . eλnt

,and finally

eAt = TeΛtT−1.

5.4.2. Exponential Matrix for Diagonal Matrices with Com-plex Eigenvalues. If the matrix A has complex eigenvalues, the associatedeigenvector will be complex as well, thus the columns of the matrix T . How-ever, since the matrix A is real by definition, complex eigenvalues comes inpairs and, hence, the matrix exponential will be real. Therefore, it is possi-ble in this case to resort to a coordinates transformation that transforms adiagonalisable matrix on the complex set into a block diagonal matrix, whoseblock dimension is at most 2. The diagonal blocks are associated to eachcomplex and conjugated pairs.

Let us consider a matrix A having a pairs of eigenvalues λ1,2 = σ ± jω,i.e.,

A[vr + jvc] = (σ + jω)[vr + jvc],

A[vr − jvc] = (σ − jω)[vr − jvc],being vr and vc the real and complex part of the eigenvectors, respectively.In matrix form, we have

A[vr + jvc |vr − jvc] = [vr + jvc |vr − jvc][σ + jω 0

0 σ − jω

].

It then follows immediately

eAt[vr + jvc |vr − jvc] = [vr + jvc |vr − jvc]eσt[ejωt 0

0 e−jωt

].

By the Euler formula, we have that

eσ+jω = eσ[cos(ω) + j sin(ω)],

eσ−jω = eσ[cos(ω)− j sin(ω)].

Moreover, we notice that, using the invertible matrix

H =1

2

[1 −j1 j

], H−1 = −j

[j j−1 1

],


we have[vr + jvc |vr − jvc]H = [vr|vc],

and

H−1

[σ + jω 0

0 σ − jω

]H =

[σ ω−ω σ

],

and, finally,

H−1

[ejωt 0

0 e−jωt

]H =

[cos(ωt) sin(ωt)− sin(ωt) cos(ωt)

].

Therefore, noticing that ifAV = V Λ we haveAVH = V ΛH = V HH−1ΛH,we have

A[vr + jvc |vr − jvc]H = A[vr|vc] = [vr|vc][σ ω−ω σ

],

and

eAt[vr + jvc |vr − jvc]H = eAt[vr|vc] = [vr|vc]eσt[

cos(ωt) sin(ωt)− sin(ωt) cos(ωt)

].

For instance, being A ∈ R4×4 with λ1 and λ4 being real and λ2,3 = σ±jω,we have

eAt = TeΛtT−1 = T

eλ1t 0 0 0

0 eσt cos(ωt) eσt sin(ωt) 00 −eσt sin(ωt) eσt cos(ωt) 00 0 0 eλ4t

T−1.

5.4.3. The Jordan Canonical Form for Defective Matrices. Thedefinition of a defective matrix is given below.

Definition 5.15. A square matrix A is called defective if it is not diag-onalisable.

Let us make this point clearer. We have notice that a matrix A ∈Rn×n is diagonalisable if and only if there exists a basis of Rn consistingof eigenvectors of A. This is certainly true if the matrix has n distincteigenvalues, i.e.,

P (A) = (λ− λ1)(λ− λ2) . . . (λ− λn),

where P (A) is the characteristic polynomial of the matrix A. However, thisis only a sufficient condition, since it may happens that a matrix has onlyh ≤ n distinct eigenvalues λi, i = 1, . . . , h, thus having

P (A) = (λ− λ1)r1(λ− λ2)r2 . . . (λ− λh)rh ,

where ri is the algebraic multiplicity of the eigenvalue λi. In such a case, it isimportant to recall the definition of geometric multiplicity of the eigenvalueλi.

Definition 5.16. The geometric multiplicity of the eigenvalue λi is thenumber of linearly independent eigenvectors vi,k associated to λi.


If the geometric multiplicity, i.e., the number of linearly independentsolutions of

Avi,k = λivi,k ⇒ (λiI −A)vi,k = 0,

is equal to the algebraic multiplicity ri, i.e., there exists

(λiI −A)vi,k = 0, with k = 1, . . . , ri,

linearly independent solutions, the matrix is still diagonalisable. In such acase, we have:

T = [v1,1|v1,2| . . . |v1,r1 |v2,1|v2,2| . . . |v2,r2 |v3,1| . . . |vh,1|vh,2| . . . |v1,rh ],

and

Λ = T−1AT =

λ1Ir1 0 . . . 0

0 λ2Ir2 . . . 0...

.... . .

...0 0 . . . λhIrh

.If this is not the case, there exists at least one eigenvalue λi such that

(5.8) (λiI −A)vi,k = 0, with k = 1, . . . , qi < ri,

or, equivalently, for which the number of linearly independent eigenvectorsis less then ri. In other words, the algebraic multiplicity is greater than thegeometric multiplicity. In such a case, a basis for the coordinates transfor-mation can still be defined using the generalised eigenvectors.

A basis of generalised eigenvectors is constructed by linear transforma-

tions. In practice we are searching for an eigenvector v(2)i,k linearly indepen-

dent from v(1)i,k = vi,k such that

(5.9) (λiI −A)v(2)i,k 6= 0 and (λiI −A)2v

(2)i,k = 0.

Such an eigenvector is simply given by

(λiI −A)v(2)i,k = v

(1)i,k = vi,k.

Indeed, by premultiplying the previous equation by λiI−A we have that (5.9)is satisfied by means of (5.8).

The construction is then iterated until mi,k, i.e.,


(1)i,k = vi,k,


(2)i,k ,


(3)i,k ,

...

(λiI −A)v(mi,k+1)i,k = v

(mi,k)i,k ,


since v(mi,k+1)i,k is linearly dependent from v

(1)i,k , v

(2)i,k , . . . , v

(mi,k)i,k . It is possible

to show that there exists a set of ri generalised eigenvectors that are linearlyindependent, i.e.,

ri =

qi∑k=1

mi,k.

By using the following transformation matrix

T = [v(1)1,1|v

(2)1,1| . . . |v

(m1,1)1,1 |v(1)

2,1|v(2)2,1| . . . |v

(m2,1)2,1 | . . . |v(mh,qh )

h,qh],

the Jordan Canonical Form of the matrix A is obtained, which is given bythe following block-diagonal matrix

J = T−1AT =

J1 0 . . . 00 J2 . . . 0...

.... . .

...0 0 . . . Jh

,where Ji is the Jordan block associated to λi and having dimension ri. EveryJordan block is a block-diagonal matrix of type

Ji =

Ji,1 0 . . . 00 Ji,2 . . . 0...

.... . .

...0 0 . . . Ji,qi

,where Ji,k, k = 1, . . . , qi, is a Jordan miniblock. The number of miniblocksis equal to the geometric multiplicity of the eigenvalue λi. Each miniblockis of the form

Ji,k =

λi 1 0 . . . 0 00 λi 1 . . . 0 0...

.... . .

. . ....

...0 0 0 . . . λi 10 0 0 . . . 0 λi

,whose dimension is given by the number mi,k of linearly independent gen-eralised eigenvectors of the k-th chain.

Remark 5.17. From the previous analysis it follows that necessary andsufficient conditions for diagonalisability are:

• There exists n linearly independent eigenvectors;• The algebraic multiplicity ri of λi equal to the geometric multiplic-

ity qi;• The dimension of each Jordan miniblock Ji,k is unitary.


5.4.4. Exponential Matrix of Defective Matrices. We start bynoticing that for block-diagonal matrices, the exponential is given by theexponential of each block, since

A1t 0 . . . 00 A2t . . . 0...

.... . .

...0 0 . . . Ant

k

=

(A1t)

k 0 . . . 0

0 (A2t)k . . . 0

......

. . ....

0 0 . . . (Ant)k

,that immediately yields to

e

A1 0 . . . 00 A2 . . . 0...

.... . .

...0 0 . . . An

t

=

eA1t 0 . . . 0

0 eA2t . . . 0...

.... . .

...0 0 . . . eAnt

.From the analysis of the defective matrices, we have noticed that each

matrix can be transformed into a block-diagonal matrix where each blockcomprises a Jordan miniblock. Therefore, to compute the matrix exponentialin the general case, it is sufficient to compute the exponential of a Jordanminiblock, i.e.,

eJi,kt = e

λi 1 0 . . . 0 00 λi 1 . . . 0 0...

.... . .

. . ....

...0 0 0 . . . λi 10 0 0 . . . 0 λi

t

= eΛit+Ji,kt = eΛiteJi,kt,

since the two matrices commute (indeed, a diagonal matrix commute withany matrix). It is now possible to notice that Ji,k is a nilpotent matrix of

order p = mi,k, i.e., Jpi,k = 0 but J pi,k 6= 0 ∀p > p ≥ 0. Hence,

eJi,kt = I + Ji,kt+ J2i,k

t2

2!+ · · ·+ J2

i,k

tp−1

(p− 1)!.

Finally, we have

eJi,kt = eΛiteJi,kt = eΛit

1 t t2

2! . . . tp−2

(p−2)!tp−1

(p−1)!

0 1 t . . . tp−3

(p−3)!tp−2

(p−2)!...

.... . .

. . ....

...0 0 0 . . . 1 t0 0 0 . . . 0 1

.


5.4.5. Exponential Matrix of Defective Matrices with ComplexEigenvalues. If the matrix A has complex eigenvalues associated with gen-eralised eigenvectors, i.e., there exists p = mi,k > 1, it is still possible toobtain a suitable representation. Indeed, suppose p = 2 and let us define

V = [v(1)r + jv(1)

c |v(2)r + jv(2)

c |v(1)r − jv(1)

c |v(2)r − jv(2)

c ],

and the invertible matrix

H =1

2

1 −j 0 00 0 1 −j1 j 0 00 0 1 j

, H−1 = −j

j 0 j 0−1 0 1 00 j 0 j0 −1 0 1

,we have

V H = [v(1)r |v(1)

c |v(2)r |v(2)

c ].

Moreover, AV = V J where

J =

σ + jω 1 0 0

0 σ + jω 0 00 0 σ − jω 10 0 0 σ − jω

.Hence,

Jr = H−1JH =

σ ω 1 0−ω σ 0 10 0 σ ω0 0 −ω σ

=

[W I0 W

].

For the exponential, we first notice that

Jkr =

[W k kW k−1

0 W k

],

hence

eJrt =

[I +Wt+W 2 t2

2! + . . . 0 + It+ 2W t2

2! + 3W 2 t3

3! + . . .

0 I +Wt+W 2 t2

2! + . . .

]=

[eWt teWt

0 eWt

].

By recalling that

eWt = eσt[


],

we finally have

eJrt =

[eWt teWt

0 eWt

]= eσt

[


]t


]0


] .


In general, if p > 2, we have that following the same steps we can express

Jr = H−1JH =

W I 0 . . . 0 00 W I . . . 0 0...

.... . .

. . ....

...0 0 0 . . . W I0 0 0 . . . 0 W

,and hence

eJrt =

eWt teWt t2

2!eWt . . . tp−2

(p−2)!eWt tp−1

(p−1)!eWt

0 eWt teWt . . . tp−3

(p−3)!eWt tp−2

(p−2)!eWt

......

. . .. . .

......

0 0 0 . . . eWt teWt

0 0 0 . . . 0 eWt

.

5.4.6. Some Examples.

Example 5.18. Let us consider the matrix

A =

[0 ω−ω 0

].

which is the real representation of a complex matrix. Indeed, let us firstcompute the eigenvalues of the matrix A, which are the roots of the followingscalar equation

P (A) = det(λI−A) = det

([λ −ωω λ

])= (λ2 +ω2) = (λ−jω)(λ+jω) = 0.

The eigenvector associated to λ1 = jω is

(λ1I −A)v1 = 0⇒[jω −ωω jω

]v1 = 0⇒ v1 =

[aja

]⇒ v1 =

[1j

].

The eigenvector associated to λ2 = −jω will be the complex conjugated ofv1, indeed

(λ2I −A)v2 = 0⇒[−jω −ωω −jω

]v2 = 0⇒ v2 =

[a−ja

]⇒ v2 =

[1−j

].

We can now define the transformation matrix

T = [v1|v2] =

[1 1j −j

], T−1 = − 1

2j

[−j −1−j 1

],

that yields to

T−1AT = Λ =

[jω 00 −jω

],

since the matrix A is diagonalisable having two distinct complex and con-jugated roots or equivalently

AT = ΛT = TΛ,


since Λ is diagonal. Using the invertible matrix

H =1

2

[1 −j1 j

], H−1 = −j

[j j−1 1

],

we notice that H = T−1 and hence

TH = I,

since the columns of I are the real and complex part, respectively, of thevector v1. Therefore

ATH = TΛH ⇒ A = THH−1ΛH ⇒ A = H−1ΛH ⇒ A = TΛT−1,

as expected, i.e., the matrix A is already in real form. As a consequence


[ejωt 0

0 e−jωt

]T−1 =

[ejωt+e−jωt

2ejωt−e−jωt

2j

− ejωt−e−jωt2j

ejωt+e−jωt

2

]=


],

as shown previously.


A =

1 0 12 1 11 −1 2

,and let us compute eAt. To this end, let us first compute the eigenvalues ofA, i.e., the roots of

det(λI −A) = 0⇒ λ(λ2 − 4λ+ 5) = λ[(λ− 2)2 + 1] = 0,

that has roots λ1,2 = 2 ± j and λ3 = 0. For the associated eigenvector, wehave

(λ1I −A)v1 = 0⇒ v1 =

22− j1 + j

,then

v2 =

22 + j1− j

,and

(λ3I −A)v3 = 0⇒ v3 =

1−1−1

.We can now define the transformation matrix

T = [v1,r|v1,c|v3] =

1 0 12 −1 −11 1 −1

, T−1 =1

5

2 1 11 −2 33 −1 −1

,


that yields to real Jordan form

T−1AT = Λ =

2 1 0−1 2 00 0 0

,and hence


e2t cos(t) e2t sin(t) 0−e2t sin(t) e2t cos(t) 0

0 0 1

T−1.

5.5. Discrete Time Linear System Solution

For the linear system the solution of the difference equation involvedin (5.1) can be computed explicitly using an iterative solution. Indeed,

x(1) = Ax(0) +Bu(0),

x(2) = Ax(1) +Bu(1) = A2x(0) +ABu(0) +Bu(1),

...

x(k) = Ax(k − 1) +Bu(k − 1) = Akx(0) +Ak−1Bu(0) + · · ·+ABu(k − 2) +Bu(k − 1),

that yields immediately to

(5.10)

x(k) = Akx(0) +k−1∑i=0

Ak−1−iBu(i),

y(k) = CAkx(0) + Ck−1∑i=0

Ak−1−iBu(i) +Du(k).

Remark 5.20. As for the continuous-time systems, the state space evo-lution, solution of the discrete-time linear time-invariant system (5.1), com-prises two terms. The first depends only on the initial condition x(0) of thesystem and hence it is dubbed unforced response. The second term insteaddepends on the inputs u(t) but not on the initial condition and therefore itis termed forced response.

Remark 5.21. As for the continuous-time systems, since Ak is a con-stant once k is fixed, it turns out that the states are combined through atime invariant linear transformation. Moreover, the combination of forcedand unforced response is an application of the superposition principle be-tween the states and the inputs.

5.5.1. Connection between the State Space Representation andthe Z-Transform. To explain the connection between the Z-Transform andthe state space description, let us first consider the dynamic of a autonomoussystem:

x(k + 1) = Ax(k),

5.5. DISCRETE TIME LINEAR SYSTEM SOLUTION 101

whose Z-Transform are given by

Z (x(k + 1)) = zX(z)− zx(0), and Z (Ax(k)) = AX(z),

that has been obtained by applying the time-shifting rule. Hence,(5.11)zX(z)−zx(0) = AX(z)⇒ (zI−A)X(z) = zx(0)⇒ X(z) = (zI−A)−1zx(0).

The matrix (zI − A)−1 is defined for any z ∈ C except for the eigenvaluesof A, which are the points in which det(zI −A) = 0.

By applying the inverse Z-Transform to (5.11), we have

x(k) = Z−1(z(zI −A)−1

)x(0),

that, recalling (5.10), implies that the inverse Z-Transform of z(zI − A)−1

is the matrix power Ak or, equivalently, the transition matrix Φ(k).

Example 5.22. Consider the following harmonic oscillator

x(k + 1) =

[0 −11 0

]x(k) = Ax.

Since

zI −A =

[z 1−1 z

],

we have that the eigenvalues are ±j and hence

(zI −A)−1 =1

z2 + 1

[z −11 z

].

Notice that the forced response involves the extension to matrices ofthe discsrete convolution summation between the states and the inputs. Inparticular, for a SISO system and assuming that u(k) = 0 ∀k < 0, we canwrite

k−1∑i=0

Ak−1−iBu(i) = Ak−1 ∗Bu(k),

and, as a consequence,

Z(Ak−1 ∗Bu(k)

)= (zI −A)−1BU(z).

Therefore, computing the Z-Transform of (5.7) and applying the superposi-tion principle, we get

Z (y(k)) = Cz(zI −A)−1x(0) + C(zI −A)−1BU(z) +DU(z).

Of course, if the system starts at rest, as it is usually assumed for thefrequency domain approach, we have x(0) = 0 and hence

Y (z)

U(z)= z−1C(zI −A)−1B +D.


Remark 5.23. In particular, for a SISO system and assuming thatu(k) = 0 ∀k < 0, we can compute the Z-Tranform of the linear time in-variant dynamic in (5.1) as

zX(z)− zx(0) = AX(z) +BU(z),

Y (z) = CX(z) +DU(z),

that, by solving the first equation in X(z) and then substituting in thesecond yields to

Y (z) = Cz(zI −A)−1x(0) +[C(zI −A)−1B +D

]U(z),

as desired.

Example 5.24. The following linear system

x(k + 1) =

[0.5 10 −0.5

]x(k) +

[01

]u(k),

y(k) =[1 −1

]x(k),

is represented by the following transfer function

G(z) =−z + 1.5

z2 − 0.25.

5.5.2. The Role of the Eigenvalues in the Discrete Time Trans-fer Function. Let us consider a SISO system. As for the continuous timesystems, there is a clear connection between the Z-Transform and the statespace representation. Indeed, we know that for a transfer function

Y (z)

U(z)= G(z),

are of prominent role the poles, which are the roots of the denominator.Hence, assuming x(0) = 0 as in the transfer function analysis case, we have

Y (z) = C(zI −A)−1BU(z) +DU(z),

where the direct coupling between the input and the output given by D isnow neglected since it does not involve the system characteristics. In sucha case, we have G(z) = C(zI − A)−1B. As previously, the (i, j) element of(zI −A)−1 can be computed using the Cramer’s rule

(−1)i+jdet Adji,jP (A)

,

where P (A) is the characteristic polynomial of A, defined as

P (A) = det(zI −A).

det Adji,j has degree less than n and there is still the possibility of havinga cancellation between the roots of P (A) and the roots of det Adji,j .

5.6. THE ROLE OF THE EIGENVALUES FOR DISCRETE TIME SYSTEMS 103

5.6. The Role of the Eigenvalues for Discrete Time Systems

Let us start with some properties of the matrix power and its eigenvaluesthat turn to be useful in the next development.

• Ak1Ak2 = Ak2Ak1 = (A1A2)k, ∀k, iff A1A2 = A2A1;

• If A is invertible, then(Ak)−1

=(A−1

)k= A−k;

• If v ∈ Rn is an eigenvector of a matrix A associated with theeigenvalue λ, i.e., Av = λv, than it immediately follows that v isalso an eigenvector of the matrix Ak associated with the eigenvalueλk, i.e., Akv = λkv;• For any given A and for any transformation matrix T , we have that

(T−1AT )k = T−1AkT ;• det(Ak) = [det(A)]k. From this property, and recalling that det(AB) =

det(A) det(B) (if of course A and B are square matrices), it fol-lows that det(AA−1) = det(A) det(A−1) = 1, which is of coursedet(I) = 1. Hence, the eigenvalues does not change for similaritytransformation, which is a property we have given for granted, butnever proved.

Bearing these properties in mind, we can now relate the matrix powerto its eigenvalues.

5.6.1. Matrix Powers for Diagonal Matrices. If a matrix is diag-onalisable, the matrix power can be easily obtained. Indeed, let

T−1AT = Λ =

λ1 0 . . . 00 λ2 . . . 0...

.... . .

...0 0 . . . λn

,we have that

Ak = TΛkT−1 = T

λk1 0 . . . 00 λk2 . . . 0...

.... . .

...0 0 . . . λkn

T−1.

5.6.2. Exponential Matrix for Diagonal Matrices with Com-plex Eigenvalues. If the matrix A has complex eigenvalues, it is possibleto resort to a coordinates transformation that transforms a diagonalisablematrix on the complex set into a block diagonal matrix, whose block di-mension is at most 2. The diagonal blocks are associated to each complexand conjugated pairs. As derived for the continuous time case and sinceAV = V Λ we have AVH = V ΛH = V HH−1ΛH, we have

A[vr + jvc |vr − jvc]H = A[vr|vc] = [vr|vc][σ ω−ω σ

],


that can be further transformed using the modulus ρ =√σ2 + ω2 and phase

θ = arctan(ωσ

), i.e.,

A[vr|vc] = [vr|vc]ρ[

cos(θ) sin(θ)− sin(θ) cos(θ)

].

Therefore

Ak[vr|vc] = [vr|vc]ρk[

cos(kθ) sin(kθ)− sin(kθ) cos(kθ)

].

For instance, being A ∈ R4×4 with λ1 and λ4 being real and λ2,3 = σ±jω,we have

Ak = TΛkT−1 = T

λk1 0 0 00 ρk cos(kθ) ρk sin(kθ) 00 −ρk sin(kθ) ρk cos(kθ) 00 0 0 λk4

T−1.

5.6.3. Exponential Matrix of Defective Matrices. In this case, itis still possible to use the Jordan form. Indeed, we still have that

A1 0 . . . 00 A2 . . . 0...

.... . .

...0 0 . . . An

k

=

Ak1 0 . . . 00 Ak2 . . . 0...

.... . .

...0 0 . . . Akn

.Hence, the analysis is related to the behaviour of the Jordan miniblocks,

i.e.,

Jki,l =

λi 1 0 . . . 0 00 λi 1 . . . 0 0...

.... . .

. . ....

...0 0 0 . . . λi 10 0 0 . . . 0 λi

l

= (Λi + Ji,l)k =

k∑q=0

Ckq λk−qi Jqi,l,

where Ckq =

(kq

)=

k!

q!(k − q)!if k ≥ q,

Ckq = 0 if k < q,

,

that is a polynomial function of k of degree q, i.e., the binomial coefficient.Therefore, for a miniblock of dimension p = mi,k, we have

Jki,l =

λki Ck1λ

k−1i Ck2λ

k−2i . . . Ckq−2λ

k−q+2i Ckq−1λ

k−q+1i

0 λki Ck1λk−1i . . . Ckq−3λ

k−q+3i Ckq−2λ

k−q+2i

......

. . .. . .

......

0 0 0 . . . λki Ck1λk−1i

0 0 0 . . . 0 λki

.

5.6. THE ROLE OF THE EIGENVALUES FOR DISCRETE TIME SYSTEMS 105

5.6.4. Exponential Matrix of Defective Matrices with ComplexEigenvalues. If the matrix A has complex eigenvalues associated with gen-eralised eigenvectors, i.e., there exists p = mi,k > 1, it is still possible toobtain a suitable representation. Indeed, suppose p = 2 and following thesame rationale of the continuous time case, we have

Jr = H−1JH =

σ ω 1 0−ω σ 0 10 0 σ ω0 0 −ω σ

=

[W I0 W

].

For the matrix power, we recall that

Jkr =

[W k kW k−1

0 W k

]=

[W k Ck1W

k−1

0 W k

],

and thus

Jkr =

ρk

[cos(kθ) sin(kθ)− sin(kθ) cos(kθ)

]kρk−1

[cos((k − 1)θ) sin((k − 1)θ)− sin((k − 1)θ) cos((k − 1)θ)

]0 ρk

[cos(kθ) sin(kθ)− sin(kθ) cos(kθ)

] .

In general, if p > 2, we have that following the same steps we can express

Jr = H−1JH =

W I 0 . . . 0 00 W I . . . 0 0...

.... . .

. . ....

...0 0 0 . . . W I0 0 0 . . . 0 W

,and hence

Jkr =

W k Ck1W

k−1 Ck2Wk−2 . . . Ckq−2W

k−q+2 Ckq−1Wk−q+1

0 W k Ck1Wk−1 . . . Ckq−3W

k−q+3 Ckq−2Wk−q+2

......

. . .. . .

......

0 0 0 . . . W k Ck1Wk−1

0 0 0 . . . 0 W k

.5.6.5. Some Examples.


A =

λ 1 00 λ 10 0 λ

.It follows that

Ak =

λk kλk−1 k(k−1)2 λk−2

0 λk kλk−1

0 0 λk

.


5.7. Modes Analysis

The unforced response of a linear system given by the equations (5.1) isjust a linear composition of only the functions that can be found in the realJordan form matrix. Such functions are called the modes of the system.

5.7.1. Continuous Time Systems. As previously obtained, the modesof a continuous time system are given by:

(1) Simple exponential functions eλt given by Jordan miniblocks ofdimension one with real eigenvalue λ. These modes are convergentto zero, constant or exponentially divergent if the eigenvalue λ isless, equal or greater than zero, respectively;

(2) Composite exponential functions tkeλt given by Jordan miniblocksof dimension m > 1 (and hence k = 0, . . . ,m − 1) with real eigen-value λ. These modes are convergent to zero if λ < 0, polynomiallydivergent if λ = 0 and k > 0 or exponentially divergent if λ > 0;

(3) Oscillating functions of the type eσt cos(ωt) and eσt sin(ωt) givenby two miniblocks of dimension 1 associated to complex and con-jugated eigenvalues λ = σ ± jω. For the real Jordan form, theminiblock is only one. These modes are convergent to zero if σ < 0,persistently oscillating if σ = 0 or exponentially divergent if σ > 0;

(4) Oscillating functions of the type tkeσt cos(ωt) and tkeσt sin(ωt) givenby Jordan miniblocks of dimensionm > 1 (and hence k = 0, . . . ,m−1) with complex and conjugated eigenvalues λ = σ ± jω. Thesemodes are convergent to zero if σ < 0, polynomially divergent ifσ = 0 and k > 0 or exponentially divergent if σ > 0.

5.7.2. Discrete Time Systems. As previously obtained, the modesof a discrete time system are given by:

(1) Powers λk given by Jordan miniblocks of dimension one with realeigenvalue λ. These modes are convergent to zero, constant ordivergent if the eigenvalue |λ| is less, equal or greater than one,respectively. If λ < 0, the series comprises an oscillating behaviourwith alternate signs;

(2) Series of functions Ckq λk−q given by Jordan miniblocks of dimension

m > 1 (and hence k = 0, . . . ,m− 1) with real eigenvalue λ. Thesemodes are convergent to zero if |λ| < 1, polynomially divergent if|λ| = 1 or exponentially divergent if |λ| > 1. Again the oscillatingbehaviour is given by λ < 0;

(3) Oscillating series of the type ρk cos(kθ) and rhok sin(kθ) given bytwo miniblocks of dimension 1 associated to complex and conju-gated eigenvalues λ = σ ± jω = ρe±jθ. For the real Jordan form,the miniblock is only one. These modes are convergent to zero ifρ < 1, persistently oscillating if ρ = 1 or exponentially divergentif ρ > 1. The frequency of the oscillations is proportional to θ:the greater is θ, the higher would be the oscillation frequency. The

5.7. MODES ANALYSIS 107

oscillating behaviour that previously was given by λ < 0, it is nowgiven for the maximum value of θ = π;

(4) Oscillating series of the type Ckq ρk−q cos((k−q)θ) and Ckq rho

k−q sin((k−q)θ) given by Jordan miniblocks of dimension m > 1 (and hencek = 0, . . . ,m − 1) with complex and conjugated eigenvalues λ =σ± jω = ρe±jθ. These modes are convergent to zero if ρ < 1, poly-nomially divergent if ρ = 1 and k > 0 or exponentially divergent ifρ > 1.


CHAPTER 6

Structural Properties of Systems

The unforced response with the initial condition x(0) expresses if thestates will converge or diverge regardless of the input. This is a propertythat can be inferred from the modes analysis presented in 5.7. Hence theeigenvalues plays a prominent role for the overall system, having close con-nection with BIBO stability. However we will see that the concepts of con-vergence/divergence, more formally defined as stability, is a system-wideproperty, rather than an input-output relation as the BIBO stability. Itwill be clear in the following that the system modes determines the BIBOstability.

6.1. Stability

Consider a generic system, not necessarily linear, on a vector space x ∈Rn

dx(t)

dt= f(x(t), u(t), t).

If the system is time-invariant, we can drop the variable t and write

dx(t)

dt= f(x(t), u(t)).

The trajectory of the system is given by x(t) = Φx(x0, u(t)), that is a func-tion of the initial conditions x(t0) = x0 and of the input vector u(t). Forlinear systems, these two components can be computed separately and thensummed up using the superposition principle, i.e., the unforced and forcedresponse.

An important concept for the stability analysis is given by the equilib-rium point.

Definition 6.1. The equilibrium point is a constant solution to a dif-ferential equation or a fixed point to a difference equation.

In practice, x ∈ Rn is an equilibrium point for the differential equation

dx(t)

dt= f(x(t), t),

if

f(x, t) = 0,∀t.109

110 6. STRUCTURAL PROPERTIES OF SYSTEMS

Similarly, x ∈ Rn is an equilibrium point (or fixed point ) for the differenceequation

x(t+ 1) = f(x(t), t),

ifx = f(x, t), ∀t = 0, 1, 2, . . .

Example 6.2. Let us consider the RL circuit of Example 5.2 in Figure 1,whose dynamic equations are given by (Ohm and Henry laws)

Ldi(t)

dt+Ri(t) = v(t).

Assuming the output y(t) = i(t) and the input u(t) = v(t), we have thefollowing state space realisation

x1(t) = −RLx1(t) +

u(t)

L,

y(t) = x1(t).

The equilibrium point is given by

˙x1(t) = 0 = −RLx1(t) +

u(t)

L⇒ x1 =

u

R.

Example 6.3. Consider the mass-dumper-spring of Example 5.3 in Fig-ure 2, whose dynamic equations are given by (Newton, Hooke and Rayleighlaws)

d2p(t)

dt2m+K(p(t)− p) +B

dp(t)

dt= f(t),

where p(t) is the position of the cart, p is the relax point of the springand f(t) is the external force acting on the system. Choosing y(t) = p(t),u(t) = f(t) and assuming p = y = 0, the following canonical control form isobtained

x(t) =

[0 1

−Km −B

m

]x(t) +

[01

]u(t),

y(t) =[

1m 0

].

The equilibrium point is given by

x1 =m

Ku,

x2 = 0.

From the previous examples it is clear how the equilibrium point can bea function also of the input. Moreover, in order to reach a constant statevalue, the input should be constant as well. Furthermore, by the modesanalysis it turns out that if all the modes are convergent, then the unforcedresponse tends towards zero, which means that for u(t) = 0 the equilibriumpoint will be the origin.

6.1. STABILITY 111

t [s]

0 1 2 3 4 5

x1

-2

0

2

4

6

8

10

x0 = -1

x0 = 1

x0 = 10

(a)

t [s]

0 1 2 3 4 5

x1

×105

-0.5

0

0.5

1

1.5

2

2.5

x0 = -1

x0 = 1

x0 = 10

(b)

Figure 1. RL example: a) Unforced response for R =2 Ohm and L = 1 Henry; b) Unforced response for R =−2 Ohm and L = 1 Henry.

Example 6.4. The unforced response for the Example 6.2 is reportedin Figure 1-a when R = 2 Ohm and L = 1 Henries. In this case the modesare convergent, indeed there is only one eigenvalue equals to

λ1 = −RL,

hence exponential convergence. The figure reports the time evolution forthree different initial conditions. Irrespective of the initial condition, the sys-tem exponentially converges towards the origin. Similarly, if R = −2 Ohm


t [s]

0 1 2 3 4 5

x1

-2

0

2

4

6

8

10

x0 = -1

x0 = 1

x0 = 10

(a)

t [s]

0 1 2 3 4 5

x1

×105

-0.5

0

0.5

1

1.5

2

2.5

x0 = -1

x0 = 1

x0 = 10

(b)

Figure 2. RL example: a) Response for R = 2 Ohm andL = 1 Henry; b) Response for R = −2 Ohm and L = 1 Henry.The constant input u = 10 Volts is applied after one second.

and L = 1 Henries (Figure 1-b), which is not physically feasible, otherwisethe resistor would inject energy in the system instead of dissipating it, thebehaviour is divergent irrespective of the initial condition. In fact, in thiscase the eigenvalue λ1 previously computed would be positive.

If an input is added equal to u = 10 Volts after one second, the responseschange to the sum of the unforced and forced responses, as reported inFigure 2. The equilibrium point reached is a function of the constant inputvalue. Notice how the presence of a constant input does not change the

6.1. STABILITY 113

t [s]0 5 10 15 20

x1, x

2

-1

-0.5

0

0.5

1

1.5

p0 = -0.5

v0 = 0

p0 = 0

v0 = 0

p0 = 1.5

v0 = -1

(a)

t [s]0 5 10 15 20

x1, x

2

-1

-0.5

0

0.5

1

1.5

p0 = -0.5

v0 = 0

p0 = 0

v0 = 0

p0 = 1.5

v0 = -1

(b)

Figure 3. Mass-Spring-Damper example for m = 10 Kg,K = 2 N/m and B = 10: a) Unforced response; b) Overallresponse assuming a constant input u = 2 N applied afterone second.

divergent behaviour (compare Figure 1-b and Figure 2-b). Moreover, as forthe unforced response, the steady state value reached after the transient (i.e.,the equilibrium point) is not affected by the initial condition. This is trivialby recalling the definition of the equilibrium point previously retrieved.

Example 6.5. The unforced response for the Example 6.3 is reportedin Figure 3-a when m = 10 Kg, K = 2 N/m and B = 10. In this case the


t [s]0 5 10 15 20

x1, x

2

-15

-10

-5

0

5

10

15

20

p0 = -0.5

v0 = 0

p0 = 0

v0 = 0

p0 = 1.5

v0 = -1

Figure 4. Mass-Spring-Damper example: Unforced re-sponse for m = 10 Kg, K = −2 N/m and B = 10.

modes are convergent, indeed there are two eigenvalue equals to

λ1 =−B +

√B2 − 4Km

2mand λ2 =

−B −√B2 − 4Km

2m,

that, given the described choices of the parameters, are both negative andreal. As a consequence, we have exponential convergence. The figure reportsthe time evolution for three different initial conditions for both the positionsand the velocity v(t) = p(t). Irrespective of the initial conditions, the systemexponentially converges towards the origin. If an input (the external force)is added equal to u = 2 N after one second, the responses change to thesum of the unforced and forced responses, as reported in Figure 3-b. Theequilibrium point reached is a function of the constant input value. More-over, as for the unforced response, the steady state value reached after thetransient (i.e., the equilibrium point) is not affected by the initial conditions.This is trivial by recalling the definition of the equilibrium point previouslycomputed.

If m = 10 Kg, K = −2 N/m and B = 10 (Figure 4), which is notphysically feasible, otherwise the spring would push the mass when it isextended beyond its relax point, the behaviour of the unforced responseis divergent irrespective of the initial conditions. In fact, in this case theeigenvalue λ1 previously computed would be positive. Also in this case, thepresence of a constant input would not change the divergent behaviour. Theimportant fact here to notice is that without perturbation the system doesnot exhibit an unstable behaviour in the unforced response, as highlightedby the red line corresponding to x1(t0) = p(t0) = p0 = 0 and x2(t0) =v(t0) = v0 = 0. This is very important and will be used extensively in thestability analysis.

6.1. STABILITY 115

t [s]0 5 10 15 20

x1, x

2

-3

-2

-1

0

1

2

3

p0 = -0.5

v0 = 0

p0 = 0

v0 = 0

p0 = 1.5

v0 = -1

Figure 5. Mass-Spring-Damper example: Unforced re-sponse for m = 10 Kg, K = 2 N/m and B = 0.

An interesting case is when m = 10 Kg, K = 2 N/m and B = 0,as reported in Figure 5. The unforced response now exhibits a persistentoscillations, whose amplitude do depend on the initial conditions. In thiscase the eigenvalues are given by

λ1 =

√−Kmm

and λ2 =−√−Kmm

,

which are two imaginary conjugated roots. In this case, the presence ofa constant input would change the DC gain of the oscillations but bot itspersistent oscillations. Again, without perturbation the system does notexhibit a persistent oscillation in the unforced response, as highlighted bythe red line corresponding to x1(t0) = p(t0) = p0 = 0 and x2(t0) = v(t0) =v0 = 0.

From the concept of perturbation, it is clear that the equilibrium pointmakes only sense if the system has converging behaviour, being the point inthe state space that the system reaches starting from any initial condition.

6.1.1. The role of the Eigenvectors in the Unforced Response.We saw in Chapter 5 that each matrix A ∈ Rn×n has n (generalised) linearlyindependent eigenvectors vi associated to the eigenvalues λi. To clearlyunderstand the role of the eigenvectors, let us consider a diagonalisablematrix A. In such a case, there exists n independent solutions

Avi = λivi, ∀i = 1, . . . , n.

Moreover, if the system is continuous time (similar arguments can be fol-lowed for the discrete counterparts), we know that

eAtvi = eλitvi, ∀i = 1, . . . , n.


Furthermore, we are also aware that having vi linearly independent yieldsto

x =

n∑i=1

αivi, ∀x ∈ Rn,

where αi ∈ R.Since the unforced response is given by

x(t) = eAtx0 = eAtn∑i=1

αivi =

n∑i=1

αieλitvi,

the state space dynamic aligns to the eigenvector directions. Without lossof generality, let us suppose that λi ∈ R and that λi < 0, ∀i = 1, . . . , n.Moreover, let us assume that λn < λn−1 < · · · < λ1 < 0. It turns out thatthe unforced solution will converge towards the origin along the directiongiven by v1. Similar considerations can be derived if λi < 0, ∀i = 2, . . . , n,and λ1 > 0: the system will diverge along the direction given by v1.

Example 6.6. The phase portrait, i.e., the solution of the differentialequation in the state space as a function of time, of the unforced responsefor the Example 6.3 is reported in Figure 6-a when m = 10 Kg, K = 2 N/mand B = 10. In this case the modes are convergent, and the phase portraitclearly shows that the origin is approached, for any initial condition, alongv1, since the two eigenvalues are real and negative and equals to

λ1 =−B +

√B2 − 4Km

2mand λ2 =

−B −√B2 − 4Km

2m.

Instead, for m = 10 Kg, K = −2 N/m and B = 10 the system is unstableand the system aligns, for any initial condition, along the eigenvector v1,since λ1 > 0 and lambda2 < 0 (see Figure 6-b).

Finally, if m = 10 Kg, K = 2 N/m and B = 0, the system has apersistent oscillation with imaginary conjugated eigenvalues

λ1 =

√−Kmm

and λ2 =−√−Kmm

.

Recalling the results obtained in Chapter 5, we have that the exponentialmatrix is a rotation matrix, therefore the phase portrait depicts circles (seeFigure 7).

6.1.2. Lyapunov Stability Definition. It is clear from the previousanalysis that what matters for system is its behaviour in time. In particularit can be converging, diverging or oscillating depending on the eigenvalues(i.e., modes analysis). Moreover, the natural intuition leads to the followingstatement: if the behaviour is converging or oscillating the system will bestable, if it is diverging it will be unstable. However, in order to give ananswer to the stability property of the system, we had to compute the timeevolution of the system, that is we had to solve the differential or differenceequations. Of course, this is a solution that is always feasible for linear

6.1. STABILITY 117

x1

-0.5 0 0.5 1 1.5

x2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

[p0 v

0] = [-0.5, 0]

[p0 v

0] = [0, 0]

[p0 v

0] = [1.5, -1]

(a)

x1

-15 -10 -5 0 5 10 15 20

x2

-3

-2

-1

0

1

2

3

[p0 v

0] = [-0.5, 0]

[p0 v

0] = [0, 0]

[p0 v

0] = [1.5, -1]

(b)

Figure 6. Mass-Spring-Damper example: phase portraitfor the system having a) a converging behaviour with m =10 Kg, K = 2 N/m and B = 10; b) a diverging behaviourwith m = 10 Kg, K = −2 N/m and B = 10.

systems, since it is only requested to compute the eigenvalues of the dynamicmatrix. Nonetheless, this can be a very difficult problem for a nonlinearsystem. So, is there any possibility to solve this problem in the general caseand then specialise it in the linear case?

The main idea is to express the distance from the equilibrium point:if the distance decreases with time, the system is asymptotically stable; ifit is bounded, the system is stable; if it increases with time, the system is


x1

-3 -2 -1 0 1 2 3

x2

-1.5

-1

-0.5

0

0.5

1

1.5

[p0 v

0] = [-0.5, 0]

[p0 v

0] = [0, 0]

[p0 v

0] = [1.5, -1]

Figure 7. Mass-Spring-Damper example: phase portrait forthe system showing a persistent oscillation with m = 10 Kg,K = 2 N/m and B = 0.

unstable. An easy way to express the distance, or vector length, in Rn is bymeans of a vector norm.

Definition 6.7. Given a vector space V defined on a field F, a vectornorm ρ satisfies the following properties:

• Positive homogeneity: ρ(av) = |a|ρ(v), ∀a ∈ F and ∀v ∈ V ;• Triangle inequality: ρ(v + u) ≤ ρ(v) + ρ(u);• Zero vector: ρ(v) = 0, if and only if v = 0.

We will usually consider ρ(·) as ‖ · ‖.Definition 6.8. A vector space with a norm ‖ · ‖ is called a normed

vector space.

For example, the Euclidean norm (or 2-norm) is ‖v‖2 =√v2

1 + v22 + · · ·+ v2

n =√vTv.

Example 6.9. Let us consider again the unforced response for the Ex-ample 6.3. In particular, the Euclidean 2-norm of the state vector x(t)− xwhen m = 10 Kg, K = 2 N/m and B = 10 is reported in Figure 8-a In thiscase the modes are convergent, and, hence, the norm is converging irrespec-tive of the initial conditions. Instead, for m = 10 Kg, K = −2 N/m andB = 10 the system is unstable and the norm is diverging (see Figure 6-b).

If m = 10 Kg, K = 2 N/m and B = 0, the system has a persistentoscillation, hence the norm is bounded (see Figure 9).

With this idea in mind, we can now presents the definitions that underliethe Lyapunov Stability Theory.

6.1. STABILITY 119

t [s]0 5 10 15 20

||x||

2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

[p0 v

0] = [-0.5, 0]

[p0 v

0] = [0, 0]

[p0 v

0] = [1.5, -1]

(a)

t [s]0 5 10 15 20

||x||

2

0

2

4

6

8

10

12

14

16

18

[p0 v

0] = [-0.5, 0]

[p0 v

0] = [0, 0]

[p0 v

0] = [1.5, -1]

(b)

Figure 8. Mass-Spring-Damper example: norm the systemhaving a) a converging behaviour with m = 10 Kg, K =2 N/m and B = 10; b) a diverging behaviour with m =10 Kg, K = −2 N/m and B = 10.

Definition 6.10. An equilibrium point x is stable if all the trajectoriesx(t) = Φx(x′, u, t) that starts from initial points x′ sufficiently close to xremain arbitrarily closed to x.

More precisely: if ∀ε > 0, ∃δ > 0 s.t. ‖x′ − x‖ < δ, then ‖Φx(x′, u, t)−x‖ < ε, ∀t. This case is represented by Figure 10-a.


t [s]0 5 10 15 20

||x||

2

0

0.5

1

1.5

2

2.5

3

[p0 v

0] = [-0.5, 0]

[p0 v

0] = [0, 0]

[p0 v

0] = [1.5, -1]

Figure 9. Mass-Spring-Damper example: norm for the sys-tem showing a persistent oscillation with m = 10 Kg, K =2 N/m and B = 0.

(a) (b)

Figure 10. a) Trajectory originating from a stable equilib-rium point. b) Trajectory originating from an asymptoticallystable equilibrium point.

Definition 6.11. An equilibrium point x is attractive if all the trajec-tories x(t) = Φx(x′, u, t) that starts from initial points x′ sufficiently closeto x converge towards Φx(x, u, t), for t→ +∞.

More precisely: if ∃δ > 0 s.t. ‖x′− x‖ < δ, then limt→+∞ ‖Φx(x′, u, t)−x‖ = 0.

Definition 6.12. An equilibrium point x is asymptotically stable if it isstable and attractive.

6.1. STABILITY 121

(a) (b)

Figure 11. a) Stable trajectory. b) Asymptotically stable trajectory.

The example of an asymptotically stable point is depicted in Figure 10-b.

Definition 6.13. An equilibrium point x that is not stable is unstable.

Remark 6.14. The same concept of stability fro an equilibrium point xcan be applied also to reference trajectories by substituting in the previousdefinitions the x with x(t) = Φx(x0, u, t).

Graphical representations of stable and asymptotically stable trajecto-ries are given in Figure 11.

From the previous definitions, it is clear that the norm plays a crucialrole. In particular, the theory relies on the definition of positive definitenessof a function.

Definition 6.15. A function V (x) : Rn 7→ R is positive definite (p.d.)if and only if it is V (x) > 0, ∀x ∈ Br\0 and V (0) = 0. It is globally p.d. ifand only if it is V (x) > 0, ∀x ∈ Rn\0 and V (0) = 0.

Definition 6.16. A function V (x) : Rn 7→ R is positive semi definite(p.s.d.) if and only if it is V (x) ≥ 0, ∀x ∈ Br\0 and V (0) = 0 (e.g. aquadratic form with P ≥ 0). It is globally p.s.d. if and only if it is V (x) ≥ 0,∀x ∈ Rn\0 and V (0) = 0.

Definition 6.17. A function V (x) : Rn 7→ R is negative definite (n.d.)if and only if it is V (x) < 0, ∀x ∈ Br\0 and V (0) = 0 (e.g. a quadratic formwith P < 0). It is globally n.d. if and only if it is V (x) < 0, ∀x ∈ Rn\0 andV (0) = 0.

Definition 6.18. A function V (x) : Rn 7→ R is negative semi definite(n.s.d.) if and only if it is V (x) ≤ 0, ∀x ∈ Br\0 and V (0) = 0 (e.g. aquadratic form with P ≤ 0). It is globally n.s.d. if and only if it is V (x) ≤ 0,∀x ∈ Rn\0 and V (0) = 0.


Definition 6.19. A function V (x) : Rn 7→ R is not definite if it is notp.d., p.s.d, n.d. nor n.s.d.

A trivial example of a p.d. function is the Euclidean 2-norm. Similarly,an example of a n.d. function is minus the Euclidean 2-norm.

Definition 6.20. Given a function V (x) : Rn 7→ R, the region of pointsin which V (x) = constant is called a level surface. Usually, the level surfacesare ellipsoids in Rn.

Definition 6.21. Let V (x) : Rn 7→ R be a continuous function withcontinuous first partial derivatives in a neighborhood of the origin of Rn.If V (x) is p.d., then V (x) is called a Lyapunov function candidate for thesystem x(t) = f(x(t), t) (where u(t) is supposed to be zero).

According to our intuition and to the stability definitions of an equilib-rium point, if V (x) is decreasing for increasing t, than the given solutiontrajectory must be converging toward the equilibrium point. This is theunderlying idea of Lyapunov stability theorems. In order to understand ifthe Lyapunov function candidate V (x) is decreasing or not, it is sufficient

to determine V (x), which is given by the total derivative of V (x), i.e.,

V (x) =∂V (x)

∂xx(t) =

∂V (x)

∂xf(x, t) =

=∂V (x)

∂x1f1(x, t) +

∂V (x)

∂x2f2(x, t) + · · ·+ ∂V (x)

∂xnfn(x, t) =

= ∇V (x)f(x, t) = LfV (x),

where we made the implicit assumption that

x(t) =

x1(t)x2(t)

...xn(t)

= f(x(t), t) =

f1(x, t)f2(x, t)

...fn(x, t)

.It is now quite easy to introduce the Lyapunov stability theorems.

Theorem 6.22. The equilibrium point of x(t) = f(x(t), t) is stable ifthere exists a Lyapunov function candidate V (x) with LfV (x) ≤ 0.

Theorem 6.23. The equilibrium point of x(t) = f(x(t), t) is unstable ifthere exists a Lyapunov function candidate V (x) with LfV (x) > 0.

Theorem 6.24. The equilibrium point of x(t) = f(x(t), t) is asymp-totically stable if there exists a Lyapunov function candidate V (x) withLfV (x) < 0.

We immediately notice that the stability property is a function of thesystem whatever is the input applied, indeed the theorems do not dependon the input. The following example clarify this point.

6.1. STABILITY 123

Example 6.25. Let us recall the RL Example 6.2, whose equilibriumpoint is

x1 =u

R.

By defining the error variable x1 = x1 − x1, it is possible to define thefollowing Lyapunov candidate

v(x1) = x21,

which is the distance of the state variable x1 to the equilibrium point. Letus compute its time derivative

V (x1) = LfV (x1) = 2x1˙x1 = 2x1x1,

since ˙x1 = x1 + ˙x1 = x1 since x1 is constant. Then

V (x1) = 2x1x1 = 2x1

(−RLx1 +

u

L

)= −2

R

Lx1

(x1 −

u

R

)= −2

R

Lx2

1.

When R = 2 Ohm and L = 1 Henries, we have that V (x1) is n.d., hencethe system is asymptotically stable. Instead, when R = −2 Ohm and L =1 Henries, we have that V (x1) is p.d., hence the system is unstable.

The most powerful characteristic of the Lyapunov theory is that stabilitycan be proved without explicitly computing the solution of the differentialequations.

Remark 6.26. For discrete time systems, the time derivative is substi-tuted with the difference of the Lyapunov function, i.e.,

∆fV (x) = V (f(x))− V (x).

6.1.3. Quadratic Forms. For the choice of the Lyapunov candidate,one standard choice comes from the quadratic forms.

Definition 6.27. Given a matrix P ∈ Rn×n, the scalar function

V (x) = xTPx

is said to be a quadratic form.

For the quadratic form, the antisymmetric part of P is not relevant.Indeed, any matrix P can be written as the sum of its symmetric and anti-symmetric part, i.e.,

P =P + P T

2+P − P T

2.

It then follows immediately that

V (x) = xTPx = xT(P + P T

2+P − P T

2

)x = xT

P + P T

2x.

We should emphasise that V (x) is p.d. if P is p.d. (i.e., P > 0).Moreover, V (x) is p.s.d. if P is p.s.d. (i.e., P ≥ 0).


Definition 6.28. A real, symmetric matrix P ∈ Rn×n is positive definiteif (all the following statements are equivalent):

(1) xTPx > 0 for every nonzero x ∈ Rn;(2) all eigenvalues of P are positive;(3) P = RTR for some nonsingular R;(4) the leading principal minors of P are positive;(5) all principal minors of P are positive.

It is easy to see that if P is p.d. (or p.s.d.), then −P will be n.d. (orn.s.d.).

The condition P = RTR highlights the fact that a quadratic form canbe seen as the Euclidean 2-norm of a vector in a specific set of coordinates.Indeed,

xTPx = xTRTRx = yT y = ‖y‖22,where y = Rx.The choice of R is not unique.

6.2. Stability of Linear Systems

6.2.1. Continuous Time Systems. Let us consider the following lin-ear system

x = Ax,

and the Lyapunov function candidate

V (x) = xTPx.

The time derivative is then given by

V (x) = 2xTPx = 2xTPAx = xT (ATP + PA)x , −xTQx.

In other words, −Q is the symmetric part of 2PA. The equation

ATP + PA = −Q

is termed the continuous time Lyapunov equation.Then, it follows immediately that:

(1) If P is p.d. and Q is p.s.d., then the system is stable;(2) If P is p.d. and Q is p.d., then the system is asymptotically stable.

In general, if P is arbitrarily chosen, then Q is not defined. A solutioninstead exists in the reverse order: fixing Q p.d. and assuming that A isasymptotically stable, than the solution is given by

P =

∫ +∞

0eA

T tQeAtdt.

From the previous equation it follows that the modes analysis is equallycomplex to finding a solution for the Lyapunov function.

6.2. STABILITY OF LINEAR SYSTEMS 125

6.2.2. Discrete Time Systems. Let us consider the following linearsystem

x(k + 1) = Ax(k),

and the Lyapunov function candidate

V (x) = xTPx.

The Lyapunov function difference is then given by

∆V (x) = x(k+1)TPx(k+1)−x(k)TPx(k) = x(k)T (ATPA+P )x(k) , −xTQx.The equation

ATPA+ P = −Qis termed the discrete time Lyapunov equation.

Then, it follows immediately that:

(1) If P is p.d. and Q is p.s.d., then the system is stable;(2) If P is p.d. and Q is p.d., then the system is asymptotically stable.

Again, if P is arbitrarily chosen, then Q is not defined. A solutioninstead exists in the reverse order: fixing Q p.d. and assuming that A isasymptotically stable, than the solution is given by

P =+∞∑i=0

(AT )iQAi.

As for the continuous time case, from the previous equation it follows thatthe modes analysis is equally complex to finding a solution for the Lyapunovfunction.


APPENDIX A

Some mathematical definition of interest

We now list some mathematical definition of interest. In the text belowwe will refer to a set denoted as T and to its elements, denoted by lowercapitals letters such as t, t1, t2, . . ..

Definition A.1. Un insieme T si dice totalmente ordinato se su diessa e definita una relazione ≤ di ordinamento totale definita su coppiedegli elementi dell’insieme. Una definizione di ordinamento totale gode delleseguenti proprieta:

(1) Riflessivita: ∀t ∈ T : t ≤ t;(2) Antisimmetria: t1 ≤ t2 ∧ t2 ≤ t1 → t1 = t2(3) Transitivita: t1 ≤ t2 ∧ t2 ≤ t3 → t1 ≤ t3(4) Confrontabilita (tricotomia): ∀t1, t2 ∈ T o t1 ≤ t2 oppure t2 ≤ t1.

Se valgono solo le prime tre proprieta si parla di ordinamento parziale.

Come esempio tutti i possibili spazi rappresentati il temp nel nostrocorso sono almeno parzialmente ordinati.

Diamo ora la definizione di spazio metrico.

Definition A.2. Consideriamo una funzione d : T × T → R, dettadistanza, definita sulle coppie di elementi di un insieme T che associa a cias-cuna coppia di punti un numero reale. Tale funzione definisce una metricase:

• Positivita: ∀t1, t2 ∈ T , d(t1, t2) ≥ 0 e d(t1, t2) = 0 se e solo set1 = t2.• Simmetria: ∀t1, t2 ∈ T : d(t1, t2) = d(t2, t1).• Disuguaglianza triangolare: ∀t1, t2, t2 ∈ T : d(t1, t3) ≤ d(t1, t2) +d(t2, t3).

Definition A.3. Uno spazio metrico e un insieme su cui sia definitauna metrica.

Definition A.4. A point t is a limit point of a set T if for any ε > 0there exist another point t1 different from t such that d(t, t1) ≤ ε, whered(., .) is a metric.

Definition A.5. The space T is continuum if it is a closed connectedset. A connected set is one where no matter how it is divided in two disjointset, at least one of them will contain limit points of the other. A closed setis one that contains the limit poitns of all its subsets.

127

128 A. SOME MATHEMATICAL DEFINITION OF INTEREST

The time space T for CT systems is a continuum because we cannot poseconstraints on when events and observations occurs, just as it happens inthe physical world.

Bibliography

[Hes09] Joao P Hespanha, Linear systems theory, Princeton university press, 2009.[LSV98] Edward A Lee and Alberto Sangiovanni-Vincentelli, A framework for compar-

ing models of computation, Computer-Aided Design of Integrated Circuits andSystems, IEEE Transactions on 17 (1998), no. 12, 1217–1229.

[PPR95] Charles L Philips, John M Parr, and E Riskin, Signals, systems, and transforms,Prentice Hall, 1995.

[RI79] Antonio Ruberti and Alberto Isidori, Teoria dei sistemi, Boringhieri, 1979.

129

Documents

Theory of Linear Systems Lecture Notespalopoli/courses/TS/TScomplete.pdf · 2016-01-07 · This lecture notes are related to the course of \Theory of Linear Sys-tems$ taught in the