Modelling Dynamics in Processes and Systems

Wojciech Mitkowski and Janusz Kacprzyk (Eds.)

Modelling Dynamics in Processes and Systems

Studies in Computational Intelligence,Volume 180

Editor-in-ChiefProf. Janusz KacprzykSystems Research InstitutePolish Academy of Sciencesul. Newelska 601-447 WarsawPolandE-mail: [email protected]

Further volumes of this series can be found on our homepage:springer.com

Vol. 156. Dawn E. Holmes and Lakhmi C. Jain (Eds.)Innovations in Bayesian Networks, 2008ISBN 978-3-540-85065-6

Vol. 157.Ying-ping Chen and Meng-Hiot Lim (Eds.)Linkage in Evolutionary Computation, 2008ISBN 978-3-540-85067-0

Vol. 158. Marina Gavrilova (Ed.)Generalized Voronoi Diagram: A Geometry-Based Approach toComputational Intelligence, 2009ISBN 978-3-540-85125-7

Vol. 159. Dimitri Plemenos and Georgios Miaoulis (Eds.)Artificial Intelligence Techniques for Computer Graphics, 2009ISBN 978-3-540-85127-1

Vol. 160. P. Rajasekaran and Vasantha Kalyani DavidPattern Recognition using Neural and Functional Networks,2009ISBN 978-3-540-85129-5

Vol. 161. Francisco Baptista Pereira and Jorge Tavares (Eds.)Bio-inspired Algorithms for the Vehicle Routing Problem, 2009ISBN 978-3-540-85151-6

Vol. 162. Costin Badica, Giuseppe Mangioni,Vincenza Carchiolo and Dumitru Dan Burdescu (Eds.)Intelligent Distributed Computing, Systems and Applications,2008ISBN 978-3-540-85256-8

Vol. 163. Pawel Delimata, Mikhail Ju. Moshkov,Andrzej Skowron and Zbigniew SurajInhibitory Rules in Data Analysis, 2009ISBN 978-3-540-85637-5

Vol. 165. Djamel A. Zighed, Shusaku Tsumoto,Zbigniew W. Ras and Hakim Hacid (Eds.)Mining Complex Data, 2009ISBN 978-3-540-88066-0

Vol. 166. Constantinos Koutsojannis and Spiros Sirmakessis(Eds.)Tools and Applications with Artificial Intelligence, 2009ISBN 978-3-540-88068-4

Vol. 167. Ngoc Thanh Nguyen and Lakhmi C. Jain (Eds.)Intelligent Agents in the Evolution of Web and Applications, 2009ISBN 978-3-540-88070-7

Vol. 168.Andreas Tolk and Lakhmi C. Jain (Eds.)Complex Systems in Knowledge-based Environments: Theory,Models and Applications, 2009ISBN 978-3-540-88074-5

Vol. 169. Nadia Nedjah, Luiza de Macedo Mourelle andJanusz Kacprzyk (Eds.)Innovative Applications in Data Mining, 2009ISBN 978-3-540-88044-8

Vol. 170. Lakhmi C. Jain and Ngoc Thanh Nguyen (Eds.)Knowledge Processing and Decision Making in Agent-BasedSystems, 2009ISBN 978-3-540-88048-6

Vol. 171. Chi-Keong Goh,Yew-Soon Ong and Kay Chen Tan(Eds.)Multi-Objective Memetic Algorithms, 2009ISBN 978-3-540-88050-9

Vol. 172. I-Hsien Ting and Hui-Ju Wu (Eds.)Web Mining Applications in E-Commerce and E-Services, 2009ISBN 978-3-540-88080-6

Vol. 173. Tobias GroscheComputational Intelligence in Integrated Airline Scheduling,2009ISBN 978-3-540-89886-3

Vol. 174.Ajith Abraham, Rafael Falcon and Rafael Bello (Eds.)Rough Set Theory: A True Landmark in Data Analysis, 2009ISBN 978-3-540-89886-3

Vol. 175. Godfrey C. Onwubolu and Donald Davendra (Eds.)Differential Evolution: A Handbook for GlobalPermutation-Based Combinatorial Optimization,2009ISBN 978-3-540-92150-9

Vol. 176. Beniamino Murgante, Giuseppe Borruso andAlessandra Lapucci (Eds.)Geocomputation and Urban Planning, 2009ISBN 978-3-540-89929-7

Vol. 177. Dikai Liu, Lingfeng Wang and Kay Chen Tan (Eds.)Design and Control of Intelligent Robotic Systems, 2009ISBN 978-3-540-89932-7

Vol. 178. Swagatam Das,Ajith Abraham and Amit KonarMetaheuristic Clustering, 2009ISBN 978-3-540-92172-1

Vol. 179. Mircea Gh. Negoita and Sorin HinteaBio-Inspired Technologies for the Hardware of Adaptive Systems,2009ISBN 978-3-540-76994-1

Vol. 180.Wojciech Mitkowski and Janusz Kacprzyk (Eds.)Modelling Dynamics in Processes and Systems, 2009ISBN 978-3-540-92202-5

Wojciech Mitkowski and Janusz Kacprzyk (Eds.)

Modelling Dynamics in Processesand Systems

123

Prof.Wojciech MitkowskiFaculty of Electrical Engineering,AutomaticsComputer Science and ElectronicsAGH University of Science and TechnologyAl. Mickiewicza 3030-059 KrakowPolandEmail: [email protected]

Prof. Janusz KacprzykSystems Research InstitutePolish Academy of Sciencesul. Newelska 601-447 WarsawPolandEmail: [email protected]

ISBN 978-3-540-92202-5 e-ISBN 978-3-540-92203-2

DOI 10.1007/978-3-540-92203-2

Studies in Computational Intelligence ISSN 1860949X

Library of Congress Control Number: 2008942380

c© 2009 Springer-Verlag Berlin Heidelberg

This work is subject to copyright. All rights are reserved, whether the whole or partof the material is concerned, specifically the rights of translation, reprinting, reuse ofillustrations, recitation, broadcasting, reproduction on microfilm or in any other way,and storage in data banks. Duplication of this publication or parts thereof is permittedonly under the provisions of the German Copyright Law of September 9, 1965, inits current version, and permission for use must always be obtained from Springer.Violations are liable to prosecution under the German Copyright Law.

The use of general descriptive names, registered names, trademarks, etc. in this publi-cation does not imply, even in the absence of a specific statement, that such names areexempt from the relevant protective laws and regulations and therefore free for generaluse.

Typeset & Cover Design: Scientific Publishing Services Pvt. Ltd., Chennai, India.

Printed in acid-free paper

9 8 7 6 5 4 3 2 1

springer.com

Preface V

Preface

Dynamics is what characterizes virtually all phenomenae we face in the real world, and processes that proceed in practically all kinds of inanimate and animate systems, notably social systems. For our purposes dynamics is viewed as time evolution of some characteristic features of the phenomenae or processes under consideration. It is obvious that in virtually all non-trivial problems dynamics can not be neglected, and should be taken into account in the analyses to, first, get insight into the problem con-sider, and second, to be able to obtain meaningful results.

A convenient tool to deal with dynamics and its related evolution over time is to use the concept of a dynamic system which, for the purposes of this volume can be characterized by the input (control), state and output spaces, and a state transition equation. Then, starting from an initial state, we can find a sequence of consecutive states (outputs) under consecutive inputs (controls). That is, we obtain a trajectory. The state transition equation may be given in various forms, exemplified by differen-tial and difference equations, linear or nonlinear, deterministic or stochastic, or even fuzzy (imprecisely specified), fully or partially known, etc. These features can give rise to various problems the analysts may encounter like numerical difficulties, insta-bility, strange forms of behavior (e.g. chaotic), etc..

This volume is concerned with some modern tools and techniques which can be useful for the modeling of dynamics. We focus our attention on two important areas which play a key role nowadays, namely automation and robotics, and biological sys-tems. We also add some new applications which can greatly benefit from the avail-ability of effective and efficient tools for modeling dynamics, exemplified by some applications in security systems.

The first part of the volume is concerned with more general tools and techniques for the modeling of dynamics. We are particularly interested in the case of complex systems which are characterized by a highly nonlinear dynamic behavior that can re-sult in, for instance, chaotic behavior.

R. Porada and N. Mielczarek (Modeling of chaotic systems in program ChaoPhS) consider first some general issues related to non-linear dynamics, both from the per-spective of gaining mode knowledge on how to proceed in case of such dynamics, and from tools and techniques which can be used in practice. Notably, they deal with simulation tools, and propose a new simulation program, ChaoPhS (Chaotic Phenom-ena Simulations), which is meant for studying chaotic phenomena in continuous and discreet systems, including systems used in practice. The structure of the program, and algorithms employed are presented. Numerical tests on some models of chaotic systems known from the literature are presented. Moreover, as an illustration an

VI Preface

example of using the tools and techniques proposed for the analysis of chaotic behav-ior in a power electronic system is presented.

V. Vladimirov and J. Wróbel (Oscillations of vertically hang elastic rod, contact-ing rotating disc) present an analysis of mechanical oscillations of an elastic rod forming a friction pair with a rotating disc. In the absence of friction the model is de-scribed by a two-dimensional Hamiltonian system of ordinary differential equations which is completely integrable. However, when a Coulomb type friction is added, the situation becomes more complicated. The authors use both the qualitative methods and the numerical simulation. They obtain a complete global behavior of the system, within a broad range of values of a driven parameter, for two principal types of a modeling function simulating the Coulomb friction. A sequence of bifurcations (limit cycles, double-limit cycles, homoclinic bifurcations and other regimes) are observed as the driven parameter changes. The patterns of bifurcations depend essentially upon a model of a frictional force and this dependence is analyzed in detail. Much more complicated regimes appear as one-dimensional oscillations of the rotating element are incorporated into the model. The system possesses in this case quasiperiodic, mul-tiperiodic and, probably, chaotic solutions.

V.N. Sidorets (The bifurcations and chaotic oscillations in electric circuits with ARC) is concerned with the autonomous electric circuits with ARC governed by three ordinary differential equations. By varying two parameters, many kinds of bifurca-tions, periodic and chaotic behaviors of this system. Bifurcation diagrams, which are a powerful tool to investigate bifurcations have been used and studied. Routes to chaos have been considered using one-parameter bifurcation diagrams. Three basis patterns of bifurcation diagrams that possess the properties of: softness and reversibility, stiff-ness and irreversibility, and stiffness and reversibility, have been observed.

The second section of the volume is devoted to a key problem of modeling dynam-ics in control and robotics, very relevant fields in which intelligent systems have found numerous applications.

Oscar Castillo and Patricia Melin (Soft computing models for intelligent control of non-linear dynamical systems) describe the application of soft computing techniques (fuzzy logic, neural networks, evolutionary computation and chaos theory) to control-ling non-linear dynamical systems in real-world problems. Since control of real world non-linear dynamical systems may require the use of several soft computing tech-niques to achieve a desired performance, several hybrid intelligent architectures have been developed. The basic idea of these hybrid architectures is to combine the advan-tages of each of the techniques involved. Moreover, this can also help in dealing with the fact that non-linear dynamical systems are difficult to control due to the unstable and even chaotic behaviors that may occur. Practical applications of the new control architectures proposed include robotics, aircraft systems, biochemical reactors, and manufacturing of batteries.

J. Garus (Model reference adaptive control of underwater robot in spatial motion) discusses nonlinear control of an underwater robot. Emphasis is on the tracking of a desired trajectory. Command signals are generated by an autopilot consisting of four controllers with a parameter adaptation law that has been implemented implemented. External disturbances are assumed, and an index of control quality is introduced. Results of computer simulations are provided to demonstrate the effectiveness, effi-ciency, correctness and robustness of the approach proposed.

Preface VII

P. Skruch (Feedback stabilization of distributed parameter gyroscopic systems) discusses feedback stabilization of distributed parameter gyroscopic systems de-scribed by second-order operator equations. It is shown that the closed loop system which consists of the controlled system, a linear non-velocity feedback and a parallel compensator is asymptotically stable. In the case where velocity is available, the par-allel compensator is not necessary to stabilize the system. Results for the multi-input multi-output case are presented. The stability issues are proved by using the LaSalle theorem extended to the infinitely dimensional systems. Numerical examples are given to illustrate the effectiveness and efficiency of the proposed controllers.

W. Mitkowski and P. Skruch (Stabilization results of second-order systems with delayed positive feedback) discuss issues related to oscillations in second-order sys-tems with a delayed positive feedback, notably oscillation and non-oscillation criteria. The authors consider the stability conditions for the system without damping and with a gyroscopic effect. A general algorithm for determining the stability regions is pro-posed. Theoretical and numerical results are presented for the single-input single-output case. The results obtained are better with respect to some oscillation criteria proposed so far in the literature.

The third part of the volume is concerned with the modeling dynamics in various processes that occur in biological systems. This area has recently been gaining much popularity in the research community around the world, and it is hoped that a deeper understanding of dynamics of such processes can be of a great importance for solving many problems we face in the world related to, for instance, the propagation of vari-ous kinds of disease, epidemics, etc.

F.F. Matthäus (A comparison of modeling approaches for the spread of prion dis-eases in the brain) is concerned with prion related diseases, exemplified by the well-known “mad cow disease” or the Creutzfeld-Jacob disease. She presents and com-pares two different modeling approaches for the spread of prion diseases in the brain. The first is a reaction-diffusion model, which allows the description of prion spread in simple brain subsystems, like nerves or the spine. The second approach is the combi-nation of epidemic models with transport on complex networks. With the help of these models, she studies the dependence of the disease progression on transport phe-nomena and the topology of the underlying network.

Ch. Merkwirth, J. Wichard and M. Ogorzałek (Ensemble modeling for bio-medical Applications) propose the use of ensembles of models constructed by using methods of statistical learning. The input data for model construction consists of real meas-urements taken in physical system under consideration. Then the authors propose a program toolbox which makes possible to construct single models as well as hetero-genous ensembles of linear and nonlinear models. Several well performing model types, among which are the ridge regression, k-nearest neighbor models and neural networks have been implemented. Ensembles of heterogenous models typically yield a better generalization performance than homogenous ensembles. Additionally, the authors propose methods for model validation and assessment as well as adaptor classes performing a transparent feature selection or random subspace training on a large number of input variables. The toolbox is implemented in Matlab and C++ and available under the GPL. Several applications of the described methods and the numerical toolbox itself are described. These include the ECG modeling, classifica-tion of activity in drug design, etc.

VIII Preface

The fourth part of the volume is devoted to various issues related to the modeling of dynamics in new application areas which have recently attracted much attention in the research community and practice.

M. Hrebień and J. Korbicz (Automatic fingerprint identification based on minutiae points) deal with a problem that has recently attracted much attention, and become of utmost importance, namely the use of some individual specific features in human identification. In the paper, fingerprint ideantification is considered, specifically by considering local ridge characteristics called the minutiae points. Automatic finger-print matching depends on the comparison of these minutiaes and relationships be-tween them. The authors discuss several methods of fingerprint matching, namely, the Hough transform, the structural global star method and the speeded up correlation ap-proach. Since there is still a need for finding the best matching approach, research for on-line fingerprints has been conducted to compare quality differences and time rela-tions between the algorithms considered and the experimental results are shown. Some issues related to image enhancement and the minutiae detection schemes em-ployed are dealt with.

Ł. Rauch and J. Kusiak (Image filtering using the dynamic particles method) con-sider holistic approaches for image processing and their use in various types of applica-tions in the domain of applied computer science and pattern recognition. A new image filtering method based on the dynamic particles approach is presented. It employs physical principles for the 3D signal smoothing. The obtained results are compared with commonly used denoising techniques including the weighted average, Gaussian smoothing and wavelet analysis. The calculations are performed on two types of noise superimposed on the image data, i.e. the Gaussian noise and the salt-pepper noise. The algorithm of the dynamic particle method and the results of calculations are presented.

B. Ambrożek (The Simulation of cyclic thermal swing adsorption (TSA) process) deals with the prediction of the dynamic behavior of a cyclic thermal swing adsorp-tion (TSA) system with a column packed with a fixed bed of adsorbent using a rigor-ous dynamic mathematical model. The set of partial differential equations, represent-ing the thermal swing adsorption, is solved by using numerical procedures from the International Mathematical and Statistical Library (IMSL). The simulated thermal swing adsorption cycle is operated in three steps: (i) an adsorption step with a cold feed; (ii) a countercurrent desorption step with a hot inert gas; (iii) a counter-current cooling step with a cold inert gas. Some examples of simulations are presented for the propane adsorbed onto and desorbed from a fixed bed of activated carbon. Nitrogen is used as the carrier gas during adsorption and as the purge gas during desorption and cooling.

M. Danielewski, B. Wierzba and M. Pietrzyk (The stress field induced diffusion) present a mathematical description of the mass transport in multi-component solution. The model is based on the Darken concept of the drift velocity. To be able to present an example of a real system the authors restrict the analysis to an isotropic solid and liquids for which the Navier equation holds. The diffusion of components depends on the chemical potential gradients and on the stress that can be induced by the diffusion and by the boundary and/or initial conditions. In such a quasi-continuum the energy, momentum and mass transport are diffusion controlled and the fluxes are given by the Nernst-Planck formulae. It is shown that the Darken method combined with the Navier equations is valid for solid solutions as well as multi component liquids.

Preface IX

We hope that the particular chapters, written by leading experts in the field, can provide the interested readers with much information on topics which may be relevant for their research, and which are difficult to find in the vast scientific literature scat-tered over many fields and subfields of applied mathematics, control, robotics, secu-rity analysis, bioinformatics, mechanics, etc.

The idea of this volume has been a result of very interesting discussions held during, and after the well attended Special Session on “Dynamical Systems – Modelling, Analysis and Synthesis” at the CMS – “Computer Methods and Systems” International Conference held on November 14–16, 2005 and organized by the AGH - University of Science and Technology in Cracow, Poland. We wish to thank all the attendees, and participants at discussions for their support and encouragement we have experienced while preparing this publication.

We wish to thank the contributors for their excellent work and a great collaboration in this challenging and interesting editorial project. Special thanks are due to Dr. Thomas Ditzinger and Ms. Heather King from Springer for their constant help and support.

October 2008 Wojciech Mitkowski Janusz Kacprzyk

Contents

Basic Tools and Techniques for the Modelling of Dynamics

Modeling of Chaotic Systems in the ChaoPhS ProgramRyszard Porada, Norbert Mielczarek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Model of a Tribological Sensor Contacting Rotating DiscVsevolod Vladimirov, Jacek Wrobel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

The Bifurcations and Chaotic Oscillations in Electric Circuitswith ArcV. Sydorets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Modelling Dynamics in Control and Robotics

Soft Computing Models for Intelligent Control of Non-linearDynamical SystemsOscar Castillo, Patricia Melin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Model Reference Adaptive Control of Underwater Robot inSpatial MotionJerzy Garus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Feedback Stabilization of Distributed Parameter GyroscopicSystemsPawe�l Skruch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Stabilization Results of Second-Order Systems with DelayedPositive FeedbackWojciech Mitkowski, Pawe�l Skruch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

XII Contents

Modelling Dynamics in Biological Processes

A Comparison of Modeling Approaches for the Spread ofPrion Diseases in the BrainFranziska Matthaus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Ensemble Modeling for Bio-medical ApplicationsChristian Merkwirth, Jorg Wichard, Maciej J. Ogorza�lek . . . . . . . . . . . . . . . 119

New Application Areas

Automatic Fingerprint Identification Based on MinutiaePointsMaciej Hrebien, Jozef Korbicz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Image Filtering Using the Dynamic Particles MethodL. Rauch, J. Kusiak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

The Simulation of Cyclic Thermal Swing Adsorption (TSA)ProcessBogdan Ambrozek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

The Stress Field Induced DiffusionMarek Danielewski, Bart�lomiej Wierzba, Maciej Pietrzyk . . . . . . . . . . . . . . . 179

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 1 – 20. springerlink.com © Springer-Verlag Berlin Heidelberg 2009

Modeling of Chaotic Systems in the ChaoPhS Program

Ryszard Porada and Norbert Mielczarek

Poznan University of Technology, Institute of Industrial Electrical Engineering Piotrowo 3a, 61-138 Poznań, Poland {Ryszard.Porada,Norbert.Mielczarek}@put.poznan.pl

Abstract. Modeling of the chaos phenomena in the nonlinear dynamics requires application of more precise methods and simulatory tools than in cases of researches of linear systems. Researches on these phenomena, except cognitive values, has also importance in technical meaning. For obtaining the high quality parameters of output signals of practical systems it is necessary to control, and even eliminate chaotic behaviour. Practical simulatory programs, eg. Matlab not always realize high criteria concerning exactitude and speed of the simulation. In the paper we introduced a new simulatory program ChaoPhS (Chaotic Phenomena Simulations) to investigate chaotic phenomena in continuous and discreet systems, and also systems encoun-tered in practice. Also we presented structure of the program and used numeric algorithms. The program was tested with utilization of well-known from the literature models of chaotic sys-tems. Some selected results of researches chaotic phenomena which appear in simple power electronic systems are also presented.

1 Introduction

In recent years it is observed alot of interest in theory of deterministic chaos not only among mathematicians and physicists, but also among representatives of technical sciences. This theory analyzes irregular movement in the state space of nonlinear sys-tem. Classic dynamic laws describe unambiguously the state of systems evolution as a function of time, when initial conditions are known. The reason of observed chaotic behaviour is not an external noise, but the property of nonlinear systems resulting in exponential divergence of an initially close trajectory in the limited area of phase space. The reason why the system behaves this way is its sensitivity to initial condi-tions which makes impossible a long-term forecast of their trajectory, because in prac-tice we can establish initial conditions only with a finite precision.

The research on deterministic chaos phenomena enables the identification of a reason and designation of means of their elimination that is essential in practical applications.

The state vector of nonlinear systems in longer prospects of time depends on initial conditions and significantly also on numeric methods applied to solving equations de-scribing these systems. The application of one of typical simulation programs, e.g. Matlab is often related with a very long computation time. Also a limited number of implemented numeric algorithmic integration method of dynamics equations and lack of numeric instruments to assign the quantities characterizing methods of nonlinear dynamics (e.g. the Poincaré section, Lyapunov exponents etc.), has contributed to our decision to write our own simulating program.

2 R. Porada and N. Mielczarek

The paper describes the concept of deterministic chaos and mathematical instru-ments used for its analyses. We introduced a self-made simulating program, ChaoPhs, carried out tests of this program and demonstrated research results of a simple power electronic system (an example of a typical, strongly nonlinear switching structure used in practice), operating in a closed system for various control and load conditions.

2 General Characteristic of Methods of Nonlinear Dynamics

Behaviour of dynamical system (evolutionary) usually can be described [1,2] by the differential equations in normal form:

))(),(),(,()( ttttt ΛuxFx = , 00 )( xx =t (1)

where: x – state vector, u – control vector, Λ – vector of additional parameter,

specified on manifold M , which creates its phase space. Phase flow )(),( xgx,g tΛt ≡Λ

generates vector field F specified on manifold M . The subset of points:

{ }1:: ℜ∈=∈= tKMK Λtgγ (2)

makes the orbit of flow. The orbit is a curve lying on manifold M and is the trajec-tory of equations (1). If equation (1) has a periodic solution with period T , then

tTt gg ΛΛ =+ , 1ℜ∈t and orbit (2) is closed. The orbits of flow tgΛ are the integral

curves of vector field F . For the system with a discrete time given in the form of an algebraic representa-

tion, the evolution in the function of time can be described by an equation in form of a general iterative formula:

)(1 npn xfx =+ (3)

where nx and 1+nx describe the system state in the n -th and in )1( +n -th step of

evolution. Among all basic methods of nonlinear dynamics [2,3,4] it is possible to mention

several mutually interrelated notions, like fixed points, orbits, attractors, the Poincaré section, the Lyapunov exponents, the Hausdorff dimension, the correlation function or bifurcation [1,2,7,11].

An attractor is a certain region, trajectory or point in the phase space, towards which trajectories beginning in different region of phase space head. The simplest at-tractor is a fixed point, when the system has a distinguished state, towards which it is aiming regardless of the initial conditions. In a two-dimensional phase space there is only possible one more type of an attractor – border cycle. Border cycles appear in nonlinear systems, in which there exist elements dissipating the energy and support-ing the movement.

Modeling of Chaotic Systems in the ChaoPhS Program 3

The Poincaré sections simplify the attractor search problem by the analysis of points appointed by trajectories which are cutting through the chosen plane. The Poincaré

map emerges from orbits of the phase flow tgΛ , and its property, i.e. the qualification

whether it is contracting or expanding, determines the systems proceeding. The Lyapunov exponents are used to estimate the convergence or divergence of the

phase flow trajectory. The positive values of exponents mean the divergence of orbits and chaos. The Lyapunov exponent is defined by the equation:

)(ln1suplim ttt

ξλ ∞→

= (4)

where )(tξ is a phase trajectory and describes the exponential divergence or conver-

gence of trajectories surrounding the analyzed trajectory. In general, the number of Lyapunov exponents equals the number of dimensions of the phase space.

By bifurcation in nonlinear dynamics we call a change in a stable functioning of the system, proceeding under a modifications of control parameter. If we assume, that the movement of a dynamic system is described by the structure of split of phase space into trajectories, then by bifurcation values of parameters we understand those for which this structure undergoes changes.

3 Numerical Modeling of Chaotic Systems

Sensitivity of initial conditions and unexpectedness in a long-term period of time has the fundamental meaning for the evolution of chaotic systems. A numeric assignment of trajectory of such systems is more difficult then in the case of linear systems. Deci-sive are the accepted mathematical models, first of all, numerical algorithms. It re-quires a particularly precise checking of error emerging during the calculations.

It is often accepted [1,2] that for the purpose of a preliminary evaluation of the sys-tem's behaviour it is possible to apply a simplified system in which there occur only simplified models of nonlinear elements, being the principle cause of the chaotic be-haviors. From the point of view of a high sensitivity of such system, a quantitative analysis of trajectory is useless. Results of such investigations are useful only for the qualitative analysis, that is to assign fixed points, bifurcations and existence of chaotic attractors.

A trajectory of a linear or nonlinear systems defined by formula (1) can be found by the use of iterative algorithms:

∫+

+=+ht

t

dtxtftxhtx ),()()( (5)

or by expanding in the Taylor series. In this research we use several methods of solv-ing equations (1) and they are all discussed in the farther part of this paper.

For a qualitative study of the chaotic model very helpful can be the Poincaré sec-tion. It makes possible a simplification of the attractor search task by the analysis of points appointed by trajectories which are cutting through chosen plane. Instead of continuous lines we obtain a set of points situated on this plane (Fig. 1). The plane is


Fig. 1. An exemple of the Poincaré section in autonomous systems

Fig. 2. Example of the Poincaré section in non-autonomous systems

selected in such a way so as to provide as much as possible information, if this kind of motion has an attractor and which is its structure. If the motion takes place on a closed trajectory, then it intersects the Poincaré plane in one point and regularity of the movement is easy to notice. A chaotic motion gives irregular trajectories which cross the plane in others new points. If there is no regularity (i.e. an attractor), then the in-tersection points migrate in an irregular way within a certain region of the plane, fa-voring none of its part.

In the non-autonomous systems (particularly for input signals with a constant pe-riod T ) we often apply the stroboscopic Poincaré section [1,2]. The Poincaré section points can be found in moments nT , Nn ,,2,1 … = .

In nonlinear dynamics the bifurcation diagram is used for the evaluation of a stable work, evaluating the change of a stable system functioning, undergoing changes under modifications of value of control parameter – Fig. 3. The change of state occurs in the form of trajectory multiplications [1,2], leading as a result to chaotic behaviour of the system. This diagram can be obtained by putting on the x axis the value of a control parameter and on the y axis – the found points of the Poincaré section for different

initial conditions, after the elimination of transient states. To distinguish phenomena of the deterministic chaos from noise or systems that are

entirely stochastic, we can use the Lyapunov exponents and the series of generalized dimensions (Hausdorff, fractal, correlation). The first define the level of chaos in a dynamical system, whereas the second defines a measure of complexity of the system.


Fig. 3. Bifurcation diagram showing a cascade of period doubling of phase trajectory orbit

It is rather hard to calculate these coefficients analytically, however they can be rela-tively easily determined by the use of sampled time series of the investigated system.

The Lyapunov exponents are numerical coefficients of exponential growth of dis-tance between neighboring points on phase space, when we operate on it using a transformation. For the simplest transformation nn xax =+1 , after n steps, we ob-

tain 0xax nn = , which can easily be recorded as anxxn

ln0 e = . The aln shows the

proportion in which the distance between points in one step of transformation changes. For the multidimensional systems, where the transformation is a set in form

nn xAx =+1 , the Lyapunov exponents are equal kk aln=λ , where kaaa ,,, 21 … are

the eigenvalues of A matrix. In directions where trajectories diverge from each other, the Lyapunov exponents are positive, and on the contrary – when they converge – the exponents are negative. The condition to keep the measure is 1det =A which means that the product of all eigenvalues is equal 1. For the continuous nonlinear systems, the rate of motion on each trajectory is set by a tangent vector. The transformation

matrix is the Jacobian matrix, ijji xJ

∂∂= f , where ji

J

are functions of points co-

ordinates in the phase space and they define the rate of change of the j -th coordinate

in the ix direction. Therefore these exponents are calculated locally and theirs values

are obtained in small surroundings of the explored point. In order to assign the largest Lyapunov exponent, in this research was used the al-

gorithm developed by Collins, De Luca and Rosenstein [16]. Let the sequence:

{ }Nxxxxx ,,,, 321 …= (3)

represent the samples of a time series of one of the state variable for which exponents are being estimated, whereas:

[ ] T ni XXXX ...21= (4)


where iX – vector of state variables obtained in discrete time i , n – number of

state variables (embedding dimension of systems trajectory). Applying the Takens method [12] of attractor reconstruction from the time series,

we obtain the vector of delayed state variables:

( )[ ]T JmiJiii 1... −++= XXXX (5)

where: J – reconstruction delay, m – embedding dimension of space of delayed state variables vector.

To correctly designate the embedding dimension of space m , we apply the Takens theorem:

12 +≥ nm (6)

After reconstruction of the vector of state variables, we find distance jd to the ref-

erence point j , in the nearest neighborhood, defined as the Euclidean norm:

( ) jj

j

j XXdX

−= min0 (7)

where: ( )0jd – initial distance of j -th point from neighboring point. It is possible to

accept, that distance )(id j is equal to:

)(1)( tieCid jjΔ⋅≈ λ (8)

where jC – initial distance.

After finding the logarithm of both sides of the equation, we obtain:

)(ln)(ln 1 tiCid jj Δ⋅+≈ λ (9)

The largest Lyapunov exponent can be obtained by calculating the slope coefficient of equation (9) using the least squares method and dividing it by the sample interval of time series x .

Fig. 4. Method of numeric calculations of the Lyapunov exponent


For a qualitative description of complexity of the chaotic system we can use the correlation dimension 2D , being the lower limit of the Hausdorff dimension 0D , i.e.

02 DD < . The correlation dimension is defined as:

( )rCr

Dr

lnln

1lim

02

→= (10)

where ( )rC – correlation integral equal to:

( ) ( ) [ ]∑≠

−−Θ−

=ki

ki XXrMM

rC1

2 (11)

where: r – distance between points, Θ – the Heaviside function. If the time signal (5) is known, then it is possible to compute the correlation inte-

gral ( )rC . The correlation integral is the probability that the distance between two

points on the attractor is smaller than r . In this work we use the Grassberger-Procacci algorithm [11] to calculate ( )rC as

the correlation sum. Writing equation (10) in the form:

( ) ( )rfrC lnln = (12)

it is possible to notice that the correlation dimension can be assigned as the slope of the function (12).

The correlation integral can also be used as an instrument allowing to distinguish deterministic irregularities, arising from internal properties of a strange attractor from the external white noise. If a strange attractor is embedded in an n - dimensional space and an external white noise is added, then each point on the attractor is rimmed with a homogeneous n – dimensional cloud of points. The radius of this cloud 0r is

proportional to the intensity of noise. For 0rr >> the correlation integral counts

these clouds as points and the slope of function ( ) ( )rfrC lnln = is equal to the cor-

relation dimension of the attractor. For 0rr << most of the counted points are situ-

ated inside homogenously filled n -dimensional cells and the slope tends to value n . In practical applications, the sources of deterministic chaos are switched systems,

e.g. power electronics systems. In investigations using numerical simulations they cause additional difficulties whose severity depends on a selected model of switch-ing elements. In case of a model of system with a changeable structure, it is often an ideal model of the element (zero time to switch, zero resistance of switch in on-state and infinite in off-state). The method to eliminate the right-hand side discontinuity of a system is the numerical calculation of the switching moment t , and next the

integration according to rule (2) in the range −Stt; and setting the initial condi-

tions )()( −+ = SS txtx of the new integration (2) in the range httS ++ ; . It is possible

to eliminate the left-hand side discontinuity (e.g. closing switch in a circuit with


capacitance with a non-zero initial condition) by setting initial conditions of a new

)()( −+ ≠ SS txtx and a very small integration step.

In the case when we select a model with the stationary structure, it is possible to replace an ideal switching element with the real one and apply the algorithm of solv-ing stiff differential equations of systems with a very small integration step.

4 Description of the ChaoPhS Program

The ChaoPhS (Chaotic Phenomena Simulations) program, whose block diagram is shown in Fig. 5, was written in the C++ Builder software development kit, with the use of object-oriented programming technology.

The program contains a library of mathematical models of chaotic systems and there are also power electronics systems, from among which a tested object is se-lected. This library is open because it is added dynamically. This means that when we develop this program, it is possible to add new elements to the library without neces-sity of modification and a renewed compilation of entire programming code. Each system joined to the library is treated as a class being an object (in the sense of pro-gramming terminology) which has definite methods used to analyse this system [6] and properties determining its parameters. The methods of the class describing the system composed among others with implemented numeric algorithms solving equa-tions of the mathematical model of this system and mathematical instruments used for its analysis. This concept and part of numerical methods was taken from work [14].

To solve the differential equation (1) it is possible to choose one of methods due to: Runge-Kutta, Fehlberg, Dormand-Prince, Adams-Multon, Gear and Gragg with the Bulirch-Stoer extrapolation. It is also possible to choose the methods order as well as the step of integration or use of option of an automatic selection of the step, and in some cases an automatic selection of the method’s order (Fig. 5). To provide a facility improving this program performance and simplicity of implementing methods to in-vestigate of nonlinear systems, there is added an approximation method of state vec-

tor. A discrete state vector obtained in discrete time 0k

t t k h= + ⋅ ( h – integration

step) one can approximate by getting additional data between the discrete time kt and

1−kt . To the applied method of spline function approximation of third order [13] there

has been added the linear interpolation which for some studied systems gives better results by considering very small integration steps consequent of nonlinearity and nonstationary of these objects.

Besides the waveform of the state vector, the program makes possible to appoint the phase trajectory and spectral analysis with the usage of the discrete Fouriers transformation (Horner’s algorithm and FFT algorithms created by: Cooley-Tukey and Sande-Tukey with defined radix). The methods of analysis of nonlinear dynam-ics, which were added to the used numerical library [6], concern the determination of the Poincaré section, bifurcation diagrams of selected systems parameters, the


a) menu of model choice b) menu responsible for choice of simulation start

c) option dialog box of simulation parame-ters

d) option dialog box of parameters of data analysis

e) dialog box for changing model parameters

f) menu of type of graphical chart describing system

g) table with models time series of state vector

Fig. 5. Exemplary screenshots of program ChaoPhS concerning the choice of numerical meth-ods solving the equations of systems model


Lyapunov exponents and a correlation function (developed by Rosenstein, Collins, and De Luca [16]).

The program makes also possible to import data calculated in other programs (e.g. Matlab) and to compute selected quantities characterizing the nonlinear dynamics (e.g. the Poincaré section or the Lyapunov exponents). Data can be selected from the program menu <File>.

To choose the model one should press the <Model> submenu from menu bar and make selection of the proper system from objects available in the library (fig. 5a). Next one should click the <Option> command from main menu and set the parame-ters of simulation (fig. 5c), that means: initial and final time of simulation, initial val-ues of state vector, algorithm of ODE solver and its order, precision and integration step. Then there appears the name of the model investigated and the ODE solver al-gorithm, at the bottom of the program panel on the status bar. In the main window there should be visible a graphic symbol of the model, that for convenience can be moved inside the programs panel. After setting parameters of the model (by double clicking on the graphic symbol of system in Fig. 5e), it is possible already to start calculations pressing the submenu <Calculations> and <Solve> command. During the computation the program reacts, and it can be visible in the progress bar placed below the main panel. Afterwards on the screen a waveforms window of state vari-ables (Fig. 5b) appears which can be enlarged to the size of the program main win-dow and can be dynamically changed using a computer mouse. In the main menu one can activate the submenu <Charts> from which it is possible to choose the graphs: waveforms of state variables, phase portraits, frequency spectrum, Poincaré section, bifurcation diagram and function used to estimate the largest Lyapunov ex-ponent and correlation dimension (fig. 5f). Before selecting any chart, it is necessary to set options used in tab <Numeric methods II> in the options dialog box (Fig. 5d). There can be selected algorithms of FFT, approximation (or interpolation) methods of state vector if automatic selection of integration step is enabled, and also parame-ters of the Poincaré section, bifurcation diagram and the Lyapunov exponent. In this program it is also possible to see the computed state vector of the model equation by selecting <Table> command from the main menu. The vector appears inside the ta-ble shown in Fig. 5g.

Except for the model of systems described by equation (1) the program can analyze models represented by algebraic equations, e.g. logistic map, Henon map, Ikeda map and others.

5 Verification of the ChaoPhs Program

For verifying instruments implemented in ChoaPhS it has been carried out test of well-known models of chaotic systems in the form:

recurrent map: logistic, Henon, Ikeda, differential equation (1) of models: Lorenz, Rössler, Rössler – hyperchaos, Chua

circuit with smooth nonlinearity.


Logistic map Henon map Ikeda map

Fig. 6. Bifurcation diagrams of tested mathematical models

Logistic map Henon map Ikeda map

Lorenz system Rösslera system Chua circuit (smooth nonlinear-

ity)

Fig. 7. Attractors of tested models

In Fig. 6 thre were introduced the diagrams of bifurcation for the investigated test maps. All of the diagrams are identical with those obtained in other publications [1,2,10] which proves the correctness of the performed calculations onto iterative maps characterized by formula )(1 nn f XX =+ .

From Fig. 7 it results that also systems described by (1) are correctly simulated, and the implemented methods of solving differential equations are proper. Phase tra-jectories forming strange attractors are the same as those contained in [1,2,6,7,8,9,10].

For the assignment of the largest Lyapunov exponent the program draws chart of distance of trajectory points as a function of the largest exponent ))(()(ln 1 tfid j λ= .

The exponent is calculated (using the least squares method in a selected range) on the


Fig. 8. Chart of function ))(()(ln 1 tfid j λ= for logistic map

Fig. 9. Chart of function ( ) ( )rfrC lnln = for logistic map

basis of slope of this function. The correlation dimension 2D can be determined on the

basis of a chart of logarithm of the correlation integral (an exemplary chart is shown in Fig. 9) as a function of the logarithm of distance between neighboring points

( ) ( )rfrC lnln = .

To determine the Lyapunov exponent and the correlation dimension, the time se-ries of just one state variable is sufficient because the program independently assigns


the attractor using the method of delayed time series [12]. Before those quantities are computed in the program, it is necessary to input the following parameters: embedded dimension of attractor obtained from time series, time delay, number of time series used in calculations of exponent 1λ and dimension 2D , and also window size, out-

side of which points are skipped. Table 1 shows a comparison of values of the largest Lyapunov exponent presented

in publications [15,16,20] with exponents calculated using ChaoPhS. It can be noticed that the compared values are similar.

Table 1. Comparison of values of largest Lyapunov exponents calculated in program ChaoPhs with values presented in publications [15,16,20]

6 Simulation of Simple Power Electronics Converters

We have presented some results of simulations of practical switching systems on an example of a step-down buck converter [17], with different kinds of modulation. For the converter we used the sawtooth and triangle signal as a carrier signal in the PWM modulation. The investigations were carried out with the uso of ChaoPhS and were compared with results obtained by the Matlab program.


Fig. 10. Main panel of program ChaoPhS

Fig. 11. Frequency spectrum of DC/DC buck converter for control gain K as parameter: 4,8=K , 12=K , 5,14=K , 23=K

Figure 10 shows the main panel of ChaoPhS with a scheme of the DC/DC con-verter and a graphical chart containing a voltage waveform during the steady system state.

Figure 11 shows the evolution of converters state from the steady state with the T1 periodic orbit, through the states T2 -, T4 - periodic, up to chaotic system functioning. It is possible to notice there duplicative stripes of the frequency spectrum and in the fi-nal chart – the absence of a leading frequency.


For power electronics system, with a periodic waveform of current and voltage whose frequency equals frequency of an external signal, that is frequency of the sawtooth signal in the PWM generator, the largest Lyapunov exponent should be negative. This is caused by a lack of dissipation of system trajectory. Function

))(()(ln 1 tfid j λ= for this condition must have a negative slope – Fig. 12a. For a

chaotic model of operation the slope is positive (Fig. 12b).

a) b)

Fig. 12. Chart for calculation of largest Lyapunov exponent for: a) stable state 4,8=K ;

b) chaos 23=K

Fig. 13. Nonlinear phase trajectory – chaotic activity of DC/DC buck converter

In Figures 13 and 14 it is shown the phase trajectory and bifurcation diagram of the investigated converter which confirm the occurrence of chaotic phenomena when the control parameter (gain system) is changing.

Figure 15 represents a situation in which two attractors related with bifurcations arise which are formed during the functioning of the system. In this figure there are


compared two diagrams – the one obtained using the program presented program and using Matlab. One can notice that:

a)

b)

Fig. 14. Bifurcation diagram of DC/DC buck converter with sawtooth carrier signal in PWM: a) ChaoPhS, b) Matlab

Similarity of these two diagrams in the range <8,25> of control parameter value. The additional clouds of points on the Matlab diagram are connected with an insuffi-cient filtration of the transient states of the system which is a result of a long compu-tation time in comparison with the computation time of the presented program.

Also the functioning of a buck type converter with a triangle carrier signal of the PWM modulation [19] was presented.

Figure 16 shows a panel of ChaoPhS with a diagram of the converter, a window of the phase space chart and a dialog box used to input parameters of the model. The


a)

b)

Fig. 15. a) Two attractors of DC/DC buck converter; b) enlargement of one attractor

picture of the phase spaces shows that the system is in the state of a deterministic chaos. Even though the structure of the converter has not been changed, the region of stable work occurs in a different range of values of the controlled parameter K . This can be noticed when we compare bifurcation diagrams of both systems pre-sented in Figs. 14 and 18. Additionally, Figure 18 shows how significant are initial


conditions for which simulation was performed. For the rol parameter 10=K with different initial conditions, the Poincaré section can have one or three stable points.

From Figure 17 it results that the structure of attractor of the buck converter with the triangular carrier signal of the PWM modulation, is very complex and is different from the system with the sawtooth carrier modulation signal (Fig. 15).

Fig. 16. Main panel of ChaoPhS program for DC/DC buck converter with triangle carrier PWM signal

Fig. 17. Poincaré section for DC/DC buck converter with triangle carrier PWM signal and 23=K


Fig. 18. Bifurcation diagram for DC/DC buck converter with triangle carrier PWM signal

a) b)

Fig. 19. Chart to obtaining largest Lyapunov exponent for: a) stable state 4,8=K , b) chaos

23=K

7 Conclusions

The paper presents a simulation program, ChaoPhS (Chaotic Phenomena Simula-tions), intended for investigating deterministic chaos phenomena in various systems, among others in power electronics converters. This program was written in the C++ Builder software development kit to support of object-oriented programming technol-ogy. Due to the application of dynamically linked libraries of the studied systems, methods of solving equations which describe them, and also methods of analysis of occurring chaos phenomena, the presented program can be easily developed further.

The verification of correctness of the analysis of well-known models of chaotic systems carried out with the use of the presented program shows convergence with results presented in the bibliography.

In this paper we also introduced several models of power electronics converters whose chaotic properties are an object of research of the authors.


References

[1] Ott, E.: Chaos in dynamical systems (in Polish). WNT, Warszawa (1997) [2] Schuster, H.G.: Deterministic chaos. An introduction (In Polish). PWN, Warszawa (1995) [3] Hamill, D.C.: Power electronics: A field rich in nonlinear dynamics. In: Nonlinear Dy-

namics of Electronic Systems, Dublin (1995) [4] Hirsch, M.W., Smale, S.: Differential Equations, Dynamical Systems and Linear Algebra.

Academic Press, New York (1974) [5] Banerjee, S., Ranjan, P., Grebogi, C.: Bifurcations in one-dimentional piecewise smooth

maps: Theory and applications in switching circuits. IEEE Trans. On Circuits and Sys-tems – I 47(5) (2000)

[6] Lorenz, E.N.: Deterministic nonperiodic flow. J. Atmos. Sci. 20, 130 (1963) [7] Eckmann, J.-P., Ruelle, D.: Ergodic theory of chaos and strange attractors. Rev. Mod.

Phys. 57, 617 (1985) [8] Rössler, O.E.: An equation for continuous chaos. Phys. Lett. A 57, 397 (1976) [9] Rössler, O.E.: An equation for hyperchaos. Phys. Lett. A 71, 155 (1979)

[10] Hénon, M.: A two-dimensional mapping with a strange attractor. Comm. Math. Phys. 50, 69 (1976)

[11] Grassberger, P., Procaccia, I.: Characterization of strange attractors. Phys. Rev. Lett. 50, 346 (1983)

[12] Takens, F.: Lecture Notes In Math, vol. 898. Springer, Heidelberg (1981) [13] Baron, B., Piątek, Ł.: Metody numeryczne w C++ Builder. Helion, Gliwice (2004) [14] Baron, B.: Układ dynamiczny jako obiekt klasy C++. IC-SPETO, Gliwice-Ustroń (2005) [15] Wolf, A., Swift, J.B., Swinney, H.L., Vastano, J.A.: Determining Lyapunov exponents

from a time series. Physica D 16, 285 (1985) [16] Rosenstein, M.T., Collins, J.J., De Luca, C.J.: A practical method for calculating largest

Lyapunov exponents from small data sets (1992) [17] Porada, R., Mielczarek, N.: Wstępne badania symulacyjne zachowań chaotycznych

w układach energoelektronicznych. ZKwE, Kiekrz (2004) [18] Porada, R., Mielczarek, N.: Preliminary Analysis of Chaotic Behaviou. In: Power Elec-

tronics. EPNC, Poznań (2004) [19] Porada, R., Mielczarek, N.: Badania zjawisk chaosu deterministycznego w zamkniętych

układach energoelektronicznych. In: IC-SPETO 2005, Gliwice-Ustroń (2005) [20] Huang, P.J.: Control in Chaos (2000),

http://math.arizona.edu/~ura/001/huang.pojen/

Model of a Tribological Sensor ContactingRotating Disc

Vsevolod Vladimirov and Jacek Wrobel

AGH University of Science and TechnologyAl. Mickiewicza 30, 30-059 Cracow, [email protected], [email protected]

Abstract. We study mechanical oscillations of a sensor, forming a friction pair withthe rotating disc. In the absence of friction the model is described by a two-dimensionalhamiltonian system of ODE’s which is completely integrable. As the Coulomb-type fric-tion is added, the regimes appearing in the modelling system become more complicated.They are investigated both by the qualitative methods and the numerical simulation.With such a synthesis we obtain a complete global behavior of the system, withinthe broad range of a driven parameter values, for two principal types of the model-ing function, simulating the Coulomb friction. A sequence of bifurcations (limit cycles,double-limit cycles, homoclinic bifurcations and other regimes) are observed as the thedriven parameter changes. The pattern of bifurcations depends in essential way uponthe model of friction force employed and this dependence is analyzed in detail. Muchmore complicated regimes appear as we incorporate into the model the one-dimensionaloscillations of the rotating element. The system possesses in this case quasiperiodic,multiperiodic and, probably, chaotic solutions.

1 Model Describing the Vertical Rod Which ContactsRotating Disc

1.1 Statement of the Problem

We consider a modelling system, describing oscillations of the vertical rod. Itslower end is fixed, while the upper one contacts the rotating disc. Geometry ofthe mechanical system is shown in fig. 1. We assume in addition that the disccan perform vertical oscillations. In such circumstances, nonlinear oscillations ofthe far end of the rod can be described by the following second order equation:

x + x3 − x + f(x − ν) [1 + ε sin (ω t)] = 0, (1)

where f is a Coulomb-type friction force, having the properties:

• f is the antisymmetric function• f(ν) has a local minimum for some ν0 > 0.

W. Mitkowski and J. Kacprzyk (Eds.): Model. Dyn. in Processes & Sys., SCI 180, pp. 21–27.springerlink.com c© Springer-Verlag Berlin Heidelberg 2009

22 V. Vladimirov and J. Wrobel

VB

A

Fig. 1. Outlook of the mechanical system: A–vertical rod; B–rotating disc

1.2 Local Analysis of the Autonomous Case

At first let us analyze the case ε = 0, for which equation (1) can be rewritten asthe following dynamical system:

x = −y (2)y = x3 − x − f(y + ν),

To begin with, let us note that system (2) is Hamiltonian when f = 0 and it iscompletely described by the Hamiltonian function

H(x, y) =x4

4+

y2

2− x2

2,

which is constant on the phase trajectories (see fig. 2).For f �= 0, all stationary points of system (2) lie on the horizontal axis, and

have the representation (x∗, 0), where x∗ satisfies the equation

x3 − x = f(ν). (3)

We assume in addition that f(ν) ∈ (− 2√

39 , 2

√3

9 ). With this condition systempossesses three stationary points.

It is easy to see that Jacobi matrix of system (2), corresponding to a stationarypoint (x∗(ν), 0), is given by the following formula:

J(ν) =[

0 −13x2

∗(ν) − 1 −f ′(ν)

]

The eigenvalues of matrix J(ν) are as follows:

λ1 =12

(−f ′(ν) −

√4 − 12x2∗(ν) + f ′(ν)2

),

λ2 =12

(−f ′(ν) +

√4 − 12x2∗(ν) + f ′(ν)2

).

Model of a Tribological Sensor Contacting Rotating Disc 23

Fig. 2. Phase portrait of system (2) in case f = 0

In accordance with the assumption that f(−x) = −f(x), we analyze theregimes, appearing in the system when ν > 0. It follows then from equation(3) that coordinates x−, x0 and x+ of the critical points lie, respectively, insidethe intervals (−1,−0.577), (−0.577, 0), (1, 1.359). For x0(ν) ∈ (−0.577, 0) theeigenvalues are real and have opposite signs. The eigenvalues corresponding tocritical points x−(ν) ∈ (−1,−0.577) and x+(ν) ∈ (1, 1.359) are complex. Theysatisfy the inequality

Re[λ±1,2] > 0 when ν < ν0,

Re[λ±1,2] < 0 when ν > ν0.

So in vicinity of the value ν = ν0 limit cycles’ creation occur. In order tostudy their types we should calculate the real part of the first Floquet indexC1(ν0) [1]. Luckily we can do it for both of the critical points A± = (x±(ν0), 0)simultaneously. Performing the change of variables

x = −W

Ω+ x±, (4)

y = U, (5)

where Ω =√

3x2± − 1, we get the canonical representation

[U

W

]=

[0 −ΩΩ 0

] [UW

]+

[h1

h2

], (6)

where

h1 =3W 2x±

Ω2− W 3

Ω3−

−[f

′′(ν0)U2 + f

′′′(ν0)U3

]+ o(U3),

h2 = 0.


Using the well know formula (see e.g. [1]), we obtain, that

Re(C1(ν0)) = −3 f ′′′(ν0)/8.

From this we conclude that stability type of the pair of the limit cycles completelydepends on the sign of f ′′′(ν0). If f ′′′(ν0) > 0 then the stable limit cycles appearwhen ν < ν0. Contrary, for f ′′′(ν0) < 0 the unstable limit cycles appear whenν > ν0.

1.3 Global Behavior of the Autonomous System

Above we have shown that stability types of the periodic trajectories arising insystem (2) depend merely upon the sign of f ′′′(ν0). Now we are going to presentthe global behavior of this system and its dependence upon the parameter ν andthe type of the modelling function f , simulating the Coulomb friction. We usethe following approximation for f :

f(ν) ={

ϕ(ν) when ν ∈ (0, ν1),k arctan (ν − ν1) + ϕ(ν1) when ν > ν1,

where ϕ(ν) = a ν4 + b ν3 + +c ν2 + d ν1 + e.Numerical simulations show that, depending on the sign of f ′′′(ν0), there are

two types of the global behavior, as it is illustrated on fig. 3 and 4, while the restof peculiarities of function f seem to be unimportant. The global phase portraitspresented here could serve as a basis of the prediction of qualitative behavior ofthe autonomous system (2) in the broad range of the values of the parameter ν.

1.4 Non-autonomous Case

In general case equation (1) can be presented as the following dynamical system:

x = −y (7)y = x3 − x − f(y + ν) [1 + ε sin (ωt)] .

v0

Fig. 3. Qualitative changes of phase portrait of system (2), case f′′′

(ν0) > 0


Fig. 4. Qualitative changes of phase portrait of system (2), case f′′′

(ν0) < 0

Fig. 5. Bifurcation diagrams of system (7), obtained for ε = 0.2, and increasing ν


In what follows, we assume that ε ∈ (0, 1]. Numerical experiments show thatbehavior of system (7) does not differ from that of system (2) when ε << 1.

There are no significant changes also in case when ε is of the order of unity, butν > ν0 +d, i.e. in those cases when the critical points (x±, 0) of the autonomoussystem are stable foci. But the behavior of system (7) drastically changes from(2) when ν ∈ (0, ν0 + d). Qualitative changes of the non-autonomous systemthat have been studied with the help of the Poincare sections techniques [1] areshown in figs. 5–10 They present the results of numerical simulations in which



Fig. 8. Bifurcation diagrams of system (7), obtained for ε = 0.2, and decreasing ν




the driving parameter ν either grow or decreases. All figures present the resultsof the simulation for the case f

′′′(ν0) < 0.

2 Concluding Remarks

A brief presentation of the global analysis of equation (1) shows that even theautonomous case presents very rich behavior within the parameter range ν ∈(0, ν + d] for some d > 0. The qualitative features of the phase trajectoriesdepend merely on the sign of f ′′′(ν0) and seems not to be sensible upon the otherdetails of the modelling function f , representing the Coulomb-type friction.

The variety of solutions becomes much more reach when the term that de-scribes vertical oscillation is incorporated. On analyzing the qualitative featuresof solutions one can see that it becomes more and more complicated, depend-ing on the values of the parameter ε. As this parameter growth, the system (7)demonstrates periodic, quasiperiodic and multiperiodic regimes, period doublingcascades and, probably, chaotic oscillations. Let us note, yet, that this is the casewhen ν ∈ (0, ν +d], because for sufficiently large values of velocity, lying beyondthis interval, all the movements in the system become asymptotically stable, andtend, depending on the initial values, to either (x+, 0) or (x−, 0).

Acknowledgements

The authors are very indebted to Dr T.Habdank-Wojewodzki for the acquaintingwith his experimental results and valuable suggestions.

Reference

[1] Guckenheimer, J., Holmes, P.: Nonlinear Oscillations, Dynamical Systems and Bi-furcations of Vector Fields. Springer, New York (1987)


The Bifurcations and Chaotic Oscillations in Electric Circuits with Arc

V. Sydorets

Paton Welding Institute Bozhenko 11 Kiev, Ukraine [email protected]

Abstract. The autonomous electric circuits with arc governed by three ordinary differential equations were investigated. Under variation of two parameters we observed many kinds of bifurcations, periodic and chaotic behaviors of this system. The bifurcation diagrams were studied in details by means of its construction. Routes to chaos were classified. Three basis patterns of bifurcation diagrams that possess the properties – (i) softness and reversibility; (ii) stiffness and irreversibility; (iii) stiffness and reversibility – were observed.

1 Introduction

In the last years the investigations of nonlinear dynamical dissipative systems are rapid developed. The fundamental results one of which is invention of deterministic chaos in different mechanical, physical, chemical, biological, and ecological systems was obtained. Same phenomena were found out in electrical engineering. They were studied in detail by L.Chua [1] and V.Anishchenko [2].

A classical nonlinearity – electric arc in electric circuits remain insufficiently researched. Author was tried to make up for this deficiency. The more so since the mathematical model of dynamical electric arc was proposed by I.Pentegov, and conjointly with author was improved and used in many applications [3].

As is shown preliminary investigations [4] in electric circuit with arc the emergence of a deterministic chaos is possible.

A cardinal importance in nonlinear systems has the bifurcation phenomenon. Under variation of two parameters a lot of kinds of bifurcations, periodic, and chaotic regimes can be observed:

Hopf bifurcation (supercritical or subcritical); Bifurcation of twin limit cycles (stable and unstable); Infinite cascade of period doubling bifurcations with transition to chaos; Finite cascade of period doubling bifurcations with or without transition to chaos; Reverse cascade of period doubling bifurcations; Intermittency; Crisis of attractor; Overlapping of attractor basin that leads to metastable chaos and isolate

regimes.

30 V. Sydorets

A powerful tool to investigate nonlinear dissipative dynamic systems is toconstruct one and two parameter bifurcation diagrams [5-10]. One parameter bifurcation diagrams are very well suited for the investigation of routing of chaos development. Two parameter diagrams allow to generalize these results and to reveal a set of universal structures.

An electric circuit with arc is a fairly simple and convenient system for investigation because it possesses a rich collection of periodic and chaotic regimes [5-7].

In spite of complexity and variety of bifurcation diagrams of an electric circuit with arc we could find several typical patterns. Classification was carried out with respect to two important properties: softness or stiffness of chaos or periodicity rising; reversibility or irreversibility of a process under rising or falling of bifurcation parameter.

At this classification a pattern type does not depend on concrete bifurcation which causes it, and also possesses self-similarity that is characteristic feature of embedded patterns in self-organization.

2 Electric Circuits with Arc

Eight electric circuits with arc which are depicted in Fig.1 were investigated. It is easy to show that processes in circuits depicted in Fig.1e, 1f, 1g, 1h are similar to the ones depicted in Fig.1a, 1b, 1c, 1d accordingly. In circuits 1d and 1h the oscillations never exist. In circuits 1b and 1f the oscillations are periodic only. Periodic and chaotic oscillations are observed in circuits 1a, 1e, 1c, and 1g. Therefore the processes taking place in circuit 1a will be described.

According to a generalized model of arc [3] it is considered as part of an electric circuit. The voltage on this part is

( )A

U iu i

iθ

θ

= , (1)

where: i – arc current, U(i) – static volt-ampere characteristic of arc, iθ – state current of arc [3].

A dimensionless differential equation system described a circuit depicted in Fig.1a contains two Kirchhoff equations for the contour and node and also the arc model equation is:

( )

1.2

.

.2

1;

11 ;

,

n

x y xzL

y R y RxRC

z x z

−⎛ ⎞= −⎜ ⎟

⎝ ⎠

= + − −

= −

(2)

where R, L, C – resistance, inductance and capacity of electric circuit; n – exponent in approximation of static volt-ampere characteristic of arc; x, y, z – dimensionless reactor current, capacitor voltage, and square of arc state current accordingly.

The Bifurcations and Chaotic Oscillations in Electric Circuits with Arc 31

Fig. 1. Eight electric circuits with arc

When static volt-ampere characteristic of arc is falling two fixed point is present whose coordinates may be found from system (2) equal to with zero.

A single condition which may be obtained analytically is the condition of Hopf bifurcation [11]. For this case we carry out a linearization of system (2) closed point for which the Kaufman condition hold true.

One of the Hopf bifurcation conditions coincide with the condition of equality to 0 of the real part of pair complex roots of the characteristic polynomial.

( )( )( )

1RLC RC L L R nRC

RLC R n

+ + + + + =

= + (3)

The basic distinction of the Hopf bifurcation in the considered circuit is that this bifurcation may be supercritical as well as subcritical. So local unstability may come as a result of separation stable limit cycle from focus as a result of junction focus with an unstable limit cycle.

The curve of the Hopf bifurcation (see Fig.2) in the parameter plane (L,C) which is defined by formulae (3) have a minimum. It turned out that from the side of a small L until the minimum (for R = 15 - Lm = 2,7924741181414) the bifurcation is critical, afterwards the minimum is subcritical. To point of change of the Hopf bifurcation kind of a curve of twin cycle (tangent) bifurcation joins. Its location was defined more

32 V. Sydorets

0 2 4 6 8 100

2

4

6

8

10

12

C R 0.51.5

5

15

50∞

L

Fig. 2. The curves of Hopf bifurcation

2 1 0 1 2 3 4 5 6

1

2

3

4

5

6

R 15= zL 1=C 2.7=

x

Fig. 3. Oscillations with period 1T as a result of Hopf bifurcation. This is a projection on phase

portrait on plate (x, z ). Fixed point – 1,1,1.

exactly. The twin cycle bifurcation lies under the Hopf bifurcation curve. So under variation of parameter C the system develops according to differ scenarios depending on the value of a fixed parameter L.

For instance the case R = 15 will be described. At a small L (L < Lm) and rising of C the Hopf bifurcation with the advent of a stable limit cycle occurs (Fig.3). Further, theb rising of C leads to the period of doubling bifurcation: single divisible limit cycle becomes unstable but twice divisible stable cycle appears (Fig.4). In the system a self-oscillations with half frequency is settled. Then a period doubling bifurcation cascade follows. As a result four, eight, sixteen, etc. divisible cycles appear (Figs.5-7).

3 Period Doubling Bifurcations

As is well known [2] period doubling bifurcation cascade is one of scenarios of transition from an ordinary attractor to a strange one i.e. the transition from periodic self-oscillations to chaotic ones.


2 1 0 1 2 3 4 5 6

1

2

3

4

5

6

R 15= zL 1=C 2.9=

x

Fig. 4. Cascade of period doubling bifurcations. Oscillations with period 2T.

2 1 0 1 2 3 4 5 6

1

2

3

4

5

6

R 15= zL 1=C 3.025=

x


2 1 0 1 2 3 4 5 6

1

2

3

4

5

6

R 15= zL 1=C 3.035=

x


In fact, at a certain value of C in the system chaotic self-oscillations appear (fig.8), and an attractor becomes strange. Its strangeness consists in that any of its trajectories is unstable in the Lyapunov sense but an attractor is stable in the Poisson sense. A

34 V. Sydorets

2 1 0 1 2 3 4 5 6

1

2

3

4

5

6

R 15= zL 1=C 3.0385=

x


2 1 0 1 2 3 4 5 6

1

2

3

4

5

6

R 15= zL 1=C 3.19=

x

Fig. 8. Chaotic oscillations – strange attractor

2 1 0 1 2 3 4 5 6

1

2

3

4

5

6

R 15= zL 1=C 3.088=

x

Fig. 9. Periodic window in chaos. Oscillations with period 5T.

strong dependence of solution on the initial conditions demonstrates unstability in the Lyapunov sense. If in the periodic regime two initial condition close trajectories come together, then in a chaotic regime they diverge but oscillations remain stable because


2 1 0 1 2 3 4 5 6

1

2

3

4

5

6

R 15= zL 1=C 3.13=

x


2 1 0 1 2 3 4 5 6

1

2

3

4

5

6

R 15= zL 1=C 3.2=

x


the system is characterized by a total compression of phase volume (divergence of system is negative).

The structure of chaos is non-homogeneous. In a chaotic region a window of periodicity is observed. At that for value of L = 1 they qualitatively coincide with the window of periodicity for the logistic map [2], i.e. at first window where periodic oscillations have six divisible period follows, then five divisible period window follows (see fig.9), then wide three divisible period window follows (see fig.10). As a result of a period doubling bifurcation window with six divisible periods appears. At other values of L windows of periodicity with period 3, 4 (see Fig.11), 6, 9, 12 occur.

At large values of parameter L (L > Lm) the scenario of oscillation development in the system greatly differs from the scenario described above. The development is initiated by the twin limit cycle bifurcation and as a result stable and unstable limit cycles appear stiffly.

So in the system two attractors coexist simultaneously: first – a stable fixed point, second – a stable limit cycle. Further, at the increase C these attractors develop independently. With a limit cycle the period doubling bifurcation cascade occurs after that a chaotic oscillations appear. With a stable fixed point, the subcritical Hopf bifurcation occurs and as a result, depending on parameter values, either periodic

36 V. Sydorets

oscillations with two divisible period or chaotic oscillations may stiffly appear. From the bifurcation diagram one can see that depending on initial conditions the transition process tends to different attractors: either to a limit cycle or a strange attractor. Attracting zones are separated by unstable limit cycle.

4 Bifurcation Diagrams

For a more detailed study of scenarios of chaos development many researchers employ the technique of constructing a single parameter bifurcation diagram. On the abscissa axis the values of varied parameter is put and on the ordinate axis – one of coordinates of the Poincare section points. As a section plane the half plane is chosen

2 0x z− = , (4)

where x > 1. Judging by the third equation of system (2) the Poincare section points will be oscillation maximums of variable z.

In Fig.12 there is the bifurcation diagram for L = 1 and a varying range of parameter C, from 2.8 to 3.4. All stages of the scenario described above are visible on it very well. On the bifurcation diagram the periodic windows in chaos are well visible too. A rise in chaotic region periodic oscillations may be considered as a self-organization process. Therefore a question of interest is of cause and mechanism of its appearance.

For instance at L = 1 the evolution of a strange attractor is well visible. From the beginning the chaotic state is extended among neighbor orbits of periodic oscillations and a strange attractor has a strip structure. Narrow strips are joined in wider anes as a result of a “reverse” period doubling bifurcation cascade, i.e. according to order 2k, 2k-

1, ..., 16, 8, 4, 2, 1. After the last “reverse” period doubling bifurcation the strange attractor densely covers a part of phase space and has a structure the so-called screw strange attractor.

In Fig.12 periodicity windows with period 5 (C = 3.088..3.090), 3 (C = 3.123..3.145, wide window), 4 (C = 3.200..3.210), 3 (C = 3.3355), and 1 (C = 3.3800) are marked. At that wide window with period 3 presents almost on all bifurcation diagrams where there is the regime of developed chaos (this fact was noted in [12]). It begins by a stiff destruction of chaos and ends with a period doubling cascade (i.e. 3, 6, 12, ..., 3⋅2k).

However, for example, on the bifurcation diagram at L = 0.3 (fig.13) two windows with period 3 are observed. The development scenario for the first window coincides with one described above but the development scenario for second window is reverse.

The periodic window 2⋅3 (C = 4.1204 ..4.1586) on the bifurcation diagram at L = 0.2 (Fig.14) both appears stiffly and stiffly destroys. It is of particular interest the window with period 2⋅2 (C = 4.204..4.205) since two attractors coexist in it and depending on initial conditions one of them can realize.

The analysis and comparison scenarios described above with well known approaches show that they coincide with the Feigenbaum scenario especially in prior to chaos regimes (period doubling bifurcation cascade). Distinctions are observed in


Fig. 12. Bifurcation diagram at L = 1

Fig. 13. Bifurcation diagram at L = 0.3


38 V. Sydorets



chaotic regimes. Parameter values are (for instance L = 0.1 in Fig.15) when the number of period doubling bifurcations is limited and chaotic regimes do not come.

At value of parameter L = 0.11 an infinite period doubling bifurcation cascade occurs. However it adjoins with other one occurring in reverse direction. By it chaotic oscillations, there is a transition to periodic (see Fig.16).

5 Self-similarity and Scaling Invariance of Bifurcation Diagram

In spite of complexity and variety of bifurcation diagrams for an electric circuit with arc several typical patterns are found. A classification was carried out with respect to two important properties: (a) softness or stiffness of chaos or periodicity rising; (b) reversibility or irreversibility of a process under rising or falling of bifurcation parameter.


At such a classification a type of pattern does not depend on a concrete bifurcation which causes it and possesses self-similarity that is typical for embedded structures.

A property of reversibility is important at carrying out ordinary physical experiments. In future we will consider an ordinary physical as such an experiment when a dynamic system is observed at a sufficiently smooth changing of bifurcation parameter. In this case the final values of variables for one value of parameter is the initial values for other value of parameter. Then ordinary physical experiments will be both the observation of more nature phenomena, and more physical experiments when initial conditions for variables do not set by particular way, and numerical experiment that simulate ordinary physical experiment.

In the studied system there were revealed only three base patters which possess follow properties:

(i) softness and reversible (Fig.17); (ii) stiffness and irreversible (Fig.17); (iii) stiffness and reversible (Fig.18).

Pattern (i) is well known and extended one of period doubling bifurcations. It can start either a supercritical Hopf bifurcation (as, for instance, in the studied system at small values of parameter L), or a period doubling bifurcation when it is embedded structure (for instance, every subsequent branch of bifurcation tree on fig.12-16 is

(ii)(i)

hysteresis

Fig. 17. Patterns (i) and (ii)

(iii)

met

asta

ble

chao

s

crisis

isolated region(iii)

Fig. 18. Patterns (iii) at ordinary and special physical experiments

40 V. Sydorets

similar to previous). Properties of softness show that all periodic oscillations at period doubling bifurcations appear with zero amplitude, and in accumulation point of period doubling bifurcations although one considers that transition to chaos happens a chaotic component power is equal to zero. At reverse changing of bifurcation parameter the processes occur in a reverse order. Prolongation of pattern (i) in a chaotic region is a cascade of a so-called ‘reverse’ bifurcation. At that narrow chaotic strips join forming more wide strips. ‘Reverse’ bifurcations possess properties of softness and reversibility too.

Pattern (ii) differ from pattern (i) that at certain of the values of bifurcation parameter the system is bistable and two attractors (stable motions) coexist in it. Repeller (unstable motion) which is a limit of attractor basins is located between them. System motion coincides with one of attractors while development of other attractor happens imperceptibly. On edges of the bistable zone a junction of a repeller with one of attractors and its mutual destruction that become apparent as jumping to remained attractor. This phenomenon is known as hysteresis. It is necessary to emphasize that jumps on differ edges happen in differ directions.

By increasing the bifurcation parameter (at L > Lm) as a result of period doubling bifurcation a chaos in system appears stiffly. In other cases stiff appearance (appearance with nonzero amplitude) of periodic oscillations is possible.

By further raising the bifurcation parameter a chaos development in patterns (ii) and (i) coincides. However if falling of bifurcation parameter is begun then irreversibility of pattern (ii) show. The process will follow another path. Those system regimes which do not appear at rising of bifurcation parameter will be appeared. Cascade of period doubling bifurcation is observed in a reverse order. The last bifurcation at which an attractor disappears is the tangent bifurcation with stable and unstable cycles.

It is necessary to note that although in pattern (ii) all regimes do not become apparent simultaneously they can be reveal in principle by ordinary physical experiment at rising and falling of bifurcation parameter.

Pattern (iii) outwardly resembles pattern (ii) however it have essential distinctions. The limit of a strange attractor intersects with a repeller. Basins of two attractors overlap. This phenomenon is called a crisis of strange attractor. A chaotic attractor with that crisis take place at competition of two attractors looses its attracting properties.

The jump to periodic (more stable) attractor occurs and zone of a so-called metastable chaos appears. Attracting properties are restored only when a repeller disappears joining with periodic attractor as a result tangent bifurcation which coincides with second crisis.

An ordinary physical experiment in presence of pattern (iii) looks in the following way. If bifurcation parameter rises the oscillations in system coincide with a periodic attractor and in tangent bifurcation point developed chaotic oscillations appear stiffly. By decreasing the bifurcation parameter in crisis point the developed chaotic oscillations become periodic ones stiffly. Thus development of chaos is stiff and reversible.

It is necessary to pay special attention that pattern (iii) has regimes which can not be revealed by ordinary an physical experiment. Therefore they can be called ‘isolated’ regimes. They are limited with one side by tangent bifurcation and with the other side by strange attractor crisis.


Why they are impossible to reveal? It is explained that the jumps which occur in the system direct to the same side (unlike from hysteresis when the jumps direct in differ side) and to hit in this regime region by natural way does not present possible. This can do either by special physical experiment when can preset the initial conditions or parameter value changes very fast or modifying studied system by superposition any pulses. In nature these regimes can become apparent as a result of some extreme (extraordinary) events. However even if in the system isolated regimes occurs then any changing of parameter leads to transition in region of simple regimes.

Although isolated regimes are a phenomenon sufficiently exotic they are importance from the viewpoint of studying chaotic oscillation properties that appear in patters (iii). It turns out that properties of chaos in this case are determined that a cascade of period doubling bifurcations which occur in isolated region because this is the same attractor. Although in a metastable chaos region an attractor looses its attractive properties its development continues.

Knowledge of isolated regime properties helps to reveal them on the bifurcation diagram of the studied system (see enlarged notes on Fig.14-16).

6 Quantitative Estimations

Feigenbaum [2] determined that the cascade of period doubling bifurcations possesses not only qualitative but quantitative universal properties. It turned out that at doubling the bifurcation values of parameter represent the geometric series where denominator δ is universal value i.e. value independent on kind of nonlinear system.

It was obtained for the studied system

δ = 4,669220751009,

already at bifurcation 64-divisible of period that confirms its universality i.e. contains five correct significant digits.

7 Conclusions

The electric circuits with arc possess an abundance of periodic and chaotic behaviour. Investigation of these circuits may be useful because its properties are universal and can apply to other nonlinear dynamical systems.

References

1. Syuan, W.: Family of Chua’s circuits. Trans. IEEE. 75(8), 55–65 (1987) 2. Anishchenko, V.S.: Complicated oscillation in simple, 312 p. Nauka, Moscow (1990) (in

Russian) 3. Pentegov, I.V., Sidorets, V.N.: Energy parameters in mathematical model of dynamical

welding arc. Automaticheskaya svarka 11, 36–40 (1988) (in Russian) 4. Sidorets, V.N., Pentegov, I.V.: Chaotic oscillations in RLC circuit with electric arc

Doklady AN Ukrainy, vol. 10, pp. 87–90 (1992) (in Russian)

42 V. Sydorets

5. Sidorets, V.N., Pentegov, I.V.: Appearance and structure of strange attractor in RLC circuit with electric arc. Technicheskaya electrodynamica 2, 28–32 (1993) (in Russian)

6. Sidorets, V.N., Pentegov, I.V.: Deterministic chaos development scenarios in electric circuit with arc. Ukrainian physical journal 39(11-12), 1080–1083 (1994) (in Ukrainian)

7. Sidorets, V.N.: Structures of bifurcation diagrams for electric circuit with arc. Technichna electrodynamica 6, 15–18 (1998)

8. Vladimirov, V.A., Sidorets, V.N.: On the Peculiarities of Stochastic Invariant Solutions of a Hydrodynamic System Accounting for Non-local Effects. Symmetry in Nonlinear Mathematical Physics 2, 409–417 (1997)

9. Vladimirov, V.A., Sidorets, V.N.: On Stochastic Self Oscillation Solutions of Nonlinear Hydrodynamic Model of Continuum Accounting for Relaxation Effects. Dopovidi Nacionalnoyi akademiyi nauk Ukrayiny 2, 126–131 (1999) (in Russian)

10. Vladimirov, V.A., Sidorets, V.N., Skurativskii, S.I.: Complicated Travelling Wave Solutions of a Modelling System Describing Media with Memory and Spatial Nonlocality. Reports on Mathematical Physics 41(1/2), 275–282 (1999)

11. Sidorets, V.N.: Feature of analyses eigenvalues of mathematical models of nonlinear electrical circuits. Electronnoe modelirovanie 20(5), 60–71 (1998) (in Russian)

12. Li, T., Yorke, J.A.: Period Three Implies Chaos American Math. Monthly 82, 985–991 (1975)


Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems

Oscar Castillo and Patricia Melin

Division of Graduate Studies and Research Tijuana Institute of Technology Tijuana, Mexico [email protected]

Abstract. We describe in this paper the application of soft computing techniques to controlling non-linear dynamical systems in real-world problems. Soft computing consists of fuzzy logic, neural networks, evolutionary computation, and chaos theory. Controlling real-world non-linear dynamical systems may require the use of several soft computing techniques to achieve the desired performance in practice. For this reason, several hybrid intelligent architectures have been developed. The basic idea of these hybrid architectures is to combine the advantages of each of the techniques involved in the intelligent system. Also, non-linear dynamical systems are difficult to control due to the unstable and even chaotic behaviors that may occur in these systems. The described applications include robotics, aircraft systems, biochemical reactors, and manufacturing of batteries.

Keywords: Neural Networks, Fuzzy Logic, Genetic Algorithms, Intelligent Control.

1 Introduction

We describe in this paper the application of soft computing techniques and fractal theory to the control of non-linear dynamical systems [8]. Soft computing consists of fuzzy logic, neural networks, evolutionary computation, and chaos theory [23]. Each of these techniques has been applied successfully to real world problems. However, there are applications in which one of these techniques is not sufficient to achieve the level of accuracy and efficiency needed in practice. For this reason, is necessary to combine several of these techniques to take advantage of the power that each technique offers. We describe several hybrid architectures that combine different soft computing techniques. We also describe the development of hybrid intelligent systems combining several of these techniques to achieve better performance in controlling real dynamical systems. We illustrate these ideas with applications to robotic systems, aircraft systems, biochemical reactors, and manufacturing systems. Each of these problems has its own characteristics, but all of them share in common their non-linear dynamic behavior. For this reason, the use of soft computing techniques is completely justified. In all of these applications, the results of using soft computing techniques have been better than with traditional techniques.

44 O. Castillo and P. Melin

2 Neural Network Models

A neural network model takes an input vector X and produces and output vector Y. The relationship between X and Y is determined by the network architecture [23]. There are many forms of network architecture (inspired by the neural architecture of the brain). The neural network generally consists of at least three layers: one input layer, one output layer, and one or more hidden layers. Figure 1 illustrates a neural network with p neurons in the input layer, one hidden layer with q neurons, and one output layer with one neuron.

Output

Hidden

Input

1 q+1

p+1

j q

i21

Fig. 1. Single hidden layer feedforward neural network

In the neural network we will be using, the input layer with p+1 processing elements, i.e., one for each predictor variable plus a processing element for the bias. The bias element always has an input of one, Xp+1=1. Each processing element in the input layer sends signals Xi (i=1,…,p+1) to each of the q processing elements in the hidden layer. The q processing elements in the hidden layer (indexed by j=1,…,q) produce an “activation” aj=F(ΣwijXi) where wij are the weights associated with the connections between the p+1 processing elements of the input layer and the jth processing element of the hidden layer. Once again, processing element q+1 of the hidden layer is a bias element and always has an activation of one, i.e. aq+1=1. Assuming that the processing element in the output layer is linear, the network model will be

(1)

Here πι are the weights for the connections between the input layer and the output layer, and θj are the weights for the connections between the hidden layer and the output layer. The main requirement to be satisfied by the activation function F(.) is that it be nonlinear and differentiable. Typical functions used are the sigmoid, hyperbolic tangent, and the sine functions, i.e.:

Soft Computing Models for Intelligent Control of Non-linear Dynamical Systems 45

(2)

The weights in the neural network can be adjusted to minimize some criterion such as the sum of squared error (SSE) function:

(3)

Thus, the weights in the neural network are similar to the regression coefficients in a linear regression model. In fact, if the hidden layer is eliminated, (1) reduces to the well-known linear regression function. It has been shown [13, 24] that, given sufficiently many hidden units, (1) is capable of approximating any measurable function to any accuracy. In fact F(.) can be an arbitrary sigmoid function without any loss of flexibility.

The most popular algorithm for training feedforward neural networks is the backpropagation algorithm. As the name suggests, the error computed from the output layer is backpropagated through the network, and the weights are modified according to their contribution to the error function. Essentially, backpropagation performs a local gradient search, and hence its implementation does not guarantee reaching a global minimum. A number of heuristics are available to partly address this problem, some of which are presented below. Instead of distinguishing between the weights of the different layers as in Equation (1), we refer to them generically as wij in the following.

After some mathematical simplification the weight change equation suggested by back-propagation can be expressed as follows:

(4)

Here, ηis the learning coefficient and θ is the momentum term. One heuristic that is used to prevent the neural network from getting stuck at a local minimum is the random presentation of the training data. Another heuristic that can speed up convergence is the cumulative update of weights, i.e., weights are not updated after the presentation of each input-output pair, but are accumulated until a certain number of presentations are made, this number referred to as an “epoch”. In the absence of the second term in (4), setting a low learning coefficient results in slow learning, whereas a high learning coefficient can produce divergent behavior. The second term in (4) reinforces general trends, whereas oscillatory behavior is canceled out, thus allowing a low learning coefficient but faster learning. Last, it is suggested that starting the training with a large learning coefficient and letting its value decay as training progresses speeds up convergence.

2.1 Levenberg-Marquardt Modifications for Neural Networks

The method of steepest descent, also known as gradient method, is one of the oldest techniques for minimizing a given function defined on a multidimensional space. This method forms the basis for many optimization techniques. In general, the descent direction is given by the second derivatives of the objective function E. The matrix of


second derivatives gives us what is known as the Hessian matrix H. In classical Newton's method this matrix is used to define an adaptation rule for a parameter vector θ as follows:

(5)

where g is the gradient vector consisting of all the first order derivatives of function E. In Newton's method H needs to be positive definite to have convergence.

Furthermore, if the Hessian matrix is not positive definite, the Newton direction may point toward a local maximum, or a saddle point. The Hessian can be altered by adding a positive definite matrix P to H to make H positive definite. Levenberg and Marquardt [15] introduced this notion in least-squares problems. Later, Goldfeld et al. [11] first applied this concept to the Newton's method. When P = λΙ, Equation (5) will be

(6)

where I is the identity matrix and λ is some nonnegative value. Depending on the magnitude of A, the method transits smoothly between the two extremes: Newton's method (λ→ 0) and well-known steepest descent method (λ→ ∞ ) .A variety of Levenberg- Marquardt algorithms differ in the selection of λ. Goldfeld et al. computed eigenvalues of H and set A to a little larger than the magnitude of the most negative eigenvalue.

Moreover, when λ increases, || θnext - θnow || decreases. In other words, λ plays the same role as an adjustable step length. That is, with some appropriately large λ, the step length, will be the right one. Of course, the step size η can be further introduced and can be determined in conjunction with line search methods:

(7)

For the case of neural networks these ideas are used to update (or learn) the weights of the network [8].

3 Fractal Dimension of a Geometrical Object

Recently, considerable progress has been made in understanding the complexity of an object through the application of fractal concepts [14] and dynamic scaling theory [3]. For example, financial time series show scaled properties suggesting a fractal structure [8]. The fractal dimension of a geometrical object can be defined as follows:

(8)

where N(r) is the number of boxes covering the object and r is the size of the box. An approximation to the fractal dimension can be obtained by counting the number of boxes covering the boundary of the object for different r sizes and then performing a logarithmic regression to obtain d (box counting algorithm). In Figure 2, we illustrate the box counting algorithm for a hypothetical curve C. Counting the number of boxes


Fig. 2. Box counting algorithm for a curve C

Fig. 3. Logarithmic regression to find dimension

for different sizes of r and performing a logarithmic linear regression, we can estimate the box dimension of a geometrical object with the following equation:

(9)

this algorithm is illustrated in Figure 3. The fractal dimension can be used to characterize an arbitrary object. The reason

for this is that the fractal dimension measures the geometrical complexity of objects. In this case, a time series can be classified by using the numeric value of the fractal dimension (d is between 1 and 2 because we are on the plane xy). The reasoning behind this classification scheme is that when the boundary is smooth the fractal dimension of the object will be close to one. On the other hand, when the boundary is rougher the fractal dimension will be close to a value of two.

We developed a computer program in MATLAB for calculating the fractal dimension of a sound signal. The computer program uses as input the figure of the signal and counts the number of boxes covering the object for different grid sizes.

4 Intelligent Control Using Soft Computing

First, we describe a new method for adaptive model-based control of robotic dynamic systems using a neuro-fuzzy-fractal approach. Intelligent control of robotic dynamic


systems is a difficult problem because the dynamics of these systems is highly non-linear [5]. We describe an intelligent system for controlling robot manipulators to illustrate our neuro-fuzzy-fractal approach for adaptive control. We use a new fuzzy inference system for reasoning with multiple differential equations for modelling based on the relevant parameters for the problem [6]. In this case, the fractal dimension [14] of a time series of measured values of the variables is used as a parameter for the fuzzy system. We use neural networks for identification and control of robotic dynamic systems [4, 21]. The neural networks are trained with the Levenberg-Marquardt learning algorithm with real data to achieve the desired level of performance. Combining a fuzzy rule base [32] for modelling with the neural networks for identification and control, an intelligent system for adaptive model-based control of robotic dynamic systems was developed. We have very good simulation results for several types of robotic systems for different conditions. The new method for control combines the advantages of fuzzy logic (use of expert knowledge) with the advantages of neural networks (learning and adaptability), and the advantages of the fractal dimension (pattern classification) to achieve the goal of robust adaptive control of robotic dynamic systems.

The neuro-fuzzy-fractal approach described above can also be applied to the case of controlling biochemical reactors [21]. In this case, we use mathematical models of the reactors to achieve adaptive model-based control. We also use a fuzzy inference system for differential equations to take into consideration several models of the biochemical reactor. The neural networks are used for identification and control. The fractal dimension of the bacteria used in the reactor is also an important parameter in the fuzzy rules to take into account the complexity of biochemical process. We have very good results for several food production processes in which the biochemical reactor is controlled to optimize the production.

We have also used our hybrid approach for the case of controlling chaotic and unstable behavior in aircraft dynamic systems [22]. For this case, we use mathematical models for the simulation of aircraft dynamics during flight. The goal of constructing these models is to capture the dynamics of the aircraft, so as to have a way of controlling this dynamics to avoid dangerous behavior of the system. Chaotic behavior has been related to the flutter effect that occurs in real airplanes, and for this reason has to be avoided during flight. The prediction of chaotic behavior can be done using the mathematical models of the dynamical system. We use a fuzzy inference system combining multiple differential equations for modelling complex aircraft dynamic systems. On the other hand, we use neural networks trained with the Levenberg-Marquardt algorithm for control and identification of the dynamic systems. The proposed adaptive controller performs rather well considering the complexity of the domain.

We also describe in this paper, several hybrid approaches for controlling electrochemical processes in manufacturing applications. The hybrid approaches combine soft computing techniques to achieve the goal of controlling the manufacturing process to follow a desired production plan. Electrochemical processes, like the ones used in battery formation, are very complex and for this reason very difficult to control. Also, mathematical models of electrochemical processes are difficult to derive and they are not very accurate. We need adaptive control of the electrochemical process to achieve on-line control of the production line. Of course,


adaptive control is easier to achieve if one uses a reference model of the process [21, 22]. In this case, we use a neural network to model the electrochemical process due to the difficulty in obtaining a good mathematical model for the problem. The other part of the problem is how to control the non-linear electrochemical process in the desired way to achieve the production with the required quality. We developed a set of fuzzy rules using expert knowledge for controlling the manufacturing process. The membership functions for the linguistic variables in the rules were tuned using a specific genetic algorithm. The genetic algorithm was used for searching the parameter space of the membership functions using real data from production lines. Our particular neuro-fuzzy-genetic approach has been implemented as an intelligent system to control the formation of batteries in a real plant with very good results.

5 Intelligent Control of Robotic Systems

Given the dynamic equations of motion of a robot manipulator, the purpose of robot arm control is to maintain the dynamic response of the manipulator in accordance with some pre-specified performance criterion [7]. Although the control problem can be stated in such a simple manner, its solution is complicated by inertial forces, coupling reaction forces, and gravity loading on the links. In general, the control problem consists of (1) obtaining dynamic models of the robotic system, and (2) using these models to determine control laws or strategies to achieve the desired system response and performance [10].

Among various adaptive control methods, the model-based adaptive control is the most widely used and it is also relatively easy to implement. The concept of model-based adaptive control is based on selecting an appropriate reference model and adaptation algorithm, which modifies the feedback gains to the actuators of the actual system.

Many authors have proposed linear mathematical models to be used as reference models in the general scheme described before. For example a linear second-order time invariant, differential equation can be used as the reference model for each degree of freedom of the robot arm. Defining the vector y(t) to represent the reference model response and the vector x(t) to represent the manipulator response, the joint i of the reference model can be described by

(10)

If we assume that the manipulator is controlled by position and velocity feedback gains and the coupling terms are negligible, then the manipulator equation for joint i can be

i(t)x"i(t) + i(t)x'i(t) + xi(t) = ri(t) (11)

where the system parameters αi(t) and βi(t) are assumed to vary slowly with time.

The fact that this control approach is not dependent on a complex mathematical model is one of its major advantages, but stability considerations of the closed-loop adaptive system are critical. A stability analysis is difficult and has only been carried out using linearized models. However, the adaptability of the controller can become


questionable if the interaction forces among the various joints are severe (non-linear). This is the main reason why soft computing techniques [7] have been proposed to control this type of dynamic systems.

Adaptive fuzzy control is an extension of fuzzy control theory to allow the fuzzy controller, extending its applicability, either to a wider class of uncertain systems or to fine-tune the parameters of a system to accuracy [9]. In this scheme, a fuzzy controller is designed based on knowledge of a dynamic system. This fuzzy controller is characterized by a set of parameters. These parameters are either the controller constants or functions of a model’s constants.

A controller is designed based on an assumed mathematical model representing a real system. It must be understood that the mathematical model does not completely match the real system to be controlled. Rather, the mathematical model is seen as an approximation of the real system. A controller designed based on this model is assumed to work effectively with the real system if the error between the actual system and its mathematical representation is relatively insignificant. However, there exists a threshold constant that sets a boundary for the effectiveness of a controller. An error above this threshold will render the controller ineffective toward the real system.

An adaptive controller is set up to take advantage of additional data collected at run time for better effectiveness. At run time, data are collected periodically at the beginning of each constant time interval, tn = tn-1 + Δt, where Δt is a constant measurement of time, and [tn, tn-1) is a duration between data collection. Let Dn be a set of data collected at time t = tn. It is assumed that at any particular time, t = tn, a history of data {D0, D1, …, Dn} is always available. The more data available, more accurate the approximation of the system will become.

At run time, the control input is fed into both the real system and the mathematical model representing the system. The output of the real system and the output of that mathematical model are collected and an error representing the difference between these two outputs are calculated. Let x(t) be the output of the real system, and y(t) the output of the mathematical model. The error ε(t) is defined as:

(t) = x(t) – y(t). (12)

Figure 4 depicts this tracking of the difference between the mathematical model and the real dynamic system it represents.

+

+ u(t) x(t) (t) xdesired

y(t)

Controller Real Dynamic System

Mathematical Model

Fig. 4. Tracking the error function between outputs of a real system and mathematical model


An adaptive controller will be adjusted based on the error function ε(t). This calculated data will be fed into either the mathematical model or the controller for adjustment. Since the error function ε(t) is available only at run time, an adjusting mechanism must be designed to accept this error as it becomes available, i.e., it must evolve with the accumulation of data in time. At any time, t = tn, the set of calculated data in the form of a time series {ε(t0), ε(t1),..., ε(tn)}is available and must be used by the adjusting mechanism to update appropriate parameters.

In normal practice, instead of doing re-calculation based on a lengthy set of data, the adjusting algorithm is reformulated to be based on two entities: (i) sufficient information, and (ii) newly collected data. The sufficient information is a numerical variable representing the set of data {ε(t0), ε(t1),..., ε(tn-1)} collected from the initial time t0 to the previous collecting cycle starting at time t = tn-1. The new datum ε(tn) is collected in the current cycle starting at time t = tn.

An adaptive controller will operate as follows. The controller is initially designed as a function of a parameter set and state variables of a mathematical model. The parameters can be updated any time during operation and the controller will adjust itself to the newly updated parameters. The time frame is usually divided into a series of equally spaced intervals {[tn,tn+1)| n = 0,1,2,...; tn+1 = tn+ Δt}. At the beginning of each time interval [tn,t n+1) observable data are collected and the error function ε(tn) is calculated. This error is used to calculate the adjustment in the parameters of the controller. New control input u(tn) for the time interval [tn,tn+1) is then calculated based on the newly calculated parameters and fed into both the real dynamic system under control and the mathematical model upon which the controller is designed. This completes one control cycle. The next control cycle will consist of the same steps repeated for the next time interval [tn+1,tn+2), and so on.

5.1 Mathematical Modelling of Robotic Dynamic Systems

We will consider, in this section, the case of modelling robotic manipulators [5]. The general model for this kind of robotic system is the following:

M(q)q" + V(q, q'))q' + G(q) + Fdq' = (13)

where q ∈ Rn denotes the link position, M(q) ∈ Rnxn is the inertia matrix, V(q,q') ∈

Rnxn is the centripetal-Coriolis matrix, G(q) ∈ Rn represents the gravity vector, Fd ∈

Rnxn is a diagonal matrix representing the friction term, and τ is the input torque applied to the links. We show in Figure 5 the case of the two-link robot arm. In this figure, we show the variables involved.

For the simplest case of a one-link robot arm, we have the scalar equation:

Mqq" + Fdq' + G(q) = (14)

If G(q) is a linear function (G = Nq), then we have the "linear oscillator" model:

q" + aq' + bq = c

where a = Fd/Mq , b = N/Mq and c = τ/Mq. This is the simplest mathematical model

for a one-link robot arm. More realistic models can be obtained for more complicated


Fig. 5. Two-link robot arm indicating the variables involved

functions G(q). For example, if G(q) = Nq2, then we obtain the "quadratic oscillator" model:

q" + aq' + bq2 = c (15)

where a, b and c are defined as above. A more interesting model is obtained if we define G(q) = Nsinq. In this case, the

mathematical model is

q" + aq' + bsinq = c (16)

where a, b and c are the same as above. This is the so-called "sinusoidally forced oscillator". More complicated models for a one-link robot arm can be defined similarly.

For the case of a two-link robot arm, we can have two simultaneous differential equations as follows:

q"1 + a1q'1 + b1q22 = c1

q"2 + a2q'2 + b2q21 = c2 (17)

which is called the "coupled quadratic oscillators" model. In Equation (17) a1, b1, a2,

b2, c1 and c2 are defined similarly as in the previous models. We can also have the

"coupled cubic oscillators" model:


q"1 + a1q'1 + b1q32 = c1 ,

q"2 + a2q'2 + b2q31 = c2 (18)

(a)

(b)

Fig. 6. (a) Function approximation after 9 epochs, (b) SSE of the neural network


5.2 Simulation Results

To give an idea of the performance of our neuro-fuzzy approach for adaptive model-based control of robotic systems, we show below simulation results obtained for a single-link robot arm. The desired trajectory for the link was selected to be

qd = tsin(2.0t) (19)

and the simulation was carried out with the initial values: q(0) = 0.1 q'1(0) = 0. We used

three-layer neural networks (with 15 hidden neurons) with the Levenberg-Marquardt

(a)

(b)

Fig. 7. (a) Non-linear surface for modelling, (b) fuzzy reasoning procedure


(a)

(b)

Fig. 8. (a) Simulation of position q1, (b) Simulation of position q2

algorithm and hyperbolic tangent sigmoidal functions as the activation functions for the neurons. We show in Figure 6(a) the function approximation achieved with the neural network for control after 9 epochs of training with a variable learning rate. The identification achieved by the neural network can be considered very good because

the error has been decreased to the order of 10-4. We show in Figure 6(b) the curve relating the sum of squared errors SSE against the number of epochs of neural network training. We can see in this figure how the SSE diminishes rapidly from

being of the order of 102 to smaller value of the order of 10-4. Still, we can obtain a better approximation by using more hidden neurons or more layers. In any case, we


can see clearly how the neural networks learns to control the robotic system, because it is able to follow the arbitrary desired trajectory.

We show in Figure 7(a) the non-linear surface for the fuzzy rule base for modelling. The fuzzy system was implemented in the fuzzy logic toolbox of MATLAB [25]. We show in Figure 7(b) the reasoning procedure for specific values of the fractal dimension and number of links of the robotic system.

In Figure 8 we show simulation results for a two-link robot arm with a model given by two coupled second order differential equations. Figure 8(a) shows the behavior of position q1 and Figure 8(b) shows it for position q2 of the robot arm.

We can see from these figures the complex dynamic behavior of this robotic system [7]. Of course, the complexity is even greater for higher dimensional robotic systems.

We have very good simulation results for several types of robotic manipulators for different conditions. The new method for control combines the advantages of neural networks (learning and adaptability) with the advantages of fuzzy logic (use of expert knowledge) to achieve the goal of robust adaptive control of robotic dynamic systems. We consider that our method for adaptive control can be applied to general non-linear dynamical systems [8, 27] because the hybrid approach, combining neural networks and fuzzy logic, does not depend on the particular characteristics of the robotic dynamic systems.

The new method for adaptive control can also be applied for autonomous robots [8], but in this case it may be necessary to include genetic algorithms for trajectory planning.

6 Control of Biochemical Reactors

Process control of biochemical plants is also an attractive application because of the potential benefits to both adaptive network research and to actual biochemical process control. In spite of the extensive work on self-tuning controllers and model-reference control, there are many problems in chemical processing industries for which current techniques are inadequate. Many of the limitations of current adaptive controllers arise in trying to control poorly modeled non-linear systems [1]. For most of these processes extensive data are available from past runs, but it is difficult to formulate precise models. This is precisely where adaptive networks are expected to be useful [31].

Bioreactors are difficult to model because of the complexity of the living organisms in them and also they are difficult to control because one often can't measure on-line the concentration of the chemicals being metabolized or produced. Bioreactors can also have markedly different operating regimes, depending on whether the bacteria is rapidly growing or producing product. Model-based control of these reactors offers a dual problem: determining a realistic process model and determining effective control laws in the face of inaccurate process models and highly nonlinear processes [19, 20, 26].

Biochemical systems can be relatively simple in that they have few variables, but still very difficult to control due to strong nonlinearities which are difficult to model accurately. A prime example is the bioreactor. In its simplest form, a bioreactor is simply a tank containing water and cells (e.g.. bacteria) which consume nutrients ("substrate") and produce products (both desired and undesired) and more cells. Bioreactors can be quite complex: cells are self-regulatory mechanisms, and can


adjust their growth rates and production of different products radically depending on temperature and concentrations of waste products [16]. Systems with heating or cooling, multiple reactors or unsteady operation greatly complicate the analysis. Mathematical models for these systems can be expressed as differential (or difference) equations [3, 17, 18].

Now we propose mathematical models that integrate our method for geometrical modelling of bacteria growth using the fractal dimension [14] with the method for modelling the dynamics of bacteria population using differential equations [27]. The resulting mathematical models describe bacteria growth in space and in time, because the use of the fractal dimension enables us to classify bacteria by the geometry of the colonies and the differential equations help us to understand the evolution in time of bacteria population.

We will consider first the case of using one bacteria for food production. The mathematical model in this case can be of the following form:

dN/dt = r(1 - N-D

/K)N-D

- N-D

dP/dt = N-D

(20)

where D is the fractal dimension, N is the bacteria population, P is quantity of chemical product, r is the rate of bacteria growth, K is the environment capacity, and β is a biochemical conversion factor.

We will consider now the case of two bacteria used for food production:

dN1/dt=[r

1-(r

1/K

1)N

1

-D1-(r

1/K

1)

12N

2

-D2]N

1

-D1- N

1

-D1

dN2/dt = [r

2-(r

2/K

2)N

2

-D2-(r

2/K

2)

21N

1

-D1]N

2

-D2- N

2

-D2

dP/dt = N1

-D1 + N

2

-D2

(21)

where D1 is the fractal dimension of bacteria 1, D2 is the fractal dimension of bacteria 2 and the rest of variables are as described in the last equation.

As we can see from equations (20) and (21) the idea of our method of modelling is to use the fractal dimension D as a parameter in the differential equations, so as to have a way of classifying for which type of bacteria the equation corresponds. In this way, equation (20), for example, can represent the model for food production using one bacteria (the one defined by the fractal dimension D).

We have implemented a model-based neural controller using the architecture of Figure 9. Two multilayer networks are used, one for the model of the plant and the second for the controller. The Neural Networks were implemented in the MATLAB programming language to achieve a high level of efficiency on the numerical calculations needed for these modules. The Fractal module was also implemented in the MATLAB programming language for the same reason. In this way we combine the three methodologies to obtain the best of the three worlds (Neural Networks, Fuzzy Logic and Fractal Theory) using for each the appropriate implementation language.


Fig. 9. Indirect Adaptive Neuro-Fuzzy-Fractal Control

Fig. 10. Simulation of the model for two bacteria used in food production

We show in Figure 10 simulation results of bacteria population used for food production. We can see from this figure the complicated dynamics for the case of two bacteria competing in the same environment, and at the same time producing the chemical product necessary for food production.

We also show in Figure 11 simulation results for the case of two good bacteria used for food production and one bad bacteria that is attacking the other ones. We can see from this figure how one of the good bacteria is eliminated (the population goes


Fig. 11. Simulation of the model for two good bacteria and one bad one

down to zero), which of course results in a decrease of the resulting quantity of the food product. This is a case, which has to be avoided because of the bad resulting effect of the bad bacteria. Intelligent control helps in avoiding these types of scenarios for food production.

We have use a general method for adaptive model based control of non-linear dynamic plants using Neural Networks, Fuzzy Logic and Fractal Theory. We illustrated our method for control with the case of biochemical reactors. In this case, the models represent the process of biochemical transformation between the microbial life and their generation of the chemical product. We also describe in this paper an adaptive controller based on the use of neural networks and mathematical models for the plant. The proposed adaptive controller performs rather well considering the complexity of the domain being considered in this research work. We can say that combining Neural Networks, Fuzzy Logic and Fractal Theory, using the advantages that each of these methodologies has, can give good results for this kind of application. Also, we believe that our neuro-fuzzy-fractal approach is a good alternative for solving similar problems.

7 Intelligent Control of Aircraft Systems

The mathematical models of aircraft systems can be represented as coupled non-linear differential equations [22]. In this case, we can develop a fuzzy rule base for modelling that enables the use of the appropriate mathematical model according to the changing


conditions of the aircraft and its environment. For example, we can use the following model of an airplane when wind velocity is relatively small:

p’ = I1(-q + l), q’ = I2(p + m) (22)

where I1 and I2 are the inertia moments of the airplane with respect to axis x and y, respectively, l and m are physical constants specific to the airplane, and p, q are the positions with respect to axis x and y, respectively. However, a more realistic model of an airplane in three dimensional space, is as follows:

p’ = I1(-qr + l), q’ = I2(pr + m), r’ = I3(-pq + n) (23)

where now I3 is the inertia moment of the airplane with respect to the z axis, n is a physical constant specific to the airplane, and r is the position along the z axis. Considering now wind disturbances in the model, we have the following equation:

p’ = I1(-qr + l) - ug, q’ = I2(pr + m), r’ = I3(-pq + n) (24)

where ug is the wind velocity. The magnitude of wind velocity is dependent on the altitude of the airplane in the following form:

ug = uwind510 1 + ln (r/510) ln(51)

where uwind510 is the wind speed at 510 ft altitude (typical value = 20 ft/sec). If we use the models of Eq. (22)-(24) for describing aircraft dynamics, we can

formulate a set of rules that relate the models to the conditions of the aircraft and its environment. Lets assume that M1 is given by Eq. (22), M2 is given by Eq. (24), and M3 is given by Eq. (24). Now using the wind velocity ug and inertia moment I1 as parameters, we can establish the fuzzy rule base for modelling [29, 30] as in Table 1.

In Table 1, we are assuming that the wind velocity ug can have only two possible fuzzy values (small and large). This is sufficient to know if we have to use the mathematical model that takes into account the effect of wind (M3) for ug large or if we don’t need to use it and simply the model M2 is sufficient (for ug small). Also, the inertia moment (I1) helps in deciding between models M1 and M2 (or M3).

To give an idea of the performance of our neuro-fuzzy-fractal approach for adaptive control, we show below simulation results for aircraft dynamic systems. First, we show in Figure 12(a) the fuzzy rule base for a prototype intelligent system

Table 1. Fuzzy rule base for modelling aircraft systems

IF THENWind Inertia Fractal Dim Model Small Small Low M1Small Small Medium M2Small Large Low M2Small Large Medium M2Large Small Medium M3Large Large Medium M3Large Large High M3


(a)

(b)

Fig. 12. (a) Fuzzy rule base (b) Non-linear surface for aircraft dynamics

developed in the fuzzy logic toolbox of the MATLAB programming language. We show in Figure 12(b) the non-linear surface for the problem of aircraft dynamics using as input variables: fractal dimension and wind velocity.


(a)

(b)

Fig. 13. (a) Simulation of position q (b) Simulation of position p

We show simulation results for an aircraft system obtained using our new method for modelling dynamical systems. In Figure 13(a) and Figure 13(b) we show results for an airplane with inertia moments: I1 = 1, I2 = 0.4, I3 = 0.05 and the constants are: l = m = n = 1. The initial conditions are: p(0) = 0, q(0) = 0, r(0) = 0.


To give an idea of the performance of our neuro-fuzzy approach for adaptive model-based control of aircraft dynamics, we show below (Figure 14) simulation results obtained for the case of controlling the altitude of an airplane for a flight of 6 hours. We assume that the airplane takes about one hour to achieve the cruising altitude 30 000 ft, then cruises along for about three hours at this altitude (with minor fluctuations), and finally descends for about two hours to its final landing point. We will consider the desired trajectory as follows:

30t + sin2t for 0 t 1

rd = 30 + 2 sin10t for 1 < t 4

90 - 15t for 4< t 6

Of course, a complete desired trajectory for the airplane would have to include the positions for the airplane in the x and y directions (variables p, q in the models). However, we think that here for illustration purposes is sufficient to show the control of the altitude r for the airplane.

We used three-layer neural networks (with 10 hidden neurons) with the Levenberg-Marquardt algorithm and hyperbolic tangent sigmoidal functions as the activation functions for the neurons. We show in Figure 14 the function approximation achieved by the neural network for control after 800 epochs of training with a variable learning rate. The identification achieved by the neural network (after 800 epochs) can be

considered very good because the error has been decreased to the order of 10-1. Still, we can obtain a better approximation by using more hidden neurons or more layers. In

Fig. 14. Function approximation of the neural network for control of an airplane


any case, we can see clearly (from Figure 14) how the neural network learns to control the aircraft, because it is able to follow the arbitrary desired trajectory.

We have to mention here that these simulation experiments for the case of a specific flight for a given airplane show very good results. We have also tried our approach for control with other types of flights and airplanes with good simulation results. Still, there is a lot of research to be done in this area because of the complex dynamics of aircraft systems.

We have developed a general method for adaptive model based control of non-linear dynamic systems using Neural Networks, Fuzzy Logic and Fractal Theory. We illustrated our method for control with the case of controlling aircraft dynamics. In this case, the models represent the aircraft dynamics during flight. We also described in this paper an adaptive controller based on the use of neural networks and mathematical models for the system. The proposed adaptive controller performs rather well considering the complexity of the domain being considered in this research work. We have shown that our method can be used to control chaotic and unstable behavior in aircraft systems. Chaotic behavior has been associated with the “flutter” effect in real airplanes, and for this reason is very important to avoid this kind of behavior. We can say that combining Neural Networks, Fuzzy Logic and Fractal Theory, using the advantages that each of these methodologies has, can give good results for this kind of application. Also, we believe that our neuro-fuzzy-fractal approach is a good alternative for solving similar problems.

8 Intelligent Control of the Battery Charging Process

In a battery a process of conversion of chemical energy into electrical energy is carried out. The chemical energy contained in the electrode and electrolyte is converted into electrical power by means of electrochemical reactions. When connecting the battery to a source of direct current a flow of electrons takes place for the external circuit, and of ions inside the battery, giving an accumulation of load in the battery. The quantity of electric current that is required to load the battery is determined by an unalterable law of nature, that was postulated by Michael Faraday, which is known as the Law of Faraday [2]. Faraday found that the quantity of electric power required to perform an electrochemical change in a metal is related to the relative weight of the metal. In the specific case of lead this is considered to be 118 amperes hour for pound of positive active material for cell. In practice, more energy is required to counteract the losses due to the heat and to the generation of gas.

We show in Table 2 experimental data for a specific type of battery with different sizes of the plates, and different number of plates for each cell. In this table, we show the load time and the average current needed for the respective load. In Table 2 we can observe that to form a battery we need to apply a particular current intensity during a certain amount of time to achieve the required loading for the battery.

The goal of the manufacturers of batteries is to reduce the time required to load the battery. However, current intensity can't be increased arbitrarily because of the physical characteristics of the specific battery [12]. If the current is increased too much, the temperature in the battery will go over a safe temperature value eventually causing the destruction of the battery.


Table 2. Experimental data for different types of batteries

Type of Plate Positive 0.060” Negative 0.050”

Positive 0.070” Negative 0.060”

Plate cell

TotalA. H.

72 hr Amp.

96 hr Amp.

Total A.H.

72 hr Amp

96 hr Amp

7 155 2.2 1.6 165 2.4 1.8 9 180 2.8 2.0 200 2.8 2.2

11 230 3.2 2.4 245 3.4 2.4 13 260 3.6 2.6 295 4.0 3.0 15 300 4.2 3.0 345 4.8 3.6 17 400 5.6 4.2 415 5.8 4.4

8.1 Fuzzy Method for Control

In this approach we use a statistical model to represent the electrochemical process and a fuzzy rule base for process control. The temperature in the battery depends on the electrical current that circulates in it during its formation, this means that to maintain the temperature below a specific threshold it is important to control the intensity of the current. Therefore for this case the independent variable is the average current I, and the dependent variable is the average temperature T. A simple statistical linear model can stated as follows:

T = o + 1 I (25)

where βo and β1 are parameters to be estimated (by least squares) using real data for this problem. In Table 3, we show experimental values for a battery of 6 Volts, which

Table 3. Values of temperature and current for a battery of 200 amperes hour

Hrs T I Hrs T I 21:00 111 5.22 23:00 93 3.53 23:00 100 5.21 1:00 91 3.40 1:00 105 5.52 3:00 92 3.32 3:00 100 5.66 5:00 96 3.16 5:00 100 5.60 7:00 98 3.10 7:00 97 5.72 9:00 98 3.14 9:00 92 4.82 11:00 102 3.12 11:00 95 4.32 13:00 99 3.03 13:00 102 4.10 15:00 98 3.05 15:00 103 4.05 17:00 97 3.06 17:00 100 3.40 19:00 95 2.96 19:00 97 3.77 21:00 94 2.60 21:00 94 3.62 23:00 96 2.76


T I T dT/dt

Fuzzy controller

Electro-chemical process

Fig. 15. Fuzzy Control of the process

Fig. 16. Fuzzy rule base for controlling the Process

according to manufacturer’s specifications should be loaded by using 200 amperes hour. Using the data from Table 3 we can obtain (by least squares method) the values of βo and β1 [28]. The equations is as follows:

T = 88.03 + 2.5304 I (26)

with correlation value of only 0.57 which is because of the complexity of the data. For the fuzzy controller we used as input variables, the temperature T and the

change of temperature dT/dt, and as output variable the current intensity that should be applied to the battery. In Figure 15 we show the architecture of our control system.

The control method was implemented in the MATLAB language. For each of the linguistic variables it was considered convenient to use five terms. In Figure 16 we show the fuzzy rule base implemented in the Fuzzy Logic Toolbox of MATLAB. We have 25 rules because we are using 5 linguistic terms for each variable. The membership functions were tuned manually until they give the best values for the problem.


8.2 Neuro-Fuzzy Method for Control

Since it is difficult to tune a particular inference system to model a complex dynamical system [1] it is convenient to use adaptive fuzzy inference systems. Adaptive neuro-fuzzy inference systems (ANFIS) can be used to adapt the membership functions and consequents of the rule base according to historical data of the problem [13]. In this case, we can use the data from Table 2 and apply the ANFIS methodology to find the best fuzzy system for our problem. We used the fuzzy logic toolbox of MATLAB to apply the ANFIS methodology to our problem with 5 membership functions and first order Sugeno functions in the consequents. We show in Figure 17 the non-linear surface for control.

Fig. 17. ANFIS surface for the process

8.3 Neuro-Fuzzy-Genetic Control

In this case, neural networks are used for modelling the electrochemical process, fuzzy logic for controlling the electrical current and genetic algorithms for adapting the membership functions of the fuzzy system [8]. A multilayer feedforward neural network was used for modelling the electrochemical process. We used the data form Table 3 and the Levenberg-Marquardt learning algorithm to train the neural network. We used a three layer neural network with 15 nodes in the hidden layer. The results of training for 2000 epochs are as follows. The sum of squared errors was reduced from about 200 initially to 11.25 at the end, which is a very good approximation in this case. The fuzzy rule base was implemented in the Fuzzy Logic Toolbox of MATLAB.


In this case, 25 fuzzy rules were used because there were 5 linguistic terms for each input variable.

8.4 Experimental Results

The three hybrid control systems were compared by simulating the formation (loading) of a 6 Volts battery. This particular battery is manually loaded (in the plant) by applying 2 amperes for 50 hours under manufacturer’s specifications. We show in Table 4 the experimental results.

Table 4. Comparison of the Methods for Control

Control Method Time Loading Manual Control 50 hours Conventional Control 36 hours Fuzzy Control 32 hours Neuro-Fuzzy Control 30 hours Neuro-Fuzzy-Genetic 25 hours

We can see from Table 4 that the fuzzy control method reduces 36% the time required to charge the battery compared with manual control, and 11.11% compared with conventional PID control [27]. We can also see how ANFIS helps in reducing even more this time because we are using neural networks for adapting the intelligent system. Now the reduction is of 40% with respect to manual control. Finally, we can notice that using a neuro-fuzzy-genetic approach reduces even more the time because the genetic algorithm optimizes the fuzzy system. In this case, reduction is of 50 % with respect to manual control.

We have described in this section, three different approaches for controlling an electrochemical process. We have shown that for this type of application the use of several soft computing techniques can help in reducing the time required to produce a battery. Even fuzzy control alone can reduce the formation time of a battery, but using neural networks and genetic algorithms reduces even more the time for production. Of course, this means that manufacturers can produce the batteries in half the time needed before.

9 Conclusions

We can say that hybrid intelligent systems can be used to solve difficult real-world problems. Of course, the right hybrid architecture (and combination) has to be selected. At the moment, there are no general rules to decide on the right architecture for specific classes of problems. However, we can use the experience that other researchers have gained on these problems and use it to our advantage. Also, we always have to turn to experimental work to test different combinations of soft computing techniques and decide on the best one for ourselves. Finally, we can conclude that the use of soft


computing for controlling dynamical systems is a very fruitful area of research, because of the excellent results that can be achieved without using complex mathematical models [8, 23].

Acknowledgments

We would like to thank the research grant committee of CONACYT-Mexico, for the financial support given to this research project, under grant 33780-A, and also COSNET for the research grants 743.99-P, 414.01-P and 487.02-P. We would also like to thank the Department of Computer Science of Tijuana Institute of Technology for the time and resources given to this project.

References

[1] Albertos, P., Strietzel, R., Mart, N.: Control Engineering Solutions: A practical approach. IEEE Computer Society Press, Los Alamitos (1997)

[2] Bode, H., Brodd, R.J., Kordesch, K.V.: Lead-Acid Batteries. John Wiley & Sons, Chichester (1977)

[3] Castillo, O., Melin, P.: Developing a New Method for the Identification of Microorganisms for the Food Industry using the Fractal Dimension. Journal of Fractals 2(3), 457–460 (1994)

[4] Castillo, O., Melin, P.: Mathematical Modelling and Simulation of Robotic Dynamic Systems using Fuzzy Logic Techniques and Fractal Theory. In: Proceedings of IMACS 1997, Berlin, Germany, vol. 5, pp. 343–348 (1997)

[5] Castillo, O., Melin, P.: A New Fuzzy-Fractal-Genetic Method for Automated Mathematical Modelling and Simulation of Robotic Dynamic Systems. In: Proceedings of FUZZ 1998, vol. 2, pp. 1182–1187. IEEE Press, Anchorage (1998)

[6] Castillo, O., Melin, P.: A New Fuzzy Inference System for Reasoning with Multiple Differential Equations for Modelling Complex Dynamical Systems. In: Proceedings of CIMCA 1999, pp. 224–229. IOS Press, Vienna (1999)

[7] Castillo, O., Melin, P.: Automated Mathematical Modelling, Simulation and Behavior Identification of Robotic Dynamic Systems using a New Fuzzy-Fractal-Genetic Approach. Journal of Robotics and Autonomous Systems 28(1), 19–30 (1999)

[8] Castillo, O., Melin, P.: Soft Computing for Control of Non-Linear Dynamical Systems. Springer, Heidelberg (2001)

[9] Chen, G., Pham, T.T.: Introduction to Fuzzy Sets, Fuzzy Logic, and Fuzzy Control Systems. CRC Press, Boca Raton (2001)

[10] Fu, K.S., Gonzalez, R.C., Lee, C.S.G.: Robotics: Control, Sensing, Vision and Intelligence. McGraw-Hill, New York (1987)

[11] Goldfeld, S.M., Quandt, R.E., Trotter, H.F.: Maximization by Quadratic Hill Climbing. Econometrica 34, 541–551 (1966)

[12] Hehner, N., Orsino, J.A.: Storage Battery Manufacturing Manual III. Independent Battery Manufacturers Association (1985)

[13] Jang, J.R., Sun, C.T., Mizutani, E.: Neuro-Fuzzy and Soft Computing. Prentice Hall, Englewood Cliffs (1997)

[14] Mandelbrot, B.: The Fractal Geometry of Nature. W.H. Freeman and Company, New York (1987)


[15] Marquardt, D.W.: An Algorithm for Least Squares Estimation of Non-Linear Parameters. Journal of the Society of Industrial and Applied Mathematics 11, 431–441 (1963)

[16] Melin, P., Castillo, O.: Modelling and Simulation for Bacteria Growth Control in the Food Industry using Artificial Intelligence. In: Proceedings of CESA 1996, Gerf EC Lille, Lille, France, pp. 676–681 (1996)

[17] Melin, P., Castillo, O.: An Adaptive Model-Based Neural Network Controller for Biochemical Reactors in the Food Industry. In: Proceedings of Control 1997, pp. 147–150. Acta Press, Canada (1997)

[18] Melin, P., Castillo, O.: An Adaptive Neural Network System for Bacteria Growth Control in the Food Industry using Mathematical Modelling and Simulation. In: Proceedings of IMACS World Congress 1997, vol. 4, pp. 203–208. W & T Verlag, Berlin (1997)

[19] Melin, P., Castillo, O.: Automated Mathematical Modelling and Simulation for Bacteria Growth Control in the Food Industry using Artificial Intelligence and Fractal Theory. Journal of Systems, Analysis, Modelling and Simulation, 189–206 (1997)

[20] Melin, P., Castillo, O.: An Adaptive Model-Based Neuro-Fuzzy-Fractal Controller for Biochemical Reactors in the Food Industry. In: Proceedings of IJCNN 1998, Anchorage Alaska, USA, vol. 1, pp. 106–111 (1998)

[21] Melin, P., Castillo, O.: A New Method for Adaptive Model-Based Neuro-Fuzzy-Fractal Control of Non-Linear Dynamic Plants: The Case of Biochemical Reactors. In: Proceedings of IPMU 1998, vol. 1, pp. 475–482. EDK Publishers, Paris (1998)

[22] Melin, P., Castillo, O.: A New Method for Adaptive Model-Based Neuro-Fuzzy-Fractal of Non-Linear Dynamical Systems. In: Proceedings of ICNPAA, pp. 499–506. European Conference Publications, Daytona Beach (1999)

[23] Melin, P., Castillo, O.: Modelling, Simulation and Control of Non-Linear Dynamical Systems. Taylor and Francis Publishers, London (2002)

[24] Miller, W.T., Sutton, R.S., Werbos, P.J.: Neural Networks for Control. MIT Press, Cambridge (1995)

[25] Nakamura, S.: Numerical Analysis and Graphic Visualization with MATLAB. Prentice-Hall, Englewood Cliffs (1997)

[26] Narendra, K.S., Annaswamy, A.M.: Stable Adaptive Systems. Prentice Hall Publishing, Englewood Cliffs (1989)

[27] Rasband, S.N.: Chaotic Dynamics of Non-Linear Systems. John Wiley & Sons, Chichester (1990)

[28] Sepulveda, R., Castillo, O., Montiel, O., Lopez, M.: Analysis of Fuzzy Control System for Process of Forming Batteries. In: ISRA 1998, Mexico, pp. 203–210 (1998)

[29] Sugeno, M., Kang, G.T.: Structure Identification of Fuzzy Model. Fuzzy Sets and Systems 28, 15–33 (1988)

[30] Takagi, T., Sugeno, M.: Fuzzy Identification of Systems and its Applications to Modelling and Control. IEEE Transactions on Systems, Man and Cybernetics 15, 116–132 (1985)

[31] Ungar, L.H.: A Bioreactor Benchmark for Adaptive Network-Based Process Control. In: Neural Networks for Control, pp. 387–402. MIT Press, Cambridge (1995)

[32] Zadeh, L.A.: The Concept of a Linguistic Variable and its Application to Approximate Reasoning. Information Sciences 8, 43–80 (1975)


Model Reference Adaptive Control of Underwater Robot in Spatial Motion

Jerzy Garus

Naval University 81-103 Gdynia ul. Śmidowicza 69, Poland [email protected]

Abstract. The paper addresses nonlinear control of an underwater robot. The way-point line of sight scheme is incorporated for the tracking of a desired trajectory. Command signals are generated by an autopilot consisting of four controllers with parameter adaptation law implemented. Quality of control is concerned in presence of environmental disturbances. Some computer simulations are provided to demonstrate effectiveness, correctness and robustness of the approach.

1 Introduction

Underwater Robotics has known an increasing interest in the last years. The main benefits of usage of an Underwater Robotic Vehicles (URV) can be removing a man from the dangers of the undersea environment and reduction in cost of exploration of deep seas. Currently, it is common to use the URV to accomplish missions like inspections of coastal and off-shore structures, cable maintenance, as well as hydrographical and biological surveys. In the military field it is employed in such tasks as surveillance, intelligence gathering, torpedo recovery and mine counter measures.

The URV is considered being a floating platform carrying tools required for performing various functions, like manipulator arms with interchangeable end-effectors, cameras, scanners, sonars, etc. An automatic control of such objects is a difficult problem caused by their nonlinear dynamics [1, 3, 4, 5, 6]. Moreover, the dynamics can change according to the alteration of configuration to be suited to the mission. In order to cope with those difficulties, the control system should be flexible.

The conventional URV operate in crab-wise manner of four degrees of freedom (DOF) with small roll and pitch angles that can be neglected during normal operations. Therefore its basic motion is movement in horizontal plane with some variation due to diving.

The objective of the paper is to present a usage of the adaptive inverse dynamics algorithm to driving the robot along a desired trajectory in the spatial motion. It

72 J. Garus

consists of the following four sections. Brief descriptions of dynamical and kinematical equations of motion of the URV and the adaptive control law are presented in the Section 2. Next some results of the simulation study are provided. Conclusions are given in the Section 4.

2 Nonlinear Adaptive Control Law

The general motion of marine vessels of six DOF describes the following vectors [2, 4, 5]:

[ ][ ][ ]T

T

T

NMKZYX

rqpwvu

zyx

,,,,,

,,,,,

,,,,,

===

τv

η ψθφ (1)

where:

η – vector of position and orientation in the inertial frame;

x, y, z – coordinates of position; φ, θ, ψ – coordinates of orientation (Euler angles); v – vector of linear and angular velocities with coordinates in the

body-fixed frame; u, v, w – linear velocities along longitudinal, transversal and vertical axes; p, q, r – angular velocities about longitudinal, transversal and vertical axes; τ – vector of forces and moments acting on the robot in the body-fixed frame; X, Y, Z – forces along longitudinal, transversal and vertical axes; K, M, N – moments about longitudinal, transversal and vertical axes.

Nonlinear dynamical and kinematical equations of motion in the body-fixed frame

can be expressed as [4,5]:

τηgvvDvvCvM =+++ )()()( (2a)

( )vηJη = (2b)

where:

M – inertia matrix (including added mass); C(v) – matrix of Coriolis and centripetal terms (including added mass); D(v) – hydrodynamic damping and lift matrix;

)(ηg – vector of gravitational forces and moments;

)(ηJ – velocity transformation matrix between the body-fixed frame and

the inertial one.

Model Reference Adaptive Control of Underwater Robot in Spatial Motion 73

The robot’s dynamics in the inertial frame can be written as [4, 5]:

ηηηηη τηgηηvDηηvCηηM =+++ )(),(),()( (3)

where:

( )( ) [ ]( )

( )( ) τηJτ

ηgηJηg

ηJvDηJηvD

ηJηJηMJvCηJηvC

ηMJηJηM

T

T

T

T

T

)(

)()()(

)()()(),(

)()()()()(),(

)()()(

1

1

11

111

11

−

−

−−

−−−

−−

==

=−=

=

η

η

η

η

η

There are parametric uncertainties in the dynamic model (2a), and some parameters

are generally unknown. Hence, parameter estimation is necessary in case of model-based control. For this purpose it is assumed that the robot equations of motion are linear in a parameter vector p, that is [8]:

( ) τpvvηYηgvvDvvCvM =≅+++ ,,)()()( (4)

where ( )vvηY ,, is a known matrix function of measured signals usually referred as

the regressor matrix (dimension n×r) and p is a vector of uncertain or unknown parameters.

Let define the nonlinear URV dynamics (2a) in a compact form as:

τηvhvM =+ ),( (5)

where h is the nonlinear vector:

( ) )()()(, ηgvvDvvCηvh ++= (6)

The parameter adaptation law, under assumption that parameters of desired trajectory dη , dη and dη are given and vectors η , v and v measured ,takes the

form [5, 8]:

),(ˆˆ ηvhaMτ += (7)

where the hat denotes the adaptive parameter estimates.

Substitution (7) into (5) and adding and subtracting vM from the left side of the dynamical equations yields:

( ) ),(~~ˆ ηvhvMavM +=− (8)

where MMM −= ˆ~and ),(),(ˆ),(

~ ηvhηvhηvh −= .

74 J. Garus

Since the equations of motion are linear in the parameter vector p, the following parameterization can be applied:

( )pvvηYηvhvM ~,,),(~~ =+ (9)

where ppp −= ˆ~ is the unknown parameter error vector.

Differentiation of the kinematical equation (2b) with respect to time yields:

( ) ( )[ ]vηJηηJv −= −1 (10)

Substitution (10) to (8) and choosing the commanded acceleration a in a form

( ) ( )[ ]vηJaηJa −= −η

1 the following expression is obtained:

( )[ ] ( )pvvηYaηηJM ~,,ˆ 1 =−−η (11)

Multiplying (11) with ( )( )TηJ 1− gives:

( )[ ] ( )( ) ( )pvvηYηJaηηM ~,,ˆ 1 T−=− ηη (12)

Furthermore, let the commanded acceleration ηa be chosen as the PDD2– type

control [5]:

ηKηKηa ~~PDd −−=η (13)

where dηηη −=~ is the tracking error and KP, KD are positive definite diagonal

matrices. Hence, the error dynamics can be written in the form:

( )( ) ( ) ( ) ( )pvvηYηJηKηKηηM ~,,~~~ˆ 1 T

PD−=++η (14)

Assuming that ( )ηM 1ˆ −η exists, the expression (14) can be written in a state-space

form:

( ) ( )pvvηYηBJAxx ~,,T−+= (15)

where:

⎥⎦

⎤⎢⎣

⎡=

ηη

x ~

~, ⎥

⎦

⎤⎢⎣

⎡−−

=DP KK

I0A , ( )⎥⎦

⎤⎢⎣

⎡= − ηM

0B 1ˆ

η

.

Updated the parameter vector p according to the formulae [5, 6]:

( ) ( ) PxBηJvvηYΓp TT 11 ,,ˆ −−−= (16)


where Γ and P are symmetric positive definite matrices, convergence of η~ to zero is

guaranteed. A block diagram of the control system with parameter adaptation law is shown in

Fig. 1.

Fig. 1. A block diagram with the parameter adaptation law

3 Simulation Results

A main task of the proposed tracking control system is to minimize distance of attitude of the robot’s centre of gravity to the desired trajectory under assumptions:

1. the robot can move with varying linear velocities u, v, w and angular velocity r; 2. its velocities u, v, w, r and coordinates of position x, y, z and heading ψ are

measurable; 3. the desired trajectory is given by means of set of way-points ( ){ }dididi zyx ,, ;

4. reference trajectories between two successive way-points are defined as smooth and bounded curves;

5. the command signal τ consists of four components: XX =τ , YY =τ , ZZ =τ

and NN =τ calculated from the control law (7).

The structure of the proposed automatic control system is depicted in Fig. 2.

76 J. Garus

Fig. 2. The main parts of the control system

To validate the performance of the developed nonlinear control law some simulations results, done in the MATLAB/Simulink environment, are presented. A mathematical model of the UVR is based on a real construction, i.e. the underwater robotic vehicle called “Coral” designed and built for the Polish Navy. It is the open frame robot controllable of four DOF, being 1.5 m long and having a propulsion system consisting of six thrusters. Displacement in horizontal plane is done by means of four thrusters which generate force up to ±750 N assuring speed up to ±1.2 m/s and ±0.6 m/s in x and y direction, consequently. In the vertical plane two thrusters are used assuring speed up to ±0.35 m/s. All parameters of the robot’s dynamics are presented in the Appendix.

The numerical simulations have been done for the following assumptions:

1. The robot has to follow the desired trajectory beginning from (10 m, 10 m, 0 m), passing target way-points: (10 m, 10 m, -5 m), (10 m, 90 m, -5 m), (30 m, 90 m, -5 m), (30 m, 10 m, -5 m), (60 m, 10 m, -5 m), (60 m, 90 m, -5 m), (60 m, 90 m, -15 m), (60 m, 10 m, -15 m), (30 m, 10 m, -15 m), (30 m, 90 m, -15 m), (10 m, 90 m, -15 m) and ending in (10 m, 10 m, -15 m);

2. The turning point is reached when the robot is inside of the 0.5 meter circle of acceptance;

3. The sea current interacts the robot’s hull with maximum velocity 0.3 m/s and direction 1350;

4. Dynamic equations of the robot’s motion are integrated with higher frequency (18 Hz) than the rest of modules (6 Hz).

It has been assumed that the time-varying reference trajectories at the way-point i to the next way-point i+1 are generated using desired speed profiles [7, 8]. Such approach allows us to keep constant speed along certain part of the path. For those assumptions and the following initial conditions:


( ) 0ηη =bdk t , ( ) 0ηη =bdk t

( ) 1ηη =fdk t , ( ) 1ηη =fdk t (17)

( ) maxmax ηη =tdk ,

where 4,1=k , the ith segment of the trajectory in a period of time fb ttt ,∈ is

modelled according to the expression [8]:

( )( )

( )( )

⎪⎪⎪⎪

⎩

⎪⎪⎪⎪

⎨

⎧

≤<−−−−

−≤<−+

+−−+

≤≤−+

=

fmffm

mfm

m

mf

mbm

dk

ttttttt

tttt

tt

tt

ttttt

t

21max1

max

max01

20max0

2

2

22

ηηη

η

ηηη

ηηη

η

where max

01

ηηη −−= fm tt .

The algorithm of control has been worked out basis on simplified URV model proposed in [4, 9]:

( ) τvvDvM =+ ss (18)

where all kinematics and dynamics cross-coupling terms are neglected. Here sM and

( )vDs are diagonal matrices with diagonal elements of the inertia matrix M and a

nonlinear damping matrix ( )vDn , consequently (see the Appendix). Uncertainties in

the above model are compensated in the control system. Therefore, the robot’s model for spatial motion of four DOF can be written in the following form:

NNN

ZZZ

YYY

XXX

rrdrm

wwdwm

vvdvm

uudum

ττ

ττ

=+

=+

=+

=+

(19)

Defining the parameter vector p as

[ ]TNNZZYYXX dmdmdmdm=p the equation (18) can be written

in a form:

( ) τpvvY =, (20)

78 J. Garus

where:

( )⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

rrr

www

vvv

uuu

000000

000000

000000

000000

, vvY .

010203040506070

0 20

40 60

80 100

-16 -14 -12 -10 -8 -6 -4 -2 0 2

position x [m]position y [m]

posi

tion

z [m

]

dr

0 500 1000 1500 20000

20

40

60

80

posi

tion

x [m

]

dr

0 500 1000 1500 2000-1

0

1

2

erro

r x

[m]

time [s]

Fig. 3. Track-keeping control under interaction of sea current disturbances (maximum velocity 0.3 m/s and direction 1350): desired (d) and real (r) trajectories (upper plot), x-, y-, z-position and their errors (2nd π 4th plots), course and its error (5th plot), commands (low plot)


0 500 1000 1500 20000

50

100

posi

tion

y [m

]

dr

0 500 1000 1500 2000-1

-0.5

0

0.5

1

erro

r y

[m]

time [s]

0 500 1000 1500 2000-15

-10

-5

0

5

posi

tion

z [m

]

dr

0 500 1000 1500 2000-0.02

-0.015

-0.01

-0.005

0

erro

r z

[m]

time [s]

0 500 1000 1500 2000-200

0

200

400

cour

se p

si [d

eg]

dr

0 500 1000 1500 2000-40

-20

0

20

erro

r ps

i [de

g]

time [s]

Fig. 3. (continued)

80 J. Garus

0 500 1000 1500 2000-1000

0

1000

X [N

]

forces and moment

0 500 1000 1500 2000-100

0

100 Y

[N]

0 500 1000 1500 2000-200

-100

0

Z [N

]

0 500 1000 1500 2000-50

0

50

N [N

m]

time [s]

Fig. 3. (continued)

The control problem has been examined under interaction of environmental disturbances, i.e. a sea current. To simulate its effect on robot’s motion assumed the current’s velocity Vc is slowly-varying and the direction is fixed. For simulation needs the current velocity was generated by using the first order Gauss-Markov process [5]:

ωμ =+ cc VV (21)

where ω is a Gaussian white noise, 0≥μ is a constant and ( ) max0 cc VtV ≤≤ .

0 500 1000 1500 20000

200

400

600

m [k

g]

estimates for motion along x axis

0 500 1000 1500 20000

500

1000

1500

d [k

g/m

]

time [s]

se

Fig. 4. Estimates of mass and damping coefficients: set value (s) and estimate (e)


0 500 1000 1500 20000

200

400

600

m [k

g]

estimates for motion along y axis

0 500 1000 1500 2000200

300

400

500

d [k

g/m

]

time [s]

se

0 500 1000 1500 20000

50

100

150

m [k

g]

estimates for motion along z axis

0 500 1000 1500 2000200

300

400

500

d [k

g/m

]

time [s]

se

0 500 1000 1500 200010

20

30

40

50

m [k

g m 2 ]

estimates for rotation about z axis

0 500 1000 1500 20005

10

15

20

d [k

g m 2 ]

time [s]

se

Fig. 4. (continued)

82 J. Garus

Results of track-keeping in presence of external disturbances and courses of command signals are presented in Fig. 3.

It can be noticed that the proposed autopilot enhanced good tracking control along the desired trajectory in the spatial motion. The main advantage of the approach is using the simple nonlinear law to design controllers and its high performance for relative large sea current disturbances (comparable with resultant speed of the robot).

Since the true values of components of the vector p are unknown, the process of evaluation started from half of the nominal values. Time histories of estimated parameters during track-keeping are presented in Fig. 4.

4 Conclusions

In the paper the nonlinear control system for the underwater robot has been described. The obtained results with the autopilot consisting of four controllers with parameter adaptation law implemented have showed that the proposed control system is simple and useful for the practical usage.

Disturbances from the sea current were added in the simulation study to verify the performance, correctness and robustness of the approach.

Further works are devoted to the problem of tuning of the autopilot parameters in relation to the robot’s dynamics.

References

[1] Antonelli, G., Caccavale, F., Sarkar, S., West, M.: Adaptive Control of an Autonomous Underwater Vehicle: Experimental Results on ODIN. IEEE Transactions on Control Systems Technology 9(5), 756–765 (2001)

[2] Bhattacharyya, R.: Dynamics of Marine Vehicles. John Wiley and Sons, Chichester (1978) [3] Craven, J., Sutton, R., Burns, R.S.: Control Strategies for Unmanned Underwater Vehicles.

Journal of Navigation 1(51), 79–105 (1998) [4] Fossen, T.I.: Guidance and Control of Ocean Vehicles. John Wiley and Sons, Chichester

(1994) [5] Fossen, T.I.: Marine Control Systems. Marine Cybernetics AS, Trondheim (2002) [6] Garus, J.: Design of URV Control System Using Nonlinear PD Control. WSEAS

Transactions on Systems 4(5), 770–778 (2005) [7] Garus, J., Kitowski, Z.: Tracking Autopilot for Underwater Robotic Vehicle. In: Cagnol,

J., Zolesio, J.P. (eds.) Information Processing: Recent Mathematical Advances in Optimization and Control, pp. 127–138. Presses de l’Ecole des Mines de Paris (2004)

[8] Spong, M.W., Vidyasagar, M.: Robot Dynamics and Control. John Wiley and Sons, Chichester (1989)

[9] Yoerger, D.R., Slotine, J.E.: Robust Trajectory Control of Underwater Vehicles. IEEE Journal of Oceanic Engineering (4), 462–470 (1985)


Appendix

The URV model. The following parameters of dynamics of the underwater robot have been used in computer simulations:

{ }1.299.322.85.1265.1080.99diag=M

( ) ( ){ }

⎭⎬⎫

⎩⎨⎧

+

+==+=

rqp

wvudiag

diagn

937.12002.14212.3

03.47841.40518.227

603.1918.1223.00.00.00.10

vDDvD

( )

⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

−−−−

−−−

−−

=

03.18.605.180.28

3.109.55.1800.26

8.69.500.280.260

05.180.28000

5.1800.26000

0.280.260000

pquv

pruw

qrvw

uv

uw

vw

vC

( )

( )⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

+−−

−

=

0

)cos()cos()sin(2.279

)sin()cos(2.279

)cos()cos(0.17

)sin()cos(0.17

)sin(0.17

φθθφθ

φθφθ

θ

ηg

Feedback Stabilization of Distributed ParameterGyroscopic Systems

Pawe�l Skruch

AGH University of Science and Technology, Institute of Automation,al. Mickiewicza 30/B1, 30-059 Krakow, [email protected]

Abstract. In this paper feedback stabilization of distributed parameter gyroscopicsystems is discussed. The class of such systems is described by second-order operatorequations. We show that the closed loop system which consists of the controlled sys-tem, linear non-velocity feedback and a parallel compensator is asymptotically stable.In the case where velocity is available, the parallel compensator is not necessary tostabilize the system. We present our results here for multi-input multi-output case.The stability issues are proved by LaSalle’s theorem extended to infinite dimensionalsystems. Numerical examples are given to illustrate the effectiveness of the proposedcontrollers.

1 Introduction

Many physical systems are represented by partial differential equations. As anexample we can consider robots with flexible links, vibrating structures such asbeams, buildings, bridges, etc. For the most part, it is not possible or feasibleto obtain a solution of these equations. Therefore in practice, a distributed pa-rameter system is first discretized to a matrix second-order model using someapproximate methods. Then the problem is solved for this discretized reduced-order model.

It is well-known that a dangerous situation called resonance occurs when oneor more natural frequences of the system become equal or close to a frequencyof the external force. Because a linear infinite dimensional system described byan operator second-order differential equation without damping term may havean infinite number of poles on the imaginary axis [17], [18], [26], the approxi-mate solutions are not suitable for designing the stabilizer. To combat possibleundesirable effects of vibrations, the dynamic effect of the system parts whosebehaviour are described by partial differential equations has to be taken intoaccount in designing a controller.

Stability of second-order systems both in finite and infinite dimensional casehas been studied in the past. More recently, in [19] and [20] the dynamics andstability of LC ladder network by inner resistance, by velocity feedback and byfirst range dynamic feedback are studied. Control problems for finite dimensionalundamped second-order systems are discussed in [12] and [21]. In [28], the class


86 P. Skruch

of non-linear controllers is proposed to stabilize damped gyroscopic systems. Sta-bilization problems for infinite dimensional second-order systems are discussedby very many scientists, and to mention only a few we note the works [13], [14],[17], [23] and [24]. A good source of references to papers in which stabilizationproblems are treated can by found in [18].

The paper is organized as follows. In the next section we introduce the system.We also analyze some properties of the system. In section 3 and 4, we propose twotypes of control laws. We prove that the proposed control laws asymptoticallystabilize the system. In section 5 we present some numerical simulation results.Finally, we give some concluding remarks.

2 Description of the System

Let Ω ⊂ RN be a bounded domain with smooth boundary ∂Ω. By X we denote

a real Hilbert space consisting of square integrable functions on the set Ω withthe following inner product:

〈f, g〉X =∫

Ω

f(ξ)g(ξ)dξ. (1)

Let L2 and Hk be defined as follows:

L2 ={

f : Ω → Rn :

∫Ω

|f(ξ)|2dξ < ∞}

, (2)

Hk ={

f ∈ L2 : f, f ′, . . . , f (k) ∈ L2}

. (3)

We consider a control system described by the second-order operator equation

x(t) + Gx(t) + Ax(t) = Bu(t), t > 0, (4)

with initial conditions

x(0) = x0 ∈ D(A), x(0) = x1 ∈ X, (5)

where x(t) ∈ X = L2(Ω). We assume that A : (D(A) ⊂ X) → X is a linear,generally unbounded, self-adjoint and positive definite operator with domainD(A) dense in X and compact resolvent R(λ, A); G ∈ L(X) is a linear, boundedand skew-adjoint (gyroscopic) operator. The control force is represented by theoperator B ∈ L(Rr, X) defined as follows:

Bu(t) =r∑

i=1

biui(t), (6)

where B = [b1 b2 · · · br], bi ∈ X , u(t) = [u1(t) u2(t) · · · ur(t)]T , ui(·) ∈L2([0,∞), R), i = 1, 2, . . . , r. The state of the system is measured by averag-ing sensors, whose outputs are expressed by the linear and bounded operatorC ∈ L(X, Rm)

y(t) = Cx(t), (7)

Feedback Stabilization of Distributed Parameter Gyroscopic Systems 87

whereCx = [〈c1, x〉X 〈c2, x〉X · · · 〈cm, x〉X ]T , (8)

ci ∈ X , i = 1, 2, . . . , m are sensor influence functions.From the Hilbert-Schmidt theory [5], [22], [31] for compact self-adjoint oper-

ators, it is well-known that the operator A satisfies the following hypotheses:

(a) 0 ∈ ρ(A), i.e. A−1 exists and is compact (ρ(A) stands for the resolvent setof the operator A),

(b) A is closed,(c) The operator A has only purely discrete spectrum consisting entirely of dis-

tinct real positive eigenvalues λi with finite multiplicity ri < ∞, where0 < λ1 < . . . < λi < . . ., limi→∞λi = ∞,

(d) For each eigenvalue λi there exists ri corresponding eigenfunctions υik,Aυik = λiυik, where i = 1, 2, . . ., k = 1, 2, . . . , ri,

(e) The set of eigenfunctions υik, i = 1, 2, . . ., k = 1, 2, . . . , ri, forms a completeorthonormal system in X .

By introducing new function space X = D(A1/2)×X , the equation (4) is reducedto the following abstract first-order form:

d

dtx(t) + (A + G)x(t) + Bu(t), (9)

where x = col (x, x), the operators A, G and B are defined as

A =[

0 I−A 0

], G =

[0 00 −G

], B =

[0B

]. (10)

Remark 1. The operator A is positive and self-adjoint on the real Hilbert spaceX . The operator A1/2 is well defined. Thus the operator A (see (10)) on X =D(A1/2)×X (see [18], [26]) is the infinitesimal generator of a C0-semigroup S(t)on X , ‖S(t)‖ ≤ 1 and domain of A is D(A) = D(A) ×D(A1/2). In this case theinner product on X is given by 〈z, v〉 = 〈A1/2z1, A

1/2v1〉 + 〈z2, v2〉.Remark 2. The operator G is bounded. Thus G is bounded as well (see (10)).From theorem about bounded perturbation of generator [9], [18], [25], the oper-ator A + G (see (10)) is the infinitesimal generator of a C0-semigroup on X.

Remark 3. In the real Hilbert space X and for the skew-adjoint operator G, thefollowing equality is true:

〈x, Gx〉X =12〈x, Gx〉X +

12〈x, Gx〉X =

12〈x, Gx〉X − 1

2〈Gx, x〉X = 0. (11)

3 Stabilization in the Case Where Velocity Feedback IsNot Available

The main idea of this section is devoted to the stabilization of the system (4),(7) in the case where only position feedback is available. The stabilizer will be

88 P. Skruch

constructed by placing actuators and sensors at the same location, what meansthat C = B∗ and consequently

y(t) = B∗x(t). (12)

We assume that the system (4) with the output (12) is approximately observable(see [1], [10], [11]).

Let us consider the linear dynamic feedback given by the formula (see also[13], [21])

u(t) = −K[w(t) + y(t)], (13)

w(t) + Aww(t) = Bwu(t), w(0) = w0, (14)

where w(t) ∈ Rm, Aw = diag [αi], Bw = diag [βi], αi, βi ∈ R, αi > 0, βi > 0,

i = 1, 2, . . . , m, K = KT > 0 is a real positive definite matrix.To analyze the closed loop system, we first define the function space Z =

H1(Ω) × L2(Ω) × Rm with the following inner product:

〈z, z〉Z = 〈Az1, z1〉X + 〈z2, z2〉X + zT3 Qz3 + (B∗z1 + z3)T K(B∗z1 + z3), (15)

where z = col (z1, z2, z3), z = col (z1, z2, z3), Q = diag [αi

βi] = AwB−1

w . Let usnote that the space Z with the inner product (15) is a Hilbert space. Now, theclosed loop system (4), (12), (13), (14) can be written in the following abstractform:

z(t) = Lz(t), (16)

where z(t) = col (x(t), x(t), w(t)), L : (D(L) ⊂ Z) → Z is a linear operatordefined as follows:

L =

⎡⎣ 0 I 0−A − BKB∗ −G −BK−BwKB∗ 0 −Aw − BwK

⎤⎦ . (17)

The closed loop system (16) can also obtain the following form [20]:⎡⎣x(t)

x(t)w(t)

⎤⎦ =

⎡⎣ 0 I 0−A −G 00 0 −Aw

⎤⎦

⎡⎣x(t)

x(t)w(t)

⎤⎦ +

⎡⎣ 0

BBw

⎤⎦ u(t), (18)

s(t) =[C1B

∗ 0 C2

]⎡⎣x(t)

x(t)w(t)

⎤⎦ , (19)

u(t) = −Ks(t), (20)

where the matrices C1 = C2 = I.

Theorem 1. Suppose that the matrices C1 and C2 are real and invertible andthe system (4), (12) is observable. Then the system (18), (19) is observable.


Proof. The system (18), (19) is observable, if for any complex number s theequation ⎧⎪⎪⎪⎨

⎪⎪⎪⎩

sx1 − x2 = 0,

Ax1 + (G + sI)x2 = 0,

(Aw + sI)x3 = 0,

C1B∗x1 + C2x3 = 0

(21)

has no nonzero solution x = col (x1, x2, x3) [11]. When s = −αi, i = 1, 2, . . . , m,we have x3 = 0 and (21) becomes

⎧⎪⎨⎪⎩

sx1 − x2 = 0,

Ax1 + (G + sI)x2 = 0,

B∗x1 = 0.

(22)

If the system (4), (12) is observable, (22) has no nonzero solution for any complexnumber s.

Next consider the case where s = −αi for some i = 1, 2, . . . , m. From (21) isfollows that

Ax1 = (Gαi − α2i I)x1. (23)

From this it holds that

〈Ax1, x1〉X = 〈(Gαi − α2i I)x1, x1〉X = −α2

i ‖x1‖2X ≤ 0, (24)

which implies that x1 = 0, since A is positive, self-adjoint and has compactresolvent (see also lemma 2). Consequently, x2 = 0, x3 = 0. Therefore, thesystem (21) has no nonzero solution also for s = −αi. We have proved thetheorem. ��Theorem 2. Suppose that the system (4), (12) is approximately observable. Letus consider the system (16), where the operator L is given by (17). Then thefollowing assertions are true:

(a) L is dissipative,(b) Ran (λ0I − L) = Z for some λ0 > 0,(c) D(L)cl = Z and L is closed,(d) The operator L generates a C0-semigroup of contractions TL(t) ∈ L(Z),

t ≥ 0,(e) The C0-semigroup TL(t) generated by L is asymptotically stable.

Proof. (a) The linear operator L is dissipative if and only if

‖(λI − L)z‖Z ≥ λ‖z‖Z, ∀z∈D(L), λ>0 (25)

(see [25]). In the real Hilbert space Z, the condition (25) is equivalent to

‖Lz‖2Z − 2〈Lz, z〉Z ≥ 0, ∀z∈D(L), λ>0. (26)

90 P. Skruch

Using (15) and (17), we obtain

〈Lz, z〉Z = 〈Az2, z1〉X + 〈−Gz2, z2〉X + 〈−(A + BKB∗)z1 − BKz3, z2〉X+ [−BwKB∗z1 − (Aw + BwK)z3]

T Qz3 + [B∗z2 − BwKB∗z1]T K(B∗z1 + z3)

+ [−(Aw + BwK)z3]T K(B∗z1 + z3). (27)

Simple calculations show that

〈Lz, z〉Z = −zT Bwz ≤ 0, (28)

wherez = KB∗z1 + (Q + K)z3. (29)

Since 〈Lz, z〉Z ≤ 0, it follows that L is dissipative (see (26)).(b) To prove the assertion (b), it is enough to show that for some λ0 > 0, the

operator λ0I −L : Z → Z is onto. Let z = col (z1, z2, z3) ∈ Z be given. We haveto find z = col (z1, z2, z3) ∈ D(L) such that

(λ0I − L)z = z. (30)

Hence the following equations should hold:

λ0z1 − z2 = z1, (31)

(A + BKB∗)z1 + (λ0 + G)z2 + BKz3 = z2, (32)

BwKB∗z1 + (λ0I + Aw + BwK)z3 = z3. (33)

From (31) and (33) we can determine z2 and z3

z2 = λ0z1 − z1, (34)

z3 = (λ0I + Aw + BwK)−1(z3 − BwKB∗z1). (35)

We can do this because the matrix λ0I +Aw +BwK is invertible (see lemma 1).Using (34) and (35) in (32) we obtain

{λ20 + λ0G + A + B[K−1 + (λ0B

−1w + Q)−1]−1B∗}z1

= λ0z1 + z2 − BK(λ0I + Aw + BwK)−1z3. (36)

Define Γ (λ0) by

Γ (λ0) = λ20 + λ0G + A + B[K−1 + (λ0B

−1w + Q)−1]−1B∗. (37)

We know that Q = QT > 0 and λ0B−1w = diag [λ0β

−1i ]. Moreover, the inverse of

a real, symmetric and positive definite matrix is also a symmetric and positivedefinite matrix (see [32]). Hence, there exists the symmetric and positive definite


matrix [K−1 + (λ0B−1w + Q)−1]−1. Thus the operator Γ (λ0) is a closed operator

with domain D(Γ ) = D(A) dense in X . Additionally,

〈Γz1, z1〉X ≥ (λ20 + δ)‖z1‖X , (38)

where the constant δ > 0 can be determined by using lemmas 2 and 3. Thismeans that the operator Γ (λ0) is invertible and the equation (36) has a uniquesolution z1 ∈ D(A). The remaining unknowns z2 ∈ H1(Ω) and z3 ∈ R

m can beuniquely determined from (34) and (35). This completes the proof of (b).

(c) If for some λ0 > 0, Ran (λ0I − L) = Z then Ran (λI − L) = Z for all λ > 0[25]. Let us note that also Ran (λI − L) = Z for λ = 0. Now, we know that theoperator L is dissipative, the Hilbert space Z is reflexive and Ran (I − L) = Z.All these properties imply that D(L)cl = Z and L is closed [25].

(d) Because of (a), (b) and (c), the statement that the operator L generatesa C0-semigroup of contractions TL(t) ∈ L(Z), t ≥ 0, can be concluded fromLumer-Phillips theorem [6], [16], [17], [25].

(e) The asymptotic stability of the closed loop system (16) can be proved byLaSalle’s invariance principle [15] extended to infinite dimensional systems [7],[8], [17], [29]. We introduce the following Lyapunov function:

V (x(t), x(t), w(t)) =12〈x(t), x(t)〉X +

12〈Ax(t), x(t)〉X +

12w(t)T Qw(t)

+12

[w(t) + B∗x(t)]T K [w(t) + B∗x(t)] , (39)

where Q = diag [αi

βi] = AwB−1

w . We can notice that V (x, x, w) = 0 if and onlyif col (x, x, w) = 0. Otherwise V (x, x, w) > 0. Taking the derivative of V withrespect to time, we obtain

V (x(t), x(t), w(t)) = 〈x(t), x(t)〉X + 〈Ax(t), x(t)〉X + w(t)T Qw(t)

+ [w(t) + B∗x(t)]T K [w(t) + B∗x(t)] . (40)

Along trajectories of the closed loop system (16) it holds that

V (x(t), x(t), w(t)) = −s(t)T Bws(t) ≤ 0, (41)

where s(t) = KB∗x(t) + (Q + K)w(t). According to LaSalle’s theorem, all solu-tions of (16) asymptotically tend to the maximal invariant subset of the followingset

S ={

z ∈ Z : z = col (x, x, w), V (z) = 0}

, (42)

provided that the solution trajectories for t ≥ 0 are precompact in Z. FromV = 0 we have s(t) = 0 (see (19) for C1 = K, C2 = Q + K). The system (18),(19) is observable (see theorem 1), thus we have x = 0, x = 0, w = 0 and finallythe largest invariant set contained in S = {0} is the set {0}.

The trajectories of the closed loop system (16) are precompact in Z if the set

γ(z0) =⋃t≥0

TL(t)z0, z0 = z(0) ∈ D(L), (43)

92 P. Skruch

is precompact in Z. Since the operator L generates a C0-semigroup of contrac-tions on Z, hence the solution trajectories {TL(t), t ≥ 0} are bounded on Z.The precompactness of the solution trajectories are guaranteed if the operator(λI − L)−1 : Z → Z is compact for some λ > 0 [2], [27]. We first notice thatΓ (λ)−1 exists for λ ≥ 0 and is bounded. Therefore the operator (λI − L)−1 forλ ≥ 0 exists and is bounded as well. Since the embedding of H1(Ω)×L2(Ω)×R

m

into H2(Ω) × H1(Ω) × Rm is compact [30], it follows that (λI − L)−1 : Z → Z

is a compact operator. We have proved the theorem. ��Lemma 1. The matrix λ0I + Aw + BwK is invertible for λ0 ≥ 0.

Proof. We can notice that λ0I + Aw + BwK = Aw + BwK, where Aw =diag [λ0 + αi], i = 1, 2, . . . , m. The matrix Bw = diag [βi], and thereforeB−1

w = diag [β−1i ]. Hence Aw + BwK = Bw(B−1

w Aw + K), where B−1w Aw =

diag [β−1i (λ0 + αi)]. The matrix B−1

w Aw + K is symmetric and positive definite.This means that there exists (B−1

w Aw + K)−1. The proof can be concluded withremark that the product of invertible matrices is also invertible (see [32]). ��Lemma 2. The operator A satisfies the following condition:

〈Ax, x〉X ≥ λmin‖x‖X , (44)

λmin = min {λn : λn ∈ σ(A), n = 1, 2, . . .}, σ(A) stands for the discrete spec-trum of A.

Lemma 3. For any real and positive definite matrix K = KT > 0 there existsδ > 0, such that

〈(A + BKB∗)x, x〉X = 〈Ax, x〉X + (B∗x)T K(B∗x) ≥ δ‖x‖2X . (45)

The lemmas 2 and 3 can be proved by using the following expansions in Hilbertspace X :

x =∞∑

i=1

ri∑k=1

〈x, υik〉Xυik, x ∈ X, (46)

Ax =∞∑

i=1

λi

ri∑k=1

〈x, υik〉Xυik, x ∈ D(A). (47)

4 Stabilization in the Case Where Velocity Feedback IsAvailable

In this section we consider the dynamic feedback

u(t) = −K1y(t) − K2y(t), (48)

where K1 = KT1 ≥ 0 oraz K2 = KT

2 > 0 are real matrices. The control function(48) is applied to the system (4) with the output (12). The resulting closed loopsystem becomes


x(t) + (G + BK2B∗)x(t) + (A + BK1B

∗)x(t) = 0. (49)

We can reformulate the system (49) as a set of first-order equations. First, weintroduce the Hilbert space Z = H1(Ω)×L2(Ω) with the following inner product:

〈z, z〉Z = 〈Az1, z1〉X + 〈z2, z2〉X + (B∗z1)T K1(B∗z1), (50)

where z = col (z1, z2), z = col (z1, z2). In new function space Z, the closed loopsystem (49) can be written in the abstract form

z(t) = Lz(t), (51)

where z(t) = col (x(t), x(t)), L : (D(L) ⊂ Z) → Z is a linear operator definedas follows:

L =[

0 I−A − BK1B

∗ −G − BK2B∗

]. (52)

Theorem 3. Suppose that the system (4), (12) is approximately observable. Letus consider the system (51), where the operator L is given by (52). Then thefollowing assertions are true:

(a) L is dissipative,(b) Ran (λ0I − L) = Z for some λ0 > 0,(c) D(L)cl = Z and L is closed,(d) The operator L generates a C0-semigroup of contractions TL(t) ∈ L(Z),

t ≥ 0,(e) The C0-semigroup TL(t) generated by L is asymptotically stable.

Proof. The proof shall be carried out by using the same method as in the proofof theorem 2. The Lyapunov function for the system (51) is given by

V (z(t)) =12〈x(t), x(t)〉X +

12〈Ax(t), x(t)〉X +

12

[B∗x(t)]T K1[B∗x(t)], (53)

and the stability of the closed loop system is a consequence of LaSalle’s theorem.��

5 Illustrative Examples

To illustrate our theory we consider the motion of a taut string, rotating aboutits ξ-axis with constant angular velocity ω (Fig. 1). In was shown in [3] and [4]that the small oscillations of such a string are governed by the system of partialdifferential equations

{∂2x1(t,ξ)

∂t2 − 2ω ∂x2(t,ξ)∂t − ω2x1(t, ξ) − ∂2x1(t,ξ)

∂ξ2 = b(ξ)u(t),∂2x2(t,ξ)

∂t2 + 2ω ∂x1(t,ξ)∂t − ω2x2(t, ξ) − ∂2x2(t,ξ)

∂ξ2 = 0,(54)

94 P. Skruch

Fig. 1. Small oscillations of a taut rotating string

where t > 0, ξ ∈ (0, 1). The boundary conditions are of the form{

x1(t, 0) = x1(t, 1) = 0,

x2(t, 0) = x2(t, 1) = 0,(55)

and the initial conditions{x1(0, ξ) = 0.1(1 − ξ)ξ, x1(0, ξ) = 0,

x2(0, ξ) = 0, x2(0, ξ) = 0.(56)

The function b : [0, 1] → R is defined as follows:

b(ξ) =

{1, for 0.7 ≤ ξ ≤ 1.0,

0, otherwise.(57)

Then we find that the system (54) can be written in the form (4), where X =L2((0, 1), R2),

A

[x1

x2

]=

[−x′′1 − ω2

−x′′2 − ω2

], (58)

D(A) = {x ∈ H2((0, 1), R2) : x = [x1x2]T , xi(0) = xi(1) = 0, i = 1, 2}, (59)

Bu(t) =[b(ξ)

0

]u(t), ξ ∈ [0, 1], t ≥ 0, (60)

G

[x1

x2

]=

[−2ωx2

2ωx1

], D(G) = X, (61)

(see also [3] and [4]). The output for the system (54) we calculate in the followingway:

y(t) = B∗x(t) = 〈[b0

],

[x1

x2

]〉X . (62)

The open loop system is not asymptotically stable. In order to stabilize thesystem we can use one of the following controllers:

{w(t) + 0.5w(t) = 0.1u(t),u(t) = −100.0[w(t) + y(t)],

(63)


0 2 4 6 8 10 12 14 16 18 20−1.5

−1

−0.5

0

0.5

1

t

y

Fig. 2. Effects of using the controller (63) in stabilization of the system (54)

0 2 4 6 8 10 12 14 16 18 20−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

t

y

Fig. 3. Effects of using the controller (64) in stabilization of the system (54)

with w(0) = 0.5, oru(t) = −10.0y(t) − 20.0y(t). (64)

Simulation results are presented in Fig. 2 and 3. For comparison purposes, theoutput y(t) of the open loop system (dot line) is given together with the outputof the closed loop system (solid line).


We have investigated stabilization of distributed parameter gyroscopic systemswhich are represented by second-order operator equations. The systems have an

96 P. Skruch

infinite number of poles on the imaginary axis. The important role in the stabi-lization process has played the assumption that the input and output operatorsare collocated. We have proposed a linear dynamic velocity feedback and lineardynamic position feedback. In the case where velocity in not available, a parallelcompensator is necessary to stabilize the system. The asymptotic stability ofthe closed loop system in both cases has been proved by LaSalle’s invarianceprinciple extended to infinite dimensional systems. Numerical simulation resultshave shown the effectiveness of the proposed controllers.

Acknowledgement

This work was supported by Ministry of Science and Higher Education in Polandin the years 2008–2011 as a research project No N N514 414034.

References

[1] Curtain, R.F., Pritchard, A.J.: Infinite dimensional linear systems theory.Springer, Heidelberg (1978)

[2] Dafermos, C.M., Slemrod, M.: Asymptotic behaviour of nonlinear contractionsemigroups. J. Funct. Anal. 13(1), 97–106 (1973)

[3] Datta, B.N., Ram, Y.M., Sarkissian, D.R.: Multi-input partial pole placement fordistributed parameter gyroscopic systems. In: Proc. of the 39th IEEE InternationalConference on Decision and Control, Sydney (2000)

[4] Datta, B.N., Ram, Y.M., Sarkissian, D.R.: Single-input partial pole-assignmentin gyroscopic quadratic matrix and operator pencils. In: Proc. of the 14th Inter-national Symposium of Mathematical Theory of Networks and Systems MTNS2000, Perpignan, France (2000)

[5] Dunford, N., Schwartz, J.T.: Linear operators. Part II. Spectral theory. Self adjointoperators in Hilbert space. Interscience, New York (1963)

[6] Engel, K.J., Nagel, R.: One-parameter semigroups for linear evolution equation.Springer, New York (2000)

[7] Hale, J.K.: Dynamical systems and stability. J. Math. Anal. Appl. 26(1), 39–59(1969)

[8] Hale, J.K., Infante, E.F.: Extended dynamical systems and stability theory. Proc.Natl. Acad. Sci. USA 58(2), 405–409 (1967)

[9] Kato, T.: Perturbation theory for linear operators. Springer, New York (1980)[10] Klamka, J.: Controllability of dynamical systems. PWN, Warszawa (1990) (in

Polish)[11] Kobayashi, T.: Frequency domain conditions of controllability and observability

for a distributed parameter system with unbounded control and observation. Int.J. Syst. Sci. 23(12), 2369–2376 (1992)

[12] Kobayashi, T.: Low gain adaptive stabilization of undamped second order systems.Arch. Control Sci. 11(XLVII) (1-2), 63–75 (2001)

[13] Kobayashi, T.: Stabilization of infinite-dimensional undamped second order sys-tems by using a parallel compensator. IMA J. Math. Control Inf. 21(1), 85–94(2004)


[14] Kobayashi, T., Oya, M.: Adaptive stabilization of infinite-dimensional undampedsecond order systems without velocity feedback. Arch. Control Sci. 14(L) (1), 73–84 (2004)

[15] La Salle, J., Lefschetz, S.: Stability by Liapunov’s direct method with applications.PWN, Warszawa (1966) (in Polish)

[16] Lummer, G., Phillips, R.S.: Dissipative operators in a Banach space. Pacific J.Math. 11(2), 679–698 (1961)

[17] Luo, Z., Guo, B., Morgul, O.: Stability and stabilization of infinite dimensionalsystems with applications. Springer, London (1999)

[18] Mitkowski, W.: Stabilization of dynamic systems. WNT, Warszawa (1991) (inPolish)

[19] Mitkowski, W.: Dynamic feedback in LC ladder network. Bull. Pol. Acad. Sci.Tech. Sci. 51(2), 173–180 (2003)

[20] Mitkowski, W.: Stabilisation of LC ladder network. Bull. Pol. Acad. Sci. Tech.Sci. 52(2), 109–114 (2004)

[21] Mitkowski, W., Skruch, P.: Stabilization of second-order systems by linear positionfeedback. In: Proc. of the 10th IEEE International Conference on Methods andModels in Automation and Robotics, Miedzyzdroje, Poland, August 30–September2, 2004, pp. 273–278 (2004)

[22] Mizohata, S.: The theory of partial differential equations. Cambridge Univ. Press,Cambridge (1973)

[23] Morgul, O.: A dynamic control law for the wave equation. Automatica 30(11),1785–1792 (1994)

[24] Morgul, O.: Stabilization and disturbance rejection for the wave equation. IEEETrans. Autom. Control 43(1), 89–95 (1998)

[25] Pazy, A.: Semigroups of linear operators and applications to partial differentialequations. Springer, New York (1983)

[26] Pritchard, A.J., Zabczyk, J.: Stability and stabilizability of infinite dimensionalsystems. SIAM Rev. 23(1), 25–52 (1981)

[27] Saperstone, S.: Semidynamical systems in infinite dimensional spaces. Springer,New York (1981)

[28] Skruch, P.: Stabilization of second-order systems by non-linear feedback. Int. J.Appl. Math. Comput. Sci. 14(4), 455–460 (2004)

[29] Slemrod, M.: Stabilization of boundary control systems. J. Differ. Equations 22(2),402–415 (1976)

[30] Tanabe, H.: Equations of evolution. Pitman, London (1979)[31] Taylor, A.E., Lay, D.C.: Introduction to functional analysis. John Wiley & Sons,

New York (1980)[32] Turowicz, A.: Theory of matrix, 5th edn., AGH, Krakow (1995) (in Polish)

Stabilization Results of Second-Order Systemswith Delayed Positive Feedback

Wojciech Mitkowski1 and Pawe�l Skruch1

1 AGH University of Science and Technology, Institute of Automatics,al. Mickiewicza 30/B1, 30-059 Krakow, [email protected]

2 AGH University of Science and Technology, Institute of Automatics,al. Mickiewicza 30/B1, 30-059 Krakow, [email protected]

Abstract. Oscillation and nonoscillation criteria are established for second-order sys-tems with delayed positive feedback. We consider the stability conditions for the systemwithout damping and with gyroscopic effect. A general algorithm for finding stabilityregions is proposed. Theoretical and numerical results are presented for single-inputsingle-output case. These results improve some oscillation criteria of [1], [2] and [6].

1 Introduction

The paper expands on a method proposed by [1], [2] and [6] for stabilizingsecond-order systems with delayed positive feedback. The system is describedby linear second-order differential equations

x(t) + Gx(t) + Ax(t) = Bu(t), (1)

y(t) = Cx(t), (2)

where x(t) ∈ Rn, u(t) ∈ R, y(t) ∈ R, t ≥ 0. Here R

n and R are real vector spacesof column vectors, x(t), u(t), y(t) are vectors of states, inputs and outputs,respectively, A ∈ R

n×n, B ∈ Rn×1, C ∈ R

1×n, G ∈ Rn×n. We assume that the

matrix A = AT > 0 is positive definite and the multiplicity of all eigenvaluesof A is equal one. The matrix G = −GT is called skew-symmetric (gyroscopic)matrix. If we take the Laplace transform in (1) and (2) and use zero initialconditions, we obtain

G1(s) =Y (s)U(s)

= C(s2 + sG + A)−1B. (3)

In [7], it has been proved that the system (1) is not asymptotically stable. Theeigenvalues of (1) are different from zero, pairwise conjugated and located onthe imaginary axis.


100 W. Mitkowski and P. Skruch

In this paper in order to stabilize the system (1) we use the following positive,time-delay feedback

u(t) = ky(t − τ), (4)

where k > 0, τ > 0 and y(t) = 0 for t ∈ [−τ, 0). Using the Laplace transform in(4), we obtain

U(s) = G2(s)Y (s), (5)

whereG2(s) = ke−sτ . (6)

The closed loop system (see Fig. 1) will be defined by the following transferfunction:

G(s) =G1(s)G2(s)

1 − G1(s)G2(s). (7)

Fig. 1. Closed loop system

We try to determine the range of allowable delays in order to guarantee sta-bility for the system (7). The analysis of the closed loop system is made byusing the Nyquist criterion (see for example [5]). Because all open loop polesare located on the imaginary axis, the system represented by the transfer func-tion (7) will be asymptotically stable if there is no clockwise encirclements ofthe (−1, j0) point by the Nyquist plot of G12(s) = −G1(s)G2(s). If jω∗ is thepole of G12(s) of multiplicity m∗, then the ”points” G12(jω∗

−) and G12(jω∗+) are

connected in the clockwise direction by a circular arc of the radius R = ∞ andangle φ = m∗π, which is centered at the origin.

The analysis of time-delay systems is widely discussed by very many scientists,and to mention only a few we note the works [3] and [4]. A good source of referencesto papers in which stabilization problems are treated can by found in [5].

2 System without Damping

Let us consider the controllable second-order system without damping

x(t) + Ax(t) = Bu(t), (8)

y(t) = BT x(t), (9)

Stabilization Results of Second-Order Systems 101

where x(t) ∈ R2, u(t) ∈ R, y(t) ∈ R, t ≥ 0, A ∈ L(R2, R2), A = AT > 0,

B ∈ L(R, R2). For purpose of theoretical and numerical analysis we assume that

A =[

2 −1−1 2

], B =

[10

]. (10)

If we take the Laplace transform in (8), (9) and use zero initial conditions, weobtain

G1(s) =Y (s)U(s)

= BT (s2 + A)−1B. (11)

Simple calculations show that

G1(s) =s2 + 2

(s2 + 1)(s2 + 3). (12)

The open loop system represented by the transfer function (12) is not asymp-totically stable. The poles of (12) are located on the imaginary axis: s1 = j,s2 = −j, s3 =

√3j, s4 = −√

3j, j2 = −1. In order to stabilize the system weconsider the following feedback:

U(s) = G2(s)Y (s), G2(s) = ke−sτ . (13)

In this case the closed loop system (11), (13) with the matrices (10) is describedby the transfer function

G(s) =G1(s)G2(s)

1 − G1(s)G2(s)=

k(s2 + 2)e−sτ

s4 + 4s2 + 3 − k(s2 + 2)e−sτ. (14)

The stability of the closed loop system (14) will be checked by exploring theNyquist plot of

G12(s) = −G1(s)G2(s). (15)

The Nyquist plot allows us to gain insight into stability of the closed loop systemby analyzing the contour of the frequency response function G12(jω) on thecomplex plane. In this case

G12(jω) = Re G12(jω) + j Im G12(jω), (16)

where

Re G12(jω) = kω2 − 2

ω4 − 4ω2 + 3cos(ωτ), (17)

Im G12(jω) = −kω2 − 2

ω4 − 4ω2 + 3sin(ωτ). (18)

We will try to plot the graph (16) only for positive frequency, that is for ω ∈[0, +∞). The second half of the curve can be achieved by reflecting it over thereal axis. The magnitude and the phase of the function (16) are given by


|G12(jω)| = k

∣∣∣∣ ω2 − 2ω4 − 4ω2 + 3

∣∣∣∣ , (19)

tan θ(ω) = − tan (ωτ). (20)

If ω = 0, then Im G12(jω) = 0 and Re G12(jω) = − 23k. We note that a necessary

condition for stability is that k < 32 . This means that the start point of the curve

G12(jω) will be located on the right side from the (−1, j0) point.Let us consider what happens with the plot G12(jω) as ω ∈ (0, 1)∪ (1,

√3) ∪

(√

3, +∞). In this case the magnitude (19) is finite, therefore we need to find allintersections of the polar plot with the negative real axis. They will take placeat the frequencies satisfying the following conditions:

Im G12(jω) = 0 and Re G12(jω) < 0. (21)

At these frequencies the magnitude (19) must be less than 1, i.e.

|G12(jω)| = k

∣∣∣∣ ω2 − 2ω4 − 4ω2 + 3

∣∣∣∣ < 1. (22)

Then we will be sure that there is no encirclements of the (−1, j0) point. Thefirst condition Im G12(jω) = 0 is true when

ω =√

2 or ω =nπ

τ, (23)

for all n = 1, 2, . . .. At ω =√

2 we have Re G12(jω) = 0. This means that themagnitude |G12(jω)| = 0. At ω = nπ

τ , n = 1, 2, . . ., the condition Re G12(jω) < 0is equivalent to

(−1)nk(nπτ)2 − 2τ4

(nπ)4 − 4(nπτ)2 + 3τ4< 0, (24)

for all n = 1, 2, . . ..

Now, let us consider what happens at ω = 1 and ω =√

3. Since the magnitude(19) is infinite at these frequencies, we need to be sure that

limω→1−

Im G12(jω) > 0 (25)

andlim

ω→√3−

Im G12(jω) > 0. (26)

Then the ”points” G12(j−) and G12(j+) (or G12(j√

3−) and G12(j√

3+)) will beconnected by the polar plot in the clockwise direction by a circular arc of theradius R = ∞ and angle φ = π, which is centered at the origin. In other words,the (−1, j0) point will not be embraced by the curve G12(jω). The inequalities(25) and (26) are equivalent to the following ones

sin τ > 0 or sin√

3τ > 0. (27)


Combining all conditions, we have:

(a) k ∈ (0, 32 ), sin τ > 0, sin

√3τ > 0,

(b) If there exists ω0 = nπτ /∈ {1,

√3}, n = 1, 2, . . ., such that

(−1)nk(nπτ)2 − 2τ4

(nπ)4 − 4(nπτ)2 + 3τ4< 0,

then it must satisfy

k

∣∣∣∣ (nπτ)2 − 2τ4

(nπ)4 − 4(nπτ)2 + 3τ4

∣∣∣∣ < 1.

0 0.5 1 1.50

5

10

15

20

25

30

35

40

45

50

A

k

τ

B C

Fig. 2. Stability regions for the system (14)

0 2 4 6 8 10 12 14 16 18 20−6

−4

−2

0

2

4

6

t

y

ABC

Fig. 3. Trajectories of the closed loop system (14) corresponding to the points A, Band C


−1.2 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

real

imag ω = 0

ω −−> 1−

ω −−> 1+

ω −−> sqrt(3)−

ω −−> sqrt(3)+

Fig. 4. The plot of the function G12(jω) corresponding to the parameters k = 1.2,τ = 0.9 (point A)

−1.5 −1 −0.5 0 0.5−1

−0.8

−0.6

−0.4

−0.2

0

0.2

real

imag

ω = 0

ω −−> 1−

ω −−> 1+

ω −−> sqrt(3)−

ω −−> sqrt(3)+

Fig. 5. The plot of the function G12(jω) corresponding to the parameters k = 0.5,τ = 7.53 (point B)

Fig. 2 illustrates the stability regions for the system (14). For example, the pointA = (1.2, 0.9) is located in the stability region. The point B = (0.5, 7.53) standsfor the stable system but not asymptotically stable. And the point C = (1.0, 3.0)illustrates the unstable region. The trajectories corresponding to the points A,B and C are shown in Fig. 3. Figs 4, 5 and 6 present the Nyquist plots forasymptotically stable, stable and unstable systems.


−1.2 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

−0.6

−0.4

−0.2

0

0.2

0.4

real

imag

ω = 0

ω −−> 1−

ω −−> 1+

ω −−> sqrt(3)−

ω −−> sqrt(3)+

Fig. 6. The plot of the function G12(jω) corresponding to the parameters k = 1.0,τ = 3.0 (point C)

3 Gyroscopic System

The gyroscopic system is a system of differential equations of the form

x(t) + Gx(t) + Ax(t) = Bu(t), (28)

y(t) = BT x(t), (29)

where x(t) ∈ R2, u(t) ∈ R, y(t) ∈ R, t ≥ 0, A ∈ L(R2, R2), A = AT > 0,

G ∈ L(R2, R2), G = −GT , B ∈ L(R, R2). We assume that

A =[

2 −1−1 2

], B =

[10

], G =

[0 1−1 0

]. (30)

The Laplace transform of the system (28) and (29) determines the followingtransfer function:

G1(s) =Y (s)U(s)

= BT (s2 + sG + A)−1B. (31)

Using (30) in (31) we obtain

G1(s) =s2 + 2

s4 + 5s2 + 3. (32)

The open loop system is not asymptotically stable, its eigenvalues are locatedon the imaginary axis

s1,2 = ±j

√−5 +

√13

2, s3,4 = ±j

√−5 −√

132

. (33)


0 0.5 1 1.50

5

10

15

20

25

30

35

40

45

50

k

τ

Fig. 7. Stability regions for the system (35)

LetU(s) = G2(s)Y (s), G2(s) = ke−sτ . (34)

The closed loop system (31), (34) with the matrices (30) is given by

G(s) =G1(s)G2(s)

1 − G1(s)G2(s)=

k(s2 + 2)e−sτ

s4 + 5s2 + 3 − k(s2 + 2)e−sτ. (35)

Let us note that the difference between the non-gyroscopic (12) and gyroscopicsystem (32) is in the denominator of the appropriate transfer function. Usingthe same technique as in the previous section, we can easily give the conditions,which let us determine the range of allowable parameters k and τ . They are asfollows:

(a) k ∈ (0, 32 ), sin s1τ > 0, sin s3τ > 0,

(b) If there exists ω0 = nπτ /∈ {s1, s3}, n = 1, 2, . . ., such that

(−1)nk(nπτ)2 − 2τ4

(nπ)4 − 5(nπτ)2 + 3τ4< 0,

then it must satisfy

k

∣∣∣∣ (nπτ)2 − 2τ4

(nπ)4 − 5(nπτ)2 + 3τ4

∣∣∣∣ < 1.

Fig. 7 shows the graphical representation of the stability regions for the system(35).

Based on our discussion, we can establish an algorithm for finding the rangeof allowable parameters of the positive time-delay controller (4) in order to guar-antee stability of the general second-order system (1), (2). The algorithm can beeasily implemented in MATLAB-Simulink environment.


ALGORITHM: The algorithm for finding stability regions for the generalizedsecond-order system with the positive time-delay feedback.

INPUT: The matrices G = Rn×n, A = R

n×n, B = Rn×1, C = R

1×n, thetransfer function of the system

G1(s) = C[s2 + sG + A]−1B,

the transfer function of the controller

G2(s) = ke−sτ ,

the transfer function of the closed loop system

G(s) =G1(s)G2(s)

1 − G1(s)G2(s).

OUTPUT: The set S = {(k, τ) ∈ R2 : the closed loop system is asymptotically

stable}.ASSUMPTIONS: G = −GT , A = AT > 0, the system is observable, the open

loop system has all eigenvalues located on the imaginary axis, the multiplicityof all eigenvalues is equal one.

STEP 1: Find the poles of the open loop system: si = jωi, ωi > 0, i =1, 2, . . . , n.

STEP 2: Determine the set S1 = {(k, τ) ∈ R2 : τ > 0, k > 0, kCA−1B < 1}.

STEP 3: Determine the set Ω = {ω ∈ (0, +∞)\{ω1, ω2, . . . , ωn} : Im G12(jω) =0 and Re G12(jω) < 0}.

STEP 4: Determine the set S2 = {(k, τ) ∈ R2 : ∀ω∗∈Ω|G12(jω∗)| < 1, k >

0, τ > 0}.STEP 5: Determine the set S3 = {(k, τ) ∈ R

2 : limω→ωi− Im G12(jω) > 0, i =1, 2, . . . , n}.

STEP 6: Determine the set S = S1 ∩ S2 ∩ S3.


In this paper stabilization problem of matrix second-order systems has beendiscussed. We have presented our results for single-input single-output case. Thesystems have all poles located on the imaginary axis. We have proved that thesystem can be stabilized by delayed positive feedback. The analysis of the closedloop system has been performed using the Nyquist criterion. An algorithm forfinding stability regions has been proposed and then validated by series numericalcomputations in MATLAB-Simulink environment. It seems to be interestingto extend the results for infinite dimensional second-order dynamical systemsdescribed by singular partial differential equations.

Acknowledgement

This work was supported by Ministry of Science and Higher Education in Polandin the years 2008–2011 as a research project No N N514 414034.


References

[1] Abdallah, C., Dorato, P., Benitez-Read, J., Byrne, R.: Delayed positive feedbackcan stabilize oscillatory systems. In: Proc. of the American Control Conference,San Francisco CA, pp. 3106–3107 (1993)

[2] Bus�lowicz, M.: Stabilization of LC ladder network by delayed positive feedbackfrom output. In: Proc. XXVII International Conference on Fundamentals of Elec-trotechnics and Circuit Theory, IC-SPETO 2004, pp. 265–268 (2004) (in Polish)

[3] Elsgolc, L.E.: Intoduction to the theory of differential equations with delayed ar-gument. Nauka, Moscow (1964) (in Russian)

[4] Gorecki, H., Fuksa, S., Grabowski, P., Korytowski, A.: Analysis and synthesis oftime delay systems. PWN, Warszawa (1989)

[5] Mitkowski, W.: Stabilization of dynamic systems. WNT, Warszawa (1991) (in Pol-ish)

[6] Mitkowski, W.: Static feedback stabilization of RC ladder network. In: Proc.XXVIII International Conference on Fundamentals of Electrotechnics and CircuitTheory, IC-SPETO, pp. 127–130 (2005)

[7] Skruch, P.: Stabilization of second-order systems by non-linear feedback. Int. J.Appl. Math. Comput. Sci. 14(4), 455–460 (2004)

A Comparison of Modeling Approaches for theSpread of Prion Diseases in the Brain

Franziska Matthaus

Interdisciplinary Center for Scientific Computing,University of Heidelberg, [email protected]

Abstract. In this article we will present and compare two different modeling ap-proaches for the spread of prion diseases in the brain. The first is a reaction-diffusionmodel, which allows the description of prion spread in simple brain subsystems, likenerves or the spine. The second approach is the combination of epidemic models withtransport on complex networks. With the help of these models we study the dependenceof the disease progression on transport phenomena and the topology of the underlyingnetwork.

1 Introduction

The progression of prion diseases is accompanied on one hand by the multiplica-tion of the infective agent, and on the other hand by its spatial dispersion in thebrain. However, models developed to describe the kinetics of the prion diseaseprogression usually include reaction terms but neglect prion transport. To closethis gap, we want to present in this article different approaches to model prionpropagation with a special focus on the spatial component. The spatial modelsprovide information about the prion-distribution in space, additionally to thetemporal evolution of the concentration, and allow to determine the dependenceof the concentration kinetics on prion transport.

Prion diseases are fatal neurodegenerative diseases caused by an infectiveagent that is neither a virus nor any other conventional agent, but a particleconsisting solely of a wrongly folded protein, PrPsc [14]. This protein is an iso-form of the native cellular prion protein, PrPc, which is present in many tissuesbut with the highest concentration in the brain. In comparison to the mainlyalpha-helical PrPc, PrPsc is dominated by beta-sheet and characterized by ahigher resistance to degradation by protease K and a tendency to aggregate.

The infectivity of PrPsc lies in its ability to interact with the native PrPc,resulting in a change of PrPc into the pathologic form PrPsc. The interactionmechanism is thereby not fully understood.

Because we are interested mainly in spatial effects, we will focus on thesimplest kinetic model of prion-prion interaction. In the so-called heterodimermodel [14, 5], PrPc is converted upon interaction with a single prion particle


110 F. Matthaus

+ +

PrPc PrPsc

Fig. 1. Scheme of the heterodimer model

(see Figure 1). After the conversion, the two resulting infective agents dissociateand the process can start again with new PrPc.

Prion transport in the brain is another field where experimental data is sparse.However, there are indications that prions move within the brain via axonaltransport [1], where the transport happens in both directions (anterograde andretrograde). The speed of 1 mm/d hereby coincides with the speed of passiveneuronal transport [7].

2 The Reaction-Diffusion Approach

The heterodimer model (see Figure 1) can be written in the form of two differ-ential equations, one describing the concentration dynamics of PrPc and one forthe concentration dynamics of PrPsc. We denote the concentration of PrPc byA and the concentration of PrPsc by B. In the model, PrPc is produced with aconstant rate v0 and degraded with rate kA. Conversion is proportional to bothconcentrations, A and B, with a constant kAB. PrPsc is degraded with a rate kB ,where kA > kB, because of the higher resistance of PrPsc to proteases. For thespatial model we assume a one-dimensional domain Ω =

{x ∈ R

1 : 0 ≤ x ≤ L}

and and zero-flux boundary conditions,

∂A

∂x

∣∣∣∣∂Ω×T

= 0, and∂B

∂x

∣∣∣∣∂Ω×T

= 0. (1)

The one-dimensional domain can hereby by associated with simple brain sub-structures, like nerves or the spine. The reaction-diffusion system for the het-erodimer model then has the following form:

∂A

∂t= v0 − kA · A − kAB · A · B + D∇2A

∂B

∂t= kAB · A · B − kB · B + D∇2B,

(2)

with the initial conditions A(0, x) = A0(x) ≥ 0, B(0, x) = B0(x) ≥ 0.This set of equations (2) has been used by Payne and Krakauer [13] to study

inter-strain competition. Qualitatively they could show how after co-infectionwith two different prion strains the first inoculated strain can slow down or evenstop the spread of the second strain and prevail, even if it has a longer incubationperiod.

The parameters for this model have been estimated in [10], and are summa-rized in Table 1. With the estimated parameter values the solutions of the system

A Comparison of Modeling Approaches for the Spread of Prion Diseases 111

Table 1. Kinetic parameters for prion-prion interaction

v0 4 μg/(g·d)kA 4 d−1

kB 0.03 d−1

kAB 0.15 (μg·d/g)−1

D 0.05 mm2/d

(2) can be analyzed qualitatively as well as quantitatively, and allow comparisonwith results from real experiments.

2.1 Results from the Reaction-Diffusion Approach

We solve the Equations (2) with (1) numerically using an implicit Euler dis-cretization scheme and the initial conditions A(0, x) = A∗

1, B(0, 0) > 0 butsmall, and B(0, x) = 0 for x > 0, which corresponds to an infection with scrapieprion at one end of the domain. For these initial conditions, the solutions exhibittraveling wave behavior, as shown in Figure 2.

0 20 40 60 80 1000

20

40

60

80

100

120

distance in mm

B in

μg/

g

t=450 days B(0)=0.025 μg/g

Fig. 2. Snapshot of a traveling wave for the heterodimer model

For the heterodimer model (2), the speed of the traveling wave front for scrapieprion cB can be determined analytically [10, 12], and depends on the kinetic param-eters as cB =

√D · kAB · (A∗

1 − A∗2), where A∗

1 and A∗2 stand for the steady state

concent rations of cellular prion in the healthy system (absence of scrapie prion)and in the diseased system (after infection with scrapie prions), respectively.

With the spatial model (2) it can also be shown, that the diffusion coefficienthas an influence on the overall concentration dynamics of PrPsc. In Figure 3we show the dynamics of the PrPsc-concentration, averaged over the domain Ω,for varying D. For small diffusion coefficients, the traveling wave forms and theaccess of PrPsc to its substrate PrPc is limited. In this case the concentrationdynamics are dominated by linear growth. For large diffusion coefficients PrPsc

quickly distributes in space and the concentration dynamics show a sigmoidalevolution, similar to the results of the heterodimer model without diffusion.

112 F. Matthaus

0 100 200 300 4000

20

40

60

80

100

120

time in days

B c

once

ntra

tion

D=0.9D=0.5D=0.1D=0.05

Fig. 3. Dependence of the overall concentration dynamics on the diffusion coefficient

0 10 20 30 40250

260

270

280

290

300

310

320

330Heterodimer model

time of eye removal (in days)

incu

batio

n tim

e (in

day

s)

Incubation time for unenucleated mice

Fig. 4. Incubation times depending on the interval between intraocular infection andsurgical eye removal. Experimental data from [16] (× with error bars) and simulationdata for two different parameter sets.

The model (2) can also be related to experiments, for example when modelingthe spread of prions in the mouse visual system. Here we can make use of the factthat the mouse visual system is nearly linear, with the optic nerve projectingfrom the eye the lateral geniculate nucleus (LGN), and the optic radiations thenprojecting from the LGN to the visual cortex. Because of this simple structurethe system can be approximated by a one-dimensional domain and our modelapplies.

Scott et al. [15, 16] carried out experiments to show the dependence of the in-cubation time tinc on the dose of intraocularly injected scrapie material. Further-more, they investigated how the incubation time changes when the eye is surgicallyremoved at different time intervals after intraocular infection. For the first exper-iment, the relationship tinc ∝ log(dose) found can be easily reproduced with ourspatial model, however, here the spatial component is not essential, as any modelwith a near-exponential initial phase would give the same result. Different is the


situation for the experiments regarding the surgical eye removal. Here a modelwithout a spatial component is not sufficient. With a spatial model, eye removalcan be simulated by a change in the domain Ω, in particular by inserting zero-fluxboundary conditions at the position where the optic nerve is cut. To compare theresults of the simulations with the real data we modified the model slightly to ob-tain a better description of the spatial domain and the steady state distributionof PrPc. For details see [9, 10]. The results of the simulation fit well to the exper-imental data that show a decaying incubation time for larger intervals betweeninfection and surgical eye removal (see Figure 4).

3 Network Models

The complexity of the brain neuronal network and the fact that prions are trans-ported across the edges of this network make the application of reaction-diffusionequations on larger brain systems very difficult. However, some results on thespread of infections on networks can be obtained by combining epidemic modelswith transport on networks. In the previous section we showed that the diseasekinetics are dependent on prion transport. In the present section we will showthat the topology of the underlying network effects the disease spread as well.

Networks consist of a set of N nodes and M edges, where the nodes representhere the neuronal cells and the edges denote whether between two cells existsa connection (in the form of a synapse or gap junction) or not. The number ofedges originating from a node corresponds to the number of neighbors of thenode and is called the nodes degree k. The average of the degrees of all nodesin the network is called the degree of the network 〈k〉. According to the degreedistribution P (k) networks can be classified. In this article we will focus on onenetwork model, called small world [17]. Small worlds can be constructed from d-dimensional regular grids by rewiring the edges with a probability p. Dependingon p, this model interpolates between regular and totally random networks, andis therefore a good model for our purpose.

To describe the spread of infective diseases on networks, the network model iscombined with a model of epidemic diseases, the SI model. The SI model classifiesnodes into two discrete states, namely susceptible or infected. Susceptible nodesbecome infected with probability ξ, where ξ is a function of the transmissionprobability between two neighbors λ, and the number of infected neighbors m:

ξ = 1 − (1 − λ)m.

3.1 Results from the Network Approach

To show how the network topology affects the speed of the epidemic spread, werun simulations on two-dimensional small worlds. All networks hereby consist of104 nodes and have an average degree of 〈k〉 = 3.96, however vary in the rewiringprobability p. Every simulation is started with a single (randomly chosen) in-fected node, and we measure the number of iterations until 95% of the network

114 F. Matthaus

10−4

10−3

10−2

10−1

100

20

40

60

80

100

120140

rewiring probability

num

ber

of it

erat

ions

unt

il 95

% in

fect

ed

Fig. 5. Number of iterations until 95% of the network is infected, depending on therewiring probability p

0 0.2 0.4 0.6 0.8 14

4.2

4.4

4.6

4.8

5

rewiring probability

hete

roge

neity

⟨ k2 ⟩ /

⟨ k⟩

Fig. 6. Heterogeneity of 2-dimensional small worlds in dependence on the rewiringprobability

got infected. The result for every p is the average over various realizations of thenetwork.

The result, displayed in Figure 5, shows clearly that the velocity of the spreadincreases with increasing rewiring probability. The crucial network feature isthereby the degree heterogeneity of the network, defined as 〈k2〉/〈k〉. For smallworlds, the degree heterogeneity increases with p, as shown in Figure 6. In [3] itwas shown that the time scale τ of the initial exponential growth of the epidemicsis related to the degree heterogeneity as:

τ =〈k〉

λ(〈k2〉 − 〈k〉) . (3)

This relation shows that for scale-free networks, characterized by a power-lawdegree distribution with P (k) ∝ k−α, the epidemics can have an extremely fastinitial growth, because here 〈k2〉 diverges with the network size.


2 4 6 8 10 1255

60

65

70

75

80

node degree

aver

age

surv

ival

tim

e (in

iter

atio

ns)

Fig. 7. Survival time of nodes in random networks in dependence on their degree

The neuronal network of the brain is an example of a very large network, andalthough the degree variability of the nodes is bound by the number of synapsesa cell can form, this number can be as high as 2·105 for Purkinje cells [18]. Todetermine the exact growth rate of the number of infected cells, estimates forthe transmission probability and for the degree heterogeneity are needed, whichare points that still need experimental and theoretical investigation.

The spread of epidemics on networks differs in many aspects from diffusivespread on homogeneous domains. One example is the following: On a homoge-neous domain, the time when a cell becomes infected depends only on its distancefrom the origin of the infection. On networks, this time is also influenced by thedegree of the node. To show this, we set p = 1 to obtain networks with a largedegree variation and again simulate the outbreak of an epidemic applying the SImodel.

Figure 7 shows the average survival time of nodes in dependence on theirdegree. One can see that on average nodes of high degree are earlier infected(have shorter survival times) than nodes of low degree. The reason is that nodesof high degree have more neighbors from which they can contract the disease.Instead of looking at the survival times directly, Barthelemy et al. [3] measured(with the same result) the average degree of the newly infected nodes and theinverse partition ratio, defined as Y2(t) =

∑k(Ik/I)2, where Ik/I denotes the

fraction infected nodes of degree k in relation to all infected nodes.

4 Conclusions

The two approaches describe the disease progression on different scales. Thediffusion approach focuses on the mechanism of prion-prion interaction, but islimited to simple spatial domains. The network approach takes into account thecomplexity of the domain, in which the transport of the infective agent takesplace, but therefore is no longer specific for prion diseases.

116 F. Matthaus

The problem with models of prion spread in the brain is the shortness ofexperimental data. Not only the prion interaction mechanism is not fully un-derstood, but also the topology of the brain neuronal network is unclear. Theaim of this article is to present some general results obtained by the use of verysimple models.

With the appearance of new experimental data, the development of moredetailed models will become feasible. A possibility here is the combination of akinetic model for prion-prion interaction with transport on networks, and thusthe study of reaction-diffusion systems on networks. Some work on reaction-diffusion systems on networks has been carried out for example in [6], whichdeals with annealing processes of the types A + A → 0 and A + B → 0 onscale-free networks, or in [2] where the Gierer-Meinhardt model was studiedon random and scale-free networks. The models for prion spread derived bycombining prion-prion interaction with transport on networks eventually shouldnot only account for long incubation periods but also provide a description oflocal prion accumulations and the formation of plaques.

References

[1] Armstrong, R.A., Lantos, P.L., Cairns, N.J.: The spatial patterns of prion depositsin Creutzfeldt-Jakob disease: comparison with β-amyloid deposits in Alzheimer’sdisease. Neurosci. Lett. 298, 53–56 (2001)

[2] Banerjee, S., Mallik, S.B., Bose, I.: Reaction-diffusion processes on random andscale-free networks (2004) arXiv:cond-mat/0404640

[3] Barthelemy, M., Barrat, A., Pastor-Satorras, R., Vespignani, A.: Velocity andhierarchical spread of epidemic outbreaks in scale-free networks (2004) arXiv:cond-mat/0311501

[4] Eigen, M.: Prionics or the kinetic basis of prion diseases. Biophys. Chem. 63,A1–A18 (1996)

[5] Galdino, M.L., de Albuquerque, S.S., Ferreira, A.S., Cressoni, J.C., dos Santos,R.J.V.: Thermo-kinetic model for prion diseases. Phys. A 295, 58–63 (2001)

[6] Gallos, L.K., Argyrakis, P.: Absence of kinetic effects in reaction-diffusion pro-cesses in scale-free networks. Phys. Rev. Lett. 92(13), 138301 (2004)

[7] Glatzel, M., Aguzzi, A.: Peripheral pathogenesis of prion diseases. Microbes. In-fect. 2, 613–619 (2000)

[8] Harper, J.D., Lansbury Jr., P.T.: Models of amyloid seeding in Alzheimer’s dis-ease and scrapie: mechanistic truths and physiological consequences of the time-dependent solubility of amyloid proteins. Annu. Rev. Biochem. 66, 385–407 (1997)

[9] Matthaus, F.: Hierarchical modeling of prion spread in brain tissue, PhD thesis(2005)

[10] Matthaus, F.: Diffusion versus network models as descriptions for the spread ofprion diseases in the brain. J. theor. Biol (in press) (2005)

[11] Masel, J., Jansen, V.A.A., Nowak, M.A.: Quantifying the kinetic parameters ofprion replication. Biophys. Chem. 77, 139–152 (1999)

[12] Murray, J.D.: Mathematical Biology. Springer, Heidelberg (1989)[13] Payne, R.J.H., Krakauer, D.C.: The spatial dynamics of prion disease. Proc. R.

Soc. Lond. B 265, 2341–2346 (1998)


[14] Prusiner, S.B.: Prions. Proc. Natl. Acad. Sci. USA 95, 13363–13383 (1998)[15] Scott, J.R., Davies, D., Fraser, H.: Scrapie in the central nervous system: neu-

roanatomical spread of infection and Sinc control of pathogenesis. J. Gen. Vi-rol. 73, 1637–1644 (1992)

[16] Scott, J.R., Fraser, H.: Enucleation after intraocular scrapie injection delays thespread of infection. Brain Res. 504, 301–305 (1989)

[17] Watts, D.J., Strogatz, S.H.: Collective dynamics of ’small-world’ networks. Na-ture 393, 440–442 (1998)

[18] http://faculty.washington.edu/chudler/facts.html#brain

http://faculty.washington.edu/chudler/facts.html#brain

Ensemble Modeling for Bio-medical Applications

Christian Merkwirth1, Jorg Wichard2,4, and Maciej J. Ogorza�lek1,3

1 Department of Information Technologies, Jagiellonian University, ul. Reymonta 4,Cracow, [email protected]

2 Institute of Molecular Pharmacology, Robert-Rossle-Str. 10, Berlin, [email protected]

3 AGH University of Science and Technology, al. Mickiewicza 30, Cracow, [email protected]

4 Institut fur Medizinische Informatik, Charite, Hindenburgdamm 30,12200 Berlin, Germany

Abstract. In this paper we propose to use ensembles of models constructed usingmethods of Statistical Learning. The input data for model construction consists of realmeasurements taken in physical system under consideration. Further we propose a pro-gram toolbox which allows the construction of single models as well as heterogenousensembles of linear and nonlinear models types. Several well performing model types,among which are ridge regression, k-nearest neighbor models and neural networks havebeen implemented. Ensembles of heterogenous models typically yield a better general-ization performance than homogenous ensembles. Additionally given are methods formodel validation and assessment as well as adaptor classes performing transparent fea-ture selection or random subspace training on large number of input variables. Thetoolbox is implemented in Matlab and C++ and available under the GPL. Severalapplications of the described methods and the numerical toolbox itself are described.These include ECG modeling, classification of activity in drug design and ...

1 Introduction

Ensemble methods have gained increasing attention in the last decade[1, 2] andseem to be a promising approach for improving the generalization error of ex-isting statistical learning algorithms in the regression and classification tasks.The output of an ensemble model is the average of outputs of the individualmodels belonging to the ensemble. In prediction problems an ensemble typicallyoutperforms single models. Almost all ensemble methods described so far usemodels of one single class, e.g. neural networks [1, 2, 3, 4] or regression trees [5].

We suggested to build ensembles of different model classes, to improve the per-formance in regression problems. The theoretical background of our approach isprovided by the bias/variance decomposition of the ensemble. We argue that anensemble of heterogeneous models usually leads to a reduction of the ensemblevariance because the cross terms in the variance contribution have a higher ambi-guity. Further we describe the structure of the programming toolkit and its usage.


120 C. Merkwirth, J. Wichard, and M.J. Ogorza�lek

2 Learning Dependency from Data

The modeling problem can be described as follows [6]:

Given: A series of input-output-pairs (xµ, yµ) with μ = 1, . . . , N or a functionaldependence y(x) (possibly corrupted by noise)we would like to choose a model (function) f out of some hypothesis space Has close to the true f as possibleTwo different tasks can be considered:Classification f : RD �→ {0, 1, 2, ...} – discrete classes enabling sorting theinput data into distinct classes having specific propertiesRegression f : RD �→ R – continuous output - finding a dependency on timeor parameters.

2.1 Model Types Used in Statistical Learning

There exist a vast variety of available models described in the literature whichcan be grouped into some general classes [6]:

• Global Models– Linear Models– Polynomial Models– Neural Networks (MLP)– Support Vector Machines

• Semi–global Models– Radial Basis Functions– Multivariate Adaptive Regression Splines (MARS)– Decision Trees (C4.5, CART)

• Local Models– k–Nearest–Neighbors

• Hybrid Models– Projection Based Radial Basis Functions Network (PRBFN)

Implementation of any of such modeling methods leads usually to solution ofan optimization problem and further to operations such as matrix inversion incase of linear regression or minimization of a loss function on the training dataor quadratic programming problem (eg. for SVMs).

2.2 Validation and Model Selection

The key for model selection is the Generalization error – how does the modelperform on unseen data (samples)? Exact generalization error is not accessi-ble since we have only limited number of observations! Training on small dataset tends to overfit, causing generalization error to be significantly higher thantraining error This is a consequence of mismatch between the capacity of thehypothesis space H (VC-Dimension) and the number of training observations.

Ensemble Modeling for Bio-medical Applications 121

Any type of model constructed has to pass the validation stage – estimationof the generalization error using just the given data set. In a logical way weselect the model with lowest (estimated) generalization error. To improve thegeneralization error typical remedies can be:

• Manipulating training algorithm (e.g. early stopping)• Regularization by adding a penalty to the loss function• Using algorithms with built-in capacity control (e.g. SVM)• Relying on criteria like BIC, AIC, GCV or Cross Validation to select

optimal model complexity• Reformulating the loss function, e.q. by using an ε-insensitive loss

3 Ensemble Methods

Building an Ensemble consists of averaging the outputs of several separatelytrained models

• Simple average f(x) = 1K

∑Kk=1 fk(x)

• Weighted average f(x) =∑

k wkfk(x) with∑

k wk = 1

The ensemble generalization error is always smaller than the expected errorof the individual models. An ensemble should consist of well trained but diversemodels.

3.1 The Bias/Variance Decomposition for Ensembles

Our approach is based on the observation that the generalization error of anensemble model can be improved if the predictors on which the averaging isdone disagree and if their residuals are uncorrelated [7]. We consider the casewhere we have a given data set D = {(x1, y1), . . . , (xN , yN)} and we want to finda function f(x) that approximates y also for unseen observations of x. Theseunseen observations are assumed to stem from the same but not explicitly knownprobability distribution P (x). The expected generalization error Err(x) given aparticular x and a training set D is

Err(x) = E[(y − f(x))2|x, D] (1)

where the expectation E[·] is taken with respect to the probability distributionP . The Bias/Variance Decomposition of Err(x) is

Err(x) = σ2 + (ED[f(x)] − E[y|x])2

+ED[(f(x) − ED[f(x)])2] (2)= σ2 + (Bias(f(x)))2 + Var(f(x)) (3)

where the expectation ED[·] is taken with respect to all possible realizations oftraining sets D with fixed sample size N and E[y|x] is the deterministic part of


the data and σ2 is the variance of y given x. Balancing between the bias andthe variance term is a crucial problem in model building. If we try to decreasethe bias term on a specific training set, we usually increase the bias term andvice versa. We now consider the case of an ensemble average f(x) consisting ofK individual models

f(x) =K∑

i=1

ωifi(x) wi ≥ 0, (4)

where the weights may sum to one∑K

i=1 ωi = 1. If we put this into eqn. (2) weget

Err(x) = σ2 + Bias(f(x))2 + Var(f(x)), (5)

and we can have a look at the effects concerning bias and variance. The biasterm in eqn. (5) is the average of the biases of the individual models. So weshould not expect a reduction in the bias term compared to single models.

The variance term of the ensemble could be decomposed in the following way:

V ar(f ) = E[(f − E[f ])2

]

= E[(K∑

i=1

ωifi)2] − (E[K∑

i=1

ωifi])2

=K∑

i=1

ω2i

(E

[f2

i

] − E2 [fi])

(6)

+2∑i<j

ωiωj (E [fifj ] − E [fi] E [fj ]) ,

where the expectation is taken with respect to D and x is dropped for simplicity.The first sum in eqn. 6 gives the lower bound of the ensemble variance andcontains the variances of the ensemble members. The second sum contains thecross terms of the ensemble members and disappears if the models are completelyuncorrelated [7]. The reduction of the variance of the ensemble is related to thedegree of independence of the single models. This is a key feature of the ensembleapproach.

There are several ways to increase model decorrelation. In the case of neuralnetwork ensembles, the networks can have different topology, different trainingalgorithms or different training subsets [2, 1]. For the case of fixed topology, it issufficient to use different initial conditions for the network training [4]. Anotherway of variance reduction is Bagging, where an ensemble of predictors is trainedon several bootstrap replicates of the training set [8]. When constructing k-Nearest-Neighbor models, the number of neighbors and the metric coefficientscould be used to generate diversity.

Krogh et al. derive the equation E = E − A which relates the ensemblegeneralization error E with the average generalization error E of the individual


models and the variance A of the model outputs with respect to the average out-put. When keeping the average generalization error E of the individual modelsconstant, the ensemble generalization error E should decrease with increasingdiversity of the models A. Hence we try to increase A by using two strategies:

1. Resampling: We train each model on a randomly drawn subset of 80% of alltraining samples. The number of models trained for one ensemble is chosenso that usually all samples of the training set are covered at least once bythe different subsets.

2. Variation of model type: We employ two different model types, which are lin-ear models trained by ridge regression and k-nearest-neighbor (k-NN) modelswith adaptive metric.

4 Out-of-Train Technique

The Out-of-Train (OOT) technique is a method for assessing the extra-sampleerror and can be regarded as a combination of traditional cross-validation (CV)

index of model

inde

x of

sam

ple

0.2 −0.018 −0.018 0.14

−0.42 −0.37 −0.022 −0.4

−0.38 −0.44 −0.42 −0.36

0.17 0.17 0.11 0.17

0.018 0.034 0.054 0.086

0.13 −0.42 −0.4 0.13

0.17 0.19 0.15 0.17

−0.013 −0.035 0.041 0.063

0.025 0.017 0.063 −0.01

0.13 0.21 0.16 0.16

0.076

−0.3

−0.4

0.16

0.048

−0.14

0.17

0.014

0.024

0.16

2 4 6 8 10 12 14 16 18 20 OOT

1

2

3

4

5

6

7

8

9

10

Fig. 1. Averaging scheme for OOT calculation for an example data set of ten samples.On this data set, 20 models were trained. Column j corresponds to model j. For eachmodel, samples used for training are colored white, while samples not used for trainingare colored gray. For easier reading, only output values for test samples were printed onthe respective row and column. To compute the OOT output for the i-th sample whichis depicted as gray value in the rightmost column, the average over the output of allmodels for which this sample was not in the training fraction is calculated (averagingover all gray fields in a row).


and ensemble averaging. As for cross-validation, the data set is repeatedly di-vided into training and test partitions. For each partitioning, a model is con-structed only on samples of the training partition. Test samples are not used formodel selection, deriving of stopping criteria or the like. The OOT output forone sample of the data set is the average of the outputs of models for which thissample was not part of the training set (out-of-train) as depicted in Figure 1.

5 Model Training and Cross Validation

In order to select models for the final ensemble we use a cross validation schemefor model training. As the models are initialized with different parameters (num-ber of hidden units, number of nearest neighbor, initial weights, etc.), cross val-idation helps us to find a proper value for these model parameters.

The cross validation is done in several training rounds on different subsetsof the entire training data. In every training round the data is divided in atraining set and a test set. The trained models are compared by evaluating theirprediction errors on the unseen data of the test set. The model with the smallesttest error is taken out and becomes a member of the ensemble. This is repeatedseveral times and the final ensemble is a simple average over its members. Forexample a K-fold cross validation training leads to an ensemble with K members,where the weights in equ.(4) turn to ωi = 1

K .

6 The ENTOOL Toolbox for Statistical Learning

The ENTOOL toolbox for statistical learning is designed to make state-of-the-art machine learning algorithms available under a common interface. It allowsconstruction of single models or ensembles of (heterogenous) models. ENTOOLis Matlab-based with parts written in C++ and runs under Windows and Linux.

6.1 ENTOOL Software Architecture

Each model type is implemented as separate class in our simulator, all modelclasses share common interface. Exchange model types by exchanging construc-tor call. The system allows for automatic generation of ensembles of models.Models are divided into two brands:

1. Primary models like linear models, neural networks, SVMs etc.2. Secondary models that rely on primary models to calculate output. All en-

semble models are secondary models.

Each selected model goes through three phases: Construction, Training and Eval-uation. In the construction phase topology of the model is specified. The modelcan’t be used yet – it has now to be trained on some training data set (xi, yi).After training, the model can be evaluated on new/unseen inputs (xn).


6.2 Syntax

• Constructor syntax:

model = perceptron; will create a MLP model with default topologymodel = perceptron(12); MLP model with 12 hidden layer neuronsmodel = ridge; will create a linear model trained by ridge regression

• Training syntax:

model = train(model,x,y,[],[],0.05);trains model with ε-insensitive loss of 0.05 on data set (xi, yi)

• Evaluation syntax:

y new = calc(model, x new) evaluates the model on new inputs• How to build an ensemble of models:

ens = crosstrainensemble; will create an empty ensemble objectens = train(ens,x,y,[],[],0.05); calls training routines for several primarymodels and joins them into ensemble object

• Ensemble evaluation:

y new = calc(ens, x new) evaluates the ensemble on new inputs.

6.3 Adjusting Class Specific Training Parameters

The 5th argument when calling train specifies training parameters. Excepttopology, often training parameters have to be specified:

tp = get(perceptron, ’trainparams’); error loss margin: 0.0100; decay:0.0010; rounds: 500; mrate init: 0.0100; max weight: 10; mrate grow:1.2000; mrate shrink: 0.5000.

6.4 Primary Models Types

The user has a choice of various primary model types:

ares Adaption of Friedman’s MARS algorithmridge Linear model trained by ridge regressionperceptron Multilayer perceptron with iRPROP+ trainingprbfn Shimon Cohen’s projection based radial basis function networkrbf Mark Orr’s radial basis function codevicinal k-nearest-neighbor regression with adaptive metricmpmr Thomas Strohmann’s Mimimax Probability Machine Regressionlssvm Johan Suykens’ least-square SVM toolboxtree Adaption of Matlab’s build-in regression/classification treesosusvm SVM code based on Chih-Jen Lin’s libSVMvicinalclass k-nearest-neighbor classification


6.5 Secondary Models Types

The user has a choice of various secondary model types which can be used forensembling or feature selection:

ensemble. Virtual parent class for all ensemble classescrosstrainensemble. Ensemble class that trains models according to crosstrain-

ing scheme. Creates ensembles of decorrelated models.cvensemble. Ensemble class that trains models according to crossvalidation/

out-of-training scheme. Can be used to access OOT error.extendingsetensemble. Boosting variant for regression.subspaceensemble. Creates an ensemble of models where each single model is

trained on a random subspace of the input data set.optimalsvm. Wrapper that trains RBF osusvm/lssvm with optimal parameter

settings (C and γ)featureselector. Does feature selection and trains model on selected subset

6.6 Experience in Ensembling

All the described methods have been implemented and are available for downloadfrom our web-site http://chopin.zet.agh.edu.pl/˜wichtel and http://zti.if.uj.edu.pl/˜merkwirth/entool.htm which contains also manual and installation guide.The toolkit is under continuous development and new features and algorithmsare being added. Also in the nearest future we will make available for users anon-line statistical learning service. The toolkit has been thoroughly tested on avariety of problems from ECG modeling, CNN training, financial time series, ElNino real data, physical measurements and many others [9, 10]

6.7 ECG Modeling

We used a ECG time series (see figure 2) from the CinC Challenge 2000 datasets: Data for development and evaluation of ECG-based apnea detectors (seePhysiobank Database at URL: pcbim2.dsi.unifi.it/physiobank/database/apnea-ecg/index.html for more information about the recording details). The data seta01.dat was cropped to approximately 50000 samples by omitting later measure-ments. From this time series we generated time-delay-vectors of dimension 12with delay 1 (see [11]). The modeling task consisted of learning the one-step-ahead prediction for each of the 50000 delay vectors. Since the final error MSEEon this full data set F differs from run to run, we repeatedly partitioned F ran-domly into a training partition Aj of 40000 samples and a test partition A0jof 10000 samples. A0j is not used at all during the j-th run of the algorithm asdescribed in section 2, instead it is kept until the end of the training to assess thetest error of ensemble Ej . As parameters for the algorithmwe chooseN = 100,4N= 10 and R = 6, which results in models mi that are trained on a maximum of150 samples out of Aj . The repeated partitioning and execution of leads to ancollection of ensembles Ej, j = 1, . . . 10. We then removed the worst performingensembles according to their error on the full data set3 F from that collection

http://zti.if.uj.edu.pl/\char 126\relax merkwirth/entool.htm

http://zti.if.uj.edu.pl/\char 126\relax merkwirth/entool.htm


200 400 600 800 1000

−100

0

100

200

300

Fig. 2. First 1000 samples of the ECG time series used for the numerical experiments.Time-delay-reconstruction with dimension 12 and delay 1 was used to generate thedata sets F and G. Both data sets contain 50000 samples each. G is taken from thelater part of the ECG time series. Both data sets have no samples in common. Sincethe MSE of models trained solely on F is lower on G than on F itself, the dynamics ofthe ECG seems to be stationary.

and treated the reduced collection as an ensemble of ensembles EE. This methodhas two advantages: If one run diverges or produces an substantially inferiormodel, it is not used for the final collection of ensembles. Due to the stochasticnature of the initial subset and of the training method used, the output of theensembles generated by the different runs are to some extend decorrelated. Thisresults not only in a substantial reduction of the error MSEEE of the collectionof ensembles on F, but also in an reduction of the generalization error on unseendata (see [1, 2]). We used four different underlying model types which range fromstrictly local to a global models: 1. The model type that was mostly used withinthe numerical experiments is a variant of the k-nearest-neighbor regressor withadaptive metric. In our case GA-like algorithm adapts the metric coefficientsof one of the L1, L2 or L1 metrics as well as the number k of neighbors. Thefitness of an individual is the negative leave-one-out-error on the training data,which can be easily calculated using a fast nearest-neighbor algorithm ([24]).The metric coefficients are adapted by mutation and crossover within a prede-fined number of generations. 2. As a semi-local model type we decided to use thehybrid PRBFN network based on radial basis function and sigmoidal nodes (see[12]). We found this model type to perform superior on several artificial and realworld test problems. 3. As global model type we choose a fully connected neuralnetwork with one hidden layer and eight hidden layer neurons. The network istrained using a second-order Levenberg-Marquardt algorithm. The implementa-tion was taken from the NNSYSID2.0 toolbox written by Magnus Nrgaard (see[25]). 4. Another global model type we used was a linear model that was trainedusing a cross-validation scheme.


k−NN PRBFN NeuralNet Linear k−NN full0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45Results of model types on training set F

Model Type

Mea

n S

quar

e E

rror

SequencesEnsembles of SequencesEnsemble of Standard Models

k−NN PRBFN NeuralNet Linear k−NN full0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45Results of model types on test set G

Model Type

Mea

n S

quar

e E

rror

SequencesEnsembles of SequencesEnsemble of Standard Models

k−NN PRBFN NeuralNet Linear0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45Results on training set F with adding of randomly chosen samples

Model Type

Mea

n S

quar

e E

rror

SequencesEnsembles of Sequences

Several interesting points could be observed:

6.8 Classification of Anti-Viral Chemicals

6.8.1 NCI AIDS Antiviral Screen Data Set

To apply the ENTOOL toolbox to a problem encountered in cheminformatics,we employed a special neural network type called Molecular Graph Networks(MGN) that can be directly applied for QSAR applications ([13]). We considered


a data set of more than 42000 compounds from the DTP AIDS antiviral screendata set of the NCI Open Database.

We considered a data set of more than 42000 compounds from the DTP AIDSantiviral screen data set of the NCI Open Database. The antiviral screen uti-lized a soluble formazan assay to measure the ability of compounds to protecthuman CEM cells[14] from HIV-1-induced cell death. In the primary screen-ing set of results, the activities of the compounds tested in the assay were de-scribed to fall into three classes: confirmed active (CA) for compounds thatprovided 100 % protection, confirmed moderately active (CM) for compoundsthat provided more than 50 % protection, and confirmed inactive (CI) for theremaining compounds or compounds that were toxic to the CEM cells and there-fore seemed to not provide any protection. The data set was obtained fromhttp://cactus.nci.nih.gov/ncidb2/download.html. The data set consistedof originally 42689 2D structures with AIDS test data as of October 1999 andwas provided in SDF format. Seven compounds could not be parsed and had tobe removed. From the total of 42682 useable compounds 41179 compounds were

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Frac. false positives

Fra

c. tr

ue p

ositi

ves

ROC

OOT Train Classes CM and CATest Classes CM and CAOOT Train Class CA onlyTest Class CA only

Fig. 3. ROC curves for the classifiers constructed on the NCI AIDS Antiviral ScreenData Set with ε-insensitive absolute loss. The Figure displays two pairs of ROC curves.In this computational experiment we trained an ensemble of molecular graph networkson a data set consisting of three classes of molecules (CI, CM and CM). To be ableto generate ROC curves, we had to reduce the number of classes to two by poolingthe molecules of two classes into a single class. The lower pair of ROC curves wasobtained by using the ensemble of classifiers to discriminate between CI as one classand CA and CM as second class, while the upper pair details the ROC curves whenusing the same ensemble of classifiers to discriminate between CI and CM as one classand the confirmed actives CA as the second. The AUCs of the respective pairs of curvesare 0.82 resp. 0.81 for classification of CI versus CA and CM and 0.94 resp. 0.94 forclassification of CI and CM versus CA.

http://cactus.nci.nih.gov/ncidb2/download.html


confirmed inactive, 1080 compounds were confirmed moderately active and 423compounds were confirmed active.

To solve this multiclass classification problem, we used the one-versus-all ap-proach based on the logistic loss function. We construct three ensembles of clas-sifiers. Each of these three ensembles is trained to solve the binary classificationproblem of discriminating one of the three classes against the rest and consistsof six MGNs. Each MGN consisted of 18 individual feature nets with iterationdepths ranging from 3 to 10 and a supervisor network with 24 hidden layerneurons. The MGNs were trained by stochastic gradient descent with a fixednumber of 106 gradient calculations. The global step size μ was decreased every70000 gradient updates by a factor of 0.8. We randomly partitioned the data setinto a training set of 35000 compounds and test set of 7682 compounds. EachMGN was trained on a random two-third of the 35000 training samples. Thusthe OOT output for every sample of the training set was computed by averagingover 2 models while the output for the held-out test set by averaging over all 6models of each ensemble.

Results for the classification experiments on NCI data set with classificationloss function are given in Figure 3. This figure displays two pairs of ROC curves.The lower pair of ROC curves in Figure 3 was obtained by using the ensemble ofclassifiers to discriminate between CI on the one hand and CA and CM on theother hand, while the upper pair details the ROC curves when using the sameensemble of classifiers to discriminate between CI and CM on the one handand the confirmed actives CA on the other. The remarkable coincidence of thecurves obtained by validation on the training part and from the held-out testpart of more than 7000 compounds indicates that the validation was performedproperly and does not exhibit overfitting. This result is supported by the AUCsof the respective pairs of curves which are 0.82 (OOT) and 0.81 (test) for theclassification of CI versus CA and CM and both 0.94 for the classification of CIand CM versus CA.

Results for the classification with logistic loss function are depicted in Figure 4.The obtained AUC values are similar to the best results of several variants of aclassification method based on finding frequent subgraphs[15] (experiments H2and H3 when omitting class CM from the test set for the ensemble constructed todiscriminate CA versus the two other classes). Wilton et al.[16] compare severalranking methods for virtual screening on an older version of the NCI data set.The best performing method there, binary kernel discrimination, is able to locate12 % of all actives (CM and CA pooled) in the first 1 % and 35 % of all actives inthe first 5 % of the ranked NCI data set. MGNs trained with logistic loss are ableto find 36 % resp. 74 % of all actives in the first 1 % resp. 5 % of the NCI dataset ranked according to the output of the ensemble of classifiers. When interpret-ing the output of the three classifiers as pseudo-probabilities and assigning theclass label of the classifier with highest output value to each sample, we are ableto compute confusion matrices for the OOT validation on the training set and for


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Frac. false positives

Fra

c. tr

ue p

ositi

ves

ROC

Test Class CIOOT Train Class CITest Class CMOOT Train Class CMTest Class CAOOT Train Class CA

Fig. 4. ROC curves for the three classifiers constructed on the NCI AIDS AntiviralScreen Data Set using logistic loss and one-versus-all approach. The Figure displaysthree pairs of ROC curves. In this computational experiment three ensembles of molec-ular graph networks were trained on a data set consisting of three classes of molecules(CI, CM and CM). The green/black pair of ROC curves corresponds to the ensembleclassifier discriminating class CM from the two other classes, the red/magenta pairto class CI against the others. The blue/cyan pair details the ROC curves resultingfrom the ensemble classifier trained to discriminate class CA against the two remainingclasses CI and CM. AUCs are 0.80 resp. OOT 0.81 for class CI, 0.75 resp. OOT 0.75for class CM and 0.94 resp. OOT 0.91 for class CA.

Table 1. Confusion matrix for the OOT validation on the training set obtained by thesystem of three classifiers on the NCI AIDS Antiviral Screen data set using logistic lossand one-versus-all approach. The values displayed indicate the fraction of the samplesof each class are classified into the respective classes ). E.g. 83.5 % of the samples ofclass CI are classified correctly, 12.6 % of the CI samples are classified wrongly intoclass CM and the remaining 3.8 % are wrongly classified into class CA. While samplesof classes CI and CA are mostly classified correctly, class CM (confirmed moderate)are recognized correctly in only 38 % of the cases.

Predicted ClassActual Class CI CM CA

CI 0.835 0.126 0.038CM 0.408 0.380 0.212CA 0.124 0.187 0.690

the held-out test set, given in Tables 1 and 2. While classes CI and CA can becorrectly classified in a majority of the cases, samples of class CM are recognizedcorrectly in less than 40 % of all cases.


Table 2. Confusion matrix for the held-out test set. 85 % of the samples of class CI areclassified correctly, 12 % of the samples of class CI are classified wrongly as belongingto class CM and the remaining 3 % are wrongly classified to fall into class CA.

Predicted ClassActual Class CI CM CA

CI 0.852 0.121 0.027CM 0.444 0.369 0.187CA 0.093 0.160 0.747

6.9 Classifiers for the Chromosome-Damaging Potential ofChemicals

We used ensemble models for classification in order to predict the chromosome-damaging potential of chemicals as assessed in the in Vitro chromosome aberra-tion (CA) test. This CA test is an in Vitro test that is used in the early stages ofthe drug discovery process in order to exclude toxic compounds. A detailed de-scription of our appraoch in comparision to the performance of existing methodscould be found in Rothfuss et al. [17].

The CA-test data used in this study were obtained from two recently pub-lished data collections [18, 19]. Further details on the original data source can beobtained from the references of both data compilations. The genotoxicity datacollection from Snyder et al. [19] contains in Vitro cytogenetics data for 248 mar-keted pharmaceuticals. Structural information could be retrieved for 229 of the248 compounds. Altogether, 189 negative and 40 positive data records from thisdata source could be used for modelbuilding purposes. The database collectedby Kirkland et al. [18] contains CA-test data for 488 structurally diverse com-pounds, consisting of industrial, environmental, and pharmaceutical compounds.Structural information was retrieved for 450 out of this compounds. Altogether,168 negative and 282 positive data records from this data source could be usedfor model-building purposes.

The descriptors that serve as input variables for the classification were calcu-lated with the dragonX software [20] that was originally developed by MilanoChemometrics and the QSAR Research Group. The software generates a totalnumber of 1664 molecular descriptors that are group into 20 different blocks,such as constitutional descriptors, topological descriptors, and walk and pathcounts [20]. For each compound in the data set, all 1664 descriptors were cal-culated. Because many of these descriptors are redundant or carry correlatedinformation, feature selection need to be performed in order to select the mostuseful subset of descriptors to build a classification model. Our feature selectionapproach follows the method of variable importance as proposed by Breiman[21]. The underlying idea is to select descriptors on the basis of the decrease ofclassification accuracy after the permutation of the descriptors [21]. Briefly, anensemble of decision trees is built, which uses all descriptors as input variablesand associated activity (CA-test result) as output variables using 90% of the data(training set). The prediction accuracy of the classification model on an out of


training portion of the data (test set) is recorded. In a second step, the same isdone after the successive permutation of each descriptor. The relative decreaseof classification accuracy is the variable importance following the idea that themost discriminative descriptors are the most important ones. The descriptor setwas reduced iteratively resulting in a final set of 14 descriptors, including topo-logical charge indices, electronegativity and shape descriptors. Several of theidentified descriptors can be directly related to genotoxicity and specify char-acteristics of structures involved in DNA modifications (see Rothfuss et al. [17]and the references therein). Our final classifier was trained with several differentmodel classes to achieve a diverse ensemble:

• Classification and regression trees (CART)• Support vector machines (SVM) with Gaussian kernels• Linear and quadratic discriminant analysis• Linear ridge models• Feedforward neural networks with two hidden layers trained by gradient de-

scend• K-nearest-neighbor models with adaptive metrics

In order to estimate the performance of the final ensemble model, we performeda 20 fold cross-validation, wherein 10% of the data was randomly kept out as testset an the remaining 90% of the data was used for model training. The resultswith respect to training and test set are reported in Table 3.

Table 3. The performance of the ensemble classification model, mean values calculatedover 20 cross-validation-folds

Accuracy Sensitivity Specificity

Training set 75.9% 75.1% 76.8%

Test set 71.6% 70.8% 71.4%

As pointed out by several research groups, the state of the art machine learn-ing approaches in the field toxicology prediction can compete with most of thecommercial software tools [22, 23, 17] and they havethe further advantage, thatthey could be trained with the additional in-house data collection of institutesor companies.

7 Why Do Ensembling? Pros and Cons

Building ensembles of models gives several advantages:

• Straightforward extension of existing modeling algorithms• Almost fool-proof minimization of generalization error• Makes no assumptions on the structure of the underlying models• Alleviates the problem of model selection


These advantages come on the expense of increased computational effort. Thedescribed strategies have been tested extensively using measured data series formexperiments in electronic circuits, real ECG time series and financial data.

Acknowledgments

This work has been prepared in part within the scope of the Research TrainingNetwork COSYC of SENS No. HPRN-CT-2000-00158 of the 5th EU Framework.

References

[1] Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and activelearning. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural In-formation Processing Systems, vol. 7, pp. 231–238. MIT Press, Cambridge (1995),citeseer.ist.psu.edu/krogh95neural.html

[2] Perrone, M.P., Cooper, L.N.: When Networks Disagree: Ensemble Methods forHybrid Neural Networks. In: Mammone, R.J. (ed.) Neural Networks for Speechand Image Processing, pp. 126–142. Chapman and Hall, Boca Raton (1993)

[3] Hansen, L., Salamon, P.: Neural Network Ensembles. IEEE Trans. on PatternAnalysis and Machine Intelligence 12(10), 993–1001 (1990)

[4] Naftaly, U., Intrator, N., Horn, D.: Optimal ensemble averaging of neural networks.Network, Comp. Neural Sys. 8, 283–296 (1997)

[5] Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regres-sion Trees. Wadsworth International Group, Belmont (1984)

[6] Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning.Springer Series in Statistics. Springer, Heidelberg (2001)

[7] Krogh, A., Sollich, P.: Statistical mechanics of ensemble learning. Physical ReviewE 55(1), 811–825 (1997)

[8] Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996),citeseer.ist.psu.edu/breiman96bagging.html

[9] Merkwirth, C., Ogorzalek, M., Wichard, J.: Stochastic gradient descent trainingof ensembles of dt-cnn classifiers for digit recognition. In: Proceedings of the Eu-ropean Conference on Circuit Theory and Design ECCTD 2003, Krakow, Poland,vol. 2, pp. 337–341 (September 2003)

[10] Wichard, J., Ogorza�lek, M.: Iterated time series prediction with ensemble models.In: Proceedings of the 23rd International Conference on Modelling Identificationand Control (2004)

[11] Suykens, J., Vandewalle, J. (eds.): Nonlinear Modeling - Advanced Black–BoxTechniques. Kluwer Academic Publishers, Dordrecht (1998)

[12] Cohen, S., Intrator, N.: A hybrid projection based and radial basis function archi-tecture. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 147–155.Springer, Heidelberg (2000)

[13] Merkwirth, C., Lengauer, T.: Automatic generation of complementary descriptorswith molecular graph networks (2004)

[14] Weislow, O., Kiser, R., Fine, D., Bader, J., Shoemaker, R., Boyd, M.: New solubleformazan assay for hiv-1 cytopathic effects: application to high flux screening ofsynthetic and natural products for aids antiviral activity. J. Nat. Cancer Inst. 81,577–586 (1989)

citeseer.ist.psu.edu/krogh95neural.html

citeseer.ist.psu.edu/breiman96bagging.html


[15] Deshpande, M., Kuramochi, M., Karypis, G.: Frequent sub-structure-based ap-proaches for classifying chemical compounds. In: Proceedings of the Third IEEEInternational Conference on Data Mining ICDM 2003, Melbourne, Florida, pp.35–42 (November 2003)

[16] Wilton, D., Willett, P., Lawson, K., Mullier, G.: Comparison of ranking methodsfor virtual screening in lead-discovery programs. J. Chem. Inf. Comput. Sci. 43,469–474 (2003)

[17] Rothfuss, A., Steger-Hartmann, T., Heinrich, N., Wichard, J.: Computational pre-diction of the chromosome-damaging potential of chemicals. Chemical Researchin Toxicology 19(10), 1313–1319 (2006)

[18] Kirkland, D., Aardema, M., Henderson, L., Muller, L.: Evaluation of the abilityof a battery of three in vitro genotoxicity tests to discriminate rodent carcinogensand non-carcinogens. Mutat. Res. 584, 1–256 (2005)

[19] Snyder, R.D., Pearl, G.S., Mandakas, G., Choy, W.N., Goodsaid, F., Rosen-blum, I.Y.: Assessment of the sensitivity of the computational programs DEREK,TOPKAT and MCASE in the prediction of the genotoxicity of pharmaceuticalmolecules. EnViron. Mol. Mutagen. 43, 143–158 (2004)

[20] Todeschini, R.: Dragon Software, http://www.talete.mi.it/dragon_exp.htm[21] Breiman, L.: Arcing classifiers. The Annals of Statistics 26(3), 801–849 (1998),

http://citeseer.nj.nec.com/breiman98arcing.html

[22] Serra, J.R., Thompson, E.D., Jurs, P.C.: Development of binary classification ofstructural chromosome aberrations for a diverse set of organic compounds frommolecular structure. Chem. Res. Toxicol. 16, 153–163 (2003)

[23] Li, H., Ung, C., Yap, C., Xue, Y., Li, Z., Cao, Z., Chen, Y.: Prediction of genotoxic-ity of chemical compounds by statistical learning methods. Chem. Res. Toxicol. 18,1071–1080 (2005)

[24] McNames, J.: Innovations in Local Modeling for Time Series Prediction, Ph.D.Thesis, Stanford University (1999)

[25] Norgaard, M.: Neural Network Based System Identification Toolbox, Tech. Report.00-E-891, Department of Automation, Technical University of Denmark (2000),http://www.iau.dtu.dk/research/control/nnsysid.html

http://www.talete.mi.it/dragon_exp.htm

http://citeseer.nj.nec.com/breiman98arcing.html

http://www.iau.dtu.dk/research/control/nnsysid.html


Automatic Fingerprint Identification Based on Minutiae Points

Maciej Hrebień and Józef Korbicz

Institute of Control and Computation Engineering University of Zielona Góra ul. Podgórna 50 65-246 Zielona Góra Poland {m.hrebien,j.korbicz}@issi.uz.zgora.pl

Introduction

In recent years security systems have played an important role in our community. Payment operations without cash, restricted access to specific areas, secrecy of information stored in databases are only a small part of our daily living that requires special treatment. Beside traditional locks, keys or ID cards, there is an increased interest in biometric technologies, that is, human identification based on one’s individual features [2, 19].

Fingerprint identification is one of the most important biometric technologies considered nowadays. The uniqueness of a fingerprint is exclusively determined by local ridge characteristics called the minutiae points. Automatic fingerprint matching depends on the comparison of these minutiaes and relationships between them [13, 14, 18].

In this paper several methods of fingerprint matching are discussed, namely, the Hough transform, the structural global star method and the speeded up correlation approach (Sect. 4). Because there is still a need for finding the best matching approach, research for on-line fingerprints was conducted to compare quality differences and time relations between the algorithms considered and the experimental results are grouped in Section 5. One can also find here a detailed description of image enhancement (Sect. 2) and the minutiae detection scheme (Sect.3) used in our research.

1 Fingerprint Representation

A fingerprint is a structure of ridges and valleys unique for all human beings. The uniqueness is exclusively determined by local ridge characteristics called minutiae points and relationships between them [6].

Two most common minutiae points considered in today’s research are known as ending and bifurcation. An ending point is the place where a ridge ends its flow, and bifurcation is the place where a ridge forks into two parts (Fig. 1).

138 M. Hrebień and J. Korbicz

Fig. 1. Example of ending and bifurcation points

2 Image Enhancement

A very common technique for reducing the quantity of information received from a fingerprint scanner in the form of a grayscale image is known as the Gabor filtering [13]. The filter based on local ridge orientation and frequency estimations produces a nearly binary output – the intensity histogram has a U-shaped form [8].

The Gabor filter is defined by

),2cos(])[2

1exp(),,,(

2

2

2

2

θθθ π

δδθ fx

yxfyxh

yx

+−= (1)

where:

,sincoscossin θθθθ θθ yxyandyxx −=+=

θ is a local ridge orientation, e.g., the angle that fingerprint ridges form with the horizontal axis when crossing through an arbitrary small block, f is an estimation of ridge frequency in that block, δx and δy are space constants defining the stretch of the filter (Fig. 2).

Because the ridge flow does not change significantly in its local neighborhood, the orientation angle is calculated for blocks of the fingerprint image rather then for each point separately as suggested in the more direct mask method presented in [15]. The

Fig. 2. Graphical representation of the Gabor filter (f = 1/10, δx = δy = 4.0)

Automatic Fingerprint Identification Based on Minutiae Points 139

ridge orientation θ for a specified block centered at the position (i, j) can be then estimated with the equations

,),(

),(

2

1),( 1

⎟⎟⎠

⎞⎜⎜⎝

⎛= −

jiV

jiVtanji

x

yθ (2)

,),(),(2),(2

2

2

2

∑ ∑+

−=

+

−=

∂∂=

wi

wiu

wj

wjv

yxx vuvujiV (3)

∑ ∑+

−=

+

−=

∂−∂=2

2

22

2

2 )),,(),((),(

wi

wiu

y

wj

wjv

xy vuvujiV (4)

where ∂x(u, v) and ∂y(u, v) are pixel gradients at the position (u, v) (e.g., estimated with Sobel’s mask [16]) and w is the block size (w = 16 for 500dpi fingerprint images, or w = 15 to avoid unambiguous selection of the central point [5]). Additionally, taking into account the fact that fingerprint ridges are not directed, orientation equal to 225° can be considered as equal to 45°, so the orientation θ is usually described on a half-open angle, for example, θ ∈ [0…π).

The local ridge frequency f can be estimated by counting an average number of pixels between two consecutive peaks of gray-levels along the direction normal to the local ridge orientation. The idea is based on a w × l (where w < l) oriented window placed at the center of each block and rotated with the θ angle. The frequency of each block is given by

,),(

1),(

jiTjif = (5)

where T(i, j) is an average number of pixels between two consecutive peaks in the so-called x-signature obtained from

.1,...,1,0),,(1 1

0∑

−

=

−==w

dk lkkdW

wX (6)

The local ridge frequency f can be given a constant value, the same for all blocks if the filtering time must be minimized. Certainly, proper selection of its value is crucial for the final result. Too large frequency will cause creation of spurious ridges. Too small, on the contrary, will introduce the problem of merging nearby ridges into one. For 500dpi fingerprint images, the inter-ridge distance is approximately equal to 10, so f can be given a 1/10 value [10].

The space constants δx and δy define the stretch of the Gabor filter along the OX and the OY axis. Selecting their values is a sort of a deal. If δx and δy are too small, the


filter is not effective in removing noise. If δx and δy are too large, the filter is more robust in removing noise but introduces the smoothing effect and the ridge details are lost. The δx and δy values should be approximately equal half the inter-ridge distance to maximize enhancement effectiveness. For 500dpi fingerprint images, δx and δy are usually equal to 4.0 or, sometimes, δy = 3.0 if there is a concern for spurious ridges creation [5].

Trying to speed up the process of the Gabor filtering one can notice that the filter is symmetrical on the OX as well as the OY axis, so the calculations can be significantly reduced. Additionally, a segmentation mask (constructed, for example, using the variance approach [12, 17]) can be used so that the filter calculations could be performed only in those parts of image which were marked as blocks containing the object’s pixels (ridges) [7].

An example result of fingerprint image enhancement with the final binary output is illustrated in Fig. 3. The example input image was first normalized to reduce variance in gray-levels of each pixel [1] (without changing the clarity between ridges and valleys) with the equations

,),(1

)(1

0

1

02 ∑∑

−

=

−

=

=N

i

N

j

jiXN

XM (7)

,))(),((1

)(1

0

1

0

22 ∑∑

−

=

−

=

−=N

i

N

j

XMjiXN

XVAR (8)

⎪⎪⎩

⎪⎪⎨

⎧

≤−−

>−+=

,),(,)),((

),(,)),((

),(2

0

2

0

MjiXifVAR

MjiXVARM

MjiXifVAR

MjiXVARM

jiGo

o

(9)

where M0 and VAR0 are the expected middle value and variance (both usually equal 128), and N is the size of the N × N input image X.

Fig. 3. Example of image enhancement and binaryzation based on the Gabor filter1

1 Off-line fingerprint taken from the U.S. National Institute of Standards and Technology

database, NIST-4, http://www.nist.gov


3 Minutiae Detection

3.1 Image Thinning, Coordinates and Types

Image thinning can be considered as a process of erosion [4, 7]. All pixels from the edges of an object (a fingerprint ridge) are removed only if they do not affect the coherence of the object as a whole, and they are left untouched otherwise. The skeleton form of a fingerprint is generated until there are no more surplus pixels to remove. The thickness of ridges in the resulting image has to be equal to one pixel, and the shape and run of the original ridges should be preserved. An example of a thinned form of a fingerprint can be seen in Fig. 4.

To determine whether a pixel at the position (i, j) in the skeleton form of a fingerprint is a minutiae point, we have to deal with the mask rules illustrated in Fig. 5. Bifurcation or ending are defined in a place where the perimeter of the mask (eight nearest neighbors of the central point) is intersected in three or one part respectively.

Fig. 4. Example of a thinned form of a fingerprint image from Fig. 3

Fig. 5. Example of 3×3 masks used to define: a) bifurcation, b) non-minutiae point, c) ending, d) noise


3.2 Minutiae Orientation

To define the orientation of each minutiae we can use a (7 × 7) mask technique with angles quantized to 15° and the center placed in a minutiae point. The orientation of an ending point is equal to the point where a ridge is crossing through the mask. The orientation of a bifurcation point can be estimated with the same method but only the leading ridge is considered, that is, the ridge with a maximum sum of angles to the other two ridges of the bifurcation (see, for instance, Fig. 6).

Fig. 6. Bifurcation (60°) and ending (210°) point orientation example

3.3 False Minutiae Points

At the end of the minutiae detection process, all determined points should be verified to see if they were not created by accident, for example, as a result of filtering errors. Thus, all minutiaes from the borders of an image, in a very close neighbourhood to the region marked as the background in the segmentation mask, created as a result of a local ridge peak (bifurcation very close to an ending point) or as a consequence of the pore structure of a fingerprint (the ridge hole – two bifurcations in a close neighbourhood with opposite orientations) should be treated as false and removed from the set. The local ridge noise problem can be reduced e.g. by ridge smoothing techniques (a pixel is given a value using the majority rule in the nearest neighbourhood [4]) just after image binaryzation, so that all small ridge holes will be patched and all peaks smoothed out.

4 Minutiae Matching

4.1 Hough Transform

Let MA and MB denote minutiae sets determined from the images A and B:

}.,...,,{

},...,,{

21

21

Bn

BBB

Am

AAA

mmmM

mmmM

=

= (10)


Each minutiae is defined by the image coordinates (x, y) and the orientation angle θ ∈ [0...2π], that is,

....1},,,{

...1},,,{

njyxm

miyxmBj

Bj

Bj

Bj

Ai

Ai

Ai

Ai

==

==

θθ

(11)

What we expect is to find such a transformation of the minutiae set MB into MC that will be the best estimation of MA (the MA minutiaes have to be covered by the MC minutiaes with a given distance (r0) and the orientation (θ0) tolerance). This means that

)),(),(( 0θθθ ≤≤∃∀ Cj

Aio

Cj

Ai

jiKandrmmS (12)

have to be maximized, where S is a function defining the distance between a pair of minutiaes (in Chebyshev’s meaning) and K is a function defining the difference between minutiaes orientation (assuming that the difference between 3° and 358° is equal to 5°):

),,max(),( yyxx bababaS −−=

).2,min(),( βαπβαβα −−−=K (13)

)max(arg),,,(

}

1),,,(),,,(

),,,(,,,

cossin

sincos

{

...1},,...,,{

),(

...1},,...,,{

...1,},,{

...1,},,{

0),,,(

########

####

21

0

21

Asyx

syxAsyxA

syxdiscretizesyx

y

xs

y

x

y

x

LlssssFOR

KIF

KkFOR

njMyxFOR

miMyxFOR

lkjiA

lk

Bj

Bj

kk

kklA

i

Ai

Ll

Bjk

Ai

Kk

BBj

Bj

Bj

AA

iAi

Ai

lkji

←ΔΔ

+ΔΔ←ΔΔ

ΔΔ←ΔΔ

⎥⎥⎦

⎤

⎢⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡−

−⎥⎦

⎤⎢⎣

⎡←⎥

⎦

⎤⎢⎣

⎡ΔΔ

=∈

≤+

=∈

=∈

=∈

←∀∀∀∀

++++ θ

θθθθ

θθθθ

θθθθθθθθ

θθ

Fig. 7. Hough transform routine


The Hough transform, which was adopted for fingerprint matching [14], can be performed to find the best alignment of the sets MA and MB including the possible scale, rotation and displacement of the image A versus B. The transformation space is discretized – each parameter of the geometric transform (Δx, Δy, θ, s) comes from a finite set of values. A four dimensional accumulator A is used to accumulate evidences of alignment between each pair of minutiaes considered. The best parameters of the geometric transform, that is, (Δx+, Δy+, θ+, s+) are arguments of the maximum value from the accumulator (see the procedure in Fig. 7).

After performing the transformation, minutiae points are juxtaposed to calculate the matching score with respect to their distance, orientation and type (with a given tolerance).

An example result of the Hough transform is shown in Fig. 8.

4.2 Global Star Method

The global star method is based on a structural model of fingerprints. Distinguishing between the types of minutiaes (ending or bifurcation) and including the possible scale, rotation and displacement of images, a star can be created with the central point in one of the minutiaes, and with the arms directed to the remaining ones (Fig. 9a). Assuming, as in previous deliberations, that

},,...,,{

},...,,{

21

21

Bn

BBB

Am

AAA

mmmM

mmmM

=

= (14)

indicate sets of minutiaes of one type, m stars for the image A and n stars for the image B can be created:

},,...,,{

},...,,{

21

21

Bn

BBB

Am

AAA

SSSS

SSSS

=

= (15)

where each star can be defined as

.,},...,,{

,},...,,{

...121

...121

Bjnj

Bn

BBBj

Aimi

Am

AAAi

mincentermmmS

mincentermmmS

=

=

=

= (16)

In opposition to the local methods [18], the voting technique for selecting the best

aligned pair of the stars ( Bwj

Awi SS , ) can be performed (Fig. 10), including matching

such features like the between-minutiae angle K and the ridge count D (Fig. 9cb). In the final decision, also the orientation of minutiaes is taken into account (Fig. 11) after their adjustment by the angle of orientation difference between the central points of stars from the best alignment (α).

An example result of the global start matching method is shown in Fig. 12.


Fig. 8. Example result of the Hough transform – matched minutiaes, with a given tolerance, are marked with elipses

Fig. 9. General explanation of the star method: a) example star created for fingerprint ending points, b) ridge counting (here equal to 5), c) example of relative angle determination between the central minutiae and the remaining ones


)))(max((arg

)))(max((arg

1),(),(

)),(),(),(),((

}{:

}{

...1,

...1,

0),(

00

...1...1

ASS

ASS

jiAjiA

kmmKmmKanddmmDmmDIF

mSmthatngassumi

mSmFOR

njSSFOR

miSSFOR

jiA

jBB

wj

iAA

wi

Ak

Ai

Bl

Bj

Ak

Ai

Bl

Bj

l

Bj

Bj

Bl

Ai

Ai

Ak

BBj

AAi

njmi

←

←

+←

≤−≤−∃

−∈

−∈

=∈

=∈

←∀∀ ==

Fig. 10. First stage of the global star matching algorithm

}

1

{

)),(

),(),(),(),((

}{:,

}{

0

0

00

+←

≤+

≤−≤−∃

−∈

−∈

←

LL

Tand

kmmKmmKanddmmDmmDIF

mSmthatngassumi

mSmFOR

L

Bl

Ak

Ak

Awi

Bl

Bwj

Ak

Awi

Bl

Bwj

l

Bwj

Bwj

Bl

Awi

Awi

Ak

θαθθ

Fig. 11. Second stage of the global start matching algorithm – way of determining the number of the matched minutiae pairs (L)

4.3 Correlation

Because of non-linear distortion, skin conditions or finger pressure that cause the varying of image brightness and contrast [13], the correlation between fingerprint images cannot be applied directly. Moreover, taking into account the possible scale, rotation and displacement, searching for the best correlation between two images using an intuitive sum of squared differences is computationally very expensive. To eliminate or at least reduce some of the above-mentioned problems, a binary representation of the fingerprint can be used. To speed up the process of preliminary


Fig. 12. Example result of the global star method

Fig. 13. Example of preliminary aligned fingerprint segmentation masks (left) and correlation between two impressions of the same finger (right), where red denotes the best alignment (images obtained with a Digital Persona U.are.U 4000 scanner)

alignment, a segmentation mask can be used with conjunction to the center of gravity of binary images. Also, the quantization of geometric transform features can be applied, considering the scale and rotation only at the first stage (since displacement


),(

)),,,,,,,(,(

}

],,,,,[],,,,,[

)),,,,,,,(,(

{

...1},,...,,{

...1},,...,,{

...1},,...,,{

...1},,...,,{

...1},,...,,{

],[],[

),,,,,,,(

}

],,[],,[

)),,,,,,,(,(

{

...1},,...,,{

...1},,...,,{

],,,,,[],,,,,[

],,[],,[

),(),(),(

),(),(),(

maxmaxmaxmaxmax

maxmaxmaxmaxmaxmax

max

21

21

21

21

21

minminmin

minminmin

min

21

21

maxmaxmaxmaxmaxmax

minminmin

AAD

SSssBtransformADW

dssdss

ddIF

SSssBtransformADd

PpFOR

NnFOR

MmFOR

LlssssFOR

KkssssFOR

xSySSS

SSyxssBtransformB

dsds

ddIF

SSyxssBtransformADd

JjFOR

IissssFOR

dss

ds

jiBjiBjiB

jiAjiAjiA

obj

Bj

Bi

yxyxobj

yp

xnm

yl

xk

yxyx

Bj

Bi

yp

xnm

yl

xkobj

yP

yyyp

xN

xxxn

Mm

yL

yyyl

xK

xxxk

Bj

Bi

Bj

Bi

Bj

Bi

ji

Bj

Bijiisegsegseg

Jj

Ii

yxyx

segji

segji

ΔΔ←

ΔΔ←ΔΔ

<

ΔΔ←

=ΔΔΔ∈Δ

=ΔΔΔ∈Δ

=∈=∈

=∈

Δ+Δ−←

ΔΔ←

←<

ΔΔ←

=∈=∈

∞−←ΔΔ

∞+←

∗←∀∀

∗←∀∀

θ

θθ

θ

θθθθ

θ

θθ

θ

θθθθ

φφφφφθφφθ

Fig. 14. Algorithm of finding the best correlation ratio (W) between the images A and B using their segmentation masks Aseg and Bseg


Fig. 15. Example result of the best minutiae match for the correlation from Fig. 13

is the difference between the centers of gravity), minimizing the Dseg criteria (a simple image XOR):

∑∑−

=

−

= ⎩⎨⎧

=≠

=1

0

1

0 ).,(),(,0

),(),(,1),(

N

i

N

j segseg

segseg

segsegseg jiBjiAif

jiBjiAifBAD (17)

After finding nearly the best alignment of segmentation masks (Fig. 13), looking for the best correlation is limited to a much more reduced area. Including the rotation, vertical and horizontal displacement, stretch and arbitrary selected granularity of these features, the best correlation can be found (Fig. 14) searching for the maximum value of the Dobj criteria (a double image XNOR):

∑∑−

=

−

= ⎩⎨⎧ ==

=1

0

1

0 ,,0

),(),(,1),(

N

i

N

jobj caseotherthein

objjiBjiAifBAD (18)

where obj represents the object’s (ridge) pixel. Because fingerprint correlation does not tell us anything about minutiae matching,

the thinning process with minutiae detection should be applied to both binary images


from the best correlation. Then two sets of minutiaes can be compared to sum up the matching score.

An example result of the correlation algorithm is shown in Fig. 13 and Fig. 15.

5 Experimental Results

The experiments were performed on a PC with a Digital Persona U.are.U 4000 fingerprint scanner. The database consists of 20 fingerprint images with 5 different impressions (plus one more for the registration phase).

There were three experiments carried out. The first and the third one differ in the case of parameter settings of each method. In the second one, the image selected for the registration phase was chosen arbitrarily as the best one in the arbiter’s opinion (in the first and the third experiment the registration image was the first fingerprint image acquired from the user).

All images were enhanced with the Gabor filter described in Section 2 and matched using the algorithms described in Section 4. The summary of the matching results for Polish regulations concerning fingerprint identification based on minutiaes [6] and time relations between each method are grouped in Tab. 1.

Table. 1. Summary of the achieved results

Experiment Hough Trans. Global Star Correlation 1 85 45 37 2 88 76 70

Matching percentage [%]

3 82 80 61 1 25 / 7 10 / 3 16 / 4 2 25 / 7 13 / 4 19 / 5

Avg. count of endings / bifurcations in the best

match 3 26 / 7 19 / 5 17 / 5 1 1 22 6 2 1 10 1

Number of images that did not cross the

matching threshold 3 1 2 2 Time relation 1, 2, 3 1 HT ~6 HT ~14 HT

As one can easily notice, the Hough transform gave us the fastest response and the

highest hit ratio from the methods considered. Additionally, it can be quite easily vectorized to perform more effectively with SIMD organized computers.

The global star method is scale and rotation independent but more expensive computationally because of the star creation process – determining the ridge count between the mA and mB minutiaes needs an iteration process, which is time consuming (even if one notices that D(mA, mB) = D(mB, mA) in the star creation process with the center in mA and mB). Moreover, filtering errors and not very good image quality can cause breaks in the continuity of ridges disturbing the proper ridge count determination and, as a consequence, produce a lower matching percentage in common situations.


The analysis of an error set of the correlation method shows that it is most sensitive in the case of image selection for the registration phase and parameter settings from the group of the algorithms considered. Too small fragment or strongly deformed fingerprint impression make finding the unambiguous best correlation (maximum W value in Fig. 13) significantly difficult. Additionally, it is time consuming because of its complexity (series of geometric transformations).

6 Conclusions

In this paper several methods of fingerprint matching were reviewed. The experimental results show quality differences and time relations between the analyzed algorithms. The influence of selecting an image for the registration phase can be observed. The better image selected, the higher the matching percentage and smaller inconvenience if the system works as a lock.

Suboptimal parameters selected for the performed preliminary experiments show that it is still a challenge to use global optimization techniques for finding the best parameters of each described method. Additionally, automatic image pre-selection (classification), e.g., one based on global features of a fingerprint (such as core and delta positions, the loop class) can speed up the whole matching process for very large databases [3, 9, 11].

Heavy software architecture dependent optimizations or even hardware implementation [14] can be considered if there is a big concern about speed. On the other hand, if security is more important, hybrid solutions including, for example, voice, face or iris recognition could be combined with fingerprint identification to increase the system’s infallibility [19].

References

[1] Andrysiak, T., Choraś, M.: Image retrieval based on hierarchical Gabor filters. Int. J. of Appl. Math. and Comput. Sci. 15(4), 471–480 (2005)

[2] Bouslama, F., Benrejeb, M.: Exploring the human handwriting process. Int. J. of Appl. Math. and Comput. Sci. 10(4), 877–904 (2000)

[3] Cappelli, R., Lumini, A., Maio, D., Maltoni, D.: Fingerprint classification by directional image partitioning. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 402–421 (1999)

[4] Fisher, R., Walker, A., Perkins, S., Wolfart, E.: Hypermedia Image Processing Reference. John Wiley & Sons, Chichester (1996)

[5] Greenberg, S., Aladjem, M., Kogan, D., Dimitrov, I.: Fingerprint image enhancement using filtering techniques. In: IEEE Proc. 15th Int. Conf. Pattern Recognition, vol. 3, pp. 322–325 (2000)

[6] Grzeszyk, C.: Dactyloscopy. PWN, Warszawa (1992) (in Polish) [7] Gonzalez, R., Woods, R.: Digital Image Processing. Prentice-Hall, Englewood Cliffs

(2002) [8] Hong, L., Wan, Y., Jain, A.: Fingerprint image enhancement: algorithm and performance

evaluation. IEEE Trans. on Pattern Analysis and Machine Intelligence 20(8), 777–789 (1998)


[9] Jain, A., Minut, S.: Hierarchical kernel fitting for fingerprint classification and alignment. In: IEEE Proc. 16th Int. Conf. Pattern Recognition, vol. 2, pp. 469–473 (2002)

[10] Jain, A., Prabhakar, S., Hong, L., Pankanti, S.: Fingercode: a filterbank for fingerprint representation and matching. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2187–2194 (1999)

[11] Karu, K., Jain, A.: Fingerprint classification. Pattern Recognition 29(3), 389–404 (1996) [12] Lai, J., Kuo, S.: An improved fingerprint recognition system based on partial thinning. In:

Proc. 16th Conf. on Computer Vision, Graphics and Image Processing, vol. 8, pp. 169–176 (2003)

[13] Maltoni, D., Maio, D., Jain, A., Prabhakar, S.: Handbook of Fingerprint Recognition. Springer, Heidelberg (2003)

[14] Ratha, N., Karu, K., Chen, S., Jain, A.: A real-time matching system for large fingerprint databases. IEEE Trans. on Pattern Analysis and Machine Intelligence 28(8), 799–813 (1996)

[15] Stock, R., Swonger, C.: Devolopment and Evaluation of a Reader of Fingerprint Minutiae, Cornell Aeronautical Laboratory, Technical Report (1969)

[16] Tadeusiewicz, R.: Vision systems of industrial robots, WNT (1992) (in Polish) [17] Thai, R.: Fingerprint Image Enhancement and Minutiae Extraction, University of

Western Australia (2003) [18] Wahab, A., Chin, S., Tan, E.: Novel approach to automated fingerprint recognition. IEE

Proc. in Vis. Image Signal Process 145(3) (1998) [19] Zhang, D., Campbell, P., Maltoni, D., Bolle, R. (eds.): Special Issue on Biometric

Systems. IEEE Trans. on Systems, Man, and Cybernetics 35(3), 273–450 (2005)


Image Filtering Using the Dynamic Particles Method

L. Rauch and J. Kusiak

UST, AGH University of Science and Technology, Cracow, Poland [email protected], [email protected]

Abstract. The holistic approaches used for image processing are considered in various types of applications in the domain of applied computer science and pattern recognition. A new image filtering method based on the dynamic particles (DP) approach is presented. It employs physics principles for the 3D signal smoothing. The obtained results were compared with commonly used denoising techniques including weighted average, Gaussian smoothing and wavelet analysis. The calculations were performed on two types of noise superimposed on the image data i.e. Gaussian noise and salt-pepper noise. The algorithm of the DP method and the results of calculations are presented.

1 Introduction

1.1 Denoising Processes

The analysis of the experimental measurement data is often difficult and sometimes even impossible in their rough version because of superimposed noise. Properly performed analysis based on the denoising techniques allows extracting the vital part of the data. Due to the denoising process, which is often very expensive and time-consuming, the experimental data can be restored and used in further calculations. There exists a lot of examples of such data obtained from many experiments in different domains of science e.g. experiments of plastometric material tests, determination of engine parameters, sound recording, market analysis, etc. In most cases observed noise is a result of external factors like sensitivity of the industrial measuring sensors or market impulses [1].

Above-mentioned measurements are mainly in form of one dimensional signal that have to be pre-processed before further analysis (Fig. 1).

However, a lot of obtained experimental results are presented as multi-dimensional data and also requires application of denoising algorithms. The example of such data used in medical or industrial applications is the image data in form of two dimensional pictures. The analysis of pictures taken for example from industrial camera is very often difficult because of low quality of registered image caused by the low resolution of the possessed equipment. Moreover, the data presented in the picture is usually superimposed with the noise. In most cases the noise on the image is the difference between the real color that can be seen by the human eye and the value that is registered by the camera. Thus, if there are many pixels in the picture with unsettled value, then the whole image can be illegible.

154 L. Rauch and J. Kusiak

Fig. 1. Examples of noisy measurements signals obtained from different plastometric material tests [2]

However, the character of the noise can be very different. Several types of random noise can be distinguished [3]:

Gaussian noise – used for testing of the denoising algorithm when the noise is generated and then superimposed on the source image,

White noise – is a noise that contains every frequency within the range of human hearing (generally from 20 hertz to 20 kHz) in equal amounts,

Salt-pepper noise – specific type of noise that changes the value of random chosen pixels on white or black color.

The denoising method presented in this paper is dedicated for the images saved in grayscale.

1.2 State-of-the-Art

Several commonly used methods of denoising exist. Each of them has some advantages and disadvantages, but no one can be treated as the unified denoising and smoothing method. The unification of such techniques should give one method, which can be applied to different types of measurement data saddled with a noise of different type. Existing known methods have to be reconfigured and adapted to the new conditions, even if the analyzed data has the same form, but with different noise. The example of such data is presented in Figure 1, where two similar plots are shown. They contain results of a metal samples compression tests performed with different tools’ velocities. Each of these curves is loaded with noise of different frequencies though they describe the same type of tested material. Therefore, denoising methods should be designed to obtain similar results independently of the noise character and, what is more important, independently of the curve shape. This would allow the application of the method in the automated way of denoising process that won’t require reconfiguration of input parameters and additional user’s interaction.

Image Filtering Using the Dynamic Particles Method 155

The process of denoising is in fact the problem of data approximation. There are many of such algorithms, but the most widely known and used are:

moving weighted average and polynomial approximations, wavelet analysis [4] and artificial neural networks [5]. large family of convolusion methods and frequency based filters [6], Kalman statistical model processing [7], dedicated filtering (used mainly in the image filtering processes) e.g. NL-means,

neighborhood models [3].

In case of polynomial approximation approach, the algorithms return well-fitting smoothed curves, but if the data contains several thousands of measured points then the calculation time is very long and the method appears inefficient. The weighted average technique allows user very fast and flexible data smoothing, but the assessment of the obtained results is very difficult and based only on the user’s intuition. Moreover, if the algorithm is running too long then the results converges to the straight line or surface joining the border points of data set. Thus, the main disadvantage of this method is the problem of a stop criterion of the algorithm. The wavelet analysis is very similar to the traditional Fourier method, but is more effective in analyzing physical situations where the signal contains discontinuities and sharp peaks. It allows application of denoising process on different levels of signal decomposition, making the solution very precise and controllable. Wavelets are mathematical functions that divide the data into different frequency components. Then the analysis of each component is performed with a resolution matched to the frequency scale. The drawbacks of the method are the necessity of setting thresholds each time the input data is changing and choosing the quantity of decomposition levels that can be dependent on the noise character. Approach based on the artificial neural networks is also often used, mainly the Generalized Regression Neural Networks (GRNN) is applied. The results obtained using that technique are smoother than in other methods e.g. wavelet analysis, but the application have to be reconfigured each time the data is changing. In some cases, even the type of the network must be changed, what is very inconvenient during the continuous calculations. Thus, the neural network approach is suitable for single calculations, but not for the automated application of denoising process.

The review of mentioned-above denoising methods allows to determine main problems related to the process of denoising:

the definition of the stop criterion and the evaluation of the quality results, techniques applied as the iterated algorithms run too long in most cases, the results are too simplified, which makes useless the further analysis of the

data, there is no unified method that could be applied on different types of noise

characterized by different frequencies.

The main objective of this paper is the presentation of scalable algorithm that could be applied for different types of random noise. Moreover, the algorithm should be equipped with the solution of the stop criterion that analyzes the progress of calculations making temporary assessment of obtained results. The description of this


method and the results of the application of elaborated method to the image data testing sets and their interpretation is presented.

2 Dynamic Particle Method

2.1 Description of the DPA Idea

The idea of the Dynamic Particles (DP) algorithm is based on the definition of a particle. A lot of definitions in the different science domains exist, but the most general definition treats the particle as an object placed in the N-dimensional space. From the mathematical point of view, the particle is a vector with N components related to each dimension in a space. This approach characterize the particle’s position and thus it can be analyzed relatively to the others [8].

The paper presents an algorithm that performs calculations on the three-dimensional particles where the particle is in fact a pixel in form of three-dimensional vector:

two dimensions define the position of the pixel on the image (width and height), the third dimension defines the value of the pixel in the grayscale – values are

from the range of 0-255, where 0 indicates black and 255 indicates white color.

Therefore, we receive the whole image as the 3D surface made of points represent-ing adequate pixels. Values of all three dimensions should be normalized before calculations. This process allows equalization of the influence that each dimension has on the results of calculations. Finally, the obtained results are re-scaled to the previous range of values.

2.2 DP Algorithm

DP algorithm employs elementary physical principles determining laws of particles motion. Each particle has its own set of neighbour particles. The distance between the particle and one of its neighbours is denoted as dij and calculated in Euclidian space as:

∑=

−=N

kikjkij xxd

1

2)( (1)

where xjk and xik indicate the vector components in N-dimensional space. The force between two particles is proportional to the distance between them, and can be defined as a resultant of all forces acting on the neighbours. Thus, the length of resultant force acting on the particle can be treated as the particle’s potential Vi. The gradient of this potential is mainly responsible for the movement of the particle in each calculations step. The set of differential equations of particles movement can be written as follows:

⎪⎩

⎪⎨⎧

⋅=

⋅−−=⋅

dtvdr

vfVdt

dvm

i

icii

i (2)


where i is a number of considered particle (pixel), mi - mass of the particle (default mass is equal 1), vi - particle’s velocity, dr - one step distance, fc - friction coefficient.

The friction coefficient is similar to the friction force, which is responsible for braking of a particles motion. The value of this coefficient should be less than 0.5 to sustain the stability of the whole set of particles. It has been found, during several performed calculations, that the best start value for Cc is 0.4, which makes the algorithm fast and convergent.

After each performed iteration of the algorithm, the correction of fc is applied. If the calculated resultant force is lower than resultant force in previous iteration, then fc is reduced by these forces quotient. Thus, the convergence of the algorithm is assured through the reduction of forces and fc in each step of calculations.

The example of the calculations of a resultant force in 3D is presented in Figure 2. The set of neighbours for each particle contains eight (full set) or four (subset) particles.

Fig. 2. Visualization of the image as connected particles set

The stop criterion of the proposed algorithm was solved by establishing the threshold of movement. If the force acting on the single particle is less than the threshold defined at the beginning of the algorithm, the particle does not move any longer. The whole algorithm reaches the end of the run when all particles are stopped. However, the threshold responsible for the motion of the particles defines also the smoothness of the expected results. If it is set as the small value, then the algorithm is running till all forces on the curve reach the threshold and the differences between positions of two adjacent particles are very low. Otherwise, the plot of new curve is sharper sustaining all most important peaks. The value of this parameter can vary


between 10-5 and 10-20. If its value is too small, then it has no more impact on the shape of the curve. Otherwise, if it is too high, the algorithm stops too early giving no effect of smoothing.

2.3 Quality Assessment of Results

The precise validation of the obtained results is possible only in the case of testing original data, which does not contain noise. The procedure of such testing is proposed in three main steps:

Preparation of testing data – the original image (without noise) is taken and

noised with generated random noise – the converted image is created, The algorithm of denoising is applied on converted image – the denoised image

is obtained, The calculation of similarity ratio between original and denoised images is

performed. The ratio of similarity is in most cases calculated as the standard deviation between

original and denoised images [3]. If such coefficient is equal to zero it means that the process of denoising was perfectly performed. At the moment there are no algorithms giving such results. The main disadvantage is that the value of the ratio is absolute and its interpretation is usually impeded. Therefore, it has been proposed the coefficient of the denoising quality, which can be evaluated accounting for the differences between the original image, converted image and denoised image as follows:

),(_

),(_

ii

iiq DSdiffcalc

NSdiffcalcD = (3)

where Dq is denoising quality coefficient; Si – source (original) image; Ni – noised image; Di – denoised image. The calc_diff function used in equation (3) is defined as the modified standard deviation:

1_

−= ∑

n

ddiffcalc i

(4)

where di is the distance between corresponding particles in both images; n is the number of points. The Dq coefficient equal to 1 means that there was no effect of denoising process. Thus, the Dq value should be grater than 1 and the higher value means the better denoising results. The test performed on the one dimensional data (signal denoising) indicated that high quality of denoising was obtained when the Dq value was higher then 5. However, the character of images, which are often very jagged, indicates that denoising quality in the range from 1.2 to 1.6 is satisfactory.

2.4 Computational Complexity

One of the main objectives of this paper was to create scalable algorithm of denoising. In this case the scalability property means that the algorithm would be applicable for


Table 1. The algorithm of DP method

___________________________________________ do { // Begin of the calculations

total = 0; for (int j=0; j<data.length; j++) {

// Motion of the particle if (ismoving[j]==1) {

// New particle position position[j] = calc_pos(j);

// Store forces for reduction of Cc if (i>0) force_old[j] = force[j];

force[j] = pos_diff(data[j] – position[j]);

// New coef. for each particle data[j] = position[j];

}

// Reduction of the Cc for each particle if (force[j]<force_old[j]) cc[j] = cc[j]*(force_old[j]/force[j]);

// Checking the stop criterion if (cc[j]<thres) { ismoving [j] = 0; } else { ismoving [j] = 1; total = 1; } }

} while (total==1); // End of the calculations___________________________________________

data placed in N-dimensional space. Thus, it is required that the complexity of such algorithm should possess low dependence on the quantity of dimensions.

The source code presented in the Table 1 shows that the calculation complexity of DP method depends mainly on the number of points in a data set. The pessimistic variant of the calculations assumes that each iteration of the algorithm requires the calculation of every particle’s position. The function responsible for these calculations called calc_pos performs several iterations based on the quantity of neighbor particles and the quantity of dimension. In the case of two dimensional data (image) the influence of this function on the algorithm complexity is insignificant. However, the number of main loop iterations is important which depends on the Cc initial value. Satisfactory results for image denoising (e.g. 400x400 pixels is equal to 160 000 particles) are obtained after 25-50 iteration. Thus, the calculation complexity of the designed algorithm can be estimated in O notation as follows:


2( ) logO n n n= (5)

where n is dependent on the number of particles and log2n component is related to the number of iterations.

3 Results

The tests of created algorithm were performed on the data set containing several examples of images. The images were characterized by different types of content:

smoothed – contents with several basic colors grouped in sets of pixels e.g.

geometric figures (Fig. 3-5), jagged – real photos containing in most cases full range of colors mixed between

each other with high frequency (Fig. 6-9). Each one of them was superimposed by the two types of generated random noise:

Gaussian noise – the values of standard deviation were set to 5, 10 and 20 separately. It means that the average value of noise was equal respectively 4%, 8% and 16% of the average pixels’ values,

Salt and pepper noise – the noise was superimposed on pixels with probability p. If the random number was less or equal p then the value of the pixel was changed to zero. Otherwise if the random number was greater than 1-p, the pixel value was changes to 255.

Finally, the set of data contained 30 (thirty) images submitted to analyze using

procedure of DP algorithm. Several chosen results are presented in the Figures 3-9.

Fig. 3. Example of smooth image containing only 5 colors in original version, noised with Gaussian (4%) noise

Fig. 4. Result of denoising process obtained by using DP method


Fig. 5. The comparison of magnified noised (Fig. 3) and denoised (Fig. 4) images – the denoising quality coefficient is equal 1.495

Fig. 6. Example of original jagged image containing a lot of details – Lena picture

Fig. 7. The same image noised with Gaussian (8%) noise

Fig. 8. Results obtained using standard DP method

Fig. 9. Results obtained using DP method equipped with edge detection algorithm used during the neighborhood determination process


The results obtained from denoising process seem to be satisfactory. However, the main observed disadvantage is the lack of edge detection procedure what can be seen in Fig. 5. The subsequent figures present the application of the DP method supported by proper edge detection algorithm.

4 Conclusions and Discussion

The presented DP method was designed as the technique of data denoising that could be applied for measurements data of different dimensions without the necessity of reconfiguration of the whole set of parameters. The design and implementation of such unified method was the main objective of this paper. The algorithm was tested for one dimensional data (measurement signals of different frequencies) and two dimensional data (images being the aim of this work). Moreover, the implementation of the algorithm allows to perform calculations using multi-dimensional data.

The two dimensional data in form of image is a special type of measurements data obtained from a photo camera. The analysis of such data is often impeded by the superimposed noise. The DP algorithm suppresses those noised parts of images allowing the further analysis.

The advantage of the image data processing is also the possibility of visual assessment of results. The results obtained using DP method are very similar to these obtained by Gaussian filtering and wavelet smoothing.

The further development of this technique should focus on the application for the multi-dimensional data processing. Main objectives that should be achieved are:

design and implementation of the shared nearest neighborhood algorithm, design of parallel version of the DP algorithm, compilation of the DP method together with proper algorithm of Multi

Dimensional Scaling (MDS).

Acknowledgements

Financial assistance of the KBN project No. 11.11.110.575 is acknowledged.

References

[1] Rauch, Ł., Talar, J., Zak, T., Kusiak, J.: Filtering of thermomagnetic data curve using artificial neural network and wavelet analysis. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS, vol. 3070, pp. 1093–1098. Springer, Heidelberg (2004)

[2] Gawąd, J., Kusiak, J., Pietrzyk, M., Di Rosa, S., Nicol, G.: Optimization Methods Used for Identification of Rheological Model for Brass. In: Proc. 6th ESAFORM Conf. On Material Forming, Salerno, Italy, pp. 359–362 (2003)

[3] Buades, A., Coll, B., Morel, J.M.: On image denoising methods, Centre de Matematiques et de Leurs Applications, http://www.cmla.ens-cachan.fr


[4] Adelino, R., da Silva, F.: Bayesian wavelet denoising and evolutionary calibration. Digital Signal Processing 14, 566–689 (2004)

[5] Falkus, J., Kusiak, J., Pietrzkiewicz, P., Pietrzyk, W.: The monograph, Intelligence in Small World - nanomaterials for the 21th Century. In: Filtering of the industrial data for the Artificial Neural Network Model of the Steel Oxygen Converter Process. CRC-PRESS, Boca Raton (2003)

[6] Hara, S., Tsukada, T., Sasajirna, K.: An in-line digital filtering algorithm for surface roughness profiles. Precision Engineering 22, 190–195 (1998)

[7] Piovoso, M., Laplante, P.A.: Kalman filter recipes for real-time image processing. Real-time Image Processing 9, 433–439 (2003)

[8] Dzwinel, W., Alda, W., Yuen, D.A.: Cross-Scale Numerical Simulations using Discrete Particle Models. Molecular Simulation 22, 397 (1999)


The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process

Bogdan Ambrożek

Szczecin University of Technology, Department of Chemical Engineering and Environmental Protection Processes, Al. Piastów 42, 71-065 Szczecin [email protected]

Abstract. The dynamic behavior of cyclic thermal swing adsorption (TSA) system with a column packed with fixed bed of adsorbent is predicted successfully with a rigorous dynamic mathematical model. The set of partial differential equations (PDEs), representing the TSA, is solved by the numerical method of lines (NMOL), using the FORTRAN subroutine DIVPAG from the International Mathematical and Statistical Library (IMSL). The simulated TSA cycle is operated in three steps: (i) an adsorption step with cold feed; (ii) a countercurrent desorption step with hot inert gas; (iii) a countercurrent cooling step with cold inert gas. Exemplary simulation results are presented for the propane adsorbed onto and desorbed from fixed bed of activated carbon. Nitrogen is used as carrier gas during adsorption and as purge gas during desorption and cooling.

1 Introduction

The cyclic thermal swing adsorption (TSA) processes have been widely used in the industry for the removal and recovery of pollutants, such as volatile organic compounds (VOCs), from the gaseous streams [1]. A typical TSA system consist of two adsorption columns with fixed bed of adsorbent and operates between two different temperatures. While the adsorption process takes place in one column, the bed in the other column is subjected to regeneration. During desorption, the first step of regeneration, hot purge gas, which can be a slipstream of the purified gas or another inert gas, flows through the bed. The adsorbate concentration in the purge gas is much higher than in the feed gas and this concentrated stream can be sent to an incinerator. It is also possible to recover the adsorbate by condensing it out from the purge gas stream. After completion of the desorption step, the bed is cooled.

The cyclic TSA processes, in mathematical aspect, are classified as distributed parameter systems, described by an integrated system of partial differential and algebraic equations (IPDAEs) [2]. Each TSA process approaches a cyclic steady-state (CSS). In this state the conditions at the end of each cycle are identical to those at the start [3]. The difficulties for the design of TSA processes are based on the lack of information about the influence of the process variables on the dynamic behavior of adsorption column and cyclic steady-state convergence time.

The purpose of the present paper is to provide a parametric analysis of thermal swing adsorption. The effect of different operating conditions and some of the model

166 B. Ambrożek

parameters on the concentration and temperature breakthrough curves were considered. The cyclic steady-state cycles are obtained by cyclic computer simulation. The system studied was propane adsorbed onto and desorbed from fixed bed of activated carbon. Nitrogen was used as carrier gas during adsorption and as purge gas during desorption and cooling. The TSA cycle comprises the following steps: adsorption, desorption and cooling.

2 Mathematical Model

Mathematical model describing TSA process consists of integrated partial differential and algebraic equations. The model equations were obtained by applying differential material and energy balances to the adsorbent bed. The following assumptions were made:

(1) The gas phase follows the ideal gas law. (2) Constant pressure operation. (3) Single adsorbate system. (4) The velocity of the carrier gas is constant. (5) Negligible radial concentration, temperature and velocity gradient within the bed. (6) Negligible intraparticle heat transfer resistance.

The model considers mass and heat transfer resistances, axial diffusion and thermal conductivity.

Based on the above assumptions, the adsorbate mass balance within the gas phase is represented by the following equation:

( )0

12

2=

∂∂−−

∂∂+

∂∂+

∂∂−

t

q

t

y

z

yG

z

yD

g

p

gax ρ

ρε

εερ

(1)

The adsorbate balance around the solid phase is formulated using a linear driving force expression:

( )qqkt

q * −=∂∂

(2)

Heterogeneous energy balance around the gas phase in the packed bed accounts for axial conduction and heat transfer to the solid phase and to the column wall:

( ) ( ) ( )0

412

2

=−

+−−

+∂

∂+

∂∂

+∂

∂−

pgg

cgwsg

pgg

pfgg

g

g

pgg

axCD

TThTT

C

h

t

T

z

TG

z

T

C

k

ρεερεα

ερρ (3)

The energy balance of the adsorbent particle includes the heat generated by adsorption and is expressed as:

( ) 0=∂∂−−−

∂∂

t

q

C

HTT

C

h

t

T

ps

asg

psp

pfs Δρ

α (4)

The Simulation of Cyclic Thermal Swing Adsorption (TSA) Process 167

Wall energy balance is stated as:

( ) ( ) 0=−+−−∂

∂ambc

cpc

aircg

cpc

cw

c TTC

UTTC

ht

T

ρα

ρα

(5)

The overall mass transfer coefficient k is calculated by a combination of resistances at the external and inside of adsorbent particle [4,5]:

pps

p*

pgf

p

D

R

y

q

kk ααρρ

51 += (6)

The effective diffusivity, Dps, is related to Knudsen and surface diffusivities as follows:

q

yDDD

*

p

gpKsps ∂

∂+=ρ

ρε (7)

The Knudsen diffusion coefficient is calculated by the following equation [6]:

2151079

/

p

eK M

Tr.D ⎟

⎠⎞

⎜⎝⎛⋅= −

τ (8)

The surface diffusion coefficient, SD , is expressed by the following equation [6]:

⎟⎠⎞

⎜⎝⎛−⋅=

−

RT

Eexp

.D

ss τ

610611 (9)

Mass and heat axial dispersion values are calculated with the following correlations [7]:

ReSc.D

D

M

ax 5020 +=ε (10)

RePr.k

k

g

ax 507 += (11)

The fk and fh values are calculated using equations of Wakao and Chen [8]:

31601102 /. ScRe..Sh += (12)

31601102 /. PrRe..Nu += (13)

168 B. Ambrożek

Molecular diffusivity is calculated using the equation developed by Fuller et al. [9]:

( )( ) ( )[ ]23131

217512 11100131

/g

/

/g

.

MDvDvP

M/M/T.D

+

+⋅=

− (14)

The isosteric heat of adsorption is estimated using the Clausius-Clapeyron equation:

qa T

plnRTH ⎟

⎠⎞

⎜⎝⎛

∂∂= 2Δ (15)

For 0=z and 0>t two different boundary conditions are used:

⎟⎠⎞⎜

⎝⎛

=−=−==∂

∂+− 00

0z

yz

yg

G

zz

yaxD

ρ

(16)

( )+− ===

−−=∂

∂00

0zgzgpg

z

gax TTGC

z

Tk

(17)

and

ozzyyy == +− == 00

(18)

TTT

zgzg == +− == 00 (19)

Both of the above boundary conditions have been employed by other investigator [4-6].

The boundary conditions at Lz = and 0>t are written as follows:

00 =∂

∂=

∂∂

== Lz

g

Lz z

T;

z

y (20)

The solution of the model equations requires the knowledge of the state of the column at the beginning of each step. The initial conditions for Lz <<0 and

0=t are:

( ) ( ) ( ) ( )( ) ( ) ( ) ( )( ) ( )zTz,T

zTz,T;zTz,T

zyz,y;zqz,q

coc

gogsos

oo

=

====

0

00

00

(21)

In the present study, it is assumed that the final concentration and temperature profile in adsorbent bed for each step defines the initial conditions for the next step. For the adsorption step in the first adsorption cycle:


( ) ( ) ( ) acogoso

oo

T.zTzTzT

,)z(y;)z(q

===== 00

(22)

The temperature-dependent Langmuir isotherm equation was used to represent ad-sorption equilibrium:

Py)T/Bexp(b

Py)T/Bexp(b)T/Qexp(qq

o

oo*

+=

1 (23)

3 Numerical Solution

The model developed in this work consists of partial differential equations (PDEs) for mass and energy balances. The set of PDEs are first transformed into a dimensionless form, and the resulting system is solved using the numerical method of lines (NMOL) [10]. The spatial discretization is performed using second-order central differencing, and the PDEs are reduced to a set of ordinary differential equations (ODEs). The number of axial gird nodes was 30. The resulting set of ODEs were solved using the FORTRAN subroutine DIVPAG of the International Mathematical and Statistical Library (IMSL). The DIVPAG program employs Adams-Moulton’s or Gear’s BDF method with variable order and step size.

4 Results and Discussion

The simulated TSA cycle (Figure 1) was operated in three steps: (i) an adsorption step with cold feed (293K); (ii) a countercurrent desorption step with hot inert gas; (iii) a countercurrent cooling step with cold inert gas (293K).

The system studied was propane adsorbed onto and desorbed from fixed bed of activated carbon (Columbia Grade L). Nitrogen was used as carrier gas during adsorption and as purge gas during desorption and cooling. The adsorbent bed was 0.40 m long, with 0.07 m diameter. The concentration of propane at inlet to the adsorption column during adsorption step was y = 0.01 mol/mol, total pressure P = 0.25 MPa. The superficial gas flow rates was the same for each step and was 7.0 mol/m2 s.

The appropriate set of constants in Eq. (23) for propane on activated carbon are determined using the experimental isotherm data published in [11]. The following values of parameters are obtained: q0 = 1.841 mol/kg, Q = 323.7 K, b0 = 0.257·10-7 Pa-1, B = 2466.5 K.

The cyclic steady-state (CSS) cycles are obtained under various conditions by a cyclic iteration method; complete cycles are run until the periodic states are achieved. Adsorption step is terminated when the outlet concentration of organic compound rises up to 5 % of inlet concentration. The desorption step is terminated when the outlet temperature exceeds 95 % of the inlet temperature. Cooling time depends mainly on the required final outlet temperature. In this study the value of 300 K is

170 B. Ambrożek

Fig. 1. Three-step TSA process with fixed adsorbent bed

assumed. The final concentration and temperature profile in adsorbent bed for each step defines the initial conditions for the next step. It is assumed that the condition for a periodic state is satisfied when the amount removed from the bed during regenera-tion is equal to the amount that is accumulated in the bed during the adsorption step. The following equation is used to determine the cyclic steady-state [2]:

( ) ( )

δ<⎟⎟⎠

⎞⎜⎜⎝

⎛−⎟⎟

⎠

⎞⎜⎜⎝

⎛∫∫

− cyclethnc

L

cyclethnc

L

qdzqdz010

(24)

where δ is value close to zero (in this work δ = 1⋅10-5). Approximately 15-20 cycles are needed to achieve the cyclic steady-state,

depending on process conditions. The computer simulation results are used to study the effect of different operating

conditions and some of the model parameters on the concentration and temperature breakthrough curves. The effects of adiabatic and non-adiabatic operation, purge gas temperature during desorption step, boundary conditions, axial diffusion and thermal conductivity were investigated. Exemplary simulation results for the cyclic steady-state are shown in Figures 2 - 11. Typical concentration and temperature breakthrough curves for adsorption, desorption and cooling steps are shown in Figures 2 - 6. In the case of desorption step, two transitions are apparent, connected by a plateau. The breakthrough curves were highly influenced by purge gas temperature (Figure 7) and heat loss through the adsorption column wall, especially for small diameter adsorption


0 4000 8000 12000 16000 20000

t [s]

0

0.002

0.004

0.006

0.008

0.01

y [m

ol/m

ol]

Fig. 2. Concentration breakthrough curve for adsorption step

0 4000 8000 12000 16000 20000

t [s]

292

296

300

304

308

312

T [

K]

Fig. 3. Temperature breakthrough curve for adsorption step

172 B. Ambrożek

0 1000 2000 3000 4000 5000

t [s]

0

0.01

0.02

0.03

y [m

ol/m

ol]

Fig. 4. Concentration breakthrough curve for desorption step. Purge gas temperature: 394 K.

0 1000 2000 3000 4000 5000

t [s]

280

300

320

340

360

380

T [

K]

Fig. 5. Temperature breakthrough curve for desorption step. Purge gas temperature: 394 K.


0 1000 2000 3000 4000 5000

t [s]

280

300

320

340

360

380

T [

K]

Fig. 6. Temperature breakthrough curve for cooling step. Purge gas temperature during desorption step: 394 K.

0 1000 2000 3000 4000 5000

t [s]

0

0.01

0.02

0.03

0 1000 2000 3000 4000 5000

0

0.01

0.02

0.03

0 1000 2000 3000 4000 5000

0

0.01

0.02

0.03

y [m

ol/m

ol]

T= 394 K

T= 350 K

T= 310 K

Fig. 7. Effect of purge gas temperature on concentration breakthrough curve for desorption step

174 B. Ambrożek

0 1000 2000 3000 4000 5000

0

0.01

0.02

0.03

0.04

0 1000 2000 3000 4000 5000

t [s]

0

0.01

0.02

0.03

0.04

y [m

ol/m

ol]

adiabatic

non-adiabatic, D= 1m

0 1000 2000 3000 4000 5000

0

0.01

0.02

0.03

0.04

non-adiabatic, D= 0.07 m

Fig. 8. Concentration breakthrough curves for adiabatic and non-adiabatic desorption. Purge gas temperature: 394 K

0 1000 2000 3000 4000 5000

0

0.01

0.02

0.03

0 1000 2000 3000 4000 5000

t [s]

0

0.01

0.02

0.03

y [m

ol/m

ol]

Eqs (16) and (17)

Eqs (18) and (19)

Fig. 9. Concentration breakthrough curves for different boundary conditions. Purge gas temperature: 394 K.


0 1000 2000 3000 4000 5000

0

0.01

0.02

0.03

0 1000 2000 3000 4000 5000

t [s]

0

0.01

0.02

0.03

y [m

ol/m

ol]

Eq. 10

Dax= 0

Fig. 10. Effect of axial diffusion on concentration breakthrough curve for desorption step. Purge gas temperature: 394K.

0 1000 2000 3000 4000 5000

280

300

320

340

360

380

0 1000 2000 3000 4000 5000

t [s]

280

300

320

340

360

380

T [

K]

Eq. 11

kax= 0

Fig. 11. Effect of axial thermal conductivity on temperature breakthrough curve for desorption step. Purge gas temperature: 394 K.

176 B. Ambrożek

column (Figure 8). The modeling results show that concentration and temperature breakthrough curves obtained using different boundary conditions, defined by equations (16)-(17) and (18)-(19), are practically identical (Figure 9).

The effect of axial diffusion on the concentration breakthrough curve is illustrated in Figure 10. The effective axial diffusion coefficient, Dax , was (i) set equal to zero, and (ii) calculated by the equation (10). Figure 11 represents the effect of axial thermal conductivity on the temperature breakthrough curve. The value of kax was varied in the same manner as for axial diffusion coefficient. Both breakthrough curves were not significantly affected by axial thermal conductivity and axial diffusion, but the required computer time was sensitive to the values of Dax and kax.

5 Conclusions

The theoretical study of thermal swing adsorption was made. A non-equilibrium, non-adiabatic mathematical model was developed to simulate temperature and concentra-tion breakthrough curves for adsorption and regeneration. The cyclic steady-state (CSS) cycles are obtained under various conditions by a cyclic iteration method. The modeling results were used to study the effect of different operating conditions and some of the model parameters on the concentration and temperature breakthrough curves. The effects of adiabatic and non-adiabatic operation, purge gas temperature during desorption step, boundary conditions, axial diffusion and thermal conductivity were investigated.

Based on the modeling results the following conclusions are drawn:

(i) The breakthrough curves were highly influenced by purge gas temperature and heat loss through the adsorption column wall.

(ii) The concentration and temperature breakthrough curves obtained using different boundary conditions are practically identical.

(iii) The breakthrough curves were not significantly affected by axial thermal con-ductivity and axial diffusion.

Symbols

bo – constant in Eq. 20 B – constant in Eq. 20 Cpc – heat capacity of column, J/(mol K) Cpg – heat capacity of gas, J/(mol K) Cps – heat capacity of solid, J/(kg K) D – internal diameter of bed, m Dax – axial diffusion coefficient, m2/s DK – Knudsen diffusion coefficient, m2/s DM – molecular diffusion coefficient, m2/s Dps – effective particle diffusion coefficient, m2/s DS – surface diffusion coefficient, m2/s Dv, Dvg – diffusion volume of adsorbate, inert gas E – surface diffusion energy of activation, J/mol


G – superficial molar gas flow rate, mol/(m2 s) hf – heat transfer coefficient from the bulk gas phase to the particle, W/(m2 K) hw – heat transfer coefficient from the bulk gas phase to the column wall, W/(m2 K) ΔHa – heat of adsorption of adsorbate, J/mol k – overall mass transfer coefficient, 1/s kax – axial thermal conductivity, W/(m K) kf – film mass transfer coefficient, m/s kg – thermal conductivity of gas, W/(m K) L – bed length, m M, Mg – molecular weight of adsorbate, inert gas, kg/kmol nc – number of cycles Nu – Nusselt number p – partial pressure of adsorbate, Pa P – total pressure, Pa Pr – Prandtl number q – adsorbate concentration in solid phase, mol/kg qo – constant in Eq. 20 q* – value of q at equilibrium with y, mol/kg Q – constant in Eq. 20 re – mean pore radius, Å R – gas constant, J/(mol K) Re – Reynolds particle number Rp – particle radius, m Sc – Schmidt number Sh – Sherwood number t – time, s T – temperature, K Tc – column wall temperature, K Tg – gas temperature within the bed, K Tgo – gas temperature at the feed conditions, K Ts – solid phase temperature, K Tamb – ambient temperature, K U – overall heat transfer coefficient for column insulation, W/(m2 K) y – mole fraction of adsorbate in the gas phase, mol/mol yo – mole fraction of adsorbate in the gas phase at the feed conditions, mol/mol y* – mole fraction of adsorbate in the gas phase in equilibrium with q, mol/mol z – axial distance, m αair – ratio of the log mean surface area of the insulation to the volume of the column wall, 1/m αc – ratio of the internal surface area to the volume of the column wall, 1/m αp – particle external surface area to volume ratio, 1/m ε – bed voidage εp – particle porosity ρc – column density, kg/m3 ρg – gas density, mol/m3 ρp – particle density, kg/m3 τp, τs – pore, surface tortuosity factor

178 B. Ambrożek

References

[1] Bathen, D., Breitbach, M.: Adsorptionstechnik. Springer, Berlin (2001) [2] Ko, D., Moon, I., Choi, D.-K.: Analysis of the Contact Time in Cyclic Thermal Swing

Adsorption Process. Ind. Eng. Chem. Res. 41, 1603 (2002) [3] Ding, Y., LeVan, M.D.: Periodic States of Adsorption Cycles III. Convergence

Acceleration for Direct Determination. Chem. Eng. Sci. 56, 5217 (2001) [4] Schork, J.M., Fair, J.R.: Parametric Analysis of Thermal Regeneration of Adsorption

Beds. Ind. Eng. Chem. Res. 27, 457 (1988) [5] Yun, J.-H., Choi, D.-K., Monn, H.: Benzene Adsorption and Hot Purge Regeneration in

Activated Carbon Beds. Chem. Eng. Sci. 55, 5857 (2000) [6] Huang, C.-C., Fair, J.R.: Study of the Adsorption and Desorption of Multiple Adsorbates

in a Fixed Bed. AICHE J. 34, 1861 (1988) [7] Wakao, N., Funazkri, T.: Effect of Fluid Dispersion Coefficients on Particle-to-Fluid

Mass Transfer Coefficients in Packed Beds. Chem. Eng. Sci. 33, 1375 (1978) [8] Wakao, N., Chen, B.H.: Some Models for Un steady-state Heat Transfer in Packed Bed

Reactors. In: Kulkarni, B., Mashelkar, R., Sharma, M. (eds.) Recent Trends in Chemical Reaction Engineering, vol. 1, p. 254. Wiley Eastern Ltd., New Delhi (1987)

[9] Sinnott, R.K.: Coulson & Richardson’s Chemical Engineering, vol. 6. Butterworth-Heinemann, Oxford (1999)

[10] Schiesser, W.E.: The Numerical Methods of Lines. Academic Press, California (1991) [11] Valenzuela, D.P., Myers, A.L.: Adsorption Equilibrium Data Handbook. Prentice-Hall,

Englewood Cliffs (1989)


The Stress Field Induced Diffusion

Marek Danielewski1, Bartłomiej Wierzba2, and Maciej Pietrzyk2

1 Faculty of Materials Science and Ceramics, AGH University of Science and Technology, Al. Mickiewicza 30, 30-059 Cracow, Poland [email protected] 2 Faculty of Metals Engineering and Industrial Computer Science, AGH University of Science and Technology, Al. Mickiewicza 30, 30-059 Cracow, Poland [email protected], [email protected]

Abstract. The mathematical description of the mass transport in multicomponent solution is presented. Model base on Darken concept of the drift velocity. In order to present an example of the real system we restrict analysis to an isotropic solid and liquids for which Navier equation holds. The diffusion of components depends on the chemical potential gradients and on the stress that can be induced by the diffusion and by the boundary and/or initial conditions. In such quasi-continuum the energy, momentum and mass transport are diffusion controlled and the fluxes are given by the Nernst-Planck formulae. It is show that the Darken method combined with Navier equations is valid for solid solutions as well as multi component liquids.

Keywords: stress, interdiffusion, Navier equation, multicomponent solution, alloys, drift velocity.

1 Introduction

The new understanding of diffusion in multi component systems started with Kirkendall experiments on the interdiffusion (ID) between Cu and Zn. Experiments proved that the diffusion by direct interchange of atoms, the prevailing idea of the day, was incorrect and that a less-favored theory, the vacancy mechanism, must be considered. In 1946, Kirkendall, along with his student, Alice Smigelskas, had co-authored a paper asserting that ID between Cu and Zn in brass shows movement of the interface between the “initially different phases” due to ID. This discovery, known since then as the “Kirkendall effect”, supported the idea that atomic diffusion occurs through vacancy exchange [1]. It shows the different intrinsic diffusion fluxes of the components, that cause swelling (creation) of one part and shrinkage (annihilation) of the other part of the diffusion couple. The key conclusion is that local movement of solid (its lattice) and liquid due to the diffusion is a real process. Once the solution is non uniform and the mobilities differ from each other, than the vast number of phenomena can occur: the Kirkendall marker movement, the Kirkendall-Frenkel voids might be formed and stress is generated, etc. The concepts initiated by Kirkendall played a decisive role in the development of the diffusion theory [2,3]. The progress in the understanding of the ID phenomenology [4] allows nowadays for an attempt to further generalize Darken method. Darken method for multicomponent

180 M. Danielewski, B. Wierzba, and M. Pietrzyk

solutions is based on the postulate that the total mass flow is a sum of diffusion and drift flow [4]. The force arising from gradients causes the atoms of the particular component to move with a velocity, which in general may differ from the velocity of the atoms of other components. Medium is common for all the species and all the fluxes are coupled. Thus, their local changes can affect the common drift velocity, υdrift. The physical laws that govern process are continuity equations and the postulate that the total molar concentration of the solution is constant. The extended Darken method in one dimension [4] allows modeling the positions of the solution boundaries, densities and the drift velocity. Physical laws are the same as in original Darken model. All the important differences are in the formulation of the initial and boundary conditions. Model allows modelling ID for arbitrary initial distribution of the components, in a case of moving boundaries, of the reactions and in many other situations. The uniqueness and existence of the solution, the effective methods of numerical solution and successful modelling of the “diffusional structers” (“up-hill diffusion”) prove the universality of the drift concept. It offers sole opportunity to describe ID in the real solutions and in three dimensions - an objective of this work. The presented model is solvable and there exists an unique solution of it [4].

2 The Darken Method

The core of the Darken method is the mass balance equation:

div , 1,...,iiJ i r

t

ρ∂= − =

∂ (1)

and the postulated form of the flux of i-th element, Ji, that contains the diffusive and the drift terms:

, 1,...,d drift

i i iJ J i rρ υ= + = (2)

where υdrift denotes the drift velocity, d

iJ is the diffusion flux and r number of

components. The mass balance equation can be written in the internal reference frame (relatively to the drift velocity).

Thus from Eqs. (1) and (2) it follows:

( )Ddiv div div div 1, ...,

D drift

d drift d drifti

i i ii iJ x i rt υ

ρρ υ ρ ρ υ= − − = − − = (3)

where d

ix is diffusion velocity. The derivative in Eq. (3) is called Lagrange’an,

substantial or material derivative:

Dgrad

D drift

drifti i

it tυ

ρ ρυ ρ

∂= +

∂ (4)

and it gives the rate of density changes at the point moving with an arbitrary velocity, here it is the drift velocity.

The Stress Field Induced Diffusion 181

The generally accepted form of the diffusion flux is the Nernst-Planck equation [5,6]:

d

i i i iJ B Fρ= (5)

where iB and iF are the mobility of i-th component and forces acting on it:

gradi iF μ= − (6)

Upon combining Eqs. (3), (5) and (6) the continuity equation becomes:

[ ]D1, ...,

Ddiv grad div

drift

i drift

i i i i i rt

Bυ

ρ ρ μ ρ υ= − = (7)

3 The Diffusion and Stress in the Multi Component Solution

3.1 Mass Balance

For all the processes that obey the mass conservation law and when the chemical and/or nuclear reactions are not allowed (the reaction term can be omitted), the equation of mass conservation holds, Eq (3). It is postulated here that the drift velocity is a sum of Darken drift velocity (generated by the interdiffusion) and the

deformation velocity υσ (generated by the stress):

drift Dυ υ υ= + σ (8)

Darken postulated that diffusion fluxes are local and defined exclusively by the local forcing (e.g., the chemical potential gradient, stress field, electric field etc.). He postulated existence of the unique average velocity that he called the drift velocity. In this work, we generalize the original Darken concept to include the elastic deformation of an alloy. The Darken’s drift velocity, υD, is given by [2]:

1 1

1 1r rdfD d

i i i ii i

c x c xc c

υ υ= =

= − −∑ ∑ σ (9)

The average total and the diffusion velocities are given by:

1

1 rdf

i ii

c xc

υ=

= ∑ (10)

1

1 rdfd d

i ii

c xc

υ=

= ∑ (11)

The diffusion velocity of the i-th component and the concentration of the solution are defined by:


d d

i i iJ c x= (12)

1

r

ii

c c=

=∑ (13)

From Eqs. (8) – (12), the following relations for the flux of the i-th element and its velocity hold:

d D

i i i iJ J c cυ υ= + + σ (14)

D d drift d

i i ix x xυ υ υ= + + = +σ (15)

Upon summing Eqs. (14), for all components the average local velocities, satisfy the Eq. (9):

1 1

r rd D

i i i ii i

c x c x c cυ υ= =

= − −∑ ∑ σ

and from Eqs. (10), (11) and (14) it follows

d drift Dυ υ υ υ υ υ= − = − − σ (16)

The postulate of the drift velocity allows rewriting Eq. (3) in the following form:

( )Ddiv div 0

D drift

d driftii i i

cc x c

t υ

υ+ + = (17)

Summing (17) over all components it is easy to show:

( )Ddiv grad 0

D drift

driftcc c

t υ

υ υ+ − = (18)

and finally

( )div 0c

ct

υ∂+ =

∂ (19)

Thus, we have obtained the well known formulae for the mass conservation in the multicomponent solution.

3.2 Stress and Strain Relations

The general form of the equation of motion for an elastic solid is very complex. We will use the results that come out for an isotropic material. In such a case the equation of motion reduces to the vector equation: ( )graddiv divgradλ μ μ= + +f u u , were

f is the density of the force induced by the displacement vector u. It shows that isotropic material has only two elastic constants. To get the equation of motion for


such a material, we can set 2

2ft

ρ ∂=∂

u and upon neglecting any body forces like

gravity etc., one gets [7]: 2

2 ( )graddiv divgradt

ρ λ μ μ∂ = + +∂

u u u .

An elastic body is defined as a material for which the stress tensor depends only on a deformation tensor F,

( )Fσ σ= (20)

We postulate in this work that the displacements are small. In such a case the displacement gradient, H, is defined as the gradient of the displacement vector (u = x – X):

grad= = −H u F 1 (21)

and the strain tensor is the symmetric part of H

( )1

2TH Hε = + (22)

where

( ), ,

1

2kl k l l ku uε = +

The constitutive equation of an isotropic, linear and elastic body is known as the Hooke’an law [8]:

( ) 2λ μ= +tr 1σ ε ε (23)

where andλ μ denote the Lame coefficients:

( ) ( ) ( ) and

1 1 2 2 1

vE E

v v vλ μ= =

+ − + (24)

where E denotes the Young module and v is the Poisson number. The divergence of the stress tensor is defined by the Eq. (23) and equals [9]:

( )div graddiv divgradλ μ μ= + +σ u u (25)

3.3 Momentum Balance

The Navier and Navier-Lamé equations describe the momentum balance in the compressible fluid and isotropic solid. The relations for the momentum and the moment of momentum obtained in the theory of mass transport in continuum in which diffusion takes place [10] allow to postulate the following relation:

*D div

D driftbf

t υ

υρ σ ρ= + (26)

where σ * and bf denote the overall Cauchy stress tensor and body force, respectively.


When interdiffusion is analyzed it is convenient to express momentum balance as a function of concentrations. Thus upon dividing Eq. (26) by the overall molar mass and using Eq. (15) one gets:

DD div

D Ddrift drift

drift

b

d

cft t

c cυ υ

υυ σ= + − (27)

where σ and bf denote the overall stress tensor defined by the Eq. (23) and body

force, respectively. In Eqs. (26) and (27) we postulate that the drift velocity defines the local frame of

reference. In the analyzed case of the regular, cubic and elastic crystal the following relation holds [9,11]:

0T− =σ σ (28)

3.4 Diffusion and Other Fluxes

The diffusion of the i-th component, Eq. (14), depends on the both, the stress and the chemical potential gradient, Eq. (5) and (6). Following Darken the total flux is a sum diffusion and drift terms:

( )grad D

i i i i i i iJ c B p c cμ υ υ= − + Ω + + σ (29)

Moreover we limit the free energy density to the isostatic stress component. Keeping only the diagonal terms [12,11] one gets:

1tr

3p = − σ (30)

The free energy density (pressure) gradient will induce the diffusion flux of elements if their molar volumes differ [12]. The Nernst-Einstein equation relates the mobility and the self diffusion coefficient [12]:

i iD B kT= (31)

where k is the Boltzmann constant and T the absolute temperature.

3.5 Physical Laws (The Integral Form)

The mass conservation law has a form:

( ) ( )

Dd d 0

D drift

d

t tc c

t β βυ

ϑ υ∂

+ =∫ ∫ s

Using Eq. (16), it can be written in terms of drift velocity. Thus, the following formulae form the integral balance equations for multicomponent solution [13]:

( ) ( ) ( )

Dd d d 0

D drift

drift

t t tc c c

t β β βυ

ϑ υ υ∂ ∂

+ − =∫ ∫ ∫s s (32)


( ) ( ) ( )

Dd d d

D driftt t t bc c

t β β βυ

υ ϑ σ ϑ∂

= +∫ ∫ ∫s f (33)

( ) ( ) ( )

Dd d d

D driftt t t bc c

t β β βυ

υ ϑ σ ϑ∂

× = × + ×∫ ∫ ∫x x s x f (34)

( )( ) ( ) ( ) ( ) ( )

2

1

D 1d d d d d

D 2drift

rdrift d

i i T Bt t t t ti

bce c x ct β β β β β

υ

υ ϑ υσ υ ϑ ϑ∂ ∂

=

+ + − + +⎛ ⎞ =⎜ ⎟⎝ ⎠

∑∫ ∫ ∫ ∫ ∫s q s f q (35)

( ) ( ) ( )

Dd d d

D drift

T B

t t tc

t T Tβ β βυ

η ϑ ϑ∂

≥ − +∫ ∫ ∫q q

s (36)

where e, qT, qB, and η denote the specific internal energy, heat flux (vector of heat transfer), vector of heat source per unit mass produced by internal sources and density of energy production, respectively.

The integral equations allow to derive the self-consistent set of the following differential equations [13]:

( )Ddiv div 0

D drift

drift dcc c

t υυ υ+ + = (37)

( )D Ddiv div 0

D Ddrift drift

d driftd

bc c c ct tυ υ

υ υ υ υ+ − − =+fσ (38)

( ) ( )

( )1

1

DD Ddiv div div

D D D

1div : grad 0

2

drift drift drift

dd rd d i

i i Ti

rd

i i i i Bi

xec e c c c c x

t t t

x x c x

υ υ υ

υυ υ υυ υ

υ

=

=

− + + − +

+ + + =

− −∑

∑

q

qσ (39)

0T− =σ σ (40)

( ) ( )

( )1

1

DD Ddiv div

D D D

1div : grad 0

2

drift drift drift

dd rd d i

i ii

rd

i i i ii

xc c c c c x

t t t

x x c x

υ υ υ

ψ υψ υ υ υυ υ

υ

=

=

− + + − +

+ + ≥

− ∑

∑ σ (41)

where the specific free energy is defined as e Tψ η= − .

4 Results

There exists a solution of above model. At present we solve this problem numerically using Finite Differential Method (FDM) in one dimension.

For demonstration the Cr-Fe-Ni system has been chossen. Interdiffusion modelling in Cr-Fe-Ni closed system has been done using the FDM method and compared with experimantal results. For the calculations the following data have been used:


(a) The initial distribution of concentrations Fig. 1.

(a) The activity of components (thermodynamic data)

(a) Calculated avarage self-diffusion coeficients at 1273 K:

11 2 1

11 2 1

11 2 1

1.106 10 [ ]

1.923 10 [ ]

2.788 10 [ ]

Ni

Fe

Cr

D cm s

D cm s

D cm s

− −

− −

− −

= ⋅

= ⋅

= ⋅

Fig. 1. Initial concentrations profiles of Cr-Ni-Fe

In Figure 2 the calculated concentrations profiles of Cr, Fe and Ni are compared with the experimental results and show satisfy agreement:

Fig. 2. Calculated concentrations profiles of Cr-Ni-Fe


Figure 3 shows the calculated drift velocity of the diffusion couple shown in Fig. 1:

Fig. 3. Calculated drift velocity

In Figure 4 the pressure disribution is shown.

Fig. 4. Pressure diagram

The above figures illustrate the evolution of the concentration, drift velocity and the pressure. Compariton the symulation data with experiment shows that the model is valid, and the mathematical description of interdiffusion and stress is effective tool for simulating that processes.



The following conclusions can be drawn:

a) The mathematical description of interdiffusion in multicomponent systems has been formulated. For the known thermodynamic data and diffusivities, the evolution of the concentration profiles and drift velocity can be computed.

b) Effective formulae enable us to calculate the concentration profiles and the drift velocity as a function of time and position.

c) The model was applied for the modelling interdiffusion in Cr-Fe-Ni diffusion couple. The calculated concentration profiles were consistent with experimental results.

d) The Navier–Stokes and Navier-Lame equations for the case of multi component solutions, where the concentrations are not uniform, has been effectively used.

Acknowledgments

This work has been supported by the MNiI No. 11.11.110.643, under Grant No. 4 T08C 03024, and under Grant No. 3 T08C 044 30, finansed during the period 2006-2008.

References

[1] Smigelskas, A.D., Kirkendall, E.: Trans. A.I.M.E. 171, 130 (1947) [2] Darken, L.S.: Trans. AIME 174, 184 (1948) [3] Danielewski, M.: Defect and Diffusion Forum 95–98, 125 (1993) [4] Holly, K., Danielewski, M.: Phys. Rev. B 50, 13336 (1994) [5] Nernst, W.: Z. Phys. Chem. 4, 129 (1889) [6] Planck, M.: Ann. Phys. Chem. 40, 561 (1890) [7] Feynman, R.P., Leighton, R.B., Sands, M.: The Feynman Lectures on Physics. Addison-

Wesley, London (1964) [8] Cottrell, A.H.: The mechanical properties of matter. John Wiley & Sons Inc., New York

(1964) [9] Landau, L.D., Lifszyc, E.M.: Theory of elasticity, Nauka, Moscow (1987)

[10] Danielewski, M., Krzyżański, W.: The Conservation of Momentum and Energy in Open Systems. Phys. Stat. Sol. 145, 351 (1994)

[11] Stephenson, G.B.: Acta metall. 36, 2663 (1988) [12] Philibert, J.: Diffusion and Stress. In: Defect and Diffusion Forum, pp. 129–130 (1996) [13] Danielewski, M., Wierzba, B.: The Unified Description of Interdiffusion in Solids and

Liquids. In: Proc. Conf. 1st International Conference on Diffusion in Solids and Liquids, Aveiro, Portugal, p. 113 (2005)

Author Index

Ambrozek, Bogdan 165

Castillo, Oscar 43

Danielewski, Marek 179

Garus, Jerzy 71

Hrebien, Maciej 137

Korbicz, Jozef 137Kusiak, J. 153

Matthaus, Franziska 109Melin, Patricia 43Merkwirth, Christian 119

Mielczarek, Norbert 1Mitkowski, Wojciech 99

Ogorza�lek, Maciej J. 119

Pietrzyk, Maciej 179Porada, Ryszard 1

Rauch, L. 153

Skruch, Pawe�l 85, 99Sydorets, V. 29

Vladimirov, Vsevolod 21

Wichard, Jorg 119Wierzba, Bart�lomiej 179Wrobel, Jacek 21

Documents

Modelling Dynamics in Processes and Systems