898

Methods of Applied Mathematics for Engineers and Scientists.pdf

  • Upload
    txqtxq

  • View
    156

  • Download
    50

Embed Size (px)

Citation preview

  • Uploaded by:

    Ebooks Chemical Engineering https://www.facebook.com/pages/Ebooks-Chemical-Engineering/238197077030

    For More Books, softwares & tutorials Related to Chemical Engineering

    Join Us

    @facebook: https://www.facebook.com/pages/Ebooks-Chemical-

    Engineering/238197077030

    @facebook: https://www.facebook.com/AllAboutChemcalEngineering

    @facebook: https://www.facebook.com/groups/10436265147/

    ADMIN:

    I.W

    >

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    METHODS OF APPLIED MATHEMATICS FORENGINEERS AND SCIENTISTS

    Based on course notes from more than twenty years of teaching engi-neering and physical sciences at Michigan Technological University,Tomas Cos engineering mathematics textbook is rich with examples,applications, and exercises. Professor Co uses analytical approachesto solve smaller problems to provide mathematical insight and under-standing, and numerical methods for large and complex problems. Thebook emphasizes applying matrices with strong attention to matrixstructure and computational issues such as sparsity and efficiency.Chapters on vector calculus and integral theorems are used to buildcoordinate-free physical models, with special emphasis on orthogonalcoordinates. Chapters on ordinary differential equations and partialdifferential equations cover both analytical and numerical approaches.Topics on analytical solutions include similarity transform methods,direct formulas for series solutions, bifurcation analysis, Lagrange-Charpit formulas, shocks/rarefaction, and others. Topics on numeri-cal methods include stability analysis, differential algebraic equations,high-order finite-difference formulas, Delaunay meshes, and others.MATLAB implementations of the methods and concepts are fullyintegrated.

    Tomas Co is an associate professor of chemical engineering atMichigan Technological University. After completing his PhD in chem-ical engineering at the University of Massachusetts at Amherst, he wasa postdoctoral researcher at Lehigh University, a visiting researcherat Honeywell Corp., and a visiting professor at Korea University.He has been teaching applied mathematics to graduate and advancedundergraduate students at Michigan Tech for more than twentyyears. His research areas include advanced process control, includ-ing plantwide control, nonlinear control, and fuzzy logic. His journalpublications span broad areas in such journals as IEEE Transactionsin Automatic Control, Automatica, AIChE Journal, Computers inChemical Engineering, and Chemical Engineering Progress. He hasbeen nominated twice for the Distinguished Teaching Awards atMichigan Tech and is a member of the Michigan TechnologicalUniversity Academy of Teaching Excellence.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    Methods of Applied Mathematics forEngineers and Scientists

    Tomas B. CoMichigan Technological University

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    cambridge university pressCambridge, New York, Melbourne, Madrid, Cape Town,Singapore, Sao Paulo, Delhi, Mexico City

    Cambridge University Press32 Avenue of the Americas, New York, NY 10013-2473, USA

    www.cambridge.orgInformation on this title: www.cambridge.org/9781107004122

    Tomas B. Co 2013

    This publication is in copyright. Subject to statutory exceptionand to the provisions of relevant collective licensing agreements,no reproduction of any part may take place without the writtenpermission of Cambridge University Press.

    First published 2013

    Printed in the United States of America

    A catalog record for this publication is available from the British Library.

    Library of Congress Cataloging in Publication Data

    Co, Tomas B., 1959Methods of applied mathematics for engineers and scientists : analytical andnumerical approaches / Tomas B. Co., Michigan Technological University.pages cmIncludes bibliographical references and index.ISBN 978-1-107-00412-2 (hardback)1. Matrices. 2. Differential equations Numerical solutions. I. Title.QA188.C63 2013512.9434dc23 2012043979

    ISBN 978-1-107-00412-2 Hardback

    Additional resources for this publication at [insert URL here].

    Cambridge University Press has no responsibility for the persistence or accuracy ofURLs for external or third-party Internet Web sites referred to in this publicationand does not guarantee that any content on such Web sites is, or will remain, accurateor appropriate.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    Contents

    Preface page xi

    I MATRIX THEORY

    1 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 Definitions and Notations 41.2 Fundamental Matrix Operations 61.3 Properties of Matrix Operations 181.4 Block Matrix Operations 301.5 Matrix Calculus 311.6 Sparse Matrices 391.7 Exercises 41

    2 Solution of Multiple Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 542.1 Gauss-Jordan Elimination 552.2 LU Decomposition 592.3 Direct Matrix Splitting 652.4 Iterative Solution Methods 662.5 Least-Squares Solution 712.6 QR Decomposition 772.7 Conjugate Gradient Method 782.8 GMRES 792.9 Newtons Method 802.10 Enhanced Newton Methods via Line Search 822.11 Exercises 86

    3 Matrix Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 993.1 Matrix Operators 1003.2 Eigenvalues and Eigenvectors 1073.3 Properties of Eigenvalues and Eigenvectors 1133.4 Schur Triangularization and Normal Matrices 1163.5 Diagonalization 1173.6 Jordan Canonical Form 1183.7 Functions of Square Matrices 120

    v

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    vi Contents

    3.8 Stability of Matrix Operators 1243.9 Singular Value Decomposition 1273.10 Polar Decomposition 1323.11 Matrix Norms 1353.12 Exercises 138

    II VECTORS AND TENSORS

    4 Vector and Tensor Algebra and Calculus . . . . . . . . . . . . . . . . . . . . 1494.1 Notations and Fundamental Operations 1504.2 Vector Algebra Based on Orthonormal Basis Vectors 1544.3 Tensor Algebra 1574.4 Matrix Representation of Vectors and Tensors 1624.5 Differential Operations for Vector Functions of One Variable 1644.6 Application to Position Vectors 1654.7 Differential Operations for Vector Fields 1694.8 Curvilinear Coordinate System: Cylindrical and Spherical 1844.9 Orthogonal Curvilinear Coordinates 1894.10 Exercises 196

    5 Vector Integral Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2045.1 Greens Lemma 2055.2 Divergence Theorem 2085.3 Stokes Theorem and Path Independence 2105.4 Applications 2155.5 Leibnitz Derivative Formula 2245.6 Exercises 225

    III ORDINARY DIFFERENTIAL EQUATIONS

    6 Analytical Solutions of Ordinary Differential Equations . . . . . . . . . . 2356.1 First-Order Ordinary Differential Equations 2366.2 Separable Forms via Similarity Transformations 2386.3 Exact Differential Equations via Integrating Factors 2426.4 Second-Order Ordinary Differential Equations 2456.5 Multiple Differential Equations 2506.6 Decoupled System Descriptions via Diagonalization 2586.7 Laplace Transform Methods 2626.8 Exercises 263

    7 Numerical Solution of Initial and Boundary Value Problems . . . . . . . 2737.1 Euler Methods 2747.2 Runge Kutta Methods 2767.3 Multistep Methods 2827.4 Difference Equations and Stability 2917.5 Boundary Value Problems 2997.6 Differential Algebraic Equations 3037.7 Exercises 305

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    Contents vii

    8 Qualitative Analysis of Ordinary Differential Equations . . . . . . . . . . 3118.1 Existence and Uniqueness 3128.2 Autonomous Systems and Equilibrium Points 3138.3 Integral Curves, Phase Space, Flows, and Trajectories 3148.4 Lyapunov and Asymptotic Stability 3178.5 Phase-Plane Analysis of Linear Second-Order

    Autonomous Systems 3218.6 Linearization Around Equilibrium Points 3278.7 Method of Lyapunov Functions 3308.8 Limit Cycles 3328.9 Bifurcation Analysis 3408.10 Exercises 340

    9 Series Solutions of Linear Ordinary Differential Equations . . . . . . . . 3479.1 Power Series Solutions 3479.2 Legendre Equations 3589.3 Bessel Equations 3639.4 Properties and Identities of Bessel Functions and

    Modified Bessel Functions 3699.5 Exercises 371

    IV PARTIAL DIFFERENTIAL EQUATIONS

    10 First-Order Partial Differential Equations and the Method ofCharacteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37910.1 The Method of Characteristics 38010.2 Alternate Forms and General Solutions 38710.3 The Lagrange-Charpit Method 38910.4 Classification Based on Principal Parts 39310.5 Hyperbolic Systems of Equations 39710.6 Exercises 399

    11 Linear Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . 40511.1 Linear Partial Differential Operator 40611.2 Reducible Linear Partial Differential Equations 40811.3 Method of Separation of Variables 41111.4 Nonhomogeneous Partial Differential Equations 43111.5 Similarity Transformations 43911.6 Exercises 443

    12 Integral Transform Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45012.1 General Integral Transforms 45112.2 Fourier Transforms 45212.3 Solution of PDEs Using Fourier Transforms 45912.4 Laplace Transforms 46412.5 Solution of PDEs Using Laplace Transforms 47412.6 Method of Images 47612.7 Exercises 477

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    viii Contents

    13 Finite Difference Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48313.1 Finite Difference Approximations 48413.2 Time-Independent Equations 49113.3 Time-Dependent Equations 50413.4 Stability Analysis 51213.5 Exercises 519

    14 Method of Finite Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52314.1 The Weak Form 52414.2 Triangular Finite Elements 52714.3 Assembly of Finite Elements 53314.4 Mesh Generation 53914.5 Summary of Finite Element Method 54114.6 Axisymmetric Case 54614.7 Time-Dependent Systems 54714.8 Exercises 552

    Bibliography B-1

    Index I-1

    A Additional Details and Fortification for Chapter 1 . . . . . . . . . . . . . . 561A.1 Matrix Classes and Special Matrices 561A.2 Motivation for Matrix Operations from Solution of Equations 568A.3 Taylor Series Expansion 572A.4 Proofs for Lemma and Theorems of Chapter 1 576A.5 Positive Definite Matrices 586

    B Additional Details and Fortification for Chapter 2 . . . . . . . . . . . . . 589B.1 Gauss Jordan Elimination Algorithm 589B.2 SVD to Determine Gauss-Jordan Matrices Q and W 594B.3 Boolean Matrices and Reducible Matrices 595B.4 Reduction of Matrix Bandwidth 600B.5 Block LU Decomposition 602B.6 Matrix Splitting: Diakoptic Method and Schur

    Complement Method 605B.7 Linear Vector Algebra: Fundamental Concepts 611B.8 Determination of Linear Independence of Functions 614B.9 Gram-Schmidt Orthogonalization 616B.10 Proofs for Lemma and Theorems in Chapter 2 617B.11 Conjugate Gradient Algorithm 620B.12 GMRES algorithm 629B.13 Enhanced-Newton Using Double-Dogleg Method 635B.14 Nonlinear Least Squares via Levenberg-Marquardt 639

    C Additional Details and Fortification for Chapter 3 . . . . . . . . . . . . . . 644C.1 Proofs of Lemmas and Theorems of Chapter 3 644C.2 QR Method for Eigenvalue Calculations 649C.3 Calculations for the Jordan Decomposition 655

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    Contents ix

    C.4 Schur Triangularization and SVD 658C.5 Sylvesters Matrix Theorem 659C.6 Danilevskii Method for Characteristic Polynomial 660

    D Additional Details and Fortification for Chapter 4 . . . . . . . . . . . . . . 664D.1 Proofs of Identities of Differential Operators 664D.2 Derivation of Formulas in Cylindrical Coordinates 666D.3 Derivation of Formulas in Spherical Coordinates 669

    E Additional Details and Fortification for Chapter 5 . . . . . . . . . . . . . . 673E.1 Line Integrals 673E.2 Surface Integrals 678E.3 Volume Integrals 684E.4 Gauss-Legendre Quadrature 687E.5 Proofs of Integral Theorems 691

    F Additional Details and Fortification for Chapter 6 . . . . . . . . . . . . . . 700F.1 Supplemental Methods for Solving First-Order ODEs 700F.2 Singular Solutions 703F.3 Finite Series Solution of dx/dt = Ax + b(t) 705F.4 Proof for Lemmas and Theorems in Chapter 6 708

    G Additional Details and Fortification for Chapter 7 . . . . . . . . . . . . . . 715G.1 Differential Equation Solvers in MATLAB 715G.2 Derivation of Fourth-Order Runge Kutta Method 718G.3 Adams-Bashforth Parameters 722G.4 Variable Step Sizes for BDF 723G.5 Error Control by Varying Step Size 724G.6 Proof of Solution of Difference Equation, Theorem 7.1 730G.7 Nonlinear Boundary Value Problems 731G.8 Ricatti Equation Method 734

    H Additional Details and Fortification for Chapter 8 . . . . . . . . . . . . . . 738H.1 Bifurcation Analysis 738

    I Additional Details and Fortification for Chapter 9 . . . . . . . . . . . . . . 745I.1 Details on Series Solution of Second-Order Systems 745I.2 Method of Order Reduction 748I.3 Examples of Solution of Regular Singular Points 750I.4 Series Solution of Legendre Equations 753I.5 Series Solution of Bessel Equations 757I.6 Proofs for Lemmas and Theorems in Chapter 9 761

    J Additional Details and Fortification for Chapter 10 . . . . . . . . . . . . . 771J.1 Shocks and Rarefaction 771J.2 Classification of Second-Order Semilinear Equations: n > 2 781J.3 Classification of High-Order Semilinear Equations 784

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    x Contents

    K Additional Details and Fortification for Chapter 11 . . . . . . . . . . . . . 786K.1 dAlembert Solutions 786K.2 Proofs of Lemmas and Theorems in Chapter 11 791

    L Additional Details and Fortification for Chapter 12 . . . . . . . . . . . . . 795L.1 The Fast Fourier Transform 795L.2 Integration of Complex Functions 799L.3 Dirichlet Conditions and the Fourier Integral Theorem 819L.4 Brief Introduction to Distribution Theory and Delta Distributions 820L.5 Tempered Distributions and Fourier Transforms 830L.6 Supplemental Lemmas, Theorems, and Proofs 836L.7 More Examples of Laplace Transform Solutions 840L.8 Proofs of Theorems Used in Distribution Theory 846

    M Additional Details and Fortification for Chapter 13 . . . . . . . . . . . . . 851M.1 Method of Undetermined Coefficients for Finite

    Difference Approximation of Mixed Partial Derivative 851M.2 Finite Difference Formulas for 3D Cases 852M.3 Finite Difference Solutions of Linear Hyperbolic Equations 855M.4 Alternating Direction Implicit (ADI) Schemes 863

    N Additional Details and Fortification for Chapter 14 . . . . . . . . . . . . . 867N.1 Convex Hull Algorithm 867N.2 Stabilization via Streamline-Upwind Petrov-Galerkin (SUPG) 870

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    Preface

    This book was written as a textbook on applied mathematics for engineers andscientist, with the expressed goal of merging both analytical and numerical methodsmore tightly than other textbooks. The role of applied mathematics has continued togrow increasingly important with advancement of science and technology, rangingfrom modeling and analysis of natural phenomenon to simulation and optimizationof man-made systems. With the huge and rapid advances of computing technology,larger and more complex problems can now be tackled and analyzed in a verytimely fashion. In several cases, what used to require supercomputers can now besolved using personal computers. Nonetheless, as the technological tools continueto progress, it has become even more imperative that the results can be understoodand interpreted clearly and correctly, as well as the need for a deeper knowledgebehind the strengths and limitations of the numerical methods used. This meansthat we cannot forgo the analytical techniques because they continue to provideindispensable insights on the veracity and meaning of the results. The analytical toolsremain to be of prime importance for basic understanding for building mathematicalmodels and data analysis. Still, when it comes to solving large and complex problems,numerical methods are needed.

    The level of exposition in this book is aimed at the graduate students, advancedundergraduate students, and researchers in the engineering and science field. Thusthe topics were mostly chosen to continue several topics found in most undergradu-ate textbooks in applied mathematics. We have focused on advanced concepts andimplementation of various mathematical tools to solve the problems that most grad-uate students are likely to face in their research work and other advanced courses.

    The contents of the book can be divided into four main parts: matrix theory,vector and tensors, ordinary differential equations, and partial differential equations.We begin the book with matrix theory because the tools developed in matrix theoryform the crucial foundations used in the rest of the book. The next part centers onthe concepts used in vector and tensor theory, including the application of tensorcalculus and integral theorems to develop mathematical models of physical systems,often resulting in several differential equations. The last two parts focus on thesolution of ordinary and partial differential equations. It can be argued that theprimary needs of applied mathematics in engineering and the physical sciences areto obtain models for a system or phenomena in the form of differential equations

    xi

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-FM cuus1935/Co 978 1 107 00412 2 February 14, 2013 22:13

    xii Preface

    and then to be able to solve them to predict and understand the effects of changesin model parameters, boundary conditions, or initial conditions.

    Although the methods of applied mathematics are independent of computingplatform and programs, we have chosen to use MATLAB as a particular plat-form under which we investigate the mathematical methods, techniques, and ideasso that the approaches can be tested and the results can be visualized. The sup-plied MATLAB codes are all included on the books website, and the reader canmodify the codes for their own use. There exists several excellent MATLAB tool-boxes supplied by third-party software developers, and they have been optimizedfor speed, efficiency, and user-friendliness. However, the unintended consequencesof user-friendly tools can sometimes render the users to be button pushers. Wecontend that students in applied mathematics still need to discover the mechanismand ideas behind the full-blown programs at least to apply them to simple testproblems and gain some basic understanding of the various approaches. The linksto the supplemental MATLAB programs and files can be accessed through the link:www.cambridge.org/Co.

    The appendices are collected as chapter fortifications. They include proofs,advanced topics, additional tables, and examples. The reader should be able toaccess these materials through the web via the link: www.cambridge.org/Co.The index also contains topics that can be found in the appendices, and they aregiven page numbers that continue the count from the main text.

    Several colleagues and students have helped tremendously in the writing ofthis textbook. Mostly, I want to thank my best friend and wife, Faith Morrison, forthe support, encouragement, and sacrifices she has given me to finish this extendedand personally significant project. I hope the textbook will contain useful informa-tion to the readers, enough for them to share in the continued exploration of themethods and applications of mathematics to further improve the understanding andconditions of our world.

    T. CoHoughton, MI

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    PART I

    MATRIX THEORY

    Matrix theory is a powerful field of mathematics that has found applications in thesolution of several real-world problems, ranging from the solution of algebraic equa-tions to the solution of differential equations. Its importance has also been enhancedby the rapid development of several computer programs that have improved theefficiency of matrix analysis and the solution of matrix equations.

    We have allotted three chapters to discussing matrix theory. Chapter 1 containsthe basic notations and operations. These include conventions and notations forthe various structural, algebraic, differential, and integral operations. As such, thischapter focuses on how to formulate problems in terms of matrix equations, thevarious approaches of matrix algebraic manipulations, and matrix partitions.

    Chapter 2 then focuses on the solution of the linear equation given by Ax = b,and it includes both direct and indirect methods. The most direct method is to findthe inverse of A and then evaluate x = A1b. However, the major practical issue isthat matrix inverses become unwieldy when the matrices are large. This chapter isconcerned with finding the solutions by reformulating the problem to take advantageof available matrix properties. Direct methods use various factorizations of A basedon matrices that are more easily invertible, whereas indirect methods use an iterativeprocess starting with an initial guess of the solution. The methods can then be appliedto linear least-squares problems, as well as to the solution of multivariable nonlinearequations.

    Chapter 3 focuses on matrices as operators. In this case, the discussion is con-cerned with the analysis of matrices, for example, using eigenvalues and eigenvec-tors. This allows one to obtain diagonalized matrices or Jordan canonical forms.These forms provide efficient tools for evaluating matrix functions, which are alsovery useful for solving simultaneous differential equations. Other analysis tools suchas singular values decomposition, matrix norms, and condition numbers are alsoincluded in the chapter.

    The matrix theory topics are also used in the other parts of this book. In Part II,we can use matrices to represent vector coordinates and tensors. The operations andvector/tensor properties can also be evaluated and analyzed efficiently using matrixtheory. For instance, the mutual orthogonalities among the principal axes of a sym-metric tensor are immediate consequences of the properties of matrix eigenvectors.In Part III, matrices are also shown to be indispensable tools for solving ordinarydifferential equations. Specifically, the solution and analysis of a set of simultaneous

    1

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    2 Matrix Theory

    linear ordinary differential equations can be represented in terms of matrix expo-nential functions. Moreover, numerical solution methods can now be coded in matrixforms. Finally, in Part IV of the book, both the finite difference and finite elementsmethods reduce partial differential equations to linear algebraic equations. Thus thetools discussed in Chapter 2 are strongly applicable because the matrices resultingfrom either of these methods will likely be large and sparse.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1 Matrix Algebra

    In this chapter, we review some definitions and operations of matrices. Matricesplay very important roles in the computation and analysis of several mathematicalproblems. They allow for compact notations of large sets of linear algebraic equa-tions. Various matrix operations such as addition, multiplication, and inverses canbe combined to find the required solutions in a more tractable manner. The exis-tence of several software tools, such as MATLAB, have also made it very efficientto approach the solution by posing several problems in the form of matrix equa-tions. Moreover, the matrices possess internal properties such as determinant, rank,trace, eigenvalues, and eigenvectors, which can help characterize the systems underconsideration.

    We begin with the basic notation and definitions in Section 1.1. The matrix nota-tions introduced in this chapter are used throughout the book. Then in Section 1.2,we discuss the various matrix operations. Several matrix operations should be famil-iar to most readers, but some may not be as familiar, such as Kronecker products.We have classified the operations as either structural or algebraic. The structuraloperations are those operations that involve only the collection and arrangement ofthe elements. On the other hand, the algebraic operations pertain to those in whichalgebraic operations are implemented among the elements of a matrix or group ofmatrices. The properties of the different matrix operations such as associativity, com-mutativity, and distributivity properties are summarized in Section 1.3. In addition,we discuss the properties of determinants and include some matrix inverse formulas.The properties and formulas allow for the manipulation and simplification of matrixequations. These will be important tools used throughout this book.

    In Section 1.4, we explore various block matrix operations. These operations arevery useful when the structure of the matrices can be partitioned into submatrices.These block operations will also prove to be very useful when solving large sets ofequations that exhibit a specific pattern.

    From algebraic operations, we then move to topics involving differential andintegral calculus in Section 1.5. We first define and fix various notations for thederivatives and integrals of matrices. These notations are also used throughout thebook. The various properties of the matrix calculus operations are also summarizedin this section. One of the applications of matrix calculus is optimization, in which theconcept of positive (and negative) definiteness is needed for sufficient conditions. We

    3

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    4 Matrix Algebra

    devote Section A.5 in the appendix to explaining positive or negative definiteness inmore detail.

    Finally, in Section 1.6, we include a brief discussion on sparse matrices. Thesematrices often result when the problem involves a large collection of smaller ele-ments that are connected with only few of the other elements, such as when wesolve differential equations by numerical methods, for example, the finite differencemethods or finite element methods.

    1.1 Definitions and Notations

    The primary application of matrices is in solving simultaneous linear equations.These equations can come from solving problems based on mass and energy balanceof physical, chemical, and biological processes; Kirchhoffs laws in electric circuits;force and moment balances in engineering structures; and so forth. The size of theunknowns for these problems can be quite large, so the solution can become quitecomplicated. This is especially the case with modern engineering systems, whichtypically contain several stages (e.g., staged operations in chemical engineering), arehighly integrated (e.g., large-scale integration in microelectronics), or are structurallylarge (e.g., large power grids and large buildings). Matrix methods offer techniquesthat allow for tractability and computational efficiency.

    When solving large nonlinear problems, numerical methods become a neces-sary approach. The numerical computations often involve matrix formulations. Forinstance, several techniques for solving nonlinear equations and nonlinear optimiza-tion problems implement Newtons method and other gradient-based methods, inwhich the calculations include matrix operations. Matrix equations also result fromfinite approximations of systems of differential equations. For boundary value prob-lems, the internal values are to be solved such that both the boundary conditions andthe differential equations that describe the systems are satisfied. Here, the numeri-cal techniques include finite element methods and finite difference methods, both ofwhich translate the problem back to a linear set of equations.

    Aside from calculating the unknowns or solving differential equations, matrixmethods are also useful in operator analysis and design. In this case, matrix equationsare analyzed in terms of operators, inputs, and outputs. The matrices associated withthe operators can be formulated to obtain the desired behavior. For example, ifwe want to move a 3D point a = (x, y, z) to another position, say, b = (x, y, z), ina particular way, for instance, to move it radially outward or rotate it at specifieddegrees counterclockwise, then we can build matrices that would produce the desiredeffects. Conversely, for a system (mechanical, chemical, electrical, biological, etc.)that can be written in matrix forms (both in differential equations and algebraicequations), we can often isolate the matrices associated with system operations anduse matrix analysis to explore the capabilities and behavior of the system.

    It is also worth mentioning that, in addition to the classical systems that are mod-eled with algebraic and differential equations, there are other application domainsthat use matrix methods extensively. These include data processing, computationalgeometry, and network analysis. In data processing, matrix methods help in regres-sion analysis and statistical data analysis. These applications also include data miningin search engines, bioinformatics, and computer security. Computational geome-try also uses matrix methods to handle and analyze large sets of data. Applica-tions include computer graphics and visualization, which are also used for pattern

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.1 Definitions and Notations 5

    recognition purposes. In network analysis, matrix methods are used together withgraph theory to analyze the connectivity and effects of large, complex structures.Applications include the analysis of communication and control systems, as well aslarge power grids.

    We now begin with the definition of a matrix and continue with some of thenotations and conventions that are used throughout this book.

    Definition 1.1. A matrix is a collection of objects, called the elements of thematrix, arranged in rows and columns.

    These elements of the matrix could be numbers, such as

    A =(

    1 0 0.32 3 + i 12

    )with i = 1

    or functions, such as

    B =(

    1 2x(t) + asin(t)dt dy/dt

    )The elements of matrices are restricted to a set of mathematical objects that allowalgebraic binary operations such as addition, subtraction, multiplication, and divi-sion. The valid elements of the matrix are referred to as scalars. Note that a scalar isnot the same as a matrix having only one row and one column.

    We often use capital letters to denote matrices, whereas the corresponding smallletters stand for the elements. Thus the elements of matrix A positioned at the ith rowand j th column are denoted as aij , for example, for A having N rows and M columns,

    A =

    a11 a12 a1Ma21 a22 a2M...

    .... . .

    ...aN1 aN2 aNM

    (1.1)The size of the matrix is given by the symbol [=], for example, for matrix A havingN rows and M columns,

    A [=] N M or A[NM] (1.2)A row vector is a matrix having one row, whereas a column vector is a matrix

    having one column. The length of a vector means the number of elements of the rowor column vector. If the type of vector has not been specified, we take it to mean acolumn vector. We often use bold small letters to denote vectors. A basic vector isthe ithunit vector of length N denoted by ei,

    ei =

    0...010...0

    ith element (1.3)

    The length N of the unit vector is determined by context.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    6 Matrix Algebra

    A square matrix is a matrix with the same number of columns and rows. Spe-cial cases include lower triangular, upper triangular, and diagonal matrices. Lowertriangular matrices have zero elements above the main diagonal, whereas uppertriangular matrices have zero elements below the main diagonal. Diagonal matriceshave zero off-diagonal elements. The diagonal matrix is also represented by

    D = diag(

    d11,d22, . . . ,dNN)

    (1.4)

    A special diagonal matrix in which the main diagonal elements are all 1s is knownas the identity matrix, denoted by I. If the size of the identity matrix needs tobe specified, then we use IN to denote an N N identity matrix. An extensivelist of different matrices that have special forms such as bidiagonal, tridiagonal,Hessenberg, Toeplitz, and so forth are given in Tables A.1 through A.5 in Section A.1as an appendix for easy reference.

    1.2 Fundamental Matrix Operations

    We assume that the reader is already familiar with several matrix operations. Thepurpose of the following sections is to summarize these operations, introduce ournotations, and relate them to some of the available MATLAB commands. Wecan divide matrix operations into two major categories. The first category involvesthe restructuring or combination of matrices. The second category includes theoperations that contain algebraic computations such as addition, multiplication, andinverses.

    1.2.1 Matrix Restructuring Operations

    A list of matrix rearrangement operations with their respective notations are sum-marized in Tables 1.1 and 1.2 (together with some MATLAB commands associatedwith the operations).

    The row and column augmentation operations are designated by horizontal andvertical bars, respectively. These are used extensively throughout the book becausewe take advantage of block matrix operations. The reshaping operations are givenby the vectorization operation and reshape operation. Both these operations arequite useful when reformulating equations such as HX + XB + CXD = F into thefamiliar linear equation form given by Ax = b.

    There are two operations that involve exchanging the roles of rows and columns:the standard transpose operation, which we denote by superscript T , and the conju-gate transpose, which we denote by superscript asterisk. In general, AT = A, exceptwhen the elements of A are all real. When A = AT , we say that A is symmetric, andwhen A = A, we say that A is Hermitian. The two cases are generally not the same.For instance, let

    A =(

    1 + i 22 3

    )B =

    (1 2 + i

    2 i 3)

    then A is symmetric but not Hermitian, whereas B is Hermitian but not symmetric.On the other hand, when A = AT , we say that A is skew-symmetric, and whenA = A, we say that A is skew-Hermitian.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.2 Fundamental Matrix Operations 7

    Table 1.1. Matrix restructuring operations

    Operation Notation Rule

    1 Column Augment

    C =(

    A B)

    MATLAB: C=[A,B]

    c11 c1,M+P...

    . . ....

    cN1 cN,M+P

    =

    a11 a1M b11 b1P...

    . . ....

    .... . .

    ...aN1 aNM bN1 bNP

    2 Row Augment

    C =(

    AB

    )

    MATLAB: C=[A;B]

    c11 c1,M...

    . . ....

    cN+P,1 cN+P,M

    =

    a11 a1M...

    . . ....

    aN1 aNMb11 b1M

    .... . .

    ...bP1 bPM

    3 Vectorize

    C = vec (A)

    MATLAB: C=A(:)

    c1...

    cNM

    =

    A,1

    ...

    A,M

    where A,i is the ith column of A

    The submatrix operation is denoted by using a list of k subscript indices and superscript indices to refer to the rows and columns, respectively, extracted from amatrix. For instance,

    A = 1 2 34 5 6

    7 8 9

    A2,31,2 = ( 2 35 6)

    For a square matrix, if the diagonals of the submatrix are a subset of the diagonalsof the original matrix, then we call it a principal submatrix. This happens if thesuperscript indices and the subscript indices of the submatrix are the same. Forinstance,

    A = 1 2 34 5 6

    7 8 9

    A1,31,3 = ( 1 37 9)

    then A1,31,3 is a principal submatrix.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    8 Matrix Algebra

    Table 1.2. Matrix rearrangement operations

    Operation Notation Rule

    4 Reshape

    C =reshape (v,N,M)

    MATLAB:reshape(v,N,M)

    C =

    v1 vN+1 v(M1)N+1

    ......

    ...vN v2N vMN

    where v is a vector of length NM

    5 Transpose

    C = AT

    MATLAB: C=A.

    C =

    a11 aM1

    .... . .

    ...a1M aMN

    6Conjugate

    Transpose

    C = A

    MATLAB: C=A

    C =

    a11 aM1

    .... . .

    ...a1M aMN

    where aij = complex conjugate of aij

    7 Submatrix

    C = Aj1,j2 ...,ji1,i2,...,ik

    MATLAB:rows=[i1,i2,...]cols=[j1,j2,...]C=A(rows,cols)

    C[ k ] =

    ai1 j1 ai1 j

    .... . .

    ...aik j1 aik j

    8 (ij)th Redact

    C = Aij

    MATLAB:C=AC(i,j)=[ ]

    C[ (N1)(M1) ] =A1,...,j11,...,i1 A

    j+1,...,M1,...,i1

    A1,...,j1i+1,...,N Aj+1,...,Mi+1,...,N

    Next, the operation to remove some specified rows and columns is referred tohere as the (ij)th redact operation. We use Aij to denote the removal of the ith rowand j th column. For instance,

    A = 1 2 34 5 6

    7 8 9

    A23 = ( 1 27 8)

    (1.5)

    This operation is useful in finding determinants, cofactors, and adjugates.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.2 Fundamental Matrix Operations 9

    Table 1.3. Matrix algebraic operations

    Operation Notation RuleMATLABcommands

    1 Sum C = A + B cij = aij + bij C=A+B

    2 Scalar Product C = qA cij = q aij C=q*A

    3 Matrix Product C = AB cij =K

    k=1 aikbkj C=A*B

    4Haddamard

    ProductC = A B cij = aij bij C=A.*B

    5Kronecker

    Product(tensor product)

    C = A B

    C =a11B a1MB

    ......

    aN1B aNMB

    C=kron(A,B)

    6 Determinantq = det (A)

    or q = |A|q =

    K

    (K)

    (N

    i=1ai,ki

    )see (1.10)

    q=det(A)

    7 Cofactor q = cof (aij ) q = (1)i+j Aij

    8 Adjugate C = adj (A) cij = cof (aji)

    9 Inverse C = A1 C = 1|A| adj (A) C=inv(A)

    10 Trace q = tr (A) q = Ni=1 aii q=trace(A)11 Real Part C = Real(A) cij = real (aij ) C=Real(A)

    12 Imag Part C = Imag(A) cij = imag (aij ) C=Imag(A)

    13 Complex Congugate C = A cij = aij C=Conj(A)

    1.2.2 Matrix Algebraic Operations

    The matrix algebraic operations can be classified further as either binary or unary.For binary operations, the algebraic operations require two inputs, either a scalarand a matrix or two matrices of appropriate sizes. For unary operations, the input isa matrix, and the algebraic operations are applied on the elements of the matrix. Thematrix algebraic operations are given in Table 1.3, together with their correspondingMATLAB commands.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    10 Matrix Algebra

    1.2.2.1 Binary Algebraic Operations

    The most basic matrix binary computational operations are matrix sums, scalarproducts, and matrix products, which are quite familiar to most readers. To see howthese operations seem the natural consequences of solving simultaneous equations,we refer the reader to Section A.2 included in the appendices.

    Matrix products of A and B, are denoted simply by C = AB, which requiresA[=]N K, B[=]K M and C[=]N M (i.e., the columns of A must be equal to therows of B). If this is the case, we say that A and B are conformable for the operationAB. Furthermore, based on the sizes of the matrices, A[NK]B[KM] = C[NM], we seethat dropping the common value K leaves the size of C to be N M. For the matrixproduct AB, we say that A premultiplies B, or B postmultiplies A. For instance, let

    A = 1 12 1

    1 0

    and B = ( 2 11 3)

    then C = AB = 3 45 5

    2 1

    However, B and A is not conformable for the product BA.

    In several cases, AB = BA, even if the reversed order is conformable, and thusone needs to be clear whether a matrix premultiplies or postmultiplies anothermatrix. For the special case in which switching the order yields the same product(i.e., AB = BA), then we say that A and B commutes. It is necessary that commutingmatrices are square and have the same sizes.

    We list a few key results regarding matrix products:

    1. For matrix products between a matrix A[=]N M and the appropriately sizedidentity matrix, we have

    AIM = INA = A

    where IM and IN are identity matrices of size M and size N, respectively.2. Based on the definition of matrix products, when B premultiplies A, the row

    elements of B are pairwise multiplied with the column elements of A, and theresults are then summed together. This fact implies that to scale the ith rowof A by a factor di, we can simply premultiply A by a diagonal matrix D =diag (d1, . . . ,dN). For instance,

    DA = 2 0 00 1 0

    0 0 1

    1 2 34 5 67 8 9

    = 2 4 64 5 6

    7 8 9

    Likewise, to scale the j th column of A by a factor dj , we can simply postmultiplyA by a diagonal matrix D = diag (d1, . . . ,dN). For instance,

    AD = 1 2 34 5 6

    7 8 9

    2 0 00 1 00 0 1

    = 2 2 38 5 6

    14 8 9

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.2 Fundamental Matrix Operations 11

    3. Premultiplying A by a row vector of 1s yields a row vector containing the sums ofeach column, whereas postmultiplying by a column vector of 1s yields a columnvector containing the sum of each row. For instance,

    (1 1 1

    ) 1 2 34 5 67 8 9

    = ( 12 15 18 ) 1 2 34 5 6

    7 8 9

    111

    = 615

    24

    4. Let T be an identity matrix, but with additional nonzero nondiagonal elements

    in the j th column. Then B = TA is a matrix whose ith row (i = j) is given by thesum of the ith row of A and tij (j th row of A). The j th row of B remains to be thej th row of A. For instance, 1 0 10 1 2

    0 0 1

    1 2 34 5 67 8 9

    = 6 6 618 21 24

    7 8 9

    Likewise, let G be an identity matrix, but with nondiagonal elements in the ith

    row. Then C = AG is a matrix whose j th column (j = i) is given by the sum ofthe j th column of A and gij (j th column of A). The ith column of G remains tobe the ith column of A. For instance, 1 2 34 5 6

    7 8 9

    1 0 00 1 01 2 1

    = 2 8 32 17 6

    2 26 9

    5. A square matrix P is known as a row permutation matrix if it is a matrix obtained

    by permuting the rows of an identity matrix. If P is a row permutation matrix,then PA is a matrix obtained by permuting the rows of A in the same sequence asP. For instance, let P[=]3 3 be obtained by permuting the rows of the identitymatrix according to the sequence [3, 1, 2], then

    PA = 0 0 11 0 0

    0 1 0

    1 2 34 5 67 8 9

    = 7 8 91 2 3

    4 5 6

    Likewise, a square matrix P is also a column permutation matrix if it is a matrixobtained by permuting the columns of an identity matrix. If P is a column permuta-tion matrix, then AP is obtained by permuting the columns A in the same sequence asP. For instance, let P[=]3 3 be obtained by permuting the columns of the identitymatrix according to the sequence [3, 1, 2], then

    AP = 1 2 34 5 6

    7 8 9

    0 1 00 0 11 0 0

    = 3 1 26 4 5

    9 7 8

    Remark: Matrices D, T , and P described in items 2, 4, and 5 are known as thescaling, pairwise combination, and permutation row operators, respectively. Col-lectively, they are known as the elementary row operators. All three operationsshow that premultiplication (left multiplication) is a row operation. On the otherhand, D, G, and P are elementary column operators, and they operate on matrices

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    12 Matrix Algebra

    via postmultiplication (right multiplication).1 All these matrix operations are usedextensively in the Gauss-Jordan elimination method for solving linear equations.

    Aside from scalar and matrix products, there are two more matrix operationsinvolving multiplication. The Hadamard product, also known as element-wise prod-uct, is defined as follows:

    Q = A B qij = aij bij i = 1, . . . ,N; j = 1, . . . ,M (1.6)For instance, (

    1 12 2

    )(

    1 23 4

    )=(

    1 26 8

    )The Kronecker product, also known as the Tensor product, is defined as follows:

    = A B =

    a11B a1MB... . . . ...aN1B aNMB

    (1.7)where the matrix blocks aij B are scalar products of aij and B. For instance,

    (1 1

    2 2)

    (

    1 23 4

    )=

    1 2 1 23 4 3 4

    2 4 2 46 8 6 8

    Both the Hadamard product and Kronecker product are useful when solving generalmatrix equations, some of which result from the finite difference methods.

    1.2.2.2 Unary Algebraic Operations

    We first look at the set of unary operations applicable only to square matrices. Thefirst set of unary operations to consider are highly related to each other. Theseoperations are the determinant, cofactors, adjugates, and inverses. As before, werefer the reader to Section A.2 to see how these definitions naturally developedfrom the application to the solution of simultaneous linear algebraic equations.

    Of these unary operations, the matrix inverse can easily be defined independentof computation.

    Definition 1.2. The matrix inverse of a square matrix A is a matrix of the samesize, denoted by A1, that satisfies

    A1A = AA1 = I (1.8)Unfortunately, except for some special classes of matrices, the determination

    of the inverse is not straightforward in general. Instead, the computation of matrixinverses requires the definition of three other operations: the determinant, the cofac-tor, and the adjugate.

    First, we need another function called the permutation sign function.

    1 We suggest the use of the mnemonics LR and RC to stand for Left operation acts on Rows andRight operation acts on Columns, respectively.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.2 Fundamental Matrix Operations 13

    Definition 1.3. Let K = {k1,k2, . . . ,kN} be a sequence of distinct indices rangingfrom 1 to N. Let (K) be the number of pairwise exchanges among the indices inthe sequence K needed to reorder the sequence in K into an ascending order givenby {1, 2, . . . ,N}. Then the permutation sign function, denoted by (K), is definedby

    (K) = (1)(K) (1.9)which means it takes on the value of +1 or 1, depending on whether (K) iseven or odd, respectively.

    For example, we have

    (1, 2, 3) = +1 (5, 1, 2, 4, 3) = 1 (2, 1, 3, 4) = 1 (6, 2, 1, 5, 3, 4) = +1

    Definition 1.4. The determinant of a square matrix A of size N, denoted by either|A| or det(A), is given by

    det(A) =

    ki = kji, j = 1, . . . ,N

    (k1, . . . ,kN) a1,k1 a2,k2 aN,kN (1.10)

    where the summation is over all nonrepeated combinations of indices 1, 2, . . . ,N.

    Definition 1.5. The cofactor of an element aij of a square matrix A of size N,denoted by cof (aij ), is defined as

    cof (aij ) = (1)i+j det (Aij) (1.11)where Aij is the (ij)th redact.

    Using cofactors, we can compute the determinant in a recursive manner.

    LEMMA 1.1. Let A be a square matrix of size N, then det (A) = a11 if N = 1. Otherwise,for any j

    det (A) =N

    k=1akj cof (akj ) (1.12)

    Likewise, for any i

    det (A) =N

    k=1aik cof (aik) (1.13)

    PROOF. By induction, one can show that either the column expansion formula givenin (1.12) or the row expansion formula given in (1.13) will yield the same result asgiven in (1.10).

    We refer to A as singular if det(A) = 0; otherwise, A is nonsingular. As we shownext, only nonsingular matrices can have matrix inverses.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    14 Matrix Algebra

    Definition 1.6. The adjugate2 of a square matrix A is a matrix of the same size,denoted by adj (A), consisting of the cofactors of each element in A but collectedin a transposed arrangement, that is,

    adj(

    A)=

    cof (a11) cof (aN1)

    .... . .

    ...

    cof (a1N) cof (aNN)

    (1.14)Using adjugates, we arrive at one key result for the computation of matrix inverses,if they exist.

    LEMMA 1.2. Let A be any square matrix, then

    A adj(A) =(

    det(A))

    I and adj(A) A =(

    det(A))

    I (1.15)

    Assuming matrix A is nonsingular, the inverse is given by

    A1 = 1det(A)

    adj(A) (1.16)

    PROOF. (See section A.4.3)Note that matrix adjugates always exist, whereas matrix inverses A1 exist only

    if det(A) = 0.

    EXAMPLE 1.1. Let

    A = 1 2 34 5 6

    7 8 0

    then

    cof(a11) = +5 68 0

    ; cof(a21) = 2 38 0 ; cof(a31) = + 2 35 6

    cof(a12) =

    4 67 0 ; cof(a22) = + 1 37 0

    ; cof(a32) = 1 34 6

    cof(a13) = +4 57 8

    ; cof(a23) = 1 27 8 ; cof(a33) = + 1 24 5

    then

    adj(A) = 48 24 342 21 6

    3 6 3

    adj(A) A = A adj(A) 27 0 00 27 0

    0 0 27

    2 In other texts, the term adjoint is used instead of adjugate. We chose to use the latter, because

    the term adjoint is also used to refer to another matrix in linear operator theory.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.2 Fundamental Matrix Operations 15

    and

    A1 = 127

    48 24 342 21 63 6 3

    Although (1.16) is a general method for computing the inverse, there are moreefficient ways to find the matrix inverse that take advantage of special structuresand properties. For instance, the inverse of diagonal matrices is another diagonalmatrix consisting of the reciprocals of the diagonal elements. Another example iswhen the transpose happens to also be its inverse. These matrices are known asorthogonal matrices. To determine whether a given matrix is indeed orthogonal, wecan just compute AT A and AAT and check whether both products yield identitymatrices.

    The other unary operations include the trace, real component, imaginary com-ponent, and the complex conjugate operations. The trace of a square matrix A,denoted tr(A), is defined as the sum of the diagonals.

    EXAMPLE 1.2. Let A[=]2 2, then for M = I A, where is a scalar parameter,we have the following results:

    det (I A) = 2 (

    tr (A))+ det

    (A)

    adj (I A) =(

    a22 a12a21 a11

    )(I A)1 = 1

    2 (

    tr (A))+ det

    (A) ( a22 a12a21 a11

    )Note that when det (I A) = 0, the inverse will no longer exist, butadj (I A) will still be valid.

    We now show some examples in which the matrices can be used to represent theindexed equations. The first example involves the matrix formulation of the finitedifference approximation of a partial differential equation. The second involves thematrix formulation of a quadratic equation.

    EXAMPLE 1.3. Consider the heat equation of a L W flat rectangular plate givenby

    Tt

    = (2Tx2

    + 2Ty2

    )(1.17)

    with stationary boundary conditions

    T (0, y, t) = f0(y) T (x, 0, t) = g0(x)T (L, y, t) = fL(y) T (x,W, t) = gW (x)

    and initial condition, T (x, y, 0) = h(x, y). We can introduce a uniform finitetime increment t and finite differences for x and y given by x = L/(N + 1)and y = W/(M + 1), respectively, so that tk = kt, xn = nx, and ym = my,

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    16 Matrix Algebra

    L

    W

    x

    x=0y=0 y

    T(k+1)T(k)

    T(0)

    ......

    Tn,m-1 Tn,m+1

    Tn,m

    Tn-1,m

    Tn+1,m

    Figure 1.1. A schematic of the finite difference approximation of the temperature distributionT of a flat plate in Example 1.3.

    with k = 0, 1, . . ., n = 0, . . . ,N + 1, and m = 0, . . . ,M + 1. The points corre-sponding to n = 0, n = N + 1, m = 0, and m = M + 1 represent the boundaryvalues. We can then let [T (k)] be a N M matrix that represents the tem-perature distribution of the specific internal points of the plate at time tk (seeFigure 1.1).

    Using the finite difference approximation of the partial derivatives at pointx = nx and y = my, and time t = kt:3

    Tt

    = Tn,m(k + 1) Tn,m(k)t

    2Tx2

    = Tn+1,m(k) 2Tn,m(k) + Tn1,m(k)x2

    2Ty2

    = Tn,m+1(k) 2Tn,m(k) + Tn,m1(k)y2

    then (1.17) is approximated by the following indexed equations:

    Tn,m(k + 1) =x

    (Tn1,m(k) +

    (1

    2x 2

    )Tn,m(k) + Tn+1,m(k)

    )

    +y(

    Tn,m1(k) +(

    12y

    2)

    Tn,m(k) + Tn,m+1(k)) (1.18)

    where

    x = t(x)2 ; y =t

    (y)2

    Tn,m(k) is the temperature at time t = kt located at (x, y) = (nx,my).The first group of terms in (1.18) involves only Tn1,m, Tn,m, and Tn+1,m,

    that is, only a combination of row elements at fixed m. This means that the

    3 The finite difference methods are discussed in more detail in Chapter 13.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.2 Fundamental Matrix Operations 17

    first group of terms can be described by the product AT for some constantN N matrix A. Conversely, the second group of terms in (1.18) involves onlyacombination of column elements at fixed n, which means a product TB forsome matrix B[=]M M. In anticipation of boundary conditions, we need anextra matrix C[=]N M. Thus we should be able to represent (1.18) using amatrix formulation given by

    T (k + 1) = AT (k) + T (k)B + C (1.19)where A and B and C are constant matrices.4

    When formulating general matrix equations, it is often advisable to apply itto smaller matrices first. Thus let us start with a case in which N = 4 and M = 3.We can show that (1.18) can be represented by

    [T (k + 1)

    ]= x

    x 1 0 01 x 1 00 1 x 10 0 1 x

    T11(k) T12(k) T13(k)T21(k) T22(k) T23(k)T31(k) T32(k) T33(k)T41(k) T42(k) T43(k)

    +y

    T11(k) T12(k) T13(k)T21(k) T22(k) T23(k)T31(k) T32(k) T33(k)T41(k) T42(k) T43(k)

    y 1 01 y 1

    0 1 y

    +x

    T01(k) T02(k) T03(k)

    0 0 00 0 0

    T51(k) T52(k) T53(k)

    + y

    T10(k) 0 T14(k)T20(k) 0 T24(k)T30(k) 0 T34(k)T40(k) 0 T44(k)

    where x = 1/(2x) 2 and y = 1/(2y) 2. Generalizing, we have

    A = x

    x 1 0

    1. . .

    . . .. . . x 1

    0 1 x

    [=]N N

    B = y

    y 1 0

    1. . .

    . . .. . . y 1

    0 1 y

    [=]M M

    C = x

    p1 pM0 0

    ...0 0q1 qM

    + y r1 0 0 s1... ... ... ...

    rN 0 0 sN

    4 More generally, if the boundary conditions are time-varying, then C = C(k). Also, if the coefficient = (t), then A and B will need to be replaced by A(k) and B(k), respectively.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    18 Matrix Algebra

    where pm = f0 (my), qm = fL (my), rn = g0 (nx), and sn = gW (nx).The initial matrix is obtained using the initial condition, that is, Tnm(0) =h(nx,my). Starting with T (0), one can then march iteratively through timeusing (1.19). (A specific example is given in exercise E1.21.)

    EXAMPLE 1.4. The general second-order polynomial equation in N variables isgiven by

    =N

    i=1

    Nj=1

    aij xixj

    One could write this equation as

    [] = xT Axwhere

    A =

    a11 . . . a1N... . . . ...aN1 . . . aNN

    and x =x1...

    xN

    Note that [] is a 1 1 matrix in this formulation. The right-hand side is knownas the quadratic form. However, because xixj = xj xi, three alternative forms arepossible:

    [] = xT Qx [] = xT Lx or [] = xT Uxwhere

    Q =

    q11 . . . q1N... . . . ...qN1 . . . qNN

    U = u11 . . . u1N. . . ...

    0 uNN

    L = 11 0... . . .

    N1 . . . NN

    and

    qij = aij + aji2 ; uij =

    aij + aji if i < jaii if i = j0 if i > j

    ; ij =

    aij + aji if i > jaii if i = j0 if i < j

    (The proof that all three forms are equivalent is left as an exercise in E1.34.)This example shows that more than one matrix formulation is possible

    in some cases. Matrix Q is symmetric, whereas L is lower triangular, and Uis upper triangular. The most common formulation is to use the symmetricmatrix Q.

    1.3 Properties of Matrix Operations

    In this section, we discuss the different properties of matrix operations. With theseproperties, one could manipulate matrix equations to either simplify equations,generate efficient algorithms, or analyze the problem, before actual matrix compu-tations. We first discuss the basic properties involving addition, multiplications, and

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.3 Properties of Matrix Operations 19

    Table 1.4. Properties of matrix operations

    Commutative Operations

    A B = B AA = A

    A + B = B + AAA1 = A1A

    Associativity of Sums and Products

    A + (B + C) = (A + B) + C

    A (BC) = (AB) C

    A (B C) = (A B) C

    A (B C) = (A B) C

    Distributivity of Products

    A (B + C) = AB + AC

    (A + B) C = AC + BC

    A (B + C) = A B + A C

    (A + B) C = A C + B C

    A (B + C) = A B + A C= B A + C A= (B + C) A

    (AB) (CD) = (A C)(B D)

    Transpose of Products

    (AB)T = BT AT

    (A B)T = AT BT(A B)T = BT AT

    = AT BT

    Inverse of Matrix Products and Kronecker Products

    (AB)1 = B1A1 (A B)1 = (A)1 (B)1

    Reversible Operations(AT

    )T = A(A) = A

    (A1

    )1 = AVectorization of Sums and Products

    vec (A + B) = vec (A) + vec (B)vec (BAC) = (CT B) vec (A)

    vec (A B) = vec(A) vec(B)

    inverses. Next is a separate subsection on the properties of determinants. Finally,we include a subsection of the formulas that involve matrix inverses.

    1.3.1 Basic Properties

    A list of some basic properties of matrix operations is given in Table 1.4. Most ofthe properties can be derived by directly using the definitions given in Tables 1.1,1.2, and 1.3. The proofs are given in Section A.4.1 as an appendix. The propertiesof the matrix operations allow for the manipulation of matrix equations before

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    20 Matrix Algebra

    Table 1.5. Definition of vectors

    Vector Description of elements

    x xk is the annual supply rate (kg/year) of material from source k.y yk is the annual production rate (kg/year) of product k.z zk is the sale price per kg of product k.w wk is the production cost per kg of the material from source k.

    actual computations. They help in simplifying expressions that often yield importantinsights about the data or the system being investigated.

    The first group of properties list the commutativity, associativity, and distributiv-ity properties of various sums and products. One general rule is to choose associationsof products that would improve computations. For instance, let a, b, c, d, e, and f becolumn vectors of the same length; we should use the following associations

    abT cdT efT = a (bT c) (dT e) fTbecause both

    (bT c

    )and

    (dT e

    )are 1 1. A similar rule holds for using the distribu-

    tive properties. For example, we can use distributivity to rearrange the followingequation:

    AD + ABCD = A(D + BCD) = A(I + BC)DMore importantly, these properties allow for manipulations of matrix equations

    to help simplify the equations, as shown in the example that follows.

    EXAMPLE 1.5. Consider a processing facility that can take raw material from Mdifferent sources to produce N different products. The fractional yield of productj per kilogram of material coming from source i can be collected in matrix formas F = ( fij ). In addition, define the cost, price, supply rates, and productionrates by the column vectors given in Table 1.5. We simplify the situation byassuming that all the products are sold immediately after production withoutneed for inventory. Let S, C, and P = (S C) be the annual sale, annual cost,and annual net profit, respectively. We want to obtain a vector g where the kth

    element is the annual net profit per kilogram of material from source k, that is,P = gT x.

    Using matrix representation, we have

    y = F xS = zT yC = wT x

    then the net profit can be represented by

    P = S C = zT F x wT x = (zT F wT ) x = gT xwhere g is given by

    g = F T z w

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.3 Properties of Matrix Operations 21

    More generally, the problem of maximizing net profit by adjusting the supplyrates are formulated as a typical linear programming problem:

    maxx

    gT x (objective function)

    subject to

    0 x xmax (availability constraints)ymin y(= F x) ymax (demand constraints)

    The transposes of matrix products turn out to be equal to the matrix products ofthe transposes but in the reversed sequence. Together with the associative property,this can be extended to the following results:

    (ABC EFG)T = GT F T ET CT DT AT(Ak

    )T= (AT )k(

    AA1)T = (A1)T AT = I = (AT )1 AT

    The last result shows that(AT

    )1 = (A1)T . Thus we often use the shorthand ATto mean either

    (AT

    )1 or (A1)T .Similarly, the inverse of a matrix product is a product of the matrix inverses in

    the reverse sequence. This can be generalized to be5

    (ABC )1 = C1B1A1(Ak

    )1= A1 A1 = (A1)k

    AkA = Ak+

    Thus we can use Ak to denote either(Ak

    )1or

    (A1

    )k. Note that these results arestill consistent with A0 = I.

    EXAMPLE 1.6. Consider a resistive electrical network consisting of junction pointsor nodes that are connected to each other by links where the links contain threetypes of electrical components: resistors, current sources, and voltage sources.We simplify our network to contain only two types of links. One type of link con-tains only either one resistor or one voltage source, or both connectedin series.6

    5 Note that this is not the case for Kronecker products that is,

    (A B C )1 = A1 B1 C1

    6 If multiple resistors with resistance Rj1,Rj2, . . . are connected in series the j th link, then they canbe replaced by one resistor with resistance Rj =

    k Rjk. Likewise, if multiple voltage sources with

    signed voltages sj1, sj2, . . . are connected in series the j th link, then they can be replaced by onevoltage source with signed voltage sj =

    k sjk, where the sign is positive if the polarity goes from

    positive to negative along the current flow.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    22 Matrix Algebra

    1

    30

    2

    1R

    6R

    5R3R

    4R

    2R

    +

    -1S

    3,0A

    Figure 1.2. An electrical network with resistors Rj inlink j , voltage sources sj in link j , and current sourcesAk, from node k to node .

    The other type of link contains only a current source. One such network is shownin Figure 1.2.

    Suppose there are n + 1 nodes and m (n + 1) links. By setting one ofthe nodes as having zero potential (the ground node), we want to determinethe potentials of the remaining n nodes as well as the current flowing througheach link and the voltages across each of the resistors. To obtain the requiredequations, we need to first propose the directions of each link, select the groundnode (node 0), and label the remaining nodes (nodes 1 to n). Based on thechoices of current flow and node labels, we can form the node-link incidencematrix [=]n m, which is a matrix composed of only 0, 1, and 1. The ithrow of refers to the ith node, whereas the j th column refers to the j th link.Note that the links containing only current sources are not included during theformulation of incidence matrix. (Instead, these links are involved only duringthe implementation of Kirchhoffs current laws.) We set ij = 1 if the current isflowing into node i along the j th link, and ij = 1 if the current is flowing outof node i along the j th link. For the network shown in Figure 1.2, the incidencematrix is given by

    = +1 1 0 1 0 00 +1 1 0 +1 0

    0 0 0 +1 1 1

    Let pi be the potential of node i with respect to ground and let ej be the potentialdifference along link j between nodes k and , that is, wherekj = 0 andj = 0.Because the current flows from high to low potential,

    e = T pIf the j th link contains a voltage source sj , we assign a positive value if thepolarity is from positive to negative along the chosen direction of the currentflow. Let vj be the voltage across the j th resistor, then

    e = v + sOhms law states that the voltage across the j th resistor is given by vj = ij Rj ,where ij and Rj are the current and resistance in the j th link. In matrix form, wehave

    v = Ri where R =

    R1 0. . .0 Rm

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.3 Properties of Matrix Operations 23

    Let the current sources flowing out of the ith node be given by Aij , whereas thoseflowing into the ith node are given by Ai. Then the net current inflow at node idue only to current sources will be

    bi =

    Ai

    j

    Aij

    Kirchhoffs current law states that the net flow of current at the ith node is zero.Thus we have

    i + b = 0In summary, for a given set of resistance, voltage sources, and current sources,we have enough information to find the potentials at each node, the voltageacross each resistor, and the current flows along the links based on the chosenground point and proposed current flows. To solve for the node potentials, wehave

    e = v + sT p = Ri + s

    R1T p R1s = iR1T p R1s = i = b(

    R1T)

    p = (b R1s)p = (R1T )1 (b R1s) (1.20)

    Using the values of p, we could find the voltages across the resistors,

    v = T p s (1.21)And finally, for the current flows,

    i = R1v (1.22)For the network shown in Figure 1.2, suppose the values for the resistors,voltage source, and current source are given by: {R1,R2,R3,R4,R5,R6} ={1 , 2 , 3 , 0.5 , 0.8 , 10 }, S1 = 1 v and A3,0 = 0.2 amp. Then thesolution using equations (1.20) to (1.22) yields:

    p =

    0.61180.42540.4643

    volts, v =

    0.38820.18640.42540.14750.03890.4643

    volts, and i =

    0.38820.09320.14180.29500.04860.0464

    amps

    Remarks:

    1. R1 is just a diagonal matrix containing the reciprocals of the diagonalelements of R.

    2.(R1T

    )is an n n symmetric matrix, and its inverse is needed in equa-

    tion (1.20). If n is large, it is often more efficient to approach the same

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    24 Matrix Algebra

    problem using the numerical techniques that are covered in the next chap-ter, such as the conjugate gradient method.

    The last group of properties given in Table 1.3 involves the relationship betweenvectorization, matrix products, and Kronecker products. These properties are veryuseful in reformulating matrix equations in which the unknown matrices X do notexclusively appear on the right or left position of the products in the equation.For example, a form known as Sylvester matrix equation, which often results fromcontrol theory as well as in finite difference solutions, is given by

    QX + XR = C (1.23)where Q[=]N N, R[=]M M are C[=]N M are constant matrices, whereasX[=]N M is the unknown matrix. After inserting appropriate identity matrices,the properties can be used to obtain the following result:

    vec(

    QXIM + INXR)

    = vec (C)

    vec(

    QXIM)+ vec

    (INXR

    )=(

    ITM Q)

    vec (X) +(

    RT IN)

    vec (X) =(ITM Q + RT IN

    )vec (X) = vec (C)

    By setting A = ITM Q + RT IN, x = vec (X), and b = vec (C), the problem canbe recast as Ax = b.

    EXAMPLE 1.7. In example 1.3, the finite difference equation resulted in the matrixequation given by

    T (k + 1) = AT (k) + T (k)B + Cwhere A[=]N N, B[=]M M, C[=]N M, and T [=]N M. At equilibrium,T (k + 1) = T (k) = Teq, a constant matrix. Thus the matrix equation becomes

    Teq = ATeq + TeqB + CUsing the vectorization properties in Table 1.4, we obtain

    vec(Teq

    ) = (I[M] A) vec (Teq)+ (BT I[N]) vec (Teq)+ vec (C)or

    Kx = b x = K1bwhere

    K = I[NM] (I[M] A

    ) (BT I[N])x = vec (Teq)b = vec (C)

    After solving for x, Teq can be recovered by using the reshape operator,that is, Teq = reshape (x,N,M).

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.3 Properties of Matrix Operations 25

    Table 1.6. Properties of determinants

    1 Determinant of Products det(

    AB)= det

    (A)

    det(

    B)

    2 Determinant of Triangular Matrices det(

    A)= Ni=1 aii

    3 Determinant of Transpose det(

    AT)= det

    (A)

    4 Determinant of Inverses det(

    A1)= det

    (A)1

    5Let B contain permuted columnsof A based on sequence K

    det(

    B)= (K)det

    (A)

    where (K) is the permutationsign function

    6

    Scaled Columns:

    B =

    1 a11 N a1N

    ... ...

    1 aN1 N aNN

    det(

    B)=(N

    j=1 j)

    det(

    A)

    7 Multilinearity

    a11 x1 + y1 a1N

    ... ...

    ...aN1 xN + yn aNN

    =

    a11 x1 a1N

    ... ...

    ...aN1 xN aNN

    +

    a11 y1 a1N

    ... ...

    ...aN1 yn aNN

    8 Linearly Dependent Columns

    det(

    A)= 0

    if for some k = 0,N

    j=1 iA,j = 0 Using item 3 (i.e., that the transpose operation does not alter the determinant), a dual set of propertiesexists for items 5 to 8, in which the columns are replaced by rows.

    1.3.2 Properties of Determinants

    Because the determinant is a very important matrix operation, we devote a separatetable for the properties of determinants. A summary of the properties of determi-nants is given in Table 1.6. The proofs for these properties are given in Section A.4.2as an appendix.

    Note that even though A and B may not commute, the determinants of both ABand BA are the same, that is,

    det(

    AB)= det

    (A)

    det(

    B)= det

    (B)

    det(

    A)= det

    (BA

    )Several properties of determinants help to improve computational efficiency.

    For instance, the fact that the determinant of a triangular or diagonal matrix is just

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    26 Matrix Algebra

    the product of the diagonal means that there is tremendous advantage to findingmultiplicative factors that could diagonalize or triagularize the original matrix.Later, in Chapter 3, we try to find such a nonsingular T whose effect would be to makeC = T 1AT diagonal or triangular. Yet C and A will have the same determinant,that is,

    det(

    T 1AT)= det (T 1) det (A) det (T ) = 1

    det (T )det (A) det (T ) = det (A)

    The last property in the list is one of the key application of determinants inlinear algebra. It states that if the columns of a matrix are linearly dependent(defined next), then the determinant is zero.

    Definition 1.7. Vectors {v1, v2, . . . , vN} are linearly dependent ifN

    j=1ivi = 0 (1.24)

    for some k = 0

    This means that if {v1, . . . , vN} is a linearly dependent set of vectors, then any of thevectors in the set can be represented as a linear combination of the other (N 1)vectors. For instance, let

    v1 = 11

    1

    v2 = 12

    1

    v3 = 01

    0

    We can compute the determinant of V = ( v1 v2 v3 ) = 0 and conclude imme-diately that the columns are dependent. In fact, we check easily that v1 = v2 v3,v2 = v1 + v3, or v3 = v2 v1.

    EXAMPLE 1.8. Let a tetrahedron be described by four vertices in 3D space givenby p1, p2, p3, and p4, as shown in Figure 1.3. Let v1 = p2 p1, v2 = p3 p1, andv3 = p4 p1 form a 3 3 matrix

    V =(

    v1 v2 v3

    )It can be shown using the techniques given in Section 4.1, together with Sec-tion 4.2, that the volume of the tetrahedron can be found by the determinantformula:

    Volume = 16

    abs

    det ( V )

    For instance, let

    p1 = 10

    0

    p2 = 11

    0

    p3 = 11

    1

    p4 = 01

    1

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.3 Properties of Matrix Operations 27

    z

    y

    x

    p2

    p3

    p4

    p1

    Figure 1.3. A tetrahedron described by four points: p1, p2, p3, andp4.

    then the tetrahedron formed by vertices p1, p2, p3, and p4 yields

    V = 0 0 11 1 1

    0 1 1

    Volume = 16

    If instead of p4, we have by p4 =(

    1 0 1)T

    ,

    V = 0 0 01 1 0

    0 1 1

    Volume = 0which means p1, p2, p3, and p4 are coplanar, with v1 = v2 v3.

    1.3.3 Matrix Inverse Formulas

    In this section, we include some of the formulas for the inverses of matrices and twoimportant results: the matrix inversion lemma (known as the Woodbury formula)and Cramers rule.

    We start with the inverse of a diagonal matrix. The inverse of a diagonal matrix Dis another diagonal matrix containing the reciprocals of the corresponding diagonalelements di, that is,

    D1 =

    d1 0. . .0 dN

    1

    =

    d11 0

    . . .0 d1N

    (1.25)Direct calculations can be used to show that DD1 = D1D = I.

    Next, we have a formula for the inverse of a triangular matrix T of size N.

    LEMMA 1.3. For a triangular matrix T [=]N N, let D be a diagonal matrix such thatdii = tii and matrix K = D T . Then,

    T 1 = D1(

    I +N1=1

    (KD1

    ))(1.26)

    PROOF. Multiply (1.26) by T = D K and expand,

    TT 1 = (D K) D1(

    I +N1=1

    (KD1

    )) = I (KD1)Nbut KD1 is a strictly triangular matrix that is nilpotent matrix of degree (N 1)(see exercise E1.9), that is,

    (KD1

    )N = 0.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    28 Matrix Algebra

    Next, we discuss an important result in matrix theory known as the matrixinversion lemma, also known as the Woodbury matrix formula.

    LEMMA 1.4. Let A, C, and M = C1 + DA1B be nonsingular, then

    (A + BCD)1 = A1 A1B [C1 + DA1B]1 DA1 (1.27)PROOF. With M = C1 + DA1B, let Q be the right hand side of (1.27), that is,

    Q = A1 A1BM1DA1 (1.28)Then,

    (A + BCD) Q = (AQ) + (BCDQ)= (AA1 AA1BM1DA1)

    + (BCDA1 BCDA1BM1DA1)= I + BCDA1 B (I + CDA1B)M1DA1= I + BCDA1 B (CC1 + CDA1B)M1DA1= I + BCDA1 BC (C1 + DA1B)M1DA1= I + BCDA1 BCMM1DA1

    = I + BCDA1 BCDA1

    = IIn a similar fashion, one can also show that Q(A + BCD) = I.

    Remark: The matrix inversion lemma given by (1.27) is usually applied in cases inwhich the inverse of A is already known and the size of C is significantly smallerthan A.

    EXAMPLE 1.9. Let

    G = 1 0 22 2 0

    1 1 3

    = T + wvTwhere

    T = 1 0 02 2 0

    1 1 3

    ; w = 10

    0

    and vT = ( 0 0 2 )this means we split G into a triangular matrix T and a product of a columnvector w and row vector vT . We can use lemma 1.3 to find

    T 1 = 1 0 01 1/2 0

    0 1/6 1/3

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.3 Properties of Matrix Operations 29

    Then with (1.27),

    G1 = (T 1 + w [1] vT )1 = T 1 T 1w (1 + vT T 1w)1 vT T 1=

    1 1/3 2/31 5/6 2/30 1/6 1/3

    where we took advantage of the fact that

    (1 + vT T 1w) [=]1 1.

    We complete this subsection with the discussion of a technique used in solvinga subset of Ax = b. Suppose we want to solve for only one of the unknowns, forexample, the kth element of x, for a given linear equation Ax = b. One could extractthe kth element of x = A1b, but this involves the evaluation of A1, which can becomputationally expensive. As it turns out, finding the inverse is unnecessary if onlyone unknown is needed, by using Cramers rule, as given by the following lemma:

    LEMMA 1.5. Let A[=]N N be nonsingular, then

    xk =det

    (A[k,b]

    )det (A)

    (1.29)

    where A[k,b] is obtained A by replacing the kth column with b.

    PROOF. Using (1.16), x = A1b can then be written as x1...xN

    = 1det(A)

    cof(a11) cof(aN1)... . . . ...cof(a1N) cof(aNN)

    b1b2...

    bN

    or for the kth element,

    xk =n

    j=1 bj cof(akj )

    det(A)

    The numerator is just the determinant of a matrix, A[k,b], which is obtained from A,with the kth column replaced by b.

    EXAMPLE 1.10. Let

    A = 1 0 22 2 0

    1 1 3

    and b = 23

    2

    Then for Ax = b, the value of x2 can be found immediately using Cramers rule,

    x2 =det

    (A[2,b]

    )det (A)

    =

    1 2 22 3 01 2 3

    1 0 22 2 0

    1 1 3

    =116

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    30 Matrix Algebra

    1.4 Block Matrix Operations

    A set of operations called block matrix operations (also known as partitioned matrixoperations) takes advantage of special submatrix structures. The block operationsare given as follows:

    (A B

    C D

    )(E F

    G H

    )=

    (AE + BG AF + BHCE + DG CF + DH

    )(1.30)

    det

    (A 0

    C D

    )= det

    (A)

    det(

    D)

    (1.31)

    det

    (A B

    C D

    )= det(A) det (D CA1B) ; if A1 exists (1.32)= det

    (D)

    det(

    A BD1C)

    ; if D1 exists (1.33)(A B

    C D

    )1=

    (W X

    Y Z

    )(1.34)

    where, W , X, Y , and Z depend on the two possible cases:

    Case 1: A and = D CA1B are nonsingular, thenZ = 1

    Y = 1CA1 = ZCA1

    X = A1B1 = A1BZW = A1(I + B1CA1) = A1 (I BY) = (I XC) A1 (1.35)

    Case 2: D and = A BD1C are nonsingular, thenW = 1

    X = 1BD1 = WBD1

    Y = D1C1 = D1CWZ = D1(I + C1BD1) = D1 (I CX) = (I YB) D1 (1.36)

    The proofs of (1.30) through (1.36) are given in Section A.4.5. The matrices =D CA1B and = A BD1C are known as the Schur complements of A and D,respectively.

    EXAMPLE 1.11. Consider the open-loop process structure consisting of R processunits as shown in Figure 1.4. The local state vector for process unit i is givenby xi =

    (x1i, , xNi

    )T. For instance, xki could stand for the kth species in

    process unit i. The interaction among the process units is given by

    A1x1 + B1x2 = p1Ci1xi1 + Aixi + Bixi+1 = pi if 1 < i < R

    CR1xR1 + ARxR = pR

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.5 Matrix Calculus 31

    Figure 1.4. An open-loop system of R processunits.

    where Ai,Bi,Ci[=]N N and pi[=]N 1. A block matrix description isFx = p

    where

    F =

    A1 B1 0

    C1. . .

    . . .. . .

    . . . BR10 CR1 AR

    x = x1...

    xR

    p = p1...

    pR

    As a numerical illustration, let N = 2 and R = 2 and

    F =

    1 2 0 1

    1 2 1 20 1 2 11 1 0 1

    and p =

    0.40.60.40.5

    Using (1.36) to find F1, we have

    D1 =( 0.5 0.5

    0 1.0)

    1 = (A BD1C)1 = ( 0.0952 0.28570.3333 0

    )from which

    W =( 0.0952 0.2857

    0.3333 0

    )Y =

    (0.2857 0.14290.2381 0.2857

    ) X =( 0.0952 0.2857

    0.3333 0

    )Z =

    ( 0.4286 0.14290.1429 0.0476

    )and

    x =(

    x1x2

    )= F1p =

    (W X

    Y Z

    )0.40.50.40.6

    =

    0.53330.3333

    0.10.2667

    Remark: Suppose we are interested in x1 (or x2); then one needs only the valuesof W and X (or Y and Z, respectively).

    1.5 Matrix Calculus

    In this section, we establish the conventions and notations to be used in this bookfor derivatives and integrals of systems of multivariable functions and equations.For simplicity, we assume that the functions are sufficiently differentiable. The main

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    32 Matrix Algebra

    advantage of matrix calculus is also to allow for compact notation and thus improvethe tractability of calculating large systems of differential equations. This meansthat matrix algebra and matrix analysis tools can be used to study the solutionand behavior of systems of differential equations, the numerical solution of systemsof nonlinear algebraic equations, and the numerical optimization of multivariablefunctions.

    1.5.1 Matrix of Univariable Functions

    Let A(t) be a matrix of univariable functions; then the derivative of A(t) with respectto t is defined as

    ddt

    A(t) = limt0

    1t

    (A(t +t) A(t)

    )=

    d(a11)

    dt d(a1M)

    dt...

    . . ....

    d(aN1)dt

    d(aNM)dt

    (1.37)Based on (1.37), we can obtain the various properties given in Table 1.7. For thederivative of determinants, the proof is given in Section A.4.6.

    Likewise, the integral of matrices of univariable functions is defined as follows:

    tft0

    A(t) dt = limt0

    T1k=0

    A (kt) t =

    tft0

    a11(t)dt tf

    t0a1N(t)dt

    .... . .

    ... tft0

    aN1(t)dt tf

    t0aNN(t)dt

    (1.38)

    where T = (tf t0) /t. Based on the linearity property of the integrals, we havethe properties shown in Table 1.8.

    EXAMPLE 1.12. Define the following function as the matrix exponential,

    exp(

    A(t))= I + A(t) + 1

    2!A(t)2 + 1

    2!A(t)2 + 1

    3!A(t)3 + (1.39)

    then the derivative of exp(

    A(t))

    is given by

    ddt

    exp(

    A(t))

    = 0 + ddt

    A(t) + 12!

    (A(t)

    (ddt

    A(t))

    +(

    ddt

    A(t))

    A(t))

    +

    In general, A(t) and its derivative are not commutative. However, for the specialcase in which A and its derivative commute, the matrix exponential simplifies to

    ddt

    exp(

    A(t))

    = ddt

    A(t) + A(t)(

    ddt

    A(t))

    + 12!

    A(t)2(

    ddt

    A(t))

    +

    = exp(

    A(t))( d

    dtA(t)

    )=(

    ddt

    A(t))

    exp(

    A(t))

    One such case is when A(t) is diagonal. Another case is when A(t) = (t)M,where M is a constant square matrix.

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.5 Matrix Calculus 33

    Table 1.7. Properties of derivatives of matrices of univariable functions

    1 Sum of Matricesddt

    (M(t) + N(t)

    )=(

    ddt

    M)

    +(

    ddt

    N)

    2 Scalar Productsddt

    ((t)M(t)

    )= d

    dtM +

    (ddt

    M)

    3 Matrix Productsddt

    (M(t)N(t)

    )=(

    ddt

    M)

    N + M(

    ddt

    N)

    4 Hadamard Productsddt

    (M(t) N(t)

    )=(

    ddt

    M)

    N + M (

    ddt

    N)

    5 Kronecker Productsddt

    (M(t) N(t)

    )=(

    ddt

    M)

    N + M (

    ddt

    N)

    6 Partitioned Matricesddt

    (A(t) B(t)

    C(t) D(t)

    )=

    ddt

    Addt

    B

    ddt

    Cddt

    D

    7 Matrix Transposeddt

    (A(t)T

    )=(

    ddt

    A)T

    8 Matrix Inverseddt

    (A(t)1

    )= A1

    (ddt

    A)

    A1

    9 Determinants

    ddt

    (det

    (A(t)

    ) )=

    Nk=1

    det(

    Ak(t))

    where

    Ak =

    a11 a1N...

    dak1dt

    dakNdt

    ...an1 aNN

    kthrow

    EXAMPLE 1.13. Three vertices of a tetrahedron are stationary, namely p1, p2, andp3. The last vertex p4(t) moves as a function of t. As described in Example 1.8,the volume of the tetrahedron (after applying a transpose operation) is given by

    Vol = 16

    det

    (p2 p1

    )T(

    p3 p1)T

    (p4 p1

    )T

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    34 Matrix Algebra

    Table 1.8. Properties of integrals of matrices of univariable functions

    1 Sum of Matrices (

    M(t) + N(t))

    dt =

    Mdt +

    Ndt

    2 Scalar Products (

    M)

    dt =

    Mdt if is constant

    (dt

    )M if M is constant

    3 Matrix Products (

    MN)

    dt =

    M

    Ndt if M is constant

    (Mdt

    )N if N is constant

    4 Hadamard Products (

    M(t) N(t))

    dt =

    M

    Ndt if M is constant

    (Mdt

    ) N if N is constant

    5 Kronecker Products (

    M(t) N(t))

    dt =

    M

    Ndt if M is constant

    (Mdt

    ) N if N is constant

    6 Partitioned Matrices (

    A(t) B(t)

    C(t) D(t)

    )dt =

    Adt

    Bdt

    Cdt

    Ddt

    7 Matrix Transpose (

    A(t)T)

    dt =(

    A dt)T

    Using the formula for the derivative of determinants (cf. property 9 in Table 1.7),the rate of change of Vol per change in t is given by

    ddt

    Vol = 16

    0 + 0 + det

    (p2 p1

    )T(

    p3 p1)T

    ddt

    (pT4

    )

  • P1: JZP Trim: 7in 10in Top: 0.375in Gutter: 0.875inCUUS1935-01 cuus1935/Co 978 1 107 00412 2 February 11, 2013 22:11

    1.5 Matrix Calculus 35

    For instance, let the points be given by

    p1 =12

    0

    p2 =0.51

    0

    p3 =11

    0

    p4 =2t + 3t + 1

    t + 5

    then

    ddt

    Vol = 0 + 0 + 16

    det

    0.5 1 00 1 02 1 1

    = 112

    EXAMPLE 1.14. Let f () = pT Q()p, where p is constant and Q() is a squarematrix. Then the integral of f () can be evaluated as

    0f ()d =

    0

    pT Q()p d

    = pT(

    0Q() d

    )p

    For a specific example, 0

    (p1 p2@

    ) ( cos() sin() sin() cos()

    )(p1p2

    )d = (p1 @p2 ) (0 22 0

    )(p1p2

    )= 0

    1.5.2 Derivatives of Multivariable Functions

    Let xi, i = 1, . . . ,N, be independent variables collected in a column vector as

    x =

    x1...xN

    then a multivariable, scalar function f of these variables is denoted by

    f(

    x)= f

    (x1, x2, . . . , xM

    )whereas a vector of multivariable scalar functions f(x) is also arranged in a columnvector as

    f (x) =

    f1(x)...fN(x)

    = f1 (x1, x2, . . . , xM)...

    fN (x1, x2, . . . , xM)

    We denote a row vector of length M known as the gradient vector, which is thepartial derivatives of f (x) by

    ddx

    f (x) =(

    fx1

    , . . . ,fxM

    )(1.40)

  • P1: JZP Trim: 7in 10in To