Distributed Computation of Linear Matrix Equations: An

1858 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 64, NO. 5, MAY 2019

Distributed Computation of Linear MatrixEquations: An Optimization Perspective

Xianlin Zeng , Member, IEEE, Shu Liang , Yiguang Hong , Fellow, IEEE,and Jie Chen , Senior Member, IEEE

Abstract—This paper investigates the distributed compu-tation of the well-known linear matrix equation in the form ofAXB = F , with the matrices A, B, X , and F of appropriatedimensions, over multiagent networks from an optimizationperspective. In this paper, we consider the standard dis-tributed matrix-information structures, where each agent ofthe considered multiagent network has access to one of thesubblock matrices of A, B, and F . To be specific, we firstpropose different decomposition methods to reformulatethe matrix equations in standard structures as distributedconstrained optimization problems by introducing substi-tutional variables; we show that the solutions of the refor-mulated distributed optimization problems are equivalent toleast squares solutions to original matrix equations; and wedesign distributed continuous-time algorithms for the con-strained optimization problems, even by using augmentedmatrices and a derivative feedback technique. Moreover, weprove the exponential convergence of the algorithms to aleast squares solution to the matrix equation for any initialcondition.

Index Terms—Constrained convex optimization, dis-tributed computation, least squares solution, linear matrixequation, substitutional decomposition.

I. INTRODUCTION

R ECENTLY, the increasing scale and big data of engi-neering systems and science problems have posed new

Manuscript received December 21, 2017; revised April 25, 2018; ac-cepted June 8, 2018. Date of publication June 14, 2018; date of currentversion April 24, 2019. This work was supported in part by the Na-tional Key Research and Development Program of China under Grant2016YFB0901902, in part by the National Natural Science Foundation ofChina under Grant 61333001, Grant 61403231, and Grant 61603378, inpart by the Fundamental Research Funds for the China Central Univer-sities of USTB under Grant FRF-TP-17-088A1, and in part by the ChinaPostdoctoral Science Foundation under Grant 2017M620020. Recom-mended by Associate Editor W. Michiels. (Corresponding author: XianlinZeng.)

X. Zeng is with the Key Laboratory of Intelligent Control and Decision ofComplex Systems, School of Automation, Beijing Institute of Technology,100081 Beijing, China (e-mail: [email protected]).

S. Liang is with the Key Laboratory of Knowledge Automation for In-dustrial Processes of Ministry of Education, School of Automation andElectrical Engineering, University of Science and Technology Beijing,100083 Beijing, China (e-mail: [email protected]).

Y. Hong is with the Key Laboratory of Systems and Control, Academyof Mathematics and Systems Science, Chinese Academy of Sciences,100190 Beijing, China (e-mail: [email protected]).

J. Chen is with the Beijing Advanced Innovation Center for IntelligentRobots and Systems (Beijing Institute of Technology), Key Laboratory ofBiomimetic Robots and Systems, Beijing Institute of Technology, Ministryof Education, 100081 Beijing, China (e-mail: [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TAC.2018.2847603

challenges for the design based on computation, communi-cation, and control. Traditional centralized algorithms for thecomputation of small or modest sized problems are often en-tirely infeasible for large-scale problems. As a result, distributedalgorithms over multiagent networks have attracted a signif-icant amount of research attention due to their broad rangeof applications in nature science, social science, and engi-neering. Particularly, distributed optimization, which seeks aglobal optimal solution with the objective function as a sum ofthe local objective functions of agents, has become increas-ingly popular [1]–[3]. In fact, distributed optimization withdifferent types of constraints, including local constraints andcoupled constraints, has been considered and investigated us-ing either discrete-time or continuous-time solvers (see [1]–[7]). Recently, distributed continuous-time algorithms have re-ceived much attention in [2]–[4], [7]–[12], mainly becausethe continuous-time physical system may involve with solv-ing optimal solutions, and the continuous-time approach mayprovide an effective tool for analysis and design, thoughdistributed designs for many important problems are stillchallenging.

In fact, distributed computation of the linear algebraic equa-tion of the form Ax = b, where A is a matrix, and x and bare vectors of appropriate dimensions, over a multiagent net-work has attracted much research attention because it is funda-mental for many computational tasks and practical engineeringproblems. Mainly based on the distributed optimization idea,distributed algorithms appeared for solving the linear algebraicequation Ax = b. The significant results in [11]–[18] providedvarious distributed algorithms with the standard case that eachagent knows a few rows of A and b, while the authors in [19] pro-posed a distributed computation approach for another standardcase, where each agent has the knowledge of a few columns ofmatrix A. In fact, the analysis given in [12], [15], [17]–[19] de-pends on the existence of exact solutions to the linear equations.Specifically, the authors in [12] proposed a discrete-time dis-tributed algorithm for a solvable linear equation and presentedthe necessary and sufficient conditions for exponential conver-gence of the algorithm, while the authors in [15] developed acontinuous-time algorithm with an exponential convergence ratefor a nonsingular and square A and extended the algorithm to thecase, where A is of full row rank with bounds on the convergencerate. Furthermore, the authors in [18] constructed a distributedalgorithm and derived the necessary and sufficient conditionson a time-dependent graph for an exponential convergence rate.Additionally, the authors in [11] considered distributed compu-tation for a least squares solution to the linear equations that mayhave no exact solutions, by providing approximate least squaressolutions, while the authors in [16] dealt with the problem for

0018-9286 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

https://orcid.org/0000-0002-6224-9974

https://orcid.org/0000-0002-9719-1987

https://orcid.org/0000-0001-9505-8739

https://orcid.org/0000-0003-2449-9793

ZENG et al.: DISTRIBUTED COMPUTATION OF LINEAR MATRIX EQUATIONS: AN OPTIMIZATION PERSPECTIVE 1859

the least squares solutions with different graphs and appropriatestep-sizes.

Although distributed computation of Ax = b has been studiedin the past several years, the results for distributed computationof general linear matrix equations are quite a few. Note that lin-ear matrix equations are very important, related to fundamentalproblems in applied mathematics and computational technol-ogy, such as the existence of solutions of algebraic equationsand stability analysis of linear systems [20], [21]. One of themost famous matrix equations is AXB = F with the matricesA, X, B, and F of appropriate dimensions. The computationof its solution X plays a fundamental role in many importantapplication problems, such as the computation of (generalized)Sylvester equations and generalized inverses of matrices (see[20]–[23]). It is worthwhile pointing out that the computationof the special form AX = F (referring to [24]–[26]) or a morespecial form Ax = b as linear algebraic equations with vectorsx and b (referring to [11]–[18]) has also been widely studied fora broad range of applications.

The objective of this paper is to compute a least squares solu-tion to the well-known matrix equation AXB = F over a multia-gent network in distributed information structures. Consideringthat the computation of a least squares solution to the linearalgebraic equation Ax = b can be related to some optimizationproblems such as minx ‖Ax − b‖2 , we also take a distributedoptimization perspective to investigate the solution for this ma-trix equation over a large-scale network. Note that distributedlinear matrix equations may have different distributed informa-tion structures due to different information structures of A, B,and F known by agents. Based on the column or row subblocksof the matrices A, B, and F that each agent may know, we geteight standard matrix-information structures (see Section III fordetails), and then provide different substitutional decomposi-tion structures to transform the computation problem to differ-ent distributed constrained optimization problems, where eachagent only knows local information (instead of the whole dataof matrices) and obtains the solution by communicating withits neighbors. Then, we propose distributed continuous-time al-gorithms and analyze their convergence with the help of somecontrol techniques, such as stability theory [27] and derivativefeedback [28]. In other words, we employ both constrained con-vex optimization and control ideas to compute a least squaressolution to AXB = F . The technical contribution of the paper issummarized as follows.

1) For a distributed design to solve the linear matrix equa-tion of the form AXB = F , we propose eight standarddistributed structures, and then construct different decom-position transformations with substitutional variables toreformulate the original computation problem to dis-tributed optimization problems with different constraints(related to consensus intersections or coupled equali-ties), whose solutions are proved to be least squaressolutions to the original matrix equation. The paperpresents a distributed optimization perspective for investi-gating distributed computation problems of linear matrixequations.

2) Based on the reformulated optimization problems, wedesign distributed continuous-time algorithms to solvelinear matrix equations in the proposed standard struc-tures, respectively, by using modified Lagrangian func-

tions and derivative feedbacks. Because the structures ofthe problems are different, the proposed algorithms aredesigned using different techniques in distributed opti-mization and control although all the algorithms are ofprimal-dual types. Note that the distributed (continuous-time or discrete-time) algorithms for its very special caseAx = b, which were widely investigated in [11]–[16],[19], cannot be applied to the computation of the matrixequation.

3) For various distributed algorithms in the correspondingstructures, we provide rigorous proofs for the correct-ness and exponential convergence of the algorithms toa least squares solution based on saddle-point dynamicsand stability theory with mild conditions. Note that someassumptions (such as the existence of exact solutions orthe boundedness of least squares solutions in [11], [12],[15], [17]–[19]) for Ax = b are not required in our paper,and therefore our results may also provide a new view-point for the distributed computation of Ax = b and itstheoretical analysis.

The remainder of this paper is organized as follows. Prelim-inary knowledge is presented in Section II, while the problemformulation of solving a matrix equation with distributed infor-mation and the main result of this paper are given in Section III.Then, the reformulations of the matrix equation in differentstructures, distributed algorithms for the reformulated optimiza-tion problems, and their exponential convergence are given inSection IV. Following that, a numerical simulation is carriedout for illustration in Section V. Finally, concluding remarksare provided in Section VI.

II. MATHEMATICAL PRELIMINARIES

In this section, we introduce the necessary notations andknowledge related to matrices, graph theory, convex analysis,optimization, and convergence property.

A. Matrices

Denote R as the set of real numbers, Rn as the set of n-dimensional real column vectors, Rn×m as the set of n-by-mreal matrices, and In as the n × n identity matrix, respectively.For A ∈ Rm×n , we denote rankA as the rank of A, AT asthe transpose of A, range(A) as the range of A, ker(A) asthe kernel of A, and tr(A) as the trace of A. Write 1n (1n×q )for the n-dimensional column vector (n × q matrix) with allelements of 1, 0n (0n×q ) for the n-dimensional column vec-tor (n × q matrix) with all elements of 0, A ⊗ B for the Kro-necker product of matrices A and B, and vec(A) for the vectorobtained by stacking the columns of matrix A. Furthermore,denote ‖ · ‖ as the Euclidean norm, and ‖ · ‖F as the Frobe-nius norm of real matrices defined by ‖A‖F =

√tr(ATA)

=√∑

i,j A2i,j . Let 〈·, ·〉F be the Frobenius inner product of real

matrices defined by 〈A1 , A2〉F = tr(AT1 A2) =

∑i,j (A1)i,j

(A2)i,j with A1 , A2 ∈ Rm×n , which satisfies 〈A1A2 , A3〉F =〈A1 , A3A

T2 〉F = 〈A2 , A

T1 A3〉F for A1 ∈ Rm×n , A2 ∈ Rn×q ,

and A3 ∈ Rm×q . Let {mj}nj=1 and {qj}n

j=1 be sequences ofn positive integers with

∑nj=1 mj = m and

∑nj=1 qj = q, and


let Ai ∈ Rmi ×qi for i ∈ {1, . . . , n}. Define augmented matrices

[Ai ]{mj }n

j = 1R and [Ai ]

{qj }nj = 1

C as

[Ai ]{mj }n

j = 1R �

[0qi ×m 1 , . . . , 0qi ×mi−1 , A

Ti ,

. . . , 0qi ×mi + 1 , . . . , 0qi ×mn

]T∈ Rm×qi (1)

[Ai ]{qj }n

j = 1C �

[0mi ×q1 , . . . , 0mi ×qi−1 , Ai,

. . . , 0mi ×qi + 1 , . . . , 0mi ×qn

]∈ Rmi ×q . (2)

B. Graph Theory

An undirected graph G is denoted by G(V, E , A), whereV = {1, . . . , n} is the set of nodes, E ⊂ V × V is the setof edges, A = [ai,j ] ∈ Rn×n is the adjacency matrix, suchthat ai,j = aj,i > 0 if {j, i} ∈ E and ai,j = 0 otherwise. TheLaplacian matrix is Ln = D − A, where D ∈ Rn×n is diago-nal with Di,i =

∑nj=1 ai,j , i ∈ {1, . . . , n}. Specifically, if the

graph G is connected, then Ln = LTn ≥ 0, rankLn = n − 1,

and ker(Ln ) = {k1n : k ∈ R} [29].

C. Convex Analysis and Optimization

A set Ω ⊆ Rp is convex if λz1 + (1 − λ)z2 ∈ Ω for anyz1 , z2 ∈ Ω and λ ∈ [0, 1]. A function f : Ω → R is said tobe convex (or strictly convex) if f(λz1 + (1 − λ)z2) ≤ (or <)λf(z1) + (1 − λ)f(z2) for any z1 , z2 ∈ Ω, z1 �= z2 and λ ∈(0, 1). Sometimes, a convex optimization problem can bewritten as minz∈Ω f(z), where Ω ⊆ Rp is a convex set andf : Rp → R is a convex function.

D. Convergence Property

Consider a dynamical system

x(t) = φ(x(t)), x(0) = x0 , t ≥ 0 (3)

where φ : Rq → Rq is Lipschitz continuous. Given a trajectoryx : [0,∞) → Rq of (3), y is a positive limit point of x(·) if thereis a positive increasing divergent sequence {ti}∞i=1 ⊂ R, suchthat y = limi→∞ x(ti), and a positive limit set of x(·) is the setof all positive limit points of x(·). A set D is said to be positiveinvariant with respect to (3) if x(t) ∈ D for all t ≥ 0 and everyx0 ∈ D.

Denote Bε(x), x ∈ Rn with a constant ε > 0 as the open ballcentered at x with radius ε. Let D ⊂ Rq be a positive invariantset with respect to (3) and z ∈ D be an equilibrium of (3). zis Lyapunov stable if, for every ε > 0, there exists δ = δ(ε) >0, such that, for every initial condition x0 ∈ Bδ (z)

⋂D, thesolution x(t) of (3) stays in Bε(z) for all t ≥ 0.

The following Lemmas are needed in the analysis of thispaper.

Lemma 2.1 [27, Theorem 3.1]: Let D be a compact, positiveinvariant set with respect to system (3), V : Rq → R be a con-tinuously differentiable function, and x(·) ∈ Rq be a solutionof (3) with x(0) = x0 ∈ D. Assume V (x) ≤ 0, ∀x ∈ D, anddefine Z = {x ∈ D : V (x) = 0}. If every point in the largestinvariant subsetM ofZ ⋂D is Lyapunov stable, whereZ is theclosure of Z ⊂ Rn , then (3) converges to one of its Lyapunovstable equilibria for any x0 ∈ D.

Lemma 2.2: Suppose φ(x) = Mx + b with M ∈ Rq×q , b ∈Rq , and D = Rq . The following statements are equivalent.

1) System (3) converges to an equilibrium exponentially forany initial condition.

2) System (3) converges to an equilibrium for any initialcondition.

Proof: It is trivial that (i) ⇒ (ii), and the proof is omitted.Suppose (ii) holds. Let x∗ be an equilibrium of (3) and define

y = x − x∗. System (3) is equivalent to

y(t) = My(t), y(0) = y0 ∈ Rq , t ≥ 0. (4)

To show (ii) ⇒ (i), we show that y(t) converges to anequilibrium exponentially for any initial condition.

It follows from statement (ii) and [30, Definition 11.8.1,p. 727] that M is semistable (that is, its eigenvalues lie onthe open left half-complex plane, except for a few semisim-ple zero eigenvalues). Hence, there exists an invertible matrixP ∈ Rq×q , such that

PMP−1 =

[D 0r×(q−r)

0(q−r)×r 0(q−r)×(q−r)

]

,

where D ∈ Rr×r is Hurwitz and r = rankM ≤ q. Define z =[ z1z2

] = Py ∈ Rq , such that z1 ∈ Rr , and z2 ∈ Rq−r . It followsfrom (4) that

z1(t)=Dz1(t), z2(t)=0q−r , z1(0)=z1,0 , z2(0)=z2,0 , t ≥ 0.

Hence,∥∥∥∥∥z(t) −

[0r

z2(0)

] ∥∥∥∥∥

= ‖z1(t)‖ = ‖eDtz1(0)‖.

Recall that D is Hurwitz. The trajectory z(t) converges to [ 0r

z2 (0) ]exponentially and, equivalently, y(t) converges to P−1 [ 0r

z2 (0) ]exponentially. �

III. PROBLEM DESCRIPTION AND MAIN RESULT

In this paper, we consider the distributed computation of aleast squares solution to the well-known matrix equation in thefollowing form

AXB = F (5)

where A ∈ Rm×r , B ∈ Rp×q , and F ∈ Rm×q are known ma-trices, and X ∈ Rr×p is an unknown matrix to be solved. Notethat (5) may not have a solution X . However, it always has aleast squares solution, which is defined as follows.

Definition 3.1: A least squares solution to (5) is a solutionof the optimization problem minX ‖AXB − F‖2

F .Obviously, if (5) has a solution, then a least squares solution

is also an exact solution. The following result is well known(see [31] and [32]).

Lemma 3.1:1) Equation (5) has an exact solution if and only if

range(F ) ⊂ range(A) and range(FT) ⊂ range(BT).

2) X∗ ∈ Rr×p is a least squares solution if and only if

0r×p =∂‖AXB − F‖2

F

∂X

∣∣∣X =X ∗

= AT(AX∗B − F )BT .

(6)


3) If A is full column-rank and B is full row-rank, X∗ =(ATA)−1ATFBT(BBT)−1 is the unique least squaressolution.

Note that (5) is one of the most famous matrix equations inmatrix theory and applications (see [20] and [21]), related tothe computation of many important problems, such as (general-ized) Sylvester equations and generalized inverses of matrices(see [20]–[23], [31]). Because solving (5) is one of the key prob-lems of matrix computation, many techniques have been pro-posed and various centralized algorithms have been developedto solve problem (5) (see [23], [31], [33]–[35]). One signifi-cant method is a gradient-based approach from the optimizationviewpoint (see [31, Th. 2]). Because many matrix equations inengineering and science fields have large scales, the distributedcomputation of (5) is very necessary. However, very few resultshave been obtained for the distributed computation of (5) dueto its complicated structures when each agent only knows somesubblocks of (large-size) matrices A, B, and F .

On the other hand, distributed computation of linear algebraicequations in the form of Ax = b with vectors x and b has beenwidely studied in recent years, and some significant results havebeen obtained in [11]–[16], [19]. To solve (5), an immediateidea is to vectorize it as follows:

vec(AXB) = (BT ⊗ A)vec(X) = vec(F )

and try the existing linear algebraic equation results here. Al-though this idea may work in centralized situations, it may to-tally spoil the original distributed information structure becausethe local knowledge about some subblocks of A and B of eachagent may be mixed up and multiplied due to the Kroneckerproduct. Hence, we have to develop new methods to solve thematrix equation (5) in a distributed way.

In this paper, we consider the distributed computation of aleast squares solution to (5) over a multiagent network describedby an undirected graph G, where matrices A, B, and F arecomposed of n row-block or column-block matrices, known byn agents.

In this complicated problem, there are different distributedinformation structures of matrices A, B, and F . To distinguishthe row-blocks or column-blocks of a matrix, we use subscript“vi” to denote its ith row-block and subscript “li” to denote itsith column-block in the sequel.

For different information structures of matrices A, B, and F ,we can classify the distributed computation problem of (5) inthe following eight standard structures:

1) Row-column-column (RCC) structure: Consider (5) with

A =

⎡

⎢⎢⎢⎣

Av1

...

Avn

⎤

⎥⎥⎥⎦∈ Rm×r , B =

[Bl1 , . . . , Bln

] ∈ Rp×q

F =[Fl1 , . . . , Fln

] ∈ Rm×q (7)

where Avi ∈ Rmi ×r , Bli ∈ Rp×qi , Fli ∈ Rm×qi ,∑n

i=1mi = m,

∑ni=1 qi = q, and the subblocks of A, B, and

F are distributed among the agents of network G.In this structure, agent i only knows Avi , Bli , and Fli

for i ∈ {1, . . . , n}. By communicating with its neigh-bors, every agent i aims to obtain a least squares solutionto (5).

Obviously, if X and F are row vectors with A = 1, thematrix equation (5) with (7) becomes a linear algebraicequation BTXT = FT , where each agent knows a rowsubblock of BT and the vector FT , which was inves-tigated in [11]–[14], [17], [18], and references therein.However, the subblocks of matrices A, B, and X arecoupled in the original (5) with (7), and hence new tech-niques and ideas are needed for its distributed algorithmdesign.

2) Row-row-row (RRR) structure: Consider (5) with

A =

⎡

⎢⎢⎢⎣

Av1

...

Avn

⎤

⎥⎥⎥⎦∈ Rm×r , B =

⎡

⎢⎢⎢⎣

Bv1

...

Bvn

⎤

⎥⎥⎥⎦∈ Rp×q

F =

⎡

⎢⎢⎢⎣

Fv1

...

Fvn

⎤

⎥⎥⎥⎦∈ Rm×q (8)

with X = [Xl1 , . . . , Xln ] ∈ Rr×p , where Avi ∈ Rmi ×r ,Xli ∈ Rr×pi , Bvi ∈ Rpi ×q , Fvi ∈ Rmi ×q ,

∑ni=1 mi =

m, and∑n

i=1 pi = p. Similarly, agent i in the n-agentnetwork G only knows Avi , Bvi , and Fvi and cooperateswith its neighbors to compute Xli .Clearly, if X and F are row vectors with A = 1, thisproblem becomes that discussed in [19].

3) Column-column-row (CCR) structure: Consider (5) with

A =[Al1 , . . . , Aln

] ∈ Rm×r , F =

⎡

⎢⎢⎢⎣

Fv1

...

Fvn

⎤

⎥⎥⎥⎦∈ Rm×q

B =[Bl1 , . . . , Bln

] ∈ Rp×q (9)

where Ali ∈ Rm×ri , Bli ∈ Rp×qi , Fvi ∈ Rmi ×q ,∑n

i=1ri = r,

∑ni=1 mi = m, and

∑ni=1 qi = q. We use an

n-agent network G to find X , where agent i knows Ali ,Bli , and Fvi and estimates X by cooperating with itsneighbors to reach a consensus to a least squares solutionto matrix equation (5) with (9).

4) Column-row-row (CRR) structure: Consider (5) with

A =[Al1 , . . . , Aln

] ∈ Rm×r , B =

⎡

⎢⎢⎢⎣

Bv1

...

Bvn

⎤

⎥⎥⎥⎦∈ Rp×q

F =

⎡

⎢⎢⎢⎣

Fv1

...

Fvn

⎤

⎥⎥⎥⎦∈ Rm×q (10)

where Ali ∈ Rm×ri , Xli ∈ Rr×pi , X = [Xl1 , . . . , Xln ]∈ Rr×p , Bvi ∈ Rpi ×q , Fvi ∈ Rmi ×q ,

∑ni=1 ri = r,∑n

i=1 mi = m, and∑n

i=1 pi = p. We use an n-agent


system to find X , where agent i knows Ali , Bvi , and Fvi

and cooperates with its neighbors to compute Xli , whichcomposes a least squares solution to matrix equation (5)and (10).In this case, if X and F are row vectors with A = 1, (5)and (10) becomes the problem investigated in [19].

5) Row-column-row (RCR) structure: Consider (5) with

A =

⎡

⎢⎢⎢⎣

Av1

...

Avn

⎤

⎥⎥⎥⎦∈ Rm×r , B =

[Bl1 , . . . , Bln

] ∈ Rp×q

F =

⎡

⎢⎢⎢⎣

Fv1

...

Fvn

⎤

⎥⎥⎥⎦∈ Rm×q . (11)

Clearly, this structure is equivalent to the RCC structureby the transposes of matrices.

6) Column-column-column (CCC) structure: Consider (5)with

A =[Al1 , . . . , Aln

] ∈ Rm×r ,

B =[Bl1 , . . . , Bln

] ∈ Rp×q

F =[Fl1 , . . . , Fln

] ∈ Rm×q . (12)

It is equivalent to the RRR structure by the transposes ofmatrices.

7) Row-row-column (RRC) structure: Consider (5) with

A =

⎡

⎢⎢⎢⎣

Av1

...

Avn

⎤

⎥⎥⎥⎦∈ Rm×r , B =

⎡

⎢⎢⎢⎣

Bv1

...

Bvn

⎤

⎥⎥⎥⎦∈ Rp×q

F =[Fl1 , . . . , Fln

] ∈ Rm×q . (13)

It is equivalent to the CCR structure by the transposes ofmatrices.

8) Column-row-column (CRC) structure: Consider (5) with

A =[Al1 , . . . , Aln

] ∈ Rm×r , B =

⎡

⎢⎢⎢⎣

Bv1

...

Bvn

⎤

⎥⎥⎥⎦∈ Rp×q

F =[Fl1 , . . . , Fln

] ∈ Rm×q . (14)

It is equivalent to the CRR structure by the transposes ofmatrices.

Remark 3.1: In these formulations, the rows and columns ofmatrices may have different physical interpretations. Take A, forexample. If A is decomposed of row blocks, each row definesa local linear space and matrix A defines the intersection of thelocal linear spaces as in [11]–[14], [17], [18]. However, if Ais decomposed of column blocks, each block contains partialinformation of the coupling/monotropic constraint information,for example, in resource allocation problems as discussed in[2]. Due to different structures, different combinations of the

consensus design and the auxiliary decomposition design areadopted. �

The main result of this paper can, in fact, be written asTheorem 3.1: A least squares solution to (5) in the eight

standard structures can be obtained using distributed algorithmswith exponential convergence rates if the undirected graph G isconnected.

Clearly, because RCR, CCC, RRC, and CRC structures arethe transpose of the RCC, RRR, CCR, and CRR structures, theeight different structures are basically four standard structuresin the distributed computation design. Therefore, we only needto study (5) with the four standard structures, (7)–(10), in thesequel.

IV. REFORMULATION, ALGORITHM, AND EXPONENTIAL

CONVERGENCE

In this section, we first reformulate the matrix computationproblem in four different structures as solvable distributed opti-mization problems with different substitutional decompositions.Then, we propose distributed continuous-time algorithms for thefour standard structures using a derivative feedback idea and thesaddle-point dynamics. Finally, we give the exponential conver-gence proof of our algorithms with the help of the stabilitytheory and the Lyapunov method.

A. Row-Column-Column Structure

To handle the couplings between the subblocks of matrices A,B, and F in (5) with (7), we introduce a substitutional variableY to make (5) and (7) equivalent to Y = AX and Y Bli =Fli for i ∈ {1, . . . , n}. Let Xi ∈ Rr×p and Yi ∈ Rm×p be theestimates of X and Y of agent i ∈ {1, . . . , n}, respectively. Wepropose a full-consensus substitutional decomposition methodby requiring both Xi and Yi to achieve consensus, namely, werewrite (5) with (7) as

YiBli = Fli, Yi = Yj , i, j ∈ {1, . . . , n} (15)

AXi = Yi, Xi = Xj . (16)

Clearly, (16) is not in a distributed form because all the sub-blocks of A need to be known. To decompose (16), define

Yi �

⎡

⎢⎢⎣

Y v1i

...

Y vni

⎤

⎥⎥⎦ ,

where Y vji ∈ Rmj ×p for all i, j ∈ {1, . . . , n}. Due to Yi = Yj

for all i, j ∈ {1, . . . , n} in (15), (16) is equivalent to

AviXi = Y vii , Xi = Xj , i, j ∈ {1, . . . , n}. (17)

Hence, the matrix equation (5) with (7) is equivalent to thelinear matrix equations (15) and (17). Define extended matricesXE = [XT

1 , . . . , XTn ]T ∈ Rnr×p and YE = [Y T

1 , . . . , Y Tn ]T ∈

Rnm×p . Based on (15) and (17), we reformulate the distributedcomputation of (5) with the RCC structure as the following


distributed optimization problem

minXE ,YE

n∑

i=1

‖YiBli − Fli‖2F (18a)

s.t. Xi = Xj , Yi = Yj , AviXi = Y vii , i, j ∈ {1, . . . , n}

(18b)

where agent i knows Avi , Bli , Fli , and estimates the solutionXi and Yi with only local information.

Remark 4.1: Problem (18) is a standard distributed optimiza-tion problem, which contains local constraints AviXi = Y vi

iand consensus constraints Xi = Xj and Yi = Yj . �

The following proposition reveals the relationship between(5) and problem (18).

Proposition 4.1: Suppose that the undirected graph G is con-nected. X∗ ∈ Rr×p is a least squares solution to matrix equation(5) if and only if there exists Y ∗ = AX∗ ∈ Rm×p , such that(X∗

E , Y ∗E ) = (1n ⊗ X∗, 1n ⊗ Y ∗) is a solution to problem (18).

The proof can be found in Appendix A.In this structure, we focus on problem (18), and propose a

distributed algorithm of agent i as

Xi(t) = −ATvi(AviXi(t) − Y vi

i (t)) − ATviΛ

3i (t)

−n∑

j=1

ai,j (Λ1i (t) − Λ1

j (t))

−n∑

j=1

ai,j (Xi(t) − Xj (t)), Xi(0) = Xi0 ∈ Rr×p

(19a)

Yi(t) = − (Yi(t)Bli − Fli)BTli + [Imi

]RΛ3i (t)

+ [Imi]R(AviXi(t) − Y vi

i (t))

−n∑

j=1

ai,j (Yi(t) − Yj (t))

−n∑

j=1

ai,j (Λ2i (t) − Λ2

j (t)), Yi(0) = Yi0 ∈ Rm×p

(19b)

Λ1i (t) =

n∑

j=1

ai,j (Xi(t) − Xj (t)), Λ1i (0) = Λ1

i0 ∈ Rr×p

(19c)

Λ2i (t) =

n∑

j=1

ai,j (Yi(t) − Yj (t)), Λ2i (0) = Λ2

i0 ∈ Rm×p

(19d)

Λ3i (t) = AviXi(t) − Y vi

i (t), Λ3i (0) = Λ3

i0 ∈ Rmi ×p (19e)

where i ∈ {1, . . . , n}, t ≥ 0, Xi(t) and Yi(t) are the estimatesof solutions to problem (18) by agent i at time t, Λ1

i (t), Λ2i (t),

and Λ3i (t) are the estimates of Lagrangian multipliers for the

constraints in (18b) by agent i at time t, and [Imi]R denotes

[Imi]{mj }n

j = 1R , as defined in (1).

Remark 4.2: Algorithm (19) is a primal-dual algorithm,whose primal variables are Xi and Yi and dual variables

are Λ1i , Λ2

i , and Λ3i . Though substitutional variables are used

in (19) for the distributed computation of (5) and (7), algo-rithm (19) is a fully distributed algorithm. Different from theclassic (Arrow-Hurwicz-Uzawa type) primal-dual algorithm in[16], the consensus design (−∑n

j=1 ai,j (Xi(t) − Xj (t)) and−∑n

j=1 ai,j (Yi(t) − Yj (t)) in (19a) and (19b)) and the damp-ing design (−AT

vi(AviXi(t) − Y vii (t)) in (19a)) are used in the

algorithm. �Remark 4.3: Let

Λ1 =

⎡

⎢⎢⎢⎣

Λ11

...

Λ1n

⎤

⎥⎥⎥⎦∈ Rnr×p ,Λ2 =

⎡

⎢⎢⎢⎣

Λ21

...

Λ2n

⎤

⎥⎥⎥⎦∈ Rnm×p ,

and

Λ3 =

⎡

⎢⎢⎢⎣

Λ31

...

Λ3n

⎤

⎥⎥⎥⎦∈ Rm×p ,

where Λ1i ∈ Rr×p , Λ2

i ∈ Rm×p , and Λ3i ∈ Rmi ×p . Algorithm

(19) can be viewed as the saddle-point dynamics of the modifiedLagrangian function L(XE , YE ,Λ1 ,Λ2 ,Λ3) given by

L =12

n∑

i=1

‖YiBli − Fli‖2F +

n∑

i=1

n∑

j=1

⟨Λ1

i , ai,j (Xi − Xj )⟩

F

+n∑

i=1

n∑

j=1

⟨Λ2

i , ai,j (Yi − Yj )⟩

F+

n∑

i=1

〈Λ3i , AviXi − Y vi

i 〉F

+12

n∑

i=1

n∑

j=1

⟨Xi, ai,j (Xi − Xj )

⟩

F+

12

n∑

i=1

‖AviXi − Y vii ‖2

F

+12

n∑

i=1

n∑

j=1

⟨Yi, ai,j (Yi − Yj )

⟩

F

where ai,j is the (i, j)-th element of the adjacency matrixof graph G, Λ1 , Λ2 , and Λ3 are the Lagrangian matrix mul-tipliers, that is, Xi = −∇Xi

L, Yi = −∇YiL, Λ1

i = ∇Λ1iL,

Λ2i = ∇Λ2

iL, and Λ3

i = ∇Λ3iL for i ∈ {1, . . . , n}. In this mod-

ified Lagrangian function, 12

∑ni=1 ‖AviXi − Y vi

i ‖2F is the

augmented term, and 12

∑ni=1〈Xi,

∑nj=1 ai,j (Xi − Xj )〉F and

12

∑ni=1〈Yi,

∑nj=1 ai,j (Yi − Yj )〉F are the (weighted) Lapla-

cian regularization. �Remark 4.4: Recall that a standard centralized algo-

rithm (see [31]) is X(k) = X(k − 1) + μAT[F − AX(k −1)B]BT , where μ > 0 is an appropriate real number andAT[F − AX(k − 1)B]BT is the negative gradient of 1

2 ‖AXB −F‖2

F . Both the centralized algorithm and algorithm (19) use thegradient dynamics in the design. However, in contrast to thecentralized algorithm, algorithm (19) uses auxiliary variablesand equality linear constraints to deal with the unavailabilityof matrices information. The auxiliary variables may also beconsidered as distributed observers and filters of the unavailablematrix information from the control viewpoint. �

The following result reveals the relationship between equi-libria of algorithm (19) and solutions to problem (18), which is


an immediate conclusion of the Karush–Kuhn–Tucker (KKT)optimality condition [36, Th. 3.25], so its proof is omitted.

Lemma 4.1: For a connected undirected graph G,(X∗

E , Y ∗E ) ∈ Rnr×p × Rnm×p is a solution to problem (18) if

and only if there exist matrices Λ1∗ ∈ Rnr×p , Λ2∗ ∈ Rnm×p ,and Λ3∗ ∈ Rm×p , such that (X∗

E , Y ∗E ,Λ1∗,Λ2∗,Λ3∗) is an equi-

librium of (19).It is time to show the exponential convergence of algorithm

(19).Proposition 4.2: If the undirected graph G is connected, then1) every equilibrium of algorithm (19) is Lyapunov stable,

and its trajectory is bounded for any initial condition;2) the trajectory of algorithm (19) is exponentially conver-

gent and Xi(t) converges to a least squares solution to(5) exponentially for all i ∈ {1, . . . , n}.

The proof can be found in Appendix B, which shows that al-gorithm (19) is globally convergent. In fact, if there are multiplesolutions, the solution obtained by algorithm (19) depends onthe selected initial condition.

Remark 4.5: In comparison with many previous results onthe distributed computation of linear algebraic equation Ax = b,we do not need the boundedness assumption for least squaressolutions given in [11] or the existence assumption of exact solu-tions given in [12], [15], [17]–[19]. By virtue of the saddle-pointdynamics, the authors in [37] proposed a distributed algorithmto achieve least squares solutions with an exponential conver-gence rate, without any boundedness assumption or choices oftime-varying small step sizes. �

B. Row-Row-Row Structure

The decomposition method used in the RCC structure can-not convert the RRR structure to a solvable optimization prob-lem. To deal with (5) and (8), we take substitutional variablesYi ∈ Rr×q , such that Yi = XB for all i ∈ {1, . . . , n}, and thenwe propose another method called Y -consensus substitutionaldecomposition because we need the consensus of Yi as follows:

AviYi = Fvi, Yi = Yj , i, j ∈ {1, . . . , n} (20)

1n

n∑

i=1

Yi = [Xl1 , . . . , Xln ]

⎡

⎢⎢⎣

Bv1

...

Bvn

⎤

⎥⎥⎦ =

n∑

i=1

XliBvi. (21)

In this way, agent i computes Xli and Yi based on onlylocal information. To decompose (21), we add new variablesZi ∈ Rr×q , such that

1n

Yi − XliBvi +n∑

j=1

ai,j (Zi − Zj ) = 0r×q (22)

where ai,j is the (i, j)-th element of the adjacency matrixof the connected graph for the agents. Clearly, (22) implies(21). Conversely, if (21) holds, there exists Zi ∈ Rr×q , suchthat (22) holds due to the fundamental theorem of linear al-gebra [38] (whose proof is similar to the proof of part (ii) ofProposition 4.1).

Then, we reformulate the distributed computation of (5) withthe RRR structure as the following optimization problem

minX,YE ,Z

n∑

i=1

‖AviYi − Fvi‖2F (23a)

s.t.1n

Yi − XliBvi +n∑

j=1

ai,j (Zi − Zj ) = 0r×q

Yi = Yj , i, j ∈ {1, . . . , n} (23b)

where X = [Xl1 , . . . , Xln ] ∈ Rr×p , YE = [Y T1 , . . . , Y T

n ]T∈ Rnr×q , and Z = [ZT

1 , . . . , ZTn ]T ∈ Rnr×q .

Remark 4.6: In (23b), Yi = Yj is a consensus constraint and1n Yi − XliBvi +

∑nj=1 ai,j (Zi − Zj ) = 0r×q is a coupled con-

straint, which may be viewed as a (generalized) resource allo-cation constraint [39]. �

It is not hard to obtain the following result.Proposition 4.3: Suppose that the undirected graph G is con-

nected. X∗ ∈ Rr×p is a least squares solution to matrix equation(5) if and only if there exist Y ∗

E ∈ Rnr×q and Z∗ ∈ Rnr×q , suchthat (X∗, Y ∗

E , Z∗) is a solution to problem (23).The proof is omitted due to the space limitation and similarity

to that of Proposition 4.1.In the RRR structure, we define Λ1 =

[(Λ1

1)T , . . . ,

(Λ1n )T

]T ∈ Rnr×q and Λ2 =[(Λ2

1)T , . . . , (Λ2

n )T]T ∈ Rnr×q

as estimates of Lagrangian multipliers, where Λ1i ∈ Rr×q and

Λ2i ∈ Rr×q for i ∈ {1, . . . , n}. Then, we propose a distributed

algorithm of agent i as follows:

Xli(t) = Λ1i (t)B

Tvi , Xli(0) = Xli0 ∈ Rr×pi (24a)

Yi(t) = − ATvi(AviYi(t) − Fvi) −

n∑

j=1


− 1n

Λ1i (t) −

n∑

j=1

ai,j (Λ2i (t) − Λ2

j (t))

Yi(0) = Yi0 ∈ Rr×q (24b)

Zi(t) = −n∑

j=1

ai,j (Λ1i (t) − Λ1

j (t)), Zi(0) = Zi0 ∈ Rr×q

(24c)

Λ1i (t) =

1n

(Yi(t)+Yi(t)) − (Xli(t) + Xli(t))Bvi (24d)

+n∑

j=1

ai,j (Zi(t)−Zj (t))−n∑

j=1

ai,j (Λ1i (t)−Λ1

j (t)),

Λ1i (0) = Λ1

i0 ∈ Rr×q , (24e)

Λ2i (t) =

n∑

j=1

ai,j (Yi(t) − Yj (t)) +n∑

j=1


Λ2i (0) = Λ2

i0 ∈ Rr×q (24f)

where i ∈ {1, . . . , n}, t ≥ 0, Xli(t), Yi(t), and Zi(t) are theestimates of solutions to problem (23) by agent i at time t, andai,j is the (i, j)-th element of the adjacency matrix of graph G.


Remark 4.7: The derivative feedbacks Xli and Yi are usedin algorithm (24); otherwise the trajectories of the algorithmmay oscillate following a periodic routine. In fact, derivativefeedbacks play a role as a damping term to deal with the generalconvexity of objective functions [28]. �

The following result shows the correctness of algorithm (24)for problem (23).

Lemma 4.2: Suppose that the undirected graph G is con-nected. (X∗, Y ∗

E , Z∗) ∈ Rr×p × Rnr×q × Rnr×q is a solutionto problem (23) if and only if there exist matrices Λ1∗ ∈ Rnr×q

and Λ2∗ ∈ Rnr×q such that (X∗, Y ∗E , Z∗,Λ1∗,Λ2∗) is an equi-

librium of (24).The proof is omitted because it is easy due to the KKT opti-

mality condition [36].Then, we show the convergence of algorithm (24). Define a

function

V (X,YE , Z,Λ1 ,Λ2) = V1(YE) + V2(X,YE , Z,Λ1 ,Λ2) (25)

where

V1 � 12

n∑

i=1

‖AviYi − Fvi‖2F − 1

2

n∑

i=1

‖AviY∗i − Fvi‖2

F

−n∑

i=1

〈ATvi(AviY

∗i − Fvi), Yi − Y ∗

i 〉F

+12

n∑

i=1

n∑

j=1

ai,j 〈Yi, Yi − Yj 〉F

V2 � 12

n∑

i=1

‖Xli − X∗li‖2

F +12

n∑

i=1

‖Yi − Y ∗i ‖2

F

+12

n∑

i=1

‖Zi − Z∗i ‖2

F +12

n∑

i=1

‖Λ1i − Λ1∗

i ‖2F

+12

n∑

i=1

‖Λ2i − Λ2∗

i ‖2F

and (X∗, Y ∗E , Z∗,Λ1∗,Λ2∗) is an equilibrium of (24). The fol-

lowing lemma will be needed in the theoretical proof of ouralgorithm.

Lemma 4.3: If the undirected graph G is connected, V1(YE)defined in (25) is nonnegative for all YE ∈ Rnr×q .

Proof: Consider function V1(YE) defined in (25). Because Gis undirected

12

n∑

i=1

n∑

j=1

ai,j 〈Yi, Yi − Yj 〉F =14

n∑

i=1

n∑

j=1

ai,j‖Yi − Yj‖2F ≥ 0.

Define f(YE) = 12

∑ni=1 ‖AviYi − Fvi‖2

F . f(YE) is clearlyconvex with respect to matrix YE ∈ Rnr×q . Then

f(YE) − f(Y ∗E ) ≥

n∑

i=1

〈Yi − Y ∗i ,∇Y ∗

if(Y ∗

E )〉F

=n∑

i=1

〈Yi − Y ∗i , AT

vi(AviY∗i − Fvi)〉F .

Hence, V1(YE) ≥ f(YE) − f(Y ∗E ) − ∑n

i=1〈Yi − Y ∗i , AT

vi(Avi

Y ∗i − Fvi)〉F ≥ 0 for all YE ∈ Rnr×q . �

Next, we show the exponential convergence of algorithm (24).Proposition 4.4: If the undirected graph G is connected, then1) every equilibrium of algorithm (24) is Lyapunov stable,


gent, and X(t) converges to a least squares solution of(5) exponentially.

The proof can be found in Appendix C.Remark 4.8: Compared with related results in linear alge-

braic equations or others [11], [12], [15], [17]–[19], the bound-edness assumption for least squares solutions or the existenceof exact solutions is not required. �

C. Column-Column-Row Structure

To handle the CCR structure, we take a substitutional variable

Y =

⎡

⎢⎢⎣

Yv1

...

Yvn

⎤

⎥⎥⎦ ∈ Rr×q , Yvi ∈ Rri ×q , ∀i ∈ {1, . . . , n}.

It is clear that (5) and (9) are equivalent to AY = F, Y =XB. Let Xi ∈ Rr×p be the estimate of X by agent i. De-

fine matrices [Yvi ]{rj }n

j = 1R , [Fvi ]

{mj }nj = 1

R , and [Bli ]{qj }n

j = 1C as

in (1) and (2) and take [Yvi ]R , [Fvi ]R , and [Bli ]C to repre-

sent [Yvi ]{rj }n

j = 1R , [Fvi ]

{mj }nj = 1

R , and [Bli ]{qj }n

j = 1C for the ease

of notation. Clearly,∑n

i=1[Yvi ]R = Y ,∑n

i=1[Fvi ]R = F , and∑ni=1[Bi ]C = B. Here, we construct a transformation, called

X-consensus substitutional decomposition with requiring theconsensus of Xi , and then (5) and (9) are equivalent to

n∑

i=1

AliYvi =n∑

i=1

[Fvi ]R (26)

n∑

i=1

[Yvi ]R =n∑

i=1

Xi [Bli ]C , Xi = Xj , i, j ∈ {1, . . . , n}.(27)

To decompose (26) and (27), we add new variables Ui ∈Rm×q , Wi ∈ Rm×q , and Zi ∈ Rr×q , such that

AliYvi − [Fvi ]R − Ui = 0m×q , Ui =n∑

j=1

ai,j (Wi − Wj )

(28)

[Yvi ]R − Xi [Bli ]C −n∑

j=1

ai,j (Zi − Zj ) = 0r×q , Xi = Xj

(29)

where i, j ∈ {1, . . . , n}, ai,j is the (i, j)-th element of the ad-jacency matrix of graph G. If (28) and (29) hold, then one caneasily obtain (26) and (27). Conversely, if (26) and (27) hold, itfollows from a similar proof of Proposition 4.1 that there existUi , Wi , and Zi such that (28) and (29) hold.

Let XE = [XT1 , . . . , XT

n ]T ∈ Rr×p , Y = [Y Tv1 , . . . , Y

Tvn ]T ∈

Rr×q , U = [UT1 , . . . , UT

n ]T ∈ Rnm×q , W = [WT1 , . . . , WT

n ]T∈ Rnm×q , and Z = [ZT

1 , . . . , ZTn ]T ∈ Rnr×q . Then, we refor-

mulate the distributed computation of (5) with the CCR structure


as the following optimization problem

minXE ,Y ,U,W,Z

n∑

i=1

‖AliYvi − [Fvi ]R − Ui‖2F (30a)

s.t. Xi = Xj , Ui =n∑

j=1

ai,j (Wi − Wj ) (30b)

[Yvi ]R − Xi [Bli ]C −n∑

j=1


i, j ∈ {1, . . . , n}. (30c)

Similar to the RRR structure, (26) and (27) are a combinationof a consensus constraint and coupled equality constraints. Then,we have the following result.

Proposition 4.5: Suppose that the undirected graph G isconnected. X∗ ∈ Rr×p is a least squares solution to matrixequation (5) if and only if there exist X∗

E = 1n ⊗ X∗, Y ∗ ∈Rr×q , Z∗ ∈ Rnr×q , U ∗ ∈ Rnm×q , and W ∗ ∈ Rnm×q , such that(X∗

E , Y ∗, Z∗, U ∗,W ∗) is a solution to problem (30).The proof is omitted due to the space limitation and similarity

to that of Proposition 4.1.In the CCR structure, we take

Λ1 =

⎡

⎢⎢⎢⎣

Λ11

...

Λ1n

⎤

⎥⎥⎥⎦∈ Rnr×p ,Λ2 =

⎡

⎢⎢⎢⎣

Λ21

...

Λ2n

⎤

⎥⎥⎥⎦∈ Rnm×q ,

and

Λ3 =

⎡

⎢⎢⎢⎣

Λ31

...

Λ3n

⎤

⎥⎥⎥⎦∈ Rnr×q

as the Lagrangian multipliers, where Λ1i ∈ Rr×p , Λ2

i ∈ Rm×q ,and Λ3

i ∈ Rr×q . The distributed algorithm of agent i is

Xi(t) = Λ3i (t)[Bli ]TC −

n∑

j=1

ai,j (Λ1i (t) − Λ1

j (t))

−n∑

j=1

ai,j (Xi(t) − Xj (t)), Xi(0) = Xi0 ∈ Rr×p

(31a)

Yv i(t) = − ATli (AliYvi(t) − [Fvi ]R − Ui(t)) − [Iri

]CΛ3i (t)

Yvi(0) = Yvi0 ∈ Rri ×q (31b)

Ui(t) = AliYvi(t) − [Fvi ]R − Ui(t) − Λ2i (t)

Ui(0) = Ui0 ∈ Rm×q (31c)

Wi(t) =n∑

j=1

ai,j (Λ2i (t) − Λ2

j (t)), Wi(0) = Wi0 ∈ Rm×q

(31d)

Zi(t) =n∑

j=1

ai,j (Λ3i (t) − Λ3

j (t)), Zi(0) = Zi0 ∈ Rr×q

(31e)

Λ1i (t) =

n∑

j=1

ai,j (Xi(t) − Xj (t)), Λ1i (0) = Λ1

i0 ∈ Rr×p

(31f)

Λ2i (t) = Ui(t) + Ui(t) −

n∑

j=1

ai,j (Wi(t) − Wj (t))

−n∑

j=1

ai,j (Λ2i (t) − Λ2

j (t)), Λ2i (0) = Λ2

i0 ∈ Rm×q

(31g)

Λ3i (t) = [Yvi ]R(t) + [Yv i ]R(t) − Xi(t)[Bli ]C

−n∑

j=1

ai,j (Zi(t) − Zj (t))−n∑

j=1

ai,j (Λ3i (t) − Λ3

j (t))

Λ3i (0) = Λ3

i0 ∈ Rr×q (31h)

where i ∈ {1, . . . , n}, t ≥ 0, Xi(t), Yvi(t), Ui(t), Wi(t), andZi(t) are the estimates of solutions to problem (30) by agent iat time t, ai,j is the (i, j)-th element of the adjacency matrix ofgraph G, and [Bli ]C , [Fvi ]R , and [Iri

]C are shorthand notations

for [Bli ]{qj }n

j = 1C , [Fvi ]

{mj }nj = 1

R , and [Iri]{rj }n

j = 1C as defined in (1)

and (2).Similar to algorithm (24), algorithm (31) is the saddle-point

dynamics of the modified Lagrangian function with derivativefeedbacks, which are a “damping” term (see Remark 4.7).

The following lemma reveals the connection of solutionsto problem (30) and equilibria of algorithm (31), whoseproof is quite obvious because of the KKT optimalitycondition [36].

Lemma 4.4: Suppose that the undirected graph G is con-nected. (X∗

E , Y ∗, Z∗, U ∗,W ∗) ∈ Rnr×p × Rr×q × Rnr×q ×Rnm×q × Rnm×q is a solution to problem (30) if and onlyif there exist matrices Λ1∗ ∈ Rnr×p , Λ2∗ ∈ Rnm×q , andΛ3∗ ∈ Rnr×q , such that (X∗

E , Y ∗, Z∗, U ∗,W ∗,Λ1∗,Λ2∗,Λ3∗)is an equilibrium of (31).

Define the function

V (XE , Y, Z, U,W,Λ1 ,Λ2 ,Λ3) = V1(XE , Y, U)

+ V2(XE , Y, Z, U,W,Λ1 ,Λ2 ,Λ3) (32)

with

V1 � 12

n∑

i=1

‖AliYvi − [Fvi ]R − Ui‖2F

+n∑

i=1

n∑

j=1

〈Λ1∗i , ai,j (Xi − Xj )〉F

+n∑

i=1

〈Λ2∗i , Ui〉F +

n∑

i=1

〈Λ3∗i , [Yvi ]R − Xi [Bli ]C〉F

− 12

n∑

i=1

‖AliY∗vi − [Fvi ]R − U ∗

i ‖2F


V2 � 12

n∑

i=1

‖Xi − X∗i ‖2

F +12

n∑

i=1

‖Yvi − Y ∗vi‖2

F

+12

n∑

i=1

‖Zi − Z∗i ‖2

F +12

n∑

i=1

‖Ui − U ∗i ‖2

F

+12

n∑

i=1

‖Wi − W ∗i ‖2

F +12

n∑

i=1

‖Λ1i − Λ1∗

i ‖2F

+12

n∑

i=1

‖Λ2i − Λ2∗

i ‖2F +

12

n∑

i=1

‖Λ3i − Λ3∗

i ‖2F

where (X∗E , Y ∗, Z∗, U ∗,W ∗,Λ1∗,Λ2∗,Λ3∗) is an equilibrium

of (31).Lemma 4.5: Suppose that the undirected graph G is con-

nected. The function V1(XE , Y, U) defined in (32) is nonnega-tive for all (XE , Y, U) ∈ Rnr×p × Rr×q × Rnm×q .

The proof is similar to that of Lemma 4.3 and omitted.The following result shows the exponential convergence of

algorithm (31).Proposition 4.6: If the undirected graph G is connected, then1) every equilibrium of algorithm (31) is Lyapunov stable,


gent, and Xi(t) converges to a least squares solution of(5) exponentially for all i ∈ {1, . . . , n}.

The proof is similar to that of Proposition 4.4 by using theLyapunov candidate function V defined in (32). Hence, it isomitted.

D. Column-Row-Row Structure

In the CRR structure, which is the most complicated structureamong the four standard structures, the above decompositionmethods do not work. Define a substitutional variable

Y =

⎡

⎢⎢⎣

Yv1

...

Yvn

⎤

⎥⎥⎦ ∈ Rr×q , Yvi ∈ Rri ×q , ∀i ∈ {1, . . . , n}.

Clearly, (5) with (10) is equivalent to AY = F and Y = XB.

Moreover, we further define the augmented matrices [Yvi ]{rj }n

j = 1R

and [Fvi ]{mj }n

j = 1R as in (1) and take [Yvi ]R and [Fvi ]R to denote

[Yvi ]{rj }n

j = 1R and [Fvi ]

{mj }nj = 1

R for convenience. Then, we have

n∑

i=1

AliYvi =n∑

i=1

[Fvi ]R (33)

n∑

i=1

[Yvi ]R =n∑

i=1

XliBvi. (34)

To decompose (33) and (34), we take new variables Ui ∈Rm×q , Wi ∈ Rm×q , and Zi ∈ Rr×q , such that

AliYvi − [Fvi ]R − Ui = 0m×q , Ui =n∑

j=1

ai,j (Wi − Wj )

(35)

[Yvi ]R − XliBvi −n∑

j=1

ai,j (Zi − Zj ) = 0r×q . (36)

Let X = [Xl1 , . . . , Xln ] ∈ Rr×p , Y = [Y Tv1 , . . . , Y

Tvn ]T ∈

Rr×q , U = [UT1 , . . . , UT

n ]T ∈ Rnm×q , W = [WT1 , . . . , WT

n ]T∈ Rnm×q , and Z = [ZT

1 , . . . , ZTn ]T ∈ Rnr×q . We reformulate

the distributed computation of (5) with the CRR structure as thefollowing optimization problem:

minX,Y ,U,W,Z

n∑

i=1

‖AliYvi − [Fvi ]R − Ui‖2F (37a)

s.t. [Yvi ]R − XliBvi −n∑

j=1


(37b)

Ui =n∑

j=1

ai,j (Wi − Wj ), i ∈ {1, . . . , n}.

(37c)

The transformation given here is simply called consensus-free substitutional decomposition because we do not need theconsensus of Xi or Yi for i = 1, . . . , n. Then, we have thefollowing result.

Proposition 4.7: Suppose that the undirected graph G is con-nected. X∗ is a least squares solution to (5) if and only if thereexist Y ∗, Z∗, U ∗, and W ∗, such that (X∗, Y ∗, Z∗, U ∗,W ∗) is asolution to problem (37).

The proof is omitted due to the space limitation and similarityto that of Proposition 4.1.

In this structure, we propose a distributed algorithm of agenti as follows:

Xli(t) = Λ2i (t)B

Tvi , Xli(0) = Xli0 ∈ Rr×pi (38a)

Yv i(t) = −ATli (AliYvi(t) − [Fvi ]R − Ui(t)) − [Iri

]CΛ2i (t)

Yvi(0) = Yvi0 ∈ Rri ×q (38b)

Ui(t) = AliYvi(t) − [Fvi ]R − Ui(t) − Λ1i (t)

Ui(0) = Ui0 ∈ Rm×q (38c)

Wi(t) =n∑

j=1

ai,j (Λ1i (t) − Λ1

j (t)), Wi(0) = Wi0 ∈ Rm×q

(38d)

Zi(t) =n∑

j=1

ai,j (Λ2i (t) − Λ2

j (t)), Zi(0) = Zi0 ∈ Rr×q

(38e)

Λ1i (t) = Ui(t) + Ui(t) −

n∑

j=1

ai,j (Wi(t) − Wj (t))

−n∑

j=1

ai,j (Λ1i (t) − Λ1

j (t)), Λ1i (0) = Λ1

i0 ∈ Rm×q

(38f)

Λ2i (t) = [Yvi ]R(t) + [Yv i ]R(t) − Xli(t)Bvi

−n∑

j=1

ai,j (Zi(t) − Zj (t))−n∑

j=1

ai,j (Λ2i (t) − Λ2

j (t))

− Xli(t)Bvi, Λ2i (0) = Λ2

i0 ∈ Rr×q (38g)


where i ∈ {1, . . . , n}, t ≥ 0, Xli(t), Yvi(t), Ui(t), Wi(t), andZi(t) are the estimates of solutions to problem (37) by agent iat time t, ai,j is the (i, j)-th element of the adjacency matrix

of graph G, and [Yvi ]R = [Yvi ]{mj }n

j = 1R and [Iri

]C = [Iri]{rj }n

j = 1C

are as defined in (1) and (2), respectively. Similar to algorithms(24) and (31), the design of algorithm (38) also combines thesaddle-point dynamics of the modified Lagrangian function andderivative feedback technique.

Let

Λ1 =

⎡

⎢⎢⎢⎣

Λ11

...

Λ1n

⎤

⎥⎥⎥⎦∈ Rnm×q and Λ2 =

⎡

⎢⎢⎢⎣

Λ21

...

Λ2n

⎤

⎥⎥⎥⎦∈ Rnr×q ,

where Λ1i ∈ Rm×q and Λ2

i ∈ Rr×q for i ∈ {1, . . . , n}. We havethe following result, whose proof is omitted because it isstraightforward due to the KKT optimality condition [36, Th.3.25].

Lemma 4.6: Suppose that the undirected graph G isconnected. (X∗, Y ∗, Z∗, U ∗,W ∗) ∈ Rr×p × Rr×q × Rnr×q ×Rnm×q × Rnm×q is a solution to problem (37) if and onlyif there exist Λ1∗ ∈ Rnm×q and Λ2∗ ∈ Rnr×q , such that(X∗, Y ∗, Z∗, U ∗,W ∗,Λ1∗,Λ2∗) is an equilibrium of (38).

For further analysis, let (X∗, Y ∗, Z∗, U ∗,W ∗,Λ1∗,Λ2∗) bean equilibrium of (38), and take

V (X,Y,Z, U,W,Λ1 ,Λ2) = V1(Y,U)

+ V2(X,Y,Z, U,W,Λ1 ,Λ2) (39)

where

V1 =12

n∑

i=1

‖AliYvi − [Fvi ]R − Ui‖2F +

n∑

i=1

〈Λ1∗i , Ui〉F

+n∑

i=1

〈Λ2∗i , [Yvi ]R〉F − 1

2

n∑

i=1

‖AliY∗vi − [Fvi ]R − U ∗

i ‖2F

and

V2 � 12

n∑

i=1

‖Xli − X∗li‖2

F +12

n∑

i=1

‖Yvi − Y ∗vi‖2

F

+12

n∑

i=1

‖Zi − Z∗i ‖2

F +12

n∑

i=1

‖Ui − U ∗i ‖2

F

+12

n∑

i=1

‖Wi − W ∗i ‖2

F +12

n∑

i=1

‖Λ1i − Λ1∗

i ‖2F

+12

n∑

i=1

‖Λ2i − Λ2∗

i ‖2F .

Then, we get the following result.Lemma 4.7: The function V1(Y,U) defined in (39) is non-

negative for all (Y,U) ∈ Rr×q × Rnm×q .The proof is similar to that of Lemma 4.5 and is omitted.

Now, it is time to show the main result of this subsection.Next, we show the exponential convergence of algorithm (38).Proposition 4.8: If the undirected graph G is connected, then1) every equilibrium of algorithm (38) is Lyapunov stable,

and its trajectory is bounded for any initial condition;

2) the trajectory of algorithm (38) is exponentially conver-gent, and X(t) converges to a least squares solution ofthe matrix equation (5) exponentially.

Using the Lyapunov candidate function V in (39), one can getthe proof following similar steps to that of Proposition 4.4. Theproof is thus omitted.

E. Discussions

The conclusion of Theorem 3.1 is obtained immediately fromthe results given in Propositions 4.1–4.8. In fact, we developnew methods for the distributed computation of a least squaressolution to matrix equation (5), which is much more complicatedthan that to the linear algebraic equation. The main results ofthis section is summarized as follows.

1) We employ different substitutional decomposition meth-ods to reformulate the original computation matrix equa-tions as distributed constrained optimization problemswith different constraints in the standard structures. Notethat the decompositions are new compared with those inthe distributed computation of the linear algebraic equa-tion of the form Ax = b in [11]–[14], [17], [18].

2) We give distributed algorithms to deal with the distributedconstrained optimization problems, which are equivalentto matrix equations in different standard structures. Theproposed algorithms are not a direct application of theexisting ideas on distributed subgradient optimization de-signs. Derivative feedback ideas are used to deal with thegeneral convexity. Additionally, the auxiliary variablesare employed as observers to estimate the unavailablematrix information from the structure decomposition.Therefore, the proposed algorithms are different fromthose given in [3] and [11], which did not used derivativefeedbacks and auxiliary variables.

3) We give the exponential convergence analysis of the algo-rithms by using advanced (control) techniques such as theLyapunov stability theory and the derivative feedback todeal with convexity of objective functions. The proposedalgorithms are globally exponentially convergent, whichguarantees the exponential convergence to a least squaressolution to the matrix equation for any initial condition.

In each standard structure of our problems, we have to em-ploy different ideas to obtain solutions of the reformulateddistributed optimization problems because the distributed de-sign for problems with various constraints and only convexobjective functions is a nontrivial task. Moreover, the deriva-tive feedback plays a “damping” role in the structures with thecoupled constraints for the convergence of the proposed algo-rithms. Specifically, different consensus variables and deriva-tive feedback variables are used for various structures due todifferent constraints (see Table I). The developed approach mayprovide effective tools for general cases or mixed structureseven though there may be no universal way of generalizing thisapproach.

Remark 4.9: This paper sheds light on state-of-the-art of thedistributed computation of matrix equations optimization viaa distributed optimization perspective. For different problemstructures, distributed computation algorithms with exponential


TABLE ICONSENSUS VARIABLE, COUPLED CONSTRAINT, AND DERIVATIVE FEEDBACK

convergence rates are proposed by combining distributed opti-mization and control ideas. �

V. NUMERICAL SIMULATION

In this section, we give a numerical example for illustration.Due to the space limitation, we only present a numerical simu-lation for the RRR structure.

Consider a linear matrix equation (5) with the structure (8)and n = 4, where

Av1 = [2, 1], Av2 = [4, 3], Av3 = [1, 3], Av4 = [2, 4]

Bv1 = [1, 2], Bv2 = [3, 2], Bv3 = [2, 4], Bv4 = [2, 1]

and F is given by

Fv1 = [0, 0], Fv2 = [2, 1], Fv3 = [3, 5], Fv4 = [1, 4].

There is no exact solution for this matrix equation, and there-fore we find a least squares solution for the problem. Let theadjacency matrix of the graph be

⎡

⎢⎢⎢⎢⎣

0 1 0 1

1 0 1 0

0 1 0 1

1 0 1 0

⎤

⎥⎥⎥⎥⎦

.

We solve a least squares solution with algorithm (24)

X = [Xl1 ,Xl2 ,Xl3 ,Xl4 ]

=

[−0.2744 0.0973 −0.2058 0.1572

0.3780 −0.0373 0.2835 −0.1163

]

∈ R2×4

where agent i estimates Xli for i ∈ {1, . . . , 4}. Fig. 1 shows thatthe trajectory of algorithm converges to a least squares solutionand Fig. 2 shows the trajectory of ‖AXB − F‖F , while Fig. 3demonstrates the boundedness of algorithm variables.

VI. CONCLUSION

In this paper, the distributed computation of least squaressolutions to the linear matrix equation AXB = F in standarddistributed structures has been studied. Based on substitutionaldecompositions, the computation problems have been reformu-lated as equivalent constrained optimization problems in thestandard cases. Inspired by saddle-point dynamics and deriva-tive feedbacks, distributed continuous-time algorithms for thereformulated problems have been proposed. Furthermore, the

Fig. 1. Trajectories of estimates for X versus time.

Fig. 2. Trajectories of estimates for ‖AXB − F ‖F versus time.

Fig. 3. Trajectories of estimates for Y , Z , Λ1 , and Λ2 versus time.

boundedness and exponential convergence of the proposed al-gorithms have been proved using the stability and Lyapunovapproaches. Finally, the algorithm performance has been illus-trated via a numerical simulation.

This paper assumes that information of matrices is dividedwith respect to rows and columns and solves the least squaressolutions. It is desirable to further investigate the distributedcomputation for other well-known linear matrix equations with


mixed structures and solve general solutions, such as solutionsto LASSO type problems. In addition, undirected graphs may begeneralized to directed graphs, random graphs, and switchinggraphs, for instance, and also various effective discrete-timealgorithms based on alternating direction method of multipliers(ADMM) or other methods may be constructed.

APPENDIX APROOF OF PROPOSITION 4.1

1) Suppose that (X∗E , Y ∗

E ) = (1n ⊗ X∗, 1n ⊗ Y ∗) is a solu-tion to (18). We show that X∗ is a least squares solution to (5).

Because G is undirected and connected, (18b) is equivalent to

n∑

j=1

ai,j (Xi − Xj ) = 0r×p ,

n∑

j=1

ai,j (Yi − Yj ) = 0m×p

AviXi = Y vii , i ∈ {1, . . . , n}.

By the KKT optimality condition [36, Th. 3.25], (X∗E , Y ∗

E ) =(1n ⊗ X∗, 1n ⊗ Y ∗) is a solution to problem (18) if and only ifAX∗ = Y ∗, and there are matrices Λ1∗

i ∈ Rr×p , Λ2∗i ∈ Rm×p ,

and Λ3∗i ∈ Rmi ×p , such that

0r×p = −ATviΛ

3∗i −

n∑

j=1

aj,i(Λ1∗i − Λ1∗

j ) (40a)

0m×p = − (Y ∗Bli − Fli)BTli + [Imi

]RΛ3∗i

−n∑

j=1


j ) (40b)

where, for simplicity, [Imi]R denotes [Imi

]{mj }n

j = 1R as defined in

(1).By (40) and ai,j = aj,i because G is undirected, we have

0r×p =n∑

i=1

[ATviΛ

3∗i +

n∑

j=1


j )]

=n∑

i=1

ATviΛ

3∗i = ATΛ3∗ (41)

0m×p =n∑

i=1

[−(Y ∗Bli − Fli)BTli + [Imi

]RΛ3∗i

−n∑

j=1


j )]

=n∑

i=1

[−(Y ∗Bli − Fli)BTli + [Imi

]RΛ3∗i ]

= −(Y ∗B − F )BT + Λ3∗ (42)

where Λ3∗ =[(Λ3∗

1 )T · · · (Λ3∗n )T

]T ∈ Rm×p . It follows from(41) and (42) that AT(Y ∗B − F )BT = 0r×p . Recall AX∗ =Y ∗. Equation (6) holds and X∗ is a least squares solution to (5).

ii) Conversely, suppose that X∗ is a least squares solution to(5) and Y ∗ = AX∗. We show that (X∗

E , Y ∗E ) = (1n ⊗ X∗, 1n ⊗

Y ∗) is a solution to problem (18) by proving (40).

Let Λ3∗ =[(Λ3∗

1 )T · · · (Λ3∗n )T

]T = (Y ∗B − F )BT . (6) canbe rewritten as

ATΛ3∗ =n∑

i=1

ATviΛ

3∗i = 0r×p

Λ3∗−(Y ∗B − F )BT =n∑

i=1

[−(Y ∗Bli − Fli)BTli +[Imi

]RΛ3∗i ].

Because ker(Ln ) and range(Ln ) form an orthogonal decom-position of Rn by the fundamental theorem of linear algebra[38], where Ln is the Laplacian matrix of G, there are matricesΛ1∗

i ∈ Rr×p , Λ2∗i ∈ Rm×p such that (40) holds. It follows from

AX∗ = Y ∗ and the KKT optimality condition [36, Th. 3.25] that(X∗

E , Y ∗E ) = (1n ⊗ X∗, 1n ⊗ Y ∗) is a solution to problem (18).

APPENDIX BPROOF OF PROPOSITION 4.2

1) Let (X∗E , Y ∗

E ,Λ1∗,Λ2∗,Λ3∗) be any equilibrium of algo-rithm (19) and function V be a positive definite function given by

V (XE ,YE ,Λ1 ,Λ2 ,Λ3) � 12

n∑

i=1

‖Xi − X∗i ‖2

F

+12

n∑

i=1

‖Yi − Y ∗i ‖2

F +12

n∑

i=1

‖Λ1i − Λ1∗

i ‖2F

+12

n∑

i=1

‖Λ2i − Λ2∗

i ‖2F +

12

n∑

i=1

‖Λ3i − Λ3∗

i ‖2F .

The derivative of function V along the trajectory of algorithm(19) is given by

V =n∑

i=1

〈Xi − X∗i , Xi〉F +

n∑

i=1

〈Yi − Y ∗i , Yi〉F

+n∑

i=1

〈Λ1i − Λ1∗

i , Λ1i 〉F +

n∑

i=1

〈Λ2i − Λ2∗

i , Λ2i 〉F

+n∑

i=1

〈Λ3i − Λ3∗

i , Λ3i 〉F . (43)

By algorithm (19), and the facts that AviX∗i − Y vi∗

i =0mi ×p , X∗

i = X∗j , −AT

viΛ3∗i − ∑n

j=1 ai,j (Λ1∗i − Λ1∗

j ) = 0r×q ,we have

n∑

i=1

〈Xi − X∗i , Xi〉F

= −n∑

i=1

‖Avi(Xi − X∗i )‖2

F − 12

n∑

i=1

n∑

j=1

ai,j‖Xi − Xj‖2F

−n∑

i=1

〈Xi − X∗i , AT

vi(Λ3i − Λ3∗

i )〉F

+n∑

i=1

〈Xi − X∗i , AT

vi(Yvii − Y vi∗

i )〉F

−n∑

i=1

n∑

j=1

ai,j 〈Λ1i − Λ1∗

i , Xi − Xj 〉F (44)


n∑

i=1

〈Yi − Y ∗i , Yi〉F = −

n∑

i=1

‖(Yi − Y ∗i )Bli‖2

F

+n∑

i=1

〈Λ3i − Λ3∗

i , Y vii − Y vi∗

i 〉F

+n∑

i=1

〈Y vii − Y vi∗

i , Avi(Xi − X∗i )〉F

−n∑

i=1

‖Y vii − Y vi∗

i ‖2F

− 12

n∑

i=1

n∑

j=1

ai,j‖Yi − Yj‖2F

−n∑

i=1

n∑

j=1


i , Yi − Yj 〉F

(45)n∑

i=1

〈Λ1i − Λ1∗

i , Λ1i 〉F =

n∑

i=1

n∑

j=1


i , Xi − Xj 〉F

(46)n∑

i=1

〈Λ2i − Λ2∗

i , Λ2i 〉F =

n∑

i=1

n∑

j=1


i , Yi − Yj 〉F

(47)n∑

i=1

〈Λ3i − Λ3∗

i , Λ3i 〉F =

n∑

i=1

〈Λ3i − Λ3∗

i , Avi(Xi − X∗i )〉F

−n∑

i=1

〈Λ3i − Λ3∗

i , Y vii − Y vi∗

i 〉F .

(48)

To sum up

V = − 12

n∑

i=1

n∑

j=1

ai,j‖Xi − Xj‖2F −

n∑

i=1

‖AviXi − Y vii ‖2

F

−n∑

i=1

‖(Yi − Y ∗i )Bli‖2

F−12

n∑

i=1

n∑

j=1

ai,j‖Yi − Yj‖2F ≤ 0.

(49)

Hence, (X∗E , Y ∗

E ,Λ1∗,Λ2∗,Λ3∗) is a Lyapunov stable equilib-rium of algorithm (19). Because function V is positive definiteand radically unbounded. It follows from (49) that a trajectoryof algorithm (19) is bounded for arbitrary initial condition.

2) Define the set

R ={

(XE , YE ,Λ1 ,Λ2 ,Λ3) : V (XE , YE ,Λ1 ,Λ2 ,Λ3) = 0}

⊂ {(XE , YE ,Λ1 ,Λ2 ,Λ3) : AviXi − Y vi

i = 0mi ×p

(Yi − Y ∗i )Bli = 0m×qi

, Xi = Xj , Yi = Yj

i, j ∈ {1, . . . , n}}.

LetM be the largest invariant subset ofR. It follows from theinvariance principle [40, Th. 2.41] that (XE(t), YE(t),Λ1(t),Λ2

(t),Λ3(t)) → M as t → ∞, and M is positive invariant. As-

sume that (XE(t), Y E(t),Λ1(t),Λ

2(t),Λ

3(t)) is a trajectory

of (19) with (XE(t), Y E(t),Λ1(t),Λ

2(t),Λ

3(t)) ∈ M for all

t ≥ 0. For all i ∈ {1, . . . , n}, we have Λ1

i (t) ≡ 0r×q , Λ2

i (t) ≡0m×q , and Λ

3

i (t) ≡ 0mi ×p , and hence

Xi(t) ≡ −ATviΛ

3i (0) −

n∑

j=1

ai,j (Λ1i (0) − Λ

1j (0))

Y i(t) = −(Y i(t)Bli − Fli)BTli + [Imi

]RΛ3i (0)

−n∑

j=1

ai,j (Λ2i (0) − Λ

2j (0))

= −(Y i(t) − Y ∗i )BliB

Tli − (Y ∗

i Bli − Fli)BTli

+ [Imi]RΛ

3i (0) −

n∑

j=1

ai,j (Λ2i (0) − Λ

2j (0))

≡ −(Y ∗i Bli − Fli)BT

li + [Imi]RΛ

3i (0)

−n∑

j=1

ai,j (Λ2i (0) − Λ

2j (0)).

Suppose Xi(t) �= 0r×p (or Y i(t) �= 0m×p ). Then, Xi(t) →∞ (or Y i(t) → ∞) as t → ∞, which contradicts the bound-

edness of the trajectory. Hence, Xi(t) = 0r×p , Y i(t) =0m×p , and M ⊂ {

(XE , YE ,Λ1 ,Λ2 ,Λ3) : Xi = 0r×p , Yi =0m×p , Λ1

i = 0r×q , Λ2i = 0m×q , Λ3

i = 0mi ×p

}.

Clearly, any point in M is an equilibrium point of algorithm(19). By part 1), any point in M is Lyapunov stable. It fol-lows from Lemma 2.1 that (19) is globally convergent to anequilibrium. Due to Proposition 4.1 and Lemma 4.1, Xi(t) con-verges to a least squares solution to (5). Furthermore, it followsform Lemma 2.2 that the convergence rate of algorithm (19) isexponential.

APPENDIX CPROOF OF PROPOSITION 4.4

1) Let (X∗, Y ∗E , Z∗,Λ1∗,Λ2∗) be an equilibrium of algorithm

(24) and define function V as (25). The function derivative V1(·)along the trajectory of algorithm (24) is

V1 =n∑

i=1

⟨AT

vi(AviYi − Fvi) +n∑

j=1

ai,j (Yi − Yj )

−ATvi(AviY

∗i − Fvi), Yi

⟩

F

=n∑

i=1

⟨AT

vi(AviYi − Fvi) +1n

Λ1i

+n∑

j=1

ai,j (Yi − Yj ) +n∑

j=1

ai,j (Λ2i − Λ2

j ), Yi

⟩

F


+n∑

i=1

⟨− 1

nΛ1

i −n∑

j=1

ai,j (Λ2i − Λ2

j )

−ATvi(AviY

∗i − Fvi), Yi

⟩

F.

Note that −ATvi(AviY

∗i − Fvi) − 1

n Λ1∗i − ∑n

j=1 ai,j (Λ2∗i −

Λ2∗j ) = 0r×q because (X∗, Y ∗

E , Z∗,Λ1∗,Λ2∗) is an equilibriumof algorithm (24). Thus,

V1 =n∑

i=1

⟨AT

vi(AviYi − Fvi) +1n

Λ1i +

n∑

j=1

ai,j (Yi − Yj )

+n∑

j=1

ai,j (Λ2i − Λ2

j ), Yi

⟩

F− 1

n

n∑

i=1

〈Λ1i − Λ1∗

i , Yi〉F

−n∑

i=1

n∑

j=1

ai,j 〈Λ2i − Λ2

j − Λ2∗i + Λ2∗

j , Yi〉F

= −‖Yi‖2F − 1

n

n∑

i=1

〈Λ1i − Λ1∗

i , Yi〉F

−n∑

i=1

n∑

j=1


i , Yi − Yj 〉F .

Following similar steps to prove part 1) of Proposition 4.2,we can prove that V2 along the trajectory of algorithm (24) is

V2 = −n∑

i=1

‖Avi(Yi − Y ∗i )‖2

F − 12

n∑

i=1

n∑

j=1


+1n

n∑

i=1

〈Λ1i − Λ1∗

i , Yi〉F − 12

n∑

i=1

n∑

j=1

ai,j‖Λ1i − Λ1

j ‖2F

−‖Λ1i B

Tvi‖2

F +n∑

i=1

n∑

j=1


i , Yi − Yj 〉F .

Hence,

V = − ‖Yi‖2F −

n∑

i=1

‖Avi(Yi − Y ∗i )‖2

F − ‖Xli‖2F

− 12

n∑

i=1

n∑

j=1


− 12

n∑

i=1

n∑

j=1

ai,j‖Λ1i − Λ1

j ‖2F ≤ 0. (50)

Recall that V1(YE) is nonnegative for all YE ∈ Rnr×q dueto Lemma 4.3. V is positive definite and radically unbounded,(X∗, Y ∗

E , Z∗,Λ1∗,Λ2∗) is a Lyapunov stable equilibrium, andfurthermore, it follows from (50) that a trajectory of algorithm(24) is bounded for arbitrary initial condition.

2) Let

R ={

(X,YE , Z,Λ1 ,Λ2) : V (X,YE , Z,Λ1 ,Λ2) = 0}

⊂{(X,YE , Z,Λ1 ,Λ2) : Avi(Yi − Y ∗

i )=0mi ×q , Yi =0r×q

Λ1i = Λ1

j , Yi = Yj , Xli = 0r×pi, i, j ∈ {1, . . . , n}}.

Let M be the largest invariant subset of R. It fol-lows from the invariance principle [40, Th. 2.41] that(X(t), YE(t), Z(t),Λ1(t),Λ2(t)) → M as t → ∞. Note thatM is invariant. The trajectory (X(t), YE(t), Z(t),Λ1(t),Λ2(t)) ∈ M for all t ≥ 0 if (X(0), YE(0), Z(0),Λ1(0),Λ2(0)) ∈ M. Assume (X(t), Y E(t), Z(t),Λ

1(t),Λ

2(t)) ∈ M

for all t ≥ 0, Xli(t) ≡ 0r×pi, Y i(t) ≡ 0r×q , Zi(t) ≡ 0r×q ,

Λ2

i (t) ≡ 0r×q , and hence

Λ1

i (t) =1n

Y i(0) − Xli(0)Bvi +n∑

j=1

ai,j (Zi(0) − Zj (0)).

If Λ1

i (t) �= 0r×q , then Λ1i (t) → ∞ as t → ∞, which con-

tradicts the boundedness of the trajectory. Hence, Λ1

i (t) ≡0r×q for i ∈ {1, . . . , n} and M ⊂ {

(X,YE , Z,Λ1 ,Λ2) : Xli =0r×pi

, Yi = 0r×q , Zi = 0r×p , Λ1i = 0r×q , Λ2

i = 0m×q

}.

Take any (X, YE , Z, Λ1 , Λ2) ∈ M. (X, YE , Z, Λ1 , Λ2) isclearly an equilibrium point of algorithm (24). It follows frompart 1) that (X, YE , Z, Λ1 , Λ2) is Lyapunov stable. Hence, everypoint inM is Lyapunov stable. By Lemma 2.1, algorithm (24) isconvergent to an equilibrium. Due to Lemma 2.2, algorithm (24)converges to an equilibrium exponentially. According to Propo-sition 4.3 and Lemma 4.2, X(t) converges to a least squaressolution to (5) exponentially.

REFERENCES

[1] A. Nedic, A. Ozdaglar, and P. A. Parrilo, “Constrained consensus and op-timization in multi-agent networks,” IEEE Trans. Autom. Control, vol. 55,no. 4, pp. 922–938, Apr. 2010.

[2] P. Yi, Y. Hong, and F. Liu, “Distributed gradient algorithm for constrainedoptimization with application to load sharing in power systems,” Syst.Control Lett., vol. 83, pp. 45–52, 2015.

[3] S. S. Kia, J. Cortes, and S. Martınez, “Distributed convex optimization viacontinuous-time coordination algorithms with discrete-time communica-tion,” Automatica, vol. 55, pp. 254–264, 2015.

[4] Z. Qiu, S. Liu, and L. Xie, “Distributed constrained optimal consensus ofmulti-agent systems,” Automatica, vol. 68, pp. 209–215, 2016.

[5] D. Yuan, D. W. C. Ho, and S. Xu, “Regularized primal-dual subgradientmethod for distributed constrained optimization,” IEEE Trans. Cybern.,vol. 46, no. 9, pp. 2109–2118, Sep. 2016.

[6] G. Shi and K. H. Johansson, “Randomized optimal consensus of multi-agent systems,” Automatica, vol. 48, no. 12, pp. 3018–3030, 2012.

[7] X. Zeng, P. Yi, and Y. Hong, “Distributed continuous-time algorithmfor constrained convex optimizations via nonsmooth analysis approach,”IEEE Trans. Autom. Control, vol. 62, no. 10, pp. 5227–5233, Oct. 2017.

[8] J. Wang and N. Elia, “Control approach to distributed optimization,” inProc. Allerton Conf. Commun., Control Comput., Monticello, IL, USA,2010, pp. 557–561.

[9] B. Gharesifard and J. Cortes, “Distributed continuous-time convex op-timization on weight-balanced digraphs,” IEEE Trans. Autom. Control,vol. 59, pp. 781–786, Mar. 2014.

[10] Q. Liu and J. Wang, “A second-order multi-agent network for bound-constrained distributed optimization,” IEEE Trans. Autom. Control,vol. 60, no. 12, pp. 3310–3315, Dec. 2015.

[11] G. Shi, B. D. O. Anderson, and U. Helmke, “Network flows that solvelinear equations,” IEEE Trans. Autom. Control, vol. 62, no. 6, pp. 2659–2674, Jun. 2017.

[12] S. Mou, J. Liu, and A. S. Morse, “A distributed algorithm for solving alinear algebraic equation,” IEEE Trans. Autom. Control, vol. 60, no. 11,pp. 2863–2878, Nov. 2015.

[13] L. Wang, D. Fullmer, and A. S. Morse, “A distributed algorithm with anarbitrary initialization for solving a linear algebraic equation,” in Proc.Amer. Control Conf., Boston, MA, USA, 2016, pp. 1078–1081.


[14] G. Shi and B. D. O. Anderson, “Distributed network flows solving linearalgebraic equations,” in Proc. Amer. Control Conf., Boston, MA, USA,2016, pp. 2864–2869.

[15] B. Anderson, S. Mou, A. Morse, and U. Helmke, “Decentralized gradi-ent algorithm for solution of a linear equation,” Numer. Algebra ControlOptim., vol. 6, no. 3, pp. 319–328, 2016.

[16] Y. Liu, C. Lageman, B. Anderson, and G. Shi, “An Arrow-Hurwicz-Uzawa type flow as least squares solver for network linear equations,”arXiv:1701.03908v1.

[17] J. Liu, S. Mou, and A. S. Morse, “Asynchronous distributed algorithms forsolving linear algebraic equations,” IEEE Trans. Autom. Control, vol. 63,no. 2, pp. 372–385, Feb. 2018.

[18] J. Liu, A. S. Morse, A. Nedic, and T. Basar, “Exponential convergence of adistributed algorithm for solving linear algebraic equations,” Automatica,vol. 83, pp. 37–46, 2017.

[19] K. Cao, X. Zeng, and Y. Hong, “Continuous-time distributed algorithmsfor solving linear algebraic equation,” in Proc. 36th Chin. Control Conf.,Dalian, China, 2017.

[20] A. Ben-Isreal and T. N. E. Greville, Generalized Inverses: Theory andApplications. New York, NY, USA: Springer-Verlag, 2003.

[21] C. R. Rao and S. K. Mitra, Generalized Inverse of Matrices and Its Appli-cations. New York, NY, USA: John Wiley & Sons Inc, 1972.

[22] J. K. Baksalary and R. Kala, “The matrix equation AXB + CYD = E ,”Linear Algebra Appl., vol. 30, pp. 141–147, 1980.

[23] Y. Tian and H. Wang, “Relations between least-squares and least-rank so-lutions of the matrix equation AXB = C ,” Appl. Math. Comput., vol. 219,no. 20, pp. 10293–10301, 2013.

[24] K. G. Woodgate, “Least-squares solution of F = P G over positivesemidefinite symmetric P ,” Linear Algebra Appl., vol. 145, pp. 171–190,1996.

[25] L. Wu and B. Cain, “The re-nonnegative definite solutions to the matrixinverse problem AX = B ,” Linear Algebra Appl., vol. 236, pp. 137–146,1996.

[26] C. J. Meng, X. Y. Hu, and L. Zhang, “The skew-symmetric orthogonal so-lutions of the matrix equation AX = B ,” Linear Algebra Appl., vol. 402,pp. 303–318, 2005.

[27] Q. Hui, W. M. Haddad, and S. P. Bhat, “Semistability, finite-time stability,differential inclusions, and discontinuous dynamical systems having acontinuum of equilibria,” IEEE Trans. Autom. Control, vol. 54, no. 10,pp. 2465–2470, Oct. 2009.

[28] A. S. Antipin, “Feedback-controlled saddle gradient processes,” Autom.Remote Control, vol. 55, no. 3, pp. 311–320, 2003.

[29] C. Godsil and G. F. Royle, Algebraic Graph Theory. New York, NY, USA:Springer-Verlag, 2001.

[30] D. S. Bernstein, Matrix Mathematics: Theory, Facts, and Formulas, 2nded. Princeton, NJ, USA: Princeton University Press, 2009.

[31] F. Ding, P. X. Liu, and J. Ding, “Iterative solutions of the generalizedSylvester matrix equations by using the hierarchical identification princi-ple,” Appl. Math. Comput., vol. 197, pp. 41–50, 2008.

[32] Y. Tian, “Some properties of submatrices in a solution to the matrixequation AXB = C with applications,” IEEE Trans. Control Netw. Syst.,vol. 346, no. 6, pp. 557–569, Aug. 2009.

[33] Y. Peng, X. Hu, and L. Zhang, “An iteration method for the symmetricsolutions and the optimal approximation solution of the matrix equationAXB = C ,” Appl. Math. Comput., vol. 160, pp. 763–777, 2005.

[34] J. Hou, Z. Peng, and X. Zhang, “An iterative method for the least squaressymmetric solution of matrix equation AXB = C ,” Numer. Algorithms,vol. 42, no. 2, pp. 181–192, 2006.

[35] F. Zhang, Y. Li, W. Guo, and J. Zhao, “Least squares solutions with specialstructure to the linear matrix equation AXB = C ,” Appl. Math. Comput.,vol. 217, pp. 10049–10057, 2011.

[36] A. Ruszczynski, Nonlinear Optimization. Princeton, NJ, USA: PrincetonUniversity Press, 2006.

[37] X. Wang, J. Zhou, S. Mou, and M. J. Corless, “A distributed algorithm forleast square solutions of linear equations,” arXiv:1709.10157v1.

[38] G. Strang, “The fundamental theorem of linear algebra,” Amer. Math.Monthly, vol. 100, no. 9, pp. 848–855, 1993.

[39] P. Yi, Y. Hong, and F. Liu, “Initialization-free distributed algorithms foroptimal resource allocation with feasibility constraints and application toeconomic dispatch of power systems,” Automatica, vol. 74, pp. 259–269,2016.

[40] W. M. Haddad and V. Chellaboina, Nonlinear Dynamical Systems andControl: A Lyapunov-Based Approach. Princeton, NJ, USA: PrincetonUniv. Press, 2008.

Xianlin Zeng (S’12–M’15) received the B.S. andM.S. degrees in control science and engineeringfrom the Harbin Institute of Technology, Harbin,China, in 2009 and 2011, respectively, and thePh.D. degree in mechanical engineering fromthe Texas Tech University, Lubbock, TX, USA,in 2015.

He is currently a Postdoctoral with the KeyLaboratory of Intelligent Control and Decisionof Complex Systems, School of Automation,Beijing Institute of Technology, Beijing, China.

His research interests include distributed optimization, distributed con-trol, and distributed computation of network systems.

Shu Liang received the B.E. degree in auto-matic control and the Ph.D. degree in engineer-ing from the University of Science and Technol-ogy of China, Hefei, China, in 2010 and 2015,respectively.

He is currently a Lecturer with the Schoolof Automation and Electrical Engineering, Uni-versity of Science and Technology Beijing,Beijing, China. His research interests includenonsmooth systems and control, distributed op-timizations, game theory, and fractional ordersystems.

Yiguang Hong (M’99–SM’02–F’17) receivedthe B.S. and M.S. degrees from Peking Uni-versity, Beijing, China, and the Ph.D. degreefrom the Chinese Academy of Sciences (CAS),Beijing, China.

He is currently a Professor with the Academyof Mathematics and Systems Science, CAS, andis the Director of the Key Laboratory of Systemsand Control, CAS. His current research interestsinclude nonlinear control, multiagent systems,distributed optimization/game, machine learn-

ing, and social networks.Prof. Hong serves as the Editor-in-Chief of the Control Theory and

Technology and the Deputy Editor-in-Chief of Acta Automatica Sinca. Healso serves or served as an Associate Editor for many journals, includingthe IEEE TRANSACTIONS ON AUTOMATIC CONTROL, IEEE TRANSACTIONSON CONTROL OF NETWORK SYSTEMS, and IEEE CONTROL SYSTEMS MAG-AZINE. He is a recipient of the Guang Zhaozhi Award at the ChineseControl Conference, Young Author Prize of the IFAC World Congress,Young Scientist Award of CAS, the Youth Award for Science and Tech-nology of China, and the National Natural Science Prize of China.

Jie Chen (M’09–SM’12) received the B.S., M.S.,and Ph.D. degrees from the Beijing Institute ofTechnology (BIT), Beijing, China, in 1986, 1996,and 2001, respectively.

He is currently a Professor with the Schoolof Automation, BIT, and serves as the Directorof the Key Laboratory of Intelligent Control andDecision of Complex Systems, BIT, and the Vice-President of BIT. His research interests includecomplex systems, multi-agent systems, multi-objective optimization and decision, constrained

nonlinear control, and optimization methods.Prof. Chen serves as the Managing Editor for the Journal of Sys-

tems Science and Complexity, and is an Editorial Board Member andan Associate Editor for several journals, including the IEEE TRANSAC-TIONS ON CYBERNETICS, International Journal of Robust and NonlinearControl, and Science China Information Sciences. He is a recipient ofthe National Natural Science Award of China, and the National Scienceand Technology Progress Awards of China. He is an Academician ofthe Chinese Academy of Engineering, a Chief Scientist of National 973Basic Research Program, the Principal Investigator of an Innovative-Research-Group Program supported by the Natural Science Foundationof China. He serves as the Vice-President of the Chinese Associationof Automation, and an Executive Director of the Chinese Artificial Intelli-gence Society.

Documents

Distributed Computation of Linear Matrix Equations: An