Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Projection Algorithms and Monotone Operators
Heinz H. Bauschke
Diplom-Mathematiker, Goethe-Universitat, 1990
A THESIS SUBMITTED IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
in the Department
of
Mathematics and Statistics
@ Heinz H. Bauschke 1996
SIMON FRASER UNIVERSITY
August 1996
All rights reserved. This work may not be
reproduced in whole or in part, by photocopy
or other means, without the permission of the author.
APPROVAL
Name: Heinz H. Bauschke
Degree: Doctor of Philosophy
Title of thesis: Projection Algorithms and Monotone Operators
Examining Committee: Dr. Carl J. Schwarz
Chair
Dr. Jonathan M. Borwein
Senior Su~ervisor
Dr. Manfred R. Trummer
Dr. Robert D. Russell
Dr. Arvind Gupta
School of Computing Science
Dr. Frank R. Deutsch
Pennsylvania State University
External Examiner
August 8, 1996 Date Approved:
. . 11
PARTIAL COPYRIGHT LICENSE
I hereby grant to Simon Fraser Universi the right to lend my
2 tY thesis, pro'ect or extended essay (the title o which is shown below) to users o the Simon Fraser University Library, and to make partial or single copies only for such users or in response to a request from the l ibraq of any other university, or other educational institution, on its own behalf or for one of its users. I further agree that permission for multiple copying of this work for scholarly purposes may be granted by me or the Dean of Graduate Studies. It is understood that copying or publication of this work for financial gain shall not be allowed without my written permission.
Title of Thesis/Project/Extended Essay
Author: (signature3
Abstract
This thesis consists of two parts.
In Part I, projection algorithms for solving convex feasibility problems in Hilbert space are
studied. Powerful techniques from Convex Analysis are employed within a very general
framework that covers and extends many well-known results. Ostensibly different looking
conditions sdc i en t for linear convergence are shown to be special instances of regularity
- a concept new in this context. Numerous examples, including subgradient algorithms,
are presented.
Several notions of monotonicity of operators on Banach spaces are analyzed in Part 11.
Utilizing Convex and Functional Analysis, it is shown that for a bounded linear positive
semi-definite operator, all these "monotonicities" coincide with the monotonicity of the
conjugate operator. Moreover, monotonicity of the conjugate operator is automatic in many
classical Banach spaces but not in spaces containing a complemented copy of the space of
absolutely convergent sequences.
Acknowledgments
I wish to express my deepest appreciation to my advisor Jon Borwein for his enthusiasm,
encouragement and able guidance. Because of his vast knowledge and understanding of
mathematics, my view on Optimization broadened immensely. It has been a great privilege
to study with him.
Thanks to Jon's restlessness, I had the opportunity to study at Dalhousie, Waterloo: and
Simon Fraser. I am grateful to all three departments for their help and support.
The CECM was a very stimulating environment. Thanks to Greg Fee, John Hebron and
Simon PloufTe for their Maple help. Special thanks go to Jack Ho for his invaluable Unix
help.
Mathematical thanks to Tam& Erdblyi, Simon Fitzpatrick, Adrian Lewis, Warren Moors,
Alex SimoniE, and Jon Vanderwerff for discussing problems and making observations that
improved this thesis notably.
Many many thanks to my friends Tam& Erdblyi, Mark Limber, and John Read, who have
always been supportive and encouraging.
Meiner Familie kann ich nicht genug danken fur ihr grofles Vertrauen und ihre kontinuier-
liche Unterstutzung wihrend meines gesarnten Studiums. Mein Vater machte mich in jun-
gen Jahren mit der Frankfurter Goethe Universitat vertraut und meine Ausbildung dort
verdanke ich ausschliefllich meiner Mutter. Bedanken mochte ich mi& auch bei meiner
Schwester, die immer offene Ohren fur die Note ihres kleinen Bruders hat. Danke fur alles!
Most of all, I wish to thank my wife Stefanie for sharing these unforgettable years with me.
Widmung
Meiner Mutter und dem Angedenken meines Vaters
Contents
... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abstract ill
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments iv
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Widmung v
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of Tables xii ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of Figures xu1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Introduction 1
. . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Welcome, my dear reader 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Welcome to Part I 2
. . . . . . . . . . . . . . . . . . . . . 1.3 Migrating from Part I to Part I1 5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Welcome to Part I1 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Thetoolbox 8
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Overview 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Convex Analysis 8
. . . . . . . . . . . . . . . . . . . . . 2.2.1 Hilbert space geometry 8
. . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Mosco convergence 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Subgradients 10
. . . . . . . . . . . . . . . . . . . . . . . . 2.2.4 Fenchel conjugates 11
. . . . . . . . . . . . . . . . . . . . . . . . . 2.2.5 The duality map 12
. . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.6 Optimization 13
. . . . . . . . . . . . . . . . . . . . . . . . 2.2.7 Linear inequalities 14
. . . . . . . . . . . . . . . . . . . . . . . . . 2.2.8 Convex relations 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Functional Analysis 15
. . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Weak* closures 15
2.3.2 Bits from Operator Theory . . . . . . . . . . . . . . . . . . . 16
. . . . . . . . . . . . . 2.3.3 Machinery from Banach Space Theory 18
2.3.4 Rugged Banach spaces . . . . . . . . . . . . . . . . . . . . . 19
2.4 Nonexpansive maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.1 General nonexpansive maps . . . . . . . . . . . . . . . . . . . 21
2.4.2 Attracting maps . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.3 Strongly attracting maps . . . . . . . . . . . . . . . . . . . . 23
. . . . . . . . . . . 2.4.4 Averaged and firmly nonexpansive maps 24
2.5 Monotone Operator Theory . . . . . . . . . . . . . . . . . . . . . . . . 26 2.6 Miscellany . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
. . . . . . . 2.6.1 Hoo hoo - ha ha: the monkey house of interiors 26
. . . . . . . 2.6.2 Angles and differences of two closed convex sets 29
. . . . . . . . . . . . . . 2.6.3 Pierra's product space formalization 31
2.6.4 Linearly convergent sequences . . . . . . . . . . . . . . . . . 32
2.6.5 A "series" estimate . . . . . . . . . . . . . . . . . . . . . . . 32
2.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
I Projection Algorithms in Hilbert spaces 35
3 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3.1 Hilbert lattice cones . . . . . . . . . . . . . . . . . . . . . . . 38
3.3.2 Polar cones . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.3 Rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.4 Icecream cones . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.5 Affine subspaces . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.6 Convex polyhedra . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.7 Balls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.8 Cylinders 44
3.3.9 Polar sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3.10 Epigraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.11 Sublevel sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
vii
3.4 Relaxations and quasi-projections . . . . . . . . . . . . . . . . . . . . 48
3.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4 Regularity for two Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Two cones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Guaranteeing bounded linear regularity . . . . . . . . . . . . . . . . . 56
4.5 Two subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.6 Finite-dimensional results . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.7 Linearly constrained feasibility problems . . . . . . . . . . . . . . . . . 62
4.8 Limiting examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.9 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5 Regularity for finitely many sets . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.3 Finitely many cones . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.4 Promoting regularities . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.5 Finitely many subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.6 Finite-dimensional results . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.7 Halfspaces and convex polyhedra . . . . . . . . . . . . . . . . . . . . . 73
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8 Notes 73
6 FejCr monotone sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Overview 75
6.2 FejCr monotone sequences . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.3 Two examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
. . . . . . . . . . . . . . 6.3.1 Compositions of nonexpansive maps 77
6.3.2 Finding support in Hilbert spaces . . . . . . . . . . . . . . . 78
6.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7 The convex feasibility problem . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7.2 The convex feasibility problem . . . . . . . . . . . . . . . . . . . . . . 81
7.3 Motivating projection methods . . . . . . . . . . . . . . . . . . . . . . 83
7.4 Finding good weights . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
... V l l l
7.5 A fun algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.5.1 Convergence results . . . . . . . . . . . . . . . . . . . . . . . 93
7.5.2 Numerical experiments . . . . . . . . . . . . . . . . . . . . . 94
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Notes 95
8 The general projection algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 97
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Overview 97
8.2 One step at a time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.3 The general projection algorithm . . . . . . . . . . . . . . . . . . . . . 99
8.4 Asymptoticallyregularalgorithms . . . . . . . . . . . . . . . . . . . . 101
8.5 Focusingalgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.6 Strongly focusing algorithms . . . . . . . . . . . . . . . . . . . . . . . 103
8.7 Linearly focusing algorithms . . . . . . . . . . . . . . . . . . . . . . . 105
8.8 Overrelaxed algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 106
8.9 Notions at a glance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.10 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
9 Applications! Applications! . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
9.2 Focusing algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
9.3 Strongly focusing algorithms . . . . . . . . . . . . . . . . . . . . . . . 114
9.4 Linearly focusing algorithms . . . . . . . . . . . . . . . . . . . . . . . 114
9.5 Perennial favourites . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
9.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
10 Subgradient algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
10.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
10.3 The general subgradient algorithm . . . . . . . . . . . . . . . . . . . . 121
10.4 Some applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
10.4.1 Censor and Lent's framework . . . . . . . . . . . . . . . . . . 123
10.4.2 Pol yak's framework . . . . . . . . . . . . . . . . . . . . . . . 124
10.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
11 A farewell to Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
11.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
11.2 The von Neumann/Halperin result . . . . . . . . . . . . . . . . . . . . 126
11.2.1 Halperin's proof . . . . . . . . . . . . . . . . . . . . . . . . . 127
11.2.2 A relaxed version of Dykstra's algorithm . . . . . . . . . . . 127
11.3 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
11.4 Open problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
11.4.1 The alternating projections problem . . . . . . . . . . . . . . 133
11.4.2 The random projections problem . . . . . . . . . . . . . . . . 133
11.4.3 The cyclic projections problem . . . . . . . . . . . . . . . . . 134
11.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
I1 Monotone Operators in Banach spaces 136
12 The zoo of monotonicities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
12.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
12.2 Basic properties and facts . . . . . . . . . . . . . . . . . . . . . . . . . 137
12.3 The monotonicities for linear operators . . . . . . . . . . . . . . . . . 142
12.3.1 Symmetric operators . . . . . . . . . . . . . . . . . . . . . . 144
12.3.2 Skew operators . . . . . . . . . . . . . . . . . . . . . . . . . . 147
12.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
13 Characterizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
13.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
13.2 The main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
13.3 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
14 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
14.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
14.2 Generating "weird" examples systematically . . . . . . . . . . . . . . . 159
14.3 Conjugate monotone spaces . . . . . . . . . . . . . . . . . . . . . . . . 163
14.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
15 Some nonlinear results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
15.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
15.2 Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
15.3 Regularizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
15.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
16 A farewell to Part I1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1 Overview 178
. . . . . . . . . . . . . 16.2 Simons's strongly maximal monotone operators 178
16.2.1 Primally strongly maximal monotone operators . . . . . . . . 178
. . . . . . . . . 16.2.2 Dually strongly maximal monotone operators 179
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.3 Notes 180
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.4 Open problems 180
. . . . . . . . . . . . . . . . . 16.4.1 The interrelationship problem 180
. . . . . . . . . . . . . . . . . . . 16.4.2 The decomposition problem 181
. . . . . . . . . . . . . . . . . . . . . . . 16.4.3 The (cms) problem 182
16.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography 183
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Glossarex 198
List of Tables
8.1 Controls for projection algorithms . . . . . . . . . . . . . . . . . . . . . . . . 108
8.2 Parameters for projection algorithms . . . . . . . . . . . . . . . . . . . . . . . 108
8.3 Other notions for projection algorithms . . . . . . . . . . . . . . . . . . . . . 109
xii
List of Figures
7.1 The weight w l as a function of P2x . x . . . . . . . . . . . . . . . . . . . . . . 91
7.2 The relaxation parameter pgood as a function of P2x - x . . . . . . . . . . . . . 92
7.3 The "progress" estimate as a function of P2x - x . . . . . . . . . . . . . . . . . 92
... Xlll
Chapter 1
Introduction
1.1 Welcome, my dear reader
In view of the extremely small number of people who read Ph.D. theses, I am thrilled that
you are reading this very sentence and hope very much that you bear with me for a while.
So what's in this thesis? It consists of two parts: Part I unifies and improves projection
algorithms for solving convex feasibility problems in Hilbert spaces. In Part 11, various
notions of monotonicity for continuous linear monotone operators on Banach spaces are
compared and characterized.
The contents and the relationship between the two parts is made more precise in the re-
maining sections of this chapter.
A number of useful results - mostly from Convex Analysis - is presented in Chapter 2:
which is the common "tool box" for both Part I and Part 11.
As references for various topics, I recommend the following:
Convex Analysis
Rockafellar 11271, Phelps [120], Zeidler [166], Aubin and Ekeland [7], Ekeland and
Temarn [58], Giles 1721, Hiriart-Urruty and Lemardchal [88, 891, Ioffe and Tihomirov
[941.
Monotone Operator Theory
Phelps [120, 1211, Zeidler [167, 1681. (See also the proceedings [71, 29, 1641.)
Functional Analysis
CHAPTER 1. INTRODUCTION 2
Holmes [91], Jarneson [95], Wilansky [159, 1611, Conway [43], Dunford and Schwartz
[55], Rudin [135].
Moore/Penrose inverse
Groetsch [78].
Banach Lattices
Schaefer [138], Meyer-Nieberg [108].
Fixed Point Theory
Goebel and Kirk [73].
Finally, I would like to point out that there is a
Glossarex
at the end of the thesis. The Glossarex, a hybrid of a glossary and an index, is meant to
make reading and using this thesis easier. Please have a look at it now and note that terms
not defined in the main body of the thesis are often explained in the Glossarex. Thanks again
for your interest - and now let the thesis begin!
1.2 Welcome to Part I
A very common problem in diverse areas of mathematics and physical sciences consists of
trying to find a "solution" satisfying certain "constraints". This problem is referred to as the
Convex Feasibility Problem. Mathematically, it is described as follows: Suppose C1, . . . , Cx
are finitely many closed convex nonempty subsets of a Hilbert space with C := ni C; # 0.
Convex Feasibility Problem: Find x E C.
Here, the sets C; correspond to the constraints and C to the set of solutions. (For a
description of incarnations of the Convex Feasibility Problem, see Section 7.2.)
Typically, it is not possible to find a solution in C directly, but at least the constraint sets C;
are "simple" in the sense that the associated projections (Chapter 3) are easy to compute.
Consequently, one tries to solve the convex feasibility problem algorithmically. The idea is
to involve the projections to generate a sequence of points that is supposed to converge to
a solution.
We will analyze in detail the following (by Section 7.3) well-motivated projection algorithm:
CHAPTER 1. INTRODUCTION
Given a current iterate x,, compute the next iterate x,+l by
Here, each Pi,, is the projection onto some superset Ci,, of C;, a, is some relaxation pa-
rameter in [O, 21, p, is some extrapolation parameter greater than or equal to l, and the wi,,
are some nonnegative weights.
This framework is broad enough to cover a huge number of algorithms. (It is also closely
related to the recent studies by Bauschke and Borwein [14], by Combettes [41, 391, and
by Kiwiel and Lopuch [loll. These studies contain many pointers to important earlier
works and they are compared in the Notes of Chapter 8.) For instance, (*) covers not
only classical short-step methods (p, = 1) from the 1930's by von Neumann, by Kaczmarz,
and by Cirnmino, but also more modern long-step methods (p, > 1) by Merzlyakov and by
Pierra. (See Chapter 9 for these examples and many more.)
The questions that immediately arise concern the behaviour of the sequence (x,) generated
by the projection algorithm: when does (x,) converge weakly to a solution? when in norm?
when even linearly?
The aim of Part I of this thesis is to analyze the projection algorithm in detail, to bring out
underlying recurring key concepts, and to improve, unify and review existing results.
This is achieved by a consequent modularization: the analysis relies upon independent
concepts that are useful and important in their own right:
projections and their properties (Chapter 3);
(bounded) (linear) regularity (Chapter 4 and Chapter 5);
Fejdr monotone sequences (Chapter 6).
These key concepts work extremely well together and make it relatively easy to derive a
variety of convergence results.
Part I is organized as follows.
In Chapter 3, we study projections and their properties. A large collection of computationally
tractable examples is presented.
The four notions of (bounded) (linear) regularity are introduced in Chapter 4 for two closed
convex intersecting sets. These regularities are quantitative versions of the following, very
CHAPTER 1. INTRODUCTION 4
intuitive, geometric condition: "closeness to the two sets implies closeness to their intersec-
tion". Bounded linear regularity holds under a certain constraint qualification. Particular-
ization of this constraint qualification produces a Slater-type interiority condition and to
a condition on the angle (when the constraints are subspaces). These two conditions were
known to imply linear convergence of sequences generated by certain algorithms. We will
prove a new general linear convergence result under bounded linear regularity, giving us a
unifying and very elegant understanding of these earlier linear convergence results.
The generalization to finitely many sets is carried out in Chapter 5. Various verifiable
sufficient conditions for the regularities are provided.
In Chapter 6, we discuss FejCr monotone sequences. This concept is important to us,
because every sequence generated by a projection algorithm is Fejdr monotone. As a different
application of FejCr monotonicity, we first introduce a new algorithm for finding a support
point of a closed convex set; we then present a general convergence proof based on Fejdr
monotone sequences.
Framework (*) is motivated in detail in Chapter 7. For the case of two constraints, the
motivation naturally leads to an explicit algorithm. The proof of a prototypical convergence
result nicely illustrates how the key concepts work together.
The heart of Part I is Chapter 8: which contains the basic convergence results for the
framework (*) .
The most important applications are presented in Chapter 9.
Subgradient algorithms, a particular type of projection algorithms, are the subject of Chap-
ter 10. The constraint sets involved are sublevel sets of convex functions; as such, their
projections are hard to compute and approximating supersets must be used instead.
Chapter 11 contains two self-contained proofs of the fundamental von NeumannIHalperin
result: the first proof is due to Halperin: the second one follows from a new result on a
relaxed version of Dykstra's algorithm. Part I of this thesis is concluded with a list of open
problems.
Finally, we assume throughout Part I that
X is a (real) Hilbert space.
CHAPTER 1. INTRODUCTION
1.3 Migrating from Part I to Part I1
Although Parts I and I1 of this thesis are essentially independent, we point out how Mono-
tone Operator Theory offers an interesting and different perspective on projections algo-
rithms. (This section is based on Spingarn's excellent paper 11481.)
Let us fix a Hilbert space X. Associated with a maximal monotone operator T on X and
p > O i s
the resolvent of T, ( I + p ~ ) - l ,
where I denotes the identity map.
Suppose C is a closed convex nonempty subset of X. Then the projection onto C is precisely
the resolvent of the normal cone map:
A fundamental problem in Monotone Operator Theory consists of trying to find a zero of a
sum of finitely many maximal monotone operators TI, . . . , TN on X:
Zero Problem: Find x E X such that 0 E (TI + . . .+TN)(x) .
The Zero Problem is often tackled by employing Passty's generalization of the famous prox-
imal point algorithm:
Given a current iterate x, and pn > 0, compute the next iterate xn+l by
Now suppose that C1,. . . , CN are closed convex nonempty subsets of X with projections
PI,. . . , PN respectively. If we let each Ti be the normal cone map Nc,, then the Zero
Problem specializes to the
Convex Feasibility Problem: Find x E C1 n nCN.
This is the central problem of Part I; moreover, the iteration (PPA) corresponds to the
method of cyclic projections:
It is not surprising that results on the convergence of the iteration (PPA) applied to the
Convex Feasibility Problem are weaker than the known results on (special purpose) projec-
tion algorithms; nonetheless, this connection between Projection Algorithms and Monotone
Operators is quite illuminating and stimulating.
CHAPTER 1. INTRODUCTION
1.4 Welcome to Part I1
A monotone operator is a set-valued map T from a Banach space X to its dual space X* satisfying the relation
(Tx - T Y , ~ - Y) L 0, vx, y E X.
In the simplest case, when X is just the real line, this relation corresponds precisely to
increasing (possibly set-valued) functions, hence the name. Monotone operators appear in
quite diverse areas such as Operator Theory, Numerical Analysis, Differentiability Theory
of Convex Functions, and Partial Differential Equations, because the notion of a monotone
operator is broad enough to cover two fundamental mathematical objects: linear positive
semi-definite operators and subdiflerentials of convex functions. Although the former object
gave rise to the field, it is the latter that has been receiving much of the recent attention.
The urge to extract and study the quite strong monotonicity properties of subdifferentials of
convex functions has led to the introduction of several new more powerful notions of mono-
tonicity. While these notions are automatic for maximal monotone operators on reflexive
Banach spaces, the situation in nonreflexive Banach spaces is far less well understood. (As a
consequence, almost all applications of Monotone Operator Theory - for instance, in Par-
tial Differential Equations - are restricted to the reflexive case albeit the natural setting of
these applications is often nonreflexive; see [54, 69, 1161.) Quite surprisingly, these notions
of monotonicity were largely untested even for the most natural candidates: continuous
linear positive semi-definite operators. Thus:
The a i m of Part II of this thesis is to study the various notions of monotonicity for contin-
uous linear positive semi-definite operators.
Using elegant and potent tools from Convex Analysis, we show that these notions all coincide
with the monotonicity of the conjugate operator. Structure theorems from Banach Space
Theory then imply that monotonicity of the conjugate operator seems to be the rule -
with the notable exception of spaces that contain a complemented copy of the sequence
space e l . Part I1 is organized as follows.
In Chapter 12, the various notions of monotonicity by Gossez (dense type or type (D)), by
Phelps and Fitzpatrick (locally maximal monotone), and by Simons (range-dense type or
type (WD) and type (NI)) are introduced and some of their basic relationships are reviewed.
CHAPTER 1. INTRODUCTION 7
From Section 12.3 on, we focus on the case when the monotone operator is continuous and
linear; the key concept is the simple yet extremely useful decomposition of the operator into
a symmetric and a skew part.
The main result, whose proof depends crucially on Fenchel's Duality Theorem, is presented
in Chapter 13. It allows us to give a (partial) affirmative answer to a question posed by
Gossez more than two decades ago.
In Chapter 14, we derive and extend classical counter-examples by Gossez and by Fitzpatrick
and Phelps systematically from a result that can be viewed as an "instruction manual7'
for constructing interesting continuous linear monotone operators whose skew parts have
nonmonotone conjugates. Such strange operators occur only in a few classical Banach
spaces like tl and L1[O, 11.
In Chapter 15, we investigate regularizations, i.e., (possibly) nonlinear set-valued pertur-
bations of continuous linear monotone operators by positive multiples of the duality map.
The results demonstrate the close relationship between local maximal monotonicity and
monotonicity of range-dense type even in this nonlinear context.
The final Chapter 16 employs once more Fenchel's Duality Theorem to study Simons's notion
of a strongly maximal monotone operator in the linear setting. A list of open problems
concludes this thesis.
Finally, we assume throughout Part I1 that
X is a (real) Banach space.
Chapter 2
The tool box
2.1 Overview
In this chapter, I pack my (mathematical) tool box with various results mostly from Convex
Analysis, Functional Analysis, Fixed Point Theory, and Monotone Operator Theory.
2.2 Convex Analysis
For notions not explicitly defined either in this section or in the Glossarex, the reader is
referred to [127, 58, 71.
2.2.1 Hilbert space geometry
Proposition 2.2.1 Suppose al:. . . , aN are finitely many vectors in a Hilbert space and
wl : . . . , w g are nonnegative weights adding up to 1. Then:
Proof. Expand 11 xi w;a;l12 as (xi wiai, xj wjaj), substitute (a;; aj) by (-[la; - aj1I2 + (lai(I2 + llaj112)/2: and use the fact that weights add up to 1. . Corollary 2.2.2 (Parallelogram Law) Suppose a, b are two vectors in a Hilbert space. Then
Ila + b1I2 + Ila - b1I2 = 2(11a1I2 + llb1I2)-
Proof. Let N := 2, a1 := a, a;? := b, w l := w2 := $ and apply Proposition 2.2.1. W
CHAPTER 2. THE TOOL BOX 9
Proposition 2.2.3 (KadecIKlee property) Suppose (xn) is a sequence in a Hilbert space
that converges weakly to x. If limn llx,II = 11x11, then (x,) converges in norm to x.
Proposition 2.2.4 (strict convexity) Suppose x, y are two vectors in a Hilbert space. If
115 + Yll = ll4l + Ilvll, then IlYIl . x = 1141 Y.
Proof. We can assume that x and y are both nonzero. Let a1 := llyll -x, a2 := IlxH -3, w1 :=
Ilxlllllx+~Il, andw2 := Ilvlllllx+~II. Then l l w m +w2a2Il2 = 11x11211~112 = ~llla1112+w211a2112
by assumption and now a1 - a2 = 0 by Proposition 2.2.1. . 2.2.2 Mosco convergence
Fact 2.2.5 (Tsukada) Suppose S is a closed convex nonempty subset of a Hilbert space
and (S,) is a sequence of closed convex supersets of S. Then TFAE:
(i) Ps, 4 Ps pointwise.
(ii) Sn + S in the sense of Mosco, i.e., the following two conditions are satisfied:
(1) If s E S, then there exists a sequence (s,) with s, E Sn, Vn and s, + s.
(2) If (k,) is a subsequence of (n) and sk, is a weakly convergent sequence with
s k n E Skn, Vn, then the weak limit lies in S.
(iii) If (k,) is a subsequence of (n) and (xkn) is a weakly convergent sequence with xlcn - PsLnxkn --+ 0: then the weak limit belongs to S.
Moreover: if (i)-(iii) hold, then S = 0, Sn.
Proof. "(i) * (ii)": [154, Hilbert space case of Theorem 3.21. "(ii) ++ (iii)" and the "More-
over" part are straight-forward. . Fact 2.2.6 Suppose (S,) is a sequence of closed convex subsets of a Hilbert space with
S n 2 Sn+l, Vn and S := n, S, # 0. Then Sn -t S in the sense of Mosco.
Proof. [112, Lemma 1.31.
CHAPTER 2. THE TOOL BOX
2.2.3 Subgradients
Fact 2.2.7 Suppose f is a convex finite function on a Banach space X and xo E X. Then:
(i) f is continuous at xo and df (xo) is singleton if and only if f is lower semi-continuous
and Giiteaux-differentiable at zo. In this case, the unique subgradient of f at x0 is
precisely the GBteaux-derivate of f at xo.
(ii) If f is continuous at xo, then f is subdifferentiable at xo.
(iii) If X is finite-dimensional, then f is continuous and subdifferentiable everywhere.
Proof. [58, Corollary 1.2.5, Proposition 1.5.3, Proposition 1.5.2, and Corollary 1.2.31. . Proposition 2.2.8 Suppose f is a convex finite function on a Banach space X. Then
TFAE:
(i) f is bounded on bounded sets.
(ii) f is Lipschitz continuous on bounded sets.
(iii) dom df = X and df carries bounded sets to bounded sets.
Proof. (See also [14, Proposition 7.81.) "(i)+(ii)": can be found in [125, Proof of The-
orem 41.B]. "(ii)+(iii)" : By Fact 2.2.7. (ii) , f is subdifferentiable everywhere. It suffices
to show that df carries the int rBx to a bounded set, for every positive real r . So fix
T > 0 and obtain a Lipschitz constant for f on int rBx, say L. Now fix x E int rBx
and obtain s > 0 such that B(x,s) C intrBx. Pick any x' E df(x) and b E Bx. Then
(x': sb) < f (x + sb) - f (x) L LslJ bll , thus 11x' 11 < L and therefore the subgradients of f are
bounded on int r Bx . "(iii)+ (i)" : It is enough to show that f is bounded on r Bx , for every
positive T. By assumption, there exists some M > 0 such that the norm of an arbitrary
subgradient of f at an arbitrary point in rBx is at most M. Fix x E rBx. On the one
'hand, pick x" E df (x). Then (x", 0 - x) < f (0) - f (x), thus f (x) < f (0) + M r . Hence f is
bounded above on T Bx by f (0) + Mr. On the other hand, fixing x; E d f (0) shows similarly
that f is bounded below on rBx by f (0) - Mr. Altogether, f is bounded on rBx and the
proof is complete. . Corollary 2.2.9 Suppose X is a finite-dimensional Banach space. Then every convex finite
function is subdifferentiable everywhere and carries bounded sets to bounded sets.
CHAPTER 2. THE TOOL BOX 11
Proof. By Fact 2.2.7.(iii), every convex function on X is continuous and subdifferentiable
everywhere. Apply Proposition 2.2.8.
One might ask whether Proposition 2.2.8 holds true for a bigger class of Banach spaces.
The answer to this question is negative:
Fact 2.2.10 Suppose X is a Banach space. Then TFAE:
(i) Every continuous convex function on X is bounded on bounded sets.
(ii) The subdifferential map of every continuous convex function on X carries bounded
sets to bounded sets.
(iii) X is finite-dimensional.
Proof. 121, Theorem 2.21 H
Fact 2.2.1 1 (sum rule) Suppose fl , f2 are convex lower semi-continuous proper functions
on a Banach space X. If 0 E sri (dom fl - dom f2), then d( fi + f2) = d f 1 + d f2 on X .
Proof. [5, Corollary 2.11 W
2.2.4 Fenchel conjugates
The following two propositions follow straight from the definitions.
Proposition 2.2.12 Suppose f is a convex lower semi-continuous proper function on a
Banach space X, z' E X^: and T E R. Let h(x) := f (x) + (z",x) + T , Vx E X. Then
h'(x') = -r + f'(x' - z'), Vx' E X'.
Proposition 2.2.13 Suppose f is a convex lower semi-continuous proper function on a
Banach space X and z E X. Let h(x) := f (x + z), Vx E X. Then hX(x*) = f *(x*) - (x', z),
Vx' E X'.
Example 2.2.14 Suppose X is a Banach space, z* E X*, and r E R. Let h(x) := (z*, x)+T,
Vx E X. Then h*(x*) = -r + q,*)(x*), Vxz E X*.
Example 2.2.15 Suppose X is a Banach space, zo, zl E X, and E > 0. Let h := LC, where
C := [ZO, zl] + E B X . Then h*(x*) = E I J x * I I + max((x*, zO), (x*, zl)), Vx* E X*.
CHAPTER 2. THE TOOL BOX 12
Example 2.2.16 Suppose X is a Banach space and h(x) := 4 I ~ X ( ( ~ , Vx E X . Then h" (x") =
$ 1 ( ~ * 1 1 ~ , Vx* E X*.
Proof. [7, Proposition 4.4.81. W
Example 2.2.17 Suppose C is a closed convex nonempty subset of a Banach space X
(viewed in X**). Then L? = ~~1 C.
Proof. By assumption, LC is closed convex lower semi-continuous proper, hence ~ r l ~ = LC
(see, for instance, [7, Theorem 4.2.21). Also, since 0 5 LC: we have 0 < L?. Fix an arbitrary W*
x** E X**. Case 1: x** E clWeak*C. Then there exists a net (x,) in C with x, --, x*". Thus
0 < L?(x**) L l i m a ~ ~ ( x , ) = l h a ~ C ( x , ) = 0. Case 2: x** cl ,,&*C. Since cl is
convex, we can separate x*" from it (viewed in X** with the weak* topology): there exists
x* E X* with (x**,x*) > L;(x*). Hence L?(X**) 2 supnGN(x**,nx*) - ~E(nx*) = +w. . Example 2.2.18 Suppose X is a Banach space, C* is a weak* compact convex nonempty
subset of X*, and let g := L & ) ~ . Then g is convex lower semi-continuous finite on X with
g* = LC*.
Proof. The function g is convex and finite (since C* is bounded). Also, g is weakly lower
semi-continuous on X (as L;, is weak* lower semi-continuous on X**). It remains to show
that g*(x*) = L ~ * (x*), Vx* E X*. Case 1: x* E C*. Then 0 = (x*, 0) - g(0) < g*(x*) =
supZEX(x*, x) - L& (x) < up^.^^^.. (x*, x**) - L& (x**) = L;/** (x*) = 0 = LC* (x*). Case 2:
x* $2 C*. Separate by some y E X : (y, x*) > ~;,(y). Then g*(x") 2 supnEN(x*, ny) - g(ny) = s u p n E ~ n ((y, x*) - L& (y)) = +w and hence g*(x*) = LC* (x*).
The reader should be warned that in various textbooks, statements like "if C is a closed
convex nonempty subset of X , then L? = he'' appear. This does not contradict Exam-
ple 2.2.17 - the ambiguity stems from the fact that most authors require that the domain
of biconjugates be just the original space X rather than X*". However, our analysis relies
on information on the behaviour of biconjugates in the bidual space. Two more results on
weak* closures can be found in Subsection 2.3.1.
2.2.5 Theduality map
Suppose X is a Banach space. Recall that the duality map is the subdifferential for the
function 3 11 I t 2 on X ; it is denoted by J . Hence
J x = {x* E X * : (x*,x) = ((x(I2 = ( \ x * ( ( ~ } , Vx E X.
CHAPTER 2. THE TOOL BOX
In Hilbert space, the duality map is just the identity; the duality map is a very successful
attempt to mimic this outside Hilbert spaces. It is well-known that J is homogeneous and
that J x is bounded, convex, weak* closed, and nonempty, Vx E X; see, for instance, [120].
Example 2.2.19 Suppose I' is a nonempty set and x = (x,) E co(I'). Then J x =
conv {xyey : 1x71 = IIxII}.
Example 2.2.20 Suppose I' is a nonempty set and x = (x,) E Jl(I'). Then (Jx), =
llxllSignx,, Vy E I?.
Example 2.2.21 Suppose (A, A, p) is a o-finite complete measure space and f is a function
in L1(A,A,p). Then (Jf ) (a) = JlfllSign f (a) , Va E A.
Example 2.2.22 Suppose R is a compact Hausdorff space. Recall that the dual of C(R)
is M(R), the space of finite signed Baire measures on 0 (for details, see [87, 1341). The
total variation (resp. support) of an arbitrary measure p E M(R) is denoted IpJ(R) (resp.
suppp). Let C+(R) be the set of all positive measures on 52. Fix an arbitrary f E C(R)
and abbreviate Rf := {w E 0 : If (w)l = 1 1 f 11). Then:
The above examples are part of the folklore. Unfortunately, there seems to be no rich or de-
tailed collection of examples in the literature. The reader is referred to [46, Subsection 12.21
and [91, Section 20.F] for results that are helpful in verifying the above examples; see also
Phelps's discussion on the closely related concept of subrejleduity in [117, 118, 1191.
2.2.6 Optimization
Fact 2.2.23 (Fenchel Duality) Suppose A is a continuous linear operator from a Banach
space X to a Banach space Y, f is a convex lower semi-continuous function on X and g is
a convex lower semi-continuous function on Y. Consider the convex programs
(p) p := inf [f (z) + g ( ~ x ) ] zEX
and
(Dl d .- .- - inf [f *(-~ 'y*) + gx (y*)] Y'EY*
Then p > d.
CHAPTER 2. THE TOOL BOX 14
(i) Suppose x E X and y* E Y*. Then x solves (P), y* solves (D), and p = d if and only
if the Kamsh/Kuhn&Tucker conditions Ax E dg*(y*) and -A"y* E d f (x) hold.
(ii) If A(dom f ) n int domg # 0 and p is finite, then p = d and d is attained. If y* is an
arbitrary solution of (D), then solutions of (P) are equal to the (possibly empty) set
A-ldg*(y*) n df *(-A*y*).
Proof. It is easy to check (i). For (ii) see [7, Theorem 4.6.11. . Corollary 2.2.24 Suppose A is a continuous linear operator from a Banach space X to a
Banach space Y, f is a convex lower semi-continuous function on X , g is a convex lower
semi-continuous function on Y, and yo E Y. Define
p := in; f (x ) + g(Ax + yo) and d := - inf ,{f *(-A'y*) + g*(y*) - (y*, yo)). y*EY
Then p 2 d. If A(dom f ) n [-yo+int domg] # 0 and p is finite, thenp = d and d is attained.
Proof. Follows readily from Fact 2.2.23. (ii) and Proposition 2.2.13. . Fact 2.2.25 (Karush/Kuhn&Tucker) Suppose X is a Banach space and f , gl, . . . , g, are
finitely many convex continuous functions on X. Consider the convex program
(p) inf f (x) subject to gj (x) 5 0, V j . +EX
Suppose further that 2 E X is feasible, i.e. gj(2) 2 0, Vj , and that Slater's condition
holds. Then f solves (P) if and only if the Karush/Kuhn&Tucker conditions hold: there
exist Lagrange multipliers p1, . . . , p, E R such that
O E af (3) + C p j d g j ( 5 ) , p j 2 0; and pjgj(Z) = 0, vj . j
Proof. [166, Theorem 47.E.(2)] . 2.2.7 Linear inequalities
Fact 2.2.26 (Hoffman, 1952) Suppose Cl , . . . , CN are finitely many halfspaces in a Eu-
clidean space X. Then there exists K > 0 such that d(x, ni Ci) 5 K. maxi d(x, Ci), Vx E X.
Proof. [go, Theorem]. .
CHAPTER 2. THE TOOL BOX
2.2.8 Convex relations
Fact 2.2.27 (Open Mapping Theorem) Suppose X, Y are Banach spaces and R is a closed
convex relation from X to Y. Then R is open throughout core ran (R).
Proof. [19, Theorem 8.(c)]. W
2.3 Functional Analysis
Various results from Functional Analysis/Operator Theory/Banach Space Theory are pro-
vided in this section.
2.3.1 Weak* closures
The proof of the following result on weak* closures is due to Warren Moors [log] and much
cleaner than the one we originally proposed in [17].
Proposition 2.3.1 Suppose C is a closed convex subset of a Banach space X (viewed in
X**). If x** E cl w,k*(C) and T > llx**ll, then x** E cl ,&*(rBx n C).
Proof. Suppose not. Then x** E (TBx- n clWk*(C)) \ cl ,,&*(TBx f l C). Note that
clweak*(rBX n C) r B X - n clweakr(C). NOW sepwate by x* E X* and obtain a E R such
that
Let U, := {z" E X** : (z"*,x*) >_ a) and L, := {zW* E X*' : (z**,x*) 5 a) . Then
C = ( C n L,) U ( C n U,) and hence cl weak* (C) = cl weak* (C n La) u cl weak* ( C fl U,) . Thus
(otherwise, x'' E clWek*(C n L,) E clWakw(L,) = La, which is absurd). Now (*) implies
that 8 = U, fl (C n T Bx) = (U, n C) n T Bx. Hence we can separate by some y* E X* \ (0):
sup(rBx, y*) < inf (C n U, , y*) .
But then (**) implies (x**, y*) 2 T lly*ll > IIx**ll Ily*ll, an impossibility. H
CHAPTER 2. THE TOOL BOX 16
Proposition 2.3.2 Suppose C is a closed convex nonempty subset of a Banach space X W* (viewed in X**) and x** E clWeak*C. Then there exists a net (x,) in C with x, - x** and
II~all + 1 1 ~ * * 1 1 -
Proof. (See also Gossez's [74, Corollaire 3-21.) Fix an arbitrary r > IIx**ll. By Propo- W*
sition 2.3.1, there exists a net (x,) in C with 1 1 ~ ~ l l < T and x, - x*". The weak* lower
semicontinuity of the norm in X** yields bal lx, l l 2 11x**11. NOW let 7 be the weak topol-
ogy on cl we,k*C (in the sense of General Topology) which makes the following functions
continuous (see [160, Section 6.31 or [159, Section 9.31): (x*, -), Vx* E X' and ( 1 - 11. Since
r > IIx**ll was chosen arbitrarily, every 7-neighborhood V of xX* contains a point in C, say
xv. The net (xv) does the job. . 2.3.2 Bits from Operator Theory
Fact 2.3.3 (Schauder) Suppose T is a continuous linear operator from a Banach space X
to a Banach space Y. Then T is compact if and only if T* is.
Proof. See, for instance, [43, Theorem VI.3.41. . Fact 2.3.4 (Gantmacher) Suppose T is a continuous linear operator from a Banach space
X to a Banach space Y. Then TFAE: (i) T is weakly compact; (ii) clT(Bx) is weakly
compact; (iii) T* is weakly compact.
Proof. See, for instance: [161, Theorem 11-4-2 and Theorem 11-4-41. H
Fact 2.3.5 (Wilansky) Suppose T is a continuous linear operator from X to Y. Consider
the following conditions: (i) T is tauberian; (ii) ker T"* = ker T; (iii) ker T is reflexive. Then:
(i)+(ii)+(iii). Moreover, if ranT is closed, then (i), (ii), and (iii) are equivalent.
Proof. [161, Theorem 11-4-51. 1
The next fact is a useful application of the (classical) Open Mapping Theorem:
Fact 2.3.6 Suppose X, Y are Banach spaces, T is a continuous linear operator from X to
Y, and S is a subset of X . Then T(S) is closed if and only if S + ker T is.
Proof. [91, Lemma 17.H]. .
CHAPTER 2. THE TOOL BOX 17
Proposition 2.3.7 Suppose Q is a surjection from a Banach space X to a Banach space
Y. Then:
(i) Q* is a tauberian injection from Y* to X*.
(ii) Q*" is a surjection from X** to Y""
(iii) Suppose T is a continuous linear operator from Y to Y*. Then T is weakly compact
if and only if Q*TQ is.
Proof. (i): Q* is an injection ([161, Theorem 11-3-4.(b)]), hence tauberian (Fact 2.3.5).
(ii): Q** is onto by (i) and [161, Theorem 11-3-4.(a)]. (iii): "+": fix an arbitrary x** E XL*.
Then: Q**x** yq* + T**Q**~** y* + Q***T**Q**~** E X* (since Q***Jy* = QL).
Hence ran (Q*TQ)** X* and so Q*TQ is weakly compact. "e": We prove the contra-
positive, so suppose T is not weakly compact. Then there is some y** € Y** such that
T**y** E Y***\Y*. Since Q** is onto (by (ii)), we obtain z** E X** with Q**x** = y**. How-
ever, Q* is tauberian (by (i)), hence Q***T**y** € X*** \ X*. Altogether, (Q*TQ)**x** E
X"** \ X*; so Q*TQ is not weakly compact. . Definition 2.3.8 (Moore/Penrose inverse) Suppose X, Y are Hilbert spaces and A is a
continuous linear operator from X to Y with closed range. The Moore/Penrose inverse of
A is the unique continuous linear operator from Y to X satisfying
Definition 2.3.8 is due to Moore; see [78] for this and equivalent conditions. If A is invertible,
then At = A-'. Note that ran AI is indeed closed:
Fact 2.3.9 Suppose X , Y are Hilbert spaces and A is a continuous linear operator from
X to Y with closed range. Then ran ~t = ranA* is closed and PranA* = AtA. The
Moore/Penrose inverse At can be calculated by
or via the Tihonov regularization: At = limt,o+(A*A + t1)-1~*.
Proof. [78, Sections 11.1 and 11.21. Recall that ran A is closed if and only if ran A* is ([43,
Theorem VI.1.101). .
CHAPTER 2. THE TOOL BOX
Fact 2.3.10 Suppose C1, . . . , ClV are finitely many closed subspaces of a Hilbert space. If
C := ni Ci, then PC, - - . PC, - PC = PcNncl . . . PCIncl.
Proof. [98, Theorem 11. W
2.3.3 Machinery from Banach Space Theory
Some quite deep facts follow:
Fact 2.3.11 (Bessaga and Pelczynski) Suppose X is a Banach space. Then TFAE: (i) X
contains a complemented copy of el; (ii) X* contains a copy of co; (iii) el is a quotient of
X .
Proof. See, for instance: [52, Theorem V.10 on page 481 and [95, 29.171.
Fact 2.3.12
(i) (Pelczynski) Every infinite-dimensional closed subspace of el contains a complemented
copy of 11.
(ii) (Kadec and Pelczynski) Every nonreflexive closed subspace of L1 [O, 11 contains a com-
plemented copy of el.
Proof. See, for instance, [52, Theorem VII.6 on page 74 and Theorem on page 941. . Fact 2.3.13 (Rosenthal) Suppose X is a Banach space. Then X does not contain a copy
of el if and only if it is weakly sequentially Cauchy.
Proof. See [I321 and [52, Chapter XI]. H
Proposition 2.3.14 Suppose X is a Banach space. Then X is Schur if and only if ev-
ery weakly Cauchy sequence is norm convergent. Consequently, Schur spaces are weakly
sequentially complete.
Proof. (Consult the Glossarex for definitions if necessary.) The "if" part is trivial. "only
if": fix a weakly Cauchy sequence in X , which is assumed to be Schur. Were the sequence not
norm convergent, then it would possess a subsequence, say (x,), with infnEN I I x ~ - x ~ + ~ ~ ~ > 0.
Now xn - xn+l would converge weakly to 0, hence also in norm - contradiction! The
"Consequently" is clear. W
CHAPTER 2. THE TOOL BOX
2.3.4 Rugged Banach spaces
Suppose X is a Banach space. We say that X is rugged, if
clspanran (J - J ) = X*,
where J denotes the duality map.
Rugged spaces have to have points where the norm is not Giiteaux differentiable; they
are never smooth. In particular, none of the following spaces is rugged: Hilbert spaces,
uniformly convex spaces, Lp[O, 11 and tp, for 1 < p < +oo. The notion of a rugged space will turn out to be useful when we study regularizations in
Section 15.3.
Example 2.3.15 Suppose I' is a nonempty set. Then co(I') is rugged if and only if I'
contains at least two elements.
Proof. If I' is a singleton, then co(I') is essentially R which is not rugged. Now suppose
I' contains at least two elements. Since c;(I') = tl(I') and clspan{e, : y E I') = tl(I'),
i t suffices to show that e, E spanran ( J - J) . So fix y E I' and any other 6 E I?, where
y # 6. Let x := e, + eg and y := ey - eg. Then; by Example 2.2.19, J x = conv {e,, es)
and Jy = conv {e, , -ea). Thus e , - eg E J x - J x and e, - (-ea) E J y - Jy. Therefore,
e, E !j [(JX - Jx ) + ( J y - Jy)] and we are done. H
Example 2.3.16 Suppose I' is a nonempty set. Then tl(I') is rugged if and only if I'
contains at least two elements.
Proof. Again we can assume that I' contains at least two different elements. We show that
ti (I') is rugged for I' = N since it is notationally much more convenient (the other cases are
proved analogously). By Example 2.2.20,
Example 2.3.17 Suppose (A, A, p) is a a-finite complete measure space. Then L1 (A, A, p )
is rugged if and only if A contains at least two disjoint sets of finite strictly positive measure.
Proof. Once more, we can assume that there are A1,A2 E A with Al fl A2 = 0 and
0 < p(Al),p(A2) < +m. Let f i := hXAi for i = 1,2 (x denotes the characteristic
function), and proceed as in the proof of Example 2.3.16. W
CHAPTER 2. THE TOOL BOX
Example 2.3.18 Suppose R is a compact Hausdorff space. Then C(R) is rugged if and
only if R contains at least two elements.
Proof. We can assume the existence of two distinct points w l , w2 in Q. Pick two disjoint
open neighborhoods Ul of w l and U2 of w2. Define El := cl Ul , E2 := R \ U1 , and E3 :=
El n Ez. Then El, E2, E3 are closed with w2 $ El: w l E2, w2 E2, and El U E2U E3 = Q.
Let B be the class of all Baire sets of R. If E is a closed subset of R and p E M(Q),
then define pE(B) := p(B fl E), VB E 17. Now fix an arbitrary finite signed Baire measure
p E M (R). Then we can decompose p into
We want to show that C(Q) is rugged; by the above decomposition, it is enough to show that
p~~ E span ran ( J - J ) . Moreover, in view of the Jordan decomposition for signed measures
and the homogeneity of spanran(J - J ) , we can assume WLOG that p is a probability
measure (i-e., a positive measure with IIpII = IpI(R) = p(Q) = 1). Altogether, it suffices to
establish the following
Key step: If E is a closed convex subset of R, p is a probability measure on Q with supp p C
E, and wo E R \ E , then p E spanran ( J - J) .
Recall that every compact Hausdorff space is normal ([95, Corollary 9.151). By Tietze's
Extension Theorem ([95, Theorem 12.4]), there exists f E C(R) with f l E = +1 and f (wo) =
-1. Let So E M(R) be the Dirac measure at wo: So(B) = +1, if w E B ; So(B) = 0, otherwise,
Q B E B. A direct check or Example 2.2.22 shows that
If 1 denotes the function in C(R) that is identically equal to 1, then similarly
Consequently, p + So E ( J - J ) ( f ) and p - 60 E ( J - J ) ( l ) which yields the desired
~ E $ [ ( J - J ) ( f ) + ( J - J ) ( l ) ] & s p a n r a n ( J - J ) . W
Corollary 2.3.19 Every AM-space with unit that is at least two-dimensional is rugged.
Proof. By a well-known result of Kakutani (see [138, Corollary 11.7.11 or [108, Theo-
rem 2.1.3]), every AM-space with unit is isometrically isomorphic to some C(R), where R
is a compact Hausdorff space. .
CHAPTER 2. THE TOOL BOX
Example 2.3.20 C[O, 11 is rugged.
Example 2.3.21 L, [O,1] is rugged.
Proof. It is an infinite-dimensional AM-space with unit; see [138, page 1031. . Example 2.3.22 Suppose I' is a nonempty set. Then c(r) (resp. &(I?)) is rugged if and
only if I? contains at least two elements.
Proof. These spaces are AM-spaces with unit and they have dimensions greater or equal
to 2. Alternatively, equip I' with the discrete topology. Then c(r ) (resp. .!,(I?)) can
be identified with C(R), where R is the one-point-cornpactification (resp. ~ t o n e / ~ e c h
compactification). . 2.4 Nonexpansive maps
We record some properties of nonexpansive maps that are most useful for us. Many of these
properties hold (much) more generally; however, as we deal exclusively with Hilbert spaces
in Part I, we prefer to give self-contained proofs.
2.4.1 General nonexpansive maps
Theorem 2.4.1 (Baillon) Suppose T and (T,) are nonexpansive maps defined on a closed
convex nonempty subset D of a Hilbert space X. Suppose (Tn) converges pointwise to T.
Then:
for every sequence (2,) in D.
Proof. (See also [8, Chapitre 6, Ddmonstration du Thdorkme 1.31.) Fix an arbitrary
vector u E X. Denote the projection onto D by PD (see Chapter 3 for basic properties of
projections). Because TnPD is nonexpansive, ((x, - u) - (TnPDx, - TnPDu), x, - u) 2 0,
Vn. The assumptions imply (TPDu - u, x - u) > 0. NOW let u := x + tv, where v E X
arbitrary but fixed and t > 0. Then (TPD(x + tv) - (x + tv), v) 5 0, which yields (after
letting tend t to 0) (TPDx - x,v) _< 0. Setting v := x - TPDx results in x = TPDx. As
x E D, the proof is complete. .
CHAPTER 2. THE TOOL BOX 22
Corollary 2.4.2 (Opial's Demiclosedness Principle) Suppose T is a nonexpansive map de-
fined on a closed convex nonempty subset D of a Hilbert space X. Then
x n - x I implies x E FixT, x, - Tx, -+ 0
for every sequence (2,) in D.
Proof. (See also [115, Lemma 21.) Apply Theorem 2.4.1 (with T, := T, V n ) .
Proposition 2.4.3 Suppose T is a nonexpansive map defined on a closed convex nonempty
set in a Hilbert space. Then FixT is closed convex (possibly empty).
Proof. Suppose XI, x2 E FixT. Define z := wlxl + ~ 2 x 2 , where w l , wl are nonnegative
weights. Then, using Proposition 2.2.1,
2.4.2 Attracting maps
Definition 2.4.4 We say that a nonexpansive map T defined on a closed convex nonempty
set D in a Hilbert space is attracting, if FixT # 0 and
JITx - f J J < JJx - f J J , Vx E D \ FixT, f E FixT.
Theorem 2.4.5 Suppose TI,. . . , TN are finitely many attracting nonexpansive self-maps
of a closed convex nonempty set in a Hilbert space with ni FixT, # 0. Then:
(i) Fix (TN - . . TI) = ni Fix T; and TN . - TI is attracting.
(ii) If w l , . . . , w~ are positive weights adding up to 1, then Fix (xi w;T,) = ni Fix% and
xi wiT, is attracting.
CHAPTER 2. THE TOOL BOX 23
Proof. Denote the domain of the maps by D. It is enough to prove the theorem for N = 2;
the general case follows inductively. (i): Clearly, FixTl n FixT2 2 Fix (T2T1). To prove the
other inclusion, pick f E Fix (T2T1). It is enough to show that f E FixTl. If this were false:
then Tl f @ Fix T2. Now fix f E Fix Tl n Fix T2. Then, since T2 is attracting,
which is absurd. Thus FixTl n FixT2 = Fix (T2T1). It remains to show that T2T1 is
attracting. Fix x E D \ Fix (T2T1), f E Fix (T2T1). If x = Tlx, then T2x # x and hence
IIT2T1x-f l l = IIT2a:-f l l < 11%-f ll. Else # Tlx, then llT2T1x-f l l 5 IIT1x-f l l < llx-f ll.
In either case, T2T1 is attracting. (ii): Clearly, FixTl fl FixT2 2 Fix (wlTl + w2T2) (recall
that weights add up to 1). Conversely, pick f E Fix (wlTl + w2T2), $ E FixTl n FixT2.
Then, using Proposition 2.2.1,
Hence the above chain of inequalities is actually one of equalities and so TI f = T2 f ; which
implies f = wlTl f + w2T2 f = Tl f = T2 f . Next, we show that wlTl + w2T2 is attracting.
Suppose x # wlTl x + w2T2x and f E Fix (wlTl + w2T2). Then x $! Fix TI n FixT2 and thus
2.4.3 Strongly attracting maps
Definition 2.4.6 We say that a nonexpansive map T defined on a closed convex nonempty
set D in a Hilbert space is strongly attracting or K-attracting, if FixT # 0 and K is a positive
real with
Clearly, every strongly attracting map is attracting. The next result is a quantitative version
of Theorem 2.4.5:
CHAPTER 2. THE TOOL BOX 24
Theorem 2.4.7 Suppose TI, . . . , TN are strongly attracting self-maps of a closed convex
nonempty set in a Hilbert space with constants 61, . . . : K,V respectively. Then:
(i) TN . . - Ti is (1/N) min{nl,. . . , nN}-attracting.
(ii) If w l , . . . , wnr are positive weights adding up to 1, then wiT, is min{tcl, . . . : K.~T}-
attracting.
Proof. Again we denote the domain of the maps by D and recall that weights add up to
1. (i): Given x E D, f E Fix (TN . . .TI), we estimate (with the help of Cauchy/Schwarz):
2.4.4 Averaged and firmly nonexpansive maps
A map T defined on a closed convex nonempty subset D in a Hilbert space is called firmly
nonezpansive , if
which is equivalent (see [165, Section 11) to any of the following conditions:
I - T is firmly nonexpansive.
2T - I is nonexpansive.
T = $I + N, for some nonexpansive map N on D.
llTx - Ty1I2 5 (Tx - Ty,x - y), Vx, y E D.
*. b F The map T is called averaged, if T = (1 - w)I + wN, for some nonexpansive map N on
D and w E [0, l[. The above characterization implies that every firmly nonexpansive map
is averaged and that T is averaged if and only if T = (1 - a ) I + aF, for some firmly
nonexpansive map F on D and a! E [O, 2[.
Averaged maps are perfect prototypes for strongly attracting maps:
Proposition 2.4.8 Suppose T is a firmly nonexpansive self-map of a closed convex non-
empty subset D with FixT # 0. Let a! E [O, 2[ and R be the averaged map R := (1-a)I+aT.
Then for every x E D and f E Fix T:
(i) FixR = FixT, if a > 0.
(ii) (x - f , x - Tx) 2 112 - Tx1I2 and (x - Tx,Tx - f ) > 0.
(iii) IIx - f ] I 2 - llfi - f ] I 2 = 2a(x - f , x - Tx) - a211x - Tx1I2.
(iv) 112 - f [ I 2 - IIRz - f [ I 2 1 ((2 - cr)/a) 115 - fill2 = (2 - a!)allx - ~ ~ 1 1 ~ ; in particular,
R is (2 - a!)/a-attracting.
Proof. (i) is immediate. (ii): Since T is firmly nonexpansive, we have: IITx - f 112 I (Tx- f , x - f ) * ll~x-x11~+11x- f112+2(~x-x ,x- f ) 5 (Tx- f , x - f ) * I I T X - X ~ ~ ~ 5 (x - Tx, x - f ) * 0 5 (x - Tx,Tx - f) . (iii):
llx - f 1 1 2 - I l f i - f 1 1 2 = 112: - f 1 1 2 - 11(1 - - f ) + ~ ( T x - f Ill2 = llx - f 1 1 2 - [(I - ( Y ) ~ I ~ X - f 112 + a211~x - f 112 + 2 4 1 - a)(x - f ,Tx - f ) ]
= 2a11x - f 112 - a211s - f 1 1 2 - a211~x - f 1 1 2 + 2a2(x - f ,Tx - f ) - 2a(x - f ,Tx - f)
= 2a(x - f , (X - f ) - (Tx - f ) ) - a2[11x - f ( I 2 + llTx - f 112 - 2(x - f , T x - f ) ]
= 2a(x - f , x - Tx) - cr211x - ~ 2 1 1 ~ .
(iv) :
2 (iii) IIx - f 112 - I I & - f 11 = 2a(z - f , x - Tx) - a211x - ~ ~ 1 1 ~
CHAPTER 2. THE TOOL BOX
2.5 Monotone Operator Theory
This short section contains a selection of classical results that we will need mainly in Part 11.
Fact 2.5.1 (Rockafellar) Suppose f is a convex lower semi-continuous proper function on
a Banach space. Then the subdifferential map is maximal monotone.
Proof. [128]. . Fact 2.5.2 (Rockafellar) Suppose T and TI are maximal monotone operators on a reflexive
Banach space with (dom T n int dom TI) U (dom TI n int dom T) # 0. Then T + T is maximal
monotone.
Proof. [129]. H
Fact 2.5.3 (Heisler) Suppose T and TI are maximal monotone operators on a Banach space
X with domT = domT = X. Then T + TI is maximal monotone.
Proof. [121, Remark after Problem 2.201.
Fact 2.5.4 Suppose X is a Hilbert space and T is a set-valued map from X to X*. Then:
(i) T is monotone if and only if the resolvent ( I + p ~ ) - l is (single-valued and) firmly
nonexpansive, for (some or every) p > 0.
(ii) T is maximal monotone if and only if the resolvent ( I + p ~ ) - ' is (single-valued and)
firmly nonexpansive and dom T = X ; for (some or every) p > 0.
Proof. [57, Theorem 21. . 2.6 Miscellany
2.6.1 Hoo hoo - ha ha: the monkey house of interiors
Definition 2.6.1 Suppose C is a convex subset of a Banach space X. A point x E C
belongs to the . . .
(i) core of C, denoted core C, if cone (C - x) = X .
(ii) strong relative interior of C, denoted sri C, if cone (C - x) is a closed subspace.
(iii) relative interior of C, denote ri C: if cone (C - x) is a subspace.
(iv) quasi relative interior of C: denoted qri C, if EZiE (C - x) is a subspace.
Evidently,
int C C core C C sri C E ri C C qri C.
Note that the interior and the core take into account in which space the set is viewed whereas
the relative interior versions reflect the intrinsic geometry.
Definition 2.6.2 Suppose C is a convex subset of a Banach space X. If A is a subset of
X, then the set
{x E C : A ~ x + c o n e ( C - x ) )
is called the core of C relative to A and denoted core *(C).
Proposition 2.6.3 Suppose C is a convex subset of a Banach space. Then:
(i) core C = core x (C) .
(ii) sri C = core ;iff (C) (C) .
(iii) ri C = core ,E (c) (C) .
Proof. Let us prove (ii), so fix an arbitrary point x E C. Then: x E c ~ r e ~ ( ~ , ( C ) * - aff(C) E x+cone (C-x ) @ x+-(C-x) x+cone(C-x) w W ( C - x ) C
cone (C - 2) * cone (C - x) is a closed subspace e x E sri (C). (i) and (iii) can be proved
analogously. . Remarks 2.6.4 A few comments on history and terminology are in order. These comments
help explain the name of this subsection.
(i) Borwein and Lewis coined and deeply explored the notion of the quasi relative interior;
see [22, 231. It should be noted that Zarantonello suggested the equivalent notion of
"inner points" in 1971; see 1165, Definition 2.51.
(ii) The three notions of strong relative interior, relative interior, and quasi relative interior
all agree in finite dimensions. The relative interior is studied extensively in [127,
Section 61; it also coincides with Holmes7s "intrinsic core" [91].
CHAPTER 2. THE TOOL BOX 28
(iii) What I call the strong relative interior appeared also under three different names in
the literature. However, I believe that all of these are somewhat unfortunate: Firstly,
Attouch and Thera [6, Section 4.11 called it the "strong quasi interior". However, there
is the notion of the quasi interior of a convex set C, defined by {a: E C : EZE (C - x) =
X). But the strong quasi interior does not have to be a subset of the quasi interior
(consider R in El2). Secondly, Jeyakumar and Wolkowicz [96, Definition 3.21 called
it the "strong quasi relative interior". However, the strong relative interior is closer
to the relative interior than to the quasi relative interior. Finally, Bauschke and
Borwein [15] called it the "intrinsic core". This is only justified in finite-dimensional
spaces; see (ii).
We conclude this section with some results that demonstrate the usefulness of strong relative
interiors.
Proposition 2.6.5 Suppose C is a subspace of a Banach space. Then C is closed if and
only if 0 E sri C .
Theorem 2.6.6 Suppose X, Y are Banach spaces and R is a closed convex relation from
X to Y with 0 E sri (ran 0 ) . Then for every E > 0, there exists 5 > 0 such that
@Z(ranR) n SBy C R(B(3, e ) ) , V 3 E R-'(0).
Proof. Let Z := (ran R). Then R is a closed convex relation from X to Z and
0 E sri (ran R) = core a(,, (ran R) = core -(,, (ran 0) = core (ran 0 ) .
By Fact 2.2.27, R is open at 0. Hence for every Z E W1(0) and every E > 0: there exists
S > 0 such that 6Bz E R(B(5,e)). The result follows. . Corollary 2.6.7 Suppose Cl, C2 are closed convex nonempty subsets of a Banach space
X. Then core (C1 - C2) = int (Cl - C2).
Proof. We only have to show that core (C1 - C2) G int (Cl - C2) and it suffices WLOG
(after translation) to pick 0 E core (Cl - C2). Then cone (Cl - C2) = X and hence
- span (Cl - C2) = X
CHAPTER 2. THE TOOL BOX
Define a set-valued map R from X x X to X by
xl - 22, if xl E C1 and 2 2 E C2; R(xl, 22) :=
otherwise,
for every (xl, x2) E X x X. Then R is a closed convex relation with ran 0 = C1 - C2 and
0 E core (C1 - C2) sri (Cl - C2). Hence, by Theorem 2.6.6, there exists 6 > 0 such that
spanran R n 6Bx 5 ran R or 6Bx C C1 - C2. Therefore, 0 E int (C1 - C2). 4
We obtain the following classical result (see: for instance, [91, Lemma 17.E]):
Corollary 2.6.8 In a Banach space, the core of every closed convex set is the same as its
interior.
2.6.2 Angles and differences of two closed convex sets
Definition 2.6.9 (Dixrnier; 1949) Suppose C, D are closed convex nonempty subsets of a
Hilbert space X. The minimal angle between C and D is the angle yo = yo(C, D) E [O, ~ / 2 ]
whose cosine is d e k e d by
cos yo := sup{(?, 2) : ? E Bx n cone C, 2 E Bx n cone D).
Note that yo(C, D) = yo(D, C). Dixmier's definition of an angle is studied most often in
the context of closed subspaces; see Deutsch's [49] for more. Nonetheless, the more general
definition provides an easy criterion for the closedness of the difference of two closed convex
nonempty sets:
Proposition 2.6.10 Suppose C, D are closed convex nonempty subsets of a Hilbert space.
If yo(C, D) > 0, then C - D is closed.
Proof. (See also [47, Lemma 2.5. (4)] or [84, Lemma 23.) Since yo(C, D) > 0, there exists
e > 0 with cosyo(C, D) = 1 - e. For arbitrary nonzero points c E C, d E D, we have:
Now suppose x E C - D. Then there exist sequences (c,) in C and (d,) in D such that
C, - d n + X.
CHAPTER 2. THE TOOL BOX
Claim: hnllc, l l < +m.
Otherwise, IIc, 11 -+ +oo and thus (since (c, - d,) is bounded) [Idn 11 -f +m. The above
inequality implies further the absurdity 1 1 ~ 1 1 ~ + llcn - d,1I2 2 2~11~nlllldnll -+ +oo and the
claim is verified.
Hence lim,l)c,)) < +m. But then there exist subsequences (ck,) of (c,) and (dk,) of (dn)
that are weakly convergent. It follows that x E C - D.
Definition 2.6.11 (Friedrichs; 1937) The angle between two closed convex cones C, D in
a Hilbert space is defined by
Note that y(D, C) = y(C, D) = y(C fl (C n D)@, D r l (C n D)e).
Friedrichs's angle is almost exclusively studied for closed subspaces; see again Deutsch's [49]
for more. However, we find it convenient to express a later result (Proposition 4.3.2) in
terms of the angle for cones.
The next fact explains our interest in angles: if the angle is positive then von Neumann's
alternating projection method produces linearly convergent sequences (Remark 9.6.1).
Fact 2.6.12 Suppose C, D are closed subspaces of a Hilbert space. Then the angle y(C, D)
can be expressed in terms of projections: cos y(C, D) = 11 PD PC - P c n ~ 11.
Proof. [47, Lemma 2.5.(3)]; see also [98, Theorem 21 and [49, Lemma 10.41.
Corollary 2.6.13 Suppose C, D are closed subspaces of a Hilbert space. If y(C, D) > 0,
then [C fl (C f7 D)'] + [D fl (C n D)'] is closed.
Proof. From Definition 2.6.11 and Proposition 2.6.10, the difference [Cn ( C n D)'] - [D n (C fl D)'] is closed; the result follows, since the subtrahend is a subspace. . Proposition 2.6.14 Suppose C is a closed convex subset of a Hilbert space and D is a
convex nonempty subset of C'. Then C - D is closed if and only if D is.
Proof. "=+-": Fix d E and pick a sequence (d,) in D converging to d. Fix an arbitrary
point E E C. Then (E - d,) is a sequence in C - D converging to Z - 2; hence E - 2 E - -
C - D = C - D. Thus there exist c E C, d E D with E - d = c - d, or: C - c = d - d.
CHAPTER 2. THE TOOL BOX 3 1
But .? - c is in span C and d - d is spanB C (spanc)'. It follows that d = d E D, as
desired. "en: Since D is contained in cL = (coneC)', we have .yo(C, D) = a/2 and the
result follows from Proposition 2.6.10.
Proposition 2.6.15 Suppose C, D are closed subspaces of a Hilbert space. Then [C n (C n D)'] + [D n (C n D)'] is closed if and only if C + D is.
Proof. (Alex SimoniE [142]; see also [15, Lemma 4.101.)
Claim 1: [C n (C n D)'] + [D n (C n D)'] = (C + D) n (C n D)'.
''En is obvious. Conversely, fix x E (C + D) n (C fl D)'. Then x = c + d, for some points
c E C, d E D, and (Proposition 3.3.8) x = P ( c n D ) ~ ( ~ ) = P ( c n D ) l ( ~ + d) = (I - P c n ~ ) ( c ) + ( I - PcnD)(d); so "1" follows and Claim 1 thus holds. Similarly, we see
Claim 2: C + D = [ (C+ D) fl (Cn D)'] + (Cn D).
Both claims together imply C + D = ([C n (C n D)'] + [D n (C n D)']) + (C n D). The
result now follows from Proposition 2.6.14. . We conclude the subsection with a classical result:
Proposition 2.6.16 Suppose C, D are closed subspaces of a Hilbert space X. Then C+ D
is closed if and only if C' + DL is.
Proof. "+": let f l := LC and f2 := LD. Then dom f l - dom f2 = C + D is a closed
subspace; hence 0 E sri (dom f l - dom f2) (Proposition 2.6.5). Therefore, by Fact 2.2.11,
~ L C + ~ L D = ~ ( L C + LD) = &cnD. Evaluated at 0, this becomes CL + DL = (C n D)I . But
the latter set is closed and hence so is C' + DL.
"&?: if C' + DL is closed, then (by "+?') so is C" + DL' = C + D.
2.6.3 Pierra's product space formalization
Fact 2.6.17 Suppose CI , . . . , CAT are finitely many closed convex nonempty subsets of a
Hilbert space X. Define the product space with (ordered) positive weights (w;);Ll adding
up to 1 by
the diagonal by
CHAPTER 2. THE TOOL BOX
and the product set by N
Then 1 1 ~ - y1I2 = c:!~ willxi - yi112 for every x, y E X, and the (components of the) pro-
jections onto A, C are given by
Note that for x E X, N
X E ~ C , ifandonlyif (X , . . . , x ) E A ~ C . i=l
Proof. [123, Section 11. . 2.6.4 Linearly convergent sequences
Proposition 2.6.18 Suppose (x,) is a sequence in a Banach space X, p is a positive integer,
and x E X. If (xp) converges linearly to x and (llx, - xll) is decreasing, then the entire
sequence (2,) converges linearly to x.
Proof. (See also [14, Proposition 1.61.) There exist cr > 0 and P E [O, 1[ such that
llxpn - 211 2 aPn, Vn. Now fix an arbitrary positive integer m and divide by p with
remainder: m = p n + T , where 0 5 r 5 p - 1. We then estimate as desired:
2.6.5 A bbseries" estimate
Proposition 2.6.19 Suppose (Ak)k>l - is a sequence of nonnegative reds with Cgl A: < +m. Let An := Xi=, Ak be the nth partial sum and p be an arbitrary positive integer.
Then bnA,(An-p + . + An-l + A,) = 0.
Proof. By Cauchy/Schwarz, An(An-p + - . . + A,) 5 ~;i\Ix;=~ A:(x,-, + . . + A,); hence
it suffices to show that lh,@(An-p + . - . + A,) = 0. Suppose not. Then there exists r > 0
such that eventually An-p + - - + An 2 ~ / f i , which implies
Summing over sufficiently large n yields the desired contradiction. .
i I' CHAPTER 2. THE TOOL BOX i,
2.7 Notes
Remark 2.7.1 An interesting improvement of Proposition 2.3.1 was pointed out to us by
Warren Moors [log]: if C is a closed convex subset of a Banach space X and r > inf IICII,
then cl weak* (rBx n C) = r Bx- n cl weak* (C). Moreover, the assumption "r > inf IICII" is
important in the sense that the conclusion can fail when r = inf IICII.
Remark 2.7.2 Suppose X is a separable Banach space that does not contain a copy of
e l . Then every bounded subset of X** is weak* sequentially dense in its weak* closure
(Rosenthal's [133, Theorem 21). Using Proposition 2.3 .I, one thus obtains the following
interesting sequential variant of Proposition 2.3.2: if C is a closed convex nonempty subset
of X C X**, and x** E cl w,ak*C, then there exists a bounded sequence (x,) in C with w*
x r - x**.
Remark 2.7.3 Leach and Whitfield [102, page 1211 coined the notion of a rough norm,
where rough is also meant in a "nonsmooth sense". However, the properties rough and
rugged are unrelated (the reader is referred to [51, 1201 for further information and notation):
on the one hand, the Euclidean plane R2 is an Asplund space and thus does not admit an
equivalent rough norm ([51, Theorem 1.5.31) but does admit an equivalent rugged norm
(Proposition 2.7.4 below). Hence rugged j4 rough. On the other hand, Phelps constructed
an equivalent norm on el that is Giiteaux differentiable but nowhere Frdchet differentiable
([120, Example following Theorem 5.121). By Deville et al.'s [51, Theorem III.1.9], the space
el admits an equivalent rough Giiteaux differentiable norm; this norm cannot be rugged.
Thus rough $j rugged.
It is amusing that (almost) all Banach spaces can be renormed to become rugged:
Proposition 2.7.4 (Jon Vanderwerff [156]) Suppose X is a Banach space. Then X admits
an equivalent rugged norm if and only if the dimension of X greater than 1.
Proof. "+": neither (0) nor JR is rugged. "e": fix an arbitrary u E X with llull = 1
and u* E Ju . Then X = keru* $ Ru which allows us to view X as X = Y $ JR, where
Y := keru*. Define a norm on X by Ill(y, r)III := llyll + Irl, V(y, r ) E X = Y @ R. Clearly,
11 . 11 5 1 ) 1 1 1 1 ; thus, by the Inverse Mapping Theorem ([43, Theorem III.12.5]), these norms
are equivalent. The norm in (X, I I I - I I I)* is given by 1 1 1 (y*, r*) 1 1 1 = max{Il y* 11, I T * I ) , Vy* E Y*,
CHAPTER 2. THE TOOL BOX 34
r* E R" = R. For the remainder of this proof, J denotes the duality map of (X, I I I - 1 1 1).
Then J(y, r) = {(y*, r*) E Y" x R : (y*, y) + r*r = llyll + Irl = max{lly* 11, IT*^)). On
the one hand, J ( u , 0) > {u*) x 1-1, +1] so that J (u , 0) - J (u , 0) 2 (0) x [-2, +2]. On
the other hand, J (0 , l ) > By* x (1); thus J(0, 1) - J ( 0 , l ) > 2By* x (0). Altogether,
UnEN n[(J(u, 0) - J (u , 0)) + (J (o , I.) - J(O, 1))j = Y* x R = X*. Consequently, (X, I I I . I [I) is rugged. . The results on (strongly) attracting and averaged maps are taken from [14, Section 21.
Part I
Projection Algorithms in Hilbert
spaces
Chapter 3
Projections
3.1 Overview
In this Chapter, we consider projections - not surprisingly, a key ingredient of projection
algorithms - in some detail. After listing simple yet immensely useful properties, we present
a fairly rich collection of examples before turning to related operators (that facilitate the
understanding of Fejdr monotone sequences in Section 6.2).
3.2 Basic properties
Recall that the distance function to a closed convex nonempty subset C of X is defined by
d(z , C ) := infCEC 1 1 % - ell, Vz E X ; it is a convex nonexpansive function on X and hence
weakly lower semi-continuous.
Theorem 3.2.1 Suppose C is a closed convex nonempty subset of X. Then for every
z E X, there exists a unique closest point in C , called the projection of z onto C and
denoted PC(%) or Pcz, with d(z, C ) = 1l.z - Pczll.
(i) Pcz is characterized by z - Pcz E d ~ ~ ( P ~ z ) ; that is:
P c z E C and ( C - P c z , z - P c z ) < O .
(ii) For every x , y E X , we have:
11. - y1I2 = l1Pcx - pcyl12 + Il(x - Y ) - (Pcx - P C Y ) I I ~
+ 2(x - Pcx, Pcx - Pcy) + 2 ( y - Pcy, Pcy - Pcz);
CHAPTER 3. PROJECTIONS
moreover, each term occurring in this formula is nonnegative.
Proof. For an entirely elementary proof of (i), see [165, Lemma 1.11; here, we rather give
a Convex Analysis proof: define a function h on X by h(x) := $ 1 1 ~ - z1I2 + LC(X). Then
h has weakly compact convex lower level sets and is strictly convex. Thus argminZExh(x)
exists, is unique, and equal to Pcz. The optimality condition 0 E Bh(Pcz) is equivalent to
z - Pcz E d ~ ~ ( P ~ z ) , hence (i) holds. (ii) follows from (i) by a trivial expansion.
Observation 3.2.2 Every projection is firmly nonexpansive (by Theorem 3.2.1. (ii)) , hence
1-attracting (Proposition 2.4.8.(iv)). This opens the door to the powerful results on strongly
attracting maps in Subsection 2.4.3. Here is a sampler: Suppose PI, . . . , PN are projections
onto closed convex sets C1,. . . , CN respectively with C := ni C; # 0 . Then
Fix (PN - PI) = C and PN - - - PI is (l/N)-attracting,
by Theorem 2.4.5. (i) and Theorem 2.4.7.(i). Suppose further w l , . . . , W N are positive weights
(which thus add up to 1). Then
Fix Ci wiP, = C and C, wiP; is min{wl, . . . , w.v)-attracting,
by Theorem 2.4.5. (ii) and Theorem 2.4.7. (ii).
Once the projection onto a set is known, then so are the projections onto its translations,
dilations, and expansions:
Proposition 3.2.3 Suppose C is a closed convex nonempty subset of X and x E X. Then:
(i) (translation) Pz+c(x) = z + Pc(x - z), Vz E X
(ii) (dilation) PTC(x) = rPC(x/r), VT # 0.
(iii) (expansion) P c + ~ ~ ~ (x) = x, if d(x, C) 5 E; P c + ~ B ~ (x) = Pc(x) + E(X - Pc(x))/JJx - Pc(x)ll, otherwise, Ve 2 0.
Proof. Verify the condition of Theorem 3.2.l.(i).
The important class of cones has well-behaved projection maps and distance functions:
Proposition 3.2.4 Suppose C is a closed convex cone in X. Then PC is positively homo-
geneous. In particular, d(rx,C) = rd(x,C), for every x € X , T 2 0.
CHAPTER 3. PROJECTIONS
Proof. Use Proposition 3.2.3.(ii).
A very useful explicit formula for the subdifferential of the distance function is provided in
the following:
Proposi t ion 3.2.5 Suppose C is a closed convex nonempty subset of X and x E X. Then:
( B X n Nc(x), otherwise.
Proof. Denote the function d( . , C) by f and PC by P. The function f is convex finite
and nonexpansive on X; in particular, by Proposition 2.2.8, f is subdifferential everywhere.
Suppose x E X and x* E X* = X. Then, using the fact that the distance function
is the infimal convolution of LC and 11 . 1 1 , we obtain the basic equivalences: x* E Bf (x)
* x E 8f*(x*) = BL>(x*) + b~~~ (x*) (the usual constraint qualification holds at 0) H
f (x ) + f*(x*) 5 (x*,x) - Pxll + L;(x*) + L B ~ ( X * ) 5 (x*,x). SO by now we know
that 8 # 8 f (x) C Bx. Case 1: x E C. Then the basic equivalences are equivalent to
sup(%*, C - x) 5 0, i.e., x* E NC(x). Case 2: x St' C. Claim: af (x) Sx. Otherwise, let
x* E af (x) fl int Bx. Then BL*, (x*) = (0). Hence, by the basic equivalences, x E ~ L E (x*)
or x* E ~ L C ( X ) , which implies the desired absurdity x E C. The Claim thus holds. So
let x* E 8.f (x) G Sx. Using the basic equivalences, we estimate: (x*, P x - x) 5 L; (x*) - (x*, x) < -1lx - Pxll = -115 - Pxll 11x*11 < -(x - Px, x*) = (x*, Px - x). Hence equality
holds throughout and we obtain x* = (x - Px)/Jlx - PxJ1.
3.3 Examples
This section contains numerous examples of directly computable projections. To prove
these formulae, you either compute "from scratch" (as in Theorem 3.3.6) or you have a
"candidate" in advance (as in Example 3.3.17). In the latter case, Theorem 3.2.l.(i) is most
helpful since it often allows a painless verification of the given candidate.
3.3.1 Hilbert lattice cones
Theorem 3.3.1 Suppose X is a Hilbert lattice and let C be the lattice cone X+. Then
Pcx = x+, for every x E X.
CHAPTER 3. PROJECTIONS
Proof. Since X is a Hilbert lattice, for every x, y E X (see [24, page 2201):
(1) x Ay = 0 if and only if x ,y > 0 and (x,y) = 0;
moreover, Borwein and Yost show in [24, Theorem 81 that (1) implies (3): X + E (X+)@.
(For the reader's convenience, we adapt the labeling from [24, Theorem 81.) Now fix x E X .
Then x+ E X + and x+ A x- = 0. We verify the condition of Theorem 3.2.l.(i):
Theorem 3.3.1 applies in particular to the standard Hilbert lattices such as l2 and L2[0, 11.
We record an important finite-dimensional case separately:
Example 3.3.2 Let C be the nonnegative orthant in a Euclidean space RN and x E RN. Then (Pcx)~ = x' = max{x,, 01, for every i.
3.3.2 Polar cones
Theorem 3.3.3 (Moreau; 1962) Suppose C is a closed convex cone in X and x E X . Then
Pce = I - PC and (Pc(x), Pce(x)) = 0.
Proof. (See also [110].) Abbreviate PC by P . Theorem 3.2.l.(i) yields not only x - P x E
8~~ (Px) but also (since LT;. = LC@): P x E 8LCe (x - Px) ($ x - (x - Px) E d ~ c e (x - Px) ++ x - P x = Pee (x). Thus (c- Pc(x), Pee (x)) 5 0, Vc E C, and (after choosing c = 0, 2Pc (x))
it follows that (Pc(x), Pce(x)) = 0. H
Remark 3.3.4 Suppose C is a closed convex cone in X. Theorem 3.3.3 implies that every
x E X can be written as c + d : where c E C and d E C6. In contrast to the subspace case,
this decomposition need not be unique unless c and d are assumed to be perpendicular.
3.3.3 Rays
Example 3.3.5 Suppose a E X \ (0) and let C := cone (a) be the ray generated by a.
Then Pcx = (1/JJaJ12)(a, x)+ a, Vx E X .
Proof. Fix an arbitrary x f X. If (a, x) > 0, then Pcx = P,,,,(,)(x) (Example 3.3.11);
otherwise, x E Ce and Pc(x) = 0 (Theorem 3.3.3).
CHAPTER 3. PROJECTIONS
3.3.4 Icecream cones
Theorem 3.3.6 Let C , be the icecream cone { ( x : r ) E X x R : ilxll < a r ) , for some a > 0.
Then C z = -Cl/, and for every ( x , r ) E X x W:
I ( x , T ) , if llxll < a r ;
Pc,(v)= ( O , O ) , if allxll < -r;
( a , 1 otherwise.
Proof. " C z G -C1/,": Pick ( y , s ) E C z . Then ( a y , Ilyll) E C, and hence ( ( y , s ) , ( a y , Ilyll))
= ally((2 + sllyll < 0. If y # 0 , then ( ( y ( ( < - s / a ; equivalently, ( -y , -s) E Cl/,. Else
y = 0. Since ( 0 , l ) E C,, we conclude s = ( (0 , s ) , ( 0 , l ) ) 5 0 and hence llyll = 0 < - s /a ;
equivalently, (-0, -s) E C1/,. " C z 2 -C1/,": Pick ( y , s ) E -C1/,, i.e., llyll < - s / a . Fix
an arbitrary ( x , r ) E C,. Then ( ( x , r ) , ( y , s ) ) = ( x , y) + r s < IIxII . llyll+ T S < ( a r ) ( - s / a ) + T S = 0, as desired. Now let us prove the projection formula; so fix ( x , r ) E X x W. The first
case is obvious while the second case (in which ( x , r ) E C z ) follows with Theorem 3.3.3. So
we assume (1) 11s 1 1 > ar and (2) allx 1 1 > -r. Let us abbreviate ro := (allx 1 1 + r)/(cu2 + I ) ,
2 = x/11x11, and xo := roa2 so that our goal is: PC, ( x , r ) I (xo , ro) . By (2), ro > 0 and
hence (so, ro) E C,. Note also that by ( I ) ,
We check the condition of Theorem 3.2.l.(i): pick an arbitrary ( y , s ) E C,. Then indeed
((Y, s ) - ( ~ 0 , T o ) , ( x , r ) - ( ~ 0 , T O ) )
= ( ( y - ~ ~ 0 2 , S - T O ) : ( ( I I x I I - TO)^, - T O ) )
= (y - a r o i , (1Ix 1 1 - a r o ) i ) + ( S - T O ) ( r - T O )
= (11x11 - aro)(y: 2) + Q T O ~ T O - 11x11) + ( s - ro)(r - T O )
< ( 1 1 ~ 1 1 - aro)ll~ll+ aro(ar0 - 1 1 ~ 1 1 > + ( S - ro>(r - T O )
5 (11x11 - a r o ) ~ s + aro(aro - llxll) + ( s - r o ) ( ~ - TO>
= a(llxll - aro)(s - T O ) + ( s - ro)(r - ro)
= ( s - ro)[0!(ll~ll - are) + T - TO] = ( S - ro)O = 0. . Remarks 3.3.7 a Theorem 3.3.6 can be proved differently by Theorem 3.3.21 after noting
that C, = epi ( 1 1 - Il/a). The explicit formula for the projection together with its non-
polyhedral structure makes the icecream cone an important provider for (counter-)examples;
CHAPTER 3. PROJECTIONS 4 1
see Chapter 4, in particular, Section 4.8. a The icecream cone Cl for X = R2 can actually
be viewed as the cone of positive semi-definite matrices in the (Euclidean) space of 2 x 2
real symmetric matrices; see Remark 3.5.1.
3.3.5 Affine subspaces
In Operator Theory, the notion of the orthogonal projection (onto a closed subspace) is
fundamental (see, for instance, [43, Theorem 1.2.71). Fortunately, there is no cause for
confusion as "our" projection coincides with this operator-theoretic orthogonal projection:
Proposition 3.3.8 Suppose C is a closed subspace of X. Then the projection onto C is
precisely the orthogonal projection in the sense of Operator Theory. Also, the projection
onto the orthogonal complement of C is given by Pel = I - PC.
Proof. The definitions coincide because of Theorem 3.2.1 and the usual operator-theoretic
definition (see, for instance, [43, Theorem 1.2.71). The formula for Pel follows from Theo-
rern3.3.3,since~'=C@. . Example 3.3.9 Suppose al, . . . ,an: are finitely many linearly independent vectors in X.
Define a continuous linear operator A from X to lRN by (Ax); = (a;, x), Vx E X,i. Then
- A*(AA*)-IA. Pspan { a l , ... ,aN) -
Proof. Note that ranA* = span{al,. . . , aN); apply Fact 2.3.9. . Example 3.3.10 Suppose X is a Euclidean space and Y is a subspace of X with ba-
sis a ~ , . . . , aN. If A denotes the matrix with column vectors al, . . . , a,v, then Py =
A(A=A)-'A=.
Proof. A is an operator from W' to X with ran A = Y ; apply Fact 2.3.9.
Example 3.3.11 Suppose a E X \ (0) and let C := span (a) be the line generated by a.
Then Pcx = (l/)la)]2)(a, x)a: Vx E X.
The next example is fundamental.
Example 3.3.12 Suppose a E X \ (0) and b E R. Let C be the hyperplane {x E X :
Proof. Check the condition of Theorem 3.2.l.(i). .
CHAPTER 3. PROJECTIONS
3.3.6 Convex polyhedra
Computing the projection onto a convex polyhedron is the same as solving a quadratic
programming problem and thus not simple; there is no "formula'? one could "write down".
However, algorithms such as Lemke 's method and active-set-methods (which are finitely con-
vergent but somewhat complicated, see [141, Chapter 71 and [106, Chapter 14, Section 11):
or Hildreth's method (which is computationally easy but not finitely convergent) do the
job. The last method is actually an incarnation of Dykstra's algorithm which we discuss in
Subsection 11.2.2.
Here, we rather focus on special cases that do allow explicit formulae.
The next example is fundamental.
Example 3.3.13 Suppose a E X \ (0) and b E R. Let C be the halfspace {x E X : (a, x) 5 b). Then:
in particular, d(x, C) = ((a, x) - b)+/J(all.
Proof. Once again, check the condition of Theorem 3.2.l.(i).
The next result sharpens Example 3.3.13 considerably.
Theorem 3.3.14 Suppose {al7. . . , aN) is a finite set of orthogonal nonzero vectors in X
and some given extended reals satisfy -m 5 1; 5 u; 5 +m, Vi E (1,. . . , N). Let C be the
box {X E X : li < (a;, X) < Ui, Vi}. Then
where
if Zi 5 (a;, X) _< u;:
x) - u;: if (a;, x) > u;; Vi .
( a , ) - , i f Z i > ( a ; , ~ ) ,
Proof. Use induction on N and Zarantonello's [165, Lemma 5.101. We omit the somewhat
messy details as the result is not needed in the sequel. . Easier to verify is the following special case of Theorem 3.3.14, which covers Example 3.3.2
as well:
CHAPTER 3. PROJECTIONS 43
Example 3.3.15 Suppose X is RN and some given extended reals satisfy -m 5 1; _< u; < fco, for every i E (1, . . . , N). Let C be the box {x E RN : 1; 5 xi _< u;, Vi). Then for every
x E X and every i E {I,. . . , N ) :
Proof. Either specialize Theorem 3.3.14 or check the condition of Theorem 3.2.l.(i). . Computing projections onto convex polyhedra can be complicated; nonetheless, an impor-
tant reduction to the hite-dimensional setting is always possible:
Theorem 3.3.16 Suppose C is a convex polyhedron in X given by {x E X : (aj,x) _< b j , Vj), for finitely many vectors a j and reals bj. Let K be an arbitrary closed subspace
of nj ker (aj). and define D := C n KL . Then D is a convex polyhedron in KL with
PC = PK + PDP~I-
Proof. It is clear that D is a convex polyhedron in K' (just restrict the functionds aj)
and that C = K @ D. Fix an arbitrary x E X. Then PKX + PDPKlx E C. Pick arbitrary
k E K and d E D. Then
and we are done by Theorem 3.2.1. (i) . H
3.3.7 Balls
Example 3.3.17 Let C be the unit ball Bx. Then for every x E X:
PC(.) = {x, if 1141 _< 1;
x otherwise.
Proof. Check Theorem 3.2.l.(i). 1
CHAPTER 3. PROJECTIONS
3.3.8 Cylinders
Theorem 3.3.18 Suppose e E Sx and r 2 0. Suppose further H is the hyperslab {x E X :
I(e,x)I 5 1) and C is the infinite cylinder {x E X : d(x,Re) 5 T ) . Let Z be the cylinder
H n C. Then Pzx = PcPHx = PHPcx, for every x E X , where:
2, if l(e, x)I 5 1;
x - ((e, x) - l)e, if (e, x) > 1;
x - ((e, x) + l)e, if (e, x) < -1;
and
if 112 - (e, x)ell 5 r ; Pcx =
( e , x)e + r 2- otherwise. e,x e '
Proof. The formula for PH follows either from Theorem 3.3.14 or directly with The-
orem 3.2.l.(i). The formula for PC follows by combining Example 3.3.11 with Proposi-
tion 3.2.3.(iii). The statement on Pz is (a somewhat tedious) consequence of Zarantonello's
[165, Lemma 5.101. Again, this is not needed in the sequel, so we skip the details. . Note that the projection onto an arbitrary cylinder can now be found by using Proposi-
tion 3.2.3.
3.3.9 Polar sets
Theorem 3.3.19 Suppose C is a bounded closed convex nonempty subset of X. Let CQ
be the polar of C and z E X \ CQ. Then
Pcoz = z - pPc(z/p),
where p = (Pcoz, z - Pcoz) is the unique positive solution of
P = (pPc(zlP): z - l~Pc(zl l l )) .
Proof. The polar set C@ is nothing but the sublevel set of the continuous convex finite
(as C is bounded) function L; corresponding to 1. Since L;(z) > 1 > 0 = L;(O), we apply
Theorem 3.3.25 (which does not rely on this result) and learn that
Pcoz = (I + ~ d ~ ; ) - l (z),
!
CHAPTER 3. PROJECTIONS
where p is (for now) an arbitrary positive solution of
Further, Pcoz = (I + pd~T;.)-'(z) u z E (I + pd~T; . ) (P~az) * (z - Pcoz)/p E ~ L > ( & O Z )
H Pcaz E a ~ ~ ( ( z - P ~ ~ z ) / p ) u (PCaz)/p E B ~ c ( ( z - P ~ a ~ ) / p ) (because ~ ~ c ( ( ~ - - P c o ~ ) / p )
is a convex cone) H PC(z/p) = (Z - PCaz)/p (by Theorem 3.2.l.(i))
Finally,
Remark 3.3.20 If C is unbounded, then LC is not necessarily finite and so Theorem 3.3.25
- which was our main tool in the proof of Theorem 3.3.19 - does not apply (because there is
no "general" Karush/Kuhn&Tucker Theorem for this case). And indeed, the conclusion of
Theorem 3.3.19 is false: Let C be a closed convex unbounded cone. Then there does not exist
p > 0 such that p = (pPC(z/p), z - pPC(z/p)), because the right hand side of this equation
equals 0 (Theorem 3.3.3 and C@ = ~ 0 ) . Nevertheless, the formula Pcoz = z - pPc(z/p)
remains true for every p > 0, since pPc(z/p) = PPcz = Pcz (Proposition 3.2.3.(ii)).
3.3.10 Epigraphs
Computing the projection onto the epigraph (defined in the Glossarex) of a convex function
usually requires solving a nonlinear inclusion:
Theorem 3.3.21 Suppose f is a continuous convex finite function on X. Let C be the
epigraph of f and (z, t ) E (X x R) \ C. Then Pc(z, t ) = (x, f (x)), where x is the unique
solution of
E a: + XI - t ) a f ( ~ ) .
Proof. Let (x, T) = Pc(z, t). Then (x, T ) is the (unique) argmin of the convex program
min illx - z1I2 + $(T - t)2 subject to f (x) + ( - 1 ) ~ 5 0. ( x , r )EX x R
CHAPTER 3. PROJECTIONS 46
Slater's condition clearly holds for this program; hence, by Fact 2.2.25, (x, r) is characterized
by feasibility and the existence of a Lagrange multiplier p 2 0 with
Now p > 0 (because (z, t) f C), hence f (x) = r , which further yields p = f (x) - t. . Remark 3.3.22 Let X := RN and f (x) = a i $ ~ ? , VX E X. Then Vf (x) = (ai~i);Ll.
Thus given an arbitrary (z, t) E (X x R) \ epi ( f ) , we have to solve the equations xi =
z; - (f (x) - t)aix; for x. Let k := f (x). Then x; = z;/((k - t)a; + I ) , Vi, and k is thus
given implicitly by N
dza i ( 2 . (k - t)ai zi + 1
At least numerically, we can try to solve this equation for k.
The next example shows the (surprising) difficulty of this problem even when N = 1.
Example 3.3.23 (parabola) Let f (2) := x2, Vx E R and C be the epigraph of f . If
(z, t) E R2 \ epi ( f ) , then Pc(z, t) = (x, x2), where x is the unique real solution of
Remark 3.3.24 In the context of Example 3.3.23, we have to find the root of the cubic
polynomial g, where g(x) := x3 + (4 - t)x - 5 . The function g is convex (resp. concave)
right (resp. left) from the origin and has nice differentiability properties. Note also that
g(0) = -212 and g(z) = z(z2 - t ) : thus: if z # 0 (the case z = 0 is trivial: Pc(O: t) = (0, 0));
then g(0) and g(z) have opposite signs. Under these assumptions, Newton's method will
find the solution without diEiculty. The iteration step is given by
for every n 2 0.
More generally, let m be an integer greater than 1 and apply Theorem 3.3.21 to the function
f (x) := xm, Vx E R. The resulting equation is of degree 2m - 1 and thus not easy to solve.
CHAPTER 3. PROJECTIONS
3.3.11 Sublevel sets
The projection onto a sublevel set of a convex function can be expressed using a resolvent
of the subdifferential:
Theorem 3.3.25 Suppose f is a continuous convex finite function on X . Suppose further
z E X and let C := {x E X : f (x) 5 t ) be the sublevel set of f corresponding to t , where
f (2) > t > inf f (X). Then
PCZ = (I + paf)-l(z),
where p is an arbitrary positive solution of f ((I + pd f)-'(z)) = t.
Proof. The point Pcz is the unique argmin of the convex program
1 min ZIlx - z1I2 subject to f (x) - t < 0. x E X
Slater's condition holds; hence, by Fact 2.2.25, there exists a Lagrange multiplier p > 0 with
z E Pcz + pdf (Pcz) and p(f (Pcz) - t) = 0.
Since z 4 C, we conclude p > 0. Thus f (Pcz) = t and z E (I + pa f )(Pcz). But the
resolvent (I + pa f )-' is single-valued (Fact 2.5.4) and we are done. . Example 3.3.26 Suppose X is a Euclidean space and f (x) := ~ ( A z , x) + ( b , x), Vx E X,
where A is some symmetric positive semi-dehite operator A on X and b E X. Suppose
further z E X , f (z) > t > inf f (X), and let C be the sublevel set of f corresponding to t.
Then Pcz is the unique solution of
Pcz = ( I + p ~ ) - l ( ~ - pb) and f (Pcz) = t .
Proof. Follows readily from Theorem 3.3.25, since V f (x) = Ax + b.
Remark 3.3.27 Since an ellipse is nothing but the (boundary of a) sublevel set of a
quadratic function, we can use Example 3.3.26 to find its projection. Let X be RN,
f(x) := ~ z ~ a ; 4 x ? , Vx E X, and 0 = inf f ( X ) < t < f(z) , for some z E X and
a ' , . . . , a~ 2 0. Denote the sublevel set of f corresponding to t by C. If x := Pcz,
then xi = zi/(l + pai), Vi: where
CHAPTER 3. PROJECTIONS
The last equation has a unique solution and its LHS is a nice convex function in p. The
equation can be solved explicitly for N = 1 and for cui a (in which case C becomes just
a ball - not very interesting). It appears that for all other cases, one is forced to find
solutions numerically. Indeed, even for an ellipse in the Euclidean plane (N = 2), one is
already led to a quartic equation.
3.4 Relaxations and quasi-pro jections
Definition 3.4.1 Suppose C is a closed convex nonempty subset of X. Then for every
z E X , the set
is called the relaxation of z with respect to C and denoted Rc(z) or Rcz.
It is clear that the relaxation Rc(z) is a closed convex subset of X. Using Proposi-
tion 2.4.8.(iv), one readily verifies the basic inclusions
and the translation formula
The set Rc(z) can be strictly bigger than the segment [z,2Pcz - z] (let C be the unit
ball in R2 and consider z := (zl, 0) with zl large): also, it can be strictly smaller than
B(Pcz, d(z, C)):
Proposition 3.4.2 Suppose C is a closed afiine subspace of X and z E X. Then the
relaxation Rc(z) consists of elements of the form
Proof. Assume first that C is a subspace and fix an arbitrary x E Rc(z). Consider
the out-of-the-blue c := (1 - t)PCz + tPcx E C, with t E W. Squaring, expanding, and
simplifying llx - ell 5 llz - c I I yields
F
CHAPTER 3. PROJECTIONS
Letting t tend to -w implies PCx = PCz and so llPclxll 2 IIPclzII. It is easy to show
that these two conditions are suficient for x to be a member of Rc(z). Hence the formula
is verified for the subspace case. The general case follows with the translation formula.
Note that in the class of closed f i n e subspaces, the condition "Rc(z) = [z, 2Pcz - z],
Vz E X" is equivalent to C being a hyperplane. In other words, R c very nicely generalizes
the classical notion of a relaxation in the context of relaxation methods for hyperplanes.
Definition 3.4.3 (Baillon and Bruck [lo]) Suppose C is a closed convex nonempty subset
of X and z E X. Then the set
c n Rc(4
is called the quasi-projection of z onto C and denoted Qc(z) or Qcz.
Analogously to the relaxation, we have C i l [PC%, 2Pcz - z] E Qc(z) E C n B(Pcz, d(z, C))
and the translation formula &,+c(z) = a + QC(z - a), for every a E X.
The quasi-projection onto a closed afEne subspace becomes particularly nice:
Proposition 3.4.4 Suppose C is a closed h e subspace of X. Then the quasi-projection
reduces to the ordinary projection: QC(z) = {Pc(z)}, for every z E X.
Proof. Clear from Proposition 3.4.2. W
3.5 Notes
Even today, the best resource for results on projections is Zarantonello's [I651 article from
1971. Zarantonello had a different goal (spectral theory with respect to convex cones) and
thus did not extensively study non-conical examples.
This chapter showed that the calculation of projections is not always trivial and sometimes
requires sophisticated mathematical tools (Theorem 3.3.19, Theorem 3.3 .%); at other times,
the resulting problems can be tackled only numerically. Only a few sets allow nice clean pro-
jection formulae; most importantly, hyperplanes, halfspaces, balls, and boxes. Interestingly,
although the collection of sets that allow explicit computation of projection is rather small,
there is a universe of projection methods for solving the convex feasibility problem. Con-
cerning the examples, part of the folklore are: Example 3.3.2, Theorem 3.3.3, Example 3.3.5,
CHAPTER 3. PROJECTIONS 50
Proposition 3.3.8, Example 3.3.9, Example 3.3.10, Example 3.3.11, Example 3.3.12, Exam-
ple 3.3.13, Example 3.3.15, Example 3.3.17. To the best of my knowledge, all other results
are new. We now relate the icecream cone to symmetric positive semi-definite matrices.
Remark 3.5.1 It is interesting to note that for X = IR2, the icecream cone Cl in R3 = X x R
from Theorem 3.3.6 corresponds directly to the cone of positive semi-definite matrices in
the (Euclidean) space of 2 x 2 real symmetric matrices: indeed, denoting the standard basis
in IR3 by el, eq, eg, an arbitrary element in the icecream cone Cl is precisely of the form
Now consider an arbitrary vector yl f l + y2f2 + y3f3 with respect to the new basis f l :=
(el + e3)/2, f2 := e2, and f3 := (e3 - e1)/2. This vector belongs to C1 exactly when:
(yi - y3)2/4 + Y; < (yi + y3)2/4 and (yl + y3)/2 > 0. But this condition is equivalent to:
y$ 5 y1y3 and yl 2 0. Finally, the last equivalence says exactly that the real symmetric
matrix
( ) is positive semi-definite.
The cone of positive semi-definite matrices is of utmost importance in semi-definite pro-
gramming. For detailed information on Convex Analysis on the Hermitian and symmetric
matrices, the reader is referred to Lewis's [105]; this article also contains the tools to estab-
lish the following:
Example 3.5.2 Suppose C is the cone of positive semi-definite matrices in the Euclidean
space of Hermitian N x N matrices 3-1 equipped with the inner product (X, Y) = trace (XY),
VX, Y E 3-1. Then C@ = C and the projection is given by
Proof. Let X be the eigenvalue map on 3.1, which sends a Hermitian matrix to its eigenvalues
ordered decreasingly in ElN. Further let f be the indicator function of the nonnegative
orthant in ItN. One central result of [lo51 is the beautiful conjugation formula (f o X)* =
f * o A, which readily yields C@ = C. Now let X = U* DU E 3-1 with U unitary and D
diagonal. Write D = D+ - D-, where (D+)ij = (D; j ) + , Vi, j and D- := D+ - D.
Clearly, U*D+U E C and (-U*DSu, U*DU - U*D+U) = trace ((U*D+U)(U*DAU)) =
CHAPTER 3. PROJECTIONS
trace (D+D-) = 0. Hence (C - U*Df U, U*DU - U'D+U) = (C, -U*D-U) 5 0 and we are
done (once more by Theorem 3.2.l.(i)). . The analogous result holds for the real symmetric matrices. The fact that the cone of
positive semi-definite Hermitian matrices is self-dual (c@ = C) is known as Feje'r's theorem;
see [92, Corollary 7.5.41.
The notion of a quasi-projection (Definition 3.4.3) was used by Baillon and Bruck [lo] to
investigate the location of possible limit points of a sequence. Similarly, we will find that
some results on FejQr monotone sequences in Section 6.2 are well formulated in the language
of quasi-projections and relaxations (Definition 3.4.1).
Chapter 4
Regularity for two Sets
4.1 Overview
The results in this chapter are central for building projection methods that generate norm or
even linearly convergent sequences. This is achieved by imposing a variation of an extremely
intuitive geometric condition called regularity upon the constraint sets. Roughly, two sets
are regular, if, "whenever you are close to each of the sets, then the intersection cannot be
too far away."
There are two ostensibly quite different looking conditions that imply bounded linear reg-
ularity of two sets C1, C2: (i) the interior of one set intersects the other: (C1 n int C2) U
(C2 n int Cl) # 0; (ii) their difference C2 - C1 is a closed subspace.
The key result is striking: these two conditions are nothing but special incarnations of the
universal constraint qualification 0 E sri (C2 - C1) (Theorem 4.4.3). The analysis is carried
out in the very potent framework of (set-valued) convez analysis.
The key result is then particularized and discussed in more detail; moreover: it is sharp:
counter-examples are mostly built around the icecream cone.
4.2 Basic properties
Definition 4.2.1 Suppose Cl , C2 are closed convex subsets of X with C := C1 n C2 # 0. We say that {Cl, C2} is . . .
(i) regular, if max{d(x,, C1), d(x,, C2)} 4 0 implies d(x,, C) + 0, for every sequence
(x,) in X.
CHAPTER 4. REGULARITY FOR TWO SETS
(ii) boundedly regular, if max{d(x,, Cl), d(x,, C2)) -) 0 implies d(x,, C) + 0, for every
bounded sequence (x,) in X.
(iii) linearly regular, if there exists K > 0 such that d(x, C) 5 K max{d(x, C1), d(x, (72)):
for every x E X.
(iv) boundedly linearly regular, if for every bounded set S, there exists KS > 0 such that
d(x, C) 5 KS max{d(x, C1), d(x, C2)), for every x E S.
The following implications are immediate from the definitions:
linearly regular + boundedly linearly regular
U u regular =+- boundedly regular.
Proposition 4.2.2 Suppose C1 , C2 are closed convex subsets of X with nonempty inter-
section. If Cl or C2 is boundedly compact, then {Cl, C2) is boundedly regular.
Proof. (See also [15, Theorem 3.91.) We assume WLOG that Cl is boundedly compact.
(Bounded compactness is defined in the Glossarex.) Suppose to the contrary that {C1, C2)
is not boundedly regular. Then there exists 6 > 0 and a sequence (x,) that converges weakly
to some 5 E X with
Hence 5 E C1 r l C2 (by weak lower semi-continuity of d(., Cl) and d(-, C2)) and (Pc,xn)
converges weakly to Z .
Claim: (Pc,x,) converges in norm to 5.
Otherwise, we could find a subsequence (k,) of (n) such that Pc,xk, -+ g, for some g # 5
(by bounded compactness of C1). But this would contradict the weak convergence of the
sequence (Pc,z,) to 5. The claim thus holds.
The claim and d(x,, C1) -, 0 imply that (x,) converges in norm to Z. This yields d(x,: C1 n C2) + d(Z, C1 n C2) = 0, the desired contradiction. . The conclusion of Proposition 4.2.2 can not be strengthened t o any of the other forms of
regularity; see Example 4.8.3.
Corollary 4.2.3 Suppose C1, C2 are closed convex subsets of X with nonempty intersec-
tion. If span Cl or span C2 is finite-dimensional, then {Cl , C2) is boundedly regular.
CHAPTER 4. REGULARITY FOR TWO SETS
Most of the next two results is due to Adrian Lewis [104]:
Proposition 4.2.4 (Lewis) Suppose Cl, C2 are closed convex subsets of X with nonempty
intersection and y is a point in X with d(y, Cl n C2) > 1. Let Q be the point on the segment
[Pclnc2 (y), y] that is distance I away from Cl n C2. Then:
Proof. After translation, we can assume WLOG that Pclnc, (y) = 0. Then t := J(yll > 1 =
I I Q I I , where Q := ylllyll. We can also assume WLOG that max{d(Q,Cl), d(Q, C2)) = d(Q,C1).
Also, since PcInc, (y) = 0 E C1:
Let HI be the hyperplane {a: E X : (x, $ - PC, Q) 5 (PC, Q, Q - Pc,Q)). Then Hi contains
C1 and the formula in Example 3.3.13 yields
Hence
Remark 4.2.5 The proof of Proposition 4.2.4 actually shows the following:
is decreasing.
Theorem 4.2.6 Suppose Cl, C2 are closed convex subsets of X with bounded nonempty
intersection.
CHAPTER 4. REGULARITY FOR TWO SETS
(i) If {Cl, C2) is boundedly regular, then {Cl, C2) is regular.
(ii) (Lewis) If {Cl, C2) is boundedly linearly regular, then {C1, C2) is linearly regular.
Proof. (i): Suppose to the contrary that {C1, C2) is not regular. Then there is a sequence
(x,) with max{d(xn, Cl), d(xn, C2)) + 0 but limn d(xn, C1 n C2) > 0. Since (Cl, C2) is
boundedly regular, the sequence (llxnll) tends to +m. Thus, by boundedness of C1 n C2,
the sequence d(xn, Cl nC2) tends to infinity. We assume WLOG that d(x,, Cl nC2) > 1, Vn.
Let & be the point on the segment [Pc,nc, (x,): xn] that is distance 1 away from C1 fl C2,
Vn. By assumption and Proposition 4.2.4,
Hence rnax{d(?,, Cl), d(?n, C2)) + 0. But the sequence (2 , ) is bounded, so bounded regu-
larity of {C1, C2) implies the absurdity: 1 E d(&, Cl n C2) + 0. (ii): Let S be the bounded
set (Cl n C2) + Bx. Obtain K S > 0 such that d(x, C1 n C2) < ns max{d(x, C1), d(x, C2)),
Vx E S. By Proposition 4.2.4, this inequality holds true on X. I
Both parts of Theorem 4.2.6 are false without the assumption of a bounded intersection;
see Example 4.8.3 and Example 4.8.4.
Corollary 4.2.7 Suppose Cl: C2 are closed convex subsets of X with bounded nonempty
intersection. If C1 or C2 is boundedly compact, then {C1, C2) is regular.
Proof. Combine Proposition 4.2.2 and Theorem 4.2.6.
4.3 Two cones
Theorem 4.3.1 Suppose C1, C2 are closed convex cones in X. Then TFAE:
(i) {CI , C2) is linearly regular.
(ii) {C1, C2) is boundedly linearly regular.
(iii) {Cl , C2) is regular.
Moreover: if C1 n C2 = {0), then (i)-(iii) are also equivalent to
ii
CHAPTER 4. REGULARITY FOR TWO SETS
(iv) {C1, C2) is boundedly regular.
Proof. "(i)=+-(ii))' and "(i)+(iii)" are clear. "(i)e(ii)": There exists K > 0 such that
Now use homogeneity (Proposition 3.2.4) to see that this estimate holds true on the entire
space X. "(i)+=(iii)": Suppose not. Then there exists a sequence (x,) in X such that
Let y, := x,/(n max{d(x,, C1), d(xn, C2))), Vn > 1. Using homogeneity (Proposition 3.2.4)
once more, we see that max{d(yn, C1), d(y,, C2)) = l l n + 0 despite d(yn, Cl nC2) > 1. This
contradicts the regularity of {C1, C2). "Moreover": (iii) implies (iv). If conversely {C1, C2)
is boundedly regular with Cl n C2 = {0), then {Cl, C2) is regular by Theorem 4.2.6.
Theorem 4.5.1 shows that for two closed subspaces, bounded regularity always does imply
regularity. This is not true for closed convex cones, it does happen that (iv) holds in absence
of (i)-(iii); see Example 4.8.3. Thus the assumption "Cl n Cz = (0)" in the "Moreover"
part of Theorem 4.5.1 is important.
Proposition 4.3.2 Suppose Cl , C2 are closed convex cones in X. If {C1, C2) is boundedly
regular, then y (Cl , C2) > 0.
Proof. Suppose not. Then there exist sequences (c(;)) in Cl n (Cl n C2)@, (cp)) in C2 n (n) 2 (Cl 1 - 6 ' 2 ) ~ with ~ l $ ) l l - 1lcF)ll I 1 and (cp), cp)) 3 1. Thus (by expanding llcp) - c2 I ( )
$) - $) 3 0. NOW define I(") := (cy) +$))/2, Vn 2 1. Note that the sequence (I(")) lies
in (Cl n C2)@. Then x(") - c r ) , I(") - $) 3 0 and hence max{d(x("), c ~ ) , d(x(")> ~ 2 ) ) + 0. 1 (n) ("1 However, using Theorem 3.3.3 and = 4 + (cl , c2 ) + 1, we contradict bounded
regularity of {Cl,C2) by d ( x ( " ) , ~ l n C2) = ~ ~ ~ ~ ~ ~ ~ ~ ~ ) e ( x ( " ) ) l l = 1 1 ~ ( " ) / 1 1- . 4.4 Guaranteeing bounded linear regularity
The development in this section follows largely [15, Section 41.
Proposition 4.4.1 Suppose A is a continuous linear operator from X to another Hilbert
space Y, C is a closed convex subset of X, D is a closed convex subset of Y. Suppose further
CHAPTER 4. REGULARITY FOR TWO SETS
there exist Z E C n A-'(D) and 6 > 0 such that
Then:
Proof. Fix x E C and let y := Ax. We may assume WLOG that x $ A - ~ ( D ) . By
hypothesis, there exists c E C n B(Z, 1) and d E D such that
Now define
X := I JPDAx - E ]0,1[ and E := (1 - X)x + Xc E C.
IJPDAx - Ax11 + 6
Then one checks that
Hence E E C n A-'(D) and AZ E AC n D. Thus (i) is clear from
Arguing similarly, we always have
Proposition 4.4.2 Suppose Cl and C2 are closed convex subsets of X with nonempty
intersection. Suppose further that there exists 3 E Cl and K, r > 0 with d(x, Cl n C2) 5 ~ d ( x , C2), for every x E C1 fl B(3,2r). Then:
d(x, Cl n C2) < ( 2 ~ + 1) max{d(x, Cl), d(x, C2)}, Vx E B(Z, r).
CHAPTER 4. REGULARITY FOR TWO SETS 58
Proof. Claim 1: d(x, Cl n C2) 5 (2n + 1) max{d(x, Cl n B(5,2r)), d(x, C2)), Vx E X.
Define a real-valued function f on X by f (x) := K max{d(x, C1), d(x, (72)) - d(x, C1 n C2).
Check that f is (K + 1)-Lipschitz on X (distance functions are nonexpansive; hence so is
max{d(x, Cl), d(x, C2))). Now denote momentarily C1 f l B(3, 2r) by S. Then f (s) - f (x) 5 (n + 1)11s - xl1, V s E S; hence inf f (S) 5 f (x) + ( K , + l)d(x, S), or
The last infimum is nonnegative by assumption; hence K max{d (x, C1), d(x, C2)) - d (x, C1 n C2) + (n + l)d(x, CI n B(3, 2r)) 2 0, Vx € X . Claim 1 follows.
Ciaim 2: d(x, Cl) = d(x, C1 n B(3, 2r)), Vx E B(3, r).
For such a point x , we clearly have 11% - Pc,xll 5 Ilx - 311 5 r ; so IIPc,x - 311 I (1 Pc,x -
x 11 + 112 - 3 11 5 2r, i.e., PC, x E C1 n B(3, 2r) and Claim 2 thus holds. The result now follows
by combining the two claims. H
Theorem 4.4.3 Suppose A is a continuous linear operator from X to another Hilbert space
Y, C is a closed convex subset of X , and D is a closed convex subset of Y. Suppose further
0 E sri (AC - D). Then:
(i) {AC, D) is boundedly linearly regular.
(ii) {C, A-I (D) ) is boundedly linearly regular.
Proof. Define a set-valued map R from X to Y by
(07 otherwise.
Note that ran R = AC - D and hence 0 E sri (ranR). I t is straight-forward to check that R
is a closed convex relation. Then, by Theorem 2.6.6 (with E = I) , there exists 6 > 0 such
that for an arbitrary but fixed 3 E W1(0) = C fl A - ~ D :
Now Proposition 4.4.l.(i) yields d(y, A C n D) 5 $ (11~3 - y l l + IIAJl)d(y, D), Vy E AC. Fix an
arbitrary r > 0. Then d(y, AC n D) 5 i ( 2 r + IIAll)d(y, D), Vy E AC n B(Aq2r) . Hence by
Proposition 4.4.2, d(y, A C n D) 5 ($ (2r + llAl1) + 1) rnax{d(y, AC), d(y, D)) , Vy E B(A3, r ) .
Thus (i) is verified; (ii) follows similarly (by use of Proposition 4.4.l.(ii)) H
CHAPTER 4. REGULARITY FOR TWO SETS
Corollary 4.4.4 Suppose C17 C2 are closed convex subsets of X with nonempty intersec-
tion. Then {Cl , C2) is boundedly linearly regular whenever one of the following conditions
holds:
(i) (C1 fl int C2) U (C2 n int C1) # 8.
(ii) 0 E int (Cl - C2).
(iii) 0 E core (Cl - C2).
(iv) 0 E sri (Cl - C2).
(v) C1 - C2 is a closed subspace.
Proof. Clearly, (i)-(iv) are increasingly less restrictive and (iv) follows from Theorem 4.4.3.
To catch (v), recall Proposition 2.6.5.
Corollary 4.4.4 is sharp in the sense that one can neither expect linear regularity nor regu-
larity even when (i) holds; see Example 4.8.4.
Remark 4.4.5 Concerning the conditions (i)-(v) of Corollary 4.4.4, we record: (i) is
more restrictive than (ii) (consider Cl := C2 := 1; in 12). (ii) and (iii) are the same
(Corollary 2.6.7), but (iii) is easier to check. (iii) is clearly more restrictive than (iv)
(consider Cl := C2 := (0) in R). If C1 and C2 are subspaces, then (iv) and (v) are the
same (Theorem 4.5.1).
4.5 Two subspaces
The situation for two subspaces is extremely satisfactory:
Theorem 4.5.1 Suppose Clr C2 are closed subspaces of X
(i) 7(C1, C2) > 0.
(ii) C1 + C2 is closed.
(iii) Cf + Ck is closed.
(iv) 0 E sri (Ci - C2).
Then TFAE:
CHAPTER 4. REGULARITY FOR TWO SETS
(v) {C1, C2) is (boundedly) (linearly) regular
Proof. (See also [14, Proposition 5.161.) "(i)+(ii)": By Corollary 2.6.13, the sum [Cl n (Ci r\ C2)'] + [C2 f l (Cl f l C2)'] is closed, which is equivalent (by Proposition 2.6.15) to
the closedness of Cl + C2. "(ii)~(iii)?': Proposition 2.6.16. "(ii)+(iv)?': Proposition 2.6.5.
"(iv)+(v)": By Corollary 4.4.4, the set {C1, C2) is boundedly linearly regular; equivalently,
linearly regular or regular (Theorem 4.3.1); in particular, boundedly regular. "(v)+ (i)" :
Proposition 4.3.2. . Remark 4.5.2 Since y (Cl , C2) = 7(Cl n (C1 n C2)', C2 n (Cl n C2)I) , we could double
the number of items in Theorem 4.5.1 by replacing "C;" (i = 1,2) by (the no larger)
"Ci n (C1 nC2)'"; consequently, what really matters are the relations between these possibly
smaller subspaces. Example 4.8.1 shows: (bounded) (linear) regularity for two closed
subspaces is not automatic. a A consequence of Theorem 4.5.1 is that two closed subspaces
are (boundedly) (linearly) regular if and only if their complements are. This goes very
wrong even for simple closed cones; see Example 4.8.2.
Proposition 4.5.3 Suppose Cl, C2 are closed subspaces of X. If C1 n (Cl n C2)' or C2 n (C1 n C2)I is finite-dimensional, then {C17 C2} is linearly regular.
Proof. Corollary 4.2.3 implies that {Cl n (Cl n C2)', C2n (Cl n C2)'} is boundedly regular;
equivalently, by Theorem 4.5.1, {Cl n (C1 n C2)', C2 fl (Cl n C2)') is linearly regular. Now
recallRemark4.5.2. . We easily derive a very classical result (see, for instance, [95, Proposition 20.11):
Corollary 4.5.4 In a Hilbert space, the sum of a closed and a finite-dimensional subspace
is always closed.
Proof. Proposition 4.5.3 and Theorem 4.5.1. . 4.6 Finite-dimensional results
It doesn't come as a big surprise but it is nonetheless very satisfying that the usual constraint
qualifications guarantee bounded linear regularity:
Proposition 4.6.1 Suppose Cl, C2 are closed convex subsets of a Euclidean space X. If
ri C1 n ri C2 # 0, then {C1, C2) is boundedly linearly regular.
CHAPTER 4. REGULARITY FOR TWO SETS 61
Proof. sri (C1 - C2) = ri (Cl - C2) = ri (C1) - ri (C2) 3 0, by Remarks 2.6.4.(ii) and
Corollary [127, Corollary 6.6.21; now apply Corollary 4.4.4. H
Proposition 4.6.1 is sharp in the following sense: it is not true that the (less restrictive)
assumption "(C1 n ri C2) U (C2 n ri Cl) # 0" implies bounded linear regularity; see Exam-
ple 4.8.3.
Proposition 4.6.2 Suppose C1 is a convex polyhedron and C2 is a closed convex set in a
Euclidean space X. If C1 n ri C2 # 0, then {Cl, C2} is boundedly linearly regular.
Proof. WLOG (translate) 0 E Cl n ri C2. Let Y := span C2. Note that if S is an arbitrary
closed convex subset of Y, then
Viewed in Y, we have 0 E C1 n ri C2 = (Cl n Y) n core (C2) = (Cl r l Y) rl int Y (C2)
(Proposition 2.6.3 and Corollary 2.6.8). Consequently, by Corollary 4.4.4, the set {Cl n Y, C2) is boundedly linearly regular in Y. For an arbitrary but fixed p > 0, there exists
K I > 0 such that d2(y, c1 n C2) = d2(y, (Cl n Y) n C2) I ~ l [ d ~ ( ~ , Cl n Y) + d2(y, C2)],
Vy E pBy. (Throughout this proof, we use the equivalence of the max and the Euclidean
norm.) WLOG nl 1 4. Using llPYll -< 1 (Observation 3.2.2),
On the other hand, {Cl,Y) is linearly regular by Fact 2.2.26 and thus we obtain 6 2 > 0
with
Now fix an arbitrary x E pBx. Then altogether,
CHAPTER 4. REGULARITY FOR TWO SETS
Therefore, using ~1 > 4,
Proposition 4.6.2 is sharp: if merely C1 fl C2 # 0 with Cl polyhedral, then {Cl, C2) need
not be boundedly linearly regular; see Example 4.8.3. Also, the conclusion can not be
strengthened to (linear) regularity; see Example 4.8.4.
4.7 Linearly constrained feasibility problems
Proposition 4.7.1 Suppose C is a closed convex subset of X, b is a point in a Euclidean
space Y, and A is a continuous linear operator from X to Y. Then {C, A-I (b)) is boundedly
linearly regular whenever (i) b E ri (AC); or (ii) qri (C) # 0 and b E A(qri C).
Proof. (See also [15, Theorem 5.3 and Remarks 5.41.) Condition (i) is equivalent to
0 E sri (AC - {b)); apply Theorem 4.4.3.(ii). Note that qri C # 0 implies A(qri C) = ri (AC)
(by [22, Proposition 2.101) and the result follows.
4.8 Limiting examples
Example 4.8.1 Let X be a Hilbert space with normalized Schauder basis (u,),>~. Suppose
(yn)n2 1 is a sequence of "angles" in ] 0 , ~ / 2 [ with yn -+ 0. Define
Note that span {en, f,) = span { ~ ~ ~ - 1 , ~ 2 ~ ) and hence span {en, f n : n 2 1) = X. Define
further
C1 :=span{en :n> 1) and C2:=Span{fn :n2 1).
Then C1 + C2 is dense in X as is Cf + Cf (for analogous reasons) and hence Cl n C2 = (0).
Since
COS y(C1, C2) > sup(en, fn) = SUP COS yn = 1, n > l n / l
we obtain y(C1, C2) = 0. Therefore, by Theorem 4.5.1, C1 + C2 is not closed and {C1, C2) is not (boundedly) (linearly) regular.
F
CHAPTER 4. REGULARITY FOR TWO SETS
Example 4.8.2 Let X := e2, C := Xf , and H := {a}', for some a = (a,) E Sx with
a, > 0, V n . Clearly, C n H = (0). Let dn) denote the nth unit vector, Vn. Then, using
Example 3.3.12, d(u(,), H) = a, + 0, d(dn) , C) 0, but d (dn) , C r l H) E 1. Hence
{C, H) is not boundedly regular.
In stark contrast, we prove that
{Ce, H@) is (boundedly) (linearly) regular.
Indeed, we have only to show that { X f ,span (a)) is regular (Theorem 4.3.1). So pick a
sequence (x(,)) with rna~{d(x(~) , x+), d(x(,), span (a))) + 0. Write x ( ~ ) = Xna + pn(~(n) )L
with (a(n))L E Sx n {a)', Vn. Then 0 t d(x("), span (a)) = )p, 1 (Example 3.3.11); hence
x(,) z An&. Thus 0 +- d(x(,), X+) % d(X,a, X+) = JA, I (by Theorem 3.3.1), which implies
x(") z X2a. Consequently, using Example 3.3.5, d(x(,), X+ n span (a)) = d (x(,), cone (a)) PZ
d ( X ~ U , cone (a)) 0, as desired.
Example 4.8.3 Let K be the icecream cone {x = (XI, x2, 23) E IR3 : x: + x3 5 xi, x3 1 0)
(so K = Cl for X = IR2 in the notation of Theorem 3.3.6). Then, by Theorem 3.3.6, for
every Y = (Yl,Y2,~3) E R3:
if y: + yg < yg and 93 > 0;
if yf + yi 5 yi and y3 5 0;
( 2 , ) otherwise.
Now let H be the hyperplane {x = (XI, 2 2 , 03) E R3 : xl = x3). Then (by Example 3.3.12)
PH(y) = ((yl + y3)/2, y2: (y1 + y3)/2), Vy E R3. It follows that K n H = {x E R3 : xl =
x3 2 0: 2 2 = 0) = cone ((1, 0: 1)) ; in particular,
By Example 3.3.5, P ~ n ~ ( y ) = ((yl + y3)+ /2 ) ( l ,~ , I), Vy E IR3. Proposition 4.2.2 implies
that the set
{K, H) is boundedly regular.
Define a sequence in IR3 by x("-) := (n, 1, n), Vn 2 1. Then d(x(,), H ) r 0 and d ( ~ ( ~ ) , K n H) E 1. Maple assists in computing d(x("), K) = 11 (JZ(JWi + n)) + 0. Therefore, by
Theorem 4.3.1,
CHAPTER 4. REGULARITY FOR T W O SETS
{K, H ) is neither regular nor (boundedly) linearly regular.
Example 4.8.4 (Example 4.8.3 continued) Let A be the hyperplane {x E W3 : XI - 23 =
-1); this is a translation of H: A = H + (-1,0,O). Consider K n A. Clearly, A fl int K # 0 and hence, by Corollary 4.4.4, the set
{K, A) is boundedly linearly regular.
The boundary of K n A with respect to aff (K n A) is contained in A, because A is affine.
Moreover, as int K # 0, it is easy to check that this relative boundary is given by the parabola
((x; - 1)/2,x2, (x; + 1)/2), where x2 E W. Now we let dn) := ((n2 - 1)/2, n + 1, (n2 + 1)/2),
Vn 2 1; the sequence (x (~) ) lies outside K f l A but inside A. As n becomes large, the
projection of x ( ~ ) onto K n A becomes arbitrarily close to ((n2 - 1)/2, n, (n2 + 1)/2), because
the slope of the parabola tends to infinity. Hence d ( ~ ( ~ ) , A) f 0 and d ( ~ ( ~ ) , K n A) -t 1.
On the other hand, the formula for PK in Example 4.8.3 and Maple yield:
Thus the set
{K, A} is not (linearly) regular although A is polyhedral and A fl int K # 0.
4.9 Notes
Bauschke and Borwein coined the notion of (bounded) (linear) regularity in [15] and proved
the majority of the results in this chapter.
Lewis's observation (Proposition 4.2.4) is very nice and allows one to prove the "Moreover"
part of Theorem 4.3.1 (which strengthens [15, Theorem 3.171).
Proposition 4.4.1 is an important special case of a quantitative inversion theorem for convex
relations due to Robinson; see [126, Theorem 21.
The convex relation R defined in the proof of Theorem 4.4.3 is regular a t (5,O) in the sense
of Borwein [20, Corollary 4.11; also, 0 is a regular value of R in the sense of Robinson [126,
CHAPTER 4. REGULARITY FOR TWO SETS
Section 21 - thus the notions "(boundedly) (linearly) regular" appear to be reasonably well
justified.
The finite-dimensional results (Proposition 4.6.1 and Proposition 4.6.2) are quite important
and appear to be new.
There are some sufficient conditions guaranteeing (bounded) regularity, usually described
in terms of "roundness" of the constraint sets. While interesting, I think these results are
not as useful as the preceding results. Nevertheless, here is a sample result taken from [18,
Section 3.61:
Proposition 4.9.1 Suppose C1, C2 are closed convex sets with nonempty intersection. If
C1 or C2 is Kadec/Klee, then the set {C1, C2) is boundedly regular.
Proof. (See also [18, Theorem 3.6.5.(iii)].) Suppose WLOG that Cl is KadecIKlee and let
(2,) be a bounded sequence with max{d(x,, Cl) , d(z,, C2)) -t 0. Assume further WLOG
(subsequence) that d(xn, Cl n C2) -+ L and x, - x, for some x E Cl n C2 and L > 0. It
suffices to show that L = 0. If x E bd Cl, then (because x, - Pc,xn -+ 0, Pclxn - x, and
C1 is Kadec/Klee) x, -+ x and L = 0. Otherwise, x E int ( e l ) n Cz and Corollary 4.4.4
even implies the much stronger bounded linear regularity. H
Combettes also studied Kadec/Klee sets [41, page 2211 and called them Levitin-Polyak sets.
However, I prefer the notion Kadec/Klee since it is fully consistent with Banach space
geometry.
Finally, concerning the examples: Example 4.8.1 is part of the folklore and Example 4.8.2
was also discussed in [15, Example 5.51. The examples involving the icecrearn cone (Ex-
ample 4.8.3 and Example 4.8.4) make many of the statements in this chapter sharp; they
appear to be new.
Chapter 5
Regularity for finitely many sets
5.1 Overview
Based on our work in Chapter 4, we are well equipped to tackle notions of regularity for
finitely many sets. Not only can the two-set results be nicely used to obtain finitely-many-
set results (Theorem 5.4.1), but most of them also allow straight-forward generalizations.
Bounded linear regularity is guaranteed either if a typical constraint qualification (Corol-
lary 5.4.2) holds; or, in the subspace case, if the sum of the orthogonal complements is
closed. As in the two-set case, these conditions look very different at first sight. In Eu-
clidean spaces, bounded linear regularity holds under the standard constraint qualifications
(Theorem 5.6.2). Linear regularity is obtained when all the sets are convex polyhedra; this
is essentially Hoffman's result (Theorem 5.7.1).
5.2 Basic properties
Definition 5.2.1 Suppose C1,. . . , CN are finitely many closed convex subsets of X with
C := ni C; # 0. We say that {C1,. . . , C,V) is . . .
(i) regular, if maxi d(z,, C;) + 0 implies d(xn, C) + 0, for every sequence (2,) in X.
(ii) boundedly regular, if maxi d(x,, C;) -+ 0 implies d(xn, C) -+ 0, for every bounded
sequence (2,) in X.
(iii) linearly regular, if there exists K. > 0 such that d (x, C) 5 K maxi d (x, C;) , for every
x E X.
CHAPTER 5. REGULARTTY FOR FINITELY MANY SETS 67
(iv) bounded ly l inearly regular, if for every bounded set S, there exists KS > 0 such that
d (x, C) 2 KS max; d (x, C;) , for every x E S.
The obvious generalizations of every result for the two-set case in Section 4.2 hold true;
here, we record only a selection:
Proposition 5.2.2 Suppose C1,. . . , CN are finitely many closed convex subsets of X with
nonempty intersection. If some set C; is boundedly compact, then {el,. . . , CN) is bound-
edly regular.
Proof. (See also [14, Proposition 5.4.(i)].) Adapt the proof of Proposition 4.2.2. . Theorem 5.2.3 Suppose C1,. . . , CN are finitely many closed convex subsets of X with
bounded nonempty intersection.
(i) If {el, . . . , CN) is boundedly regular: then it is regular.
(ii) If {Cl, . . . , CN) is boundedly linearly regular, then it is linearly regular
Proof. Analogous to the proof of Theorem 4.2.6. . Corollary 5.2.4 Suppose C1,. . . , CN are finitely many closed convex subsets of X with
bounded nonempty intersection. If some set C; is boundedly compact, then {Cl,. . . , CN)
is regular.
Proof. Combine Proposition 5.2.2 and Theorem 5.2.3.(i). . 5.3 Finitely many cones
Again, the proof of Theorem 4.3.1 carries over without difficulty:
Theorem 5.3.1 Suppose C1, . . . , CN are finitely many closed convex cones in X. Then
TFAE:
(i) {C1, . . . , C.v) is linearly regular
(ii) {C1, . . . , CN ) is boundedly linearly regular
(iii) {C1, . . . , CN) is regular.
Moreover: if ni C; = {0), then (i)-(iii) are also equivalent to
(iv) {C1, . . . , CN } is boundedly regular.
CHAPTER 5. REGULARITY FOR FINITELY MANY SETS
5.4 Promoting regularities
The proof of Theorem 4.4.3 on bounded linear regularity for two sets does not generalize to
finitely many sets; consequently, we focus on criteria that rely on regularities of two sets to
tap into this powerful result.
Theorem 5.4.1 Suppose C1,. . . , CN are finitely many closed convex subsets of X with
nonempty intersection. If {Cl n . - . n Ci, Ci+l) is (boundedly) (linearly) regular, for every
1 < i < N - 1, thensois {Cl, ... ,CN).
Proof. (See also [14, Theorem 5.111.) Consider the case, say, of bounded linear regularity.
So fix a bounded subset S of X and obtain K; > 0 such that d(x, Cl n - - - n Ci+l) < K i max{d(x, Cl n. . . n C;), d(x, Ci+l)), Vx E S, 1 5 i 5 N - 1. Then d(x, C1 n . . - n CN) < 61 - - KN-1 maxj d(x, Cj), Vx E S. The other cases are proved similarly. H
An important constraint qualification for bounded linear regularity follows:
Corollary 5.4.2 Suppose Cl, . . . , CN are finitely many closed convex subsets of X with
CN n nzl int (Ci) # 0. Then {C1, . . . , CN) is boundedly linearly regular.
Proof. (See also 179, Lemma 51.) The assumption clearly implies C;+l nint (Cl n - . . nCi) # 0, for every 1 < i < N - 1. Now apply Corollary 4.4.4.(i) and Theorem 5.4.1.
The next result says roughly that regularity is preserved by taking coarser partitions:
Theorem 5.4.3 Suppose C1,. . . ,CAT are finitely many closed convex subsets of X with
nonempty intersection. Suppose further J1 U - . - U JM = (1,. . . , N ) pair-wise disjointly.
D e h e Dj := niEJj Ci, for every 1 < j 5 M . If {Cl;. . . , CAv) is (boundedly) (linearly)
regular, then so is {Dl, . . . :DM).
Proof. (See also 114, Proposition 5.251.) Use that d(., nzl D,) = d(-, nLl Ci) and that
maxi~J, d ( * , Ci) < d(., D j ) - . The converse of Theorem 5.4.3 is false: let C1, C2 be as in Example 4.8.1. Then {Cl, C2) is
not (boundedly) (linearly) regular, but Cl n Cz = (0) clearly is.
5.5 Finitely many subspaces
The next result provides a very satisfactory generalization of Theorem 4.5.1.
CHAPTER 5. REGULARITY FOR FINITELY MANY SETS 69
Theorem 5.5.1 Suppose C1,. . . , CN are finitely many closed subspaces of X. Consider,
as in Fact 2.6.17, the product space X with positive weights (w;)zl, the diagonal A, and
the product subspace C. Then TFAE:
(i) Cf + + C h is closed.
(ii) A + C is closed.
(iii) {A, C ) is (boundedly) (linearly) regular.
(iv) {C1 , . . . , CN) is (boundedly) (linearly) regular.
Proof. "(i)*(ii)": Define a continuous linear operator T from X onto X by T x := N xi=l w~x;. Then T*y = (y, y,. . . , y), Vy E X; hence ranT* = A and kerT = '(ranT*) =
AL. Thus, by Fact 2.3.6, Ck + - . . +Ck = xi w;Ck is closed if and only if C' + A' is. The
closedness of the latter sum is (by Proposition 2.6.16) equivalent to the closedness of C + A.
"(ii)++(iii)": Theorem 4.5.1. "(iii)+(iv)" : By linear regularity, there exists K > 0 such that
d(x, A n C ) 5 K max{d(x, A), d(x, C)), for every x E X. In particular, if x := (x )g l E A,
then
consequently, {Cl , . . . , CN) is linearly regular. "(iv)+(i)": We prove the contrapositive: if
Cf + - - . + CA$ is not closed, then {Cl,. . . , CN} is not boundedly regular. By the already
established equivalence of (i) and (iii), there exists E > 0 and a bounded sequence ( ~ ( ~ 1 ) in
X such that r n a x { d ( ~ ( ~ ) , A), d ( ~ ( ~ ) , C ) ) + 0 but d ( ~ ( ~ ) , A n C) -t E . Let y(n) be the first
coordinate of PA ( ~ ( ~ 1 ) (the coordinates all agree anyway), Vn 1 1. Since x ( ~ ) sz PA ( x ( ~ ) ) ,
we have: (y(")) is bounded, xi wid2(y("), C;) 3 0, but d2(y(n), ni Ci) -t r2 > 0. Hence
{CI , . . . : Cdv} is not boundedly regular and the equivalence of (i)-(iv) is proven. Before
considering (v), let C := n,Ci. Then the above implies (with Proposition 2.6.14) the
following
Observation: Cf + . . . + Ch is closed e Ck + - . - + C,f + C is closed ++ (c? + C) + - - . + (cA + C) is closed ++ (C1 n C L ) I + - - + (CN n C L ) I is closed
* {Cl n CL, . . . , CN n CL} is (boundedly) (linearly) regular.
CHAPTER 5. REGULARITY FOR FINITELY MANY SETS
This observation gives us the flexibility needed to handle (v).
"(i)-(iv)+(v)": By the observation, {C1 fl C', . . . , CN n C') is linearly regular. Hence
there exists K. > 0 such that
On the other hand, by Observation 3.2.2, for every 1 < i 5 N, x E X:
Altogether, 1 1 ~ 1 1 ~ I ~ ~ ~ ~ ( 1 1 ~ 1 1 ~ - IIPCNnC~ . - P ~ ~ ~ ~ I x ~ ~ ~ ) , Vx E X; this readily implies
JJPCNnC~ . . . PcInC~)1 < 1. NOW apply Fact 2.3.10.
"(v)+(i)-(iv)": We prove the contrapositive and assume that Cf + . - . + C$ is not closed;
equivalently (by the above observation), {Cl n C',. . . , CN n C') is not boundedly reg-
ular. Hence there is some bounded sequence (x,) with maxi d(x,, Ci n CL) -+ 0, but
d (x,, ni Ci n C') llxn 11 f f 0. After passing to a subsequence and normalizing, we as-
sume WLOG 1 1 ~ ~ 1 1 - 1. NOW xn - P C l n C ~ x n --+ 0, which implies, by nonexpansivity of
PC2nC~, that P C 2 n C ~ ~ n - PC2nC~PClnC~~n + 0. Since X, - P C 2 n C ~ ~ n 4 0, we ob-
tain x, - Pc2nc~Pc,nc~xn + 0. Repeating this line of thought yields eventually x, -
PcNncl . - . PC, n c l X, -+ 0. The triangle inequality implies llx, 11 - 11 PcNncl - . . PC, n c ~ x n 11 -+ 0; thus llPcNncl - . - Pclnclxnll + 1. Therefore, by Fact 2.3.10, /]PC, - . -PC, - PC[[ = 1.
The proof is complete.
As (bounded) (linear) regularity is invariant under translation, it is clear that corresponding
results hold for intersecting closed affine subspaces.
In striking contrast to Theorem 4.5.1, (bounded) (linear) regularity of finitely many closed
subspaces C1 . . . , CN has nothing to do with the closedness of Cl + . . . + CN :
Example 5.5.2 Suppose C1, C2 are closed subspaces of X such that Cl + C2 is not closed
(see Example 4.8.1); thus neither is Cj'- + Ck (Proposition 2.6.16). Let N be a positive
integer greater or equal to 3.
(i) If C3 := - - := CN := X, then C1 + . . + Cnr is closed but Ck + . . + Cb is not.
(ii) If C3 := - . . := CN := {0), then C1 + - - + CN is not closed but Cj'- + - . + Ch is.
CHAPTER 5. REGULARITY FOR FINITELY MANY SETS 7 1
In view of Theorem 5.5.1, we learn that (bounded) (linear) regularity of {Cl, . . . , CN) is
independent of the closedness of C1 + . - . + CN for N > 3.
Remark 5.5.3 For every N-tuple ((71,. . . , CN) of finitely many closed subspaces of X,
define the angle y(C1, . . . , CN) E [O, ~ / 2 ] by
Then:
(i) For N = 2, the angle y (C1, C2) just defined is consistent with Definition 2.6.11 (use
Fact 2.6.12) and hence independent of the ordering. I t also provides a sharp bound
for the method of alternating projections: II(Pc, - Pclncz 11 = y(C1, C2),
Vn 2 1; see [98, Theorem 21.
(ii) For N 2 3, the angle y(C1,. . . , CN) depends on the ordering: Consider u, v, w E
.- cN-2 := span (u), Sx with (v,w) # 0 and I(u,v)l # I(u,w)l. Let Cl := - 0 . .-
CN-~ := span (v), and CN := span (w). Then CN-l n CN = (0) and one readily com-
putes that cosy (C1, Cz, . . . , CN) = 1 (u, v) 1 I (v, w) I is different from I (u, w) I I (w , v) 1 =
co~y(C1, . . - CN-2, CN, CN-1).
(iii) Theorem 5.5.1 implies that y (Cl , . . . , CN) > 0 if and only if y(C,(l), . . . , Cr( N)) > 0,
for every permutation T of (1, . . . , N).
(iv) I t is true that y(Cl, C2,. . . , CRr) = ~ ( C N , C N - ~ , . . . , C1): if we let C := ni Ci, then
(using Fact 2.3.10)
Thus the angle is cyclically invariant and we have yet another explanation why the
angle for two closed subspaces is independent of the order.
(v) For N 2 3, one obtains the (non-sharp but still) useful estimate: JI(Pc, - - -
PniciII I cosn y(C1,. . . ,CN) , Vn > 1.
(vi) Theorem 5.5.1 allows us to write down a pretty characterization:
{C1,. . . , CN) is (boundedly) (linearly) regular if and only if y(C1,. . . , CN) > 0.
CHAPTER 5. REGULARITY FOR FINITELY MANY SETS
Here are some sufficient conditions for linear regularity of a set of closed subspaces:
Theorem 5.5.4 Suppose Cl, . . . , CN are finitely many closed subspaces of X. Then the
set {Cl,. . . , Chr} is linearly regular whenever one of the following conditions holds:
(i) y(C1 n . - .nC; ,C ;+ l ) > 0, for every 15 i 5 N - 1.
(ii) Some C; n (Cl n . . . n CN)' is finite-dimensional.
(iii) All C;, with the possible exception of one, are finite-codimensional.
(iv) Each Ci is a hyperplane.
(v) X is finite-dimensional.
Proof. (See also [18, Proposition 3.7.71.) (i): By Theorem 4.5.1, {Cl n - . n C;, Ci+1) is
linearly regular, for every 1 5 i 5 N - 1; now apply Theorem 5.4.1. (ii): Proposition 5.2.2
yields bounded regularity of {C1 n (Cl n . . - r l CN)', . . . , CN n (Cl n - . - n CN)'), which
is equivalent to linear regularity of {Cl,. . . , CN) (by the observation in the proof of The-
orem 5.5.1). (5) : We assume WLOG that C1, . . . , CN-l are finite-codimensional. Then
Ck + . . - + Ch-l is finite-dimensional and hence (Corollary 4.5.4) Cf + . - - + C b is closed.
Apply Theorem 5.5.1. Finally, (iv) and (v) are clearly more restrictive than (iii). 1
5.6 Finite-dimensional results
Once again, the results from the two-set case generalize beautifully:
Proposition 5.6.1 Suppose C1,. . . , Cnr are finitely many closed convex subsets of a Eu-
clidean space X. If nzl ri Ci # 0, then {C1,. . . , CN) is boundedly linearly regular.
Proof. Combine Theorem 5.4.1 and Proposition 4.6.1. W
Theorem 5.6.2 Suppose that C1, . . . , CM are finitely many convex polyhedra and that
CM+1, . . . , CAW are finitely many closed convex sets in a Euclidean space X. If C; n nz M+l ri C; # 0, then {Cl , . . . , CN) is boundedly linearly regular.
CHAPTER 5. REGULARITY FOR FINITELY MANY SETS 73
Proof. Let Dl := nE1 C; and D2 := nEM+l Ci. Then the following three sets are
boundedly linearly regular: {Dl, D2) (by Proposition 4.6.2), {Cl , . . . , CM) (by Fact 2.2.26),
and {CM+l, . . . , CN) (by Proposition 5.6.1). The result follows readily. . When only convex polyhedral sets are present, then the conclusion of Theorem 5.6.2 can be
strengthened to linear regularity; see Remark 5.7.3.
5.7 Halfspaces and convex polyhedra
Hoffman's result (Fact 2.2.26) can be rephrased as: "In a Euclidean space, every set of
intersecting halfspaces is linearly regular." The generalization to arbitrary Hilbert spaces is
not hard:
Theorem 5.7.1 Suppose C1, . . . , CN are finitely many halfspaces in X with nonempty
intersection. Then {C1, . . . , CN} is linearly regular.
Proof. Suppose each Ci is given as {x E X : (a;, x) 5 b;), for some ai E X and b; E R. Let K be ni ker (ai); then K is a closed subspace of finite co-dimension. Let further Di :=
C; n KL, V i ; each D; is a halfspace in the Euclidean space K I . Let finally C := 0; C; and
D := c ~ K I . On the one hand, by Fact 2.2.26, the set {Dl , . . . , DN} is linearly regular. On
the other hand, by Theorem 3.3.16, if x E X and y := PKlx E KL, then x - Pcx = y - PDY and x - Pc,x = y - Poi y, V i . Altogether, it follows easily that {C1, . . . , CN } is linearly
regular. . Corollary 5.7.2 Suppose Cl: . . . , CN are finitely many convex polyhedra in X with non-
empty intersection. Then {C1, . . . , CN} is linearly regular.
Proof. (See also [14: Corollary 5-26].) Combine Theorem 5.7.1 with Theorem 5.4.3. . Remark 5.7.3 Suppose X is a Euclidean space. Then, by Corollary 5.7.2, any finite set
consisting of halfspaces, f i e subspaces, boxes, and hyperplanes is linearly regular. This
result covers quite a few applications; it is also stronger than Theorem 5.6.2.
5.8 Notes
The notions of regularity (Definition 5.2.1) were coined in [14, Section 51.
CHAPTER 5. REGULARITY FOR FINITELY MANY SETS 74
The equivalence of (i)-(iii) in Theorem 5.3.1 appears also in [14, Proposition 5.91; as in the
two-set case, the "Moreover" part is new.
Gubin et al. first investigated bounded regularity in some detail. Their [79, Lemma 51
contains Proposition 5.2.2 and Corollary 5.4.2 (proved from scratch); they also noted that
Hoffman's result (Fact 2.2.26) holds true in arbitrary Hilbert space (Theorem 5.7.1).
The subspace case is completely settled in Theorem 5.5.1; this result is a combination of
[14, Lemma 5.18, Theorem 5.191 and [18, Proposition 3.7.3, Theorem 3.7.41.
In Remark 5.5.3, we touch upon angles of tuples of closed subspaces; much more on this
and connections to the method of cyclic projections can be found in [49, 501.
Condition (i) of Theorem 5.5.4 is popular in Computed Tomography (see [147, Theo-
rem 2.2]), whereas (ii) appears on [70, page 461.
The central finite-dimensional result, Theorem 5.6.2, is new.
Hoffman's result (Fact 2.2.26 and Theorem 5.7.1) was recently deduced from Fenchel Dual-
ity; see Burke and Tseng's [32].
We conclude by mentioning that - in a Baire category sense - "bounded linear regularity
is the rule" :
Remark 5.8.1 Suppose N is a positive integer greater or equal to 2 and let 7 be the set of
all N-tuples of the form (C1,. . . , CN), where each C; is a bounded closed convex subset of
X with 0; C; # 0. Let further R be the subset of all boundedly linearly regular N-tuples,
that is: (Ci, . . . , CN) E R if and only if (C1,. . . , CN) E 7 and {CI, . . . , CN) is boundedly
linearly regular. Then R is residual in 7 (equipped with the Hausdorff metric).
Proof. [14, Theorem 5.271. .
Chapter 6
Fej6r monotone sequences
6.1 Overview
We discuss the appealing and powerful concept of a Fejdr monotone sequence. Beside
projections and the "fab four" regularities, this is the third key concept we rely on to
study projection methods for convex feasibility problems. Fejdr monotonicity is also very
useful outside the projection methods world, as our algorithm for finding support points
demonstrates.
6.2 Fej&r monotone sequences
Definition 6.2.1 Suppose (x,),>~ - is a sequence in X and C is a closed convex nonempty
subset of X. Then (x,) is F e j h monotone with respect to C, if
Fejdr monotone sequence have not only an immediate intuitive appeal but, more importantly,
also excellent properties:
Theorem 6.2.2 Suppose ( x , ) , ~ ~ is a Fejdr monotone sequence with respect to some closed
convex nonempty subset C of X. Then:
(i) (2,) is bounded and the sequence (d(x,, C)) is decreasing (hence convergent).
(ii) If wl, w2 are two weak cluster points of (x,), then wl - wp E (C - C)'
CHAPTER 6. F E J ~ R MONOTONE SEQUENCES 76
(iii) The sequence (Pcxn) is norm convergent; denote its limit by c*. Let d := inf, d(x,, C)
and suppose w is an arbitrary weak cluster point of (x,). Then
w E P~' (C") n on,, - RC(xn) 2 PE' (c*) n B(c*, d).
In particular: (2,) has at most one weak cluster point in C, namely c*. Moreover: w
is a norm cluster point of (2,) if and only if llw - c*ll = d. Finally: d = 0 is equivalent
to the norm convergence of (2,) to c* , in which case 11 c* - x, 11 5 2d(xn, C), Vn > 0.
(iv) If C is a closed f i e subspace, then Pcxn c*.
(v) If int C # 0, then En llxn - xn+l 11 < +co. In particular, (2,) is norm convergent.
(vi) If there exists 0 5 0 < 1 such that d(x,+l,C) < Od(z,,C), for every n 2 0, then
Ilx, - c*II 5 20nd(xo, C). In particular, (x,) converges linearly to c" with rate 8.
Proof. Throughout, let us abbreviate PC by P . (i): (x,) is clearly bounded. Also,
11xn+l - Pxn+~ll < 11xn+l - Pxnll 5 IIxn - Pxnll, Vn; hence (d(xn, C)) decreases. (ii): By
(i), the limits Ai := limn llxn - c;1I2 exist, for i = 1,2. Obtain two (possibly different)
subsequences (xkn), (xi,) such that xkn - WI and xm - w2, for some wl, w2 E X. Consider
Taking limits along the subsequences (k,) and (1,) yields A1 = A2 + llcz - c11I2 + 2(w1 - c2, c2 - cl) and A1 = A2 + IIc2 - c1 / I 2 + 2(w2 - c2, c2 - cl). NOW subtract. (iii): Fix two
arbitrary positive integers m , n with m > n. Theorem 3.2.l.(ii) (applied to x, and Px,)
yields
hence, using Fejdr monotonicity,
In view of (i), the last expression gets small as m, n get big: thus, (Px,) is a Cauchy
sequence with limit c*, say. Now let w be a weak cluster point of (x,), say the subsequence
(xk,) converges weakly to w. For every c E C, we have (c - Pxkn, xk, - Pxk,) < 0; taking
limits yields (c - c*, w - c*} 5 0 and it follows that c* = Pw. Weak lower semi-continuity
CHAPTER 6. FEJER MONOTONE SEQUENCES 77
of the norm and FejCr monotonicity yield JJw - ell 5 llxn - c I I , Vn 2 0, c E C. Thus
w E Rc(x,), Vn 2 0 and so w E n n , o R C ( ~ n ) 2 n,,, B(Pxn, d(x,, C)); see Section 3.4. - It is easy to check that the latter intersection is contained in B(c*, d). The "In particular"
part follows. "Moreover": if w is a norm cluster point of (x,), say the subsequence (xln)
converges in norm to w, then d = limn llxn - PxnJJ = limn ))xln - c*J1 = llw - c* 11. Conversely,
if xkn A w and llw - c*li = d, then xh, - c* - w - c" and the Kadec/Klee property of
X implies xk, + W , as desired. "Finally": By the "Moreover" part, w is a weak but
not a norm cluster point of (x,) if and only if llw - c*ll < d. Thus d = 0 if and only if
x, + c*. Note that in this case, c* E RC(xn); hence Ilc* - Px,ll 5 llxn - Pxnll and further
IIc* -xn11 < I Ic* -Pxnll +IIPxn -xnll < 2d(xn, C), Vn > 0. (iv): Suppose w is a weak cluster
point of (x,). Hence Pc(w) = c* and (by Proposition 3.4.2) w = PC(%,) + dn E Rc(xn),
Vn >_ 0, where (d,) is a sequence in par^)'-. Write C = c + par C, for some c E C. Then,
using (iii) and Proposition 3.2.3. (i),
as desired. (v): By assumption, there exist E > 0 and co E C such that B(co, E ) C. Claim:
2~11~n - xn+lll I IIxn - coI12 - Ilxn+~ - co(I2, Vn 2 0.
This is clear if x,+l = x,. Otherwise, let c := co + ~ ( x , - ~ , + ~ ) / l l x , - x ,+~/ ( ; then square
and expand Ilc-xn+lll < llc-x,II. The claim follows and implies 26 En ((x, -xn+i(l h 11x0-
coil2 - infnlo llx, - coil2 < +oo. (vi): The assumption implies d(x,,C) < end(xo,C) --+ 0.
Now invoke the estimate from (iii). H
6.3 Two examples
6.3.1 Compositions of nonexpansive maps
Suppose (Tn)nZ1 is a sequence of nonexpansive maps, all defined on a closed convex non-
empty subset D of X. If C := nn,l FixT, # 0, then every sequence ( x , ) , ~ ~ generated
by
xo E D arbitrary, xn+l := TnS1xn; for every n 2 0;
is Fejkr monotone with respect to C.
Thus Theorem 6.2.2 applies; for instance: if int C # 0, then (x,) converges in norm.
CHAPTER 6. FEJER MONOTONE SEQUENCES 78
In particular, iterating a single nonexpansive map T results in the sequence of Picard iterates
(Tnxo) which is Fejdr monotone with respect to FixT.
The above model was the foundation for the systematic investigation of algorithms in [14].
6.3.2 Finding support in Hilbert spaces
In this subsection, I present an algorithm for finding support points, developed jointly with
Jon Vanderwerff [156].
We are given a closed convex nonempty subset C of X and a functional f E Sx. We consider
the problem of finding a support point of f in C, that is:
Find (if possible) a point s E C with f (s) = max f (C).
This problem is fundamental in optimization: indeed, every linear programming problem
can be interpreted this way.
For brevity, we write P instead of PC throughout this subsection.
Proposition 6.3.1 Suppose s E C. Then TFAE:
(i) s is a support point of f in C.
(ii) P(s + a f ) = S, for every a > 0.
(iii) P ( s + a f ) = s, for some a > 0.
Proof. Follows easily with Theorem 3.2.l.(i). . Proposition 6.3.2 Suppose x E X and a > 0. Then for every c E C:
Proof. By Theorem 3.2.l.(ii) (applied to c and x + a f ), we have Ilc - (x + a f ) ) I 2 2 Ilc - P(x + af))12+ ll(x + a f ) - P(x + crf))I2. Thus: using 1) fll = 1,
CHAPTER 6. FEJ& MONOTONE SEQUENCES 79
Theorem 6.3.3 Suppose (an),>o - is a sequence of positive reds with En a, = +w. Define
a sequence (x,) by
xo E C arbitrary, xn+l := P ( x , +a, f ) , for every n > 0.
Then: either f is a support functional and (x,) converges weakly to some support point; or
f is not a support functional and llxnll -t +m.
Proof. Observe that the sequence (x,) lies entirely in C .
S tep 1: (f (x,)) is increasing.
By Theorem 3.2.l.(i), 0 > ( x , - xn+l, ( x , + a , f ) - x,+I) = llxn -~n+111~+anf ( x n - ~ n + 1 ) -
s t e p 2: limn f (x,) = SUP f ( C ) E ] - CO, +w].
The limit of ( f (x,)) exists in ] - oo, +CO] by Step 1. It suffices to show that for every c E C,
E > 0 , eventually f (x,) > f (c ) - E . Suppose not. Then there exist c E C and E > 0 such
that f (x,) 5 f ( c ) - E , Vn 2 0. By Proposition 6.3.2,
Now as (f (x,)) converges to a finite limit, eventually (f, x,+l - x,) 5 €12. Thus there
exists ii 2 0 such that ))x,+~ - c)I2 - lJxn - c)I2 < 2 a n ( - ~ / 2 ) , V n > a. This implies
-11x3 - ell2 5 - E an = -ca, which is absurd. Step 2 is verified.
Step 3: I f f is not a support functional, then II~nll + +m.
Assume the opposite. Then some subsequence of (x,) is bounded and WLOG weakly
convergent; in view of Step 2, its weak limit would be a support point - contradiction.
Step 3 thus holds.
Last Step: If f is a support functional, then (x,) converges weakly to some support point.
Denote the set of support points for f in C by S. Then S is closed convex nonempty and
s E S if and only if P ( s + a f ) = s for every a > 0 (Proposition 6.3.1). Hence
for every s E S, n > 0; i.e., (x,) is FejCr monotone with respect to S. On the other hand, by
Step 2, each weak cluster point of (2,) must lie in S. Altogether, Theorem 6.2.2.(iii) implies
that (x,) converges weakly to some support point in S. The entire theorem is proven. .
CHAPTER 6. FEJER MONOTONE SEQUENCES 80
The condition on the divergence of the sequence (a,) is really necessary: for instance, let
C := [-I, 11 in W: f := 1, xo := 0, and C, a, < 1. Then (x,) cannot converge to 1, which
is the only support point of f in C.
6.4 Notes
Some historical remarks on Fej& monotone sequences and Theorem 6.2.2 are in order:
The notion was coined by Motzkin and Schoenberg [113]. Part (i) and (ii) of Theorem 6.2.2
are folklore. Then norm convergence of (Pcxn) in (iii) essentially appeared in [9, Lemme 31;
see also [53, Theorem 3.4.(c)]. The relation between weak limit points of (x,) and c" appears
to be new as does (iv). Part (v) comes from [I l l ] and (vi) from [79, Proof of Lemma 61.
For Fixed Point Theorists, who sometimes encounter the asymptotic centre of a sequence
(see, for instance, [73, Chapter 91): we point out the following:
Remark 6.4.1 Borrowing notation and assumptions of Theorem 6.2.2, we have: limn llx, -
c*II 5 limn 112, - ell, V c E C. In particular, c* is the asymptotic centre of (x,) in C.
Proof. Ilx, - Pcxn(( 5 llxn - ell, V c E C, n 2 1. Take limits.
The model of Subsection 6.3.1 focused on the iteration of nonexpansive maps rather than the
construction of Fejkr monotone sequences; it is thus incapable of recapturing the projection
algorithms based on extrapolations discussed in the next chapter.
The support point algorithm of Theorem 6.3.3 is a neat application of FejCr monotonicity.
By using the duality between support points and support functionals, we can also engineer
an algorithm that tackles the converse problem of finding a support functional to a given
point; details might someday appear elsewhere.
Finally, Jim Burke [31] pointed out that the support point algorithm of Theorem 6.3.3 is a
special case of the proximal point algorithm. For more on the proximal point algorithm, the
reader is referred to [130, 801.
Chapter 7
The convex feasibility problem
7.1 Overview
We describe the convex feasibility problem and its numerous manifestations and then mo-
tivate projection methods for solving it.
When dealing with two constraints, our analysis leads to an explicit algorithm. A basic
convergence result, that very well illustrates how the key concepts work together, and some
comments on the performance of this algorithm conclude the chapter.
7.2 The convex feasibility problem
Throughout this section, suppose C1,. . . , C.v are finitely many closed convex subsets of X
with C := n, C, # 0. The Convez Feasibility Problem is simply:
(CFP) Find a point in C.
The sets Ci are also referred to as the constraint sets or constraints; the set C is the set of
all solutions.
The CFP is very common in mathematical and physical sciences; a (certainly incomplete)
list of areas is:
Best Approximation Theory
Constraints: closed subspaces.
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM 82
Applications: statistics (linear prediction theory), complex analysis (Bergman kernels,
conformal maps), partial differential equations (Dirichlet problem) ; see [48].
Discrete Image Reconstruction
Constraints: convex polyhedral sets; X is Euclidean.
Applications: medical imaging (computerized tomography), electron microscopy; see
[33, 35, 36, 139, 1571.
Continuous Image Reconstruction
Constraints: lattice cones, closed afEne subspaces, halfspaces.
Applications: computerized tomography, signal processing; see [4
Subgradient algorithms
Constraints: sublevel sets of convex functions; usually approximated by supersets.
Applications: convex inequalities, minimization of convex nonsmooth functions; see
[34, 140, 991.
Typically, the solution set C is geometrically fairly complicated, whereas the constraints Ci
- or suitable supersets thereof - are simple in the sense that projections are computable.
This suggests the following iterative strategy for solving the CFP: Given a current iterate
x,, produce a "better" approximation x,+l using the computable projections onto the
constraints. Under side conditions, the so-generated sequence (x,) converges (in some sense)
to a solution of the CFP.
All of our work up to now comes nicely into play right here: Chapter 3 exhibits many ex-
amples of computable projections. * FejCr monotone sequences from Chapter 6 are superbly
suited to describe "goodness": J ( X , + ~ - ell 5 llxn - c I I , VC E C can certainly mean that %,+I
is "better7' than 2,. The regularities of Chapter 4 and Chapter 5 will guarantee norm or
even linear convergence of the sequence (2,) to a solution.
We defer a detailed discussion of a very flexible update scheme to Section 8.3; for the
remainder of this chapter, we rather analyze the two-set case in more detail to get a better
sense of the problem.
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM
7.3 Motivating projection methods
Throughout this section, we assume that
C1 and C2 are closed convex subsets of X with C := C1 n C2 # 0;
for brevity, we denote the projections onto Cl, C2, and C by PI, P2, and P, respectively. All
results of this section actually have the obvious analogues for finitely many sets; however,
we work with two constraints only. (This assumption is crucial in the next section.) Our
aim is to solve the convex feasibility problem:
(CFP) Find a point in C.
We try to find a solution of the CFP iteratively by constructing a Fejdr monotone sequence
( x , ) , ~ ~ . Given the projections PI, P2 and a current iterate x,, how should we compute
x,+l? One way is to let z,+l be the image under either PI or P2: if x, = Plxn-l (resp.
x, = P2x,-1), then let xn+l := P2xn (resp. x,+l = Plxn). (Note that it is pointless
to repeat Pl or P2 twice in a row because projections are idempotent.) This simple yet
incredibly successful strategy is called the method of alternating projections; it dates back
to 1933 and is due to John von Neumann [158, Theorem 13.71.
Of course, there are other, well-justified ways to update x,. Again for brevity, we denote
the current iterate x, by x and the next iterate xn+l by y. We can assume WLOG that
x $Z C (else there is nothing to do) and that y # x (else we make no progress). Having at
hand only x: Pix, and P2x, how could we choose the update y? It is quite reasonable to let
because: if C1, C2 are halfspaces, then Px E x + cone (Pix - x, P2x - x) (see Fan's
[64, Theorem 14 on page 1291); if we had a convex feasibility problem with only one set
CI = C, then Px E x + cone {Plx - 2): it makes the analysis work (an excellent reason
indeed).
One cannot expect to find a solution in just one step, i.e., y E C:
Example 7.3.1 (1): Let X := IR2, C1 be the horizontal axis, and C2 the diagonal so that
C = Cl n C2 is the origin. If x = (10, -1) say, then 0 4 x + cone {Plx - x, P2x - x) but at
least 0 E x +span {Pix - x, P2x - 2). But even the bigger x +span {Plx - x, P2x - x) need
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM
not contain a solution (see (2)).
(2): Let Cl, C2 be closed subspaces with nonclosed but dense sum in X and with C =
C1 n C2 = (0) (see, for instance, Example 4.8.1). Let further x E X \ (Cl + C2). After
adding nonzero vectors from C1 and C2 to x if necessary, we can assume that PIX # 0 and
P2x # 0. This in turn implies that 0 $ x + span {Plx - x, P2x - x).
But we agreed to require y E x + cone {Plx - x, P2x - x), so let us write
Y = x + PC; wi(Pix - x), where p, w l , w2 2 0 and wl + w2 = 1.
We refer t o p as a relaxation parameter; wl, w2 are nonnegative weights. As we are interested
in generating a Fejdr monotone sequence, we must compare Ily - ell to llx - ell, Vc € C. We
obtain:
I I Y - c1I2
= I/(x - C) + PC; wi(Pix - x)1I2
= 11s - c1I2 + p211Ci wi(Pix - x)1I2 + 2 ( ~ - C, PC; w;(P,x - x))
= 112 - ell2 + P211Ci w;(P;x - x)1I2 + ZPC; w;((x - E X , EX - X) + (Pix - C, EX - I))
= 112 - + P211Ci W;(P~X - x)1I2 - 2pCiwillx - + 2pC;wi(c - Pix, x - Pix).
The crux is that we do not know C! However, since the weights are nonnegative, we
conclude that the last sum is nonpositive (by Theorem 3.2.l.(i)); in fact, we only have
to worry about nonzero weights. (This is where we make crucial use of the assumption
y E x +cone {PIX - x, P2x - x).) Hence we let I := {i : w; > 0) be the set of active indices;
then
Note that the set niEI C; contains C and, most importantly, the RHS of (*) is independent
of C. Thus:
The Fejdr estimate Jly - ell 5 IIx - ell holds true if the RHS of (*) is nonpositive. Consider the
weights wl, w2 fixed. If 11 Xi wi(x - P ; x ) / ~ ~ = 0, or equivalently x E Fix ( X i w;P;) = niE1 C;
(Observation 3.2.2), then y = x, which we excluded from our analysis (no progress would
be made). Hence we rewrite our assumption y # x as x $ niEI C;; equivalently,
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM
Then the RHS of (*) is a quadratic in p; this quadratic is nonpositive exactly for
The RHS of (*) becomes as small as possible when the quadratic attains its minimum; this
happens for
Let us denote the update corresponding to pgood by ygood.
By Proposition 2.2.1, we always have pgood 2 1 and the implications: pgood = 1 * {x - Pix : i E I) is singleton +- ygood E niEI Ci.
If exactly one of the weights is positive, then pgood = 1 and we perform a typical alternating
projection step: ygood = Pix, for some i.
Using a relaxation parameter p E [O, 2pgood] was first suggested by Merzlyakov [lo71 in
1962; he considered the case when the constraints are halfspaces. This idea was recently
revived by Combettes [41] and by Kiwiel [loo]. Let us go back to the "ideal" situation
where we "know" the solution set C. For a fixed but arbitrary point c E C, the expression
Ily - c1I2 - llx - c1I2 becomes a s small as possible at
This time, Ily - c1I2 - Ilx - c1I2 2 0 if and only if p E [0, 2pc]. To generate a FejCr monotone
sequence, we want this estimate to hold for every c E C; thus we define
(x, xi W ~ ( X - f ix)) - /,;:(Xi w;(x - E x ) ) pideal := inf pc =
CEC llCi wi(x - Pix)Il2
By its very construction, pgood 5 Pideal. (This inequality also follows from Theorem 3.2.l.(i).)
Also, Ily - c1I2 - 112 - c1I2 0, Vc E C, precisely when p E [0, 2pideal] Denote the update
corresponding to pideal by yideal. It makes sense to think of yid,,, as the "best" update
because: if C = {c) is singleton, then Yideal zs "best" in the sense that I)yideal - c]) < ) I Y - ell,
for every other update y different from yideal.
We now show that the updates ygood and y;deal coincide if the constraints are cones; this
improves on [4l, Proposition 5.71:
Observation 7.3.2 If each Ci is a closed convex cone, then ygood = Yideal.
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM 8 6
Proof. Recall that ( e x , x - e x ) = 0, Vi (Theorem 3.3.3), that C@ 2 zi C? 3 xi w i ( I - P,)x, and that L; = LC@. This implies
When the constraint sets are not all cones, it can happen that pgood < pideal:
Example 7.3.3 Let X := R2, C1 := B((0, I ) , I) , C2 := B((0, -I), I), wl := w2 := 112, and
x := (1,O). Then: C = Cl n C2 = {(O,O)), Plx = (l/fi)(l, a- I), P2x = ( l / a ) ( l , 1 - JZ) ,
During the preceding discussion, we assumed that the weights wi were fixed and concluded
that pgood was - in the realistic absence of explicit knowledge on C - quite a good choice.
So let us now assume that we have complete freedom in choosing the weights wi and the
relaxation parameter p. What should we do? Well, after taking a fresh look at (*), it lies
at hand to look for the argmins of the problem
In this problem, p is independent of the weights; thus we can and do first minimize over p.
This is precisely how we determined pgood. After substituting pgood, we face the problem
The strategy of solving the problem for p first is, of course, not new. Solutions of (**) are
called "deepest surrogate cuts" in [loo]. The argmin of (**), together with the corresponding
pgood, results in an update that realizes the sharpest possible estimate in (*). While it is
expensive to solve (**) for more than 2 constraint sets explicitly, a complete solution is
possible in our two-constraint setting; this will be achieved in the next section.
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM
7.4 Finding good weights
In this section, we use essentially the fact that we deal with only two constraint sets. We
keep the notation and assumptions of the last section. We start by simplifying notation.
Throughout, we let a := x - Pix, b := x - P2x, w := w l , and we define f (w) := -(wll~11~ + (1 - w)llb112)2/11wa + (1 - 4b1I2. Finding good weights thus means studying the following problem (see (**) in the previous
section) :
,u := inf f = inf -(wlla1I2 + (1 - ~ > l l b I l ~ ) ~ [OJI zo€[O,l] llwa + (1 - w)b1I2 '
and we have first to make sense of this expression (is the infimum attained? can the de-
nominator be O? etc.). Denote an arbitrary element of argmin [o,ll f by wgOOd; recall that it
gives rise correspondingly to the relaxation parameter pgood and then to the update ygood.
If w is merely in [O,l], then we nonetheless let the relaxation parameter be pgood and call
the corresponding update simply y.
We are thus very much interested in determining argmin [o,ll f . This will be done by elimi-
nating several special cases.
When a and b are simultaneously equal to 0, then x E C = Cl r l C2 and there is nothing
left to do. We now turn to the other possibility for f not being everywhere defined.
Observation 7.4.1 Suppose a = 0 and b # 0. Then f (.) E -llbJI2 on [O, 1[ and f is
undefined at 1. Choosing w = 1 yields the update y = x (no progress); whereas for
w E [ O , l [ , y = P2x. Similarly for b = 0 and a # 0. Altogether: if a = 0, then we let
argmin lo,ll f = [O, l[; if b = 0, we let argmin lo,l1 f =]O, 11.
We henceforth assume that a and b are both nonzero. We now learn that the infimization
problem has solutions:
Observation 7.4.2 Because C = Cl n C2 # 0, the vector wa + (1 - w)b # 0: Vw E [O: 11.
Consequently, the function f is continuous on [0, 11 and argmin f # 0.
Since
we do nothing but calculus.
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM
Observation 7.4.3 f(0) = -)lbJ12, f (1) = -)la))2, so p 2 min{-lla))2, -)lb))2).
The next observations are more involved but painlessly verified using Maple.
Observation 7.4.4 Suppose llall = Ilbll. Then it follows that 4 E argmin(o ll f and p =
-IJa1)211b)(2/($11allllbll + $(a, b)); moreover, /I < mini-llal12, -llb/12} if and only if a # b.
Altogether: if a = b, then argmin f = [O, 11 (and hence every update y belongs to C):
otherwise, a # b and argmin wll f = (4).
Observation 7.4.5 Suppose JJaJJ # JlbJI and (a: b) = 2)laJ1211b))2/(1Jal)2 + llb112). Then
from which we conclude: if llall > Ilbll, then argmin[o,llf = {I); if llall < Ilbll, then
argmin p l I f = (0). Note that we can rewrite the assumption on (a, b) in terms of the
harmonic mean: (a, b) = har {(la1I2, llb112); in particular, (a, b) 2 min{llal12, llb112).
Observation 7.4.6 Suppose a and b are both nonzero, different from each other, but
collinear: b = Xu, where X > 0 and X # 1. Then
2 (1-w)X2+w f (4 = -1141 ( ( and f' (w) = ( A - 1)
2X11a112((1- w)X2 + w ) . 1-w)X+w ((1 - w)X + w)3 ?
hence f is monotone. Consequently: if llbll > Ilall, then argminlo,llf = (0); if llbll < \lull,
then argmin lo,ll f = (1).
It is time to consider the general case. When analyzing the general case, one is lead to deal
with special cases; this is why we made the observations beforehand. Thus we can assume
that we are not in any of the above special cases.
Since f is a quotient of two quadratics of the same degree 2, its derivative f ' can be written
as the quotient of a quadratic and a quartic. In particular, f has at most two critical points
on the entire real line.
Observation 7.4.7 fl(0) = 2((a, b) - lla112), f l ( l ) = 2(11b1I2 - (a, b)); in general, the deriva-
tive fl(w) is equal to
2 w l l a 1 i 2 + (l - w ) l b 1 2 ( ( ~ I I ~ I I ~ I I ~ I I ~ - ( 1 1 ~ 1 1 ~ + 11bl2)(a, b))w + 11b12(11a112 - (a, a))) llwa + (1 - w)b1I4
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM
Thus f is continuously differentiable on [O,1] and has two critical points in W:
llb1I2 and llb1I2((a, b) - 11a1I2> llb1I2 - Ila1l2 (llaJI2 + llb1I2>(a, b) - 211a11211b112'
The former critical point always lies outside [O, 11 (and actually corresponds to a local
maximum). Thus we are interested in the latter critical point, which we name Wcrit.
We have:
wcrit E ] O , 1 [ n argmin [o,ll f if and only if f '(0) < 0 < f '(1).
(It is clear that fl(0) and f l ( l ) both have to be nonzero. Also, all other possible sign
combinations lead quickly to contradictions. j Put differently:
Observation 7.4.8 wcrit E]O,l[n argmin lo,ll f if and only if (a, b) < min{lla112, 11b112}.
Fortunately, we are able to summarize all our observations into one:
Observation 7.4.9 If (a, b) < min{llal12, llb112), then
where f (wgood) < mini-llal12, -llb112); otherwise, f (wgood) = min{-I/~I/~, -llb112}, where
Turning back to our original notation, we obtain:
Observation 7.4.10 Given x, we compute the weight wl as follows.
Case 1: (x - Pix, x - P2x) < min{llx - Plx112, 112 - ~2x11~). Then:
Case 2: (x - Plx,x - P2x) 1 min{JJx - Plx112, 1 1 % - ~2x11~) . Then:
1, if 1 1 % - Plxll > 112 - P2xl1; W 1 :=
0, otherwise.
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM 90
Now w2 = 1 - wl. What about the relaxation parameter pgood (see page 85) corresponding
to these weights? In Case 1, we have
whereas pgood = 1 in Case 2. And the update ygood corresponding to these parameters?
Well, in Case 1:
Case 2 is much simpler: ygood = Plx, if llx - PlxII > IIx - P2211; ygood = P22, other-
wise. The following estimates hold (by (**) in the last section, the definition of f , and
Observation 7.4.9) for every c E C: In Case 1:
whereas in Case 2, we obtain the less sharp
Therefore: in either case:
In the next section, we will see that any sequence whose terms satisfy an analogue of the
last inequality has pleasant convergence properties.
We conclude by visualizing some of the numerical quantities of Observation 7.4.10.
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM
We assume WLOG Ilx - P1x 11 > 112 - P2x11. Since the weights and the relaxation parameter
depend only o n the angle (Plx - x, P2x -x) and t h e magni tudes llx - PlxlJ, llx - P2xIl, we can
WLOG work in R2 and further assume (after translating and scaling if necessary) x = (0,O)
and P1x = (0: 1). Hence P2x =: (u, v) has norm less than or equal to 1. We now vary (u, v)
in the unit disc and plot: the weight wl as defined in Observation 7.4.10, the corresponding
relaxation parameter pgood, and what we could call the "progress" function, i.e., the sharpest
estimate we have on llygood - c1I2 - llx - c1I2, for all c E C (see Observation 7.4.10).
We see from Figure 7.1 that Case 1 of Observation 7.4.10, i.e., w l < 1, happens if the angle
between PIx - x and P2x - x is large. In this case, dramatic overrelaxations are selected
(Figure 7.2) and the "progress" function yields much tighter estimates than mini-llx -
P ~ X ~ I ~ , -112 - ~ ~ ~ 1 1 ~ ) (Figure 7.3).
I
Figure 7.1: The weight wl as a function of P2x - x.
7.5 A fun algorithm
The determination of the update ygood in the previous section immediately suggests a cor-
responding algorithm: let x be the current iterate x,, compute ygood according to Obser-
vation 7.4.10, and then set x,+l := ygood. The starting point xo is chosen arbitrarily in X.
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM
Figure 7.2: The relaxation parameter pgood as a function of P2x - x.
Figure 7.3: The "progress" estimate as a function of P2x - x.
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM
We call this algorithm simply (FUN). (This algorithm is probably not new; however, I have
not seen its explicit formulae anywhere else.)
We investigate (a generalization of) (FUN) theoretically and practically in the following two
subsections.
7.5.1 Convergence results
We now show how Fejdr monotonicity and the regularities combined yield powerful conver-
gence results.
Theorem 7.5.1 Suppose Cl, C2 are closed convex subsets of X with C := C1 n C2 # 0. Suppose further that (x,),>~ - is Fejdr monotone with respect to C with
Let c* := limn Pcxn E C (see Theorem 6.2.2.(iii)). Then:
(i) The sequence (x,) converges weakly to c*.
(ii) If {Cl, C2) is boundedly regular, then (x,) converges in norm to c*
(iii) If {Cl, C2) is boundedly linearly regular, then (x,) converges linearly to cx.
(iv) If {Cl, C2) is linearly regular, then (x,) converges linearly to c* with a rate indepen-
dent of the starting point.
Proof. The sequence (x,) is bounded and so is S := {x, : n 2 0). Also, the sequence
(d(xn, C)) is convergent (Theorem 6.2.2.(i)); hence d2(xn, C) - d2(xn+1, C) + 0 which yields
(i): (*) implies that every weak cluster point of (x,) must lie in C. But (x,) has at
most one weak cluster point in C, namely c' (Theorem 6.2.2.(iii)). Hence (x,) con-
verges weakly to c*. (ii) : Bounded regularity (for S) and (*) yield d (x,, C) + 0, which
is equivalent to norm convergence of (x,) to c* (Theorem 6.2.2.(iii)). (iii): There exists
K > 0 (depending on S) such that d(x,, C) 5 n max{d(x,, C1), d(z,, C2)), Vn > 0. Hence
d2(xn, C) 5 n2 (d2(xn, C) - d2(xn+1, C)) which in turn implies (using Theorem 6.2.2.(vi))
that (z,) converges linearly to c* with rate J-. (iv): analogous to (iii) with the
important difference that we can pick n independent of S. .
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM
Remarks 7.5.2
(i) Two instances of algorithms satisfying the assumptions of Theorem 7.5.1 are:
the method of alternating projections; and
the (FUN) algorithm outlined at the beginning of this section.
(ii) If C is an affine subspace, then the weak limit c* of (x,) from Theorem 7.5.1 is actually
Pcxo (by Theorem 6.2.2. (iv)) and (x,) thus converges weakly to a solution of the best
approximation problem infCEc llxO - ell.
(iii) Finitely many constraints Cl, . . . , CN can be dealt with by working in the product
space X with just two sets A and C; see Fact 2.6.17.
7.5.2 Numerical experiments
We briefly report on two numerical experiments we made on the (FUN) algorithm. We
worked in Maple with precision Digits := 15. Three algorithms were compared. The
only difference was how we picked the weights; once these were picked, we always took the
corresponding pgoo,j (see page 85) as the relaxation parameter. For the (FUN) algorithm,
the weights were picked according to Observation 7.4.10. Let us abbreviate von Neumann's
method of alternating projections by (vN); for fairness, the first set we projected onto was
the one which was farther away. Thus, if (FUN) never encounters the case where it selects
those sneakily calculated weights, then (FUN) is just the same as (vN), which always selects
weights equal to 0 or 1. Finally, we denote Merzlyakov's method by (M); that is, we let the
weights be always equal to $. (For the choice of the name, see page 85.)
All the constraints lie in the Euclidean plane R2.
Observation 7.5.3 We let Cl be the line y = x + 1 and C2 be the line y = 1. Then
C = C1 n C2 = ((0,l)). For 10 iterations, we describe how the algorithms do for the
following starting points:
(1,lO): (FUN) and (vN) coincide; (M) does best.
(1, -10): (M) defeats (vN); but (FUN) reaches C in 4 steps.
(-1,lO): (M) surpasses (vN); (FUN) reaches C in 7 steps.
(-1, -10): (M) wins; (FUN) and (vN) coincide.
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM 95
One can experiment with other starting points and obtains different results. Sometimes
(FUN) does very well - it has the potential to reach the solution in just a few steps under
"good circumstances" - in other cases, (M) wins. (vN) does surprisingly well, given its
simplicity.
Let us now consider a nonlinear setting.
Observation 7.5.4 We let C1 be the unit ball and C2 be the line y = 1 so that C =
Cl fl C2 = ((0, l)) . We considered 20 iterations since the geometry is so "bad" that the
algorithms do not produce approximate solutions in just a few steps. We looked at the
starting points (1, lo), (10, I), and (10,lO). In all three cases, (FUN) wins and (vN) is
never worse than (M).
Conclusion: There is no such thing as a "best" algorithm. The performance does depend
on the geometry of the sets as well a s the starting point. If you can afford it, run them all.
Finally, I confess to the dear reader that the (FUN) algorithm appears to be numerically
unstable (different setting of Maple's precision resulted in different behaviour of (FUN)) -
not a total surprise in view of the Figure 7.1. In practice, I would recommend safeguarding
the (FUN) algorithm by ignoring weights that are very close but not equal to 0 or 1.
7.6 Notes
General recent pointers to the vast literature on Convex Feasibility Problems include [14,
40: 411.
Section 7.3 provides a rationale for using projection methods to solve convex feasibility
problems. While the experienced "projection method buff" will not find much new in this
section, I very much enjoyed writing it.
Even more fun is Section 7.4, where I find the "best" update for convex feasibility problems
with two constraints; I have not seen this analysis anywhere but doubt that it is new. The
corresponding algorithm, which I call (FUN), is investigated in Section 7.5. Finding the
"best" update for more than two sets requires solving a quadratic programming problem
and is thus more demanding; see [101, Example 8.6 and Remark 8.71.
Theorem 7.5.1 is the "mother of all convergence results for projections methods"; it covers
the (FUN) algorithm as well as the method of alternating projections. It is no exaggeration
to say that if one understands how this result brings together Fejkr monotonicity and the
CHAPTER 7. THE CONVEX FEASIBILITY PROBLEM 96
regularities: then one knows what projection methods are all about - the rest is technical
complications.
Chapter 8
The general projection algorithm
8.1 Overview
This chapter contains the algorithmic "back bone" of Part I. Deferring applications to
the next chapter, we present basic convergence results on projection algorithms. New key
ingredients are notions of focusing algorithms, which work hand in hand with regularities,
and together yield powerful convergence results. Tables, helping the reader to not get lost
in the babel of notions, are included for convenience. In the concluding notes, we compare
our set-up to some other, very recent, important frameworks.
8.2 One step at a time
Suppose C1: . . . , Cn: are finitely many closed convex subsets of X with C := ni C; # 0 . Our
aim is to solve the convex feasibility problem: (CFP) Find a point in C. We abbreviate
each projection PC, by Pi and PC by P.
We are given a point x in X, which we think of as the current iterate of an algorithm, and
we want to find a "better" point y, i.e., the next iterate. As motivated in Section 7.3, a
promising candidate is
y : = x + c r p ~ ; w ; ( p Z x -2);
here, (Y and p are two parameters and the wi are nonnegative weights. (The product crp
is just like the single relaxation parameter in Section 7.3; however, I chose to have two
parameters to facilitate comparisons to other methods.) Only nonzero weights matter; thus
CHAPTER 8. THE GENERAL PROJECTION ALGORITHM 98
we let I be the set of active indices {i : w; > 0). Then y = x + a,p&z wi(Pix - x). Note
that y = x (i.e., no change) precisely when a,p = 0 or w;(P,x - x) = 0. The latter
condition is equivalent to x E Fix wi Pi) and to x E C; (by Observation 3.2.2).
Observation 8.2.1
(i) 1 1 Xi w;(x - P;X)I~~ = 0 if and only if x € njEI Ci.
(ii) 11 Cj w;(x - P ~ X ) / / ~ = xi wiIlx - ~ ~ 1 1 ~ if and only if { E x : i E I) is singleton.
Proof. We already discussed (i); (ii) follows from Proposition 2.2.1. W
Thus the quantity
I otherwise.
is well-defined (a.k.a. pgood in the two-set case; see page 85) and greater than or equal to 1.
We assume throughout
a,€ [0,2] and p~ [l,R].
Then in particular cup E [O,2R] (which corresponds to p E [0, 2pgood] in the two-set discussion
on page 85). The conditions imposed on a, and p are precisely what is needed to make the
following:
Observation 8.2.2 llx - yll = apll C;W;(X - E x ) ] / and for every c E njEZCi : we have
following estimates:
Proof. The identity is clear; so fix c E njE C;. On the one hand, the proof of the in-
equality (*) in Section 7.3 generalizes to finitely many sets and yields 11 y - c1I2 - llx -
CHAPTER 8. THE GENERAL PROJECTION ALGORITHM 99
< ( ~ p ) ~ 1 1 Ci wi(x - - 2(ap) Xi W ~ I I X - pix1I2. On the other hand, by Prop-
osition 2.2.1, 11 Ci wi(X - ex)1I2 = xi WillX - P ~ x I / ~ - 1 Cij W i w j l l e X - Alto-
gether, the first displayed inequality and the displayed equality hold. Next, p < R im-
plies 0 1 ~ ~ ~ 1 1 C; W ~ ( X - P , x ) ~ ) ~ - 2 a p x ; will^ - < a2pCiwil(x - pixI12(11 xi wi(z - EX) 112) -' 1 1 xi W;(X - pix) 1 1 2 - Zap xi willx - Ex1I2, which verifies the second displayed in-
equality. The third displayed inequality follows from: p < R * Ily - x1I2 = a2ppll xi W;(X - piz)1I2 5 a2pCiwill~ - piz1l2 *-a(2 - ~ ) P ~ ~ W ~ I I X - < -((2 - @)/a)lly - x1l2.
The last displayed inequality is trivial. Finally, the "Also" part follows from: p < R + Ily - x1I2 = c ~ ~ ~ ~ l l Ciwi(x - P , x ) [ ~ ~ < a 2 p C i willz - pixI12. Consider the term ap(ap - 2)Ciwi llx - P;X 11 from the first inequality in Observation 8.2.2.
Note that we know its sign for sure only when p = 1. This corresponds to the algorithms
we investigated in great detail in [14].
8.3 The general projection algorithm
Throughout the remainder of this chapter, we assume the following.
SETTING. Suppose C1, . . . , CN are finitely many closed convex subsets of X with
We aim for solving the convex feasibility problem
(CFP) Find a point in C.
Suppose further that (Ci,n)n>o - is a sequence of closed convex supersets of Ci, Vi. Denote
the projections onto C;,,, C;, C by Pi,,, Pi, P, respectively. We tackle CFP by studying
the (projection) algorithm that generates a sequence ( x , ) , ~ ~ by
The point xo is referred to as the starting point. The w;,, are nonnegative weights (which
thus add up to I) , and we let In := {i : w;,, > 0) be the set of active indices, V n 2 0. We
also assume that each index is picked infinitely often, i.e., {n : i E I n ) is infinite, Vi, and
speak of random or repetitive control. Each 01, is a relaxation parameter in [O, 21 and each
CHAPTER 8. THE GENERAL PROJECTION ALGORITHM
pn is an extrapolation parameter in [I, Rn], where
( 1 7 otherwise.
It will be handy to abbreviate
Proof. (See also [14, Lemma 3-21 .) Apply Observation 8.2.2 to the supersets Ci,, , Vn. . Corollary 8.3.2 Every sequence (xn) generated by the algorithm is FejQ monotone with
respect to C. Also:
Proof. (See also [14, Lemma 3.21.) Since niEIn Ci,, _> C, Vn, the result follows easily from
Proposition 8.3.1. H
Throughout, we also let (see Theorem 6.2.2. (iii))
C* := limn Px,.
We will repeatedly use the fact (again Theorem 6.2.2.(iii)) that (xn) possesses at most one
weak cluster point in C, namely c*.
CHAPTER 8. THE GENERAL PROJECTION ALGORITHM
8.4 Asymptotically regular algorithms
We say the algorithm is asymptotically regular, if every sequence generated by the algorithm
is. This is a necessary condition for algorithms to generate norm convergent sequences.
Proposition 8.4.1
(i) If int C # 0, then En Ilx, - zn+l 11 < +ca and (2,) thus converges.
(ii) If E < 2, then En I ~ x , - ~ n + ~ 1 1 ~ < +CO-
In particular, every algorithm with (i) or (ii) is asymptotically regular
Proof. (See also [101, Corollary 3.4 and Corollary 3-51 .) (i) : follows from Theorem 6.2.2. (v) .
(ii): There exists WLOG e > 0 with a, 2 - e, Vn. Then, using Proposition 8.3.1,
~1I~n-xn+1 ( I 2 I (2-an)a:pn C; ~ i , n l l ~ n - P i , ~ x n \ \ ~ I 2an(2-an)pn ui,nll~n -pi,nxnI12,
Vn. Now sum over n and recall Corollary 8.3.2.(i). . Here is a sufficient condition guaranteeing that the distance to the supersets gets small:
Proposition 8.4.2 If lbn:iGln an(2-an)pnwi,, > 0 for some i, then limn:iEIn xn - Pi,,x, =
0.
Proof. Clear from Corollary 8.3.2.(i). . 8.5 Focusing algorithms
We say the algorithm is focusing, if
Xk, - x
xk, - Pi,kn xkn -f 0 implies x E Ci,
i E I k n I for each i and every subsequence (xk,) of every sequence (2,) generated by the algorithm.
This condition helps ensuring that weak limit points of (x,) lie in the desired set. Here is
a verifiable sufficient condition:
Proposition 8.5.1 If (Pi,,) converges actively pointwise to Pi, i.e., limn:iczn Pi,nx = Pix,
Vx E X, for every i, then the algorithm is focusing and each Ci = r)n:iEIn Ci,n.
CHAPTER 8. THE GENERAL PROJECTION ALGORITHM 102
Proof. Fact 2.2.5; the focusing part also follows from Theorem 2.4.1. . In particular, the algorithm is focusing if each Ci,, Ci, i.e., the algorithm has constant
sets.
Theorem 8.5.2 (a dichotomy result) Suppose the algorithm is focusing and a, (2-
an)pnw;,n > 0, for each i. Then the sequence (xn) either converges in norm to c" or has no
norm cluster points at all.
Proof. (See also [101, Theorem 3.81 and [14, Theorem 3.101.) Since (x,) is Fejdr monotone
with respect to C (Corollary 8.3.2) it suffices to show that norm cluster points of (x,)
(should they exist) lie in C (Theorem 6.2.2.(iii)). Assume the opposite. Then there exists a
subsequence (xkn) converging in norm to some point x $ C. Define Iin := {i : x E Ci) and
Iout := {i : x $ C;); then Iout # 0. After passing to a subsequence if necessary, we assume
that I k n U Ikn+l U .. ' U Ilcn+l-l = {1,2,. . . , N). Now let kn _< m, 5 k,+l - 1 be minimal
such that Imn f l .Tout # 0, i.e., if k, 5 2 < m,, then I[ C Iin. Since x E niEIin C;, repeated
use of Proposition 8.3.1 yields llxkn - x 11 2 Ilxm, - x 11, which in turn implies
(1) Xm, + 2-
After another pass to a subsequence if necessary, we assume that there is some index i such
that
(2) i € Imn n Iout, V n .
In view of the assumption on the parameters and Proposition 8.4.2,
(3) Xmn - Pr,mnxmn + 0.
Now the algorithm is focusing; thus (I), (2): and (3) result in x E C;, which is absurd. . We say the algorithm is intermittent and speak of intermittent control, if there exists a
positive integer p such that In U In+1 U . . U In+p-l = {1,2,. . . , N), V n 2 0. The method
of alternating projections, among many other projection algorithms, is governed by inter-
mittent control.
Theorem 8.5.3 (weak topology results) Suppose the algorithm is focusing.
(i) If the algorithm is intermittent, & < 2, and each ~ n : i E ~ n a n p n w i , n > 0, then (x,) is
asymptotically regular, converges weakly to c*, and rnaxi~I,, d(xn, Ci,,) + 0.
(ii) If the sequence (2,) converges weakly to some x E X and x, an(2 - a,)p,w;,, = +m
for some index i, then x E C; and d(x,, C;,,) = 0.
CHAPTER 8. THE GENERAL PROJECTION ALGORITHM
Proof. (See also [14, Theorem 3.201.) (i): Asymptotic regularity comes from Proposi-
tion 8.4.l.(ii) and max;,=zn d(x,, Ci,,) -+ 0 from Proposition 8.4.2. Suppose to the contrary
that (2,) does not converge weakly to c". So pick a subsequence (xkn) of (xn), x E X , and
some index i such that xkn - x $Z Ci. By intermittence, there exist a positive integer p and a
sequence (m,) with k, 5 m, < k, + p - 1 and i E I,,, Vn. Because the algorithm is asymp-
totically regular, xkn - xmn + 0. It follows that xmn converges weakly to x as well. Also,
x,, - P,,,,xmn + 0. Since the algorithm is focusing, we have x E C;, which is the desired
contradiction. Hence x, - c". (ii): Corollary 8.3.2.(i) yields l h n : i E z n l l ~ n - Pi,,x,(l = 0.
Since the algorithm is focusing, it follows that x E C,. H
8.6 Strongly focusing algorithms
We say the algorithm is strongly focusing, if
Xkn - x
xkn - Pi,]cn xkn + O implies d(xkn, Ci) --, 0,
2 E Ikn
for each i and every subsequence (xkn) of every sequence (2,) generated by the algorithm.
Every strongly focusing algorithm is focusing, because distance functions to convex sets are
weakly lower semi-continuous.
Proposition 8.6.1 Suppose the algorithm is focusing. If each "orbit" {x, : n > 0) is
relatively compact, then the algorithm is strongly focusing. In particular, this happens if
X is finite-dimensional or if int C # 0.
Proof. (See also [14, Corollary 4.121.) Assume the opposite. Then there exists x E X
and a subsequence (xkn) of (xn) such that xkn - x, xkn - Pj,knxkn + 0, i E Ikn, but
llxkn - Pixkn 11 1 c, for some index i, c > 0, and Vn. As the algorithm is focusing, x E
C;. We assume WLOG that xkn -) x (relative compactness and subsequence). But then
xk, - Pixkn -) x - Pix = 0 which is absurd. "In particular7': If X is finite-dimensional,
then {x, : n > 0) is bounded, hence relatively compact. If int C # 0, then (x,) is norm
convergent (Theorem 6.2.2. (v)). The proof is complete.
We say the algorithm considers remotest sets, if
CHAPTER 8. THE GENERAL PROJECTION ALGORITHM
and for every sequence (2,) generated by the algorithm. Every sequence (in) with in E
I,,,(n), Vn, is called a sequence of active remotest indices.
Theorem 8.6.2 (more weak topology results) Suppose the algorithm is strongly focusing,
considers remotest sets and (in) is some sequence of active remotest indices.
(i) If En a, (2-a,)pnwin,, = +w, then (x,) has a subsequence (xkn) with maxi d(xkn , Ci)
-+ 0 and hence (xkn) converges weakly to c*.
(ii) If h n a n (2 - an)pnwinln > 0, then maxi d(x,, C;) + 0 and hence (x,) converges
weakly to c*.
Proof. (See also [14, Theorem 4.261.) (i): By Corollary 8.3.2.(i), lh, llx, - Pin,n~n112 = 0.
Thus there exists a subsequence (xkn) of (2,) with ikn = i for some index i, xkn - x for
some x E X, and xkn - P,,knzkn + 0. As the algorithm is strongly focusing, d(xkn, C;) -+ 0.
But i E I,,(k,) and so maxj d(xkn, Cj) -+ 0. The result follows. (ii) is similarly.
Combining strongly focusing algorithms with (boundedly) regular constraints yields power-
ful norm convergence results:
Theorem 8.6.3 Suppose the algorithm is strongly focusing and {C1,. . . , CN) is boundedly
regular. Then the sequence (x,) converges in norm to c* whenever one of the following
conditions holds:
(i) The algorithm is intermittent, E < 2, and each b n : i E I n ~ n ~ n ~ i , n > 0.
(ii) The algorithm considers remotest sets and En an(2 - an)pnwin,, = +oo, where (in)
is some sequence of active remotest indices.
Proof. (See also [14, Theorem 5.2 and Theorem 5.31.) (i): By Theorem 8.5.3.(i), we have
rnaxj~r, 112, - P,,,X, 11 -t 0. Since the algorithm is strongly focusing, max;~I, llx, - Pixn 11 -t 0 (by a straight-forward proof by contradiction). Intermittence and asymptotic regular-
ity (Proposition 8.4.1 .(ii)) yield mas d(x,, Ci) t 0. Now bounded regularity implies
d(x,, C ) + 0 and we are done (Theorem 6.2.2.(iii)). (ii): By Theorem 8.6.2.(i) and
bounded regularity, there exists a subsequence (xkn) of (xn) with d(xkn, C) + 0. Apply
Theorem 6.2.2.(iii). The proof is complete. .
CHAPTER 8. THE GENERAL PROJECTION ALGORITHM
8.7 Linearly focusing algorithms
We say the algorithm is linearly focusing, if there exists some P > 0 such that eventually
for each i and every sequence (2,) generated by the algorithm.
Every linearly focusing algorithm is strongly focusing, hence focusing. Clearly, every algo-
rithm that has constant sets is linearly focusing.
Theorem 8.7.1 Suppose the algorithm is linearly focusing and {Cl,. . . , CN) is boundedly
linearly regular. Then the sequence (x,) converges linearly to c* whenever one of the
following conditions holds:
(i) The algorithm is intermittent, E < 2, and each lim ,:iEln anpnwi,n > 0.
(ii) The algorithm considers remotest sets and l&,a,(2 - an)pnwi,,, > 0, for some
sequence (in) of active remotest indices.
Proof. (See also [14, Theorem 5.7 and Theorem 5.81 .) (i): Obtain e > 0 such that (WLOG)
a, 5 2 - e and a,p,w;,, 2 E, Vn, i E In. Pick a positive integer p and /3 > 0 such that
(WLOG) I ,UIn+l U. . - U In+p-l = (1,. . . , N ) and pd(x,,C;) 5 d(xn, Ciln), Vn,i E In. Fix
an arbitrary index i. Then there exists kp 5 mk 5 (k + l)p - '1 with i E I,,, Vk. Fix an
arbitrary c E C. mk-1
step 1: d2(xkp, Ci) < (mt + 1 - kp)(d2(xmk, ~ i ) + l l ~ n - xn+11I2), v k . Indeed, since distance functions are nonexpansive, d(xkp, Ci) 5 d (xmk, Ci) + (lxb - xmk 11 5
mk-1 d(xmk, Ci) + 112" - x,+III- NOW apply Cauchy/Schwarz.
step 2: d2(xmk, Ci) < (11x*p - ell2 - l I ~ ( k + l ) ~ - c1I2)/(e2P2), M.
By Fej6r monotonicity and Proposition 8.3.1,
mk-1 Step 3: llxn - xn+11I2 ((2 - ~) / f ) ( l l~kp - ell2 - Il~(k+l)p - ell2), Vk-
By Proposition 8.3.1, I ~ X , - xn+11I2 5 (an/(2 - (~,))(11~, - - 1 1 ~ , + ~ - ell2); OW sum
appropriately and use an/(2 - a,) = -1 + 2/(2 - a,) < (2 - E)/E.
CHAPTER 8. THE GENERAL PROJECTION ALGOMTHM
Step 4: d2(xkp,Ci) 5 P ( & T + y ) ( I ( ~ k ~ - dl2 - I / ~ ( k + i ) ~ - ell2), Vk.
Piece together Steps 1-3.
Final step: On the one foot, let X := p(& + %) and c := Pxkp; then Step 4 implies E P
maxi d2(xg, Ci) I X(d2(xkP,c) - d2(x(k+l)p, C)) , Vk. On the other foot, {C1,. . . , C N } is
boundedly linearly regular; so we obtain 6s > 0 such that d(x,, C) 5 K S maxi d(x,, Ci),
Vi. We conclude altogether d2(xkp, C) 5 ~ 2 s X ( d ~ ( x ~ ~ , C ) - d2(x(k+l)p, C)) , Vk. By Theo-
rem 6.2.2. (vi), (x kp) converges linearly to c* . Therefore, using Proposition 2.6.18, the entire
sequence (x,) converges linearly to c*.
(ii): Pick P > 0 and e > 0 such that (WLOG) pd(xn,Ci) I d(x,,C;,,) and an(2 - an)pnwin,n 2 e, Vn,i E In. Obtain KS > 0 such that d(x,,C) 5 ns ma% d(x,,C;), Vn.
Then for every n, we obtain (using Proposition 8.3.1) :
Apply Theorem 6.2.2. (vi) .
Remark 8.7.2 A second examination of the proof of Theorem 8.7.1 reveals that the rate of
convergence is independent of the starting point whenever {Cl, . . . , CN} is linearly regular.
8.8 Overrelaxed algorithms
We present two convergence results on algorithms that allow a, = 2, i.e., total overrelax-
ation, which are not covered by the results of the previous sections.
Proposition 8.8.1 Suppose the algorithm is asymptotically regular and there exists a sub-
sequence (xkn) of (x,) with xkn + x for some x E X, Ikn = (1, . . . , N), ,akn pkn > 0,
h n ~ i , ~ , > 0 and P i , k n ~ -+ Pix, Vi. Then the entire sequence (2,) converges to c*.
Proof. (See also 114, Theorem 4.221.) We assume WLOG (subsequence) each wi,kn + wi > 0. Now Ilx - xknll -+ 0, hence IIPi,kmx - Pz,knxknII -f O (projections are nonexpansive), and
thus Pi,knxkn + E x , Vi. On the other hand, xkn+l - xkn = aknpkn Ci~i ,kn(Pi ,knxkn -
xkn ); taking limits yields x = xi wiPix. By Observation 3.2.2, x E C. It follows from
Theorem 6.2.2.(iii) that x = c* and that (x,) converges to c*. H
CHAPTER 8. THE GENERAL PROJECTION ALGORITHM 107
Theorem 8.8.2 Suppose there exists a subsequence (k,) of (n) such that Ikn = {I , . . . , N},
0 < infn a k n pkn 5 Sup a k n pk, 5 2, limn Wi,kn > 0, 'di, and each Ci,, G Ci is either a
hyperplane or a halfspace. If the set of normals to the constraints span a subspace of
dimension at least 2, then (x,) converges in norm to c*.
Proof. (See also [14, Theorem 6.291.) Write WLOG each Ci as a halfspace {x : (ai: x) 5 b;},
for some ai E Sx, bi E W. (The case when some set is a hyperplane is treated analogously.)
Using Proposition 8.3.1 and Example 3.3.13, we obtain that 0 + xi,j Wi,knLJj,kn xkn -
pj,knxkn 112 = Cij wi,knwj,kn ll((ai,xkn) - bi)+ai - ((aj, x k ) - bj)+aj112- Fix an arbitrary
index i and pick another index j such that {ai, aj) is linearly independent. (Here we use the
assumption on the normals.) Then we conclude ((a;, xkn) - bi)+ -, 0. Since i is arbitrary,
we have maxi d(xkn, Ci) -f 0. Hence, by linear regularity of halfspaces/hyperplanes (Theo-
rem 5.5.4 and Corollary 5.7.2), d(xkn , C) -+ 0. Now x, -f c* by Theorem 6.2.2.(iii). The
proof is complete.
In general, some assumption on a,p, is necessary:
Example 8.8.3 Let X := R2, C 1 , := C1, C2,, := C2 be the vertical and horizontal 1 axes, wl,, := := 5 , and x0 := (x,x) # 0. One checks that = 2 and hence X I =
$(2 - aopo)xo Similarly, if we let an :E 2 and p, := R, = 2, then x, = (-l),xO ft 0.
8.9 Notions at a glance
Tables 8.1, 8.2, 8.3 show most of the notions conveniently in one place. Unfortunately, but
not unexpectedly, quite a few of these notions appear under a babel of different names in
the literature.
8.10 Notes
The setting of Section 8.3 is my basic algorithmic framework in Part I of this thesis. For
beauty and simplicity, I discuss only projection algorithms rather than methods for find-
ing common fixed points of firmly nonexpansive maps, I avoid infinitely many constraints
altogether, and I refrain from some more general forms of control. However, I expect the
majority of the results to hold true more generally (at the expense of uglier proofs).
CHAPTER 8. THE GENERAL PROJECTION ALGORJTHM
Control
random/repetitive
cyclic
intermittent
weighted
singular
I remotest sets I considers remotest sets and is singular 1
Meaning
{ n : i E I,} is infinite, V i (general assumption)
L = { [ n + l ] ~ ) , V n
~ P > O : I ~ U I ~ + ~ U . . . U I ~ + ~ - ~ = { ~ , . . . , N } , V n
I = 1 . , N } , V n
I, is singleton, V n
almost cyclic
consideration of remotest sets
- -- - - - - pp pp
Table 8.1: Controls at a glance.
intermittent and singular
I,,(n) # 0, V n
I Parameters I Meanino I relaxed I or, E [O, 21, V n (general assumptio
I
I unrelaxed I an = I, V n I
I underrelaxed I an E 10.11. ~n I
totally overrelaxed
overrelaxed
or, = 2: V n
an E [1,2], vn
Table 8.2: Parameters at a glance.
short-step method
long-step method
p, = 1: V n
p, = &, V n
CHAPTER 8. THE GENERAL PROJECTION ALGORTTRM
Other notions I Meaning
has constant sets I Ci,, =Ci, Vi,n I
linearly focusing I 3p > 0 : /3d(zn, G ) < d ( ~ n , Ci,n), Vi, n
Table 8.3: Other notions a t a glance.
asymptotically regular
focusing
strongly focusing
Using this chapter's notation, we now compare the present framework to three recent im-
portant studies that cover projection methods.
x, - z,+l -t 0, V X ~
Xk, - x
Xk,, - Pi,knXkn + 0
i E I k n
2 k n - Xkn - Pi,k,xlc, + 0
i E I k n
BB Bauschke and Borwein's [14]: This framework deals exclusively with short-step methods,
i.e., p, f 1, but is otherwise identical to my set-up. (The authors considered N
relaxation parameters, which - as pointed out by Combettes, Kiwiel and Lopuch
- is not necessary after a change of variables.) The important notions of focusing,
strongly focusing, and linearly focusing algorithms were coined in this framework. In
this chapter, I have generalized many of BB's results from p, - 1 to pn E [ l , & I , Vn.
C Combettes's [39,41]: A very elegant but more restrictive framework because only linearly
focusing algorithms are discussed that also obey the following two conditions: there
exists e > 0 such that
(Actually, Combettes's assumption on the weights ostensibly looks more general but
is not, as a renormalization of the weights shows.) These restrictions simplify the con-
vergence analysis. No linear convergence results are given. However, infinitely many
constraints and.more general controls are allowed and some interesting applications in
the engineering world are discussed in detail. In my framework, I view Combettes's
CHAPTER 8. THE GENERAL PROJECTION ALGORITHM 110
method as a long-step method with potentially small relaxation parameters: p, &; "E 5 anPn < (2-c)&" then becomes "EI~, = c/Rn < cm 5 2-2'. This interpretation
allows me to recover some of Combettes's results for the controls I use.
KL Kiwiel and Lopuch's [loll: This very potent framework of "approximate cutting half-
spaces" has a format that differs considerably from the present, much easier to un-
derstand, set-up. Even comparing assumptions is not an entirely trivial task. Their
framework is a long-step method, i.e., pn r &, and the authors always assume (see
[101, equation (2.3)]) the existence of X > 0 such that
Ci wi,nIIxn - ~ l , n z n I I ~ - - ' I x n - x n + l I I 2 ~ m a x d(x,, tic), vn. I( C; wi,n(xn - Pi,nxn) II a n i€Im
This assumption does not come for free (and thus has to be discussed separately; see
[101, Remark 2.2 and Remark 2.31).
2 2 Condition (KL) implies, using Proposition 8.3.1, X2 maxi~I, d2(xn, C;,,) 5 llxn - ~ n + ~ l l /a,
< ( I I X ~ - C ~ ~ ~ - I I X ~ + ~ - C ~ ~ ~ ) / ( ~ ~ ( ~ - ~ ~ ) ) , Vn, c E ni Ci. This estimate is akin to the "progress - estimate" of Theorem 7.5.1 and explains a little why this framework works well. It is nice
to note that (in the authors' own words) Kiwiel and Lopuch's "work benefited greatly from
BB". They observed that many of BB's proofs generalized to long-step method almost
without change. (This shortcoming of BB can be attributed to the authors concentration
on the iteration of nonexpansive maps rather than on the construction of Fejdr monotone
sequences.)
Kiwiel and Lopuch's framework and the present one are complementary; lurking somewhere,
there is probably a yet-more-general framework covering "everything". I fear, unfortunately,
that this hypothetical framework will be neither particularly nice nor intuitive. All these
frameworks rest on "the shoulders of giants", by which I mean that many researchers built
the foundation of this area; however, I do not even attempt to give a historical account of
each and every relevant result. The next chapter contains numerous examples and certainly
highlights some of the most important contributions.
Let us return to the contents of this Chapter. Section 8.3 is folklore. Finite-dimensional
versions of Theorem 8.5.2 (for firmly nonexpansive maps) were discovered by F l h and
Zowe [68], by Tseng [153], and by Elsner et al. (591.
CHAPTER 8. THE GENERAL PROJECTION ALGORTTHM
Because of the special connection to Kiwiel and Lopuch's framework, we point out that Theo-
rem 8.5.3 Proposition 8.6.1, Theorem 8.6.2, Theorem 8.6.3, Theorem 8.7.1, Proposition 8.8.1
are related to Kiwiel and Lopuch's [101, Theorem 3.11, Corollary 4.4, Theorem 4.10, The-
orem 5.2, Theorem 5.3, Theorem 5.7, Theorem 5.8, Corollary 4.71, respectively.
Chapter 9
Applications! Applications!
9.1 Overview
Retaining t h e notation of t h e last chapter, we present selected applications. The
examples chosen are mostly related to known results to demonstrate the power of our al-
gorithmic set-up. (In fact, substantially more general results are often possible, and many
more examples could be constructed.) We do not give a historical account here and refer the
interested reader to [14, Section 61 for more information and examples; however, Section 9.5
presents some classical (and still beautiful) results.
9.2 Focusing algorithms
Example 9.2.1 ( F l h and Zowe's 168, Theorem 1 and Theorem 21) Suppose X is finite-
dimensional, the algorithm is linearly focusing, has short steps, and 0 < < Z < 2. Then
the sequence (2,) converges to some point in C whenever (i): There exists E > 0 such that
w ; , 2 c, Vn,i E I,: or (ii): intC # 0 and En wi,, = +GO, Vi.
Proof. The algorithm is linearly focusing, hence focusing. Thus Theorem 8.5.2 yields
(i). (ii): By Theorem 6.2.2.(v), (x,) is norm convergent to some point in X. Apply Theo-
rem8.5.3.(ii). . Example 9.2.2 (Combettes's [39, Theorem 4.1 for intermittent control]) Suppose the al-
gorithm is linearly focusing, has long steps, and there is some E > 0 such that wi,, > e and
CHAPTER 9. APPLICATIONS! APPLICATIONS!
E 5 QnPn = a n & 5 (2 - E)&, Vn, i E I,. If the control is intermittent, then the sequence
(x,) converges weakly to some point in C.
Proof. Clearly, E < 2, lim,:iElnanpnwi,, 2 e2 > 0, and the algorithm is focusing. Apply
Theorem8.5.3.(i). . Example 9.2.3 (Baillon's [8, Chapitre 6, Remarque 11.61) Suppose Ci,, > Ci,n+l, Vn, and
C; = 0, C,,,, Vi. Suppose further the algorithm is almost cyclic and unrelaxed. Then the
sequence (2,) is asymptotically regular and converges weakly to some point in C.
Proof. By Fact 2.2.6, each Ci,, converges to Ci in the sense of Mosco. By Fact 2.2.5 and
Proposition 8.5.1, the algorithm is focusing. Apply Theorem 8.5.3. (i) . W
Example 9.2.4 (Browder's [28, Theorem 2 for finitely many sets]) Suppose the algorithm
is almost cyclic, unrelaxed, and has constant sets. Then the sequence (x,) converges weakly
to some point in C.
Proof. Example 9.2.3 with constant sets. . Example 9.2.5 (Pierra's [123, Theorem 1.2.(i)]) Suppose the algorithm is weighted, 0 < a r a, 5 1, has long steps and constant sets, and each w;,, E 1/N. Then the sequence
(2,) converges weakly to some point in C and max; d(x,, Ci) + 0.
Proof. Theorem 8.5.3. (i). . Example 9.2.6 (Bruck's [30, Corollary 1.21) Suppose the algorithm is singular, has con-
stant sets, and an = a € ]0,2[. If some constraint Cj is boundedly compact, then the
sequence (x,) converges in norm to some point in C.
Proof. Proposition 8.4.2 implies x, - Pixn = 0, Vz. Since Cj is boundedly compact
and (x,) is bounded, there exist a subsequence (xkn) of (x,) and x E Cj with Ikn = {j)
and Pjxkn + x. It follows that (xkn ) converges to x as well. By Theorem 8.5.2, x = c* and
5, + c*. .
CHAPTER 9. APPLICATIONS! APPLICATIONS!
9.3 Strongly focusing algorithms
Example 9.3.1 (Pierra's [123, Theorem 1.2.(ii)]) Suppose the algorithm is weighted, 0 < a r a, < 1, has long steps and constant sets, and each w;,, - 1/N. If the set of constraints
is boundedly regular, then the sequence (x,) converges in norm to some point in C. In
particular, this happens if X is finite-dimensional or int ni C; # 0.
Proof. By Theorem 8.5.3.(i), maxi d(x,, C,) -+ 0. The "In particular" part follows from
Proposition 5.2.2 and Corollary 5.4.2.
Example 9.3.2 (Combettes's [39, Theorem 4.1 for consideration of remotest sets control])
Suppose the algorithm is linearly focusing, has long steps, and there is some E > 0 such that
Wi,, > E and € < a,p, = a,& 5 (2 - E)& , Vn, i E In. If the algorithm considers remotest
sets, then the sequence (2,) converges weakly to some point in C.
Proof. The algorithm is strongly focusing; Theorem 8.6.2.(ii) applies.
Example 9.3.3 (Combettes's [39: Theorem 5.1 for intermittent and consideration of re-
motest sets control]) Suppose the algorithm is linearly focusing, has long steps, and there
is some E > 0 such that w;,, > E and E < a,p, = a,& 5 (2 - €)&, Vn, i E I,. Suppose
further that the constraints are boundedly regular. If the algorithm is either intermittent
or considers remotest sets, then the sequence (x,) converges in norm to some point in C.
Proof. The algorithm is strongly focusing and Theorem 8.6.3 applies. . 9.4 Linearly focusing algorithms
Example 9.4.1 (Smith et al.'s [147, Theorem 2.21) Suppose the algorithm is cyclic: unre-
laxed, and has constant sets that are closed subspaces. If 7(Ci, Ci+ n - 0 . f l CAr) > 0, for
every 1 5 i 5 N - 1, then the sequence (x,) converges linearly to some point in C.
Proof. Combine Theorem 8.7.1 and Theorem 5.5.4.(i). . Example 9.4.2 Suppose the algorithm is intermittent, has constant sets that are hyper-
planes, and there is some E > 0 such that E < a, 5 2 - E and p,wi,, > €, Vn, i E I,. Then
the sequence converges linearly to c* with a rate independent of the starting point.
CHAPTER 9. APPLICATIONS! APPLICATIONS!
Proof. Combine Theorem 8.7.1, Remark 8.7.2, and Theorem 5.5.4.(iv). W
Example 9.4.3 (Herman et al.'s [86, Corollary 11; Trummer's [151, Theorem 51) Suppose
X is finite-dimensional and the algorithm is cyclic, has constant sets that are hyperplanes,
and 0 < g 5 E < 2. Then the sequence converges linearly to some point in C with a rate
independent of the starting point.
Proof. Example 9.4.2. W
Example 9.4.4 (Trummer's [152, Theorem 81) Suppose X is finite-dimensional and the
projection algorithm is weighted, unrelaxed, has short steps and constant sets that are
hyperplanes given by Ci = {x E X : (a;, x) = bi), for each i and some ai E X \ {0), b; E R.
If the weights are given by wi,, := lloi1(2/ Cj 11aj.j12, then the sequence converges linearly to
some point in C.
Proof. Example 9.4.2. W
Example 9.4.5 Suppose the algorithm is intermittent, has constant sets that are half-
spaces, and there is some E > 0 such that E < Qn < 2 - E and p,w;,, > E, Vn, i E In. Then
the sequence converges linearly to c* with a rate independent of the starting point.
Proof. Combine Theorem 8.7.l.(i), Remark 8.7.2, and Theorem 5.7.1. W
Example 9.4.6 (De Pierro and Iusem's [44, feasible case of Lemma 81) Suppose X is
finite-dimensional and the algorithm is weighted, has short steps and constant sets that
are halfspaces. If 0 < (y 5 5 < 2 and 0 < w;,, G w;, Vi, then the sequence (x,) converges in
to some point in C.
Proof. Example 9.4.5.
Example 9.4.7 (Gubin et al.'s [79, Theorem l.(a) for finitely many sets]) Suppose the
algorithm has remotest set control, constant sets, and 0 < g 5 L < 2. If there is some j
such that Cj n niZj int C; # 0, then the sequence (x,) converges linearly to some point in
Proof. Combine Theorem 8.7.l.(ii) and Corollary 5.4.2.
CHAPTER 9. APPLICATIONS! APPLICATIONS!
Example 9.4.8 Suppose the algorithm considers remotest sets and has constant sets that
are halfspaces. If (in) is some sequence of active remotest indices with k,an(2-a,)p,w;,,,
> 0, then the sequence (2,) converges linearly to c* with a rate independent of the starting
point.
Proof. Combine Theorem 8.7.1, Remark 8.7.1, and Theorem 5.7.1. H
Example 9.4.9 (Gubin et al.'s [79, Theorem l.(d)]) Suppose the algorithm has remotest
set control and constant sets that are halfspaces. If 0 < cr 5 Z i < 2, then the sequence (x,)
converges linearly to some point in C.
Proof. Example 9.4.8. . 9.5 Perennial favourites
Example 9.5.1 (the method of cyclic projections; Bregrnan's [26, Theorem 11; 1965) Sup-
pose the algorithm is cyclic, unrelaxed and has constant sets. Then the sequence (x,)
converges weakly to some point in C.
Proof. Example 9.2.4. . Example 9.5.2 (remotest set control; Bregman's [26, Theorem 2 for finitely many sets];
1965) Suppose the algorithm is unrelaxed, has constant sets and remotest sets control. Then
the sequence (x,) converges weakly to some point in C.
Proof. Example 9.3.2. . Example 9.5.3 (random projections) Suppose the algorithm is singular, unrelaxed, and
has constant sets. If some constraint Cj is boundedly compact, then the sequence (x,)
converges in norm to some point in C. In particular, this holds if X is finite-dimensional.
Proof. Example 9.2.6. . Example 9.5.4 (Gubin et al.'s [79, Theorem l.(a)]; 1967) Suppose the algorithm is cyclic,
has constant sets, and 0 < 2 5 E < 2. If there is some j such that Cj n n,,, int C; # 0, then the sequence (2,) converges (linearly) to some point in C.
CHAPTER 9. APPLICATIONS! APPLICATIONS!
Proof. Combine Theorem 8.7.l.(i) and Corollary 5.4.2. . Example 9.5.5 (Browder's [28, Corollary to Theorem 31; 1967) Suppose the algorithm is
almost cyclic, unrelaxed, and has constant sets that are closed subspaces. If Cf + - + Ci is closed, then the sequence (x,) converges linearly.
Proof. Combine Theorem 8.7.l.(i) and Theorem 5.5.1. W
Example 9.5.6 (Kaczmarz [97]; 1937) Suppose X is finite-dimensional and the algorithm
is cyclic, unrelaxed, and has constant sets that are hyperplanes. Then the sequence (x,)
converges linearly to some point in C with a rate independent of the starting point.
Proof. Example 9.4.3. W
Example 9.5.7 (Gubin et al.'s [79, Theorem l.(d)]; 1967) Suppose the algorithm is cyclic,
has constant sets that are halfspaces, and 0 < p < i5 < 2. Then the sequence (x,) converges
(linearly) to some point in C (with a rate independent of the starting point).
Proof. Example 9.4.5. W
Example 9.5.8 (Agmon's [I, Theorem 31, Motzkin and Schoenberg's [113, Case 1 in Theo-
rem 1 and Theorem 21; 1954) Suppose X is finite-dimensional and the algorithm has remotest
set control and constant sets that are halfspaces. If 0 < a, = a < 2, then the sequence (x,)
converges to some point in C.
Proof. Example 9.4.9. . Example 9.5.9 (Merzlyakov's [107, Theorem]; 1963) Suppose the algorithm considers re-
motest sets and has constant sets that are halfspaces and 0 < a, = a < 2. If there is some
E > 0 such that pnwi,,, 2 E , Vn and for some sequence (in) of active remotest indices, then
(x,) converges linearly to some point in C (with a rate independent of the starting point).
Proof. Example 9.4.8. W
Example 9.5.10 (Cimmino [38]; 1938) Suppose X is finite-dimensional and the algorithm
is weighted, has short steps and constant sets that are hyperplanes with at least two linearly
independent normals. Suppose further that a, = 2 and wi,, wi > 0, for each i. Then the
sequence (x,) converges in norm to some point in C.
Proof. Theorem 8.8.2. W
CHAPTER 9. APPLICATIONS! APPLICATIONS!
9.6 Notes
The method of cyclic projections (Example 9.5.1) is studied in some detail in [18]. Even
for the method of alternating projections (i.e., two constraints), it is unknown whether or
not the convergence can be only weak. For a slight perturbation of the method of cyclic
projections (which is not covered by our framework): norm convergence always holds; see
Ill] and the references therein for more.
For two constraints, the method of Example 9.5.2 essentially becomes the method of alter-
nating projections.
Random projections (Example 9.5.3) pose also a tantalizing question, still open for the
general case: does a sequence generated by random projections converge (weakly) to some
point in C? For some positive results, see [12].
Remark 9.6.1 By Example 9.5.5, the method of cyclic projections for closed subspaces
Cl, . . . , CN generates linearly convergent sequences provided that Cf +- . .+Ch is closed. In
view of Theorem 5.5.1 and Remark 5.5.3, this sufficient condition is equivalent to (bounded)
(linear) regularity of {Cl, . . . , CN) and to the positivity of the angle y(C1, . . . , CN); it is
interesting to note that it is also necessary: By 118, Theorem 5.7.161, a zero-angle implies
the existence of a sequence generated by the method of cyclic projections that converges
arbitrarily slowly.
It is a remarkable fact that the method of cyclic projections for closed subspaces always
generates norm convergent sequences:
Fact 9.6.2 (von Neumann's [158, Theorem 13.71, 1933: Halperin's [83: Theorem 11, 1962)
Suppose C1, . . . , Caw are closed subspaces of X. Let C := ni C; and T := PC, . PC,. Then
the sequence ( F x ) converges in norm to Pcx, for every x E X.
While von Neumann proved the two-subspace case, his proof does not generalize to finitely
many subspaces. The general case was covered by a result of Halperin; see also Section 11.2.
Fact 9.6.2 is not covered by our framework; however, in the case when ~ f + . . .+Ch is closed,
we deduce more, namely linear convergence (Example 9.5.5). Because of its fundamental
importance and many many applications (see Deutsch's [48]), we provide two self-contained
proofs of Fact 9.6.2; see Section 11.2.
CHAPTER 9. APPLICATIONS! APPLICATIONS!
Merzlyakov was the first to consider relaxed long-step methods (for halfspaces); see Exam-
ple 9.5.9.
Finally, Example 9.5.10 has a nice geometric interpretation: the next iterate is obtained by
reflection of the current iterate in all N hyperplanes and then taking a weighted average.
Chapter 10
Subgradient algorithms
10.1 Introduction
For one last time (promised!), we preserve the notation of Chapter 8. In this chapter
we study a particular kind of a projection algorithm, called subgradient algorithm, that
aims to solve convex feasibility problems whose constraints include sublevel sets of convex
functions. Because the computation of the projection onto a sublevel set is hard, one is lead
to consider approximating supersets. We give sufficient conditions that allow us to apply the
theory of Chapter 8; selected applications then demonstrate the power of this framework.
10.2 Motivation
Suppose f is a convex lower semi-continuous proper function on X. We aim to solve a
convex feasibility problem of the following kind:
(CFP) Find x E X such that f (x) 5 0.
We assume that CFP possesses solutions. In our framework, the (only) constraint is the
sublevel set C := {x E X : f (x) 5 0). We saw in Subsection 3.3.11 that it is "hard" to
compute projections onto a sublevel set. So what do we do? Well, we hunt for easier-to-
handle supersets, which are covered by the algorithm. The idea is very natural: Given a
"current iterate" xo E dom af, we start by evaluating f (xo). If f (xo) < 0, then xo E C and
we are done. Otherwise, we consider $, a first-order approximation of f at xo:
CHAPTER 10. SUBGRADIENT ALGORITHMS
The sublevel set 6 := {x E X : f(x) 5 0) associated with f is much easier to deal with:
Observation 10.2.1 6' is always a closed convex superset of C. If g(xo) = 0, then 6 = X:
otherGise, c is a halfspace. In either case, the projection of xo onto 6' is given by
( X O , otherwise.
(Note that if f (xo) > 0, then g(xo) # 0, because argmin f lies in the nonempty set
C = {x E X : f (x) 5 01.) In summary, the strategy of considering the superset of
C suggests a particular type of projection algorithm, which we will call subgradient algo-
rithm. So (trivially) every subgradient algorithm is a projection algorithm. Conversely,
every projection algorithm with constant sets can be viewed as a subgradient algorithm (by
using the distance functions to the constraints and Proposition 3.2.5).
10.3 The general subgradient algorithm
A projection algorithm is called a subgradient algorithm, if the following condition holds:
There is some index i, called a subgradient index, such that C; = {x E X : fi(x) 5 0) and
Ci,n = {X E X : f;(xn) + (g i (~n) , x - xn) < 0), where f; is a convex finite function on X that
is bounded on bounded sets, and g;(x,) E df;(x,), Vn. The set of all subgradient indices is
denoted Id.
Note that by Proposition 2.2.8, "f; is bounded on bounded sets" is equivalent to "dom d f; =
X and dfi carries bounded sets to bounded sets" ; moreover, by Corollary 2.2.9, this is
automatic for finite convex functions in iinite dimensions. (See this chapter's notes for a
comment on how to handle not necessarily finite functions.)
To apply our results on projection algorithms in this context: we provide a verifiable condi-
tion for (linearly) focusing subgradient algorithms.
Theorem 10.3.1 Suppose we are given a subgradient algorithm. Then:
(i) If (Pi,,) converges actively pointwise to Pi, Vi @ I d , then the subgradient algorithm is
focusing.
(ii) If there is some Slater point 5 i. X, i.e., fi(5) < 0, Vi E Id , and some ,D > 0 such
that ,Bd(xn, C) 5 d(x,, Ci,,), Vn, i E I, \ Id , then the subgradient algorithm is linearly
focusing.
CHAPTER 10. SUBGRADIENT ALGORITHMS 122
Proof. (See also [14, Theorem 7.7 and Theorem 7.121.) (i): Because of Proposition 8.5.1,
we have only to investigate i E Ia. So suppose (xkn) is a subsequence of (x,) with xkn - x
and xkn - Pilknxkn -+ 0, where x E X and i E Ikn, Vn. Our goal is to show that z E C;.
Clearly, f i ( ~ ) < limn fi(xkn). Now either fi(xkn) I 0 frequently or fi(xkn) > 0 eventually.
In the former case, we are done; thus, assume the latter case. Since (x,) is bounded,
there exists some M > 0 such that llgi(~n)ll 5 M , Vn (Proposition 2.2.8). Hence, using
Observation 10.2.1,
Therefore, fi(xkn) -f 0, which implies x E C; and (i) is verified. (ii): It suffices to show that
for an arbitrary but fixed subgradient index i, there exists some Pi > 0 such that
As (3,) is bounded, there exists (again by Proposition 2.2.8) M > 0 such that 112 - xn 11 5 M
and llgi(xn) 11 < M, Vn. NOW fix n such that i E In. If X, E C;, then any Pi would do; SO
assume fi(x,) > 0. Define
and y := (1 - X)2 + AX,.
Then fi(y) I (1 - A) fi(2) + X f;(x,) = 0, so y E C;. Hence, using Observation 10.2.1,
Thus Pi = does the job and we are done. H
10.4 Some applications
Numerous results on subgradients algorithms follow by specialization of Theorem 10.3.1.
Again, we give a rather small but illustrative selection; more general results are often pos-
sible.
CHAPTER 10. SUBGRADIENT ALGORITHMS
10.4.1 Censor and Lent's framework
We speak of Censor and Lent's framework, if every index is a subgradient index, i-e., Ia =
(1,. . . , N ) .
Example 10.4.1 Suppose X is finite-dimensional. If each l&~,,~,-~, an(2 - a, jp,wi,, > 0, then the sequence (2,) generated by the subgradient algorithm in Censor and Lent's
framework converges to c*.
Proof. (See also [14, Theorem 7.13.(i)].) The subgradient algorithm is focusing by Theo-
rem 10.3.l.(i). Apply Theorem 8.5.2. 1
Example 10.4.2 (Censor and Lent's [37, Theorem 11) Suppose X is finite-dimensional, the
subgradient algorithm in Censor and Lent's framework is almost cyclic, and 0 < 5 E < 2.
Then the sequence (on) converges to some point in C.
Proof. Example 10.4.1. 1
Example 10.4.3 Suppose the existence of a Slater point in Censor and Lent's framework:
35 E X : fi(5) < 0,Vi. If the subgradient algorithm is intermittent, E < 2, and each
h n : i E I , , ( Y ~ P ~ w ; , ~ > 0, then the sequence (2,) converges linearly to c*.
Proof. (See also [14, Theorem 7.18.(ii)].) The Slater point implies that int C # 0; con-
sequently, the sequence (x,) is convergent to some x € X by Theorem 6.2.2.(v). Also,
by Theorem 10.3.1. (ii) , the subgradient algorithm is linearly focusing. By Corollary 5.4.2,
{CI , . . . , CN) is boundedly linearly regular; thus Theorem 8.7.1. (i) applies. W
Example 10.4.4 (De Pierro and Iusem's [45, Theorem 21) Suppose X is finite-dimensional,
the subgradient algorithm in Censor and Lent's framework is almost cyclic: and 0 < cr 5 E < 2. If there exists some Slater point, then the sequence (2,) converges linearly to some
point in C.
Proof. Example 10.4.3. 1
Example 10.4.5 (Eremin's [63, Theorem 1.31; 1969) Suppose the subgradient algorithm
in Censor and Lent's framework is weighted with each w;,, E wi > 0 and an a E ]0,2[.
If there exists some Slater point and the algorithm has short steps, then the sequence (x,)
converges linearly to some point in C.
Proof. Example 10.4.3. 1
CHAPTER 10. SUBGRADIENT ALGORITHMS
10.4.2 Polyak's framework
Polyak's framework arises when: N = 2, Ia = {I), and C2,, C2.
To save on subscripts, we let f := fl. Thus Polyak's framework studies the convex feasibility
problem
(Pol yak's CFP)
Find z E C2 such that f (x) 5 0.
This looks very special on first sight, but is, as we explain in this chapter's notes' fairly
general.
Example 10.4.6 Suppose the subgradient algorithm in Polyak's framework is intermittent,
E < 2, and each l h n : i E z , ~ n ~ n ~ i , n > 0. Then the sequence (z,) converges weakly to c*.
Proof. (See also [14, Theorem 7.221.) By Theorem 10.3.l.(i), the subgradient algorithm is
focusing. Now apply Theorem 8.5.3.(i). . Example 10.4.7 (Polyak's [124, Theorem 11; 1969) Suppose the subgradient algorithm in
Polyak's framework is cyclic and there is some E > 0 such that E 5 a, < 2 - 6 , Vn. Then
the sequence (2,) converges weakly to some point in C.
Proof. Example 10.4.6. . Example 10.4.8 Suppose the subgradient algorithm in Polyak's framework is intermittent
and there exists some i E C2 with f ( 2 ) < 0. If E < 2 and each l h n : i E z , , ~ n p n ~ i , n > 0: then
the sequence (x,) converges linearly to c".
Proof. (See also [14, Theorem 7.271.) The subgradient algorithm is linearly focusing by
Theorem lO.3.l. (ii). As i E C2 n int C1, the set {Cl , C2) is boundedly linearly regular
(Corollary 5.4.2). Apply Theorem 8.7.1. (i). . Example 10.4.9 (Polyak's [124, a case of Theorem 41; 1969) Suppose the subgradient
algorithm in Polyak's framework is cyclic and 0 < g < E < 2. If there exists some i E C2
with f ( 5 ) < 0, then the sequence (2,) converges linearly to some point in C.
Proof. Example 10.4.8. .
CHAPTER 10. SUBGRADIENT ALGORITHMS
10.5 Notes
We give several comments on Polyak's framework. What happens if the function f is only
defined on C2, i.e., dom f = C2? (After all: Polyak's CFP does not really care about f
outside C2 .) We proceed as follows: Suppose f is L-Lipschitz on C2. Let F := f OLll- 1 1 , i-e.,
F is the infimal convolution of f and L1I 1 1 . Then F is L-Lipschitz everywhere on X and
coincides with f on C2 (see [89, Proposition XI.3.4.51). Hence we work with F rather than f .
A cycle in the unrelaxed form of Polyak's algorithm (Example 10.4.7 and Example 10.4.9)
can be interpreted as follows: Given an iterate x,, first project onto C2 to obtain x,+l.
Then construct the approximating superset as in Observation 10.2.1 and project x,+l onto
it to obtain xn+2. Continue in this fashion. We now explain why Polyak's CFP is quite
general: Suppose you are given finitely many convex functions f j and you want to find some
x E C2 with fj(x) 5 0, Vj . Then Polyak's framework is still valuable, because we simply let
f := maxj fj. (For the formula on the subdifferential of the maximum of convex functions,
see [7, Proposition 4.3.8.(ii)].)
General subgradient algorithms are also investigated by Combettes [42] and by Kiwiel and
Lopuch [101, Sections 7-10] from their respective frameworks (see also Chapter 8's Notes).
Kiwiel and Lopuch obtain a variety of powerful results: partidy based on Kiwiel's [loo].
Chapter 11
A farewell to Part I
11.1 Overview
The two promised proofs of the von Neumann/Halperin result axe given. (Recall that the
von Neumann/Halperin result is not covered by our framework, because we require bounded
regularity for any norm convergence result. On the other hand, in the presence of bounded
regularity, our framework yields a stronger conclusion, namely linear convergence with a
rate independent of the starting point. See also Section 9.6.) I discuss my favourite open
problems and conclude Part I of this thesis.
11.2 The von Neumann/Halperin result
In this section, we provide two proofs of the fundamental von Neumann/Halperin result on
the convergence of the method of cyclic projections for closed subspaces (see Fact 9.6.2):
Suppose C1, . . . , CN are finitely many closed subspaces of X. Set C := ni C;, abbreviate each projection PC, by PI:, and let T := PN - . P2 PI. Then the
sequence ( F x ) converges in norm to Pcx, for every x E X.
The next subsection contains Halperin's short and sweet proof. In the subsection thereafter:
we prove a convergence result on a relaxed version of Dykstra's algorithm, which also implies
the von Neumann/Halperin result.
CHAPTER 11. A FAREWELL TO PART I
11.2.1 Halperin's proof
Proof. (First proof of the von Neumann/Halperin result; see also [83, Theorem I].) The
von Neumann/Halperin result is a statement on the subsequence of a sequence generated by
a cyclic unrelaxed (projection) algorithm in the sense of Chapter 8. By Proposition 8.4.1,
this algorithm is asymptotically regular; hence Tnx - T ~ + ~ x + 0, Vx E X. It follows
that Tny + 0, Vy E ran (I - T). Because sup, llTnI) 5 1, we actually obtain Tng -t 0,
'dg E -(I - T). But what is E X ( I - T)? Well, using Observation 3.2.2, m ( I - T) =
(ker ( I - T)*)' = (ker (I - p1P2 . . . PN))' = (Fix P~ p2 . - PN)' = CL. Thus on the one
hand,
On the other hand, again using Observation 3.2.2,
Altogether: Tnx = Tn(PCx) + Tn (PClx) = Pcx + Tn (PClx) + Pcx, Vx E X. . 11.2.2 A relaxed version of Dykstra's algorithm
We now present a new convergence result on a relaxed version of Dykstra's algorithm. Not
only the von Neumann/Halperin result but also a basic result on Dykstra's algorithm by
Boyle and Dykstra will follow.
Theorem 11.2.1 (Relaxed Dykstra algorithm) Suppose C1,. . . , CN are closed convex sub-
sets of X with C := niCi # 0. Let q - ( ~ - ~ ) := . - . .- .- 9-1 := qo := 0, [.] := ['IN, and set
Cn := C[nl, Pn := PC, , Vn. Suppose further (j3n)n>-(nr- - is a sequence in [O,1] and xo E X.
D e h e sequences (x,), (9,) by
Finally, let xz := xo - ~ ; z r ( l - Pk)qk, Vn. If (x;) converges in norm to some point x*,
then (2,) converges in norm to Pcx*.
CHAPTER 11. A FAREWELL TO P A R T I
Proof. The following formulae hold true for every n 2 1:
Indeed, (1) follows from Theorem 3.2.l.(i), (2) from the definition of the sequences, (3) from
(2), and finally (3') is equivalent to (3). The next identity is crucial; it is true for 0 5 m < n and c E C:
(4) is proved by induction on n, where m is arbitrary but fixed: clearly, (4) holds when
n = m. For n 3: m, the induction step follows from
Horrified by these expressions? Well, I am. But they do turn out to be extremely useful,
because we know the sign of the each term. Indeed: as a first demonstration, we let m = 0
and keep c E C. Then for every n 2 0:
CHAPTER 11. A FAREWELL TO PARTI
moreover, the terms in all sums are nonnegative. It follows that
00 00
2 (2.) is bounded and I I B ~ - N D - ~ - Q X I I ~ = C 11x1-I - xkll < +m; k=l k= 1
in particular, (2,) is asymptotically regular. Now consider
Denote the first (resp. second) sum on the RHS of (5) by Sl (n) (resp. S2(n)). By (I) , Sl (n)
is always nonpositive. Concerning S2(n), we will prove that
Before we do so, note the following telescoping identity:
This identity yields
and further
Using (7), we obtain
CHAPTER 11. A FAREWELL TO PART I 130
But, by Proposition 2.6.19, the limes inferior of the sequence generated by the last expression
is equal to 0 and (6) thus holds. Hence we obtain a subsequence (n') of (n) such that (recall
(5) and the nonpositivity of Sl(n))
After passing to another subsequence if necessary, we also assume WLOG:
xnt - c', for some c* E X, lim llxntll exists (2 /lc*ll), [n'] - 3, for some index j . n
I t follows that c* E Cj and thus c* E C, by asymptotic regularity of (x,). Hence, for an
arbitrary c E C, the following chain of inequalities holds:
By Theorem 3.2.1. (i),
If we let c = c* in (9), then we obtain a chain of equalities and learn that limnt llxnr 11 = Ilc* 11.
By the Kadec/Klee property of X (Proposition 2.2.3);
which, in conjunction with (5), yields
The second sum, a.k.a. S2(n1), tends to 0 by (8). Hence so must the first sum, which we
know to possess exclusively nonpositive terms:
CHAPTER 11. A FAREWELL TO PART I
To complete the proof, we go back to equation (4), which yields, after letting m = n', n 2 n' arbitrary, and c = c*:
Consequently, the entire sequence (xn) converges to c*. . Corollary 11.2.2 (Dykstra's algorithm) Suppose Cl , . . . , CN are closed convex subsets of
X with C := 0; ci # 0. Let q - ( ~ - ~ ) := . . . .= . 9-1 := q0 := 0, [-I := [ - I N , and set each
Pi := PC;. If xo is an arbitrary point in X and we define sequences (x,), (9,) by
then (2,) converges in norm to Pcxo.
Proof. (See also Boyle and Dykstra's [25].) Set Pn 1 in Theorem 11.2.1. . Proof. (Second proof of the von Neumann/Halperin result) Apply Corollary 11.2.2 when
each Ci is closed subspace. Then qn E Chi, Vn; consequently, the sequence (q,,) can be
ignored when computing the sequence (x,). W
Remark 11.2.3
(i) The relaxed version of Dykstra's algorithm presented in Theorem 11.2.1 is a hybrid of
the method of cyclic projections (Pn E 0) and of Dykstra's algorithm (Pn - 1). For
the method of cyclic projections, Theorem 11.2.1 becomes:
CHAPTER 11. A FAREWELL TO PARTI
If the sequence (x:) = ( x n - ~ ) = (P[nl~n-N-l) converges in norm to xu,
then (2,) converges in norm to Pcx*.
Thus we conclude that the norm limit (should it exist) must lie in C - but that's old
hat by Example 9.5.1.
(ii) Why do we consider the iteration from Theorem 11.2.1? Well, i t appears that Dyk-
stra's algorithm never converges faster than the method of cyclic projections and often
slower ([16, Example 5.31). Hence, as the iteration from Theorem 11.2.1 is "between"
the original Dykstra's method and the method of cyclic projections, it presumably
converges faster than Dykstra's method. Detailed numerical experiments, however,
have yet to be performed.
(iii) Let (Ak) be a sequence of positive reds with Ck Xk < +m. Then the sequence (x:)
converges in norm whenever (1 - Pk)llqkll 5 X k , Vk. SO we can choose
to obtain a truly relaxed application of Theorem 11.2.1. This choice of parameters
determines pk at runtime: indeed, Pk depends on xk, xk-N, Pk-lv but Pk is used only
later for the computation of X ~ + N .
11.3 Notes
It is not too hard to show that the von Neumann/Halperin result holds true when the
constraints are intersecting closed affine subspaces.
A very nice survey on the many many applications of the von Neumann/Halperin result is
Deutsch's [48].
Dykstra's algorithm can also be explained as from a Convex Analysis viewpoint: see GafFke
and Mathar's [70].
For two intriguing generalizations of Dykstra's algorithm, see Hundal and Deutsch's [93].
11.4 Open problems
Here are my favourite open problems out of many more that could be formulated (what
happens for infinitely many constraints? for other controls? etc.). My "top three" are
CHAPTER 11. A FAREWELL TO PARTI
extremely simple to state; but, more importantly, I believe that genuinely new ideas are
required to tackle these problems.
11.4.1 The alternating projections problem
T h e alternating projection problem. Suppose Cl, C2 are closed convex
subsets of X with projections PI, P2 and C := C1 n C2 # 0. For an arbitrary
xo E X , generate the sequence of alternating projections:
Is it possible that (x,) converges weakly but not in norm?
In the language of Chapter 8, the sequence (x,) is generated by a cyclic unrelaxed projection
algorithm with two constraints. We know by either Theorem 7.5.1 and Remarks 7.5.2
or Example 9.5.1 that (x,) converges weakly to some point in C. Bounded regularity is
sufficient for norm convergence (by Theorem 8.6.3.(i)) but not necessary; see Remark 4.5.2
and Fact 9.6.2. (For an example rooted in a Hilbert lattice, see [15, Example 5-51.) Note
that a slight perturbation of the method of alternating projections always produces norm
convergent sequences; see [ll] .
11.4.2 The random projections problem
The random projections problem. Suppose C1, . . . , CN are finitely many
closed convex subsets of X with projections Pl , . . . , PN and C := r); Ci # 8.
Suppose further T is a map from the no~ega t ive integers onto (1,. . . , N ) that
takes each value infinitely often; in other words: T is an N-die. For an arbitrary
xo E X , generate the sequence of random projections:
Is it possible that (x,) does not converge even weakly to some point in C?
For N = 2, the answer is negative, as the sequence generated is "essentially" the same as a
sequence of alternating projections (projections are idempotent).
As with the alternating projections problem, some sufficient conditions are known. The most
famous one is due to Amemiya and Ando [2] and dates back to 1965: Weak convergence
CHAPTER 11. A FAREWELL TO PARTI
holds if each C; is a subspace. (Ironically, there is no example where norm convergence fails.)
For a condition guaranteeing norm convergence, based on bounded regularity, see [12] which
also contains pointers to other results. Dye and Reich [56] obtained weak convergence for
N = 3; their proof uses combinatorial ideas. For N 2 4, the question is wide open. In
Euclidean spaces, (norm) convergence holds for the so-called method of random Bregman
projections. This method uses more general Bregman distances to define projections; we
refer the interested reader to [13] for further information.
11.4.3 The cyclic projections problem
The cyclic projections problem. Suppose C1 , . . . , CN are finitely many
closed convex subsets of X with projections PI,. . . , PN and ni C; = 8 (the
constraints have no common point). For an arbitrary xo E X, generate the
sequence of cyclic projections:
What is the behaviour of the sequence (x,)?
This problem is of considerable interest in applications. The theoretical model is "perfect"
(the constraints can be met simultaneously); but in practice, no solutions can be found
because measurement errors lead to an infeasible problem. Of course, if you apply cyclic
projections in such a case, you want to understand what the sequence does in the infeasible
case. For N = 2, the behaviour is well-understood: subsequences try to realize the distance
between the (nonintersecting) constraint sets and thus either diverge or converge weakly -
depending on whether or not the distance between the constraints is attained. For this and
other references, the reader is referred to (151. For N 2 3, the geometry is hardly understood
and only partial results are known; see [18]. The cyclic projections problem is still in its
infancy.
11.5 Conclusion
In Part I of this thesis, I have investigated many apparently different looking algorithms
in the unifying framework of general projection algorithms. A strict modularization of the
problem made the analysis possible. The three key modules, each of which is of considerable
interest in itself, study:
CHAPTER 11. A FAREWELL TO PART I
projections and their properties;
(bounded) (linear) regularity;
Fejdr monotone sequences.
I leant heavily on tools from the beautiful and powerful area of Convex Analysis.
Part I1
Monotone Operators in Banach
spaces
Chapter 12
The zoo of monotonicities
12.1 Overview
We introduce the various notions of maximal monotonicity (coined by Gossez, by Fitzpatrick
and Phelps, and by Simons) and record basic relationships and known facts. We then
study these notions for a continuous linear monotone operator and lay the ground work
for next chapter's main results. We rely on the simple yet immensely useful decomposition
of a continuous linear monotone operator into the sum of a symmetric part and a skew
part (Proposition 12.3.5). Being the subdifferential of a (quadratic) convex function, the
symmetric part has excellent properties.
12.2 Basic properties and facts
Most of the several stronger notions of maximal monotonicity are defined in terms of exten-
sions of graphs in a larger space.
Definition 12.2.1 Suppose T is a set-valued map from X to X". Define set-valued maps
TI: To: T from Xu* to Xu via their graphs as follows:
U' * (i) (x**, x*) E graT1, if there exists a bounded net (x,, x:) in graT with x, - x** and
x; -f x*.
(ii) (x**, x*) E graTo, if inf(y,y*)EgraT(y* - x*, y - x**) = 0.
(iii) (x**, x*) E graT, if inf(y,y*)EgraT(~X - x*, y - x**) 2 0.
CHAPTER 12. THE ZOO OF MONOTONICITIES 138
The extension TI is topological, whereas To and T are more geometrical. The graph of T consists of all pairs in X*" x X * that are "monotonically related" to graT.
Remarks 12.2.2 Suppose T is a set-valued map from X to X*.
(i) T is monotone if and only if graT 2 g a T .
(ii) If T is monotone, then so is TI.
(iii) T is maximal monotone if and only if graT = (graT) n (X x X*).
(iv) If T is maximal monotone, then T need not be monotone; see Remark 14.4.4.
Proposition 12.2.3 Suppose T is a monotone operator from X to X*. Then the following
inclusions hold in X** x X*:
graT graTl C graTo C g a T = g r a E fl (X** x X*).
Proof. The inclusions gra T C gra Tl and gra To C graT > g r a E n (X" x X*) are obvious
(even without monotonicity). Fix an arbitrary (x**, x*) E graTl and obtain a bounded net W*
(x,, 2:) in graT with x, - x** and xT, -t x*. Then (x,- y, 2:- y*) 2 0, V a , (y, y*) E gra T;
taking limits yields (x** - y, x* - y*) > 0. On the other hand, (x** - x,, x* - xl;.) + 0;
altogether, 0 = inf(y,yL)EgraT(x** - y, x* - y*), i.e., (x**, x*) E graTo. Hence graTl C graTo.
Finally, pick (z**, z*) E g a T . Then 0 5 inf(y,y.)EgraT(y* - z*, y - z**) 2 l i a (x : -
zu, x0 - z**) = (x* - z*,x** - z**); SO (zX*,z*) is monotonically related to graTl, hence
g r a T ~ g r a ? ; ; n ( ~ * * x ~ * ) .
These basic inclusions allow a nice motivation of some of the various types of monotonicity:
Definition 12.2.4 Suppose T is a monotone operator from X to X*. Then:
(i) (Gossez [76]) T is of dense type or of type (D), if TI = T.
(ii) (Simons [146, Definition 141) T is of range-dense type or of type (WD), if for every
x* E ranT, there exists a bounded net (x,, x:) E graT with z: -+ x*.
(iii) (Simons [146, Definition 101) T is of type (NI), if inf(y,y*)Egra~(y* - x*, y - x**) < 0,
for all (x**,x*) E X** x X*. If this holds only on some subset of X*" x X*, then we
say that T i s of type (NI) with respect t o this subset.
CHAPTER 12. THE ZOO OF MONOTONICITIES
(iv) (Fitzpatrick and Phelps [66, Section 31) T is locally maximal monotone, if (graT-l) r l
(V x X ) is maximal monotone in V x X , for every convex open set V in X* with
V n r a n T #0.
(v) (Fitzpatrick and Phelps [67, Section 31) T is maximal monotone locally, if (graT) n (U x X*) is maximal monotone in U x X*, for every convex open set U in X with
U n d o m T # 0.
(vi) T is unique, if all maximal monotone extensions of T in X** x X" coincide.
In reflexive spaces, maximal monotonicity of dense type, local maximal monotonicity and
(ordinary) maximal monotonicity are all the same:
Fact 12.2.5 Suppose X is reflexive and T is a monotone operator from X to X*. Then
TFAE: (i) T is maximal monotone; (ii) T is maximal monotone and of dense type; (iii) T is
locally maximal monotone.
Proof. See, for instance, Phelps's [121, Example 3.2.(b) and Proposition 4.41. . Fact 12.2.6 Suppose T is a monotone operator from X to X". Then TFAE:
(i) T is unique.
(ii) is the unique maximal monotone extension of T in X** x X*
(iii) T is maximal monotone.
(iv) T is monotone.
Moreover: If T is of type (NI), then (i)-(iv) hold.
Proof. See Simons's [146, Theorem 191. . A unique maximal monotone operator need not be of type (NI): consider the operator G
in Example 14.2.2. This example also shows that Simons's [146, Theorem 191 cannot be
improved (meaning that we cannot add the condition "T is of type (NI)" to the list of
characterizations of uniqueness in Fact 12.2.6).
The following characterizations are sometimes more handy to work with:
Proposition 12.2.7 Suppose T is a monotone operator from X to X*. Then:
CHAPTER 12. THE ZOO OF MONOTONICITIES
(i) T is of dense type if and only if TI is maximal monotone.
(ii) T is of range-dense type if and only if ranTl = ranT
(iii) T is of type (NI) if and only if To = T.
(iv) T is locally maximal monotone if and only if for every weak* closed convex bounded
subset C of X* with ran T n int C # 0, and for every xo E X , xT, E (int C) \ Txo, there
exist (z, z*) E graT n (X x C) with (z* - xT,, z - xo) < 0.
(v) T is maximal monotone locally if and only if for every bounded closed convex subset
C of X with dom T n int C # 0, and for every x0 E int C, xT, E X* \ Txo, there exists
(z, z*) E graT n ( C x X*) with (z* - x;, z - xo) < 0.
Proof. (i): "=+": T is of dense type * TI = T. Now TI is monotone (Remarks 12.2.2.(ii)),
hence so is F. By Fact 12.2.6, T = TI is maximal monotone. "+": Pick (x**, x*) E g a T .
Then (by Proposition 12.2.3) (x**,x*) E g r a c , i.e., this point is monotonically related to
gra TI. Now TI is maximal monotone, hence (x**, x*) E gra TI. (ii): "a": Pick x* E ran T. By assumption, there exists a bounded net (x,, x:) in graT such that x: + x*. Without
W * loss, we can assume that x, - x**. Then (x**, x*) E gra TI and in particular x* E ran TI.
"+" is even simpler. (iii): Let us abbreviate inf(y,yl)Egra~(x** - y, x* - y*) by I. " a " : If
(x**,x*) E g a T , then I > 0. Now T is of type (NI), hence I 5 0. Thus I = 0. "+": Fix
(xL*,x*) E X** x X*. If (x**,x*) $i q a T , then I < 0. Otherwise, (x**,x*) E T = To and
hence I = 0. (iv) is Phelps's [121, Proposition 4.31, and (v) is proved analogously.
Proposition 12.2.8 Suppose T is a monotone operator from X to X'. Define T \ To via
its graph: gra ( T \ To) := (gra T ) \ (gra To). Then ran (T \ To) c ran ?; \ ran TI.
Proof. Let x' E ran (T \ To), i.e., there is some x** E X** such that (x"',x*) E (graT) \ (graTo): inf(y,y.lEgraT(x*a - y,x* - y*) > 0. NOW assume to the contrary that x" E ranT1,
say (ym,x*) E graTl, for some y*' E X**. Then we would obtain a bounded net (y,, yz) in W *
graT with y, - y** and y: --+ x*. It follows that (x*'- y,, x*- yz) --+ (x**-y**, x"-x*) = 0
implying the contradiction: inf(y,y*)EgraT(x** - y,x* - y*) 5 0. W
The following implications are due to Simons; to illustrate Proposition 12.2.7, we prove
some of them.
CHAPTER 12. THE ZOO OF MONOTONICITIES 141
Fact 12.2.9 For any monotone operator from X to X*, the following implications hold:
dense type + range-dense type + type (NI) + unique.
Proof. (See also [146, Lemma 15 and Theorem 191.) Suppose T is a monotone operator
from X to X*. Using Definition 12.2.4, Proposition 12.2.7, and Proposition 12.2.8, we obtain
the following cascade of implications: T is of dense type @ TI = ?i + ran TI = r anT * T
is of range-dense type + ran (F \ To) = 0 @ To = ?i * T is of type (NI). The remaining
implication is contained in Fact 12.2.6. H
All these notions were introduced to generalize the properties of the subdifferential operator;
so the following deep facts due to Gossez, Simons, Fitzpatrick and Phelps do not come as a
surprise.
Fact 12.2.10 Suppose f is convex lower semi-continuous proper function on X . Then:
(i) af is maximal monotone and of dense type; in fact, (3f)l = (df*)-'.
(ii) a f is locally maximal monotone.
(iii) a f is maximal monotone locally.
Proof. (i): [74, Thdorkme 3.11. (ii): [144]. (iii): [67, Corollary 3.41. . Fact 12.2.10 can be viewed as a sharpening of a classical result due to Rockafellar (see
Fact 2.5.1).
Fact 12.2.11 Suppose T is a monotone operator from X to X*. If T is of range-dense type
or locally maximal monotone, then cl ran T is convex.
Proof. See Simons's [146, Theorem 171 (resp. Fitzpatrick and Phelps's [66, Theorem 3.51)
when T is of range-dense type (resp. l o c d y maximal monotone). H
Recall that a set-valued map T from X to X * is coercive, if limllzll-+,*. inf ~ ( T Z , z) = +a.
Fact 12.2.12 Suppose T is a coercive monotone operator from X to X". If T is of dense
type, then ran TI = cl ran T = X*.
Proof. See Gossez's [74, Thdorkme 8.11.
Fact 12.2.13 Suppose T is a coercive maximal monotone operator from X to X*. If
c l ranT = X*, then T is locally maximal monotone.
CHAPTER 12. THE ZOO OF MONOTONICITIES
Proof. See Phelps's [121, Theorem 4.8.(ii)]. H
Corollary 12.2.14 Every coercive maximal monotone operator o
maximal monotone.
f dense type is locally
Proof. (See also Phelps's [121, Corollary 4.91.) Fact 12.2.12 and Fact 12.2.13. W
Corollary 12.2.14 is the only known "general" result connecting monotone operators of dense
type with locally maximal monotone operators; it will be slightly improved in this chapter's
notes.
12.3 The monotonicities for linear operators
Theorem 12.3.1 Suppose T is a continuous linear operator from X to X*. Then:
(i) graT E (graT**) n (gra (T* Ix)*) C (graT**) n (X** x X*) G gra T**.
(ii) graTl = cl weak*xll.llgraT = (graT**) fl (X*' x X*).
Proof. (See also Gossez's [75, End of Section 21.) (i): is straight-forward and thus omitted.
(ii): we start with the second equality. Consider Y := (X**, weak*) x (X*, 11 . 11). Then
Y* = (X*, 11.11) x (X**, 11 -11) ([91, Theorem 18.E]) and we can compute the closure of graT
in Y using the Bipolar-Theorem ([43, Theorem V.1.81): one first checks that (gaT) ' =
{(-T*x**, x**) : x** E X**} and then that clWeak*,ll.~lgraT = '((gaT)') = (graT**) fl
(X"* x X*). Now we turn our attention to the first equality. It is clear that graTl C ~ l , , , ~ - ~ ~ ~ . ~ ~ g r a T . Conversely, pick (x**, x") E ~ l , , ~ * , ~ ~ . ~ ~ g r a T , i.e., x* = T"*xR" E X* (by
the second equality). There exists a net (x,) in X with x, zRU and Tx, + xx. Now fix
an arbitrary positive integer n and an arbitrary weak* neighborhood V of 0 (in X*"). Let
Cn := T-'(2" + ~ B X * ) . Then Cn is closed and convex. Moreover, Cn contains X, for all
large a; hence x*" E clWeAlCn. Proposition 2.3.2 (applied to Cn) guarantees the existence
of a point x ( , , ~ ) with:
Let V be the set of all weak* neighborhoods of 0 and consider the following binary relation
on N x V: (nl, K ) (nl, V2), if nl 2 n2 and Q C fi. Then (N, V) is a directed set. The W*
net x(,,v) is bounded with x(,,v) - x** and TX( , ,~ ) -4 x*. Consequently, (x**, x*) E graTl
and the proof is complete. W
CHAPTER 12. THE ZOO OF MONOTONICITIES 143
Corollary 12.3.2 Suppose T is a continuous linear operator from X to X*. Then T is
weakly compact (resp. tauberian) if and only if TI = T'* (resp. TI = T).
Recall that if T is a continuous linear operator from X to X* with ( T x , ~ ) 1 0, Vx E X ,
then T is called positive or positive semi-definite. The following result is part of the folklore.
Proposition 12.3.3 Suppose T is a continuous linear operator from X to X". Then TFAE:
(i) T is positive; (ii) T is monotone; (iii) T is maximal monotone.
Proof. By linearity of T, (i) and (ii) are equivalent; also, (iii) implies (ii). L'(ii)+(iii)7':
(See also [El, Proof of Example 1.5.(b)] .) Suppose (x, x*) E @aT) n (X x X*). Then, for
every X > 0 and u E X: 0 5 (T(x + Xu) - 2*,(x + XU) - x) = X(Tx - x*,u) + X 2 ( ~ u , u ) .
Dividing by X and letting X tend to 0 yields 0 5 (Tx - x*, u). Since u is arbitrary, we
conclude x* = Tx. Hence T is maximal monotone by Remarks 12.2.2.(iii). N The following result characterizes monotonicity of conjugate operators.
Proposition 12.3.4 Suppose T is a continuous linear operator from X to X*. Then T is
monotone and of type (NI) with respect to gra (-T*) if and only if T* is monotone.
Proof. Clearly, if T* is monotone, then so is T. So suppose T is monotone. Fix x** E X*"
and x E X . Then (Tx + T*x**, x - x**) = (Tx, x) - (Tx, x**) + (T*x**, x) - (T*xX*, x*") > -(T*xL*, XI*). Hence -(T*x**, x**) 5 infzEX(Tx+T*x**, x-x**) 5 (TO+T*x**, 0-x**) =
-(T*xL*: x**) and thus:
inf (y" - (-T'x*"), y - 2") = inf (Tx + T'x*", x - x") = -(TXx*", x*"). (y,y*)€graT ZEX
The result follows readily.
Our study of continuous linear monotone operators relies on the following easy-to-prove yet
immensely useful result.
Proposition 12.3.5 Suppose T is a continuous linear operator from X to X*. Then T can
be written as the sum of two continuous linear operators, T = P + S, where P is symmetric
and S is skew. This decomposition is unique; in fact:
P x = $TX + &T*X and Sx = &TX - ~ T X , Vx E X.
We refer to P (resp. S) as the symmetric part (resp. skew part) of T.
CHAPTER 12. THE ZOO OF MONOTONICITIES
From a monotone operator theory point of view, the symmetric part P is very nice (being
equal to the subdifferential of a convex function whenever T is monotone; see Proposi-
tion 12.3.6). Although the skew part S is very monotone ( ( S X , ~ ) = 0, Vx E X), it is far
away from being the subdifferential of a convex function (recall that Hessians of convex
functions are symmetric - not skew!). So it is not unexpected that the skew part S plays a
major role (see Chapter 13).
Let us now study the important concepts of symmetric and skew operators separately.
12.3.1 Symmetric operators
Proposition 12.3.6 Suppose T is a continuous linear operator from X to X* with sym-
metric part P. Let q(x) := $(x, Tx), Vx E X. Then: q is convex u T is monotone H P
is monotone. In this case, we have furthermore:
(i) Vq = P.
(ii) q * o P = q .
(iii) ran P C dom q* cl ran P
(iv) q* is strictly convex on ran P.
(v) q* is nonnegative and quadratic-homogeneous, i.e. q*(tx*) = t2q*(x*), Vx* E X * , t E W.
Proof. Since q is continuous, it suffices to check midpoint convexity; fixing two arbitrary
points x, y E X, we have
(i): Pick (x,xR) E dq (possible, because of Fact 2.2.7.(ii)). Then t(xX, h) 5 q(x +th) - q(x),
Vh E X , t > 0; this simplifies to (x', h) 5 ( ~ T X + ~ T ' x , h) + $t(h, Th). Letting t tend to 0
yields x' = ~ T X + $I"X = Px. (ii): Fix xo E X. Then
q*(Pxo) = sup(Pxo, x) - q(x) = - inf {q(x) + (-Pxo, x)); xEX 2EX
this last infimum can be viewed as a little optimization problem which is easy to solve:
indeed, after taking gradients, we learn that the set of minimizers equals xo + ker P . It
follows that q*(Pxo) = q(xo). (iii): The first inclusion follows from (ii). Now fix an arbitrary
CHAPTER 12. THE ZOO OF MONOTONICITIES 145
x* E domq". Again, viewing q* a s an optimization problem turns out to be useful: let
f (x) := q(x) - ( x * , ~ ) , then
q*(x*) = - inf {q(x) - (x*,x)) = - inf f(x). SEX x€X
Fix€ > 0. By [120, Lemma3.221, thereexistssomex E X such that llVf(x)ll = IIPx-x*~~ < E . As E was chosen arbitrarily, it follows that x' E clranP. (iv): follows precisely as in [89,
Theorem X.4.1.31. (v): q* (x*) > (x*, 0) - q(0) = 0, Vx* E X*. q* is quadratic-homogeneous,
since q is. . The next proposition complements Proposition 12.3.3. It also follows from Fitzpatrick and
Phelps's [67, Theorem 3.101; however, here we provide a simpler Convex Analysis proof
which demonstrates clearly the usefulness of Fenchel's Duality Theorem (Fact 2.2.23) in
monotone operator theory.
Proposition 12.3.7 Suppose T is a continuous linear operator from X to X*. Then T is
maximal monotone locally if and only if T is monotone.
Proof. "+": choose U := X in Definition 12.2.4.(v). "en: by Proposition 12.2.7.(v),
we fix a bounded closed convex subset C in X , x0 E int C, and x; E X * \ Txo. Let
p := infxEc ~ ( T X - x;, x - x O ) Our aim is p < 0. Since C is bounded, the infimum p is 1 finite. Define f (x) := q(z) -($XS+$T*X~, x)+$(x;~, xO) and g := LC, where q(x) := (x, Tx),
Vx E X . Then q is convex and continuous on X (Proposition 12.3.6), hence so is f . Also,
p = infzcx f(x) + g(x). Now Fact 2.2.23 yields the existence of some x" E X* such that
p = -{f *(-x*) + g"(xm)). Moreover, since xo is in the interior of C, we estimate
p = -{fa(-zX) +gX(x')) = -fX(-2") - l;:(xl)
< - f "(-x") - (x", 20) 5 -{(-2": 20) - f (xo)} - (x*, x0)
= -{(-xu, xo) - q(xo) + (hx; + p x o , "0) - h(x& 20)) - (x*, 20) = 0. . The power of the decomposition stems from the following result:
Theorem 12.3.8 Suppose P is a continuous linear monotone symmetric operator from X -
to X*. Then Pl = Po = P = P* = P**. Consequently: P is maximal monotone of dense
type, weakly compact, locally maximal monotone, and maximal monotone locally; P" is
monotone and symmetric.
CHAPTER 12. THE ZOO OF MONOTONICITIES 146
Proof. By Proposition 12.3.6, P is the subdifferential of the continuous convex function
4 (x, Px). Hence (Fact 12.2.10 and Proposition 12.3.7) P is maximal monotone and of dense -
type, locally maximal monotone, and maximal monotone locally; in particular, Pl = Po = P
and P is of type (NI). It follows that on the one hand, P" is a maximal monotone extension
of P (Proposition 12.3.4 and Proposition 12.3.3); on the other hand, is the unique maxima2
monotone extension of P in X** x X* (Fact 12.2.6). Altogether, = P*. Now P* = P"*,
because PI = P* and gra Pl C gra P** (Theorem 12.3.1). Finally, the weak compactness of
P follows from Corollary 12.3.2. W
The next corollary is used repeatedly.
Corollary 12.3.9 Suppose P is a continuous linear monotone symmetric operator from X w* * I to X*. Then for every x** E X**, there exists a bounded net (x,) in X such that x, - x
and Px, -t P*x** = P**x**.
Proof. PI = P* = P**. W
Theorem 12.3.8 allows us to strengthen parts of Proposition 12.3.6:
Proposition 12.3.10 Suppose P is a continuous linear monotone symmetric operator from
X to X*. Let q(x) := $(x, Px), Vx E X . Then
Also, dom dq" = ran P* and Vq** = P** = P*.
Proof. Fix x** E X** and define g(x) := (-P*xX*, x) + $(xX", P'x"), Vx E X . Then
(x": P'z") E gra Po (Theorem 12.3.8) and hence
0 = 3 inf (Px - P*xXx, x - x") = inf q(x) + g(x). 2EX zEX
The conjugate of g is given by (see Example 2.2.14) g* (x") = - $(x*', P"xX") + L~-~.,.. (x*),
Vx' E X'. Fact 2.2.23 yields
0 = - inf *{qL(x*) + g*(-x*)) = $(x**, P ~ x * " ) - q*(P*xX*), x'EX
which is the first equality. To prove the second equality, we first note that the first equality
implies
q**(x**) = SUP (x**, x*) - q* (z*) 2 (XI*, PIX**) - q*(P*z**) = $(XI*, P*xL*)- x'EX*
CHAPTER 12. THE ZOO OF MONOTONICITIES 147
w* ** On the other hand, by Corollary 12.3.9, there is a bounded net (x,) in X such that x, -- x
and Px, + P*x**. Then for every x* E X", we estimate
q*(x*) 2 lim(x', x,) - $(x,, Px,) = (xva, xw) - $(x**, P*x**). (2
This in turn implies & (x", P*x**) 2 (x**, x*) - q* (x*) = q** (x**), which yields the
second equality. "Also" part: By Theorem 12.3.8 and Fact 12.2.10, dom dq* = ran (dq) 1 =
ran P*. Finally, Vq** = P** (apply Proposition 12.3.6 to P* = P**).
12.3.2 Skew operators
Parts of the next two propositions are implicit in Gossez's work [75, 771.
Proposition 12.3.11 Suppose S is a continuous linear skew operator from X to X* and
(x**,x*) E X** x X*. Then:
(i) x* E Six** * x* = S**x** - - -S*x** * -x* E (-S)lx**.
(ii) x* E Sox** ($ x* = -S*X** and (S*x**,x**) = 0 # -x* E (-S)oz**.
Consequently: gra Sl = graS** n gra (-S*) 5 gra (-S*), (-S)l = -S1, and
(-S)o = -So. Moreover:
(iv) -S* is monotone * 3 = -S* and S is unique @ gra- E gra (-3).
- - (v) S" is skew @ S is weakly compact e SI = ST* e~ So = -Sa @ -S = -S.
Proof. Fix an arbitrary y E X. Then, using the skewness of S, (x*" - y,xR - Sy) =
(xXR, xw) - (y, S"xX* + xu). Hence
(xu*, xu), if x* = -Fxu* ; inf ( X " ' - ~ , X " - ~ ' ) =
(y,y*)EgraS otherwise.
(ii) and (iii) follow readily. For (i), observe that: x* E Six** H (x**, x*) E graSl n graSo
(Proposition 12.2.3) * x* = S"*x** = -S*x** E X* ((ii) and Theorem 12.3.1) * x* = s * W 2 + + = - s + ~ * * (ran S* C X*).
Now the "Consequently" part follows from (i)-(iii) and Proposition 12.2.3. "Moreover": S
CHAPTER 12. THE ZOO OF MONOTONICITIES 148
is unique if and only if is monotone (Fact 12.2.6); hence (iv) follows. For (v), observe - -
that S* skew M So = -S* -S = -S S** = -9 S1 = S** (since g a s l =
gra S** r l gra (-S*)) M S is weakly compact (Corollary 12.3.2).
Although (-S)i = -S1 and (-S)o = -So for every continuous linear skew operator S, - - the formula "-S = -S" is false in general: see, for instance, Example 14.2.2 and Exam-
ple 14.2.4.
Proposition 12.3.12 Suppose S is a continuous linear skew operator from X to X". Sup-
pose further S* or -S* is monotone. If s** is a point in X** with (x**, s * ~ * * ) = 0: then
S**x** = -S*x**. Therefore, S1 = So and (-S)l = (-S)O.
Proof. Suppose x** E X"* with (S*x*", x**) = 0. Case 1: S* is monotone. Then fix
y** E X** and X > 0. Thus:
0 = (S*x**, x**) = (S*(x** + Xy*"), x** + XY**) - X(S*(x** + Xy**), y**) - X(S*yL", x**)
2 -X(S*x**, y**) - X(S*y**, x**) - X ~ ( S * ~ * * , y**).
Now divide by X and then let X tend to 0 to conclude (S*x**, y**) 2 -(S*y**, x*"). Replace
y** by -y** and obtain (S*x**, y**) 5 -(S* y**, x**). Altogether:
It follows that S*"x** = -S*x**. Case 2: -S* is monotone. Then, by the first case,
(-S)**x** = -(-S)*x**, as desired. Using Proposition 12.3.11 repeatedly, we conclude as
follows: Fix an arbitrary (x**, x*) E graSo. Then x* = -S*x*" and (S*x**, x*") = 0; hence,
by what we just proved, S**x** = -S*x** and so (x**, x*) E gra S1. Thus S1 = So which
yields(-S)l=-S1=-So=(-S)o.
12.4 Notes
The extensions Tl and T were considered by Gossez [75, Section 21 and by Phelps [121:
Section 31, respectively.
We now present the announced improvement of Corollary 12.2.14. (Recall that a set-valued
map is bounded if it maps bounded sets to bounded sets.)
Theorem 12.4.1 Suppose T is a maximal monotone operator from X to X*. Consider the
following conditions: (i) T is coercive and of dense type; (ii) (T)-l is bounded and T is of
range-dense type; (iii) T is locally maximal monotone. Then: (i)+(ii)+(iii).
CHAPTER 12. THE ZOO OF MONOTONICITIES
Proof. "(i)+(ii)": using the definition of Tl and weak* lower semicontinuity of the norm
in X**, it is easy to prove that TI is coercive whenever T is. Now T is of dense type,
thus TI = T and T is of range-dense type; so (T)-' is bounded. "(ii)+(iii)": a closer look
at Phelps's proof of [121, Theorem 4.8.(ii)] reveals that already clranT = X * and T - ~
bounded imply local maximal monotonicity of T. Since clranT _> ranTl = ran?; (T is of
range-dense type) and since T - I is bounded (because (T)-' is), it suffices to establish the
following
Keystep: clran?; = X*.
Fix x* E X* and (u, u*) E gra T and let (A,) be any sequence of strictly positive reds
tending to 0. Simons's [146, Theorem 12.(a)] yields the existence of sequences ( x r ) in X** -
and of (y:), (xz) in X * with: x* = y: + A,X:, yi E T x r , 11x:112 = 11xr112 = ( x r , xi): Vn.
Also, by definition of T , (yi - u * , x r - u) > 0, Vn. Altogether,
Claim: (X,zr) is bounded.
Otherwise, WLOG Xnllxrll + +m (rename an appropriate subsequence). Hence Ilxr;*ll t
+m and dividing the displayed chain of inequalities by IIxrll yields +w + Ily:ll/llxrll. On
the other hand, Ily:ll/llxrll = 112" - Anx:ll/llx:/l -) 0, an impossibility. The claim thus
holds.
It follows that (Anxi) is bounded and so is (y:). Since (T)-l is bounded, we conclude that
the sequence (2:) is actually bounded. Consequently, X,x; + 0 and hence x* t y: E ranT,
as desired. . Let us now comment on the assumptions of Corollary 12.2.14 and of Theorem 12.4.1. It
is easy to see that the following implications hold: domT is bounded + T is coercive + T-' is bounded. Neither implication can be reversed in general: indeed, let X := R2 and
consider first the monotone operator T(xl, x2) := (-22, xl). Then T-' is bounded but T is
not coercive. Also, the identity is coercive but domT is unbounded. The former example
can be used to show that Theorem 12.4.1 is a genuine sharpening of Corollary 12.2.14.
Remark 12.4.2 The operator (xl, x2) I+ (-x2,xl) on W2 shows that condition (ii) of The-
orem 12.4.1 is really more general than condition (i). However, since R2 is very reflexive, we
know already by Fact 12.2.5 that T is locally maximal monotone. A much more sophisticated
example (in not necessarily reflexive spaces) can be obtained as follows:
CHAPTER 12. THE ZOO OF MONOTONICITIES
Suppose J is the duality map on X and S is a continuous linear skew operator
on a Banach space Y. Suppose further S* is skew, S-' is bounded, but S is not
coercive. Let T be the map from X x Y to X* x Y* defined by T(x, y) = ( Jx , Sy),
V(x, y) E X x Y. Then: T is maximal monotone of dense type, (?;)-I is bounded,
but T is not coercive.
We omit the verification - which relies on results presented later - since this construction
is not crucial. More concretely, we could let X := el, Y := R2, and S be as above to
obtain an example on el x R2. Since the space el x R2 is isometrically isomorphic to 11, the
construction results altogether in an example on the nonreflexive space el.
Remark 12.4.3 If T is a linear monotone operator from X to X* with domT = X , then T
is necessarily continuous (see, for instance, [168, Proposition 26.4. (b)]) . Thus starting with
Proposition 12.3.4, I could have written throughout "T is a linear monotone operator from
X to X* with domT = X" instead of "T is a continuous linear monotone operator from X
to X"' ; however, I prefer the latter formulation.
Some results of Section 12.3 were also obtained independently by Phelps for linear (not nec-
essarily continuous) operators; more precisely, Phelps's [I221 contains more general versions
of Proposition 12.3.5 and Proposition 12.3.6.
Remark 12.4.4 Borrowing the notation of propositions 12.3.6 and 12.3.10 and recalling
that ran P G ran P*, we thus precisely know what q* does except at points in (clran P) \ ran P*. Therefore, if ran P is closed: then q* is completely determined. This happens if X
is finite-dimensional in which case we recover a well-known formula for q" (see, for instance,
189, Example X.1.1.41).
Remark 12.4.5 Suppose T is a maximal monotone operator from X to X". It is not
known whether or not Tl and To can actually differ. The situation is not clear even when T
is a continuous linear skew operator. In view of Proposition 12.3.12, a good candidate one
might try would be the operator F from Example 14.2.4.
Finally, we saw and will see in the later chapters that the decomposition (Proposition 12.3.5)
of a continuous linear monotone operator into a symmetric and skew part is immensely
useful. It would be excellent to have the notions of a symmetric and a skew part for general
maximal monotone operators as well. I am aware only of work by Asplund [4, Section 61
which is based on Zorn's lemma and is probably too nonconstructive to be practical.
Chapter 13
Characterizations
13.1 Overview
This chapter contains the main results of Part 11. We show how Fenchel's Duality Theorem
and our ground work in the previous chapter lead to an elegant characterization of the
various monotonicities in terms of the conjugate operator. We also give a partial affirmative
answer to a question posed by Gossez more than two decades ago.
13.2 The main results
Proposition 13.2.1 Suppose T is a continuous linear operator from X to X" and (xu*, x")
belongs to X"" x X". Let q(x) := $(x, Tx), Vx E X. Then
and hence:
(i) T is of type (NI) with respect to (x"", x") H q'(ix' + $T"X'") 2 $(xX", xu).
(ii) (x"", x*) E graTO H q"(fz" + ~T"x'*) = f (xu", x*).
(iii) (x**, x*) E graT +$ q*(fx* + +T*X**) 5 f(x**, r*).
If x* = (T*IX)*x**, then:
(iv) T is of type (NI) with respect to (x**, x*) H (S*x**, x**) L: 0.
CHAPTER 13. CHARACTERTZATIONS
(v) (x**, x*) E graTO e (S*x**, x**) = 0.
(vi) (x**, x*) E p a T u (s*xZ*, x**) 2 0.
Consequently:
(vii) S is monotone e T is (NI) with respect to gra (T* lx)*
(viii) S*isskew T**=(T*lx)* e gra(T*lx)*EgraTo.
(ix) -S* is monotone u gra (T* Ix)* C_ graT
Proof. The displayed formula: (i), (ii), and (iii) are easy to check. Now suppose (x**, x*) E
gra (TIX)*, i.e., x* = P*xX* - S*x**. Then, using Proposition 12.3.10,
i(x**, x*) - q*(&x* + &T*x**) = &(x**, P*x**) - &(x**, s*x**)
- q*(gp*x** - isx** + ;P*~** +
= -&**, S*x**),
which yields (iv), (v), and (vi). The "Consequently" part follows from (iv), (v), (vi), and
the fact that dom(T*lx)* = X**.
Proposition 13.2.2 Suppose T is a continuous linear monotone operator from X to X*
with symmetric part P and skew part S. If S* or -S* is monotone, then graTX* n gra (T* Jx)* = graTl. Consequently: S is skew if and only if TI = T** = (T* Ix)".
Proof. "G". . is clear from Theorem 12.3.l.(ii). "2": Fix (x**,x*) E graTl, i.e. x* =
T**xNT = PXx** + S**xf * E X* (Theorem 12.3.l.(ii) and Theorem 12.3.8). Using Proposi-
tion 12.2.3, we thus have (x"', x') E graTo f l graT". Now Proposition 13.2.l.(ii), the fact
that (S*"x**, x) = -(SXx", x), Vx E X, and Proposition 12.3.10 give
= sup(P*x**; 2) - &(x, Px) = q*(P*x**) = 4(x1*, PIX**). +EX
Hence (S*x**, x**) = 0 and so (by Proposition 12.3.12) S*x** = -S**x**. Thus (x**, x") =
(x**, P*x** + S**x**) = (x**,P*x** - S*x**) E gra(T*IX)*. The "Consequently" part
follows. . We are now ready for the main result.
CHAPTER 13. CHARACTERIZATIONS 153
Theorem 13.2.3 Suppose T is a continuous linear operator from X to X* with symmetric
part P and skew part S. Then TFAE:
(i) T is monotone and of dense type.
(ii) T is monotone and of range-dense type.
(iii) T is monotone and of type (NI) .
(iv) T is monotone and of type (NI) with respect to gra (-T").
(v) T is locally maximal monotone.
(vi) T* is monotone.
(vii) P and S* are monotone.
(viii) P is monotone and S is of dense type.
(ix) P is monotone and S is of range-dense type.
(x) P is monotone and S is of type (NI).
(xi) P is monotone and S is of type (NI) with respect to -S*
(xii) P is monotone and S is locally maximal monotone.
(xiii) gra TI = gra To = graT = gra T** n gra (T* ] X I * .
1 Proof. Throughout this proof, let q(x) := I(Tx, x) = ~ ( P X , x), Vx E X .
-(i)=+(ii)*(iii)+(iv) + (vi)" follow from Fact 12.2.9 and Proposition 12.3.4.
"(vi)*(vii)": T and P are monotone, because T' is. Fix an arbitrary x" E X"'. By w-
Corollary 12.3.9, obtain a bounded net (x,) in X such that x, - x"' and Px, -t Pxx"'.
Now
0 5 (Tu(x*" - x,), x*' - x,) = (T"x*;*, x** - x,) - (T*x,, x*" - x,)
= (Yx**; xX* - xa) - (Px, - Sxa, x** - x,) = (T*x** - Pxa,x** - 2,) + (2,: S*x**)
+ (x**, SXx**);
consequently, S* is monotone and (vii) holds.
"(vii)+(i)": Fix an arbitrary (x**, x*) E graT. By Corollary 12.3.9, obtain a bounded net
CHAPTER 13. CHARACTERIZATIONS 154
W* (xa) in X such that x, - x** and Px, -t P*x**. Now Proposition 13.2.l.(iii) and the
monotonicity of S* yield
hence (S*x**, xX*) = 0 and q*(&x* + $T*x**) = $(x**, x*). This has two important conse-
quences: firstly, by Proposition 12.3.12,
Secondly, using Proposition 12.3.10, ( i s * + $T*x** , x**) = g(x**,xX) + 4(x**, P*x**) =
q*($x* + &T*X**) + q**(x**); thus: x** E aq*($x* + &T*x**) + gx* + ;T*x** E dq**(x**) =
{P*x**} *
Altogether, x* = P*x**-S*x** = P**x**+S**x** - - T**x** E X*, so that (Theorem 12.3.1)
(x**, x*) E graTl, as desired.
"(v)+(vi)": T is maximal monotone (use V = X* in Definition 12.2.4.(iv)) and so is P :
the function q is convex (Proposition 12.3.6). Fix xg* E X**. We aim for (TWx;*, x;*) 2 0
and can thus assume WLOG that x: := T*xGX # 0. Select xl E X with (x;, xl) < 0 and let
x; := Txl. Let xo := 0, fix an arbitrary c > 0, and define
Then C, is weak* closed, convex, bounded with xy E ranT n int C,. Also, x t E (int C,) \ Txo. Local maximal monotonicity (via Proposition 12.2.7.(iv)), Proposition 2.2.12, and
Fact 2.2.23 yield
inf (Tx - 26, x - xO) = inf q(x) + (-$26, x) + L~~ (Tx) O > x€X:TxEC. xEX
2 - inf {q*(-T"x" + 4x5) + L ; ~ (XI*)}. z** EX**
Now pick x** := ix;"; then, by Example 2.2.15 and the fact that q*(O) = 0,
Multiply by 2 and then let e tend to 0 to obtain 0 5 max{(xg*, T*x;*), (xg*, TX~)}. Since
(xc*, Txl) = (T*z;)*, xl) = (25, xl) < 0, we conclude (xi;*, T*x;*) > 0.
CHAPTER 13. CHARACTERIZATIONS 155
"(vii)+(v)": In view of Proposition 12.2.7.(iv), let us fix a weak* closed convex bounded
subset C of X* with ranT n int C # 0, xo E X , xi; E (int C) \ Txo. Let
p := inf $(TX - xi;, x - xO). xEX:TxEC
Clearly, p < +oo and our aim is p < 0. We thus can assume WLOG that p > -m, hence p
is finite. Let f (x) := q(x) + +(-xi; - T*xo,x) + $(xi;,xO), Vx E X, and let g := LC. Then,
using Fact 2.2.23 and Propositioh 2.2.12,
p = inf f (x) + g(Tx) = - inf { f *(-T*x**) + g* (x**)} xEX x**EX**
- 1 - I ( X ~ , X ~ ) * - x*jgi**{q*(-T*x** + $xi; + & T * X ~ ) + L;(x-*)}.
Moreover: the last infimum is attained (by Fact 2.2.23 and ranT n int C # 0), say at some
xi;* E X**. Thus the proof of "(vii)+(v)" would be complete after reaching the following
(Aim) $(x& zo) < q*(-T*x;* + $2; + $T*XO) + L;(x;*).
By assumption, 0 I (S*(xO - 2xi;*), xo - 2xi;*), which is equivalent to
1 (-T*x;* + IXT, + $T*XO, so - 2xi;*) - $(xo - 2xi;*; P*(xo - 2x;*)) 2 $(xi;, xO) - (xz*: xi;).
W* On the other hand, Corollary 12.3.9 gives a bounded net (x,) in X such that x, -- xo - 2xi;*
and Px, -+ P* (xo - 22;"); thus altogether
Consequently, since x(; is in the interior of C,
which is what we aimed for!
We just proved that (i)-(vii) are equivalent for arbitrary continuous linear operators. So let
us apply this to T := S. The symmetric (resp. skew) part of T is 0 (resp. S). Hence we
obtain the equivalences: S is of dense type w S is of range dense type w S is of type (NI)
e S is of type (NI) with respect to gra ( - 9 ) e S is locally maximal monotone * S* is
monotone.
Therefore, (i)-(xii) are equivalent. It remains to include (xiii) . "(i), (vii)+ (xiii)" : use Prop-
osition 12.2.3 and Proposition 13.2.2. "(xiii)+(i)": T is monotone (since graT & graTl =
graF, see Remarks 12.2.2. (i)) and of dense type. .
CHAPTER 13. CHARACTERIZATIONS 156
Observation 13.2.4 Gossez [75, End of Section 21 found the following question interesting:
Suppose that T is a closed densely defined linear monotone operator from X to
X * and that T* is monotone. Is TI maximal monotone? '
He then proved that the answer is "yes" if T is continuous and skew. We are now able to
give an affirmative answer to this question provided that T is merely continuous: indeed,
this follows from Theorem 13.2.3. "(vi)+(i)" and Proposition 12.2.7.(i).
Theorem 13.2.5 Suppose T is a continuous linear operator from X to X* with symmetric
part P and skew part S. Then TFAE:
(i) T and T*lx are monotone and of dense type.
(ii) T and T*Ix are monotone and of type (NI).
(iii) T and T* Jx are locally maximal monotone.
(iv) T* and (T* lx)* are monotone.
(v) T is monotone and weakly compact.
(vi) P is monotone and S is weakly compact.
(vii) P is monotone and S* is skew.
(viii) P is monotone and S, -S are of dense type.
(ix) P is monotone and S , -S are of type (NI).
(x) P is monotone and S7 -S are locally maximal monotone.
Proof. Applying Theorem 13.2.3 to T = P + S and TIx = P - S yields the equivalence
of (i), (ii), (iii), (iv), (vii), (viii), (ix), and (x). Now (v) H (vi) (by weak compactness of
P: see Theorem 12.3.8) H (vii) (by Proposition 12.3.11.(v)); so (i)-(x) are all equivalent.
Finally, (vii) + P , S* are monotone and S* is skew H graTl = graTo = g a T = graT** n gra(T*~x)*andT**=(T*~x)*(Theorem13.2.3andProposition13.2.l.(viii)) + (xi). . Borrowing notation from the two theorems above, we see that monotonicity of S* can be
interpreted as "one half of weak compactness" of T.
CHAPTER 13. CHARACTERIZATIONS
Remark 13.2.6 Given a continuous linear monotone operator T from X to X* with skew
part S, the following three (mutually exclusive) alternatives are conceivable:
T is "good": both S* and -S* are monotone.
T is "so-so": either S" or -S* is monotone but not both.
0 T is "bad": neither S" nor -S* is monotone.
A priori, it is not clear that so-so or bad operators exist. However, this is indeed the case
and we will systematically find examples of so-so and bad operators in the next chapter.
13.3 Notes
Gossez's question from Observation 13.2.4 remains open for monotone linear operators that
are not continuous.
Proposition 13.3.1 Let X be the Banach space of,all continuous linear operators from X
to X* (equipped with the usual operator norm). Let further M := {T E X : T is monotone),
M* := {T E X : T* is monotone), and Mo := {T E X : T is monotone, weakly compact).
Then M , M ,, and Mo are closed convex cones and M > M, _> Mo. Moreover, lin M =
{T E X : T is skew) and linM, = l inMo = {T E X : T* is skew).
Proof. Clearly, M, M ,, and Mo are convex cones; the announced inclusions follow from
Theorem 13.2.3 and Theorem 13.2.5. To verify closedness of the cones: suppose T, -+ T in
X. Recall that Tz -+ T" in X' and T,'" -+ T"' in X"" (because IIT, - TI1 = /IT: - T'll =
/IT:" - T'"II). (i): Suppose the net (T,) is in M . Then (T,x,x) 2 0, Va,x E X: hence
(Tx, x) 1 0. It follows that M is closed. (ii): Suppose the net (T,) is in M,. Then
(T,'zx", x") 2 0, Va, x** E X""; hence (T"xX', 2"") 2 0. It follows that M, is closed. (iii):
Suppose the net (T,) is in Mo. Then T,'*x*' E X": Va,xa' E X". Since X" is closed in
XX**: we obtain T**x** E X* and we saw in (i) already that (Tx, x) 2 0, Vx E X. Hence Mo is closed. Now T E lin M * T and -T are monotone @ T is skew. Finally: T E lin M,
T* and -T* are monotone * P, -P, 9, -Sx are monotone (Theorem 13.2.3) e P, -P are monotone and S*, -S* are skew * T, -T are monotone and weakly compact
(Theorem 13.2.5) * T E linMo. .
CHAPTER 13. CHARACTERIZATIONS
Borrowing notation from Proposition 13.3.1, we see with Theorem 13.2.3 that the
convex cone M , is equal to the set of of all continuous linear monotone operators that are
dense type (equivalently: of range-dense type, of type (NI), or locally maximal
In particular, the latter set is closed under addition, nonnegative scalar-multi~lication, and
taking lirni ts in X.
Chapter 14
Examples
14.1 Overview
Given a continuous linear skew operator from X to X*, we are presented with three mutually
exclusive alternatives (see Remark 13.2.6): S is "good", i.e., S* is skew; S is "so-so",
i-e., either S* or -S* is monotone but not both; S is "bad", i.e., neither S* nor -S* is
monotone. In Section 14.2, we will systematically construct "so-so" and "bad" operators
and easily recover examples by Gossez and by Fitzpatrick and Phelps in !I and L1[0,1].
Banach spaces that allow only "good" operators are studied in Section 14.3; these spaces
will be referred to as "(crns) spaces". Using Banach Space Theory, we see that many classical
Banach spaces like co, c, l,, L,[O, 11, C[O, 11 as well as all reflexive Banach spaces are (crns).
14.2 Generating "weird" examples systematically
Theorem 14.2.1 Suppose T is a continuous linear operator from X to X* and there exists
some e E X* such that
e#c l r anT and ( T X , X ) = ( ~ , X ) ~ , VxEX.
Then T is monotone, its symmetric part P is given by P x := (e, x)e, Vx E X , its skew part
S := T - P, and P*x** := (x**, e)e, Vx** E X*". Moreover: S* is not monotone; S is neither
of type (NI) nor locally maximal monotone; S and -S are not weakly compact.
If ran T* = ran T* lx (equivalently,
CHAPTER 14. EXAMPLES 160
or (ranT)* c X), then -S* is monotone (in fact: (-S*x**, x**) = (x**-2, e)2, Vx** E X**);
S is unique; -S is of dense type and locally maximal monotone;
gra S1 = gra S* n gra (-9) = graSo = gra (-S*);
gra (-S)l = gra (-S**) n graS* = gra (-S)o = gra-S 5 graS*.
If ker T 5 ker T"", or, even more restrictive,
(2) T is one-to-one and ranT* is not norm dense in X*,
then S and -S are each not tauberian; gra S 5 gra S1 ; gra (-S) 5 gra (-S)i .
Proof. T is obviously monotone. Let Px := (e, x)e, Vx E X ; then (PSx**, x) = (x*", Px) =
(x**, e) (e, x) and hence P"x** = (x**, e)e, Vx** E X**. So P is symmetric. Consider now
S := T - P . Then (Sx, x) = (Tx, x) - (Px, x) = (Tx, x) - (e, x ) ~ = 0, Vx E X , thus
S is skew. Since T = P + S, the symmetric (resp. skew) part of T is P (resp. S) by
Proposition 12.3.5. Because e @ clranT = I ( k e r ~ * ) ([161, Lemma 11-1-7.(c)]), there exists
some xi* E ker T* with (xi*, e) # 0. Hence
(S*x(;*, xc*) = (T*xg*, 2;") - (P*x:*, xc*) = 0 - (xi*, e)2 < 0;
so S* is not monotone. Thus S is neither of type (NI) nor locally maximal monotone
(Theorem 13.2.3) and S, -S are not weakly compact (Proposition 12.3.11. (v)).
Since ran T C X*, the Hahn/Banach Theorem allows us to identify (ran T)* with {x**lran :
x** E X**) . Hence, as announced:
(ran T)' E X * Vz* E (ran T)' 32 E X E X*' : z* = iIranT 3- * VX"' E X'" 33 E X : X I,, = ilranT
*Vx**EXm 3 i E X V x E X : (x**- i ,Tx)=O
w Vx*" E X'" 32 E X : TILxX' = T'i
Suppose now (1) holds and fix an arbitrary x** E X**. Pick d E X C_ X** with x*"Iran =
? I r a n T ; equivalently, T*x** = T*i . Then we have (S*X**,X) = (T"x**,x) - (P*xx*,x) =
(TX3,x) - (P*x**, x), Vx E X ; hence
CHAPTER 14. EXAMPLES
Because T*Ix = P - S = 2 P - T, we further obtain
(S*X**, x**) = ( T i , x**) - (x**, e)2 = 2(P2, x**) - (T2, x**) - (x**, e)2
= 2(P5, x*") - (2, T*x**) - (x**, e)2 = 2(P2, x**) - (2, Ti) - (x"", e)2
= 2(e, 2)(x*", e) - (e, 2)2 - (2**, e)2 = -(e, x** - i)2 5 0;
consequently, -S* is monotone. Then Proposition 12.3.11 and Proposition 12.3.12 imply: -
S is unique, S = -S, S1 = So = -(-S)i = -(-S)o, gra Sl = gra S"* n gra (-S'), and
gra (-S)i = gra (-S**) r l gra S*. Now (-S)i = and gra Sl 5 g r a S (Theorem 13.2.3 on
-S,S); also, gra (-S**) n gra S* 5 gra S* (otherwise, S* is skew which is absurd). So all
announced consequences of (1) are verified. Finally, (2) u ker T = (0) and cl ran T* 5 X"
* kerT = (0) and (c l ran~*) ' = kerT** 2 (0) ([161, Lemma 11-1-7.(b)]) + kerT 5 ker TX* * T and -T are not tauberian ([161, Theorem 11-4-51) @ S and -S are not
tauberian (since P is weakly compact) * gra S 5 gra Sl and gra (-S) 5 gra (-S)l (by
Theorem12.3.1andCoroUary12.3.2). 1
Theorem 14.2.1 allows a painless derivation of two important examples.
Example 14.2.2 (Gossez) Define the map G from el to 1, by
Then: G and -G are continuous linear skew operators from el to tT = em. G* is not
monotone but -GX is. G is neither of type (NI) nor locally maximal monotone. -G is
of dense type and locally maximal monotone. Both G and -G are unique, but neither is
weakly compact nor tauberian.
Proof. (See also [75, Example in Section 21.) Consider the map T from el to t, given by
Then T is linear, continuous (in fact, IlTll = 2), and ranT co C t,. Let e := (1,1, 1 , . . .) E
1, = 1;. Then e $! clranT G clco = co and for every x E el,
CHAPTER 14. EXAMPLES 162
By Theorem 14.2.1, the symmetric part P of T is given by P x = (e, x)e, Vx E el, and the
skew part of T is S := T - P . Now for all x E L 1 , n E N
hence S = G. It follows that G* is not monotone; G is neither of type (NI) nor lo-
cally maximal monotone; G and -G are not weakly compact (Theorem 14.2.1). The
HahnIBanach Theorem yields (ranT)* & cT, = el, so (1) in Theorem 14.2.1 holds and
hence: -G* is monotone; -G is of dense type, locally maximal monotone, and unique;
G is at least unique. It is easy to check that T is one-to-one. Note also that by (I),
clranT* = clranT*lx = clran(2P - T) & cl(ranT + Re) C cl(co +Re) = c 5 I , = e;.
Hence (2) in Theorem 14.2.1 holds and G, -G are not tauberian.
Remark 14.2.3 Let G denote the Gossez operator from Example 14.2.2. Gossez [75]
proved that G is unique but not of dense type. Phelps [121, Example 4.51 showed that
G is not locally maximal monotone. We observe that our discussion of the Gossez oper-
ator via Theorem 14.2.1 is conceptionally much simpler, does not require the ~ t o n e / ~ e c h
compactification, and gives a bit more insight.
The next example is a "continuou~~~ version of the (negative) Gossez operator; see Exarn-
ple 14.2.2.
Example 14.2.4 (Fitzpatrick and Phelps) Define the map F from L1 [O; 11 to L, [O, 11 by
(Fx)(t) := Jill x(s)ds - 1' x(s)ds, Vx E Ll[O, 11, t t [0, 11.
Then F , -F are continuous linear skew operators from L1[O, 11 to L;[O, 11 = L,[O, 11 that
are not of type (NI), not locally maximal monotone, and not weakly compact. Neither F"
nor -F* is monotone.
Proof. (See also [67, Example 3.21.)
Step 1: Define the map T from L1 [O, I] to L, [O,1] by
Then T is linear and continuous (with ((TI( = 2). The range of T is contained in the subspace
Co,o of L,[O, 11 that consists of all equivalence classes that contain a continuous function
CHAPTER 14. EXAMPLES
vanishing at 0. Let e denote the equivalence class in L,[O, 11 that contains the constant
function 1. Then the distance from e to any member in Co,o is at least 1; thus certainly
e cl ran T. Also, for every x E L1 [O, 11,
Then (Theorem 14.2.1) the positive part P of T is given by P x := (e, x)e, Vx E L1[O, 11.
The skew part S of T is given by
(Sx)(t) := (Tx)(t) - (Px)(t) = 2 Jl X(S) ds - (e, x)e(t) = 2 j'l x(s) ds - Jd x(s) ds
= $x(s) ds - x(s) ds;
for every x E L1 [O, 11, t E [O,l]; consequently, S = F. Now Theorem 14.2.1 implies that F*
is not monotone and F is neither of type (NI) nor locally maximal monotone.
Step 2: This time, we define the map T by (Tz)(t) := 2c z(s) ds, Vx E L1[O, 11, t E [O,l].
We let e be as in Step 1 and check analogously: T is continuous, linear, e $? clranT, and
( T x , ~ ) = (x, e)2, Vx E L1[O, 11. This time, however, the skew part o[ T is equal to -F!
We deduce as in Step 1 that - F* is not monotone and that -F is neither of type (NI) nor
locally maximal monotone. . 14.3 Conjugate monotone spaces
In Remark 13.2.6, we discussed three possible distinct properties for a given continuous
linear operator: "good', "so-so": and "bad". We are now interested in finding Banach
spaces that only allow "good" operators.
Definition 14.3.1 We say that X is a conjugate monotone space (crns), if the conjugate
of every continuous linear monotone operator from X to X * is monotone.
Proposition 14.3.2 TFAE:
(i) X is (crns).
(ii) Every continuous linear skew operator from X to X * has a skew conjugate.
(iii) Every continuous linear skew operator from X to X* is weakly compact.
CHAPTER 14. EXAMPLES
(iv) Every continuous linear monotone operator from X to X* is weakly compact.
Proof. "(i)+(iv)": Fix a continuous linear monotone operator T from X to X*. Then
TIx is continuous linear monotone as well. Since X is (crns), both T" and (TIx)" are
monotone; equivalently, by Theorem 13.2.5, T is weakly compact. "(iv)=+ (iii)" : is trivial.
"(iii) * (ii)" : use Proposition l2.3.ll.(v). "(ii)a(i)" : follows from Theorem 13.2.3. . Remarks 14.3.3
(i) Every reflexive Banach space is (crns); this follows from Proposition 14.3.2.
(ii) Neither el nor L1 [O, 11 is (crns) (Example 14.2.2, Example 14.2.4).
(iii) The Banach space !T, is not (crns) either: indeed, let T := -G*: where G is the Gossez
operator from Example 14.2.2. Then T is monotone but not weakly compact.
Proposition 14.3.4 Suppose Y is a quotient of X. If X is (crns), then so is Y.
Proof. Let Q be a surjection from X to Y. Fix an arbitrary continuous linear mono-
tone operator T from Y to Y*. Then Q*TQ is not only continuous and linear but also
monotone (an easy check). Hence Q*TQ is weakly compact (by assumption) as is T (by
Proposition 2.3.7.(iii)). . Remark 14.3.5 (Simon Fitzpatrick [65])
(i) Suppose X is isomorphic to a Banach space Y. Then Y is a quotient of X and vice
versa. Therefore, X is (crns) if and only if Y is.
(ii) Consequently, L k [0, 11 is not (cms): indeed, !, and L, [O,1] are isomorphic; see [91,
Theorem 24.A]. Hence tk and LI$[O, 11 are isomorphic as well. Since !T, is not (crns)
(by Remarks 14.3.3.(iii)), neither is LL[O, 11 (by (i)). Note that we can not use the
operator F from Example 14.2.4 to obtain a counterexample since neither F x nor -Fx
is monotone.
If X contains a "complemented copy of el" (the Glossarex contains definitions and refer-
ences), then we can transplant the Gossez operator (Example 14.2.2) into this space:
Proposition 14.3.6 Suppose X contains a complemented copy of el. Then X is not (crns).
CHAPTER 14. EXAMPLES 165
Proof. The hypothesis means (consult the Glossarex if necessary) that there exist a com-
plemented closed subspace Y of X , a continuous linear operator Py from X to X with
ran Py = Y (and PyPy = Py), and a continuous linear one-to-one and onto operator I
from Y to el . Denote the Gossez operator from el to f!, of Example 14.2.2 by G. Consider
S := (IPy)*G(IPy). Then S is a continuous linear skew operator from X to X". Since G
is not weakly compact, neither is S (Proposition 2.3.7.(iii)). Hence X is not (cms). . Definition 14.3.7 We say that X is (c) (resp. X is (w)), if every continuous linear oper-
ator from X to X* is compact (resp. weakly compact).
Property (w) was defined by Saab and Saab [136]. Clearly, the following implications hold
and this explains our interest in the properties (c) and (w). It is obvious that finite-
dimensional Banach spaces are (c) and that reflexive Banach spaces are (w).
In the remainder of this section, we will make use of some heavy Banach space theory.
Fact 14.3.8 Suppose X * is Schur. Then X is (c).
Proof. The space X* is Schur, thus every continuous linear operator from an arbitrary
Banach space to X * is "completely continuous" or "Dunford-Pettis" (i.e., i t sends weakly
Cauchy sequences to norm Cauchy sequences); it then follows from Emmanuele's [60, Corol-
lary 71 that X does not contain a copy of el. Now let T be an arbitrary continuous linear
operator from X to X'. Fix an arbitrary sequence (b,) in Bx. By Fact 2.3.13, (b,) possesses
a weakly Cauchy subsequence, say (bk,). Then (pbkn ) is weakly Cauchy, too. By Propo-
sition 2.3.14, the sequence (Tbkn) is norm convergent. It follows that T(Bx) is relatively
sequentially compact: consequently, T is compact.
Example 14.3.9 The space co(I') is ( c ) : for every nonempty set I?.
Proof. In view of Fact 14.3.8, it is enough to show that c;(I') is Schur. If I' is finite: then
co(I') is finite-dimensional and thus c;(r) is certainly Schur. Otherwise, use the well-known
"Schurness" of el = c; ([95, Proposition 271). H
Fact 14.3.10 (Rosenthal/Pitt) Suppose 1 < p < +w. Then Lp[O, 11 and tp are (w).
Moreover: lp is (c) if and only if 2 < p while Lp[O, 11 is never (c).
CHAPTER 14. EXAMPLES
Proof. [131, Theorem A2]. 1
Proposition 14.3.11 Suppose X does not contain a copy of el and X" is weakly sequen-
tially complete. Then X is (w).
Proof. (See also Saab and Saab's [137, Remark following Corollary 241.) As in the proof of
Fact 14.3.8, let T be a continuous linear operator from X to X* and fix an arbitrary sequence
(b,) in Bx. Obtain a weakly Cauchy subsequence (bk,) of (b,) by Fact 2.3.13 Then (Tbk,)
is weakly Cauchy in X* and hence weakly convergent. It follows that T ( B x ) is relatively
weakly sequentially compact and hence clT(Bx) is weakly compact by ~ b e r l e i n h u l i a n
152, Theorem on page 181. Thus T is weakly compact (Fact 2.3.4).
The following proposition can be proved like Proposition 14.3.4.
Proposition 14.3.12 Suppose Y is a quotient of X. If X is (w), then so is Y -
Proposition 14.3.13 Suppose X is not (w) (resp. not ( c ) ) . Then XX* is not (w) (resp-
not (c)).
Proof. Pick a continuous linear operator from X to X* that is not weakly compact (resp-
not compact). By Fact 2.3.4 (resp. Fact 2.3.3), T** is not weakly compact (resp. not
compact), either. Hence X** is not (w) (resp. not (c)). . A complete characterization of (w) Banach lattices was provided by Saab and Saab.
Fact 14.3.14 (Saab and Saab) Suppose X is a Banach lattice. Then TFAE: (i) X is (w);
(ii) X is (cms); (iii) X does not contain a complemented copy of el: (iv) X* is weakly
sequentially complete.
Proof. "(i) w (iii)" : is [136, Corollaire 1 I]. "(iii) e (iv)" : (iii) X" does not contain
a copy of co (Fact 2.3.11) * (iv) (by (108, Theorem 5.1.14.(i)]). "(i)dii)": is trivial.
"(ii)+ (iii)" : follows from Proposition 14.3.6. . Examples 14.3.15 Every AM-space is (w). In particular: co, c, l,, L W [ O , and C(R)
(for any compact Hausdorff space R) are all (w).
Proof. The dual of an AM-space is an AL-space ([138, Proposition 11.9.11). And every *L-
space is weakly sequentially complete ([138, Corollary to Proposition 11.8.81); SO Fact 14-3.14
applies. For the examples, see [138, page 102fl. I
CHAPTER 14. EXAMPLES 167
Remark 14.3.16 Note that, for instance, cg does not contain a copy of el ([155, page 101)
whereas f, does (as e, is universal for d separable Banach spaces; see [91, Section 251).
In view of Examples 14.3.15, "not containing a copy of el" is not necessary for property
(w). Saab and Saab's result (Fact 14.3.14) and Proposition 14.3.6 highlight the important
role of the property "not containing a complemented copy of el7'. We do not know whether
or not there exists a Banach space that is (w) but does not contain a complemented copy
of el.
Definition 14.3.17 We say that X is (symmetric w), if every continuous linear symmetric
operator from X to X * is weakly compact.
The property (symmetric w) complements property (crns) nicely:
Proposition 14.3.18 X is (w) if and only if X is both (symmetric w) and (crns).
Proof. "*" : Proposition 14.3.2. "+" : Proposition 12.3.5. . Remarks 14.3.19
(i) For more on (symmetric w) in complex Banach spaces, see [3].
(ii) According to GutiCrrez [82, page 1511, it is an open problem whether or not a Banach
space with property (symmetric w) is necessarily (w).
(iii) Similarly, we do not know whether or not a Banach space that is (crns) is necessarily
( 4 .
(iv) In view of Proposition 14.3.18, the questions asked in (ii) and (iii) are really about
the relationship between the properties (symmetric w) and (cms).
(v) Suppose X is a Banach space such that every continuous linear symmetric operator
from X to X* is the difference of two continuous linear monotone symmetric operators.
Then, using Theorem 12.3.8: X is (w) if and only if X is (crns).
(vi) Let us now briefly outline that the approach suggested in (v) works in Hilbert space:
indeed, given a continuous linear operator T from X to X*, define the positive (resp.
negative) port of T by T+ := f I T [ + f T (re,. T- := f I T ] - f ~ ) , where IT1 :=
d? (Spectral Theorem!). Then T+, T- are continuous linear monotone symmetric
CHAPTER 14. EXAMPLES 168
operators and T = T+ -T-. (Since the notions considered depend only on the topology
rather than the given norm, the same is true for all renormed Hilbert spaces; in
particular, finite-dimensional Banach spaces.)
(vii) There exists a continuous linear symmetric operator from el to .& that cannot be writ-
ten as the difference of two continuous linear monotone symmetric operators: indeed,
Aron et al. [3, page 831 provide a continuous linear symmetric operator that is not
weakly compact (and necessarily not monotone; see Theorem 12.3.8). In particular:
El is not (symmetric w).
(viii) I t would be interesting to know if the notions of the positive (resp. negative) part
of an operator (see (vi)) can be made meaningful for a "reasonable" ~ 1 ~ s of Banach
spaces.
(ix) Proposition 14.3.4 and Proposition 14.3.6 remain valid, if we replace "(cms)" by "(sym-
metric w)" (for the proof of Proposition 14.3.6, use the operator mentioned in (vii)
rather than the Gossez operator).
(x) In view of (ix), we could add "X is (symmetric w)" to the list of items characterizing
(w) Banach lattices in Fact 14.3.14.
14.4 Notes
The Gossez operator from Example 14.2.2 arises quite naturally from an operator-theoretic
viewpoint:
Remark 14.4.1 The universal nonweakly compact operator (see [52, Exercise VII.61) is
the sum operator o from Ll to 1,:
Then the symmetric part P of a is given by (Px), := xn+f Ck+, xk, Vx iz El; the skew part
of a is - h ~ , where G denotes Gossez's operator from Example 14.2.2; and a is monotone:
denoting (1,1,1,. . .) E L, by e, we have (or, z) = f ( e , ~ ) ~ + f11x11;. Similarly, the tad
operator T from L l to t , is given by ( T X ) ~ := '&,n 2 k , vx E el, n E N. Then a and 7 -
possess the same symmetric part (so T is also monotone), but the skew part of T equals $'.
CHAPTER 14. EXAMPLES
Remarks 14.4.2 Consider the operator F from Example 14.2.4 and let X := L1[O, 11. Fitz-
patrick and Phelps showed directly that the operator F is not locally maximal monotone;
see [67, Example 3.21. We now sketch a proof that F is not tauberian. Define the operator
T from X to X* by (Tx)(t) := Ji x(s) ds, Vz 'z X , t E [0, 11. Since the skew part of T is $ F
(see the proof of Example 14.2.4), the operator F is tauberian if and only if T is. If T were
tauberian, then T(Bx) would be closed (see [ l6l , Problem 11-4-1261 or [114, Theorem 2.11).
Keystep (Tam& Erddlyi [62]): T(Bx) is not closed.
To see this, denote Lebesgue's singular function by A (see [87, Exercise 8.281 for the con-
struction and a sketch). Let a1 (resp. Q2) be the piecewise linear continuous function on
[O, 1) determined by the points ((0, O), (i, $), ( 3 , $), (1,1)}
(resp. by {(OO), ( ) ( , ) ( ) ( ) ( $1, (8 , ) (1, 1 ) ) Continue - in the spirit
of the construction of A via Cantor's ternary set - to obtain a sequence (a,) of continuous
piecewise linear functions and denote their derivatives by (cp,). Then (p,) lies in Bx (in fact:
IIcpn 11 1 - 1) and the sequence (Tcp,) = (a,) converges uniformly to A and hence in L, [0, 11.
However, A is not absolutely continuous ([150, Example 3.1381); therefore, A @ ran T ([150:
Theorem 6.841).
Altogether, the operator F is not tauberian.
We conclude this chapter by mentioning an interesting example, also due to Gossez.
Fact 14.4.3 There exists a continuous linear skew operator from el to l , that is not unique.
Proof. [77]. . Remark 14.4.4 Denoting the operator from Fact 14.4.3 by S, we see that 3 is not mono-
tone (by Fact 12.2.6).
Remark 14.4.5 In Proposition 14.3.11, the condition "X' is weakly sequentially complete"
is not necessary for "X is (w)": Leung [103, Section 21 constructed a James type space that
is (c), does not contain a copy of el, and its dual is not weakly sequentially complete.
Remark 14.4.6 It is unknown whether or not "X is not (crns)" implies "X** is not (crns)" :
the proof of Proposition 14.3.13 does not generalize, since conjugates of skew operators need
not be skew (Example 14.2.2 and Example 14.2.4).
C H A P T E R 14. EXAMPLES 170
Concerning Remark 14.3.16: we refer the reader to van Dulst's [I551 for more on Banach
spaces not containing a copy of !I. Further facts on properties related to (w) may be found
in [61] and [136].
Chapter 15
Some nonlinear results
15.1 Overview
First steps towards nonlinear results are taken in this chapter. We study regularizations of
continuous linear monotone operators, i.e., perturbations by positive multiples of the duality
map. The results presented underline the close relationship between local maximal mono-
tonicity and monotonicity of range-dense type even in this nonlinear context. In essence,
the standard constructions of "bad" or "so-so7' operators are shown to rely on the behaviour
of continuous linear monotone operators on the underlying spaces.
15.2 Sums
Proposition 15.2.1 Suppose M is a monotone operator from X to Xu and S is a contin-
uous linear skew operator from X to Xu. If M and S are of type (NI): then so is M + S,
and (M + S)ox"" = Max"* + Sox"", Vx"" E Xu'.
- - Proof. Suppose M and S are of type (NI): Mo = M and So = S (Proposition 12.2.7.(iii)).
Fix (xu", x*) E X*" x X'. It is clear that M + S is monotone and that (M + S)ox"
M + Sx** (Proposition 12.2.3). Then for every (u, u*) E gra M:
(x** - U, x* - (u* + SU)) = (x*" - u, x* - u*) + (S*x**, -21)
= (x** - U: xX - u*) + (S1x**, x** - u) - (S*x**, x**)
= (x** - U, (x* + S*x**) - u*) - (S*x**, x**).
CHAPTER 15. SOME NONLINEAR RESULTS
Since S* is monotone (Theorem 13.2.3) and M is of type (NI), we obtain:
i n f ( v , v * ) ~ g r a ( ~ + ~ ) ( ~ * * - V, 2" - v*)
= -(SWx"*, x**) + inf(,,,*)Egra~(x** - u, (x* + Sxx**) - u*)
5 inf(,,,*)Egra~(x** - u, (xX + S*Z**) - u*) 2 0.
It follows that M + S is of type (NI): (M + S)O = M + S. If x* E M x * * = ( M + S ) o ~ x * ,
then (from the inequalities above) x" + SWx** E Mx** = Max** so that (S*2**,x**) = 0. -
Hence (Proposition 12.3.11.(ii)) -S*x*" E Sox** = Sx** and thus altogether M + Sz*" C - Mx** + SX** = Max** + Sox**. Finally, let y* E Mox** and -S*x** E Sox**, i.e.,
(S*x**, x**) = 0 (Proposition 12.3.11. (ii)). Then
therefore, y* - S*x"* E (M + S)OX** and the proof is complete.
Corollary 15.2.2 Suppose f is a convex lower semi-continuous proper function on X and
T is a continuous linear monotone operator from X to X*. If T is of type (NI), then so is
df +T.
Proof. Decompose T into its symmetric part P and its skew part S (Proposition 12.3.5).
Then P is a subdifferential (Proposition 12.3.6) and so is df + P (in fact, by the sum rule,
it is the subdiffential of the function f (x) + ~ ( T X , x), Vx E X). Hence, as a subdifferential,
d f + P is of dense type (Fact 12.2.10.(i)) and hence of type (NI) (Fact 12.2.9). If T is of
type (NI), then so is S (Theorem 13.2.3) and, by Proposition 15.2.1, (d f + P) + S = d f + T
is of type (NI) as well.
Proposition 15.2.3 Suppose K, M are monotone operators from X to X", K is compact,
dom K > dom M, and x" E Xu'. Then Mix" (K + M)lx'R - KlxR'. If K is at most
single-valued, then (K + M)lxR" = Klx** + Mix".
w* Proof. Fix (x*", x*) E gra M1 and obtain a bounded net (x,, x:) in gra M with x, -- x*"
and xz + x*. By hypothesis, cl K(x,) is compact, hence there are subnets (xp) of (x,) and W*
y i E Kxp such that yl; -t y* E Klx**. Now (xp, x; + yj) E gra(K + M), xp - x**, and
xj+$ -+ x*+y*; consequently, z*+y* E ( K + M ) r ~ * * and thus x* E (K+M)lx*'-Klx*".
CHAPTER 15. SOME NONLINEAR RESULTS 173
"If" part: from the above, K1x**+Mlx** 5 (K+M)lx** . Let's now pick x* E (K+M)lx** . w * Then there is a bounded net (x,, Kx,+z:) in gra ( K + M ) with (x,, 2:) E gra M , x , - x**,
and K x , + zz + x*. After passing to a subnet if necessary, we assume K x , + y* E Klx**.
Hence zz -, x" - y* E M1xx* and the entire proposition is proven. . Proposition 15.2.4 Suppose M is a monotone operator from X to X* and S is a contin-
uous linear skew operator from X to X " . If M is of dense type and S is compact, then
M + S is of dense type and (M + S)lx*" = Mix"* + Slz*" = Mix*" + S**x*", Vx*" E Xx*.
- Proof. Suppose M is of dense type and S is compact. Fix x** E X**. Then M1 = M and
S is of dense type (and of type (NI); see Theorem 13.2.5). Hence, by Proposition 15.2.1 and
Corollary 12.3.2, M + Sx** = Mix** + Six** = Mix** + S*x**. On the other hand, since
S is compact and single-valued, Proposition 15.2.3 and Proposition 12.2.3 apply and yield
MIX** + SIX"* = ( M + S)lx** C M + Sx**. Altogether, ( M + S)ix** = M + Sx**. . Corollary 15.2.5 Suppose f is a convex lower semi-continuous proper function on X and
T is a continuous linear monotone operator from X to X*. If the skew part of T is compact
or if X is (c), then 8 f + T is of dense type.
Proof. If X is (c), then the skew part of T is certainly compact. So, much as in the proof
of Corollary 15.2.2, denote the symmetric (resp. skew) part of T by P (resp. S ) . Then
af + P is of dense type and S is compact; hence Proposition 15.2.4 applies. H
15.3 Regularizations
Suppose T is a monotone operator from X to X * and X > 0. Then the operator
where J denotes the duality map, is called a regularization of T. The reader is referred to
[I681 for motivation and examples.
We start with some basic properties.
Proposition 15.3.1 Suppose T is a continuous linear monotone operator from X to X" .
Then the regularization T + X J is coercive and maximal monotone, VX > 0.
CHAPTER 15. SOME NONLINEAR RESULTS 174
Proof. Coercivity is obvious from ((T + X J )x , x) 2 X11xII. The operators T and X J
are maximal monotone and their domain is the entire space; the same is true for T + X J
(Fact 2.5.3). . Definition 15.3.2 Suppose c 2 0. Then the c-subdifferential map of the function 3 11 . [ I 2 on X is denoted by J,; thus
J,x := {x" E X* : (x*,x) 2 $ 1 1 ~ 1 1 ~ + - €1, Vx E X
(The reader will find some more information on the c-subdifferential in the Glossarex and
should be warned that the notation of Definition 15.3.2 is not quite unambiguous: on the one
hand, J, has a precise meaning for c equal to 0 or 1 (for c = 0, we recover J ) . On the other
hand, one might think that Jo = J1 = ( J* ) -~ (see Definition 12.2.1 and Fact 12.2.10.(i)).
However, for the remainder of this section, J, is always meant to be as in Definition 15.3.2.)
Fact 15.3.3 (Gossez) Suppose T is a monotone operator of dense type from X to X*. Then
ran(T+XJ,) = X * , VX > 0 , ~ > 0 .
Proof. [74, Thdorkme 4.11. . Fact 15.3.4 (Gossez) Suppose T is a monotone operator from X to X* and X > 0. If
clran(T + X J ) = X*, then ran(T+XJ,) = X*, VE > 0.
Proof. [75, Lemma 11. I
The converse holds true in the continuous linear setting.
Proposition 15.3.5 Suppose T is a continuous linear operator from X to X' and X > 0.
If ran ( T + XJ,) = X", Ve > 0, then clran ( T + XJ) = X".
Proof. Fix z' E X*. For every E > 0, there exists x, E X and u; E J,x, such that z" =
Tx,+Xu;. By a result of Br@ndsted&Rockafellar ([27, Lemma]; see also [120, Theorem 3.171) , there exist y, E X and vT E Jy, such that 11x, - y,II < fi and lluT - V T 1 1 < J;. Then
Ty, + Xv: E ran ( T + X J ) and
Letting E tend to 0 from above yields z* E clran (T + XJ). I
CHAPTER 15. SOME NONLINEAR RESULTS 175
Proposition 15.3.6 Suppose T is a monotone operator from X to X* and E > 0. If there
exists a sequence of positive reds (An) with A, -, 0 and ran (T + A, J,) = X*, then T is of
range-dense type.
Proof. (See also [75, Lemma 21.) In view of Proposition 12.2.3 and Proposition 12.2.7.(ii),
it suffices to show that r a n T ranTl. So fix z* E ranT, say z* E Tz** for some z** E X*".
Obtain for every n E N a vector x, E X with z* = y:+Xnuz, where y: E Txn and u: E Jexn-
Then (by definition of T ) (z"* - x,, z* - y:) 2 0 and so (z**, u;) 2 (x,: u:). Using the
definition of J,, we estimate $ 1 1 ~ ~ 1 1 ~ 5 E - $11u:/I2 + (u;:xn) 5 E - f [lu:112 + (z**,u:) 5 E + 31(~**11~; the sequence (x,) stays bounded and so does the sequence (u;) (an easy proof
by contradiction). It follows that Xnu; t 0. Moreover, there is a subnet (x,) of (2,) with W* W*
x, - x**, for some x"* E X**. Now (x,, yg) E graT, x, - x**, and yg = z* - X,uz -+ z*;
hence (x**, z*) E gra Ti. Consequently, z* E ran TI, as desired. I
Theorem 15.3.7 Suppose T is a continuous linear monotone operator from X to X* . Then
TFAE:
(i) T is of dense type.
(ii) ran(T+XJ,) = X * , VX > O , E > 0.
(iii) cl ran ( T + X J ) = X*, VX > 0.
(iv) T + X J is of range-dense type, VX 2 0.
(v) There exists a sequence (A,) of positive reals tending to 0 with clran (T + An J) = X',
Vn E N.
(vi) There exist 6 > 0 and a sequence (A,) of positive reals tending to 0 with ran (T + X,J,) = X * , Vn E N.
Proof. "(i)*(ii)": Fact 15.3.3. "(ii) ++ (iii)": Fact 15.3.4 and Proposition 15.3.5.
"(ii)-(iii)=+-(iv)": Fix X 2 0 and E > 0. Suppose first X = 0. By (ii), ran (T + A J,) = X',
Vn E N, and hence (Proposition 15.3.6) T + X J is of range-dense type. Now suppose X > 0.
By (iii), cl ran (T + (A + i ) J) = cl ran ((T + X J) + i J) = X*, Vn E N. Thus (Fact 15.3.4)
ran ((T + XJ) + J,) = X*, Vn E N. Hence T + X J is of range-dense type by Proposi-
tion 15.3.6. "(iii)*(v)": is trivial. "(v)*(vi)": immediate from Fact 15.3.4. "(iv)=+-(i)":
CHAPTER 15. SOME NONLINEAR RESULTS 176
T + OJ = T is monotone and of range-dense type, hence of dense type (Theorem 13.2.3).
"(vi)=+(i)": By Proposition 15.3.6, T is of range-dense type, thus of dense type (Theo-
rem 13.2.3).
By contrast, Simons's [146, Theorem 12.(a)] shows that if T is an arbitrary monotone
operator of type (NI) from X to X*, then T + X j is onto, for every X > 0.
The next two results apply to the various rugged spaces identified in the Subsection 2.3.4.
Proposition 15.3.8 Suppose X is rugged, T is a continuous linear monotone operator
from X to X*, and X > 0. Then TFAE: (i) T + XJ is locally maximal monotone; (ii)
cl ran (T + X J ) is convex; (iii) cl ran ( T + X J ) = X* .
Proof. "(i)+(ii)" : follows from Fact 12.2.11. "(ii)+(iii)": Suppose not. Then there exists
x* E X * \ (clran ( T + XJ)). Now separate: (x**, x*) > supZEx(x**, (T + XJ)x), for some
x** E X** \ (0). The homogeneity of T and J implies (x**, (T + X J)x) = 0, Vx E X . Taking
the difference (T is single-valued) and dividing by X yields (x**, J x - J x ) = 0, Vx E X.
Hence x** E (ran (J - J ) ) I = (cl span ran (J - J ) ) I = ( x * ) ~ = {0}, an impossibility.
"(iii)=+(i)" : By Proposition 15.3.1, T + X J is maximal monotone and coercive; therefore, by
Fact 12.2.13, (i) follows.
Remark 15.3.9 Gossez proved in [76] the following: "Let G be the Gossez operator from
Example 14.2.2 and X > 0 such that cl ran (G + X J ) # 1,. Then cl ran (G + X J ) is not
convex." His proof is quite involved (relying on the ~ t o n e / ~ e c h compactification and the
definition of G). However, his result now follows from Example 2.3.16 and Proposition 15.3.8
- in a very structured and almost effortless way!
In rugged spaces, some more items can be added to the list of characterizations of Theo-
rem 15.3.7:
Theorem 15.3.10 Suppose X is rugged and T is a continuous linear monotone operator
from X to X". Then TFAE:
(i) T is of dense type.
(ii) T + X J is locally maximal monotone, VX 2 0.
(iii) There exists a sequence (A,) of positive reds tending to 0 such that cl ran (T + A, J)
is convex, Vn E N.
CHAPTER 15. SOME NONLINEAR RESULTS 177
Proof. "ti)+-(ii)": T = T + 0 J is locally maximal monotone by Theorem 13.2.3. Suppose
now X > 0. By Theorem 15.3.7, clran(T + XJ) = X*. Hence (Proposition 15.3.8) T + X J is locally maximal monotone. "(ii)+-(iii)" : clear from Fact 12.2.11. "(iii) * (i)" : by
Proposition 15.3.8, clran (T + X,J) = X*, Vn E N Now apply Theorem 15.3.7.
We conclude with a theorem that, in view of Example 14.3.9, applies in particular to co(I').
Theorem 15.3.11 Suppose X is (c) and T is a continuous linear monotone operator from
X to X*. Then T + XJ is of dense type and locally maximal monotone, VX 2 0.
Proof. For X = 0, this follows from Theorem 13.2.5. So suppose X > 0. Corollary 15.2.5
implies that T + X J is of dense type. Then Theorem 15.3.7 yields clran ( T + XJ) = X". Since T + X J is maximal monotone and coercive (Proposition 15.3.1), we apply Fact 12.2.13
and conclude that T + X J is locally maximal monotone. . 15.4 Notes
Remarks 15.4.1 The operator F from Example 14.2.4 is not of dense type and L1 [O, 11 is
rugged (Example 2.3.17); hence, by Theorem 15.3.10, we expect cl ran (F+X J ) to be noncon-
vex, for all small X > 0. In fact, Fitzpatrick and Phelps showed directly that cl ran ( F + 1 J )
is nonconvex; see [67, Example 3-21. It follows that F + J is neither of range-dense type
nor locally maximal monotone (Fact 12.2.11). Suppose X is rugged and T is a continuous
linear monotone operator from X to X" that is not of dense type (see Example 14.2.2 or
Example 14.2.4). Then T+X J is neither of range-dense type nor locally maximal monotone,
for all small X > 0: indeed, Theorem 15.3.10 yields nonconvex clran (T + X J ) , for all small
X > 0, and so Fact 12.2.11 applies. Finally, the conclusion of Theorem 15.3.11 remains
true if we replace " X is (c)" by " X is reflexive" (Proposition 15.3.1 and Fact 12.2.5). Similar
remarks apply to some of the other results in this subsection (although the known results
are usually stronger).
Chapter 16
A farewell to Part I1
16.1 Overview
Using one last time Fenchel's mighty Duality Theorem, we are able to give partially affir-
mative answers to questions raised by Simons. Part I1 of this thesis is then concluded by a
list of open problems.
16.2 Simons's strongly maximal monotone operators
16.2.1 Primally strongly maximal monotone operators
Definition 16.2.1 Suppose T is a maximal monotone operator from X to X*. Then T
is said to be primally strongly maximal monotone: if for every weakly compact convex
nonempty subset C of X and every y" E X" \ T(C), there exists (x, x") E graT such that
(2 - C,X* - yX) < 0, V c E C.
Theorem 16.2.2 Suppose T is a continuous linear monotone operator from X t o X". Then
T is primally strongly maximal monotone.
Proof. Fix a weakly compact convex nonempty subset C of X and y* E X* \ T(C). It
suffices to show that there exists some x E X such that (x, Tx) + (-y*, 2) + L;(Y* -Tx) < 0.
The last condition is equivalent to
1 p := inhsx q(x) + I ( - ~ * , x) + L;(-$Tx + Zy*) ; O,
CHAPTER 16. A FAREWELL TO PART 11 179
where q(x) := i (Tx ,x ) , Vx E X. Because C is bounded, we have dom~; = X*. So
Corollary 2.2.24 yields the existence of some x*" E X*" with
Viewed in X*", the set C is weak* compact (since c is weakly compact in X) and hence
weak* closed. Hence, by Example 2.2.17, L~ = (the latter indicator function being
viewed in X**). Thus
** x := x E C and p = - { q * ( f ~ x + fY*) - $(x,Y*)}.
Now recall that dq(x) = {Px), where P is the symmetric part of T (Proposition 12.3.6.(i)).
Therefore,
P < 0 o q*($~ 'x + ly*) - $(x, y*) > 0 o q * ( $ ~ z + fy*) + q(z) > (4T-x + ~ Y * , x )
* &T*X + hy* fi! dq(2) o ~ T ' X + &,t* # PZ o y* # Tx. B
Observation 16.2.3 Simons's [145, Theorem 6-11 states that subdifferentials of convex
functions are primally strongly maximal monotone; he asked [145, page 13871 whether or
not this holds true for arbitrary maximal monotone operators. In view of Theorem 16.2.2, we
now see that continuous linear monotone operators are pimally strongly maximal monotone
as well.
16.2.2 Dually strongly maximal monotone operators
This subsection is "dual" to the previous one.
Definition 16.2.4 Suppose T is a maximal monotone operator from X to X x . Then T is
said to be dually strongly maximal monotone: if for every weak* compact convex nonempty
subset C' of X' and every y E X \ T - ' ( C ) , there (x, x*) E graT such that (a: - y, x= - c- ) < 0, Vc' E C'.
Theorem 16.2.5 Suppose T is a continuous linear operator from X to Xx. Then
T is dually strongly maximal monotone.
Proof. Fix a weak* compact convex nonempty subset C" of X* and y E X with TY C*.
It suffices to show that there exists some a: E X with (Tx, X) - (T*y, 2) + L E * (y - 2) < 0-
The last condition is equivalent to
CHAPTER 16. A FAREWELL TO PART II 180
where q(x) := &(Tx,x), Qx E X. Let g := L & ~ x . By Example 2.2.18, g is convex finite
lower semi-continuous on X and g" = LC*. Hence Corollary 2.2.24 yields the existence of
some x* E X* with p = -{q*($x" + &T*y) + g*(x*) - &(y, x*)}, i.e.,
x* E C* and p = -{q*($x* + $ y y ) - $(y,x*)}.
Recall that aq(y) = {Py}, where P is the symmetric part of T (Proposition 12.3.6.(i)).
Consequently,
p < 0 o q*($x* + 4 ~ ' ~ ) - $(y, x*) > 0 o q^(fx* + $T*y) + q(y) > (&I* + &T*Y, y)
O $2' + $T*y fZ aq(y) ax* + &T*y # Py H x* # Ty. H
Observation 16.2.6 Simons's [145, Theorem 6.21 states that subdifferentials of convex
functions are dually strongly maximal monotone; he asked [145, page 13871 whether or not
this holds true for arbitrary maximal monotone operators. In view of Theorem 16.2.5, we
now know that continuous linear monotone operators are dually strongly maximal monotone
a s well.
16.3 Notes
The notion of a primally (resp. dually) strongly maximal monotone operator was coined
by Simons [143]. Zagrodny [163] studied questions different from but related to the two
questions of Simons mentioned in Observation 16.2.3 and Observation 16.2.6.
16.4 Open problems
A flourishing mathematical theory thrives on interesting open problems. An important list
of open problems in general Monotone Operator Theory was compiled by Simons [143].
Here, I add three problems which arise quite naturally from the contents of Part I1 of this
thesis.
16.4.1 The interrelationship problem
We have seen that the various notions of maximal monotonicity (of dense type, of range-
dense type, of type (NI), locally maximal monotone) agree not only for subdifferentials
(Fact 12.2.10) but also for continuous linear monotone operators (Theorem 13.2.3). Hence
the following question is very natural.
CHAPTER 16. A FAREWELL TO PART II
The interrelationship problem. Determine the precise relationships among
the following notions of maximal monotonicity: of dense type, of range-dense
type, of type (NI), locally maximal monotone.
This is likely a hard question. More realistically, we could ask for a better comparison of
the extensions. For instance: given a maximal monotone operator T from X to X*, is it
possible that graTl 5 graTo?
I do not know the answer even when T is a continuous linear skew operator. The only
candidate for a counter-example is the operator F from Example 14.2.4; however, this is
not an easy fellow to work with.
16.4.2 The decomposition problem
The decomposition of a continuous linear monotone operator into a symmetric and a skew
part was crucial in obtaining the main results in Chapter 13. 1 believe that a general
decomposition of maximal monotone operators would prove at least equally useful and this
motivates my second open problem.
The decomposition problem. Suppose T is a maximal monotone operator
from X to X*. Find a constructive decomposition of T
into the sum of a symmetric part P and a skew part S.
If T is linear and continuous, then the decomposition should return what we expect from
Proposition 12.3.5. If T is the subdifferential, then I would like to have T = P and S = 0.
Asplund has such a decomposition [4, Theorem 6.31; however, I fear that his result is too
nonconstructive to be useful.
Note that for a continuous linear monotone operator T , the decomposition is constructive:
P = $T + $T" l x and S = $T - $T* I x . This clearly suggests tackling the decomposition
problem by introducing the notion of a conjugate operator. However, I am not aware of any
work on this.
CHAPTER 16. A FAREWELL TO PART I1 182
i ,' 8 4
16.4.3 The (crns) problem
We saw in Section 14.3 that many classical Banach spaces are (crns) and thus do not
allow "weird" continuous linear monotone skew operators. Fact 14.3.14 characterized (crns)
Banach lattices in terms of not containing a complemented copy of el . This immediately
suggests my last question.
The (crns) problem. Suppose X is a Banach space. Is the property "X does
not contain a complemented copy of el" equivalent to " X is (crns)"?
Note that a (crns) Banach space does not contain a complemented copy of (Proposi-
tion 14.3.6). Probably, this problem is the hardest of all three and its solution might require
deep Banach Space Theory.
16.5 Conclusion
In Part I1 of this thesis, I have analyzed and characterized various notions of monotonicity
for continuous linear monotone operators. The study depends on results from Functional
Analysis and Banach Space Theory, but most importantly on
0 Fenchel's Duality Theorem,
which continues to amaze me by its wide range of applications.
Bibliography
[I] S. AGMON. The relaxation method for linear inequalities. Canadian Journal of
Mathematics, 6:382-392, 1954.
[2] I. AMEMIYA and T. ANDO. Convergence of random products of contractions in
Hilbert space. Acta scientiarum mathematicarum (Szeged), 26:239-244, 1965.
[3] R.M. ARON, B.J. COLE, and T.W. GAMELIN. Spectra of algebras of analytic
functions on a Banach space. Journal fur die reine und angewandte Mathematik,
415:51-93, 1991.
[4] E. ASPLUND. Topics in the theory of convex functions. In GHIZZETTI [71], pages
1-33. Proceedings of a NATO Advanced Study Institute held in Venice, Italy, June
17-30, 1968.
[5] H. ATTOUCH and H. BREZIS. Duality for the sum of convex functions in general
Banach spaces. In J.A. Barroso, editor, Aspects of Mathematics and its Applications,
volume 34 of North-Holland mathematical library, pages 125-133. Elsevier Science
Publ. Co., 1986. Collection of papers written in honor of Leopoldo Nachbin.
[6] H. ATTOUCH and M. THERA. A general duality principle for the sum of two
operators, 1996. Preprint.
[7] J.-P. AUBIN and I. EKELAND. Applied Nonlinear Analysis. Wiley-Interscience, New
York, 1984.
[8] J.B. BAILLON. Comportement Asymptotique des Contractions et Semi-Groupes de
Contractions. PhD thesis, University of Paris VI, 1978. In French.
BIBLIOGRAPHY 184
[9] J.B. BAILLON and H. BREZIS. Une remarque sur le comportement asymptotique
des semigroupes non lineaires. Houston Journal of Mathematics, 2(1):5-7, 1976. In
French.
[lo] J.B. BAILLON and R.E. BRUCK. Ergodic theorems and the asymptotic behavior
of contraction semigroups. In K.K. Tan, editor, Fixed Point Theory and Applica-
tions, pages 12-26, Singapore, 1992. World Scientific Publ. Proceedings of the second
international conference held in Halifax, Nova Scotia, Canada, June 9-14, 1991.
[Ill H.H. BAUSCHKE. The approximation of fixed points of compositions of nonexpansive
mappings in Hilbert space. Journal of Mathematical Analysis and Applications. To
appear.
[12] H.H. BAUSCHKE. A norm convergence result on random products of relaxed
projections in Hilbert space. Transactions of the American Mathematical Society,
347(4):1365-1374, April 1995.
[13] H.H. BAUSCHKE and J.M. BORWEIN. Legendre functions and the method of ran-
dom bregman projections. Journal of Convex Analysis. To appear.
[14] H.H. BAUSCHKE and J.M. BORWEIN. On projection algorithms for solving convex
feasibility problems. SIAM Review. To appear.
[15] H.H. BAUSCHKE and J.M. BORWEIN. On the convergence of von Neumann's al-
ternating projection algorithm for two sets. Set- Valued Analysis, l(2): 185-212, 1993.
[16] H.H. BAUSCHKE and J.M. BORWEIN. Dykstra's alternating projection algorithm
for two sets. Journal of Approximation Theory, 79(3):418-443, December 1994.
[17] H.H. BAUSCHKE and J.M. BORWEIN. Continuous linear monotone operators on
Banach spaces. Technical report, Centre for Experimental & Constructive Mathemat-
ics (CECM), Simon Fraser University, 1995. CECM Information Document 95-049.
Submitted.
[18] H.H. BAUSCHKE, J.M. BORWEIN, and A.S. LEWIS. The method of cyclic pro-
jections for closed convex sets in Hilbert space. In Y. Censor and S. Reich, editors,
BIBLIOGRAPHY 185
Optimization and Nonlinear Analysis, Contemporary Mathematics. American Math-
ematical Society, 1996. Proceedings on the Special Session on Optimization and Non-
linear Analysis, Jerusalem, May 1995. To appear.
[19] J.M. BORWEIN. Convex relations in analysis and optimization. In S. Schaible and
W.T. Ziemba, editors, Generalized Concavity in Optimization and Economics, pages
335-377. Academic Press, 1980. Proceedings of the NATO Advanced Study Institute
held at the University of British Columbia in Vancouver, Canada, August 4-15.
[20] J.M. BORWEIN. Stability and regular points of inequality systems. Joumal of Opti-
mization Theory and Applications, 48(1):9-52, 1986.
[21] J.M. BORWEIN, S. FITZPATRICK, and J . VANDERWERFF. Examples of convex
functions and classifications of normed spaces. Journal of Convex Analysis, 1(1):61-73,
1994.
1221 J.M. BORWEIN and A.S. LEWIS. Partially finite convex programming Part I: Quasi
relative interiors and duality theorey. Mathematical Programming, 57:15-48, 1992.
[23] J.M. BORWEIN and A.S. LEWIS. Partially finite convex programming Part 11: Ex-
plicit lattice models. Mathematical Programming, 57:49-83, 1992.
[24] J.M. BORWEIN and D.T. YOST. Absolute norms on vector lattices. Proceedings of
the Edinburgh Mathematical Society, 27:215-222, 1984.
[25] J.P. BOYLE and R.L. DYKSTRA. A method for finding projections onto the in-
tersection of convex sets in Hilbert spaces. In R.L. Dykstra, T. Robertson, and
F.T. Wright, editors, Advances in Order Restricted Statistical Inference, pages 28-
47. Springer-Verlag, 1985. Lecture Notes in Statistics: vol 37. Proceedings, Iowa City.
[26] L.M. BREGMAN. The method of successive projection for finding a common point
of convex sets. Soviet Mathematics Doklady, 6:688-692, 1965.
[27] A. BR0NDSTED and R.T. ROCKAFELLAR. On the subdifferentiability of convex
functions. Proceedings of the American Mathematical Society, 16:605-611, August
1965.
[28] F.E. BROWDER. Convergence theorems for sequences of nonlinear operators in Ba-
nach spaces. Mathematische Zeitschrifl, 100:201-225, 1967.
BIBLIOGRAPHY 186
[29] F.E. BROWDER, editor. Nonlinear Functional Analysis. American Mathematical
Society, Providence, Rhode Island, 1970. Proceedings of the Symposium in Pure
Mathematics of the American Mathematical Society held in Chicago, Illinois, April
16-19, 1968. Proceedings of Symposia in Pure Mathematics Volume XVIII, Part I.
[30] R.E. BRUCK. Random products of contractions in metric and Banach spaces. Journal
of Mathematical Analysis and Applications, 88:319-332, 1982.
[31] J.V. BURKE. Personal communication.
[32] J.V. BURKE and P. TSENG. A unified analysis of Hoffman's bound via Fenchel
duality. SIAM Journal on Optimization, 6(2):265-282, May 1996.
[33] Y. CENSOR. Row-action methods for huge and sparse systems and their applications.
SIAM Review, 23(4):444-466, October 1981.
[34] Y. CENSOR. Iterative methods for the convex feasibility problem. In M. Rosenfeld
and J. Zaks, editors, Convexity and Graph Theory, pages 83-91. North-Holland, 1984.
Proceedings of the Conference on Convexity and Graph Theory, Israel, March 1981.
Annals of Discrete Mathematics (20).
[35] Y. CENSOR. Parallel application of block-iterative methods in medical imaging and
radiation therapy. Mathematical Programming, 42: 307-325, 1988.
[36] Y. CENSOR and G.T. HERMAN. On some optimization techniques in image recon-
struction from projections. Applied Numerical Mathematics, 3(5):365-391, 1987.
(371 Y. CENSOR and A. LENT. Cyclic subgradient projections. Mathematical Program-
ming, 24:233-235, 1982.
[38] G. CIMMINO. Calcolo approssimate per le soluzioni dei sistemi di equazioni lineari. La
Ricerca scientijica ed a1 Progresso tecnico nell' Economia nazionale (Roma), 9(1):326-
333, 1938. Consiglio Nazionale delle Ricerche. Minister0 dell' Educazione nazionale.
[39] P.L. COMBETTES. Hilbertian convex feasibility problem: Convergence of projection
methods. Applied Mathematics and Optimization. To appear.
[40] P.L. COMBETTES. The foundations of set theoretic estimation. Proceedings of the
IEEE, 81(2):182-208, February 1993.
BIBLIOGRAPHY 187
[41] P.L. COMBETTES. The Convex Feasibility Problem in Image Recovery, volume 95
of Advances in Imaging and Electron Physics, pages 155-270. Academic Press, Inc.,
1996.
[42] P.L. COMBETTES. Convex set theoretic image recovery by extrapolated iterations of
parallel subgradient projections. IEEE Transactions on Image Processing, November
or December 1996. To appear.
[43] J.B. CONWAY. A Course in Functional Analysis, volume 96 of Graduate Texts in
Mathematics. Springer-Verlag, New York, second edition, 1990.
[44] A.R. DE PIERRO and A.N. IUSEM. A simultaneous projections method for linear
inequalities. Linear Algebra and its Applications, 64:243-253, 1985.
[45] A.R. DE PIERRO and A.N. IUSEM. A finitely convergent "row-action" method for
the convex feasibility problem. Applied Mathematics and Optimization, 17:225-235,
1988.
1461 K. DEIMLING. Nonlinear Functional Analysis. Springer-Verlag, Berlin, 1985.
[47] F. DEUTSCH. Rate of convergence of the method of alternating projections. In
B. Brosowski and F. Deutsch, editors, Parametric optimization and approximation,
pages 96-107. Birkhauser, 1983. International Series of Numerical Mathematics Vol
72.
[48] F. DEUTSCH. The method of alternating orthogonal projections. In S.P Singh, edi-
tor, Approximation Theory, Spline functions and Applications, pages 105-121. Kluwer
Academic Publ., 1992. Proceedings of a Conference held in the Hotel villa del Mare,
Maratea, Italy between April 28, 1991 and May 9, 1991.
[49] F. DEUTSCH. The angle between subspaces of a Hilbert space. In S.P. Singh, editor,
Approximation Theory, Wavelets and Applications, pages 107-130. Kluwer Academic
Publishers, 1995.
[50] F. DEUTSCH and H. HUNDAL. The rate of convergence for the method of alternating
projections 11. Journal of Mathematical Analysis and Applications. To appear.
BIBLIOGRAPHY 188
[51] R. DEVILLE, G. GODEFROY, and V. ZIZLER. Smoothness and renomnings in
Banach spaces, volume 64 of Pitman monographs and surveys in pure and applied
mathematics. Longman Scientific&Technical, New York, 1993.
[52] J . DIESTEL. Sequences and Series in Banach Spaces, volume 92 of Graduate Texts
in Mathematics. Springer-Verlag, New York, 1984.
[53] B. DJAFARI ROUHANI. Asymptotic behaviour of almost nonexpansive sequences
in a Hilbert space. Journal of Mathematical Analysis and Applications, 151:226-235,
1990.
1541 V. DOLEZAL. Monotone operators and applications in Control and Network Theory.
Elsevier Scientific Publishing Company, Amsterdam, The Netherlands, 1979.
1551 N. DUNFORD and J.T. SCHWARTZ. Linear Operators Part I: General Theory.
Interscience Publishers, New York, 1964.
[56] J.M. DYE and S. REICH. Unrestricted iterations of nonexpansive mappings in Hilbert
space. Nonlinear Analysis, 18(2):199-207, 1992.
[57] J . ECKSTEIN and D.P. BERTSEKAS. On the Douglas-Rachford splitting method
and the proximal point algorithm for maximal monotone opeators. Mathematical
Programming, 55(3):293-318, 1992.
[58] I. EKELAND and R. TEMAM. Convex Analysis and Variational Problems. North-
Holland Publ., Amsterdam, 1976.
[59] L. ELSNER, I. KOLTRACHT, and M. NEUMANN. Convergence of sequential and
asynchronous nonlinear paracontractions. Numerische Mathematik, 62:305-319, 1992.
(601 G. EMMANUELE. A dual characterization of Banach spaces not containing kl. Bul-
letin of the Polish Academy of Sciences, Mathematics, 34(3-4):155-16, 1986.
[61] G. EMMANUELE. Remarks on weak compactness of operators defined on cer-
tain injective tensor products. Proceedings of the American Mathematical Society,
l l6(2) :473-476, October 1992.
[62] T. ERDELYI. Personal communication.
BIBLIOGRAPHY 189
[63] 1.1. EREMIN. Fejdr mappings and convex programming. Siberian Mathematical JOUT-
nal, 10:762-772, 1969.
[64] Ky FAN. On systems of linear inequalities. In H. W. Kuhn and A.W. Tucker, editors,
Linear inequalities and related systems, pages 99-156. Princeton University Press,
1956. Annals of Mathematics Studies Number 38.
[65] S. FITZPATRICK. Personal communication.
[66] S.P. FITZPATRICK and R.R. PHELPS. Bounded approximations to monotone oper-
ators on Banach spaces. Annales de l'lnstitut Henri Poincare'. AnaZyse Non Line'aire,
9(5):573-595, 1992.
[67] S.P. FITZPATRICK and R.R. PHELPS. Some properties of maximal monotone op-
erators on nonreflexive Banach spaces. Set- Valued Analysis, 3:51-69, 1995.
[68] S.D. FLAM and J. ZOWE. Relaxed outer projections, weighted averages and convex
feasibility. BIT, 30:289-300, 1990.
1691 S. FUCIK and A. KUFNER. Nonlinear Difierential Equations, volume 2 of Studies
in Applied Mechanics. Elsevier Scientific Publishing Company, Amsterdam, 1980.
[70] N. GAFFKE and R. MATHAR. A cyclic projection algorithm via duality. Metrika,
36:29-54, 1989.
[71] A. GHIZZETTI, editor. Theory and Applications of monotone operators, Gubbio,
Italy, 1969. Edizioni "Oderisi". Proceedings of a NATO Advanced Study Institute
held in Venice, Italy, June 17-30, 1968.
[72] J.R. GILES. Convex analysis with application in the differentiation of convex func-
tions. Pitman, 1982.
[73] K. GOEBEL and W.A. KIRK. Topics in metric fixed point theory. Cambridge Uni-
versity Press, 1990. Cambridge studies in advanced mathematics 28.
[74] J.-P. GOSSEZ. Op6rateurs monotones non lindaires dans les espaces de Banach non
rdflexifs. Journal of Mathematical Analysis and Applications, 34:371-395, 1971.
BIBLIOGRAPHY 190
[75] J.-P. GOSSEZ. On the range of a coercive maximal monotone operator in a nonre-
flexive Banach space. Proceedings of the American Mathematical Society, 35(1):88-92,
September 1972.
[76] J.-P. GOSSEZ. On a convexity property of the range of a maximal monotone operator.
Proceedings of the American Mathematical Society, 55(2):359-360, March 1976.
[77] J.-P. GOSSEZ. On the extensions to the bidual of a maximal monotone operator.
Proceedings of the American Mathematical Society, 62(1):67-71, January 1977.
[78] C.W. GROETSCH. Generalized inverses of linear operators. Marcel Dekker, Inc.,
New York, 1977. Monographs and textbooks in pure and applied mathematics, Vol
37.
[79] L.G. GUBIN, B.T. POLYAK, and E.V. RAIK. The method of projections for finding
the common point of convex sets. U.S.S.R. Computational Mathematics and Mathe-
matical Physics, 7(6):1-24, 1967.
[80] 0. GULER. On the convergence of the proximal point algorithm for convex minimiza-
tion. SIAM Journal on Control and Optimization, 29(2):403-419, March 1991.
[81] 0 . GULER and L. TUNCEL. Characterization of the barrier parameter of homoge-
neous convex cones, 1996. Preprint.
[82] J.M. GUTIERREZ. Weakly continuous functions on Banach spaces not containing el. Proceedings of the American Mathematical Society, 119(1):147-152, September 1993.
[83] I. HALPERIN. The product of projection operators. Acta scientiamm mathemati-
camm (Szeged), 23:96-99, 1962.
[84] A. HARAUX. How to differentiate the projection on a convex set in Hilbert space.
some applications to variational inequalities. Journal of the Mathematical Society of
Japan, 29(4):615-631, 1977.
[85] G.T. HERMAN. Image Reconstruction from Projections. Academic Press, New York,
1980.
BIBLIOGRAPHY 191
[86] G.T. HERMAN, A. LENT, and P.H. LUTZ. Relaxation methods for image reconstruc-
tion. Communications of the Association for Computing Machinery, 21(2):152-158,
February 1978.
[87] E. HEWITT and K. STROMBERG. Real and abstract analysis, volume 25 of Graduate
Texts i n Mathematics. Springer-Verlag, New York, 1965.
1881 J.-B. HIRIART-URRUTY and C. LEMARECHAL. Convez Analysis and Minimiza-
tion Algorithms I, volume 305 of Grundlehren der mathematischen Wissenschaften.
Springer-Verlag, 1993.
[89] J.-B. HIRIART-URRUTY and C. LEMARECHAL. Convez Analysis and Minimiza-
tion Algorithms 11, volume 306 of Grundlehren der mathematischen Wissenschaften.
Springer-Verlag, 1993.
[go] A.J. HOFFMAN. On approximate solutions of systems of linear inequalities. Journal
of Research of the National Bureau of Standards, 49(4):263-265, October 1952.
[91] R.B. HOLMES. Geometric Functional Analysis and its Applications. Springer-Verlag,
New York, 1975.
[92] R.A. HORN and C.R. JOHNSON. Matrix analysis. Cambridge University Press,
second edition, 1985.
[93] H. HUNDAL and F. DEUTSCH. Two generalizations of Dykstra's cyclic projection
algorithm, 1995. Preprint.
[94] A.D. IOFFE and V.M. TIHOMIROV. Theory of Extremal Problems. North-Holland
Publ., Amsterdam, 1979. Studies in Mathematics and its Applications; v. 6.
[95] G.J.O. JAMESON. Topology and Nomned Spaces. Chapman and Hall, 1974.
[96] V. JEYAKUMAR and H. WOLKOWICZ. Generalizations of Slater's constraint qual-
ification for infinite convex programs. Mathematical Programming, 57:85-101, 1992.
[97] S. KACZMARZ. Angeniiherte Auflosung von Systemen hea re r Gleichungen. Bulletin
international de Z'Acade'mie Polonaise des Sciences et des Lettres. Classe des Sciences
mathimatiques et naturelles. Se'ries A: Sciences matheinatiques, pages 355-357, 1937.
Cracovie, Imprimerie de 1'UniversitC.
BIBLIOGRAPHY 192
[98] S. KAYALAR and H.L. WEINERT. Error bounds for the method of alternating
projections. Mathematics of Control, Signals, and Systems, 1:43-59, 1988.
[99] K.C. KIWIEL. The efficiency of subgradient projection methods for convex optimiza-
tion, part I: General level methods. SIAM Journal on Control and Optimization. To
appear.
[loo] K.C. KIWIEL. Block-iterative surrogate projection methods for convex feasibility
problems. Linear Algebra and its Applications, 215:225-260, 1995.
[loll K.C. KIWIEL and B. LOPUCH. Surrogate projection methods for finding fixed points
of firmly nonexpansive mappings. SIAM Journal on Optimization. To appear.
[102] E.B. LEACH and J.H.M. WHITFIELD. Differentiable functions and rough norms
on Banach spaces. Proceedings of the American Mathematical Society, 33(1):120-126,
May 1972.
[I031 D.H. LEUNG. Banach spaces with property (w). Glasgow Mathematical Journal,
35:207-217, 1993.
[104] A.S. LEWIS. Personal communication.
[I051 A.S. LEWIS. Convex analysis on the Hermitian matrices. SIAM Journal on Opti-
mization, 6(1):164-177, February 1996.
[I061 D.G. LUENBERGER. Linear and Nonlinear Programming. Addison-Wesley, second
edition, 1984.
[107] Y.I. MERZLYAKOV. On a relaxation method of solving systems of linear inequalities.
U.S.S.R. Computational Mathematics and Mathematical Physics, 2(3):504-510, 1963.
[I081 P. MEYER-NIEBERG. Banach Lattices. Springer-Verlag, Berlin, 1991.
[log] W. MOORS. Personal communication.
[I101 J.J. MOREAU. Ddcomposition orthogonale d'un espace hilbertien selon deux c6nes
mutuellement polaires. Comptes Rendus des Se'ances de 1 'Acade'mie des Sciences,
Skries A-B, Paris, 255:238-240, 1962. In French.
BIBLIOGRAPHY 193
[ I l l ] J.J. MOREAU. Un cas des convergence des iterkes d'une contraction d'un espace
Hilbertien. Comptes Rendus des Se'ances de Z'Acade'mie des Sciences, Se'ries A-B,
Paris, 286(3):143-144, 1978. In French.
[I121 U. MOSCO. Convergence of convex sets and of solutions of variational inequalities.
Advances in Mathematics, 3:510-585, 1969.
[I131 T.S. MOTZKIN and I.J. SCHOENBERG. The relaxation method for linear inequal-
ities. Canadian Journal of Mathematics, 6:393-404, 1954.
[I141 R. NEIDINGER and H.P. ROSENTHAL. Norm-attainment of linear functionals on
subspaces and characterizations of tauberian operators. Pacific Journal of Mathemat-
ics, 118(1):215-228, 1985.
[I151 2. OPIAL. Weak convergence of the sequence of successive approximations for nonex-
pansive mappings. Bulletin of the American Mathematical Society, 73:591-597, 1967.
[I161 D. PASCAL1 and S. SBURLAN. Nonlinear mappings of monotone type. Sijthoff &
Noordhoff International Publishers, Alphen aan den Rijn, The Netherlands, 1978.
[I171 R.R. PHELPS. Subreflexive normed linear spaces. Archiv der Mathematik, 8:444-450,
1957.
[118] R.R. PHELPS. Correction to "Subreflexive normed linear spaces". Archiv der Math-
ematik, 9:439-440, 1958.
[I191 R.R. PHELPS. Some subreflexive Banach spaces. Archiv der Mathematik, 10:162-169,
1959.
[I201 R.R. PHELPS. Convex Functions, Monotone Operators and Differentiability, volume
1364 of Lecture Notes in Mathematics. Springer-Verlag, second edition, 1993.
[121] R.R. PHELPS. Lectures on maximal monotone operators, 1993. TeX file: phelpsmax-
monop.tex, Banach space bulletin board archive: ftp.math.okstate.edu.
[I221 R.R. PHELPS. Linear monotone operators as subdifferentials, August 1994. Unpub-
lished notes.
[I231 G. PIERRA. Decomposition through formalization in a product space. Mathematical
Programming, 28:96-115, 1984.
BIBLIOGRAPHY 194
[124] B.T. POLYAK. Minimization of unsmooth functionals. U.S.S.R. Computational
Mathematics and Mathematical Physics, 9(1):14-29, 1969.
[125] A.W. ROBERTS and D.E. VARBERG. Convex Functions. Academic Press, New
York, 1973.
[I261 S.M. ROBINSON. Regularity and stability for convex multivalued functions. Mathe-
matics of Operations Research, 1(2):130-143, May 1976.
[127] R.T. ROCKAFELLAR. Convex Analysis. Princeton University Press, Princeton, NJ ,
1970.
[128] R.T. ROCKAFELLAR. On the maximal monotonicity of subdifferential mappings.
Pacific Journal of Mathematics, 33:209-216, 1970.
[I291 R.T. ROCKAFELLAR. On the maximality of sums of nonlinear monotone operators.
Transactions of the American Mathematical Society, 149:75-88, 1970.
(1301 R.T. ROCKAFELLAR. Monotone operators and the proximal point algorithm. S I A M
Journal on Control and Optimization, 14(5):877-898, August 1976.
[I311 H.P. ROSENTHAL. On quasi-complete subspaces of Banach spaces with an appendix
on compactness of operators from L p ( p ) to LT(u). Journal of Functional Analysis,
4:176-214, 1969.
[132] H.P. ROSENTHAL. A characterization of Banach spaces containing el. Proceedings
of the National Academy of Science, USA, 71:2411-2413, 1974.
[I331 H.P. ROSENTHAL. Some recent discoveries in the isomorphic theory of Banach
spaces. Bulletin of the American Mathematical Society, 84(5):803-831, 1978.
[I341 H.L. ROYDEN. Real Analysis. Macmillan Publishing Company, New York, third
edition, 1988.
[I351 W. RUDIN. Functional Analysis. McGraw Hill, 1973.
[I361 E. SAAB and P. SAAB. Applications lin6aires continues sur le produit tensoriel injectif
d'espaces de Banach. Comptes Rendus de I 'Acadh ie des Sciences de Paris, Seivie I,
311:789-792, 1990.
BIBLIOGRAPHY 195
[I371 E. SAAB and P. SAAB. On Stability Problems of Some Properties in Banach Spaces.
In K. Jarosz, editor, Function spaces, pages 367-394, New York, 1992. Marcel Dekker.
Lecture notes in pure and applied mathematics; vol 136. Proceedings of a conference
on function spaces held at Southern Illinois University a t Edwardsville, from April 19
to 21, 1990.
[I381 H.H. SCHAEFER. Banach Lattices and Positive Operators, volume 215 of Die
Grundlehren der mathematischen Wissenschaften. Springer-Verlag, New York, 1974.
[I391 M.I. SEZAN. An overview of convex projections theory and its applications to image
recovery problems. Ultramicroscopy, 40:55-67, 1992.
[I401 N.Z. SHOR. Minimization Methods for Non-Diflerentiable Functions. Springer-Verlag,
Berlin, 1985. Springer series in computational mathematics v. 3.
[I411 D.M. SIMMONS. Nonlinear Programming for Operations Research. Prentice Hall:
1975.
[142] A. SIMONIC. Personal communication.
11431 S. SIMONS. Questions on subdifferentials and monotone operators. Unpublished
manuscript.
[I441 S. SIMONS. Subdifferentials are locally maximal monotone. Bulletin of the Australian
Mathematical Society, 47:465-471, 1993.
[I451 S. SIMONS. Subtangents with controlled slope. Nonlinear Analysis, Theory, Methods
tY Applications, 22(11):1373-1389, 1994.
[I461 S. SIMONS. The range of a monotone operator. Journal of Mathematical Analysis
and Applications, 199:176-201, 1996.
[I471 K.T. SMITH, D.C. SOLMON, and S.L. WAGNER. Practical and mathematical as-
pects of the problem of reconstructing objects from radiographs. Bulletin of the Amer-
ican Mathematical Society, 83(6):1227-1270, November 1977.
[I481 J.E. SPINGARN. Partial inverse of a monotone operator. Applied Mathematics and
Optimization, 10:247-265, 1983.
BIBLIOGRAPHY 196
[149] H. STARK, editor. Image Recovery: Theory and Application. Academic Press, Or-
lando, Florida, 1987.
[I501 K.R. STROMBERG. A n introduction to classical real analysis. Wadsworth Interna-
tional Group, Belmont, California, 1981.
[I511 M.R. TRUMMER. Reconstructing pictures from projections: On the convergence of
the ART algorithm with relaxation. Computing, 26(3) :l89-195, 1981.
[I521 M.R. TRUMMER. SMART - an algorithm for reconstructing pictures from projec-
tions. Journal of Applied Mathematics and Physics (ZAMP), 34:746-753, September
1983.
[153] P. TSENG. On the convergence of the products of firmly nonexpansive mappings.
SIAM Journal on Optimization, 2(3):425-434, August 1992.
[I541 M. TSUKADA. Convergence of best approximations in a smooth Banach space. Jour-
nai of Approzimation Theory, 40:301-309, 1984.
[I551 D. van DULST. Characterizations of Banach spaces not containing el, volume 59
of Centrum voor Wiskunde en Informatics ( C W I ) Tract. Stichting Mathematisch
Centrum, Amsterdam, The Netherlands, 1989.
[I561 J. VANDERWERFF. Personal communication.
[I571 M.A. VIERGEVER. Introduction to discrete reconstruction methods in medical imag-
ing. In M.A. Viergever and A. Todd-Pokropek, editors, Mathematics and Computer
Science in Medical Imaging, pages 43-65, Berlin, 1988. Springer-Verlag. NATO Ad-
vanced Science Institute Series F: Computer and Systems Sciences Vol. 39. Proceed-
ings, held in Il Ciocco: Italy, September 21 - October 4, 1986.
[I581 J. von NEUMANN. Functional Operators, volume 11. The Geometry of Orthogonal
Spaces. Princeton University Press, 1950. Annals of mathematics studies Vol. 22.
Reprint of mimeographed lecture notes &st distributed in 1933.
[159] A. WILANSKY. Functional Analysis. Blaisdell Publishing Company, 1964.
[I601 A. WILANSKY. Topology for Analysis. Ginn and Company, 1970.
BIBLIOGRAPHY 197
[I611 A. WILANSKY. Modern methods in topological vector spaces. McGraw-Hill, 1978.
[162] D.C. YOULA and H. WEBB. Image reconstruction by the method of convex pro-
jections: Part 1 - theory. IEEE Transactions on Medical Imaging, MI-1(2):81-94,
October 1982.
[163] D. ZAGRODNY. The maximal monotonicity of the subdifferentials of convex func-
tions. Simons' problem, 1996. Preprint.
[164] E.H. ZARANTONELLO, editor. Contributions to Nonlinear Functional Analysis.
Academic Press, New York, 1971. University of Wisconsin. Mathematics Research
Center; Publication No. 27.
[165] E.H. ZARANTONELLO. Projections on convex sets in Hilbert space and spectral
theory. In E.H. Zarantonello, editor, Contributions to Nonlinear Functional Analysis,
pages 237-424, New York, 1971. Academic Press. University of Wisconsin. Mathe-
matics Research Center; Publication No. 27.
[166] E. ZEIDLER. Nonlinear Functional Analysis and its Applications Volume 3: Varia-
tional Methods and Optimization. Springer-Verlag, New York, 1985.
[167] E. ZEIDLER. Nonlinear Functional Analysis and its Applications Volume 2 Part A:
Linear Monotone Operators. Springer-Verlag, New York, 1990.
[168] E. ZEIDLER. Nonlinear Functional Analysis and its Applications Volume 2 Part B:
Nonlinear Monotone Operators. Springer-Verlag, New York, 1990.
GLOSSAREX 198
Points to a different item in this Glos- sarex.
4 Denotes norm convergence.
- Denotes weak convergence.
W* -- Denotes weak* convergence.
x Used for informal arguments involving se- quences. If (x,), (y,) are two sequences in a Banach space, then x, x y, means
X, -yn + 0.
@, See under US polar cone.
a See under w polar set.
Designates the end of a proof.
Stands for the infimal convolution of two convex functions.
8 If A, B are two subsets of a Hilbert space,
then A $ B indicates an orthogonal sum: ( a , b ) = O , V a € A , b ~ B.
Stands for annzhilators. Suppose X is a Banach space. If S is a subset of X, then
SL := {x* E X* : (xU,S) = 0). If S E X*, then I S := {x E X : (x, S) = 0).
+, - Denotes taking the positive or the nega- tive part. If (X, 5 ) is a vector lattice, thenx+:=xVOandx- :=(-X)VO,VXE X. Note that r+ = max{r, 01, Vr E R.
V, A See under vector lattice.
See Definition 2.3.8. - (.) In Part I, this stands for taking the (nonn)
closure of a set. Also used in conjunc- tion with other symbols, e.g., with span.
In Part 11, this denotes a certain exten- sion of a monotone operator; see Defini- tion 12.2.l.(iii).
( - ) o In Part 11, this stands for taking cer- tain extensions of monotone operators; see
Definitions 12.2.l.(i) ,(ii).
* Denotes conjugation and leads to one of the following: dual space (see under Xu) ,
conjugate function, or conjugate (transpose) operator.
G Used for sequences: if (x,) is a sequence in a Banach space X and x E X, then "x, E x" is an abbreviation for "x, = x, Vn".
(., .) If X is a Banach space and x E X, x* E
X*, then the evaluation x* (x) is also writ- ten as (x*, x) or as (x, 2"). If X is a Hilbert
space, then (X and X* can be identified and) (;, .) denotes the usual inner product.
[-, -1 Stands for a closed interval of reds or,
more generally, for a line segment: if a, b are two points in a Banach space X, then [a,b] := {x E X : Xa+(l-X)b,O 5 X 5 1) = conv {a, b) is the line segment be- tween a and b.
[.] Denotes positive remainders of integer
division; for instance: [512 = 1 and [6]2 = 2. The subscript stands for the divisor.
d See under subgradient.
V See under US Ggteaux derivative.
V Stands for "for all".
3 Stands for "there edst(s)".
y, 70 See under angle.
i If S is a subset of a Banach space, then 1s
denotes the indicator function of S: ss(x) = 0, if x E S; lS(x) = +m, if x # S.
aff If S is a subset of a Banach space X, then aff S denotes the afine span of S, i.e., the
smallest aEne subspace containing S. I t is
GLOSSAREX 199
not hard to check that aff S = S+span ( S - S ) = so + span (S - so), Vso E S.
alternating projections See under e method of alternating projections.
AL-space/AM-space An abbreviation for abstract L-space/abstract M-space. Sup- pose ( X , 11 - 1 1 , <) is a - Banach lattice. Then X is an AL-space, if llx + yll = 11x11 + l l y l l , V x , y 2 0. And X is an AM-space, if
Ilx VYII = IIxII V llyll, V x , Y 2 0. The proto- type for an AL-space (resp. AM-space) is
L [ O , 11 (resp. L& 11).
angle See Definitions 2.6.9,2.6.11.
antisymmetric operator See under skew operator.
argmin Given an optimization problem
where S is a set in a Banach space and f is a function defined on S, the set of all minimizers {Z E S : f (5) = infzEs f ( x ) ) is
denoted argmin zES f (2) .
asymptotically regular A sequence (x,) in a Banach space is called asymptotically reg- ular, if x , - xn+l -, 0. For asymptotically regular algorithms, see Section 8.4.
attracting map See Definition 2.4.4.
averaged map See page 25.
B( . , - ) If X is a Banach space, xo E X , and
T 2 0, then B ( x o , r ) := { x E X : 112 - xoll 5 T ) = x0 + r B X is the ball centered at xo of radius r.
B x The (closed) unit ball of a Banach space
x: { x € x : l/xll 5 1).
( X , 5). Recall that the modulus of an ele- ment x E X is defined by 1x1 := x V (-2).
If the Banach space and the vector lat- tice structures of X mesh in the sense that
1x1 5 IYl implies llxll < I l ~ l l , V X , Y E X, then X is called a Banach lattice. We refer
the reader to [138,108] for more on Banach lattices.
Banach space A Banach space is a complete normed space. In this thesis, all Banach spaces are assumed to be real.
bd Stands for the boundary of a set.
bounded set-valued map See page 148.
boundedly compact A set S in a Banach space X is called boundedly compact, if S rl r B X is compact, Vr > 0.
boundedly linearly regular See under regularities.
boundedly regular See under regularities.
c The Banach space of all real convergent sequences (2,) =: x with norm llxll = sup, lxnI is denoted c. Note that c is a
closed subspace of e em.
Q The Banach space of all real sequences
(x,) =: x converging to 0 with norm llxll = m a , lxnl is denoted co. Note that co is equal to - co ( r ) with r = N.
~ ( r ) Suppose r is a nonempty set. Let x be
a real-valued function on I?. Say that x E co(I'), if { y E l7 : Ix(y)l > 6 ) is finite, V E > 0. Then co (I') is a Banach space with norm
11x11 := max,~r Ix(r)l, and its dual space is equal to 11(1') ([91, Theorem 16.H]).
(c) See Definition 14.3.7. Banach lattice Suppose ( X , 1 1 - 1 1 ) is a Ba-
nach space that is also a * vector lattice
GLOSSAREX 200
C(fi), C[O, 11 Suppose R is a compact Haus- dorff space. The set of all continuous real- valued functions x on R is a Banach space
with norm llxll = m G c n 1x(w)I. The prime example is C[O, 11.
Censor and Lent's framework See Subsection 10.4.1.
cl Stands for the closure of a set. If used with a subscript, then it is the closure with
respect to the indicated topology.
closed operator A set-valued operator from
one Banach space to another is closed if its graph is. We touch upon this notion in Observation 13.2.4.
(cms) See Definition 14.3.1.
coercive set-valued map See page 141.
compact operator A continuous linear op-
erator from a Banach space X to a Banach space Y is compact, if clT(Bx) is. If Y is finite-dimensional, then T is (trivially) compact. See also Fact 2.3.3.
compact set-valued map Suppose $2 is a set-valued map from a Banach space X to
a Banach space Y. Then we say that R is compact, if clfi(B) is compact, for ev-
ery bounded subset B of X. Note that this generalizes the notion of a compact operator.
complemented subspace Suppose Y is a B a n d space. Then Z is a complemented subspace of Y, if Z is a closed subspace of
Y and there is a continuous linear opera- tor Pz from Y to Y with ranPz = Z and PzPz = Pz. (Unfortunately, Pz is usu-
ally referred to as a projection.) We refer the reader to Jameson's [95, Chapter 291 for a gentle and thorough introduction to the concept of a complemented subspace.
cone If S is a subset of a Banach space, then cone S is the smallest e convex cone con-
taining S. Therefore, if S is nonempty, then cones is the set of all linear combi-
nations of elements in S with nonnegative coefficients.
conjugate function If f is a function on a
Banach space X, then the (Fenchel) con- jugate is given by
f *(x*) := sup(x*,x) - f (x), vx* E XX. zEX
conjugate monotone space See Definition 14.3.1.
consideration of remotest sets control See under considers remotest sets.
considers remotest sets See Table 8.1 and before Theorem 8.6.2.
contains a complemented copy Suppose X and Y are Banach spaces. Then Y contains a complemented copy of X, if there exists an injection from X to Y whose range is a e complemented sub- space.
contains a copy Suppose X and Y are Ba-
nach spaces. Then Y contains a copy of X , if there exists an injection from X to Y.
converges actively pointwise See Proposition 8.5.1.
converges linearly We say a sequence (x,) in a Banach space converges linearly to x (with rate p), if ,B E [O,l[ and there exists some cr 2 0 with llx, - xll 5 opn, Vn.
converges pointwise Suppose (T,) is a se- quence of maps whose common domain D is a subset of a Banach space. Suppose further T is a map defined on D as well.
GLOSSAREX
Then (T,) converges pointwise to T , writ- ten T, -, T, if T, (x) -, T(x), Vx E D.
convex cone A convex cone is a nonempty set that is closed under addition and mul-
tiplication by nonnegative reds. I t always contains 0.
convex feasibility problem See Chapter 7 and in particular Section 7.2.
convex polyhedron A finite intersection of
halfspaces is called a convex polyhe- dron.
core Stands for the core of a convex set. See Definitions 2.6.1,2.6.2.
cyclic control/projections See Table 8.1, Example 9.5.1, and Section 11.2.
cylinder See Theorem 3.3.18.
d(., S) Describes the distance function to a
set S in a Banach space X: d(x, S) := inf,Es ((x - sll, Vx E X. Distance functions are rs nonexpansive. If S is convex, then so is d(-, S). If S is closed convex and X is
a Hilbert space, then d(x, S) = IIx - Psxll and d ( - , S) is weakly lower semi-continuous
(see also Theorem 3.2.1).
(D) See Definition 12.2.4.(i).
decreasing We say a sequence of reds (T,) is decreasing, if r, 2 T,+I, Vn. Note that you find also the term "nonincreasing" in the
literature; I think, however, that this term
is terrible: The sequence (-I), is certainly not increasing, but also not nonincreasing!
Demiclosedness Principle See Corollary 2.4.2.
dense type See Definition 12.2.4.(i).
distance function See under d(-, S).
dom Stands for the domain. Suppose X is a Banach space. The (effective) domain of a convex function f on X is defined as
{x E X : f (x ) < +w). In contrast, the domain of a set-valued map R defined on
X is the set { x E X : R(x) # 8).
duality map See Subsection 2.2.5.
dually strongly maximal monotone See Definition 16.2.4.
Dykstra's algorithm See Subsection 11.2.2.
e, Stands for unit vectors in spaces such as - C O ( ~ ) , -tp(F), and - t,(r). eitherlor "either/orW is used in this thesis
as an exclusive or; "either A or B means": "A or B but not both".
Eliot Tkmslator, soccer player, entertainer
- and son of John Read!
ellipse See Remark 3.3.27.
epi I f f is a convex function defined on a Ba- nach space X , then the set
epif := {(x,r) E X x W: f tx) 5 r )
is called the epigraph of f. For the projec- tion onto an epigraph, see Theorem 3.3.21.
epigraph See under epi . Euclidean space Hilbert spaces of finite di-
mension, such as RN, are also called a Eu- clidean spaces.
extrapolation parameter See Section 8.3.
Fejdr monotone sequence See Definition 6.2.1.
Fenchel's Duality Theorem See Fact 2.2.23.
GLOSSAREX 202
firmly nonexpansive See page 24.
finite-codimensional A closed subspace C of a Hilbert space is finite-codimensional, if CL is of finite dimension.
Fix Suppose T is a map defined on a set D. Then FixT := {x E D : x = Tx} is the set of fixed points.
focusing algorithm See Section 8.5.
(FUN) An explicit algorithm for solving e convex feasibility problems with two constraint sets; see Section 7.5.
GZiteaux derivative Suppose f is a convex
function on a Banach space X and x E
dom f . Then f is G6teaux difierentiable at x, if
exists,
In this case, there exists a unique vector in X* , denoted V f (x) and called the Gciteaux
derivative, such that
(Vf(x), h) = lim f (x + th) - f (XI t t +O
7
for every h E X. See [72, 1201, under DO subgradient, and Subsection 2.2.3.
Gossez operator See Example 14.2.2.
gra Stands for the graph of a (possibly set-
valued) map.
halfspace Suppose X is a Hilbert space. A halfspace is a set of the form {x E X :
(a,x) 5 b}, where a E X \ {0), b E R; see Example 3.3.13 for its projection.
har Stands for the harmonic mean. If T, s > 0, then har {T, s) = ~ T S / ( T + s).
has constant sets See Table 8.3 or after Proposition 8.5.1.
has long/short steps Informal for a long- step (resp. short-step) algorithm; see Ta- ble 8.2.
Hilbert lattice Suppose (X, (a,.), I) is a Hilbert space and a e vector lat-
tice. If X is a e Banach lattice (where the norm is induced by the inner product), then X is called a Hilbert lattice.
Hilbert space A Hilbert space X is a real inner product space that is a Banach space
in the norm I I . I I induced by the inner prod- uct (-,.): 11x11 := Jm, vx E X.
Hoffman's error bound See Fact 2.2.26.
hyperplane Suppose X is a Hilbert space. A hyperplane is a set of the form {x E X : (a,x) = b}, where a E X \ {0}, b E R. For the projection onto a hyperplane, see
Example 3.3.12.
hyperslab See Theorem 3.3.18.
I Unless stated otherwise, I denotes the iden-
tity map.
la The set of subgradient indices of a subgra- dient algorithm; see Subsection 10.3.
In The set of active indices for a (projection)
algorithm; see Section 8.3.
icecream cone See Theorem 3.3.6. The ice-
cream cone Cl is also called Lorenz or second-order cone; see [81].
increasing A sequence of reds ( T ~ ) is in- creasing if (-T,) is s~ decreasing.
injection A continuous linear one-to-one op- erator from a Banach space X to a Banach space Y with closed range is called an in- jection.
int Stands for the interior of a set. If used with a subscript, then it denotes the inte-
rior with respect to a topological subspace.
GLOSSAREX 203
in te rmi t ten t algori thm See before Theorem 8.5.3.
isometric isomorphism Suppose X and Y are Banach spaces. An isomorphism that preserves the norm of each element is called an isometric isomorphism; X and Y are then said to be isometrically isomolph.
isometrically isomorph See under isometric isomorphism.
isomorphic/isomorphism Suppose X and Y are Banach spaces. A continuous lin- ear one-to-one operator from X onto Y is
called an isomorphism; the Banach spaces X and Y are said to be isomorphic.
J Stands for the dual i ty map.
Kadec/Klee proper ty Stands for a prop- erty of the norm in Hilbert space (Propo- sition 2.2.3). More generally: a closed con- vex nonempty subset C in a Hilbert space
is Kadec/Klee, if whenever a sequence (x,) in C converges weakly to some point in the
boundary of C, then it actually converges in norm. See also Proposition 4.9.1.
Karush/Kuhn&Tucker conditions See Fact 2.2.23 and Fact 2.2.25.
ker Denotes the kernel (or null space) of a linear operator.
t I See under lp.
tl (I') See under - Pp(I').
tp Suppose 1 5 p < +m. The Banach space of all real sequences (2,) =: x with norm
Ilxllp := (En 1 X n ~ p ) ' < + m is denoted lp. Note that lp is equal to = lp(I') with r =
l p ( r ) Suppose I' is a nonempty set and 1 <_ p < +oo. Let x be a real-valued function
on r. Say x EL,(I'), if {y E I?: x(y) #O) 1
is countable and 1 1 ~ 1 1 ~ := (Eyer I x ( ~ ) ~ P ) ' < +m. Then ( lp(r) , I ( - 11,) is a Banach space and its dual is equal to lq(I'), where ; + $ = I .
1, The Banach space of all real bounded se- quences (x,) =: x, with the norm given by
11x11 := supn Ixnl, is denoted E,. Note that t, is equal to as &(I?) with I' = N.
l,(I') Suppose I' is a nonempty set. The
set of all bounded real-valued functions x on I' is a Banach space with norm (Ixl} =
SUPyer Ix(Y)I-
LI [O, 1) See under m= Lp[O, 11.
L1 (A, A, p) See under Lp(A, A, p).
Lp[O, l] Suppose 1 5 p < +m. The Ba-
nach space of all real Lebesgue-measurable
functions f on [O, 11 with norm 11 f [ I p :=
(I,' ( f IP) < +m is denoted Lp[O, 11. Note that Lp[O, 11 is equal to Lp(A, A, p ) with A = [O, 11, A = Lebesgue-measurable sub- sets of [0, I], and p = Lebesgue-measure.
Lp(A, A, P ) Suppose (A, A, /I) is a o-finite
complete measure space and 1 5 p < +m.
Let f be a measurable function on A. Say
f E Lp(A, A, P), if Ilf 11, := (J, If l P d ~ ) ' < +m. Then (Lp(A, A, p) , II.IIp) is a Banach space and its dual is equal t o Lq(A, A, p) ,
where $ + $ = 1. (As usual, one actually considers equivalence classes of functions
that are equal palmost everywhere.) For details, we refer the reader to [134].
L,[O, 11 Let X be the Lebesgue-measure on the interval [0,1]. The Banach space of
all real Lebesgue-measurable functions f on [O, 11 with norm 11 f 11 := inf {M > 0 : A((a E A : Jf(a)J > M ) ) = 0) < + m is denoted L,[O, 11. Note that L,[O, 11 is
GLOSSAREX 204
equal to L,(A,A,p) with A = [O, 11, A = Lebesgue-measurable subsets of [0, 11, and p = A.
L,(A, A, p) Suppose (A, A, p) is a o-finite complete measure space. The set of all measurable functions f on A with norm 11 f 11 := inf {M 2 0 : p({a E A : I f (a)l > M)) = 0) < +oo is a Banach space de-
noted L,(A, A, p). (As usual, one actu- ally considers equivalence classes of func- tions that are equal p-almost everywhere.) For details, we refer the reader to [134].
Lagrange multiplier See Fact 2.2.25.
LHS An acronym for "left hand side".
lin Denotes the lineality space of a set. Used in Proposition 13.3.1, where we deal with cones. If C is a cone in a Banach space, then lin C = C n (-C).
line See Example 3.3.11.
linear convergence See under converges linearly.
linearly constrained feasibility problem A linearly constrained feasibility problem is a convex feasibility problem with one constraint set being a closed affine sub- space (so the set can be written as A-'(b), where A is some continuous linear operator from X to Y and X ,Y are Hilbert spaces with b E Y).
linearly focusing algorithm See Section 8.7.
linearly regular See under regularities.
Lipschitz (continuous) Suppose T maps a
subset D of a Banach space to a Banach space. Then T is Lipschitz or Lipschitz continuous, if there exists some L 2 0,
called a Lzpschitz constant, such that IITx-
Tyll L L11x - YII, VX,Y E D.
locally maximal monotone See Definition 12.2.4.(iv) and Proposition 12.2.7.(iv).
long-step method See Table 8.2.
lower semi-continuous Suppose f is a con-
vex * proper function from a Banach space X to ] - co, +m]. Then f is lower semi-continuous, if the sublevel sets {x E
X : f (x) < r ) are closed, VT E R. This
is equivalent to: f (x) 5 b, f (x,), for every convergent sequence (2,) with limit X. If one requires that the sublevel sets be closed with respect to the weak* topology (say), then one speaks of a weak* lower semi-continuous function (and one obtains a characterization involving weak* conver- gent nets). The concept of a lower semi- - continuous function should not be confused with the notion of a lower semi-continuous relation; see under relation.
Maple A Symbolic Computer Algebra Pro- gram which helped me to perform some transformations and to generate some plots. See Examples 4.8.3,4.8.4, and pages 88ff.
maximal monotone See under monotone operator.
maximal monotone locally See Definition 12.2.4.(v) and Proposition 12.2.7.(v).
method of alternating projections See page 83 and Subsection 11.4.1.
method of cyclic projections See under cyclic projections.
minimal angle See under angle.
GLOSSAREX
monotone operator Suppose X is a Banach
space and T is a set-valued map from X to X*. Then T is called monotone, if (Tx - Ty, x - y) > 0, Vx, y E X (pointwise). T is maximal monotone, if T is monotone and the graph of T is a maximal subset in X x
X* with respect to set-inclusion. Zorn's lemma guarantees the existence of maximal monotone extensions for any given mono- tone operator. Analogously, we can speak of (maximal) monotone operators from X* to X or from X*' to X * or of monotone op-
erators whose graphs are maximal mono- tone with respect to some subsets and so forth.
Moore/Penrose inverse See Definition 2.3.8.
Mosco convergence A notion of set con-
vergence; see Fact 2.2.5.(ii).
Stands for the natural numbers: {1,2,. . . ). Stands for the normal cone: if C is a closed
convex nonempty subset of a Banach space
X, then Nc(x) := &C(X), Vx E X. In other words: Nc(x) = 8, if x $ C; Nc(x) = {x* E X' : (x*, C - x) 5 01, otherwise. See also Proposition 3.2.5.
(NI) See Definition 12.2.4.(iii).
nonexpansive map Suppose T is a map whose domain is a subset S of a Banach space. If T maps S to a Banach space, then T is nonexpansive, if
A nonexpansive map is uniformly continu-
ous and clearly Lipschitz with constant 1.
nonnegative orthant The set {x E R N : xi 2 0,Vi) is called the nonnegative or- thant.
nonnegative weights See under uiw weights.
normalized Schauder basis See under Schauder basis.
open (for relations) See under relation.
Opial's Derniclosedness Principle See Corollary 2.4.2.
ordered vector space Suppose X is (real)
vector space with a relation "<". Then (X, 5 ) is an ordered vector space, if it sat- isfies the following two conditions. Condition 1: (X, 5) is a partially ordered
set in the usual sense, i.e., it satisfies the following three axioms.
(PO 1): x 5 x, vx E X; (PO 2): x < y and y < x implies x = y;
(PO 3): x 5 y and y < z implies x 5 z. Condition 2: The partial ordering is linear, i-e., it is connected with the vector space structure by
( L l ) : x < y impl iesx+z< y + z , V z ~ X ; (L 2): x < y implies Ax < Xy, VO 5 A E R.
orthogonal projection Used in Operator Theory to describe the projection onto a closed subspace (Proposition 3.3.8).
overrelaxation See Section 8.8.
P In Part I, P stands for the projection op- erator (see Theorem 3.2.1). In Part 11, P stands for the symmetric part of a contin- uous linear operator.
par If A is an f i n e subspace of a Hilbert space, then there exists a unique subspace S parallel to A, i.e., A = a + S, for (some or every) a E A. This parallel subspace is denoted par A; in fact ([127, Theorem 1.21): p a r A = A - A .
GLOSSAREX 206
parabola See Example 3.3.23 and Example 4.8.4.
Paral lelogram Law See Corollary 2.2.2.
pointwise convergence See under ~SP converges pointwise.
polar cone Given a cone C in a Banach space X, the (negative) polar cone is defined by {x" E X* : (x*, C) 5 0) and denoted Ce. The set -Ce is denoted Ce and called the positive polar cone. See also Theorem 3.3.3. Polar cones generalize the notion of the or- thogonal complement of a subspace.
polar set Given a set C in a Banach space X , the set {x E X : (C,x) < 1) is called the polar (set) of C and denoted C@. Po- lar sets generalize the notion of a - polar cone.
polyhedron See under - convex polyhedron.
poly tope A polytope is the convex hull of a
finite set of points. In Euclidean spaces, polytopes correspond precisely to compact e convex polyhedra ([127, Section 191).
positive By a positive real number, we al- ways mean a number bigger than 0 - not a nonnegative number.
positive ope ra to r See page 143.
positive semi-definite matrices See Example 3.5.2 and Remark 3.5.1.
positive semi-definite opera tor See page 143.
positive weights See under weights.
positively homogeneous A map f defined on a Banach space X is positively homoge- neous, if f (ax) = a f (x), Va > 0, x E X.
primally strongly maximal monotone See Definition 16.2.1.
projection See Theorem 3.2.1.
projection algorithm See Section 8.3.
proper (convex function) A convex func- tion f on a Banach space X is proper, if f(x) > -co, Vx E X and its - domain dom f is nonempty.
qri Stands for the quasi relative interior; see Definition 2.6.l.(iv).
quasi-projection See Definition 3.4.3.
quotient A Banach space Y is a quotient of a Banach space X, if there exists a U-S sur- jection from X to Y.
W Stands for the real numbers.
R See under rs relaxation.
ran Used to describe the range of a (possibly set-valued) map.
range-dense type See Definition 12.2.4.(ii).
r andom projections See Example 9.5.3.
r a y See Example 3.3.5.
reflexive (Banach space) A Banach space is reflexive, if its unit ball is weakly
compact. (This definition is somewhat non- standard but see [55, Corollary V.4.81.) By the ~ber le in /~mul ian theorem ([55, Theo- rem V.6.1]), a Banach space is reflexive if and only if it is both weakly sequen- tially Cauchy and weakly sequen- tially complete.
regular See under regularities.
regularities See Definitions 4.2.1,5.2.1.
regularization See Section 15.3.
GLOSSAREX
relation A set-valued map R from a Banach short-step method See Table 8.2.
space to a Banach space is sign/Sign In Subsection 2.2.5, we define two a relation. The relation R is closed (resp.
maps on the real line. For every r E R: convex), if its graph graQ is. Suppose y E
sign^ = +l , if T > 0; sign0 = 0; sign T = Y. Then 0 is said to be open at y [19,
-1, if T < 0. The Sign map is set-valued: Section 1.21, if Vx E O-l(y) VE > 0 36 >
Signr = signr, if r # 0; Sign0 = [-1,+1], 0 : B(y, 6) C ~ ( B ( x , E ) ) . (In fact, R is otherwise. open a t y if and only if R-' is Lower Semi- Continuous a t y; see [19, Proposition 11.)
relaxation Stands for a certain map associ- ated with a closed convex set; see Defini- tion 3.4.1.
relaxation parameter See Section 8.3.
remotest sets control See Table 8.1.
resolvent If T is a set-valued map from a Banach space X to X*, then (J + pT)-l is called a ~esolvent of T, Vp > 0.
RHS An acronym for "right hand side".
ri Stands for the relative interior; see Defini- tion 2.6.l.(iii).
rugged Banach space See Subsection 2.3.4.
Sx Denotes the unit sphere (x E X : lixll = 1) of a Banach space X.
Schauder basis A sequence (un)nll is a
Schauder basts of a Banach space X, if every x E X can be written uniquely as
2 = Cz1 Xnun. If l l ~ n l l 1, then the Schauder basis is normalized. For more on Schauder basis, we refer the reader to [95, Section 301.
singular algorithm See Table 8.1.
skew operator Suppose T is a continuous linear operator from a Banach space X to X*. Then T is skew, if T*lx = -T. The following equivalences are easy to verify: T is skew w ( T x , y ) = -(Ty,x), Vx,y E X * T and -T are monotone w (Tx, x) = 0, Vx E X. See also Proposition 12.3.5.
skew part See Proposition 12.3.5.
skew-symmetric operator See under skew operator.
Slater point / Slater's condition See Fact 2.2.25, Theorem 10.3.l.(ii), and Example 10.4.3.
span Stands for the span, i.e., the set of a
all linear combinations of a given set in a vector space.
sri Denotes the strong relative interior; see Definition 2.6.l.(ii).
strict convexity A Banach space is strictly convex, if Ilx + yII = 11x11 + llyll always im- plies llyll . x = 11x11. y, Vx, y E X. By Prop- osition 2.2.4.(ii), Hilbert spaces are strictly convex.
Schur space A Banach space X is Schur, if strongly attracting map every weakly convergent sequence in X is See Definition 2.4.6. actually norm convergent. The prime ex- ample is ll. strongly focusing algorithm
See Section 8.6. sequence of active remotest indices
See before Theorem 8.6.2. strongly maximal monotone operator See Section 16.2.
GLOSSAREX 208
E-subdifferential Suppose f is a convex function on a Banach space X , x E dom f , and E > 0. Then x* E X' is called an E -
subgradient of f at x, if (xW,h) 5 f ( x + h) - f (x) + E, Vh E X. The set of all E-
subgradients of f at x is denoted d, f (x)
and called the E-subdiflerential. This "ap- proximate" subdifferential has nice prop- erties (see [89, Chapter XI]), for instance:
x* E d,f(x) e f (x)+f*(x*)- (x3 ,x) < E .
subgradient Suppose f is a convex function
on a Banach space X and x E dom f . A point x* E X* is called a subgradient of f
at x, if (x*, h) < f (z + h) - f (x), Vh E X. The set of all subgradients of f at x is de-
noted d f (2) and called the subdiflerential. If a f (x) # 0, then f is said to be subdifler-
entiable at x. See also under cj~ Ggteaux derivative.
subgradient algorithm See Section 10.3.
subgradient index See Section 10.3.
sublevel set See Theorem 3.3.25.
support point See Subsection 6.3.2.
surjection A continuous linear operator from a Banach space X onto a Banach space Y is called a su jection.
symmetric operator Suppose T is a con-
tinuous h e a r operator from a Banach space X to X*. Then T is symmetric,
if T*lx = T; equivalently, if (Tx, Y) = (Ty, x), Vx, y E X. See Proposition 12.3.5.
symmetric part See Proposition 12.3.5.
(symmetric w) See Definition 14.3.17
tauberian operator A continuous linear operator from a Banach space X to a Ba- nach space Y is tauberian, if ran T'*lx.. \x
Y**\Y; see also [161, Section 11-41. If X
is reflexive, then T is (trivially) tauberian.
See also Fact 2.3.5.
TFAE An acronym for "The following are
equivalent".
total overrelaxation See Table 8.2.
unique (monotone operator) See Definition 12.2.4.(vi).
unit ball See under Bx.
unrelaxed algorithm See Table 8.2.
vector lattice Suppose (X, 5 ) is an ordered vector space. Let A be a
subset of X. Recall that x E X is an up- per bound of A, written x > A, if a < X,
Va E A; an element xo E X is called the supremum or least upper bound of A, if it
is an upper bound such that x 2 A implies x > xo. (Lower bounds and infima are de- fined analogously.) Then (X, 5) is a vector lattice, if every two-point subset {x, y) of X has a supremum and an infimum which are denoted by x V y and z A y respectively.
Vonnegut Reading work by the author Kurt Vonnegut always helped me to see things
from a different perspective.
von Neumann/Halperin result See Fact 9.6.2 and Section 11.2.
(w) See Definition 14.3.7.
weakly Cauchy sequence A sequence (x,) in a Banach space X is weakly Cauchy, if ( ( x * , ~ , ) ) is a Cauchy sequence in W, vx- EX' .
weakly compact operator A continuous linear operator T from a Banach space X to a Banach space Y is weakly compact, if
ranT**lx..\x Y. If X is reflexive, then T is (trivially) weakly compact. See also Fact 2.3.4.
GLOSSAREX
weakly sequentially Cauchy A Banach space X is weakly sequentially Cauchy, if
every bounded sequence in X has a sub- sequence that is - weakly Cauchy. See Fact 2.3.13.
weakly sequentially complete A Banach space X is weakly sequentially complete, if
every weakly Cauchy sequence in X is actually weakly convergent.
weighted algorithm See Table 8.1.
weights We say that finitely many nonnega-
tive real numbers wl , . . . , w~ are (nonneg- ative) weights, if xi wi = 1. If each w; is positive, then we speak of positive weights.
WLOG An acronym for "Without loss of generality".
X In Part I, X always denotes a (real) Hil-
bert space whereas in Par t 11, X stands for a (real) Banach space.
X* The dual space of a Banach space X.