22
arXiv:2108.13994v1 [math.OC] 31 Aug 2021 Abstract strongly convergent variants of the proximal point algorithm Andrei Sipo¸ s a,b a Research Center for Logic, Optimization and Security (LOS), Department of Computer Science, Faculty of Mathematics and Computer Science, University of Bucharest, Academiei 14, 010014 Bucharest, Romania b Simion Stoilow Institute of Mathematics of the Romanian Academy, Calea Grivit ¸ei 21, 010702 Bucharest, Romania E-mail: [email protected] Abstract We prove an abstract form of the strong convergence of the Halpern-type and Tikhonov-type proximal point algorithms in CAT(0) spaces. In addition, we derive uniform and computable rates of metastability (in the sense of Tao) for these iterations using proof mining techniques. Mathematics Subject Classification 2010: 90C25, 46N10, 47J25, 47H09, 03F10. Keywords: Halpern iteration, proximal point algorithm, CAT(0) spaces, jointly firmly nonexpansive families, proof mining, rates of metastability. 1 Introduction The proximal point algorithm is a fundamental tool of convex optimization, usually attributed to Martinet [38], Rockafellar [44] (who named it) and Br´ ezis and Lions [9]. In its many variants, it usually operates by iterating on a starting point – in, say, a Hilbert space – a sequence of mappings dubbed “resolvents”, whose fixed points coincide with the solutions of the optimization problem that one is aiming at. Thus, if for any γ> 0 one denotes the resolvent of order γ corresponding to the given problem by J γ , then one selects a sequence (γ n ) of ‘step-sizes’ and then forms the iterative sequence which bears the name ‘proximal point algorithm’ by putting, for any n, x n+1 to be equal to J γn x n . Unfortunately, this class of algorithms is usually only weakly convergent: that strong convergence does not always hold has been shown by G¨ uler [19]. A natural question, then, is how to modify the algorithm into a strongly convergent one. A source of inspiration was found in the iterations commonly used in metric fixed point theory, for example the iteration introduced by Halpern in [20], which bears his name and which is used to find fixed points of e.g. a self-mapping T of the space and which, for a given ‘anchor’ point u in the space and a sequence of ‘weights’ (α n ), constructs, for each n, the step x n+1 as α n u + (1 α n )Tx n . In order to guarantee strong convergence of the algorithm, one usually imposes some condition on the sequence (α n ), for example lim n→∞ α n =0, n=0 α n = . It was more or less known since Halpern (see [47, Theorem 6] for an updated proof) that the above two conditions are necessary for strong convergence when T is a nonexpansive mapping, but they may not be sufficient: Halpern himself in his original paper [20] proved strong convergence using some highly restrictive additional conditions which excluded the natural choice α n := 1/(n + 1). Only in the 1990s, Wittmann [52] managed to show strong convergence under a weaker additional condition, which included that choice. 1

algorithm - arxiv.org

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: algorithm - arxiv.org

arX

iv:2

108.

1399

4v1

[m

ath.

OC

] 3

1 A

ug 2

021

Abstract strongly convergent variants of the proximal point

algorithm

Andrei Siposa,b

aResearch Center for Logic, Optimization and Security (LOS), Department of Computer Science,

Faculty of Mathematics and Computer Science, University of Bucharest,

Academiei 14, 010014 Bucharest, Romania

bSimion Stoilow Institute of Mathematics of the Romanian Academy,

Calea Grivitei 21, 010702 Bucharest, Romania

E-mail: [email protected]

Abstract

We prove an abstract form of the strong convergence of the Halpern-type and Tikhonov-typeproximal point algorithms in CAT(0) spaces. In addition, we derive uniform and computable ratesof metastability (in the sense of Tao) for these iterations using proof mining techniques.Mathematics Subject Classification 2010: 90C25, 46N10, 47J25, 47H09, 03F10.Keywords: Halpern iteration, proximal point algorithm, CAT(0) spaces, jointly firmly nonexpansivefamilies, proof mining, rates of metastability.

1 Introduction

The proximal point algorithm is a fundamental tool of convex optimization, usually attributed to Martinet[38], Rockafellar [44] (who named it) and Brezis and Lions [9]. In its many variants, it usually operatesby iterating on a starting point – in, say, a Hilbert space – a sequence of mappings dubbed “resolvents”,whose fixed points coincide with the solutions of the optimization problem that one is aiming at. Thus,if for any γ > 0 one denotes the resolvent of order γ corresponding to the given problem by Jγ , thenone selects a sequence (γn) of ‘step-sizes’ and then forms the iterative sequence which bears the name‘proximal point algorithm’ by putting, for any n, xn+1 to be equal to Jγn

xn.Unfortunately, this class of algorithms is usually only weakly convergent: that strong convergence

does not always hold has been shown by Guler [19]. A natural question, then, is how to modify thealgorithm into a strongly convergent one. A source of inspiration was found in the iterations commonlyused in metric fixed point theory, for example the iteration introduced by Halpern in [20], which bearshis name and which is used to find fixed points of e.g. a self-mapping T of the space and which, for agiven ‘anchor’ point u in the space and a sequence of ‘weights’ (αn), constructs, for each n, the step xn+1

asαnu+ (1− αn)Txn.

In order to guarantee strong convergence of the algorithm, one usually imposes some condition on thesequence (αn), for example

limn→∞

αn = 0,∞∑

n=0

αn = ∞.

It was more or less known since Halpern (see [47, Theorem 6] for an updated proof) that the above twoconditions are necessary for strong convergence when T is a nonexpansive mapping, but they may notbe sufficient: Halpern himself in his original paper [20] proved strong convergence using some highlyrestrictive additional conditions which excluded the natural choice αn := 1/(n+ 1). Only in the 1990s,Wittmann [52] managed to show strong convergence under a weaker additional condition, which includedthat choice.

1

Page 2: algorithm - arxiv.org

Given this, it is then natural to consider what Kamimura and Takahashi [22] and Xu [53] independentlyintroduced as the Halpern-type proximal point algorithm, where the map T in the Halpern iterationabove is replaced by the resolvent Jγn

from the proximal point algorithm, and which strongly convergesin Hilbert spaces if one imposes the Halpern conditions above and the condition limn→∞ γn = ∞. (Arelated modification known as the Tikhonov regularization was studied in [32] and especially in [54].)Aoyama and Toyoda [2] have recently shown that the Halpern proximal point algorithm converges inBanach spaces which are uniformly convex and uniformly smooth if one imposes in addition to thetwo Halpern conditions above just the condition that the sequence (γn) is bounded below away from 0,their proof making use of the property of the resolvents being strongly nonexpansive. (In particular,the usual Halpern iteration had already been shown by Saejung [45] to strongly converge for stronglynonexpansive T with just the two Halpern conditions.)

In the last two decades, there has been a continued interest in extending results in fixed point theoryand convex optimization from linear spaces like Hilbert or Banach spaces to nonlinear ones, chiefamong them being CAT(0) spaces (to be defined in the next section), frequently regarded as the rightfulnonlinear generalization of Hilbert spaces. The first adaptation of the proximal point algorithm to thiscontext was obtained in [5] by Bacak, who also authored the book [6], which serves as a general referencefor convex optimization in CAT(0) spaces.

It has been observed by Eckstein [15] that the arguments used to prove the convergence of the usualproximal point algorithm “hinge primarily on the firmly nonexpansive properties of the resolvents”.Inspired by this remark, the author, together with L. Leustean and A. Nicolae, has introduced in [34]the concept of jointly firmly nonexpansive family of mappings, in the context of CAT(0) spaces, whichallows for a highly abstract proof of the proximal point algorithm’s convergence, encompassing virtuallyall known variants in the literature. (It was not accidental that we have already presented resolventsabove in a quite abstract way.) We have recently revisited [48] the concept, providing a conceptualcharacterization of it and showing how it may be used to prove other kinds of results usually associatedwith resolvent-type mappings.

The goal of this paper is to present strongly convergent variants of the proximal point algorithm inthe framework of jointly firmly nonexpansive families of mappings in CAT(0) spaces.

We chose to adapt the proof of Aoyama and Toyoda [2] due to the fact that it uses the weakestconditions known so far; even though full strong nonexpansiveness is not available in nonlinear spaceslike CAT(0) spaces, what one needs for their proof to go through is the uniform strengthening of theweaker notion of strong quasi-nonexpansiveness, a strengthening which – with a quantitative modulus –was introduced by Ulrich Kohlenbach in [25].

The quantitative nature of this notion is due to the fact that the investigations in [25] tie into thearea of proof mining, an applied subfield of mathematical logic which aims to analyze proofs in concretemathematics using tools from proof theory in order to extract additional information from them. Thisprogram in its current form has been developed in the last decades primarily by Kohlenbach and hiscollaborators – see [23] for a comprehensive monograph; a recent survey which serves as a short andaccessible introduction is [26]. It would be natural, then, to ask for a a rate of convergence for theiterations we mentioned above; unfortunately, rates of convergence for iterative sequences which arecommonly employed in nonlinear analysis and convex optimization may not be uniform or computable(see [39]). Kohlenbach’s work then suggests that one should look instead at the following (classically butnot constructively) equivalent form of the Cauchy property (actually identifiable in mathematical logicas its Herbrand normal form):

∀ε > 0 ∀g : N → N ∃N ∈ N ∀i, j ∈ [N,N + g(N)] (‖xi − xj‖ ≤ ε) ,

which has been arrived at independently by Terence Tao in his own work on ergodic theory [51] andpopularized in [50] – as a result of the latter, the property got its name of metastability (at the suggestionof Jennifer Chayes). Kohlenbach’s metatheorems then guarantee the existence of a computable anduniform rate of metastability – a bound Θ(ε, g) on the N in the sentence above; and this researchprogram of proof mining has achieved over the years a number of non-trivial extractions of such ratesfrom celebrated strong convergence proofs, see, e.g., [24, 28, 31].

Some linear space variants of the Halpern-type proximal point algorithm have already been analyzedfrom the point of view of proof mining – specifically by Kohlenbach [27] (whose analysis we shall followclosely, given that he analyzed the original proof of Aoyama and Toyoda [2]), as well as by Pinto [40](who analyzed the proof of Xu [53] mentioned above) and Leustean and Pinto [35].

2

Page 3: algorithm - arxiv.org

The main obstacle in producing our analysis is, as suggested before, strong nonexpansiveness. Wehave said that for the usual (non-quantitative) proof we can use only the uniform version of the strongquasi-nonexpansive property – witnessed here by Lemma 3.1 – but this turns out not to be enough forthe quantitative version. What we do is to mine the proof of that lemma in order to obtain a further‘quantitative quasiness’ property in the form of Proposition 4.3, which gives us exactly the necessaryingredient for the proof to go through, namely the analogue of [27, Lemma 8] in Kohlenbach’s originalanalysis.

Section 2 presents the general concepts we shall need regarding CAT(0) spaces, self-mappings ofthem (including the jointly firmly nonexpansive families mentioned above) and techniques to proveconvergence. We chose to present the qualitative convergence results distinctly from the quantitativeones, so that they could stand on their own. Thus, the main convergence theorems can be found inSection 3 – Theorem 3.4 and Corollary 3.5, showing the strong convergence of the Halpern-type andof the Tikhonov-type proximal point algorithm, respectively – while their corresponding quantitativeversions, yielding rates of metastability, can be found in Section 4.

2 Preliminaries

One says that a metric space (X, d) is geodesic if for any two points x, y ∈ X there is a geodesic thatjoins them, i.e. a mapping γ : [0, 1] → X such that γ(0) = x, γ(1) = y and for any t, t′ ∈ [0, 1] we havethat

d(γ(t), γ(t′)) = |t− t′|d(x, y).

Among geodesic spaces, a subclass that is usually considered (e.g. in convex optimization) to be therightful nonlinear generalization of Hilbert spaces is the class of CAT(0) spaces, introduced by A.Aleksandrov [1] and named as such by M. Gromov [18], defined as those geodesic spaces (X, d) suchthat for any geodesic γ : [0, 1] → X and for any z ∈ X and t ∈ [0, 1] we have that

d2(z, γ(t)) ≤ (1− t)d2(z, γ(0)) + td2(z, γ(1))− t(1− t)d2(γ(0), γ(1)).

Another well-known fact about CAT(0) spaces is that each such space (X, d) is uniquely geodesic – thatis, for any x, y ∈ X there is a unique geodesic γ : [0, 1] → X that joins them – and in this context weshall denote, for any t ∈ [0, 1], the point γ(t) by (1 − t)x + ty. Note also that any CAT(0) space X isBusemann convex – i.e., for any x, y, u, v ∈ X and t ∈ [0, 1],

d((1 − t)x+ ty, (1− t)u+ tv) ≤ (1 − t)d(x, u) + td(y, v),

and in particular, for any x, u, v ∈ X and t ∈ [0, 1],

d(x, (1 − t)u+ tv) ≤ (1− t)d(x, u) + td(x, v).

In 2008, Berg and Nikolaev proved (see [8, Proposition 14]) that in any metric space (X, d), thefunction 〈·, ·〉 : X2 ×X2 → R, defined, for any x, y, u, v ∈ X , by

〈−→xy,−→uv〉 :=1

2(d2(x, v) + d2(y, u)− d2(x, u)− d2(y, v))

(where an ordered pair of points (a, b) ∈ X2 is denoted by−→ab), called the quasi-linearization function, is

the unique one such that, for any x, y, u, v, w ∈ X , we have that:

(i) 〈−→xy,−→xy〉 = d2(x, y);

(ii) 〈−→xy,−→uv〉 = 〈−→uv,−→xy〉;

(iii) 〈−→yx,−→uv〉 = −〈−→xy,−→uv〉;

(iv) 〈−→xy,−→uv〉+ 〈−→xy,−→vw〉 = 〈−→xy,−→uw〉.

3

Page 4: algorithm - arxiv.org

The inner product notation is justified by the fact that if X is a (real) Hilbert space, for any x, y, u,v ∈ X ,

〈−→xy,−→uv〉 = 〈x− y, u− v〉 = 〈y − x, v − u〉. (1)

The main result of [8], Theorem 1, characterized CAT(0) spaces as being exactly those geodesic spaces(X, d) such that the corresponding Cauchy-Schwarz inequality is satisfied, i.e. for any x, y, u, v ∈ X ,

〈−→xy,−→uv〉 ≤ d(x, y)d(u, v). (2)

We shall use, in addition, the following inequality connected to the quasi-linearization function.

Lemma 2.1. Let X be a CAT(0) space, x, y, z ∈ X and t ∈ [0, 1]. Then

d2((1 − t)x+ ty, z) ≤ (1− t)2d2(x, z) + 2t〈−→yz,−−−−−−−−−−−→[(1− t)x + ty]z〉.

Proof. Set u := (1 − t)x+ ty. From the defining property of CAT(0) spaces, we have that

d2(z, u) ≤ (1− t)d2(z, x) + td2(z, y)− t(1− t)d2(x, y).

Multiplying the above by (1− t), and keeping in mind that d(y, u) = (1− t)d(x, y), we have that

(1− t)d2(z, u) ≤ (1− t)2d2(z, x) + t(1− t)d2(z, y)− t(1− t)2d2(x, y)

≤ (1− t)2d2(z, x) + td2(z, y)− t(1− t)2d2(x, y)

≤ (1− t)2d2(z, x) + td2(z, y)− td2(y, u),

so, by adding td2(z, u), we get that

d2(z, u) ≤ (1 − t)2d2(x, z) + t(d2(z, y) + d2(z, u)− d2(y, u)) = (1− t)2d2(x, z) + 2t〈−→yz,−→uz〉,

which is what we needed to show.

We shall fix now a complete CAT(0) space X for the remainder of this paper, and throughout thepaper, for any self-mapping T of X , we shall denote the set of its fixed points by Fix(T ).

A self-mapping T of X is called nonexpansive if for all x, y ∈ X , d(Tx, T y) ≤ d(x, y). If C ⊆ X isclosed, convex and nonempty, then there exists a corresponding nearest point projection operator, whichone usually denotes by PC : X → C.

A fundamental property of nonexpansive mappings is the following so-called ‘resolvent convergence’result, and the original idea of its proof essentially goes back to Minty [21], and was later popularizedby Halpern [20]. The generalization to CAT(0) spaces stated below is due to Saejung [46].

Theorem 2.2 (cf. [46, Lemmas 2.1 and 2.2]). Let T : X → X be nonexpansive with Fix(T ) 6= ∅ andu ∈ X. We have that, for any t ∈ (0, 1) there is a unique z ∈ X having the property z = tu+ (1− t)Tz,and we denote it by zt. Then, we have that limt→0 zt = PFix(T )u.

Firmly nonexpansive mappings were first introduced, as a refinement of nonexpansive mappings, byBrowder [10] in the context of Hilbert spaces and then by Bruck [11] in the context of Banach spaces (thislater definition was also studied, e.g., in [41]). The following generalization to geodesic spaces, inspiredby the study of firmly nonexpansive mappings in the Hilbert ball [16, 17, 42, 43], was introduced in [3].

Definition 2.3. A mapping T : X → X is called firmly nonexpansive if for any x, y ∈ X and anyt ∈ [0, 1] we have that

d(Tx, T y) ≤ d((1− t)x + tTx, (1− t)y + tT y).

As mentioned in [4] (see also [29]), every firmly nonexpansive mapping T : X → X satisfies theso-called property (P2), i.e. that for all x, y ∈ X ,

2d2(Tx, T y) ≤ d2(x, T y) + d2(y, Tx)− d2(x, Tx)− d2(y, T y),

or, using the quasi-linearization function,

d2(Tx, T y) ≤ 〈−−−→TxTy,−→xy〉. (3)

4

Page 5: algorithm - arxiv.org

If X is a Hilbert space, property (P2) coincides with firm nonexpansiveness as (3) and (1) yield ‖Tx−Ty‖2 ≤ 〈Tx − Ty, x − y〉, which is equivalent to it e.g. by [7, Proposition 4.2]. Moreover, from thisformulation given by (3) one immediately obtains, using (2), that a self-mapping of a CAT(0) spacesatisfying property (P2) is nonexpansive.

Following [34, 48], if T and U are self-mappings of X and λ, µ > 0, we say that T and U are(λ, µ)-mutually firmly nonexpansive if for all x, y ∈ X and all α, β ∈ [0, 1] such that (1−α)λ = (1−β)µ,one has that

d(Tx, Uy) ≤ d((1 − α)x+ αTx, (1− β)y + βUy).

If (Tn)n∈N is a family of self-mappings of X and (γn)n∈N ⊆ (0,∞), we say that (Tn) is jointly firmlynonexpansive with respect to (γn) if for all n,m ∈ N, Tn and Tm are (γn, γm)-mutually firmly nonexpansive.In addition, if (Tγ)γ>0 is a family of self-mappings of X , we say that it is plainly jointly firmlynonexpansive if for all λ, µ > 0, Tλ and Tµ are (λ, µ)-mutually firmly nonexpansive. It is clear thata family (Tγ) is jointly firmly nonexpansive if and only if for every (γn)n∈N ⊆ (0,∞), (Tγn

)n∈N isjointly firmly nonexpansive with respect to (γn). In [34] it was shown that examples of jointly firmlynonexpansive families of mappings are furnished by resolvent-type mappings used in convex optimization– specifically, by:

• the family (Jγf )γ>0, where f is a proper convex lower semicontinous function on X and one denotesfor any such function g its proximal mapping by Jg;

• the family (RT,γ)γ>0, where T is a nonexpansive self-mapping of X and one denotes, for any γ > 0,its resolvent of order γ by RT,γ ;

• (if X is a Hilbert space) the family (JγA)γ>0, where A is a maximally monotone operator on Xand one denotes for any such operator B its resolvent by JB.

Again, if T and U are self-mappings of X and λ, µ > 0, one says that T and U are (λ, µ)-mutually(P2) if for all x, y ∈ X ,

1

µ(d2(Tx, Uy) + d2(y, Uy)− d2(y, Tx)) ≤

1

λ(d2(x, Uy)− d2(x, Tx)− d2(Tx, Uy)),

or, using the quasi-linearization function,

1

µ〈−−−→TxUy,

−−→yUy〉 ≤

1

λ〈−−−→TxUy,

−−→xTx〉.

Proposition 2.4 (cf. [34, Proposition 3.10]). Let λ, µ > 0 and T and U be (λ, µ)-mutually (P2)self-mappings of X. Then, for all x ∈ X,

d(Tx, Ux) ≤|λ− µ|

λd(x, Tx).

Corollary 2.5. Let λ, µ > 0 and T and U be (λ, µ)-mutually (P2) self-mappings of X. Then, for allx ∈ X,

d(x, Ux) ≤(2 +

µ

λ

)d(x, Tx).

Proof. Let x ∈ X . Then

d(x, Ux) ≤ d(x, Tx) + d(Tx, Ux) ≤ d(x, Tx) +|λ− µ|

λd(x, Tx) ≤

(2 +

µ

λ

)d(x, Tx).

Corollary 2.6 (cf. [34, Corollary 3.11]). Any two mutually (P2) self-mappings of X have the same fixedpoints.

5

Page 6: algorithm - arxiv.org

One may then similarly state the corresponding definitions for jointly (P2) families of mappings. Asshown in [34], all those (P2) notions generalize their firmly nonexpansive counterparts and coincide withthem in the case where X is a Hilbert space. The main result of that paper showed that this conditionsuffices for the working of the proximal point algorithm, namely that if X is complete, (Tn)n∈N is afamily of self-mappings of X with a common fixed point and (γn)n∈N ⊆ (0,∞) with

∑∞

n=0 γ2n = ∞,

then, assuming that (Tn) is jointly (P2) with respect to (γn), any sequence (xn) ⊆ X such that for all n,xn+1 = Tnxn, ∆-converges (a generalization of weak convergence to arbitrary metric spaces, due to Lim[36]) to a common fixed point of the family. Moreover, in [48], the reason for the effectiveness of this sortof condition was further elucidated: Theorem 3.3 of that paper shows that a family of self-mappings isjointly firmly nonexpansive if and only if each mapping in it is nonexpansive and the family as a wholesatisfies the well-known resolvent identity.

We will need some facts about sequences of reals. A function τ : N → N is said to be unboundedlyincreasing if limn→∞ τ(n) = ∞ and for all n ∈ N, τ(n) ≤ τ(n+ 1). The following result is immediate.

Lemma 2.7 ([2, Lemma 2.6]). Let (an) ⊆ R converging to 0 and τ : N → N be unboundedly increasing.Then limn→∞ aτ(n) = 0.

Lemma 2.8 (cf. [37, Lemma 3.1]). Let (an) ⊆ R and (nj) be a strictly increasing sequence of naturalnumbers. Assume that for all j ∈ N, anj

< anj+1. Define τ : N → N by setting, for all n ∈ N,

τ(n) := max{k ≤ max(n0, n) | ak < ak+1}.

Then:

• τ is unboundedly increasing;

• for all n ∈ N, aτ(n) ≤ aτ(n)+1 and, for all n ≥ n0, an ≤ aτ(n)+1.

Corollary 2.9 ([2, Lemma 2.7]). Let (an) be a non-convergent sequence of nonnegative real numbers.Then there is an N ∈ N and an unboundedly increasing τ : N → N such that for all n ∈ N, aτ(n) ≤ aτ(n)+1

and, for all n ≥ N , an ≤ aτ(n)+1.

Proof. Assume that there is an n such that for all p > n, ap ≥ ap+1. Then (an) is bounded and eventuallymonotone, so it is convergent, a contradiction. Thus, for all n, there is a p > n with ap < ap+1, and byiterating this statement we obtain a sequence (nj) as in the hypothesis of Lemma 2.8. By applying thatlemma, we obtain the desired conclusion.

The following lemma is widely used in fixed point theory.

Lemma 2.10 ([2, Lemma 2.8]). Let (an) ⊆ [0,∞), (βn) ⊆ R and (αn) ⊆ [0, 1]. Suppose that∑∞

n=0 αn =∞, lim supn→∞ βn ≤ 0, and, for all n,

an+1 ≤ (1 − αn)an + αnβn.

Then limn→∞ an = 0.

We shall also use, for any a, b ∈ N, the notation [a, b] := {n ∈ N | a ≤ n ≤ b}, disambiguating it bythe context from the real interval [a, b] – we also note that if a > b, then [a, b] = ∅ – thus one has that a,b ∈ [a, b] only if a ≤ b, a fact which one must remember to check whenever this property is used.

3 Convergence theorems

3.1 Preparatory lemmas

In this subsection, we state a number of lemmas and propositions which will help us in proving the mainconvergence theorems, which we do in the next subsection.

The following lemma shows that (P2) mappings have the property, defined in [25, Section 4], ofuniform strong quasi-nonexpansiveness, and gives the corresponding ‘SQNE-modulus’.

6

Page 7: algorithm - arxiv.org

Lemma 3.1. Let ε, b > 0, z ∈ X, T : X → X a (P2) mapping and p ∈ Fix(T ). Assume that d(z, p) ≤ b.Then, if

d(z, p)− d(Tz, p) <ε2

2b,

we have that d(z, T z) < ε.

Proof. If d(z, T z) = 0, then d(z, T z) < ε. Assume, then, that d(z, T z) 6= 0, so d(z, p) + d(Tz, p) > 0.Since T is (P2) and p ∈ Fix(T ), we have that

2d2(Tz, p) ≤ d2(z, p) + d2(Tz, p)− d2(z, T z),

sod2(Tz, p) ≤ d2(z, p)− d2(z, T z).

Thus (using, for the strict inequality, the fact that d(z, p) + d(Tz, p) > 0),

d2(z, T z) ≤ d2(z, p)− d2(Tz, p)

≤ (d(z, p)− d(Tz, p))(d(z, p) + d(Tz, p))

<ε2

2b· (d(z, p) + d(Tz, p))

≤ε2

2b· 2b = ε2,

so d(z, T z) < ε.

Corollary 3.2. Let b > 0, (zn) ⊆ X, (Sn) be a family of (P2) self-mappings of X and p a common fixedpoint of the Sn’s. Assume that, for all n ∈ N, d(zn, p) ≤ b. Then, if

limn→∞

(d(zn, p)− d(Snzn, p)) = 0,

we have that limn→∞ d(zn, Snzn) = 0.

Proof. Let ε > 0. We have that there is an N ∈ N such that for all n ≥ N , d(zn, p)−d(Snzn, p) < ε2/(2b).Then, by Lemma 3.1, for all n ≥ N , d(zn, Snzn) < ε, from which we get the conclusion.

3.2 Main results

The following lemma, the analogue of [2, Lemma 2.9], morally forms an integral part of the mainconvergence proof, so we have chosen to present it in this subsection.

Lemma 3.3. Let T : X → X be nonexpansive with Fix(T ) 6= ∅, (xn) ⊆ X a bounded sequence, u ∈ Xand for all t ∈ (0, 1), let zt be the unique point in X such that zt = tu+ (1− t)Tzt. Then:

(i) for all t ∈ (0, 1) and n ∈ N, we have that

〈−−→ztxn,−→ztu〉 ≤

t

2d2(xn, zt) +

(1− t)2

2td(xn, T xn)(d(xn, T xn) + 2d(xn, zt));

(ii) setting w := PFix(T )u, so that limt→0 zt = w, and assuming that limn→∞ d(xn, T xn) = 0, we havethat

lim supn→∞

〈−→uw,−−→xnw〉 ≤ 0.

Proof. (i) Using Lemma 2.1, we have that

d2(zt, xn) ≤ (1− t)2d2(Tzt, xn) + 2t〈−−→uxn,−−→ztxn〉

≤ (1− t)2(d(xn, T xn) + d(Txn, T zt))2 + 2t(〈−−→xnzt,

−−→xnzt〉+ 〈−→ztu,−−→xnzt〉)

≤ (1− t)2(d2(xn, zt) + d(xn, T xn)(d(xn, T xn) + 2d(xn, zt)))

+ 2t(d2(xn, zt)− 〈−−→ztxn,−→ztu〉),

from which we get the conclusion.

7

Page 8: algorithm - arxiv.org

(ii) Using (i) and that limn→∞ d(xn, T xn) = 0, we get that for all t ∈ (0, 1).

lim supn→∞

〈−−→ztxn,−→ztu〉 ≤

t

2lim supn→∞

d2(xn, zt).

Also, for all t ∈ (0, 1) and n ∈ N,

〈−→uw,−−→xnw〉 = 〈−→uw,−−→xnw〉 − 〈−→uw,−−→xnzt〉+ 〈−→uw,−−→xnzt〉 − 〈−→uzt,−−→xnzt〉+ 〈−→uzt,

−−→xnzt〉

= 〈−→uw,−−→xnw〉+ 〈−→uw,−−→ztxn〉+ 〈−→uw,−−→xnzt〉+ 〈−→ztu,−−→xnzt〉+ 〈−−→ztxn,

−→ztu〉

= 〈−→uw,−−→ztw〉+ 〈−−→ztw,−−→xnzt〉+ 〈−−→ztxn,

−→ztu〉.

Let ε > 0. As limt→0 zt = w, there is a t1 ∈ (0, 1) such that for all t ∈ (0, t1),

〈−→uw,−−→ztw〉 ≤ d(u,w)d(zt, w) ≤ε

3,

and, using in addition that the set {d(xn, zt) | n ∈ N, t ∈ (0, 1)} is bounded (since the curve (zt) isconvergent, hence bounded), we get that there is a t2 ∈ (0, 1) such that for all t ∈ (0, t2) and alln ∈ N,

〈−−→ztw,−−→xnzt〉 ≤ d(zt, w)d(xn, zt) ≤

ε

3,

and that there is a t3 ∈ (0, 1) such that for all t ∈ (0, t3),

lim supn→∞

〈−−→ztxn,−→ztu〉 ≤

t

2lim supn→∞

d2(xn, zt) ≤ε

3.

Let t ∈ (0, 1) be smaller than t1, t2 and t3. Then we get that

lim supn→∞

〈−→uw,−−→xnw〉 ≤ ε.

As ε was arbitrarily chosen, we obtain the desired conclusion.

The following is the main strong convergence theorem of this paper, showing the asymptotic behaviourof the Halpern proximal point algorithm for jointly (P2) families of mappings.

Theorem 3.4. Let (Tn) be a family of self-mappings of X, (γn) ⊆ (0,∞) and γ > 0 be such that forall n, γn ≥ γ. Assume that the family (Tn) is jointly (P2) with respect to (γn). Let F be the commonfixed point set of the family and assume that F 6= ∅. Let (αn) ⊆ (0, 1] such that limn→∞ αn = 0 and∑∞

n=0 αn = ∞. Let u ∈ X and (xn) ⊆ X be such that for all n,

xn+1 = αnu+ (1− αn)Tnxn.

Then (xn) converges strongly to PFu.

Proof. Set w := PFu. By Busemann convexity, we have that, for all n,

d(xn+1, w) ≤ αnd(u,w) + (1− αn)d(Tnxn, w)

≤ αnd(u,w) + (1− αn)d(xn, w).

By induction, one gets that for all n, d(Tnxn, w) ≤ max(d(u,w), d(x0 , w)) and thus (xn) and (Tnxn) arebounded sequences. Therefore,

limn→∞

d(xn+1, Tnxn) = limn→∞

(αnd(u, Tnxn)) = 0.

Also, we have that, for all n,

d(xn+1, w) ≤ αnd(u,w) + (1− αn)d(Tnxn, w)

≤ αnd(u,w) + d(Tnxn, w),

8

Page 9: algorithm - arxiv.org

so, for all n,d(xn+1, w)− d(Tnxn, w) ≤ αnd(u,w). (4)

Using Lemma 2.1 and that, for all n, d(Tnxn, w) ≤ d(xn, w), we have that, for all n,

d2(xn+1, w) ≤ (1− αn)d2(xn, w) + 2αn〈

−→uw,−−−−→xn+1w〉. (5)

Claim. The sequence (d(xn, w)) is convergent.

Proof of claim: Assume towards a contradiction that it is not convergent. Then, by Lemma 2.9, there isan N ∈ N and an unboundedly increasing τ : N → N such that for all n ∈ N, d(xτ(n), w) ≤ d(xτ(n)+1, w)and, for all n ≥ N , d(xn, w) ≤ d(xτ(n)+1, w).

For all n, we have that d(Tτ(n)xτ(n), w) ≤ d(xτ(n), w), so, using (4), we get that, for all n,

0 ≤ d(xτ(n), w)− d(Tτ(n)xτ(n), w) ≤ d(xτ(n)+1, w)− d(Tτ(n)xτ(n), w) ≤ ατ(n)d(u,w).

By Lemma 2.7, we have that limn→∞ ατ(n) = 0, so, from the above we get that

limn→∞

(d(xτ(n)+1, w) − d(Tτ(n)xτ(n), w)) = 0,

and so, by Corollary 3.2, thatlimn→∞

d(Tτ(n)xτ(n), xτ(n)) = 0.

By Corollary 2.5, we have that, for all n,

d(xτ(n), Tτ(0)xτ(n)) ≤

(2 +

γτ(0)γτ(n)

)d(xτ(n), Tτ(n)xτ(n)) ≤

(2 +

γτ(0)γ

)d(xτ(n), Tτ(n)xτ(n)),

from which we get thatlimn→∞

d(Tτ(0)xτ(n), xτ(n)) = 0.

We may now apply Lemma 3.3 to get that

lim supn→∞

〈−→uw,−−−−→xτ(n)w〉 ≤ 0. (6)

On the other hand, we have that, for all n,

d(xτ(n), xτ(n)+1) ≤ d(xτ(n), Tτ(n)xτ(n)) + d(Tτ(n)xτ(n), xτ(n)+1)

= d(xτ(n), Tτ(n)xτ(n)) + ατ(n)d(u, Tτ(n)xτ(n)),

solimn→∞

d(xτ(n), xτ(n)+1) = 0.

Since, for all n,〈−→uw,−−−−−−−−→xτ(n)xτ(n)+1〉 ≤ d(u,w)d(xτ(n), xτ(n)+1),

we have thatlimn→∞

〈−→uw,−−−−−−−−→xτ(n)xτ(n)+1〉 = 0.

From the above and (6), we get that

lim supn→∞

〈−→uw,−−−−−−→xτ(n)+1w〉 ≤ 0.

Using (5), we have that, for all n,

d2(xτ(n)+1, w) ≤ (1− ατ(n))d2(xτ (n), w) + 2ατ(n)〈

−→uw,−−−−−−→xτ(n)+1w〉

≤ (1− ατ(n))d2(xτ(n)+1, w) + 2ατ(n)〈

−→uw,−−−−−−→xτ(n)+1w〉,

so, for all n,ατ(n)d

2(xτ(n)+1, w) ≤ 2ατ(n)〈−→uw,−−−−−−→xτ(n)+1w〉.

9

Page 10: algorithm - arxiv.org

Since, for all n, ατ(n) > 0, we have that, for all n,

d2(xτ(n)+1, w) ≤ 2〈−→uw,−−−−−−→xτ(n)+1w〉.

Now, for all n ≥ N , we have that

lim supn→∞

d2(xn, w) ≤ lim supn→∞

d2(xτ(n)+1, w) ≤ 2 lim supn→∞

〈−→uw,−−−−−−→xτ(n)+1w〉 ≤ 0,

so limn→∞ d2(xn, w) = 0, which contradicts our assumption that the sequence (d(xn, w)) is not convergent.This finishes the proof of the claim. �

Now, since, for all n, d(Tnxn, w) ≤ d(xn, w), we have that, using (4),

0 ≤ d(xn, w)− d(Tnxn, w) ≤ d(xn, w) + αnd(u,w)− d(xn+1, w),

solimn→∞

(d(xn, w)− d(Tnxn, w)) = 0,

and then, by Corollary 3.2, thatlimn→∞

d(xn, Tnxn) = 0.

By Corollary 2.5, we have that, for all n,

d(xn, T0xn) ≤

(2 +

γ0γn

)d(xn, Tnxn) ≤

(2 +

γ0γ

)d(xn, Tnxn),

from which we get thatlimn→∞

d(xn, T0xn) = 0.

We may now apply Lemma 3.3 to get that

lim supn→∞

〈−→uw,−−→xnw〉 ≤ 0,

so we also have thatlim supn→∞

2〈−→uw,−−−−→xn+1w〉 ≤ 0.

By the above, Lemma 2.10, and (5), we get that limn→∞ d2(xn, w) = 0 and hence that limn→∞ xn =w.

The following is the analogue in our context of [2, Corollary 3.3], giving a convergence theorem forthe so-called ‘Tikhonov regularization’ of the proximal point algorithm as discussed in [54]. The factthat convergence results for this kind of iteration may be immediately obtained from the Halpern oneswas previously remarked in [33].

Corollary 3.5. Let (Tn) be a family of self-mappings of X, (γn) ⊆ (0,∞) and γ > 0 be such that forall n, γn ≥ γ. Assume that the family (Tn) is jointly (P2) with respect to (γn). Let F be the commonfixed point set of the family and assume that F 6= ∅. Let (βn) ⊆ (0, 1] such that limn→∞ βn = 0 and∑∞

n=0 βn = ∞. Let u ∈ X and (yn) ⊆ X be such that for all n,

yn+1 = Tn(βnu+ (1− βn)yn).

Then (yn) converges strongly to PFu.

Proof. For all n, put xn := βnu+ (1− βn)yn and αn := βn+1. We see that, for all n, yn+1 = Tnxn and

xn+1 = βn+1u+ (1− βn+1)yn+1 = αnu+ (1− αn)Tnxn.

We may now apply Theorem 3.4 to get that (xn) converges strongly to PFu. We also have that, for anyn,

d(yn+1, PFu) = d(Tnxn, PFu) ≤ d(xn, PFu),

so (yn) also converges strongly to PFu.

10

Page 11: algorithm - arxiv.org

4 Quantitative results

4.1 Preparatory lemmas

Similarly to the last section, we present the preparatory lemmas and propositions in a separate subsection.As stated in the Introduction, if (xn)n∈N is a sequence in X , then (xn) is called metastable if for any

ε > 0 and g : N → N there is an N such that for all i, j ∈ [N,N + g(N)], d(xi, xj) ≤ ε, and that a rate ofmetastability for (xn) is a function Ψ : (0,∞)×N

N → N such that for any ε and g, Ψ(ε, g) gives an upperbound on the (smallest) corresponding N . It is an immediate exercise that this is just a reformulationof the Cauchy property.

For all g : N → N, we define g : N → N, for all n, by g(n) := n + g(n). Also, for all f : N → N

and all n ∈ N, we denote by f (n) the n-fold composition of f with itself. Note that for all g and n,g(n)(0) ≤ g(n+1)(0). We define, in addition, for any f : N → N and c ∈ N the function fc : N → N,setting, for any l ∈ N, fc(l) := f(l + c).

The following proposition, which we state in the form that we shall need later, gives a uniform andcomputable rate of metastability for nonincreasing sequences of nonnegative reals bounded above by afixed constant.

Proposition 4.1 (Quantitative Monotone Convergence Principle, cf. [50]). Let b > 0 and (an) be a

nonincreasing sequence in [0, b]. Then for all ε > 0, g : N → N and l ∈ N there is an N ∈[l, g(⌈

bε⌉)(l)

]

such that for all i, j ∈ [N,N + g(N)], |ai − aj | ≤ ε.

Proof. Let ε > 0 and g : N → N. Assume that the conclusion is false, hence in particular for all i ≤⌈bε

⌉,

ag(i)(l) − ag(i+1)(l) > ε. Then

b ≥ al ≥ al − ag(⌈

bε⌉+1)(l)

=

⌈ bε⌉∑

i=0

(ag(i)(l) − ag(i+1)(l)

)>

⌈b

ε

⌉· ε ≥ b,

a contradiction.

Lemma 4.2. Let b > 0 and x, y, z ∈ X be such that d(x, y) ≤ b and d(y, z) ≤ b. Then

d2(x, y) ≤ d2(y, z) + 2bd(x, z).

Proof. Since d(x, y) ≤ d(x, z) + d(y, z), we have that d(x, y)− d(y, z) ≤ d(x, z). Now,

d2(x, y)− d2(y, z) = (d(x, y) + d(y, z))(d(x, y)− d(y, z)) ≤ (d(x, y) + d(y, z)) · d(x, z) ≤ 2b · d(x, z),

from which the conclusion follows.

We shall now present the analogue in our context of [27, Lemma 8], which is proven there usingfull strong nonexpansiveness. Since the adaptation of that concept to the metric context would be‘somewhat artificial’ [25], we are being led, as said in the Introduction, to further mine the alreadypartially quantitative Lemma 3.1 into the following property.

Proposition 4.3. Denote, for this and subsequent results, for any ε, b > 0, ω(b, ε) := ε2

15b . Let ε, b > 0,z, p ∈ X and T : X → X a (P2) mapping. Assume that d(z, p) ≤ b and d(p, T p) ≤ b. Then, if

d(z, p)− d(Tz, p) ≤ ω(b, ε)

andd(p, T p) ≤ ω(b, ε),

we have that d(z, T z) ≤ ε.

Proof. Since T is (P2), we have that

2d2(Tz, T p) ≤ d2(z, T p) + d2(Tz, p)− d2(z, T z)− d2(p, T p)

≤ (d(z, p) + d(p, T p))2 + d2(Tz, p)− d2(z, T z)− d2(p, T p).

11

Page 12: algorithm - arxiv.org

Asd(Tz, T p) ≥ |d(Tz, p)− d(p, T p)|,

we have that

d2(Tz, T p) ≥ d2(Tz, p) + d2(p, T p)− 2d(Tz, p)d(p, T p) ≥ d2(Tz, p)− 2d(Tz, p)d(p, T p),

so

2d2(Tz, p)−4d(Tz, p)d(p, T p) ≤ d2(z, p)+d2(p, T p)+2d(z, p)d(p, T p)+d2(Tz, p)−d2(z, T z)−d2(p, T p).

Thus,

d2(z, T z) ≤ d2(z, p)− d2(Tz, p) + 4d(p, T p)(d(z, p) + d(Tz, p))

= (d(z, p)− d(Tz, p) + 4d(p, T p))(d(z, p) + d(Tz, p))

(ε2

15b+ 4 ·

ε2

15b

)(d(z, p) + d(Tz, p))

=ε2

3b· (d(z, p) + d(Tz, p))

≤ε2

3b· (d(z, p) + d(Tz, T p) + d(p, T p))

≤ε2

3b· (2d(z, p) + d(p, T p))

≤ε2

3b· (2b+ b) =

ε2

3b· 3b = ε2,

so d(z, T z) ≤ ε.

The following is the quantitative version of Lemma 3.3, i.e. the analogue of [27, Lemma 11].

Lemma 4.4. Let T : X → X be nonexpansive with Fix(T ) 6= ∅, (xn) ⊆ X a bounded sequence, u ∈ Xand for all t ∈ (0, 1), let zt be the unique point in X such that zt = tu + (1 − t)Tzt. Let b > 0 suchthat for all n ∈ N and all t ∈ (0, 1) one has d(zt, xn) ≤ b and d(xn, T xn) ≤ b. Let (tl)l∈N∗ ⊆ (0, 1) andρ : (0,∞) → N be such that for all l ≥ ρ(ε) we have that tl ≤ ε. Let χ : N∗ → N

∗ be such that for alll ∈ N

∗ we have that tl ≥1

χ(l) , i.e.1

tlχ(l)≤ 1. Take k ≥ ρ

(εb2

), so that tk

2 · b2 ≤ ε2 . Take n such that

d(xn, T xn) ≤ε

3bχ(k) . Then

〈−−→uztk ,−−−→xnztk〉 ≤ ε.

Proof. By Lemma 3.3.(i), we have that

〈−−→uztk ,−−−→xnztk〉 ≤

tk2d2(xn, ztk) +

(1− tk)2

2tkd(xn, T xn)(d(xn, T xn) + 2d(xn, ztk))

≤tk2

· b2 +3b

2tkd(xn, T xn)

≤ε

2+

3b

2tk·

ε

3bχ(k)≤ε

2+ε

2= ε.

Lemma 4.5 ([27, Lemma 9.1]). For any ε > 0, g : N → N, K ∈ N, b > 0, set

ψ(ε, g,K, b) := g(⌈bε⌉)(K) ≥ K.

Let b > 0, (an) ⊆ [0, b] and τ : N → N such that for all k, n ∈ N with k ≤ n and ak < ak+1, we havek ≤ τ(n).

Then, for all g : N → N, K ∈ N and ε > 0 with τ(ψ(ε, g,K, b)) < K we have that there is an ∈ [K,ψ(ε, g,K, b)] such that for all i, j ∈ [n, n+ g(n)], |ai − aj | ≤ ε.

12

Page 13: algorithm - arxiv.org

The following is the quantitative version of Lemma 2.10.

Lemma 4.6 ([27, Lemma 10]). For any ε > 0, S : (0,∞)× N → N, m ∈ N and b > 0, set

ϕ(ε, S,m, b) := m+ S( ε4b,m)+ 1.

Let b > 0, (an) ⊆ [0, b], (αn) ⊆ (0, 1], (βn) ⊆ R and (γn) ⊆ [0,∞). Suppose that for any n ∈ N,

an+1 ≤ (1 − αn)an + αnβn + γn.

Let S : (0,∞)× N → N be nondecreasing in the second argument such that for all ε > 0 and m ∈ N,

S(ε,m)∏

k=m

(1− αk) ≤ ε.

Let ε > 0, g : N → N and P ∈ N be such that there is an m ≤ P such that for all

i ∈[m,m+ gM

(m+ S

( ε4b,m)+ 1)+ S

( ε4b,m)],

we have that βi ≤ε4 . Suppose that

ϕ(ε,S,P,b)+gM(ϕ(ε,S,P,b))∑

i=0

γi ≤ε

2.

Then there is an N ≤ ϕ(ε, S, P, b) such that for all i ∈ [N,N + g(N)], ai ≤ ε.

Lemma 4.7 ([27, Lemma 12]). Let (yn)n≥1 ⊆ X and ξ : (0,∞) × NN → N be such that for any ε > 0

and g : N → N there is an n ∈ [1, ξ(ε, g)] such that for all i, j ∈ [n, g(n)], d(yi, yj) ≤ ε.Then there is an ε > 0 such that for all c ∈ N

∗ and all f : N → N there is a k ∈ [c, ξ(ε, fc) + c] suchthat for all i, j ∈ [k, f(k)], d(yi, yj) ≤ ε.

The following is the quantitative version of Theorem 2.2, as obtained in [28]. An abstract version ofit using the concept of jointly firmly nonexpansive families of mappings may be found in [48, Section 5],but here we shall only need the rate of metastability for the resolvents of nonexpansive mappings.

Proposition 4.8 (cf. [28, Proposition 9.3]). Define, for all b, ε > 0 and g : N → N, ξb(ε, g) :=

g

(⌈b2

ε2

⌉)

(1).Let T : X → X be nonexpansive, u ∈ X, and for all t ∈ (0, 1) put zt to be the unique point in X

such that zt = tu + (1 − t)Tzt. Let (tn)n∈N∗ ⊆ [0, 1] be nonincreasing. Put, for any n ∈ N∗, yn := ztn .

Let b > 0 and assume that, for all n, d(yn, u) ≤ b. Then, for any ε > 0 and any g : N → N there is ann ≤ ξb(ε, g) such that for all i, j ∈ [n, g(n)], d(yi, yj) ≤ ε.

4.2 Main results

The main quantitative theorem includes, as expected, a rate of metastability, and in order to express itwe shall introduce the following notations.

Notation 4.9. Let b, γ > 0, (γn) ⊆ (0,∞), (αn) ⊆ (0, 1], ζ : (0,∞) → N, S : (0,∞) × N → N, ε > 0and g : N → N.

We shall introduce a series of quantities depending on these parameters. Set

C := 2 +γ0γ, ε :=

ε2

128b.

Set, for all l ∈ N,

ηl :=ε2

192bl, M1(l) := min

(1

2ω(b,ηlC

), ω

(b,

ε2

128b

),ε2

128b

), nl := max

(M1(i)

b

) ∣∣∣∣ i ≤ l

},

13

Page 14: algorithm - arxiv.org

g(l) := gM(l + S

(ε2

16b2, l

)+ 1

)+ S

(ε2

16b2, l

), g′(l) := g(l) + 2.

Set, for all l, i ∈ N,

θ(l, i) := ψ

(1

2ω(b,ηlC

), g′, i, b

)≥ i,

and for all l ∈ N,

θ∗(l) := max{θ(j, nj) | j ≤ l}, K(l) := θ(l, nl) + gM (θ(l, nl)) + 2,

K(l) := K(l) + S

(ε2

16b2,K(l)

)+ 1 + gM

(K(l) + S

(ε2

16b2,K(l)

)+ 1

),

γMl := max{γj | j ≤ l}.

Set, for all β > 0 and l ∈ N,

ρ(β, l) :=

(2 +

γMl

γ

)· b

β

.

Set, for all l ∈ N,

M2(l) := min

ε

2,

ε2

16b(K(l) + 1

) , ω(b,

ε2

128b

), ω(b,ηlC

),ε2

16b·min{αj | j ≤ K(l)}

,

f(l) := max(ρ(M2(l), K(l)

), l)≥ l.

Set, now,

c :=

⌈64b2

ε2

⌉, k∗ := ξb(ε, fc) + c, K∗ := θ∗(k∗) + gM (θ∗(k∗)) + 2,

Φ := K∗ + S

(ε2

16b2,K∗

)+ 1.

This last quantity we shall denote in the sequel by Φb,γ,(γn),(αn),ζ,S(ε, g), i.e. explicitly expressing itsdependence on the parameters.

Armed with the above, we may now state the quantitative version of Theorem 3.4, which gives a rateof metastability for the Halpern proximal point algorithm in our context.

Theorem 4.10. Let (Tn) be a family of self-mappings of X, (γn) ⊆ (0,∞) and γ > 0 be such that(Tn) is jointly (P2) with respect to (γn) and for all n, γn ≥ γ. Let (γn) ⊆ (0,∞) be such that for all n,γn ≥ γn. We denote by F the common fixed point set of the family (Tn). Let (αn) ⊆ (0, 1], u ∈ X and(xn) ⊆ X be such that for all n,

xn+1 = αnu+ (1− αn)Tnxn.

Let (αn) ⊆ (0, 1] be such that for all n, αn ≤ αn. Let b ∈ N∗ and p ∈ F be such that 2d(x0, p) ≤ b

and 2d(u, p) ≤ b. Let ζ : (0,∞) → N be such that for all β > 0 and all m ≥ ζ(β), αm ≤ β. LetS : (0,∞)× N → N be nondecreasing in the second argument such that for all ε > 0 and m ∈ N,

S(ε,m)∏

k=m

(1− αk) ≤ ε.

Then:

(i) for all ε > 0 and g : N → N, there is a w ∈ X and an N ≤ Φb,γ,(γn),(αn),ζ,S(ε, g) such that for alli ∈ [N,N + g(N)], d(w, Tiw) ≤ ε/2 and d(xi, w) ≤ ε/2.

(ii) Φb,γ,(γn),(αn),ζ,S is a rate of metastability for (xn), i.e. for all ε > 0 and g : N → N, there is anN ≤ Φb,γ,(γn),(αn),ζ,S(ε, g) such that for all i, j ∈ [N,N + g(N)], d(xi, xj) ≤ ε.

14

Page 15: algorithm - arxiv.org

Proof. Let ε > 0 and g : N → N. We shall use the notations from Notation 4.9, instantiating theparameters with those from the statement of the theorem, together with this ε and g.

We first remark that the second bullet point is an immediate consequence of the first one.For all t ∈ (0, 1), set zt to be the unique point such that zt = tu + (1 − t)T0zt. Note that, for all

t ∈ (0, 1), by Busemann convexity, we have that

d(zt, p) ≤ tdt(u, p) + (1− t)d(T0zt, p) ≤ tdt(u, p) + (1− t)d(zt, p),

so d(zt, p) ≤ d(u, p), from which we get d(zt, u) ≤ 2d(u, p) ≤ b. Also, for all n and t, d(Tnzt, p) ≤d(zt, p) ≤ d(u, p), so d(Tnzt, u) ≤ 2d(u, p) ≤ b and d(Tnzt, zt) ≤ d(Tnzt, p) + d(zt, p) ≤ 2(u, p) ≤ b.

Again by Busemann convexity, we have that, for all n,

d(xn+1, p) ≤ αnd(u, p) + (1− αn)d(Tnxn, p)

≤ αnd(u, p) + (1− αn)d(xn, p).

By induction, one gets that for all n, d(Tnxn, p) ≤ d(xn, p) ≤ max(d(u, p), d(x0, p)) ≤ b/2, so, for all n,d(Tnxn, u) ≤ d(Tnxn, p) + d(u, p) ≤ b and, for all n and t, d(xn, zt) ≤ d(xn, p) + d(zt, p) ≤ b.

For all l ∈ N∗, set yl := z1/l. By Proposition 4.8 and Lemma 4.7, we get that there is a k ∈ [c, k∗]

such that for all i, j ∈ [k, f(k)], d(yi, yj) ≤ ε. Set k′ := f(k) ≥ k. We get in particular that d(yk, yk′) ≤ εand that

d(yk′ , T0yk′) = d

(1

k′u+

(1−

1

k′

)T0yk′ , T0yk′

)=

1

k′d(u, T0yk′) ≤

b

k′.

We shall take w := yk′ and thus it remains to be shown that there is an N ≤ Φ such that for alli ∈ [N,N + g(N)], d(yk′ , Tiyk′) ≤ ε/2 and d(xi, yk′) ≤ ε/2.

Set A := θ(k, nk) ≥ k and, for all m, am := d(xm, yk′). We distinguish two cases.Case I. For all i ≤ A, ai+1 ≤ ai.Since

θ(k, nk) = ψ

(1

2ω(b,ηkC

), g′, nk, b

)= g′

(⌈b

12ω(b, ηkC )

⌉)

(nk),

we get by Proposition 4.1 that there is an n ∈ [nk, A] such that for all i, j ∈ [n, n+g′(n)] = [n, n+g(n)+2],|ai − aj | ≤

12ω(b, ηk

C

). We keep this in mind.

Case II. There is an i ≤ A with ai+1 > ai.Define τ : N → N, for all n ∈ N, by

τ(n) := max{j ≤ max(n,A) | aj < aj+1}.

Then:

• for all n ∈ N, τ(n) ≤ τ(n+ 1) and aτ(n) ≤ aτ(n)+1;

• for all l, n ∈ N with l ≤ n and al < al+1, we have l ≤ τ(n);

• for all n ≥ A, an ≤ aτ(n)+1 (this is the only non-trivial statement, but [27, Lemma 9.2] shows thatit follows exactly as in the original proof of Lemma 2.8, i.e. see [37, Lemma 3.1]).

We now distinguish two sub-cases.Sub-case II.1. For all m ∈ [A,A+ g(A) + 2], τ(m) ≥ nk.Let m ∈ [A,A+ g(A) + 2] be arbitrary. Then

d(xτ(m)+1, yk′) ≤ ατ(m)d(u, yk′) + (1 − ατ(m))d(Tτ(m)xτ(m), yk′) ≤ ατ(m)d(u, yk′) + d(Tτ(m)xτ(m), yk′),

sod(xτ(m)+1, yk′)− d(Tτ(m)xτ(m), yk′) ≤ ατ(m)d(u, yk′) ≤ ατ(m)b

and (using that τ(m) ≥ nk)

d(xτ(m), yk′)− d(Tτ(m)xτ(m), yk′) ≤ d(xτ(m)+1, yk′)− d(Tτ(m)xτ(m), yk′)

≤ ατ(m)b ≤M1(k) ≤ min

(ω(b,ηkC

), ω

(b,

ε2

128b

)).

15

Page 16: algorithm - arxiv.org

Asτ(m) ≤ max(m,A) = m ≤ A+ g(A) + 2 ≤ A+ gM (A) + 2 = K(k) ≤ K(k),

we have that

d(yk′ , Tτ(m)yk′) ≤

(2 +

γτ(m)

γ0

)· d(yk′ , T0yk′) ≤

(2 +

γMK(k)

γ

)·b

k′

(2 +

γMK(k)

γ

)· b ·

1

(2+

γMK(k)γ

)·b

M2(k)

≤M2(k) ≤ min

(ω(b,ηkC

), ω

(b,

ε2

128b

)).

By Proposition 4.3, we get that

d(xτ(m), Tτ(m)xτ(m)) ≤ min

(ηkC,ε2

128b

),

so

d(xτ(m), T0xτ(m)) ≤

(2 +

γ0γτ(m)

)d(xτ(m), Tτ(m)xτ(m)) ≤

(2 +

γ0γ

)·ηkC

= ηk,

and, since τ(m) ≥ nk,

d(xτ(m)+1, xτ(m)) ≤ d(xτ(m)+1, Tτ(m)xτ(m)) + d(Tτ(m)xτ(m), xτ(m))

= ατ(m)d(u, Tτ(m)xτ(m)) + d(Tτ(m)xτ(m), xτ(m))

≤ ατ(m) · b+ε2

128b≤

ε2

128b+

ε2

128b=

ε2

64b.

As d(xτ(m), T0xτ(m)) ≤ ηk and k ≥ c =⌈64b2

ε2

⌉, we have, by Lemma 4.4, that

〈−→uyk,−−−−−→xτ(m)yk〉 ≤

ε2

64.

On the other hand,

〈−→uyk,−−−−−−−−−→xτ(m)xτ(m)+1〉 ≤ d(u, yk)d(xτ(m), xτ(m)+1) ≤ b ·

ε2

64b=ε2

64,

so 〈−→uyk,−−−−−−−→xτ(m)+1yk〉 ≤

ε2

32 . We also know that d(yk, yk′) ≤ ε = ε2

128b , so

〈−−→uyk′ ,−−−−−−−→xτ(m)+1yk′〉 = 〈−→uyk,−−−−−−−→xτ(m)+1yk〉+ 〈−→uyk,

−−−→ykyk′〉+ 〈−−−→ykyk′ ,−−−−−−−→xτ(m)+1y

′k〉

≤ε2

32+

ε2

128+

ε2

128<ε2

16.

Using Lemma 2.1 and Lemma 4.2, we get that

d2(xτ(m)+1, yk) ≤ (1− ατ(m))2d2(Tτ(m)xτ(m), yk′) + 2ατ(m)〈

−−→uyk′ ,−−−−−−−→xτ(m)+1yk′〉

≤ (1− ατ(m))2d2(Tτ(m)xτ(m), Tτ(m)yk′) + 2bd(yk′ , Tτ(m)yk′) + 2ατ(m)〈

−−→uyk′ ,−−−−−−−→xτ(m)+1yk′〉

≤ (1− ατ(m))d2(xτ(m), yk′) + 2bd(yk′ , Tτ(m)yk′) + 2ατ(m)〈

−−→uyk′ ,−−−−−−−→xτ(m)+1yk′〉

≤ (1− ατ(m))d2(xτ(m)+1, yk′) + 2bd(yk′ , Tτ(m)yk′) + 2ατ(m)〈

−−→uyk′ ,−−−−−−−→xτ(m)+1yk′〉,

so

d2(xτ(m)+1, yk′) ≤ 2〈−−→uyk′ ,−−−−−−−→xτ(m)+1yk′〉+2bd(yk′ , Tτ(m)yk′)

ατ(m)≤ε2

8+ε2

8≤ε2

4.

Since m ≥ A, we have that d2(xm, yk′) ≤ d2(xτ(m)+1, yk′) ≤ ε2/4, so d(xm, yk′) ≤ ε/2.

16

Page 17: algorithm - arxiv.org

As m ≤ A+ g(A) + 2 = K(k) ≤ K(k), we have that

d(yk′ , Tmyk′) ≤

(2 +

γmγ0

)· d(yk′ , T0yk′) ≤

(2 +

γMK(k)

γ

)·b

k′

(2 +

γMK(k)

γ

)· b ·

1

(2+

γMK(k)γ

)·b

M2(k)

≤M2(k) ≤ε

2.

As m was arbitrarily chosen, we have shown that for all m ∈ [A,A + g(A) + 2], d(yk′ , Tmyk′) ≤ ε/2and d(xm, yk′) ≤ ε/2.

We can then take N := A, because then, as A = θ(k, nk) ≤ K(k) and k ≤ k∗, we have θ(k, nk) ≤θ∗(k∗) and so gM (θ(k, nk)) ≤ gM (θ∗(k∗)), θ(k, nk) + gM (θ(k, nk)) + 2 ≤ θ∗(k∗) + gM (θ∗(k∗)) + 2, soK(k) ≤ K∗ ≤ Φ. We have thus shown N ≤ Φ and we derive the needed conclusion by noting thatg(N) ≤ g(N).

Sub-case II.2. There is an m ∈ [A,A+ g(A) + 2] with τ(m) < nk.Since A ≤ m, τ(A) ≤ τ(m) < nk. But θ(k, nk) = ψ

(12ω(b, ηk

C

), g′, nk, b

), so, by Lemma 4.5, we get

that there is an n ∈ [nk, A] such that for all i, j ∈ [n, n+ g′(n)] = [n, n+ g(n) + 2], |ai− aj | ≤12ω(b, ηk

C

).

We note that this was also proven in Case I, so now we may merge the two threads of the proof (and weno longer need this m above).

Note that, since n ≤ A, we have that n+ g′(n) ≤ A+ (g′)M (A) = K(k) ≤ K(k). Also note that, asn ≥ nk, for all m ≥ n, αmb ≤M1(k) ≤

12ω(b, ηk

C

).

Let m ∈ [n, n+ g′(n)− 1] = [n, n+ g(n) + 1]. We have that

d(xm, yk′)− d(Tmxm, yk′) = d(xm+1, yk′)− d(Tmxm, yk′) + d(xm, yk′)− d(xm+1, yk′)

≤ αmb+1

2ω(b,ηkC

)≤ ω

(b,ηkC

).

As m ≤ n+ g′(n) ≤ K(k), we have that

d(yk′ , Tmyk′) ≤

(2 +

γmγ0

)· d(yk′ , T0yk′) ≤

(2 +

γMK(k)

γ

)·b

k′

(2 +

γMK(k)

γ

)· b ·

1

(2+

γMK(k)γ

)·b

M2(k)

≤M2(k) ≤ ω(b,ηkC

).

By Proposition 4.3, we get that

d(xm, Tmxm) ≤ηkC,

so

d(xm, T0xm) ≤

(2 +

γ0γm

)d(xm, Tmxm) ≤

(2 +

γ0γ

)·ηkC

= ηk,

and since k ≥ c =⌈64b2

ε2

⌉, we have, by Lemma 4.4, that

〈−→uyk,−−−→xmyk〉 ≤

ε2

64.

We also know that d(yk, yk′) ≤ ε = ε2

128b , so

〈−−→uyk′ ,−−−→xmyk′〉 = 〈−→uyk,−−−→xmyk〉+ 〈−→uyk,

−−−→ykyk′〉+ 〈−−−→ykyk′ ,−−−→xmy

′k〉

≤ε2

64+

ε2

128+

ε2

128=ε2

32.

17

Page 18: algorithm - arxiv.org

So, we have shown that, for all m ∈ [n, n+ g(n) + 1], 〈−−→uyk′ ,−−−→xmyk′〉 ≤ ε2

32 .Using Lemma 2.1 and Lemma 4.2, we get that, for all i ∈ N,

d2(xi+1, yk) ≤ (1− ατ(m))2d2(Tixi, yk′) + 2αi〈

−−→uyk′ ,−−−−→xi+1yk′〉

≤ (1− αi)2d2(Tixi, Tiyk′) + 2bd(yk′ , Tiyk′) + 2αi〈

−−→uyk′ ,−−−−→xi+1yk′〉

≤ (1− αi)2d2(xi, yk′) + 2bd(yk′ , Tiyk′) + 2αi〈

−−→uyk′ ,−−−−→xi+1yk′ 〉.

We now seek to apply Lemma 4.6 with ε 7→ ε2

4 , b 7→ b2, P 7→ K(k) and, for all i, ai 7→ d(xi, yk′),γi 7→ 2bd(Tiyk′ , yk′) and βi 7→ 2〈−−→uyk′ ,−−−−→xi+1yk′〉.

Note that

n+ g(n) = n+ gM(n+ S

(ε2

16b2, n

)+ 1

)+ S

(ε2

16b2, n

),

ϕ

(ε2

4, S,K(k), b2

)= K(k) + S

(ε2

16b2,K(k)

)+ 1,

and

ϕ

(ε2

4, S,K(k), b2

)+ gM

(ε2

4, S,K(k), b2

))= K(k),

so

ϕ(

ε2

4 ,S,K(k),b2)+gM

(ϕ(

ε2

4 ,S,K(k),b2))

i=0

2bd(Tiyk′ , yk′) ≤ (K(k) + 1) · 2b · maxi≤K(k)

d(Tiyk′ , yk′)

≤ (K(k) + 1) · 2b ·M2(k)

≤ (K(k) + 1) · 2b ·ε2

16b(K(k) + 1)=ε2

8.

Now we may apply Lemma 4.6 and we get that there is an N ≤ K(k) + S(

ε2

16b2 ,K(k))+ 1 such

that for all i ∈ [N,N + g(N)], d2(xi, yk′) ≤ ε2

4 , i.e. d(xi, yk′) ≤ ε2 . Now, for all i ∈ [N,N + g(N)], since

N ≤ K(k) + S(

ε2

16b2 ,K(k))+ 1, and so,

g(N) ≤ gM(K(k) + S

(ε2

16b2,K(k)

)+ 1

),

we have that

i ≤ N + g(N) ≤ K(k) + S

(ε2

16b2,K(k)

)+ 1 + gM

(K(k) + S

(ε2

16b2,K(k)

)+ 1

)= K(k),

so

d(yk′ , Tiyk′) ≤

(2 +

γiγ0

)· d(yk′ , T0yk′) ≤

(2 +

γMK(k)

γ

)·b

k′

(2 +

γMK(k)

γ

)· b ·

1

(2+

γMK(k)γ

)·b

M2(k)

≤M2(k) ≤ε

2.

It remains to be shown that N ≤ Φ. Since we have shown before that K(k) ≤ K∗ we have that, sinceS is nondecreasing in the second argument,

S

(ε2

16b2,K(k)

)≤ S

(ε2

16b2,K∗

),

18

Page 19: algorithm - arxiv.org

so

N ≤ K(k) + S

(ε2

16b2,K(k)

)+ 1 ≤ K∗ + S

(ε2

16b2,K∗

)+ 1 = Φ.

The proof is now finished.

As remarked before, results – including quantitative ones – concerning Tikhonov-regularized algorithmsmay be obtained from the corresponding Halpern ones, as per [33]; see also [14, Section 3.3] for exampleswhich specifically concern metastability. (The study of Tikhonov-regularized algorithms was also studiedfrom the viewpoint of proof mining in [13, 12].) We may now, thus, state the corresponding quantitativeversion of Corollary 3.5.

Corollary 4.11. Define, for any g : N → N, the function hg : N → N, for any n, by hg(n) := g(n+ 1),and for any R : (0,∞) × N → N, the function SR : (0,∞) × N → N, for any ε > 0 and m ∈ N, by

SR(ε,m) := R(ε,m + 1). Also put, for any b, γ > 0, (γn) ⊆ (0,∞), (βn) ⊆ (0, 1], ζ : (0,∞) → N,R : (0,∞)× N → N, ε > 0 and g : N → N,

Θb,γ,(γn),(βn),ζ,R(ε, g) := Φb,γ,(γn),(βn+1),ζ,SR

(ε2, hg

)+ 1.

Let (Tn) be a family of self-mappings of X, (γn) ⊆ (0,∞) and γ > 0 be such that (Tn) is jointly (P2)with respect to (γn) and for all n, γn ≥ γ. Let (γn) ⊆ (0,∞) be such that for all n, γn ≥ γn. We denoteby F the common fixed point set of the family (Tn). Let (βn) ⊆ (0, 1], u ∈ X and (yn) ⊆ X be such thatfor all n,

yn+1 = Tn(βnu+ (1− βn)yn).

Let (βn) ⊆ (0, 1] be such that for all n, βn ≤ βn. Let b ∈ N∗ and p ∈ F be such that 2d(y0, p) ≤ b

and 2d(u, p) ≤ b. Let ζ : (0,∞) → N be such that for all β > 0 and all m ≥ ζ(β), βm ≤ β. LetR : (0,∞)× N → N be nondecreasing in the second argument such that for all ε > 0 and m ∈ N,

R(ε,m)∏

k=m

(1 − βk) ≤ ε.

Then Θb,γ,(γn),(βn),ζ,Ris a rate of metastability for (yn), i.e. for all ε > 0 and g : N → N, there is an

N ≤ Θb,γ,(γn),(βn),ζ,R(ε, g) such that for all i, j ∈ [N,N + g(N)], d(yi, yj) ≤ ε.

Proof. For all n, put xn := βnu + (1 − βn)yn, αn := βn+1 and αn := βn+1. We see that, for all n,αn ≤ αn, yn+1 = Tnxn and

xn+1 = βn+1u+ (1− βn+1)yn+1 = αnu+ (1− αn)Tnxn.

We remark that for all β > 0 and all m ≥ ζ(β), m+ 1 ≥ ζ(β) and so αm = βm+1 ≤ β. We also remarkthat SR is also nondecreasing in the second argument and that for all ε > 0 and m ∈ N,

SR(ε,m)∏

k=m

(1− αk) =

R(ε,m+1)∏

k=m

(1− βk+1) =

R(ε,m+1)+1∏

k=m+1

(1− βk) ≤

R(ε,m+1)∏

k=m+1

(1− βk) ≤ ε.

We see that, by Busemann convexity,

2d(x0, p) ≤ 2(βnd(u, p) + (1− βn)d(y0, p)) ≤ 2

(βn ·

b

2+ (1− βn)

b

2

)= b.

Let ε > 0 and g : N → N. We may now apply Theorem 4.10 to get that there is a w ∈ X and anM ∈ N with M + 1 ≤ Θb,γ,(γn),(βn),ζ,R

(ε, g) such that for all q ∈ [M,M + hg(M)], d(w, Tqw) ≤ ε/4 and

d(xq , w) ≤ ε/4.Take N := M + 1 ≤ Θb,γ,(γn),(βn),ζ,R

(ε, g). For any i ∈ [N,N + g(N)], we have that i − 1 ∈

[N − 1, N − 1 + g(N)] = [M,M + hg(M)], so

d(yi, w) = d(Ti−1xi−1, w)

19

Page 20: algorithm - arxiv.org

≤ d(Ti−1xi−1, Ti−1w) + d(Ti−1w,w)

≤ d(xi−1, w) + d(Ti−1w,w) ≤ε

4+ε

4=ε

2.

Thus, for any i, j ∈ [N,N + g(N)],

d(yi, yj) ≤ d(yi, w) + d(yj , w) ≤ε

2+ε

2= ε.

The proof is now finished.

Finally, Suzuki has shown in [49] that this sort of convergence theorems for Halpern iterations – evenfor families of mappings like in our case – directly yield convergence theorems for the correspondingviscosity iterations; this has been recently analyzed quantitatively by Kohlenbach and Pinto [30], andthe results of that paper – specifically Lemma 3.4, Remark 3.5 and Theorem 3.11 – may be used toimmediately derive from our results rates of metastability for the viscosity proximal point algorithm,thus further illustrating the modularity of proof mining approaches.

5 Acknowledgements

I would like to thank Ulrich Kohlenbach and Laurentiu Leustean for their suggestions.This work has been supported by a grant of the Romanian Ministry of Research, Innovation and

Digitization, CNCS/CCCDI – UEFISCDI, project number PN-III-P1-1.1-PD-2019-0396, within PNCDIIII.

References

[1] A. D. Aleksandrov, A theorem on triangles in a metric space and some of its applications. TrudyMath. Inst. Steklov 38, 4–23, 1951.

[2] K. Aoyama, M. Toyoda, Approximation of zeros of accretive operators in a Banach space. Israel J.Math. 220, no. 2, 803–816, 2017.

[3] D. Ariza-Ruiz, L. Leustean, G. Lopez-Acedo, Firmly nonexpansive mappings in classes of geodesicspaces. Trans. Amer. Math. Soc. 366, 4299–4322, 2014.

[4] D. Ariza-Ruiz, G. Lopez-Acedo, A. Nicolae, The asymptotic behavior of the composition of firmlynonexpansive mappings. J. Optim. Theory Appl. 167, 409–429, 2015.

[5] M. Bacak, The proximal point algorithm in metric spaces. Israel J. Math. 194, 689–701, 2013.

[6] M. Bacak, Convex analysis and optimization in Hadamard spaces. De Gruyter, 2014.

[7] H. Bauschke, P. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces.Second Edition. Springer, 2017.

[8] I. D. Berg, I. G. Nikolaev, Quasilinearization and curvature of Alexandrov spaces. Geom. Dedicata133, 195–218, 2008.

[9] H. Brezis, P. L. Lions, Produits infinis de resolvantes. Israel J. Math. 29, 329–345, 1978.

[10] F. E. Browder, Convergence theorems for sequences of nonlinear operators in Banach spaces.Mathematische Zeitschrift 100, 201–225, 1967.

[11] R. E. Bruck Jr., Nonexpansive projections on subsets of Banach spaces. Pacific J. Math. 47, 341–355,1973.

[12] H. Cheval, L. Leustean, Quadratic rates of asymptotic regularity for the Tikhonov-Mann iteration.arXiv:2107.07176 [math.OC], 2021.

20

Page 21: algorithm - arxiv.org

[13] B. Dinis, P. Pinto, On the convergence of algorithms with Tikhonov regularization terms.Optimization Letters 15, no. 4, 1263–1276, 2021.

[14] B. Dinis, P. Pinto, Effective metastability for a method of alternating resolvents. arXiv:2101.12675[math.FA], 2021.

[15] J. Eckstein, The Lions-Mercier Splitting Algorithm and the Alternating Direction Method areInstances of the Proximal Point Algorithm. Report LIDS-P-1769, Laboratory for Information andDecision Sciences, MIT, 1989.

[16] K. Goebel, S. Reich, Iterating holomorphic self-mappings of the Hilbert ball. Proc. Japan Acad. Ser.A Math. Sci. 58, no. 8, 349–352, 1982.

[17] K. Goebel, S. Reich, Uniform convexity, hyperbolic geometry, and nonexpansive mappings.Monographs and Textbooks in Pure and Applied Mathematics, 83. Marcel Dekker, Inc., New York,1984.

[18] M. Gromov, Hyperbolic groups. In: S. M. Gersten (ed.), Essays in group theory. Math. Sci. Res.Inst. Publ., 8, Springer, New York, pp. 75–264, 1987.

[19] O. Guler, On the convergence of the proximal point algorithm for convex minimization, SIAM J.Control Optim. 29, 403–419, 1991.

[20] B. Halpern, Fixed points of nonexpanding maps. Bull. Amer. Math. Soc. 73, 957–961, 1967.

[21] G. J. Minty, On a “monotonicity” method for the solution of non-linear equations in Banach spaces.Proc. Natl. Acad. Sci. U.S.A. 50, 1038–1041, 1963.

[22] S. Kamimura, W. Takahashi, Approximating solutions of maximal monotone operators in Hilbertspaces. Journal of Approximation Theory 106, 226–240, 2000.

[23] U. Kohlenbach, Applied proof theory: Proof interpretations and their use in mathematics. SpringerMonographs in Mathematics, Springer, 2008.

[24] U. Kohlenbach, On quantitative versions of theorems due to F. E. Browder and R. Wittmann. Adv.Math. 226, 2764–2795, 2011.

[25] U. Kohlenbach, On the quantitative asymptotic behavior of strongly nonexpansive mappings inBanach and geodesic spaces. Israel Journal of Mathematics 216, no. 1, 215–246, 2016.

[26] U. Kohlenbach, Proof-theoretic methods in nonlinear analysis. In: B. Sirakov, P. Ney de Souza, M.Viana (eds.), Proceedings of the International Congress of Mathematicians 2018 (ICM 2018), Vol. 2(pp. 61–82). World Scientific, 2019.

[27] U. Kohlenbach, Quantitative analysis of a Halpern-type Proximal Point Algorithm for accretiveoperators in Banach spaces. Journal of Nonlinear and Convex Analysis 21, no. 9, 2125–2138, 2020.

[28] U. Kohlenbach, L. Leustean, Effective metastability of Halpern iterates in CAT(0) spaces. Adv.Math. 231, 2526–2556, 2012. Addendum in: Adv. Math. 250, 650–651, 2014.

[29] U. Kohlenbach, G. Lopez-Acedo, A. Nicolae, Quantitative asymptotic regularity for the compositionof two mappings. Optimization 66, 1291–1299, 2017.

[30] U. Kohlenbach, P. Pinto, Quantitative translations for viscosity approximation methods inhyperbolic spaces. arXiv:2102.03981 [math.FA], 2021.

[31] U. Kohlenbach, A. Sipos, The finitary content of sunny nonexpansive retractions. Commun.Contemp. Math., Volume 23, Number 1, 19550093 [63 pages], 2021.

[32] N. Lehdili, A. Moudafi, Combining the proximal algorithm and Tikhonov regularization.Optimization 37, no. 3, 239–252, 1996.

21

Page 22: algorithm - arxiv.org

[33] L. Leustean, A. Nicolae, A note on an alternative iterative method for nonexpansive mappings.Journal of Convex Analysis 24, 501–503, 2017.

[34] L. Leustean, A. Nicolae, A. Sipos, An abstract proximal point algorithm. Journal of GlobalOptimization, Volume 72, Issue 3, 553–577, 2018.

[35] L. Leustean, P. Pinto, Quantitative results on a Halpern-type proximal point algorithm.Computational Optimization and Applications 79, no. 1, 101–125, 2021.

[36] T. C. Lim, Remarks on some fixed point theorems, Proc. Amer. Math. Soc. 60, 179–182, 1976.

[37] P.-E. Mainge, Strong convergence of projected subgradient methods for nonsmooth and nonstrictlyconvex minimization. Set-Valued Analysis 16 (2008), 899–912.

[38] B. Martinet, Regularisation d’inequations variationnelles par approximations successives. Rev.Francaise Informat. Recherche Operationnelle 4, 154–158, 1970.

[39] E. Neumann, Computational problems in metric fixed point theory and their Weihrauch degrees.Log. Methods Comput. Sci. 11, 1–44, 2015.

[40] P. Pinto, A rate of metastability for the Halpern type Proximal Point Algorithm. NumericalFunctional Analysis and Optimization 42, no. 3, 320–343, 2021.

[41] S. Reich, Extension problems for accretive sets in Banach spaces. J. Functional Analysis 26, no. 4,378–395, 1977.

[42] S. Reich, I. Shafrir, The asymptotic behavior of firmly nonexpansive mappings. Proc. Amer. Math.Soc. 101, no. 2, 246–250, 1987.

[43] S. Reich, I. Shafrir, Nonexpansive iterations in hyperbolic spaces. Nonlinear Anal. 15, no. 6, 537–558,1990.

[44] R. T. Rockafellar, Monotone operators and the proximal point algorithm. SIAM J. Control Optim.14, 877–898, 1976.

[45] S. Saejung, Halpern’s iteration in Banach spaces. Nonlinear Analysis 73, 3431–3439, 2010.

[46] S. Saejung, Halpern’s iteration in CAT(0) spaces. Fixed Point Theory and Applications 2010, 471781[13 pages], 2010.

[47] T. Suzuki, Reich’s problem concerning Halpern’s convergence. Archiv der Mathematik 92, 602–613,2009.

[48] A. Sipos, Revisiting jointly firmly nonexpansive families of mappings. arXiv:2006.02167 [math.OC],2020. To appear in: Optimization.

[49] T. Suzuki, Moudafi’s viscosity approximations with Meir-Keeler contractions. J. Math. Anal. Appl.325, no. 1, 342–352, 2007.

[50] T. Tao, Soft analysis, hard analysis, and the finite convergence principle. Essay posted May 23, 2007.Appeared in: T. Tao, Structure and Randomness: Pages from Year One of a Mathematical Blog.AMS, 298 pp., 2008.

[51] T. Tao, Norm convergence of multiple ergodic averages for commuting transformations. ErgodicTheory Dynam. Systems 28, 657–688, 2008.

[52] R. Wittmann, Approximation of fixed points of nonexpansive mappings. Arch. Math. 58, 486–491,1992.

[53] H.-K. Xu, Iterative algorithms for nonlinear operators. J. London Math. Soc. 66, 240–256, 2002.

[54] H.-K. Xu, A regularization method for the proximal point algorithm. Journal of Global Optimization36, 115–125, 2006.

22