On the Complexity of the Descartes Method when using ...msagralo/ApxDescartes.pdf · On the Complexity of the Descartes Method when using Approximate Arithmetic Michael Sagraloff

On the Complexity of the Descartes Method whenusing Approximate Arithmetic

Michael Sagraloff

MPI for Informatics, Saarbrucken, Germany

Abstract

In this paper, we introduce a variant of the Descartes method to isolate the real roots of a square-freepolynomial F(x) = ∑

ni=0 Aixi with arbitrary real coefficients. It is assumed that each coefficient of F can be

approximated to any specified error bound. Our algorithm uses approximate arithmetic only, nevertheless, itis certified, complete and deterministic. We further provide a bound on the complexity of our method whichexclusively depends on the geometry of the roots and not on the complexity of the coefficients of F . For thespecial case, where F is a polynomial of degree n with integer coefficients of maximal bitsize τ , our boundon the bit complexity writes as O(n3τ2). Compared to the complexity of the classical Descartes methodfrom Collins and Akritas (based on ideas dating back to Vincent), which uses exact rational arithmetic, thisconstitutes an improvement by a factor of n. The improvement mainly stems from the fact that the maximalprecision that is needed for isolating the roots of F is by a factor n lower than the precision needed whenusing exact arithmetic.

Key words: Root isolation, Descartes method, subdivision methods, numerical computation, complexitybounds, approximate coefficients

1. Introduction

Computing the roots of a univariate polynomial can be considered as one of the fundamen-tal problems in computational algebra, and numerous approaches have been proposed in the lastdecades to solve this problem. In this paper, we focus on the problem of isolating the real roots ofa square-free polynomial F ∈R[x] with arbitrary real coefficients. More precisely, given approx-imations of the coefficients of F to an arbitrary precision, we aim to compute disjoint intervalsJ1, . . . ,Jm such that each Ji contains exactly one root of F and such that their union contains allreal roots of F . For polynomials with integer coefficients, the so-called Descartes method (or

Email address: [email protected] (Michael Sagraloff).

Preprint submitted to Elsevier 20 January 2014

”Vincent-Collins-Akritas” method) 1 , first introduced by Collins and Akritas [10], constitutesone of the simplest and most efficient algorithms. In order to better understand the contributionof this paper, we briefly review the algorithm: It starts with an interval I containing all real rootsof F and recursively proceeds as follows: For an interval I = (a,b)⊂I , Descartes’ Rule of Signis used to test I for roots of F . If it yields that the number m of roots contained in I equals zero,I is discarded. If it yields that m = 1, then I is stored as an isolating interval. In all other cases, Iis subdivided into two equally sized subintervals I` := (a,m(I)) and Ir := (m(I),b), where m(I)denotes the midpoint of I. For a polynomial F of degree n with integer coefficients of bit-size τ ,the Descartes method induces a recursion tree of size O(n(τ + logn)), where the latter bound hasshown to be optimal [15]. For Descartes’ Rule of Signs, we need to compute the polynomial 2

FI,rev(x) := (x+1)n ·F(

ax+bx+1

). (1.1)

Using asymptotically fast Taylor shifts [16, 45, 39], the cost for this computation is bounded by

O(n2(log+ max(|a|, |b|)+ log+ |b−a|−1)) = O(n3τ) (1.2)

bit operations, 3 where we define log+(x) := logmax(2, |x|) ≥ 1 for all x ∈ C and log := log2.The bound in (1.2) follows from the fact that we have to perform O(n) arithmetic operations andthat FI,rev has rational coefficients of bit-size O(n(log+ max(|a|, |b|)+ log+ |b−a|−1)) = O(n2τ).Multiplication of the bound on the recursion tree and the bound (1.2) on the bit complexity forthe computations at each node yields the bound O(n4τ2) on the overall bit complexity of theDescartes method.

The advantages of the Descartes method are its simplicity and that the size of the recursiontree adapts well to the geometric locations of the roots, that is, the recursion tree becomes largeif and only if some of the roots are clustered. A disadvantage of the Descartes method is thatthe exact computation of the polynomials FI,rev needs a precision of Θ(n2τ) in the worst case,whereas separating the roots from each other needs only O(nτ) bits. In fact, the binary repre-sentation of the endpoints of all isolating intervals returned by the algorithm needs no more thanO(nτ) bits. This brings up the question whether approximate computation of the polynomialsFI,rev yields any improvement with respect to the precision demand during the computation and,thus, also with respect to the bit complexity of the Descartes method. This question has beenaddressed in a series of previous papers: Johnson and Krandick [19] introduced a hybrid methodthat uses interval arithmetic based on floating point computation (up to a certain fixed preci-sion) to compute the polynomials FI,rev. This allows to determine the signs of the coefficientsof FI,rev (and, thus, to use Descartes’ Rule of Signs) for most of the considered intervals withinthe subdivision process by using approximate arithmetic, whereas, for the remaining intervals,the method falls back to exact computation. Hence, floating point arithmetic is used as a filter

1 There exist numerous discussions (e.g. [1]) about whether ”Descartes method” is the correct term since Descartesdid not introduce any algorithm to isolate the roots but (only) a method to estimate the number of positive roots of aunivariate polynomial (i.e. Descartes’ Rule of Signs). However, because of the fact that the algorithm from Collins andAkritas (based on ideas dating back to Vincent) exclusively uses this rule as inclusion and exclusion predicate, it isreasonable to name the algorithm after Descartes without using the possessive ”s” following his name.2 Descartes’ Rule of Sign states that the number m of roots contained in I is upper bounded by the number v of signchanges in the coefficient sequence of FI,rev and that v≡ mmod2. For more details, we refer to Section 2.6.3 According to Cauchy’s Root Bound (see e.g. [47]), we can assume that I ⊂ (−1 − 2τ ,1 + 2τ ), and thusmax(1, |a|, |b|) ≤ 1+ 2τ . In addition, Descartes method does not subdivide intervals of size less than half of the min-imal distance between two distinct roots of F (i.e. the separation σF of F), and logmax(1,σF ) = O(n(τ + logn)); seeSection 2.6 for details.

2

which allows to decrease the precision demand for most intervals, however, no improvement isachieved with respect to worst case bit complexity. Rouillier and Zimmermann [34] modified thelatter approach by arbitrarily increasing the working precision at each stage of the algorithm. Itis currently one of the fastest algorithms in practice (e.g. the univariate solver in MAPLE is basedupon this method), however, no result on the needed precision demand and its computationalcomplexity is known, and we expect that, without further modifications, there is no improve-ment upon the bound O(n4τ2) in the worst case. There also exist ”approximate versions” of theDescartes methods for which complexity results are known: In [14], Eigenwillig et al. proposeda randomized algorithm which is similar to the one from Rouillier and Zimmermann in the sensethat it computes interval approximations of the polynomials FI,rev; in fact, the method works withthe Bernstein representation and not with the monomial representation of F . However, the maindifference is that the subdivision points are randomly chosen in order to avoid unnecessarily largeworking precisions. The algorithms from [25, 35] are both deterministic, and they both start witha specific rational approximation F of F for which isolating intervals are computed. Eventually,the isolating intervals for F are obtained by enlarging the isolating intervals for F . It has beenshown that, for integer polynomials, all of the latter methods (i.e. [14, 25, 35]) need O(n4τ2) bitoperations to isolate all real roots. In summary, there exists no theoretical proof for the improvedefficiency of an ”approximate” Descartes method as observed in practice.

Main Results. A main contribution of this paper is to close the above described gap betweentheory and practice by introducing a modified Descartes method, denoted RISOLATE, whichcombines the Descartes and the Bolzano method [9, 38, 46]. More precisely, for discarding in-tervals that do not contain any root, our method mainly uses Descartes’ Rule of Signs, whereasan interval is confirmed to be isolating via a sign-change test at the endpoints of the interval andRouche’s Theorem. RISOLATE succeeds under guarantee (i.e. the method returns an exact resultfor any given F) with a working precision bounded by O(nτ) in the worst case and the size onthe recursion tree is bounded by O(nτ). This eventually yields the improved bound O(n3τ2) forisolating all real roots of F . Before we give more details, we briefly sketch how this improvementis possible: For an interval I = (a,b), let

FI(x) := F(a+(b−a) · x). (1.3)

Then, it holds that FI,rev(x) = (x+ 1)n ·FI(1/(x+ 1)). Hence, from an approximation FI of FIto L + n bits after the binary point 4 , we can directly compute an approximation FI,rev(x) ofFI,rev(x) to L bits after the binary point (see Lemma 1 (c)). Thus, in essence, we can restrictto the computation of sufficiently good approximations of the polynomials FI . In Section 2.3,we show that, for an arbitrary approximation of F to ρ0 = O(nτ) bits after the binary point,corresponding roots of F and its approximation are almost at the same location with respect totheir separations; see Theorem 3 and Appendix 6.2 for a more precise result. We conclude that,for isolating the roots of F , it should also suffice to consider approximations FI of FI to ρ0 bitsafter the binary point. But how can we compute such approximations FI in an efficient manner?Let h0, with h0 = O(nτ), denote an upper bound on the depth of the recursion tree induced bythe (modified) Descartes method. If we start with an approximation of F to ρ0 + 2h0 = O(nτ)bits after the binary point, then we can recursively compute approximations FI of FI such thatthe approximation error quadruples at most in each bisection step (Lemma 1), and thus each FIis approximated to at least ρ0 bits after the binary point. The polynomials FI have bitsize O(nτ)

4 More precisely, each coefficient of FI is approximated to an absolute error of less than 2−L−n

3

I0 = (− 12 ,

12 ), FI0 = 16

√2x2−16

√2x+4−8x+4+ π

8 +4√

2

≈ FI0 = 11585512 x2− 31363

1024 + 5145512

I1 = (− 12 ,0) I2 = (0, 1

2 )

(0, 14 ) ( 1

4 ,12 ), FI =

√2x2 +(2

√2−2)x+

√2+ π

8

≈ FI =181128 x2 + 53

64 x− 25128

(0, 18 ) ( 1

8 ,14 ) ( 1

4 ,38 ) ( 3

8 ,12 )

Fig. 1.1. Consider the recursion tree induced by the Descartes method when applied to F(x) := 16√

2 · x2− 8 · x+ π

4(with roots x1 = 0.06 . . . and x2 = 0.29 . . .). For each interval I = (a,b) in the subdivision process, we have to computethe polynomial FI(x). For instance, for I = ( 1

4 ,12 ), it holds that FI(x) = F( 1

4 + x4 ) =

√2x2 + (2

√2− 2)x+

√2+ π

8 .We start with an approximation FI0 of FI0 to a certain number ρI0 = ρ of bits after the binary point. Then, we recur-sively compute approximations FI of FI to ρI bits, where ρI is updated in each step. Notice that the polynomials FI donot necessarily correspond to a specific initial approximation G of F , that is, there might exist no polynomial G suchthat GI = FI for all considered I. In the above example, we start with FI0 (x) =

11585512 x2− 31363

1024 + 5145512 which approxi-

mates FI0 (x) = f (− 12 + x) = 16

√2x2− 16

√2x+ 4− 8x+ 4+ π

8 + 4√

2 to ρI0 = 10 bits. Then, FI0 (x2 ) and FI0 (

12 + x

2 )

are evaluated and the result is rounded to 9 bits after the binary point. The resulting polynomials are then approxi-mations of FI1 (x) = F( 1

2 + x2 ) and FI2 (x) = F( x

2 ) to ρI1 = ρI2 = 8 bits, respectively; see Lemma 1 for details. In thefollowing bisection steps, we proceed in exactly the same manner. For instance, for the interval I = ( 1

4 ,12 ), we obtain

FI(x) = 181128 x2 + 53

64 x− 25128 which approximates FI to ρI = 6 bits after the binary point.

(instead of O(n2τ) for the exact counterpart FI) which allows to reduce the cost at each node by afactor n; see also Figure 1.1 for an example in the more general setting, where F has arbitrary realcoefficients. The main difficulty is that the values h0 and ρ0 are not known in advance and thatconsidering worst case bounds for these values (e.g. for integer polynomials) yields an algorithmwhich might achieve an improved worst case complexity bound but is not practical at all; see alsoSection 4.4.1 for more details. In contrast, we propose an adaptive algorithm which succeeds witha working precision comparable to the precision that is actually needed for the given input. Inaddition, the size of the recursion tree directly depends on the actual separations of the roots; cf.the complexity bounds in (1.6) and (1.7) for a more precise result.

In the above considerations, we mainly focused on polynomials with integer coefficients.However, the proposed algorithm RISOLATE does not only apply to integer polynomials but alsoto arbitrary square-free polynomials 5

F(x) :=n

∑i=0

Aixi ∈ R[x],14≤ |An| ≤ 1, (1.4)

with real valued coefficients Ai, where we assume the existence of a coefficient oracle that pro-vides arbitrary good approximations of the coefficients at the cost of reading the approximation.In this setting, the complexity results are exclusively stated in terms of the geometry of the rootsand not in the complexity of the coefficients. More precisely, let

- ξ1, . . . ,ξn ∈ C denote the (complex) roots of F ,- ΓF := log+ maxi |ξi| the logarithmic root bound of F ,

5 The additional requirement for the leading coefficient An yields a simpler overall presentation. Notice that, for generalvalues An, we first have to multiply the polynomial F by some 2t , with t ∈ Z, such that 2t · |An| is contained in [1/4,1].

4

- σi := σ(ξi,F) := min j 6=i |ξi−ξ j| the separation of the root ξi,- σF := mini σi the separation of F , and- ΣF := ∑

ni=1 log+ σ

−1i , (1.5)

then RISOLATE induces a recursion tree of size

O(nΓF +ΣF), (1.6)

and it needs

O(n(nΓF +ΣF)2) (1.7)

bit operations to isolate all real roots of F . The coefficients of F have to be approximated toO(nΓF +ΣF) bits after the binary point. We remark that the bound in (1.7) factorizes into thebound (1.6) on the size of the recursion tree, the precision O(nΓ+ΣF) to carry out the compu-tations, and a factor O(n) for the number of arithmetic operations needed to process an intervalI in the recursion tree. Furthermore, for a polynomial F with integer coefficients of bit size τ

or less, we can first divide F by its leading coefficient (to meet the requirements in (1.4)) andthen apply RISOLATE to the polynomial F/|An|. For this special case, the bounds on the size ofthe recursion tree and the bit complexity simplify to O(nτ) and O(n3τ2), respectively, becauseΓF = O(τ), logMea(F) = O(τ + logn), and ΣF = O(nτ); see also Appendix 6.2.

Related Work. The literature on root finding mainly distinguishes between numerical methodsthat use approximate computation and methods that use exact arithmetic. Many numerical al-gorithms (e.g. based on Newton-Raphson iteration, the Weierstrass-Durand-Kerner method, (in-verse) power iteration, Eigenvalue computation, etc.) are widely used and effective in practice 6

but lack a guarantee on the global behavior. A prominent example is the Weierstrass-Durand-Kerner method, where there is still no proof known that the method converges for arbitrary givenstarting values. For a more detailed discussion, we refer to [32]. In parallel, there is a steadyongoing research on subdivision algorithms which perform rational operations on the input co-efficients. Algorithms of the latter kind are the Descartes method (e.g. [10, 13, 34]), the Bolzanomethod [9, 38], the Sturm method [11, 46], or the continued fraction method [2, 24, 41, 44]. 7

Many of these methods have been integrated into computer algebra systems, and experimentshave shown their practical evidence [18, 34]. In addition, their computational complexity hasbeen well-studied [11, 15, 38, 44]. Current experimental data shows that an approximate variantof the Descartes method [34] performs best for most polynomials, whereas, for some particularhard instances (e.g. Mignotte polynomials), the continued fraction approach is more efficient.

From a theoretical complexity point of view, the first benchmark was set by A. Schonhage[39] in 1982. He combines a newly introduced concept denoted splitting circle method withtechniques from numerical analysis (Newton iteration, Graeffe’s method, discrete Fourier trans-forms) and fast algorithms for polynomial and integer multiplication. With respect to the bench-mark problem (i.e. isolating all roots of a polynomial F of degree n with integer coefficients ofbit size τ or less), his method achieves the bit complexity bound O(n3τ). Pan and others [32, 33]gave theoretical improvements which yield record bounds with respect to bit complexity andarithmetic complexity. In particular, [33, Theorem 2.1.1] implies that isolating all complex roots

6 E.g. MPSOLVE [8] is a highly efficient implementation of the Aberth-Ehrlich method.7 The literature on root solving is extensive, hence, we decided to restrict to a selection of representative papers andrefer the reader to the references given therein.

5

of F needs no more than O(n2τ) bit operations. Very recent work [26] turns Pan’s factorizationalgorithm into a root isolation method that achieves a bit complexity bound which adapts directlyto the geometry of the roots. That is, similar to the bound in (1.7), the bound exclusively dependson the absolute values and the pairwise distances between roots. However, the main drawbackof the asymptotically fast algorithms above is that they are rather involved and difficult to im-plement. In fact, Pan’s method has not been implemented, whereas Schonhage’s method has notproven to be efficient in practice so far; see [17] for a “proof of concept” implementation of thesplitting circle method within the Computer Algebra system Pari/GP.

All of the above mentioned exact subdivision algorithms (i.e. Sturm, Bolzano, Descartes, orthe continued fraction method) need O(n4τ2) bit operations to isolate all real roots of F , thus theylag behind the (asymptotically) fastest method by three magnitudes. In a very recent work [37],we introduced a variant of the Descartes method which uses Newton iteration to speed up conver-gence. The method exclusively performs exact rational arithmetic and has bit complexity O(n3τ)which is still by one magnitude worse than the method from Pan. When compared to other ex-act subdivision methods, the improvement in [37] mainly stems from the fact that it achievesquadratic convergence in most iterations which yields a recursion tree of almost optimal size.In contrast, the improvement with respect to bit complexity (i.e. from O(n4τ2) to O(n3τ2)) asachieved by the algorithm RISOLATE in this paper is due to the use of approximate arithmeticwith a considerably smaller working precision as needed for the exact counterpart.

A first version [36] of this paper appeared in arXiv in November 2010. Since that time, thealgorithm RISOLATE has been implemented as a core function in MATHEMATICA (see [42])and the complexity results have been applied in a series of papers (e.g. [20, 21, 37, 42, 43]).In this context, we would also like to remark that our complexity results are already confirmedby experiments [42, 43] showing that the complexity of RISOLATE is exclusively related to thegeometry of the roots. Furthermore, the adaptiveness of our bound has turned out to be veryuseful in the analysis [21] of an algorithm to compute the topology of an algebraic curve whichmakes extensive use of amortization. For the future, we expect that there will be a series of furthercomplexity results based on our adaptive complexity bound for real root isolation.

Acknowledgments

A special thank goes to all anonymous reviewers for their constructive and detailed criticismthat has helped to improve the quality and exposition of this contribution.

2. Preliminaries

2.1. Notations

In addition to the definitions from (1.4) and (1.5), we define w(I) := b−a the width, m(I) :=a+b

2 the center, and r(I) = w(I)2 the radius of an interval I = (a,b). Furthermore,

I+ = (a+,b+) := (a− w(I)4n

,b+w(I)4n

) and I = (a, b) := (a− w(I)2n

,b+w(I)2n

)

denote extensions of I by w(I)4n and w(I)

2n (to both sides), respectively. We will need these intervalsfor our modified version of the Descartes method as presented in Section 3. For an arbitrary pointm ∈ C and a positive real value r, we define ∆ = ∆r(m) to be the open disk with center m andradius r. ∆ and I denote the closure of a disk ∆ and an interval I, respectively.

6

2.2. Scaling the Polynomial

Instead of isolating the roots of the given polynomial F(x) = ∑ni=0 Aixi as defined in (1.4), we

consider the equivalent task of isolating the roots of a ”scaled” polynomial

f (x) =n

∑i=0

aixi :=n

∑i=0

(Ai ·2i·Γ) · xi = F(2Γ · x), (2.1)

where Γ∈N is an integer approximation of the exact logarithmic root bound ΓF = log+(maxi |ξi|)of F such that

ΓF +1≤ Γ≤ ΓF +8logn+1. (2.2)

According to Appendix 6.1, we can compute such a Γ with O(n2ΓF) bit operations from anapproximation of F to O(nΓF) bits after the binary point. From the definition of Γ, it follows thatthe roots z1 := ξ1 ·2−Γ, . . . ,zn := ξn ·2−Γ of f are contained within the disk ∆1/2(0). Furthermore,the absolute value of each coefficient ai is upper bounded by 2O(nΓ) since |Ai| ≤

(ni

)Mea(F) ≤

2n+nΓF ≤ 2nΓ for all i. We further remark that the separations of corresponding roots of F and fscale by a factor of 2Γ (i.e. σ(ξi,F) = 2Γ ·σ(zi, f )). Thus, we have

Σ f =n

∑i=1

log+ σ(zi, f )−1 ≤ ΣF +nΓ = O(nΓF +n logn+ΣF) = O(nΓF +ΣF). (2.3)

2.3. Approximating Polynomials

We assume the existence of a coefficient oracle which, for a given ρ ∈ N, provides approxi-mations of the coefficients of F to ρ bits after the binary point. More precisely, each coefficientAi is approximated by a binary fraction Ai = mi · 2−ρ with mi ∈ Z and |Ai− Ai| ≤ 2−ρ , e.g.,Ai = sign(Ai) · b|Ai · 2ρ |c · 2−ρ . We call a polynomial F ∈ Q[x] obtained in this way a ρ-binaryapproximation of F . We only consider the cost for reading (i.e. O(n(nΓF +ρ))) but not for com-puting such an approximation. Notice that, in order to obtain a ρ-binary approximation of thescaled polynomial f , we have to approximate F to nΓ+ρ bits after the binary point since thei-th coefficient of F is shifted by i ·Γ bits.

For an arbitrary polynomial g(x) :=∑mi=0 gixi ∈C[x] with complex coefficients and an arbitrary

non-negative real number µ ∈ R≥0, we define

[g]µ :=

{g(x) =

n

∑i=0

gixi ∈ C[x] : |gi− gi| ≤ µ for all i = 0, . . . ,n

}the set of all µ-approximations of g. We remark that, since the coefficients of modulus less thanµ can be approximated by zero, a µ-approximation g of g might have lower degree than g.

Example. For g(x) := 1225665589 x10−2x2 + 1

243 x− 916 , the polynomial g(x) := 11

64 x10−2x2− 916 con-

stitutes a 6-binary approximation and g(x) :=−2x2− 34 a 2-binary approximation of g.

2.4. Taylor Shifts

The following lemma provides error bounds on how the absolute approximation error µ of apolynomial g ∈ [g]µ scales under the transformation x 7→ m+λ · x for some special values form ∈ C and λ ∈ R\{0}:

7

Lemma 1. For µ ∈ R+0 and g ∈ [g]µ an arbitrary µ-approximation of a polynomial g ∈ C[x] of

degree n, it holds that(a) g( 1

2 +12 · x) ∈ [g( 1

2 +12 · x)]2µ ,

(b) g(− 14n +(1+ 1

2n ) · x) ∈ [g(− 14n +(1+ 1

2n ) · x)]4µ ,(c) g(− 1

2 + x) ∈ [g(− 12 + x)]2nµ , and g(1+ x) ∈ [g(1+ x)]2nµ .

Proof. For µ(x) := (g− g)(x) = µnxn + . . .+µ1x+µ0, the absolute value of each coefficient µiis bounded by µ . Let m ∈ C and λ ∈ R\{0} be arbitrary values, then

µ(m+λx) =n

∑i=0

µi(m+λx)i =n

∑i=0

µi

i

∑k=0

xkλ

kmi−k(

ik

)=

n

∑k=0

xkn

∑i=k

µimi−kλ

k(

ik

)(2.4)

Thus, for |m|< 1, the absolute value of the coefficient of xk is bounded by

µ|λ |k ·∑i≥k|m|i−k

(ik

)= µ|λ |k ·∑

i≥0|m|i(

k+ ik

)= µ|λ |k · 1

(1−|m|)k+1 , (2.5)

where we used

(1−|m|)−(k+1) = ∑i≥0

(−(k+1)

i

)(−1)i|m|i = ∑

i≥0

(k+ i

i

)|m|i = ∑

i≥0

(k+ i

k

)|m|i.

For m = λ = 1/2, it follows that the absolute value of all coefficients of µ(x) is bounded by 2µ .This shows (a). For m =− 1

4n and λ = 1+ 12n , (2.5) implies that

g(− 14n

+(1+1

2n) · x) ∈

[g(− 1

4n+(1+

12n

) · x)]

µ87 ·(

1+1/(2n)1−1/(4n)

)n⊂[

g(− 14n

+(1+12n

) · x)]

4µ

because 87 ·(

1+1/(2n)1−1/(4n)

)n≤ 83

73 ·√

e ≤ 4. Hence, (b) follows. The first part of (c) is also a directimplication of (2.5). The second claim in (c) follows from the computation in (2.4) since µi is then(m = λ = 1) bounded by µ ·∑n

i=k( i

k

)= µ ·∑n

i=k( i

i−k

)= µ ·∑n−k

i=0

(i+ki

)≤ µ ·∑n−k

i=0

(ni

)≤ 2n ·µ . 2

2.5. On Sufficiently Good Approximation

In the next step, we derive a bound on how good f has to be approximated by some f suchthat, for all i, the distance of corresponding roots zi and zi of f and f is small with respect tothe separation σ(zi, f ). There exist general worst-case perturbation bounds (e.g. [40, Thm. 2.7]or [23, Chapter 15]) that apply to polynomials with multiple roots and which only depend on thedistance ‖ f − f‖1 between f and f . 8 For polynomials with roots of very large multiplicity, thesebounds are nearly optimal. However, they often constitute vast overestimations of the amount ofperturbation, in particular, for polynomials with well separated roots. In contrast, we provide amore adaptive, but implicit, bound depending on parameters, such as the separations of the rootsand the absolute values of the derivatives at the roots, which can not directly be derived from thecoefficients of f (or f ). However, our algorithm as presented in Section 4 is designed in a waysuch that it eventually succeeds with a working precision that is related to our adaptive bound.We further remark that our bound cannot be directly derived from the bound in [40, Thm. 2.7]and vice versa.

8 In the context of real root isolation, the bound in [40, Thm. 2.7] has been used in [25] in order to derive isolatingintervals for the roots of a real polynomial f ∈ R[x] from corresponding isolating intervals for the roots of a rationalapproximation f ∈Q[x].

8

The following considerations are mainly adopted from our studies in [35]. For the sake ofcomprehensibility, we decided to briefly review the results in this paper as well. We start withthe following definition:

Definition 2. For t, with t ≥ 1, an arbitrary real value and f a polynomial as in (2.1), we define

µ( f , t) :=1t· min

i=1,...,n

∣∣∣∣σ(zi, f ) · f ′(zi)

8n2

∣∣∣∣ (2.6)

We call a ρ ∈ N sufficiently large with respect to f if 9

ρ ≥ ρ f := d− log µ( f ,64n2)e. (2.7)

Notice that ρ f = O(Σ f + logn− log |an|) and ρ f = O(ΣF + logn) because of

σ(zi, f ) · | f ′(zi)|= σ(zi, f ) · |an| ·∏j 6=i|zi− z j| ≥ σ(zi, f ) · |an| ·∏

j 6=iσ(z j, f ) =

=|an|2nΓ

σ(ξi,F)∏j 6=i

σ(ξ j,F)≥ 14·2−ΣF . (2.8)

The following theorem gives an answer to our initial question how good f has to be approximatedby some f in order to ensure that corresponding roots stay at almost ”the same place” with respectto their separations:

Theorem 3. Let f be the polynomial as defined in (2.1), t ≥ 1 and f ∈ [ f ]µ( f ,t).(a) For all i = 1, . . . ,n, the disk

∆i := ∆ σ(zi , f )tn

(zi)

contains the root zi of f and a corresponding root zi of f .(b) For each z ∈ C\

⋃ni=1 ∆i, it holds that | f (z)|> (n+1)µ( f , t).

(c) If ρ ≥ ρ f , then each root zi moves by at most σ(zi, f )64n3 when passing from f to an arbitrary f ∈

[ f ]2−ρ . In particular, real roots of f stay real and non-real roots stay non-real. Furthermore,for any z ∈ C with |z− zi| ≥ σ(zi, f )

64n3 for all i, it holds that | f (z)|> (n+1)2−ρ f .

Proof. Since all roots of f are contained within ∆1/2(0), it follows that σ(zi, f )< 1 for all i and,thus, each disk ∆i is completely contained within the unit disk. For an arbitrary point z ∈ ∂∆i onthe boundary of ∆i, we have

| f (z)|= |an|n

∏j=1|z− z j|=

σ(zi, f )tn

(∏

1≤ j≤n, j 6=i

∣∣∣∣ z− z j

zi− z j

∣∣∣∣)· |an| ·

(∏

1≤ j≤n, j 6=i|zi− z j|

)

=σ(zi, f ) · | f ′(zi)|

tn ∏1≤ j≤n, j 6=i

∣∣∣∣ z− z j

zi− z j

∣∣∣∣≥ σ(zi, f ) · | f ′(zi)|tn ∏

1≤ j≤n, j 6=i

|zi− z j|− |z− zi||zi− z j|

≥ σ(zi, f ) · | f ′(zi)|tn

(1− 1

tn

)n−1

>σ(zi, f ) · | f ′(zi)|

2.72 · tn> (n+1)µ( f , t).

In addition, since f ∈ [ f ]µ( f ,t) and |z|< 1, we have |( f − f )(z)|< (n+1)µ( f , t)< | f (z)|. Hence,(a) follows from Rouche’s Theorem applied to the disks ∆i and the functions f and f . For (b), weremark that f is a holomorphic function on C\

⋃ni=1 ∆i and, thus, | f (z)| becomes minimal for a

9 This definition is motivated by our results in Theorem 3 and Section 4.1

9

point z on the boundary of one of the disks ∆i. (c) follows directly from (a), (b) and the definitionof ρ f in (2.7). 2

We conclude from the last theorem that it suffices to approximate the coefficients of f toρ , with some ρ = O(Σ f + logn− log |an|), bits after the binary point to guarantee that eachapproximation f ∈ [ f ]2−ρ has its roots at almost the same location as f .

2.6. The Descartes Method

We first resume some basic facts about the Descartes method for isolating the real roots ofa polynomial f (x) = ∑

ni=0 aixn ∈ R[x]. Descartes’ Rule of Signs states that the number var( f )

of sign changes in the coefficient sequence of f , that is, the number of pairs (i, j) with i < j,aia j < 0, and ai+1 = . . .= a j−1 = 0, is not smaller than and of the same parity as the number ofpositive real roots of f . If var( f ) = 0, then f has no positive real root, and if var( f ) = 1, f hasexactly one positive real root. The rule easily extends to an arbitrary open interval I = (a,b) viaa suitable coordinate transformation: The mapping x 7→ a+(b−a)x maps (0,1) bijectively ontoI, that is, the roots of f in I exactly correspond to those of

fI(x) := f (a+w(I)x) = f (a+(b−a)x) (2.9)

in (0,1). Hence, the composition of x 7→ a+(b−a) · x and x 7→ 1/(1+ x) constitutes a bijectivemap from (0,∞) to I. It follows that the positive real roots of

fI,rev(x) := (1+ x)n fI

(1

x+1

)= (1+ x)n · f

(ax+bx+1

)correspond bijectively to the real roots of f in I. The factor (1+x)n in the definition of fI,rev clearsdenominators and guarantees that fI,rev is a polynomial. fI,rev is computed from fI by reversingthe coefficients (i.e. the i-th coefficient is replaced by the (n− i)-th coefficient) followed by aTaylor shift by 1 (i.e. x 7→ x+1). We now define var( f , I) := var( fI,rev).

Based on Descartes’ Rule of Sign, Collins and Akritas introduced a bisection algorithm 10 forisolating the roots of f in an interval I0 (here, we assume that I0 = (−1/2,1/2)). We refer thereader to [3, 4, 5, 6, 10, 13] for extensive treatments and references.

VCA. The algorithm requires that the real roots of f in I0 are simple, otherwise it diverges. In eachstep, a set A of active intervals is maintained. Initially, A contains I0, and the algorithm stop assoon as A becomes empty. In each iteration, some interval I ∈ A is processed; If var( f , I) = 0,then I contains no root of f and we discard I. If var( f , I) = 1, then I contains exactly one rootof f and, hence, is an isolating interval for it. We add I to a list O of isolating intervals. If thereis more than one sign change, we divide I at its midpoint m(I) and add the subintervals to theset of active intervals. If m(I) is a root of f , we add the trivial interval [m(I),m(I)] to the list ofisolating intervals.

Correctness of the algorithm follows immediately from Descartes’ Rule of Signs. Terminationand complexity analysis of VCA rest on the following theorem:

10 Based on the fact that Collins and Akritas used ideas dating back to Vincent (see [5]), the algorithm has been namedVincent-Collins-Akritas method (or VCA for short). In this paper, we use both denotations, that is, VCA and Descartesmethod, in an interchangeable way.

10

Theorem 4 ([28, 31]). For a polynomial f ∈ R[x] and an interval I = (a,b), let v := var( f , I).(a) (One-Circle Theorem) If the open disk bounded by the circle centered at m(I) and passing

through the endpoints of I contains no root of f (x), then v = 0.(b) (Two-Circle Theorem) If the union of the open disks bounded by the two circles centered at

m(I)± i(1/(2√

3))w(I) and passing through the endpoints of I contains exactly one root off (x), then v = 1.

Proofs of the one- and two-circle theorems can be found in [3, 13, 22, 28, 29, 30, 31]. Theo-rem 4 implies that no interval I of length σ f /2 or less is split. Such an interval cannot contain tworeal roots and its two-circle region cannot contain any nonreal root. Thus, var( f , I)≤ 1 by Theo-rem 4. We conclude that the depth of the recursion tree is bounded by 1+ logσ

−1f . Furthermore,

it holds (see [13, Cor. 2.27] or [27, Prop. 3.1] self-contained proofs):

Theorem 5. Let I be an interval and I1 and I2 be two disjoint subintervals of I. Then,

var( f , I1)+ var( f , I2)≤ var( f , I).

According to the above theorem, there cannot be more than n/2 intervals I with var( f , I) ≥2 at any level of the recursion. Therefore, the size of the recursion tree TVCA is bounded byn(1+ logσ

−1f ). For polynomials with integer coefficients of maximal bitsize τ , it has been shown

that− logσ f = O(n(logn+τ)), thus, the latter bound writes as O(n2τ). However, a more refinedargumentation [13] shows that |TVCA| is even bounded by O(nτ) which is due to the fact thatthere are amortization effects over the separations of all roots; see Appendix 6.2.

The computation of fI,rev at each node of the tree is costly. It is better to store with everyinterval I = (a,b) the polynomial fI(x) = f (a+ x · (b−a)). If I is split at its midpoint m(I) intoI` = (a,m(I)) and Ir = (m(I),b), the polynomials associated with the subintervals are fI`(x) =fI(

x2 ) and fIr(x) = fI(

1+x2 ) = fI`(1+ x). Also, fI,rev(x) = (1+ x)n fI(

11+x ). If the coefficients of

f are integers (or dyadic fractions) of bitsize τ , then the coefficients grow by n bits in everybisection step. Thus, for a node I of depth h, the bitsize τh of the coefficients of fI is boundedby τh = τ + nh. Hence, using asymptotically fast Taylor shift (see [45, 16]), the number of bitoperations needed to compute fI` , fIr and fI,rev from fI is O(n(nh+ τ)). Since the depth of therecursion tree is O(nτ), each fI has coefficients of bitsize O(n2τ) and, thus, the cost at each nodeis bounded by O(n3τ). Eventually, the total cost for VCA is in O(n3τ) · O(nτ) = O(n4τ2).

3. A Modified Descartes Method

In c computational model, where exact operations on real numbers are assumed to be availableat unit costs, the Descartes method can be directly used to isolate the real roots of the polynomialf as defined in (2.1). Namely, in such a model, we can compute the number of sign variationsfor the polynomial fI,rev and the sign of f at the midpoint m(I) for each node I of the recursiontree no matter whether f has rational, algebraic, or transcendental coefficients. However, for anactual implementation, these computations turn out to be hard, or even infeasible, in general.Namely, if one of the coefficients of fI,rev equals zero (e.g. this is the case if one of the endpointsof I is a root of f ), then deciding the sign of this coefficient becomes infeasible since we canonly ask for approximations of f . The decision problem becomes hard if one of the coefficientshas a very small value because, in this case, we have to run our computations with a very largeworking precision. We further remark that, even for algebraic coefficients (with known algebraic

11

representation), the decision problem might be hard because this amounts to comparing alge-braic numbers of large degree. In order to overcome these issues, we do not consider the originalversion of the Descartes method but a modified variant which completely avoids such difficult de-cision problems. More precisely, we will show that our method always succeeds with a workingprecision comparable to ρ f . A crucial step in our approach is to replace the inclusion predicatevar( f , I) = 1, which is used in the Descartes method to confirm an interval to be isolating, by apredicate used in the Bolzano method.

Section 3.1 resumes some useful results which are adopted from our studies on the Bolzanomethod [38], whereas, in Section 3.2, our modified Descartes method is formulated.

3.1. The T [g,K](·)-Test: Existence of Roots

For g ∈ C[x], m ∈ C and positive real values K and r, we consider the following test whichhas already been introduced in [46] in a less general form: 11

T [g,K](m,r) : t[g,K](m,r) := |g(m)|−K ∑k≥1

∣∣∣∣∣g(k)(m)

k!

∣∣∣∣∣rk > 0. (3.1)

In order to simplify notation, we also write T [g,K](∆) or T [g,K](I) instead of T [g,K](m,r),where ∆ = ∆r(m) is disk or I = (a,b) an interval with midpoint m = m(I) and radius r = r(I). Ifthe polynomial g is fixed and no mix-up is possible, we further omit the ”g” and write T [K](m,r)for T [K](m,r) and T ′[K](m,r) for T [g′,K](m,r). We mainly use K = 3/2. Therefore, wheneverthe ”K” is suppressed (i.e. we write T [g](m,r) instead of T [g,3/2](m,r)), we consider K = 3/2.Before presenting the main technical lemmata, we first summarize the following useful propertiesof T [g,K](·):

• If T [g,K](m,r) holds, then T [g,K′](m,r′) holds for all K′ ≤ K and all r′ ≤ r.• For arbitrary values m, r and λ 6= 0, the tests T [g,K](m,r) and T [g(m+λ · x),K](0, r

λ) are

equivalent since t[g(m+λx),K](0, rλ) = t[g,K](m,r). In particular, for an interval I = (a,b),

the test T [gI ,K](0,r) is equivalent to T [g,K](a,r ·w(I)), where gI(x) = g(a+w(I) · x).• For λ ∈ R+, it holds that t[g,K](m,r) = t[λg,K](m,r) ·λ−1 and, thus, T [g,K](m,r) is equiv-

alent to T [λg,K](m,r). Hence, T [(g′)I ,K](m,r) and T [(gI)′,K](m,r) are equivalent since

(gI)′ = (g(a+w(I) · x))′ = w(I) · (g′)I .

The T [g,K](·)-test serves as an exclusion predicate but might also guarantee that a certaindisk contains at most one root. We refer to [7, Theorem 3.2] for a proof of the following lemma.

Lemma 6. Consider a disk ∆ = ∆r(m)⊂ C and a polynomial g ∈ R[x]:(a) If T [K](∆) holds for some K ≥ 1, then ∆ contains no root of g and(

1− 1K

)· |g(m)|< |g(z)|<

(1+

1K

)· |g(m)|

for all z in the closure ∆ of ∆.(b) If T ′[3/2](∆) holds, then ∆ contains at most one root of g.

11 In [46], Yakoubsohn introduces a quadtree (Weyl) construction for computing the complex roots of an analytic func-tion, where the test T [g,1](·) is exclusively used as an exclusion predicate. In [46, Section 9], he also provides boundson the arithmetic complexity and the precision that is needed by his algorithm to isolate all complex roots of a square-freepolynomial f . In particular, the bound on the precision is stated in terms of the degree of f , the absolute value of theroots, and the distance of f to the variety of all polynomials that have a multiple root, and thus, it is similar to our bound.

12

The T ′(·)-test now easily applies as an inclusion predicate:

Corollary 7. Let I = (a,b) be an interval and r≥ 1 such that T [g′I ](0,r) holds. Then, I containsa root ξ of g if and only if g(a) ·g(b)< 0. In the latter case, the disk ∆r·w(I)(a) is isolating for ξ .

Proof. If T [g′I ](0,r) holds, then T [g′](a,r ·w(I)) holds as well according to the above propertiesof T [g,K](·). It follows that the disk ∆r·w(I)(a) and, thus, I contains no root of the derivative g′.Now, since g is monotone on I, it suffices to check for a sign change of g at the endpoints of I.Namely, there exists a root ξ of g in I if and only if g(a) ·g(b)< 0. In case of existence, ∆r·w(I)(a)is isolating for ξ due to Lemma 6. 2

In order to show that the T [g′](m,r)-test in combination with sign evaluation is an efficientinclusion predicate, we give lower bounds on r in terms of σg such that the predicate succeedsunder guarantee.

Lemma 8. For g a polynomial of degree n, a disk ∆ = ∆r(m) ⊂ C, an interval I = (a,b) andI+ = (a− w(I)

4n ,b+ w(I)4n ), it holds:

(a) If r ≤ σg4n2 , then T (∆) or T ′(∆) holds.

(b) If ∆ contains a root ξ of g and r ≤ σ(ξ ,g)4n2 , then T ′(∆) holds.

(c) If var(g, I+)> 0 and T [g′I ](0,2) fails, ∆2w(I)(a) contains a root ξ of g with σ(ξ ,g)< 8n2w(I).(d) If var(g′, I)> 0 and T [gI ](0,1) fails, ∆2nw(I)(a) contains a root ξ of g with σ(ξ ,g)< 4n2w(I).

Proof. For the proof of (a) and (b), we refer to [35, Lemma 5]. For (c), suppose that var(g, I+)>0 and T [g′I ](0,2) does not hold. Then, according to Theorem 4 (a), the disk ∆r(I+)(m(I)) ⊂∆2w(I)(a) contains a root ξ of g. With (b), it follows that 2w(I) > σ(ξ ,g)

4n2 and, thus, σ(ξ ,g) <8n2w(I). For (d), we first argue by contradiction that the disk ∆2nw(I)(a) contains a root ξ of g:If |a− xi| ≥ 2nw(I) for all roots xi of g, then∣∣∣∣∣g(k)(a)g(a)

∣∣∣∣∣=∣∣∣∣∑′

i1,...,ik

1(a− xi1) . . .(a− xik)

∣∣∣∣≤ (∑ni=1

1|a− xi|

)k

≤(

12w(I)

)k

,

where the prime means that the i j’s ( j = 1 . . .k) are chosen to be distinct. It follows that T (a,w(I))

holds because of ∑nk=1

∣∣∣ g(k)(a)g(a)

∣∣∣w(I)k ≤∑nk=1 2−k < 1 < 3

2 . In addition, Theorem 4 guarantees the

existence of a root ξ ′ ∈ ∆r(I)(m(I)) of g′. Hence, we have |ξ − ξ ′| < 2nw(I)+w(I) < 4nw(I)which implies σ(ξ ,g)< 4n2w(I) due to the fact [12, 47] that there exists no root of the derivativeg′ in ∆ σ(ξ ,g)

n(ξ ). 2

3.2. DCM: A Modified Descartes Algorithm

We introduce our modified Descartes method DCM (short for “Descartes modified”) to isolatethe real roots of a polynomial f . We formulate the algorithm in the REAL-RAM model, thus, itstill does not directly apply to bitstream polynomials. However, in Section 4.1, we will present acorresponding version DCMρ of DCM which resolves this issue; see also Appendix, Algorithm 1for pseudo-code of DCM.

13

DCM. DCM maintains a list A of active nodes and a list O of isolating intervals, where weinitially set O = /0 and A := {(I0, fI0)}, with I0 := (− 1

2 ,12 ). For each active node (I, fI) from

A , we proceed as follows. We first remove (I, fI) from the list A . Then, we compute the num-ber vI+ := var( f , I+) = var( fI+,rev) of sign variations for f on the extended interval I+ (noticethat fI+(x) = fI(− 1

4n +(1+ 12n )x) and fI+,rev(x) = (1+ x)n fI+(

11+x )).

12 If vI+ = 0, we do noth-ing, that is, I is discarded. If vI+ ≥ 1, we consider the test T [ f ′I ](0,2) which is equivalent toT [ f ′](a,2w(I)). If it fails, then I is subdivided into I` = (a,m(I)) and Ir = (m(I),b) and weadd (I`, fI`) = (I`, fI(

x2 )) and (Ir, fIr) = (Ir, fI`(x+ 1)) to A . Otherwise, we evaluate the sign s

of f (a+) · f (b+) = fI+(0) · fI+(1). If s < 0 and I+ is disjoint from any other interval in O , weadd I+ to O . If s ≥ 0 or I+ intersects an interval in O , we do nothing (i.e. I is discarded). Thealgorithm stops when A becomes empty.

Theorem 9. For the polynomial f as defined in (2.1), the algorithm DCM terminates and returnsa list O = {I1, . . . , Im} of disjoint isolating intervals for all real roots of f .

Proof. If the width w(I) of an interval I = (a,b) is smaller or equal to σ f8n2 , then, according to

Theorem 4 and Lemma 8 (c), var( f , I+) = 0 or T [ f ′I ](0,2) holds. Thus, I is not further subdi-vided. This shows termination of DCM. From our construction and Corollary 7, each interval inO is isolating for a real root of f and all intervals in O are pairwise disjoint. It remains to showthat, for each real root ξ of f , there exists a corresponding isolating interval in O . Since all rootsof f have absolute value bounded by 1/2 and DCM terminates, there must be an interval I = (a,b)of minimal (positive) length whose closure I contains ξ . Since vI+ > 0, I cannot be discarded inthe first step of DCM. Hence, T [ f ′I ](0,2) holds and, thus, f is monotone on I+. Since I+ containsthe root ξ , we have f (a+) · f (b+) < 0. It follows that either I+ is added to the list of isolatingintervals or I+ intersects an interval J+ = (c+,d+) ∈ O which has been added to O before. LetJ = (c,d) be the corresponding smaller interval for J+. Since the w(I)

4n -neighborhood of I inter-sects the w(J)

4n -neighborhood of J, the following Lemma 10 shows that one of the disks ∆2w(I)(a)or ∆2w(J)(c) contains both intervals I+ and J+. Since both, T [ f ′I ](0,2) and T [ f ′J ](0,2), hold,each of the latter two disks contains at most one root due to Corollary 7. It follows that J+ ∈ Oalready isolates ξ . 2

Lemma 10. 13 Let I = (a,b) and J = (c,d) be two intervals (not necessarily of equal length)of the form

(− 1

2 + i2−h,− 12 +(i+1)2−h

), where h ∈ N and i ∈ {0, . . . ,2h − 1}. If the w(I)

2n -

neighborhood Uw(I)2n

(I) of I intersects the w(J)2n -neighborhood Uw(J)

2n(J) of J, then one of the disks

∆2w(I)(a) or ∆2w(J)(c) contains the intervals (a−w(I),b+w(I)) and (c−w(J),d +w(J)).

Proof. W.l.o.g., we can assume that w(J)≥ w(I) and, thus, w(J) = 2lw(I) with an l ∈N0. Let δ

denote the distance between I and J. If δ = 0, then ∆2w(J)(c) contains (a−w(I),b+w(I)) and(c−w(J),d +w(J)). If δ 6= 0, then δ = 2kw(I) with a k ∈ N0. Since Uw(I)

2n(I)∩Uw(J)

2n(J) 6= /0,

12 Remember that the polynomial fI+ is obtained from f by mapping all roots of f in I+ one-to-one and onto the interval(0,1). When further mapping the roots one-to-one and onto the positive real axis, we obtain fI+,rev.13 Lemma 10 proves a slightly stronger result than necessary for the proof of Theorem 9. The stronger result applies inthe proof of Theorem 14 in Section 4.2.

14

J

ImJmI

D ( ) 2w(J)

a b c d

c

d

Fig. 3.1. Wlog., we can assume that w(J)≥ w(I). w(I), w(J) and the distance δ between I and J differ by a power of 2.For δ = 0, the disk ∆ := ∆2w(J)(c) certainly contains I and J. If δ 6= 0, then w(J)≥ 2w(I) and w(J)≥ 4δ , hence I, J ⊂ ∆.

we must have w(J)2n > δ

2 . In particular, we have w(J)4 > δ

2 = 2k−1w(I). Since w(I) and w(J) differby a power of 2, it follows that w(J)≥ 2k+2w(I) = 4δ and, thus, 2w(J) = w(J)+ w(J)

2 + w(J)2 ≥

w(J)+2w(I)+2δ . From the latter inequality our claim follows. 2

Theorem 11. For a polynomial f as in (2.1), DCM induces a subdivision tree TDCM of

height h(TDCM) = O(logn− logσ f ) and size |TDCM|= O(Σ f +n logn).

Proof. The result on the height of TDCM follows directly from the proof of Theorem 9. Namely,we have shown that DCM never subdivides an interval of width less than or equal to σ f

8n2 . Forthe bound on |TDCM|, we use a similar argument as in [15] and [25]. Namely, for a root ξ off and a certain h ∈ N0 we say that I = (− 1

2 + i2−h,− 12 +(i+ 1)2−h), i = {0, . . . ,2h− 1}, is a

canonical interval for ξ if the real part of ξ is contained in [− 12 + i2−h,− 1

2 +(i+ 1)2−h) andσ(ξ , f ) < 8n22−h = 8n2w(I). We denote Tc the canonical tree which consists of all canonicalintervals. We remark that, for a canonical interval I, the parent interval of I is canonical as well.The following considerations will show that |TDCM|= O(|Tc|) and |Tc|= O(Σ f +n logn): For thesize of the canonical tree, consider a leaf I ∈ Tc and let ξI be a root of f corresponding to thisleaf. If there are several, then ξI is the root with minimal separation. Then, σ(ξI , f ) < 8n22−h

and, thus, h≤ 2logn+4− logσ(ξI , f ). Since each root of f is associated with at most one leafof the canonical tree, we conclude that |Tc| = O(n logn+Σ f ). It remains to show that |TDCM| =O(|Tc|). Consider the following mapping of internal nodes (intervals) of TDCM to canonical nodes(intervals) in Tc: Let I be a non-terminal interval of width w(I) = 2−h. Then, var( f , I+)> 0 andT [ f ′I ](0,2) does not hold. According Lemma 8 (c), the disk ∆2w(I)(a) contains a root ξ of f withσ(ξ , f )< 8n2w(I) = 8n22−h. Hence, one of the four intervals, I1 := (a−2w(I),a−w(I)), I2 :=(a−w(I),a), I3 := I or I4 := (b,b+(b−a)), is canonical for ξ . We map I to the correspondinginterval. This defines a mapping from the internal nodes of TDCM to the nodes of the canonicaltree Tc. Furthermore, each node in the canonical tree has at most four preimages in TDCM and,thus, the number of internal nodes of TDCM is bounded by O(n logn+Σ f ). Since TDCM is a binarytree, the bound on the number of internal nodes applies to the whole tree as well. 2

15

4. Algorithm

We first outline our algorithm RISOLATE to isolate the roots of f . RISOLATE decomposesinto two subroutines DCMρ and CERTIFYρ , where ρ indicates the actual working precision. Weproceed in rounds: In the first round, we start with a low working precision (e.g. ρ = 16). Ifour algorithm does not succeed in a certain round, the precision is doubled in the next round.Following this approach, we can eventually guarantee an adaptive behavior of our method, thatis, it eventually succeeds for a working precision which is at most twice the size of the actuallyneeded precision.

The first subroutine DCMρ is essentially identical to DCM with the main difference that, ateach node I = (a,b) of the recursion tree, we only consider approximations fI(x) of fI(x) =f (a+w(I) ·x) to a certain number ρI of bits after the binary point, where ρ +2logw(I)≤ ρI ≤ ρ .We remark that we process I in a way such that I is not subdivided by DCMρ if it is not subdividedby the the exact counterpart DCM. This ensures that, for any ρ , DCMρ induces a subtree TDCMρ

of TDCM and, thus, |TDCMρ | = O(Σ f + n logn) due to Theorem 11. We further show that, for aprecision ρ ≥ ρmax

f = O(Σ f + logn), DCMρ returns isolating intervals for all real roots of f ; seeTheorem 14 for the definition of ρmax

f and further details. However, for smaller ρ , DCMρ mayreturn isolating intervals only for some roots but without any information whether all real rootsare captured or not. In order to overcome such an undesirable situation, we consider an additionalsubdivision method CERTIFYρ similar to DCMρ which aims to certify that all roots are captured.We further show that CERTIFYρ also induces a recursion tree of size O(Σ f +n logn) and that itsucceeds if ρ ≥ ρmax

f .

4.1. DCMρ : An Approximate Version of DCM

We present our first subroutine DCMρ . Comments to support the approach are in italic andmarked by a ”//” at the beginning.

DCMρ . Let I0 = (− 12 ,

12 ) be the starting interval which, by construction of f , contains all real

roots of f . In a first step, we choose a (ρ + n+ 1)-binary approximation f of f and evaluatef (− 1

2 + x). Then, the resulting polynomial is approximated by a (ρ + 1)-binary approximationfI0 ∈ [ f (− 1

2 + x)]2−ρ−1 and, according to Lemma 1, we have fI0 ∈ [ fI0 ]2−ρ .

DCMρ maintains a list A of active nodes (I, fI ,ρI), where I = (a,b) ⊂ I0 is an interval, fI

approximates fI to ρI bits after the binary point and ρ + 2logw(I) ≤ ρI ≤ ρ . DCMρ eventuallyreturns a list O of tuples (J,sJ,`,sJ,r,BJ), where J = (c,d) is an isolating interval for a root off , sJ,` = sign( f (c)), sJ,r = sign( f (d)) and 0 < BJ ≤ min(| f (c)|, | f (d)|). We initially start withA := {(I0, fI0 ,ρ)} and O := /0. For each active node, we proceed as follows:

(1) Remove (I, fI ,ρI) from A .(2) Compute the polynomials

fI+(x) = fI

(− 1

4n+

(1+

12n

)· x)

and h(x) =n

∑i=0

hixi := (1+ x)n · fI+

(1

1+ x

).

(4.1)

16

(3) If hi >−2n+2−ρI for all i or hi < 2n+2−ρI for all i, do nothing (i.e., I is dicarded).

// A simple computation (see the following Lemma 12 (a)) shows that h is an approxi-mation of fI+,rev(x) = (x+ 1)n fI+(

11+x ) to ρI − n− 2 bits after the binary point. Thus, if

var( f , I+) = 0, all coefficients hi are either smaller than 2n+2−ρI or larger than −2n+2−ρI .Since we want to induce a subtree of the recursion tree TDCM induced by f , we discard I ifall coefficients of h are larger than −2n+2−ρI (or smaller than 2n+2−ρI ).

(4) If there exist hi and h j with hi≤−2n+2−ρI and h j ≥ 2n+2−ρI , consider the test T [( fI)′](0,2),

that is, evaluate t[( fI)′](0,2) = t[( fI)

′,3/2](0,2).

// Due to Lemma 12 (i), it holds that |t[( fI)′](0,2)− t[( fI)

′](0,2)| < n2n+1−ρI . Hence, ifT [( fI)

′](0,2) holds, then t[( fI)′](0,2)>−n2n+1−ρI . Thus, we proceed as follows:

(a) If t[( fI)′](0,2)>−n2n+1−ρI , consider the polynomial

fI(x) := fI(x)+ sign(( fI)′(0)) ·n ·2n+1−ρI · x, (4.2)

// Then, T [( fI)′](0,2) holds and, in particular, fI is monotone on (−2,2).

evaluate

λ− := fI

(− 1

4n

)= fI+(0)−2n−1−ρI (4.3)

λ+ := fI

(1+

14n

)= fI+(1)+(4n+1)2n−1−ρI (4.4)

λ := fI

(−1

n

)= fI

(−1

n

)−2n+1−ρI , (4.5)

and check whether the following conditions are fulfilled:

I = (a, b) = (a− w(I)2n

,b+w(I)2n

) intersects no J for any (J,sJ,`,sJ,r,BJ) ∈ O, (4.6)

λ− ·λ+ < 0, (4.7)

min(|λ−|, |λ+|)> 2n+3−ρI n, and (4.8)

|λ |> 2deg fI+n+7−ρI n2. (4.9)

If any of the conditions (4.6)-(4.9) fails, do nothing. If all conditions are fulfilled, thenadd (I,sign(λ−),sign(λ+),min(|λ−|, |λ+|)−2n+3−ρI n) to O .

// If (4.7)-(4.9) hold, I is isolating for a root ξ of f (Lemma 12 (c)). Furthermore, sincefI is monotone on (−2,2), we have | fI(− 1

2n )| > |λ−| and | fI(1+ 1

2n )| > |λ+|. Then,

inequality (4.8) and Lemma 12 (b) yields that sign( f (a)) = sign(λ−), sign( f (b)) =sign(λ+), and min(| f (a)|, | f (b)|)> min(|λ−|, |λ+|)−2n+3−ρI n.

(b) If t[( fI)′](0,2)≤−n2n+1−ρI , subdivide I into I` := (a,mI) and Ir := (mI ,b). Compute a

ρI-binary approximation fI` of fI(x2 ) and a (ρI−1)-binary approximation fIr of fI(

x+12 ),

and add (I`, fIl ,ρI−1) and (Ir, fIr ,ρI−2) to A . If ρI < 2, return “insufficient precision”.

17

0 1

-1 2-2 -1/n 1+1/n-1/4n 1+1/4n

D (0) 2

roots of fI^

g

U:=U ((0,1))1/n z

Fig. 4.1. If λ− ·λ+ = fI(− 14n ) · fI(1+ 1

4n ) < 0, then there exists a root γ ∈ (− 14n ,1+

14n ) of fI . Furthermore, ∆2(0)

contains no further root of fI . A computation shows that | fI(z)| > | fI(z)− fI(z)| for all z on the boundary of the 1n -

neighborhood U of (0,1) if the inequality (4.9) holds. Then, due to Rouche’s Theorem, U isolates a root of fI .

// Due to Lemma 1, we have fIl ∈ [ fIl ]2−ρI−1 and fIr ∈ [ fIr ]2−ρI−2 . Hence, by induction, itfollows that ρ +2logw(I)≤ ρI ≤ ρ for all active nodes.

DCMρ stops when A becomes empty. It may either return ”insufficient precision” (in Step 4 (b))or a list O of isolating intervals I for some of the roots of f together with the signs of f and alower bound on | f | at the endpoints of I.

Lemma 12. Let f be a polynomial as in (2.1), I = (a,b) an interval considered by DCMρ and hthe polynomial as defined in (4.1). Then,

(a) h(x) ∈ [ fI+,rev]2n+2−ρI and |t[( fI)′](0,2)− t[( fI)

′](0,2)|< n ·2n+1−ρI .

(b) For an arbitrary value t with |t| ≤ 1+ 1n , it holds that | f (a+ t ·w(I))− fI(t)|< 2n+3−ρI n, with

fI as defined in (4.2). In particular,

| f (a+)−λ−|, | f (a− w(I)

n)−λ |, | f (b+)−λ

+|< 2n+3−ρI n,

with λ−, λ+ and λ as defined in (4.3)-(4.5).(c) Suppose that t[( fI)

′](0,2) > −n2n+1−ρI and the inequalities (4.7)-(4.9) hold. Then, I+ con-tains a real root ξ of f and the w(I)

n -neighborhood of I is isolating for ξ .(d) For any tuple (J,sJ,`,sJ,r,BJ)∈O , the endpoints of J are located outside the union of the disks

∆i := ∆ σ(zi , f )64n3

(zi), where i = 1, . . . ,n.

Proof. Since fI ∈ [ fI ]2−ρI , we have fI+ ∈ [ fI+ ]2−ρI+2 due to Lemma 1 (b). Reversing the coeffi-cients and replacing x by x+ 1 increases the error by a factor of at most 2n (see Lemma 1 (c)),thus h ∈ [ fI+,rev]2−ρI+2+n . For the second part of (a), consider the following simple computation:

|t[( fI)′](0,2)− t[( fI)

′](0,2)| ≤ 32·n ·2−ρI

n−1

∑i=0

2i =32·n ·2−ρI (2n−1)< n2n+1−ρI ,

where the first inequality uses ( fI)′ ∈ [( fI)

′]n·2−ρI . For (b), we have

18

| f (a+ t ·w(I))− fI(t)|= | fI(t)− fI(t)| ≤ | fI(t)− fI(t)|+ |t| ·2n+1−ρI n

≤ 2−ρIn

∑i=0|t|i +

(1+

1n

)·2n+1−ρI n

≤ n2−ρI

(1+

1n

)n+1

+n2n+2−ρI < n2n+3−ρI .

Now, if the inequalities (4.7) and (4.8) hold, then sign( f (a+))= sign(λ−), sign( f (b+))= sign(λ+)

and f (a+) · f (b+) < 0, hence, f has a real root in I+. We next show that (4.9) implies theuniqueness of this root. From t[( fI)

′](0,2) > −n2n+1−ρI , it follows that T [( fI)′](0,2) succeeds

and, thus, ∆2(0) contains at most one root of fI . Since λ− = fI(− 14n ) and λ+ = fI(1 + 1

4n )

have different signs, the interval (− 14n ,1 + 1

4n ) contains a root γ of fI . We consider the 1n -

neighborhood U ⊂ C of (0,1) and an arbitrary point z on its boundary; see Figure 4.1. It holdsthat |− 1

n − γ|/|z− γ|< (1+ 54n )/(

14n ) = 4n+5 < 8n and, for any root γ 6= γ of fI , we have

|− 1n − γ||z− γ|

≤|− 1

n − z|+ |z− γ||z− γ|

≤ 1+1+ 2

n

1− 1n

= 21+ 1

2n

1− 1n

.

Hence, it follows that∣∣∣∣ λ

fI(z)

∣∣∣∣=∣∣∣∣∣ fI(− 1

n )

fI(z)

∣∣∣∣∣= |− 1n − γ||z− γ| ∏

γ 6=γ: fI(γ)=0

|− 1n − γ||z− γ|

< (4n+5) ·2deg fI−1(

1+1

2n

)deg fI−1(1− 1

n

)−deg fI+1

< (4n+5) ·2deg fI−1 ·√

2.72 ·2.72 < n2deg fI+4

and, thus, | fI(z)| > |λ | · 2−deg fI−4n−1. Since |z| ≤ 1 + 1n , we have | fI(z)− fI(z)| < n2n+3−ρI

according to (b). Then, from Rouche’s Theorem, it follows that fI has exactly one root within U if(4.9) holds. This shows (c). It remains to prove (d): Let I = (a, b) and I = (a,b) the correspondingsmaller interval. From our construction and (c), I+ contains a root ξ = zi0 of f and the w(I)

n -neighborhood of I is isolating for this root. Thus, |a− zi|> w(I)

4n for all i. If there exists an i 6= i0with a ∈ ∆i, then w(I)< 4n|a− zi|< σ(zi, f )/(16n2). Hence, we obtain

|ξ − zi| ≤ |ξ − a|+ |a− zi|<(

1+1n

)·w(I)+ σ(zi, f )

64n3

<

(1+

1n

)· σ(zi, f )

16n2 +σ(zi, f )

64n3 < σ(zi, f ),

a contradiction. It remains to show that a /∈ ∆i0 . If a ∈ ∆i0 , then w(I) < σ(ξ , f )16n2 . According to

Lemma 8 (b), T [( fJ)′](0,2) already holds for a parent node J of I and, thus, t[( fJ)

′](0,2) >−n2n+1−ρJ because of (a). This contradicts the fact that J is not terminal. In completely analo-gous manner, one shows that b is also not contained in any ∆i. This proves (d). 2

We close this section with a result on the size of the recursion tree induced by DCMρ and thebit complexity of DCMρ :

19

Theorem 13. Let f be a polynomial as in (2.1) and ρ ∈ N an arbitrary positive integer. Then,the recursion tree TDCMρ induced by DCMρ is a subtree of the tree TDCM induced by DCM, thus,

|TDCMρ | ≤ |TDCM|= O(Σ f +n logn).

Furthermore, DCMρ demands for a number of bit operations bounded by

O(n(Σ f +n logn)(nΓ+ρ− logσ f )).

Proof. For the first claim, we remark that DCMρ never splits an interval I which is not splitby DCM when applied to the exact polynomial f . Namely, if I is terminal for DCM, then eithert[( fI)

′](0,2) > 0 or var( f , I+) = var( fI+,rev) = 0. In the first case, we must have t[( fI)′](0,2) >

−n2n+1−ρI whereas, in the second case, all coefficients hi of h(x) = (1+x)n · fI+(1

1+x ) are eitherlarger than −2n+2−ρI or smaller than 2n+2−ρI ; see Lemma 12 (a). Thus, I is terminal for DCMρ

as well. The result on the size of TDCMρ then follows directly from Theorem 11.For the bit complexity, we first consider the cost in each iteration: For an active node (I, fI ,ρI)∈

A , I = (a,b), the polynomial fI approximates fI to ρI ≤ ρ bits after the binary point. The abso-lute value of each coefficient of fI is bounded by 2nΓ because the shift operation x 7→ a+(b−a) ·xdoes not increase the coefficients of f by a factor of more than 2n and the absolute value of thecoefficients of f is bounded by 2nΓ; see Section 2.2. It follows that the bitsize of the coefficientsof fI is bounded by O(nΓ)+ ρ . Hence, the cost for computing h(x), fI` and fIr(x) is boundedby O(n(nΓ+ρ)). Namely, the latter constitutes a bound on the cost for a fast asymptotic Taylorshift by an O(logn)-bit number. The cost for evaluating t[( fI)

′](0,2), λ−, λ+ and λ matches thesame bound because all these computations are evaluations of a polynomial of bitsize O(nΓ+ρ)at an O(logn)-bit number. We further remark that, in each iteration, O contains disjoint isolatingintervals J for some of the real roots of f and, thus, |O| ≤ n. Hence, the endpoints of the interval Jhave to be compared with those of at most n intervals stored in O . Since DCMρ does not produceany interval of size less than σ f

8n2 , these comparisons demand for at most O(n(logn− logσ f )) bitoperations. It follows that the total cost at each node is bounded by O(n(nΓ+ρ − logσ f )) bitoperations. The bound on the total cost then follows from our result on the size of the recursiontree. 2

4.2. Known ρ f and σ f

From Lemma 3 (c), we already know that, for ρ ≥ ρ f , each root zi of f moves by at mostσ(zi, f )

64n3 when passing from f to an arbitrary approximation f ∈ [ f ]2−ρ f ; see Definition 2 for thedefinition of ρ f . Hence, we expect it to be possible to isolate the roots of f by only consider-ing approximations of f (and the intermediate results fI) to ρ f bits after the binary point. Thefollowing theorem proves a corresponding result.

Theorem 14. Let f be a polynomial as in (2.1) and ρ ∈ N an integer with

ρ ≥ ρmaxf :=

⌈ρ f −3logσ f +16n

⌉= O(Σ f +n). (4.10)

Then, DCMρ isolates all real roots of f and BJ > 2−ρ f for all (J,sJ,`,sJ,r,BJ) ∈ O .

Proof. Due to Theorem 11 and 13, the height h(DCMρ) of TDCMρ is bounded by

h(DCMρ)≤ log16n2

σ f= 2logn+4− logσ f ≤ 4n− logσ f .

20

Then, for any interval I = (a,b) produced by DCMρ , we have

ρI ≥ ρ +2logw(I)≥ ρ−2h(DCMρ)≥ ρminf :=

⌈ρ f +8n− logσ f

⌉> 0. (4.11)

The latter inequality guarantees that DCMρ does not return “insufficient precision”. Now let I bean interval whose closure I contains a root ξ = zi0 of f . We aim to show the following facts:

(1) I is not discarded in Step 3 of DCMρ .(2) If t[( fI)

′](0,2)>−n2n+1−ρI , then all inequalities (4.7)-(4.9) are fulfilled.(3) In the latter case, either I = (a− w(I)

2n ,b+ w(I)2n ) is added to O or I only intersects intervals

J, with a corresponding (J,sJ,`,sJ,r,BJ) ∈ O , which already isolate ξ .

If (1)-(3) hold, then DCMρ outputs isolating intervals for all real roots of f . Namely, DCMρ

starts subdividing I0 = (− 12 ,

12 ) which contains all real roots of f . Thus, for each root ξ of f , we

eventually obtain an interval I such that I contains ξ and t[( fI)′](0,2)>−n2n+1−ρI . Then, either

I is added to the list of isolating intervals or O already contains an isolating interval for ξ .For the proof of (1), we have already shown that w(I) > σ(ξ , f )

16n2 . Lemma 3 (c) then ensuresthat an arbitrary g ∈ [ f ]2−ρ f has a root ξ ′ ∈ I+. Namely, the root ξ ∈ I stays real and moves by

at most σ(ξ , f )64n3 < w(I)

4n when passing from f to g. Now, suppose that all coefficients hi of h(x) =(1+ x)n · fI+(

11+x ) are larger than −2n+2−ρI ; see (4.1) for definitions. Since |hi− hi| < 2n+2−ρI

for all coefficients hi of fI+,rev = ∑ni=0 hixi (see Lemma 12 (a)), it follows that hi >−2n+3−ρI for

all i. Hence, for the polynomial

g(x) := f (x)+2n+3−ρI ∈ [ f ]2−ρ f ,

we have gI+,rev(x) = fI+,rev(x)+2n+3−LI (x+1)n and, thus, gI+,rev has only positive coefficients.In the case where hi < 2n+2−ρI for all i, we consider g(x) := f (x)−2n+3−ρI ∈ [ f ]2−ρ f and, thus,gI+,ρ has only negative coefficients. Hence, in both cases, there exists a g ∈ [ f ]2−ρ f which has noroot in I+, a contradiction. It follows that I cannot be discarded in Step 3.

For (2), suppose that t[( fI)′](0,2)>−n2n+1−ρI . Due to Lemma 12 (a), we have t[( fI)

′](0,2)>−n2n+2−ρI , and since log n2n+2−ρI

w(I) ≤ 6+3logn+n−ρI− logσ f <−ρ f , it follows that

g(x) := f (x)+ x · sign(( fI)′(0)) · n2n+2−ρI

w(I)∈ [ f ]2−ρ f .

Hence, g has a root ξ ′ in I+. Since t[(gI)′](0,2) = t[( fI)

′](0,2)+n2n+2−ρI > 0, the disk ∆2w(I)(a)is isolating for ξ ′. The following argument shows that ∆ 3w(I)

2(a) isolates ξ : Suppose that ∆ 3w(I)

2(a)

contains an additional root z j 6= ξ of f . Then, σ(ξ , f ) < 3w(I) and, thus, ξ and z j would moveby at most 3w(I)

64n3 < w(I)2 when passing from f to g. It follows that g would have at least two roots

within ∆2w(I)(a), a contradiction. Now, since ∆ 3w(I)2

(a) is isolating for ξ ∈ I, we have

σ(ξ , f )16n2 < w(I)< 2σ(ξ , f ).

The left inequality implies that the distance of ξ to any of the points a+ = a− w(I)4n , b+ = b+ w(I)

4n

and c := a− w(I)n is larger than or equal to w(I)

4n > σ(ξ , f )64n3 . Let di denote the distance between a

root zi 6= ξ and the disk ∆ 3w(I)2

(a). Then,

σ(zi, f )64n3 ≤ |zi−ξ |

64n3 ≤di +3w(I)

64n3 < di +w(I)

4.

21

It follows that the points a+, b+, c ∈ ∆ 5w(I)4

(a) are located outside the disk ∆i := ∆ σ(zi , f )64n3

(zi). In

summary, none of the disks ∆i, i = 1, . . . ,n, contains any of the points a+, b+ and c. Hence,due to Lemma 3 (c), it follows that each of the values | f (c)|, | f (a+)| and | f (b+)| is larger than(n+1)2−ρ f . A simple computation now shows that (n+1)2−ρ f > 22n+8−ρI n2. Thus, accordingto Lemma 12 (b), each of the absolute values |λ |, |λ−| and |λ+| is larger than

(n+1)2−ρ f −2n+3−ρI n > 22n+8−ρI n2−2n+3−ρI n > 22n+7−ρI n2. (4.12)

It follows that the inequalities (4.8) and (4.9) hold. Since I+ is isolating for ξ , f (a+) and f (b+)must have different signs and, thus, the same holds for λ− and λ+. Hence, the inequality (4.7)holds as well. In addition, we have BI = min(|λ−|, |λ+|)−2n+3−ρI n > 2−ρ f because of (4.12). Itremains to show (3): If t[( fI)

′](0,2)>−n2n+1−ρI , then due to (2) and Lemma 12 (b), the intervalI and the w(I)

n -neighborhood of I is isolating for ξ . If I does not intersect any other interval inO , then I is added to O and, thus, DCMρ outputs an isolating interval for ξ . We still have toconsider the case where I intersects an interval J from O . From the construction of O , J is theextension (c, d) of an interval J′ = (c,d). Now, suppose that J is isolating for a root γ 6= ξ . Theroots ξ and γ move by at most w(I)

4n and w(J′)4n , respectively, when passing from f to an arbitrary

g ∈ [ f ]2−ρ f (see the proof of (1)). Hence, it follows that the union of (a−w(I),b+w(I)) and(c−w(J′),d+w(J′)) contains at least two roots of any g∈ [ f ]2−ρ f . Due to Lemma 10, one of thedisks ∆2w(I)(a) or ∆2w(J′)(c) then also contains at least two roots of g contradicting the fact that

t[(pI)′](0,2) > 0 for p(x) := f (x)+ x · sign(( fI)

′(0)) · n2n+2−ρIw(I) ∈ [ f ]2−ρ f and t[(qJ′)

′](0,2) > 0

for q(x) := f (x)+ x · sign(( fI)′(0)) · n2n+2−ρJ′

w(J′) ∈ [ f ]2−ρ f . It follows that J already isolates ξ . 2

4.3. Unknown ρ f and σ f

For unknown ρ f and σ f , we proceed as follows: We start with an initial precision ρ (e.g.ρ = 16) and run DCMρ . If DCMρ returns ”insufficient precision”, we double ρ and start over.Otherwise, DCMρ returns a list O = {(Jk,sk,`,sk,r,Bk)}k=1,...,m, where each interval Jk = (ck,dk)isolates a real root of f , sk,` = sign f (ck), sk,r = sign f (dk) and 0 < Bk < min(| f (ck)|, | f (dk)|). Asalready mentioned, there is no guarantee that all roots of f are captured. Hence, in a second step,we use the subsequently described method CERTIFYρ to check whether the region of uncertainty

R :=[−1

2,

12

]\

m⋃k=1

Jk

contains a root of f . If we can guarantee that f (x) 6= 0 for all x ∈ R, we return the list L ={Jk}k=1,...,m of isolating intervals. Otherwise, we double ρ and start over the entire algorithm. Wehave already proven in Theorem 14 that DCMρ isolates all real roots of f if ρ ≥ ρmax

f (i.e., ρ ful-fills the inequality (4.10)). The following considerations will show that, for ρ ≥ ρmax

f , CERTIFYρ

succeeds as well.

How can we guarantee that f does not vanish on R? The crucial idea is to consider a decom-position of [− 1

2 ,12 ] into subintervals I and corresponding µI-approximations g of fI such that g

is monotone on [0,1] or T [g](0,1) holds. Namely, for such an interval I, we can easily estimatethe image g([0,1]) and, thus, conclude that f either contains no root in I ∩R or that ρ < ρmax

fbecause g(t) and fI(t) differ by at most (n+1)µI for all t ∈ [0,1]. More precisely, we have:

22

-1/2 1/2c d c d c d

-1/2

1 1 2 2 m mJ1 J2 Jm

I

>+B

a b

a

c1 d1 c2

c1

d1 c2L1 L2

1 <-B1 >+B<-B2 2 >+Bm <-Bm

l( )

c1

l( )a d1l( ) c2l( )

>+B1 <-B1 <-B2

Fig. 4.2. DCMρ returns a list O = {Jk,sk,`,sk,r,Bk}k , where Jk is isolating for a real root of f , sk,` = sign f (ck),sk,r = sign f (dk) and min(| f (ck)|, | f (dk)|) > Bk > 0. The intervals in between define the region of uncertainty R. InCERTIFYρ , we subdivide (−1/2,1/2) into intervals I such that, for a µ-approximation g of fI , either T [g](0,1) holdsor g is monotone on (0,1). If T [g](0,1) holds and |g(0)| > 8mµ , then I contains no root of f ; see Lemma 15 (a). If gis monotone on (0,1), we consider all intervals Li in the intersection of I with R and check whether the conditions inLemma 15 (b) are fulfilled. If they are fulfilled, then f has no root in Li; otherwise, we must have ρ < ρmax

f .

Lemma 15. Let I = (a,b) be an interval and g(x) a µ-binary approximation of fI with

− log µ ≥ ρ−2(4n− logσ f ). (4.13)

(a) Suppose that T [g](0,1) holds and I is not entirely contained in one of the Jk. If

|g(0)|> 8nµ, (4.14)

then I contains no root of f . Otherwise, ρ < ρmaxf .

(b) Suppose that g is monotone on [0,1] and let I ∩R =⋃s

i=1 Li be the intersection of I and R.For each endpoint q of an arbitrary Li, we define

λ (q) :=

sk,` ·Bk, if q /∈ {a,b} and q is the left endpoint of an interval Jk

sk,r ·Bk, if q /∈ {a,b} and q is the right endpoint of an interval Jk

g(0), if q = ag(1), if q = b.

(4.15)

If, for all Li = [q`,qr], min(|λ (q`)|, |λ (qr)|)> 4nµ and λ (ql) ·λ (qr)> 0, then I∩R containsno root of f . Otherwise, we have ρ < ρmax

f .

Proof. If T [g](0,1) holds, then 13 |g(0)|< |g(t)|<

53 |g(0)| for all t ∈ [0,1] according to Lemma 6.

Suppose that g(0)> 8nµ , then it follows that

| fI(t)| ≥ |g(t)|− |g(t)− fI(t)|>13|g(0)|− |g(t)− fI(t)|>

83

nµ− (n+1)µ > 0,

hence, f has no root in I. Now suppose that |g(0)| ≤ 8nµ . Since I is not contained in any Jk,there exists a t ∈ [0,1] with x = a+ t(b− a) ∈ R and | f (x)| = | fI(t)| ≤ |g(t)|+ (n+ 1)µ ≤53 |g(0)|+(n+1)µ < 16nµ . If ρ ≥ ρmax

f , then from (4.13) and the definition of ρmaxf , it follows

that− log µ ≥ ρminf =

⌈ρ f +8n− logσ f

⌉; see the computation in (4.11). Hence, we have | f (x)|<

2−ρ f . In addition, Lemma 12 (d) and Theorem 14 guarantee that DCMρ returns isolating intervalsfor all real roots of f , and each point in R has distance≥σ(zi, f )/(64n3) from each root zi. Thus,| f (x)|> (n+1)2−ρ f due to Lemma 3 (c), a contradiction. This proves (a).

23

For (b), we consider an arbitrary interval Li = [q`,qr]. Let t` and tr be corresponding values in[0,1] with ql = a+ t` ·w(I) and qr = a+ tr ·w(I). If min(|λ (q`)|, |λ (qr)|)> 4nµ , then

min(|g(t`)|, |g(tr)|)≥min(|λ (q`)|, |λ (qr)|)− (n+1)µ > 2nµ.

Namely, for q` = a, we obviously have |g(t`)|= |λ (q`)|; otherwise, |g(t`)| ≥ | fI(t`)|−(n+1)µ ≥|λ (q`)| − (n+ 1)µ . For qr, an analogous argument applies. If, in addition, λ (q`) · λ (qr) > 0,then g(t`) · g(tr) > 0 as well because λ (q`) and λ (qr) have the same sign as g(t`) and g(tr),respectively. Since we assumed that g is monotone on [0,1], it follows that |g(t)| > 2nµ for allt ∈ [t`, tr]. This shows that | fI(t)| ≥ |g(t)|− (n+ 1)µ > 0 for all t ∈ [t`, tr], thus the first part of(b) follows. For the second part, suppose that ρ ≥ ρmax

f . Then, Bk > 2−ρ f > 4nµ for all k and| f (x)| > 2−ρ f (n+ 1) for all x ∈R according to Lemma 3 (c) and Theorem 14. Thus, if a ∈R,we have

|g(0)| ≥ | fI(0)|− (n+1)µ = | f (a)|− (n+1)µ > 2−ρ f · (n+1)− (n+1)µ > 4nµ.

An analogous argument applies to b. It follows that |λ (q)| > 4nµ for all endpoints q of an ar-bitrary interval Li = [q`,qr]. It remains to show that λ (q`) ·λ (qr) > 0. We have already shownthat |λ (q)| > 4nµ for each endpoint q, thus, f (q) must have the same sign as λ (q). Namely,if q ∈ {a,b}, then f (q) differs from λ (q) > 4nµ by at most (n + 1)µ < 4nµ , and, for q /∈{a,b}, we have sign(λ (q)) = sk,` or sign(λ (q)) = sk,r depending on whether q is the left orthe right endpoint of an interval Jk. Since ρ ≥ ρmax

f , R contains no root of f , thus, we must haveλ (q`) ·λ (qr) = f (q`) · f (qr)> 0. 2

We can now formulate the subroutine CERTIFYρ (see Algorithm 3 in the Appendix for pseudo-code). CERTIFYρ is similar to DCMρ in the sense that we recursively subdivide I0 = (− 1

2 ,12 ) into

intervals I and consider corresponding ρI-binary approximations fI of fI . Then, in each iteration,we aim to apply Lemma 15 in order to certify that I ∩R contains no root of f or ρ < ρmax

f .Throughout the following consideration, we assume that

CERTIFYρ never produces an interval I of width w(I)≤σ f

8n2 . (4.16)

We will prove this fact in Theorem 16 (b). Again, we mark comments which should help tofollow the approach by an ”//” at the beginning.

CERTIFYρ . In a first step, we choose a (ρ + n+ 1)-binary approximation f of f and evaluatef (− 1

2 + x). Then, the resulting polynomial is approximated by a (ρ + 1)-binary approximationfI0 ∈ [ f (− 1

2 + x)]2−ρ−1 , thus, fI0 ∈ [ fI0 ]2−ρ according to Lemma 1.

CERTIFYρ maintains a list A of active nodes (I, fI ,ρI), where I = (a,b) ⊂ I0 is an interval,fI approximates fI to ρI bits after the binary point and ρ +2logw(I)≤ ρI ≤ ρ . We initially startwith A := {(I0, fI0 ,ρ)}. For each active node, we proceed as follows:

(1) Remove (I, fI ,ρI) from A .(2) If I∩R = /0, do nothing (i.e., discard I). Otherwise, compute t[ fI ](0,1).

// If I∩R = /0, I is contained in one of the intervals Jk, and thus we can discard I.

24

(3) If t[ fI ](0,1)>−2−ρI+2n, check whether

| fI(0)+ sign( fI(0)) ·2−ρI+2n|> 2−ρI+6n2. (4.17)

If (4.17) holds, do nothing (i.e., discard I); otherwise, return ”insufficient precision”.

// For g(x) := fI(x)+ sign( fI(0)) · 2−ρI+2n ∈ [ fI ]2−ρI+3n, the predicate T [g](0,1) holds.From our assumption on w(I), we further have

ρI−3− logn≥ ρ +2logw(I)−3− logn≥ ρ−2(3+2logn− logσ f )−3− logn≥ ρ−2logσ f −8n,

and thus 2−ρI+3n ≤ 2−ρ−2(4n−logσ f ). It follows that g fulfills the condition (4.13) fromLemma 15 and, therefore, I contains no root of f if (4.17) holds; otherwise, ρ < ρmax

f .

(4) If t[ fI ](0,1) ≤ −2−ρI+2n, compute h(x) = ∑ni=0 hixi := (1+ x)n · ( fI)

′( 11+x ) and consider

the following distinct cases:

(a) If hi >−n2n−ρI for all i (or hi < n2n−ρI for all i), consider

g(x) := fI(x)+n2n−ρI · x ∈ [ fI ]n2n+1−ρI

(g(x) := fI(x)− n2n−ρI · x, respectively). Then, for each interval Li = [q`,qr], deter-mine λ (q`) and λ (qr) as defined in (4.15). If min(|λ (q`)|, |λ (qr)|) > n2n+3−ρI andλ (q`) ·λ (qr)> 0 for all Li, discard I; otherwise, return ”insufficient precision”.

// Suppose hi >−n2n−ρI for all i and define g(x) := fI(x)+n2n−ρI x. Then, the polynomial(1+x)n ·(g)′( 1

1+x ) = (1+x)n ·( fI)′( 1

1+x )+n2n−ρI (1+x)n has only positive coefficients.It follows that var(g′,(0,1)) = 0 and, therefore, g is monotone on [0,1]. In addition, fromour assumption on w(I), we have n2n+1−ρI ≤ 2−ρ−2(4n−logσ f )+1 (see the computationabove). Hence, we can apply Lemma 15 (b) to g: That is, I ∩R contains no root of fif min(|λ (q`)|, |λ (qr)|) > n2n+3−ρI and λ (q`) ·λ (qr) > 0 for all Li = [ql ,qr]. If one ofthe latter two inequalities does not hold, then ρ < ρmax

f . The case hi < n2n−ρI for all i istreated in exactly the same manner.

(b) If there exist hi and h j with hi ≤ −n2n−ρI and h j ≥ n2n−ρI , then I is subdivided intoI` := (a,mI) and Ir := (mI ,b). We add (I`, fI` ,ρI − 1) and (Ir, fIr ,ρI − 2) to A , wherefI` is an ρI-binary approximation of fI(

x2 ) and fIr an (ρI − 1)-binary approximation of

fI(x+1

2 ); see Step 4 (b) of DCMρ for details. If ρI < 2, return ”insufficient precision”.

// Due to Lemma 1, fIl ∈ [ fIl ]2−ρI−1 and fIr ∈ [ fIr ]2−ρI−2 . Hence, by induction, it followsthat ρ +2logw(I)≤ ρI ≤ ρ for all active nodes.

CERTIFYρ stops when A becomes empty. If CERTIFYρ returns ”insufficient precision”, we knowfor sure that σ < σmax

f . Otherwise, the region of uncertainty R contains no root of f .

The following theorem proves that our assumption (4.16) for the intervals produced by CERTIFYρ

is correct. Furthermore, we show that CERTIFYρ is also efficient with respect to bit complexitymatching the worst case bound obtained for DCMρ ; see Theorem 13.

25

Theorem 16. For a polynomial f as defined in (2.1) and an arbitrary ρ ∈ N,(a) CERTIFYρ does not produce an interval I of width w(I)≤ σ f /(8n2) and it induces a recursion

tree of size O(Σ f +n logn).(b) CERTIFYρ needs no more than O(n(Σ f +n logn)(nΓ+ρ− logσ f )) bit operations.(c) For ρ ≥ ρmax

f , CERTIFYρ succeeds.

Proof. An interval I is only subdivided if t[ fI ](0,1) ≤ −2−ρI+2n (Step (3)) and if there existcoefficients hi and h j of h(x) = ∑

ni=0 hixi = (1+ x)n · ( fI)

′( 11+x ) with hi < −n2n−ρI and h j >

n2n−ρI (Step 4 (b)). In the first case, we must have t[ fI ](0,1)< 0 since |t[ fI ](0,1)− t[ fI ](0,1)|<2−ρI+2n. Hence, T [ fI ](0,1) does not hold. For the second case, we have var(( fI)

′,(0,1)) 6= 0since corresponding coefficients of h(x) = (1+x)n · ( fI)

′( 11+x ) and (1+x)n · ( fI)

′( 11+x ) differ by

at most n2n−ρI , and thus var( f ′, I) = var(( f ′)I ,(0,1)) = var(( fI)′,(0,1)) 6= 0. Hence, the first

part of (a) follows from Lemma 8 (d). This proves that our assumption (4.16) is always fulfilled,and thus Lemma 15 applies in Step 3 and Step 4 (a) of CERTIFYρ . It follows that the algorithmonly fails (i.e. it returns ”insufficient precision”) if ρ < ρ

fmax. For the second part of (a), we

remark that, due to the above argument, an interval I is terminal if the disk ∆2nw(I)(m(I)) doesnot contain a root ξ of f with σ(ξ , f ) < 4n2w(I). In [35, Section 4.2], it has been shown thatthe recursion tree T ( f ′) induced by the latter property 14 has size O(Σ f + n logn). Hence, thesame holds for the recursion tree induced by CERTIFYρ which is a subtree of T ( f ′). Finally, (c)follows in completely analogous manner as the result on the bit complexity for DCMρ as shownin the proof of Theorem 13. 2

Eventually, we present our overall root isolation method RISOLATE. It applies to a polynomialF as given in (1.4) and returns isolating intervals for all real roots of F .

RISOLATE: Choose a starting precision ρ ∈N (e.g., ρ = 16) and run DCMρ on the polynomialf as defined in (2.1). If DCMρ returns “insufficient precision”, we double ρ and start over again.Otherwise, DCMρ returns a list O = {(Jk,sk,`,sk,r,Bk)}k=1,...,m with isolating intervals Jk forsome of the real roots of f . If CERTIFYρ returns “insufficient precision”, we double ρ and startover the algorithm. If CERTIFYρ succeeds, the intervals Jk = (ck,dk) isolate all real roots of f .Hence, we return the intervals (2Γck,2Γdk), k = 1, . . . ,m, which isolate the real roots of F .

The following theorem summarizes our results:

Theorem 17. Let F be a polynomial as given in (1.4). Then, RISOLATE determines isolatingintervals J1, . . . ,Jm for all real roots of F and, for each interval J ∈ {J1, . . . ,Jm} containing aroot ξ of F, it holds that

σ(ξ ,F)

16n2 < w(J)< 2nσ(ξ ,F).

14 In [35, Section 4.2], T ( f ′) is defined as subdivision tree obtained by recursive bisection of the interval (− 14 ,

14 ) in

accordance with the following rule: At depth h ∈ N0, an interval I = (− 14 + i2−h−1,− 1

4 +(i+1)2−h−1) is subdivided ifand only if var( f ′, I) 6= 0 and ∆28n5w(I)(m(I)) contains a root ξ of f with separation σ(ξ , f ) < 27n5w(I). For the given

situation, we can alternatively define T ( f ′) as the (even smaller) tree obtained by recursive bisection of (− 12 ,

12 ), where

an interval I is subdivided if var(I, f ′) 6= 0 and ∆2nw(I)(m(I)) contains a root ξ of f with σ(ξ , f )< 4n2w(I).

26

RISOLATE demands for coefficient approximations of F to O(ΣF + nΓF) bits after the binarypoint and the total cost is bounded by

O(n(ΣF +nΓF)2) = O(n(ΣF +nτ)2)

bit operations. For F ∈ Z[x] a polynomial with integer coefficients of bitsize τ , RISOLATE com-putes isolating intervals with O(n3τ2) bit operations.

Proof. It remains to prove the complexity bounds and the claim on the width of the isolatingintervals. According to Appendix 6.1, the computation of an approximate logarithmic root boundΓ ∈ N as defined in Section 2.2 needs O(n2ΓF) bit operations. For a fixed precision ρ , the totalcost for running DCMρ and CERTIFYρ is bounded by

O(n(Σ f +n logn)(nΓ+ρ− logσ f )) = O(n(ΣF +nΓ)(nΓ+ρ− logσF))

bit operations; see Theorem 13 and Theorem 16. Since we double ρ in each step and succeedfor ρ ≥ ρmax

f , ρ is always bounded by 2ρmaxf = O(Σ f + n) = O(ΣF + nΓ) and we need at most

a logarithmic number of rounds. Hence, it follows that (up to logarithmic factors) the total costis dominated by the cost for the last run which is O(n(ΣF + nΓ)2). Furthermore, we have toapproximate the coefficients of f to O(Σ f +n) = O(ΣF +nΓ) bits after the binary point. Hence,the coefficients of F have to be approximated to O(ΣF + nΓ) bits after the binary point; seeSection 2.3 for more details. From our construction of f and Γ, it holds that Γ < 8logn+1+ΓF .Hence, we can replace Γ by ΓF in the above complexity bounds.

For the special case where F = An · xn + · · ·+A0 is a polynomial with integer coefficients ofbit size τ , we first divide F by its leading coefficient to meet the requirements in (1.4). Then, thebound on the bit complexity follows from the above general bound (applied to F(x)/An) and thefact that ΣF = O(nτ); see Appendix 6.2.

The estimate on the size of the isolating intervals is due to the following consideration: An in-terval I which contains the root z= ξ

2Γ+1 of f is not subdivided by DCMρ if w(I)≤ σ(z, f )/(8n2).Hence, any interval Jk which is returned by DCMρ as an isolating interval for z is the ex-tension I = (a− w(I)

2n ,b+ w(I)2n ) of an interval I = (a,b) with w(I) > σ(z, f )/(16n2), and thus

w(J) = 2Γ+1w(I)> σ(ξ ,F)/(16n2). From our construction, the w(I)n -neighborhood of I isolates

z as well. This shows that w(J) = 2Γ+1w(I)< 4Γnσ(zi, f ) = 2nσ(ξ ,F). 2

4.4. Some Remarks

4.4.1. On the Complexity Analysis for Integer PolynomialsWe remark that, in order to achieve the complexity bound O(n3τ2) for integer polynomials, the

subroutine CERTIFYρ and its analysis is actually not needed. Namely, due to our considerationsin Appendix 6.2, we can compute explicit upper bounds for Σ f (in terms of n and τ) and, thus,also an explicit upper bound ρ∗(n,τ) for ρmax

f which matches ρmaxf at least with respect to worst

case complexity. Then, Theorem 14 guarantees that DCMρ∗(n,τ) computes isolating intervals forall real roots of f . Unfortunately, this approach cannot be considered practical at all because suchupper bounds usually tend to be much larger than the actual ρmax

f . We would like to emphasizeon the fact that our algorithm is output sensitive in the way that it demands for a precision whichis not much larger than ρ f .

At this point, we even conjecture that, for any bisection method, the bound on the bit com-plexity as achieved by our algorithm is optimal (up to log-factors). We are not aware of any lower

27

bounds for the bit complexity of root isolation algorithms, and we can also not provide a rigor-ous proof for the optimality of our bound. However, the intuition behind our claim is that, for aMignotte polynomial F , the bounds on the precision demand and the size of the recursion treeseem to be optimal. Namely, any bisection algorithm needs Θ(nτ) steps to separate two roots ofF with pairwise distance 2−Θ(nτ). In addition, if we perturb the coefficients of F by more than2−Θ(nτ), the two ordinary roots can move by more than their initial separations, hence it seemsthat a precision of size Θ(nτ) is needed to isolate these roots from each other.

We finally remark that, in practice, it may be advantageous to start with the exact represen-tation of the rational polynomial F/An and not with an (artificial) approximation to a certainnumber ρ of bits. In this case, we propose to use the classical Descartes method with exactarithmetic for the first iterations, and only if the exact representation of the polynomials con-structed during the subdivision process exceeds the given working precision ρ , we switch to ourmodified Descartes method, where approximations are considered. However, as already arguedabove, we do not expect that this approach yields any improvement with respect to worst case bitcomplexity.

4.4.2. On Efficient ImplementationWe formulated our algorithm in a way to make it accessible to the complexity analysis but still

feasible and efficient for an implementation. Nevertheless, we recommend to consider a slightmodification of our algorithm when actually implementing it.

For our certification step CERTIFYρ , the most obvious modification is to only subdivide theregion R instead of the entire interval (− 1

2 ,12 ). More precisely, R decomposes into intervals L j

”in between” the isolating intervals Jk. Then, we approximate the polynomials fL j to ρ bits afterthe binary point and recursively proceed each L j in a similar way as proposed in CERTIFYρ .An experimental implementation of our algorithm in MAPLE has shown that, following thisapproach, the running time for the certification step is almost negligible, whereas, for the originalformulation, it is approximately of the same magnitude as the running time for DCMρ .

Furthermore, we propose to also use the inclusion predicate based on Descartes’ Rule ofSigns. With respect to complexity, our inclusion predicate based on the T ′[3/2](·)-test (seeCorollary 7) is comparable to Descartes’ Rule of Signs, where we check whether f has exactlyone sign variation for a certain interval. However, in practice, this subtle difference is crucialbecause already logn bisection steps more for each root may render an algorithm inefficient. Asan alternative, for an interval I, we propose to check whether there exists a ”good” approximationg of fI with var(g,(0,1)) = 1. 15 Namely, if there exists such a g, we can proceed with fI := gwhich has exactly one root in I. Thus, it is easy to refine I (via simple bisection or quadraticinterval refinement) such that T [g′](0,2) holds as well.

We finally report on an interesting behavior of the proposed method. It is easy to see that,for small intervals I = (a,b), the leading coefficients of fI(x) = f (a+w(I) · x) are considerablysmaller than the first-order coefficients. Since we only consider a certain number ρI ≤ ρ of bitsafter the binary point, the approximations fI can be chosen of a considerably lower degree thanfI . As a consequence, the cost at such an interval is considerably reduced because we have tocompute the polynomial fI+ = fI(

14n + (1 + 1

2n ) · x) which is expensive for large degrees. In

15 Since fI approximates fI to ρI bits after the binary point, we can derive an interval approximation [ fI,rev] := [a−0 ,a+0 ]+

· · · [a−n ,a+n ] ·xn of fI,rev to ρI−n bits after the binary point. We can then easily check whether there exists a polynomial h∈[ fI,rev] with var(h)= 1, and transform h back to g(x)= (x−1)n ·h(1/(x−1)). The so-obtained polynomial g approximatesfI to at least ρI − 2(n+ 1) bits after the binary point and var(g, [0,1]) = 1. Notice that, following this approach, theprecision ρI has to be updated accordingly.

28

particular, for a polynomial with two very nearby roots (such as Mignotte polynomials), thisbehavior can be clearly observed. More precisely, when refining an interval I that contains twonearby roots, the degree of fI decreases in each bisection step and eventually equals 2 for I smallenough. We consider this behavior as quite natural because fI implicitly captures the informationon the location of the roots in a neighborhood of I, whereas the influence of all other rootsbecomes almost negligible.

5. Conclusion

We presented a novel deterministic algorithm to isolate the real roots of a square-free poly-nomial F with arbitrary real coefficients. Our analysis shows that the hardness of isolating thereal roots exclusively depends on the location of the roots and not on the coefficient type of F .Furthermore, the overall running time is significantly reduced by considering approximations ateach node of the recursion tree. In particular, for integer polynomials, we achieve an improve-ment with respect to worst case bit complexity by a factor n = degF compared to other practicalmethods such as the Descartes method, the continued fraction method, or the Sturm method. Theimprovement stems from the fact that exact arithmetic produces too much information for thetask of root isolation and, thus, a significant overhead of computation. Hence, for the main part,we consider our result to be the missing theoretical proof of a fact that has already been observedin practice, namely, that using approximate but certified arithmetic instead of exact arithmeticyields a significant improvement. We are confident that this result does not only hold for theDescartes method but also for a majority of the known real roots solvers and encourage otherresearchers to develop corresponding approximate variants.

Very recent work [37] shows how to combine the Descartes method with Newton iteration toimprove upon the linear convergence of the original Descartes method. The algorithm NEWDSCfrom [37] can only be used to isolate the real roots of an integer polynomial F , and it achieves aworst case bit complexity of O(n3τ). Hence, the major remaining research question is whethercombining the two approaches, that is, using approximate arithmetic as proposed in this paperand using Newton iteration as proposed in [37], yields a practical algorithm that achieves recordcomplexity bounds comparable to the bounds achieved by Pan’s method.

29

References

[1] A. G. Akritas. There is no ”Descartes’ method”. In Computer Algebra in Education, pages19–35, 2008.

[2] A. G. Akritas and A. Strzebonski. A comparative study of two real root isolation methods.Nonlinear Analysis:Modelling and Control, 10(4):297–304, 2005.

[3] A. Alesina and M. Galuzzi. A new proof of Vicent’s theorem. L’Enseignement Mathema-tique, 44:219–256, 1998.

[4] A. Alesina and M. Galuzzi. Addendum to the paper ”A new proof of Vicent’s theorem”.L’Enseignement Mathematique, 45:379–380, 1999.

[5] A. Alesina and M. Galuzzi. Vincent’s theorem from a modern point of view. CategoricalStudies in Italy 2000, Rendiconti del Circolo Matematico di Palermo, Serie II, 64:179–191,2000.

[6] S. Basu, R. Pollack, and M.-F. Roy. Algorithms in Real and Algebraic Geometry. Springer,2nd edition, 2006.

[7] E. Berberich, P. Emeliyanenko, and M. Sagraloff. An elimination method for solving bi-variate polynomial systems: Eliminating the usual drawbacks. In ALENEX: Algorithm En-gineering and Experiments, pages 35–47, 2011.

[8] D. Bini and G. Fiorentino. Design, analysis and implementation of a multiprecision poly-nomial rootfinder. Numerical Algorithms, 23:127–173, 2000.

[9] M. Burr and F. Krahmer. SqFreeEVAL: An (almost) optimal real-root isolation algorithm.Journal of Symbolic Computation, 47(2):153–166, 2012.

[10] G. E. Collins and A. G. Akritas. Polynomial real root isolation using descarte’s rule ofsigns. In SYMSAC: Symposium on Symbolic and Algebraic Computation, pages 272–275,1976.

[11] Z. Du, V. Sharma, and C. Yap. Amortized bounds for root isolation via Sturm sequences.In SNC: Symbolic and Numeric Computation, pages 113–130. 2007.

[12] A. Eigenwillig. On multiple roots in Descartes’ rule and their distance to roots of higherderivatives. Journal of Computational and Applied Mathematics, 200(1):226–230, 2007.

[13] A. Eigenwillig. Real Root Isolation for Exact and Approximate Polynomials usingDescartes’ Rule of Signs. PhD thesis, Universitat des Saarlandes, May 2008.

[14] A. Eigenwillig, L. Kettner, W. Krandick, K. Mehlhorn, S. Schmitt, and N. Wolpert. Anexact descartes algorithm with approximate coefficients. In CASC: Computer Algebra inScientific Computing, pages 138–149, 2005.

[15] A. Eigenwillig, V. Sharma, and C. K. Yap. Almost tight recursion tree bounds for thedescartes method. In ISSAC: International Symposium on Symbolic and Algebraic Compu-tation, pages 71–78, 2006.

[16] J. Gerhard. Modular algorithms in symbolic summation and symbolic integration. LNCS,Springer, 3218, 2004.

[17] X. Gourdon. Combinatoire, Algorithmique et Geometrie des Polynomes. These, Ecolepolytechnique, 1996.

[18] M. Hemmer, E. P. Tsigaridas, Z. Zafeirakopoulos, I. Z. Emiris, M. I. Karavelas, andB. Mourrain. Experimental evaluation and cross-benchmarking of univariate real solvers.In SNC: Symbolic Numeric Computation, pages 45–54, 2009.

[19] J. R. Johnson and W. Krandick. Polynomial real root isolation using approximate arith-metic. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages225–232, 1997.

30

[20] M. Kerber and M. Sagraloff. Efficient real root approximation. In ISSAC: InternationalSymposium on Symbolic and Algebraic Computation, pages 209–216, 2011.

[21] M. Kerber and M. Sagraloff. A worst-case bound for topology computation of algebraiccurves. Journal of Symbolic Computation, 47(3):239–258, 2012.

[22] W. Krandick and K. Mehlhorn. New bounds for the Descartes method. Journal of SymbolicComputation, 41(1):49–66, 2006.

[23] J. McNamee and V. Pan. Numerical Methods for Roots of Polynomials -. Number 2 inStudies in Computational Mathematics. Elsevier Science, 2013.

[24] K. Mehlhorn and S. Ray. Faster algorithms for computing Hong’s bound on absolute posi-tiveness. Journal of Symbolic Computation, 45(6):677–683, 2010.

[25] K. Mehlhorn and M. Sagraloff. A deterministic algorithm for isolating real roots of a realpolynomial. Journal of Symbolic Computation, 46(1):70–90, 2011.

[26] K. Mehlhorn, M. Sagraloff, and P. Wang. From approximate factorization to root isolation.In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 283–290, 2013.

[27] B. Mourrain, F. Rouillier, and M.-F. Roy. Bernsteins basis and real root isolation. Combi-natorial and Computational Geometry, 52:459–478, 2005.

[28] N. Obreschkoff. Uber die Wurzeln von algebraischen Gleichungen. Jahresbericht derDeutschen Mathematiker-Vereinigung, 33:52–64, 1925.

[29] N. Obreschkoff. Verteilung und Berechnung der Nullstellen reeller Polynome. VEBDeutscher Verlag der Wissenschaften, 1963.

[30] N. Obreschkoff. Zeros of Polynomials. Marina Drinov, Sofia, 2003. Translation of theBulgarian original.

[31] A. M. Ostrowski. Note on Vincent’s theorem. Annals of Mathematics, Second Series,52(3):702–707, 1950. Reprinted in: Alexander Ostrowski, Collected Mathematical Papers,vol. 1, Birkhauser Verlag, 1983, pp. 728–733.

[32] V. Y. Pan. Solving a polynomial equation: some history and recent progress. SIAM Review,39(2):187–220, 1997.

[33] V. Y. Pan. Univariate polynomials: Nearly optimal algorithms for numerical factorizationand root-finding. Journal of Symbolic Computation, 33(5):701–733, 2002.

[34] F. Roullier and P. Zimmermann. Efficient isolation of a polynomial’s real roots. Journal ofComputational and Applied Mathematics, pages 33–50, 2004.

[35] M. Sagraloff. A general approach to isolating roots of a bitstream polynomial. Mathematicsin Computer Science, 4(4):481–506, 2010.

[36] M. Sagraloff. On the complexity of real root isolation. CoRR, abs/1011.0344, 2010.[37] M. Sagraloff. When Newton meets Descartes: a simple and fast algorithm to isolate the

real roots of a polynomial. In ISSAC: International Symposium on Symbolic and AlgebraicComputation, pages 297–304, 2012.

[38] M. Sagraloff and C.-K. Yap. A simple but exact and efficient algorithm for complex rootisolation. In ISSAC: International Symposium on Symbolic and Algebraic Computation,pages 353–360, 2011.

[39] A. Schonhage. The fundamental theorem of algebra in terms of computational complexity,1982. Manuscript, Department of Mathematics, University of Tubingen. Updated 2004.

[40] A. Schonhage. Quasi-GCD computations. Journal of Complexity, 1(1):118–137, 1985.[41] V. Sharma. Complexity of real root isolation using continued fractions. Theoretical Com-

puter Science, 409(2):292–310, 2008.

31

[42] A. W. Strzebonski and E. P. Tsigaridas. Univariate real root isolation in an extension field.In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages 321–328, 2011.

[43] A. W. Strzebonski and E. P. Tsigaridas. Univariate real root isolation in multiple extensionfields. In ISSAC: International Symposium on Symbolic and Algebraic Computation, pages343–350, 2012.

[44] E. Tsigaridas and I. Emiris. On the complexity of real root isolation using continued frac-tions. Theoretical Computer Science, pages 158–173, 2008.

[45] J. von zur Gathen and J. Gerhard. Fast algorithms for Taylor shifts and certain differenceequations. In ISSAC: International Symposium on Symbolic and Algebraic Computation,pages 40–47, New York, NY, USA, 1997. ACM.

[46] J.-C. Yakoubsohn. Numerical analysis of a bisection-exclusion method to find zeros ofunivariate analytic functions. Journal of Complexity, 21(5):652 – 690, 2005.

[47] C. Yap. Fundamental Problems in Algorithmic Algebra. Oxford University Press, 2000.

32

6. Appendix

6.1. Approximating ΓF

Theorem 1. An integer Γ ∈ N with

Γp ≤ Γ < 8logn+Γp (6.1)

can be computed with O(n2Γp) bit operations. The computation uses an approximation of F toL = O(nΓF) bits after the binary point.

Proof. Consider the Cauchy polynomial

F(x) := |An|xn−n−1

∑i=0|Ai|xi

of F . Then, according to [13, Proposition 2.51], F has a unique positive real root ξ ∈ R+, andthe following inequality holds:

maxi=1,...,n |ξi| ≤ ξ <n

ln2·maxi=1,...,n |zi|< 2n ·maxi=1,...,n |zi|.

It follows that F(x)> 0 for all x≥ ξ and F(x)< 0 for all x < ξ . Furthermore, since F coincideswith its own Cauchy polynomial, each complex root of F has absolute value less than or equal toξ . Let k0 be the smallest non-negative integer k with F(2k)> 0 (which is equal to the smallest kwith 2k > ξ ). Our goal is to compute an integer Γ with k0 ≤ Γ≤ k0 +1. Namely, if Γ fulfills thelatter inequality, then

max(1,maxi|zi|))≤max(1,ξ )≤ 2Γ < 4max(1,ξ )< 8n ·max(max

i|zi|,1),

and thus Γ fulfills inequality (6.1). In order to compute a Γ with k0 ≤ Γ ≤ k0 +1, we use expo-nential and binary search (try k = 1,2,4,8, . . . until F(2k) > 0 and, then, perform binary searchon the interval k/2 to k) and approximate evaluation of F at the points 2k: More precisely, weevaluate F(2k) using interval arithmetic with a precision ρ (using fixed point arithmetic) whichguarantees that the width w of B(F(2k),ρ) is smaller than 1/4, where B(E,ρ) is the intervalobtained by evaluating a polynomial expression E via interval arithmetic with precision ρ for thebasic arithmetic operations; see [20, Section 4] for details. We use [20, Lemma 3] to estimate thecost for each such evaluation: Since F has coefficients of size less than 2nMea(F), we have tochoose ρ such that

2−ρ+2(n+1)2Mea(F)2n+nk < 1/4in order to ensure that w < 1/4. Hence, ρ is bounded by O(logMea(F)+ nk) and, thus, eachinterval evaluation needs O(n(logMea(F)+ nk)) bit operations. We now use exponential plusbinary search to find the smallest k such that B(F(2k),ρ) contains only positive values. Thefollowing argument then shows that k0 ≤ k ≤ k0 + 1: Obviously, we must have k ≥ k0 sinceF(2k)< 0 and F(2k) ∈B(F(2k),ρ) for all k < k0. Furthermore, the point x = 2k0+1 has distancemore than 1 to each of the roots of F , and thus |F(2k0+1)| ≥ |An| ≥ 1/4. Hence, it follows thatB(F(2k0 +1),ρ) contains only positive values. For the search, we need

O(logk0) = O(log logξ ) = O(log(logn+ΓF))

iterations, and the cost for each of these iterations is bounded by O(n(logMea(F) + nk0)) =O(n2ΓF) bit operations. 2

33

6.2. Integer Polynomials

For a polynomial F with integer coefficients of bit size τ , we aim to show that ΣF = O(nτ).We proceed in two steps: First, we cluster the roots ξi of F into subsets consisting of nearbyroots. Second, we apply the generalized Davenport-Mahler bound [11, 13] to the roots of F .

W.l.o.g., we can assume that σ(ξ1,F)≤ ·· · ≤σ(ξn,F). For h∈N, we denote i(h) the maximalindex i with σ(ξi,F) ≤ 2−h and R(h) := {ξ1, . . . ,ξi(h)} the corresponding set of roots. If h ≤log(1/σF), then R contains at least two roots. For a fixed h, we are interested in a partition ofR := R(h) into disjoint subsets R1, . . . ,Rl that consist of nearby points, only.

Lemma 18. Suppose that h≤ log(1/σF). Then, there exists a partition of R := R(h) into disjointsets R1, . . . ,Rl such that |Ri| ≥ 2 for all i ∈ {1, . . . , l} and |ξ −ξ ′| ≤ n2−h for all ξ , ξ ′ ∈ Ri.

Proof. We initially set R1 := {ξ1}. Then, we add all roots ξi to R1 that satisfy |ξi− ξ1| ≤ 2−h.For each root in R1, we proceed in the same way. More precisely, for each ξ ∈ R1, we add thoseroots ξ ′ ∈ R to R1 with |ξ −ξ ′| ≤ 2−h. If no further root can be added to R1, we consider the setR\R1 of the remaining roots and treat it in exactly the same manner. Finally, we end up with apartition R1, . . . ,Rl of R such that, for any two points in any Ri, their distance is less than or equalto (|Ri|−1)2−h < n ·2−h. Furthermore, each of the sets Ri must contain at least two roots sinceσ(ξi,F)≤ 2−h for all i = 1, . . . , i(h). 2

We now fix a h > 4logn and consider a partitioning of R := R(h) as in the above lemma.Then, we define Gi to be the directed graph on each Ri which connects consecutive roots of Riin ascending order of their absolute values. We further define G := (R,E) as the union of all Gi.Then, G is a directed graph on R with the following properties:

(1) each edge (α,β ) ∈ E satisfies |α| ≤ |β |,(2) G is acyclic, and(3) the in-degree of each node is at most 1.

Hence, we can apply the generalized Davenport-Mahler bound [11, 13] to G :

∏(α,β )∈E

|α−β | ≥ 1(√

n+1 ·2τ)n−1·

(√3

n

)#E

·(

1n

)n/2

Since each set Ri contains at least 2 roots, we must have i(h) > #E ≥ i(h)/2. Furthermore, foreach edge (α,β ) ∈ E, we have |α−β | ≤ n ·2−h. Thus, it follows that (notice that n ·2−h < 1)

(n ·2−h)i(h)

2 >1

(√

n+1 ·2τ)n−1·

(√3

n

)i(h)

·(

1n

)n/2

>1

(n+1)n ·2nτ·(

3n2

)i(h)/2

,

and thus

i(h)<2n(τ + log(n+1))log3−3logn+h

<2n(τ + log(n+1))

h/4=

8n(τ + log(n+1))h

.

From the latter inequality, we conclude that log(1/σF)< 4n(τ+ log(n+1))+1 since, otherwise,there exists an h with 4n(τ + log(n+ 1)) ≤ h ≤ log(1/σF) and i(h) < 2 which is not possible.For the bound on ΣF , it suffices to consider only the roots ξ1, . . . ,ξk with separation less than1/n4 since each root with a larger separation contributes with at most max(τ +1,4logn) to ΣF .Thus, ΣF = O(nτ) follows from

−k

∑i=1

logσ(ξi,F)<d4n(τ+log(n+1))e

∑h=1

i(h)< 8n(τ + log(n+1))d4n(τ+log(n+1))e

∑h=1

1h= O(nτ log(nτ)).

34

6.3. Algorithms

Algorithm 1 DCM

Require: polynomial f = ∑0≤i≤n aixi ∈ R[x] as defined in (2.1)

Ensure: returns a list O of disjoint isolating intervals for all real roots of f{only in the REAL-RAM model}

I0 :=(− 12 ,

12 )

fI0(x) := f (− 12 + x)

A :={(I0, fI0)}; O := /0 {list of active and isolating intervals}repeat(I, fI) some element in A with I = (a,b); delete (I, fI) from AfI+ := fI

(− 1

4n +(1+ 1

2n

)x)

and fI+,rev(x) = ∑ni=0 hixi :=(1+ x)n · fI+

( 11+x

)if var( fI+,rev) = 0 then

do nothingelse

if t[( fI)′,3/2](0,2)> 0 then

s := sign( fI+(0) · fI+(1))if s≥ 0 then

do nothingelse

if I+ does not intersect any interval in O thenadd I+ to O

elsedo nothing

end ifend if

elsesubdivide I into Il :=(a,mI) and Ir :=(mI ,b)fI` := fI

( x2

)and fIr := fI

( x+12

)= fI`(x+1)

add (Il , fI`) and (Ir, fIr) to Aend if

end ifuntil A is emptyreturn O

35

Algorithm 2 DCMρ

Require: polynomial f = ∑0≤i≤n aixi ∈ R[x] as in (2.1) and a ρ ∈ N

Ensure: returns ”insufficient precision” or a list O = {Jk,sk,`,sk,r,Bk} of disjoint isolating in-tervals Jk = (ck,dk) for some of the real roots of f (and sk,` = sign( f (ck)), sk,r = sign( f (dk))and 0 < Bk ≤min(| f (ck)|, | f (dk)|).

I0 :=(− 12 ,

12 )

f a (ρ +n+1)-binary approximation of ffI0 a (ρ +1)-binary approximation of f (− 1

2 + x) {⇒ fI0 ∈ [ fI0 ]2−ρ }A :={(I0, fI0 ,ρ)}; O := /0 {list of active and isolating intervals}

repeat(I, fI ,ρI), where I :=(a,b), some element in A ; delete (I, fI ,ρI) from AfI+(x) := fI

(− 1

4n +(1+ 1

2n

)x)

and h(x) = ∑ni=0 hixi := (1+ x)n · fI+

( 11+x

)if hi >−2n+2−ρI for all i or hi < 2n+2−ρI for all i then

do nothingelse

if t[( fI)′,3/2](0,2)>−n ·2n+1−ρI then

fI(x) := fI(x)+ sign(( fI)′(0)) ·n ·2n+1−ρI · x

λ− := fI+(0)−2n−1−ρI , λ+ := fI+(1)+(4n+1)2n−1−ρI and λ := fI(−1/n)−2n+1−ρI .if I :=

(a− w(I)

2n ,b+ w(I)2n

)intersects no J for all (J,sJ,`,sJ,r,BJ) ∈ O and λ− ·λ+ < 0

and min(|λ−|, |λ+|)> n2n+3−ρI and |λ |> n22deg( fI)+7+n−ρI thenadd (I,sign(λ−),sign(λ+),min(|λ−|, |λ+|)−2n+3−ρI n) to O

{⇒ I contains a root ξ of f and the w(I)n -neighborhood of I is isolating for ξ}

elsedo nothing {J is already isolating for ξ}

end ifelse

do nothingend if

elseif ρI < 0 then

return ”insufficient precision”else

if ρI < 2 thenreturn ”insufficient precision”

elseSubdivide I into I` :=(a,mI) and Ir :=(mI ,b)fI` an ρI-binary approximation of fI

( x2

){⇒ fIl ∈ [ fI` ]2−(ρI−1)}

fIr an (ρI−1)-binary approximation of fI( 1+x

2

){⇒ fIr ∈ [ fIr ]2−(ρI−2)}

Add (I`, fI` ,ρI−1) and (Ir, fIr ,ρI−2) to Aend if

end ifend if

until A is emptyreturn O

36

Algorithm 3 CERTIFYρ

Require: polynomial f = ∑0≤i≤n aixi ∈ R[x] as defined in (2.1), an integer ρ ∈ N, and the listO = {(Jk,sk,`,sk,r,Bk)}k=1,...,s returned by DCMρ .

Ensure: returns ”insufficient precision” or the list L = {Jk}k=1,...,s of isolating intervals withthe guarantee that, for each real root of f , there exists a corresponding interval in L .

I0 :=(− 12 ,

12 )

f a (ρ +n+1)-binary approximation of ffI0 a (ρ +1)-binary approximation of f (− 1

2 + x) {⇒ fI0 ∈ [ fI0 ]2−ρ }A :={(I0, fI0 ,ρ)} {list of active intervals}

repeat(I, fI ,ρI), where I :=(a,b), some element in A ; delete (I, fI ,ρI) from A .if I∩R =

⋃si=1 Li = /0 then

do nothingelse

if t[ fI ,3/2](0,1)>−n ·2−ρI+2 thenif | fI(0)+ sign( fI(0)) ·2−ρI+2n|> n2 ·2−ρI+6 then

do nothing {I contains no root of f}else

return ”insufficient precision” {ρ < ρmaxf }

end ifelse

h(x) :=∑ni=0 hixi = (1+ x)n · ( fI)

′( 11+x )

if hi < n ·2n−ρI for all i (or hi >−n ·2n−ρI for all i) theng(x) := fI(x)−n ·2n−ρI (or g(x) := fI(x)+n ·2n−ρI , respectively);if for each Li = [q`,qr], min(|λ (q`)|, |λ (qr)|) > n · 2n+3−ρI and λ (ql) · λ (qr) < 0then

do nothing {I∩R contains no root of f ; λ (ql), λ (qr) defined as in (4.15)}else

return ”insufficient precision” {ρ < ρmaxf }

end ifelse

if ρI < 2 thenreturn ”insufficient precision”

elseSubdivide I into Il :=(a,m`) and Ir :=(m`,b)fI` an ρI-binary approximation of fI

( x2

){⇒ fI` ∈ [ fI` ]2−(ρI−1)}

fIr an (ρI−1)-binary approximation of f`( 1+x

2

){⇒ fIr ∈ [ fIr ]2−(ρI−2)}

Add (Il , fI` ,ρI−1) and (Ir, fIr ,ρI−2) to Aend if

end ifend if

end ifuntil A is emptyreturn ”certification successful” {The region of uncertainty R contains no root of f}

37

Documents

On the Complexity of the Descartes Method when using ...msagralo/ApxDescartes.pdf · On the Complexity of the Descartes Method when using Approximate Arithmetic Michael Sagraloff