Primes in Tuples I

PRIMES IN TUPLES I 1

Primes in Tuples I

By D. A. Goldston, J. Pintz, and C. Y. Yıldırım *

Abstract

We introduce a method for showing that there exist prime numbers whichare very close together. The method depends on the level of distribution ofprimes in arithmetic progressions. Assuming the Elliott-Halberstam conjecture,we prove that there are infinitely often primes differing by 16 or less. Even amuch weaker conjecture implies that there are infinitely often primes a boundeddistance apart. Unconditionally, we prove that there exist consecutive primeswhich are closer than any arbitrarily small multiple of the average spacing, thatis,

lim infn→∞

pn+1 − pn

log pn= 0.

We will quantify this result further in a later paper (see (1.9) below).

1. Introduction

One of the most important unsolved problems in number theory is toestablish the existence of infinitely many prime tuples. Not only is this problembelieved to be difficult, but it has also earned the reputation among mostmathematicians in the field as hopeless in the sense that there is no knownunconditional approach for tackling the problem. The purpose of this paper,the first in a series, is to provide what we believe is a method which couldlead to a partial solution for this problem. At present, our results on primesin tuples are conditional on information about the distribution of primes inarithmetic progressions. However, the information needed to prove that thereare infinitely often two primes in a given k-tuple for sufficiently large k does notseem to be too far beyond the currently known results. Moreover, we can gainenough in the argument by averaging over many tuples to obtain unconditionalresults concerning small gaps between primes which go far beyond anythingthat has been proved before. Thus, we are able to prove the existence of very

*The first author was supported by NSF grant DMS-0300563, the NSF Focused Research Groupgrant 0244660, and the American Institute of Mathematics; the second author by OTKA grants No.T38396, T43623, T49693 and the Balaton program; the third author by TUBITAK


small gaps between primes which, however, go slowly to infinity with the sizeof the primes.

The information on primes we utilize in our method is often referred toas the level of distribution of primes in arithmetic progressions. Let

θ(n) ={

logn if n is prime,0 otherwise,

(1.1)

and consider the counting function

θ(N ; q, a) =∑

n≤Nn≡a(mod q)

θ(n). (1.2)

The Bombieri-Vinogradov theorem states that for any A > 0 there is a B =B(A) such that, for Q = N

12 (logN)−B,

∑

q≤Q

maxa

(a,q)=1

∣∣∣∣θ(N ; q, a)− N

φ(q)

∣∣∣∣�N

(logN)A. (1.3)

We say that the primes have level of distribution ϑ if (1.3) holds for any A > 0and any ε > 0 with

Q = Nϑ−ε. (1.4)

Elliott and Halberstam [5] conjectured that the primes have level of distribution1. According to the Bombieri-Vinogradov theorem, the primes are known tohave level of distribution 1/2.

Let n be a natural number and consider the k-tuple

(n + h1, n + h2, . . . , n + hk), (1.5)

where H = {h1, h2, . . . , hk} is a set composed of distinct non-negative integers.If every component of the tuple is a prime we call this a prime tuple. Lettingn range over the natural numbers, we wish to see how often (1.5) is a primetuple. For instance, consider H = {0, 1} and the tuple (n, n + 1). If n = 2,we have the prime tuple (2 , 3). Notice that this is the only prime tuple of thisform because, for n > 2, one of the numbers n or n + 1 is an even numberbigger than 2. On the other hand, if H = {0, 2}, then we expect that thereare infinitely many prime tuples of the form (n, n + 2). This is the twin primeconjecture. In general, the tuple (1.5) can be a prime tuple for more than one n

only if for every prime p the hi’s never occupy all of the residue classes modulop. This is immediately true for all primes p > k, so to test this conditionwe need only to examine small primes. If we denote by νp(H) the numberof distinct residue classes modulo p occupied by the integers hi, then we canavoid p dividing some component of (1.5) for every n by requiring

νp(H) < p for all primes p. (1.6)


If this condition holds we say that H is admissible and we call the tuple (1.5)corresponding to this H an admissible tuple. It is a long-standing conjecturethat admissible tuples will infinitely often be prime tuples. Our first result isa step towards confirming this conjecture.

Theorem 1. Suppose the primes have level of distribution ϑ > 1/2. Thenthere exists an explicitly calculable constant C(ϑ) depending only on ϑ such thatany admissible k-tuple with k ≥ C(ϑ) contains at least two primes infinitelyoften. Specifically, if ϑ ≥ 0.971, then this is true for k ≥ 6.

Since the 6-tuple (n, n + 4, n + 6, n + 10, n + 12, n + 16) is admissible, theElliott-Halberstam conjecture implies that

lim infn→∞

(pn+1 − pn) ≤ 16, (1.7)

where the notation pn is used to denote the n-th prime. This means thatpn+1−pn ≤ 16 for infinitely many n. Unconditionally, we prove a long-standingconjecture concerning gaps between consecutive primes.

Theorem 2. We have

∆1 := lim infn→∞

pn+1 − pn

log pn= 0. (1.8)

There is a long history of results on this topic which we will briefly men-tion. The inequality ∆1 ≤ 1 is a trivial consequence of the prime numbertheorem. The first result of type ∆1 < 1 was proved in 1926 by Hardyand Littlewood [18], who on assuming the Generalized Riemann Hypothe-sis (GRH) obtained ∆1 ≤ 2/3. This result was improved by Rankin [26]to ∆1 ≤ 3/5, also assuming the GRH. The first unconditional estimate wasproved by Erdos [7] in 1940. Using Brun’s sieve, he showed that ∆1 < 1 − c

with an unspecified positive explicitly calculable constant c. His estimate wasimproved by Ricci [27] in 1954 to ∆1 ≤ 15/16. In 1965 Bombieri and Dav-enport [2] refined and made unconditional the method of Hardy and Little-wood by substituting the Bombieri–Vinogradov theorem for the GRH, andobtained ∆1 ≤ 1/2. They also combined their method with the method ofErdos and obtained ∆1 ≤ 0.4665 . . . . Their result was further refined byPilt’ai [25] to ∆1 ≤ 0.4571 . . . , Uchiyama [33] to ∆1 ≤ 0.4542 . . . and in sev-eral steps by Huxley [20] [21] to yield ∆1 ≤ 0.4425 . . . , and finally in 1984 to∆1 ≤ .4393 . . . [22]. This was further improved by Fouvry and Grupp [9] to∆1 ≤ .4342 . . . . In 1988 Maier [23] used his matrix-method to improve Hux-ley’s result to ∆1 ≤ e−γ · 0.4425 . . . = 0.2484 . . . , where γ is Euler’s constant.Maier’s method by itself gives ∆1 ≤ e−γ = 0.5614 . . . . The recent version ofthe method of Goldston and Yıldırım [13] led, without combination with othermethods, to ∆1 ≤ 1/4.


In a later paper in this series we will prove the quantitative result that

lim infn→∞

pn+1 − pn

(log pn)12 (log log pn)2

< ∞. (1.9)

While Theorem 1 is a striking new result, it also reflects the limitationsof our current method. Whether these limitations are real or can be overcomeis a critical issue for further investigation. We highlight the following fourquestions.Question 1. Can it be proved unconditionally by the current method thatthere are infinitely often bounded gaps between primes? Theorem 1 wouldappear to be within a hair’s breadth of obtaining this result. However, anyimprovement in the level of distribution ϑ beyond 1/2 probably lies very deep,and even the GRH does not help. Still, there are stronger versions of theBombieri-Vinogradov theorem, as found in [3], and the circle of ideas used toprove these results, which may help to obtain this result.Question 2. Is ϑ = 1/2 a true barrier for obtaining primes in tuples?Soundararajan [31] has demonstrated this is the case for the current argument,but perhaps more efficient arguments may be devised.Question 3. Assuming the Elliott-Halberstam conjecture, can it be provedthat there are three or more primes in admissible k-tuples with large enoughk? Even under the strongest assumptions, our method fails to prove anythingabout more than two primes in a given tuple.Question 4. Assuming the Elliott-Halberstam conjecture, can the twin primeconjecture be proved with a refinement of our method?

The limitation of our method, identified in Question 3, is the reason weare less successful in finding more than two primes close together. However, weare able to improve on earlier results, in particular the recent results in [13].For ν ≥ 1, let

∆ν = lim infn→∞

pn+ν − pn

log pn. (1.10)

Bombieri and Davenport [2] showed ∆ν ≤ ν − 1/2. This bound was later im-proved by Huxley [20, 21] to ∆ν ≤ ν−5/8+O(1/ν), by Goldston and Yıldırım[13] to ∆ν ≤ (

√ν − 1/2)2, and by Maier [23] to ∆ν ≤ e−γ (ν − 5/8 + O (1/ν)).

In proving Theorem 2 we will also show, assuming the primes have level ofdistribution ϑ,

∆ν ≤ max(ν − 2ϑ, 0), (1.11)

and hence unconditionally ∆ν ≤ ν − 1. However, by a more complicatedargument, we will prove the following result.

Theorem 3. Suppose the primes have level of distribution ϑ. Then forν ≥ 2,

∆ν ≤ (√

ν −√

2ϑ)2. (1.12)


In particular, we have unconditionally, for ν ≥ 1,

∆ν ≤ (√

ν − 1)2. (1.13)

From (1.11) or (1.12) we see that the Elliott-Halberstam conjecture impliesthat

∆2 = lim infn→∞

pn+2 − pn

log pn= 0. (1.14)

We can improve on (1.13) by combining our method with Maier’s matrixmethod [23] to obtain

∆ν ≤ e−γ(√

ν − 1)2. (1.15)

Huxley [20] generalized the results of Bombieri and Davenport [2] for ∆ν toprimes in arithmetic progressions with a fixed modulus. We are able to provethe analogue of (1.15) for primes in arithmetic progressions where the moduluscan tend slowly to infinity with the size of the primes considered.

Another extension of our work is that we can find primes in other setsbesides intervals. Thus we can prove that there are two primes among thenumbers n + ai, 1 ≤ i ≤ h, for N < n ≤ 2N and the ai’s are given arbitraryintegers in the interval [1, N ] if h < C

√logN(log logN)2 and N is restricted

to some sequence Nν tending to infinity, which avoids Siegel zeros for modulinear to N . It is interesting to remark that such a general result can be provedregardless of the distribution of the ai values, in contrast to our present casewhere Gallagher’s theorem (3.7) requires the ai’s to lie in an interval. Theproofs of these results will appear in later papers in this series.

While this paper is our first paper on this subject, we have two other pa-pers that overlap some of the results in this paper. The first paper [15], writtenjointly with Motohashi, gives a short and simplified proof of Theorems 1 and2. The second paper [14], written jointly with Graham, uses sieve methods toprove Theorems 1 and 2 and provides applications for tuples of almost-primes(products of a bounded number of primes.)

The present paper is organized as follows. In Section 2, we describe ourmethod and its relation to earlier work. We also state Propositions 1 and2 which incorporate the key new ideas in this paper. These are developedin a more general form than in [14] or [15] so as to be employable in manyapplications. In Section 3, we prove Theorems 1 and 2 using these propositions.The method of proof is due to Granville and Soundararajan. In Section 4 wemake some further comments on the method used in Section 3. In Section 5we prove two lemmas needed later. In Section 6, we prove a special case ofProposition 1 which illustrates the key points in the general case. In Section7 we begin the proof of Proposition 1 which is reduced to evaluating a certaincontour integral. In Section 8 we evaluate a more general contour integral thatoccurs in the proof of both propositions. In Section 9, we prove Proposition2. In this paper we do not obtain results that are uniform in k, and therefore


we assume here that our tuples have a fixed length. However, uniform resultsare needed for (1.9), and they will be the topic of the next paper in this series.Finally, we prove Theorem 3 in Section 10.

Notation. In the following c and C will denote (sufficiently) small and(sufficiently) large absolute positive constants, respectively, which have beenchosen appropriately. This is also true for constants formed from c or C withsubscripts or accents. We unconventionally will allow these constants to bedifferent at different occurences. Constants implied by pure o, O, � symbolswill be absolute, unless otherwise stated. [S] is 1 if the statement S is trueand is 0 if S is false. The symbol

∑[ indicates the summation is over square-free integers, and

∑′ indicates the summation variables are pairwise relativelyprime.

The ideas used in this paper have developed over many years. We areindebted to many people, not all of whom we can mention. In particular, wewould like to thank A. Balog, E. Bombieri, T. H. Chan, J. B. Conrey, P. Deift,D. Farmer, K. Ford, J. Friedlander, S. W. Graham, A. Granville, C. Hughes,D. R. Heath-Brown, A. Ledoan, H. L. Montgomery, Y. Motohashi, Sz. Gy.Revesz, P. Sarnak, J. Sivak, and K. Soundararajan.

2. Approximating prime tuples

Let

H = {h1, h2, . . . , hk} with 1 ≤ h1, h2, . . . , hk ≤ h distinct integers, (2.1)

and let νp(H) denote the number of distinct residue classes modulo p occupiedby the elements of H.1 For squarefree integers d, we extend this definition toνd(H) by multiplicativity. We denote by

S(H) :=∏

p

(1 − 1

p

)−k(1 − νp(H)

p

)(2.2)

the singular series associated with H. Since νp(H) = k for p > h, we see thatthe product is convergent and therefore H is admissible as defined in (1.6) if andonly if S(H) 6= 0. Hardy and Littlewood conjectured an asymptotic formulafor the number of prime tuples (n + h1, n + h2, . . . , n + hk), with 1 ≤ n ≤ N ,as N → ∞. Let Λ(n) denote the von Mangoldt function which equals log p ifn = pm, m ≥ 1, and zero otherwise. We define

Λ(n;H) := Λ(n + h1)Λ(n + h2) · · ·Λ(n + hk) (2.3)

and use this function to detect prime tuples and tuples with prime powers incomponents, the latter of which can be removed in applications. The Hardy–

1The restriction of the set H to positive integers is only for simplicity, and, if desired, can easilybe removed later from all of our results.


Littlewood prime-tuple conjecture [17] can be stated in the form∑

n≤N

Λ(n;H) = N(S(H) + o(1)), as N → ∞. (2.4)

(This conjecture is trivially true if H is not admissible.) Except for the primenumber theorem (1-tuples), this conjecture is unproved.2

The program the first and third authors have been working on since 1999is to compute approximations for (2.3) with k ≥ 3 using short divisor sumsand apply the results to problems on primes. The simplest approximation ofΛ(n) is based on the elementary formula

Λ(n) =∑

d|n

µ(d) logn

d, (2.5)

which can be approximated with the smoothly truncated divisor sum

ΛR(n) =∑

d|nd≤R

µ(d) logR

d. (2.6)

Thus, an approximation for Λ(n;H) is given by

ΛR(n + h1)ΛR(n + h2) · · ·ΛR(n + hk). (2.7)

In [13], Goldston and Yıldırım applied (2.7) to detect small gaps betweenprimes and proved

∆1 = lim infn→∞

(pn+1 − pn

log pn

)≤ 1

4.

In this paper we introduce a new approximation, the idea for which camepartly from a paper of Heath-Brown [19] on almost prime tuples. His result isitself a generalization of Selberg’s proof from 1951 (see [29], p. 233–245) thatthe polynomial n(n + 2) will infinitely often have at most five distinct primefactors, so that the same is true for the tuple (n, n + 2). Not only does ourapproximation have its origin in these papers, but in hindsight the argumentof Granville and Soundararajan (employed in the proof of Theorems 1 and 2)is essentially the same as the method used in these papers.

In connection with the tuple (1.5), we consider the polynomial

PH(n) = (n + h1)(n + h2) · · ·(n + hk). (2.8)

If the tuple (1.5) is a prime tuple then PH(n) has exactly k prime factors. Wedetect this condition by using the k-th generalized von Mangoldt function

Λk(n) =∑

d|n

µ(d)(

logn

d

)k

, (2.9)

2Asymptotic results for the number of primes in tuples, unlike the existence result in Theorem 1,are beyond the reach of our method.


which vanishes if n has more than k distinct prime factors.3 With this, ourprime tuple detecting function becomes

Λk(n;H) :=1k!

Λk(PH(n)). (2.10)

The normalization factor 1/k! simplifies the statement of our results. As wewill see in Section 5, this approximation suggests the Hardy–Littlewood typeconjecture ∑

n≤N

Λk(n;H) = N (S(H) + o(1)) . (2.11)

This is a special case of the general conjecture of Bateman–Horn [1] which isthe quantitative form of Schinzel’s conjecture [28].

In analogy with (2.6) (when k = 1), we approximate Λk(n) by thesmoothed and truncated divisor sum

∑

d|nd≤R

µ(d)(

logR

d

)k

and define

ΛR(n;H) =1k!

∑

d|PH(n)d≤R

µ(d)(

logR

d

)k

. (2.12)

However, as we will see in the next section, this approximation is not adequateto prove Theorems 1 and 2.

A second simple but crucial idea is needed: rather than only approximateprime tuples, one should approximate tuples with primes in many components.Thus, we consider when PH(n) has k + ` or less distinct prime factors, where0 ≤ ` ≤ k, and define

ΛR(n;H, `) =1

(k + `)!

∑

d|PH(n)d≤R

µ(d)(

logR

d

)k+`

, (2.13)

where |H| = k. If H = ∅, then k = ` = 0 and we define ΛR(n; ∅, 0) = 1.The advantage of (2.13) over (2.7) can be seen as follows. If in (2.13) we

restrict ourselves to d’s with all prime factors larger than h, then the conditiond|PH(n) implies that we can write d = d1d2 · · ·dk uniquely with di|n + hi,1 ≤ i ≤ k, the di’s pairwise relatively prime, and d1d2 · · ·dk ≤ R. In ourapplication to prime gaps we require that R ≤ N

14−ε. On the other hand, on

expanding, (2.7) becomes a sum over di|n+hi, 1 ≤ i ≤ k, with d1 ≤ R, d2 ≤ R,

3As with Λ(n), we overcount the prime tuples by including factors which are proper prime powers,but these can be removed in applications with a negligible error. The slightly misleading notationalconflict between the generalized von Mangoldt function Λk and ΛR will only occur in this section.


. . . , dk ≤ R. The application to prime gaps here requires that Rk ≤ N14−ε,

and so R ≤ N14k

− ε

k . Thus (2.7) has a more severe restriction on the range ofthe divisors. An additional technical advantage is that having one truncationrather than k truncations simplifies our calculations.

Our main results on ΛR(n;H, `) are summarized in the following twopropositions. Suppose H1 and H2 are, respectively, sets of k1 and k2 dis-tinct non-negative integers ≤ h. We always assume that at least one of thesesets is non-empty. Let M = k1 + k2 + `1 + `2.

Proposition 1. Let H = H1 ∪ H2, |Hi| = ki, and r = |H1 ∩ H2|. IfR � N

12 (logN)−4M and h ≤ RC for any given constant C > 0, then as

R, N → ∞ we have∑

n≤N

ΛR(n;H1, `1)ΛR(n;H2, `2) =(

`1 + `2

`1

)(logR)r+`1+`2

(r + `1 + `2)!(S(H) + oM (1))N.

(2.14)

Proposition 2. Let H = H1 ∪H2, |Hi| = ki, r = |H1 ∩H2|, 1 ≤ h0 ≤ h,and H0 = H ∪ {h0}. If R �M N

14 (logN)−B(M) for a sufficiently large positive

constant B(M), and h ≤ R, then as R, N → ∞ we have∑

n≤N

ΛR(n;H1, `1)ΛR(n;H2, `2)θ(n + h0)

=

(`1 + `2

`1

)(logR)r+`1+`2

(r + `1 + `2)!(S(H0) + oM(1))N if h0 6∈ H,

(`1+`2+1

`1 + 1

)(logR)r+`1+`2+1

(r+`1+`2+1)!(S(H)+oM(1))N if h0 ∈ H1 and h0 6∈ H2,

(`1+`2+2

`1 + 1

)(logR)r+`1+`2+1

(r+`1+`2+1)!(S(H)+oM(1))N if h0 ∈ H1 ∩ H2.

(2.15)

Assuming the primes have level of distribution ϑ > 1/2, i.e. (1.3) with (1.4)holds, the asymptotics in (2.15) hold with R � N

ϑ

2−ε and h ≤ Rε, for any

fixed ε > 0.

By relabeling the variables we obtain the corresponding form if h0 ∈H2, h0 6∈ H1.

Propositions 1 and 2 can be strengthened in several ways. We will showthat the error terms oM (1) can be replaced by a series of lower order termsand a prime number theorem type of error term. Moreover, we can make theresult uniform for M → ∞ as an explicit function of N and R. This will beproved in a later paper and used in the proof of (1.9).


3. Proofs of Theorems 1 and 2

In this section we employ Propositions 1 and 2 and a simple argumentdue to Granville and Soundararajan to prove Theorems 1 and 2.

For ` ≥ 0, Hk = {h1, h2, . . . , hk}, 1 ≤ h1, h2, . . . , hk ≤ h ≤ R, we deducefrom Proposition 1, for R � N

12 (logN)−B(M) and R, N → ∞, that

∑

n≤N

ΛR(n;Hk, `)2 ∼1

(k + 2`)!

(2`

`

)S(Hk)N(logR)k+2`. (3.1)

For any hi ∈ Hk , we have from Proposition 2, for R � Nϑ

2−ε, and R, N → ∞,

∑

n≤N

ΛR(n;Hk, `)2θ(n + hi) ∼1

(k + 2` + 1)!

(2` + 2` + 1

)S(Hk)N(logR)k+2`+1.

(3.2)Taking R = N

ϑ

2−ε, we obtain4

S : =2N∑

n=N+1

(k∑

i=1

θ(n + hi) − log 3N

)ΛR(n;Hk, `)2

∼ k

(k + 2` + 1)!

(2` + 2` + 1

)S(Hk)N(logR)k+2`+1

− log 3N1

(k + 2`)!

(2`

`

)S(Hk)N(logR)k+2`

∼(

2k

k + 2` + 12` + 1` + 1

logR − log 3N

)1

(k + 2`)!

(2`

`

)S(Hk)N(logR)k+2`.

(3.3)

Here we note that if S > 0 then there exists an n ∈ [N + 1, 2N ] such that atleast two of the numbers n+h1, n+h2, . . . , n+hk will be prime. This situationoccurs when

k

k + 2` + 12` + 1` + 1

ϑ > 1. (3.4)

If k, ` → ∞ with ` = o(k), then the left-hand side has the limit 2ϑ, and thus(3.4) holds for any ϑ > 1/2 if we choose k and ` appropriately depending on ϑ.This proves the first part of Theorem 1. Next, assuming ϑ > 20/21, we see that(3.4) holds with ` = 1 and k = 7. This proves the second part of Theorem 1but with k = 7. The case k = 6 requires a slightly more complicated argumentand is treated later in this section.

The table below gives the values of C(ϑ), defined in Theorem 1, obtainedfrom (3.4). For a given ϑ, it gives the smallest k and corresponding smallest

4In (3.3), as well as later in (3.8), the asymptotic sign replaces an error term of size o(log N) inthe parenthesis term after log3N . We thus make the convention that the asymptotic relationshipholds only up to the size of the apparent main term.


` for which (3.4) is true. Here h(k) is the shortest length of any admissiblek-tuple, which has been computed by Engelsma [6] by exhaustive search for1 ≤ k ≤ 305 and covers every value in this table and the next except h(421),where we have taken the upper bound value from [6].

ϑ k ` h(k)1 7 1 20.95 8 1 26.90 9 1 30.85 11 1 36.80 16 1 60.75 21 2 84.70 31 2 140.65 51 3 252.60 111 5 634.55 421 10 2956∗

* indicates this value could be an upper bound of the true value.

To prove Theorem 2, we modify the previous proof by considering

S :=2N∑

n=N+1

∑

1≤h0≤h

θ(n + h0) − ν log 3N

∑

1≤h1,h2,...,hk≤hdistinct

ΛR(n;Hk, `)2,

(3.5)where ν is a positive integer. To evaluate S, we need the case of Proposition2 where h0 6∈ Hk,∑

n≤N

ΛR(n;Hk, `)2 θ(n+h0) ∼1

(k + 2`)!

(2`

`

)S(Hk∪{h0})N(logR)k+2`. (3.6)

We also need a result of Gallagher [10]: as h → ∞,∑

1≤h1,h2,...hk≤hdistinct

S(Hk) ∼ hk . (3.7)


Taking R = Nϑ

2−ε, and applying (3.1), (3.2), (3.6), and (3.7), we find that

S ∼∑

1≤h1,h2,...,hk≤hdistinct

(k

(k + 2` + 1)!

(2` + 2` + 1

)S(Hk)N(logR)k+2`+1

+∑

1≤h0≤hh0 6=hi ,1≤i≤k

1(k + 2`)!

(2`

`

)S(Hk ∪ {h0})N(logR)k+2`

− ν log 3N1

(k + 2`)!

(2`

`

)S(Hk)N(logR)k+2`

)

∼(

2k

k + 2` + 12` + 1` + 1

log R + h − ν log 3N

)1

(k + 2`)!

(2`

`

)Nhk(logR)k+2`.

(3.8)

Thus, there are at least ν + 1 primes in some interval (n, n + h], N < n ≤ 2N ,provided that

h >

(ν − 2k

k + 2` + 12` + 1` + 1

(ϑ

2− ε

))logN, (3.9)

which, on letting ` = [√

k/2] and taking k sufficiently large, gives

h >

(ν − 2ϑ + 4ε + O

(1√k

))log N. (3.10)

This proves (1.11). Theorem 2 is the special case ν = 1 and ϑ = 1/2.We are now ready to prove the last part of Theorem 1. Consider

S ′ : =2N∑

n=N+1

(k∑

i=1


)(L∑

`=0

a`ΛR(n;Hk, `)

)2

=2N∑

n=N+1

(k∑

i=1


) ∑

0≤`1,`2≤L

a`1a`2ΛR(n;Hk, `1)ΛR(n;Hk, `2)

=∑

0≤`1,`2≤L

a`1a`2M`1,`2 ,

(3.11)

whereM`1,`2 = M`1,`2 − (log 3N)M`1,`2 , (3.12)

say. Applying Propositions 1 and 2 with R = Nϑ

2−ε, we deduce that

M`1,`2 ∼(

`1 + `2

`1

)(logR)k+`1+`2

(k + `1 + `2)!S(Hk)N

and

M`1,`2 ∼ k

(`1 + `2 + 2

`1 + 1

)(logR)k+`1+`2+1

(k + `1 + `2 + 1)!S(Hk)N.


Therefore,

M`1,`2 ∼(

`1 + `2

`1

)S(Hk)N

(logR)k+`1+`2

(k + `1 + `2)!

×(

k(`1 + `2 + 2)(`1 + `2 + 1)(`1 + 1)(`2 + 1)(k + `1 + `2 + 1)

logR − log 3N

).

Defining b` = (logR)`a` and b to be the column matrix corresponding to thevector (b0, b1, . . . , bL), we obtain

S∗(N,Hk, ϑ, b) :=1

S(Hk)N(logR)k+1S ′

∼∑

0≤`1,`2≤L

b`1b`2

(`1 + `2

`1

)1

(k + `1 + `2)!

(k(`1 + `2 + 2)(`1 + `2 + 1)

(`1 + 1)(`2 + 1)(k + `1 + `2 + 1)− 2

ϑ

)

∼ bT Mb,

(3.13)

where

M =[(

i + j

i

)1

(k + i + j)!

(k(i + j + 2)(i + j + 1)

(i + 1)(j + 1)(k + i + j + 1)− 2

ϑ

)]

0≤i,j≤L

.

(3.14)We need to choose b so that S∗ > 0 for a given ϑ and minimal k. On taking b

to be an eigenvector of the matrix M with eigenvalue λ, we see that

S∗ ∼ bT λb = λ

k∑

i=0

|bi|2 (3.15)

will be > 0 provided that λ is positive. Therefore S∗ > 0 if M has a positiveeigenvalue and b is chosen to be the corresponding eigenvector. Using Mathe-matica we computed the values of C(ϑ) indicated in the following table, whichmay be compared to the earlier table obtained from (3.4).

ϑ k L h(k)1 6 1 16.95 7 1 20.90 8 2 26.85 10 2 32.80 12 2 42.75 16 2 60.70 22 4 90.65 35 4 158.60 65 6 336.55 193 9 1204


In particular, taking k = 6, L = 1, b0 = 1, and b1 = b in (3.13), we get

S∗ ∼ 18!

(96 − 112

ϑ+ 2b

(18 − 16

ϑ

)+ b2

(4 − 4

ϑ

))

∼ −4(1− ϑ)8!ϑ

(b2 − 2b

18ϑ− 164(1− ϑ)

− 96ϑ − 1124(1− ϑ)

)

∼ −4(1− ϑ)8!ϑ

((b − 18ϑ − 16

4(1 − ϑ)

)2

+15ϑ2 − 64ϑ + 48

4(1 − ϑ)2

).

Choosing b = 18ϑ−164(1−ϑ) , we then have

S∗ ∼ −15ϑ2 − 64ϑ + 488!ϑ(1− ϑ)

,

of which the right-hand side is > 0 if ϑ ≤ 1 lies between the two roots of thequadratic; this occurs when 4(8 −

√19)/15 < ϑ ≤ 1. Thus, there are at least

two primes in any admissible tuple Hk for k = 6, if

ϑ >4(8−

√19)

15= 0.97096 . . . . (3.16)

This completes the proof of Theorem 1.

4. Further Remarks on Section 3

We can formulate the method of Section 3 as follows. For a given tupleH = {h1, h2, . . . , hk} we define

Q1 :=2N∑

n=N+1

fR(n;H)2, Q2 :=2N∑

n=N+1

(k∑

i=1

θ(n + hi)

)fR(n;H)2, (4.1)

where f should be chosen to make Q2 large compared with Q1, and R = R(N)will be chosen later. It is reasonable to assume

fR(n;H) =∑

d|PH(n)d≤R

λd,R. (4.2)

Our goal is to select the λd,R which maximizes

ρ = ρ(N ;H, f) :=1

log 3N

(Q2

Q1

)(4.3)

for the purpose of obtaining a good lower bound for ρ. If ρ > ν for some N

and positive integer ν, then there exists an n, N < n ≤ 2N , such that thetuple (1.5) has at least ν + 1 prime components.

This method has much in common with the method introduced for twinprimes by Selberg and for general tuples by Heath-Brown. However, they


used the divisor function d(n + hi) in Q2 in place of θ(n + hi) and sought tominimize (4.3) to obtain a good upper bound for ρ. Heath-Brown even chosef = ΛR(n;H, 1).

As a first example, suppose we choose f as in (2.6) and (2.7), so that

fR(n;H) =k∏

i=1

ΛR(n + hi). (4.4)

By [13], we have, as R, N → ∞, 5

Q1 ∼ NS(H)(logR)k if R ≤ N12k

(1−ε),

Q2 ∼ kNS(H)(logR)k+1 if R ≤ Nϑ

2k(1−ε).

(4.5)

On taking R = Nϑ02k , 0 < ϑ0 < ϑ, we see that, as N → ∞,

ρ ∼ klog R

logN∼ ϑ0

2. (4.6)

Notice that ρ < 1, so we fail to detect primes in tuples. In Section 3, we provedthat on choosing f = ΛR(n;H, `), by (3.1) and (3.2), as N → ∞,

ρ ∼ k

k + 2` + 12` + 1` + 1

ϑ0. (4.7)

If ` = 0 this gives ρ ∼ kk+1ϑ0, which, for large k, is twice as large as (4.6),

while (4.7) gains another factor of two when ` → ∞ slowly as k → ∞. Thisfinally shows ρ > 1 if ϑ > 1/2, but just fails if ϑ = 1/2.

In (3.11) we chose

fR(n;H) =L∑

`=0

b`

(logR)`ΛR(n;Hk, `) =

∑

d|PH(n)d≤R

µ(d)P(

log(R/d)logR

)(4.8)

where P is a polynomial with a k-th order zero at 0. The matrix procedure doesnot provide a method for analyzing ρ unless L is taken fixed, but the generalproblem has been solved by Soundararajan [31]. In particular, he showedthat ρ < 1 if ϑ = 1/2, so that one can not prove there are bounded gapsbetween primes using (4.8). The exact solution from Soundrarajan’s analysiswas obtained by a calculus of variations argument by Conrey, which gives, asN → ∞,

ρ =k(k − 1)

2βϑ0, (4.9)

5For special reasons, the validity of the formula for Q2 actually holds here for R ≤ Nϑ

2(k−1) (1−ε)

if k ≥ 2, but this is insignificant for the present discussion.


where β is determined as the solution of the equation

1β

=

∫ 10 yk−2q(y)2 dy∫ 10 yk−1q′(y)2 dy

with q(y) = Jk−2(2√

β) − y1− k

2 Jk−2(2√

βy), (4.10)

where Jk is the Bessel function of the first type. Using Mathematica, one cancheck that this gives exactly the values of k in the previous table, which is inagreement with our earlier calculations; but it provides somewhat smaller val-ues of ϑ for which a given k-tuple will contain two primes. Thus, for example,we can replace (3.16) by the result that every admissible 6-tuple will containat least two primes if

ϑ > .95971 . . . . (4.11)

5. Two Lemmas

In this section we will prove two lemmas needed for the proof of Proposi-tions 1 and 2. The conditions on these lemmas have been constructed in orderfor them to hold uniformly in the given variables.

The Riemann zeta-function has the Euler product representation, withs = σ + it,

ζ(s) =∏

p

(1 − 1

ps

)−1

, σ > 1. (5.1)

The zeta-function is analytic except for a simple pole at s = 1, where as s → 1

ζ(s) =1

s − 1+ γ + O(|s − 1|). (5.2)

(Here γ is Euler’s constant.) We need standard information concerning theclassical zero-free region of the Riemann zeta-function. By Theorem 3.11 and(3.11.8) in [32], there exists a small constant c > 0, for which we assumec ≤ 10−2, such that ζ(σ + it) 6= 0 in the region

σ ≥ 1 − 4c

log(|t|+ 3)(5.3)

for all t. Furthermore, we have

ζ(σ + it) − 1σ − 1 + it

� log(|t| + 3),1

ζ(σ + it)� log(|t| + 3),

ζ ′

ζ(σ + it) +

1σ − 1 + it

� log(|t| + 3),(5.4)

in this region. We will fix this c for the rest of the paper (we could take, forinstance, c = 10−2, see [8]). Let L denote the contour given by

s = − c

log(|t| + 3)+ it. (5.5)


Lemma 1. We have, for R ≥ C, k ≥ 2, B ≤ Ck,∫

L(log(|s|+ 3))B

∣∣∣∣Rs

skds

∣∣∣∣� Ck1 R−c2 + e−

√c log R/2, (5.6)

where C1, c2 and the implied constant in � depends only on the constant C inthe formulation of the lemma. In addition, if k ≤ c3 logR with a sufficientlysmall c3 depending only on C, then

∫

L(log(|s|+ 3))B

∣∣∣∣Rs

skds

∣∣∣∣� e−√

c logR/2. (5.7)

Proof. The left-hand side of (5.6) is, with C4 depending on C,

�∫ ∞

0Rσ(t) (log(|t| + 4))B

(|t| + c/2)kdt

�∫ C4

0Ck

1 R−c2dt +∫ ω−3

C4

R− c

log(|t|+3)

t3/2dt +

∫ ∞

ω−3t−3/2dt

� Ck1 R−c2 + e−

c log R

log ω + ω− 12 ,

(5.8)

where now C1 is a constant depending on C. On choosing logω =√

c logR, thefirst part of the lemma follows. The second part is an immediate consequenceof the first part.

The next lemma provides some explicit estimates for sums of the gen-eralized divisor function. Let ω(q) denote the number of prime factors of asquarefree integer q. For any real number m, we define

dm(q) = mω(q). (5.9)

This agrees with the usual definition of the divisor functions when m is apositive integer. Clearly, dm(q) is a monotonically increasing function of m

(for a fixed q), and for real m1, m2, and y, we see that

dm1(q)dm2(q) = dm1m2(q), (dm(q))y = dmy(q). (5.10)

Recall∑[ indicates a sum over squarefree integers. We use the ceiling function

dye := min{n ∈ Z; y ≤ n}.

Lemma 2. We have, for any positive real m and x ≥ 1

D′(x, m) :=∑[

q≤x

dm(q)q

≤ (dme + logx)dme ≤ (m + 1 + log x)m+1 (5.11)

and

D∗(x, m) :=∑[

q≤x

dm(q) ≤ x(dme + log x)dme ≤ x(m + 1 + logx)m+1. (5.12)


Proof. First, we treat the case when m is a positive integer. We prove (5.11)by induction. Observe the assertion is true for m = 1, that is, when d1(q) = 1by definition. Suppose (5.11) is proved for m − 1. Let us denote the smallestterm in a given product representation of q by j = j(q) ≤ x1/m. Then thisfactor can stand at m places, and, therefore, with q = q′j(q) = q′j, we have

∑[

q≤x

dm(q)q

≤ m

x1/m∑

j=1

[ 1j

∑[

q′≤x/j

dm−1(q′)q′

≤ m(1 + logx1m ) (m − 1 + logx)m−1

≤ (m + logx)(m + logx)m−1 = (m + logx)m.

This completes the induction. For real m, the result holds since D′(x, m) ≤D′(x, dme). We note that (5.12) follows from (5.11) because D∗(x, m) ≤xD′(x, m).

6. A special case of Proposition 1

In this section we will prove a special case of Proposition 1 which illustratesthe method without involving the technical complications that appear in thegeneral case. This allows us to set up some notation and obtain estimates foruse in the general case. We also obtain the result uniformly in k.

Assume H is non-empty (so that k ≥ 1), ` = 0, and ΛR(n;H, 0) =ΛR(n;H).

Proposition 3. Supposing

k �η0 (logR)12−η0 with an arbitrarily small fixed η0 > 0, (6.1)

and h ≤ RC , with C any fixed positive number, we haveN∑

n=1

ΛR(n;H) = S(H)N + O(Ne−c√

log R) + O(R(2 logR)2k

). (6.2)

This result motivates the conjecture (2.11).

Proof. We have

SR(N ;H) :=N∑

n=1

ΛR(n;H) =1k!

∑

d≤R

µ(d)(

logR

d

)k ∑

1≤n≤Nd|PH(n)

1. (6.3)

If for a prime p we have p|PH(n), then among the solutions n ≡ −hi(mod p),1 ≤ i ≤ k, there will be νp(H) distinct solutions modulo p. For d squarefreewe then have by multiplicativity νd(H) distinct solutions for n modulo d whichsatisfy d|PH(n), and for each solution one has n running through a residue


class modulo d. Hence we see that∑

1≤n≤Nd|PH(n)

1 = νd(H)(

N

d+ O(1)

). (6.4)

Trivially νq(H) ≤ kω(q) = dk(q) for squarefree q. Therefore, we conclude that

SR(N ;H) = N

1

k!

∑

d≤R

µ(d)νd(H)d

(log

R

d

)k+ O

(logR)k

k!

∑[

d≤R

νd(H)

= NTR(H) + O(R(k + logR)2k

),

(6.5)

by Lemma 2.Let (a) denote the contour s = a+it, −∞ < t < ∞. We apply the formula

12πi

∫

(c)

xs

sk+1ds =

{0 if 0 < x ≤ 1,1k!(logx)k if x ≥ 1,

(6.6)

for c > 0, and have that

TR(H) =1

2πi

∫

(1)

F (s)Rs

sk+1ds, (6.7)

where, letting s = σ + it and assuming σ > 0,

F (s) =∞∑

d=1

µ(d)νd(H)d1+s

=∏

p

(1 − νp(H)

p1+s

). (6.8)

Since νp(H) = k for all p > h, we write

F (s) =GH(s)

ζ(1 + s)k, (6.9)

where by (5.1)

GH(s) =∏

p

(1− νp(H)

p1+s

)(1 − 1

p1+s

)−k

=∏

p

(1 +

k − νp(H)p1+s

+ Oh

(k2

p2+2σ

)),

(6.10)

which is analytic and uniformly bounded for σ > −1/2+δ for any δ > 0. Also,by (2.2) we see that

GH(0) = S(H). (6.11)

From (5.4) and (6.9), the function F (s) satisfies the bound

F (s) � |GH(s)|(C log(|t|+ 3))k. (6.12)


in the region on and to right of L. Here GH(s) is analytic and bounded in thisregion, and has a dependence on both k and the size h of the components ofH. We note that νp(H) = k not only when p > h, but whenever p 6 | ∆, where

∆ :=∏

1≤i<j≤k

|hj − hi|, (6.13)

since then all k of the hi’s are distinct modulo p. We now introduce an impor-tant parameter U that is used throughout the rest of the paper. We want U

to be an upper bound for log ∆, and since trivially ∆ ≤ hk2we choose

U := Ck2 log(2h) (6.14)

and havelog∆ ≤ U. (6.15)

We now prove, for −1/4 < σ ≤ 1,

|GH(s)| � exp(5kU δ log log U), where δ = max(−σ, 0). (6.16)

We treat separately the different pieces of the product defining GH. First, byuse of the inequality log(1 + x) ≤ x for x ≥ 0, we have

∣∣∣∣∣∣∏

p≤U

(1 − νp(H)

p1+s

)∣∣∣∣∣∣≤∏

p≤U

(1 +

k

p1−δ

)

= exp(∑

p≤U

log(1 +

k

p1−δ

))

≤ exp(∑

p≤U

k

p1−δ

)

≤ exp(

kU δ∑

p≤U

1p

)

� exp(kU δ log logU

).

Second, by the same estimates and the inequality (1 − x)−1 ≤ 1 + 3x for0 ≤ x ≤ 2/3, we see that∣∣∣∣∣∣∏

p≤U

(1 − 1

p1+s

)−k∣∣∣∣∣∣≤( ∏

p≤U

(1− 1

p1−δ

)−1 )k

≤( ∏

p≤U

(1 +

3p1−δ

))k (since

1p1−δ

≤ 123/4

<23

),

� exp(3kU δ log logU

).


Hence, the terms in the product for GH(s) with p ≤ U are �exp

(4kU δ log log U

).

For the terms p > U , we first consider those for which p|∆. In absolutevalue, they are

≤∏

p|∆p>U

(1 +

k

p1−δ

)(1 +

3p1−δ

)k

≤ exp(∑

p|∆p>U

4k

p1−δ

).

Since there are less than (1 + o(1)) log∆ < U primes with p|∆, the sum aboveis increased if we replace these terms with the integers between U and 2U .Therefore the right-hand side above is

≤ exp(4k

∑

U<n≤2U

1n1−δ

)≤ exp

(4k(2U)δ

∑

U<n≤2U

1n

)≤ exp(4kU δ).

Finally, if p > U ∣∣∣∣k

p1+s

∣∣∣∣ ≤k

U1−δ≤ 1

2,

so that in absolute value the terms with p > U and p 6 | ∆ are

=

∣∣∣∣∣∏

p 6 |∆p>U

(1 − k

p1+s

)(1 − 1

p1+s

)−k∣∣∣∣∣

=

∣∣∣∣∣ exp

(∑

p 6 |∆p>U

(−

∞∑

ν=1

1ν

( k

p1+s

)ν+ k

∞∑

ν=1

1ν

( 1p1+s

)ν))∣∣∣∣∣

≤ exp(∑

p>U

∞∑

ν=2

2ν

( k

p1−δ

)ν)

≤ exp(∑

p>U

∞∑

ν=2

( k

p1−δ

)ν)

≤ exp(

2k2∑

n>U

1n2−2δ

)

≤ exp(4k2U δ

U1−δ

)

≤ exp(2kU δ

).

Thus, the terms with p > U contribute ≤ exp(6kU δ

), from which we obtain

(6.16).In conclusion, for h � RC (where C > 0 is fixed and as large as we wish)

and for s on or to the right of L, we have

F (s) � (C log(|t| + 3))k exp(5kU δ log logU). (6.17)


Returning to the integral in (6.7), we see that the integrand vanishes as |t| →∞, −1/4 < σ ≤ 1. By (6.9) we see that in moving the contour from (1) to theleft to L we either pass through a simple pole at s = 0 when H is admissible(so that S(H) 6= 0), or we pass through a regular point at s = 0 when H is notadmissible. In either case, we have by virtue of (5.2), (6.11), (6.14), (6.17),and Lemma 1, for any k satisfying (6.1),

TR(H) = GH(0) +1

2πi

∫

L

F (s)Rs

sk+1ds

= S(H) + O(e−c√

log R).

(6.18)

Equation (6.2) now follows from this and (6.5).

Remark. The exponent 1/2 in the restriction k � (logR)1/2−η0 is not signif-icant. Using Vinogradov’s zero-free region for ζ(s) we could replace 1/2 by3/5.

7. First Part of the Proof of Proposition 1

LetH = H1 ∪H2, |H1| = k1, |H2| = k2, k = k1 + k2,

r = |H1 ∩H2|, M = k1 + k2 + `1 + `2.(7.1)

Thus |H| = k − r. We prove Proposition 1 in the following sharper form.

Proposition 4. Let h � RC, where C is any positive fixed constant. AsR, N → ∞, we have∑

n≤N

ΛR(n;H1, `1)ΛR(n;H2, `2) =(

`1 + `2

`1

)(logR)r+`1+`2

(r + `1 + `2)!S(H)N

+ N

r+`1+`2∑

j=1

Dj(`1, `2,H1,H2)(logR)r+`1+`2−j

+ OM(Ne−c√

logR) + O(R2(3 logR)3k+M),(7.2)

where the Dj(`1, `2,H1,H2)’s are functions independent of R and N whichsatisfy the bound

Dj(`1, `2,H1,H2) �M (logU)Cj �M (log log 10h)C′j (7.3)

where U is defined in (6.14) and Cj and C ′j are two positive constants depending

on M .

Proof. We can assume that both H1 and H2 are non-empty since the casewhere one of these sets is empty can be covered in the same way as we did in


case of ` = 0 in Section 6. Thus k ≥ 2 and we have

SR(N ;H1,H2, `1, `2) :=N∑

n=1

ΛR(n;H1, `1)ΛR(n;H2, `2)

=1

(k1 + `1)!(k2 + `2)!

∑

d,e≤R

µ(d)µ(e)(

logR

d

)k1+`1 (log

R

e

)k2+`2 ∑

1≤n≤Nd|PH1(n)e|PH2 (n)

1.

(7.4)

For the inner sum, we let d = a1a12, e = a2a12 where (d, e) = a12. Thus a1, a2,and a12 are pairwise relatively prime, and the divisibility conditions d|PH1(n)and e|PH2(n) become a1|PH1(n), a2|PH2(n), a12|PH1(n), and a12|PH2(n). Asin Section 6, we get νa1(H1) solutions for n modulo a1, and νa2(H2) solutionsfor n modulo a2. If p|a12, then from the two divisibility conditions we haveνp(H1(p)∩ H2(p)) solutions for n modulo p, where

H(p) = {h′1, . . . , h

′νp(H) : h′

j ≡ hi ∈ H for some i, 1 ≤ h′j ≤ p}

Notice that H(p) = H if p > h . Alternatively, we can avoid this definitionwhich is necessary only for small primes by defining

νp(H1∩H2) := νp(H1(p)∩ H2(p)) := νp(H1) + νp(H2) − νp(H) (7.5)

and then extend this definition to squarefree numbers by multiplicativity.6

Thus we see that∑


1 = νa1(H1)νa2(H2)νa12(H1∩H2)(

N

a1a2a12+ O(1)

),

and haveSR(N ; `1, `2,H1,H2)

=N

(k1 + `1)!(k2 + `2)!

∑′

a1a12≤Ra2a12≤R

µ(a1)µ(a2)µ(a12)2νa1(H1)νa2(H2)νa12(H1∩H2)a1a2a12

×(

logR

a1a12

)k1+`1 (log

R

a2a12

)k2+`2

+ O

(logR)M

∑′

a1a12≤Ra2a12≤R

µ(a1)2µ(a2)2µ(a12)2νa1(H1)νa2(H2)νa12(H1∩H2)

= NTR(`1, `2;H1,H2) + O(R2(3 logR)3k+M),(7.6)

6We are making a convention here that for νp we take intersections modulo p.


where∑′ indicates the summands are pairwise relatively prime. Notice that

by Lemma 2, the error term was bounded by

� (logR)M∑[

q≤R2

∑

q=a1a2a12

dk(q)

= (logR)M∑[

q≤R2

d3(q)dk(q)

= (logR)M∑[

q≤R2

d3k(q)

� R2(3 logR)3k+M .

By (6.6), we have

TR(`1, `2;H1,H2) =1

(2πi)2

∫

(1)

∫

(1)

F (s1, s2)Rs1

s1k1+`1+1

Rs2

s2k2+`2+1

ds1 ds2, (7.7)

where, by letting sj = σj + itj and assuming σ1, σ2 > 0,

F (s1, s2) =∑′

1≤a1,a2,a12<∞

µ(a1)µ(a2)µ(a12)2νa1(H1)νa2(H2)νa12(H1∩H2)a1

1+s1a21+s2a12

1+s1+s2

=∏

p

(1 − νp(H1)

p1+s1− νp(H2)

p1+s2+

νp(H1∩H2)p1+s1+s2

).

(7.8)

Since for all p > h we have νp(H1) = k1, νp(H2) = k2, and νp(H1 ∩ H2) = r,we factor out the dominant zeta-factors and write

F (s1, s2) = GH1,H2(s1, s2)ζ(1 + s1 + s2)r

ζ(1 + s1)k1ζ(1 + s2)k2, (7.9)

where by (5.1)

GH1,H2(s1, s2) =∏

p

(1 − νp(H1)

p1+s1 − νp(H2)p1+s2 + νp(H1∩H2)

p1+s1+s2

)(1 − 1

p1+s1+s2

)r

(1 − 1

p1+s1

)k1(1 − 1

p1+s2

)k2

(7.10)is analytic and uniformly bounded for σ1, σ2 > −1/4 + δ, for any fixed δ > 0.Also, from (2.2), (7.1), and (7.5) we see immediately that

GH1,H2(0, 0) = S(H). (7.11)

Furthermore, the same argument leading to (6.16) shows that for s1, s2 on Lor to the right of L

GH1,H2(s1, s2) � exp(CkU δ1+δ2 log log U). (7.12)


with δi = −min(σi, 0) and U defined in (6.14). We define

W (s) := sζ(1 + s) (7.13)

andD(s1, s2) = GH1,H2(s1, s2)

W (s1 + s2)r

W (s1)k1W (s2)k2, (7.14)

so that

TR(`1, `2;H1,H2) =1

(2πi)2

∫

(1)

∫

(1)

D(s1, s2)Rs1+s2

s1`1+1s2

`2+1(s1 + s2)rds1ds2.

(7.15)To complete the proof of Proposition 1, we need to evaluate this integral. Wewill also need to evaluate a similar integral in the proof of Proposition 2, wherethe parameters k1, k2, and r have several slightly different relationships withH1 and H2, and G is slightly altered. Therefore we change notation to handlethese situations simultaneously.

8. Completion of the proof of Proposition 1: Evaluating an integral

Let

T ∗R(a, b, d, u, v, h) :=

1(2πi)2

∫

(1)

∫

(1)

D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)dds1 ds2, (8.1)

where

D(s1, s2) =G(s1, s2)W d(s1 + s2)

W a(s1)W b(s2)(8.2)

and W is from (7.13). We assume G(s1, s2) is regular on L and to the right ofL and satisfies the bound, with δi = −min(σi, 0),

G(s1, s2) �M exp(CMU δ1+δ2 log log U), where U = CM2 log(2h). (8.3)

Lemma 3. Suppose that

0 ≤ a, b, d, u, v ≤ M, a + u ≥ 1, b + v ≥ 1, d ≤ min(a, b), (8.4)

where M is a large constant and our estimates may depend on M . Let h � RC,with C any positive fixed constant. Then we have, as R → ∞,

T ∗R(a, b, d,u, v, h) =

(u + v

u

)(logR)u+v+d

(u + v + d)!G(0, 0)

+u+v+d∑

j=1

Dj(a, b, d, u, v, h)(logR)u+v+d−j + OM(e−c√

log R),(8.5)

where the Dj(a, b, d, u, v, h)’s are functions independent of R which satisfy the


boundDj(a, b, d, u, v, h) �M (logU)Cj �M (log log 10h)C′

j (8.6)

for some positive constants Cj , C ′j depending on M .

Proof. One would expect to proceed exactly as in Section 6 by movingboth contours to the left to L. There is, however, a complication because theintegrand now contains the function ζ(1+ s1 + s2) which necessitates that alsos1 +s2 be restricted to the region to the right of L if we wish to use the boundsin (5.4). 7 By the conditions of Lemma 3, (5.4), and (8.3), we have

D(s1, s2)su+11 sv+1

2 (s1 + s2)d

�Mexp(CMU δ1+δ2 log log U)

(log(|t1|+ 3) log(|t2| + 3)

)2M max(1, |s1 + s2|−d)|s1|a+u+1|s2|b+v+1

(8.7)

provided s1, s2, and s1 + s2 are on or to the right of L. We next let

V = e√

logR (8.8)

and define the contours, for j = 1 or 2,

L′j =

( 4−jc

logV

)={ 4−jc

logV+ it : −∞ < t < ∞

},

Lj ={ 4−jc

log V+ it : |t| ≤ 4−jV

},

Lj ={− 4−jc

log V+ it : |t| ≤ 4−jV

},

Hj ={σj ± i4−jV : |σj | ≤

4−jc

logV

}.

(8.9)

By (8.7) the integrand in (8.1) vanishes as |t1| → ∞ or |t2| → ∞ provided s1

and s2 are to the right of L′2. We first shift the contours (1) for the integrals

over s1 and s2 to L′1 and L′

2, respectively. Next, we truncate these contours sothat they may be replaced with L1 and L2. In doing this there are two errorterms which are estimated by (8.7), for example the error term coming fromL′

1 and the truncated piece of L′2 is

�M (logU)CM (logV )MV5c

16

(∫ ∞

−∞

(log |t| + 3)2M

| c4 logV + it|a+u+1

dt

)(∫ ∞

V/16

(log t)2M

t2dt

)

�M(logV )6M

V 1− 5c

16

�M e−c√

log R.

7This was pointed out to us by J. Sivak and also Y. Motohashi. This problem was handled insimilar ways in [30] and in [15]; we have also adopted this approach here.


Hence

T ∗R =

1(2πi)2

∫

L2

∫

L1

D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)dds1 ds2 + OM (e−c

√logR). (8.10)

To replace the s1-contour along L1 with the contour along L1 we consider therectangle formed by L1, H1, and L1 which contains poles of the integrand asa function of s1 at s1 = 0 and s1 = −s2. Hence we see

T ∗R =

12πi

∫

L2

Ress1=0

(D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)d

)ds2

+1

2πi

∫

L2

Ress1=−s2

(D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)d

)ds2

+1

(2πi)2

∫

L2

∫

L1∪H1

D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)dds1 ds2 + OM (e−c

√logR).

(8.11)

Here the contours along L1 and H1 are oriented clockwise. In the first andthird integrals we move the contour over L2 to L2 in the same fashion, butnow we only pass a pole at s2 = 0. Thus we obtain

T ∗R = Res

s2=0Ress1=0

D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)d+

12πi

∫

L2∪H2

Ress1=0

(D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)d

)ds2

+1

2πi

∫

L1∪H1

Ress2=0

(D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)d

)ds1 +

12πi

∫

L2

Ress1=−s2

(D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)d

)ds2

+1

(2πi)2

∫

L2∪H2

∫

L1∪H1

D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)dds1 ds2 + OM(e−c

√logN )

:= I0 + I1 + I2 + I3 + I4 + OM(e−c√

logR).(8.12)

We will see that the residue I0 provides the main term and some of the lowerorder terms, the integral I3 provides the remaining lower order terms, and theintegrals I1, I2, and I4 are error terms.

We consider first I0. At s1 = 0 there is a pole of order ≤ u + 1, andtherefore 8 by Leibniz’s rule we have

Ress1=0

D(s1, s2)Rs1

su+11 (s1 + s2)d

=1u!

u∑

i=0

(u

i

)(logR)u−i ∂i

∂si1

(D(s1, s2)(s1 + s2)d

) ∣∣∣∣∣s1=0

8If G(0,0) = 0 then the order of the pole is u or less, but the formula we use to compute theresidue is still valid. In this situation one or more of the initial terms will have the value zero.


and

∂i

∂si1

(D(s1, s2)(s1 + s2)d

) ∣∣∣∣∣s1=0

= (−1)iD(0, s2)d(d + 1) · · ·(d + i − 1)sd+i2

+i∑

j=1

(i

j

)∂j

∂sj1

D(s1, s2)

∣∣∣∣∣s1=0

(−1)i−j d(d + 1) · · ·(d + i− j − 1)

sd+i−j2

,

where in case of i = j (including the case when i = j = 0 and d ≥ 0 arbitrary)the empty product in the numerator is 1. We conclude that

Ress1=0

D(s1, s2)Rs1

su+11 (s1 + s2)d

=u∑

i=0

i∑

j=0

a(i, j)(logR)u−i

sd+i−j2

∂j

∂sj1

D(s1, s2)

∣∣∣∣∣s1=0

(8.13)

with a(i, j) given explicitly in the previous equations. To complete the evalu-ation of I0, we see that the (i, j)th term contributes to I0 a pole at s2 = 0 oforder v + 1 + d + i− j (or less), and therefore by Leibniz’s formula

Ress2=0

Rs2

sv+1+d+i−j2

∂j

∂sj1

D(s1, s2)

∣∣∣∣∣s1=0

=1

(v + d + i− j)!

v+d+i−j∑

m=0

(v + d + i− j

m

)(logR)v+d+i−j−m ∂m

∂sm2

∂j

∂sj1

D(s1, s2)

∣∣∣∣∣s1=0s2=0

.

This completes the evaluation of I0, and we conclude

I0 =u∑

i=0

i∑

j=0

v+d+i−j∑

m=0

b(i, j, m)(

∂m

∂sm2

∂j

∂sj1

D(s1, s2)

∣∣∣∣∣s1=0s2=0

)(logR)u+v+d−j−m ,

(8.14)where

b(i, j, m) = (−1)i−j

(u

i

)(i

j

)(v + d + i − j

m

)d(d + 1) · · ·(d + i − j − 1)

u!(v + d + i − j)!.

(8.15)The main term is of order (logR)u+v+d and occurs when j = m = 0.

Therefore, it is given by

G(0, 0)(logR)u+v+d

(1u!

u∑

i=0

(u

i

)(−1)id(d + 1) · · ·(d + i− 1)

(v + d + i)!

).

It is not hard to prove that

1u!

u∑

i=0

(u

i

)(−1)id(d + 1) · · ·(d + i − 1)

(v + d + i)!=(

u + v

u

)1

(u + v + d)!, (8.16)

from which we conclude that the main term is

G(0, 0)(

u + v

u

)1

(u + v + d)!(logR)d+u+v . (8.17)


Motohashi found the following approach which avoids proving (8.16) directlyand can be used to simplify some of the previous analysis. Granville also madea similar observation. The residue we are computing is equal to

1(2πi)2

∫

Γ2

∫

Γ1

D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)dds1 ds2,

where Γ1 and Γ2 are the circles |s1| = ρ and |s2| = 2ρ, respectively, with asmall ρ > 0. Writing s1 = s and s2 = sw, this is equal to

1(2πi)2

∫

Γ3

∫

Γ1

D(s, sw)Rs(w+1)

su+v+d+1wv+1(w + 1)dds dw,

with Γ3 the circle |w| = 2. The main term is obtained from the constant termG(0, 0) in the Taylor expansion of D(s, sw) and, therefore, equals

G(0, 0)(logR)u+v+d

(u + v + d)!1

2πi

∫

Γ3

(w + 1)u+v

wv+1dw = G(0, 0)

(logR)u+v+d

(u + v + d)!

(u + v

v

),

by the binomial expansion.To complete the analysis of I0, we only need to show that the partial

derivatives of D(s1, s2) at (0, 0) satisfy the bounds given in the lemma. Forthis, we use Cauchy’s estimate for derivatives

|f (j)(z0)| ≤ max|z−z0 |=η

|f(z)| j!ηj

, (8.18)

if f(z) is analytic for |z − z0| ≤ η. In the application below we will choose z0

on L or to the right of L and

η =1

C logU logT, where T = |s1| + |s2| + 3. (8.19)

Thus we see the whole circle |z−z0−1| = η will remain in the region (5.3) andthe estimates (5.4) hold in this circle. (We remind the reader that the genericconstants c, C take different values at different appearances.) Thus, we havefor s1, s2 on L or to the right of L, and j ≤ M , m ≤ 2M ,

∂m

∂sm2

∂j

∂sj1

D(s1, s2)

≤ j!m!(C log U logT )j+m max|s∗

1−s1|≤η,|s∗2−s2|≤η

|D(s∗1, s∗2)|

�M exp(CMU δ1+δ2 log log U)(logT )6M max(1, |s1 + s2|)d

max(1, |s1|)a max(1, |s2|)b,

(8.20)

which, if max(|s1|, |s2|) ≤ C, reduces to

∂m

∂sm2

∂j

∂sj1

D(s1, s2) �M exp(CMU δ1+δ2 log logU

). (8.21)


In particular, we have

∂m

∂sm2

∂j

∂sj1

D(s1, s2)

∣∣∣∣∣s1=0s2=0

�M (logU)CM . (8.22)

We conclude from (8.14), (8.17), and (8.22) that I0 provides the main termand some of the secondary terms in Lemma 3 which satisfy the stated bound.

We now consider I1. By (8.13) and (8.20) we have

I1 �M (logR)u

∫

L2∪H2

eCMUδ2 log log U (log(|t2| + 3))3M max(1, |s2|d)|s2|v+1+d max(1, |s2|b)

|Rs2 ||ds2|.

(8.23)By (8.4) we have b+ v ≥ 1, along H2 we have |Rs2 | � ec

√logR, and along both

L2 and H2 we have U δ2 � 1. When |s2| ≥ 1 we have

max(1, |s2|d)|s2|v+1+d max(1, |s2|b)

� 1|s2|v+1+b

� 1|s2|2

,

and therefore the contribution from H2 to I1 is

�M(logR)7M/2−1/2

V 2ec

√log R �M e−c

√logR.

Similarly the integral along L2 is bounded by

�M (logR)2MR− c

16 log V

∫ V

−V(log(|t|+ 3))3M min

(1

(logV )−3M,

1t2

)dt

�M (logR)3MR− c

16 log V

�M e−c√

log R,

and therefore I1 also satisfies this bound. The same bound holds for I2 sinceit is with relabeling equal to I1. Further, I4 also satisfies this bound by sameargument on applying (8.7) and noting that |s1 + s2| � c

log V in I4.Finally, we examine I3, which only occurs if d ≥ 1. We have

Ress1=−s2

(D(s1, s2)Rs1+s2

su+11 sv+1

2 (s1 + s2)d

)= lim

s1→−s2

1(d − 1)!

∂d−1

∂s1d−1

(D(s1, s2)Rs1+s2

s1u+1s2

v+1

)

=1

(d− 1)!

d−1∑

i=0

Bi(s2)(logR)d−1−i,

(8.24)

where

Bi(s2) =(

d − 1i

) i∑

j=0

(i

j

)∂i−j

∂si−j1

D(s1, s2)

∣∣∣∣∣s1=−s2

(−1)j(u + 1) · · ·(u + j)

(−1)u+j+1su+v+j+22

,

(8.25)


and therefore by (8.12), (8.24), and (8.25) we have

I3 =1

(d− 1)!

d−1∑

i=0

Ci(logR)d−1−i, (8.26)

whereCi =

12πi

∫

L2

Bi(s2) ds2, 0 ≤ i ≤ d− 1. (8.27)

By (8.20) and (8.25) we see that for s2 to the right of L

Bi(s2) �M exp(CMU |δ2| log logU

) (log(|t2| + 3)

)4M

|t2|u+v+a+b+2 max(1, |t2|i). (8.28)

In (8.27) we may shift the contour L2 to the imaginary axis with a semicircle ofradius 1/ logU centered at and to the right of s2 = 0. Further, we can extendthis contour to the complete imaginary axis with an error OM(e−c

√log R) using

(8.28) and the same argument we use above (8.10). Letting

L′ ={

s = it : |t| ≥ 1logU

}∪{s =

eiϑ

log U: −π

2≤ ϑ ≤ π

2

}(8.29)

oriented from −i∞ up to i∞, we conclude

Ci =1

2πi

∫

L′

Bi(s2) ds2 + OM(e−c√

logR), 0 ≤ i ≤ d − 1. (8.30)

The integral here is independent of R but depends on h. Therefore this providesin (8.26) some further lower order terms in Lemma 3. The contribution to Ci

from the integral along the imaginary axis is

�M (logU)u+v+i+a+b+1 exp(CM log logU) �M (logU)C′M . (8.31)

This expression also bounds the contribution to Ci from the semicircle contourand thus completes the evaluation of I3. Combining our results, we obtainLemma 3.

9. Proof of Proposition 2

We introduce some standard notation associated with (1.2) and (1.3). Let

θ(x; q, a) :=∑

p≤xp≡a(mod q)

log p = [(a, q) = 1]x

φ(q)+ E(x; q, a), (9.1)

where [S] is 1 if the statement S is true and is 0 if S is false. Next, we define

E ′(x, q) := maxa

(a,q)=1

|E(x; q, a)|, E∗(x, q) = maxy≤x

E ′(y, q). (9.2)


In this paper we only need level of distribution results for E ′, but usually theseresults are stated in the stronger form for E∗. Thus, for some 1/2 ≤ ϑ ≤ 1, weassume, given any A > 0 and ε > 0, that

∑

q≤xϑ−ε

E∗(x, q) �A,εx

(logx)A. (9.3)

This is known to hold with ϑ = 1/2.We prove the following stronger version of Proposition 2. Let

CR(`1, `2,H1,H2, h0) =

1 if h0 6∈ H,(`1+`2+1) log R

(`1+1)(r+`1+`2+1) if h0 ∈ H1 and h0 6∈ H2,(`1+`2+2)(`1+`2+1) log R(`1+1)(`2+1)(r+`1+`2+1) if h0 ∈ H1 ∩H2.

(9.4)By relabeling the variables we obtain the corresponding form if h0 ∈ H2 andh0 6∈ H1. We continue to use the notation (7.1).

Proposition 5. Suppose h � R. Given any positive A, there is a B =B(A, M) such that for R �M,A N

14 /(logN)B and R, N → ∞,

N∑

n=1

ΛR(n;H1, `1)ΛR(n;H2, `2)θ(n + h0)

=CR(`1, `2,H1,H2, h0)

(r + `1 + `2)!

(`1 + `2

`1

)S(H0)N(logR)r+`1+`2

+ N

r∑

j=1

Dj(`1, `2,H1,H2, h0)(logR)r+`1+`2−j + OM,A

(N

(logN)A

),

(9.5)

where the Dj(`1, `2,H1,H2, h0)’s are functions independent of R and N whichsatisfy the bound

Dj(H1,H2, h0) �M (logU)Cj �M (log log 10h)C′j (9.6)

for some positive constants Cj , C ′j depending on M . Assuming that conjecture

(9.3) holds, then (9.5) holds for R �M Nϑ

2−ε and h ≤ Rε, for any given ε > 0.

Proof. We will assume that both H1 and H2 are non-empty so that k1 ≥ 1 andk2 ≥ 1. The proof in the case when one of these sets is empty is much easier


and may be obtained by an argument analogous to that of Section 6. We have

SR(N ;H1,H2, `1, `2, h0) :=N∑

n=1

ΛR(n;H1, `1)ΛR(n;H2, `2)θ(n + h0)

=1

(k1 + `1)!(k2 + `2)!

∑

d,e≤R

µ(d)µ(e)(

logR

d

)k1+`1 (log

R

e

)k2+`2 ∑


θ(n + h0).

(9.7)To treat the inner sum above, let d = a1a12 and e = a2a12, where (d, e) = a12,so that a1, a2, and a12 are pairwise relatively prime. As in Section 7, the n

for which d|PH1(n) and e|PH2(n) cover certain residue classes modulo [d, e].If n ≡ b (moda1a2a12) is such a residue class, then letting m = n + h0 ≡b + h0(mod a1a2a12), we see that this residue class contributes to the innersum

∑

1+h0≤m≤N+h0m≡b+h0 (moda1a2a12)

θ(m) = θ(N + h0; a1a2a12, b + h0)− θ(h0; a1a2a12, b + h0)

= [(b + h0, a1a2a12) = 1]N

φ(a1a2a12)+ E(N ; a1a2a12, b + h0) + O(h logN).

(9.8)

We need to determine the number of these residue classes where (b +h0, a1a2a12) = 1 so that the main term is non-zero. If p|a1, then b ≡−hj (mod p) for some hj ∈ H1, and therefore b + h0 ≡ h0 − hj (mod p).Thus, if h0 is distinct modulo p from all the hj ∈ H1, then all νp(H1) residueclasses satisfy the relatively prime condition, while otherwise h0 ≡ hj(mod p)for some hj ∈ H1 leaving νp(H1)−1 residue classes with a non-zero main term.We introduce the notation ν∗

p(H10) for this number in either case, where we

define for a set H and integer h0

ν∗p(H0) = νp(H0) − 1, (9.9)

whereH0 = H ∪ {h0}. (9.10)

We extend this definition to ν∗d(H0) for squarefree numbers d by multiplica-

tivity. The function ν∗d is familiar in sieve theory, see [16]. A more algebraic

discussion of νd∗ may also be found in [14, 15]. We define ν∗

d

((H1∩H2)0

)as in

(7.5).Next, the divisibility conditions a2|PH2(n), a12|PH1(n), and a12|PH2(n)

are handled as in Section 7 together with the above considerations. Since


E(n; q, a) � (logN) if (a, q) > 1 and q ≤ N , we conclude that∑

1≤n≤Nd|PH1 (n)e|PH2 (n)

θ(n + h0) = ν∗a1

(H10)ν∗

a2(H2

0)ν∗a12

((H1∩H2)0

) N

φ(a1a2a12)

+ O

dk(a1a2a12)

max

b(b,a1a2a12)=1

∣∣E(N ; a1a2a12, b)∣∣ + h(logN)

.

(9.11)

Substituting this into (9.7) we obtain

SR(N ;H1,H2, `1, `2, h0)

=N

(k1 + `1)!(k2 + `2)!

∑′

a1a12≤Ra2a12≤R

µ(a1)µ(a2)µ(a12)2ν∗a1

(H10)ν∗

a2(H2

0)ν∗a12

((H1∩H2)0

)

φ(a1a2a12)

×(

logR

a1a12

)k1+`1 (log

R

a2a12

)k2+`2

+ O

(logR)M

∑′

a1a12≤Ra2a12≤R

dk(a1a2a12)E ′(N, a1a2a12)

+ O(hR2(3 logN)M+3k+1)

= N TR(H1,H2, `1, `2, h0) + O((logR)MEk(N)

)+ O(hR2(3 logN)M+3k+1),

(9.12)

where the last error term was obtained using Lemma 2. To estimate thefirst error term we use Lemma 2, (1.3), and the trivial estimate E ′(N, q) ≤(2N/q) logN for q ≤ N to find, uniformly for k ≤

√(logN)/18, that

|Ek(N)| ≤∑[

q≤R2

dk(q) maxb

(b,q)=1

∣∣E(N ; q, b)∣∣ ∑

q=a1a2a12

1

=∑[

q≤R2

dk(q)d3(q)E ′(N, q)

≤

√√√√∑[

q≤R2

d3k(q)2

q

√∑

q≤R2

q(E ′(N, q))2

≤√

(logN)9k2√

2N log N

√∑

q≤R2

E ′(N, q)

� N(logN)(9k2+1−A)/2,

(9.13)

provided R2 � N12 /(logN)B. On relabeling, we conclude that given any

positive integers A and M there is a positive constant B = B(A, M) so that


for R � N14

(logN)B and h ≤ R,

SR(N ;H1,H2, `1, `2, h0) = N TR(H1,H2, `1, `2, h0)+OM

(N

(logN)A

). (9.14)

Using (9.3) with any ϑ > 1/2, we see that (9.14) holds for the longer rangeR �M N

ϑ

2−ε, h � N ε.

Returning to the main term in (9.12), we have by (6.6) that

TR(H1,H2, `1, `2, h0) =1

(2πi)2

∫

(1)

∫

(1)

F (s1, s2)Rs1

s1k1+`1+1

Rs2

s2k2+`2+1

ds1ds2,

(9.15)where, by letting sj = σj + itj and assuming σ1, σ2 > 0,

F (s1, s2) =∑′

1≤a1,a2,a12<∞

µ(a1)µ(a2)µ(a12)2ν∗a1

(H10)ν∗

a2(H2

0)ν∗a12

((H1∩H2)0

)

φ(a1)a1s1φ(a2)a2

s2φ(a12)a12s1+s2

=∏

p

(1 −

ν∗p(H1

0)(p− 1)ps1

−ν∗p(H2

0)(p − 1)ps2

+ν∗

p

((H1∩H2)0

)

(p− 1)ps1+s2

).

(9.16)

We now consider three cases.

Case 1. Suppose h0 6∈ H. Then we have, for p > h,

ν∗p(H1

0) = k1, ν∗p(H2

0) = k2, ν∗p

((H1∩H2)0

)= r.

Therefore in this case we define GH1,H2(s1, s2) by

F (s1, s2) = GH1,H2(s1, s2)ζ(1 + s1 + s2)r

ζ(1 + s1)k1ζ(1 + s2)k2. (9.17)

Case 2. Suppose h0 ∈ H1 but h0 6∈ H2. (By relabeling this also coversthe case where h0 ∈ H2 and h0 6∈ H1.) Then for p > h

ν∗p(H1

0) = k1 − 1, ν∗p(H2

0) = k2, ν∗p

((H1∩H2)0

)= r.

Therefore, we define GH1,H2(s1, s2) by

F (s1, s2) = GH1,H2(s1, s2)ζ(1 + s1 + s2)r

ζ(1 + s1)k1−1ζ(1 + s2)k2. (9.18)

Case 3. Suppose h0 ∈ H1 ∩ H2. Then for p > h

ν∗p(H1

0) = k1 − 1, ν∗p (H2

0) = k2 − 1, ν∗p

((H1∩H2)0

)= r − 1.

Thus, we define GH1,H2(s1, s2) by

F (s1, s2) = GH1,H2(s1, s2)ζ(1 + s1 + s2)r−1

ζ(1 + s1)k1−1ζ(1 + s2)k2−1. (9.19)


In each case, G is analytic and uniformly bounded for σ1, σ2 > −c, with anyc < 1/4.

We now show that in all three cases

GH1,H2(0, 0) = S(H0). (9.20)

Notice that in Cases 2 and 3 we have H0 = H. By (5.1), (7.5), (9.9), and(9.16), we find in all three cases

GH1,H2(0, 0)

=∏

p

(1− νp(H1

0) + νp(H20) − νp((H1∩H2)0) − 1p − 1

)(1 − 1

p

)−a(H1,H2,h0)

=∏

p

(1− νp(H0) − 1

p − 1

)(1 − 1

p

)−a(H1,H2,h0)

,

(9.21)

where in Case 1 a(H1,H2, h0) = k1+k2−r = k−r, in Case 2 a(H1,H2, h0) =(k1 − 1) + k2 − r = k − r − 1, and in Case 3 a(H1,H2, h0) = (k1 − 1) + (k2 −1) − (r − 1) = k − r − 1. Hence, in Case 1 we have

GH1,H2(0, 0) =∏

p

(p − νp(H0)

p − 1

)(1 − 1

p

)−(k−r)

=∏

p

(1 − νp(H0)

p

)(1 − 1

p

)−(k−r+1)

= S(H0),

(9.22)

while in Cases 2 and 3 we have

GH1,H2(0, 0) =∏

p

(p − νp(H)

p − 1

)(1 − 1

p

)−(k−r−1)

=∏

p

(1 − νp(H)

p

)(1 − 1

p

)−(k−r)

= S(H) (= S(H0)).

(9.23)

We are now ready to evaluate TR(H1,H2, `1, `2, h0). There are two dif-ferences between the functions F and G that appear in (9.16)–(9.19) and theearlier (7.8)–(7.10). The first difference is that a factor of p in the denomi-nator of the Euler product in (7.8) has been replaced by p − 1, which onlyeffects the value of constants in calculations. The second difference is the re-lationship between k1, k2, and r, which effects the residue calculations of themain terms. However, the analysis of lower order terms and the error analy-sis is essentially unchanged and, therefore, we only need to examine the main


terms. We use Lemma 3 here to cover all of the cases. Taking into account(9.17)–(9.19) we have in Case 1 that a = k1, b = k2, d = r, u = `1, v = `2; inCase 2 that a = k1 − 1, b = k2, d = r, u = `1 + 1, v = `2; and in Case 3 thata = k1 − 1, b = k2 − 1, d = r − 1, u = `1 + 1, v = `2 + 1. By (9.22) and (9.23),the proof of Theorems 2 and 2′ is thus complete.

10. Proof of Theorem 3

For convenience, we agree in our notation below that we consider every setof size k with a multiplicity k! according to all permutations of the elementshi ∈ H, unless mentioned otherwise. While unconventional, this will clarifysome of the calculations.

To prove Theorem 3 we consider in place of (3.5)

SR(N, k, `, h, ν)

:=1

Nh2k+1

2N∑

n=N+1

( ∑

1≤h0≤h

θ(n + h0)− ν log 3N

)( ∑

H⊂{1,2,...,h}|H|=k

ΛR(n;H, `)

)2

= MR(N, k, `, h)− νlog 3N

hMR(N, k, `, h),

(10.1)

where

MR(N, k, `, h) =1

Nh2k

2N∑

n=N+1

( ∑

H⊂{1,2,...,h}|H|=k

ΛR(n;H, `)

)2

(10.2)

and

MR(N, k, `, h) =1

Nh2k+1

2N∑

n=N+1

( ∑

1≤h0≤h

θ(n+h0))( ∑

H⊂{1,2,...,h}|H|=k

ΛR(n;H, `)

)2

.

(10.3)To evaluate MR and MR we multiply out the sum and apply Propositions 1and 2. We need to group the pairs of sets H1 and H2 according to size ofthe intersection r = |H1 ∩ H2|, and thus |H| = |H1 ∪ H2| = 2k − r. Let uschoose now a set H and here exceptionally we disregard the permutation of theelements in H. (However for H1 and H2 we take into account all permutations.)Given the set H of size 2k − r, we can choose H1 in

(2k−rk

)ways. Afterwards,

we can choose the intersection set in(kr

)ways. Finally, we can arrange the

elements both in H1 and H2 in k! ways. This gives(

2k − r

k

)(k

r

)(k!)2 = (2k − r)!

(k

r

)2

r! (10.4)


choices for H1 and H2, taking into account the permutation of the elementsin H1 and H2. If we consider in the summation every union set H of size j

just once, independently of the arrangement of the elements, then Gallagher’stheorem (3.7) may be formulated as

∑∗

H⊂{1,2,...,h}|H|=j

S(H) ∼ hj

j!, (10.5)

where∑∗ indicates every set is counted just once. Applying this, we obtain

on letting

x =logR

h, (10.6)

and using Proposition 1

MR(N, k, `, h) ∼ 1Nh2k

k∑

r=0

(2k − r)!(

k

r

)2

r!(

2`

`

)(logR)2`+r

(r + 2`)!N

∑

|H|=2k−r

∗S(H)

∼(

2`

`

)(logR)2`

k∑

r=0

(k

r

)2 xr

(r + 1) · · ·(r + 2`).

(10.7)By Proposition 2 and (10.5) we have

MR(N, k, `, h)∼ 1Nh2k+1

k∑

r=0

(2k − r)!(

k

r

)2

r! Zr, (10.8)

where abbreviating a = 2`+1`+1 =

(2`+1`+1

)(2``

)−1= 1

2

(2`+2`+1

)(2``

)−1, we have

Zr : =(

2`

`

)(logR)2`+r

(r + 2`)!

{r∑∗

|H|=2k−r

2alogR

r + 2` + 1S(H)N

+ (2k − 2r)∑∗

|H|=2k−r

alogR

r + 2` + 1S(H)N +

∑∗

|H|=2k−r

h∑

h0=1h0 /∈H

S(H0)N

}

∼ N

(2`

`

)(logR)2`+r

(r + 2`)!

{h2k−r

(2k − r)!2ak logR

r + 2` + 1+

2k − r + 1(2k − r + 1)!

h2k−r+1

},

(10.9)

where in the last sum we took into account which element of H0 is h0, whichcan be chosen in 2k − r + 1 ways. Thus we obtain

MR(N, k, `, h) ∼(

2`

`

)(logR)2`

k∑

r=0

(k

r

)2 xr

(r + 1) · · ·(r + 2`)

(2ak

r + 2` + 1x + 1

).

(10.10)


We conclude, on introducing the parameters

ϕ =1

` + 1, (so a = 2 − ϕ), Θ =

log R

log 3N, (so R = (3N)Θ), (10.11)

that

SR(N, k, `, h, ν) ∼(

2`

`

)(logR)2`Pk,`,ν(x), (10.12)

where

Pk,`,ν(x) =k∑

r=0

(k

r

)2 xr

(r + 1) · · ·(r + 2`)

(1 + x

(4(1 − ϕ

2 )kr + 2` + 1

− ν

Θ

)).

(10.13)Let

h = λ log3N, so that x =Θλ

. (10.14)

The analysis of when S > 0 now depends on the polynomial Pk,`,ν(x). Weexamine this polynomial as k, ` → ∞ in such a way that ` = o(k). In the firstplace, the size of the terms of the polynomial are determined by the factor

g(r) =(

k

r

)2

xr,

and since g(r) > g(r − 1) is equivalent to

r <k + 1

1 + 1√x

we should expect that the polynomial is controlled by terms with r close tok/(z + 1), where

z =1√x

. (10.15)

Consider now the sign of each term. For small x, the terms in the polynomialare positive, but they become negative when

1 + x

(4(1 − ϕ

2 )kr + 2` + 1

− ν

Θ

)< 0.

When r = k/(z + 1) and letting k, ` → ∞, ` = o(k), we have heuristically

1 + x

(4(1− ϕ

2 )kr + 2` + 1

− ν

Θ

)≈ 1 +

1z2

(4kk

z+1

− ν

Θ

)

=1z2

((z + 2)2 − ν

Θ

).

Therefore, the terms will be positive for r near k/(z +1) if z >√

νΘ − 2, which

is equivalent to λ > (√

ν − 2√

Θ)2. Since we can take Θ as close to ϑ/2 as wewish, this will imply Theorem 3. To make this argument precise, we choose r0


slightly smaller than where g(r) is maximal, and prove that all the negativeterms together contribute less then the single term r0, which will be positivefor z and thus λ close to the values above.

For the proof, we may assume ν ≥ 2 and 1/2 ≤ ϑ0 ≤ 1 are fixed, withϑ0 < 1 in case of ν = 2. (The case ν = 1 is covered by Theorem 2, and thecase ν = 2, ϑ0 = 1, E2 = 0 is covered by (1.11) proved in Section 3.) First,we choose ε0 as a sufficiently small fixed positive number. We will choose `

sufficiently large, depending on ν, ϑ0, ε0, and set

k = (` + 1)2 = ϕ−2, ` > `0(ν, ϑ0, ε0), so that ϕ < ϕ0(ν, ϑ0, ε0). (10.16)

Furthermore, we choose

Θ =logR

log 3N=

ϑ0(1− ϕ)2

, (10.17)

and (because of our assumptions on ν) we can define

z0 :=√

2ν/ϑ0 − 2 > 0. (10.18)

Thus, we see that

1 +1

z02

(4kk

z0+1

− 2ν

ϑ0

)=

1z0

2

((z0 + 2)2 − 2ν

ϑ0

)= 0, (10.19)

Let us choose now

r0 =[

k + 1z0 + 1

], r1 = r0 + ϕk = r0 + ` + 1, (10.20)

and putz = z0(1 + ε0). (10.21)

The linear factor in each term of Pk,`,ν(x) is, for r0 ≤ r ≤ r1,

1 + x

(4(1− ϕ

2 )kr + 2` + 1

− ν

Θ

)= 1 +

1z20(1 + ε0)2

(4k(1 + O(ϕ))

kz0+1 + O(kϕ)

− 2ν

ϑ0(1 − ϕ)

)

= 1 +−z2

0 + O(√

νϕ) + O(νϕ)z20(1 + ε0)2

> c(ν, ϑ0)ε0 if ϕ < ϕ0(ν, ϑ, ε0),(10.22)

where c(ν, ϑ0) > 0 is a constant. Letting

f(r) :=(

k

r

)2 xr

(r + 1) · · ·(r + 2`), (10.23)


we have, for any r2 > r1,

f(r2)f(r0)

<∏

r0<r≤r2

(k + 1 − r

r· 1z

)2

<∏

r0<r≤r1

(k + 1 − r

r· 1z

)2

(10.24)

<

((k + 1r0 + 1

− 1)

1z

)2`

≤(

z0 + 1 − 1z0(1 + ε0)

)2`

< e−ε0`.

Thus, the total contribution in absolute value of the negative terms of Pk,`,ν(x)will be, for sufficiently large `, at most

k

(1 +

4(k + ν)z2

)e−ε0`f(r0) < e−ε0`/2f(r0), (10.25)

while that of the single term r0 will be by (10.22) at least

c(ν, ϑ0)ε0f(r0) > e−ε0`/2f(r0) if ` > `0(ν, ϑ0, ε0). (10.26)

This shows that Pk,`,ν(x) > 0. Hence, we must have at least ν + 1 primes insome interval

[n + 1, n + h] = [n + 1, n + λ log3N ], n ∈ [N + 1, 2N ], (10.27)

where

λ = Θz2 <ϑ0

2z20(1 + ε0)2 = (1 + ε0)2(

√ν −

√2ϑ0)2. (10.28)

Since ε0 can be chosen arbitrarily small, this proves Theorem 3.

References

1. P. Bateman and R. A. Horn, A heuristic formula concerning the distribution of prime numbers,Math. Comp. 16 (1962) , 363–367.

2. E. Bombieri and H. Davenport, Small differences between prime numbers, Proc. Roy. Soc. Ser.A 293 (1966), 1–18.

3. E. Bombieri, J. B. Friedlander, and H. Iwaniec, Primes in arithmetic progressions to large moduli.III. J. Am. Math. Soc. 2, No.2, 215-223 (1989).

4. H. Davenport, Multiplicative Number Theory, Second Edition, Revised by Hugh L. Montgomery,Springer, Berlin, Heidelberg, New York, 1980.

5. P. D. T. A. Elliott and H. Halberstam, A conjecture in prime number theory, Symposia Mathe-matica 4 (INDAM, Rome, 1968/69) 59–72, Academic Press, London.

6. T. J. Engelsma, k-tuple permissible patterns, Feb. 2005, http://www.opertech.com/primes/k-tuples.html

7. P. Erdos, The difference of consecutive primes, Duke Math. J. 6 (1940), 438–441.

8. K. Ford, Zero-free region for the Riemann zeta function, Number Theory for the Millennium II,(Urbana, IL., 2000), 25–56, A. K. Peters, Natick, MA 2002.

9. E. Fouvry and F. Grupp, On the switching principle in sieve theory, J. Reine Angew. Math. 370(1986), 101–126.

10. P. X. Gallagher, On the distribution of primes in short intervals, Mathematika 23 (1976), 4–9.

11. D. A. Goldston, On Bombieri and Davenport’s theorem concerning small gaps between primes,Mathematika 39 (1992), 10–17.


12. D. A. Goldston and C. Y. Yıldırım, Higher correlations of divisor sums related to primes. I:Triple correlations, Integers 3 (2003), A5, 66 pp. (electronic).

13. D. A. Goldston and C. Y. Yıldırım, Higher correlations of divisor sums related to primes III:Small gaps between primes, to appear in Proc. London Math. Soc. .

14. D. A. Goldston, S. W. Graham, J. Pintz, and C. Y. Yıldırım, Small gaps between primes andalmost primes, preprint.

15. D. A. Goldston, Y. Motohashi, J. Pintz, and C. Y. Yıldırım, Small gaps between primes exist,Proc. Japan Acad., 82A (2006), 61–65.

16. H. Halberstam and H. -E. Richert, Sieve methods, Academic Press, London, New York, 1975.

17. G. H. Hardy and J. E. Littlewood, Some problems of ‘Partitio Numerorum’: III On the expressionof a number as a sum of primes, Acta Math. 44 (1923), 1–70.

18. G. H. Hardy and J. E. Littlewood, unpublished manuscript, see [26].

19. D. R. Heath-Brown, Almost-prime k-tuples, Mathematika 44 (1997), 245–266.

20. M. N. Huxley, On the differencesof primes in arithmetical progressions, Acta Arith. 15 (1968/69),367–392.

21. M. N. Huxley, Small differences between consecutive primes II. Mathematika 24 (1977), 142–152.

22. M. N. Huxley, An application of the Fouvry-Iwaniec theorem, Acta Arith. 43 (1984), 441–443.

23. H. Maier, Small differences between prime numbers, Michigan Math. J. 35 (1988), 323–344.

24. H. L. Montgomery, Topics in Multiplicative Number Theory, Lecture Notes in Mathematics,Springer, Berlin, Heidelberg, New York, 1971.

25. G. Z. Pilt’ai, On the size of the difference between consecutive primes, Issledovania po teoriichisel, 4 (1972), 73–79.

26. R. A. Rankin, The difference between consecutive prime numbers. II, Proc. Cambridge Philos.Soc. 36 (1940), 255–266.

27. G. Ricci, Sull’andamento della differenza di numeri primi consecutivi, Riv. Mat. Univ. Parma5 (1954), 3–54.

28. A. Schinzel and W. Sierpinski, Sur certaines hypotheses concernant les nombres premiers, ActaArith. 4 (1958), 185–208, erratum 5 (1958), 259.

29. A. Selberg, Collected Papers, Vol. II, Springer, Berlin, 1991.

30. J. Sivak, Methodes de crible appliquees aux sommes de Kloosterman et aux petits ecarts entrenombres premiers, These de Doctorat de l’Universite Paris Sud (Paris XI) 2005.

31. K. Soundararajan, Small gaps between prime numbers: the work of Goldston-Pintz-Yıldırım,Bull. Amer. Math. Soc. 44, Number 1, January 2007, 1–18.

32. E. C. Titchmarsh, The theory of the Riemann zeta-function, Second edition, Edited and with apreface by D. R. Heath-Brown, The Clarendon Press, Oxford University Press, New York, 1986.

33. S. Uchiyama, On the difference between consecutive prime numbers, Acta Arith. 27 (1975),153–157.

D. A. GoldstonDepartment of MathematicsSan Jose State UniversitySan Jose, CA 95192–0103e-mail: [email protected]

J. PintzRenyi Mathematical Institute of the Hungarian Academy of SciencesH-1053 Budapest, Realtanoda u. 13-15, Hungaryemail: [email protected]


C. Y. YıldırımDepartment of Mathematics, Bogazici UniversityIstanbul 34342&Feza Gursey EnstitusuCengelkoy, Istanbul, P.K. 6, 81220, Turkeyemail: [email protected]

Documents

Primes in Tuples I