经过查证,在国际最新筛法专著的前言中,作者专门提及陈景润定理的现代意义,而我们国人却陈景润不理解。呜呼!

请看本文附件。

袁萌 陈启清 2月4日

附件:在最新筛法专著的前言中,专门提及陈景润定理的现代意义。

Sieve Methods

DENIS XAVIER CHARLES

Preface(前言)

Sieve methods have had a long and fruitful history. The sieve of Eratosthenes (around 3rd century B.C.) was a device to generate prime numbers. Later Legendre used it in his studies of the prime number counting function π(x). Sieve methods bloomed and became a topic of intense investigation after the pioneering work of Viggo Brun (see [Bru16],[Bru19], [Bru22]). Using his formulation of the sieve Brun proved, that the sum

∑ p, p+2 both prime

1 p

converges. This was the  rst result of its kind, regarding the Twin-prime problem. A slew of sieve methods were developed over the years — Selberg’s upper bound sieve, Rosser’s Sieve, the Large Sieve, the Asymptotic sieve, to name a few. Many beautiful results have been proved using these sieves. The Brun-Titchmarsh theorem and the extremely powerful result of Bombieri are two important examples. Chen’s theorem [Che73], namely that there are in nitely many primes p such that p+2 is a product of at most two primes, is another indication of the power of sieve methods.

Sieve methods are of importance even in applied  elds of number theory such as Algorithmic Number Theory, and Cryptography. There are many direct applications, for example  nding all the prime numbers below a certain bound, or constructing numbers free of large prime factors. There are indirect applications too, for example the running time of several factoring algorithms depends directly on the distribution of smooth numbers in short intervals. The so called undeniable signature schemes require prime numbers of the form 2p+1 such that p is also prime. Sieve methods can yield valuable clues about these distributions and hence allow us to bound the running times of these algorithms.

In this treatise we survey the major sieve methods and their important applications in number theory. We apply sieves to study the distribution of square-free numbers, smooth numbers, and prime numbers. The  rst chapter is a discussion of the basic sieve formulation of Legendre. We show that the distribution of square-free numbers can be deduced using a square-free sieve1. We give an account of improvements in the error term of this distribution, using known results regarding the Riemann Zeta function.

The second chapter deals with Brun’s Combinatorial sieve as presented in the modern language of [HR74]. We apply the general sieve to give a simpler proof of a theorem of Rademacher [Rad24]. The bound obtained by this simpler proof is slightly inferior, but still suf cient for applications such as the result of Erd os, Chowla and Briggs on the number of mutually orthogonal Latin squares. The formulation of Brun’s sieve in [HR74] also includes a proof of the important Buchstab identity. We use it to derive some bounds on the distribution of smooth numbers ([Hal70]).

The third chapter deals with the development and the applications of Selberg’s upper bound method. The proof by van Lint and Richert [vLR65] of the Brun-Titchmarsh theorem is given as the chief application. Hooley’s improvement of bounds on prime factors in a problem studied by Chebyschev is also outlined here. The last chapter is a study of the Large Sieve. We give an outline of a proof of Bombieri’s central theorem on the error term in the distribution of primes. A new application of the Bombieri theorem is shown; we prove that there are in nitely many primes p such that p+2 is a square-free number with at most 7 prime factors.

Acknowledgements: I would like to thank my advisor Dr. Ken Regan, for allowing me to work on a topic of my own interest. His support, encouragement and advice has been invaluable for my work. I thank him for proofreading the entire document and his constructive comments. A special word of thanks to Dr. Jin-Yi for helping me with character sums. I thank him for answering my queries in such a way that I gained a new insight into the problem. I

1This is not a new proof - it is implicit in the work of Erd os [Erd60]

3

4 PREFACE

thank Dr. Alan Selman for his encouragement and advice. I am deeply grateful to Professors Eric Bach, Tom Cusick, Kevin Ford, and Andrew Granville for promptly answering my queries. Their suggestions, pointers, and ideas were invaluable for this work. I am indebted to the National Science foundation for the monetary support for this work, under my advisor’s grant CCR 98-20140.

I thank my parents for their love, encouragement and prayers. I thank Pavan, Maurice, and Samik for pretending to be interested in sieves, and for reviewing the proofs. A special word of thanks to all my friends for anchoring me in sanity through this summer.

Denis Charles. July 2000

To Truth and Purity

Contents

Preface 3

Chapter 0. Notation and preliminaries 9 0.1. Standard Nomenclature 9 0.2. Conventions 9 0.3. Preliminaries 9

Chapter 1. The sieve of Eratosthenes 13 1.1. Introduction 13 1.2. Sieve of Eratosthenes-Legendre 13 1.3. Smooth numbers 15 1.4. Density of squarefree numbers 15 1.5. The error term in the distribution of Squarefree numbers 18 1.6. Pairs of squarefree numbers 22 1.7. The smallest squarefree number in an arithmetic progression 25 1.8. The Sieve Problem 27

Chapter 2. The Combinatorial Sieve 31 2.1. Brun’s Pure Sieve 31 2.2. Brun’s Sieve 36 2.3. Orthogonal Latin Squares and the Euler Conjecture 44 2.4. A Theorem of Schinzel 49 2.5. Smooth Numbers 54 2.6. On the number of integers prime to a given number 55

Chapter 3. Selberg’s Sieve 57 3.1. The Selberg upper-bound method 57 3.2. The Brun-Titchmarsh Theorem 64 3.3. Prelude to a theorem of Hooley 69 3.4. A theorem of Hooley 71

Chapter 4. The Large Sieve 79 4.1. Bounds on exponential sums 79 4.2. The Large Sieve 84 4.3. The Brun-Titchmarsh Theorem revisited 88 4.4. Bombieri’s Theorem 90 4.5. Prime and Squarefree pairs 93

Bibliography 97

7

8 CONTENTS

CHAPTER 0

Notation and preliminaries

0.1. Standard Nomenclature The largest integer not exceeding x is denotedbxc. We write a\b for two integers a,b a6= 0 if a divides b. The M¨obius function is denoted by μ(n) and de ned as: μ(n) =(( 1)k if n = p1···pk, for 1≤i < j≤k : pi 6= pj, 0 otherwise. The prime counting function is π(x) de ned as the cardinality of the set P ={p≤x| p a prime}, while π(x;q,a) will denote the cardinality of{p≤x| p≡a mod q}. We denote the von-Mangoldt function by Λ(n): Λ(n) =(log p if n = pk for a prime p, 0 otherwise, and its cumulation by ψ(x) = ∑n≤xΛ(n). If n = pe1 1 ···pek k is the prime factorization of n then ν(n) = k denotes the number of distinct primes in the factorization. We write  (n) for Euler’s totient function:

(n) = n∏ p\n1  1 p.

0.2. Conventions The letter p will always denote a prime number. Consequently, ∑n≤p≤m f(p) will denote a sum overthe prime numbers in the range of summation. A will stand for a general integer sequence to be sifted, and P for the sifting set of primes. We employ the standard O and o-notation. We use the Vinogradov notation to mean that inequality holds with some constant, i.e., f(n)g(n)  c > 0 : f(n)≤cg(n). If gcd(a,b) = 1 for two integers a and b, then we also write a⊥b. 0.3. Preliminaries THEOREM 0.3.1. Let n≥1 be an integer. Then ∑ d\n μ(d) =(1, if n = 1, 0, otherwise. Proof : Since divisors that are not squarefree drop out of the sum by the de nition of μ, we may without loss of generality assume that n is squarefree. Let n = p1p2···pl, then any divisor d of n has the form pe1 1 pe2 2 ···pel l with ei ∈{0,1}for 1≤i≤l. Using this we can split up the sum we wish to evaluate: ∑ d\n μ(d) = ∑ p e1 1 p e2 2 ···p el l e1+···+el= even 1  ∑ p e1 1 p e2 2 ···p el l e1+···+el= odd 1 =n 0 n 1+n 2+···+( 1)nn n = (1 1)n = 0.

9

10 0. NOTATION AND PRELIMINARIES

There is another way we could have evaluated the sum. Let T(l) be the number of 0-1 strings of length l that have odd number of 1s in them. Consider the last position of such a string. If it is a 1, then we must  ll the rest of the positions with an even number of 1s which can be done in 2l 1 T(l 1) ways. If the last position is a 0, then the rest of the string must have an odd number of 1s which can be done in T(l 1) ways. We have argued that T(l) satis es the following recurrence: T(l) = T(l 1)+(2l 1 T(l 1)) = 2l 1.

Thus the number of sequences with odd number of 1s and the number of them with even number of 1s is the same, and so the above sum is zero.

THEOREM 0.3.2. (M¨obius Inversion) If

f(n) = ∑ d\n

g(d)

then

g(n) = ∑ d\n

μ(d)f

n d.

Proof :

∑ d\n

μ(d)f

n d= ∑ d\n

μ(d) ∑ l\(n/d)

g(l)

=∑ l\n

g(l) ∑ d\(n/l)

μ(d)

= ∑ l=n

g(l) by Theorem 0.3.1

= g(n).

THEOREM 0.3.3. If

f(n) = ∑ d\n

g(d)

then

g(n) = ∑ d\n

μ

n df(d).

Proof :

∑ d\n

μ

n df(d) = ∑ d\n

μ

n d∑ l\d

g(l)

=∑ l\n

g(l) ∑ d\n/l

μ

n dl

= ∑ l=n

g(l) by Theorem 0.3.1

= g(n).

0.3. PRELIMINARIES 11

THEOREM 0.3.4.

∑ d\n

μ(d) d

=∏ p\n1  1 p =∏ p\n1+ μ(p) p . Proof : We know that ∑d\n (d) = n. Using M¨obius inversion on this we get: n∏ p\n1  1 p=  (n) = ∑ d\n μ(d) n d = n∑ d\n μ(d) d .

REMARK 0.3.5. The proof of Theorem 0.3.4 actually works for any multiplicative function of the divisors of n in the denominator, provided it is zero at non-squarefree divisors. We could have also proved Theorem 0.3.1 using the identity: ∑ d\n μ(d) =∏ p\n1+μ(p).

12 0. NOTATION AND PRELIMINARIES

CHAPTER 1

The sieve of Eratosthenes

1.1. Introduction

The sieve of Eratosthenes is a simple effective procedure for  nding all the primes up to a certain bound x. Take a list of the numbers 2,3,···,bxc. Call 2 a prime, and start by crossing out all the multiples of 2. Because 3 is uncrossed at this stage 3 must be prime. Cross out the multiples of 3 since they are composite, and then pick the next number that is still uncrossed and repeat. If after a stage the next uncrossed number exceeds √x then stop. At this stage all the numbers that are not crossed out are prime.

Legendre realized that this procedure can be captured succinctly in a theoretical analog of the sifting process, and used this in his study of the function π(x) =
 
{p≤x| p a prime}
 
.In this chapter we will try to apply this basic technique to study some simple problems. First we shall look at the sieve applied to the problem of estimating π(x). Although the method would lead to an exact formula for π(x) π(√x) this does not give useful estimates for π(x) owing to a huge error term. However we can adapt the basic method to study other sequences of numbers, for example the squarefree numbers, meaning numbers that are products of distinct primes. The basic sieve we develop will be more successful in dealing with squarefree numbers, essentially because they are denser than the primes. We will be able to give interesting bounds on the density of these numbers in arithmetic progressions and in pairs (n,n+2). We shall also  nd a bound on the smallest squarefree number in an arithmetic progression. Finally we shall give the general setup of a sieve problem and re-formulate the classical sieve of Eratosthenes-Legendre in this framework.

1.2. Sieve of Eratosthenes-Legendre

Let Pz = ∏p<z p. The sieve of Eratosthenes deletes from the list of numbers all those numbers that are not relatively prime to Pz, except the primes dividing Pz itself. We are interested in  nding bounds on the cardinality of the set S ={n|n≤x,n⊥Pz}. We de ne s(n) =(1, if n∈S 0 otherwise.

This is the characteristic function of the set S. Using the properties of the M¨obius function (see Chapter 0), we can write an explicit expression for s(n).

s(n) = ∑ d\gcd(n,Pz)

μ(d).

We will call such a function s(n) the sifting function.

13

14 1. THE SIEVE OF ERATOSTHENES

Then

|S|= ∑ n≤x

s(n)

= ∑ n≤x

∑ d\gcd(n,Pz)

μ(d)

= ∑ d\Pz

μ(d)∑ n≤x d\n

1

= ∑ d\Pz

μ(d)x d

= ∑ d\Pz

μ(d)x d

+x d  x d

= ∑ d\Pz

μ(d)

x d

+ ∑ d\Pz

μ(d)x d  x d.

Since each term in the second sum has absolute value at most 1, we obtain

|S|≤x ∑ d\Pz

μ(d) d

+2π(z)

= x ∏ p\Pz1  1 p+2π(z).

Now a theorem of Mertens states that

∏ p<z1  1 p~ eγ lnz

,

and this yields the estimate:

|S|≤x

eγ lnz

+2π(z)

provided z→∞ as x→∞. The usefulness of the above scheme is restricted by the huge error term 2π(z). For z = O(lnx) for example we get π(x) π(lnx) = O x lnlnx, and since π(x)≤x we get the estimate π(x) = O x lnlnx. This is markedly inferior to the truth π(x)~ x lnx. Note that if z =√x then|S|measures π(x) π(√x), for which we have derived the following exact formula: π(x) π(√x)+1 = x ∏ p<√x1  1 p+ ∑ d\P√x μ(d)x d  x d.

1.4. DENSITY OF SQUAREFREE NUMBERS 15

1.3. Smooth numbers

DEFINITION 1.3.1. A number n will be called k-smooth if  p : (p\n) (p < k). Let Ψ(x,k) =|{n≤x|n is k-smooth}|i.e., the number of k-smooth numbers up to a bound x. We can use our sieve argument to try to  nd a bound on Ψ(x,k). The weakness of this simple sieve will be apparent in the bound it gives us.

PROPOSITION 1.3.2.

Ψ(x,k) = Oxlnk lnx

+2π(x) π(k). Proof : Since a number is k-smooth only if all its prime divisors are below k, we can  nd the k-smooth numbers below a bound x, by using as our sifting set P ={p|k < p≤x}. Let Pk,x = ∏p∈P p. Let S ={n|n is k-smooth}, and this time de ne s(n) =(1 if n∈S or n = 1, 0 otherwise. Now rewriting s(n) using the M¨obius function, we obtain s(n) = ∑ d\gcd(n,Pk,x) μ(d). Setting S(n) =|S|, we apply Mertens’ Theorem at the end to conclude:

S(n) = ∑ n≤x

s(n)

= ∑ n≤x

∑ d\gcd(n,Pk,x)

μ(d)

= ∑ d\Pk,x

μ(d)∑ n≤x d\n

1

= ∑ d\Pk,x

μ(d)x d = x ∏ k<p≤x1  1 p+O(2π(x) π(k)) = Oxlnk lnx +2π(x) π(k).

The bound is clearly very poor. However we can improve this bound using more advanced sieve techniques. In [Warl90], a much better bound is given under some conditions on the sifting primes.

1.4. Density of squarefree numbers

The basic method of the sieve of Eratosthenes-Legendre can be adapted to prove a more interesting result. Let S = {n | n≤x,n is squarefree}, and let κ(x) =|S|. To obtain S as a result of a sifting process, all we need to do is take primes p <√x and cross of multiples of p2 from the list. We shall show that a variant of the function s(n) introduced earlier works in this case.

THEOREM 1.4.1.

κ(x) =

6 π2

x+O(√x).

16 1. THE SIEVE OF ERATOSTHENES

Proof : The sifting function for this set is now

s(n) =|μ(n)|, and κ(x) = ∑n≤x s(n) = ∑n≤x|μ(n)|. Now we reach an impasse, because there does not seem to be any easy way of evaluating this sum. The trick is to look for another expression for the sifting function.

[1]

s(n) = ∑d2\n μ(d).

[1]

Any number n can be represented as n = m2w, where w is squarefree and m is the largest square divisor of n. If n = pe1 1 pe2 2 ···pel l with ei = 2qi +ri,0≤ri ≤1, then m = ∏i pqi i satis es the expression. We shall write

(n) to stand for the largest square divisor of n. Now ∑ d2\n μ(d) = ∑ d\

(n)

μ(d),

and this sum is 0 unless

(n) = 1 in which case it is also 1. This proves the claim. Setting m =√x, we obtain: κ(n) = ∑ n≤x s(n) = ∑ n≤x ∑ d2\n μ(d) = ∑ d≤m μ(d) ∑ n≤x d2\n 1 = ∑ d≤m μ(d)x d2 = x ∑ d≤m μ(d) d2 + ∑ d≤m μ(d)x d2  x d2 = x ∑ d≤m μ(d) d2 +O(m). Using the fact that

∏ p 1  1 p2= ∑ n≥1

μ(n) n2

we get

κ(n) = x∏ p 1  1 p2  ∑ d>m

μ(d) d2

+O(m)

= x∏ p 1  1 p2+O(m).

Also

∏ p 1  1 p2= 1 ζ(2)

,

so that we  nally get

κ(n) = x

1 ζ(2)

+O(√x).

Euler showed that ζ(2) = π2 6 , and using this in the above expression we have

κ(n) =

6 π2

x+O(√x).

1.4. DENSITY OF SQUAREFREE NUMBERS 17

Another natural question to ask is: what is the density of squarefree numbers in an arithmetic progression  We shall give a partial answer to that question in the next theorem. Let κ(x;a,l) =|{n≤x|n is squarefree,n≡a mod l}|.

THEOREM 1.4.2. Let q > 2 be a prime, and let a be a positive integer relatively prime to q. Then there is a constant c > 0 depending only on q such that κ(x;a,q)≥cx+O(√x). Proof : Using the same idea as in the previous proof we have:

κ(x;a,q) = ∑ n≡qa n≤x

∑ d2\n μ(d)(1.1)

= ∑ d≤m ∑ d2\n n≡qa n≤x 1!where m isb√xc.(1.2)

The quantity we need to bound is de ned by

N(x;d,a,q) = ∑ d2\n n≡qa n≤x

1

This is essentially the number of solutions in k to the congruence kd2 ≡a mod q. There are two cases: [d⊥q] In this case there is a unique solution k such that k≡a(d 2) mod q. However, if k∈{0,1,···,q 1}is such a solution then for e≥1, k+eq is also a solution. Now (k+eq)d2 = n≤x, so (k+eq)≤ x d2 e≤ x d2q  k q e≤ x d2qas k < q. [d6⊥q] In this case there are no solutions to the congruence as a > 0. Thus N(x;d,a,q) =

x d2qif d ⊥q, and 0 otherwise. Substituting in (1.2) we get κ(x;a,q) = ∑ d≤m μ(d) x d2q  ∑ d≤m d6⊥q μ(d) x d2q = x q∑ d≤m μ(d) d2  ∑ d6⊥q μ(d) d2 +O(m)

18 1. THE SIEVE OF ERATOSTHENES

∑ d6⊥q

μ(d) d2 ≤ ∑ d6⊥q

1 d2

= ∑ q\d,d≤x

1 d2

= ∑ k≤(x/q)

1 k2q2

=

1 q2 ∑ k≤(x/q)

1 k2

1 q2 ∑ k≥1

1 k2

π2 6q2

Thus we get

κ(x;a,q)≥x 1 qζ(2)

ζ(2) q2 +O(√x).

and hence κ(x;a,q)≥cx+O(√x).

1.5. The error term in the distribution of Squarefree numbers We proved in the previous section that κ(x)  6 π2 x = O(√x), and it turns out to be extremely dif cult to improve on this bound. In this section we brie y digress form the topic of sieves to show a strengthening of the error term if one assumes the Riemann Hypothesis (henceforth called RH). First we shall strengthen the error term (unconditionally) using a theorem of Wal sz. THEOREM 1.5.1 ([Wal63] Satz§5.5.3). ∑ n≤x μ(n) = Bxexp Alog3 5 xloglog 1 5 x    for some positive constants A and B.

We simplify the proof in [Wal63] of the following theorem: THEOREM 1.5.2 ([Wal63] Satz§5.6.1). κ(x) = 6 π2 x+O√xexp clog3 5 xloglog 1 5 x    for some positive constant c > 0.

Proof :

κ(x) = ∑ 1≤n≤x

∑ d2\n

μ(d)

= ∑ d2m≤x

μ(d)

= ∑ d2≤x

μ(d)x d2.

1.5. THE ERROR TERM IN THE DISTRIBUTION OF SQUAREFREE NUMBERS 19 Let S2(x,y) = ∑d≤y μ(d)δx d2, where δ(z) = z bzc  1 2 and M(y) = ∑n≤y μ(n). Then

κ(x) = x ∑ d2≤x

μ2(d) d2  S2(x,√x) 1 2

M(√x).

In [MV81] (see p.255) the following bound is proved:

S2(x,y) = O(x

2 7 +y

1 2 x

1 7+ε),

and this implies that S(x,√x) = O(x11 28 ). Now consider:

∑ d>y

μ(d) d2

= 2 ∑ d>y

μ(d)

d

1 z3

dz

= 2

y

dz z3 ∑ y<n<z

μ(n)

(interchanging of the sum and the integral is valid since both of them are convergent)

= 2

y

M(z)dz z3  2M(y)

y

dz z3

= OM(y)

y

dz z3 o(1)

= OM(y) y2 .

Hence

∑ d>√x

μ(d) d2

= Oexp{clog

3 5 xloglog 1 5 x}√ x

and also

∑ d≤√x

μ(d) d2

=

1 ζ(2)

+Oexp{clog

3 5 xloglog 1 5 x}√ x .

The theorem follows from these estimates.

COROLLARY 1.5.3. The number of squarefree numbers in the interval [x,···,x+√x] is asymptotic to 6√x π2 . The corresponding problem for primes seems to be far more dif cult, see [HB88]. It turns out that if the Riemann Hypothesis holds then M(y) = O(√y), and using this in the above proof we get the following theorem:

THEOREM 1.5.4. Assuming the Riemann Hypothesis,

κ(x) =

6 π2

x+O(x

11 28 ).

20 1. THE SIEVE OF ERATOSTHENES

It turns out that if we assume the Riemann Hypothesis we can do better even without the strong bound on S2(x,y). We begin as we did before,

κ(x) = ∑ 1≤d≤x

μ(d) ∑ 1≤n≤x d2\n

1

= ∑ d2n≤x

μ(d)

= ∑ d2n≤x d≤y

μ(d)+ ∑ d2n≤x d>y

μ(d)

= Σ1 +Σ2 (say).

Now (as in the proof of the previous theorem)

Σ1 = ∑ d≤y

μ(d)x d2

= ∑ d≤y

μ(d) x d2  x d2  x d2 1 2! 1 2 ∑ d≤y

μ(d).

Let as before

S2(x,y) = ∑ d≤y

μ(d)δx d2

and M(y) = ∑d≤y μ(d), where δ(z) = z bzc  1 2, so that Σ1 = x ∑ d≤y μ(d) d2  S2(x,y)

1 2

M(y).

Let

fy(s) =

1 ζ(s) ∑ d≤y

μ(d) ds

.

We adopt the standard convention of referring to the real part of s as σ and the imaginary part as t. If σ > 1 then we have

fy(s) = ∑ d>y

μ(d) ds

,

since in this case we also have

1 ζ(s)

= ∑ 1≤d

μ(d) ds

.

Consider

ζ(s)fy(2s) =∑ 1≤n

1 ns∑ d>y

μ(d) d2s

= ∑ 1≤n

1 ns∑ d>y d2\n

μ(d).

If we look at the restricted version of this sum, namely,

∑ 1≤n≤x

1 ns∑ d>y d2\n

μ(d),

then as s → 0 this sum equals Σ2. Thus we need a way of evaluating this sum when s → 0. The following result (Lemma (3.12) [Tit86] p60) will help us do just that.

1.5. THE ERROR TERM IN THE DISTRIBUTION OF SQUAREFREE NUMBERS 21

LEMMA 1.5.5. [Tit86] Lethanibe a sequence of real numbers, such that as σ→1 from above, ∑ n≥1 |an| nσ = O 1 (σ 1)α, for some α≥1. Let ψ(n) be an upper bound for|an|, and de ne: f(s) = ∑ n≥1 an ns , for σ > 1. If c > 0,σ≥0,σ+c > 1, x is not an integer, and N is the nearest integer to x, then for all T > 0: ∑ n<x an ns = 1 2πi

c+iT

c iT

f(s+w)

xw w

dw+O xc T(σ+c 1)α+Oψ(2x)x1 σlogx T +Oψ(N)x1 σ T|x N|.

Applying this lemma to the series

∑ 1≤n≤x

1 ns∑ d>y d2\n

μ(d)

with c = 1+ 1 logx and T = x gives remainder terms of O(xε), since ψ(z) = O(√z). Making the change of variable w←s taking the s in the lemma to be 0, and setting x0 =bxc+ 1 2 so that x0 is not an integer, we obtain

Σ2 =

1 2πi

c+ix

c ix

ζ(s)fy(2s)

xs 0 s

ds+O(xε).

Now consider splitting the integral into four regions:

c+ix

c ix

+

1 2+ix

c+ix

+

1 2 ix

1 2+ix

+

c ix 1 2 ix (where the integrand is the same as above). Since the integrand has a simple pole at s = 1, with residue 2πify(2)x0, we have

c+ix

c ix

+

1 2+ix

c+ix

+

1 2 ix

1 2+ix

+

c ix 1 2 ix

= 2πify(2)x0

and so

Σ2 = fy(2)x0 +

1 2πi

C

ζ(s)fy(2s)

xs 0 s

ds+O(xε),

where C is the path made up of the line segments

c ix →

1 2 ix

1 2 ix →

1 2

+ix

1 2

+ix →c+ix. By Theorem (14.2) on p.337 of [Tit86], RH implies that 1 ζ(s) = O(|t|ε). Also THEOREM 1.5.6 ([Tit86] (14.25A)). Assume RH. For s with σ > 1 2,

∑ n<x

μ(n) ns

=

1 ζ(s)

+O(T1 εx2)+O(Tεx1 2 σ+δ).

Using this we can take T large so that

fy(s) = O(y

1 2 σ+δ0)(1.3)

under RH.

22 1. THE SIEVE OF ERATOSTHENES

Also by Theorem (14.25C) [Tit86], RH implies M(z) = O(z

1 2+ε). Using all this information we can bound

C

ζ(s)fy(2s)

xs 0 s

ds

on the contourC: we have fy(2s)=O(y1 2 1+ε)=O(y 1 2+ε) andζ(s)= 1 s 1

+O(tε), and since xs =xσ+it =e(σ+it)logx =

eσlogx+it logx, we have|xs|= xσ. Thus the integral in (1.3) is: O(x 1 2+εy 1 2+ε).

Combining all these estimates we get the following bounds:

THEOREM 1.5.7 ([MV81]). Assuming the Riemann Hypothesis, for any y > 0

κ(x) =

x ζ(2) S2(x,y)+Ox1 2+εy 1 2+ε +y1 2+ε.C OROLLARY 1.5.8. Assuming the Riemann Hypothesis,

κ(x) =

x ζ(2)

+Ox1 3+δ.

Proof : Clearly we have S2(x,y) = O(y), now setting y = x

1 3 in the above theorem we get the result.

In the same article [MV81] Montgomery and Vaughan went on to estimate the sums involved more precisely to show that κ(x) = 1 ζ(2)x+O(x 9 28+ε). Subsequently the exponent of the error term was reduced to 7 22 by various authors (see [BakPin85]).

1.6. Pairs of squarefree numbers

The famous twin prime problem asks whether there are in nitely many primes p such that p+2 is also prime. Although this problem is still open, the analogous question for the squarefree numbers can be settled rather easily using the methods we have seen so far. For a more general version of this result see [Mir49]. Let κ2(x) =
 
{n(n+2)|μ(n)2 = μ(n+2)2 = 1,n≤x}
 
.T HEOREM 1.6.1. κ2(x) =∏ p 1  2 p2x+O(x2 3 ln 4 3 x). Proof : Let s(n) = ∑d2\n μ(d). Using this we have κ2(x) = ∑ n≤x s(n)s(n+2) = ∑ n≤x∑ a2\n μ(a)∑ b2\n μ(b). If a2\n and b2\(n+2), then writing n = k1a2 and n+2 = k2b2 we have k0 1a2 +k2b2 = 2 (k0 1 = k1). This says that gcd(a2,b2) divides 2, so gcd(a,b) must be 1, i.e. a⊥b. Now interchanging the sum we get κ2(x) = ∑ k1a2 k2b2=2 k2b2≤x a⊥b μ(a)μ(b). The rest of the proof is now to bound the above sum, and to this end we split up the sum into two parts:

κ2(x) = ∑ ab≤y

μ(a)μ(b)N(x;a2,b2,2)+ ∑ ab>y k1a2 k2b2=2,k2b2≤x

μ(a)μ(b).

Here N(x;a2,b2,2) is a count of the number of solutions to the equation k1a2 k2b2 = 2, k2b2 ≤x.

1.6. PAIRS OF SQUAREFREE NUMBERS 23

It is clear that N(x;a2,b2,2) = 0 if gcd(a2,b2) does not divide 2, and otherwise N(x;a2,b2,2) = x lcm( a2,b2) +O(1)

=

x (ab)2

+O(1),

since a⊥b.

Using this we have

∑ ab≤y a⊥b

μ(a)μ(b)N(x;a2,b2,2)≤ ∑ ab≤y a⊥b

μ(a)μ(b) x (ab)2

+O(1)

= x ∑ ab≤y

μ(ab) (ab)2

+ ∑ ab≤y

μ(a)μ(b),

since the terms with a6⊥b are killed by the M¨obius function.

Thus

∑ ab≤y

μ(a)μ(b)≤ ∑ ab≤y

1

=y 1+y 2+···+y y = Oy ∑ 1≤k≤y 1 k= O(ylny).

Now the sum

∑ ab≤y

μ(ab) (ab)2

can be evaluated by looking at the terms with ν(ab) = k. Write a = pε1 1 pε2 2 ···pεk k and b = pδ1 1 pδ2 2 ···pδk k . Since a⊥b we should have ( i : 1≤i≤k) εi +δi = 1, so there are 2ν(ab) terms whose denominator is (ab)2. Hence

∑ ab≤y

μ(ab) (ab)2

= ∑ n≤y

μ(n)2ν(n) n2

= ∏ p≤y1  2 p2.

So

∑ ab≤y

μ(ab) (ab)2

=∏ p 1  2 p2 ∑ n>y

μ(n)2ν(n) n2

.

We need a bound on the sum on the right hand side of the above equation. Now

∑ ab>y

1 (ab)2

= ∑ b<y,ab>y

1 (ab)2

+ ∑ a>y,b>y

1 (ab)2

.

The second sum converges so we need to bound on the  rst part of the sum. Now:

∑ b<y,ab>y

1 (ab)2 ≤ ∑ 1≤b≤y

1 b2∑ a> y b

1 a2

24 1. THE SIEVE OF ERATOSTHENES

∑ a> y b

1 a2 ≤

y b

1 a2

da =

b y

so we have

∑ b<y,ab>y

1 (ab)2 ≤

1 y ∑ 1≤b≤y

1 b

=

1 y

lny.

We  nally get

x ∑ ab≤y

μ(ab) (ab)2

= x∏ p 1  1 p2+Ox y

lny.

Now we have to bound the sum

∑ ab>y k1a2 k2b2=2,k2b2≤x

μ(a)μ(b).

We re-express this sum as follows:

∑ ab>y a2c b2d=2 b2d≤x

μ(a)μ(b)≤ ∑ a2c b2d=2 b2d≤x ab>y

1.

Since a2c = 2+b2d, a2c≤2+x, and this gives us c≤ (x+2) a2 . Since d ≤ x b2 and y < ab we have either cd ≤ x(x+2) a2b2 or cd ≤ x(x+2) y2 . This gives

∑ a2c b2d=2 b2d≤x,ab>y

1≤ ∑ cd<x(x+2) y2

M(x;c,d,2),

where M(x;c,d,2) is the number of solutions of ca2 db2 = 2,db2 ≤x.(1.4) The above equation implies that 2c 1 p ≡1 mod p, for all p\d, 2d 1 p ≡1 mod p, for all p\c. Estermann studied these congruences and for the case cd not a square he proved [Est31]:

M(x;c,d,2) = O(lnx),

in fact that M(x;c,d,2)≤4(ln(x+2)+1). If cd is a square then since the equation (1.4) implies c⊥d we can set c = l2, d = m2 to obtain: M(x;c,d,2) = ∑ l2a2 m2b2=2 1 ≤ ∑ r2 s2=2 1 = 0.

1.7. THE SMALLEST SQUAREFREE NUMBER IN AN ARITHMETIC PROGRESSION 25

In any case we have M(x;c,d,2) = O(lnx), and using this we have: ∑ cd<x(x+2) y2 M(x;c,d,2)≤lnx ∑ cd<x(x+2) y2

1.

For any positive constant K we have:

∑ cd<K

1 = ∑ c<K

K c ≤KlnK,

so

∑ cd<x(x+2) y2

M(x;c,d,2)≤ln2 xx(x+2) y2 = Ox2 y2 ln2 x.

Setting y = x

2 3 ln

1 3 x we have

∑ ab>y

μ(ab)≤ ∑ cd<x(x+2) 2

M(x;c,d,2)

≤x

2 3 ln

4 3 x,

and also

x ∑ ab≤y

μ(ab) (ab)2

= x∏ p 1  1 p2+Ox1 3ln2 3 x+o(1). The theorem follows from these two bounds.

1.7. The smallest squarefree number in an arithmetic progression

The simple methods that we have seen so far are surprisingly powerful and provide a quick bound on the smallest squarefree number in an arithmetic progression. The following result is from [Erd60] and is one of the early uses of a squarefree sieve. THEOREM 1.7.1. Let a ⊥ D, 1 ≤ a < D. Then the smallest squarefree number in the arithmetic progression ha+ kD : k≥0iis OD 3 2 lnD. Proof : Let A =ha+kD : k≥0ibe the sequence. The  rst step would be to sift A by all squares of primes below a certain limit z. This will leave out only those numbers that could have a large prime as their square divisor. We will  nally bound the number of such integers below x and show that there are still some numbers left over — and that will prove the theorem. Let Pz = ∏p<z p. The result of the sifting of the sequence A by Pz is: S(A;Pz,x) = ∑ n∈A n≤x ∑ d2\n d\Pz μ(d) = ∑ d\Pz μ(d) ∑ n∈An ≤x,d2\n 1!.

26 1. THE SIEVE OF ERATOSTHENES

Now

∑ n∈An ≤x,d2\n

1

is exactly the number of solutions to the following pair of congruences: n≡0 mod d2 n≡a mod D. Suppose d ⊥D. Then there is exactly one solution in the interval lcm(D,d2) = Dd2, so the total number of solutions in 1≤n≤x is at most x Dd2 +1. If gcd(d,D)=δ then n=kδ by the  rst congruenceand n a=k0δ by the second congruence. This yields a=(k k0)δ and so gcd(a,D)6= 1. This is a contradiction, so if d 6⊥D there are no solutions to the congruence. Let k =b(x a) D c,which is the maximum value of k for a+kD to be in A. Then S(A;Pz,x) = ∑ d\Pz,d⊥D μ(d) x Dd2 +1 = x D ∑ d\Pz,d⊥D μ(d) d2 = k∑ d\Pz d⊥D μ(d) d2 +o(1) = k ∏ p\Pz,p6\d1  1 p2+o(1) ≥k∏ p 1  1 p2+o(1) = k6 π2 +o(1). Taking k to be c√D lnD we have S(A;Pz,k)≥ 6 π2 c√D lnD . The number of integers a+kD in A for which k < c√D lnD and also n≡0 mod p2 n≡a mod D is at most c√D p2 lnD +1. Let N stand for the number of integers k < c√D lnD in A for which a+kD6≡0 mod p2 for all p≤√cD. Then N ≥ 6 π2 c√D lnD   c√D lnD∑ p≥z 1 p2  ∑ p≥z,p≤√cD 1(1.5) ≥ 6 π2 c√D lnD   c√D lnD 1 z  π( √cD),(1.6)

and so for large enough c and L

N >

1 2

c√D lnD .(1.7) We have used the fact that π(x) < 2x lnx for large enough x.

1.8. THE SIEVE PROBLEM 27 Now we are left with the numbers that are either squarefree or divisible by a prime p > √cD. For these numbers a+kD either a+kD≡0 mod p2,k < c √D lnD and p >√cD or

a+kD = αp2 with α <

√D lnD

.

Supposing p > D

1 2+ε, we would have α < D1 2 ε if D is large enough, so we also have p < D. Thus a+kD = αp2 yields a congruence a ≡αp2 mod D. Let us  x an α; then clearly the number of such prime solutions is less than the number of solutions for the congruence x2 ≡aα 1 mod D, 0 < x < D. If aα 1 is a quadratic residue modulo D, then by the Chinese Remainder Theorem there are at most 2ν(D) such solutions to this congruence. Since ν(n) = o(lnn), we can write 2ν(D) = o(Dε 2 ). If p > D 1 2+ε then there are only D1 2 ε choices for α, so on the whole there are only o(D 1 2 ε 2 ) such solutions. Let us consider the solutions for√cD < p < D1 2+ε. We have p2 ≡aα 1 mod D, α < √D lnD , √cD < p < D1 2+ε. Let cα be the number of solutions of this congruence for a  xed α. These solutions give rise to ∑cα 2solutions to thecongruence p2 ≡q2, p,q < D1 2+ε.(1.8) Since (1.8) implies (p q)(p+q)≡0 mod D, the number of such solutions is at most the number of solutions to uv≡0 mod D, u < 2D 1 2+ε,v < 2D1 2+ε. This gives us uv = βD,1≤β < 4D2ε.(1.9) Also for a  xed β the number of such solutions is less than the number of factors of the numberβD, which is o((βD)ε), so the number of solutions of (1.9) is o((βD)ε)4D2ε = o(D4ε). This gives ∑cα 2= o(D4ε) and hence

∑ cα>1

cα = o(D4ε).

Since α <

√D lnD, ∑cα ≤

√D lnD +o(D4ε). Thus the number of integers 0≤k < c

√D lnD for which

a+kD≡0 mod p2

for some p>√cD is at most D1 2 lnD +o(D

1 2 ε 2 ). So the number of integers k, 0≤k < c

√D lnD , for which a+kD is squarefree

is

1 2

c√D lnD

√D lnD  o(D

1 ε 2 ) > 0

for large enough c.

1.8. The Sieve Problem

Now that we have seen some examples of sieve techniques at work, we can formulate the sieve problem in a generic setting so that the essential quantities are clearly visible. The notation we shall adopt is that of the seminal book by Halberstam and Richert [HR74].

28 1. THE SIEVE OF ERATOSTHENES

1.8.1. Notation.

1. A,B,··· will stand for integer sequences. 2. Ad =ha∈A : a≡0 mod di. 3. Az =ha∈A : a≤zi. 4. If A is a  nite sequence then|A|will denote the length of the sequence. 5. P =hpi : pi is the i-th primei. 6. Pz = ∏p∈Pz p. 7. S(A;Pz,x) will be the number of elements inAx that survive the sifting process by the sequencePz. In general the sifting is determined by a sifting function σ :A→{0,1}which determines whether a number survives the sifting, but usually we will only be considering simple sifting functions like

σ(n) = 1 n⊥ ∏ p∈Sz

p

8. If A is a  nite sequence then ω(p) is de ned such that ω(p) p x is a good approximation to |Ax p|. If d is any squarefree integer we can generalize this notation by de ning ω(d) = ∏p\d ω(d). 9. De ne Rd(x) =|Ax d| ω(p) p x, i.e. the remainder term in our estimate of|Ax d|. 10. De ne

W(z) = ∏ p\Pz1 ω(p) p .

1.8.2. The Sieve of Eratosthenes-Legendre revisited. The generic sieve problem is to estimate S(A;Pz,x). Needless to say solving the problem as stated in this generality is too great a task. This treatise will only be concerned with restricted versions of the sieve problem which nevertheless yield interesting and non-trivial results. The case of great importance is when Sz =Pz and A is some subsequence of positive integers.

The sieve of Eratosthenes-Legendre can be recast in this framework as follows. Let A be the sequence to be sifted, and let ω(d) and Rd be the modulo counting function and the remainder function for the sequence, respectively. Let Pz be the sifting sequence; then the sifting function is

σ(n) =(1 if, n⊥Pz 0 otherwise.

We can rewrite σ(n) as

σ(n) = ∑ d\gcd(n,Pz)

μ(d).

1.8. THE SIEVE PROBLEM 29

Thus we have

S(A,Pz,x) = ∑ n∈A,n≤x

σ(n)

= ∑ n∈Ax

∑ d\gcd(n,Pz)

μ(d)

= ∑ d\Pz

μ(d)∑ n∈Ax d\n

1

= ∑ d\Pz

μ(d)|Ax d|

= ∑ d\Pz

μ(d)ω(d) d

x+Rd(x)

= x ∑ d\Pz

μ(d)ω(d) d

+ ∑ d\Pz

μ(d)Rd(x)

= x ∏ p\Pz1 ω(p) p + ∑ d\Pz

μ(d)Rd(x)

= xW(z)+ ∑ d\Pz

μ(d)Rd(x)

= xW(z)+θ ∑ d\Pz

Rd(x) where|θ|≤1. If we assume that|Rd(x)|≤ω(d) and suppose that ω(p)≤C0, where C0 is some constant, then ω(d)≤Cν(d) 0 . So ∑ d\Pz Rd(x)≤ ∑ d\Pz C ν(d) 0 = ∏ p\Pz (1+C0) = (1+C0)π(z). Thus we have proved the following theorem. THEOREM 1.8.1. For all suf ciently large x and z < x, there is a θ with|θ|≤1 (θ depending on z), such that S(A;Pz,x) = xW(z)+θ ∑ d\Pz Rd(x). If we have|Rd(x)|≤ω(d) and ω(p)≤C0 then S(A;Pz,x) = xW(z)+O(1+C0)π(z). It is very clear that the effectiveness of the basic sieve is limited by the fact that the remainder term is a sum over all the divisors of Pz. Beginning with the next chapter we shall systematically try to reduce this term.

30 1. THE SIEVE OF ERATOSTHENES

CHAPTER 2

The Combinatorial Sieve

In this chapter we begin by exploring the ideas of Viggo Brun, who  rst showed how we can improve on the Legendre method if we relax our requirement of asymptotic results but instead look for inequalities. After developing Brun’s sieve in general we shall look at applications that bring out the surprising power of the technique. We follow the presentation in Halberstam & Richert [HR74] rather closely since its form is well suited for our applications. However our development will be targeted only to the Brun’s sieve.

2.1. Brun’s Pure Sieve Let Ax be a  nite sequence of integers and let Sz be the sifting primes. In the previous chapter the sifting function was: σ(n) = ∑ d\gcd(n,Pz) μ(d). Let us see what can be done if instead we have a pair of functions χ1(d) and χ2(d) such that σ2(n)≡∑ d\n μ(d)χ2(d)≤∑ d\n μ(d)≤∑ d\n μ(d)χ1(d)≡σ1(n). Since S(A;Pz,x) = ∑ d\Pz μ(d)|Ad| =|A|  ∑ p\Pz |Ap|+ ∑ pq\Pz |Apq|+··· we expect that truncating the series after an even (odd) number of sums will give us a lower (upper) bound. Brun’s pure sieve is an application of this well-known idea.

Using the notation developed in the last chapter we have

∑ n∈A

∑ d\n d\Pz

μ(d)χ2(d)≤S(A,Pz,x)≤ ∑ n∈A

∑ d\n d\Pz

μ(d)χ1(d).

Let us  rst look at the upper bound:

∑ n∈A

∑ d\n d\Pz

μ(d)χ1(d) = ∑ d\Pz

μ(d)χ1(d)|Ax d|

= ∑ d\Pz

μ(d)χ1(d)ω(d)x d

+|Rd(x)|

= x ∑ d\Pz

μ(d)χ1(d)

ω(d) d

+ ∑ d\Pz

μ(d)χ1(d)|Rd(x)|.

Let σ1(n) = ∑d\n μ(d)χ1(d); then by M¨obius inversion we get μ(d)χ1(d) = ∑ δ\d μ

d δσ(δ).

31

32 2. THE COMBINATORIAL SIEVE

Substituting this in the above expression we get

x ∑ d\Pz

μ(d)χ1(d)

ω(d) d

= x ∑ d\Pz

ω(d) d ∑ δ\d

μ

d δσ1(δ)

= x ∑ δ\Pz

σ1(δ)ω(δ) δ ∑ t\(Pz/δ)

μ(t)

ω(t) t

(since ω(d) is a multiplicative function)

= x ∑ δ\Pz

σ1(δ)

ω(δ) δ ∏ p\(Pz/δ)1 ω(p) p

= xW(z) ∑ δ\Pz

σ1(δ)

ω(δ) δ∏p\δ1 ω(p) p

= xW(z) ∑ δ\Pz

σ1(δ)g(δ) = xW(z)1+ ∑ 1<δ\Pz σ1(δ)g(δ),

where g(d) abbreviates ω(d) d∏p\d1 ω(p) p

.

The remainder term is clearly at most

∑ d\Pz

μ(d)χ1(d)|Rd(x)|≤ ∑ d\Pz

|χ1(d)||Rd(x)|.

A similar argument works for the lower bound too. Thus we have: xW(z)1+ ∑ 1<δ\Pz σ2(δ)g(δ)  ∑ d\Pz |χ2(d)||Rd(x)|≤S(A,Pz,x)(2.10) ≤xW(z)1+ ∑ 1<δ\Pz

σ1(δ)g(δ)+ ∑ d\Pz |χ1(d)||Rd(x)|.(2.11) Our aim will be to minimize|∑1<δ\Pz σi(δ)g(δ)| for i = 1,2 such that the remainder term ∑d\Pz|χi(d)||Rd|is small. A whole class of estimates can be obtained by restricting the functions χi to be the characteristic sequences of two divisor sets D+ and D  of Pz. The resulting sieves are called Combinatorial Sieves. Let us consider the following functions: χ(r)(d) =(1 if ν(d)≤r, and μ2(d) = 1, 0 otherwise.

These functions restrict the divisor sets over which we take the sum. In particular the restriction is on the number of distinct prime factors of the divisors. We will require the following lemma.

LEMMA 2.1.1.

∑ 0≤i≤k

( 1)in i= ( 1)kn 1 k .

Proof : The proof is by induction on k. For k = 0 we have ( 1)0n 0=n 1 0 +n 1  1=n 1 0 .

2.1. BRUN’S PURE SIEVE 33

Now

∑ 0≤i≤(k+1) ( 1)in i= ∑ 0≤i≤k

( 1)in i+( 1)k+1 n k+1 = ( 1)kn 1 k +( 1)k+1 n k+1 = ( 1)kn 1 k +( 1)k+1n 1 k +n 1 k+1 = ( 1)k+1n 1 k+1.

LEMMA 2.1.2. Let n be a positive integer and s a non-negative integer. Then ∑ d\n μ(d)χ(2s+1)(d)≤∑ d\n μ(d)≤∑ d\n μ(d)χ(2s)(d). Proof : When n = 1 all the sums are equal so we can assume n > 1. Then

∑ d\n

μ(d)χ(r)(d) = ∑ 1≤k≤r

( 1)kν(n) k = ( 1)rν(n) 1 r .

by Lemma (2.1.1).

Now let us try to bound the terms involved in (2.11). Let σ(r)(n) = ∑d\n μ(d)χ(r)(d), so that we have

σ(r)(n) = ∑ d\nν (d)≤r

μ(d)

= ( 1)rν(n) 1 r

and hence|σ(r)(n)|=ν(n) 1 r ≤ν(n) r .Then we have

∑ 1<d\Pz σ(r)(d)g(d)

≤ ∑ 1<d\Pzν(d) r g(d) = ∑ r≤m≤ν(Pz)=π(z)m r ∑ 1<d\Pz ν(d)=m

g(d)

≤ ∑ m≤rm r 1 m!∑ p<z

g(p)m

=

1 r!∑ p<z

g(p)r exp ∑ p<z

g(p).

Suppose we make the assumption|Rd(x)|≤ω(d); then we can also bound the remainder term as follows:

∑ d\Pz

|χ(r)(d)||Rd(x)|≤ ∑ d\Pz,ν(d)≤r

ω(d)≤1+ ∑ p<z

ω(p)r.

34 2. THE COMBINATORIAL SIEVE

Since

∑ d\Pzν (d)≤2s+1

μ(d)|Ad|= ∑ d\Pzν (d)≤2s

μ(d)|Ad|  ∑ d\Pzν (d)=2s+1

|Ad|≤S(A;Pz,x)≤ ∑ d\Pzν (d)≤2s

μ(d)|Ad|

we can always write

S(A;Pz,x) = ∑ d\Pzν (d)≤r

μ(d)|Ad|+θ ∑ d\Pzν (d)=r+1

|Ad|,|θ|≤1.

Putting all these together we have: S(A;Pz,x) = xW(z)1+θ 1 r!∑ p<z

g(p)r exp∑ p<z

g(p)+θ01+ ∑ p<z

ω(p)r

for some positive integer r, and with|θ|≤1,|θ0|≤1. Thus we have proved:

THEOREM 2.1.3 (Brun’s Pure Sieve). Let

g(d) =

ω(d) d∏p\d1 ω(p) p be well de ned for all d with μ(d)6= 0, and suppose|Rd(x)|≤ω(d). Then for every non-negative integer r there exist θ,θ0 with|θ|≤1,|θ0|≤1 such that S(A;Pz,x) = xW(z)1+θ 1 r!∑ p<z g(p)r exp∑ p<z g(p)+θ01+ ∑ p<z ω(p)r. We can apply this theorem to derive a much better bound on π(x) that we obtained eariler. We consider the sequence

x, and in this case since

x p ={n≤x|n≡0 mod p}we have|

x p|=bx pc= 1 px+δ0,|δ0|<

1. So we can take ω(p) = 1, and the condition|Rd(x)|≤ω(d) is also satisi ed. Also

g(p) =

1 p1  1 p≤

2 p

,

and this gives us

∑ p<z

g(p)≤2 ∑ p<z

1 p

< 2(lnlnz+1). We use the trivial estimate 1+∑p<zω(p)≤z. In this case we have W(z) = ∏ p<z1  1 p ~ e γ lnz . We begin with the following observations. First 1 r! ≤e rr by the Stirling approximation. Next if we set z such that∑ p<z g(p)≤λr, then the result of the theorem simpli es to S(A;Pz,x) = xW(z)1+θ(e1+λλ)r+θ0zr. De ning r =2(lnlnz+1) λ +1

2.1. BRUN’S PURE SIEVE 35

gives us ∑p<z g(p)≤λr. We restrict z so that

lnz =

lnx γlnlnx

,

and set

λ =

ξlnz(lnlnz+1) lnx so that for a large enough x and appropriate settings of ξ,γ we get λe1+λ ≤ 1. For this setting of z and r we have zr = ox1 εfor some ε > 0. Thus the theorem gives π(x) = Oxlnlnx lnx 1+θe clnlnx+o(x1 ε).

This approximation is signi cantly better than our  rst and shows the improvement that can be made using this simple idea. Next we will look at the twin primes problem, which was Brun’s primary application of his pure sieve. In this case we take the sequence to be A =|{n(n+2)|n≤x}|. Let p > 2; then Ap ={n(n+2) | n≤x,n(n+2)≡0 mod p}. Now n(n+2)≡0 mod p only if n≡0 or n+2≡0 since p is an odd prime. Clearly 0 and p 2 are two solutions in the interval 0,···,p 1. So we can take ω(p) = 2, p > 2. For p = 2 we have ω(p) = 1. By the Chinese Remainder Theorem we have|Rd(x)|≤ω(d). We take the sifting primes to be P ={p | p > 2}. Since S(A;Pz,x) counts all the twin-prime pairs above z, S(A;Pz,x)+2z is an upper bound on the number of twin-primes below x. Then:

W(z) = ∏ 2<p<z1  2 p ≤ ∏ 2<p<z1  2 p2 = O 1 ln2 z. Carrying out the rest of the analysis again using lnz = lnx γlnlnx we get the following theorem. THEOREM 2.1.4. Let π2(x) =|{p≤x| p+2 = q}| π2(x) = Oxlnlnx lnx 2. The above theorem can be put in a more impressive form.

THEOREM 2.1.5.

∑ p p+2=q

1 p

converges.

Proof :

36 2. THE COMBINATORIAL SIEVE

∑ p p+2=q

1 p

=∑ n

π2(n) π2(n 1) n

=∑ n

π2(n)1 n

1 n+1

=∑ n

π2(n)

1 n(n+1)

≤B∑ n

n(lnlnn)2 n(n+1)ln2 n

= B∑ n

1 nlnlnn lnn 2 = O(1).

The last step follows via

∑ n≤x

1 nlnlnn lnn 2 ≤ 2 lnx

+

2lnlnx lnx

+

(lnlnx)2 lnx (1+o(1))

using approximation by integration and taking the limit x→∞.

2.2. Brun’s Sieve

The second idea of Brun was to limit the remainder term by restricting the size of primes making up the divisors. This simple idea results in a sieve of remarkable power which can be used to prove rather sharp bounds on S(A,Pz,x). Since we are modifying the divisor sets in a non-trivial fashion we would like to have some simple conditions on the characteristic functions χ of the divisor sets, such that χ still yields good lower or upper bounds. Our  rst task is to  nd such a set of conditions. We begin with the following observation.

PROPOSITION 2.2.1.

S(A,Pz,x) = ∑ d\Pz

μ(d)χ(d)|Ad|  ∑ 1<d\Pz

σ(d)S(Ad;Pz (d),z)

where Pz (d) = ∏p∈Pz p6\d

p.

Proof :

∑ d\Pz

μ(d)χ(d)|Ax d|= ∑ d\Pz

|Ax d|∑ δ\d

μd δσ(δ)

= ∑ δ\Pz

σ(δ) ∑ t\Pz/δ

μ(t)|Aδt|

= ∑ t\Pz

μ(t)|At|+ ∑ 1<δ\Pz

σ(δ) ∑ t\Pz/δ

μ(t)|Aδt|

= S(A,Pz,x)+ ∑ 1<δ\Pz

σ(δ) ∑ t\Pz/δ

μ(t)|Aδt|

= S(A,Pz,x)+ ∑ 1<d\Pz

σ(d)S(Ad;Pz (d),z),

where we have used the M¨obius inversion on the expression for σ(d) as in the previous section.

We will use the above proposition to compare ∑d\Pz μ(d)|Ad|with ∑d\Pz μ(d)χ(d)|Ad|.

2.2. BRUN’S SIEVE 37

Now

σ(d) =∑ l\d

μ(l)χ(l)

= ∑ l\d/p

μ(l)χ(l)+ ∑ l\d/p

μ(lp)χ(lp)

= ∑ l\d/p

μ(l)χ(l)  ∑ l\d/p

μ(l)χ(lp)

= ∑ l\d/p

μ(l)χ(l) χ(lp). Let q(d) be the smallest prime divisor of d. Now using the above expression we can write ∑ 1<d\Pz σ(d)S(Ad;Pz (d),x) = ∑ δ\Pz ∑ p\Pzp <q(δ) σ(pδ)S(Apδ;Pz (pδ),x)

= ∑ δ\Pz

∑ p\Pzp <q(δ)

S(Apδ;Pz (pδ),x)∑ l\δ

μ(l)χ(l) χ(pl)

= ∑ l\Pz

∑ p\Pzp <q(l)

μ(l)χ(l) χ(pl) ∑ t\Pz/l p<q(t)

S(Aplt;Pz (plt),x)

= ∑ l\Pz

∑ p\Pzp <q(l)

μ(l)χ(l) χ(pl)S(Apl;Pp (pl),x).

Using this in the above proposition,

S(A;Pz,x) = ∑ d\Pz

μ(d)χ(d)|Ad|  ∑ d\Pz

∑ p\Pzp <q(d)

μ(d)χ(d) χ(pd)S(Apd;Pp (pd),x)

= ∑ d\Pz

μ(d)χ(d)|Ad|  ∑ d\Pz

∑ p\Pzp <q(d)

μ(d)χ(d) χ(pd)S(Apd;Pp,x)

since Pp (pd) = Pp. Suppose we have χ(1) = 1 and χ(d) = 0 for d > 1. Then S(A;Pz,x) =|A|  ∑ p<z,p∈P

S(Ap;Pp,x).

Now let χ1,χ2 be the characteristic functions of the divisor sets that we wish to use to get upper and lower bounds respectively. If we arrange ( 1)i 1μ(d)χi(d) χi(pd)≥0 whenever pd\Pz and p < q(d) for i = 1,2, then ∑ d\Pz μ(d)χ2(d)|Ad|≤S(A;Pz,x)≤ ∑ d\Pz μ(d)χ1(d)|Ad|. The above inequality is valid (needless to say) only if the sums involving χi are positive. This gives us a set of suf cient conditions for our functions χi to be well behaved.

If pd\Pz and p < q(d) then the conditions can be satis ed in only one of the following ways: 1. χi(d) = χi(pd) 2. χi(d) = 1,χi(pd) = 0 and μ(d) = ( 1)i 1 3. χi(d) = 0,χi(pd) = 1 and μ(d) = ( 1)i.

38 2. THE COMBINATORIAL SIEVE

We can avoid the last possibility by requiring that the functions χi be divisor closed, i.e. that χi(d) = 1  δ\d : χi(δ) = 1. So the functions χi for i = 1,2 should have the following properties: 1. If d\Pz, then either χi(d) = 0 or χi(d) = 1; 2. χi(1) = 1 (this is required for the derivation in Proposition (2.2.1)); 3. χi(d) = 1  δ\d : χi(δ) = 1; 4. χi(d) = 1,μ(d) = ( 1)i  χi(pd) = 1 for all pd\Pz, where p < q(d). Suppose we restrict χ(r) (which was the divisor selection function of the previous section) to also limit the number of prime factors that come from a certain interval. Suppose at most δ1 divisors can come from the interval z1 < p < z. Then the remainder term obeys

∑ d\Pz

χ(r)(d)|Rd|≤1+ ∑ p<z

ω(p)δ11+ ∑ p<z1

ω(p)r 1 δ1. This allows a more accurate estimation of the remainder term. The full Brun Sieve uses n such intervals to minimize the remainder term. The  rst step is to compare ∑d\Pz μ(d)ω(d) d with ∑d\Pz μ(d)χi(d)ω(d) d . By writing χi(d) = 1+

i(d), we can split the

sum

∑ d\Pz

μ(d)χi(d)

ω(d) d

= ∑ d\Pz

μ(d)χi(d)

ω(d) d

+ ∑ d\Pz

μ(d)

i(d)

ω(d) d

.

Let d = p1···pr; then

1 χi(d) = χi(p2···pr) χi(p1···pr) +χi(p3···pr) χi(p2···pr) +··· +χi(1) χi(pr). If we write P(p+,z) = ∏p<q<z,q∈Pq then we can write the above as: 1 χi(d) = ∑ p\dχi(gcd(d,P(p+,z))) χi(gcd(d,Pp,z)). This gives us

∑ d\Pz

μ(d)χi(d)

ω(d) d

=W(z)+ ∑ d\Pz

∑ p\d

μd pχi(gcd(d,P(p+,z))) χi(gcd(d,P(p,z)))ω(d) d

.

Let d = δpt, where δ\Pp and t\P(p+,z). Rewriting the above expression we get: ∑ d\Pz μ(d)χi(d) ω(d) d =W(z)+ ∑ p<z ω(p) p ∑ δ\Pp μ(δ) ω(δ) δ ∑ t\P(p+,z) μ(t)χi(t) χi(pt) t

ω(t)

=W(z)+( 1)i 1 ∑ p<z

ω(p) p

W(p) ∑ t\P(p+,z)χi(t)(1 χi(pt)) t

ω(t),

where we have used χi(t) χi(pt) = ( 1)i 1μ(t)χi(t)(1 χi(pt)) if pt\Pz and p < q(t). To verify this, if χi(t) = χi(pt), then both sides are 0, and this is the case if χi(pt) = 1 (since the χi are divisor closed). Now if χi(t) = 1 and χi(pt) = 0, then from the properties of χi listed above, we have that μ(t) = ( 1)i 1, and so the relation holds. So  nally we get

∑ d\Pz

μ(d)χi(d)

ω(d) d

=W(z)1+( 1)i 1 ∑ p<z

ω(p) p

W(p) W(z) ∑ t\P(p+,z)

χi(t)1 χi(pt) t

ω(t).

2.2. BRUN’S SIEVE 39

This identity holds in general for every combinatorial sieve with χi satisfying the properties listed above, provided W(z) and W(p) are well de ned. This will happen if g(d) stays bounded.

Construction of the Divisor sets: Let r be a positive integer and let zi for 1≤i≤r be real numbers. We will divide the interval [2···z] into r intervals as follows: let 2 = zr < zr 1 <···< z1 < z0 = z. Let d\Pz and βn = gcd(d,P(zn,z)) for 1≤n≤r. Let us set χi(d) = 1 if for all 1≤n≤r we have ν(βn)≤A+Cn, where A and C will be picked to make χi an acceptable function. For the current choice χi is already divisor closed, so the only property we need to check is: χi(t) = 1,μ(t) = ( 1)i   pt\Pz,p < q(t) : χi(pt) = 1. Let zm ≤ p < zm 1. Since χi(t) = 1 we should have ν(βm)≤ A+Cm. If ν(βm) < A+Cm then χi(pt) = 1. Now if ν(βm) = A+Cm, then we also have μ(t) = ( 1)i. By de nition μ(t) = ( 1)ν(t), since ν(t) = A+Cm, we have μ(t) = ( 1)A+Cm. This suggests that we set A = B i. Then we have that ( 1)B+Cm = 1 or B+Cm should be even. If we make B+Cm an odd number, then the assumption that χi(t) = 1 and μ(t) = ( 1)i results in a contradiction. Consequently, ν(βm) = A+Cm cannot happen if χi(t) = 1. For some integer b we set B = 2b 1 and C = 2. This suggests using ν(βn)≤ 2b 1+i+2n to be the condition on the number of factors of d in the interval [zn,···,z). Summarizing, the characteristic functions of the divisor sets will be (for i = 1,2) χi(d) =(1 if m : 1≤m≤r, ν(βm)≤2b i 1+2m, 0 otherwise.

The construction was such that the above function is the characteristic function of an acceptable divisor set.

Derivation of the Sieve bounds: Now

∑ 1≤n≤r

∑ zn≤p<zn 1

ω(p)W(p) pW(z) ∑ t\P(p+,z)

χi(t)1 χi(pt) t

ω(t)

≤ ∑ 1≤n≤r

W(zn) W(z) ∑ z≤p<zn 1

ω(p) p ∑ t\P(p+,z)

χi(t)(1 χi(pt)) t

ω(t).

We have used the fact that W(p) ≤W(zn) if zn ≤ p < zn 1. Now for each t which makes a contribution we have χi(pt) = 0 and χi(t) = 1. So we must have ν(t) = 2b i+2n 1for zn ≤ p < zn 1. Hence this sum is at most ∑ 1≤n≤r W(zn) W(z) ∑ d\P(zn,z) ν(d)=2b i+2n ω(d) d , and so

∑ p<z

ω(p)W(p) pW(z) ∑ t\P(p+,z)

χi(t)1 χi(pt) t

ω(t)≤ ∑ 1≤n≤r

W(zn) W(z)

1 (2b i+2n)! ∑ zn≤p<z

ω(p) p (2b i+2n).

Now to simplify this sum further we have to make some assumptions about the function ω(p); instead of assuming ω(p) = O(1) we shall use the more general assumption:

∑ w≤p<z

ω(p)ln p p ≤κlnz w+η, for 2≤w≤z.(2.12) If indeed we had ω(p) = 1, then we have

∑ w≤p<z

lnp p ≤lnz w+1, for 2≤w≤z.

40 2. THE COMBINATORIAL SIEVE

So the assumption we have made is an assumption on the average distribution of ω(p). Such an assumption usually holds, and is much easier to verify in more complicated situations.

A question we can ask is: Does the above assumption imply a bound for the sum

∑ w≤p<z

ω(p) p

Let

S(k)≡ ∑ w≤p<k

ω(p)ln p p

.

Since S(k) S(k 1)= ω(k)lnk k if k is prime we have ∑ w≤p<z ω(p) p = ∑ w≤k<z

S(k) S(k 1) lnk

= ∑ w≤k<z 1

S(k) 1 lnk

1 lnk+1!

= ∑ w≤k<z 1

S(k) ln(k+1) lnk lnklnk+1 !

Now

ln(k+1) = lnk+ln1+ 1 k,

and since 1+x≤ex we have ln1+ 1 k≤ 1 k . Then

∑ w≤p<z

ω(p) p ≤ ∑ w≤k<z 1

S(k) kln2 k

(2.13)

≤ ∑ w≤k<z 1

κlnk w+η kln2 k

(2.14)

≤κ ∑ w≤k<z 1

1 klnk  lnw ∑ w≤k<z 1

1 kln2 k

+η ∑ w≤k<z 1

1 kln2 k

(2.15)

≤κlnlnz lnw+ η lnw .(2.16) Here we have used

z

w

1 xlnx

dx = lnlnw+lnlnz,

and

z

w

1 xln2 x

dx =

1 lnw

1 lnz

.

Now returning to our original problem we need bounds on

W(zn) W(z)

= ∏ zn≤p<z

1 1 ω(p) p

,

2.2. BRUN’S SIEVE 41

and so

ln

W(zn) W(z) ≈ ∑ zn≤p<z

ω(p) p

.

Our assumption (2.12) yields a bound on

∑ zn≤p<z

ω(p) p

by (2.16). Thus we expect that a bound of the form W(zn) W(z) ≤eγnλ+ c lnz can be enforced with some— constants γ,λ and c. This can be achieved for example with a double-exponentialfall-off of zn with respect to z, in fact this is what we shall do later. If a bound for W(zn) W(z) of the above form exists, then this also gives us (as we might expect)

∑ zn≤p<z

ω(p) p ≤ ∑ zn≤p<z

ln 1 1 ω(p) p

≤ln

W(zn) W(z)

< γnλ+ c lnz.

Let f = c lnz, and suppose we can enforce γ = 2 (this helps in the simpli cation to follow). Then

∑ 1≤n≤r

W(zn) W(z) ∑ d\P(zn,z) ν(d)=2b i+2n

ω(d) d ≤ ∑ 1≤n≤r

e2nλ+2f (2nλ+2f)2b i+2n (2b i+2n)!

≤ ∑ 1≤n≤r

e2feλ2n (2n)2b i+2n (2n)!(2n)2b i1+ f n2b i+2n

(since (2b i+2n)!≥(2n)!(2n)2b i)

= ∑ 1≤n≤r

e2f(λeλ)2n(2ne 1)2ne2n (2n)!

(λ2b i)1+ f nλ2b i1+ f nλ2n

= e2f(λ+ f)2b i ∑ 1≤n≤r

(2ne 1)2n (2n)!

(λe1+λ)2n1+ f nλ2n

since (ne 1)n n! is decreasing, and1+ f nλ2n ≤e2f λ . Also assuming λe1+λ ≤1;

∑ 1≤n≤r

W(zn) W(z) ∑ d\P(zn,z) ν(d)=2b i+2n

ω(d) d ≤e2f(λ+ f)2b i2e 2e2f λ ∑ 1≤nλe1+λ

=

2λ2b i+2e2λ 1 (λe1+λ)21+ c λ2b ie2f(1+1 λ)

2λ2b i+2e2λ 1 (λe1+λ)2

e(2b i+4) f λ .

Thus

∑ d\Pz

μ(d)χi(d)

ω(d) d

=W(z)1+2θ λ2b i+2e2λ 1 (λe1+λ)2

e(2b i+4) f λfor i = 1,2.

42 2. THE COMBINATORIAL SIEVE

Now we have to bound the remainder term, which is signi cantly easier. Let us assume that ω(p) ≤ A for some constant A > 0. Then ∑ d\Pz χi(d)|Rd|≤ ∑ d\Pz χi(d)ω(d) ≤1+ ∑ p<z ω(p)2b i+1 ∏ 1≤n≤r 11+ ∑ p<zn ω(p)2 ≤(1+A(2li z+3))2b i+1 ∏ 1≤n≤r 1 (1+A(2li zn +3))2 for i = 1,2. Selection of the intervals: We select the numbers zn with an exponential fall-off in the logarithm. Let Λ > 0 be a real number. De ne lnzn = e nΛlnz for n = 1,···,r 1;

and set zr = 2. Here r is selected such that

lnzr 1 = e (r 1)Λlnz > ln2,

and

e rΛ lnz≤ln2,

so we have

e(r 1)Λ < lnz ln2 ≤erΛ. Thus for a suitable constant B the remainder term becomes

∑ d\Pz

χi(d)|Rd|≤Bz lnz2b i+1 ∏ 1≤n<rBznenΛ lnz =Bz lnz2b i+1 ∏ 1≤n≤r 1Be 1 2 rΛ lnz r 1 ∏ 1≤n≤r 1

z2 n.

Now

Be

1 2 rΛ lnz ≤

BeΛ/2 lnz rlnz ln2

< 1,

and also

∏ 1≤n≤r 1

z2 n = exp2lnz ∑ 1≤n≤r 1

e nΛ≤z 2 eΛ 1 .

Thus

∑ d\Pz

χi(d)|Rd|= Oz2b i+1+ 2 eΛ 1for i = 1,2.

We still have to check that W(zn) W(z) ≤e2(nλ+f). By our assumptions about the sum ∑w≤p<z ω(p)lnp p we have W(zn) W(z) ≤expnΛκ+ 2cenΛ lnz = e2c exp nΛκ+ 2c lnz enΛ 1 n !,n = 1,···,r.

2.2. BRUN’S SIEVE 43

If 1≤ 1 1 ω(p) p ≤A, then

c =

η 21+Aκ+ ηA ln2.

Since Λ > 0 we have

enΛ 1 n ≤

erΛ 1 r

,

and this is at most

Λ

erΛ rΛ ≤Λ

eΛ ln2

lnz ln(lnz/ln2)

.

So we get

W(zn) W(z) ≤e2c exp nΛκ1+ 2ceΛ κln2

1 ln(lnz/ln2)!for n = 1,···,r.

To meet our conditions on W(zn) W(z) we take

Λ =

2λ κ

1 1+ε

ε =

1 δe

1 κ

,

and so

e

2λ κ  eΛ ≤2λ κ  Λe2λ κ ≤εΛe1 κ .

Since eΛ 1≥Λ we have

e

2λ κ  1 eΛ 1 ≤1+

εΛe

1 k eΛ 1 ≤1+εe

1 κ = 1+

1 δ

.

With ξ = 1+ 1 δ we obtain

∑ d\Pz

χi(d)|Rd|= Oz2b i+1+ 2ξ e 2λ κ  1for i = 1,2. Thus we have proved the following theorem.

THEOREM 2.2.2. Assume that

1≤

1 1 ω(p) p

≤A,

∑ w≤p<z

ω(p)ln p p ≤κlnlnz lnw+ η lnw

,

and

|Rd|≤ω(d).

Let λ be such that 0 < λe1+λ < 1. Then S(A;Pz,x)≤xW(z)1+2 λ2b+1e2λ 1 (λe1+λ)2

exp(2b+3) c λlnz+Oz2b 1+ 2ξ e 2λ κ  1,(2.17)

44 2. THE COMBINATORIAL SIEVE

and

S(A;Pz,x)≥xW(z)1 2 λ2be2λ 1 (λe1+λ)2

exp(2b+2) c λlnz+Oz2b 1+ 2ξ e 2λ κ  1,(2.18) where

c =

η 21+Aκ+ η ln2

and ξ = 1+ε for 0 < ε < 1. Application to the Twin Primes problem : We set A ={n(n+2) | n ≤ x}. In this case we have ω(2) = 1 and ω(p) = 2. Further, all the conditions of Theorem (2.2.2) hold, and the lower bound is seen to be positive. Thus (2.18) tends to in nity with x, ([HR74], p.63) for z = x 1 u with u < 8. This implies that every divisor of a number in the sifted set is≥x 1 u so each number in the sifted set can have at most u < 8 factors1. Thus we have the following theorem. THEOREM 2.2.3. There are in nitely many n such that ν(n(n+2))≤7. We will look at some interesting applications of Brun’s sieve in the following sections.

2.3. Orthogonal Latin Squares and the Euler Conjecture DEFINITION 2.3.1. A Latin square of order n is an n×n matrix with entries in S ={1,···,n}such that every row and column is a permutation of the set S. DEFINITION 2.3.2. Two Latin squares A and B or order n are said to be mutually orthogonal if the n2 pairs (aij,bij) are distinct.

Here is a Latin square of order 3:

A =

1 2 3 2 3 1 3 1 2    ,and here is a latin square that is orthogonal to it: B =    1 2 3 3 1 2 2 3 1   .Euler conjectured that there are no mutually orthogonal Latin squares of order n, where n≡2 mod 4. The conjecture was disproved for the case n = 10, and later Bose, Parker and Shrikande [BPS60] showed that for every higher n > 6 the conjecture was false. Let ⊥(n) be the number of orthogonal latin squares of order n. Chowla, Erd os and Straus [CES60] building on this and some previous results, established that ⊥(n) > 1 3n 1 91 for large enough n. The proof involves an interesting use of the Brun Sieve, and we shall give an account of this. The exponent 1 91 is far from optimal and has been subsequently improved. The starting point for the proof is the following pair of results: THEOREM 2.3.3. [BPS60] If k≤⊥(m)+1 and 1 < u < m then ⊥(km+u)≥min{⊥(k),⊥(k+1),⊥(m)+1,⊥(u)+1} 1. THEOREM 2.3.4 (MacNiesh). 1. ⊥(ab)≥min{⊥(a),⊥(b)}; 2. ⊥(q) = q 1 if q is a power of a prime. First we shall prove the following:

THEOREM 2.3.5.

lim n→∞⊥(n) = ∞.

1For a similar derivation see Theorem (2.3.6).

2.3. ORTHOGONAL LATIN SQUARES AND THE EULER CONJECTURE 45

Proof : The idea is to have a lower bound on each of the quantities involved in Theorem (2.3.3), and then use the theorem with km+u = n.

Let x be a large positive integer. If

k+1 = ∏ p≤x

px,

then by Theorem (2.3.4) we have⊥(k+1)≥2x 1≥x. Also since k≡1 mod p for p≤x all the prime factors k are larger than x, so applying Theorem (2.3.4) again we have⊥(k)≥x. Now we select m in two pieces m1 and m2. The  rst piece is set to be m1 = kk ∏ q6\n q≤x qk. Note that m1 is bounded in terms of x alone. Now if n is large enough the interval n (k+1)m1 ··· n 1 km1 contains an integer m2 such that m2 ≡1 mod k!, simply because the length of the interval becomes larger than k!. Now set m = m1m2 then⊥(m)≥min{⊥(m1),⊥(m2)}≥min{2k 1,k}≥k. Thus we have⊥(m)+1≥k to satisfy the condition of Theorem (2.3.3). Set u = n km; we need to bound⊥(u), but  rst we need to verify that 1 < u < m. We have n (k+1)m1 < m2 < (n 1) km1 or n (k+1) < m < n 1 k . This yields km+1 <n and km+m > n, which implies that 1 < u <m. Let p≤x then km6≡n mod p. This is because k has prime factors only above x, m1 has a small prime factor only if it does not divide n, and m2 has prime factors only above k≥x. Thus km+u6≡n mod p for p < x, and so no prime smaller than x divides u. Thus we get⊥(u)≥x. Now applying Theorem (2.3.3) we have⊥(km+u)=⊥(n)≥x.

Note that this has already disproved Euler’s conjecture. It is clear that our method of proof relied on our ability to produce some numbers with large prime factors and some congruence properties, this indicates that a sieve argument might help. The necessary machinery from sieves is encapsulated in the following theorem: THEOREM 2.3.6. [Rad24] Let p1,···,pr be primes, and let ai < pi,bi < pi be non-negativeintegers for 1≤i≤r. Let D > 1 be an integer with gcd(D,pi) = 1 for each i, 1≤i≤r, and Λ is an integer, 0 < Λ < D such that gcd(Λ,D) = 1. Let P(D,x; p1,a1,b1; p2,a2,b2;···; pr,ar,br) =
 
{n≤x|n≡Λ(mod D),( i : 1≤i≤r) : n6≡ai(mod pi),n6≡bi(mod pi)}
 
.If p1 < p2 <···< pr and pi > 2, then P(D,x; p1,a1,b1;···; pr,ar,br) > Cx Dln2 pr  C0p7.938 r , where C and C0 are positive constants.

REMARK 2.3.7. The original theorem has 7.9 instead of our slightly worse 7.938, but this can be improved using a more detailed analysis of our proof. Proof : The quantity S(A;Pz,x) is the number of integers in A that are6≡0 modulo pi for each pi ∈P,pi ≤z. In this case we have two constraints for each prime pi. But we can collapse these two constraints into one as follows. The constraint for the prime i is that n6≡ai, n6≡bi modulo pi. So the constraint fails iff (n ai)(n bi)≡0 mod pi. Let A = {n ≤ x | n ≡ Λ mod D}, Api = {n ≤ x | (n ai)(n bi) ≡ 0 mod pi}, and if d = pi1···pik then Ad = {n≤x| ∏1≤j≤k(n aij)(n bij)≡0 mod d}. Suppose|Api|= ω(pi) pi x+Rpi; then we see that if d is squarefree then |Ad|= ω(d) d x+Rd, where ω(d) is de ned multiplicatively. Thus we are interested in the estimate:

46 2. THE COMBINATORIAL SIEVE

P(D,x; p1,a1,b1;···; pr,ar,br) =|A| ∑|Api|+∑|Apipj| ··· = ∑ d\p1···pr μ(d)|Ad|, which is just the sieve estimate.

The congruence (n ai)(n bi)≡0 mod pi has at most 2 solutions modulo pi so ω(pi) = 2 for each i. We will try to apply Brun’s Sieve to this problem.

We just need to verify that the conditions of the proof of Theorem (2.2.2) are valid. First 1 1  ω(p) p ≤3 so A = 3. Next ∑ w≤p<zp ∈{p1,···,pr} ω(p)ln p p ≤2 ∑ w≤p<z ln p p ≤ln z w +1, from which we have κ = 2, and η = 2. Rd ≤ω(d) also holds. Thus by the lower bound we have (with b = 2): S(A;P ={p1,···,pr},z)≥|Ax|W(z)1 2 (λeλ)2 1 (λeλ)2 exp 4c λlnz+Oz1+ 2ξ eλ 1. So all we need to show is that there is a λ such that

1+

2+2ε eλ 1 ≤u≤7.938

and

1

2(λeλ)2 1 (λe1+λ)2

> 0.

Then the second condition implies

λeλ < 1 √2+e2 ≈0.3263540699···

and the  rst implies

2+2ξ 6.938

+1≤eλ. Now set ξ = 10 9, so we must have λ≥log1.288267513692707. This value of λ also satis es the other constraint. Now we take z = pr, and using|Ax|= x D +θ,|θ|< 1, S(A;Ppr,x)≥ Cx D ∏ 1≤i≤r1  2 pi+O(p7.938 r ), and also ∏ i 1  2 pi≤ ∏ p≤pr1  2 pi. Now in ln ∏ p≤pr1  2 pi= 2 ∑ p≤pr 1 pi  2 ∑ p≤pr ∑ m>1 1 mpm the second sum converges, so we have

∏ p≤pr1  2 pi= 1 ln2 pr

+o 1 ln2 pr.

2.3. ORTHOGONAL LATIN SQUARES AND THE EULER CONJECTURE 47

The theorem follows.

Now we have the following simple lemma: LEMMA 2.3.8. For all c,0 < c < 1, the number of integers y≤x that are divisible by a prime factor p > nc of n, is at most x cnc . Proof : At most x p integers y≤x are divisble by p and so the total number of such integers is given by: ∑ p\np >nc x p ≤ x nc ∑ p\np >nc 1 ≤ x cnc . The last part follows because, there can be at most 1/c prime factors of a number n that are greater than nc.

THEOREM 2.3.9. [CES60] There is an n0 > 0 such that for all n > n0,⊥(n) > 1 3n

1 91 .

Proof : The idea as before is to apply Theorem (2.3.3) to suitable k, m and u for a given n such that n = km+u. For this to yield a lower bound on⊥(n) we need lower bounds on⊥(k),⊥(k+1),⊥(m) and⊥(u). We begin with the selection of k: we need k as well as k+1 to have no small prime factors. This is exactly the sort of problem handled by the theorem we have just proved. It turns out that the constraints on k depend on the parity of n. Case 1. (n even). Consider the constraints: k≡ 1 mod 2b 1 91 lgnc k6≡0 or  1 for p≤n 1 10

and k < n

1 10 . The  rst congruence restricts k to lie in an arithmetic progression with difference 2b 1 91 lgnc < c1n 1 91 . The second incongruence implies that both k and k +1 are free of small prime factors, apart from the large power of 2 dividing k+1. Now applying Theorem (2.3.6) there are at least:

Cn

1 10

c1 1 902 n

1 91 log2 n C0n79.38 10

1 90 = c2

n

81 910 log2 n C0n79.38 900

> c3

n

81 910 log2 n

values of k satisfying the constraints. By Lemma (2.3.8) the number of integers below n

1 10 that have a prime factor greater than n

1 90 in common with n is

at most 90n

8 90 . Thus from the bound for the values of k we have that there is a k such that gcd(k,n) = 1. Just by our selection of k we have that k has no small prime factors and though k+1 has 2 as a prime factor we still have that k+1≡0 mod 2b 1 91 lgnc and all the other factors are bigger than n 1 90 so using Theorem (2.3.4) ⊥(k) > n 1 90  1 > 1 3 n 1 91 ⊥(k+1) > min1 2 n 1 91 ,n 1 90 1 > 1 3 n 1 91 . We now set n = n1 +n2k where 0 < n1 < k. We cannot directly use n1 and n2 in our application of Theorem (2.3.3), since we have no bounds for⊥(n1) and⊥(n2). Though we have freedom in our choice of m we are still forced to pick k as our quotient in the division of n by m to write n = km+u. This suggests picking a u subject to certain conditions and then set m = n u k . Again this immediately restricts us to look at numbers that are congruent to n1 modulo k. Let

48 2. THE COMBINATORIAL SIEVE

u = n1 +u1k where u1 is picked according to the following conditions: u1 6≡n1 mod 2, u1 6≡  n1 k mod p,p6\k, u1 6≡n2 mod p, for 3≤ p≤k, and u1 < n 159 200 . The  rst incongruence forces u1k to be of opposite parity from n1 and always  xes u to be odd. In this setup we will set m = n u k = n2 u1. We want m to be free of small prime factors to guarantee a good lower bound for ⊥(m) and this is taken care by the third incongruence. Meanwhile, the second incongruence arranges for u itself to have no small prime factors. The limit on u1 is forced on us because of the limitations of Theorem (2.3.6).

The restrictions of the incongruences modulo the primes 2,3, and 5 can be handled by restricting u1 to belong to an arithmetic progression with difference 30. To apply Theorem (2.3.6) we need gcd(u1,30)= 1. If we had gcd(u1,30)> 1, then we can set u0 1 = u1 gcd(u1,30) and apply Theorem (2.3.6). Thus there are at least

Cn

159 200 30log2 k  C0k79.38 10 > c4

n

159 200 log2 n C0n79.38 100 > 0 choices for u1, if n is large enough. Now u is not divisible by any prime p ≤ k. First suppose that p 6\k, then this contradicts the incongruence n1 6≡ u1k mod p. Next, if p\k, then p\n1 which implies p\n a contradiction to gcd(k,n) = 1. Thus⊥(u)≥k, but k is not divisible by any prime≤n 1 90 , so ⊥(u)≥n 1 90 > 1 3n 1 91 . Now as promised we set m = n u k , we need to verify that m > u > 1 to apply Theorem (2.3.3), and observe that

m >

n

n

1 10  (1+n

159 200 ) >

1 2

n

9 10

> n

1 10 +(1+n

159 200 )

> u > 1,

for large enough n. Furthermore, all prime factors of m exceed k by our choice of u, and hence:

⊥(m)≥k >

1 3

n

1 91 .

Finally putting all these together and applying Theorem (2.3.3) we get: ⊥(n)> 1 3n

1 91 for large enough even numbers n.

Case 2. (n odd). We apply Theorem (2.3.6) to k+1 instead with the following constraints: k+1≡1 mod 2b 1 91 lgnc k+16≡0 or 1 mod p,p≤n 1 90 k+1≤n 1 10 . Now the argument proceeds with the role of k and k+1 interchanged, and the second set of constraints becomes: u1 6≡n2 mod 2, u1 6≡  n1 k mod p,p≤k,p6\k, u1 6≡n2 mod p,p≤k, and u1 < n 159 200 . So here both n and m are odd. The argument then proceeds similarly.

Better estimates for⊥(n) are known— for example in [Wil74] a bound⊥(n)≥n

1 17  2 is proved (for large enough n).

The current best estimate seems to be⊥(n)≥n

1 14.8 [Be83].

2.4. A THEOREM OF SCHINZEL 49

2.4. A Theorem of Schinzel

In this section we will give an application involving a variation of Theorem 2.3.6, where we look at some constant number of constraints. The proof is an interesting use of Brun’s sieve. THEOREM 2.4.1. [Sch66] For all positive integers h and N ≥3 there is an integer D such that: 1. 1≤D≤(logN)20h; 2. gcd(iD+1,N) = 1, for 1≤i≤h. Proof : For h = 1 we can take D = q 1, where q is the least prime not dividing N. Since ∑p≤D logp≤logN, we have from [RS62] Theorem 10, that either D≤100 or 0.84D≤logN. Since D≤N we have D≤(logN)20, for all N ≥3. If N ≤(logN)20h, then D = N satis es the conditions of the theorem, so we can assume N > (logN)20h, with h≥2. Now N > (logN)20h  logN > 20hloglogN. If logN < 110h, then N < e110h and (logN)20h ≥(110h)20h = elog110h20h ≥elog110+logh20h = e94.0069h+20hlogh ≥e114.0096h, which is a contradiction to N > (logN)20h. Hence we must have logN ≥ 110h, and loglogN ≥ log110+logh ≥ 5.3936, or loglogN > 5.

Let H = ∏p≤10h p, and we let p1,···,pr be the primes pi > 10h such that pi\N. Let p1 < p2 < ··· < pr. Let P(H,x; p1,···,pr) be the number of integers n≤x such that n≡0 mod H, and ( i  j) : 1≤i≤h,1≤ j ≤r : in+16≡0 mod pj. Since pi > 10h for all the values of i in the incongruences, i is invertible. Thus, the above constraints are equivalent to a system of h incongruences per prime (we had 2 such constraints in Theorem 2.3.6). Thus we have a system of incongruences: x6≡aij mod pj, for some aij.

Here we are in a special situation of the Sieve problem. The number of primes with respect to which we sift the sequence is very small, namely we sift only by the prime factors of N, of which there can be at most logN. Hence we shall re-do the analysis of the Brun sieve and thereby get a better estimate. Let A ={n≤x|n≡0 mod H}, P = ∏1≤i≤r pi and let Apj =n∈A| ∏ 1≤i≤h (n aij)≡0 mod pj. We extend the notation to Ad for d a divisor of P. We have that P(H;x; p1,···,pr) = ∑ d\P μ(d)|Ad|. For |Apj|, we can select ω(p) = hx Hpj , and Rpj ≤ h since for each congruence there is an error of at most 1 in the approximation. The denominator H can be taken out of our analysis if we set x← x H . We also have that Rd ≤ω(d).

50 2. THE COMBINATORIAL SIEVE

Hence

W(k) = ∏ 1≤i≤k1  h pi.

From our earlier work in section (2), we have

P(H;x,p1,···,pr) >

xW(pr) H

(1+Θ)+R,

where

Θ = 1 ∑ i≤r

ω(pi) pi

W(pi) W(pr) ∑ t\P(i···r]

χ(t)(1 χ(pt)) t

ω(t)

and P(i···r) = ∏i<k≤r pk. We let 1≤rt ≤rt 1 ≤···≤r0 = r, be a sequence of integers. These correspond to the real numbers zi, but here we select the indices of the primes instead. We use the function χ≡χ2 (in the proof of Theorem 2.2.2), with P(ri···r) instead of P(zn,z) in the de nition. We will show that in this case we can select the intervals (ri) such that Θ < 1.

Following the same argument as in Section 2 (with b = 1), we arrive at the following upper bound for Θ:

∑ 1≤n≤t

W(rn) W(r)

1 (2n+1)! ∑ rn≤i≤r

ω(pi) pi 2n+1.

We will show later that we can pick ri such that W(rn) W(r)

=

1 ∏rn≤i≤r1  h pi≤enγ,where γ = log1.3. As before

∑ rn≤i≤r

ω(pi) pi ≤logW(rn) W(r)≤nγ.

So the bound for Θ is

∑ 1≤n≤t

enγ (2n+1)!

(nγ)2n+1 = ∑ 1≤n≤t

(ne 1)2n+1 (2n+1)!

e2n+1γ2n+1enγ

1 e3(3!) ∑ 1≤n≤t

(γe1+γ)2n+1

(since (ne 1)2n+1 (2n+1)! is decreasing)

1 e3(3!)

γe1+γ ∑ 1≤n<∞ (γe1+γ)2n

=

1 e3(3!)

γe1+γ 1 1 (γe1+γ)2

.

The last step follows because γe1+γ < 1. The  nal expression is≈0.05478< 1. Thus Θ < 1. Let us de ne the intervals by selecting ri (for 1≤i≤t), as the least index such that πi = ∏ ri<k≤ri 11  h pk≥ 1 1.3 .

2.4. A THEOREM OF SCHINZEL 51

Since pi > 10h this is always possible. This automatically satis es the requirements set earlier on γ. Select t such that

πt = ∏ 1≤k≤rt 11  h pk≥ 1 1.3

.

Since pi > 10h we have

1

h pi

> 1

h 10h

=

9 10

so

9 10

πi =1  h 10hπi <1  h priπi,

which by the de nition of ri is such that

<

1 1.3

.

Thus

πi ≤

10 9

1 1.3

=

1 1.17

<

8 9

.

We will show that

log ∏ 1≤i≤r1  h pi>  hloglogN elogeh

> 0.2hloglogN.

Using the series expansion of log(1+x) we see that

log ∏ 1≤i≤r1  h pi+log ∏ 1≤i≤r1  h pi h ≥  ∑ 1≤i≤r

∑ 2≤m

1 mh pim

≥ ∑ 1≤i≤r

1 2 ∑ 1≤mh pim

=

1 2 ∑ 1≤i≤rh pi2 1 1  h pi.

We need a good bound on ∑i 1 p2 i

. We have by [RS62] (p.87), that

∑ x<p

1 pn ≤

1.02n xn 1 lnx

.

Using this with n = 2 and x = 10h, (all the primes pi > 10h by our choice) we have

∑ 1≤i≤r

1 p2 i ≤

2.04 10hlog10h

.

Thus

1 2 ∑ 1≤i≤r

1 1  h pih pi2 ≥ 5 9

h2 ∑ 1≤i≤r

1 p2 i

0.2h log10h

.

Now if we can bound from above

log ∏ 1≤i≤r1  h pi h,

52 2. THE COMBINATORIAL SIEVE then we can obtain a lower bound on log∏1≤i≤r1  h pi. Let N0 = N gcd(H,N). We have

A  (A)

1 ∏1≤i≤r1  1 pi

=

AN0  (AN0)

.

By [RS62] Theorem (15): For n≥3

n  (n)

< eγ loglogn+ 5 2loglogn

,

where γ is the Euler constant. Also by [RS62] Theorem (9): logH < 11h < 0.1logN. Using this we have HN0  (HN0) < eγ loglogHN0+ 2.51 loglogHN0 < eγ loglogN0+ eγ 10 + 2.51 5 < eγ(loglogN +0.4), as HN0≥N, loglogN > 5 by our conditions, and also N0≤N. Now by [RS62], where a lower bound of e γ logx1  1 log2 xfor ∏p≤x 1 1 1 p is given, we have: H  (H) > eγ log10h1  1 2log2 10h> eγ(logh+2.1). Since loglogN > log10h,

∏ 1≤i≤r1  h pi 1 < 1 eγ(logh+2.1)eγ loglogN +0.4

yielding

hlog ∏ 1≤i≤r1  1 pi< hlog(loglogN +0.4) log(logh+2.1)+ 0.2 log10h

and  nally

log ∏ 1≤i≤r1  h pi> hlogloglogN loglogeh.

Using logx loga = 1+logx ae≤ x ae, we have log ∏ 1≤i≤r1  h pi>  hloglogN elogeh

.

Since πi ≤ 1 1.17, we obtain (t 1)log1.17≤log ∏ 1≤i≤r1  h pi 1 ≤ hloglogN elogeh

<

hloglogN elog(h+1)

.

This yields

(2t +1)log(h+1) < 3log(h+1)+

2hloglogN elog1.17 < 3log(h+1)+4.7hloglogN.

2.4. A THEOREM OF SCHINZEL 53

Now pi > ilogi, by [RS62] (Corollary to Theorem 3). Hence

logπi = ∑ rn<i≤rn 1

log1  h pi

>

10 9 ∑ rn<i≤rn 1

h ps

>

10h 9

rn 1 rn

dt t logt

=

10h 9

log

logrn 1 logrn

.

Since πi ≤ 1 1.17, we have

logrn logrn 1

< 1 1.17

9 10h

<1+ 9 10h

log1.17 1 ≤(1+0.141h 1) 1,

and so

logrn logr

< (1+0.141h 1) n

for 1≤n≤t 1. Further

logN ≥ ∑ 1≤i≤r

logpi > rlog10h≥rlog20,

so logr < loglogN 1.

Now for the remainder term: R = ∑ d\P

χ(d)|Rd|≤1+ ∑ 1≤i≤r

ω(pi) ∏ 1≤i≤t 11+ ∑ j≤ri

ω(pj)2

(since ω(p) = h)

≤(1+hr) ∏ 1≤i≤t 1 (1+hri)2.

Thus

logR≤log(1+h)+logr+2(t 1)log(h+1)+2 ∑ 1≤i≤t 1

logri

= (2t 1)log(h+1)+logr+2 ∑ 1≤i≤t 1

logri < 3log(h+1)+4.7hloglogN +(loglogN 1)2 ∑ 0≤n (1+0.141h 1) n 1 < 3log(h+1)+4.7hloglogN +(loglogN 1)(14.2h+1) < 19.4hloglogN 11h 1. Since logH < 11h, we have logR < 19.4hloglogN logH 1, and logc(logN)20h H ∏ 1≤i≤r1  h pi> logR,

54 2. THE COMBINATORIAL SIEVE

where c = 1 0.05478. Thus P(H,(logN)20h,p1,···,pr) > 0. Thus there is an integer D satisfying the conditions of the theorem.

2.5. Smooth Numbers

Here we illustrate the surprising power of the indentity proved in Proposition 2.2.1.

Let Px z ={p|z≤p<x}. Then setting χ(1)=1 and χ(d)=1 for d >1 in Proposition 2.2.1, we have that for 2≤z1 ≤z : S(Ax;Px z1) = S(Ax;Px z )  ∑ z1≤p<z S(Ap;Px p). Recall that S(Ax;Px z ) = Ψ(x,z) the number z-smooth integers below x, also S(Ap;Px p) = Ψ( x p,p). Hence we have, for 2≤z1 ≤z that Ψ(x,z) = Ψ(x,z1)+ ∑ z1≤p<z Ψx p ,p.(2.19) As an application we show the following theorem.

THEOREM 2.5.1 ([Hal70]). Let y = x

1 θ where 1 < θ≤2. Then Ψ(x,y) = x1 logθ+O 1 logx.

Proof : Applying the identity (2.19) with z = x and z1 = y, we have

Ψ(x,y) = Ψ(x,x)  ∑ y≤p<x

Ψx p ,p.(2.20) Now Ψ(x,x) =bxc. Since 1 < θ≤2, p≥√x, we have that x p ≤√x≤ p. Consequently, Ψx p,p=

x p.Substituting in (2.20), we have Ψ(x,y) =bxc  ∑ y≤p<xx p = x x ∑ y≤p<x 1 p +O(π(x)) = x1 loglogx+loglogy+O 1 logx. Now x≥yθ, so logx≥θlogy, and also loglogx≥logθ+loglogy this yields Ψ(x,y) = x1 logθ+O 1 logx.

The recurrence formula can be used to convert upper bounds to other useful lower bounds, and can also be used iteratively. Here is a simple example. Let us try to evaluate Ψ(x,x 1 δ ) for 2 < δ < e using the recurrence formula

Ψ(x,x

1 δ ) = Ψ(x,x

1 2 )  ∑ x 1 δ≤p≤x

1 2

Ψx p

,p.

2.6. ON THE NUMBER OF INTEGERS PRIME TO A GIVEN NUMBER 55

Applying the trivial bound Ψx p,p≤ x p

x

1 δ≤p≤x

1 2

Ψx p

,p≤x ∑ x 1 δ≤p≤x

1 2

1 p

= x(logδ log2).

Now applying the theorem with θ = 2, we have

Ψ(x,x

1 2 ) = x1 log2+O 1 logx.

Thus we obtain

Ψ(x,x

1 δ )≥x1 logδ+O 1 logx. Of course, in this case we could have directly derived this result as in the theorem, but this just is an illustration of the usage of Buchstab’s identity. In estimating ψ(x,y) we could try to use Brun’s sieve as in section (1.3). It is clear however, that to obtain a good estimate we need to take lnz < εlnx, but this would make the error term very large, since that depends on the size of the interval x z.

2.6. On the number of integers prime to a given number

Let k > 1 be an integer and x > 1 a real number, here we will  nd bounds for the sum: ∑ n≤xgcd (n,k)=1 1.

It is clear that in every interval mod k there are  (k) such integers. However, it is not clear how uniform the distribution of these numbers are inside the interval. The sequence to be sifted is A ={n|1≤n≤x}, and the sifting primes are P ={p| p\k}. We assume x≥k. In this case we can take|Ad|= x d +Rd, where ω(d) = 1 and Rd ≤1. Now, 1 1 1 p ≤2. Hence A = 2, we also have ∑ w≤p≤z p∈P ω(p)ln p p ≤lnlnz lnw+ 1 lnw thus κ = η = 1. To apply the lower bound estimate of the Brun sieve (with b = 1), we need to  nd λ such that

1

2(λeλ)2 1+(λe1+λ)2

> 0

and

1+

2.01 e2λ 1

< γ,

where we have used ξ = 1.005. It turns out that we can take γ < 5, and satisfy both the constraints for λ = 0.204. This gives S(A;P,z)≥xW(z)1 o(1)+Oz4.85.T aking z = x 1 5 , we obtain S(A;P,z)≥c∏ p\k1  1 px+Ox0.97. Now to get the actual estimate ∑n≤x,n⊥k 1 we need to account for the numbers that might have been included in this estimate which are not really prime to k. Clearly, by our choice of the limit for z, each number which is over-counted must share a factor p with k that is larger than x 1 5 . Let us assume that the largest prime factor of k is < x 1 5 .

56 2. THE COMBINATORIAL SIEVE

Thus we have:

∑ n≤xgcd (n,k)=1

1≥

c (k) k

x+Ox0.97,

where c < 1. For the upper bound we can take the same value of λ as for the lower bound but this forces us to take z = x

1 6 in this

case and we get

∑ n≤xgcd (n,k)=1

1≤

c0 (k) k

x+Ox0.975,

where c0 < 4. In summary we have proved:

THEOREM 2.6.1. Let x > 0 and k a positive integer whose largest prime factor p is less than x

1 5 . Then

c (k) k

x+Ox0.97≤ ∑ n≤xgcd (n,k)=1

1≤

c0 (k) k

x+Ox0.975,

where c < 1 and c0 < 4 are constants.

CHAPTER 3

Selberg’s Sieve

Around 1946 Atle Selberg introduced a new method for  nding upper bounds to the sieve estimate [Sel47]. The method usually gives much better bounds than the Brun’s sieve. To obtain lower bounds one can couple the Selberg sieve with the Buchstab identities. After developing the basic ideas of this sieve technique, we shall look at the most important application of this method - to derive inequalities of the Brun-Titchmarsh type.

3.1. The Selberg upper-bound method

Selberg’s method of estimating the sum

S(A;Pz,x) = ∑ a∈A ∑ d\gcd(a,Pz)

μ(d) relies on  nding a sequence of numbers λd such that λ1 = 1 and using the inequality: S(A;Pz,x)≤ ∑ a∈A ∑ d\gcd(a,Pz) λd2. This allows us complete freedom in our choice of the numbers λd for d > 1, and the idea of this method is to select the λd such that the sum is minimized. Note that setting λ1 = 1 and λd = 0 for d > 1, leads to the trivial estimate S(A;Pz,x)≤|Ax|. Selberg’s method relies on choices of λd that mimic the cancellation occuring in the sum ∑d\nμ(d). Such choices lead to better estimates when we interchange the sum.

Now

∑ a∈Ax ∑ d\gcd(a,Pz)

λd2 = ∑ di\Pz i=1,2

λd1λd2 ∑ a∈Axa ≡0 mod D

1,

where D = lcm(d1,d2) . By our conventions about the sequence A, we have

∑ a∈Axa ≡0 modD

1 =|Ax D|=

ω(D) D

x+RD.

This yields,

∑ di\Pz i=1,2

λd1λd2|Ax D|, = x ∑ di\Pz i=1,2

λd1λd2

ω(D) D

+ ∑ di\Pz i=1,2

λd1λd2|RD|

= xΣ1 +Σ2.

The problem of selecting λd already seems dif cult. We can make the assumption that λd = 0 for d > z and hope that since the second sum σ2 contains only z2 terms we can concentrate on minimizing the leading sum σ1. Our  rst effort will be directed towards this.

Minimization of ∑1 : Using the fact that ω(d) is a multiplicative function, we have ω(D) D = ω(d1)ω(d2) ω(gcd(d1,d2)) gcd(d1,d2) d1d2 ,

57

58 3. SELBERG’S SIEVE

so

Σ1 = ∑ di\Pz

λd1λd2

ω(d1) d1

ω(d2) d2

gcd(d1,d2) d1d2

.

Let f(d) = ω(d) d , so that the sum becomes

Σ1 = ∑ di\Pz

λd1λd2

f(d1)f(d2) f(d) ,(3.21)

where d = gcd(d1,d2).

We need to get rid of the term in the denominator, and to this end we introduce the function

J(r) =

1 f(r) ∏ p\r1  f(p).

Let r = ps, and consider:

∑ δ\ps

J(δ) =∑ δ\sJ(pδ)+J(δ) =∑ δ\s 1 f(pδ) ∏ q\pδ1  f(q)+

1 f(δ) ∏ q\δ1  f(q)

=∑ δ\s

J(δ) 1 f(p)1  f(p)+1

=

1 f(p) ∑ δ\s

J(δ),

together with

∑ δ\p

J(δ) = J(p)+J(1) =

1 f(p)

.

Thus we have

1 f(d)

= ∑ δ\d

J(d).

Substituting this for 1 f(d) in (3.21) we get,

∑ di\Pz

λd1λd2

f(d1)f(d2) f(d)

= ∑ di\Pz

λd1λd2 f(d1)f(d2) ∑ δ\d1,δ\d2

J(d)

= ∑ r≤z r\Pz

J(r)∑ r\d d≤z

λd f(d)2.

Let ξr = ∑r\d d≤z

λd f(d), so that

Σ1 = ∑ r≤z r\Pz

J(r)ξ2 r.

This is what we need to minimize subject to the restriction λ1 = 1. We wish to write this constraint as a constraint among the variables ξi, which would allow us to convert the minimization problem to one entirely involving the variables ξi.

3.1. THE SELBERG UPPER-BOUND METHOD 59

The idea is to use M¨obius inversion to pick out λ1, and this is not dif cult: ∑ r≤z μ(r)ξr = ∑ r≤z μ(r) ∑ r\d d≤z λd f(d) = ∑ d≤z f(d)λd∑ r\d μ(d) = λ1 f(1) = λ1 = 1. Thus we need to minimize ∑r≤z J(r)ξ2 r, subject to the constraint ∑r≤z μ(r)ξr = 1. Let F = ∑r J(r)ξ2 r    for some real  . Since ∑r≤z μ(r)ξr = 1, we have F = ∑r J(r)ξ2 r   ∑r μ(r)ξr. Minimizing F is the same as minimizing the function ∑r≤z J(r)ξ2 r. Let us try to complete the square term in the  rst sum in F. This suggests setting  ←2ω, so ∑ r≤z J(r)ξ2 r  2ω∑ r≤z μ(r)ξr = ∑ r≤z J(r)ξ2 r   2ωμ(r)ξr J(r)  = ∑ r≤z J(r)ξ2 r   2ωμ(r)ξr J(r) +ωμ(r) J(r) 2 ∑ r≤z ω2μ(r)2 J(r) = ∑ r≤z J(r)ξr ωμ(r) J(r) 2 ω2 ∑ r≤z μ2(r) J(r) . Thus at the minimum value of F we should have ξr = ωμ(r) J(r) , and the minimum value of F would be ω2∑r≤z μ2(r) J(r) . To  nd the value of ω, we can substitute ξr into the constraint ∑r≤z μ(r)ξr = 1, and this gives us immediately that ω = 1 ∑r≤z μ(r)2 J(r) . So

min∑ r≤z

J(r)ξ2 r = ∑ r≤z

ω2 μ(r)2 J(r)

= ω2 ∑ r≤z

μ(r)2 J(r)

=

ω2 ω = ω

=

1 ∑r≤z μ(r)2 J(r)

.

By our de nition of the function g(d) we have g(r) = 1 J(r), so

∑ r≤z

μ(r)2 J(r)

= ∑ r≤z

μ(r)2g(r).

Set

G(z) = ∑ r≤z

μ2(r)g(r).

Then the minimum value of Σ1 is x G(z).

Evaluation of Σ2: To estimate the remainder term Σ2, we need an estimate on the size of the λd. We had earlier used M¨obius inversion to extract λ1 from a combination of the ξr, and we can repeat the process to get λδ for any δ.

60 3. SELBERG’S SIEVE

Now by de nition

ξr = ∑ r\d d≤z

λd f(d).

Let r = γδ, so that

ξγδ = ∑ γδ\d d≤z

λd f(d)

= ∑ γ\d δ d≤z

λd f(d)

= ∑ γ\v,v≤d δ gcd(v,δ)=1

λδv f(δv).

Since we want to extract the term with γ = 1, we calculate:

∑ γ≤z δ γ⊥δ

μ(γ)ξγδ = ∑ γ≤z δ γ⊥δ

μ(γ) ∑ γ\v,v≤z δ v⊥δ

λδv f(δv)

= ∑ v≤z δ,v⊥δ

λδv f(δv)∑ γ\v

μ(k)

= λδ f(δ).

Thus

λδ =

1 f(δ) ∑ γ≤z δ γ⊥δ

μ(γ)ξγδ,

and substituting for ξγδ gives

λδ =

ω f(δ) ∑ γ≤z δ γ⊥δ

μ(γδ)μ(γ) J(γδ)

=

ωμ(δ) f(δ)J(δ) ∑ γ≤z δ γ⊥δ

μ(γ)2 J(δ)

.

Let

Gd(y) = ∑ δ<y,δ⊥d

μ2(δ)g(δ).

Then

λδ =

ωμ(δ) f(δ)J(δ)

Gδz δ.(3.22)

3.1. THE SELBERG UPPER-BOUND METHOD 61

We will show that|λd|≤1. Observe that

G(z) =∑ l\d

∑ m≤zgcd (m,d)=l

μ(m)2g(m)

=∑ l\d

∑ h<z l gcd(h,l)=1 gcd(h,d l )=1

μ(lh)2h(lh)

=∑ l\d

μ(l)2g(l)Gdz l ≥∑ l\d μ(l)2g(l)Gdz d

and

∑ l\d

μ(l)2g(l) =∏ p\d1+g(p) =∏ p\d p p ω(p) = 1 ∏p\d1 ω(p) p

,

and so

Gdz d≤∏ p\d1 ω(p) p G(z).(3.23)

Now substituting for J(δ) in (3.22), we get:

λd =

μ(d) ∏p\d1 ω(p) p

Gd(z/d) G(z) .(3.24)

Thus by (3.23) and (3.24), we have|λd|≤1.

Now

Σ2 ≤ ∑ di<z di\Pz

Rlcm(d1,d2)

.

Fix a d; we can estimate the number of integers d1,d2 for which d = lcm(d1,d2) . Now d as well as d1 and d2 are squarefree. If d1 = ∏i pei i and d2 = ∏i pfi i , then d = ∏i pmax{ei,fi} i . Suppose p\d, then p\d1 or p\d2 or p divides both of them. So the number of integers which can give rise to d as their lcm is exactly 3ν(d). Using this and the fact that d < z2, we get

Σ2 ≤ ∑ d<z2

3ν(d)|Rd|.

If we also have the remainder condition|Rd|≤ω(d), then we can simplify further:

62 3. SELBERG’S SIEVE

∑ d<z2

3ν(d)|Rd|≤ ∑ d<z2 d\Pz

3ν(d)ω(d)

≤z2 ∑ d\Pz

3ν(d)ω(d) d

= z2 ∏ p<z,p∈P1+ 3ω(p) p ≤z2∏ p<z1+ ω(p) p 3 ≤ z2 W3(z) .

Thus we have proved: THEOREM 3.1.1. If|Rd|≤ω(d), then

S(A;Pz,x)≤ x G(z)

+

z2 W3(z)

,

where

G(z) = ∑ r≤z

μ2(r)g(r).

The second term can also be upper bounded by

∑ d<z2 d\Pz

3ν(d)|Rd|,

which is also upper bounded by

∑ d<z2 Γ(d) P

μ2(d)3ν(d)|Rd|.

Here Γ(d) stands for the set of prime divisors of d.

We will apply the Selberg method to the simple but important case where ω(d) = 1 and|Rd|≤1. THEOREM 3.1.2. Suppose ω(d) = 1 and|Rd|≤1. If d is squarefree and p / ∈P  p⊥d then S(A;P,z)≤ x ∏p<z p/ ∈P1  1 plogz +z2. Proof : Recall that

g(d) =

ω(d) d∏p\d1 ω(p) p where d\Pz. In this case we have ω(d) = 1, so we have g(d) = 1  (d) . Let k = ∏p<z p/ ∈P p. Then by de nition of G(z) in this case we get G(z) = ∑ d<z d⊥k μ2(d)  (d) .

3.1. THE SELBERG UPPER-BOUND METHOD 63

Let

Sk(z) = ∑ d<z d⊥k

μ2(d)  (d)

.

Then

S1(z) = ∑ d<z

μ2(d)  (d)

=∑ l\k

∑ d<z gcd(d,k)=l

μ2(d)  (d)

=∑ l\k

∑ h<z l gcd(h,k/l)=1 gcd(h,l)=1

μ2(lh)  (lh)

=∑ l\k

μ2(l)  (l) ∑ h<z l h⊥k

μ2(h)  (h)

=∑ l\k

μ2(l)  (l)

Skz l

≤∑ l\k

μ2(l)  (l)

Sk(z),

because Sk(z) is an increasing function of z.

Now

∑ l\k

μ2(l)  (l)

=∏ p\k1+ 1 p 1 = 1 ∏p\k1  1 p= k  (k) ,

and so

Sk(z)≥

(k) k

S1(z).

To apply Theorem 3.1.1 we need a good lower bound on G(z). Since G(z) = Sk(z), the above derivation says that we can translate a lower bound on S1(z) to a lower bound on Sk(z). We have

S1(z) = ∑ d<z

μ2(d) d

1 ∏p\d1  1 p

= ∑ d<x

μ2(d) d ∏ p\d1+ 1 p

+

1 p2

+···.

64 3. SELBERG’S SIEVE

If we set

(n) to be the largest squarefree divisor of n, then

S1(z) = ∑

(n)<z

1 n

≥ ∑ n<z

1 n ≥logz.

So Sk(z)≥  (k) k logz. We know from the proof of Theorem (3.1.1) that the remainder term is at most ∑ di\Pz di<z |Rlcm(d1,d2)|≤∑ d<z μ2(d)2 < z2.

Thus

S(A;Pz,x)≤ 1 ∏p<z p/ ∈P1  1 plogz

x+z2.

3.2. The Brun-Titchmarsh Theorem

The prime number theorem for arithmetic progressions states that

π(x;l,k) =

lix  (k)

+Oxe A√logx uniformly for k≤(logx)B, where B is any positive constant and A is a positive constant depending only on B. This is a very narrow range of values of k. It turns out that if we assume the Extended Riemann Hypothesis, then

π(x;l,k) =

lix  (k)

+O√xlogx

uniformly for k≤

√x log2 x

. By a careful analysis of the Selberg sieve (especially the remainder term) van Lint and Richert [vLR65] showed a good upper bound for π(x;l,k) valid for any k < x. In this section we shall look at the proof of this result (see Theorem 3.2.5). In a later chapter we shall improve on this result using the so called Large sieve.

Let k,l > 0 be relatively prime integers, and let x,y > 1 be reals with y≤x. We will concentrate on the sequence A ={n|x y < n≤x, n≡l mod k}. For K a multiple of k, we take as the sifting primes PK ={p| p6\K}. First we shall prove a form of the Selberg sieve, where we have a better estimate of the remainder term. We de ne

SK(z) = ∑ 1≤n≤z n⊥K

μ2(n)  (n)

as in the proof of Theorem (3.1.2), and

HK(z) = ∑ 1≤n≤x n⊥K

μ2(n)σ(n)  (n)

with σ(n) = ∑d\n d.

3.2. THE BRUN-TITCHMARSH THEOREM 65

LEMMA 3.2.1.

S(A;Pz K,x,y)≤

y kSK(z)

+

H2 K(z) S2 K(z)

.

Proof : The cardinality of the set AD ={n|x y < n≤x,n≡l mod k,n≡0 mod D} is y kD +RD. Following the proof of the Selberg sieve and using the analysis in Theorem (3.1.2) we get the  rst term to be y kSK(z) . Now the remainder term is (using|Rd|≤1) at most ∑ di\PK i=1,2 |λd1λd2|=∑ d\PK |λd|2.

In the notation of this proof we have

λd = μ(d)

d  (d)

SKdz dS K(z)

so

∑ d\PK

|λd|= ∑ 1≤d≤z d⊥K

μ2(d)d  (d)

1 SK(z) ∑ 1≤m≤z/d m⊥Kd

μ2(m)  (m)

=

1 SK(z) ∑ 1≤d≤z d⊥K

μ2(d)  (d) ∑ 1≤m≤z/d m⊥kd

μ2(m)  (m)

=

1 SK(z) ∑ 1≤d≤z d⊥K

∑ 1≤m≤z/d m⊥kd

μ2(md)  (md)

d

=

1 SK(z) ∑ 1≤n≤z n⊥K

μ2(n)  (n) ∑ d\n

d

=

HK(z) SK(z)

.

Hence the remainder term is at most

H2 K(z) S2 K(z)

, and the lemma follows.

Our aim now is to  nd a good upper bound on H2 K(z). One idea is to use Cauchy’s inequality on this sum, and this suggests that we  rst  nd a concrete upper bound for the sum ∑n≤x,n⊥K 1, which we have seen in the last chapter. Using Theorem (3.1.2) we have THEOREM 3.2.2. If 1≤k < y≤x and P is a set of primes p with k⊥ p, then we have for any z≥2 that

{n|x y < n≤x,n≡l mod k,n⊥Pz}
 
≤ y ∏p<z p/ ∈P klogz +z2. LEMMA 3.2.3. Let p(k) be the largest prime divisor of k. For x≥e6 and p(k)≤x we have ∑ n≤x n⊥K 1 < 7 (k) k x.

66 3. SELBERG’S SIEVE

Proof : Take k = 1,y = x and P ={p| p6\k}in Theorem (3.2.2). For z≤x we have ΦK(x) =
 
{n : n≤x,gcd(n, ∏ p<z,p⊥K p) = 1}
 
≤ x ∏p<z p\K1  1 plogz

+z2.

Thus

k  (k)

ΦK(x) x ≤

1 ∏p≤x1  1 p 1 logz

+

z2 x,

and using

∏ p≤x1  1 p 1 ≤eγlogx1+ 1 2log2 x

and setting z = x

1 3 , we get

k  (k)

ΦK(x) x ≤eγlogx1+ 1 2log2 x 3 logx

+

1

x

1 3.

The right hand side is decreasing, and for x = e6 is < 7.

LEMMA 3.2.4. For z > 103, h even,

H2 h(z) S2 h(z)

< 22.5

h  (h)

z2 log2 z

.

Proof : Let

Jh(z) = ∑ 1≤n≤z n⊥h

μ2(n)σ2(n)  2(n)

,

and as above let Φh(z) = ∑1≤n≤z n⊥h

1. Now

Hk(z) = ∑ 1≤n≤z n⊥k

μ2(n)σ(n)  (n)

.

Cauchy’s inequality states that

∑ 1≤n≤N

anbn2 ≤ ∑ 1≤n≤N

a2 n ∑ 1≤n≤N

b2 n.

Using this with bn = 1, an = μ2(n)σ(n)  (n) and observing that μ4(n) = μ2(n), we have

H2 h(z)≤Φh(z)Jh(z).

Let n be an integer and p⊥n; then

σ(np) = ∑ d\np

d

= ∑ d\n

d + p∑ d\n

d

= σ(n)(1+ p),

3.2. THE BRUN-TITCHMARSH THEOREM 67

and also  (np) =  (n) (p). If n is squarefree, then σ2(np)  2(np) =

σ2(n)  2(n)(1+ p)2  2(p)

=

σ2(n)  2(n) 2(p)+4p  2(p)

=

σ2(n)  2(n)1+ 4p  2(p).

By induction we have

σ2(n)  2(n)

=∏ p\n1+ 4p  2(p) = ∑ d\n 4ν(d)d  2(d) ,μ2(n) = 1.

Since 2\h we have Jh(z)≤J2(z) and

J2(z) = ∑ 1≤n≤z n⊥2

μ2(n)∑ d\n

4ν(d)d  2(d)

= ∑ 1≤d≤z d⊥2

μ2(d)4ν(d)d  2(d) ∑ 1≤m≤z/d m⊥2d

μ2(m)

≤z ∑ 1≤d≤z d⊥2

μ2(d)4ν(d)  2(d)

≤z∏ p>21+ 4 (p 1)2 < 16 5 z. In the proof of Theorem (3.1.2) we had proved Sh(x)≥  (h) h logx; now using this and Lemma (3.2.3) we have: H2 h(z) S2 h(z) ≤ 7 (h) h z16 5 z  2(h) h2 log2 z

= 22.5

z2 log2 z

h  (h)

.

THEOREM 3.2.5. If x and y are real numbers and k and l are integers satisfying 1≤k < y≤x with k⊥l, then π(x;k,l) π(x y;k,l) < 3y  (k)logy k (3.25) and π(x;k,l) π(x y;k,l) < y  (k)logqy k1+ 4 logqy k.(3.26) Proof: Let  (x,y,k,l)=π(x;k,l) π(x y;k,l) andh= 2k gcd(2,k). Then thereis an l1 such that  (x,y,k,l)≤ (x,y,h,l1)+ 1. For if k is even, then h = k, and we can take l1 = l. If k is odd, then the parity of mk+l changes alternately. In this case, we can set l1 to be the solution to l1 ≡1 mod 2 and l1 ≡l mod k. So at worst we miss one prime in the even subsequence.

68 3. SELBERG’S SIEVE

By what we have proved so far, the sifting of the sequence A by Pz yields the following upper bound:  (x,y,k,l)≤ (x,y,h,l1)+1(3.27) ≤ y  (h)S1(z) + H2 h(z) S2 h(z) +π(z,h,l1)+1(3.28) ≤ y  (k)S1(z) + H2 h(z) S2 h(z) +π(z,h,l1)+1 for any z > 1.(3.29)

We begin with a trivial estimate

(x,y,h,l1)≤ ∑ x y<n≤x n≡l1 modh

1

y h

+1.

So  (x,y,k,l)≤ y h +2. Let u =qy k. Since  (k) =  (h)≤ 1 2h, we have  (x,y,k,l) y ≤ 1 k + 2 y . Using y = u2k we obtain

(k) (x,y,k,l) y ≤

(k) k

+

2 (k) y ≤

1 2

+

2 (k) u2k

1 2

+

2 u2

.

Thus

Q =

logqy k (k) y

(x,y,k,l)≤logu1 2

+

2 u2

<

3 2

for 1 < u≤e2.9.

Now

π(z,h,l1)+1≤ ∑ 1≤n≤z,k⊥2

μ2(k)≤ z 1 2

for z≥9.

The remainder term is at most

∑ d<z gcd(d,h)=1

μ2(d)2 ≤ ∑ d<z gcd(d,2)=1

μ2(d)2, since 2\h

≤z 1 2 2 if z≥9.

By (3.27), and the above bounds we have: Q≤logu 1 logz

+

1 u2z 1 2 2 + z 1 2

< logu 1 logz

+

z2 4u2if z≥9.

De ne ω by

u =

ω √2eω,

3.3. PRELUDE TO A THEOREM OF HOOLEY 69

and set z = eω so that

Q≤

logω √2+ω ω 1+ 1 2ωfor ω≥log9. For ω≥√2e > log9 this function is decreasing, and for ω =√2e it is < 3 2. This proves (3.25). Now (3.26) is a consequence of (3.25) for u≤e8. If e8 < u < e10, then using the above bound for Q, we obtain Q≤ logω √2+ω ω 1+ 1 2ω. If u > e8, then ω < 6.4 and this gives Q < 1.4 < 1+ 4 logu. This shows (3.26) for u < e10. Now using (3.27) and setting logz = logu 2, we get logry k (Q 1)≤logulogu logz  1+48 logu u2 z2 log2 z + logu u2 z, = logu 2 logu 2 + 48 e4 logu (logu 2)2 + logu e2u, which is a decreasing function in u. In particular it is < 4 if u≥e10. This proves (3.26).

3.3. Prelude to a theorem of Hooley

In this section we will look at a variation of a problem of Chebyschev that we shall see in the next section. The problem is to prove a lower bound on the largest prime divisor of ∏ p≤x (p2 1) = ∏ p≤x (p+1)∏ p≤x (p 1). We will prove the following theorem of Motohashi [Mot70].

THEOREM 3.3.1. Let Px be the largest prime divisor of ∏ p≤x

(p2 1).

Then Px > xθ for any θ < 1  1 2e 1 4

.

Proof : In this proof q will also stand for primes, and sums or products over q will represent sums or products over primes in the range.

Consider the product Ξ = ∏p≤x(p2 1). Taking log on both sides, we have logΞ = log∏ p≤x p21  1 p2 = 2 ∑ p≤x logp O∑ p≤x 1 p2 = 2x+O(xe c√logx) O(1). Let π(x,k) be the number of primes below x such that p2 1≡0 mod k. We have that p2 1 = (p+1)(p 1) and for p > 2 we have gcd(p+1,p 1)= 2. If k = qa, q6= 2, then p2 1≡0 mod k implies that either p+1≡0 mod k or p 1 ≡ 0 mod k. In this case we have π(x,qa) = π(x; 1,qa)+π(x;+1,qa). Furthermore, π(x,2) = π(x), and π(x,4) = π(x). For a > 2, we have π(x,2a) = π(x; 1,2a 1)+π(x;+1,2a 1). Using the function π(x,qa), we can write Ξ as ∏ qa<x qπ(x,qa). For if qa divides Ξ, then it is counted exactly a times in this product. Taking logarithms we have ∑ qa<x π(x,qa)logq = 2x+O(xe c√logx).

70 3. SELBERG’S SIEVE

We split up the sum as follows:

∑ qa<x

π(x,qa)logq = ∑ q≤ √x logB x a=1

+ ∑ √x logB x <q≤xθ a=1

+ ∑ xθ<q<x a=1

+ ∑ qa<x a≥2

= Σ1 +Σ2 +Σ3 +Σ4, where B is a positive real number. We wish to show that Σ3 is non-zero for the value of θ claimed. Since we already have an asymptotic formula for the sum, to obtain a lower bound for Σ3 we need upper bounds for the remaining sums. We have π(x,k)~ 2lix  (k). Σ1 Bombieri’s Theorem— which we shall prove in Chapter 4, can be used directly to bound this sum we get: Σ1 = 2x logx ∑ q≤ √x logB x logq (q 1) +O x logx = x+Oxloglogx logx . Σ2 We have from the Brun-Titchmarsh Theorem (3.2.5) that π(x,q)≤4 x (q 1)logx q1+ 8 logx q.Hence Σ2 ≤4x( ∑ √x logB x <q≤xθ logq (q 1)logx q +O 1 (log2 x) ∑ q≤x logq q ), and using ∑p≤x logp p ~logx, we have Σ2 = 4x ∑ √x logB x <q≤xθ logq qlogx q+O x logx. Writing  (x) for ∑p≤x logp, we have by partial summation: ∑ y<p≤z log p qlogx q = ∑ y<k≤z  (k)  (k 1) klogx k = ∑ y<k≤z  (k) 1 klogx k  1 (k+1)log x k+1.This sum boils down to ∑ y<k≤z  (k) k(k+1)logx k , and using  (x) < x1+ 1 2logx, we get ∑ y<p≤z logp qlogx q≤ ∑ y<k≤z 1 klogx k . Now we can bound this sum using integration to get

∑ y<p≤z

logp qlogx q

= loglog x z loglog x y+o(1).

Thus

√x logB x

<q≤xθ

logq qlogx q

= log2(1 θ)+o(1),

3.4. A THEOREM OF HOOLEY 71

and so

Σ2 ≤ 4log2(1 θ)x+o(x). Σ4 We split up Σ4 into two parts, Σ4 = ∑ qa≤x 2 3 a≥2 + ∑ x 2 3 <qa<x a≥2 = Σ41 +Σ42, say.

Using the Brun-Titchmarsh theorem: Σ41 = O∑ q≤√x

logq

x logx ∑ a≥2

1  (qa)

= O x logx ∑ q≤√x

logq q2

= O x logx

and

Σ42 = O∑ q≤√x

logq ∑

x

2 3 <qa<x

x qa

= Ox1 3 ∑ q≤√x

logq

logx logq

= O(x

5 6 ).

Thus

Σ4 = O x logx.

From the bounds we have derived we get:

Σ3 > (1+4log2(1 θ))x+o(x).

Hence if 1+4log2(1 θ) > 0 i.e., if

1

1

2e

1 4

> θ,

then there is a prime factor exceeding xθ.

Among known improvements to this result, the best one is that the largest prime factor exceeds xθ for θ = 0.677 (see [BakHar95], [BakHar98], and also [Ho73]).

3.4. A theorem of Hooley Chebyhev proved that if Px is the largest prime factor of ∏n≤x(n2+1), then Px x →∞. Hooley [Ho67] (see also [Ho76])impro ved the previous best known result of

Px x

> (logx)A1 logloglogx

by Erd os [Erd52] to Px > x

11 10 using the Selberg sieve. In this section we shall outline the proof given by Hooley in

[Ho76]. The exponent 11 10 has since been improved to θ< 1.202···, where θ is the solution to 2 θ 2log(2 θ) = 5 4, by Deshouillers and Iwaniec [DI83] (see also [Dar96]).

72 3. SELBERG’S SIEVE

THEOREM 3.4.1 ([Ho76]). The largest prime factor of

∏ n≤x

(n2 +1)

exceeds x

11 10 for all large enough values of x. Proof : Let Px be the largest prime factor of ∏n≤x(n2 +1), and set Nx(l) =
 
{n≤x | n2 ≡ 1 mod l}
 
. We begin by nding a lower bound for ∑x≤p≤Px Nx(p)log p, as in the proof of Theorem (3.3.1). We have ∏ n≤x (n2 +1) = ∏ p≤Px pα<x2+1 pNx(pα).

Taking logs,

log∏ n≤x

(n2 +1) = log∏ n≤x

n21+ 1 n2 > logbxc!2 = 2xlogx+O(x)

by Stirling’s theorem, and so

∑ p≤Px pα<x2+1

Nx(pα)log p > 2xlogx+O(x).

Now

∑ p≤Px pα<x2+1

Nx(pα)logp = ∑ x≤p≤Px

Nx(p)log p+ ∑ p≤x

Nx(p)logp+ ∑ p≤Px α>1

Nx(pα)log p

= ΣA +ΣB +ΣC.

As before we proceed to upper-bound ΣB and ΣC, thereby obtaining a lower bound for ΣA. Now Nx(l) = ∑ n2+1≡0 mod l n≤x 1 = ∑ v2+1≡0 mod l 0<v≤l ∑ n≡v mod l n≤x 1.

Let ρ(l) be the number of solution to the congruence v2 +1≡0 mod l. Then since ∑ n≡v mod l n≤x 1  x l = O(1), we have

Nx(l) =

xρ(l) l

+O(ρ(l)).

Now ρ(2) = 1, and since the congruence 1 p ≡( 1)

p 1 2 mod p has no solutions for p≡3 mod 4, and has exactly

two solutions for p≡1 mod 4. We conclude ρ(p) =(2 if p≡1 mod 4, 0 if p≡1 mod 4.

3.4. A THEOREM OF HOOLEY 73

The needed bounds are given by:

ΣB = x ∑ p≤x

ρ(p)logp p

+O∑ p≤x

ρ(p)logp

= 2x ∑ p≤xp ≡1 mod 4

log p p

+O(x)+O(∑ p≤x

logp),

= xlogx+O(x).

using ∑ p≤x p≡l mod k

logp p = 1  (k) logx+O(1),

ΣC = O ∑ p≤√x2+1

logp ∑ 2≤α x pα

+1

= Ox∑ p

logp p1  1 p= Ox∑ p log p p(p 1) = O(x)

since the sum converges. Thus we get ΣA > xlogx+O(x). Our next task is to upper-bound the sum Tx(y) = ∑x<p≤y Nx(p)log p, which in conjunction with the above lower bound will yield a lower bound for y. It turns out that to estimate Tx(y) effectively, we need to split up the sum into two parts and evaluate each of them separately. To this end let X = x 1 11 , and assume that x 12 11 < y < x2. Then

Tx(y) = ∑ x<p≤xX

Nx(p)log p+ ∑ xX<p≤y

Nx(p)log p

= Tx(xX)+T0 x(y).

To evaluate Tx(xX), we let Vx(v) = ∑v<p≤ev Nx(p). Then

Tx(xX) = ∑ 0≤α<logX

∑ xeα<p≤xeα+1

Nx(p)log p

≤ ∑ 0≤α<logX

log(xeα+1)Vx(xeα).

Now for the sum T0 x(y), using the de nition of Nx(l), we have:

T0 x(y) = ∑ xX<p≤y pm=n2+1 n≤x

logp

= ∑ m> x2 ylog8 x

log p+ ∑ m≤ x2 ylog8 x

log p

= T00 x (y)+T000 x (y)(say).

74 3. SELBERG’S SIEVE

Now the conditions of the summation T000 x (y) yield m≤ x2 ylog8 x

, and so n <p(pm)≤r yx2 ylog8 x= x log4 x

. Since m≤n,

we have m≤ x log4 x

. Using this we have

T000 x (y) = 2logx ∑ lm=n2+1 m,n≤ x log4 x

1

= 2logx ∑ m≤ x log4 x

N x log4 x

(m).

Now if m = ∏i phi i , then ρ(m) = ∏iρ(phi), and each of the individual terms is a constant. So ρ(m)≤2ν(m), and this itself is upper bounded by d(m), i.e. the number of divisors of m. Therefore:

T000 x (y)≤

2x log3 x ∑ m≤ x log4 x

ρ(m) m

+Ologx ∑ m≤ x log4 x

ρ(m)

= O x log3 x ∑ m≤ x log4 x

ρ(m) m

= O x log3 x ∑ m≤x

d(m) m .

Now consider

∑ 1≤n≤x

1 n ∑ 1≤m≤x

1 m= ∑ 1≤n≤x2n is x smooth

1 n∑ u,v≤x uv=n

1

≥ ∑ 1≤n≤x

d(n) n .

This yields ∑1≤n≤x d(n) n = O(log2 x), and so

T000 x (y) = O x logx.

In T00 x (y), we have m > x2 ylog8 x

and pm≤x2 +1, so m≤ x2+1 p . Furthermore p > xX, and so m≤ x X1+ 1 x2≤ ex X . Thus

we have

T00 x (y)≤ ∑ x2 ylog8 x <m≤ex X pm=n2+1 n≤x,p≥xX

log

ex2 m

.

Let

Wx(w) = ∑ w<m≤ew pm=n2+1 n≤x,p≥x

1.

Then

T00 x (y)≤ ∑ 0≤α<logY

logxXeα+1Wxxe α X ,

where Y = eylog8 x xX . Finally,

T0 x(y)≤ ∑ 0≤α<logY

log(xXeα+1)Wxxe α x +O x logx.

3.4. A THEOREM OF HOOLEY 75

We will format the sums involved for application of the Selberg sieve. Let λ be a squarefree number, and de ne

(u;λ) = ∑ u<λk≤eu

Nx(λk).

We impose the conditions x

4 5 < u < x

4 3 and λ < minu

5 4 x , x u 3 4    . By a rather ingenious and elaborate argument Hooleysho wed that

(u;λ) =

3xρ(λ) 2πλ

1 ∏p\λ1+ 1 p

+Ox1 2+εu3 8 λ 1 2 (see [Ho76]§2.3 -§2.6). Since the argument is not central to our application of the sieve, we exclude the derivation of this bound here.

Application of the Sieve: Let x≤v < x

12 11 , so that v satis es the conditions on u imposed by our bounds on  (u;λ). Let d denote a squarefree number, and let λd be the Selberg coef cients. Then

Vx(v)≤ ∑ v<l≤ev

Nx(l)∑ d\l

λ2 d

= ∑ d1,d2≤z

λd1λd2 ∑ v<l≤evl ≡0 mod lcm(d1,d2)

Nx(l)

= ∑ d1,d2≤z

λd1λd2 (v;lcm(d1,d2) )

(since lcm(d1,d2) < xv 3 4 )

3x 2π ∑ d1,d2≤z

λd1λd2ω(lcm(d1,d2)) lcm(d1,d2)

+Ox1 2+εv3 8 ∑ d1,d2≤z

|λd1||λd2| plcm(d1,d2).Here

ω(d) =

ρ(d) ∏p\d1+ 1 p

,

which is clearly multiplicative. So we can apply Selberg’s sieve without modi cation, except that the remainder term is more clearly speci ed in this case. Thus by Theorem (3.1.1), we have

Vx(v)≤

3x 2πG(z)

+R,

where R is the remainder term. Now

G(z) = ∑ d<z

μ2(d)g(d),

and

g(p) =

ω(p) p1 ω(p) p

=

21+ 1 p 1p 1  2 p1+ 1 p 1= 2 p1  1 p .

76 3. SELBERG’S SIEVE

Thus

g(d) =

ρ(d) d∏ p\d p6=2,p≡1 mod 41  1 p =∑ d0 ρ(dd0) dd0 , where d0 indicates any number whose prime factors divide d. Also ρ(2α) = 0, if α > 1, and we have

∑ d≤z

μ2(d)g(d) = ∑ d≤z

∑ d0

ρ(dd0) dd0

≥ ∑ m≤z

ρ(m) m

3(1 η1) 2π

logz,

where η1 < 1 can be chosen very small. Here we have used

∑ m≤z

ρ(m) =

3z 2π

+O(z

3 4 )

(which is proved in [Ho76] p. 32) and partial summation. Also the remainder term can be bounded as follows: R = Ox1 2+εv3 8 ∑ d1,d2≤z

1 plcm(d1,d2)= Ox1 2+εv3 8 ∑ d≤z ∑ l1,l2≤z d l1⊥l2 1 √dl1l2 = Ox1 2+εv3 8 ∑ d≤z z √d ∑ l1,l2≤z/d l1⊥l2 1 √l1l2 = Ox1 2+εv3 8 ∑ d≤z z d 3 2 = Ox1 2+εv3 8 z.Selecting z = x 1 2 ηv 3 8 , we get

Vx(v) <

(1+η2)x log√xv 3 8

,

where η2 can be made arbitrarily small. Similarly

Wx(w)≤ ∑ w<m≤ew∑ d\l

λd2

= ∑ d1,d2≤z

λd1λd2 ∑ w<m≤ewmr ×lcm(d1,d2)=n2+1 n≤x

1

= ∑ d1,d2≤z

λd1λd2 lcm(d1,d2) w;lcm(d1,d2).

3.4. A THEOREM OF HOOLEY 77

Carrying through the sieve estimate, we get with z = x

2 7 ηw  3 14 that

Wx(w) <

(1+η2)x

logx

2 7 w  3 14

.

Let y = x

11 10 and γ = logx. Using the above estimates, we  nd that

Tx(xX) < x(1+η2) ∑ 0≤α<logx

α+γ+1 1 8γ  3 8α

< 0.8902xlogx,

where we have used integration to upper-bound the sum. Similarly we  nd T0x(y) < 0.1081xlogx

for large enough x. Thus we get

Tx(x

11 10 ) < 0.9983xlogx, and so the largest prime factor of ∏n≤x(n2 +1) exceeds x11 10 , for all large enough values of x.

78 3. SELBERG’S SIEVE

CHAPTER 4

The Large Sieve

The Selberg sieve does not give good bounds if we sieve out a large number of residue classes modulo each prime in the sifting set. The large sieve was designed to handle this problem, (hence the name). The bounds are derived by relating the properties of the integer sequence to the behavior of certain exponential sums.

4.1. Bounds on exponential sums

De ne e(t) = e2πit. We have en q= em qif n ≡ m mod q. The following property of the exponential functionresembles that of the M¨obius function, and is useful to study the distribution of a sequence of integers in residue classes modulo some number.

PROPOSITION 4.1.1.

∑ 1≤a≤q

ean q=(q, if n≡0 mod q 0, otherwise.

Proof : If n≡0 mod q, then ean q= 1 for each a. So ∑1≤a≤qean q= q. If n6≡0 mod q, then ∑ 1≤a≤q ean q= ∑ 0≤a≤q 1 ean q

=

eqn q 1e n q 1= 0.

Let a1,···,az be a sequence of integers, and de ne Z(q,h) =
 
{i|1≤i≤z,ai ≡h mod q}
 
and S(x) = ∑ 1≤i≤z e(aix). Now for all integers a we have Sa q= ∑ 1≤h≤q Z(q,h)eah q.(4.30) Suppose all the integers in the sequence are distributed evenly among the residue classes modulo q; then using Proposition 4.1.1 we have Sa q= Z(q,h) ∑ 1≤h≤q eah q = 0, if a6≡0 mod q. If on the other hand all the integers ai belong to a single residue class modulo q, then
 
Sa q
 
= z for all integers a.Hence the distribution of the integers among the residue classes is related to|Sa q|. In fact, we can express Z(q,h) in79

80 4. THE LARGE SIEVE terms of Sa qas follows: Sa qe h0a q = ∑ 1≤h≤q Z(q,h)eah qe h0a q ,

and therefore

∑ 1≤a≤q

Sa qe h0a q = ∑ 1≤a≤q

Z(q,h) ∑ 1≤h≤q

ea(h h0) q

= Z(q,h0)q.

Hence

qZ(q,h) = ∑ 1≤a≤q

Sa qe h0a q .(4.31)

It turns out that useful upper bounds can be obtained for the sum

∑ p≤x

∑ 1≤a≤p 1

Sa p

that are largely independent of the integer sequence used to de ne S(x).

We  rst prove a result that shows how the above sums are related to the distribution of the integer sequence in the residue classes.

LEMMA 4.1.2. For all integers q≥2,

∑ 1≤a≤q 1

Sa q

2

= q ∑ 1≤h≤qZ(q,h)  z q2.

Proof :

∑ 1≤a≤q 1

Sa q

2

= ∑ 1≤a≤q 1 ∑ 1≤h≤q

Z(q,h)eah q! ∑ 1≤k≤q

Z(q,h)eka q!

= ∑ 1≤a≤q 1

∑ 1≤h,k≤q

Z(q,h)Z(q,k)ea(h k) q

= ∑ 1≤h,k≤q

Z(q,h)Z(q,k) ∑ 1≤a≤q 1

ea(h k) q !.

It is easy to see that

∑ 1≤a≤q 1

ea(h k) q =(q 1, if h≡k mod q  1, otherwise.

4.1. BOUNDS ON EXPONENTIAL SUMS 81

Thus

∑ 1≤a≤q 1

Sa q

2

= q ∑ 1≤h≤q

Z(q,h)2  ∑ 1≤h,k≤q

Z(q,h)Z(q,k)

= q ∑ 1≤h≤q

Z(q,h)2  ∑ 1≤h≤q

Z(q,h)2

= q ∑ 1≤h≤q

Z(q,h)2 z2

= q ∑ 1≤h≤qZ(q,h)2 2zZ(q,h) q

+

z2 q2

= q ∑ 1≤h≤qZ(q,h)  z q2.

We will look at exponential sums of the form

S(x) = ∑  K≤n≤K

ane(nx),

where K is a positive integer and an ∈

. Notation : We writektkto mean the distance from t to the nearest integer, i.e.,ktk= minn|t n|=

t + 1 2 t
 
.T HEOREM 4.1.3 ([Gal67]). If S(x) = ∑ K≤n≤K ane(nx) and x1,···,xR are real numbers such that kxr xsk≥δ > 0 for r6= s, then ∑ 1≤r≤R |S(xr)|2 ≤(δ 1 +2πK) ∑  K≤n≤K |an|2. Proof : For any u we can write

S2(xr) = S2(u)+2

xr

u

S0(t)S(t)dt.

Using this we have

|S2(xr)|≤|S2(u)|+2

xr u |S0(t)S(t)|dt

.

We now integrate over the interval It =xr  δ 2,xr + δ 2, to get δ|S(xr)|2 ≤

Ir |S(u)|2du+2

Ir

xr u |S0(t)S(t)|dt

du.

Then

Ir

xr u |S0(t)S(t)|dt

du =

xr+δ 2 xr

u xr |S0(t)S(t)|dtdu+

xr xr δ 2

xr u |S0(t)S(t)|dtdu

=

xr+δ 2 xr |S0(t)S(t)|xr + δ 2 tdt +

xr xr δ 2 |S0(t)S(t)|t xr + δ 2dt

δ 2

Ir |S0(t)S(t)|dt.

Thus

δ|S(xr)|2 ≤

Ir |S(u)|2du+δ

Ir |S0(t)S(t)|dt.

82 4. THE LARGE SIEVE

By our condition on the numbers xr the intervals Ir are disjoint modulo 1 meaning that if r 6= s, then no point of Ir differs by an integer from another point in Is. Since S is periodic with period 1 and is non-negative, the value of its integral overl Ir is upper bounded by its integral over [0,1]. Thus summing over r:

δ ∑ 1≤r≤R

|S(xr)|2 ≤

1 0 |S(t)|2dt +δ

1 0 |S0(t)S(t)|dt.

Let us analyze the  rst integral. The exponential function satis es

1

0

e(nx)dx =(1 if n = 0, 0 otherwise.

We have

1 0 |S(x)|2dx =

1

0

S(x)S(x)dx

=

1

0

∑  K≤m,n≤K

aname((n m)x)dx

= ∑  K≤n≤K

|an|2. Thus the  rst integral is ∑ K≤n≤K|an|2. The second satis es:

1 0 |S0(t)S(t)|dt ≤

1 0 |S(t)|2dt

1 2

1 0 |S0(t)|2dt

1 2

and on substituting S0(t) by ∑ K≤n≤K 2πianne(nt), the right-hand side becomes = ∑  K≤n≤K |an|2 1 2 ∑  K≤n≤K |2πnan|2 ≤2πK ∑  K≤n≤K |an|2. Thus δ ∑ 1≤r≤R |S(xr)|2 ≤(1+δ2πK) ∑  K≤n≤K |a2 n|.

There is a stronger bound on the sum ∑1≤r≤R|S(xr)|2 due to Montgomery. To prove this we require the following result. THEOREM 4.1.4. Let Φ1,···,ΦR and ξ be arbitrary vectors in an inner product space V over the complex numbers. Then ∑ 1≤r≤R |(ξ,Φr)|2 ≤Akξk2, where

A = max r ∑ 1≤s≤R

|(Φr,Φs)|. THEOREM 4.1.5. Let S(x) be as above, and x1,···,xr be real numbers with kxr xsk≥δ > 0 for r6= s. Then ∑ 1≤r≤R |S(xr)|2 ≤(2K +3δ 1) ∑  K≤k≤K |ak|2.

4.1. BOUNDS ON EXPONENTIAL SUMS 83

Proof : If R = 1 we have

|S(x)|2 ≤N ∑ M+1≤n≤M+N

|an|2

by Cauchy’s inequality. Hence we may assume R ≥2 so δ≤ 1 2. We apply Theorem (4.1.4) with the inner product de ned to be (φ,ψ) = ∑k φkψk. Take ξ={akb 1 2 k } K≤k≤K andφr ={b 1 2 k e( kxr)} ∞<k<∞, where bk will be de ned later to be positive for K≤k≤K,and non-negative for other k. Using Theorem (4.1.4) we have ∑ 1≤r≤R |S(xr)|2 ≤A ∑  K≤k≤K |ak|2b 1 k , where A = maxr ∑1≤s≤1|B(xr xs| and B(x) = ∑ ∞<k<∞bke(kx). To  nish the proof it suf ces to pick bk such that bk ≥1 for K ≤k≤K such that ∑ 1≤s≤R |B(xr xs)|≤2K +3δ 1 for all r. If we took bk = 1 for K ≤k≤K and bk = 0 otherwise, we would get the inferior estimate ∑ 1≤s≤R |B(xr xs)|≤2K +O(δ 1logδ 1). Instead, take bk to be bk =          1 if|k|≤K, 1 (|k| K) L if K ≤|k|≤K +L, 0 if|k|≥K +L, where L will be selected later. Using the indentity

∑ |j|≤J

(J |j|)e(jx) =

∑ 1≤j≤J

e(jx)

2

=sinπJx sinπx2,

we can write

B(x) =

1 Lsin2πx(sinπ(K +L)x)2 (sinπKx)2.Hence B(0) = 2K +L, and

|B(x)|≤

1 L(sin2πx) ≤

1 4Lkαk2

,

so that

∑ 1≤s≤R

|B(xr xs)|≤2K +L+2 ∑ 1≤h

1 4Lh2δ2

.

Since ∑1≤h 1 h2 = π2 6 < 2, we have

∑ 1≤s≤R

|B(xr xs)|≤2K +L+

1 Lδ2

≤2K +

3 δ

.

upon taking L to be the least integer≥δ 1.

84 4. THE LARGE SIEVE

Consider the sum S(x) = ∑M+1≤N≤M+N ane(nx). The value of M is irrelevant to the magnitude of this sum since for any K we can set T(x) = ∑ K+1≤n≤K+N aM K+ne(nx) = e(K M)xS(x)and then|T(x)|=|S(x)|. Thus the above theorem can be rephrased as follows. THEOREM 4.1.6. Let S(x) = ∑ M+1≤n≤M+N ane(nx) where M and N are integers, N > 0. Let x1,···,xR be distinct real numbers modulo 1 and δ > 0 is such that kxr xsk≥δ, for r6= s. Then for arbitrary an ∑ 1≤r≤R |S(xr)|2 ≤(N +3δ 1) ∑ M+1≤n≤M+N |an|2. We state (without proof) another version of the large sieve inequalities due to Montgomery and Vaughan [MV73] (Theorem 1).

THEOREM 4.1.7 ([MV73]). Let

S(x) = ∑ M+1≤n≤M+N

ane(nx),

let x1,···,xR be real numbers, and set

δ = min r6=s kxr xsk.

Then

∑ 1≤r≤R

|S(xr)|2 ≤(N +δ 1) ∑ M+1≤n≤M+N

|an|2.

Moreover, if

δr = min s s6=r

kxr xsk

for all r, then

∑ 1≤r≤R (N +

3 2

δ 1 r ) 1|S(xr)|2 ≤ ∑ M+1≤n≤M+N

|an|2.

4.2. The Large Sieve

In this section we will use the bounds derived in the previous section to study the distribution of integer sequences in residue classes modulo primes.

Let an be a sequence of complex numbers de ned for M+1≤n≤M+N (where M,N are integers and N > 0). De ne Z(q,h) = ∑ M+1≤n≤M+N n≡h mod q an and Z(1,1) = Z = ∑ M+1≤n≤M+N an.

4.2. THE LARGE SIEVE 85

LEMMA 4.2.1. Let

S(x) = ∑ M+1≤n≤M+N

ane(nx).

If q is a positive integer, then

∑ 1≤a≤q

Sa q

2

= q ∑ 1≤h≤q

∑ d\q

μ(d) d

Zq d

,h

2

.

Proof : For an integer a we have (using (4.30)) Sa q= ∑ 1≤h≤q

Z(q,h)eah q.

By (4.31)

qZ(q,h) = ∑ 1≤a≤q

Sa qe ah q

= ∑ d\q

∑ 1≤b≤q d gcd(b, q d )=1

Sbd qe bdh q .

Let

T(q,h) = ∑ 1≤a≤q a⊥q

Sa qe ah q,

so that

qZ(q,h) = ∑ d\q

Tq d

,h.

Applying M¨obius inversion to this we get

T(q,h) = d∑ d\q

μ(d) d

Zq d

,h.

Hence

|T(q,h)|2 = q2

∑ d\q

μ(d) d

Zq d

,h

2

,

and therefore

1 q ∑ 1≤h≤q

|T(q,h)|2 = q ∑ 1≤h≤q

∑ d\q

μ(d) d

Zq d

,h

2

.

Now

q ∑ 1≤h≤q

∑ d\q

μ(d) d

Zq d

,h

2

=

1 q ∑ 1≤h≤q

|T(q,h)|2

=

1 q ∑ 1≤h≤q

∑ 1≤a,b≤q a⊥q,b⊥q

Sa qSb qe(b a)h q

=

1 q ∑ 1≤a,b≤q a⊥q,b⊥q

Sa qSb q ∑ 1≤h≤q

e(b a)h q

= ∑ 1≤a≤q a⊥q

Sa q

2

.

86 4. THE LARGE SIEVE

THEOREM 4.2.2. [Mon68] Let Z(q,h) and Z be de ned as before, and let x≥1. For each prime p≤x let H(p) be the union of ω(p) distinct residue classes modulo p. Let an be complex numbers that satisfy an = 0 if n∈H(p) for some p≤x. Then for each q≤x, μ2(q)|Z|2∏ p\q ω(p) p ω(p) ≤q ∑ 1≤h≤q

∑ d\q μ(d) d Zq d ,h

2 . Proof : This is clearly true if μ(q) = 0, so we may assume q≤x is a  xed squarefree integer. If d\q, we de ne K(d) =h|1≤h≤q and if p\d, then h∈H(p), while if p\q d , then h / ∈H(p). De ning h1 ≡ h2 if there is a d such that {h1,h2}  K(d) yields an equivalence relation. Thus K(d) when going through all the divisors of q gives a partition of{1,···,q}. Now for each h we can write q uniquely as q = ∏ p:h∈H(p) p\q p ∏ p:h/ ∈H(p) p\q p. Thus we can write any sum of the form ∑ 1≤h≤q f(h) as ∑ d\q ∑ h∈K(d) f(h). Fix a,δ where δ\q. Observe that

∑ d\q μq dd ∑ h∈K(δ) Z(d,h)

2 =

∑ d\q μ(d)q d ∑ h∈K(δ) Zq d ,h

2 (4.32) =

∑ h∈K(δ)∑d\q μ(d)d q Zq d ,h

2 (4.33) by changing the variable of summation from d to q d .

Using the Cauchy-Schwarz inequality

∑ h∈K(δ)∑d\q

μ(d)d q

Zq d

,h

2

≤ ∑ h∈K(δ)

1 ∑ h∈K(δ)

∑ d\q

μ(d)d q

Zq d

,h

2.(4.34)

Now consider

∑ d\q

μq dd ∑ h∈K(δ)

Z(d,h)

2

.

Supposing gcd(δ,d) > 1, we can select a prime p such that p\gcd(δ,d). Then Z(d,h) is a sum of an with n ≡ h mod d, since p\d we also have n≡h mod p. But p\δ and h∈K(δ) implies that n∈H(p) by the de nition of K(δ). Thus by hypothesis an = 0 whenever n≡h mod d and h∈K(δ). Hence the inner sum of

∑ d\q μq dd ∑ h∈K(δ) Z(d,h)

2

4.2. THE LARGE SIEVE 87

vanishes when gcd(δ,d) > 1. Thus we obtain,

∑ d\q

μq dd ∑ h∈K(δ)

Z(d,h) = ∑ d\q δ

μq dd ∑ h∈K(δ)

Z(d,h).

Fix d with d\(q/δ). If k∈H(p), then Z(d,k) = 0, and hence ∑ h∈K(δ) Z(d,h) = ∑ 1≤k≤d  p\d:k/ ∈H(p) Z(d,k)
 
{h|h∈K(δ),h≡k mod d.}
 
. Let S(δ,d,k) =
 
{h|h∈K(δ),h≡k mod d}
 
for k such that k∈H(p) for all primes p that divide d. By the ChineseRemainder Theorem h≡k mod d is equivalent to h≡k mod p for all prime p dividing d. Also h∈K(δ) implies thath ∈H(p) for all primes p dividing δ, and that h / ∈H(p) for all primes p dividing q/δ. Summarizing, we have shown that h∈K(δ) iff the following are satis ed: 1. p\d  h≡k mod p,h / ∈H(p) 2. p\δ h∈H(p) and 3. p\q/dδ h / ∈H(p). Since we have k such that k / ∈H(p) for all primes p dividing d, the second condition in (1) is satis ed whenever the  rst is satis ed. We have that if p\d, then there are exactly one solution of (1) modulo p, ω(p) solutions of (2) modulo p, while if p\q/dδ, then there are p ω(p) solutions to (3) modulo p. Applying the Chinese Remainder Theorem, we have S(δ,d,k) =
 
{h|1≤h≤q,h satis es conditions (1),(2)&(3)}
 
= ∏ p\δ ω(p) ∏ p\(q/dδ) (p ω(p)). This number is independent of k, and so ∑ h∈K(δ) Z(d,h) = ∑ 1≤k≤d  p\d : k/ ∈H(p) Z(d,k)∏ p\δ ω(p) ∏ p\q/dδ (p ω(p)) = ∑ 1≤k≤d Z(d,k)∏ p\δ ω(p) ∏ p\q/dδ (p ω(p)) = Z∏ p\δ f(p) ∏ p\q/dδ (p ω(p)). From this we get

∑ d\q

μq dd ∑ h∈K(δ)

Z(d,h) = ∑ d\q/δ

μq ddZ∏ p\δ

ω(p) ∏ p\q/dδ (p ω(p))(4.35) = μ(q)Z∏ p\δ ω(p) ∏ p\q/δ (p ω(p)) ∑ d\q/δ μ(d)d∏ p\d (p ω(p)) 1(4.36) = μ(q)Z∏ p\δ f(p) ∏ d\q/δ (p ω(p)) ∏ p\q/δ1  p p ω(p)(4.37) = μ(δ)Z∏ p\δ ω(p) ∏ p\q/δ ω(p)(4.38) = μ(δ)Z∏ p\q ω(p).(4.39) Now ∑ h∈K(δ) 1 = S(δ,1,1) =∏ p\δ ω(p) ∏ p\q/δ (p ω(p)).

88 4. THE LARGE SIEVE

Dividing (4.32) by the above factor and using (4.35) (4.39) we  nd that |Z|2∏ p\q ω(p)2∏ p\δ ω(p) 1 ∏ p\q/δ (p ω(p)) 1 ≤ ∑ h∈K(δ)

∑ d\q

μ(d)q d

Zq d

,h

2

.

Summing over all δ\q, the right hand side yields ∑ 1≤h≤q

∑ d\q

μ(d)q d

Zq d

,h

2

.

Since the K(δ) partition{1,···,q}, summing the left hand side yields |Z|2∏ p\d2∑ δ\q∏ p\δ ω(p) 1∏ p\q/δ (p ω(p)) 1 =|Z|2∏ p\q

ω(p)∑ δ\q

∏ p\q/δ

ω(p) ∏ p\q/δ (p ω(p)) 1

=|Z|2∏ p\q

ω(p)∏ p\q1+ ω(p) p ω(p) = q|Z|2∏ p\q ω(p) p ω(p) .

THEOREM 4.2.3. [MV73] Let N be a set of Z integers in an interval [M +1,M +N]. For each prime p let ω(p) denote the number of residue classes modulo p that contain no element of N . Then Z ≤L 1, where

L = ∑ q≤zN + 3 2

qz 1μ2(q)∏ p≤q

ω(p) p ω(p)

and z is an arbitrary positive real number.

Proof : Let xr be the numbers a q where 1≤a≤q, a⊥q and q≤z. If a0 q0 6= a q, then

a q  a0 q0

≥ 1 qq0 ≥ 1 qz . By Theorem (4.1.7) we have

∑ q≤zN + 3 2

qz 1 ∑ 1≤a≤q a⊥q

Sa q

2

≤ ∑ M+1≤n≤M+N

|an|2.

Set an = 1 or 0 according as n∈N or n / ∈N . Then by Theorem (4.2.2) we get Z2μ2(q)∏ p\q ω(p) p ω(p) ≤ ∑ 1≤a≤q a⊥q

Sa q

2 . The right hand side equals Z and this proves the theorem.

4.3. The Brun-Titchmarsh Theorem revisited

The large sieve can be used to strengthen the Brun-Titchmarsh theorem (Theorem 3.2.5). We require the following lemma.

4.3. THE BRUN-TITCHMARSH THEOREM REVISITED 89

LEMMA 4.3.1. Let u and v be any positive real numbers. Then

∑ q≤n q⊥k (1+vq) 1μ2(q)  (q) ≥

(k) k ∑ q≤u (1+vq) 1μ2(q)  (q)

.

Proof : Note that

k  (k)

=∑ r\k

μ2(r)  (r)

.

Multiplying the sum on the left by this we get

∑ q≤n q⊥k (1+vq) 1∑ r\k

μ2(qk)  (qk)

,

which includes all the terms of the sum on the right.

THEOREM 4.3.2 ([MV73]). Let x and y be positive real numbers, and let k and l be relatively prime positive integers. Then π(x+y;k,l) π(x;k,l) < 2y  (k)5 6 +logy k . Proof : We take M =x l k  and N =x+y k k  M. Let N be the set of those integers n for which M < n ≤ M +N, kn+l is prime, and kn+l > z. Then ω(p) = 1 whenever p≤z and p6\k. Thus by Theorem (4.2.3) we have π(x+y;k,l) π(x;k,l)≤L 1 +π(z), where

L = ∑ q≤z q⊥k

(N +

3 2

qz) 1 μ2(q)  (q)

.

Taking z =q2 3N and using Lemma (4.3.1), we have π(x+y;k,l) π(x;k,l) <

kN  (k)J +p(N),where

J = ∑ q≤z (1+qz 1) 1 μ2(q)  (q)

.

From [War27] we have

∑ q≤v

μ2(q)  (q)

= logv+γ+∑ p

log p p(p 1)

+o(1)

as v→∞. By partial summation we  nd that

J = logz+γ+∑ p

logp p(p 1)

= log2+o(1)

90 4. THE LARGE SIEVE as z→∞. Setting z =q2 3N we get J = 1 2 logN +γ+∑ p log p p(p 1)  1 2 log

3 2 log2+o(1)

as N →∞. Since γ > 0.577,

∑ p

logp p(p 1)

> 0.737,

lling in log2 < 0.694 and 1 2 log 3 2 < 0.203, we  nally obtain

J >

1 2

logN +0.417,

for large enough N.

4.4. Bombieri’s Theorem

The large sieve inequalities imply that if a sequence of integers is distributed rather densely in an interval, then it cannot be very unevenly distributed modulo the primes. In this section we will prove an important theorem that quanti es the above statement for the primes themselves. De ne ψ(x) = ∑ n≤x Λ(n), where Λ(n) is von-Mangoldt’s function Λ(n) =(log p if n = pk, 0 otherwise. Also de ne ψ(x;q,a) = ∑ n≤xn ≡a mod q Λ(n). Let E(x;q,a) = ψ(x;q,a)  x  (q) for a⊥q, and E (x,q) = max y≤x E(y,q). We will prove Bombieri’s Theorem in the following form:

THEOREM 4.4.1 ([Dav80]). Let A > 0 be  xed, and suppose x

1 2 (logx) A ≤Q≤x1 2 . Then

∑ q≤Q

E (x,q)x1 2 Q(logx)5.

Proof : If χ is a multiplicative character modulo q, and de ne ψ(y,χ) = ∑ n≤y χ(n)Λ(n). We begin with the identity

ψ(y;q,a) =

1  (q)∑ χ

χ(a)ψ(y,χ),

where the sum is over all the characters modulo q. Let χ0 be the principal character we then de ne ψ0(y,χ) =(ψ(y,χ) if χ6= χ0, ψ(y,χ0) y if χ = χ0.

4.4. BOMBIERI’S THEOREM 91

Then we have

ψ(y;q,a)

y  (q)

=

1  (q)∑ χ

χ(a)ψ0(y,χ),

and so

|E(y;q,a)|≤

1  (q) ∑ χ |ψ0(y,χ)| since|χ(a)|≤1. This estimate is independent of a, so that E (y;q)≤ 1  (q)∑ χ |ψ0(y,χ)|. If χ mod q is a character (possibly imprimitive) that is induced by χ1 mod q1, where χ1 is primitive, then ψ0(y,χ) and ψ0(y,χ1) do not differ very much: ψ(y,χ1) ψ0(y,χ) = ∑ pk≤y p\q χ1(pk)log p ∑ p\qlogy logplogp (logy)∑ p\q logp (logqy)2. Hence we can replace the sum over all characters by one over the primitive characters only. Thus E(x,q)(logqx)2 + 1  (q) ∑ χ
 
ψ0(y,χ1)
 
,and E (x,q)(logqx)2 + 1  (q)∑ χ max y≤x
 
ψ0(y,χ1)
 
.W e can combine the contributions from each of the primitive characters. Since a primitive character induces characters to moduli that are multiples of q, we have E (x,q)(logqx)2 + ∑ q≤Q   ∑ χ max y≤x |ψ0(y,χ)| ∑ k≤Q/q 1  (kq), where ∑  means the sum is over primitive characters modulo q. Since  (kq)≥ (k) (q) we have ∑ k≤z 1  (kq) ≤ 1  (q) ∑ k≤z 1  (k) . Now ∑ k≤z 1  (k) ≤∏ p≤z1+ 1 (p 1) + 1 p(p 1) + 1 p2(p 1) +···. Note that 1 (p 1) 1 1  1 p = 1 p 11+ 1 p + 1 p2 +···. Thus 1+ 1 (p 1) + 1 p(p 1) + 1 p2(p 1) +···= 1+ 1 (p 1) 1 1  1 p= 1+ 1 p(p 1) 1 1  1 p .

92 4. THE LARGE SIEVE

Using this we have

∑ k≤z

1  (k) ≤∏ p≤z1  1 p 11+ 1 p(p 1) logz,

and so

∑ q≤Q

∑ χ

max y≤x |ψ0(y,χ)| ∑ k≤Q/q

1  (kq)logx ∑ q≤Q

1  (q)

∑ χ

max y≤x
 
ψ0(y,χ)
 
.Thus it suf ces to show that

∑ q≤Q

1  (q)

∑ χ

max y≤x |ψ0(y,χ)|x1 2 Q(logx)4(4.40)

for x

1 2 (logx) A ≤Q≤x1 2 .

Using the large sieve we will show that

∑ q≤Q

q  (q)

∑ χ

max y≤x |ψ(y,χ)|x+x5 6 Q+x

1 2 Q2(logQx)4(4.41) for all x≥1 and Q≥1.

Now observe that

∑ U<q≤2U

q  (q)

∑ χ

max y≤x |ψ(y,χ)|≥U ∑ U<q≤2U

1  (q)

∑ χ

max y≤x |ψ(y,χ)|,

and so

∑ U≤q≤2U

1  (q)

∑ χ

max y≤x |ψ(y,χ)|x U

+x

5 6 +x

1 2U(logUx)4

by (4.41). Summing over U = 2k for k≤logQ, we have ∑ Q1<q≤Q 1  (q)   ∑ χ max y≤x |ψ(y,χ)|≤ x Q1

+x

5 6 logQ+x

1 2 Q(logQx)4. We have used the fact that for χ = χ0 we have|ψ0(y,χ0)|≤|ψ(y,χ0)|, and ψ0(y,χ) = ψ(y,χ) if χ6= χ0. This shows (4.40) for Q1 = logA x. By the Siegel-Wal sz theorem, if χ is a primitive character modulo q, q≤(logx)A, and y≤x, then |ψ0(y,χ)|x(logx) 2A. Thus the theorem follows from (4.41).

We will now sketch the proof of (4.41) (for details see [Dav80]). Using the large sieve we can derive the following:

∑ q≤Q

q  (q)

∑ χ

max u

∑ 1≤m≤M

∑ 1≤n≤N mn≤u

ambnχ(mn)

(4.42)

(M +Q2)1 2 (N +Q2)1 2 ∑ 1≤m≤M

|am|2

1 2 ∑ 1≤n≤N

|bn|2

1 2 log2MN.(4.43)

If Q2 > x then (4.41) follows from above with M = 1, a1 = 1, bn = Λ(n), N = x. Thus we may assume Q2 ≤x. It turns out that we can write ψ(y,χ) = S1 +S2 +S3 +S4,

4.5. PRIME AND SQUAREFREE PAIRS 93

where

S1 = ∑ n≤U

Λ(n)χ(n)U,

S2 =  ∑ t≤UV∑ t=md m≤U d≤V

μ(d)Λ(m) ∑ r≤y/t

χ(rt),

S3 (logy) ∑ d≤V

max w

∑ w≤h≤y/d

χ(h)

, and

S4 = ∑ U<m≤y/V

Λ(m) ∑ V<k≤y/m∑ d\kd ≤V

μ(d)χ(mk).

Using (4.42) and the P′olya-Vinogradov inequality (see [Dav80]), we can show that

∑ q≤Q

q  (q)

∑ χ

max y≤x |S4|Q2x1 2 +QxU 1 2 +QxV 1 2 +x(logx)4. The sum S2 can be split into S2 = ∑t≤UV = ∑t≤U +∑U<t<UV = S0 2 +S00 2, and it can be shown that ∑ q≤Q   ∑ χ max y≤x |S00 2|(Q2x1 2 +QxU 1 2 +Qx 1 2U 1 2V 1 2 +x)(logx)2 and

∑ q≤Q

q  (q)

∑ χ

max y≤x |S0 2|(Q

5 2U +x)(logUx)2.

Also

∑ q≤Q

q  (q)

∑ χ

max y≤x |S3|(Q

5 2V +x)(logVx)2.

On combiningthese estimates and takingU =V =x

2 3 Q 1 for x1 3 ≤Q≤x

1 2 , we obtain (4.41) in this range. For Q≤x

1 3 ,

we can take U = x

1 3 to complete the proof of (4.41).

The Bombieri result can be formulated as follows: THEOREM 4.4.2. Let E(x;q,a)=π(x;q,a)  lix  (q) for a⊥q, E(x;q)=maxa,a⊥q
 
E(x;q,a)
 
, andE (x,q)=maxy≤x E(y,q).Then for all A > 0 there exists B > 0 such that ∑ q≤x 1 2 (logx) B E (x,q) x log1+Ax .

4.5. Prime and Squarefree pairs

We can pose the following variation of the twin prime problem: “Are there in nitely many primes p such that p+2 is squarefree ” The answer to the question is yes, and this is an almost immediate consequence of the powerful result we have proved.

THEOREM 4.5.1. Let

Ξ(x) =
 
{p≤x|μ2(p+2) = 1}
 
.Then Ξ(x) = Li (x)∏ p>21  1 p(p 1)+Olnx √x+O x ln1+U(x)+O(x3 4 lnC(x)) for some constants U > 0 and C > 0.

94 4. THE LARGE SIEVE Proof : Let A ={p+2| p≤x}. We have Ξ(x) = ∑ d2≤x μ(d)∑ n∈A d2\n

1. Let Ad2 ={p+2| p≤x,p+2≡0 mod d2}. Thus by de nition|Ad2|= π(x;d2, 2). De ne Rd2 = π(x;d2, 2) Li (x)  (d2) . Then we have Ξ(x) = Li (x) ∑ d2≤x μ(d)  (d2) + ∑ d2≤x μ(d)|Rd2| = Σ1 +Σ2 Σ1 = Li (x)∑ d μ(d)  (d2)  ∑ d>√x μ(d)  (d2) = Li (x)∏ p>21  1 p(p 1)  ∑ d>√x μ(d)  (d2), since|A4|= 0 allows omitting the prime 2.

The second sum can be upper-bounded by:

∑ d>√x

1  (d2) ≤ ∑ d>√x

2lnd d2 = Olnx √x.

The remainder term is bounded by:

Σ2 ≤ ∑ d2≤x

|Rd2| = ∑ d2≤ √x lnC x |Rd2|+ ∑ √x lnC x <d2<x

|Rd2|

= O x ln1+U x+ ∑ √x lnC x <d2<x

|Rd2|

using Bombieri’s result to bound the  rst sum.

For the second sum, since|Rd2|≤b x d2c≤ x d2 , we have

√x lnC x

<d2<x

|Rd2|≤x ∑ √x lnC x <d2

1 d2 = OxlnCx x 1 4 = Ox3 4 lnC x.

4.5. PRIME AND SQUAREFREE PAIRS 95

The theorem follows from the estimates for Σ1 and Σ2.

Let Ψ(x) = ∑n≤xΛ0(n), where

Λ0(n) =(logp if n = pk and μ2(n+2) = 1, 0 otherwise.

Let

Ψ(x;q,a) = ∑ n≤xn ≡a mod q

Λ0(n),

and further let E(x;q,a) = Ψ(x;q,a)  Cx  (q), E(x;q) = maxa,a⊥q|E(x;q,a)|, and E (x,q) = maxy≤x E(y,q), where C = ∏p1  1 p(p 1). Using partial summation and the above theorem we can show that for any U > 0, Ψ(x) =Cx+O x log1+U x and

Ψ(x;q,a) =

Cx  (q)

+O x log1+U x, for a⊥q.

THEOREM 4.5.2. Let A > 0 be  xed. Then

∑ (logx)A<q≤Q

E (x,q)x1 2 Q(logx)5,

provided x

1 2 (logx) A ≤Q≤x1 2 . The proof is a careful veri cation that the proof of the Bombieri Theorem goes through except for q < (logx)A. But in this range the maximum error possible is O x log1+U xso selecting U large enough we have:T HEOREM 4.5.3. Let A > 0 be  xed. Then ∑ q≤Q E (x,q)x1 2 Q(logx)5, provided x 1 2 (logx) A ≤Q≤x1 2 . There is a version of Brun’s sieve that makes use of the result on the average behaviour of error terms to yield a better estimate. In particular we have ([HR74] Theorem 2.10 p. 65) THEOREM 4.5.4. Let the following conditions hold on the sequence A: 1. 1≤ 1 1 ω(p) p ≤A1; 2. ∑ w≤p≤z ω(p)log p p ≤κlog z w +A2, if 2≤w≤z; 3. There is a constant A0 0 such that |Rd|≤Lxlogx d +1A0ν(d) 0 ; 4. For every postive constant U ≥1 there is a C0 such that ∑ d<xα log c0 x μ2(d)|Rd|= O x logκ+U x.

96 4. THE LARGE SIEVE

Let b be a positive integer, let λ be a real number satisfying λe1+λ < 1, let

c1 =

A2 21+A1κ+ A1A2 log2,

and let u = logx logz. Then

S(A;P,z)≥xW(z)1 2 λbeλ2 1 λe1+λ2

exp(2b+2) c1 λlogz +OLz αu+2b 1+ 2.01 (e2λ/κ 1) uC0+1 logC0+κ+1 z+ O(u κlog U X),

where the O-constants may depend on A0 0,A1,A2,κ,α and U, but not on λ or b. Using this theorem with A ={p+2| p≤x,μ2(p+2)= 1}, and taking the sifting primes to be P ={p : p > 2}, we  nd that the lower bound is positive (and diverges) for u < 9. Following the same analysis as in [HR74] (p.67), we can also take u < 8 with a slightly better treatment of the principal and secondary terms involved in the proof of the above theorem. This allows us to conclude that the lower bound diverges even with z = x 1 7 , and thus we have:

THEOREM 4.5.5. There are in nitely many primes p such that p+2 is a squarefree number with at most 7 prime factors.

The above result is different from earlier ones because of the extra condition that p+2 be made up only of distinct primes.

Bibliography

[BakHar98] Baker R. C., Harman G., Shifted primes without large prime factors, Acta Arith., (83), 331-361, (1998). [BakHar95] Baker R. C., Harman G., The Brun-Titchmarsh Theorem on average, Analytic Number Theory Vol 1, (Allerton Park, IL), 39-103, Progr. Math. 138, Birkhauser Boston, Boston, (1996), [BakPin85] Baker R. C., Pintz J., The distribution of square-free numbers, Acta Arith., (46), 71-79, (1985). [Be83] Beth Thomas, Eine Bemerkung zur Absch¨atzung der Anzahl orthogonaler lateinischer Quadrate mittels Siebverfahren. Abh. Math. Sem. Univ. Hamburg, 53, 284-288, (1983). [BPS60] Bose R. C, Shrikande S. S., Parker E. T., Further results on the construction of mutually orthogonal latin squares and the Falsity of Euler’s conjecture, Canad. J. Math. 12, 189-203, 1960. [Bru16] Brun V., Omfordelingen av primtallene i forskjellige talklasser. En vre begrnsning. , Nyt Tiddsskr. f. Math. (27) B, 45-58, (1916). [Bru19] Brun V., Le crible d’Eratostne et le thorme de Goldbach, C. R. Acad. Sci. Paris, (168), 544-546, (1919). [Bru22] Brun V., Das Siev des Eratosthenes, 5. Skand. Mat. Knogr., Helsingfors, 197-203, (1922). [CES60] Chowla S., Erd os P., Straus E. G., On the maximal number of pairwise orthogonal Latin suqares of a given order , Canad. J. Math. 1 ˉ 2, 204-208, 1960. [Che73] Chen J., On the representation of a large even integer as the sum of a prime and the product of at most two primes, Sci. Sinica, (16), 157-176, (1973). [Dar96] Dartyge C′ecile, Le plus grand facteur de n2 +1 o′ u n est presque premier., Acta Arith. (76), no. 3, 199-226, (1996). [Dav80] Harold Davenport, Montgomery H. L., Multiplicative Number Theory, 2nd ed., Springer-Verlag, (1980). [DI83] Deshouillers, J.-M., Iwaniec Henryk, On the greatest prime factor of n2 +1, Ann. Inst. Fourier (Grenoble), (32), no. 4., 1-11, (1983). [Erd52] Erd os, P′al, On the greatest prime factor of ∏ f(k) J. Lond. Math. Soc., (27), 379-384, (1952). [Erd60] Erd os, P′al, ¨Uber die kleinste quadratfreie Zahl einer arithmetischen Reihe, Monatsh. Math. 64, (1960), 314-316. [Erd49] Erd os, P′al, On some applications of Brun’s method, Acta Univ. Szeged. Sect. Sci. Math. 13, (1949), 57-63. [Est31] Estermann, Theodor, Einige S¨atze ¨uber quadratfreie Zahlen, Math. Ann. vol. (105), 1931. 653-662. [Gal67] Gallagher P. X., The large sieve, Mathematika, (14), 14-20, (1967). [GeL66] Gel’Fond A. O., Linnik Yu. V., Elementary Methods in the Analytic Theory of Numbers, MIT - Press, (1966). [Hal70] Halberstam, H.; On integers all of whose prime factors are small, Proc. Lond. Math. Soc. (3), (21), 102-107, 1970. [HR74] Halberstam, H.; Richert, H.-E., Sieve Methods, Academic Press, 1974. [HalRo66] Halberstam, H.; Roth, K. F., Sequences, Oxford University Press, (1966). [HB84] Heath-Brown D. R., The Square Sieve and Consecutive Square-Free Numbers. Math. Ann. (266), (1984), 251-259. [HB88] Heath-Brown D. R., The number of primes in a short interval. J. Reine Angew. Math. (389), 22-63, (1988). [Ho67] Hooley C., On the greatest prime factor of a quadratic polynomial, Acta Math., (17), 281-299, (1967). [Ho73] Hooley C., On the largest prime factor of p+a, Mathematika, (20), 135-143, (1973). [Ho76] Hooley C., Applications of sieve methods to the theory of numbers, Cambridge University Press, (1976). [Iwan82] Iwaniec Henryk, On the Brun-Titchmarsh theorem, J. Math. Soc. Japan, (34), No. 1, 95-123, (1982). [vLR65] van Lint J. H., Richert H.-E., On primes in arithmetic progression, Acta Arith., (11), 209-216, (1965). [Mir49] Mirsky, L. On the frequency of pairs of squarefree numbers with a given difference. Bull. Amer. Math. Soc. (55), 936-939, (1949). [Mon68] Montgomery H. L., A note on the large sieve, J. Lond. Math. Soc., (43), 93-98, (1968). [MV73] Montgomery H. L., Vaughan R. C., The Large Sieve, Mathematika, (20), No. 40, 119-134, (1973). [MV81] Montgomery H. L., Vaughan R. C., The Distribution of Squarefree Numbers, Recent Progress in Analytic Number Theory, Academic Press, 247-256, (1981). [Mot70] Motohashi Yoichi, A note on the least prime in an arithmetic progression with a prime difference, Acta Arith., (17), 283-285, (1970). [Odl71] Odlyzko Andrew M., Sieve Methods, Senior Thesis, California Institute of Technology, Pasadena, California, (1971). [Rad24] Rademacher Hans, Beitr¨age zur Viggo Brunschen Methode in der Zahlentheorie, Abbh. Math. Sem. Hamburg, (3), 12-30, (1924). [RS62] Rosser J. B., Schoenfeld L., Approximate formulas for some functions of prime numbers , Illinois J. Math. (6), 64-89, (1962). [Sch66] Schinzel, A., On sums of roots of unity. (Solution of two problems of R. M. Robinson), Acta Arith., (11), 419-432, (1966). [SchWa58] Schinzel A., Wang Y., A note on some properties of the functions  (n),σ(n) and θ(n), Ann. Polon. Math. (4), 201-213, (1958). [Sel47] Selberg A., On an elementary method in the theory of primes, Norske Vid. Selsk. Forh. Trondhjem (19), no.18, 64-67, (1947). [Sel71] Selberg A., Sieve Methods, Proc. Symp. Pure Math. (20), 311-351, (1971). [Tit86] Titchmarsh E. C., The Theory of the Riemann Zeta-function, 2nd Ed., Oxford University Press, (1986). [Wal63] Wal sz Arnold, Weylsche Exponentialsummen in der neueren zahlentheorie, Deutscher Verlag der Wissenschaften, Berlin, (1963). [War27] Ward D. R., Some series involving Euler’s function, J. Lond. Math. Soc., (2), 210-214, (1927). [Warl90] Warlimont Richard, Sieving by large prime factors, Monatsh. Math., (109), no. 3, 247-256, (1990). [Wil74] Wilson, Richard

陈景润定理对筛法理论的贡献

经过查证,在国际最新筛法专著的前言中,作者专门提及陈景润定理的现代意义,而我们国人却陈景润不理解。呜呼!

请看本文附件。

袁萌  陈启清  2月4日

附件:在最新筛法专著的前言中,专门提及陈景润定理的现代意义。

Sieve Methods

DENIS XAVIER CHARLES

Preface(前言)

Sieve methods have had a long and fruitful history. The sieve of Eratosthenes (around 3rd century B.C.) was a device to generate prime numbers. Later Legendre used it in his studies of the prime number counting function π(x). Sieve methods bloomed and became a topic of intense investigation after the pioneering work of Viggo Brun (see [Bru16],[Bru19], [Bru22]). Using his formulation of the sieve Brun proved, that the sum

∑ p, p+2 both prime

1 p

converges. This was the  rst result of its kind, regarding the Twin-prime problem. A slew of sieve methods were developed over the years — Selberg’s upper bound sieve, Rosser’s Sieve, the Large Sieve, the Asymptotic sieve, to name a few. Many beautiful results have been proved using these sieves. The Brun-Titchmarsh theorem and the extremely powerful result of Bombieri are two important examples. Chen’s theorem [Che73], namely that there are in nitely many primes p such that p+2 is a product of at most two primes, is another indication of the power of sieve methods.

Sieve methods are of importance even in applied  elds of number theory such as Algorithmic Number Theory, and Cryptography. There are many direct applications, for example  nding all the prime numbers below a certain bound, or constructing numbers free of large prime factors. There are indirect applications too, for example the running time of several factoring algorithms depends directly on the distribution of smooth numbers in short intervals. The so called undeniable signature schemes require prime numbers of the form 2p+1 such that p is also prime. Sieve methods can yield valuable clues about these distributions and hence allow us to bound the running times of these algorithms.

In this treatise we survey the major sieve methods and their important applications in number theory. We apply sieves to study the distribution of square-free numbers, smooth numbers, and prime numbers. The  rst chapter is a discussion of the basic sieve formulation of Legendre. We show that the distribution of square-free numbers can be deduced using a square-free sieve1. We give an account of improvements in the error term of this distribution, using known results regarding the Riemann Zeta function.

The second chapter deals with Brun’s Combinatorial sieve as presented in the modern language of [HR74]. We apply the general sieve to give a simpler proof of a theorem of Rademacher [Rad24]. The bound obtained by this simpler proof is slightly inferior, but still suf cient for applications such as the result of Erd os, Chowla and Briggs on the number of mutually orthogonal Latin squares. The formulation of Brun’s sieve in [HR74] also includes a proof of the important Buchstab identity. We use it to derive some bounds on the distribution of smooth numbers ([Hal70]).

The third chapter deals with the development and the applications of Selberg’s upper bound method. The proof by van Lint and Richert [vLR65] of the Brun-Titchmarsh theorem is given as the chief application. Hooley’s improvement of bounds on prime factors in a problem studied by Chebyschev is also outlined here. The last chapter is a study of the Large Sieve. We give an outline of a proof of Bombieri’s central theorem on the error term in the distribution of primes. A new application of the Bombieri theorem is shown; we prove that there are in nitely many primes p such that p+2 is a square-free number with at most 7 prime factors.

Acknowledgements: I would like to thank my advisor Dr. Ken Regan, for allowing me to work on a topic of my own interest. His support, encouragement and advice has been invaluable for my work. I thank him for proofreading the entire document and his constructive comments. A special word of thanks to Dr. Jin-Yi for helping me with character sums. I thank him for answering my queries in such a way that I gained a new insight into the problem. I

1This is not a new proof - it is implicit in the work of Erd os [Erd60]

3

4 PREFACE

thank Dr. Alan Selman for his encouragement and advice. I am deeply grateful to Professors Eric Bach, Tom Cusick, Kevin Ford, and Andrew Granville for promptly answering my queries. Their suggestions, pointers, and ideas were invaluable for this work. I am indebted to the National Science foundation for the monetary support for this work, under my advisor’s grant CCR 98-20140.

I thank my parents for their love, encouragement and prayers. I thank Pavan, Maurice, and Samik for pretending to be interested in sieves, and for reviewing the proofs. A special word of thanks to all my friends for anchoring me in sanity through this summer.

Denis Charles. July 2000

To Truth and Purity

Contents

Preface 3

Chapter 0. Notation and preliminaries 9 0.1. Standard Nomenclature 9 0.2. Conventions 9 0.3. Preliminaries 9

Chapter 1. The sieve of Eratosthenes 13 1.1. Introduction 13 1.2. Sieve of Eratosthenes-Legendre 13 1.3. Smooth numbers 15 1.4. Density of squarefree numbers 15 1.5. The error term in the distribution of Squarefree numbers 18 1.6. Pairs of squarefree numbers 22 1.7. The smallest squarefree number in an arithmetic progression 25 1.8. The Sieve Problem 27

Chapter 2. The Combinatorial Sieve 31 2.1. Brun’s Pure Sieve 31 2.2. Brun’s Sieve 36 2.3. Orthogonal Latin Squares and the Euler Conjecture 44 2.4. A Theorem of Schinzel 49 2.5. Smooth Numbers 54 2.6. On the number of integers prime to a given number 55

Chapter 3. Selberg’s Sieve 57 3.1. The Selberg upper-bound method 57 3.2. The Brun-Titchmarsh Theorem 64 3.3. Prelude to a theorem of Hooley 69 3.4. A theorem of Hooley 71

Chapter 4. The Large Sieve 79 4.1. Bounds on exponential sums 79 4.2. The Large Sieve 84 4.3. The Brun-Titchmarsh Theorem revisited 88 4.4. Bombieri’s Theorem 90 4.5. Prime and Squarefree pairs 93

Bibliography 97

7

8 CONTENTS

CHAPTER 0

Notation and preliminaries

0.1. Standard Nomenclature The largest integer not exceeding x is denotedbxc. We write a\b for two integers a,b a6= 0 if a divides b. The M¨obius function is denoted by μ(n) and de ned as: μ(n) =(( 1)k if n = p1···pk, for 1≤i < j≤k : pi 6= pj, 0 otherwise. The prime counting function is π(x) de ned as the cardinality of the set P ={p≤x| p a prime}, while π(x;q,a) will denote the cardinality of{p≤x| p≡a mod q}. We denote the von-Mangoldt function by Λ(n): Λ(n) =(log p if n = pk for a prime p, 0 otherwise, and its cumulation by ψ(x) = ∑n≤xΛ(n). If n = pe1 1 ···pek k is the prime factorization of n then ν(n) = k denotes the number of distinct primes in the factorization. We write  (n) for Euler’s totient function:

(n) = n∏ p\n1  1 p.

0.2. Conventions The letter p will always denote a prime number. Consequently, ∑n≤p≤m f(p) will denote a sum overthe prime numbers in the range of summation. A will stand for a general integer sequence to be sifted, and P for the sifting set of primes. We employ the standard O and o-notation. We use the Vinogradov notation to mean that inequality holds with some constant, i.e., f(n)g(n)  c > 0 : f(n)≤cg(n). If gcd(a,b) = 1 for two integers a and b, then we also write a⊥b. 0.3. Preliminaries THEOREM 0.3.1. Let n≥1 be an integer. Then ∑ d\n μ(d) =(1, if n = 1, 0, otherwise. Proof : Since divisors that are not squarefree drop out of the sum by the de nition of μ, we may without loss of generality assume that n is squarefree. Let n = p1p2···pl, then any divisor d of n has the form pe1 1 pe2 2 ···pel l with ei ∈{0,1}for 1≤i≤l. Using this we can split up the sum we wish to evaluate: ∑ d\n μ(d) = ∑ p e1 1 p e2 2 ···p el l e1+···+el= even 1  ∑ p e1 1 p e2 2 ···p el l e1+···+el= odd 1 =n 0 n 1+n 2+···+( 1)nn n = (1 1)n = 0.

9

10 0. NOTATION AND PRELIMINARIES

There is another way we could have evaluated the sum. Let T(l) be the number of 0-1 strings of length l that have odd number of 1s in them. Consider the last position of such a string. If it is a 1, then we must  ll the rest of the positions with an even number of 1s which can be done in 2l 1 T(l 1) ways. If the last position is a 0, then the rest of the string must have an odd number of 1s which can be done in T(l 1) ways. We have argued that T(l) satis es the following recurrence: T(l) = T(l 1)+(2l 1 T(l 1)) = 2l 1.

Thus the number of sequences with odd number of 1s and the number of them with even number of 1s is the same, and so the above sum is zero.

THEOREM 0.3.2. (M¨obius Inversion) If

f(n) = ∑ d\n

g(d)

then

g(n) = ∑ d\n

μ(d)f

n d.

Proof :

∑ d\n

μ(d)f

n d= ∑ d\n

μ(d) ∑ l\(n/d)

g(l)

=∑ l\n

g(l) ∑ d\(n/l)

μ(d)

= ∑ l=n

g(l) by Theorem 0.3.1

= g(n).

THEOREM 0.3.3. If

f(n) = ∑ d\n

g(d)

then

g(n) = ∑ d\n

μ

n df(d).

Proof :

∑ d\n

μ

n df(d) = ∑ d\n

μ

n d∑ l\d

g(l)

=∑ l\n

g(l) ∑ d\n/l

μ

n dl

= ∑ l=n

g(l) by Theorem 0.3.1

= g(n).

0.3. PRELIMINARIES 11

THEOREM 0.3.4.

∑ d\n

μ(d) d

=∏ p\n1  1 p =∏ p\n1+ μ(p) p . Proof : We know that ∑d\n (d) = n. Using M¨obius inversion on this we get: n∏ p\n1  1 p=  (n) = ∑ d\n μ(d) n d = n∑ d\n μ(d) d .

REMARK 0.3.5. The proof of Theorem 0.3.4 actually works for any multiplicative function of the divisors of n in the denominator, provided it is zero at non-squarefree divisors. We could have also proved Theorem 0.3.1 using the identity: ∑ d\n μ(d) =∏ p\n1+μ(p).

12 0. NOTATION AND PRELIMINARIES

CHAPTER 1

The sieve of Eratosthenes

1.1. Introduction

The sieve of Eratosthenes is a simple effective procedure for  nding all the primes up to a certain bound x. Take a list of the numbers 2,3,···,bxc. Call 2 a prime, and start by crossing out all the multiples of 2. Because 3 is uncrossed at this stage 3 must be prime. Cross out the multiples of 3 since they are composite, and then pick the next number that is still uncrossed and repeat. If after a stage the next uncrossed number exceeds √x then stop. At this stage all the numbers that are not crossed out are prime.

Legendre realized that this procedure can be captured succinctly in a theoretical analog of the sifting process, and used this in his study of the function π(x) =
 
{p≤x| p a prime}
 
.In this chapter we will try to apply this basic technique to study some simple problems. First we shall look at the sieve applied to the problem of estimating π(x). Although the method would lead to an exact formula for π(x) π(√x) this does not give useful estimates for π(x) owing to a huge error term. However we can adapt the basic method to study other sequences of numbers, for example the squarefree numbers, meaning numbers that are products of distinct primes. The basic sieve we develop will be more successful in dealing with squarefree numbers, essentially because they are denser than the primes. We will be able to give interesting bounds on the density of these numbers in arithmetic progressions and in pairs (n,n+2). We shall also  nd a bound on the smallest squarefree number in an arithmetic progression. Finally we shall give the general setup of a sieve problem and re-formulate the classical sieve of Eratosthenes-Legendre in this framework.

1.2. Sieve of Eratosthenes-Legendre

Let Pz = ∏p<z p. The sieve of Eratosthenes deletes from the list of numbers all those numbers that are not relatively prime to Pz, except the primes dividing Pz itself. We are interested in  nding bounds on the cardinality of the set S ={n|n≤x,n⊥Pz}. We de ne s(n) =(1, if n∈S 0 otherwise.

This is the characteristic function of the set S. Using the properties of the M¨obius function (see Chapter 0), we can write an explicit expression for s(n).

s(n) = ∑ d\gcd(n,Pz)

μ(d).

We will call such a function s(n) the sifting function.

13

14 1. THE SIEVE OF ERATOSTHENES

Then

|S|= ∑ n≤x

s(n)

= ∑ n≤x

∑ d\gcd(n,Pz)

μ(d)

= ∑ d\Pz

μ(d)∑ n≤x d\n

1

= ∑ d\Pz

μ(d)x d

= ∑ d\Pz

μ(d)x d

+x d  x d

= ∑ d\Pz

μ(d)

x d

+ ∑ d\Pz

μ(d)x d  x d.

Since each term in the second sum has absolute value at most 1, we obtain

|S|≤x ∑ d\Pz

μ(d) d

+2π(z)

= x ∏ p\Pz1  1 p+2π(z).

Now a theorem of Mertens states that

∏ p<z1  1 p~ eγ lnz

,

and this yields the estimate:

|S|≤x

eγ lnz

+2π(z)

provided z→∞ as x→∞. The usefulness of the above scheme is restricted by the huge error term 2π(z). For z = O(lnx) for example we get π(x) π(lnx) = O x lnlnx, and since π(x)≤x we get the estimate π(x) = O x lnlnx. This is markedly inferior to the truth π(x)~ x lnx. Note that if z =√x then|S|measures π(x) π(√x), for which we have derived the following exact formula: π(x) π(√x)+1 = x ∏ p<√x1  1 p+ ∑ d\P√x μ(d)x d  x d.

1.4. DENSITY OF SQUAREFREE NUMBERS 15

1.3. Smooth numbers

DEFINITION 1.3.1. A number n will be called k-smooth if  p : (p\n) (p < k). Let Ψ(x,k) =|{n≤x|n is k-smooth}|i.e., the number of k-smooth numbers up to a bound x. We can use our sieve argument to try to  nd a bound on Ψ(x,k). The weakness of this simple sieve will be apparent in the bound it gives us.

PROPOSITION 1.3.2.

Ψ(x,k) = Oxlnk lnx

+2π(x) π(k). Proof : Since a number is k-smooth only if all its prime divisors are below k, we can  nd the k-smooth numbers below a bound x, by using as our sifting set P ={p|k < p≤x}. Let Pk,x = ∏p∈P p. Let S ={n|n is k-smooth}, and this time de ne s(n) =(1 if n∈S or n = 1, 0 otherwise. Now rewriting s(n) using the M¨obius function, we obtain s(n) = ∑ d\gcd(n,Pk,x) μ(d). Setting S(n) =|S|, we apply Mertens’ Theorem at the end to conclude:

S(n) = ∑ n≤x

s(n)

= ∑ n≤x

∑ d\gcd(n,Pk,x)

μ(d)

= ∑ d\Pk,x

μ(d)∑ n≤x d\n

1

= ∑ d\Pk,x

μ(d)x d = x ∏ k<p≤x1  1 p+O(2π(x) π(k)) = Oxlnk lnx +2π(x) π(k).

The bound is clearly very poor. However we can improve this bound using more advanced sieve techniques. In [Warl90], a much better bound is given under some conditions on the sifting primes.

1.4. Density of squarefree numbers

The basic method of the sieve of Eratosthenes-Legendre can be adapted to prove a more interesting result. Let S = {n | n≤x,n is squarefree}, and let κ(x) =|S|. To obtain S as a result of a sifting process, all we need to do is take primes p <√x and cross of multiples of p2 from the list. We shall show that a variant of the function s(n) introduced earlier works in this case.

THEOREM 1.4.1.

κ(x) =

6 π2

x+O(√x).

16 1. THE SIEVE OF ERATOSTHENES

Proof : The sifting function for this set is now

s(n) =|μ(n)|, and κ(x) = ∑n≤x s(n) = ∑n≤x|μ(n)|. Now we reach an impasse, because there does not seem to be any easy way of evaluating this sum. The trick is to look for another expression for the sifting function.

[1]

s(n) = ∑d2\n μ(d).

[1]

Any number n can be represented as n = m2w, where w is squarefree and m is the largest square divisor of n. If n = pe1 1 pe2 2 ···pel l with ei = 2qi +ri,0≤ri ≤1, then m = ∏i pqi i satis es the expression. We shall write

(n) to stand for the largest square divisor of n. Now ∑ d2\n μ(d) = ∑ d\

(n)

μ(d),

and this sum is 0 unless

(n) = 1 in which case it is also 1. This proves the claim. Setting m =√x, we obtain: κ(n) = ∑ n≤x s(n) = ∑ n≤x ∑ d2\n μ(d) = ∑ d≤m μ(d) ∑ n≤x d2\n 1 = ∑ d≤m μ(d)x d2 = x ∑ d≤m μ(d) d2 + ∑ d≤m μ(d)x d2  x d2 = x ∑ d≤m μ(d) d2 +O(m). Using the fact that

∏ p 1  1 p2= ∑ n≥1

μ(n) n2

we get

κ(n) = x∏ p 1  1 p2  ∑ d>m

μ(d) d2

+O(m)

= x∏ p 1  1 p2+O(m).

Also

∏ p 1  1 p2= 1 ζ(2)

,

so that we  nally get

κ(n) = x

1 ζ(2)

+O(√x).

Euler showed that ζ(2) = π2 6 , and using this in the above expression we have

κ(n) =

6 π2

x+O(√x).

1.4. DENSITY OF SQUAREFREE NUMBERS 17

Another natural question to ask is: what is the density of squarefree numbers in an arithmetic progression  We shall give a partial answer to that question in the next theorem. Let κ(x;a,l) =|{n≤x|n is squarefree,n≡a mod l}|.

THEOREM 1.4.2. Let q > 2 be a prime, and let a be a positive integer relatively prime to q. Then there is a constant c > 0 depending only on q such that κ(x;a,q)≥cx+O(√x). Proof : Using the same idea as in the previous proof we have:

κ(x;a,q) = ∑ n≡qa n≤x

∑ d2\n μ(d)(1.1)

= ∑ d≤m ∑ d2\n n≡qa n≤x 1!where m isb√xc.(1.2)

The quantity we need to bound is de ned by

N(x;d,a,q) = ∑ d2\n n≡qa n≤x

1

This is essentially the number of solutions in k to the congruence kd2 ≡a mod q. There are two cases: [d⊥q] In this case there is a unique solution k such that k≡a(d 2) mod q. However, if k∈{0,1,···,q 1}is such a solution then for e≥1, k+eq is also a solution. Now (k+eq)d2 = n≤x, so (k+eq)≤ x d2 e≤ x d2q  k q e≤ x d2qas k < q. [d6⊥q] In this case there are no solutions to the congruence as a > 0. Thus N(x;d,a,q) =

x d2qif d ⊥q, and 0 otherwise. Substituting in (1.2) we get κ(x;a,q) = ∑ d≤m μ(d) x d2q  ∑ d≤m d6⊥q μ(d) x d2q = x q∑ d≤m μ(d) d2  ∑ d6⊥q μ(d) d2 +O(m)

18 1. THE SIEVE OF ERATOSTHENES

∑ d6⊥q

μ(d) d2 ≤ ∑ d6⊥q

1 d2

= ∑ q\d,d≤x

1 d2

= ∑ k≤(x/q)

1 k2q2

=

1 q2 ∑ k≤(x/q)

1 k2

1 q2 ∑ k≥1

1 k2

π2 6q2

Thus we get

κ(x;a,q)≥x 1 qζ(2)

ζ(2) q2 +O(√x).

and hence κ(x;a,q)≥cx+O(√x).

1.5. The error term in the distribution of Squarefree numbers We proved in the previous section that κ(x)  6 π2 x = O(√x), and it turns out to be extremely dif cult to improve on this bound. In this section we brie y digress form the topic of sieves to show a strengthening of the error term if one assumes the Riemann Hypothesis (henceforth called RH). First we shall strengthen the error term (unconditionally) using a theorem of Wal sz. THEOREM 1.5.1 ([Wal63] Satz§5.5.3). ∑ n≤x μ(n) = Bxexp Alog3 5 xloglog 1 5 x    for some positive constants A and B.

We simplify the proof in [Wal63] of the following theorem: THEOREM 1.5.2 ([Wal63] Satz§5.6.1). κ(x) = 6 π2 x+O√xexp clog3 5 xloglog 1 5 x    for some positive constant c > 0.

Proof :

κ(x) = ∑ 1≤n≤x

∑ d2\n

μ(d)

= ∑ d2m≤x

μ(d)

= ∑ d2≤x

μ(d)x d2.

1.5. THE ERROR TERM IN THE DISTRIBUTION OF SQUAREFREE NUMBERS 19 Let S2(x,y) = ∑d≤y μ(d)δx d2, where δ(z) = z bzc  1 2 and M(y) = ∑n≤y μ(n). Then

κ(x) = x ∑ d2≤x

μ2(d) d2  S2(x,√x) 1 2

M(√x).

In [MV81] (see p.255) the following bound is proved:

S2(x,y) = O(x

2 7 +y

1 2 x

1 7+ε),

and this implies that S(x,√x) = O(x11 28 ). Now consider:

∑ d>y

μ(d) d2

= 2 ∑ d>y

μ(d)

d

1 z3

dz

= 2

y

dz z3 ∑ y<n<z

μ(n)

(interchanging of the sum and the integral is valid since both of them are convergent)

= 2

y

M(z)dz z3  2M(y)

y

dz z3

= OM(y)

y

dz z3 o(1)

= OM(y) y2 .

Hence

∑ d>√x

μ(d) d2

= Oexp{clog

3 5 xloglog 1 5 x}√ x

and also

∑ d≤√x

μ(d) d2

=

1 ζ(2)

+Oexp{clog

3 5 xloglog 1 5 x}√ x .

The theorem follows from these estimates.

COROLLARY 1.5.3. The number of squarefree numbers in the interval [x,···,x+√x] is asymptotic to 6√x π2 . The corresponding problem for primes seems to be far more dif cult, see [HB88]. It turns out that if the Riemann Hypothesis holds then M(y) = O(√y), and using this in the above proof we get the following theorem:

THEOREM 1.5.4. Assuming the Riemann Hypothesis,

κ(x) =

6 π2

x+O(x

11 28 ).

20 1. THE SIEVE OF ERATOSTHENES

It turns out that if we assume the Riemann Hypothesis we can do better even without the strong bound on S2(x,y). We begin as we did before,

κ(x) = ∑ 1≤d≤x

μ(d) ∑ 1≤n≤x d2\n

1

= ∑ d2n≤x

μ(d)

= ∑ d2n≤x d≤y

μ(d)+ ∑ d2n≤x d>y

μ(d)

= Σ1 +Σ2 (say).

Now (as in the proof of the previous theorem)

Σ1 = ∑ d≤y

μ(d)x d2

= ∑ d≤y

μ(d) x d2  x d2  x d2 1 2! 1 2 ∑ d≤y

μ(d).

Let as before

S2(x,y) = ∑ d≤y

μ(d)δx d2

and M(y) = ∑d≤y μ(d), where δ(z) = z bzc  1 2, so that Σ1 = x ∑ d≤y μ(d) d2  S2(x,y)

1 2

M(y).

Let

fy(s) =

1 ζ(s) ∑ d≤y

μ(d) ds

.

We adopt the standard convention of referring to the real part of s as σ and the imaginary part as t. If σ > 1 then we have

fy(s) = ∑ d>y

μ(d) ds

,

since in this case we also have

1 ζ(s)

= ∑ 1≤d

μ(d) ds

.

Consider

ζ(s)fy(2s) =∑ 1≤n

1 ns∑ d>y

μ(d) d2s

= ∑ 1≤n

1 ns∑ d>y d2\n

μ(d).

If we look at the restricted version of this sum, namely,

∑ 1≤n≤x

1 ns∑ d>y d2\n

μ(d),

then as s → 0 this sum equals Σ2. Thus we need a way of evaluating this sum when s → 0. The following result (Lemma (3.12) [Tit86] p60) will help us do just that.

1.5. THE ERROR TERM IN THE DISTRIBUTION OF SQUAREFREE NUMBERS 21

LEMMA 1.5.5. [Tit86] Lethanibe a sequence of real numbers, such that as σ→1 from above, ∑ n≥1 |an| nσ = O 1 (σ 1)α, for some α≥1. Let ψ(n) be an upper bound for|an|, and de ne: f(s) = ∑ n≥1 an ns , for σ > 1. If c > 0,σ≥0,σ+c > 1, x is not an integer, and N is the nearest integer to x, then for all T > 0: ∑ n<x an ns = 1 2πi

c+iT

c iT

f(s+w)

xw w

dw+O xc T(σ+c 1)α+Oψ(2x)x1 σlogx T +Oψ(N)x1 σ T|x N|.

Applying this lemma to the series

∑ 1≤n≤x

1 ns∑ d>y d2\n

μ(d)

with c = 1+ 1 logx and T = x gives remainder terms of O(xε), since ψ(z) = O(√z). Making the change of variable w←s taking the s in the lemma to be 0, and setting x0 =bxc+ 1 2 so that x0 is not an integer, we obtain

Σ2 =

1 2πi

c+ix

c ix

ζ(s)fy(2s)

xs 0 s

ds+O(xε).

Now consider splitting the integral into four regions:

c+ix

c ix

+

1 2+ix

c+ix

+

1 2 ix

1 2+ix

+

c ix 1 2 ix (where the integrand is the same as above). Since the integrand has a simple pole at s = 1, with residue 2πify(2)x0, we have

c+ix

c ix

+

1 2+ix

c+ix

+

1 2 ix

1 2+ix

+

c ix 1 2 ix

= 2πify(2)x0

and so

Σ2 = fy(2)x0 +

1 2πi

C

ζ(s)fy(2s)

xs 0 s

ds+O(xε),

where C is the path made up of the line segments

c ix →

1 2 ix

1 2 ix →

1 2

+ix

1 2

+ix →c+ix. By Theorem (14.2) on p.337 of [Tit86], RH implies that 1 ζ(s) = O(|t|ε). Also THEOREM 1.5.6 ([Tit86] (14.25A)). Assume RH. For s with σ > 1 2,

∑ n<x

μ(n) ns

=

1 ζ(s)

+O(T1 εx2)+O(Tεx1 2 σ+δ).

Using this we can take T large so that

fy(s) = O(y

1 2 σ+δ0)(1.3)

under RH.

22 1. THE SIEVE OF ERATOSTHENES

Also by Theorem (14.25C) [Tit86], RH implies M(z) = O(z

1 2+ε). Using all this information we can bound

C

ζ(s)fy(2s)

xs 0 s

ds

on the contourC: we have fy(2s)=O(y1 2 1+ε)=O(y 1 2+ε) andζ(s)= 1 s 1

+O(tε), and since xs =xσ+it =e(σ+it)logx =

eσlogx+it logx, we have|xs|= xσ. Thus the integral in (1.3) is: O(x 1 2+εy 1 2+ε).

Combining all these estimates we get the following bounds:

THEOREM 1.5.7 ([MV81]). Assuming the Riemann Hypothesis, for any y > 0

κ(x) =

x ζ(2) S2(x,y)+Ox1 2+εy 1 2+ε +y1 2+ε.C OROLLARY 1.5.8. Assuming the Riemann Hypothesis,

κ(x) =

x ζ(2)

+Ox1 3+δ.

Proof : Clearly we have S2(x,y) = O(y), now setting y = x

1 3 in the above theorem we get the result.

In the same article [MV81] Montgomery and Vaughan went on to estimate the sums involved more precisely to show that κ(x) = 1 ζ(2)x+O(x 9 28+ε). Subsequently the exponent of the error term was reduced to 7 22 by various authors (see [BakPin85]).

1.6. Pairs of squarefree numbers

The famous twin prime problem asks whether there are in nitely many primes p such that p+2 is also prime. Although this problem is still open, the analogous question for the squarefree numbers can be settled rather easily using the methods we have seen so far. For a more general version of this result see [Mir49]. Let κ2(x) =
 
{n(n+2)|μ(n)2 = μ(n+2)2 = 1,n≤x}
 
.T HEOREM 1.6.1. κ2(x) =∏ p 1  2 p2x+O(x2 3 ln 4 3 x). Proof : Let s(n) = ∑d2\n μ(d). Using this we have κ2(x) = ∑ n≤x s(n)s(n+2) = ∑ n≤x∑ a2\n μ(a)∑ b2\n μ(b). If a2\n and b2\(n+2), then writing n = k1a2 and n+2 = k2b2 we have k0 1a2 +k2b2 = 2 (k0 1 = k1). This says that gcd(a2,b2) divides 2, so gcd(a,b) must be 1, i.e. a⊥b. Now interchanging the sum we get κ2(x) = ∑ k1a2 k2b2=2 k2b2≤x a⊥b μ(a)μ(b). The rest of the proof is now to bound the above sum, and to this end we split up the sum into two parts:

κ2(x) = ∑ ab≤y

μ(a)μ(b)N(x;a2,b2,2)+ ∑ ab>y k1a2 k2b2=2,k2b2≤x

μ(a)μ(b).

Here N(x;a2,b2,2) is a count of the number of solutions to the equation k1a2 k2b2 = 2, k2b2 ≤x.

1.6. PAIRS OF SQUAREFREE NUMBERS 23

It is clear that N(x;a2,b2,2) = 0 if gcd(a2,b2) does not divide 2, and otherwise N(x;a2,b2,2) = x lcm( a2,b2) +O(1)

=

x (ab)2

+O(1),

since a⊥b.

Using this we have

∑ ab≤y a⊥b

μ(a)μ(b)N(x;a2,b2,2)≤ ∑ ab≤y a⊥b

μ(a)μ(b) x (ab)2

+O(1)

= x ∑ ab≤y

μ(ab) (ab)2

+ ∑ ab≤y

μ(a)μ(b),

since the terms with a6⊥b are killed by the M¨obius function.

Thus

∑ ab≤y

μ(a)μ(b)≤ ∑ ab≤y

1

=y 1+y 2+···+y y = Oy ∑ 1≤k≤y 1 k= O(ylny).

Now the sum

∑ ab≤y

μ(ab) (ab)2

can be evaluated by looking at the terms with ν(ab) = k. Write a = pε1 1 pε2 2 ···pεk k and b = pδ1 1 pδ2 2 ···pδk k . Since a⊥b we should have ( i : 1≤i≤k) εi +δi = 1, so there are 2ν(ab) terms whose denominator is (ab)2. Hence

∑ ab≤y

μ(ab) (ab)2

= ∑ n≤y

μ(n)2ν(n) n2

= ∏ p≤y1  2 p2.

So

∑ ab≤y

μ(ab) (ab)2

=∏ p 1  2 p2 ∑ n>y

μ(n)2ν(n) n2

.

We need a bound on the sum on the right hand side of the above equation. Now

∑ ab>y

1 (ab)2

= ∑ b<y,ab>y

1 (ab)2

+ ∑ a>y,b>y

1 (ab)2

.

The second sum converges so we need to bound on the  rst part of the sum. Now:

∑ b<y,ab>y

1 (ab)2 ≤ ∑ 1≤b≤y

1 b2∑ a> y b

1 a2

24 1. THE SIEVE OF ERATOSTHENES

∑ a> y b

1 a2 ≤

y b

1 a2

da =

b y

so we have

∑ b<y,ab>y

1 (ab)2 ≤

1 y ∑ 1≤b≤y

1 b

=

1 y

lny.

We  nally get

x ∑ ab≤y

μ(ab) (ab)2

= x∏ p 1  1 p2+Ox y

lny.

Now we have to bound the sum

∑ ab>y k1a2 k2b2=2,k2b2≤x

μ(a)μ(b).

We re-express this sum as follows:

∑ ab>y a2c b2d=2 b2d≤x

μ(a)μ(b)≤ ∑ a2c b2d=2 b2d≤x ab>y

1.

Since a2c = 2+b2d, a2c≤2+x, and this gives us c≤ (x+2) a2 . Since d ≤ x b2 and y < ab we have either cd ≤ x(x+2) a2b2 or cd ≤ x(x+2) y2 . This gives

∑ a2c b2d=2 b2d≤x,ab>y

1≤ ∑ cd<x(x+2) y2

M(x;c,d,2),

where M(x;c,d,2) is the number of solutions of ca2 db2 = 2,db2 ≤x.(1.4) The above equation implies that 2c 1 p ≡1 mod p, for all p\d, 2d 1 p ≡1 mod p, for all p\c. Estermann studied these congruences and for the case cd not a square he proved [Est31]:

M(x;c,d,2) = O(lnx),

in fact that M(x;c,d,2)≤4(ln(x+2)+1). If cd is a square then since the equation (1.4) implies c⊥d we can set c = l2, d = m2 to obtain: M(x;c,d,2) = ∑ l2a2 m2b2=2 1 ≤ ∑ r2 s2=2 1 = 0.

1.7. THE SMALLEST SQUAREFREE NUMBER IN AN ARITHMETIC PROGRESSION 25

In any case we have M(x;c,d,2) = O(lnx), and using this we have: ∑ cd<x(x+2) y2 M(x;c,d,2)≤lnx ∑ cd<x(x+2) y2

1.

For any positive constant K we have:

∑ cd<K

1 = ∑ c<K

K c ≤KlnK,

so

∑ cd<x(x+2) y2

M(x;c,d,2)≤ln2 xx(x+2) y2 = Ox2 y2 ln2 x.

Setting y = x

2 3 ln

1 3 x we have

∑ ab>y

μ(ab)≤ ∑ cd<x(x+2) 2

M(x;c,d,2)

≤x

2 3 ln

4 3 x,

and also

x ∑ ab≤y

μ(ab) (ab)2

= x∏ p 1  1 p2+Ox1 3ln2 3 x+o(1). The theorem follows from these two bounds.

1.7. The smallest squarefree number in an arithmetic progression

The simple methods that we have seen so far are surprisingly powerful and provide a quick bound on the smallest squarefree number in an arithmetic progression. The following result is from [Erd60] and is one of the early uses of a squarefree sieve. THEOREM 1.7.1. Let a ⊥ D, 1 ≤ a < D. Then the smallest squarefree number in the arithmetic progression ha+ kD : k≥0iis OD 3 2 lnD. Proof : Let A =ha+kD : k≥0ibe the sequence. The  rst step would be to sift A by all squares of primes below a certain limit z. This will leave out only those numbers that could have a large prime as their square divisor. We will  nally bound the number of such integers below x and show that there are still some numbers left over — and that will prove the theorem. Let Pz = ∏p<z p. The result of the sifting of the sequence A by Pz is: S(A;Pz,x) = ∑ n∈A n≤x ∑ d2\n d\Pz μ(d) = ∑ d\Pz μ(d) ∑ n∈An ≤x,d2\n 1!.

26 1. THE SIEVE OF ERATOSTHENES

Now

∑ n∈An ≤x,d2\n

1

is exactly the number of solutions to the following pair of congruences: n≡0 mod d2 n≡a mod D. Suppose d ⊥D. Then there is exactly one solution in the interval lcm(D,d2) = Dd2, so the total number of solutions in 1≤n≤x is at most x Dd2 +1. If gcd(d,D)=δ then n=kδ by the  rst congruenceand n a=k0δ by the second congruence. This yields a=(k k0)δ and so gcd(a,D)6= 1. This is a contradiction, so if d 6⊥D there are no solutions to the congruence. Let k =b(x a) D c,which is the maximum value of k for a+kD to be in A. Then S(A;Pz,x) = ∑ d\Pz,d⊥D μ(d) x Dd2 +1 = x D ∑ d\Pz,d⊥D μ(d) d2 = k∑ d\Pz d⊥D μ(d) d2 +o(1) = k ∏ p\Pz,p6\d1  1 p2+o(1) ≥k∏ p 1  1 p2+o(1) = k6 π2 +o(1). Taking k to be c√D lnD we have S(A;Pz,k)≥ 6 π2 c√D lnD . The number of integers a+kD in A for which k < c√D lnD and also n≡0 mod p2 n≡a mod D is at most c√D p2 lnD +1. Let N stand for the number of integers k < c√D lnD in A for which a+kD6≡0 mod p2 for all p≤√cD. Then N ≥ 6 π2 c√D lnD   c√D lnD∑ p≥z 1 p2  ∑ p≥z,p≤√cD 1(1.5) ≥ 6 π2 c√D lnD   c√D lnD 1 z  π( √cD),(1.6)

and so for large enough c and L

N >

1 2

c√D lnD .(1.7) We have used the fact that π(x) < 2x lnx for large enough x.

1.8. THE SIEVE PROBLEM 27 Now we are left with the numbers that are either squarefree or divisible by a prime p > √cD. For these numbers a+kD either a+kD≡0 mod p2,k < c √D lnD and p >√cD or

a+kD = αp2 with α <

√D lnD

.

Supposing p > D

1 2+ε, we would have α < D1 2 ε if D is large enough, so we also have p < D. Thus a+kD = αp2 yields a congruence a ≡αp2 mod D. Let us  x an α; then clearly the number of such prime solutions is less than the number of solutions for the congruence x2 ≡aα 1 mod D, 0 < x < D. If aα 1 is a quadratic residue modulo D, then by the Chinese Remainder Theorem there are at most 2ν(D) such solutions to this congruence. Since ν(n) = o(lnn), we can write 2ν(D) = o(Dε 2 ). If p > D 1 2+ε then there are only D1 2 ε choices for α, so on the whole there are only o(D 1 2 ε 2 ) such solutions. Let us consider the solutions for√cD < p < D1 2+ε. We have p2 ≡aα 1 mod D, α < √D lnD , √cD < p < D1 2+ε. Let cα be the number of solutions of this congruence for a  xed α. These solutions give rise to ∑cα 2solutions to thecongruence p2 ≡q2, p,q < D1 2+ε.(1.8) Since (1.8) implies (p q)(p+q)≡0 mod D, the number of such solutions is at most the number of solutions to uv≡0 mod D, u < 2D 1 2+ε,v < 2D1 2+ε. This gives us uv = βD,1≤β < 4D2ε.(1.9) Also for a  xed β the number of such solutions is less than the number of factors of the numberβD, which is o((βD)ε), so the number of solutions of (1.9) is o((βD)ε)4D2ε = o(D4ε). This gives ∑cα 2= o(D4ε) and hence

∑ cα>1

cα = o(D4ε).

Since α <

√D lnD, ∑cα ≤

√D lnD +o(D4ε). Thus the number of integers 0≤k < c

√D lnD for which

a+kD≡0 mod p2

for some p>√cD is at most D1 2 lnD +o(D

1 2 ε 2 ). So the number of integers k, 0≤k < c

√D lnD , for which a+kD is squarefree

is

1 2

c√D lnD

√D lnD  o(D

1 ε 2 ) > 0

for large enough c.

1.8. The Sieve Problem

Now that we have seen some examples of sieve techniques at work, we can formulate the sieve problem in a generic setting so that the essential quantities are clearly visible. The notation we shall adopt is that of the seminal book by Halberstam and Richert [HR74].

28 1. THE SIEVE OF ERATOSTHENES

1.8.1. Notation.

1. A,B,··· will stand for integer sequences. 2. Ad =ha∈A : a≡0 mod di. 3. Az =ha∈A : a≤zi. 4. If A is a  nite sequence then|A|will denote the length of the sequence. 5. P =hpi : pi is the i-th primei. 6. Pz = ∏p∈Pz p. 7. S(A;Pz,x) will be the number of elements inAx that survive the sifting process by the sequencePz. In general the sifting is determined by a sifting function σ :A→{0,1}which determines whether a number survives the sifting, but usually we will only be considering simple sifting functions like

σ(n) = 1 n⊥ ∏ p∈Sz

p

8. If A is a  nite sequence then ω(p) is de ned such that ω(p) p x is a good approximation to |Ax p|. If d is any squarefree integer we can generalize this notation by de ning ω(d) = ∏p\d ω(d). 9. De ne Rd(x) =|Ax d| ω(p) p x, i.e. the remainder term in our estimate of|Ax d|. 10. De ne

W(z) = ∏ p\Pz1 ω(p) p .

1.8.2. The Sieve of Eratosthenes-Legendre revisited. The generic sieve problem is to estimate S(A;Pz,x). Needless to say solving the problem as stated in this generality is too great a task. This treatise will only be concerned with restricted versions of the sieve problem which nevertheless yield interesting and non-trivial results. The case of great importance is when Sz =Pz and A is some subsequence of positive integers.

The sieve of Eratosthenes-Legendre can be recast in this framework as follows. Let A be the sequence to be sifted, and let ω(d) and Rd be the modulo counting function and the remainder function for the sequence, respectively. Let Pz be the sifting sequence; then the sifting function is

σ(n) =(1 if, n⊥Pz 0 otherwise.

We can rewrite σ(n) as

σ(n) = ∑ d\gcd(n,Pz)

μ(d).

1.8. THE SIEVE PROBLEM 29

Thus we have

S(A,Pz,x) = ∑ n∈A,n≤x

σ(n)

= ∑ n∈Ax

∑ d\gcd(n,Pz)

μ(d)

= ∑ d\Pz

μ(d)∑ n∈Ax d\n

1

= ∑ d\Pz

μ(d)|Ax d|

= ∑ d\Pz

μ(d)ω(d) d

x+Rd(x)

= x ∑ d\Pz

μ(d)ω(d) d

+ ∑ d\Pz

μ(d)Rd(x)

= x ∏ p\Pz1 ω(p) p + ∑ d\Pz

μ(d)Rd(x)

= xW(z)+ ∑ d\Pz

μ(d)Rd(x)

= xW(z)+θ ∑ d\Pz

Rd(x) where|θ|≤1. If we assume that|Rd(x)|≤ω(d) and suppose that ω(p)≤C0, where C0 is some constant, then ω(d)≤Cν(d) 0 . So ∑ d\Pz Rd(x)≤ ∑ d\Pz C ν(d) 0 = ∏ p\Pz (1+C0) = (1+C0)π(z). Thus we have proved the following theorem. THEOREM 1.8.1. For all suf ciently large x and z < x, there is a θ with|θ|≤1 (θ depending on z), such that S(A;Pz,x) = xW(z)+θ ∑ d\Pz Rd(x). If we have|Rd(x)|≤ω(d) and ω(p)≤C0 then S(A;Pz,x) = xW(z)+O(1+C0)π(z). It is very clear that the effectiveness of the basic sieve is limited by the fact that the remainder term is a sum over all the divisors of Pz. Beginning with the next chapter we shall systematically try to reduce this term.

30 1. THE SIEVE OF ERATOSTHENES

CHAPTER 2

The Combinatorial Sieve

In this chapter we begin by exploring the ideas of Viggo Brun, who  rst showed how we can improve on the Legendre method if we relax our requirement of asymptotic results but instead look for inequalities. After developing Brun’s sieve in general we shall look at applications that bring out the surprising power of the technique. We follow the presentation in Halberstam & Richert [HR74] rather closely since its form is well suited for our applications. However our development will be targeted only to the Brun’s sieve.

2.1. Brun’s Pure Sieve Let Ax be a  nite sequence of integers and let Sz be the sifting primes. In the previous chapter the sifting function was: σ(n) = ∑ d\gcd(n,Pz) μ(d). Let us see what can be done if instead we have a pair of functions χ1(d) and χ2(d) such that σ2(n)≡∑ d\n μ(d)χ2(d)≤∑ d\n μ(d)≤∑ d\n μ(d)χ1(d)≡σ1(n). Since S(A;Pz,x) = ∑ d\Pz μ(d)|Ad| =|A|  ∑ p\Pz |Ap|+ ∑ pq\Pz |Apq|+··· we expect that truncating the series after an even (odd) number of sums will give us a lower (upper) bound. Brun’s pure sieve is an application of this well-known idea.

Using the notation developed in the last chapter we have

∑ n∈A

∑ d\n d\Pz

μ(d)χ2(d)≤S(A,Pz,x)≤ ∑ n∈A

∑ d\n d\Pz

μ(d)χ1(d).

Let us  rst look at the upper bound:

∑ n∈A

∑ d\n d\Pz

μ(d)χ1(d) = ∑ d\Pz

μ(d)χ1(d)|Ax d|

= ∑ d\Pz

μ(d)χ1(d)ω(d)x d

+|Rd(x)|

= x ∑ d\Pz

μ(d)χ1(d)

ω(d) d

+ ∑ d\Pz

μ(d)χ1(d)|Rd(x)|.

Let σ1(n) = ∑d\n μ(d)χ1(d); then by M¨obius inversion we get μ(d)χ1(d) = ∑ δ\d μ

d δσ(δ).

31

32 2. THE COMBINATORIAL SIEVE

Substituting this in the above expression we get

x ∑ d\Pz

μ(d)χ1(d)

ω(d) d

= x ∑ d\Pz

ω(d) d ∑ δ\d

μ

d δσ1(δ)

= x ∑ δ\Pz

σ1(δ)ω(δ) δ ∑ t\(Pz/δ)

μ(t)

ω(t) t

(since ω(d) is a multiplicative function)

= x ∑ δ\Pz

σ1(δ)

ω(δ) δ ∏ p\(Pz/δ)1 ω(p) p

= xW(z) ∑ δ\Pz

σ1(δ)

ω(δ) δ∏p\δ1 ω(p) p

= xW(z) ∑ δ\Pz

σ1(δ)g(δ) = xW(z)1+ ∑ 1<δ\Pz σ1(δ)g(δ),

where g(d) abbreviates ω(d) d∏p\d1 ω(p) p

.

The remainder term is clearly at most

∑ d\Pz

μ(d)χ1(d)|Rd(x)|≤ ∑ d\Pz

|χ1(d)||Rd(x)|.

A similar argument works for the lower bound too. Thus we have: xW(z)1+ ∑ 1<δ\Pz σ2(δ)g(δ)  ∑ d\Pz |χ2(d)||Rd(x)|≤S(A,Pz,x)(2.10) ≤xW(z)1+ ∑ 1<δ\Pz

σ1(δ)g(δ)+ ∑ d\Pz |χ1(d)||Rd(x)|.(2.11) Our aim will be to minimize|∑1<δ\Pz σi(δ)g(δ)| for i = 1,2 such that the remainder term ∑d\Pz|χi(d)||Rd|is small. A whole class of estimates can be obtained by restricting the functions χi to be the characteristic sequences of two divisor sets D+ and D  of Pz. The resulting sieves are called Combinatorial Sieves. Let us consider the following functions: χ(r)(d) =(1 if ν(d)≤r, and μ2(d) = 1, 0 otherwise.

These functions restrict the divisor sets over which we take the sum. In particular the restriction is on the number of distinct prime factors of the divisors. We will require the following lemma.

LEMMA 2.1.1.

∑ 0≤i≤k

( 1)in i= ( 1)kn 1 k .

Proof : The proof is by induction on k. For k = 0 we have ( 1)0n 0=n 1 0 +n 1  1=n 1 0 .

2.1. BRUN’S PURE SIEVE 33

Now

∑ 0≤i≤(k+1) ( 1)in i= ∑ 0≤i≤k

( 1)in i+( 1)k+1 n k+1 = ( 1)kn 1 k +( 1)k+1 n k+1 = ( 1)kn 1 k +( 1)k+1n 1 k +n 1 k+1 = ( 1)k+1n 1 k+1.

LEMMA 2.1.2. Let n be a positive integer and s a non-negative integer. Then ∑ d\n μ(d)χ(2s+1)(d)≤∑ d\n μ(d)≤∑ d\n μ(d)χ(2s)(d). Proof : When n = 1 all the sums are equal so we can assume n > 1. Then

∑ d\n

μ(d)χ(r)(d) = ∑ 1≤k≤r

( 1)kν(n) k = ( 1)rν(n) 1 r .

by Lemma (2.1.1).

Now let us try to bound the terms involved in (2.11). Let σ(r)(n) = ∑d\n μ(d)χ(r)(d), so that we have

σ(r)(n) = ∑ d\nν (d)≤r

μ(d)

= ( 1)rν(n) 1 r

and hence|σ(r)(n)|=ν(n) 1 r ≤ν(n) r .Then we have

∑ 1<d\Pz σ(r)(d)g(d)

≤ ∑ 1<d\Pzν(d) r g(d) = ∑ r≤m≤ν(Pz)=π(z)m r ∑ 1<d\Pz ν(d)=m

g(d)

≤ ∑ m≤rm r 1 m!∑ p<z

g(p)m

=

1 r!∑ p<z

g(p)r exp ∑ p<z

g(p).

Suppose we make the assumption|Rd(x)|≤ω(d); then we can also bound the remainder term as follows:

∑ d\Pz

|χ(r)(d)||Rd(x)|≤ ∑ d\Pz,ν(d)≤r

ω(d)≤1+ ∑ p<z

ω(p)r.

34 2. THE COMBINATORIAL SIEVE

Since

∑ d\Pzν (d)≤2s+1

μ(d)|Ad|= ∑ d\Pzν (d)≤2s

μ(d)|Ad|  ∑ d\Pzν (d)=2s+1

|Ad|≤S(A;Pz,x)≤ ∑ d\Pzν (d)≤2s

μ(d)|Ad|

we can always write

S(A;Pz,x) = ∑ d\Pzν (d)≤r

μ(d)|Ad|+θ ∑ d\Pzν (d)=r+1

|Ad|,|θ|≤1.

Putting all these together we have: S(A;Pz,x) = xW(z)1+θ 1 r!∑ p<z

g(p)r exp∑ p<z

g(p)+θ01+ ∑ p<z

ω(p)r

for some positive integer r, and with|θ|≤1,|θ0|≤1. Thus we have proved:

THEOREM 2.1.3 (Brun’s Pure Sieve). Let

g(d) =

ω(d) d∏p\d1 ω(p) p be well de ned for all d with μ(d)6= 0, and suppose|Rd(x)|≤ω(d). Then for every non-negative integer r there exist θ,θ0 with|θ|≤1,|θ0|≤1 such that S(A;Pz,x) = xW(z)1+θ 1 r!∑ p<z g(p)r exp∑ p<z g(p)+θ01+ ∑ p<z ω(p)r. We can apply this theorem to derive a much better bound on π(x) that we obtained eariler. We consider the sequence

x, and in this case since

x p ={n≤x|n≡0 mod p}we have|

x p|=bx pc= 1 px+δ0,|δ0|<

1. So we can take ω(p) = 1, and the condition|Rd(x)|≤ω(d) is also satisi ed. Also

g(p) =

1 p1  1 p≤

2 p

,

and this gives us

∑ p<z

g(p)≤2 ∑ p<z

1 p

< 2(lnlnz+1). We use the trivial estimate 1+∑p<zω(p)≤z. In this case we have W(z) = ∏ p<z1  1 p ~ e γ lnz . We begin with the following observations. First 1 r! ≤e rr by the Stirling approximation. Next if we set z such that∑ p<z g(p)≤λr, then the result of the theorem simpli es to S(A;Pz,x) = xW(z)1+θ(e1+λλ)r+θ0zr. De ning r =2(lnlnz+1) λ +1

2.1. BRUN’S PURE SIEVE 35

gives us ∑p<z g(p)≤λr. We restrict z so that

lnz =

lnx γlnlnx

,

and set

λ =

ξlnz(lnlnz+1) lnx so that for a large enough x and appropriate settings of ξ,γ we get λe1+λ ≤ 1. For this setting of z and r we have zr = ox1 εfor some ε > 0. Thus the theorem gives π(x) = Oxlnlnx lnx 1+θe clnlnx+o(x1 ε).

This approximation is signi cantly better than our  rst and shows the improvement that can be made using this simple idea. Next we will look at the twin primes problem, which was Brun’s primary application of his pure sieve. In this case we take the sequence to be A =|{n(n+2)|n≤x}|. Let p > 2; then Ap ={n(n+2) | n≤x,n(n+2)≡0 mod p}. Now n(n+2)≡0 mod p only if n≡0 or n+2≡0 since p is an odd prime. Clearly 0 and p 2 are two solutions in the interval 0,···,p 1. So we can take ω(p) = 2, p > 2. For p = 2 we have ω(p) = 1. By the Chinese Remainder Theorem we have|Rd(x)|≤ω(d). We take the sifting primes to be P ={p | p > 2}. Since S(A;Pz,x) counts all the twin-prime pairs above z, S(A;Pz,x)+2z is an upper bound on the number of twin-primes below x. Then:

W(z) = ∏ 2<p<z1  2 p ≤ ∏ 2<p<z1  2 p2 = O 1 ln2 z. Carrying out the rest of the analysis again using lnz = lnx γlnlnx we get the following theorem. THEOREM 2.1.4. Let π2(x) =|{p≤x| p+2 = q}| π2(x) = Oxlnlnx lnx 2. The above theorem can be put in a more impressive form.

THEOREM 2.1.5.

∑ p p+2=q

1 p

converges.

Proof :

36 2. THE COMBINATORIAL SIEVE

∑ p p+2=q

1 p

=∑ n

π2(n) π2(n 1) n

=∑ n

π2(n)1 n

1 n+1

=∑ n

π2(n)

1 n(n+1)

≤B∑ n

n(lnlnn)2 n(n+1)ln2 n

= B∑ n

1 nlnlnn lnn 2 = O(1).

The last step follows via

∑ n≤x

1 nlnlnn lnn 2 ≤ 2 lnx

+

2lnlnx lnx

+

(lnlnx)2 lnx (1+o(1))

using approximation by integration and taking the limit x→∞.

2.2. Brun’s Sieve

The second idea of Brun was to limit the remainder term by restricting the size of primes making up the divisors. This simple idea results in a sieve of remarkable power which can be used to prove rather sharp bounds on S(A,Pz,x). Since we are modifying the divisor sets in a non-trivial fashion we would like to have some simple conditions on the characteristic functions χ of the divisor sets, such that χ still yields good lower or upper bounds. Our  rst task is to  nd such a set of conditions. We begin with the following observation.

PROPOSITION 2.2.1.

S(A,Pz,x) = ∑ d\Pz

μ(d)χ(d)|Ad|  ∑ 1<d\Pz

σ(d)S(Ad;Pz (d),z)

where Pz (d) = ∏p∈Pz p6\d

p.

Proof :

∑ d\Pz

μ(d)χ(d)|Ax d|= ∑ d\Pz

|Ax d|∑ δ\d

μd δσ(δ)

= ∑ δ\Pz

σ(δ) ∑ t\Pz/δ

μ(t)|Aδt|

= ∑ t\Pz

μ(t)|At|+ ∑ 1<δ\Pz

σ(δ) ∑ t\Pz/δ

μ(t)|Aδt|

= S(A,Pz,x)+ ∑ 1<δ\Pz

σ(δ) ∑ t\Pz/δ

μ(t)|Aδt|

= S(A,Pz,x)+ ∑ 1<d\Pz

σ(d)S(Ad;Pz (d),z),

where we have used the M¨obius inversion on the expression for σ(d) as in the previous section.

We will use the above proposition to compare ∑d\Pz μ(d)|Ad|with ∑d\Pz μ(d)χ(d)|Ad|.

2.2. BRUN’S SIEVE 37

Now

σ(d) =∑ l\d

μ(l)χ(l)

= ∑ l\d/p

μ(l)χ(l)+ ∑ l\d/p

μ(lp)χ(lp)

= ∑ l\d/p

μ(l)χ(l)  ∑ l\d/p

μ(l)χ(lp)

= ∑ l\d/p

μ(l)χ(l) χ(lp). Let q(d) be the smallest prime divisor of d. Now using the above expression we can write ∑ 1<d\Pz σ(d)S(Ad;Pz (d),x) = ∑ δ\Pz ∑ p\Pzp <q(δ) σ(pδ)S(Apδ;Pz (pδ),x)

= ∑ δ\Pz

∑ p\Pzp <q(δ)

S(Apδ;Pz (pδ),x)∑ l\δ

μ(l)χ(l) χ(pl)

= ∑ l\Pz

∑ p\Pzp <q(l)

μ(l)χ(l) χ(pl) ∑ t\Pz/l p<q(t)

S(Aplt;Pz (plt),x)

= ∑ l\Pz

∑ p\Pzp <q(l)

μ(l)χ(l) χ(pl)S(Apl;Pp (pl),x).

Using this in the above proposition,

S(A;Pz,x) = ∑ d\Pz

μ(d)χ(d)|Ad|  ∑ d\Pz

∑ p\Pzp <q(d)

μ(d)χ(d) χ(pd)S(Apd;Pp (pd),x)

= ∑ d\Pz

μ(d)χ(d)|Ad|  ∑ d\Pz

∑ p\Pzp <q(d)

μ(d)χ(d) χ(pd)S(Apd;Pp,x)

since Pp (pd) = Pp. Suppose we have χ(1) = 1 and χ(d) = 0 for d > 1. Then S(A;Pz,x) =|A|  ∑ p<z,p∈P

S(Ap;Pp,x).

Now let χ1,χ2 be the characteristic functions of the divisor sets that we wish to use to get upper and lower bounds respectively. If we arrange ( 1)i 1μ(d)χi(d) χi(pd)≥0 whenever pd\Pz and p < q(d) for i = 1,2, then ∑ d\Pz μ(d)χ2(d)|Ad|≤S(A;Pz,x)≤ ∑ d\Pz μ(d)χ1(d)|Ad|. The above inequality is valid (needless to say) only if the sums involving χi are positive. This gives us a set of suf cient conditions for our functions χi to be well behaved.

If pd\Pz and p < q(d) then the conditions can be satis ed in only one of the following ways: 1. χi(d) = χi(pd) 2. χi(d) = 1,χi(pd) = 0 and μ(d) = ( 1)i 1 3. χi(d) = 0,χi(pd) = 1 and μ(d) = ( 1)i.

38 2. THE COMBINATORIAL SIEVE

We can avoid the last possibility by requiring that the functions χi be divisor closed, i.e. that χi(d) = 1  δ\d : χi(δ) = 1. So the functions χi for i = 1,2 should have the following properties: 1. If d\Pz, then either χi(d) = 0 or χi(d) = 1; 2. χi(1) = 1 (this is required for the derivation in Proposition (2.2.1)); 3. χi(d) = 1  δ\d : χi(δ) = 1; 4. χi(d) = 1,μ(d) = ( 1)i  χi(pd) = 1 for all pd\Pz, where p < q(d). Suppose we restrict χ(r) (which was the divisor selection function of the previous section) to also limit the number of prime factors that come from a certain interval. Suppose at most δ1 divisors can come from the interval z1 < p < z. Then the remainder term obeys

∑ d\Pz

χ(r)(d)|Rd|≤1+ ∑ p<z

ω(p)δ11+ ∑ p<z1

ω(p)r 1 δ1. This allows a more accurate estimation of the remainder term. The full Brun Sieve uses n such intervals to minimize the remainder term. The  rst step is to compare ∑d\Pz μ(d)ω(d) d with ∑d\Pz μ(d)χi(d)ω(d) d . By writing χi(d) = 1+

i(d), we can split the

sum

∑ d\Pz

μ(d)χi(d)

ω(d) d

= ∑ d\Pz

μ(d)χi(d)

ω(d) d

+ ∑ d\Pz

μ(d)

i(d)

ω(d) d

.

Let d = p1···pr; then

1 χi(d) = χi(p2···pr) χi(p1···pr) +χi(p3···pr) χi(p2···pr) +··· +χi(1) χi(pr). If we write P(p+,z) = ∏p<q<z,q∈Pq then we can write the above as: 1 χi(d) = ∑ p\dχi(gcd(d,P(p+,z))) χi(gcd(d,Pp,z)). This gives us

∑ d\Pz

μ(d)χi(d)

ω(d) d

=W(z)+ ∑ d\Pz

∑ p\d

μd pχi(gcd(d,P(p+,z))) χi(gcd(d,P(p,z)))ω(d) d

.

Let d = δpt, where δ\Pp and t\P(p+,z). Rewriting the above expression we get: ∑ d\Pz μ(d)χi(d) ω(d) d =W(z)+ ∑ p<z ω(p) p ∑ δ\Pp μ(δ) ω(δ) δ ∑ t\P(p+,z) μ(t)χi(t) χi(pt) t

ω(t)

=W(z)+( 1)i 1 ∑ p<z

ω(p) p

W(p) ∑ t\P(p+,z)χi(t)(1 χi(pt)) t

ω(t),

where we have used χi(t) χi(pt) = ( 1)i 1μ(t)χi(t)(1 χi(pt)) if pt\Pz and p < q(t). To verify this, if χi(t) = χi(pt), then both sides are 0, and this is the case if χi(pt) = 1 (since the χi are divisor closed). Now if χi(t) = 1 and χi(pt) = 0, then from the properties of χi listed above, we have that μ(t) = ( 1)i 1, and so the relation holds. So  nally we get

∑ d\Pz

μ(d)χi(d)

ω(d) d

=W(z)1+( 1)i 1 ∑ p<z

ω(p) p

W(p) W(z) ∑ t\P(p+,z)

χi(t)1 χi(pt) t

ω(t).

2.2. BRUN’S SIEVE 39

This identity holds in general for every combinatorial sieve with χi satisfying the properties listed above, provided W(z) and W(p) are well de ned. This will happen if g(d) stays bounded.

Construction of the Divisor sets: Let r be a positive integer and let zi for 1≤i≤r be real numbers. We will divide the interval [2···z] into r intervals as follows: let 2 = zr < zr 1 <···< z1 < z0 = z. Let d\Pz and βn = gcd(d,P(zn,z)) for 1≤n≤r. Let us set χi(d) = 1 if for all 1≤n≤r we have ν(βn)≤A+Cn, where A and C will be picked to make χi an acceptable function. For the current choice χi is already divisor closed, so the only property we need to check is: χi(t) = 1,μ(t) = ( 1)i   pt\Pz,p < q(t) : χi(pt) = 1. Let zm ≤ p < zm 1. Since χi(t) = 1 we should have ν(βm)≤ A+Cm. If ν(βm) < A+Cm then χi(pt) = 1. Now if ν(βm) = A+Cm, then we also have μ(t) = ( 1)i. By de nition μ(t) = ( 1)ν(t), since ν(t) = A+Cm, we have μ(t) = ( 1)A+Cm. This suggests that we set A = B i. Then we have that ( 1)B+Cm = 1 or B+Cm should be even. If we make B+Cm an odd number, then the assumption that χi(t) = 1 and μ(t) = ( 1)i results in a contradiction. Consequently, ν(βm) = A+Cm cannot happen if χi(t) = 1. For some integer b we set B = 2b 1 and C = 2. This suggests using ν(βn)≤ 2b 1+i+2n to be the condition on the number of factors of d in the interval [zn,···,z). Summarizing, the characteristic functions of the divisor sets will be (for i = 1,2) χi(d) =(1 if m : 1≤m≤r, ν(βm)≤2b i 1+2m, 0 otherwise.

The construction was such that the above function is the characteristic function of an acceptable divisor set.

Derivation of the Sieve bounds: Now

∑ 1≤n≤r

∑ zn≤p<zn 1

ω(p)W(p) pW(z) ∑ t\P(p+,z)

χi(t)1 χi(pt) t

ω(t)

≤ ∑ 1≤n≤r

W(zn) W(z) ∑ z≤p<zn 1

ω(p) p ∑ t\P(p+,z)

χi(t)(1 χi(pt)) t

ω(t).

We have used the fact that W(p) ≤W(zn) if zn ≤ p < zn 1. Now for each t which makes a contribution we have χi(pt) = 0 and χi(t) = 1. So we must have ν(t) = 2b i+2n 1for zn ≤ p < zn 1. Hence this sum is at most ∑ 1≤n≤r W(zn) W(z) ∑ d\P(zn,z) ν(d)=2b i+2n ω(d) d , and so

∑ p<z

ω(p)W(p) pW(z) ∑ t\P(p+,z)

χi(t)1 χi(pt) t

ω(t)≤ ∑ 1≤n≤r

W(zn) W(z)

1 (2b i+2n)! ∑ zn≤p<z

ω(p) p (2b i+2n).

Now to simplify this sum further we have to make some assumptions about the function ω(p); instead of assuming ω(p) = O(1) we shall use the more general assumption:

∑ w≤p<z

ω(p)ln p p ≤κlnz w+η, for 2≤w≤z.(2.12) If indeed we had ω(p) = 1, then we have

∑ w≤p<z

lnp p ≤lnz w+1, for 2≤w≤z.

40 2. THE COMBINATORIAL SIEVE

So the assumption we have made is an assumption on the average distribution of ω(p). Such an assumption usually holds, and is much easier to verify in more complicated situations.

A question we can ask is: Does the above assumption imply a bound for the sum

∑ w≤p<z

ω(p) p

Let

S(k)≡ ∑ w≤p<k

ω(p)ln p p

.

Since S(k) S(k 1)= ω(k)lnk k if k is prime we have ∑ w≤p<z ω(p) p = ∑ w≤k<z

S(k) S(k 1) lnk

= ∑ w≤k<z 1

S(k) 1 lnk

1 lnk+1!

= ∑ w≤k<z 1

S(k) ln(k+1) lnk lnklnk+1 !

Now

ln(k+1) = lnk+ln1+ 1 k,

and since 1+x≤ex we have ln1+ 1 k≤ 1 k . Then

∑ w≤p<z

ω(p) p ≤ ∑ w≤k<z 1

S(k) kln2 k

(2.13)

≤ ∑ w≤k<z 1

κlnk w+η kln2 k

(2.14)

≤κ ∑ w≤k<z 1

1 klnk  lnw ∑ w≤k<z 1

1 kln2 k

+η ∑ w≤k<z 1

1 kln2 k

(2.15)

≤κlnlnz lnw+ η lnw .(2.16) Here we have used

z

w

1 xlnx

dx = lnlnw+lnlnz,

and

z

w

1 xln2 x

dx =

1 lnw

1 lnz

.

Now returning to our original problem we need bounds on

W(zn) W(z)

= ∏ zn≤p<z

1 1 ω(p) p

,

2.2. BRUN’S SIEVE 41

and so

ln

W(zn) W(z) ≈ ∑ zn≤p<z

ω(p) p

.

Our assumption (2.12) yields a bound on

∑ zn≤p<z

ω(p) p

by (2.16). Thus we expect that a bound of the form W(zn) W(z) ≤eγnλ+ c lnz can be enforced with some— constants γ,λ and c. This can be achieved for example with a double-exponentialfall-off of zn with respect to z, in fact this is what we shall do later. If a bound for W(zn) W(z) of the above form exists, then this also gives us (as we might expect)

∑ zn≤p<z

ω(p) p ≤ ∑ zn≤p<z

ln 1 1 ω(p) p

≤ln

W(zn) W(z)

< γnλ+ c lnz.

Let f = c lnz, and suppose we can enforce γ = 2 (this helps in the simpli cation to follow). Then

∑ 1≤n≤r

W(zn) W(z) ∑ d\P(zn,z) ν(d)=2b i+2n

ω(d) d ≤ ∑ 1≤n≤r

e2nλ+2f (2nλ+2f)2b i+2n (2b i+2n)!

≤ ∑ 1≤n≤r

e2feλ2n (2n)2b i+2n (2n)!(2n)2b i1+ f n2b i+2n

(since (2b i+2n)!≥(2n)!(2n)2b i)

= ∑ 1≤n≤r

e2f(λeλ)2n(2ne 1)2ne2n (2n)!

(λ2b i)1+ f nλ2b i1+ f nλ2n

= e2f(λ+ f)2b i ∑ 1≤n≤r

(2ne 1)2n (2n)!

(λe1+λ)2n1+ f nλ2n

since (ne 1)n n! is decreasing, and1+ f nλ2n ≤e2f λ . Also assuming λe1+λ ≤1;

∑ 1≤n≤r

W(zn) W(z) ∑ d\P(zn,z) ν(d)=2b i+2n

ω(d) d ≤e2f(λ+ f)2b i2e 2e2f λ ∑ 1≤nλe1+λ

=

2λ2b i+2e2λ 1 (λe1+λ)21+ c λ2b ie2f(1+1 λ)

2λ2b i+2e2λ 1 (λe1+λ)2

e(2b i+4) f λ .

Thus

∑ d\Pz

μ(d)χi(d)

ω(d) d

=W(z)1+2θ λ2b i+2e2λ 1 (λe1+λ)2

e(2b i+4) f λfor i = 1,2.

42 2. THE COMBINATORIAL SIEVE

Now we have to bound the remainder term, which is signi cantly easier. Let us assume that ω(p) ≤ A for some constant A > 0. Then ∑ d\Pz χi(d)|Rd|≤ ∑ d\Pz χi(d)ω(d) ≤1+ ∑ p<z ω(p)2b i+1 ∏ 1≤n≤r 11+ ∑ p<zn ω(p)2 ≤(1+A(2li z+3))2b i+1 ∏ 1≤n≤r 1 (1+A(2li zn +3))2 for i = 1,2. Selection of the intervals: We select the numbers zn with an exponential fall-off in the logarithm. Let Λ > 0 be a real number. De ne lnzn = e nΛlnz for n = 1,···,r 1;

and set zr = 2. Here r is selected such that

lnzr 1 = e (r 1)Λlnz > ln2,

and

e rΛ lnz≤ln2,

so we have

e(r 1)Λ < lnz ln2 ≤erΛ. Thus for a suitable constant B the remainder term becomes

∑ d\Pz

χi(d)|Rd|≤Bz lnz2b i+1 ∏ 1≤n<rBznenΛ lnz =Bz lnz2b i+1 ∏ 1≤n≤r 1Be 1 2 rΛ lnz r 1 ∏ 1≤n≤r 1

z2 n.

Now

Be

1 2 rΛ lnz ≤

BeΛ/2 lnz rlnz ln2

< 1,

and also

∏ 1≤n≤r 1

z2 n = exp2lnz ∑ 1≤n≤r 1

e nΛ≤z 2 eΛ 1 .

Thus

∑ d\Pz

χi(d)|Rd|= Oz2b i+1+ 2 eΛ 1for i = 1,2.

We still have to check that W(zn) W(z) ≤e2(nλ+f). By our assumptions about the sum ∑w≤p<z ω(p)lnp p we have W(zn) W(z) ≤expnΛκ+ 2cenΛ lnz = e2c exp nΛκ+ 2c lnz enΛ 1 n !,n = 1,···,r.

2.2. BRUN’S SIEVE 43

If 1≤ 1 1 ω(p) p ≤A, then

c =

η 21+Aκ+ ηA ln2.

Since Λ > 0 we have

enΛ 1 n ≤

erΛ 1 r

,

and this is at most

Λ

erΛ rΛ ≤Λ

eΛ ln2

lnz ln(lnz/ln2)

.

So we get

W(zn) W(z) ≤e2c exp nΛκ1+ 2ceΛ κln2

1 ln(lnz/ln2)!for n = 1,···,r.

To meet our conditions on W(zn) W(z) we take

Λ =

2λ κ

1 1+ε

ε =

1 δe

1 κ

,

and so

e

2λ κ  eΛ ≤2λ κ  Λe2λ κ ≤εΛe1 κ .

Since eΛ 1≥Λ we have

e

2λ κ  1 eΛ 1 ≤1+

εΛe

1 k eΛ 1 ≤1+εe

1 κ = 1+

1 δ

.

With ξ = 1+ 1 δ we obtain

∑ d\Pz

χi(d)|Rd|= Oz2b i+1+ 2ξ e 2λ κ  1for i = 1,2. Thus we have proved the following theorem.

THEOREM 2.2.2. Assume that

1≤

1 1 ω(p) p

≤A,

∑ w≤p<z

ω(p)ln p p ≤κlnlnz lnw+ η lnw

,

and

|Rd|≤ω(d).

Let λ be such that 0 < λe1+λ < 1. Then S(A;Pz,x)≤xW(z)1+2 λ2b+1e2λ 1 (λe1+λ)2

exp(2b+3) c λlnz+Oz2b 1+ 2ξ e 2λ κ  1,(2.17)

44 2. THE COMBINATORIAL SIEVE

and

S(A;Pz,x)≥xW(z)1 2 λ2be2λ 1 (λe1+λ)2

exp(2b+2) c λlnz+Oz2b 1+ 2ξ e 2λ κ  1,(2.18) where

c =

η 21+Aκ+ η ln2

and ξ = 1+ε for 0 < ε < 1. Application to the Twin Primes problem : We set A ={n(n+2) | n ≤ x}. In this case we have ω(2) = 1 and ω(p) = 2. Further, all the conditions of Theorem (2.2.2) hold, and the lower bound is seen to be positive. Thus (2.18) tends to in nity with x, ([HR74], p.63) for z = x 1 u with u < 8. This implies that every divisor of a number in the sifted set is≥x 1 u so each number in the sifted set can have at most u < 8 factors1. Thus we have the following theorem. THEOREM 2.2.3. There are in nitely many n such that ν(n(n+2))≤7. We will look at some interesting applications of Brun’s sieve in the following sections.

2.3. Orthogonal Latin Squares and the Euler Conjecture DEFINITION 2.3.1. A Latin square of order n is an n×n matrix with entries in S ={1,···,n}such that every row and column is a permutation of the set S. DEFINITION 2.3.2. Two Latin squares A and B or order n are said to be mutually orthogonal if the n2 pairs (aij,bij) are distinct.

Here is a Latin square of order 3:

A =

1 2 3 2 3 1 3 1 2    ,and here is a latin square that is orthogonal to it: B =    1 2 3 3 1 2 2 3 1   .Euler conjectured that there are no mutually orthogonal Latin squares of order n, where n≡2 mod 4. The conjecture was disproved for the case n = 10, and later Bose, Parker and Shrikande [BPS60] showed that for every higher n > 6 the conjecture was false. Let ⊥(n) be the number of orthogonal latin squares of order n. Chowla, Erd os and Straus [CES60] building on this and some previous results, established that ⊥(n) > 1 3n 1 91 for large enough n. The proof involves an interesting use of the Brun Sieve, and we shall give an account of this. The exponent 1 91 is far from optimal and has been subsequently improved. The starting point for the proof is the following pair of results: THEOREM 2.3.3. [BPS60] If k≤⊥(m)+1 and 1 < u < m then ⊥(km+u)≥min{⊥(k),⊥(k+1),⊥(m)+1,⊥(u)+1} 1. THEOREM 2.3.4 (MacNiesh). 1. ⊥(ab)≥min{⊥(a),⊥(b)}; 2. ⊥(q) = q 1 if q is a power of a prime. First we shall prove the following:

THEOREM 2.3.5.

lim n→∞⊥(n) = ∞.

1For a similar derivation see Theorem (2.3.6).

2.3. ORTHOGONAL LATIN SQUARES AND THE EULER CONJECTURE 45

Proof : The idea is to have a lower bound on each of the quantities involved in Theorem (2.3.3), and then use the theorem with km+u = n.

Let x be a large positive integer. If

k+1 = ∏ p≤x

px,

then by Theorem (2.3.4) we have⊥(k+1)≥2x 1≥x. Also since k≡1 mod p for p≤x all the prime factors k are larger than x, so applying Theorem (2.3.4) again we have⊥(k)≥x. Now we select m in two pieces m1 and m2. The  rst piece is set to be m1 = kk ∏ q6\n q≤x qk. Note that m1 is bounded in terms of x alone. Now if n is large enough the interval n (k+1)m1 ··· n 1 km1 contains an integer m2 such that m2 ≡1 mod k!, simply because the length of the interval becomes larger than k!. Now set m = m1m2 then⊥(m)≥min{⊥(m1),⊥(m2)}≥min{2k 1,k}≥k. Thus we have⊥(m)+1≥k to satisfy the condition of Theorem (2.3.3). Set u = n km; we need to bound⊥(u), but  rst we need to verify that 1 < u < m. We have n (k+1)m1 < m2 < (n 1) km1 or n (k+1) < m < n 1 k . This yields km+1 <n and km+m > n, which implies that 1 < u <m. Let p≤x then km6≡n mod p. This is because k has prime factors only above x, m1 has a small prime factor only if it does not divide n, and m2 has prime factors only above k≥x. Thus km+u6≡n mod p for p < x, and so no prime smaller than x divides u. Thus we get⊥(u)≥x. Now applying Theorem (2.3.3) we have⊥(km+u)=⊥(n)≥x.

Note that this has already disproved Euler’s conjecture. It is clear that our method of proof relied on our ability to produce some numbers with large prime factors and some congruence properties, this indicates that a sieve argument might help. The necessary machinery from sieves is encapsulated in the following theorem: THEOREM 2.3.6. [Rad24] Let p1,···,pr be primes, and let ai < pi,bi < pi be non-negativeintegers for 1≤i≤r. Let D > 1 be an integer with gcd(D,pi) = 1 for each i, 1≤i≤r, and Λ is an integer, 0 < Λ < D such that gcd(Λ,D) = 1. Let P(D,x; p1,a1,b1; p2,a2,b2;···; pr,ar,br) =
 
{n≤x|n≡Λ(mod D),( i : 1≤i≤r) : n6≡ai(mod pi),n6≡bi(mod pi)}
 
.If p1 < p2 <···< pr and pi > 2, then P(D,x; p1,a1,b1;···; pr,ar,br) > Cx Dln2 pr  C0p7.938 r , where C and C0 are positive constants.

REMARK 2.3.7. The original theorem has 7.9 instead of our slightly worse 7.938, but this can be improved using a more detailed analysis of our proof. Proof : The quantity S(A;Pz,x) is the number of integers in A that are6≡0 modulo pi for each pi ∈P,pi ≤z. In this case we have two constraints for each prime pi. But we can collapse these two constraints into one as follows. The constraint for the prime i is that n6≡ai, n6≡bi modulo pi. So the constraint fails iff (n ai)(n bi)≡0 mod pi. Let A = {n ≤ x | n ≡ Λ mod D}, Api = {n ≤ x | (n ai)(n bi) ≡ 0 mod pi}, and if d = pi1···pik then Ad = {n≤x| ∏1≤j≤k(n aij)(n bij)≡0 mod d}. Suppose|Api|= ω(pi) pi x+Rpi; then we see that if d is squarefree then |Ad|= ω(d) d x+Rd, where ω(d) is de ned multiplicatively. Thus we are interested in the estimate:

46 2. THE COMBINATORIAL SIEVE

P(D,x; p1,a1,b1;···; pr,ar,br) =|A| ∑|Api|+∑|Apipj| ··· = ∑ d\p1···pr μ(d)|Ad|, which is just the sieve estimate.

The congruence (n ai)(n bi)≡0 mod pi has at most 2 solutions modulo pi so ω(pi) = 2 for each i. We will try to apply Brun’s Sieve to this problem.

We just need to verify that the conditions of the proof of Theorem (2.2.2) are valid. First 1 1  ω(p) p ≤3 so A = 3. Next ∑ w≤p<zp ∈{p1,···,pr} ω(p)ln p p ≤2 ∑ w≤p<z ln p p ≤ln z w +1, from which we have κ = 2, and η = 2. Rd ≤ω(d) also holds. Thus by the lower bound we have (with b = 2): S(A;P ={p1,···,pr},z)≥|Ax|W(z)1 2 (λeλ)2 1 (λeλ)2 exp 4c λlnz+Oz1+ 2ξ eλ 1. So all we need to show is that there is a λ such that

1+

2+2ε eλ 1 ≤u≤7.938

and

1

2(λeλ)2 1 (λe1+λ)2

> 0.

Then the second condition implies

λeλ < 1 √2+e2 ≈0.3263540699···

and the  rst implies

2+2ξ 6.938

+1≤eλ. Now set ξ = 10 9, so we must have λ≥log1.288267513692707. This value of λ also satis es the other constraint. Now we take z = pr, and using|Ax|= x D +θ,|θ|< 1, S(A;Ppr,x)≥ Cx D ∏ 1≤i≤r1  2 pi+O(p7.938 r ), and also ∏ i 1  2 pi≤ ∏ p≤pr1  2 pi. Now in ln ∏ p≤pr1  2 pi= 2 ∑ p≤pr 1 pi  2 ∑ p≤pr ∑ m>1 1 mpm the second sum converges, so we have

∏ p≤pr1  2 pi= 1 ln2 pr

+o 1 ln2 pr.

2.3. ORTHOGONAL LATIN SQUARES AND THE EULER CONJECTURE 47

The theorem follows.

Now we have the following simple lemma: LEMMA 2.3.8. For all c,0 < c < 1, the number of integers y≤x that are divisible by a prime factor p > nc of n, is at most x cnc . Proof : At most x p integers y≤x are divisble by p and so the total number of such integers is given by: ∑ p\np >nc x p ≤ x nc ∑ p\np >nc 1 ≤ x cnc . The last part follows because, there can be at most 1/c prime factors of a number n that are greater than nc.

THEOREM 2.3.9. [CES60] There is an n0 > 0 such that for all n > n0,⊥(n) > 1 3n

1 91 .

Proof : The idea as before is to apply Theorem (2.3.3) to suitable k, m and u for a given n such that n = km+u. For this to yield a lower bound on⊥(n) we need lower bounds on⊥(k),⊥(k+1),⊥(m) and⊥(u). We begin with the selection of k: we need k as well as k+1 to have no small prime factors. This is exactly the sort of problem handled by the theorem we have just proved. It turns out that the constraints on k depend on the parity of n. Case 1. (n even). Consider the constraints: k≡ 1 mod 2b 1 91 lgnc k6≡0 or  1 for p≤n 1 10

and k < n

1 10 . The  rst congruence restricts k to lie in an arithmetic progression with difference 2b 1 91 lgnc < c1n 1 91 . The second incongruence implies that both k and k +1 are free of small prime factors, apart from the large power of 2 dividing k+1. Now applying Theorem (2.3.6) there are at least:

Cn

1 10

c1 1 902 n

1 91 log2 n C0n79.38 10

1 90 = c2

n

81 910 log2 n C0n79.38 900

> c3

n

81 910 log2 n

values of k satisfying the constraints. By Lemma (2.3.8) the number of integers below n

1 10 that have a prime factor greater than n

1 90 in common with n is

at most 90n

8 90 . Thus from the bound for the values of k we have that there is a k such that gcd(k,n) = 1. Just by our selection of k we have that k has no small prime factors and though k+1 has 2 as a prime factor we still have that k+1≡0 mod 2b 1 91 lgnc and all the other factors are bigger than n 1 90 so using Theorem (2.3.4) ⊥(k) > n 1 90  1 > 1 3 n 1 91 ⊥(k+1) > min1 2 n 1 91 ,n 1 90 1 > 1 3 n 1 91 . We now set n = n1 +n2k where 0 < n1 < k. We cannot directly use n1 and n2 in our application of Theorem (2.3.3), since we have no bounds for⊥(n1) and⊥(n2). Though we have freedom in our choice of m we are still forced to pick k as our quotient in the division of n by m to write n = km+u. This suggests picking a u subject to certain conditions and then set m = n u k . Again this immediately restricts us to look at numbers that are congruent to n1 modulo k. Let

48 2. THE COMBINATORIAL SIEVE

u = n1 +u1k where u1 is picked according to the following conditions: u1 6≡n1 mod 2, u1 6≡  n1 k mod p,p6\k, u1 6≡n2 mod p, for 3≤ p≤k, and u1 < n 159 200 . The  rst incongruence forces u1k to be of opposite parity from n1 and always  xes u to be odd. In this setup we will set m = n u k = n2 u1. We want m to be free of small prime factors to guarantee a good lower bound for ⊥(m) and this is taken care by the third incongruence. Meanwhile, the second incongruence arranges for u itself to have no small prime factors. The limit on u1 is forced on us because of the limitations of Theorem (2.3.6).

The restrictions of the incongruences modulo the primes 2,3, and 5 can be handled by restricting u1 to belong to an arithmetic progression with difference 30. To apply Theorem (2.3.6) we need gcd(u1,30)= 1. If we had gcd(u1,30)> 1, then we can set u0 1 = u1 gcd(u1,30) and apply Theorem (2.3.6). Thus there are at least

Cn

159 200 30log2 k  C0k79.38 10 > c4

n

159 200 log2 n C0n79.38 100 > 0 choices for u1, if n is large enough. Now u is not divisible by any prime p ≤ k. First suppose that p 6\k, then this contradicts the incongruence n1 6≡ u1k mod p. Next, if p\k, then p\n1 which implies p\n a contradiction to gcd(k,n) = 1. Thus⊥(u)≥k, but k is not divisible by any prime≤n 1 90 , so ⊥(u)≥n 1 90 > 1 3n 1 91 . Now as promised we set m = n u k , we need to verify that m > u > 1 to apply Theorem (2.3.3), and observe that

m >

n

n

1 10  (1+n

159 200 ) >

1 2

n

9 10

> n

1 10 +(1+n

159 200 )

> u > 1,

for large enough n. Furthermore, all prime factors of m exceed k by our choice of u, and hence:

⊥(m)≥k >

1 3

n

1 91 .

Finally putting all these together and applying Theorem (2.3.3) we get: ⊥(n)> 1 3n

1 91 for large enough even numbers n.

Case 2. (n odd). We apply Theorem (2.3.6) to k+1 instead with the following constraints: k+1≡1 mod 2b 1 91 lgnc k+16≡0 or 1 mod p,p≤n 1 90 k+1≤n 1 10 . Now the argument proceeds with the role of k and k+1 interchanged, and the second set of constraints becomes: u1 6≡n2 mod 2, u1 6≡  n1 k mod p,p≤k,p6\k, u1 6≡n2 mod p,p≤k, and u1 < n 159 200 . So here both n and m are odd. The argument then proceeds similarly.

Better estimates for⊥(n) are known— for example in [Wil74] a bound⊥(n)≥n

1 17  2 is proved (for large enough n).

The current best estimate seems to be⊥(n)≥n

1 14.8 [Be83].

2.4. A THEOREM OF SCHINZEL 49

2.4. A Theorem of Schinzel

In this section we will give an application involving a variation of Theorem 2.3.6, where we look at some constant number of constraints. The proof is an interesting use of Brun’s sieve. THEOREM 2.4.1. [Sch66] For all positive integers h and N ≥3 there is an integer D such that: 1. 1≤D≤(logN)20h; 2. gcd(iD+1,N) = 1, for 1≤i≤h. Proof : For h = 1 we can take D = q 1, where q is the least prime not dividing N. Since ∑p≤D logp≤logN, we have from [RS62] Theorem 10, that either D≤100 or 0.84D≤logN. Since D≤N we have D≤(logN)20, for all N ≥3. If N ≤(logN)20h, then D = N satis es the conditions of the theorem, so we can assume N > (logN)20h, with h≥2. Now N > (logN)20h  logN > 20hloglogN. If logN < 110h, then N < e110h and (logN)20h ≥(110h)20h = elog110h20h ≥elog110+logh20h = e94.0069h+20hlogh ≥e114.0096h, which is a contradiction to N > (logN)20h. Hence we must have logN ≥ 110h, and loglogN ≥ log110+logh ≥ 5.3936, or loglogN > 5.

Let H = ∏p≤10h p, and we let p1,···,pr be the primes pi > 10h such that pi\N. Let p1 < p2 < ··· < pr. Let P(H,x; p1,···,pr) be the number of integers n≤x such that n≡0 mod H, and ( i  j) : 1≤i≤h,1≤ j ≤r : in+16≡0 mod pj. Since pi > 10h for all the values of i in the incongruences, i is invertible. Thus, the above constraints are equivalent to a system of h incongruences per prime (we had 2 such constraints in Theorem 2.3.6). Thus we have a system of incongruences: x6≡aij mod pj, for some aij.

Here we are in a special situation of the Sieve problem. The number of primes with respect to which we sift the sequence is very small, namely we sift only by the prime factors of N, of which there can be at most logN. Hence we shall re-do the analysis of the Brun sieve and thereby get a better estimate. Let A ={n≤x|n≡0 mod H}, P = ∏1≤i≤r pi and let Apj =n∈A| ∏ 1≤i≤h (n aij)≡0 mod pj. We extend the notation to Ad for d a divisor of P. We have that P(H;x; p1,···,pr) = ∑ d\P μ(d)|Ad|. For |Apj|, we can select ω(p) = hx Hpj , and Rpj ≤ h since for each congruence there is an error of at most 1 in the approximation. The denominator H can be taken out of our analysis if we set x← x H . We also have that Rd ≤ω(d).

50 2. THE COMBINATORIAL SIEVE

Hence

W(k) = ∏ 1≤i≤k1  h pi.

From our earlier work in section (2), we have

P(H;x,p1,···,pr) >

xW(pr) H

(1+Θ)+R,

where

Θ = 1 ∑ i≤r

ω(pi) pi

W(pi) W(pr) ∑ t\P(i···r]

χ(t)(1 χ(pt)) t

ω(t)

and P(i···r) = ∏i<k≤r pk. We let 1≤rt ≤rt 1 ≤···≤r0 = r, be a sequence of integers. These correspond to the real numbers zi, but here we select the indices of the primes instead. We use the function χ≡χ2 (in the proof of Theorem 2.2.2), with P(ri···r) instead of P(zn,z) in the de nition. We will show that in this case we can select the intervals (ri) such that Θ < 1.

Following the same argument as in Section 2 (with b = 1), we arrive at the following upper bound for Θ:

∑ 1≤n≤t

W(rn) W(r)

1 (2n+1)! ∑ rn≤i≤r

ω(pi) pi 2n+1.

We will show later that we can pick ri such that W(rn) W(r)

=

1 ∏rn≤i≤r1  h pi≤enγ,where γ = log1.3. As before

∑ rn≤i≤r

ω(pi) pi ≤logW(rn) W(r)≤nγ.

So the bound for Θ is

∑ 1≤n≤t

enγ (2n+1)!

(nγ)2n+1 = ∑ 1≤n≤t

(ne 1)2n+1 (2n+1)!

e2n+1γ2n+1enγ

1 e3(3!) ∑ 1≤n≤t

(γe1+γ)2n+1

(since (ne 1)2n+1 (2n+1)! is decreasing)

1 e3(3!)

γe1+γ ∑ 1≤n<∞ (γe1+γ)2n

=

1 e3(3!)

γe1+γ 1 1 (γe1+γ)2

.

The last step follows because γe1+γ < 1. The  nal expression is≈0.05478< 1. Thus Θ < 1. Let us de ne the intervals by selecting ri (for 1≤i≤t), as the least index such that πi = ∏ ri<k≤ri 11  h pk≥ 1 1.3 .

2.4. A THEOREM OF SCHINZEL 51

Since pi > 10h this is always possible. This automatically satis es the requirements set earlier on γ. Select t such that

πt = ∏ 1≤k≤rt 11  h pk≥ 1 1.3

.

Since pi > 10h we have

1

h pi

> 1

h 10h

=

9 10

so

9 10

πi =1  h 10hπi <1  h priπi,

which by the de nition of ri is such that

<

1 1.3

.

Thus

πi ≤

10 9

1 1.3

=

1 1.17

<

8 9

.

We will show that

log ∏ 1≤i≤r1  h pi>  hloglogN elogeh

> 0.2hloglogN.

Using the series expansion of log(1+x) we see that

log ∏ 1≤i≤r1  h pi+log ∏ 1≤i≤r1  h pi h ≥  ∑ 1≤i≤r

∑ 2≤m

1 mh pim

≥ ∑ 1≤i≤r

1 2 ∑ 1≤mh pim

=

1 2 ∑ 1≤i≤rh pi2 1 1  h pi.

We need a good bound on ∑i 1 p2 i

. We have by [RS62] (p.87), that

∑ x<p

1 pn ≤

1.02n xn 1 lnx

.

Using this with n = 2 and x = 10h, (all the primes pi > 10h by our choice) we have

∑ 1≤i≤r

1 p2 i ≤

2.04 10hlog10h

.

Thus

1 2 ∑ 1≤i≤r

1 1  h pih pi2 ≥ 5 9

h2 ∑ 1≤i≤r

1 p2 i

0.2h log10h

.

Now if we can bound from above

log ∏ 1≤i≤r1  h pi h,

52 2. THE COMBINATORIAL SIEVE then we can obtain a lower bound on log∏1≤i≤r1  h pi. Let N0 = N gcd(H,N). We have

A  (A)

1 ∏1≤i≤r1  1 pi

=

AN0  (AN0)

.

By [RS62] Theorem (15): For n≥3

n  (n)

< eγ loglogn+ 5 2loglogn

,

where γ is the Euler constant. Also by [RS62] Theorem (9): logH < 11h < 0.1logN. Using this we have HN0  (HN0) < eγ loglogHN0+ 2.51 loglogHN0 < eγ loglogN0+ eγ 10 + 2.51 5 < eγ(loglogN +0.4), as HN0≥N, loglogN > 5 by our conditions, and also N0≤N. Now by [RS62], where a lower bound of e γ logx1  1 log2 xfor ∏p≤x 1 1 1 p is given, we have: H  (H) > eγ log10h1  1 2log2 10h> eγ(logh+2.1). Since loglogN > log10h,

∏ 1≤i≤r1  h pi 1 < 1 eγ(logh+2.1)eγ loglogN +0.4

yielding

hlog ∏ 1≤i≤r1  1 pi< hlog(loglogN +0.4) log(logh+2.1)+ 0.2 log10h

and  nally

log ∏ 1≤i≤r1  h pi> hlogloglogN loglogeh.

Using logx loga = 1+logx ae≤ x ae, we have log ∏ 1≤i≤r1  h pi>  hloglogN elogeh

.

Since πi ≤ 1 1.17, we obtain (t 1)log1.17≤log ∏ 1≤i≤r1  h pi 1 ≤ hloglogN elogeh

<

hloglogN elog(h+1)

.

This yields

(2t +1)log(h+1) < 3log(h+1)+

2hloglogN elog1.17 < 3log(h+1)+4.7hloglogN.

2.4. A THEOREM OF SCHINZEL 53

Now pi > ilogi, by [RS62] (Corollary to Theorem 3). Hence

logπi = ∑ rn<i≤rn 1

log1  h pi

>

10 9 ∑ rn<i≤rn 1

h ps

>

10h 9

rn 1 rn

dt t logt

=

10h 9

log

logrn 1 logrn

.

Since πi ≤ 1 1.17, we have

logrn logrn 1

< 1 1.17

9 10h

<1+ 9 10h

log1.17 1 ≤(1+0.141h 1) 1,

and so

logrn logr

< (1+0.141h 1) n

for 1≤n≤t 1. Further

logN ≥ ∑ 1≤i≤r

logpi > rlog10h≥rlog20,

so logr < loglogN 1.

Now for the remainder term: R = ∑ d\P

χ(d)|Rd|≤1+ ∑ 1≤i≤r

ω(pi) ∏ 1≤i≤t 11+ ∑ j≤ri

ω(pj)2

(since ω(p) = h)

≤(1+hr) ∏ 1≤i≤t 1 (1+hri)2.

Thus

logR≤log(1+h)+logr+2(t 1)log(h+1)+2 ∑ 1≤i≤t 1

logri

= (2t 1)log(h+1)+logr+2 ∑ 1≤i≤t 1

logri < 3log(h+1)+4.7hloglogN +(loglogN 1)2 ∑ 0≤n (1+0.141h 1) n 1 < 3log(h+1)+4.7hloglogN +(loglogN 1)(14.2h+1) < 19.4hloglogN 11h 1. Since logH < 11h, we have logR < 19.4hloglogN logH 1, and logc(logN)20h H ∏ 1≤i≤r1  h pi> logR,

54 2. THE COMBINATORIAL SIEVE

where c = 1 0.05478. Thus P(H,(logN)20h,p1,···,pr) > 0. Thus there is an integer D satisfying the conditions of the theorem.

2.5. Smooth Numbers

Here we illustrate the surprising power of the indentity proved in Proposition 2.2.1.

Let Px z ={p|z≤p<x}. Then setting χ(1)=1 and χ(d)=1 for d >1 in Proposition 2.2.1, we have that for 2≤z1 ≤z : S(Ax;Px z1) = S(Ax;Px z )  ∑ z1≤p<z S(Ap;Px p). Recall that S(Ax;Px z ) = Ψ(x,z) the number z-smooth integers below x, also S(Ap;Px p) = Ψ( x p,p). Hence we have, for 2≤z1 ≤z that Ψ(x,z) = Ψ(x,z1)+ ∑ z1≤p<z Ψx p ,p.(2.19) As an application we show the following theorem.

THEOREM 2.5.1 ([Hal70]). Let y = x

1 θ where 1 < θ≤2. Then Ψ(x,y) = x1 logθ+O 1 logx.

Proof : Applying the identity (2.19) with z = x and z1 = y, we have

Ψ(x,y) = Ψ(x,x)  ∑ y≤p<x

Ψx p ,p.(2.20) Now Ψ(x,x) =bxc. Since 1 < θ≤2, p≥√x, we have that x p ≤√x≤ p. Consequently, Ψx p,p=

x p.Substituting in (2.20), we have Ψ(x,y) =bxc  ∑ y≤p<xx p = x x ∑ y≤p<x 1 p +O(π(x)) = x1 loglogx+loglogy+O 1 logx. Now x≥yθ, so logx≥θlogy, and also loglogx≥logθ+loglogy this yields Ψ(x,y) = x1 logθ+O 1 logx.

The recurrence formula can be used to convert upper bounds to other useful lower bounds, and can also be used iteratively. Here is a simple example. Let us try to evaluate Ψ(x,x 1 δ ) for 2 < δ < e using the recurrence formula

Ψ(x,x

1 δ ) = Ψ(x,x

1 2 )  ∑ x 1 δ≤p≤x

1 2

Ψx p

,p.

2.6. ON THE NUMBER OF INTEGERS PRIME TO A GIVEN NUMBER 55

Applying the trivial bound Ψx p,p≤ x p

x

1 δ≤p≤x

1 2

Ψx p

,p≤x ∑ x 1 δ≤p≤x

1 2

1 p

= x(logδ log2).

Now applying the theorem with θ = 2, we have

Ψ(x,x

1 2 ) = x1 log2+O 1 logx.

Thus we obtain

Ψ(x,x

1 δ )≥x1 logδ+O 1 logx. Of course, in this case we could have directly derived this result as in the theorem, but this just is an illustration of the usage of Buchstab’s identity. In estimating ψ(x,y) we could try to use Brun’s sieve as in section (1.3). It is clear however, that to obtain a good estimate we need to take lnz < εlnx, but this would make the error term very large, since that depends on the size of the interval x z.

2.6. On the number of integers prime to a given number

Let k > 1 be an integer and x > 1 a real number, here we will  nd bounds for the sum: ∑ n≤xgcd (n,k)=1 1.

It is clear that in every interval mod k there are  (k) such integers. However, it is not clear how uniform the distribution of these numbers are inside the interval. The sequence to be sifted is A ={n|1≤n≤x}, and the sifting primes are P ={p| p\k}. We assume x≥k. In this case we can take|Ad|= x d +Rd, where ω(d) = 1 and Rd ≤1. Now, 1 1 1 p ≤2. Hence A = 2, we also have ∑ w≤p≤z p∈P ω(p)ln p p ≤lnlnz lnw+ 1 lnw thus κ = η = 1. To apply the lower bound estimate of the Brun sieve (with b = 1), we need to  nd λ such that

1

2(λeλ)2 1+(λe1+λ)2

> 0

and

1+

2.01 e2λ 1

< γ,

where we have used ξ = 1.005. It turns out that we can take γ < 5, and satisfy both the constraints for λ = 0.204. This gives S(A;P,z)≥xW(z)1 o(1)+Oz4.85.T aking z = x 1 5 , we obtain S(A;P,z)≥c∏ p\k1  1 px+Ox0.97. Now to get the actual estimate ∑n≤x,n⊥k 1 we need to account for the numbers that might have been included in this estimate which are not really prime to k. Clearly, by our choice of the limit for z, each number which is over-counted must share a factor p with k that is larger than x 1 5 . Let us assume that the largest prime factor of k is < x 1 5 .

56 2. THE COMBINATORIAL SIEVE

Thus we have:

∑ n≤xgcd (n,k)=1

1≥

c (k) k

x+Ox0.97,

where c < 1. For the upper bound we can take the same value of λ as for the lower bound but this forces us to take z = x

1 6 in this

case and we get

∑ n≤xgcd (n,k)=1

1≤

c0 (k) k

x+Ox0.975,

where c0 < 4. In summary we have proved:

THEOREM 2.6.1. Let x > 0 and k a positive integer whose largest prime factor p is less than x

1 5 . Then

c (k) k

x+Ox0.97≤ ∑ n≤xgcd (n,k)=1

1≤

c0 (k) k

x+Ox0.975,

where c < 1 and c0 < 4 are constants.

CHAPTER 3

Selberg’s Sieve

Around 1946 Atle Selberg introduced a new method for  nding upper bounds to the sieve estimate [Sel47]. The method usually gives much better bounds than the Brun’s sieve. To obtain lower bounds one can couple the Selberg sieve with the Buchstab identities. After developing the basic ideas of this sieve technique, we shall look at the most important application of this method - to derive inequalities of the Brun-Titchmarsh type.

3.1. The Selberg upper-bound method

Selberg’s method of estimating the sum

S(A;Pz,x) = ∑ a∈A ∑ d\gcd(a,Pz)

μ(d) relies on  nding a sequence of numbers λd such that λ1 = 1 and using the inequality: S(A;Pz,x)≤ ∑ a∈A ∑ d\gcd(a,Pz) λd2. This allows us complete freedom in our choice of the numbers λd for d > 1, and the idea of this method is to select the λd such that the sum is minimized. Note that setting λ1 = 1 and λd = 0 for d > 1, leads to the trivial estimate S(A;Pz,x)≤|Ax|. Selberg’s method relies on choices of λd that mimic the cancellation occuring in the sum ∑d\nμ(d). Such choices lead to better estimates when we interchange the sum.

Now

∑ a∈Ax ∑ d\gcd(a,Pz)

λd2 = ∑ di\Pz i=1,2

λd1λd2 ∑ a∈Axa ≡0 mod D

1,

where D = lcm(d1,d2) . By our conventions about the sequence A, we have

∑ a∈Axa ≡0 modD

1 =|Ax D|=

ω(D) D

x+RD.

This yields,

∑ di\Pz i=1,2

λd1λd2|Ax D|, = x ∑ di\Pz i=1,2

λd1λd2

ω(D) D

+ ∑ di\Pz i=1,2

λd1λd2|RD|

= xΣ1 +Σ2.

The problem of selecting λd already seems dif cult. We can make the assumption that λd = 0 for d > z and hope that since the second sum σ2 contains only z2 terms we can concentrate on minimizing the leading sum σ1. Our  rst effort will be directed towards this.

Minimization of ∑1 : Using the fact that ω(d) is a multiplicative function, we have ω(D) D = ω(d1)ω(d2) ω(gcd(d1,d2)) gcd(d1,d2) d1d2 ,

57

58 3. SELBERG’S SIEVE

so

Σ1 = ∑ di\Pz

λd1λd2

ω(d1) d1

ω(d2) d2

gcd(d1,d2) d1d2

.

Let f(d) = ω(d) d , so that the sum becomes

Σ1 = ∑ di\Pz

λd1λd2

f(d1)f(d2) f(d) ,(3.21)

where d = gcd(d1,d2).

We need to get rid of the term in the denominator, and to this end we introduce the function

J(r) =

1 f(r) ∏ p\r1  f(p).

Let r = ps, and consider:

∑ δ\ps

J(δ) =∑ δ\sJ(pδ)+J(δ) =∑ δ\s 1 f(pδ) ∏ q\pδ1  f(q)+

1 f(δ) ∏ q\δ1  f(q)

=∑ δ\s

J(δ) 1 f(p)1  f(p)+1

=

1 f(p) ∑ δ\s

J(δ),

together with

∑ δ\p

J(δ) = J(p)+J(1) =

1 f(p)

.

Thus we have

1 f(d)

= ∑ δ\d

J(d).

Substituting this for 1 f(d) in (3.21) we get,

∑ di\Pz

λd1λd2

f(d1)f(d2) f(d)

= ∑ di\Pz

λd1λd2 f(d1)f(d2) ∑ δ\d1,δ\d2

J(d)

= ∑ r≤z r\Pz

J(r)∑ r\d d≤z

λd f(d)2.

Let ξr = ∑r\d d≤z

λd f(d), so that

Σ1 = ∑ r≤z r\Pz

J(r)ξ2 r.

This is what we need to minimize subject to the restriction λ1 = 1. We wish to write this constraint as a constraint among the variables ξi, which would allow us to convert the minimization problem to one entirely involving the variables ξi.

3.1. THE SELBERG UPPER-BOUND METHOD 59

The idea is to use M¨obius inversion to pick out λ1, and this is not dif cult: ∑ r≤z μ(r)ξr = ∑ r≤z μ(r) ∑ r\d d≤z λd f(d) = ∑ d≤z f(d)λd∑ r\d μ(d) = λ1 f(1) = λ1 = 1. Thus we need to minimize ∑r≤z J(r)ξ2 r, subject to the constraint ∑r≤z μ(r)ξr = 1. Let F = ∑r J(r)ξ2 r    for some real  . Since ∑r≤z μ(r)ξr = 1, we have F = ∑r J(r)ξ2 r   ∑r μ(r)ξr. Minimizing F is the same as minimizing the function ∑r≤z J(r)ξ2 r. Let us try to complete the square term in the  rst sum in F. This suggests setting  ←2ω, so ∑ r≤z J(r)ξ2 r  2ω∑ r≤z μ(r)ξr = ∑ r≤z J(r)ξ2 r   2ωμ(r)ξr J(r)  = ∑ r≤z J(r)ξ2 r   2ωμ(r)ξr J(r) +ωμ(r) J(r) 2 ∑ r≤z ω2μ(r)2 J(r) = ∑ r≤z J(r)ξr ωμ(r) J(r) 2 ω2 ∑ r≤z μ2(r) J(r) . Thus at the minimum value of F we should have ξr = ωμ(r) J(r) , and the minimum value of F would be ω2∑r≤z μ2(r) J(r) . To  nd the value of ω, we can substitute ξr into the constraint ∑r≤z μ(r)ξr = 1, and this gives us immediately that ω = 1 ∑r≤z μ(r)2 J(r) . So

min∑ r≤z

J(r)ξ2 r = ∑ r≤z

ω2 μ(r)2 J(r)

= ω2 ∑ r≤z

μ(r)2 J(r)

=

ω2 ω = ω

=

1 ∑r≤z μ(r)2 J(r)

.

By our de nition of the function g(d) we have g(r) = 1 J(r), so

∑ r≤z

μ(r)2 J(r)

= ∑ r≤z

μ(r)2g(r).

Set

G(z) = ∑ r≤z

μ2(r)g(r).

Then the minimum value of Σ1 is x G(z).

Evaluation of Σ2: To estimate the remainder term Σ2, we need an estimate on the size of the λd. We had earlier used M¨obius inversion to extract λ1 from a combination of the ξr, and we can repeat the process to get λδ for any δ.

60 3. SELBERG’S SIEVE

Now by de nition

ξr = ∑ r\d d≤z

λd f(d).

Let r = γδ, so that

ξγδ = ∑ γδ\d d≤z

λd f(d)

= ∑ γ\d δ d≤z

λd f(d)

= ∑ γ\v,v≤d δ gcd(v,δ)=1

λδv f(δv).

Since we want to extract the term with γ = 1, we calculate:

∑ γ≤z δ γ⊥δ

μ(γ)ξγδ = ∑ γ≤z δ γ⊥δ

μ(γ) ∑ γ\v,v≤z δ v⊥δ

λδv f(δv)

= ∑ v≤z δ,v⊥δ

λδv f(δv)∑ γ\v

μ(k)

= λδ f(δ).

Thus

λδ =

1 f(δ) ∑ γ≤z δ γ⊥δ

μ(γ)ξγδ,

and substituting for ξγδ gives

λδ =

ω f(δ) ∑ γ≤z δ γ⊥δ

μ(γδ)μ(γ) J(γδ)

=

ωμ(δ) f(δ)J(δ) ∑ γ≤z δ γ⊥δ

μ(γ)2 J(δ)

.

Let

Gd(y) = ∑ δ<y,δ⊥d

μ2(δ)g(δ).

Then

λδ =

ωμ(δ) f(δ)J(δ)

Gδz δ.(3.22)

3.1. THE SELBERG UPPER-BOUND METHOD 61

We will show that|λd|≤1. Observe that

G(z) =∑ l\d

∑ m≤zgcd (m,d)=l

μ(m)2g(m)

=∑ l\d

∑ h<z l gcd(h,l)=1 gcd(h,d l )=1

μ(lh)2h(lh)

=∑ l\d

μ(l)2g(l)Gdz l ≥∑ l\d μ(l)2g(l)Gdz d

and

∑ l\d

μ(l)2g(l) =∏ p\d1+g(p) =∏ p\d p p ω(p) = 1 ∏p\d1 ω(p) p

,

and so

Gdz d≤∏ p\d1 ω(p) p G(z).(3.23)

Now substituting for J(δ) in (3.22), we get:

λd =

μ(d) ∏p\d1 ω(p) p

Gd(z/d) G(z) .(3.24)

Thus by (3.23) and (3.24), we have|λd|≤1.

Now

Σ2 ≤ ∑ di<z di\Pz

Rlcm(d1,d2)

.

Fix a d; we can estimate the number of integers d1,d2 for which d = lcm(d1,d2) . Now d as well as d1 and d2 are squarefree. If d1 = ∏i pei i and d2 = ∏i pfi i , then d = ∏i pmax{ei,fi} i . Suppose p\d, then p\d1 or p\d2 or p divides both of them. So the number of integers which can give rise to d as their lcm is exactly 3ν(d). Using this and the fact that d < z2, we get

Σ2 ≤ ∑ d<z2

3ν(d)|Rd|.

If we also have the remainder condition|Rd|≤ω(d), then we can simplify further:

62 3. SELBERG’S SIEVE

∑ d<z2

3ν(d)|Rd|≤ ∑ d<z2 d\Pz

3ν(d)ω(d)

≤z2 ∑ d\Pz

3ν(d)ω(d) d

= z2 ∏ p<z,p∈P1+ 3ω(p) p ≤z2∏ p<z1+ ω(p) p 3 ≤ z2 W3(z) .

Thus we have proved: THEOREM 3.1.1. If|Rd|≤ω(d), then

S(A;Pz,x)≤ x G(z)

+

z2 W3(z)

,

where

G(z) = ∑ r≤z

μ2(r)g(r).

The second term can also be upper bounded by

∑ d<z2 d\Pz

3ν(d)|Rd|,

which is also upper bounded by

∑ d<z2 Γ(d) P

μ2(d)3ν(d)|Rd|.

Here Γ(d) stands for the set of prime divisors of d.

We will apply the Selberg method to the simple but important case where ω(d) = 1 and|Rd|≤1. THEOREM 3.1.2. Suppose ω(d) = 1 and|Rd|≤1. If d is squarefree and p / ∈P  p⊥d then S(A;P,z)≤ x ∏p<z p/ ∈P1  1 plogz +z2. Proof : Recall that

g(d) =

ω(d) d∏p\d1 ω(p) p where d\Pz. In this case we have ω(d) = 1, so we have g(d) = 1  (d) . Let k = ∏p<z p/ ∈P p. Then by de nition of G(z) in this case we get G(z) = ∑ d<z d⊥k μ2(d)  (d) .

3.1. THE SELBERG UPPER-BOUND METHOD 63

Let

Sk(z) = ∑ d<z d⊥k

μ2(d)  (d)

.

Then

S1(z) = ∑ d<z

μ2(d)  (d)

=∑ l\k

∑ d<z gcd(d,k)=l

μ2(d)  (d)

=∑ l\k

∑ h<z l gcd(h,k/l)=1 gcd(h,l)=1

μ2(lh)  (lh)

=∑ l\k

μ2(l)  (l) ∑ h<z l h⊥k

μ2(h)  (h)

=∑ l\k

μ2(l)  (l)

Skz l

≤∑ l\k

μ2(l)  (l)

Sk(z),

because Sk(z) is an increasing function of z.

Now

∑ l\k

μ2(l)  (l)

=∏ p\k1+ 1 p 1 = 1 ∏p\k1  1 p= k  (k) ,

and so

Sk(z)≥

(k) k

S1(z).

To apply Theorem 3.1.1 we need a good lower bound on G(z). Since G(z) = Sk(z), the above derivation says that we can translate a lower bound on S1(z) to a lower bound on Sk(z). We have

S1(z) = ∑ d<z

μ2(d) d

1 ∏p\d1  1 p

= ∑ d<x

μ2(d) d ∏ p\d1+ 1 p

+

1 p2

+···.

64 3. SELBERG’S SIEVE

If we set

(n) to be the largest squarefree divisor of n, then

S1(z) = ∑

(n)<z

1 n

≥ ∑ n<z

1 n ≥logz.

So Sk(z)≥  (k) k logz. We know from the proof of Theorem (3.1.1) that the remainder term is at most ∑ di\Pz di<z |Rlcm(d1,d2)|≤∑ d<z μ2(d)2 < z2.

Thus

S(A;Pz,x)≤ 1 ∏p<z p/ ∈P1  1 plogz

x+z2.

3.2. The Brun-Titchmarsh Theorem

The prime number theorem for arithmetic progressions states that

π(x;l,k) =

lix  (k)

+Oxe A√logx uniformly for k≤(logx)B, where B is any positive constant and A is a positive constant depending only on B. This is a very narrow range of values of k. It turns out that if we assume the Extended Riemann Hypothesis, then

π(x;l,k) =

lix  (k)

+O√xlogx

uniformly for k≤

√x log2 x

. By a careful analysis of the Selberg sieve (especially the remainder term) van Lint and Richert [vLR65] showed a good upper bound for π(x;l,k) valid for any k < x. In this section we shall look at the proof of this result (see Theorem 3.2.5). In a later chapter we shall improve on this result using the so called Large sieve.

Let k,l > 0 be relatively prime integers, and let x,y > 1 be reals with y≤x. We will concentrate on the sequence A ={n|x y < n≤x, n≡l mod k}. For K a multiple of k, we take as the sifting primes PK ={p| p6\K}. First we shall prove a form of the Selberg sieve, where we have a better estimate of the remainder term. We de ne

SK(z) = ∑ 1≤n≤z n⊥K

μ2(n)  (n)

as in the proof of Theorem (3.1.2), and

HK(z) = ∑ 1≤n≤x n⊥K

μ2(n)σ(n)  (n)

with σ(n) = ∑d\n d.

3.2. THE BRUN-TITCHMARSH THEOREM 65

LEMMA 3.2.1.

S(A;Pz K,x,y)≤

y kSK(z)

+

H2 K(z) S2 K(z)

.

Proof : The cardinality of the set AD ={n|x y < n≤x,n≡l mod k,n≡0 mod D} is y kD +RD. Following the proof of the Selberg sieve and using the analysis in Theorem (3.1.2) we get the  rst term to be y kSK(z) . Now the remainder term is (using|Rd|≤1) at most ∑ di\PK i=1,2 |λd1λd2|=∑ d\PK |λd|2.

In the notation of this proof we have

λd = μ(d)

d  (d)

SKdz dS K(z)

so

∑ d\PK

|λd|= ∑ 1≤d≤z d⊥K

μ2(d)d  (d)

1 SK(z) ∑ 1≤m≤z/d m⊥Kd

μ2(m)  (m)

=

1 SK(z) ∑ 1≤d≤z d⊥K

μ2(d)  (d) ∑ 1≤m≤z/d m⊥kd

μ2(m)  (m)

=

1 SK(z) ∑ 1≤d≤z d⊥K

∑ 1≤m≤z/d m⊥kd

μ2(md)  (md)

d

=

1 SK(z) ∑ 1≤n≤z n⊥K

μ2(n)  (n) ∑ d\n

d

=

HK(z) SK(z)

.

Hence the remainder term is at most

H2 K(z) S2 K(z)

, and the lemma follows.

Our aim now is to  nd a good upper bound on H2 K(z). One idea is to use Cauchy’s inequality on this sum, and this suggests that we  rst  nd a concrete upper bound for the sum ∑n≤x,n⊥K 1, which we have seen in the last chapter. Using Theorem (3.1.2) we have THEOREM 3.2.2. If 1≤k < y≤x and P is a set of primes p with k⊥ p, then we have for any z≥2 that

{n|x y < n≤x,n≡l mod k,n⊥Pz}
 
≤ y ∏p<z p/ ∈P klogz +z2. LEMMA 3.2.3. Let p(k) be the largest prime divisor of k. For x≥e6 and p(k)≤x we have ∑ n≤x n⊥K 1 < 7 (k) k x.

66 3. SELBERG’S SIEVE

Proof : Take k = 1,y = x and P ={p| p6\k}in Theorem (3.2.2). For z≤x we have ΦK(x) =
 
{n : n≤x,gcd(n, ∏ p<z,p⊥K p) = 1}
 
≤ x ∏p<z p\K1  1 plogz

+z2.

Thus

k  (k)

ΦK(x) x ≤

1 ∏p≤x1  1 p 1 logz

+

z2 x,

and using

∏ p≤x1  1 p 1 ≤eγlogx1+ 1 2log2 x

and setting z = x

1 3 , we get

k  (k)

ΦK(x) x ≤eγlogx1+ 1 2log2 x 3 logx

+

1

x

1 3.

The right hand side is decreasing, and for x = e6 is < 7.

LEMMA 3.2.4. For z > 103, h even,

H2 h(z) S2 h(z)

< 22.5

h  (h)

z2 log2 z

.

Proof : Let

Jh(z) = ∑ 1≤n≤z n⊥h

μ2(n)σ2(n)  2(n)

,

and as above let Φh(z) = ∑1≤n≤z n⊥h

1. Now

Hk(z) = ∑ 1≤n≤z n⊥k

μ2(n)σ(n)  (n)

.

Cauchy’s inequality states that

∑ 1≤n≤N

anbn2 ≤ ∑ 1≤n≤N

a2 n ∑ 1≤n≤N

b2 n.

Using this with bn = 1, an = μ2(n)σ(n)  (n) and observing that μ4(n) = μ2(n), we have

H2 h(z)≤Φh(z)Jh(z).

Let n be an integer and p⊥n; then

σ(np) = ∑ d\np

d

= ∑ d\n

d + p∑ d\n

d

= σ(n)(1+ p),

3.2. THE BRUN-TITCHMARSH THEOREM 67

and also  (np) =  (n) (p). If n is squarefree, then σ2(np)  2(np) =

σ2(n)  2(n)(1+ p)2  2(p)

=

σ2(n)  2(n) 2(p)+4p  2(p)

=

σ2(n)  2(n)1+ 4p  2(p).

By induction we have

σ2(n)  2(n)

=∏ p\n1+ 4p  2(p) = ∑ d\n 4ν(d)d  2(d) ,μ2(n) = 1.

Since 2\h we have Jh(z)≤J2(z) and

J2(z) = ∑ 1≤n≤z n⊥2

μ2(n)∑ d\n

4ν(d)d  2(d)

= ∑ 1≤d≤z d⊥2

μ2(d)4ν(d)d  2(d) ∑ 1≤m≤z/d m⊥2d

μ2(m)

≤z ∑ 1≤d≤z d⊥2

μ2(d)4ν(d)  2(d)

≤z∏ p>21+ 4 (p 1)2 < 16 5 z. In the proof of Theorem (3.1.2) we had proved Sh(x)≥  (h) h logx; now using this and Lemma (3.2.3) we have: H2 h(z) S2 h(z) ≤ 7 (h) h z16 5 z  2(h) h2 log2 z

= 22.5

z2 log2 z

h  (h)

.

THEOREM 3.2.5. If x and y are real numbers and k and l are integers satisfying 1≤k < y≤x with k⊥l, then π(x;k,l) π(x y;k,l) < 3y  (k)logy k (3.25) and π(x;k,l) π(x y;k,l) < y  (k)logqy k1+ 4 logqy k.(3.26) Proof: Let  (x,y,k,l)=π(x;k,l) π(x y;k,l) andh= 2k gcd(2,k). Then thereis an l1 such that  (x,y,k,l)≤ (x,y,h,l1)+ 1. For if k is even, then h = k, and we can take l1 = l. If k is odd, then the parity of mk+l changes alternately. In this case, we can set l1 to be the solution to l1 ≡1 mod 2 and l1 ≡l mod k. So at worst we miss one prime in the even subsequence.

68 3. SELBERG’S SIEVE

By what we have proved so far, the sifting of the sequence A by Pz yields the following upper bound:  (x,y,k,l)≤ (x,y,h,l1)+1(3.27) ≤ y  (h)S1(z) + H2 h(z) S2 h(z) +π(z,h,l1)+1(3.28) ≤ y  (k)S1(z) + H2 h(z) S2 h(z) +π(z,h,l1)+1 for any z > 1.(3.29)

We begin with a trivial estimate

(x,y,h,l1)≤ ∑ x y<n≤x n≡l1 modh

1

y h

+1.

So  (x,y,k,l)≤ y h +2. Let u =qy k. Since  (k) =  (h)≤ 1 2h, we have  (x,y,k,l) y ≤ 1 k + 2 y . Using y = u2k we obtain

(k) (x,y,k,l) y ≤

(k) k

+

2 (k) y ≤

1 2

+

2 (k) u2k

1 2

+

2 u2

.

Thus

Q =

logqy k (k) y

(x,y,k,l)≤logu1 2

+

2 u2

<

3 2

for 1 < u≤e2.9.

Now

π(z,h,l1)+1≤ ∑ 1≤n≤z,k⊥2

μ2(k)≤ z 1 2

for z≥9.

The remainder term is at most

∑ d<z gcd(d,h)=1

μ2(d)2 ≤ ∑ d<z gcd(d,2)=1

μ2(d)2, since 2\h

≤z 1 2 2 if z≥9.

By (3.27), and the above bounds we have: Q≤logu 1 logz

+

1 u2z 1 2 2 + z 1 2

< logu 1 logz

+

z2 4u2if z≥9.

De ne ω by

u =

ω √2eω,

3.3. PRELUDE TO A THEOREM OF HOOLEY 69

and set z = eω so that

Q≤

logω √2+ω ω 1+ 1 2ωfor ω≥log9. For ω≥√2e > log9 this function is decreasing, and for ω =√2e it is < 3 2. This proves (3.25). Now (3.26) is a consequence of (3.25) for u≤e8. If e8 < u < e10, then using the above bound for Q, we obtain Q≤ logω √2+ω ω 1+ 1 2ω. If u > e8, then ω < 6.4 and this gives Q < 1.4 < 1+ 4 logu. This shows (3.26) for u < e10. Now using (3.27) and setting logz = logu 2, we get logry k (Q 1)≤logulogu logz  1+48 logu u2 z2 log2 z + logu u2 z, = logu 2 logu 2 + 48 e4 logu (logu 2)2 + logu e2u, which is a decreasing function in u. In particular it is < 4 if u≥e10. This proves (3.26).

3.3. Prelude to a theorem of Hooley

In this section we will look at a variation of a problem of Chebyschev that we shall see in the next section. The problem is to prove a lower bound on the largest prime divisor of ∏ p≤x (p2 1) = ∏ p≤x (p+1)∏ p≤x (p 1). We will prove the following theorem of Motohashi [Mot70].

THEOREM 3.3.1. Let Px be the largest prime divisor of ∏ p≤x

(p2 1).

Then Px > xθ for any θ < 1  1 2e 1 4

.

Proof : In this proof q will also stand for primes, and sums or products over q will represent sums or products over primes in the range.

Consider the product Ξ = ∏p≤x(p2 1). Taking log on both sides, we have logΞ = log∏ p≤x p21  1 p2 = 2 ∑ p≤x logp O∑ p≤x 1 p2 = 2x+O(xe c√logx) O(1). Let π(x,k) be the number of primes below x such that p2 1≡0 mod k. We have that p2 1 = (p+1)(p 1) and for p > 2 we have gcd(p+1,p 1)= 2. If k = qa, q6= 2, then p2 1≡0 mod k implies that either p+1≡0 mod k or p 1 ≡ 0 mod k. In this case we have π(x,qa) = π(x; 1,qa)+π(x;+1,qa). Furthermore, π(x,2) = π(x), and π(x,4) = π(x). For a > 2, we have π(x,2a) = π(x; 1,2a 1)+π(x;+1,2a 1). Using the function π(x,qa), we can write Ξ as ∏ qa<x qπ(x,qa). For if qa divides Ξ, then it is counted exactly a times in this product. Taking logarithms we have ∑ qa<x π(x,qa)logq = 2x+O(xe c√logx).

70 3. SELBERG’S SIEVE

We split up the sum as follows:

∑ qa<x

π(x,qa)logq = ∑ q≤ √x logB x a=1

+ ∑ √x logB x <q≤xθ a=1

+ ∑ xθ<q<x a=1

+ ∑ qa<x a≥2

= Σ1 +Σ2 +Σ3 +Σ4, where B is a positive real number. We wish to show that Σ3 is non-zero for the value of θ claimed. Since we already have an asymptotic formula for the sum, to obtain a lower bound for Σ3 we need upper bounds for the remaining sums. We have π(x,k)~ 2lix  (k). Σ1 Bombieri’s Theorem— which we shall prove in Chapter 4, can be used directly to bound this sum we get: Σ1 = 2x logx ∑ q≤ √x logB x logq (q 1) +O x logx = x+Oxloglogx logx . Σ2 We have from the Brun-Titchmarsh Theorem (3.2.5) that π(x,q)≤4 x (q 1)logx q1+ 8 logx q.Hence Σ2 ≤4x( ∑ √x logB x <q≤xθ logq (q 1)logx q +O 1 (log2 x) ∑ q≤x logq q ), and using ∑p≤x logp p ~logx, we have Σ2 = 4x ∑ √x logB x <q≤xθ logq qlogx q+O x logx. Writing  (x) for ∑p≤x logp, we have by partial summation: ∑ y<p≤z log p qlogx q = ∑ y<k≤z  (k)  (k 1) klogx k = ∑ y<k≤z  (k) 1 klogx k  1 (k+1)log x k+1.This sum boils down to ∑ y<k≤z  (k) k(k+1)logx k , and using  (x) < x1+ 1 2logx, we get ∑ y<p≤z logp qlogx q≤ ∑ y<k≤z 1 klogx k . Now we can bound this sum using integration to get

∑ y<p≤z

logp qlogx q

= loglog x z loglog x y+o(1).

Thus

√x logB x

<q≤xθ

logq qlogx q

= log2(1 θ)+o(1),

3.4. A THEOREM OF HOOLEY 71

and so

Σ2 ≤ 4log2(1 θ)x+o(x). Σ4 We split up Σ4 into two parts, Σ4 = ∑ qa≤x 2 3 a≥2 + ∑ x 2 3 <qa<x a≥2 = Σ41 +Σ42, say.

Using the Brun-Titchmarsh theorem: Σ41 = O∑ q≤√x

logq

x logx ∑ a≥2

1  (qa)

= O x logx ∑ q≤√x

logq q2

= O x logx

and

Σ42 = O∑ q≤√x

logq ∑

x

2 3 <qa<x

x qa

= Ox1 3 ∑ q≤√x

logq

logx logq

= O(x

5 6 ).

Thus

Σ4 = O x logx.

From the bounds we have derived we get:

Σ3 > (1+4log2(1 θ))x+o(x).

Hence if 1+4log2(1 θ) > 0 i.e., if

1

1

2e

1 4

> θ,

then there is a prime factor exceeding xθ.

Among known improvements to this result, the best one is that the largest prime factor exceeds xθ for θ = 0.677 (see [BakHar95], [BakHar98], and also [Ho73]).

3.4. A theorem of Hooley Chebyhev proved that if Px is the largest prime factor of ∏n≤x(n2+1), then Px x →∞. Hooley [Ho67] (see also [Ho76])impro ved the previous best known result of

Px x

> (logx)A1 logloglogx

by Erd os [Erd52] to Px > x

11 10 using the Selberg sieve. In this section we shall outline the proof given by Hooley in

[Ho76]. The exponent 11 10 has since been improved to θ< 1.202···, where θ is the solution to 2 θ 2log(2 θ) = 5 4, by Deshouillers and Iwaniec [DI83] (see also [Dar96]).

72 3. SELBERG’S SIEVE

THEOREM 3.4.1 ([Ho76]). The largest prime factor of

∏ n≤x

(n2 +1)

exceeds x

11 10 for all large enough values of x. Proof : Let Px be the largest prime factor of ∏n≤x(n2 +1), and set Nx(l) =
 
{n≤x | n2 ≡ 1 mod l}
 
. We begin by nding a lower bound for ∑x≤p≤Px Nx(p)log p, as in the proof of Theorem (3.3.1). We have ∏ n≤x (n2 +1) = ∏ p≤Px pα<x2+1 pNx(pα).

Taking logs,

log∏ n≤x

(n2 +1) = log∏ n≤x

n21+ 1 n2 > logbxc!2 = 2xlogx+O(x)

by Stirling’s theorem, and so

∑ p≤Px pα<x2+1

Nx(pα)log p > 2xlogx+O(x).

Now

∑ p≤Px pα<x2+1

Nx(pα)logp = ∑ x≤p≤Px

Nx(p)log p+ ∑ p≤x

Nx(p)logp+ ∑ p≤Px α>1

Nx(pα)log p

= ΣA +ΣB +ΣC.

As before we proceed to upper-bound ΣB and ΣC, thereby obtaining a lower bound for ΣA. Now Nx(l) = ∑ n2+1≡0 mod l n≤x 1 = ∑ v2+1≡0 mod l 0<v≤l ∑ n≡v mod l n≤x 1.

Let ρ(l) be the number of solution to the congruence v2 +1≡0 mod l. Then since ∑ n≡v mod l n≤x 1  x l = O(1), we have

Nx(l) =

xρ(l) l

+O(ρ(l)).

Now ρ(2) = 1, and since the congruence 1 p ≡( 1)

p 1 2 mod p has no solutions for p≡3 mod 4, and has exactly

two solutions for p≡1 mod 4. We conclude ρ(p) =(2 if p≡1 mod 4, 0 if p≡1 mod 4.

3.4. A THEOREM OF HOOLEY 73

The needed bounds are given by:

ΣB = x ∑ p≤x

ρ(p)logp p

+O∑ p≤x

ρ(p)logp

= 2x ∑ p≤xp ≡1 mod 4

log p p

+O(x)+O(∑ p≤x

logp),

= xlogx+O(x).

using ∑ p≤x p≡l mod k

logp p = 1  (k) logx+O(1),

ΣC = O ∑ p≤√x2+1

logp ∑ 2≤α x pα

+1

= Ox∑ p

logp p1  1 p= Ox∑ p log p p(p 1) = O(x)

since the sum converges. Thus we get ΣA > xlogx+O(x). Our next task is to upper-bound the sum Tx(y) = ∑x<p≤y Nx(p)log p, which in conjunction with the above lower bound will yield a lower bound for y. It turns out that to estimate Tx(y) effectively, we need to split up the sum into two parts and evaluate each of them separately. To this end let X = x 1 11 , and assume that x 12 11 < y < x2. Then

Tx(y) = ∑ x<p≤xX

Nx(p)log p+ ∑ xX<p≤y

Nx(p)log p

= Tx(xX)+T0 x(y).

To evaluate Tx(xX), we let Vx(v) = ∑v<p≤ev Nx(p). Then

Tx(xX) = ∑ 0≤α<logX

∑ xeα<p≤xeα+1

Nx(p)log p

≤ ∑ 0≤α<logX

log(xeα+1)Vx(xeα).

Now for the sum T0 x(y), using the de nition of Nx(l), we have:

T0 x(y) = ∑ xX<p≤y pm=n2+1 n≤x

logp

= ∑ m> x2 ylog8 x

log p+ ∑ m≤ x2 ylog8 x

log p

= T00 x (y)+T000 x (y)(say).

74 3. SELBERG’S SIEVE

Now the conditions of the summation T000 x (y) yield m≤ x2 ylog8 x

, and so n <p(pm)≤r yx2 ylog8 x= x log4 x

. Since m≤n,

we have m≤ x log4 x

. Using this we have

T000 x (y) = 2logx ∑ lm=n2+1 m,n≤ x log4 x

1

= 2logx ∑ m≤ x log4 x

N x log4 x

(m).

Now if m = ∏i phi i , then ρ(m) = ∏iρ(phi), and each of the individual terms is a constant. So ρ(m)≤2ν(m), and this itself is upper bounded by d(m), i.e. the number of divisors of m. Therefore:

T000 x (y)≤

2x log3 x ∑ m≤ x log4 x

ρ(m) m

+Ologx ∑ m≤ x log4 x

ρ(m)

= O x log3 x ∑ m≤ x log4 x

ρ(m) m

= O x log3 x ∑ m≤x

d(m) m .

Now consider

∑ 1≤n≤x

1 n ∑ 1≤m≤x

1 m= ∑ 1≤n≤x2n is x smooth

1 n∑ u,v≤x uv=n

1

≥ ∑ 1≤n≤x

d(n) n .

This yields ∑1≤n≤x d(n) n = O(log2 x), and so

T000 x (y) = O x logx.

In T00 x (y), we have m > x2 ylog8 x

and pm≤x2 +1, so m≤ x2+1 p . Furthermore p > xX, and so m≤ x X1+ 1 x2≤ ex X . Thus

we have

T00 x (y)≤ ∑ x2 ylog8 x <m≤ex X pm=n2+1 n≤x,p≥xX

log

ex2 m

.

Let

Wx(w) = ∑ w<m≤ew pm=n2+1 n≤x,p≥x

1.

Then

T00 x (y)≤ ∑ 0≤α<logY

logxXeα+1Wxxe α X ,

where Y = eylog8 x xX . Finally,

T0 x(y)≤ ∑ 0≤α<logY

log(xXeα+1)Wxxe α x +O x logx.

3.4. A THEOREM OF HOOLEY 75

We will format the sums involved for application of the Selberg sieve. Let λ be a squarefree number, and de ne

(u;λ) = ∑ u<λk≤eu

Nx(λk).

We impose the conditions x

4 5 < u < x

4 3 and λ < minu

5 4 x , x u 3 4    . By a rather ingenious and elaborate argument Hooleysho wed that

(u;λ) =

3xρ(λ) 2πλ

1 ∏p\λ1+ 1 p

+Ox1 2+εu3 8 λ 1 2 (see [Ho76]§2.3 -§2.6). Since the argument is not central to our application of the sieve, we exclude the derivation of this bound here.

Application of the Sieve: Let x≤v < x

12 11 , so that v satis es the conditions on u imposed by our bounds on  (u;λ). Let d denote a squarefree number, and let λd be the Selberg coef cients. Then

Vx(v)≤ ∑ v<l≤ev

Nx(l)∑ d\l

λ2 d

= ∑ d1,d2≤z

λd1λd2 ∑ v<l≤evl ≡0 mod lcm(d1,d2)

Nx(l)

= ∑ d1,d2≤z

λd1λd2 (v;lcm(d1,d2) )

(since lcm(d1,d2) < xv 3 4 )

3x 2π ∑ d1,d2≤z

λd1λd2ω(lcm(d1,d2)) lcm(d1,d2)

+Ox1 2+εv3 8 ∑ d1,d2≤z

|λd1||λd2| plcm(d1,d2).Here

ω(d) =

ρ(d) ∏p\d1+ 1 p

,

which is clearly multiplicative. So we can apply Selberg’s sieve without modi cation, except that the remainder term is more clearly speci ed in this case. Thus by Theorem (3.1.1), we have

Vx(v)≤

3x 2πG(z)

+R,

where R is the remainder term. Now

G(z) = ∑ d<z

μ2(d)g(d),

and

g(p) =

ω(p) p1 ω(p) p

=

21+ 1 p 1p 1  2 p1+ 1 p 1= 2 p1  1 p .

76 3. SELBERG’S SIEVE

Thus

g(d) =

ρ(d) d∏ p\d p6=2,p≡1 mod 41  1 p =∑ d0 ρ(dd0) dd0 , where d0 indicates any number whose prime factors divide d. Also ρ(2α) = 0, if α > 1, and we have

∑ d≤z

μ2(d)g(d) = ∑ d≤z

∑ d0

ρ(dd0) dd0

≥ ∑ m≤z

ρ(m) m

3(1 η1) 2π

logz,

where η1 < 1 can be chosen very small. Here we have used

∑ m≤z

ρ(m) =

3z 2π

+O(z

3 4 )

(which is proved in [Ho76] p. 32) and partial summation. Also the remainder term can be bounded as follows: R = Ox1 2+εv3 8 ∑ d1,d2≤z

1 plcm(d1,d2)= Ox1 2+εv3 8 ∑ d≤z ∑ l1,l2≤z d l1⊥l2 1 √dl1l2 = Ox1 2+εv3 8 ∑ d≤z z √d ∑ l1,l2≤z/d l1⊥l2 1 √l1l2 = Ox1 2+εv3 8 ∑ d≤z z d 3 2 = Ox1 2+εv3 8 z.Selecting z = x 1 2 ηv 3 8 , we get

Vx(v) <

(1+η2)x log√xv 3 8

,

where η2 can be made arbitrarily small. Similarly

Wx(w)≤ ∑ w<m≤ew∑ d\l

λd2

= ∑ d1,d2≤z

λd1λd2 ∑ w<m≤ewmr ×lcm(d1,d2)=n2+1 n≤x

1

= ∑ d1,d2≤z

λd1λd2 lcm(d1,d2) w;lcm(d1,d2).

3.4. A THEOREM OF HOOLEY 77

Carrying through the sieve estimate, we get with z = x

2 7 ηw  3 14 that

Wx(w) <

(1+η2)x

logx

2 7 w  3 14

.

Let y = x

11 10 and γ = logx. Using the above estimates, we  nd that

Tx(xX) < x(1+η2) ∑ 0≤α<logx

α+γ+1 1 8γ  3 8α

< 0.8902xlogx,

where we have used integration to upper-bound the sum. Similarly we  nd T0x(y) < 0.1081xlogx

for large enough x. Thus we get

Tx(x

11 10 ) < 0.9983xlogx, and so the largest prime factor of ∏n≤x(n2 +1) exceeds x11 10 , for all large enough values of x.

78 3. SELBERG’S SIEVE

CHAPTER 4

The Large Sieve

The Selberg sieve does not give good bounds if we sieve out a large number of residue classes modulo each prime in the sifting set. The large sieve was designed to handle this problem, (hence the name). The bounds are derived by relating the properties of the integer sequence to the behavior of certain exponential sums.

4.1. Bounds on exponential sums

De ne e(t) = e2πit. We have en q= em qif n ≡ m mod q. The following property of the exponential functionresembles that of the M¨obius function, and is useful to study the distribution of a sequence of integers in residue classes modulo some number.

PROPOSITION 4.1.1.

∑ 1≤a≤q

ean q=(q, if n≡0 mod q 0, otherwise.

Proof : If n≡0 mod q, then ean q= 1 for each a. So ∑1≤a≤qean q= q. If n6≡0 mod q, then ∑ 1≤a≤q ean q= ∑ 0≤a≤q 1 ean q

=

eqn q 1e n q 1= 0.

Let a1,···,az be a sequence of integers, and de ne Z(q,h) =
 
{i|1≤i≤z,ai ≡h mod q}
 
and S(x) = ∑ 1≤i≤z e(aix). Now for all integers a we have Sa q= ∑ 1≤h≤q Z(q,h)eah q.(4.30) Suppose all the integers in the sequence are distributed evenly among the residue classes modulo q; then using Proposition 4.1.1 we have Sa q= Z(q,h) ∑ 1≤h≤q eah q = 0, if a6≡0 mod q. If on the other hand all the integers ai belong to a single residue class modulo q, then
 
Sa q
 
= z for all integers a.Hence the distribution of the integers among the residue classes is related to|Sa q|. In fact, we can express Z(q,h) in79

80 4. THE LARGE SIEVE terms of Sa qas follows: Sa qe h0a q = ∑ 1≤h≤q Z(q,h)eah qe h0a q ,

and therefore

∑ 1≤a≤q

Sa qe h0a q = ∑ 1≤a≤q

Z(q,h) ∑ 1≤h≤q

ea(h h0) q

= Z(q,h0)q.

Hence

qZ(q,h) = ∑ 1≤a≤q

Sa qe h0a q .(4.31)

It turns out that useful upper bounds can be obtained for the sum

∑ p≤x

∑ 1≤a≤p 1

Sa p

that are largely independent of the integer sequence used to de ne S(x).

We  rst prove a result that shows how the above sums are related to the distribution of the integer sequence in the residue classes.

LEMMA 4.1.2. For all integers q≥2,

∑ 1≤a≤q 1

Sa q

2

= q ∑ 1≤h≤qZ(q,h)  z q2.

Proof :

∑ 1≤a≤q 1

Sa q

2

= ∑ 1≤a≤q 1 ∑ 1≤h≤q

Z(q,h)eah q! ∑ 1≤k≤q

Z(q,h)eka q!

= ∑ 1≤a≤q 1

∑ 1≤h,k≤q

Z(q,h)Z(q,k)ea(h k) q

= ∑ 1≤h,k≤q

Z(q,h)Z(q,k) ∑ 1≤a≤q 1

ea(h k) q !.

It is easy to see that

∑ 1≤a≤q 1

ea(h k) q =(q 1, if h≡k mod q  1, otherwise.

4.1. BOUNDS ON EXPONENTIAL SUMS 81

Thus

∑ 1≤a≤q 1

Sa q

2

= q ∑ 1≤h≤q

Z(q,h)2  ∑ 1≤h,k≤q

Z(q,h)Z(q,k)

= q ∑ 1≤h≤q

Z(q,h)2  ∑ 1≤h≤q

Z(q,h)2

= q ∑ 1≤h≤q

Z(q,h)2 z2

= q ∑ 1≤h≤qZ(q,h)2 2zZ(q,h) q

+

z2 q2

= q ∑ 1≤h≤qZ(q,h)  z q2.

We will look at exponential sums of the form

S(x) = ∑  K≤n≤K

ane(nx),

where K is a positive integer and an ∈

. Notation : We writektkto mean the distance from t to the nearest integer, i.e.,ktk= minn|t n|=

t + 1 2 t
 
.T HEOREM 4.1.3 ([Gal67]). If S(x) = ∑ K≤n≤K ane(nx) and x1,···,xR are real numbers such that kxr xsk≥δ > 0 for r6= s, then ∑ 1≤r≤R |S(xr)|2 ≤(δ 1 +2πK) ∑  K≤n≤K |an|2. Proof : For any u we can write

S2(xr) = S2(u)+2

xr

u

S0(t)S(t)dt.

Using this we have

|S2(xr)|≤|S2(u)|+2

xr u |S0(t)S(t)|dt

.

We now integrate over the interval It =xr  δ 2,xr + δ 2, to get δ|S(xr)|2 ≤

Ir |S(u)|2du+2

Ir

xr u |S0(t)S(t)|dt

du.

Then

Ir

xr u |S0(t)S(t)|dt

du =

xr+δ 2 xr

u xr |S0(t)S(t)|dtdu+

xr xr δ 2

xr u |S0(t)S(t)|dtdu

=

xr+δ 2 xr |S0(t)S(t)|xr + δ 2 tdt +

xr xr δ 2 |S0(t)S(t)|t xr + δ 2dt

δ 2

Ir |S0(t)S(t)|dt.

Thus

δ|S(xr)|2 ≤

Ir |S(u)|2du+δ

Ir |S0(t)S(t)|dt.

82 4. THE LARGE SIEVE

By our condition on the numbers xr the intervals Ir are disjoint modulo 1 meaning that if r 6= s, then no point of Ir differs by an integer from another point in Is. Since S is periodic with period 1 and is non-negative, the value of its integral overl Ir is upper bounded by its integral over [0,1]. Thus summing over r:

δ ∑ 1≤r≤R

|S(xr)|2 ≤

1 0 |S(t)|2dt +δ

1 0 |S0(t)S(t)|dt.

Let us analyze the  rst integral. The exponential function satis es

1

0

e(nx)dx =(1 if n = 0, 0 otherwise.

We have

1 0 |S(x)|2dx =

1

0

S(x)S(x)dx

=

1

0

∑  K≤m,n≤K

aname((n m)x)dx

= ∑  K≤n≤K

|an|2. Thus the  rst integral is ∑ K≤n≤K|an|2. The second satis es:

1 0 |S0(t)S(t)|dt ≤

1 0 |S(t)|2dt

1 2

1 0 |S0(t)|2dt

1 2

and on substituting S0(t) by ∑ K≤n≤K 2πianne(nt), the right-hand side becomes = ∑  K≤n≤K |an|2 1 2 ∑  K≤n≤K |2πnan|2 ≤2πK ∑  K≤n≤K |an|2. Thus δ ∑ 1≤r≤R |S(xr)|2 ≤(1+δ2πK) ∑  K≤n≤K |a2 n|.

There is a stronger bound on the sum ∑1≤r≤R|S(xr)|2 due to Montgomery. To prove this we require the following result. THEOREM 4.1.4. Let Φ1,···,ΦR and ξ be arbitrary vectors in an inner product space V over the complex numbers. Then ∑ 1≤r≤R |(ξ,Φr)|2 ≤Akξk2, where

A = max r ∑ 1≤s≤R

|(Φr,Φs)|. THEOREM 4.1.5. Let S(x) be as above, and x1,···,xr be real numbers with kxr xsk≥δ > 0 for r6= s. Then ∑ 1≤r≤R |S(xr)|2 ≤(2K +3δ 1) ∑  K≤k≤K |ak|2.

4.1. BOUNDS ON EXPONENTIAL SUMS 83

Proof : If R = 1 we have

|S(x)|2 ≤N ∑ M+1≤n≤M+N

|an|2

by Cauchy’s inequality. Hence we may assume R ≥2 so δ≤ 1 2. We apply Theorem (4.1.4) with the inner product de ned to be (φ,ψ) = ∑k φkψk. Take ξ={akb 1 2 k } K≤k≤K andφr ={b 1 2 k e( kxr)} ∞<k<∞, where bk will be de ned later to be positive for K≤k≤K,and non-negative for other k. Using Theorem (4.1.4) we have ∑ 1≤r≤R |S(xr)|2 ≤A ∑  K≤k≤K |ak|2b 1 k , where A = maxr ∑1≤s≤1|B(xr xs| and B(x) = ∑ ∞<k<∞bke(kx). To  nish the proof it suf ces to pick bk such that bk ≥1 for K ≤k≤K such that ∑ 1≤s≤R |B(xr xs)|≤2K +3δ 1 for all r. If we took bk = 1 for K ≤k≤K and bk = 0 otherwise, we would get the inferior estimate ∑ 1≤s≤R |B(xr xs)|≤2K +O(δ 1logδ 1). Instead, take bk to be bk =          1 if|k|≤K, 1 (|k| K) L if K ≤|k|≤K +L, 0 if|k|≥K +L, where L will be selected later. Using the indentity

∑ |j|≤J

(J |j|)e(jx) =

∑ 1≤j≤J

e(jx)

2

=sinπJx sinπx2,

we can write

B(x) =

1 Lsin2πx(sinπ(K +L)x)2 (sinπKx)2.Hence B(0) = 2K +L, and

|B(x)|≤

1 L(sin2πx) ≤

1 4Lkαk2

,

so that

∑ 1≤s≤R

|B(xr xs)|≤2K +L+2 ∑ 1≤h

1 4Lh2δ2

.

Since ∑1≤h 1 h2 = π2 6 < 2, we have

∑ 1≤s≤R

|B(xr xs)|≤2K +L+

1 Lδ2

≤2K +

3 δ

.

upon taking L to be the least integer≥δ 1.

84 4. THE LARGE SIEVE

Consider the sum S(x) = ∑M+1≤N≤M+N ane(nx). The value of M is irrelevant to the magnitude of this sum since for any K we can set T(x) = ∑ K+1≤n≤K+N aM K+ne(nx) = e(K M)xS(x)and then|T(x)|=|S(x)|. Thus the above theorem can be rephrased as follows. THEOREM 4.1.6. Let S(x) = ∑ M+1≤n≤M+N ane(nx) where M and N are integers, N > 0. Let x1,···,xR be distinct real numbers modulo 1 and δ > 0 is such that kxr xsk≥δ, for r6= s. Then for arbitrary an ∑ 1≤r≤R |S(xr)|2 ≤(N +3δ 1) ∑ M+1≤n≤M+N |an|2. We state (without proof) another version of the large sieve inequalities due to Montgomery and Vaughan [MV73] (Theorem 1).

THEOREM 4.1.7 ([MV73]). Let

S(x) = ∑ M+1≤n≤M+N

ane(nx),

let x1,···,xR be real numbers, and set

δ = min r6=s kxr xsk.

Then

∑ 1≤r≤R

|S(xr)|2 ≤(N +δ 1) ∑ M+1≤n≤M+N

|an|2.

Moreover, if

δr = min s s6=r

kxr xsk

for all r, then

∑ 1≤r≤R (N +

3 2

δ 1 r ) 1|S(xr)|2 ≤ ∑ M+1≤n≤M+N

|an|2.

4.2. The Large Sieve

In this section we will use the bounds derived in the previous section to study the distribution of integer sequences in residue classes modulo primes.

Let an be a sequence of complex numbers de ned for M+1≤n≤M+N (where M,N are integers and N > 0). De ne Z(q,h) = ∑ M+1≤n≤M+N n≡h mod q an and Z(1,1) = Z = ∑ M+1≤n≤M+N an.

4.2. THE LARGE SIEVE 85

LEMMA 4.2.1. Let

S(x) = ∑ M+1≤n≤M+N

ane(nx).

If q is a positive integer, then

∑ 1≤a≤q

Sa q

2

= q ∑ 1≤h≤q

∑ d\q

μ(d) d

Zq d

,h

2

.

Proof : For an integer a we have (using (4.30)) Sa q= ∑ 1≤h≤q

Z(q,h)eah q.

By (4.31)

qZ(q,h) = ∑ 1≤a≤q

Sa qe ah q

= ∑ d\q

∑ 1≤b≤q d gcd(b, q d )=1

Sbd qe bdh q .

Let

T(q,h) = ∑ 1≤a≤q a⊥q

Sa qe ah q,

so that

qZ(q,h) = ∑ d\q

Tq d

,h.

Applying M¨obius inversion to this we get

T(q,h) = d∑ d\q

μ(d) d

Zq d

,h.

Hence

|T(q,h)|2 = q2

∑ d\q

μ(d) d

Zq d

,h

2

,

and therefore

1 q ∑ 1≤h≤q

|T(q,h)|2 = q ∑ 1≤h≤q

∑ d\q

μ(d) d

Zq d

,h

2

.

Now

q ∑ 1≤h≤q

∑ d\q

μ(d) d

Zq d

,h

2

=

1 q ∑ 1≤h≤q

|T(q,h)|2

=

1 q ∑ 1≤h≤q

∑ 1≤a,b≤q a⊥q,b⊥q

Sa qSb qe(b a)h q

=

1 q ∑ 1≤a,b≤q a⊥q,b⊥q

Sa qSb q ∑ 1≤h≤q

e(b a)h q

= ∑ 1≤a≤q a⊥q

Sa q

2

.

86 4. THE LARGE SIEVE

THEOREM 4.2.2. [Mon68] Let Z(q,h) and Z be de ned as before, and let x≥1. For each prime p≤x let H(p) be the union of ω(p) distinct residue classes modulo p. Let an be complex numbers that satisfy an = 0 if n∈H(p) for some p≤x. Then for each q≤x, μ2(q)|Z|2∏ p\q ω(p) p ω(p) ≤q ∑ 1≤h≤q

∑ d\q μ(d) d Zq d ,h

2 . Proof : This is clearly true if μ(q) = 0, so we may assume q≤x is a  xed squarefree integer. If d\q, we de ne K(d) =h|1≤h≤q and if p\d, then h∈H(p), while if p\q d , then h / ∈H(p). De ning h1 ≡ h2 if there is a d such that {h1,h2}  K(d) yields an equivalence relation. Thus K(d) when going through all the divisors of q gives a partition of{1,···,q}. Now for each h we can write q uniquely as q = ∏ p:h∈H(p) p\q p ∏ p:h/ ∈H(p) p\q p. Thus we can write any sum of the form ∑ 1≤h≤q f(h) as ∑ d\q ∑ h∈K(d) f(h). Fix a,δ where δ\q. Observe that

∑ d\q μq dd ∑ h∈K(δ) Z(d,h)

2 =

∑ d\q μ(d)q d ∑ h∈K(δ) Zq d ,h

2 (4.32) =

∑ h∈K(δ)∑d\q μ(d)d q Zq d ,h

2 (4.33) by changing the variable of summation from d to q d .

Using the Cauchy-Schwarz inequality

∑ h∈K(δ)∑d\q

μ(d)d q

Zq d

,h

2

≤ ∑ h∈K(δ)

1 ∑ h∈K(δ)

∑ d\q

μ(d)d q

Zq d

,h

2.(4.34)

Now consider

∑ d\q

μq dd ∑ h∈K(δ)

Z(d,h)

2

.

Supposing gcd(δ,d) > 1, we can select a prime p such that p\gcd(δ,d). Then Z(d,h) is a sum of an with n ≡ h mod d, since p\d we also have n≡h mod p. But p\δ and h∈K(δ) implies that n∈H(p) by the de nition of K(δ). Thus by hypothesis an = 0 whenever n≡h mod d and h∈K(δ). Hence the inner sum of

∑ d\q μq dd ∑ h∈K(δ) Z(d,h)

2

4.2. THE LARGE SIEVE 87

vanishes when gcd(δ,d) > 1. Thus we obtain,

∑ d\q

μq dd ∑ h∈K(δ)

Z(d,h) = ∑ d\q δ

μq dd ∑ h∈K(δ)

Z(d,h).

Fix d with d\(q/δ). If k∈H(p), then Z(d,k) = 0, and hence ∑ h∈K(δ) Z(d,h) = ∑ 1≤k≤d  p\d:k/ ∈H(p) Z(d,k)
 
{h|h∈K(δ),h≡k mod d.}
 
. Let S(δ,d,k) =
 
{h|h∈K(δ),h≡k mod d}
 
for k such that k∈H(p) for all primes p that divide d. By the ChineseRemainder Theorem h≡k mod d is equivalent to h≡k mod p for all prime p dividing d. Also h∈K(δ) implies thath ∈H(p) for all primes p dividing δ, and that h / ∈H(p) for all primes p dividing q/δ. Summarizing, we have shown that h∈K(δ) iff the following are satis ed: 1. p\d  h≡k mod p,h / ∈H(p) 2. p\δ h∈H(p) and 3. p\q/dδ h / ∈H(p). Since we have k such that k / ∈H(p) for all primes p dividing d, the second condition in (1) is satis ed whenever the  rst is satis ed. We have that if p\d, then there are exactly one solution of (1) modulo p, ω(p) solutions of (2) modulo p, while if p\q/dδ, then there are p ω(p) solutions to (3) modulo p. Applying the Chinese Remainder Theorem, we have S(δ,d,k) =
 
{h|1≤h≤q,h satis es conditions (1),(2)&(3)}
 
= ∏ p\δ ω(p) ∏ p\(q/dδ) (p ω(p)). This number is independent of k, and so ∑ h∈K(δ) Z(d,h) = ∑ 1≤k≤d  p\d : k/ ∈H(p) Z(d,k)∏ p\δ ω(p) ∏ p\q/dδ (p ω(p)) = ∑ 1≤k≤d Z(d,k)∏ p\δ ω(p) ∏ p\q/dδ (p ω(p)) = Z∏ p\δ f(p) ∏ p\q/dδ (p ω(p)). From this we get

∑ d\q

μq dd ∑ h∈K(δ)

Z(d,h) = ∑ d\q/δ

μq ddZ∏ p\δ

ω(p) ∏ p\q/dδ (p ω(p))(4.35) = μ(q)Z∏ p\δ ω(p) ∏ p\q/δ (p ω(p)) ∑ d\q/δ μ(d)d∏ p\d (p ω(p)) 1(4.36) = μ(q)Z∏ p\δ f(p) ∏ d\q/δ (p ω(p)) ∏ p\q/δ1  p p ω(p)(4.37) = μ(δ)Z∏ p\δ ω(p) ∏ p\q/δ ω(p)(4.38) = μ(δ)Z∏ p\q ω(p).(4.39) Now ∑ h∈K(δ) 1 = S(δ,1,1) =∏ p\δ ω(p) ∏ p\q/δ (p ω(p)).

88 4. THE LARGE SIEVE

Dividing (4.32) by the above factor and using (4.35) (4.39) we  nd that |Z|2∏ p\q ω(p)2∏ p\δ ω(p) 1 ∏ p\q/δ (p ω(p)) 1 ≤ ∑ h∈K(δ)

∑ d\q

μ(d)q d

Zq d

,h

2

.

Summing over all δ\q, the right hand side yields ∑ 1≤h≤q

∑ d\q

μ(d)q d

Zq d

,h

2

.

Since the K(δ) partition{1,···,q}, summing the left hand side yields |Z|2∏ p\d2∑ δ\q∏ p\δ ω(p) 1∏ p\q/δ (p ω(p)) 1 =|Z|2∏ p\q

ω(p)∑ δ\q

∏ p\q/δ

ω(p) ∏ p\q/δ (p ω(p)) 1

=|Z|2∏ p\q

ω(p)∏ p\q1+ ω(p) p ω(p) = q|Z|2∏ p\q ω(p) p ω(p) .

THEOREM 4.2.3. [MV73] Let N be a set of Z integers in an interval [M +1,M +N]. For each prime p let ω(p) denote the number of residue classes modulo p that contain no element of N . Then Z ≤L 1, where

L = ∑ q≤zN + 3 2

qz 1μ2(q)∏ p≤q

ω(p) p ω(p)

and z is an arbitrary positive real number.

Proof : Let xr be the numbers a q where 1≤a≤q, a⊥q and q≤z. If a0 q0 6= a q, then

a q  a0 q0

≥ 1 qq0 ≥ 1 qz . By Theorem (4.1.7) we have

∑ q≤zN + 3 2

qz 1 ∑ 1≤a≤q a⊥q

Sa q

2

≤ ∑ M+1≤n≤M+N

|an|2.

Set an = 1 or 0 according as n∈N or n / ∈N . Then by Theorem (4.2.2) we get Z2μ2(q)∏ p\q ω(p) p ω(p) ≤ ∑ 1≤a≤q a⊥q

Sa q

2 . The right hand side equals Z and this proves the theorem.

4.3. The Brun-Titchmarsh Theorem revisited

The large sieve can be used to strengthen the Brun-Titchmarsh theorem (Theorem 3.2.5). We require the following lemma.

4.3. THE BRUN-TITCHMARSH THEOREM REVISITED 89

LEMMA 4.3.1. Let u and v be any positive real numbers. Then

∑ q≤n q⊥k (1+vq) 1μ2(q)  (q) ≥

(k) k ∑ q≤u (1+vq) 1μ2(q)  (q)

.

Proof : Note that

k  (k)

=∑ r\k

μ2(r)  (r)

.

Multiplying the sum on the left by this we get

∑ q≤n q⊥k (1+vq) 1∑ r\k

μ2(qk)  (qk)

,

which includes all the terms of the sum on the right.

THEOREM 4.3.2 ([MV73]). Let x and y be positive real numbers, and let k and l be relatively prime positive integers. Then π(x+y;k,l) π(x;k,l) < 2y  (k)5 6 +logy k . Proof : We take M =x l k  and N =x+y k k  M. Let N be the set of those integers n for which M < n ≤ M +N, kn+l is prime, and kn+l > z. Then ω(p) = 1 whenever p≤z and p6\k. Thus by Theorem (4.2.3) we have π(x+y;k,l) π(x;k,l)≤L 1 +π(z), where

L = ∑ q≤z q⊥k

(N +

3 2

qz) 1 μ2(q)  (q)

.

Taking z =q2 3N and using Lemma (4.3.1), we have π(x+y;k,l) π(x;k,l) <

kN  (k)J +p(N),where

J = ∑ q≤z (1+qz 1) 1 μ2(q)  (q)

.

From [War27] we have

∑ q≤v

μ2(q)  (q)

= logv+γ+∑ p

log p p(p 1)

+o(1)

as v→∞. By partial summation we  nd that

J = logz+γ+∑ p

logp p(p 1)

= log2+o(1)

90 4. THE LARGE SIEVE as z→∞. Setting z =q2 3N we get J = 1 2 logN +γ+∑ p log p p(p 1)  1 2 log

3 2 log2+o(1)

as N →∞. Since γ > 0.577,

∑ p

logp p(p 1)

> 0.737,

lling in log2 < 0.694 and 1 2 log 3 2 < 0.203, we  nally obtain

J >

1 2

logN +0.417,

for large enough N.

4.4. Bombieri’s Theorem

The large sieve inequalities imply that if a sequence of integers is distributed rather densely in an interval, then it cannot be very unevenly distributed modulo the primes. In this section we will prove an important theorem that quanti es the above statement for the primes themselves. De ne ψ(x) = ∑ n≤x Λ(n), where Λ(n) is von-Mangoldt’s function Λ(n) =(log p if n = pk, 0 otherwise. Also de ne ψ(x;q,a) = ∑ n≤xn ≡a mod q Λ(n). Let E(x;q,a) = ψ(x;q,a)  x  (q) for a⊥q, and E (x,q) = max y≤x E(y,q). We will prove Bombieri’s Theorem in the following form:

THEOREM 4.4.1 ([Dav80]). Let A > 0 be  xed, and suppose x

1 2 (logx) A ≤Q≤x1 2 . Then

∑ q≤Q

E (x,q)x1 2 Q(logx)5.

Proof : If χ is a multiplicative character modulo q, and de ne ψ(y,χ) = ∑ n≤y χ(n)Λ(n). We begin with the identity

ψ(y;q,a) =

1  (q)∑ χ

χ(a)ψ(y,χ),

where the sum is over all the characters modulo q. Let χ0 be the principal character we then de ne ψ0(y,χ) =(ψ(y,χ) if χ6= χ0, ψ(y,χ0) y if χ = χ0.

4.4. BOMBIERI’S THEOREM 91

Then we have

ψ(y;q,a)

y  (q)

=

1  (q)∑ χ

χ(a)ψ0(y,χ),

and so

|E(y;q,a)|≤

1  (q) ∑ χ |ψ0(y,χ)| since|χ(a)|≤1. This estimate is independent of a, so that E (y;q)≤ 1  (q)∑ χ |ψ0(y,χ)|. If χ mod q is a character (possibly imprimitive) that is induced by χ1 mod q1, where χ1 is primitive, then ψ0(y,χ) and ψ0(y,χ1) do not differ very much: ψ(y,χ1) ψ0(y,χ) = ∑ pk≤y p\q χ1(pk)log p ∑ p\qlogy logplogp (logy)∑ p\q logp (logqy)2. Hence we can replace the sum over all characters by one over the primitive characters only. Thus E(x,q)(logqx)2 + 1  (q) ∑ χ
 
ψ0(y,χ1)
 
,and E (x,q)(logqx)2 + 1  (q)∑ χ max y≤x
 
ψ0(y,χ1)
 
.W e can combine the contributions from each of the primitive characters. Since a primitive character induces characters to moduli that are multiples of q, we have E (x,q)(logqx)2 + ∑ q≤Q   ∑ χ max y≤x |ψ0(y,χ)| ∑ k≤Q/q 1  (kq), where ∑  means the sum is over primitive characters modulo q. Since  (kq)≥ (k) (q) we have ∑ k≤z 1  (kq) ≤ 1  (q) ∑ k≤z 1  (k) . Now ∑ k≤z 1  (k) ≤∏ p≤z1+ 1 (p 1) + 1 p(p 1) + 1 p2(p 1) +···. Note that 1 (p 1) 1 1  1 p = 1 p 11+ 1 p + 1 p2 +···. Thus 1+ 1 (p 1) + 1 p(p 1) + 1 p2(p 1) +···= 1+ 1 (p 1) 1 1  1 p= 1+ 1 p(p 1) 1 1  1 p .

92 4. THE LARGE SIEVE

Using this we have

∑ k≤z

1  (k) ≤∏ p≤z1  1 p 11+ 1 p(p 1) logz,

and so

∑ q≤Q

∑ χ

max y≤x |ψ0(y,χ)| ∑ k≤Q/q

1  (kq)logx ∑ q≤Q

1  (q)

∑ χ

max y≤x
 
ψ0(y,χ)
 
.Thus it suf ces to show that

∑ q≤Q

1  (q)

∑ χ

max y≤x |ψ0(y,χ)|x1 2 Q(logx)4(4.40)

for x

1 2 (logx) A ≤Q≤x1 2 .

Using the large sieve we will show that

∑ q≤Q

q  (q)

∑ χ

max y≤x |ψ(y,χ)|x+x5 6 Q+x

1 2 Q2(logQx)4(4.41) for all x≥1 and Q≥1.

Now observe that

∑ U<q≤2U

q  (q)

∑ χ

max y≤x |ψ(y,χ)|≥U ∑ U<q≤2U

1  (q)

∑ χ

max y≤x |ψ(y,χ)|,

and so

∑ U≤q≤2U

1  (q)

∑ χ

max y≤x |ψ(y,χ)|x U

+x

5 6 +x

1 2U(logUx)4

by (4.41). Summing over U = 2k for k≤logQ, we have ∑ Q1<q≤Q 1  (q)   ∑ χ max y≤x |ψ(y,χ)|≤ x Q1

+x

5 6 logQ+x

1 2 Q(logQx)4. We have used the fact that for χ = χ0 we have|ψ0(y,χ0)|≤|ψ(y,χ0)|, and ψ0(y,χ) = ψ(y,χ) if χ6= χ0. This shows (4.40) for Q1 = logA x. By the Siegel-Wal sz theorem, if χ is a primitive character modulo q, q≤(logx)A, and y≤x, then |ψ0(y,χ)|x(logx) 2A. Thus the theorem follows from (4.41).

We will now sketch the proof of (4.41) (for details see [Dav80]). Using the large sieve we can derive the following:

∑ q≤Q

q  (q)

∑ χ

max u

∑ 1≤m≤M

∑ 1≤n≤N mn≤u

ambnχ(mn)

(4.42)

(M +Q2)1 2 (N +Q2)1 2 ∑ 1≤m≤M

|am|2

1 2 ∑ 1≤n≤N

|bn|2

1 2 log2MN.(4.43)

If Q2 > x then (4.41) follows from above with M = 1, a1 = 1, bn = Λ(n), N = x. Thus we may assume Q2 ≤x. It turns out that we can write ψ(y,χ) = S1 +S2 +S3 +S4,

4.5. PRIME AND SQUAREFREE PAIRS 93

where

S1 = ∑ n≤U

Λ(n)χ(n)U,

S2 =  ∑ t≤UV∑ t=md m≤U d≤V

μ(d)Λ(m) ∑ r≤y/t

χ(rt),

S3 (logy) ∑ d≤V

max w

∑ w≤h≤y/d

χ(h)

, and

S4 = ∑ U<m≤y/V

Λ(m) ∑ V<k≤y/m∑ d\kd ≤V

μ(d)χ(mk).

Using (4.42) and the P′olya-Vinogradov inequality (see [Dav80]), we can show that

∑ q≤Q

q  (q)

∑ χ

max y≤x |S4|Q2x1 2 +QxU 1 2 +QxV 1 2 +x(logx)4. The sum S2 can be split into S2 = ∑t≤UV = ∑t≤U +∑U<t<UV = S0 2 +S00 2, and it can be shown that ∑ q≤Q   ∑ χ max y≤x |S00 2|(Q2x1 2 +QxU 1 2 +Qx 1 2U 1 2V 1 2 +x)(logx)2 and

∑ q≤Q

q  (q)

∑ χ

max y≤x |S0 2|(Q

5 2U +x)(logUx)2.

Also

∑ q≤Q

q  (q)

∑ χ

max y≤x |S3|(Q

5 2V +x)(logVx)2.

On combiningthese estimates and takingU =V =x

2 3 Q 1 for x1 3 ≤Q≤x

1 2 , we obtain (4.41) in this range. For Q≤x

1 3 ,

we can take U = x

1 3 to complete the proof of (4.41).

The Bombieri result can be formulated as follows: THEOREM 4.4.2. Let E(x;q,a)=π(x;q,a)  lix  (q) for a⊥q, E(x;q)=maxa,a⊥q
 
E(x;q,a)
 
, andE (x,q)=maxy≤x E(y,q).Then for all A > 0 there exists B > 0 such that ∑ q≤x 1 2 (logx) B E (x,q) x log1+Ax .

4.5. Prime and Squarefree pairs

We can pose the following variation of the twin prime problem: “Are there in nitely many primes p such that p+2 is squarefree ” The answer to the question is yes, and this is an almost immediate consequence of the powerful result we have proved.

THEOREM 4.5.1. Let

Ξ(x) =
 
{p≤x|μ2(p+2) = 1}
 
.Then Ξ(x) = Li (x)∏ p>21  1 p(p 1)+Olnx √x+O x ln1+U(x)+O(x3 4 lnC(x)) for some constants U > 0 and C > 0.

94 4. THE LARGE SIEVE Proof : Let A ={p+2| p≤x}. We have Ξ(x) = ∑ d2≤x μ(d)∑ n∈A d2\n

1. Let Ad2 ={p+2| p≤x,p+2≡0 mod d2}. Thus by de nition|Ad2|= π(x;d2, 2). De ne Rd2 = π(x;d2, 2) Li (x)  (d2) . Then we have Ξ(x) = Li (x) ∑ d2≤x μ(d)  (d2) + ∑ d2≤x μ(d)|Rd2| = Σ1 +Σ2 Σ1 = Li (x)∑ d μ(d)  (d2)  ∑ d>√x μ(d)  (d2) = Li (x)∏ p>21  1 p(p 1)  ∑ d>√x μ(d)  (d2), since|A4|= 0 allows omitting the prime 2.

The second sum can be upper-bounded by:

∑ d>√x

1  (d2) ≤ ∑ d>√x

2lnd d2 = Olnx √x.

The remainder term is bounded by:

Σ2 ≤ ∑ d2≤x

|Rd2| = ∑ d2≤ √x lnC x |Rd2|+ ∑ √x lnC x <d2<x

|Rd2|

= O x ln1+U x+ ∑ √x lnC x <d2<x

|Rd2|

using Bombieri’s result to bound the  rst sum.

For the second sum, since|Rd2|≤b x d2c≤ x d2 , we have

√x lnC x

<d2<x

|Rd2|≤x ∑ √x lnC x <d2

1 d2 = OxlnCx x 1 4 = Ox3 4 lnC x.

4.5. PRIME AND SQUAREFREE PAIRS 95

The theorem follows from the estimates for Σ1 and Σ2.

Let Ψ(x) = ∑n≤xΛ0(n), where

Λ0(n) =(logp if n = pk and μ2(n+2) = 1, 0 otherwise.

Let

Ψ(x;q,a) = ∑ n≤xn ≡a mod q

Λ0(n),

and further let E(x;q,a) = Ψ(x;q,a)  Cx  (q), E(x;q) = maxa,a⊥q|E(x;q,a)|, and E (x,q) = maxy≤x E(y,q), where C = ∏p1  1 p(p 1). Using partial summation and the above theorem we can show that for any U > 0, Ψ(x) =Cx+O x log1+U x and

Ψ(x;q,a) =

Cx  (q)

+O x log1+U x, for a⊥q.

THEOREM 4.5.2. Let A > 0 be  xed. Then

∑ (logx)A<q≤Q

E (x,q)x1 2 Q(logx)5,

provided x

1 2 (logx) A ≤Q≤x1 2 . The proof is a careful veri cation that the proof of the Bombieri Theorem goes through except for q < (logx)A. But in this range the maximum error possible is O x log1+U xso selecting U large enough we have:T HEOREM 4.5.3. Let A > 0 be  xed. Then ∑ q≤Q E (x,q)x1 2 Q(logx)5, provided x 1 2 (logx) A ≤Q≤x1 2 . There is a version of Brun’s sieve that makes use of the result on the average behaviour of error terms to yield a better estimate. In particular we have ([HR74] Theorem 2.10 p. 65) THEOREM 4.5.4. Let the following conditions hold on the sequence A: 1. 1≤ 1 1 ω(p) p ≤A1; 2. ∑ w≤p≤z ω(p)log p p ≤κlog z w +A2, if 2≤w≤z; 3. There is a constant A0 0 such that |Rd|≤Lxlogx d +1A0ν(d) 0 ; 4. For every postive constant U ≥1 there is a C0 such that ∑ d<xα log c0 x μ2(d)|Rd|= O x logκ+U x.

96 4. THE LARGE SIEVE

Let b be a positive integer, let λ be a real number satisfying λe1+λ < 1, let

c1 =

A2 21+A1κ+ A1A2 log2,

and let u = logx logz. Then

S(A;P,z)≥xW(z)1 2 λbeλ2 1 λe1+λ2

exp(2b+2) c1 λlogz +OLz αu+2b 1+ 2.01 (e2λ/κ 1) uC0+1 logC0+κ+1 z+ O(u κlog U X),

where the O-constants may depend on A0 0,A1,A2,κ,α and U, but not on λ or b. Using this theorem with A ={p+2| p≤x,μ2(p+2)= 1}, and taking the sifting primes to be P ={p : p > 2}, we  nd that the lower bound is positive (and diverges) for u < 9. Following the same analysis as in [HR74] (p.67), we can also take u < 8 with a slightly better treatment of the principal and secondary terms involved in the proof of the above theorem. This allows us to conclude that the lower bound diverges even with z = x 1 7 , and thus we have:

THEOREM 4.5.5. There are in nitely many primes p such that p+2 is a squarefree number with at most 7 prime factors.

The above result is different from earlier ones because of the extra condition that p+2 be made up only of distinct primes.

Bibliography

[BakHar98] Baker R. C., Harman G., Shifted primes without large prime factors, Acta Arith., (83), 331-361, (1998). [BakHar95] Baker R. C., Harman G., The Brun-Titchmarsh Theorem on average, Analytic Number Theory Vol 1, (Allerton Park, IL), 39-103, Progr. Math. 138, Birkhauser Boston, Boston, (1996), [BakPin85] Baker R. C., Pintz J., The distribution of square-free numbers, Acta Arith., (46), 71-79, (1985). [Be83] Beth Thomas, Eine Bemerkung zur Absch¨atzung der Anzahl orthogonaler lateinischer Quadrate mittels Siebverfahren. Abh. Math. Sem. Univ. Hamburg, 53, 284-288, (1983). [BPS60] Bose R. C, Shrikande S. S., Parker E. T., Further results on the construction of mutually orthogonal latin squares and the Falsity of Euler’s conjecture, Canad. J. Math. 12, 189-203, 1960. [Bru16] Brun V., Omfordelingen av primtallene i forskjellige talklasser. En vre begrnsning. , Nyt Tiddsskr. f. Math. (27) B, 45-58, (1916). [Bru19] Brun V., Le crible d’Eratostne et le thorme de Goldbach, C. R. Acad. Sci. Paris, (168), 544-546, (1919). [Bru22] Brun V., Das Siev des Eratosthenes, 5. Skand. Mat. Knogr., Helsingfors, 197-203, (1922). [CES60] Chowla S., Erd os P., Straus E. G., On the maximal number of pairwise orthogonal Latin suqares of a given order , Canad. J. Math. 1 ˉ 2, 204-208, 1960. [Che73] Chen J., On the representation of a large even integer as the sum of a prime and the product of at most two primes, Sci. Sinica, (16), 157-176, (1973). [Dar96] Dartyge C′ecile, Le plus grand facteur de n2 +1 o′ u n est presque premier., Acta Arith. (76), no. 3, 199-226, (1996). [Dav80] Harold Davenport, Montgomery H. L., Multiplicative Number Theory, 2nd ed., Springer-Verlag, (1980). [DI83] Deshouillers, J.-M., Iwaniec Henryk, On the greatest prime factor of n2 +1, Ann. Inst. Fourier (Grenoble), (32), no. 4., 1-11, (1983). [Erd52] Erd os, P′al, On the greatest prime factor of ∏ f(k) J. Lond. Math. Soc., (27), 379-384, (1952). [Erd60] Erd os, P′al, ¨Uber die kleinste quadratfreie Zahl einer arithmetischen Reihe, Monatsh. Math. 64, (1960), 314-316. [Erd49] Erd os, P′al, On some applications of Brun’s method, Acta Univ. Szeged. Sect. Sci. Math. 13, (1949), 57-63. [Est31] Estermann, Theodor, Einige S¨atze ¨uber quadratfreie Zahlen, Math. Ann. vol. (105), 1931. 653-662. [Gal67] Gallagher P. X., The large sieve, Mathematika, (14), 14-20, (1967). [GeL66] Gel’Fond A. O., Linnik Yu. V., Elementary Methods in the Analytic Theory of Numbers, MIT - Press, (1966). [Hal70] Halberstam, H.; On integers all of whose prime factors are small, Proc. Lond. Math. Soc. (3), (21), 102-107, 1970. [HR74] Halberstam, H.; Richert, H.-E., Sieve Methods, Academic Press, 1974. [HalRo66] Halberstam, H.; Roth, K. F., Sequences, Oxford University Press, (1966). [HB84] Heath-Brown D. R., The Square Sieve and Consecutive Square-Free Numbers. Math. Ann. (266), (1984), 251-259. [HB88] Heath-Brown D. R., The number of primes in a short interval. J. Reine Angew. Math. (389), 22-63, (1988). [Ho67] Hooley C., On the greatest prime factor of a quadratic polynomial, Acta Math., (17), 281-299, (1967). [Ho73] Hooley C., On the largest prime factor of p+a, Mathematika, (20), 135-143, (1973). [Ho76] Hooley C., Applications of sieve methods to the theory of numbers, Cambridge University Press, (1976). [Iwan82] Iwaniec Henryk, On the Brun-Titchmarsh theorem, J. Math. Soc. Japan, (34), No. 1, 95-123, (1982). [vLR65] van Lint J. H., Richert H.-E., On primes in arithmetic progression, Acta Arith., (11), 209-216, (1965). [Mir49] Mirsky, L. On the frequency of pairs of squarefree numbers with a given difference. Bull. Amer. Math. Soc. (55), 936-939, (1949). [Mon68] Montgomery H. L., A note on the large sieve, J. Lond. Math. Soc., (43), 93-98, (1968). [MV73] Montgomery H. L., Vaughan R. C., The Large Sieve, Mathematika, (20), No. 40, 119-134, (1973). [MV81] Montgomery H. L., Vaughan R. C., The Distribution of Squarefree Numbers, Recent Progress in Analytic Number Theory, Academic Press, 247-256, (1981). [Mot70] Motohashi Yoichi, A note on the least prime in an arithmetic progression with a prime difference, Acta Arith., (17), 283-285, (1970). [Odl71] Odlyzko Andrew M., Sieve Methods, Senior Thesis, California Institute of Technology, Pasadena, California, (1971). [Rad24] Rademacher Hans, Beitr¨age zur Viggo Brunschen Methode in der Zahlentheorie, Abbh. Math. Sem. Hamburg, (3), 12-30, (1924). [RS62] Rosser J. B., Schoenfeld L., Approximate formulas for some functions of prime numbers , Illinois J. Math. (6), 64-89, (1962). [Sch66] Schinzel, A., On sums of roots of unity. (Solution of two problems of R. M. Robinson), Acta Arith., (11), 419-432, (1966). [SchWa58] Schinzel A., Wang Y., A note on some properties of the functions  (n),σ(n) and θ(n), Ann. Polon. Math. (4), 201-213, (1958). [Sel47] Selberg A., On an elementary method in the theory of primes, Norske Vid. Selsk. Forh. Trondhjem (19), no.18, 64-67, (1947). [Sel71] Selberg A., Sieve Methods, Proc. Symp. Pure Math. (20), 311-351, (1971). [Tit86] Titchmarsh E. C., The Theory of the Riemann Zeta-function, 2nd Ed., Oxford University Press, (1986). [Wal63] Wal sz Arnold, Weylsche Exponentialsummen in der neueren zahlentheorie, Deutscher Verlag der Wissenschaften, Berlin, (1963). [War27] Ward D. R., Some series involving Euler’s function, J. Lond. Math. Soc., (2), 210-214, (1927). [Warl90] Warlimont Richard, Sieving by large prime factors, Monatsh. Math., (109), no. 3, 247-256, (1990). [Wil74] Wilson, Richard

陈景润定理对筛法理论的贡献相关推荐

  1. 陈景润定理对筛法理论的重要贡献

    经过查证,在国际最新筛法专著的前言中,作者专门提及陈景润定理的现代意义,而我们国人却陈景润不理解.呜呼! 请看本文附件. 袁萌 陈启清 2月4日 附件:在最新筛法专著的前言中,专门提及陈景润定理的现代 ...

  2. 陈景润定理的数学证明何处寻

    由于时代过于久远,陈景润定理的数学证明与公式推理过程很难寻找. 实际上,陈景润定理的数学证明与公式推导十分复.困难,出乎一般人的想象. 有兴趣者,可搜索该文PDF原文第,查看第5-6页.该文件共有74 ...

  3. CAP定理与BASE理论

    CAP定理与BASE理论 CAP定理 2000 年 7 月,加州大学伯克利分校的 Eric Brewer 教授在 ACM PODC 会议上提出 CAP 猜想.2年后,麻省理工学院的 Seth Gilb ...

  4. CAP 定理、 BASE 理论

    CAP 定理 2000 年 7 月,加州大学伯克利分校的 Eric Brewer 教授在 ACM PODC 会议上提出 CAP 猜想.2年后,麻省理工学院的 Seth Gilbert 和 Nancy ...

  5. 陈景润定理数学证明存在错误吗?

    1973年,陈景润定理证明论文审稿人王元先生在稿件审查意见栏目里面写了"证明无误"字样,据此,陈景润的1+2论文得以正式发表.. 三十年之后,反动文人王晓明想"整死&qu ...

  6. 陈景润定理不是谎言的证据列表

    近年来,在反动文人王晓明的蛊惑下,国内出现一股反陈景润定理的"小高潮",搅得陈景润在天之灵不得安息. 现将陈景润定理不是谎言的证据列表(部分)放在本文附件之中,请大家参阅. 袁萌 ...

  7. 陈景润定理不代表国际数学发展主流方向

    1900年.希尔伯特23个未解决的数学难题分为4个类别:数学基础.数论.代数与几何以及数学分析等四个大类. 十分明显的是,数学基础研究(即公理系统的无矛盾性)代表了二十世纪国际数学发展的主流方 向,而 ...

  8. 陈景润定理与哥德巴赫猜想

    陈景润生前一共发表77篇论文,其代表作是陈景润定理. 但是,即使陈景润定理证明无误,也不意味着:哥德巴赫猜想正确,两者不能混为一谈. 对此,国内媒体必须反思.实事求是,是检验真理的唯一标准. 袁萌  ...

  9. 预告:陈景润定理是“错误百出”吗?

    昨日,我们推出"中国哥德巴赫猜想研究的珍贵史料",其目的是为驳斥王晓明的谬论:"陈景润的1+2,中国科学史上最经典的谎言".该文宣称:陈景润定理是"错 ...

  10. 【分布式】1、CAP原则(CAP定理)、BASE理论

    CAP原则又称CAP定理,指的是在一个分布式系统中, Consistency(一致性). Availability(可用性).Partition tolerance(分区容错性),三者不可得兼. CA ...

最新文章

  1. 单商户商城与多商户商城的区别
  2. python【蓝桥杯vip练习题库】ADV-272 change(思维)
  3. iOS 组件化 —— 路由设计思路分析
  4. 五分钟完成 ABP vNext 通讯录 App 开发
  5. Post请求如何取消异步
  6. mysql 中间表的好处_Mysql中使用中间表提高统计查询速度
  7. Android 内容提供器---简介
  8. c/c++教程 - 1.3 关键字、标识符命名规则
  9. app内嵌h5页面在ios手机端滑动卡顿的解决方法
  10. 动态规划实战16 leetcode-198. House Robber
  11. PHP爬取搜狗微信文章内容
  12. hbuilder_工具的服务端口已关闭。要使用命令行调用工具,请在下方输入 y 以确认开启,或手动打开工具 -> 设置 -> 安全设置,将服务端口开启。
  13. 怎么申请企业邮箱?手机怎么申请免费邮箱?
  14. Revit SDK 介绍:ModelessForm_ExternalEvent ModelessForm_IdlingEvent
  15. nginx配置微前端
  16. 无法定位软件包/有几个软件包无法下载-问题解决
  17. Xmind 2022 Mac思维导图软件
  18. 「Python入门」Python代码规范(风格)
  19. 腾讯云数据库-劳动节小试牛刀-TDSQL-MySQL 云实例部署体验
  20. AOJ 15951 零件加工问题二

热门文章

  1. 单页面网站优化技巧有哪些?
  2. 【附源码】计算机毕业设计java音乐鉴赏网站前端开发设计与实现
  3. shell小脚本--网速监控
  4. PyQt5项目:网速监控器
  5. 基于google api 的youtube评论爬取
  6. VSCode折叠所有区域代码快捷键
  7. 图模型在欺诈检测应用一点看法
  8. 就业指导期末试题(含正确答案)
  9. 维基百科六度分隔理论
  10. [转发]Labview2015 vi文件程序框图密码破解过程