46
On average sizes of Selmer groups and ranks in families of elliptic curves having marked points Manjul Bhargava and Wei Ho 1 Introduction Let F 0 , F 1 , F 1 (2), F 1 (3), and F 2 denote the following families of elliptic curves over Q: F 0 = y 2 = x 3 + a 4 x + a 6 | a 4 ,a 6 Z, Δ 6=0 ; F 1 = y 2 + a 3 y = x 3 + a 2 x 2 + a 4 x | a 2 ,a 3 ,a 4 Z, Δ 6=0 ; F 1 (2) = y 2 = x 3 + a 2 x 2 + a 4 x | a 2 ,a 4 Z, Δ 6=0 ; F 1 (3) = y 2 + a 1 xy + a 3 y = x 3 | a 1 ,a 3 Z, Δ 6=0 ; and F 2 = y 2 + a 1 xy + a 3 y =(x - a 2 )(x - a 0 2 )(x - a 00 2 ) | a 1 ,a 2 ,a 0 2 ,a 00 2 ,a 3 Z,a 2 + a 0 2 + a 00 2 =0, Δ 6=0 , where, in each case, Δ is the discriminant polynomial in the coefficients a i ,a 0 i ,a 00 i whose nonvanishing is equivalent to the curve being nonsingular. Then we see that F 0 is the family of all elliptic curves; F 1 is the family of all elliptic curves with a marked rational point (at (0,0)); F 1 (2) is the family of all elliptic curves with a marked rational point of order two (at (0,0)); F 1 (3) is the family of all elliptic curves with a marked rational point of order three (at (0,0)); and F 2 is the family of all elliptic curves with two marked rational points (at, e.g., (a 2 , 0) and (a 0 2 , 0)). In fact, we will show in §10 that 100% of the curves in F 1 (respectively, F 2 ) have rank at least 1 (resp., 2) when ordered by height, to be defined below. The aim of this article is to demonstrate that the average rank of elliptic curves in each of these five families is bounded above. In fact, we will prove the stronger result that average size of the 2- and/or 3-Selmer groups of elliptic curves in each of these families is bounded above. To state these results more precisely, define the (naive) height of a Weierstrass elliptic curve E in any of the families F 0 , F 1 , F 1 (2), or F 1 (3) by ht(E) := max{|a i | 12/i }; similarly, define the (naive) height of an elliptic curve E in the family F 2 by ht(E) := max{|a i | 12/i , |a 0 2 | 6 , |a 00 2 | 6 }. We prove the following theorem. Theorem 1.1. When elliptic curves in the families F 0 , F 1 , F 1 (3), F 1 (2), F 2 are ordered by height: (a) The average size of the 2-Selmer group in F 0 is 3. (b) The average size of the 3-Selmer group in F 0 is 4. (c) The average size of the 2-Selmer group in F 1 is at most 6. (d) The average size of the 3-Selmer group in F 1 is 12. 1

On average sizes of Selmer groups and ranks in families of

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On average sizes of Selmer groups and ranks in families of

On average sizes of Selmer groups and ranks in

families of elliptic curves having marked points

Manjul Bhargava and Wei Ho

1 Introduction

Let F0, F1, F1(2), F1(3), and F2 denote the following families of elliptic curves over Q:

F0 ={y2 = x3 + a4x+ a6 | a4, a6 ∈ Z, ∆ 6= 0

};

F1 ={y2 + a3y = x3 + a2x

2 + a4x | a2, a3, a4 ∈ Z, ∆ 6= 0}

;

F1(2) ={y2 = x3 + a2x

2 + a4x | a2, a4 ∈ Z, ∆ 6= 0}

;

F1(3) ={y2 + a1xy + a3y = x3 | a1, a3 ∈ Z, ∆ 6= 0

}; and

F2 ={y2 + a1xy + a3y = (x− a2)(x− a′2)(x− a′′2) |

a1, a2, a′2, a′′2, a3 ∈ Z, a2 + a′2 + a′′2 = 0, ∆ 6= 0

},

where, in each case, ∆ is the discriminant polynomial in the coefficients ai, a′i, a′′i whose nonvanishing

is equivalent to the curve being nonsingular. Then we see that F0 is the family of all elliptic curves;F1 is the family of all elliptic curves with a marked rational point (at (0,0)); F1(2) is the familyof all elliptic curves with a marked rational point of order two (at (0,0)); F1(3) is the family of allelliptic curves with a marked rational point of order three (at (0,0)); and F2 is the family of allelliptic curves with two marked rational points (at, e.g., (a2, 0) and (a′2, 0)). In fact, we will showin §10 that 100% of the curves in F1 (respectively, F2) have rank at least 1 (resp., 2) when orderedby height, to be defined below.

The aim of this article is to demonstrate that the average rank of elliptic curves in each of thesefive families is bounded above. In fact, we will prove the stronger result that average size of the 2-and/or 3-Selmer groups of elliptic curves in each of these families is bounded above.

To state these results more precisely, define the (naive) height of a Weierstrass elliptic curve Ein any of the families F0, F1, F1(2), or F1(3) by ht(E) := max{|ai|12/i}; similarly, define the (naive)height of an elliptic curve E in the family F2 by ht(E) := max{|ai|12/i, |a′2|6, |a′′2|6}. We prove thefollowing theorem.

Theorem 1.1. When elliptic curves in the families F0, F1, F1(3), F1(2), F2 are ordered by height:

(a) The average size of the 2-Selmer group in F0 is 3.

(b) The average size of the 3-Selmer group in F0 is 4.

(c) The average size of the 2-Selmer group in F1 is at most 6.

(d) The average size of the 3-Selmer group in F1 is 12.

1

Page 2: On average sizes of Selmer groups and ranks in families of

(e) The average size of the 3-Selmer group in F1(2) is 4.

(f) The average size of the 2-Selmer group in F1(3) is at most 3.

(g) The average size of the 2-Selmer group in F2 is at most 12.

For cases (c), (e), and (g), the statement that the average size is at most a given integer N meansthat the limsup of the corresponding ratio is at most N . Note that, in each of these averages, weare in fact counting certain isomorphism classes of elliptic curves infinitely many times. However,for each such family F , we observe that an elliptic curve E ∈ F always has a unique representativein F of minimal height, which we then call a minimal element of F . For example, a curve E in F0

is minimal if and only if there is no prime p such that p4 | a4 and p6 | a6. In general, for each of ourfamilies F , a curve is minimal in F exactly when, for every prime p, it is not the case that pi | ai(and pi | a′i and pi | a′′i ) for all i.

If we restrict ourselves only to the minimal curves in each of our families F , then any Q-isomorphism class of elliptic curves in any such family will be represented exactly once. We provethat the averages and upper bounds in Theorem 1.1 do not change even when one averages onlyover these minimal curves (i.e., over all isomorphism classes of curves in these families, ordered bytheir minimal heights); moreover, these averages and upper bounds do not change even when anyadditional finite set of congruence conditions are imposed on the coefficients of the elliptic curves.

More generally, for F = F0, F1, F1(2), F1(3), or F2, let Φ be any subfamily of F that is defined,for each prime p, by congruence conditions modulo some power of p. We say that such a subfamily Φof elliptic curves over Q is large at p if the congruence conditions defining Φ at p contains all ellipticcurves E in the family such that p2 does not divide the reduced discriminant ∆red of E; and wesay that the subfamily Φ of F is large if it is large at all but finitely many primes p. The reduceddiscriminant ∆red is the squarefree part of the discriminant polynomial ∆. We prove the followingstrengthening of Theorem 1.1:

Theorem 1.2. The average values and upper bounds given in Theorem 1.1 for the families F0, F1,F1(2), F1(3), and F2 remain the same even when one averages over any large subfamily of one ofthese families.

Note that the sets of all curves and the sets of all minimal curves in F0, F1, F1(2), F1(3), and F2

are large. So too are the subfamilies of all elliptic curves (resp. all minimal curves) in these familiesdefined by any finite set of congruence conditions, as are those having semistable reduction in thefamilies F0, F1, and F2. Thus Theorem 1.2 applies to quite general subfamilies of these families ofelliptic curves.

Using the fact that the p-rank of the p-Selmer group of an elliptic curve bounds its algebraicrank, we obtain the following:

Theorem 1.3. When elliptic curves in any large subfamily of one of the families F0, F1, F1(2),F1(3), and F2 are ordered by height (resp., minimal height), the average rank is bounded.

Indeed, we obtain the explicit bounds of 7/6, 13/6, 7/6, 3/2, and 7/2 for the limsup of theaverage ranks of the curves in any large subfamily of F0, F1, F1(2), F1(3), and F2, respectively.

We note that considering only the 2-Selmer group is not sufficient to prove a finite bound on theaverage rank in the family F1(2) of elliptic curves over Q with a marked rational 2-torsion point.As shown by Klagsbrun and Lemke-Oliver [KLO14], the average size of the 2-Selmer group in thefamily F1(2) is infinite. Moreover, if φ denotes the 2-isogeny of elliptic curves E → E′ associated to

2

Page 3: On average sizes of Selmer groups and ranks in families of

the marked rational 2-torsion point on the elliptic curve E in F1(2), and φ denotes the dual isogeny,then Kane and Klagsbrun [KK17] have shown that the average sizes of the φ- and φ-Selmer groupsin F1(2) are also each unbounded. Since Theorem 1.1(e) shows that the average size of 3rk(E), andtherefore 2rk(E), is bounded for the elliptic curves in F1(2), it follows that most (indeed, a densityof 100% of) elements in the 2-Selmer groups, and the φ- and φ-Selmer groups, of elliptic curves inF1(2) must correspond to nontrivial elements of the Tate–Shafarevich group.

Theorem 1.4. When all 2-Selmer elements (resp. φ- or φ-Selmer elements) of elliptic curvesover Q with a marked rational 2-torsion point are ordered by the heights of these elliptic curves,a density of 100% have no rational point, i.e., correspond to nontrivial elements of the Tate–Shafarevich group.

As a consequence, we also obtain:

Theorem 1.5. When elliptic curves over Q with a marked rational 2-torsion point are ordered byheight, the average size of the 2-torsion subgroup of the Tate–Shafarevich group is infinite.

Since both [KLO14] and [KK17] obtain more precise results on the distribution of 2-Selmer (resp.φ- and φ-Selmer) groups in F1(2), due to Theorem 1.1(e), these same distribution results thus alsoapply to the 2-torsion (resp. φ- and φ-torsion) subgroups of the Tate–Shafarevich groups of thesecurves.

Theorem 1.4 can be made more explicit in the case of φ-Selmer groups:

Theorem 1.6. When curves y2 = Ax4 + Bx2z2 + Cz4 (A,B,C ∈ Z) having points everywherelocally are ordered by the height max{|B|2, |AC|}, a density of 100% have no rational point.

Theorem 1.6 follows from Theorem 1.4 because elements of the φ-Selmer group of the ellipticcurve E : y2 = x3 + a2x

2 + a4x in F1(2) can be represented by the genus one curves YA,B,C : y2 =Ax4 + Bx2z2 + Cz4 that have points locally at every place, where A,B,C ∈ Z, AC = a4, andB = a2. We conclude from Theorem 1.4 that 100% of such curves YA,B,C in fact have no rationalpoint when ordered by the heights of their Jacobians.

It is precisely the curves appearing in Theorem 1.6 that are studied in first courses on descenton elliptic curves having a rational 2-torsion point. Theorem 1.6 states that most such torsors forelliptic curves have no rational points.

Method of proof

The proof of Theorems 1.1–1.6 rely on recently developed parametrizations of Selmer elementsof elliptic curves having marked points, via orbits in certain coregular representations, which weobtained in [BH16]. A representation of an algebraic group G on a vector space V is called coregularif the ring of invariants is a free polynomial ring. In [BH16], we classified the orbits of G(K) onV (K) for a field K for many coregular representations, and showed that the K-orbits in these casescorrespond to genus one curves over K together with extra data, such as line bundles on thesecurves and marked points on their Jacobians.

Specifically, we proved that the nondegenerate Q-orbits having integral invariants in these coreg-ular representations involving “Rubik’s cubes”, “hypercubes”, and their symmetrizations (see Ta-ble 1) naturally correspond to line bundles of degree 2 or 3 on principal homogeneous spaces forelliptic curves in these families. By a result of Cassels [Cas62], elements of the p-Selmer group ofan elliptic curve E may be realized as isomorphism classes of locally soluble principal homogeneousspaces C for E along with a degree p line bundle on the curve C. We conclude that the “locally

3

Page 4: On average sizes of Selmer groups and ranks in families of

soluble” Q-orbits of these representations correspond to elements of the 2- or 3-Selmer groups forelliptic curves in these families. (Note that for the family F0 these parametrizations were classicallyknown and were used in [BS15a] and [BS15b] to prove Theorems 1.1–1.3 for F0.) In order to obtainthe averages in Theorem 1.1, we are thereby reduced to counting Q-orbits in these representationsthat are both locally soluble and have bounded integral invariants.

To successfully count these Q-orbits, we select a representative integral orbit, i.e., an orbit ofG(Z) on V (Z), for each Q-orbit. In particular, we show that for any locally soluble element of V (Q)with integral invariants, there exists an integral representative in V (Z) with essentially the sameinvariants (i.e., up to absolutely bounded factors). Thus, to count rational orbits correspondingto Selmer elements of elliptic curves having bounded invariants, it suffices to count integer orbitscorresponding to Selmer elements having essentially those same invariants.

By a suitable adaptation of the counting techniques of [Bha10] and [BS15a], we first carry out acount of the total number of integral orbits in these representations having bounded height satisfyingthe appropriate irreducibility conditions (to be defined). The primary obstacle in this counting, asin previous representations encountered in, e.g., [Bha10, BS15a], is that the fundamental region inwhich one has to count points is not bounded but has a complex system of cusps going off to infinity.A priori, it could be difficult to obtain exact counts of points of bounded height in the cusps ofthese fundamental regions. We show, however, that most of the points in the cusps correspondingto Selmer group elements lie on subvarieties given by certain algebraic conditions on the associatedgeometric data, and we show that these reducible points can be counted by a different argument.The problem then reduces to counting points, corresponding to Selmer elements, in the main bodyof the fundamental region, which we show are predominantly irreducible.

Since not all of the genus one curves correspond to Selmer elements, to obtain the averages inTheorem 1.1 requires a sieve to the locally soluble Q-orbits. The upper bound sieves are easier;the lower bounds are much harder, and we accomplish them in cases (a), (b), (d), and (e) ofTheorem 1.1. We note that only the upper bounds are required for the upper bounds on averagerank. By carrying out these sieves, we prove that each of the averages or bounds in Theorem 1.1arises naturally as the sum of two contributions: one from the “main body” of the fundamentalregion, which is essentially the Tamagawa number of the group acting; and one from the cuspof the fundamental region, which is essentially a count of the number of elements correspondingto a certain subgroup S′ of the Selmer group S, namely, the image in S of the subgroup of theMordell–Weil group generated by the marked points.

We summarize these cusp (S′) and Tamagawa (non-S′) contributions in Table 1. The secondand third columns of Table 1 give (the isomorphism class of) the algebraic group G acting and therepresentation V of G, respectively. The fourth column gives the interpretation of orbits of G(K)on V (K) for a field K, as classified in [BH16], namely, as an elliptic curve E in the appropriatefamily with an E-torsor C and a line bundle L (on C), whose degree is given in the fifth column.The sixth column lists generators for the ring of invariants, and the degrees of these invariants, forthe action of G on V . (These generators are also the coefficients of the corresponding elliptic curvesin the family F .) The seventh column gives the dimension of V , and the eighth gives the degreeof the height function on V . The ninth column gives the number of elements in the subgroup S′

of S, and the tenth column lists the Tamagawa number of G. Finally, the sum of the ninth andtenth columns yields the averages or upper bounds given in the eleventh column of the table, andin Theorems 1.1 and 1.2.

We explain more precisely the details of our strategy to prove the contents of Table 1 (and thusTheorems 1.1 and 1.3) in §2.

4

Page 5: On average sizes of Selmer groups and ranks in families of

Core

gu

lar

spaces

an

dS

elm

er

gro

up

s

#G

rou

pG

Rep

rese

nta

tion

VQ

-orb

its↔

deg

.In

vari

antsai,a′ i

dim

.ht.

deg

.cu

spT

am

agaw

aT

ota

l(C,E,L

)ofL

and

thei

rd

egre

esn

kco

ntr

.#

1.P

GL

2S

ym

4(2

)E∈F

02

a4,a

65

61

23

bin

ary

qu

arti

cfo

rms

2,3

2.P

GL

3S

ym

3(3

)E∈F

03

a4,a

610

12

13

4te

rnar

ycu

bic

form

s4,

6

3.P

GL

2 2S

ym

2(2

)⊗

Sym

2(2

)E∈F

12

a2,a

3,a

49

12

24

6b

ideg

ree

(2,2

)fo

rms

2,3,

4

4.S

L3 3/µ

2 33⊗

3⊗

3E∈F

13

a2,a

3,a

427

36

39

12

Ru

bik

’scu

bes

6,9,

12

5.S

L2 3/µ

33⊗

Sym

2(3

)E∈F

1(2

)3

a2,a

418

36

13

4d

oub

lysy

mm

etri

cR

ub

ik’s

cub

es6,

12

6.S

L2 2/µ

22⊗

Sym

3(2

)E∈F

1(3

)2

a1,a

38

24

12

3tr

iply

sym

met

ric

hyp

ercu

bes

2,6

7.S

L4 2/µ

3 22⊗

2⊗

2⊗

2E∈F

22

a1,a

2,a′ 2,a

316

24

48

12

hyp

ercu

bes

2,4,

4,6

Tab

le1:

Cor

egu

lar

spac

esan

dS

elm

ergr

oup

s

Nota

tion

:F

orea

chro

w,

the

non

deg

ener

ateQ

-orb

its

(th

atis

,el

emen

tsofG

(Q)\V

(Q))

corr

esp

ond

tod

ata

inth

efo

urt

hco

lum

n,

den

ote

d(C,[E

],L

),w

her

eC

isa

genu

son

ecu

rve,L

isa

lin

ebu

nd

leof

deg

ree

2or

3(a

ssp

ecifi

ed)

onC

,an

dE

isan

elli

pti

ccu

rve

wit

ha

mod

elin

the

spec

ified

fam

ilyF

wit

hJac

(C)∼ =E

.(F

ora

mor

ep

reci

sed

escr

ipti

onof

the

acti

onofG

onV

,se

e§3

.)

Inth

ela

ngu

age

of[B

H16

],L

ines

4an

d5

ofT

able

1ar

eth

ere

pre

senta

tion

sco

rres

pon

din

gto

Ru

bik

’scu

bes

an

dd

ou

bly

sym

met

ric

Ru

bik

’scu

bes

,re

spec

tive

ly;m

eanw

hil

e,L

ines

7an

d6

ofT

able

1ar

eth

ere

pre

senta

tion

sco

rres

pon

din

gto

hyp

ercu

bes

an

dtr

iply

sym

met

ric

hyp

ercu

bes

,re

spec

tive

ly.

Not

eth

atth

esp

aces

inL

ines

1,2,

and

3m

ayal

sob

evie

wed

assu

bsp

aces

ofth

ese,

nam

ely

qu

ad

rup

lysy

mm

etri

chyp

ercu

bes

,tr

iply

sym

met

ric

Ru

bik

’scu

bes

,an

dd

oub

lyd

oub

lysy

mm

etri

chyp

ercu

bes

,re

spec

tive

ly.

5

Page 6: On average sizes of Selmer groups and ranks in families of

2 Outline of proof

Let (G,V ) denote any one of the representations listed in Table 1. Let K be a field not of charac-teristic 2 or 3. In Section 3, to any (nondegenerate) orbit of G(K) on V (K), we describe how toassociate an elliptic curve E in the corresponding family F , but where the coefficients of E lie in Krather than Z; we denote this family of elliptic curves F (K). Namely, for a particular v ∈ V (K),the model of E in F (K) has coefficients ai (and a′2 for F = F2) given by the generators of theinvariant ring for the action of G on V . These invariants ai are fixed polynomials in Z[V ].

The discriminant ∆(v) of v ∈ V (K) is the usual discriminant polynomial ∆ in the coefficientsai, whose nonvanishing is equivalent to the curve E being nonsingular. The discriminant on V (K)is thus invariant under G(K). We use V stab(K) ⊂ V (K) to denote the subset of stable (or nonde-generate) points in V (K), i.e., the points in V (K) where the discriminant is nonzero. The heightH(v) of a point v ∈ V (Z) (or v ∈ V (R), using the same formulas) is the height H(E) of thecorresponding elliptic curve.

We review in §3 the theorems from [BH16] which state that the G(Q)-orbits of points v ∈V stab(Q) are in canonical bijection with triples (E,C,L), where C is a principal homogeneousspace for an elliptic curve E in F (Q) for one of the families F = F0, . . . , F2, and L is a line bundleon C of degree 2 or 3. The aforementioned elliptic curve is isomorphic to the Jacobian of thegenus one curve C obtained via these bijections. Given a specific v ∈ V stab(Q), there is a naturalchoice for a model of this elliptic curve with coefficients given by the generators ai, a

′i of the ring of

invariants of the action of G on V . Thus, given v ∈ V stab(Q), there exists a model E of the ellipticcurve in F (Q) such that the discriminant ∆, height H, and invariants ai, a

′i of v and E agree.

Among the G(Q)-orbits on V stab(Q), only the locally soluble orbits yield (2- or 3-) Selmer groupelements for the corresponding elliptic curve E. Let V ls(Q) ⊂ V stab(Q) denote those G(Q)-orbitson V (Q) corresponding to triples (C,E,L) that are in the 2- or 3-Selmer group of the elliptic curveE. Thus, the set of all G(Q)-orbits on V ls(Q) yielding a given elliptic curve E naturally has thestructure of a finite abelian 2- or 3-group. We denote this group by S(E). The group S(E) naturallyhas a subgroup S′(E), namely, the subgroup generated by the images in S(E) of the marked pointson E. We also define a notion of irreducible for points v ∈ V (Z), and we show that the points in

V ls(Z) := V (Z) ∩ V ls(Q)

that do not correspond to elements of S′(E) are all irreducible.Let us now restrict ourselves to those orbits in V stab(Q) having integer invariants, so that the

elliptic curve E associated to a point v in this subset of V stab(Q) is genuinely in the family F .In order to count the orbits of G(Q) on V ls(Q) having integral invariants and bounded height —thereby yielding a count of the total number of 2- or 3-Selmer group elements for elliptic curves ofbounded height in F — we carry out the following steps:

(1) We show that for each v′ ∈ V ls(Q) with integral invariants, there exists an integral representativev ∈ V (Z) having the identical integral invariants (up to absolutely bounded factors). In thisway, we may associate an integral orbit (the G(Z)-orbit of v ∈ V (Z)) to the subset in a rationalorbit with fixed integral invariants. (In previous work like [BS15a, BS15b], this step was aconsequence of minimization results of Cremona, Fisher, and Stoll [CFS10, Fis13]. See alsorecent work of Fisher and Radicevic [FR17] for minimization algorithms of some of the spacesconsidered in this paper.)

(2) We construct fundamental domains for the action of G(Z) on V (R):

6

Page 7: On average sizes of Selmer groups and ranks in families of

(a) First, we show that we may construct a fundamental set L for the action of R××G(R) onV (R) that is an absolutely bounded set in V (R).

(b) Second, we construct a fundamental domain F for the action ofG(Z)\G(R) that is containedin a Siegel domain. Then for any g ∈ G(R), we see that FgL ⊂ V (R) yields a finite unionof fundamental domains for the action of G(Z) on V (R).

We choose g to vary in a compact set G0 ⊂ G that is the closure of some open set in G.This yields a compact but continuously-varying set of fundamental domains FgL for theaction of G(Z) on V (R).

(3) As we want to count the irreducible G(Z)-orbits on V ls(Z), we give some sufficient conditionsfor reducibility.

(4) We give an asymptotic count of all irreducible G(Z)-orbits on V stab(Z). First, by adapting thetechniques of [Bha10] and [BS15a], we show that: (a) the cusps of the fundamental regionsFgL have a negligible number of irreducible points; (b) the main body of the fundamentalregions FgL have a negligible number of points that are reducible; and (c) the total number ofirreducible points in the main bodies of the fundamental regions FgL having height less thanX is asymptotically equal to the Euclidean volume of FgL ∩ {v ∈ V (R) : H(v) < X}.

(5) The elements of V (Z) that are in V ls(Z) are defined by infinitely many congruence conditions.In order to count just those irreducible G(Z)-orbits on V stab(Z) that are locally soluble, onemust perform a sieve. The sieve we require is a certain “geometric sieve”, which originates inthe work of Ekedahl [Eke91] and was further developed by Poonen [Poo03] and also in [BS15b].This sieve allows us to obtain a count of just the irreducible G(Z)-orbits of bounded height inV ls(Z).

(6) In order to obtain the averages in Theorem 1.1, we must count the total number of curves ofheight less than X in the family F . For almost all families, these counts are fairly straight-forward; but for the family F2, we must establish the uniformity estimate by embedding thefamily into the cusp region of a larger space and use additional invariants, extending the ideaof the “Q-invariant” from [BSW16].

By using the next step (7), we may add back to the count in (5) the number of Selmer elementsin S(E), over all curves E ∈ F of height < X, that reduce to the identity in S′(E) (i.e., thenumber of reducible orbits on V ls(Z) having height < X). The total number of Selmer elementsin S(E) over all elliptic curves E ∈ F of height less than X, divided by the total number ofelliptic curves E ∈ F of height less than X, as X tends to infinity, then yields the desiredaverages in Theorem 1.1.

(7) We prove an auxiliary lemma which shows that, when elliptic curves in the families F1 and F2

are ordered by height, a density of 100% of the marked points on these curves have infiniteorder. This implies that, for 100% of the curves E in Fi, we have that |S′(E)| = pi, wherep = 2 or 3. In other words, for 100% of the elliptic curves E in the family F = Fi, there are pi

reducible orbits on V ls(Z) corresponding to the elliptic curve E. In the cases corresponding tothe families F1(·), the group S′(E) is trivial.

We will carry out the details of item (j) above in Section j + 3 below.

7

Page 8: On average sizes of Selmer groups and ranks in families of

3 Parametrizations of Selmer elements

We recall (and appropriately modify) the relevant results from [BH16]. In particular, we describe thebijections between the nondegenerate G(Q)-orbits of V (Q) and the principal homogeneous spacesC for elliptic curves in the corresponding families, including how these parametrization resultssimplify when only considering locally soluble C and irreducible orbits. In this paper, we only needthese results over Q, Qp, and R, but the bijections hold over any base field not of characteristic 2or 3 (as shown in [BH16]), and the statements we make in this section about the case of locallysoluble curves also extend to other number fields.

We first make a few definitions to state the main parametrization theorem. Recall that a genusone curve over Q is locally soluble if it has points over Qp for every prime p and over R. For eachline in Table 1, let V ls(Q) denote the elements of V (Q) where the associated genus one curve islocally soluble; as we will see, this set is preserved by G(Q).

Also, for each case, we show in [BH16] that an element of V (Q) gives rise to a number of binaryquartic forms or ternary cubic forms over Q (for d = 2 or 3, respectively). For v ∈ V (Q), we saythat v is irreducible if each of its covariant binary quartic forms (resp., covariant ternary cubicforms) has no rational linear factor (resp., defines a smooth cubic curve in P2 with no rational flex);we say that v is reducible otherwise.

We obtain the following bijections for each representation (the descriptions of the group actionsmay be found at the end of this section):

Theorem 3.1. Consider one of the following lines from Table 1:

number group G representation V degree d family F marked point(s)

1. PGL2 Sym4(2) 2 F0 none2. PGL3 Sym3(3) 3 F0 none3. PGL2

2 Sym2(2)⊗ Sym2(2) 2 F1 (0, 0)4. SL3

3/µ23 3⊗ 3⊗ 3 3 F1 (0, 0)

5. SL23/µ3 3⊗ Sym2(3) 3 F1(2) (0, 0)

6. SL22/µ2 2⊗ Sym3(2) 2 F1(3) (0, 0)

7. SL42/µ

32 2⊗ 2⊗ 2⊗ 2 2 F2 (a2, 0), (a′2, 0)

(a) There exists a bijection between the nondegenerate G(Q)-orbits of V (Q) and isomorphismclasses of triples (E,C,L), where E is an elliptic curve in the family F with marked point(s)as indicated, C is an E-torsor over Q, L is a line bundle of degree d on C, and the markedpoint(s) represent rational degree 0 divisors of C. Two such triples (E,C,L) and (E′, C ′, L′)are isomorphic if E and E′ refer to the same elliptic curve in the family F and there is anisomorphism between C and C ′ preserving both the line bundles and the torsor structure.

(b) There exists a bijection between the nondegenerate G(Q)-orbits of V ls(Q) and pairs (E, ξ),where E is an elliptic curve in F and ξ is an element of the d-Selmer group Sd(E) of E.

(c) For an elliptic curve E ∈ F , let S′(E) denote the subgroup of Sd(E) generated by the imagesof the marked points on E under the natural map E(Q)/dE(Q) → Sd(E). Then the nonde-generate G(Q)-orbits of the irreducible elements of V ls(Q) are in bijection with pairs (E, ξ),where E is an elliptic curve in F and ξ ∈ Sd(E) \ S′(E).

Theorem 3.1(a) is proved for each case in [BH16], over any fields not of characteristic 2 or 3,although the exact statements are slightly different. There, we study the action of a larger groupG′ on the corresponding representation V for which the ai are relative invariants. The geometric

8

Page 9: On average sizes of Selmer groups and ranks in families of

data parametrized includes the elliptic curve in the family only up to isomorphism. Here, the groupG is the subgroup of G′ that fixes these invariants ai, which correspond to the coefficients of theelliptic curve.

Part (b) of Theorem 3.1 follows directly from part (a) by the following lemma, whose proof maybe found in, e.g., [Cas62, Theorem 1.2]:

Lemma 3.2. Let C be a genus one curve over Q. If C is locally soluble, then any rational pointP on the Jacobian of C represents a rational divisor on C, not just a rational divisor class.

Finally, reducibility or irreducibility for an element in V (Q) is a G(Q)-invariant notion. Anelement v ∈ V (Q) is reducible if and only if at least one of the covariant binary quartic forms(resp., ternary cubic forms) is isomorphic to its Jacobian elliptic curve, in other words, if and onlyif it corresponds to the trivial element in the 2-Selmer (resp., 3-Selmer) group of the elliptic curveassociated to v. We thus obtain part (c) of Theorem 3.1.

The groups and representations

We more precisely describe the group actions for each of the spaces in Table 1 and Theorem 3.1,over any field of characteristic not 2 or 3.

1. Binary quartic forms. For a 2-dimensional vector space V , the GL(V )-action on the repre-sentation Sym4(V )⊗ (∧2V )−2 factors through a PGL(V )-action on Sym4(V ).

2. Ternary cubic forms. For a 3-dimensional vector space W , the GL(W )-action on the spaceSym3(W )⊗ (∧3W )−1 factors through an action of PGL(V ) on the space Sym3(W ).

3. Bidegree (2, 2) forms. Let V1 and V2 be 2-dimensional vector spaces. The GL(V1)×GL(V2)-action on Sym2(V1)⊗ (∧2V1)−1 ⊗ Sym2(V2)⊗ (∧2V2)−1 factors through an action of PGL(V1)×PGL(V2) on Sym2(V1)⊗ Sym2(V2).

4. Rubik’s cubes. Let W1,W2, and W3 by 3-dimensional vector spaces. The group GL(W1) ×GL(W2)×GL(W3) naturally acts on W1⊗W2⊗W3. Consider the subgroup G′ ⊂

∏3i=1 GL(Wi)

consisting of triples (g1, g2, g3) with∏3i=1 det(gi) = 1. Let Gm denote the center of each GL(Wi).

Then each element of the representation is stabilized by the kernel of the multiplication mapG3

m∩G′ → µ3. We let G be the quotient of G′ by this kernel and its representation W1⊗W2⊗W3.Note that G is isomorphic to SL3

3/µ23.

5. Doubly symmetric Rubik’s cubes. Let W1 and W2 be 3-dimensional vector spaces. Let G′

be the subgroup of GL(W1)×GL(W2) consisting of pairs (g1, g2) such that (det g1)(det g2)2 = 1.The stabilizer of the action of G′ on W1 ⊗ Sym2(W2) is the kernel of the multiplication mapsending (γ1, γ2) ∈ G2

m ∩G′ to γ1γ22 . We will consider the quotient G of G′ by this kernel and its

action on W1 ⊗ Sym2(W2). Note that G is isomorphic to SL23/µ3.

6. Triply symmetric hypercubes. Let V1 and V2 be 2-dimensional vector spaces. As before, weconsider the subgroup G′ of GL(V1) × GL(V2) of pairs (g1, g2) such that (det g1)(det g2)3 = 1,and we consider the quotient of G′ by the stabilizer of its action on V1 ⊗ Sym3(V2). The groupis isomorphic to SL2

2/µ2.

7. Hypercubes. Let Vi be a 2-dimensional vector space for i = 1, 2, 3, 4 with the usual GL(Vi)-action on each factor. Let G′ be the subgroup of

∏1≤i≤4 GL(Vi) consisting of tuples (g1, g2, g3, g4)

with∏4i=1 det gi = 1. LetG be the quotient ofG′ by the stabilizer of its action on V1⊗V2⊗V3⊗V4,

i.e., by the kernel of the multiplication map G4m∩G′ → µ2. The group G is isomorphic to SL4

2/µ32.

9

Page 10: On average sizes of Selmer groups and ranks in families of

In each case, we show in [BH16] that the covariant binary quartic forms or ternary cubic formscorrespond to the genus one curve C and line bundles L or L ⊗ P , where P refers to the markedpoint(s) as elements of Pic0(C). These are obtained in general by determinantal constructions, andthere is one binary quartic or ternary cubic form for each factor of GL2 or, respectively, GL3.

4 Integral representatives for Selmer elements

In this section, we describe how to find representatives in V (Z) for rational orbits with integralinvariants. More precisely, given v′ ∈ V ls(Q) such that the G(Q)-invariants of v′ are integers, weshow that there exists an element v ∈ V (Z) with the same invariants, up to absolutely boundedfactors. To find such an integral representative v, we first consider the question locally, i.e., we findan element of V (Zp) in the same G(Qp)-orbit as v′. Then strong approximation implies that theselocal representatives may be glued together to obtain an integral element of V (Z). Our method tofind v will heavily rely on the parametrizations of §3.

Let v′ be an element of V ls(Q) such that theG(Q)-invariants of v′ are integral, i.e., the generatorsof the G(Q)-invariant ring, when evaluated on v′, are integral. By Theorem 3.1, the G(Q)-orbit ofv′ corresponds to (the equivalence class of) an elliptic curve E in the family F and a torsor C forE with a line bundle L of degree d on C. In particular, the elliptic curve E has an affine integralWeierstrass model E◦, as an element of F with coefficients equal to the invariants of v′.

Let p be any prime. By assumption, the genus one curve C has a Qp-point, so C is isomorphicto its Jacobian E over Qp. We may write the line bundle LQp on CQp

∼= EQp as O(D) for a divisorD = (d− 1) ·O+Q with Q ∈ E(Qp). There exists an automorphism of CQp (as a genus one curve)taking D = (d−1) ·O+Q to (d−1) ·O+(Q+dQ′) for any point Q′ ∈ E(Qp), so for the equivalenceclass of (C,L), the point Q in the divisor D is only determined up to dE(Qp). We claim that wemay choose a representative of the dE(Qp)-coset of any Q ∈ E(Qp) that is either the point O atinfinity or (almost) integral as a point on E◦(Qp).

Lemma 4.1. Let p be a prime and let d = 2 or 3. Let E be an elliptic curve over Qp with an affineintegral Weierstrass model E◦ (with coefficients in Zp).

• If p - d, then every coset E(Qp)/dE(Qp) has a representative that is either the point O atinfinity or a point (x, y) ∈ E◦(Qp) with x, y ∈ Zp.

• If p = d = 2, then every coset E(Qp)/dE(Qp) has a representative that is either O or a point(x, y) ∈ E◦(Qp) with 24x, 26y ∈ Zp.

• If p = d = 3, then every coset E(Qp)/dE(Qp) has a representative that is either O or a point(x, y) ∈ E◦(Qp) with 32x, 33y ∈ Zp.

Proof. Let E denote the projective closure of the affine Weierstrass model E◦ in P2, and let E denotethe reduction of E modulo p.

If p - d, then we only need consider the standard reduction map

E(Qp)→ E(Fp).

All of the points in E(Qp) that are not in the preimage of the reduction of O are integral (i.e., thecoordinates on E◦ are in Zp) by definition of the reduction map. On the other hand, the kernelconsists of exactly the Zp-points of the formal group E for E, which is d-divisible (in itself), andall of these points are in the dE(Qp)-coset of O.

10

Page 11: On average sizes of Selmer groups and ranks in families of

If p = d, then this kernel is no longer d-divisible, but we only need to slightly modify thisargument to allow for bounded denominators. For positive integers i, let Ei denote the Z/piZ-points of the formal group E, so the canonical reduction maps Ei+1 → Ei are surjective withkernels Z/pZ. Note that E(Zp) = lim←− Ei, and E1 is trivial by definition. Let Ki denote the kernel

of the surjective map E(Zp)→ Ei, which corresponds precisely to the subset of points in E(Qp) thatreduce to O modulo pi. Let νp denote the p-adic valuation of an element of Qp. If (x, y) ∈ E◦(Qp)with νp(y) ≤ −3i, then νp(x) = 2

3νp(y) ≤ −2i and the point (x, y) corresponds to a point in Ki.Therefore, for all (x, y) 6∈ Ki, we have that

νp(x) ≥ −2(i− 1) and νp(y) ≥ −3(i− 1) (1)

since valuations are integers and 3νp(x) = 2νp(y) when either valuation is negative.The argument above for p - d uses the fact that K1 = E(Zp) is d-divisible. For p = d, we claim

that Ki is p-divisible in Ki−1 for i ≥ 3 when p = 2 and i ≥ 2 when p = 3. This follows from the factthat for all such i, the formal logarithm induces an identification of Ki−1 with pi−1Zp in a mannercompatible with inclusion as i changes [Sil92, Theorem IV.6.4(b)]. Thus, Ki is p-divisible in Ki−1,so the points in Ki are in the coset of O in E(Qp)/dE(Qp).

Therefore, for p = d = 2 or 3, we see that the natural map

(E(Qp) \Ki) ∪ {O} ↪→ E(Qp)→ E(Qp)/pE(Qp)

is surjective for i = 3 or 2, respectively. When p = 2, the inequalities (1) imply that 24x, 26y ∈ Z2

for (x, y) ∈ E◦(Q2) not in K3; similarly, for p = 3, all (x, y) 6∈ K2 have 32x, 33y ∈ Z3, as desired.

For the remainder of this section, we call a point (x, y) ∈ E◦(Qp) as in Lemma 4.1 almostintegral. Given the elliptic curve E over Qp with affine integral model E◦ and the line bundle d ·Oor (d−1) ·O+Q, where Q is an almost integral point of E◦(Qp), we need to exhibit a correspondingelement of V (Zp) (or more precisely, an element of V (Qp) with bounded denominators at 2 and 3).We list below explicit such elements in each case; note that we may control the powers of 2 and 3in the denominators of all the entries:

1. Binary quartic forms. We slightly modify the formulas found in [CFS10, Section 3.2].Let E be the elliptic curve y2 = x3 + a4x + a6, where a4, a6 ∈ Zp and the discriminant∆ = −16(4a3

4 + 27a26) is nonzero.

(a) Suppose L ∼= O(O + Q) for an almost integral Q = (x1, y1) ∈ E◦(Qp). Set the binaryquartic form

f(w1, w2) =1

4w4

1 −3

2x1w

21w

22 − 2y1w1w

32 + (−3

4x2

1 − a4)w42. (2)

(b) Suppose L ∼= O(2O). Set the binary quartic form

f(w1, w2) = w31w2 + a4w1w

32 + a6w

42. (3)

In both cases, the binary quartic form f determines a genus one curve isomorphic to E (i.e.,the normalization of z2 = f(w1, w2)), along with the appropriate line bundle: in the first case,O(O+Q) where O and Q are taken to (w1, w2, z) = (1, 0, 1

2) and (1, 0,−12), respectively, and

in the second case, O(2O) where O is taken to the point (z, w1, w2) = (0, 1, 0) on this model.The usual degree 2 and 3 invariants I and J of f are −3a4 and −27a6, respectively, implyingthat the Jacobian of these models is the original elliptic curve.

11

Page 12: On average sizes of Selmer groups and ranks in families of

2. Ternary cubic forms. Again, we modify the formulas found in [CFS10, Section 3.2]. Let E

be the elliptic curve given by y2 = x3 + a4x + a6, where a4, a6 ∈ Zp and the discriminant isnonzero.

(a) Suppose L ∼= O(2O + Q) for an almost integral Q = (x1, y1) ∈ E◦(Qp). The ternarycubic form

Y 2Z −X2Y + 3x1Y Z2 + 2y1XZ

2 + (3x21 + a4)Z3 (4)

where the pullback of O(1) from P2 is isomorphic to O(2O + Q), where O and Q aretaken to the points [X,Y, Z] = [0, 1, 0] and [1, 0, 0], respectively.

(b) Suppose L ∼= O(3O). Then the ternary cubic form

− Y 2Z +X3 + a4XZ2 + a6Z

3 (5)

clearly cuts out a curve in P2 that is isomorphic to E, and the pullback of O(1) from P2

is isomorphic to O(3O), where O is taken to the point [X,Y, Z] = [0, 1, 0].

3. Bidegree (2, 2) forms. Let E be the elliptic curve y2 +a3y = x3 +a2x2 +a4x with the point

P = (0, 0) on E◦(Qp). We first show that there exist divisors for both the line bundle L andthe bundle L⊗ P that are the sum of almost integral points and/or the point O at infinity.

We may assume from the earlier argument that the line bundle L either is O(O+Q) for somealmost integral point Q = (x1, y1) on E◦(Qp) or is O(2O). In the former case, the line bundleL⊗ P then is isomorphic to O(O +Q′), where Q′ = (x2, y2) is the sum of Q and P ; we have

x2 =y2

1

x21

− a2 − x1 and y2 = −y31

x31

+ a2y1

x1+ y1 − a3. (6)

If Q′ is not in 2E(Qp), then Q′ is also almost integral as a point on E◦(Qp), which implies thaty1

x1∈ Zp if p 6= 2 and y1

x1∈ 1

4Z2 if p = 2. Otherwise, we may translate the elliptic curve by halfof Q′ to obtain isomorphisms L⊗P ∼= O(2O) and L ∼= O(O+[−P ]); note that −P = (0,−a3)is also integral. We may use the automorphism of E fixing O and sending P to −P to reduceto the case where L ∼= O(2O) (though the roles of L and L ⊗ P will be switched). We thusreduce to two cases:

(a) Suppose the line bundle L is isomorphic to O(O+Q), and L⊗P is isomorphic to O(O+Q′), where Q = (x1, y1) and Q′ = (x2, y2) = Q+P are almost integral points on E◦(Qp).The coordinates x2 and y2 are computed by (6), and we have that y1/x1 ∈ Zp (or 1

4Z2).Then the following bidegree (2, 2) form, with coordinates ([w1, w2], [z1, z2]), recoversthe elliptic curve E with the same line bundles, while having the correct polynomialinvariants:

(w2

1 w1w2 w22

0 12 − y1

2x1

−12

y1

x1−a2

2 − x1 − x22

− y1

2x1

a22 + x1

2 + x2y2−y1

2

· z2

1

z1z2

z22

. (7)

(b) Suppose the line bundle L is isomorphic to O(2O), and L⊗P is isomorphic to O(O+P ).Then the bidegree (2, 2) form below

(w2

1 w1w2 w22

0 0 −12

12 0 −a2

20 −a3

2 −a42

· z2

1

z1z2

z22

(8)

has the desired properties.

12

Page 13: On average sizes of Selmer groups and ranks in families of

4. Rubik’s cubes. The geometric data here is almost identical to bidegree (2, 2) forms (case3), except that the degree of the line bundle is different. Let E be the elliptic curve y2 +a3y =x3 + a2x

2 + a4x with the point P = (0, 0). By the same arguments as before, there are twocases:

(a) Suppose the line bundle L is isomorphic to O(2O + Q) and L ⊗ P is isomorphic toO(2O+Q′), where Q = (x1, y1) and Q′ = (x2, y2) = Q+P are almost integral points onE◦(Qp). Recall that y1/x1 is integral (or in 1

3Z3 for p = 3). Then the following elementof Q3

p ⊗Q3p ⊗Q3

p recovers this geometric data with the correct polynomial invariants:y2 − y1 x2 − x1 0x2 − x1 0 1

0 −1 0

x2 − x1 0 10 0 0−1 0 0

−a2 − 2x1 − x2 y1/x1 0−y1/x1 1 0−1 0 0

(9)

where these three matrices represent the 3× 3× 3 array of coefficients. One checks thattwo of the ternary cubics obtained from this element are

Y 2Z −X2Y + (a2 + 3xi)Y Z2 + (a3 + 2yi)XZ

2 + (a4 + 2a2xi + 3x2i )Z

3

for i = 1 and 2, which are the embeddings of E by O(2O+Q) and O(2O+Q′), respectively.

(b) Suppose the line bundle L is isomorphic to O(3O) and L ⊗ P ∼= O(2O + P ). Then thefollowing element of Z3

p ⊗ Z3p ⊗ Z3

p recovers E with these line bundles and the correctpolynomial invariants:1 0 0

0 0 −10 0 −a3

0 1 0−1 0 00 a2 a4

0 0 10 0 00 −1 0

(10)

where these three matrices represent the 3×3×3 array of coefficients. It is easy to checkthat two of the ternary cubics obtained from this Rubik’s cube are exactly those givingthe embeddings of E by O(3O) and O(2O + P ).

5. Doubly symmetric Rubik’s cubes. Let E be the elliptic curve y2 = x3 + a2x2 + a4x with

the 2-torsion point P = (0, 0). We again have two cases, where either L ∼= O(2O+Q), for analmost integral point Q = (x1, y1) or L ∼= O(3O).

(a) Suppose L ∼= O(2O+Q) for an almost integral point Q = (x1, y1) on E◦(Qp), and recallthat we may assume y1/x1 is also integral (or in 1

3Z3 for p = 3). Then the followingelement of Q3

p ⊗ Sym2(Q3p), represented as a triple of symmetric 3 × 3 matrices, has

(almost) integral coefficients and the desired minimal invariants a2 and a4:0 1 01 0 00 0 0

−x1 0 00 1 −10 −1 1

−a2 − x1y1

x1− y1

x1y1

x10 0

− y1

x10 −1

. (11)

The ternary cubic form

−X2Y +X2Z + x1Y2Z + (a2 + x1)Y Z2 + 2

y1

x1XZ2 +

y21

x21

Z3

obtained from this element is PGL3(Qp)-equivalent to the ternary cubic form

Y 2Z −X2Y + (a2 + 3x1)Y Z2 + 2y1XZ2 + (a4 + 2a2x1 + 3x2

1)Z3,

13

Page 14: On average sizes of Selmer groups and ranks in families of

which by (4) corresponds to the embedding of E via the line bundle O(2O+Q). Therefore,the genus one curve and degree 3 line bundle obtained via Theorem ?? is exactly E withthe line bundle O(2O +Q).

(b) Suppose L ∼= O(3O). Then the following element of Z3p ⊗ Sym2(Z3

p), represented as atriple of symmetric 3× 3 matrices, has the desired minimal invariants a2 and a4:a6 0 1

0 1 01 0 0

0 1 01 0 00 0 0

a4 0 00 0 00 0 −1

(12)

The genus one curve and degree 3 line bundle obtained via Theorem 3.1 corresponds tothe ternary cubic form X3 +a2X

2Z−Y 2Z+a4XZ2, which by (5) is exactly E embedded

using the line bundle O(3O).

6. Triply symmetric hypercubes. Let E be the elliptic curve y2 + a1xy + a3y = x3; notethat the point P = (0, 0) on E◦ has order 3 and is an element of 2E(Qp). Just as before, wemay reduce to the cases where L is isomorphic to O(O + Q), for some almost integral pointQ = (x1, y1) not equal to O or P , or where L is isomorphic to O(2O).

(a) For L ∼= O(O +Q), with Q = (x1, y1), the pair of binary cubic forms with coefficients

(y1, 0, 0, 1) and (0,−y1,−x1,−a1) (13)

is close to the desired integral element, but the invariants of this pair have extra factorsof 2 and y1 (namely, 2iyi1ai instead of ai).

By applying transformations of the form(pα 0

0 pβ

)in the two copies of GL2(Qp) to (13),

we obtain the following element with the desired minimal invariants, up to powers of 2and units in Zp:

(y1p−v, 0, 0, p3w−v) and (0,−y1p

−2w,−x1p−w,−a1) (14)

where w is the p-adic valuation of x1 and v is the p-adic valuation of y1. Because y1

divides x31, we have that 3w − v is nonnegative, and all of the coordinates of (14) are

integral. Dividing (14) by 2y1p−v, which is a unit in Qp when p 6= 2 or 3, produces a

pair of binary cubic forms with fundamental polynomial invariants a1 and a3 and givesrise to the desired binary quartic form corresponding to the elliptic curve E with themarked point P and the line bundle L. (If p = 2 or 3, then since the valuation of 2y1p

−v

is bounded, we still obtain an element of V (Qp) with bounded denominators.)

(b) For L ∼= O(2O), the pair of binary cubics

(1, 0, 0, a3) and (0, 0, 1, a1) (15)

is an element of V (Zp) with the invariants a1 and a3, and via Theorem 3.1 and (3), it iseasy to compute that it gives rise to the elliptic curve E with the marked point P andthe line bundle O(2O).

7. Hypercubes. Let E be the elliptic curve y2 +a1xy+a3y = (x−a2)(x−a′2)(x−a′′2) with thethree points (a2, 0), (a′2, 0), and (a′′2, 0) summing to the identity point. Again, we considerthe two cases where L ∼= O(O + Q), for an almost integral point Q = (x1, y1) ∈ E◦(Qp), orL ∼= O(2O).

14

Page 15: On average sizes of Selmer groups and ranks in families of

(a) Suppose L ∼= O(O + Q) for an almost integral point Q = (x1, y1) of E◦(Qp). Then thefollowing element of Q2

p⊗Q2p⊗Q2

p⊗Q2p, represented as a 2× 2 matrix of 2× 2 matrices,

gives the elliptic curve E (up to isomorphism) and the line bundle L:(y1 00 0

) (0 00 1

)(

0 −y1

−y1 a2 − x1

) (−y1 a′2 − x1

a′′2 − x1 −a1

) (16)

In other words, the natural binary quartic arising from (16), via Theorem 3.1, indeedcorresponds to the map of E to P1 via the sections of L. The invariants coming from(16) are not quite the desired minimal invariants a1, a2, a

′2, a′′2, a3, however, as they are

scaled by powers of y1 (and 2); namely, instead of ai, we obtain 2iyi1ai. By applying

transformations of the form(pα 0

0 pβ

)in the four copies of GL2(Qp) to (16), we obtain

the following element with the desired minimal invariants, up to powers of 2 and unitsin Zp: (

y1p−v 0

0 0

) (0 0

0 pw+w′+w′′−v

)(

0 −y1p−w−w′

−y1p−w−w′′ (a2 − x1)p−w

) (−y1p

−w′−w′′ (a′2 − x1)p−w′

(a′′2 − x1)p−w′′ −a1

)where v is the p-adic valuation vp(y1) and w = vp(a2 − x1), w′ = vp(a

′2 − x1), and w′′ =

vp(a′′2−x1). Because y1 divides (x1−a2)(x1−a′2)(x1−a′′2), the inequality v ≤ w+w′+w′′

holds, and every entry of the above array is indeed integral. Dividing the top two 2× 2matrices by 2y1p

−v, which is a unit in Qp when p 6= 2 or 3, gives a hypercube withthe precise invariants a1, a2, a

′2, a′′2, a3. (If p = 2 or 3, like in case 4, we see that the

p-adic valuation of 2y1p−v is still bounded, so the same hypercube still gives the correct

invariants and has bounded denominators.)

(b) Suppose L ∼= O(2O). Then the following element of Z2p ⊗Z2

p ⊗Z2p ⊗Z2

p, represented as a2 × 2 matrix of 2 × 2 matrices, recovers the elliptic curve E and L and has the desiredinvariants: (

1 00 −a2

) (0 −a′2−a′′2 a3

)(

0 00 1

) (0 11 a1

) (17)

The binary quartic obtained from (17) via Theorem 3.1 is

w31w2 +

1

4a2

1w21w

22 + (

1

2a1a3 + a2a

′2 + a′2a

′′2 + a2a

′′2)w1w

32 + (−a2a

′2a′′2 +

a23

4)w4

2,

which matches (3) in this case after changing the equation of E into short Weierstrassform and applying a change of variables to the quartic.

Therefore, in each case, from an element of V ls(Qp), we obtain a G(Qp)-equivalent elementof V ls(Qp) with absolutely bounded denominators and the same invariants. In other words, wemay find a G(Qp)-equivalent integral element in V (Zp) with invariants up to absolutely boundedfactors of 2 and 3. In each case, a standard argument on strong approximation (see, e.g., [Fis07,Lemmas 3.2 and 3.3] and [CFS10, Theorem 4.17]) allows us to patch together these local integralrepresentatives into an integral model over Q:

15

Page 16: On average sizes of Selmer groups and ranks in families of

Theorem 4.2. For each of the cases in Table 1, for an element of V ls(Q) such that its G(Q)-invariants are in Z, there is a G(Q)-equivalent element of V ls(Z) with the same invariants, up toabsolutely bounded factors of 2 and 3.

The exact factors of 2 and 3 differ for each case but may be explicitly computed by the formulasin this section, though they are not needed for the later counting arguments. We obtain thefollowing:

Corollary 4.3. Consider any of the cases in Table 1. Let E be an elliptic curve in the family F andlet d be the degree of the associated line bundle. Then the elements in the d-Selmer group Sd(E) ofE are in bijection with G(Q)-equivalence classes on the set V ls(Z) of locally soluble elements havinginvariants equal to the coefficients M iai (and M ia′i) of E, for some absolutely bounded integer M .

5 Construction of fundamental domains

For each case in Table 1, we wish to count irreducible G(Z)-orbits on V (Z) having bounded height.We will accomplish this by counting suitable integer points in fundamental domains for the actionof G(Z) on V (R).

5.1 Fundamental sets for the action of G(R) on V (R)

In this subsection, we would like to find a bounded fundamental set for the action of G(R)×R× onV (R), or equivalently, a bounded fundamental set for the action of G(R) on the elements of heightat most 1 in V (R). We break up V stab(R), considered with its real topology, into its connectedcomponents V (i) for i ∈ {1, . . . , N}. We prove the following:

Theorem 5.1. There exists an absolutely bounded fundamental set for the action of G(R) × R×on each component V (i).

A point v ∈ V stab(R) is called R-soluble if the genus one curve C arising from v under theparametrization theorems of §3 has a real point. Note that when d = 2, a point v is R-insoluble ifand only if the corresponding binary quartic form is negative definite, and for d = 3, all points areR-soluble. In each component V (i), the points v ∈ V (i) are either all R-soluble or all not, and weneed only study the R-soluble components.

Let m denote the number of independent polynomial invariants for the action of G on V , i.e.,the real dimension of V (R)/G(R) or the complex dimension of V (C)/G(C). There exists a G(R)-invariant map π : V (i) → Rm sending an element to its invariant vector ~a, or equivalently, a mapπ : V (i)/G(R)→ Rm from orbits to invariant vectors.

For invariant vectors ~a ∈ Rm, we define several properties, such as height and discriminant, inthe same way as for the corresponding elliptic curves E~a with coefficients ~a in the family F as in §1.For example, let the height of ~a ∈ Rm be the maximum of the values |aj |12/j , where j denotes thedegree of the invariant aj . Let RmH≤1 denote the set of nondegenerate invariant vectors of height atmost 1, where nondegenerate means that the corresponding discriminant form is nonzero (so thevector yields a genuine elliptic curve E~a). It is clear that RmH≤1 is a bounded set in Rm.

In §4, for a given vector ~a ∈ Rm of invariants, we gave explicit formulas for points v(~a) ∈ V (i)

having π(v(~a)) = ~a, in other words, sections of π. Note that these v(~a) are all R-soluble byconstruction. Because RmH≤1 is bounded, we would like to use the algebraic formulas for these

sections to construct bounded fundamental sets in V (i) for the elements of height at most 1; theonly complication is that the fibers of π are not always a single orbit.

16

Page 17: On average sizes of Selmer groups and ranks in families of

Recall that in the parametrization theorems of §3, for an element v ∈ V (i), the invariantsπ(v) precisely correspond to the coefficients of the elliptic curve E arising from v. Thus, given anondegenerate vector ~a of invariants, the G(R)-orbits in V (R) with invariant vector ~a correspondto isomorphism classes of pairs (C,L) associated to the elliptic curve E~a with coefficients ~a. Ford = 2, there are either one or two G(R)-orbits associated to any nondegenerate ~a, depending onwhether the elliptic curve E~a has one or two real components, respectively; since this condition islocally constant, the nonempty fibers of π are of constant size (either 1 or 2) for each V (i). Ford = 3, the situation is simpler, since there is always only one G(R)-orbit for each nondegenerateinvariant vector ~a.

In the cases where d = 3, because the nonempty fibers of π have size 1, we may directly usethe algebraic formulas from §4 to find representatives v(~a) ∈ V (i) for each orbit corresponding tonondegenerate ~a ∈ RmH≤1. For simplicity, we may use the formulas in part (b) of each case, whereL ∼= O(3O); it is clear that applying the formulas (which are polynomial in the invariants) toRmH≤1 produces a bounded fundamental set. In other words, we may break up G(R)\V stab(R) into

R(i) ⊂ V (i) for i ∈ {1, . . . , N}, where

R(i) := {λ · v(~a) : λ ∈ R>0, ∆(~a) > 0} (18)

orR(i) := {λ · v(~a) : λ ∈ R>0, ∆(~a) < 0} (19)

in accordance with whether discriminants are positive or negative on V (i), respectively.For d = 2, an identical argument constructs the sets R(i) for the components V (i) for which

the fibers of π have size 1, by using the formulas from §4 where L ∼= O(2O). However, for thecomponents V (i) where the fibers have size 2, we need to find a representative w(~a) ∈ V (i) thatrepresents the other orbit in π−1(~a), for any ~a ∈ RmH≤1. In terms of the geometric data, these orbitscorrespond to elliptic curves E with two real components (and a trivial torsor C), where the degree2 line bundles L are isomorphic to O(O + Q) for any point Q on the non-identity component ofE(R). Such elliptic curves over R are exactly those with positive discriminant and may be writtenin the form

EI,J : y2 = x3 − I

3x− J

27

with 4I3 − J2 > 0 for I, J ∈ R. The point Q may be taken to be1

(x1, y1) =

(−√I

3,I3/4

√2− I−3/2J

3√

3

). (20)

Note that the positive discriminant of E implies that −2 < I−3/2J < 2, so Q ∈ EI,J(R).Thus, in each of the remaining V (i), for a vector ~a ∈ RmH≤1 where π−1(~a) consists of two orbits,

the elliptic curve E~a is isomorphic to an elliptic curve EI,J for some I, J ∈ R with 4I3 − J2 > 0.A simple change of variables to put E~a into short Weierstrass form yields polynomial formulas forI and J in terms of the a-invariants. Then using the formulas in §4 with L ∼= O(O + Q) for Q asin (20), we obtain w(~a) ∈ V (i) as desired. It remains to check that this process gives a boundedset of w(~a) in V (i); the formulas are algebraic in the a-invariants but there are some negativeexponents. Observe, however, that the only denominators occur in the definition of y1 in (20) andin the expression y1/x1 in some of the formulas. For the first, as noted above, the term under the

1One may derive the formula (20) for the point Q from the binary quartic forms in L(1) in [BS15a, Table 1], sincethose quartics yield genus one curves isomorphic to the elliptic curves E1,J with hyperelliptic map given by the linebundle O(O +Q) for I = 1.

17

Page 18: On average sizes of Selmer groups and ranks in families of

squareroot in the formula for y1 is bounded between 0 and 4, so the possible values for y1 itself are

bounded (since I is also bounded). For the second, observe that y1/x1 = −I1/4√

2− I−3/2J/√

3,which is also bounded for similar reasons. Therefore, in these cases, the union of the set of w(~a)and the set of v(~a), for ~a ∈ RmH≤1 is a bounded fundamental set for G(R) × R× acting on V (R).

For these components V (i), in analogy with (18) and (19), we thus have

R(i) := {λ · v(~a) : λ ∈ R≥0,∆(~a) > 0} ∪ {λ · w(~a) : λ ∈ R≥0,∆(~a) > 0}

as fundamental sets.Let R(i)(X) be the set of all elements in R(i) having height less than X. Since H(λ·v) = λkH(v),

we see that the coefficients of all the elements in R(i)(X) are bounded by O(X1/k). Note also thatfor any g ∈ G(R), the set g · R(i) is also a fundamental domain for the action of G(R) on R(i).Furthermore, for any compact set G0 ⊂ G(R), the coefficients of elements in g ·R(i), with g ∈ G0,having height less than X, is bounded by O(X1/k), where the implied constant depends only onG0.

Note that in each component, the groups E[d](R) and E(R)/dE(R) are locally constant, whereE refers to the elliptic curve arising from a given point v. By the parametrization theorems, thegeneric stabilizer in G(R) for an element in V (i) corresponds to E[d](R). We denote the cardinalityof the generic stabilizer in G(R) for an element in V (i) by ni.

5.2 Arithmetic reduction theory: fundamental domains for G(Z) acting on V (R)

Let F denote a fundamental domain for the left action of G(Z) on G(R) that is Haar-measurableand contained in a standard Siegel set [BHC62, §2]. As the group G is a finite index quotient of theproduct of finitely many SL2’s or SL3’s, we may write F naturally as a subset of a product of F2’sor F′3s, where F2 (resp. F3) denotes a fundamental domain for the left action of SL2(Z) on SL2(R)(resp. SL3(Z) on SL3(R)). To arrange F to lie in a Siegel domain, we simply take F2 and F3 to liein Siegel domains; explicitly, we may take F2 = {νακ : ν(x) ∈ N ′(α), α(s) ∈ A′, κ ∈ K}, where

N ′(α) =

{(1x 1

): x ∈ I(α)

}, A′ =

{(s−1

s

): s ≥

√3/2

},

and K is as usual the (compact) real orthogonal group SO2(R); here I(α) is a union of oneor two subintervals of [−1

2 ,12 ] depending only on the value of α ∈ A′. Similarly, F3 = {νακ :

ν(x, x′, x′′) ∈ N ′(α), α(t, u) ∈ A′, k ∈ K}, where

K = subgroup SO3(R) ⊂ GL+3 (R) of orthogonal transformations;

A′ = {α(t, u) : t, u > c},

where α(t, u) =

t−2u−1

tu−1

tu2

;

N ′ = {ν(x, x′, x′′) : (x, x′, x′′) ∈ I ′(a)},

where n(x, x′, x′′) =

1x 1x′ x′′ 1

;

here I ′(a) is a measurable subset of [−1/2, 1/2]3 dependent only on α ∈ A′, and c > 0 is an absoluteconstant.

18

Page 19: On average sizes of Selmer groups and ranks in families of

When F is a subset of a product of multiple F′2s or F3’s, then we use subscripts to distinguishthe coordinates on these F2’s or F3’s. Thus, for example, in the case V = 2 ⊗ 2 ⊗ 2 ⊗ 2, thecoordinates on G(Z)\G(R) will be the si (i ∈ {1, 2, 3, 4}), xi (i ∈ {1, 2, 3, 4}), and the κi ∈ Ki

(i ∈ {1, 2, 3, 4}).We may now construct fundamental sets for the action of G(Z) on V (R). Namely, for any

g ∈ G(R) we see that FhR(i) is the union of ni fundamental domains for the action of G(Z)on V (i); here, we regard FhR(i) as a multiset, where the multiplicity of a point x in FhR(i) isgiven by the cardinality of the set {g ∈ F : x ∈ ghR(i)}. (See [BS15a, §2.1] for a more detailedexplanation.) Thus a G(Z)-equivalence class x in V (i) is represented in this multiset σ(x) times,where σ(x) = #StabG(R)(x)/#StabG(Z)(x). In particular, σ(x) for x ∈ V (i) is always a numberbetween 1 and ni.

For any G(Z)-invariant set S ⊂ V (i)∩V (Z), let N(S;X) denote the number of G(Z)-equivalenceclasses of irreducible elements B ∈ S satisfying H(B) < X. Then we conclude that, for anyh ∈ G(R), the product ni ·N(S;X) is exactly equal to the number of irreducible integer points inFhR(i) having height less than X, with the slight caveat that the (relatively rare—see Lemma 7.5)points with G(Z)-stabilizers of cardinality r (r > 1) are counted with weight 1/r.

As mentioned earlier, the main obstacle to counting integer points of bounded height in asingle domain FhR(i) is that the relevant region is not compact, but rather has cusps going off toinfinity. We simplify the counting in this cuspidal region by “thickening” the cusp; more precisely,we compute the number of integer points of bounded height in the region FhR(i) by averaging overlots of such fundamental regions, i.e., by averaging over the domains FhR(i) where h ranges over acertain compact subset G0 ∈ G(R). The method, which is an extension of the method of [Bha10],is described in Section 7.

6 Some sufficient conditions for reducibility in V (Z)

A simple sufficient condition for a binary quartic form ax4 + bx3y + cx2y2 + dxy3 + ey4 to bereducible is that its coefficient a of x4 is 0. Similarly, a ternary cubic form f(x, y, z) is reducibleif the coefficients of x3, x2y, and xy2 all simultaneously vanish, or if the coefficients of x3, x2y, andx2z all simultaneously vanish; indeed, in both of these cases, we see that [1 : 0 : 0] is then a flex off , i.e., both f and the Hessian of f vanish at [1 : 0 : 0].

For each of the representations V in Cases 3–7 of Table 1, we now provide some analogoussufficient conditions which guarantee that a point in V (Q) is reducible. We begin with the space ofhypercubes (bijk`) (i, j, k, ` ∈ {1, 2}) in V (Q) = Q2 ⊗Q2 ⊗Q2 ⊗Q2. Note that V (Q) has a naturalaction by S4 given by permuting the tensor factors.

Lemma 6.1. Let B = (bijk`) be an element in V (Q) = Q2 ⊗ Q2 ⊗ Q2 ⊗ Q2 such that, after asuitable action by an element of S4, all the coordinates in at least one of the following sets vanish:

(i) {b1111, b1112, b1121, b1122}

(ii) {b1111, b1112, b1121, b1211}

Then B is reducible.

Proof. In both cases (i) and (ii), one checks that y is a linear factor of Disc(b1ijkx + b2ijky), andhence B is reducible.

Note that the two spaces V (Q) = Q2 ⊗ Sym3Q2 and V (Q) = Sym2Q2 ⊗ Sym2Q2 may beviewed as the spaces of triply symmetric and doubly doubly symmetric hypercubes (bijk`) over

19

Page 20: On average sizes of Selmer groups and ranks in families of

Q, respectively. Thus the reducibility criteria given in Lemma 6.1 apply to these spaces too, byviewing each as a subspace of the space Q2⊗Q2⊗Q2⊗Q2 of hypercubes. However, there are somecases for these two spaces that are not quite covered by Lemma 6.1, and hence we state lemmasfor these two spaces separately below.

Lemma 6.2. Let B = (bijk`) be an element in V (Q) = Q2 ⊗ Sym3Q2 such that all the coordinatesin at least one of the following sets vanish:

(i) {b1111, b1112}

(ii) {b1111, b2111}

Then B is reducible.

Proof. In case (i), we see that y is a factor of Disc(b1ijkx + b2ijky). In case (ii), by replacing thecubical matrix b1ijk by a suitable Q-linear combination of b1ijk and b2ijk, we may transform B (byan element of G(Q)) so that b1112 is zero. Since b1111 will remain zero, we are then in case (i).Hence B is reducible in either case.

The space V (Q) = Sym2Q2⊗Sym2Q2 has a natural action by S2, given by permuting the tensorfactors.

Lemma 6.3. Let B = (bijk`) be an element in V (Q) = Sym2Q2 ⊗ Sym2Q2 such that, after asuitable action by an element of S2, we have b1111 = b1112 = 0. Then B is reducible.

Proof. The covariant binary quartics have no quartic or cubic terms and thus are reducible.

We next turn to the space of Rubik’s cubes (bijk) (i, j, k ∈ {1, 2, 3}) in V (Q) = Q3 ⊗Q3 ⊗Q3.The space V (Q) has a natural action by S3, again given by permuting the tensor factors.

Lemma 6.4. Let B = (bijk) be an element in V (Q) = Q3 ⊗ Q3 ⊗ Q3 such that, after a suitableaction by an element of S3, all the coordinates in at least one of the following sets vanish:

(i) {b111, b112, b113, b121, b122, b123}

(ii) {b111, b112, b113, b121, b122, b131, b211, b212, b221}

(iii) {b111, b112, b113, b121, b131, b211, b311}

(iv) {b111, b112, b121, b122, b211, b212, b221, b222}

Then B is reducible.

Proof. In all cases, we see that the curve in P2 defined by det(b1ijx+b2ijy+b3ijz) = 0 is not smoothor has a flex at the point (1 : 0 : 0). Hence B is reducible in all cases.

Finally, the space V (Q) = Q3⊗Sym2Q3 may be viewed as the space (bijk) of doubly symmetricRubik’s cubes over Q, and thus the reducibility criteria in Lemma 6.4 apply also to this space.However, again, there is a case that we will need that is not quite covered by Lemma 6.4, and sowe state the corresponding lemma for Q3 ⊗ Sym2Q3 separately.

Lemma 6.5. Let B = (bijk) be an element in V (Q) = Q3 ⊗ Sym2Q3 such that all the coordinatesin at least one of the following sets vanish:

(i) {b111, b112, b113, b122, b123}

20

Page 21: On average sizes of Selmer groups and ranks in families of

(ii) {b111, b112, b113, b122, b211, b212}

(iii) {b111, b112, b113, b211, b212, b213}

(iv) {b111, b112, b122, b211, b212, b222}

(v) {b111, b211, b311}

Then B is reducible.

Proof. In cases (i)–(iv), we see that the curve in P2 defined by det(b1ijx+ b2ijy + b3ijz) = 0 is notsmooth or has a flex at the point (1 : 0 : 0). In case (v), by replacing the matrix b1ij by a suitableQ-linear combination of b1ij , b2ij , and b3ij , we may transform B (by an element of G(Q)) so thatb112 and b113 are zero. Since b111 will remain zero, we are then in case (iii) of Lemma 6.4. HenceB is reducible in all cases (i)–(v).

7 Counting irreducible elements of bounded height

In this section, we derive asymptotics for the number of G(Z)-equivalence classes of irreducibleelements of V (Z) having bounded invariants. We also describe how these asymptotics change whenwe restrict to counting elements in V (Z) satisfying a finite set of congruence conditions.

Let V (i) (i ∈ {1, . . . , N}) denote again the components of V stab(R), and let

ci =Vol(FR(i) ∩ {v ∈ V (R) : H(v) < 1})

ni.

Then in this section we prove the following theorem:

Theorem 7.1. Fix i ∈ {1, . . . , N}. For any G(Z)-invariant set S ⊂ V (Z)(i) := V (Z) ∩ V (i), letN(S;X) denote the number of G(Z)-equivalence classes of irreducible elements B ∈ S satisfyingH(B) < X. Then

N(V (Z)(i);X) = ciXn/k + o(Xn/k).

7.1 Averaging over fundamental domains

Let G0 be a compact, semialgebraic, left K-invariant set in G(R) that is the closure of a nonemptyopen set and in which every element has determinant greater than or equal to 1. Let V (Z)irr denotethe subset of elements of V (Z) that are irreducible. Then for any i ∈ {1, . . . , N}, we may write

N(V (Z)(i);X) =

∫h∈G0

#{x ∈ FhR ∩ V (Z)irr : H(x) < X}dhni ·

∫h∈G0

dh,

where V (Z)irr denotes the set of irreducible elements in V (Z)irr and R is equal to R(i). The

denominator of the latter expression is an absolute constant C(i)G0

greater than zero.

More generally, for any G(Z)-invariant subset S ⊂ V (Z)(i), let N(S;X) denote the number ofirreducible G(Z)-orbits in S having height less than X. Let Sirr denote the subset of irreduciblepoints of S. Then N(S;X) can be similarly expressed as

N(S;X) =

∫h∈G0

#{x ∈ FhR ∩ Sirr : H(x) < X}dh

C(i)G0

. (21)

21

Page 22: On average sizes of Selmer groups and ranks in families of

We use (21) to define N(S;X) even for sets S ⊂ V (Z) that are not necessarily G(Z)-invariant.As in [BS15a, Thm. 2.5], we may write N(S;X) alternatively as

N(S;X) =1

C(i)G0

∫g∈N ′(α)A′K

#{x ∈ Sirr ∩ νακG0R : H(x) < X} dg

where dg is a Haar measure on G(R). Explicitly, if we write G as a finite quotient of∏i SL2 or 3,

where the i-th factor of SL has Iwasawa decomposition NiAiKi, then we have

dg =∏i

s−2i dui d

×si dκi or∏i

t−6i u−6

i dνi d×ti d

×ui dκi ,

respectively, where dνi and dκi are invariant measures on Ni and Ki, respectively. We normalizethe invariant measure dκi on Ki so that

∫Kidκi = 1.

Let us write E(ν, α,X) = ναG0R ∩ {x ∈ V (i) : H(x) < X}, again viewed as a multiset. AsKG0 = G0 and

∫K dk = 1, we have

N(S;X) =1

C(i)G0

∫g∈N ′(a)A′

#{x ∈ Sirr ∩ E(ν, α,X)} dg. (22)

We note that the same counting method may be used even if we are interested in counting bothreducible and irreducible orbits in V (Z). For any set S ⊂ V (i), let N∗(S;X) be defined by (22),but where the superscript “irr” is removed:

N∗(S;X) =1

C(i)G0

∫g∈N ′(s)A′

#{x ∈ S ∩ E(ν, α,X)} dg. (23)

Thus for a H(Z)-invariant set S ⊂ V (i), N∗(S;X) counts the total (weighted) number of H(Z)-orbits in S having height less than X (not just the irreducible ones).

The expression (22) for N(S;X), and its analogue (23) for N∗(S,X), will be useful to us in thesections that follow.

7.2 An estimate from the geometry of numbers

To estimate the number of lattice points in E(ν, α,X), we have the following result due to Davenport[Dav51].

Proposition 7.2. Let R be a bounded semialgebraic multiset in Rn having maximum multiplicitym and defined by at most k polynomial inequalities each having degree at most `. Let R′ denotethe image of R under any (upper or lower) triangular unipotent transformation of Rn. Then thenumber of integral lattice points, counted with multiplicity, contained in the region R′ is

Vol(R) +O(max{Vol(R), 1}),

where Vol(R) denotes the greatest d-dimensional volume of any projection of R onto a coordinatesubspace obtained by equating n−d coordinates to zero, for any d between 1 and n−1. The impliedconstant in the second summand depends only on n, m, k, and `.

Although Davenport states Lemma 7.2 only for compact semialgebraic sets, his proof adaptswithout significant change to the more general case of a bounded semialgebraic multiset R ⊂Rn, with the same estimate applying also to any image R′ of R under a unipotent triangulartransformation.

22

Page 23: On average sizes of Selmer groups and ranks in families of

7.3 Cutting off the cusps

The following proposition shows that the number of points in B ∈ FhR(i) ∩ V (Z) having boundedheight, where the coordinate of lowest weight (namely, b1111 or b111) vanishes, is negligible.

Proposition 7.3. Let h take a random value in G0 uniformly with respect to the Haar measuredg. Then the expected number of irreducible elements B ∈ FhR(i) ∩V (Z) such that H(B) < X andb1111 = 0 (resp. b111 = 0) is Oε(X

(n−1)/k+ε).

Proof. We follow the method in [Bha10]. Namely, we divide the set of all B ∈ V (Z) into a numberof cases depending on which initial coordinates are zero and which are nonzero. These cases aredescribed in the second columns of Tables 2(a)–(e). The vanishing conditions in the various subcasesof Case m+ 1 are obtained by setting equal to 0—one at a time—each variable that was assumedto be nonzero in Case m. If such a resulting subcase satisfies the reducibility conditions of thecorresponding lemma among Lemmas 6.1–6.5, it is not listed. In this way, it becomes clear thatany irreducible element in V (Z) must satisfy precisely one of the conditions enumerated in thesecond column of the corresponding table.

Let T denote the set of all n variables bijk` (or bijk) corresponding to the coordinates on V (Z).For a subcase C of the corresponding table among Tables 2(a)–(e), we use T0 = T0(C) to denotethe set of variables in T assumed to be 0 in Subcase C, and T1 to denote the set of variables in Tassumed to be nonzero.

Each variable b ∈ T has a weight, defined as follows. The action of r = (s1, s2, . . .) or r =(t1, u1, t2, u2, . . .) on B ∈ V causes each variable b to multiply by a certain weight which we denoteby w(b). These weights w(b) are evidently rational functions in s1, s2, . . . or t1, u1, t2, u2, . . ..

Let V (C) denote the set of B ∈ V (R) such that B satisfies the vanishing and nonvanishingconditions of Subcase C. For example, in Subcase 3b of Table 2(d) we have T0(3b) = {b111, b112, b121}and T1(3b) = {b113, b122, b131, b211}; thus V (3b) denotes the set of all B ∈ V (Z) = Z3 ⊗ Z3 ⊗ Z3

such that b111 = b112 = b121 = 0 but b113, b122, b131, b211 6= 0.For each subcase C of Case m (m > 0), we wish to show that N(V (C);X) is Oε(X

(n−1)/k+ε).Since N ′(α) is absolutely bounded, the equality (23) implies that

N∗(V (C);X)�∫ ∞s1,s2,···=c

σ(V (C)) s−21 s−2

2 · · · d×s2 d

×s1

or

N∗(V (C);X)�∫ ∞t1,u1,t2,u2,···=c

σ(V (C)) t−61 u−6

1 t−62 u−6

2 · · · d×u2 d

×t2 d×u1 d

×t1,

where σ(V (C)) denotes the number of integer points in the region E(ν, α, λ,X) that also satisfythe conditions

b = 0 for b ∈ T0 and |b| ≥ 1 for b ∈ T1. (24)

Now for an element B ∈ H(ν, α,X), we evidently have

|b| ≤ Jw(b)X1/k (25)

for some absolute constant J > 0, and therefore the number of integer points in H(ν, α, λ,X)satisfying (24) will be nonzero only if we have

Jw(b)X1/k ≥ 1 (26)

for all weights w(b) such that b ∈ T1. Now the sets T1 in each subcase of Table 2 have been chosento be precisely the set of variables having the minimal weights w(b) among the variables t ∈ T \T0;

23

Page 24: On average sizes of Selmer groups and ranks in families of

by “minimal weight” in T \ T0, we mean that there is no other variable b ∈ T \ T0 with weighthaving equal or smaller exponents for all parameters s1, s2, . . . (resp. t1, u1, t2, u2, . . .). Thus if thecondition (26) holds for all weights w(b) corresponding to b ∈ T1, then—by the very choice ofT1—we will also have Jw(b)� 1 for all weights w(b) such that b ∈ T \ T0.

Therefore, if the region H = {B ∈ H(ν, α,X) : b = 0 ∀b ∈ T0; |b| ≥ 1 ∀b ∈ T1} ⊂ Rn−|T0|

contains an integer point, then (26) and Lemma 7.2 together imply that the number of integerpoints in H is O(Vol(H)), since the volumes of all the projections of ν−1H will in that case also beO(Vol(H)). Now clearly

Vol(H) = O(Jn−|T0|X

n−|T0|k

∏b∈T\T0

w(b)),

so we obtain

N(V (C);X)�∫ ∞s1,s2,···=c

Xn−|T0|k

∏b∈T\T0

w(b) s−21 s−2

2 · · · d×s2 d

×s1 (27)

or

N(V (C);X)�∫ ∞t1,u1,t2,u2,···=c

Xn−|T0|k

∏b∈T\T0

w(b) t−61 u−6

1 t−62 u−6

2 · · · d×u2 d

×t2 d×u1 d

×t1. (28)

The latter integral can be explicitly carried out for each of the subcases in Table 2. It suffices,however, to have a simple estimate of the form O(Xr), with r ≤ (n − 1)/k + ε, for the integralcorresponding to each subcase. For example, if the total exponent of si (resp. ti, ui) in (27) or (28)is negative for all i, then it is clear that the resulting integral will be at most O(X(n−|T0|)/k) invalue. This condition holds for many of the subcases in Table 2 (indicated in the fourth column by“-”), immediately yielding the estimates given in the third column.

For cases where this negative exponent condition does not hold, the estimate given in thethird column can be obtained as follows. The factor π given in the fourth column is a product ofvariables in T1, and so it is at least one in absolute value. The integrand in (27) or (28) may thusbe multiplied by π without harm, and the estimates (27) and (28) will remain true; we may thenapply the inequalities (25) to each of the variables in π, yielding

N(V (C);X)�∫ ∞s1,s2,···=c

Xn−|T0|+#π

k

∏b∈T\T0

w(b) w(π) s−21 s−2

2 · · · d×s2 d

×s1 (29)

or

N(V (C);X)�∫ ∞t1,u1,t2,u2,···=c

Xn−|T0|+#π

k

∏b∈T\T0

w(b) w(π) t−61 u−6

1 t−62 u−6

2 · · · d×u2 d

×t2 d×u1 d

×t1

(30)where #π denotes the total number of variables of T appearing in π (counted with multiplicity), andwe extend the notation w multiplicatively, i.e., w(ab) = w(a)w(b). In each subcase of Table 2, wehave chosen the factor π so that the total exponent of each si (resp. each ti and each ui) in (29) (resp.(30)) is negative. Thus we obtain from (29) and (30) that N(V (C);X) = O(X(n−#T0(C)+#π)/k),and this is precisely the estimate given in the third column of Table 2. In every subcase, aside fromCase 0, we see that n−#T0 + #π < n, as desired.

24

Page 25: On average sizes of Selmer groups and ranks in families of

Case The set S ⊂ V (Z) defined by N(S;X)� Use factor

0. b1111 6= 0 X16/24 –

1. b1111 = 0 ; X15/24 –b1112, b1121, b1211, b2111 6= 0

2. b1111, b1112 = 0 ; X14/24+ε –b1121, b1211, b2111 6= 0

3. b1111, b1112, b1121 = 0 ; X14/24+ε b1122

b1122, b1211, b2111 6= 0

Table 2(a). Estimates for 2⊗ 2⊗ 2⊗ 2.

Case The set S ⊂ V (Z) defined by N(S;X)� Use factor

0. b1111 6= 0 X8/24 –

1. b1111 = 0 ; X22/72+ε (b2111)1/3

b1112, b2111 6= 0

Table 2(b). Estimates for 2⊗ Sym3(2).

Case The set S ⊂ V (Z) defined by N(S;X)� Use factor

0. b1111 6= 0 X9/12 –

1. b1111 = 0 ; X8/12+ε –b1112, b1211 6= 0

Table 2(c). Estimates for Sym2(2)⊗ Sym2(2).

Case The set S ⊂ V (Z) defined by N(S;X)� Use factor

0. b111 6= 0 X27/36 –

1. b111 = 0 ; X26/36 –b112, b121, b211 6= 0

2. b111, b112 = 0 ; X25/36 –b113, b121, b211 6= 0

3a. b111, b112, b113 = 0 ; X24/36+ε –b121, b211 6= 0

3b. b111, b112, b121 = 0 ; X24/36+ε –b113, b122, b131, b211 6= 0

4a. b111, b112, b113, b121 = 0 ; X24/36+ε b122

b122, b131, b211 6= 0

4b. b111, b112, b121, b122 = 0 ; X24/36+ε b113

b113, b131, b211 6= 0

4c. b111, b112, b121, b211 = 0 ; X23/36 –b113, b122, b131, b212, b221, b311 6= 0

Table 2(d). Subcases 0–4c of estimates for 3⊗ 3⊗ 3.

25

Page 26: On average sizes of Selmer groups and ranks in families of

Case The set S ⊂ V (Z) defined by N(S;X)� Use factor

5a. b111, b112, b113, b121, b122 = 0 ; X24/36+ε b2123

b123, b131, b211 6= 0

5b. b111, b112, b113, b121, b131 = 0 ; X24/36+ε b2122

b122, b211 6= 0

5c. b111, b112, b113, b121, b211 = 0 ; X24/36+ε b122b212

b122, b131, b212, b221, b311 6= 0

5d. b111, b112, b121, b122, b211 = 0 ; X24/36+ε b113b131

b113, b131, b212, b221, b311 6= 0

6a. b111, b112, b113, b121, b122, b131 = 0 ; X26/36 b2123b2132b211

b123, b132, b211 6= 0

6b. b111, b112, b113, b121, b122, b211 = 0 ; X24/36+ε b123b131b212

b123, b131, b212, b221, b311 6= 0

6c. b111, b112, b113, b121, b131, b211 = 0 ; X24/36+ε b2122b311

b122, b212, b221, b311 6= 0

6d. b111, b112, b113, b121, b211, b221 = 0 ; X21/36+ε –b122, b131, b212, b311 6= 0

6e. b111, b112, b121, b122, b211, b212 = 0 ; X21/36+ε –b113, b131, b221, b311 6= 0

7a. b111, b112, b113, b121, b122, b131, b211 = 0 ; X24/36+ε b2123b132b311

b123, b132, b212, b221, b311 6= 0

7b. b111, b112, b113, b121, b122, b211, b212 = 0 ; X24/36+ε b123b213b131b311

b123, b131, b213, b221, b311 6= 0

7c. b111, b112, b113, b121, b122, b211, b221 = 0 ; X21/36+ε b123

b123, b131, b212, b311 6= 0

7d. b111, b112, b113, b121, b131, b211, b212 = 0 ; X24/36+ε b2122b213b311

b122, b213, b221, b311 6= 0

7e. b111, b112, b121, b122, b211, b212, b221 = 0 ; X21/36+ε b222

b113, b131, b222, b311 6= 0

8a. b111, b112, b113, b121, b122, b131, b211, b212 = 0 ; X24/36+ε b123b2132b213b311

b123, b132, b213, b221, b311 6= 0

8b. b111, b112, b113, b121, b122, b211, b212, b221 = 0 ; X24/36+ε b123b132b213b221b311

b123, b131, b213, b222, b311 6= 0

8c. b111, b112, b113, b121, b131, b211, b212, b221 = 0 ; X24/36+ε b2122b213b231b311

b122, b213, b222, b231, b311 6= 0

Table 2(d) cont’d. Subcases 5a–8c of estimates for 3⊗ 3⊗ 3.

26

Page 27: On average sizes of Selmer groups and ranks in families of

Case The set S ⊂ V (Z) defined by N(S;X)� Use factor

0. b111 6= 0 X18/36 –

1. b111 = 0 ; X17/36 –b112, b211 6= 0

2a. b111, b112 = 0 ; X16/36 –b113, b122, b211 6= 0

2b. b111, b211 = 0 ; X17/36 b311

b112, b311 6= 0

3a. b111, b112, b113 = 0 ; X15/36+ε –b122, b211 6= 0

3b. b111, b112, b122 = 0 ; X15/36+ε –b113, b211 6= 0

3c. b111, b112, b211 = 0 ; X16/36 b311

b113, b122, b212, b311 6= 0

4a. b111, b112, b113, b122 = 0 ; X15/36+ε b123

b123, b211 6= 0

4b. b111, b112, b113, b211 = 0 ; X15/36+ε b311

b122, b212, b311 6= 0

4c. b111, b112, b122, b211 = 0 ; X15/36+ε b311

b113, b212, b311 6= 0

4d. b111, b112, b211, b212 = 0 ; X15/36+ε b311

b113, b122, b311 6= 0

5a. b111, b112, b113, b122, b211 = 0 ; X15/36+ε b123b311

b123, b212, b311 6= 0

5b. b111, b112, b113, b211, b212 = 0 ; X15/36+ε b213b311

b122, b213, b311 6= 0

5c. b111, b112, b122, b211, b212 = 0 ; X15/36+ε b222b311

b113, b222, b311 6= 0

Table 2(e). Estimates for 3⊗ Sym2(3).

7.4 The number of irreducible points in the main body

We now give an estimate on the number of reducible elements B ∈ FhR ∩ V (Z), on average,satisfying b1111 6= 0 (resp. b111 6= 0):

Proposition 7.4. Let h take a random value in G0 uniformly with respect to the measure dg. Thenthe expected number of reducible elements B ∈ FhR(i) ∩ V (Z) such that H(B) < X and b1111 6= 0(resp. b111 6= 0) is o(Xn/k).

We defer the proof of this proposition to the end of the section.We also have the following proposition which bounds the number of G(Z)-equivalence classes

of integral elements in FhR(i) with height less than X that have large stabilizers inside G(Q); weagain defer the proof to the end of the section.

Proposition 7.5. Let h ∈ G0 be any element, where G0 is any fixed compact subset of G(R). Thenthe number of integral elements B ∈ RX(h) whose stabilizer in G(Q) has size greater than 1 (forB ∈ V (i)) is o(Xn/k).

27

Page 28: On average sizes of Selmer groups and ranks in families of

7.5 The main term

Fix again i ∈ {1, . . . , N} and let R = R(i). The results of §7.4 show that, in order to obtainTheorem 7.1, it suffices to count those integral elements B ∈ FhR of bounded height for whichb1111 6= 0 (resp. b111 6=), as h ranges over G0.

Let RX(h) denote the region FhR ∩ {B ∈ V (R) : H(B) < X}; let RX := RX(1). Then wehave the following result counting the number of integral points in RX(h), on average, satisfyingb1111 6= 0 (resp. b111 6= 0):

Proposition 7.6. Let h take a random value in G0 uniformly with respect to the Haar measuredg. Then the expected number of elements B ∈ FhR ∩ V (Z)(i) such that |H(B)| < X and b111 6= 0(resp. b111 6= 0) is Vol(RX) +O(X(n−1)/k).

Proof. Following the proof of Lemma 7.3, let V (i)(φ) denote the subset of V (R) such that b1111 6= 0(resp. b111 6= 0). We wish to show that

N∗(V (i)(φ);X) = Vol(RX) +O(X(n−1)/k).

We have

N∗(V (i)(φ);X) =1

C(i)G0

∫ ∞s1,s2,···=c

∫ν∈N ′(s)

σ(V (φ)) s−21 s−2

2 · · · d×s2 d

×s1

or

N∗(V (i)(φ);X) =1

C(i)H0

∫ ∞t1,u1,t2,u2,···=c

∫ν∈N ′(s)

σ(V (φ))t−61 u−6

1 t−62 u−6

2 · · · d×u2 d

×t2 d×u1 d

×t1,

where σ(V (φ)) denotes the number of integer points in the region E(ν, α,X) satisfying |bmin| ≥ 1,where bmin denotes b1111 (resp. b111). Evidently, the number of integer points in E(ν, α,X) with|bmin| ≥ 1 can be nonzero only if we have

Jw(bmin)X1/k ≥ 1. (31)

Therefore, if the region B = {B ∈ E(ν, α,X) : |bmin| ≥ 1} contains an integer point, then (31) andLemma 7.2 imply that the number of integer points in B is Vol(B)+O(Vol(B)/(w(bmin)X1/k)), sinceall smaller-dimensional projections of u−1B are clearly bounded by a constant times the projectionof B onto the hyperplane bmin = 0 (since bmin has minimal weight).

Therefore, since B = E(ν, α,X)−(E(ν, α,X)−B

), we may write

N∗(V (i)(φ);X) =1

C(i)H0

∫ ∞s1,s2,···=c

∫ν∈N ′(α)

(Vol(E(ν, α,X)

)−Vol

(B(ν, α,X)−B

)+O(max{X

n−1k s2

1s22 · · · , 1})

)s−2

1 s−22 · · · dν d

×s2 d×s1 (32)

or

N∗(V (i)(φ);X) =1

C(i)H0

∫ ∞t1,u1,t2,u2,···=c

∫ν∈N ′(α)

(Vol(E(ν, α,X)

)−Vol

(E(ν, α,X)−B

)(33)

+O(max{Xn−1k t61u

61t

62u

62 · · · , 1})

)t−61 u−6

1 t−62 u−6

2 · · · dν d×u2 d

×t2 d×u1 d

×t1.

The integral of the first term in (32) or (33) is∫h∈G0

Vol(RX(h))dg. Since Vol(RX(h)) does not

depend on the choice of h ∈ G0, the latter integral is simply C(i)G0·Vol(RX).

28

Page 29: On average sizes of Selmer groups and ranks in families of

To estimate the integral of the second term in (32) or (33), let B′ = E(ν, α,X) − B, and foreach |bmin| ≤ 1, let B′(bmin) be the subset of all elements B ∈ B′ with the given value of bmin. Then

the (n− 1)-dimensional volume of B′(bmin) is at most O(X

n−1k∏b∈T\{bmin}w(b)

), and so we have

the estimate

Vol(B′)�∫ 1

−1X

n−1k

∏b∈T\{bmin}

w(b) dbmin = O(X

n−1k

∏b∈T\{bmin}

w(b)).

The second term of the integrand in (32) or (33) can thus be absorbed into the third term.Finally, one easily computes the integral of the third term in (32) and (33) to be O(X(n−1)/k).

We thus obtainN∗(V (i);X) = Vol(RX) +O(X(n−1)/k),

as desired.

Combining Propositions 7.3, 7.4, 7.5, and 7.6 yields Theorem 7.1.

7.6 Computation of the volume

In this subsection, we describe how to compute the volume of RX in V (i) ⊂ V (R).Let m denote the number of independent invariants for the action of G1 on V . Given an element

v ∈ V (R), we may attach to v a vector ~a = ~a(v) ∈ Rm whose coordinates are the independent invari-ants of v. For example, in the case of V = 2⊗2⊗2⊗2, we have ~a(v) = (a2(v), a4(v), a′4(v), a6(v)). Foreach ~a ∈ Rm, the set R(i) contains at most one point p(i)(~a) having invariant vector ~a. Let R(i)(X)denote the set of all those points in R(i) having height less than X. Then Vol(RX) = Vol(F·R(i)(X))

The set R(i) is in canonical one-to-one correspondence with the set {~a ∈ Rm : ∆(~a) > 0} or{~a ∈ Rm : ∆(~a) < 0} in accordance with whether ∆ takes positive or negative values on R(i). Thereis thus a natural measure d~a on each of these sets R(i), given by the standard Euclidean measureon {~a(v) : v ∈ R(i)} viewed as a subset of Rm.

We then have the following proposition, whose proof is identical to that of [BS15a, Thm. 2.8]:

Proposition 7.7. There exists a rational constant J such that, for any measurable function φ onV (R), we have

|J|ni

∫R(i)

∫G(R)

φ(g · V (i)(~a)) dg d~a =

∫V (i)

φ(v)dv. (34)

We may use Proposition 7.7 to give a convenient expression for the volume of the multiset RX :∫RX

dv =

∫F·R(i)(X)

dv = |J| ·∫R(i)(X)

∫F

dg d~a

= |J| ·Vol(G(Z)\G(R)) ·∫R(i)(X)

d~a. (35)

7.7 Congruence conditions

In this subsection, we prove the following version of Theorem 7.1 where we count elements of V (Z)satisfying a finite set of congruence conditions:

29

Page 30: On average sizes of Selmer groups and ranks in families of

Theorem 7.8. Suppose S is a subset of V (Z) defined by congruence conditions modulo finitelymany prime powers. Then

N(S ∩ V (i);X) = N(V (Z)(i);X) ·∏p

µp(S) + o(Xn/k), (36)

where µp(S) denotes the p-adic density of S in V (Z).

Proof. Suppose S is defined by congruence conditions modulo some integer m. Then S may beviewed as the union of (say) r translates L1, . . . ,Lr of the lattice m · V (Z). For each such lat-tice translate Lj , we may use formula (22) and the discussion following that formula to computeN(S;X), but where each d-dimensional volume is scaled by a factor of 1/md to reflect the fact thatour new lattice has been scaled by a factor of m. For a fixed value of m, we thus obtain

N(Lj ∩ V (i);X) = m−nVol(RX) + o(Xn/k). (37)

Summing (37) over j, and noting that rm−n =∏p µp(S), yields Theorem 7.8.

We will also have occasion to use the following weighted version of Theorem 7.8; the proof isidentical.

Theorem 7.9. Let p1, . . . , pr be distinct prime numbers. For j = 1, . . . , r, let φpj : V (Z) → Rbe a G(Z)-invariant function on V (Z) such that φpj (x) depends only on the congruence class of

x modulo some power pajj of pj. Let Nφ(V (Z) ∩ V (i);X) denote the number of irreducible G(Z)-

orbits in V (Z)∩V (i) having height bounded by X, where each orbit G(Z) ·B is counted with weightφ(B) :=

∏rj=1 φpj (B). Then

Nφ(V (Z) ∩ V (i);X) = N(V (Z) ∩ V (i);X)

r∏j=1

∫B∈V (Zpj )

φpj (B) dB + o(Xn/k), (38)

where φpj is the natural extension of φpj to V (Zpj ) by continuity, and dB denotes the additivemeasure on V (Zpj ) normalized so that

∫B∈V (Zpj ) dB = 1.

7.8 Proof of Propositions 7.4 and 7.5

We may use the results of §7.7 to prove Propositions 7.4 and 7.5 on estimates for reducible pointsin the main body and points with large stabilizer, respectively. Indeed, to prove Proposition 7.4,we note that if an element B ∈ V (Z) is reducible over Q then it also must be reducible modulo pfor every p.

Let Sred denote the set of elements in V (Z) that are reducible over Q, and let Sredp denote the

set of all elements in V (Z) that are reducible mod p. Then Sred ⊂ ∩pSredp . Let Sred(Y ) = ∩p<Y Sred

p

for any positive integer Y , and let us use as before V (φ) to denote the set B ∈ V (Z) such thatbmin 6= 0. Then the proof of Theorem 7.8 (without assuming Propositions 7.4 and 7.5!) gives that

N∗(Sred(Y ) ∩ V (φ);X) ≤ N∗(V (φ);X) ·∏p<Y

µp(Sredp ) + o(Xn/k). (39)

Note that the inequality in (39) also holds when the product is over subsets of primes p < Y .To estimate µp(S

redp ) in each case (for p large enough), first recall that an element v of V (Z) is

reducible mod p if and only if any one of the covariant binary quartic forms or ternary cubic forms

30

Page 31: On average sizes of Selmer groups and ranks in families of

coming from v, considered mod p, corresponds to a trivial Selmer element, i.e., has a root or a flexdefined over Fp, respectively. In each of the seven cases, we show that for infinitely many p, thereexists δ ∈ (0, 1) such that

µp(Sredp ) ≤ 1− δ +O(pβ)

where β = −1/2 or −1. The latter five cases will use the first two cases, and some cases will needa weak form of the Hasse bound

#E(Fp) = p+O(√p)

for elliptic curves E to conclude that all of the fibers of the covariant binary quartic or ternarycubic maps are roughly the same size.

1. Binary quartic forms. It is easy to check that 14(p5 − p4 + p3 − 3p2 + 2p) of the binary

quartic forms in Fp are irreducible over Fp (out of p5 total), so in this case, we have µp(Sredp ) ≤

3/4 +O(1/p).

2. Ternary cubic forms. We will show that a positive density of smooth ternary cubic formsover Fp have Jacobians with a nontrivial rational 3-torsion point, and most of those will nothave a rational flex. We first claim that the density of the singular ternary cubics over Fp isO(1/p); this follows from using the Grothendieck-Lefschetz trace formula to count points inthe smooth locus of the moduli space of ternary cubics (there is no 1/

√p term because the

moduli space is rational and thus has vanishing H1).

Note that since #E(Fp)/3E(Fp) = #E[3](Fp), each elliptic curve E over Fp arises as theJacobian of a ternary cubic form over Fp the same number of times (in fact, exactly #GL3(Fp)times). Now consider the degree 8 forgetful map from the modular curve Y1(3) to the modulispace M1,1 of elliptic curves; over Fp, the image consists of the elliptic curves over Fp with anontrivial 3-torsion point defined over Fp, and this image must have density at least 1/8 +O(1/p) (the error term due to the cusps of the genus 0 curve X1(3)). Finally, if E[3](Fp) = 3or 9, then a density of 2/3 or 8/9 (respectively) of the ternary cubics with Jacobian E will nothave a rational flex. Combining all of these proportions shows that the density of irreducibleternary cubics over Fp is at least 1/12 +O(1/p), so

µp(Sredp ) ≤ 11/12 +O(1/p).

3. Bidegree (2, 2) forms. A bidegree (2, 2) form is irreducible if and only if both covariantbinary quartics are irreducible. Consider the map φ from bidegree (2, 2) forms to one of thecovariant binary quartics. Given a bidegree (2, 2) form v ∈ V (Fp), if f = φ(v) is irreducible,then f corresponds to a genus one curve C (isomorphic to its Jacobian E) and a degree 2 linebundle L such that L is not isomorphic to O(2Q) for a point Q ∈ E(Fp). Thus, the groupE(Fp)/2E(Fp) is nontrivial. Adding any nonzero point P ∈ E(Fp) in all but one (nonzero)coset of 2E(Fp) to the line bundle L gives a second line bundle L′ with (C,L′) correspondingto an irreducible binary quartic. Thus, a positive proportion (either 1

2−1

#E(Fp) or 34−

1#E(Fp))

of the bidegree (2, 2) forms above f are irreducible. Combining this with case 1 and the Hassebound shows that for large enough p, we have

µp(Sredp ) ≤ 7/8 +O(1/

√p).

4. Rubik’s cubes. The argument for the case of Rubik’s cubes is very similar to that forbidegree (2, 2) forms, replacing covariant binary quartics with covariant ternary cubics. Given

31

Page 32: On average sizes of Selmer groups and ranks in families of

an irreducible ternary cubic f corresponding to a genus one curve C, degree 3 line bundle L,and Jacobian E, we want to find points P ∈ E(Fp) such that L + P and L − P (as degree3 line bundles) are not isomorphic to O(3Q) for any Q. Again, we have that E(Fp)/3E(Fp)is nontrivial (since f is irreducible) so has size either 3 or 9. All nonzero points P not intwo of the nonzero cosets of 3E(Fp) will thereby give an irreducible Rubik’s cube (with threeirreducible ternary cubics). For large enough p, we have

µp(Sredp ) ≤ 35/36 +O(1/

√p).

5. Doubly symmetric Rubik’s cubes. We combine the argument for Rubik’s cubes withthe observation that 2-torsion points P are always equal to 3P . Thus, a doubly symmetricRubik’s cubes with one irreducible ternary cubic will be irreducible, and since there areeither zero, one, or three nontrivial 2-torsion points for any elliptic curve, we have µp(S

redp ) ≤

35/36 +O(1/p).

6. Triply symmetric hypercubes. We combine the argument for bidegree (2, 2) forms withthe observation that 3-torsion points P are always equal to 2(−P ). So a triply symmetrichypercube with one irreducible binary quartic will be irreducible, and since an elliptic curvehas zero, two, or eight nontrivial 3-torsion points, we compute µp(S

redp ) ≤ 15/16 +O(1/p).

7. Hypercubes. We use a similar argument as the previous cases, but we now need to eliminatethree cosets of 2E(Fp). It is easy to check that the number of binary cubic forms over Fp withthree distinct roots in Fp is 1

6p(p2 − 1)(p− 1), so E(Fp)/2E(Fp) has order 4 for 1/6 +O(1/p)

of elliptic curves over Fp. Thus, we obtain, for large enough p,

µp(Sredp ) ≤ (1− 1/4 · 1/6 · 1/4) +O(1/

√p) = 95/96 +O(1/

√p).

Combining with (39), we see that

limX→∞

N∗(Sred ∩ V (φ);X)

Xn/k�

∏p<Y

µp(Sredp ) �

∏p<Y

(1− δ +O(pβ)

).

When Y tends to infinity, the product on the right tends to 0, proving Proposition 7.4.We may proceed similarly with Proposition 7.5. If an element B ∈ V (Z) with nonzero discrim-

inant has a nontrivial stabilizer in G(Q), then any of the covariant binary quartic forms or ternarycubic forms has a nontrivial stabilizer, or equivalently, the corresponding Jacobian has a rational2- or 3-torsion point. Let Sbigstab ⊂ V (Z) denote the elements B ∈ V (Z) that have nontrivial

stabilizers in G(Q) and let Sbigstabp ⊂ V (Z) denote the elements B ∈ V (Z) such that B modulo p

has a nontrivial stabilizer in G(Fp). Let Sbigstab(Y ) = ∩p<Y Sbigstabp . Then we claim that in each

case, we haveµp(S

bigstabp ) ≤ (1− δ′) +O(pβ) (40)

for β = −1/2 or −1 and some δ′ ∈ (0, 1).We need only compute δ′ for binary quartics and ternary cubics; inequality (40) for the other

cases will follow from the Hasse bound argument because inclusion in V bigstab is determined bythe stabilizer for any of the covariant forms. For binary quartic forms, note that 1

3p(p2 − 1)2 of

the p5 binary quartic forms over Fp factor into an irreducible cubic factor and a linear factor overFp. All of the Jacobians of these curves have no 2-torsion point over Fp since the cubic does not

factor, so µp(Sbigstabp ) ≤ 2/3 + O(1/p). For ternary cubic forms, we want to find the density of

32

Page 33: On average sizes of Selmer groups and ranks in families of

ternary cubics whose Jacobians have no 3-torsion point over Fp. As above, consider the forgetfulmap Y1(3)→M1,1; both the source and the target over Fp are genus zero curves (with cusps), andsince the fibers have order 0, 2, or 8, we must have that at least 1/2 +O(1/p) of the points in M1,1

are not in the image of the map, i.e., µp(Sbigstabp ) ≤ 1/2 +O(1/p).

Finally, we have by the same argument as for Proposition 7.4 that

limX→∞

N∗(Sbigstab ∩ V (φ);X)

Xn/k�

∏p<Y

µp(Sbigstabp ) �

∏p<Y

(1− δ′ +O(pβ)

)and letting Y tend to infinity proves Proposition 7.5.

8 Sieving to Selmer elements

We have seen that locally soluble orbits of elements of V (Q) correspond to elements in the d-Selmergroup S(E) of elliptic curves E in the family F , where F = F0, F1, F1(2), F1(3), or F2 and d = 2 or3. More precisely, recall from Theorem 3.1(c) that irreducible such orbits correspond to elementsof the Selmer group S(E) that are not in the subgroup S′ = S′(E) given by the image in S(E) ofthe marked points on E.

Let Φ be a subfamily of F that is defined by local congruence conditions modulo prime powers.For each prime p we assume that the elliptic curves over Zp, that the congruence conditions modulopowers of p define, form a closed subset of Zmp with boundary of measure 0. We use Φinv to denote

the set {~a ∈ Zm : ~a = ~a(E) for some E ∈ Φ}, and Φinvp to denote the p-adic closure of Φinv in Zmp

by Φinvp . We say that such a subfamily Φ of elliptic curves over Q is acceptable at p if Φinv

p containsall elliptic curves E in the family such that p2 - ∆(E). The subfamily Φ of elliptic curves is calledacceptable if it is large at all but finitely many primes p. In this section, we prove Theorem 1.2 forthis slightly more general definition of an acceptable subfamily, using an appropriate sieve appliedto the counts of G(Z)-orbits on V (Z) having bounded height as obtained in Section 7.

8.1 A weighted set U(Φ) in V (Z) corresponding to a large family Φ

Theorem 3.1(c) implies that non-S′ elements of the Selmer group of the Jacobian of the elliptic curveE(~a) ∈ F for ~a ∈ Zm are in bijective correspondence with G(Q)-equivalence classes of irreduciblelocally soluble elements B ∈ V (Z) having invariants M iai and M ia′i for all i; in this bijection,we have H(B) = M6H(C). Let us write ~aM to denote the vector ~a in which each ai and a′i arereplaced by M iai and M ia′i, respectively.

In §7, we computed the asymptotic number of G(Z)-equivalence classes of irreducible elementsB ∈ V (Z) having bounded height. In order to use this to compute the number of irreducible locallysoluble G(Q)-equivalence classes of elements B ∈ V (Z) having invariants in

{~aM : ~a ∈ Φinv} (41)

and bounded height (where Φ is any large family), we need to count each G(Z)-orbit G(Z) ·B witha weight of 1/n(B), where n(B) is equal to the number of G(Z)-orbits inside the G(Q)-equivalenceclass of B in V (Z).

To count the number of irreducible locally soluble G(Z)-orbits having invariants in the set (41)and bounded height, where each orbit G(Z) · B is weighted by 1/n(B), it suffices to count thenumber of such G(Z)-orbits of bounded height such that each orbit G(Z) · B is weighted instead

33

Page 34: On average sizes of Selmer groups and ranks in families of

by 1/m(B), where

m(B) :=∑

B′∈O(f)

#AutQ(B′)

#AutZ(B′)=

∑B′∈O(f)

#AutQ(B)

#AutZ(B′);

here O(f) denotes a set of orbit representatives for the action of G(Z) on the G(Q)-equivalenceclass of B in V (Z), and AutQ(B′) (resp. AutZ(B′)) denotes the stabilizer of B′ in G(Q) (resp.G(Z)). The reason it suffices to weight by 1/m(B) instead of 1/n(B) is that we have shown in theproof of Proposition 7.5 that all but a negligible number o(Xn/k) of G(Z)-orbits having boundedheight have trivial stabilizer in G(Q) (and thus also in G(Z)), while the number of elliptic curvesin Φ of bounded height is � Xn/k.

We use U(Φ) to denote the weighted set of all locally soluble elements in V (Z) having invariantsin the set (41), where each element of B ∈ U(Φ) is assigned a weight of 1/m(B). Then we haveconcluded that the weighted number of irreducible G(Z)-orbits of height less than M6X in U(Φ)is asymptotically equal to the number of elements in S(E) \ S′(E) for elliptic curves E of heightless than X in Φ.

The global weights m(B) assigned to elements B ∈ U(Φ) are useful for the following reason.For a prime p and any element B ∈ V (Zp), define the local weight mp(B) by

mp(B) :=∑

B′∈Op(B)

#AutQp(B)

#AutZp(B′),

where Op(B) denotes a set of orbit representatives for the action of G(Zp) on the G(Qp)-equivalenceclass of B in V (Zp), and AutQp(B) (resp. AutZp(B)) denotes the stabilizer of B in G(Qp) (resp.G(Zp)). Using the fact that G has class number one, by an argument identical to [BS15a, Prop. 3.6],we have the following identity:

m(B) =∏p

mp(B). (42)

Thus the global weights of elements in U(Φ) are products of local weights, so we may express theglobal density of elements U(Φ) in V (Z) as products of local densities of the closures of the setU(Φ) in V (Zp). We consider these local densities next.

8.2 Local densities of the weighted sets U(Φ)

Suppose that Φ is a large subfamily of elliptic curves in F , and for each prime p, let Φp denotethe resulting family of curves defined by congruence conditions over Zp . Let U(Φ) denote theassociated weighted set in V (Z), and let Up(Φ) denote the p-adic closure of U(Φ) in V (Zp). We cannow determine the p-adic density of Up(Φ), where each element B ∈ Up(Φ) is weighted by 1/mp(B),in terms of a local (p-adic) mass Mp(V,Φ) involving all elements of E(Qp))/dE(Qp) for curves Ein Φ over Qp; the proof is identical to [BS15a, Prop. 3.9]:

Proposition 8.1. Let J be the constant of Proposition 7.7, and let Φ be any large subfamily ofelliptic curves in F . Then∫

Up(Φ)

1

mp(v)dv = |MnJ|p ·Vol(G(Zp)) ·Mp(V,Φ),

where

Mp(V,Φ) :=

∫E=E(~a)∈Φp

∑σ∈

E(Qp)dE(Qp)

1

#E[d](Qp)d~a.

34

Page 35: On average sizes of Selmer groups and ranks in families of

In the analogous manner, if Φ is a large subfamily of F , then we may define Mp(Φ) to be themeasure of Φinv

p with respect to the measure d~a on Zmp , where the measure d~a on Zmp is normalizedso that the total measure is 1. That is, we have

Mp(Φ) =

∫E=E(~a)∈Φp

d~a. (43)

In Section 9, we will be interested in comparing the masses Mp(V,Φ) and Mp(Φ).

8.3 Squarefree conditions

In this section, we describe conditions for elements in V (Z) that will be removed in the sieve forSelmer elements. For example, we show that conditions like insolubility at p imply that p2 dividesthe discriminant (or a specific factor of it).

For two of the cases, the discriminant polynomial factors as a polynomial with repeated factors,so sieving naively for elements with squarefree discriminant would remove all elements. Specifically,for Case 5 (doubly symmetric Rubik’s cubes), for an elliptic curve in F1(2), we have a factorizationof the discriminant ∆ = 16a2

4(−4a4 + a22) (see [BH16, §5.2]); let α(v) = a4 and ∆′(v) = −4a4 + a2

2,which are both degree 12 invariants of V . Similarly, for Case 6 (triply symmetric hypercubes), thediscriminant polynomial for an elliptic curve in F1(3) factors as a rational multiple of a3

3(a31−27a3)

[BH16, §6.3]; in this case, let α(v) = a3 and ∆′(v) = a31 − 27a3, which are degree 6 invariants of V .

Recall that we define the reduced discriminant ∆red(v) to be the squarefree part of the discriminant,so it is α(v)∆′(v) for these two cases (and just ∆(v) for the other cases).

We will use the following definition repeatedly in the sequel:

Definition 8.2. For any integer polynomial g(t1, . . . , tr) where p2 divides g(~b) with ~b ∈ Zr, we saythat g(~b) is a multiple of p2 for “mod p reasons” if p2 | g(~b′) for all ~b′ ≡ ~b (mod p), and for “modp2 reasons” otherwise.

Proposition 8.3. Let v ∈ V (Z). If the covariant binary quartic or ternary cubic forms associatedto v are insoluble at p (i.e., do not have a Qp-point) or if mp(v) 6= 1, then

(i) p2 divides the discriminant ∆(v), and

(ii) in Cases 5 and 6, either p2 divides α(v) for mod p reasons, or p divides both α(v) and ∆′(v),or p2 divides ∆′(v) for mod p2 reasons.

Proof. For binary quartics and ternary cubics (Cases 1 and 2), this result is proved in [BS15a,Proposition 3.18] and [BS15b, Proposition 38], respectively.

Part (i) follows directly from Cases 1 and 2, since the discriminant of a bidegree (2, 2) form, aRubik’s cube, or a hypercube coincide with the discriminant of any of the covariant binary quarticsor ternary cubics. For example, if a bidegree (2, 2) form (or a hypercube) v gives rise to a covariantbinary quartic f that is insoluble at p, then ∆(v) = ∆(f) is divisible by p2. Furthermore, ifone of the covariant binary quartics f of v ∈ V (Z) has mp(f) 6= 1, then there is an element ofSL2(Qp)\SL2(Zp) that takes f to another integral binary quartic. By taking the identity in all otherfactors of SL2(Qp), we then obtain a non-integral element of SL2(Qp)

r (for r = 2 or 4, respectively)taking v to another element of V (Zp), so mp(v) 6= 1. The argument for Rubik’s cubes is analogous.

For a doubly symmetric Rubik’s cube v (Case 5), if any of the covariant ternary cubics of v isinsoluble at p, then all the covariant cubics are. Let f be the covariant cubic det(Ax+By + Cz),when we view v as a triple of symmetric matrices (A,B,C). By the argument for ternary cubics(see [BS15b, Proposition 38]), we find that insolubility of f implies that f modulo p factors over

35

Page 36: On average sizes of Selmer groups and ranks in families of

Fp into linear factors. The three singularities [x : y : z] of the curve f = 0 modulo p correspond towhere the rank of the matrix Ax+By+Cz (modulo p) drops by 2. Thus, a change of coordinates(over Fp) will take v to the triple (E11, E22, E33) modulo p, where Eij is the 3× 3 matrix with a 1in the ijth entry and 0 elsewhere. An easy explicit computation shows that p2 divides α(v) for vcongruent to (E11, E22, E33) modulo p (so p2 divides α(v) for “mod p reasons”).

If a doubly symmetric Rubik’s cube v = (A,B,C) has mp(v) 6= 1, then there exists a nontrivialelement γ = (γ1, γ2) ∈ GL3(Qp)

2, not in GL3(Zp)2, such that γ(v) ∈ V (Zp) and (det γ1)(det γ2)2 =1 (where, say, V = V1⊗Sym2(V2) and γ1 acts on V1 and γ2 acts on V2). Without loss of generality,by scaling, we may take γ2 to have determinant 1, p−1, or p−2 (so γ1 has determinant 1, p2, or p4,respectively).

First suppose γ2 has determinant 1. If γ1 is nontrivial, then γ1 also takes the covariant ternarycubic f = det(Ax+By + Cz) to an integral ternary cubic form f ′. A change of basis puts γ1 into

the form

(pr

ps

pt

), where r + s + t = 0 and r ≤ s ≤ t with at least one nonzero. Then either

f or f ′ has a linear factor when reduced modulo p; assume without loss of generality f factorsinto a linear and a quadratic factor modulo p. Then the curve f = 0 modulo p has at least twosingularities (over Fp), corresponding to where the rank of the matrix Ax + By + Cz modulo pdrops by 2. As above, a change of coordinates (over Fp) will take v to the triple (E11, E22, C

′)modulo p for some symmetric matrix C ′. An explicit computation2 shows that p2 divides α(v) forv congruent to (E11, E22, C

′) modulo p, so p2 divides α(v) for mod p reasons. Now if γ1 is trivial,then γ2 must be nontrivial, and γ2 takes the other covariant ternary cubic g to an integral ternary

cubic. Again, we may take γ2 to be of the form

(pr

ps

pt

), where r+s+t = 0 and r ≤ s ≤ t with at

least one nonzero. This implies that either v or γ2(v), up to an appropriate change of coordinates,

has the following factors of p in each of the three matrices:(p2 p ppp

). Then the ternary cubic f is

a multiple of p2, so both the invariants α(v) and ∆′(v) are divisible by p2.If γ2 has determinant p−u for u = 1 or 2, then we can similarly change the basis to make γ1

a diagonal matrix (pr, ps, pt) with r + s + t = 2u and r ≤ s ≤ t. To examine how γ acts on thecoefficients of the covariant ternary cubic form f(x, y, z), first note that γ2 sends f to p−uf . Weclaim that that either γ(f) modulo p is divisible by x, or f modulo p is divisible by z, or γ(f) isa multiple of p. This is a straightforward computation: for u = 1, if s ≥ 1, then under the actionof γ, the coefficients of xyz, y3, y2z, yz2, and z3 in f are all multiplied by positive powers of p, soγ(f) modulo p is divisible by x. If s ≤ 0, then the coefficients of x3, x2y, xy2, and y3 are multipliedby negative powers of p by the action of γ, so since f is integral, it must be divisible by z modulop. For u = 2, if r ≤ 0 and s ≥ 1, then x divides γ(f) modulo p; if r ≤ 0 and s ≤ 0, then z divides fmodulo p; and if (r, s, t) = (1, 1, 2) (the only remaining case), then p divides γ(f). Now we may usethe arguments from the case of det γ2 = 1, since either f or γ(f) has a linear factor when reducedmodulo p or γ(f) is a multiple of p.

For triply symmetric hypercubes v (Case 6), first suppose the covariant binary quartics arisingfrom v are insoluble at p. Then they must be a square of a quadratic polynomial modulo p (possiblya fourth power of a linear factor). Viewing v as a pair of binary cubic forms (A,B) in variables tand u, we have that the pencil of binary cubic forms has two points where the cubic is in fact thecube of a linear form modulo p. In other words, up to appropriate changes of coordinates, we haveA = t3 and B = u3 modulo p. It is trivial to check in the case that the invariant α(v) is divisible

2This computation may be done without having an explicit formula for α by computing the usual invariants forthe ternary cubic f , which are degree 12 and 18 in the entries of V , and comparing them modulo low powers of p tothe degree 6 and 12 G-invariants of V , the latter of which is α.

36

Page 37: On average sizes of Selmer groups and ranks in families of

by p2, and for mod p reasons.If a triply symmetric hypercube v has mp(v) 6= 1, then there exists a nontrivial element γ =

(γ1, γ2) ∈ GL2(Qp)2, not in GL2(Zp)2, such that γ(v) ∈ V (Zp) and (det γ1)(det γ2)3 = 1 (where,

say, V = V1 ⊗ Sym3(V2) and γ1 acts on V1 and γ2 acts on V2). A change of basis puts γ1 into the

form(pr

ps

)and γ2 into the form

(pt

pu

)for integers r, s, t, u. Since the action of the diagonal

matrices (p3Id2, pId2) on V1 ⊗ Sym3(V2) is trivial, we may assume γ2 has determinant 1 or p−1.If γ2 has determinant 1, so does γ1, and r = −s and t = −u. First suppose γ1 is nontrivial,

so without loss of generality, we may take r > 0. Then γ1 takes the covariant binary quarticf = disc(Ax+By) to an integral binary quartic form f ′. The binary cubic B is thus a multiple ofp, in which case it is easy to see that p3 divides both α(v) and ∆′(v). If instead γ1 is trivial and γ2

is nontrivial, then since we may take t > 0, we find that the binary cubic forms A and B must havemultiple factors of p in their coefficients, namely both are of the form c3X

3 + c2X2Y + prc1XY

2 +p3rc0Y

3 for ci ∈ Zp. Again, this immediately implies that both α(v) and ∆′(v) are divisible by p3.If γ2 has determinant p−1, then we have r = 3− s and u = −t− 1. Then the action of (γ1, γ2)

on a pair of binary cubics (A,B) with integral coefficients ((a0, a1, a2, a3), (b0, b1, b2, b3)) producesa pair of cubics whose corresponding (integral) coefficients are scaled by the following powers of p:

((r + 3t, r + t− 1, r − t− 2, r − 3t− 3), (−r + 3t+ 3,−r + t+ 2,−r − t+ 1,−r − 3t)).

Note that the powers for the coefficients of B are negated and reversed of those for A. When anyof the powers are negative, we find that the corresponding coefficient is divisible by the negative ofthat power of p. For example, if r = 2 and t = 1, then the integrality of both (A,B) and γ(A,B)implies that p divides a2, p4 divides a3, p2 divides b2, and p5 divides b3. Explicit computationswith the invariants α and ∆′ in this case give the following implications:

(i) if p | a2, a3, b2, b3, then p2 | α and p2 | ∆′

(ii) if p | a3, b1, b2, b3, then p | α and p | ∆′

(iii) if p | b0, b1, b2, b3 (i.e., B is a multiple of p), then p3 | α and p3 | ∆′

Without loss of generality, we may assume r ≥ 2 and t ≥ 0. Then one of the above three casesholds unless r = 3t+ 3 or r = t+ 2. If r = 3t+ 3, then pi divides bi for i = 1, 2, 3, so we computethat p2 divides α. If r = t + 2, then p divides a3, p divides b2, and p2 divides b3; in this case, wecompute that p2 divides ∆′, and for mod p2 reasons.

8.4 Uniformity estimates and a squarefree sieve

To obtain the cases of equality in Theorem 1.1, we require a more general version of Theorem 7.9,namely one that counts weighted elements of V (Z), where the weight functions are defined byappropriate infinite sets of congruence conditions. A function φ : V (Z) → [0, 1] ∈ R is said tobe defined by congruence conditions if, for all primes p, there exist functions φp : V (Zp) → [0, 1]satisfying the following conditions:

(1) For all B ∈ V (Z), the product∏p φp(B) converges to φ(B).

(2) For each prime p, the function φp is locally constant outside some closed set Sp ⊂ V (Zp) ofmeasure zero.

37

Page 38: On average sizes of Selmer groups and ranks in families of

We say that such a function φ is acceptable if for sufficiently large primes p, we have φp(B) = 1whenever p2 divides ∆red(B).

Our purpose in this section is to prove the following generalization of Theorem 7.9, which allowsfor certain infinite sets of congruence conditions:

Theorem 8.4. Let φ : V (Z) → [0, 1] be an acceptable function that is defined by congruenceconditions via the local functions φp : V (Zp) → [0, 1]. Then, with notation as in Theorem 7.9, wehave:

Nφ(V (Z)(i);X) ≤ N(V (Z)(i);X)∏p

∫B∈V (Zp)

φp(B) dB + o(Xn/k), (44)

with equality in Cases 1, 2, 4, and 5 of Table 1.

To prove Theorem 8.4, we follow the method of [Bha13] to establish the following tail estimate:

Proposition 8.5. Let Wp(V ) be the set of v ∈ V (Z) such that p2 divides ∆red(v). In Cases 1, 2,4, and 5 of Table 1, for any fixed ε > 0, we have

N(∪p>MWp(V );X) = Oε(Xn/k/(M logM) +X(n−1)/k) +O(εXn/k). (45)

We expect Proposition 8.5 to hold also for Cases 3, 6, and 7 (which together with Theorem 8.7would imply that the upper bounds in Theorem 1.1 are exact averages for those cases).

Proof. If p2 ≤ X1/k, then the counting method of §7.7, with the relevant congruence conditionsmodulo p2 imposed, immediately yields the individual estimate N(Wp(V );X) = O(Xn/k/p2) (not-ing that R(i)(X) = X1/kR(i)(1)). Hence, to prove Proposition 8.5, it suffices to assume thatM > X1/(2k).

To simplify notation, let ∆′ = ∆ for Cases 1, 2, and 4. Let W(2)p denote the set of B ∈ V (Z)

such that p2 | ∆′(B) for “mod p2 reasons”, i.e., such that there exists B′ ≡ B (mod p) such that

p2 - ∆′(B′). Let W(1)p := Wp(V )\W(2)

p ; for B ∈ W(1)p , we have that p2 divides ∆′(B) for mod p

reasons or (only relevant in Case 5) either p divides both α(B) and ∆′(B) or p2 divides α(B). InCase 5, it is easy to check that if p2 divides α(B) for mod p2 reasons, then p divides ∆′(B) also.

The two sets W(1)p and W

(2)p are preserved under G(Z)-transformations.

For any ε > 0, let F(ε) ⊂ F denote the subset of elements na(ti, ui)k ∈ F such that ti and uiare bounded above by an appropriate constant to ensure that Vol(F(ε)) = (1 − ε)Vol(F). ThenF(ε) · R(i)(X) is a bounded domain in V (R) that expands homogeneously with X. By [Bha13,Theorem 3.3], we have

#{F(ε) ·R(i)(X)⋂

(∪p>MW(1)p )} = Oε(X

n/k/(M logM) +X(n−1)/k). (46)

Furthermore, the results of §7.5 imply that

#{(F\F(ε)) ·R(i)(X)⋂V (Z)irr} = O(εXn/k). (47)

Combining the two estimates (46) and (47) yields (45) with Wp replaced with W(1)p .

Proposition 8.5 is already known in Cases 1 and 2 of Table 1 (cf. [BS15a, BS15b]), so we prove

the estimate for W(2)p only for Cases 4 and 5, where the elements of V (Z) are (possibly symmetric)

3× 3× 3 matrices.Suppose B belongs to W

(2)p . Let f(x, y, z) = det(B1jkx + B2jky + B3jkz) be the first of the

three ternary cubic forms arising from B; then ∆(B) = ∆(f). Note that the discriminant of f

38

Page 39: On average sizes of Selmer groups and ranks in families of

must also be a multiple of p2 for mod p2 reasons (for otherwise B would then be in W(1)p ). In

[BS15b, Prop. 25], it was shown that if a ternary cubic form f has discriminant a multiple of p2

for mod p2 reasons, then there is an SL3(Z)-transformation taking f to f ′, such that p divides thexz2- and yz2-coefficients and p2 divides the z3-coefficient of f ′(x, y, z). Let B′ be the result of thecorresponding SL3(Z)-transformation on B; then f ′ is the first ternary cubic form arising from B′.

Since the z3-coefficient of f ′ is a multiple of p2, we see that the determinant of the 3× 3 matrix(B′3jk) is a multiple of p2, and it must be so for mod p2 reasons. It follows that the matrix (B′3jk)modulo p has rank 2. By an SL3(Z) × SL3(Z)-transformation in Case 5, or simply an SL3(Z)-transformation in Case 6, we may obtain an element B′′ ∈ V (Z) from B′ such that: a) the lastrow and column of (B′′3jk) is a multiple of p; b) the determinant of the 2× 2 matrix (B′′3jk)1≤j,k≤2

is coprime to p; and c) B′′333 is a multiple of p2. Note that the first of the associated ternary cubicforms of B′′ remains f ′. The fact that p divides the coefficients of xz2 and yz2 implies that B′′133

and B′′233 are also multiples of p.Define the element B′′′ by(( 1

1p−1

),( 1

1p−1

),( 1

1p−1

))· pB or

(( 11p−1

),( 1

1p−1

))· pB (48)

depending on whether we are in Case 5 or 6, respectively. Then B′′′ has the same discriminant as

B and is in W(1)p , because its first associated ternary cubic form f ′′′ has its x3-, x2y-, xy2-, and

y3-coefficients divisible by p.

We therefore have obtained a discriminant-preserving map φ from G(Z)-orbits on W(2)p to G(Z)-

orbits on W(1)p . The following lemma states that this map is at most 3 to 1:

Lemma 8.6. Given a G(Z)-orbit on W(1)p , there are at most three G(Z)-orbits on W

(2)p that map

to it under φ.

Proof. Let B be an element of W(2)p and f its first associated ternary cubic form, i.e., f(x, y, z) =

det(B1jkx + B2jky + B3jkz). If the reduction of f ∈ W(2)p modulo p has a nodal singularity at

[0 : 0 : 1] ∈ P2(Fp) (i.e., its xz2-, yz2-, and z3-coefficients vanish modulo p), then the first associatedternary cubic form f ′′′ of the 3 × 3 × 3 matrix (48), when reduced modulo p, has z as a factor.Moreover, for such an f ′′′, the first associated ternary cubic form(

11p

)· p−1f ′′′ (49)

of the 3× 3× 3 matrix((1

1p

),(

11p

),(

11p

))· p−1B or

((1

1p

),(

11p

))· p−1B (50)

can be integral only if the x3-, x2y-, xy2-, and y3-coefficients of f ′′′ are zero modulo p. Therefore, the

preimages under φ of the G(Z)-orbit of B ∈W(1)p are associated to linear factors of the reduction of

f ′′′ modulo p. The reduction of f ′′′ modulo p has at most 3 linear factors, unless f ′′′ ≡ 0 (mod p),

in which case (50) belongs to W(1)p . Thus, the map φ : G(Z)\W(2)

p → G(Z)\W(1)p is at most 3 to 1,

and the lemma follows.

Therefore, since discriminants less than X can have at most 2k distinct prime factors p > M >X1/(2k), we obtain

N(∪p>MW(2)p ;X) ≤ 3 ·2k ·N(∪p>MW(1)

p ;X) = Oε(Xn/k/(M logM)+X(n−1)/k)+O(εXn/k). (51)

This concludes the proof of the proposition.

39

Page 40: On average sizes of Selmer groups and ranks in families of

8.5 Weighted count of elements in U(Φ) having bounded height

For a large family Φ, we may now describe the asymptotic number of G(Z)-orbits in U(Φ) havingbounded height.

Theorem 8.7. Let Φ be any large subfamily of F . Then N(U(Φ);M6X), the weighted number ofG(Z)-orbits in U(Φ) having height less than M6X, is given by

N(U(Φ);M6X) ≤M6n/k ·N∑i=1

Vol(RX)

ni·∏p

∫Up(Φ)

1

mp(v)dv ·Xn/k + o(Xn/k), (52)

with equality in Cases 1, 2, 4, and 5 of Table 1.

Proof. By Theorem 7.1, Theorem 7.9, and the multiplicativity of weights (42), it follows that forany fixed positive integer Y , we have

limX→∞

N(V (Z) ∩ [∩p<Y Up(Φ)];M6X)

(M6X)n/k=

N∑i=1

Vol(R1)

ni·∏p<Y

∫Up(Φ)

1

mp(v)dv,

where V (Z) ∩ [∩p<Y Up(Φ)] is viewed as a weighted set in which each element B is weighted by1/m(B). Letting Y tend to infinity, we obtain that

lim supX→∞

N(U(Φ);M6X)

Xn/k≤M6n/k ·

N∑i=1

Vol(RX)

ni·∏p

∫Up(Φ)

1

mp(v)dv. (53)

To obtain a lower bound for N(U(Φ);M6X) in Cases 1, 2, 4, and 5 of Table 1, we note that

V (Z) ∩⋂p<M

Up(Φ) ⊂(U(Φ) ∪

⋃p>M

Wp

)(even as weighted sets, since all weights in Up(Φ) are less than 1.) Hence, by the uniformity estimateof Proposition 8.5, we obtain

lim infX→∞

N(U(Φ);M6X)

(M6X)n/k≥

N∑i=1

Vol(RX)

ni·∏p<Y

∫Up(Φ)

1

mp(v)dv −O(Y −1).

Letting Y tend to infinity completes the proof.

It remains to evaluate expression (52) in terms of the total number of elliptic curves in Φ havingheight less than X.

9 Proof of Theorem 1.1

9.1 The number of elliptic curves in Φ of bounded height

Theorem 9.1. Let Φ be any large subfamily of F . Then the number of elliptic curves E in Φ withH(E) < X is given by ∫

H(~a)<Xd~a ·

∏p

Mp(Φ) ·Xn/k + o(X(n−1)/k).

40

Page 41: On average sizes of Selmer groups and ranks in families of

This theorem is proved in much the same manner as Theorem 8.7 (but is easier), again usingthe uniformity estimate of Proposition 8.5 for the families F = F0, F1, F1(2), and F1(3).

For the family F = F2, the proof of Theorem 9.1 is more involved and comprises the remainderof this subsection. We embed the family F2 of elliptic curves into the space of hypercubes (usingthe Kostant section, so that the image of each curve in F2 corresponds to the trivial Selmer ele-ment associated to that curve), and then we establish an appropriate uniformity estimate in the“cusp” region of the space of hypercubes. The main idea is to consider a subspace of the space ofhypercubes, which mostly coincides with the reducible elements from Lemma 6.1; the action of thesubgroup of G = GL4

2 preserving this subspace has additional invariants, and one may estimate thenumber of orbits with one of those invariants large and fixed. Similar ideas were used in [BSW16]for counting monic integer polynomials with squarefree discriminant.

Recall that a curve in the family F2 is of the form

y2 + a1xy + a3y = (x− a2)(x− a′2)(x− a′′2) (54)

with all ai integral and a2 + a′2 + a′′2 = 0 (and nonzero discriminant). We embed F2 into the spaceof hypercubes (see (17) in Case 7(b) in §4) by sending the curve with invariants (a1, a2, a

′2, a′′2, a3)

to the hypercube (0 00 1

) (0 11 a1

)(

1 00 −a2

) (0 −a′2−a′′2 a3

) . (55)

The discriminant of the hypercube (55) (e.g., the discriminant of the covariant binary quartics) isthe discriminant ∆ of the curve (54). The space of hypercubes of the form (55) is contained in thesubspace W0 of hypercubes of the form(

0 00 1

) (0 11 ∗

)(‡ ∗∗ ∗

) (∗ ∗∗ ∗

) , (56)

where ∗ and ‡ denote any element of Q (or more generally, of the base field). This space W0 isfixed by U4 ⊂ SL4

2, where U denotes the group of lower triangular unipotent matrices. Let H bethe subgroup of GL4

2 fixing this space W0; it contains U4 as well as, e.g., quadruples of diagonalmatrices such as ([1, γ], [1, γ], [1, γ], [γ−2, γ−1]).

Note that the subspace W0 has an obvious U4-invariant, namely the entry denoted by ‡. We callthis the Q-invariant of the element of W0 (after the analogous invariant in [BSW16]). Furthermore,the discriminant of any hypercube of the form (56) has a square factor of the Q-invariant.

We want to show that if p2 divides the discriminant ∆ of the curve (54) for mod p2 reasons,then we can find a hypercube of the form (56) that is H-equivalent to (55) such that p2 divides itsdiscriminant for mod p reasons and its Q-invariant is p.

Suppose p2 divides ∆ for mod p2 reasons. Then all the covariant binary quartic forms alsohave discriminant (equal to ∆) a multiple of p2 for mod p2 reasons. There exists a lower triangularunipotent transformation from U that transforms such a binary quartic to one with last coefficientdivisible by p2 and second-to-last coefficient divisible by p. Apply such a transformation to (55) inthe direction of the top and bottom rows of matrices; then the top row (cube) is unchanged andthe bottom row (cube) is of the form (

1 00 ∗

) (0 ∗∗ ∗

). (57)

41

Page 42: On average sizes of Selmer groups and ranks in families of

Because the transformed binary quartic form f has last coefficient divisible by p2, the cube (57)has discriminant a multiple of p2 (for mod p2 reasons also). Thus, an appropriate lower triangularunipotent transformation moves one of its covariant binary quadratic forms q1 (see [Bha04] for theconstruction of three binary quadratic forms from a 2 × 2 × 2 cube) to one with last coefficientdivisible by p2 and middle coefficient divisible by p. Finally, using lower triangular unipotent lineartransformations in the other two directions, we obtain a cube of the following form modulo p:(

1 ∗∗ ∗

) (unit 0

0 0

)(58)

Note that if the upper left entry of the second matrix were not a unit, then the discriminant of thecube (57) would have been a multiple of p2 for mod p reasons.

Because p divides the middle coefficient of the binary quadratic form q1, we in fact must havethat the lower right entry of the left matrix of (58) is divisible by p. Since p2 divides the discriminantof the cube (58), the lower right entry of the second matrix must in fact be divisible by p2.

Finally, we note that elements of U4 do not change most of the top row of (55); only theentry where a1 originally was in (55) can be changed. In fact, after all the transformations, thatentry is divisible by p, since the binary quartic form f is not changed by any of the last threetransformations and its second-to-last coefficient is divisible by p.

Therefore, after the lower triangular unipotent transformations described above in the fourdirections, we obtain a hypercube of the form(

0 00 1

) (0 11 p∗

)(

1 ∗∗ p∗

) (∗ p∗p∗ p2∗

) . (59)

Multiplying the hypercube (59) by p2 and acting by ( 1 00 p−1 ) in all four directions gives the hypercube(

0 00 1

) (0 11 ∗

)(p ∗∗ ∗

) (∗ ∗∗ ∗

) (60)

with Q-invariant p.As in Section 8.4 (and especially Proposition 8.5), the difficult part of Theorem 9.1 is to estimate

the number of curves in F2 whose discriminant is divisible by p2 for mod p2 reasons (in the language

of §8.4, an estimate for the size of W(2)p ). From the argument above, it suffices to estimate the

number of H-equivalence classes of elements of the form (56) with Q-invariant larger than M . Thisestimate follows from arguments analogous to those in [BSW16, §3.4].

9.2 Evaluation of the average size of the 2-Selmer group

We now have the following theorem, from which Theorem 1.2 (and thus Theorem 1.1) will be seento follow.

42

Page 43: On average sizes of Selmer groups and ranks in families of

Theorem 9.2. Let Φ be any large subfamily of F . Then we have

limX→∞

∑E∈Φ

H(E)<X

#(S(E) \ S′(E))

∑E∈Φ

H(E)<X

1=

|J| ·Vol(G(Z)\G(R)) ·N∑i=1

1

ni·∫H(~a)<X±∆(~a)>0

d~a

n∑i=1

∫H(~a)<X±∆(~a)>0

d~a

·∏p

|J|p ·Vol(G(Zp)) ·∫C=C(~a)∈Φp

∑σ∈

E(Qp)dE(Qp)

1

#E[d](Qp)d~a

∫C=C(~a)∈Φp

d~a

.

Proof. This follows by combining Proposition 8.1, Theorem 8.7 and expression (35) for the volumeVol(RX), and Theorem 9.1.

In order to evaluate the right hand side of the expression in Theorem 9.2, we use the following fact(see [BK77, Lemma 3.1]):

Lemma 9.3. Let J be an abelian variety over Qp of dimension n. Then

#(J(Qp)/dJ(Qp)) =

{#J [d](Qp) if p 6= d;

dn ·#J [d](Qp) if p = d.

Proof. It follows from the theory of formal groups that there exists a subgroup M ⊂ J(Qp) of finiteindex that is isomorphic to Znp . Let H denote the finite group J(Qp)/M . Then by applying thesnake lemma to the following diagram

0

��

//M

[d]

��

// J(Qp)

[d]

��

// H

[d]

��

// 0

��

0 //M // J(Qp) // H // 0

we obtain the exact sequence

0→M [d]→ J(Qp)[d]→ H[d]→M/dM → J(Qp)/dJ(Qp)→ H/dH → 0.

Since H is a finite group and M is isomorphic to Znp , Lemma 9.3 follows.

The expression on the right hand side in Theorem 9.2 thus reduces simply to the Tamagawanumber τ(G) = Vol(G(Z)\G(R)) ·

∏p Vol(G(Zp)) of G. This gives the average number of elements

in S(E) \ S′(E) over all elliptic curves E ∈ Φ.Meanwhile, the size of S′(E) is 1 except in the cases where F = Fi for i ∈ {1, 2}. In that case,

as we will prove in the next section (Prop. 10.1), we have for 100% of elliptic curves E ∈ Fi that|S′(E)| = di. This completes the proof of Theorem 1.2, and thus of Theorem 1.1.

43

Page 44: On average sizes of Selmer groups and ranks in families of

10 The marked points in F1 and F2 are independent

In this section, we prove that the marked points in the families F1 and F2 are non-torsion and“independent” asympototically 100% of the time. More precisely, we prove the following theorem.

Theorem 10.1. Let j ∈ {1, 2} and d any positive integer. For an elliptic curve E ∈ Fj, let S(E)denote the d-Selmer group of E ∈ Fi and S′(E) the subgroup in S(E) generated by the images ofthe marked points on E. Then, when elliptic curves E ∈ Fi are ordered by height, 100% of thesecurves E have the property that |S′(E)| = dj.

Theorem 10.1 implies that, for 100% of the curves E in Fj , the subgroup in E(Q) generated bythe marked points has rank j. Thus, when elliptic curves E ∈ Fj are ordered by height, 100% ofthese curves E have rank at least j.

Proof. Let j ∈ {1, 2}. Let n1 = 3 and k1 = 4, and n2 = 4 and k2 = 6. Then the total number ofelliptic curves E ∈ Fj with height less than X is ∼ cjXnj/kj for a positive constant cj . This is seensimply by counting lattice points in a certain bounded region in Rnj .

We first note that asympotically 100% of the elliptic curves in Fj have trivial rational torsion.The generic elliptic curve for each Fj (over Q({ai})) has no nonzero rational `-torsion for primes` ≤ 7. By the Hilbert irreducibility theorem, the same holds for asymptotically 100% of the curvesin Fj , and Mazur’s theorem implies that asymptotically 100% of the curves in Fj have trivialrational torsion. This immediately implies that the marked points on 100% of the curves in Fjhave infinite order.

We now show that in F2, the two marked points P1 and P2 are independent asymptotically100% of the time as well. The argument is similar to the argument in §7.8 proving that reducibleelements in V (Q) are rare, namely reducing modulo p and showing that the desired property israre enough modulo p. Here, we take dependence to mean that the two points generate a cyclicsubgroup, and we restrict our attention to the set F nt

2 of curves in F2 that have trivial rationaltorsion. Note that dependence of two points on E(Q) implies dependence after reducing E moduloany prime.

Let Sdep denote the set of curves E in F nt2 where P1 and P2 are dependent, and let Sdep

p denotethe set of curves E in F nt

2 where P1 and P2 are dependent modulo the prime p (i.e., dependent as

points of E(Fp)). Then Sdep ⊂ ∩pSdepp .

Let Scycp denote the set of curves in F2 where the reduction modulo p is a cyclic group. Let T cyc

p

denote the set of (isomorphism classes of) elliptic curves E over Fp where E(Fp) is a cyclic group.

Note that for a curve E ∈ Scycp ∩ F nt

2 , we have E ∈ Sdepp and the reduction of E modulo p lies in

T cycp . For a fixed prime p, Vladut [Vla99] shows that the probability of E(Fp) being cyclic for E

an elliptic curve over Fp is

µp(Tcycp ) :=

∏primes `|p−1

(1− 1

`(`2 − 1)

)+O(p−1/2+ε)

which is ≤ 5/6 +O(p−1/2+ε) for p 6= 2. Since the constant in the error term does not depend on p,for sufficiently large p, we must have µp(T

cycp ) ≤ 5/6 + δ for any δ > 0.

In order to use the above bound on µp(Tcycp ) to compute the probability that a curve E ∈ F2

lies in Scycp , recall that the set of curves in F2 modulo p is the same as elliptic curves modulo p,

with multiplicity given by the number of pairs of distinct non-identity points on the curve. Using

44

Page 45: On average sizes of Selmer groups and ranks in families of

(a weak version of) the Weil bound, we thus find that the density of Scycp in F2 is bounded above

by(5/6 + δ)(p+O(

√p))2

(5/6 + δ)(p+O(√p))2 + (1/6− δ)(p+O(

√p))2

for any δ > 0. For sufficiently large p, we thus find that this density of Scycp in F2 is at most 5/6+δ′

for any δ′ > 0.If E ∈ Sdep

p but not in Scycp , then the group E(Fp) is the product of two nontrivial cyclic groups,

and the probability that the two marked points are dependent in E(Fp) is bounded above by 1/2.Thus, combining this with the density of Scyc

p , for sufficiently large p we find that the density of

Sdepp in F2 is at most

(5/6 + δ′) + 1/2(1/6− δ′) = 11/12 + δ′/2

for any δ′ > 0.By the Chinese remainder theorem, for any finite set Σ of sufficiently large primes p, the

density of curves in F2 where the two marked points are dependent modulo p for all p ∈ Σ is atmost

∏p∈Σ(11/12 + δ′/2). Letting the size of Σ approach infinity yields the result.

Acknowledgments

We are grateful to Bhargav Bhatt, John Cremona, Aise Johan de Jong, Tom Fisher, Benedict Gross,Catherine O’Neil, Arul Shankar, Christopher Skinner, and Xiaoheng Wang for helpful conversations.The first author was partially supported by a Simons Investigator Grant and NSF Grant DMS-1001828. The second author was supported by NSF Grant DMS-1701437.

References

[BH16] Manjul Bhargava and Wei Ho, Coregular spaces and genus one curves, Cambridge J.Math. 4 (2016), no. 1, 1–119.

[Bha04] Manjul Bhargava, Higher composition laws. I. A new view on Gauss composition, andquadratic generalizations, Ann. of Math. (2) 159 (2004), no. 1, 217–250.

[Bha10] , The density of discriminants of quintic rings and fields, Ann. of Math. (2) 172(2010), no. 3, 1559–1591.

[Bha13] , The geometric sieve and squarefree values of polynomial discriminants and otherinvariant polynomials, 2013.

[BHC62] Armand Borel and Harish-Chandra, Arithmetic subgroups of algebraic groups, Ann. ofMath. 75 (1962), 485–535.

[BK77] Armand Brumer and Kenneth Kramer, The rank of elliptic curves, Duke Math. J. 44(1977), no. 4, 715–743.

[BS15a] Manjul Bhargava and Arul Shankar, Binary quartic forms having bounded invariants,and the boundedness of the average rank of elliptic curves, Ann. of Math. (2) 181 (2015),no. 1, 191–242.

[BS15b] , Ternary cubic forms having bounded invariants, and the existence of a positiveproportion of elliptic curves having rank 0, Ann. of Math. (2) 181 (2015), no. 2, 587–621.

45

Page 46: On average sizes of Selmer groups and ranks in families of

[BSW16] Manjul Bhargava, Arul Shankar, and Xiaoheng Wang, Squarefree values of polynomialdiscriminants I, 2016, https://arxiv.org/abs/1611.09806.

[Cas62] J. W. S. Cassels, Arithmetic on curves of genus 1. IV. Proof of the Hauptvermutung, J.Reine Angew. Math. 211 (1962), 95–112.

[CFS10] John E. Cremona, Tom A. Fisher, and Michael Stoll, Minimisation and reduction of 2-,3- and 4-coverings of elliptic curves, Algebra Number Theory 4 (2010), no. 6, 763–820.

[Dav51] H. Davenport, On a principle of Lipschitz, J. London Math. Soc. 26 (1951), 179–183.

[Eke91] Torsten Ekedahl, An infinite version of the Chinese remainder theorem, Comment. Math.Univ. St. Paul. 40 (1991), no. 1, 53–59.

[Fis07] Tom Fisher, A new approach to minimising binary quartics and ternary cubics, Math.Research Let. 14 (2007), no. 4, 597–613.

[Fis13] Tom Fisher, Minimisation and reduction of 5-coverings of elliptic curves, Algebra NumberTheory 7 (2013), no. 5, 1179–1205. MR 3101076

[FR17] Tom Fisher and Lazar Radicevic, Some minimisation algorithms in arithmetic invarianttheory, 2017, https://arxiv.org/abs/1703.01940.

[KK17] Daniel Kane and Zev Klagsbrun, On the joint distribution of Selφ(E/Q) and Selφ(E′/Q)in quadratic twist families, 2017, https://arxiv.org/abs/1702.02687.

[KLO14] Zev Klagsbrun and Robert J. Lemke Oliver, The distribution of the Tamagawa ratioin the family of elliptic curves with a two-torsion point, Research in the MathematicalSciences 1 (2014), no. 1, 15.

[Poo03] Bjorn Poonen, Squarefree values of multivariable polynomials, Duke Math. J. 118 (2003),no. 2, 353–373.

[Sil92] Joseph H. Silverman, The arithmetic of elliptic curves, Graduate Texts in Mathematics,vol. 106, Springer-Verlag, New York, 1992, Corrected reprint of the 1986 original.

[Vla99] S. G. Vladut, Cyclicity statistics for elliptic curves over finite fields, Finite Fields Appl.5 (1999), no. 1, 13–25.

46