9
Excess variants in AFF2 detected by massively parallel sequencing of males with autism spectrum disorder Kajari Mondal, Dhanya Ramachandran, Viren C. Patel, Katie R. Hagen, Promita Bose, David J. Cutler and Michael E. Zwick * Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA Received February 6, 2012; Revised June 13, 2012; Accepted June 29, 2012 Autism spectrum disorder (ASD) is a heterogeneous disorder with substantial heritability, most of which is unexplained. ASD has a population prevalence of one percent and affects four times as many males as females. Patients with fragile X E (FRAXE) intellectual disability, which is caused by a silencing of the X-linked gene AFF2, display a number of ASD-like phenotypes. Duplications and deletions at the AFF2 locus have also been reported in cases with moderate intellectual disability and ASD. We hypothesized that other rare X-linked sequence variants at the AFF2 locus might contribute to ASD. We sequenced the AFF2 genomic region in 202 male ASD probands and found that 2.5% of males sequenced had missense mutations at highly conserved evolutionary sites. When compared with the frequency of missense mutations in 5545 X chromosomes from unaffected controls, we saw a statistically significant enrichment in patients with ASD (OR: 4.9; P < 0.014). In addition, we identified rare AFF2 3 UTR variants at conserved sites which alter gene expression in a luciferase assay. These data suggest that rare variation in AFF2 may be a previously unrecognized ASD susceptibility locus and may help explain some of the male excess of ASD. INTRODUCTION The recent surge of advances in second-generation sequencing technologies and better methods of targeted enrichment have made the detection of a more complete spectrum of genetic variation increasingly feasible (1,2). There is keen interest in applying these approaches to uncover the genetic basis of polygenic complex human disease, including autism (OMIM 209850), a childhood-onset disorder characterized by impaired social interactions, abnormal verbal communication, restricted interests and repetitive behaviors. Autism has an estimated prevalence of 1% (3,4), and one of its most striking epidemio- logical features is a 4-fold excess of affected males. Autism, or the broader autism spectrum disorder (ASD) phenotype, is an example of a highly heterogenous, multifactor- ial disorder with substantial heritability (5 8, reviewed in 9). To date, more than 100 different genes and genomic regions have been linked to this complex trait (reviewed in 10,11). Despite this, most of the genetic risk for ASD remains unexplained. Recent studies exploring ASD genetics generally adopt one of two study designs. The first employs genome-wide association studies, which have identified a few loci of interest, but largely failed to replicate findings between studies (12 14). A meta-analysis of these studies, with over 2500 study subjects, reveals it is extremely unlikely that there is any common variant influencing autism susceptibility with an odds ratio of .1.5 (15). The second design focuses on large but very rare (fre- quency usually less than 1 in a thousand in the general popula- tion) de novo and inherited copy number variants (CNVs). Numerous studies have now shown convincingly that this class of rare variation makes a significant contribution to autism susceptibility (16 27), explaining up to 15% of all ASD cases. Unfortunately, these studies point to a highly heter- ogenous allelic architecture, as no single risk variant is found in more than 1% of surveyed cases. Overall, although genetic studies have uncovered a large number of candidate loci, much ASD heritability remains unexplained. Like other disorders with a male preponderance, there is evi- dence that mutations at X-linked loci in hemizygous males con- tribute to the observed ASD sex bias (24,28 30). Here, we * To whom correspondence should be addressed at: Department of Human Genetics, Emory University, Whitehead Biomedical Research Building, Suite 301, Atlanta, GA 30322, USA. Tel: +1 4047279924; Fax: +1 4047273949; Email: [email protected] # The Author 2012. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected] Human Molecular Genetics, 2012 1–9 doi:10.1093/hmg/dds267 HMG Advance Access published July 10, 2012 at Emory University on July 11, 2012 http://hmg.oxfordjournals.org/ Downloaded from

Excess variants in AFF2 detected by massively parallel sequencing of males with autism spectrum disorder

Embed Size (px)

Citation preview

Excess variants in AFF2 detected by massivelyparallel sequencing of males with autism spectrumdisorderKajari Mondal, Dhanya Ramachandran, Viren C. Patel, Katie R. Hagen, Promita Bose,

David J. Cutler and Michael E. Zwick!

Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA

Received February 6, 2012; Revised June 13, 2012; Accepted June 29, 2012

Autism spectrum disorder (ASD) is a heterogeneous disorder with substantial heritability, most of which isunexplained. ASD has a population prevalence of one percent and affects four times as many males asfemales. Patients with fragile X E (FRAXE) intellectual disability, which is caused by a silencing of theX-linked gene AFF2, display a number of ASD-like phenotypes. Duplications and deletions at the AFF2locus have also been reported in cases with moderate intellectual disability and ASD. We hypothesizedthat other rare X-linked sequence variants at the AFF2 locus might contribute to ASD. We sequenced theAFF2 genomic region in 202 male ASD probands and found that 2.5% of males sequenced had missensemutations at highly conserved evolutionary sites. When compared with the frequency of missense mutationsin 5545 X chromosomes from unaffected controls, we saw a statistically significant enrichment in patientswith ASD (OR: 4.9; P < 0.014). In addition, we identified rare AFF2 3" UTR variants at conserved siteswhich alter gene expression in a luciferase assay. These data suggest that rare variation in AFF2 may be apreviously unrecognized ASD susceptibility locus and may help explain some of the male excess of ASD.

INTRODUCTION

The recent surge of advances in second-generation sequencingtechnologies and better methods of targeted enrichment havemade the detection of a more complete spectrum of geneticvariation increasingly feasible (1,2). There is keen interest inapplying these approaches to uncover the genetic basis ofpolygenic complex human disease, including autism (OMIM209850), a childhood-onset disorder characterized by impairedsocial interactions, abnormal verbal communication, restrictedinterests and repetitive behaviors. Autism has an estimatedprevalence of 1% (3,4), and one of its most striking epidemio-logical features is a 4-fold excess of affected males.

Autism, or the broader autism spectrum disorder (ASD)phenotype, is an example of a highly heterogenous, multifactor-ial disorder with substantial heritability (5–8, reviewed in 9). Todate, more than 100 different genes and genomic regions havebeen linked to this complex trait (reviewed in 10,11). Despitethis, most of the genetic risk for ASD remains unexplained.Recent studies exploring ASD genetics generally adopt one of

two study designs. The first employs genome-wide associationstudies, which have identified a few loci of interest, butlargely failed to replicate findings between studies (12–14). Ameta-analysis of these studies, with over 2500 study subjects,reveals it is extremely unlikely that there is any commonvariant influencing autism susceptibility with an odds ratio of.1.5 (15). The second design focuses on large but very rare (fre-quency usually less than 1 in a thousand in the general popula-tion) de novo and inherited copy number variants (CNVs).Numerous studies have now shown convincingly that thisclass of rare variation makes a significant contribution toautism susceptibility (16–27), explaining up to 15% of allASD cases. Unfortunately, these studies point to a highly heter-ogenous allelic architecture, as no single risk variant is found inmore than 1% of surveyed cases. Overall, although geneticstudies have uncovered a large number of candidate loci,much ASD heritability remains unexplained.

Like other disorders with a male preponderance, there is evi-dence that mutations at X-linked loci in hemizygous males con-tribute to the observed ASD sex bias (24,28–30). Here, we

!To whom correspondence should be addressed at: Department of Human Genetics, Emory University, Whitehead Biomedical Research Building,Suite 301, Atlanta, GA 30322, USA. Tel: +1 4047279924; Fax: +1 4047273949; Email: [email protected]

# The Author 2012. Published by Oxford University Press. All rights reserved.For Permissions, please email: [email protected]

Human Molecular Genetics, 2012 1–9doi:10.1093/hmg/dds267

HMG Advance Access published July 10, 2012 at Em

ory University on July 11, 2012

http://hmg.oxfordjournals.org/

Dow

nloaded from

further explore the hypothesis that rare mutations in X-linkedgenes contribute to ASD by performing targeted sequencingof the AFF2 (formerly FMR2) locus in two large autismcohorts. Our study is motivated by findings from FRAXE(OMIM #309548), a rare form of mild-to-moderate intellectualdisability (ID) affecting 1 in 50 000 newborn males. Hallmarksof the FRAXE phenotype include learning difficulties, commu-nication deficits, attention problems, hyperactivity and autisticbehavior. The initial discovery and mapping of the FRAXElocus identified a CGG expansion upstream of AFF2 thatsilences the gene resulting in a complete loss of function. Thissimultaneously explained the ID and the presence of a chromo-somal fragile site (31–33). Large deletions that include bothAFF2 and the adjacent FMR1 gene result in a severe ID (34–36), whereas deletions of the AFF2 locus only present withautism and mild ID (37). A duplication of the AFF2 locus hasalso been reported in a young male with ID (38). Thus, sincecomplete loss of AFF2 leads to a mild non-syndromic form ofID often presenting with autistic features, we hypothesizedthat hypomorphic alleles with reduced function might act asautism susceptibility loci. If correct, our hypothesis predictsthat: (i) males with autism will have an excess of rare variants,(ii) these rare variants will be found at evolutionarily conservedsites, and (iii) these variants will influence gene function.Because males are hemizygous for AFF2, whereas females arediploid, rare variants at this locus could help explain the prepon-derance of males with ASD, if those damaging rare variants arelargely recessive in nature.

Here, we report the results of a targeted second-generationsequencing study of the AFF2 locus in male ASD patientsascertained using different sampling designs from theAutism Genetic Resource Exchange (AGRE) and the SimonsSimplex Collection (SSC). As predicted, we see an over-representation of rare variants, and missense mutations in par-ticular, in ASD cases, with a more than 4-fold increase in ASDrisk among carriers of rare AFF2 protein-coding changes. Weobserve both inherited and de novo risk alleles in our sample.Furthermore, we also identify two rare 3"UTR variants thatappear to alter the expression of a reporter in a luciferaseassay in a tissue specific fashion. Taken together, these datashow that AFF2 is a susceptibility locus conferring increasedrisk for ASD in males. These data also suggest that rarecoding sequence variations contribute to susceptibility forcomplex traits like autism, and such variations may accountfor some of the high heritability for these disorders.

RESULTS

We sequenced the AFF2 locus in a sample of 202 males with adiagnosis of autism. A total of 127 of the cases were obtainedfrom the multiplex AGRE, while the remaining 75 cases werefrom the SSC. We identified a total of 286 sites of variation,with 269 single nucleotide variants (SNVs) and 17 insertionsor deletions (indels). Overall levels of variation were similarbetween the two data sets [Qw per site (39); AGRE 6.0 #1024, SSC 6.7 # 1024], with an excess of rare variants as evi-denced by a negative value for the Tajima’s D test statistics forboth sets of samples [(40); AGRE 1.46, SSC 1.41]. For theSNVs, a total of 128, or 48%, had not been reported before. As

summarized in Figure 1, almost all common variation (.5% fre-quency in our population) is contained in dbSNP, whereas mostrare variants (,5%) have not been cataloged in dbSNP.

Our study is focused on rare variation, so we did not follow-up the 141 previously cataloged variants. Functional annota-tion of the remaining 128 SNVs revealed that 16 werelocated at sites with elevated evolutionary conservation (Sup-plementary Material, Table S1). Each of these variants wasobserved in a single case. To arrive at a better estimate oftheir population frequency, we genotyped 13 of the variantsin a collection of 1400 unaffected male controls, obtainedfrom the NIMH Human Genetics Initiative (SupplementaryMaterial, Table S1). Only one intronic variant had a frequency.0.01, suggesting that the variants we identified are very rarein the general population.

We next restricted our attention to missense mutations. Inour cases, there were 5 (2.5% of total cases sequenced)singleton non-synonymous variants. We estimated that thepopulation frequency for two of these variants was $0.001,with the other three having frequencies below 0.0001 (Sup-plementary Material, Table S1). We further noted thatthese mutations were all at highly conserved evolutionarysites (PhyloP .1, Table 1). To ask whether this distribution

Figure 1. Summary of SNV and indel variation discovered at the AFF2 locusin males with ASD. The frequency of SNVs and indels (minor alleles) in casesis plotted against their level of evolutionary conservation. Most common vari-ation has already been discovered and exists in public databases (blue; circlesand diamonds). Most of the rare variation at AFF2 was discovered in our studyand not contained in public databases (red; circles and diamonds).

Table 1. Rare non-synonymous variants observed at the AFF2 locus

Sample collection Genomeposition(hg19)

Aminoacidchange

PhyloP Inherited/ denovo

(SSC, 11251.p1) 148037554 S660T 2.33 Inherited(AGRE, AU1016302) 148037715 D714N 2.37 Inherited(SSC, 11554.p1) 148038084 R837C 2.48 Inherited(AGRE, AU1284302) 148044334 R927H 2.62 Inherited(SSC, 11012.p1) 148048494 I1030L 1.86 De novo

2 Human Molecular Genetics, 2012

at Emory U

niversity on July 11, 2012http://hm

g.oxfordjournals.org/D

ownloaded from

was unusual, we compared our case data to sequence datafrom 5545 X chromosomes of European American controls,obtained from the Exome Variant Server from the NHLBIExome Sequencing Project. As shown in Table 2, for var-iants with PhyloP scores .1.0 (hence the top 10% of con-served sites in the human genome), there is a 4.2-foldexcess of mutations in ASD cases (95% CI: 1.28–11.09,P-value 0.01). If we further restrict our analysis to siteswith PhyloP scores .2, there is a 4.9-fold excess of rare var-iants in ASD cases (95% CI: 1.21–14.37, P-value 0.014,Table 2).

One potential explanation for our finding is that thesequenced cases are poorly matched with the controls, andthe cases are systemically more variable at many or all Xchromosome loci. To test this hypothesis, we sequenced asubset of our cases (75 SSC males) for virtually all codingloci on the X (1006 loci total). For each locus, we performedthe same analysis as we did at AFF2, asking whether the caseswere enriched for mutations with PhyloP .1, or .2. Our ana-lysis rejects the hypothesis that our cases are systematicallymore variable than the controls (Supplementary Material,Fig. S1). At the vast majority of loci, controls have the sameor more variation than cases. Furthermore, AFF2 appears tobe especially enriched for variation in our cases. Of the1006 loci, it exhibits the 77th greatest enrichment for variationrelative to controls for sites with PhyloP .1 (P , 0.08) andthe 21st most enrichment for variants with a PhyloP .2

(P , 0.02). Because the 75 SSC males were a subset of thetotal samples used in this analysis, we also asked whetherthis subset shows evidence of population stratification relativeto the whole case sample. To do so, we obtained wholegenome association X chromosome genotype data for theAGRE case samples, and assessed stratification for all of ourcase samples (41–43). We observed no evidence of systematicstratification between our SSC cases and AGRE cases (Sup-plementary Material, Fig. S2), suggesting that we can rejectthe hypothesis that a simple model of stratification canaccount for our findings.

To confirm the patterns of segregation of putative AFF2susceptibility variants, we genotyped the mother and availablesiblings for each of the five missense mutations to determinewhether they showed segregation as expected (Fig. 2). Fortwo of the missense mutations (R927C, D714N), we observedsegregation from the mother to both affected sons. For anothermissense mutation (S660T), we observed transmission of therisk allele from the mother to the affected son, and transmis-sion of the reference allele to the unaffected son (Fig. 2). Mis-sense mutation R837C was transmitted from mother toaffected son. For the AGRE cases, these patterns of transmis-sion were expected because of the manner in which weselected our cases for sequencing. Interestingly, missense mu-tation I1030L in a SSC case arose as a de novo mutation in themale patient with autism, providing further evidence thatAFF2 is an autism susceptibility locus.

Table 2. Odds ratio (OR) for missense mutations at AFF2 for males with ASD when compared with unaffected controls

Study PhyloP .1 PhyloP .2Missense variants(PhyloP .1)

Odds ratio versuscontrols (95% CI)

P-value Missense variants(PhyloP .2)

Odds ratio versuscontrols (95% CI)

P-value

Present study (n ! 202) 5 4.2 (1.28–11.09) 0.010 4 4.9 (1.21–14.37) 0.014Control X chromosomes (n ! 5545) 33 Reference N/A 23 Reference N/A

Figure 2. Segregation pattern of the non-synonymous AFF2 variants in families from AGRE and the SSC ASD patient collections. Note that the I1030L variantis a de novo variant in the male with ASD.

Human Molecular Genetics, 2012 3

at Emory U

niversity on July 11, 2012http://hm

g.oxfordjournals.org/D

ownloaded from

Given that we observed a statistically significant excess ofrare missense mutations in AFF2 in patients with autism, wenext wondered whether, among the collection of rare variantswe discovered, non-coding variants were also autism suscepti-bility variants. One way non-coding alleles could exert influ-ence is by changing the expression of AFF2. To test thishypothesis, we performed a luciferase assay using constructscontaining two of the three 3"UTR variants and a third withthe human genome reference sequence as a comparison intwo different cell lines. No other sequence variants werepresent in the 3"UTR. As shown in Figure 3, the 3"UTRvariant (ChrX:148076068[C.T]) showed a statistically signifi-cant reduction in gene expression in the luciferase assay per-formed in Human HEK293T cells (P , 0.02), while the3"UTR variant (ChrX:148075200[T.C]) did not show a statis-tically significant reduction in expression (P , 0.31). We repli-cated these experiments using Mouse Neuro 2A cells and both3"UTR variants showed a strong increase in expression relativeto a construct containing the human genome reference sequence(ChrX:148076068[C.T], P , 0.03; ChrX:148075200[T.C],P , 0.02). Together, these data suggest these rare 3"UTR var-iants could act to alter the expression of AFF2, perhaps in atissue-specific manner, and provide further support for therole of AFF2 as an autism susceptibility locus.

DISCUSSION

Uncovering the genetic basis of polygenic disorders remains achallenge for research. The idea that common genetic vari-ation contributes significantly to common diseases has beena leading model in human genetics for the past 15 years

(44–46); evaluating this model required the development ofhigh-throughput genotyping platforms and a catalog ofcommon human genetic variation as seen in the HapMapproject (47,48). The ‘common-disease-common-variant’model was bolstered by technology development, sincedirect sequencing was not a viable strategy. Therefore, system-atic genotyping of common variants was perceived as the bestway to begin to characterize the allelic architecture of complexhuman traits (49). In the time elapsed since, genome-wide as-sociation studies have shown that common variants with largeeffects are unlikely to exist in the human population for manydisorders, although a large number of loci with alleles withmuch smaller ORs (,1.2) remains plausible (reviewed in 50).

Following Haldane in the 1920s, population biologists havelong been aware that deleterious alleles of large effect will bemaintained only at very low frequencies in the general popu-lation (51). Most quantitative traits, including human diseases,show substantial heritability in most populations (52). This ob-servation suggests a possible paradox arising from the beliefthat these traits are influenced by stabilizing selection for anoptimum, or simple directional selection to remove diseasealleles, both of which should act to eliminate genetic variation(52,53). One model to account for the maintenance of vari-ation supposes that variation is maintained by a very largenumber of loci, each with rare alleles of exceedingly smalleffect (too small to ever be individually detectable in an ex-periment), and which are individually at mutation-selectionbalance frequency (54). A later analysis, in contrast, arguedthat even if mutation-selection balance maintains much vari-ation, then there are likely to be rare alleles with appreciableeffects (55), and/or alleles at intermediate frequency createdby balancing selection due to their pleiotropic effects on

Figure 3. Histogram showing the level of expression for 3"UTR variants in luciferase assays. Expression levels are normalized to a construct with the humangenome reference sequence. The 3"UTR variant (ChrX:148076068[C.T]) showed statistically significant reduction of gene expression in the luciferase assayperformed in human HEK293T cells (P , 0.02), while the 3"UTR variant (ChrX:148075200[T.C]) did not (P , 0.31). Repeating these experiments usingmouse Neuro 2A cells resulted in both 3"UTR variants showing a strong increase of expression relative to a construct containing the human genome referencesequence (ChrX:148076068[C.T], P , 0.03; ChrX:148075200[T.C], P , 0.02), suggesting that these variants influence gene expression in a tissue specificmanner.

4 Human Molecular Genetics, 2012

at Emory U

niversity on July 11, 2012http://hm

g.oxfordjournals.org/D

ownloaded from

multiple phenotypes (52). In support of the former view, copynumber variation studies have identified variants with a largeeffect size, having odds ratios often greater than 5 (16–27);however, these variants are quite rare, often occurring muchless often than 1 in 1000 in the general population, a frequencygenerally consistent with a large effect locus at mutation selec-tion balance. It has been reported that in aggregate, largeCNVs account for 15% of the cases of autism. Since WGAstudies generally preclude the existence of rampant intermedi-ate frequency loci of large effect, Turelli’s theory predicts theexistence of loci with multiple rare alleles, and intermediateORs should make a significant contribution to polygenic disor-ders such as autism. Our study of AFF2 has identified one suchlocus.

In the 202 autistic males sequenced, we detected five rarereplacement variants: S660T, D714N, R837C, R927H andI1030L. These variants appear in $2.5% of boys with a diag-nosis of autism. One variant (I1030L) arose as a de novo event.We have also identified rare 3"UTR sequence variants thatappear to alter levels of gene expression in a luciferaseassay in a tissue-specific manner. AFF2 missense mutationshave also been seen in other males with neurodevelopmentaldisorders (56,57). The observation of deletions of just theAFF2 locus in a patient with autism and mild ID (37) and aduplication of the AFF2 locus in a young male with ID (38)provide further independent support for our findings.

We do not claim that the variants we have identified aremonogenic causes of autism. Instead, our data support theidea that AFF2 harbors susceptibility alleles that makemodest contributions to the risk of ASD. Furthermore,because AFF2 is found on the X chromosome, recessiveacting effects would be expected to preferentially influencethe risk of ASD in males. Even so, the size of the effectswe observed could be overestimated for at least two reasons.First, the observed excess of rare variation in the humangenome might make statistical significance estimates morechallenging than modeled here (58). Second, our sample sizerendered this study underpowered for detecting very weakeffects. Therefore, there is the real possibility that the trueOR of AFF2 mutations might be even smaller than we haveestimated due to ‘Winner’s Curse’ (59). Finally, we note thatwe did not perform a genome-wide study, and had we doneso with only our sample size, the magnitude of the effect esti-mated here would not have survived a multiple test correction.Detecting effect sizes this small would require vastly largernumbers of samples for a whole genome study. While recentexome sequencing studies identified de novo variants thatappear enriched in patients with ASD (60–63), it is worthnoting that the variant we observed could have arisen bychance. Replication of our finding regarding AFF2 in a largercollection of individuals with ASD could help resolve anumber of these potential issues and provide further supportfor our findings.

Much remains to be discovered concerning the structure andfunction of the AFF2 protein. We know AFF2 is a large genewith 21 exons and 6 annotated isoforms with alternative spli-cing among exons 2, 3, 5 and 7 (31–33). The longest of theAFF2 isoforms is composed of 1311 amino acids and contains2 nuclear localization signal sequences (33). AFF2 belongs toa gene family that includes AF4, LAF4 and AF5q31. The

AFF2 gene has a common ancestor in Drosophila melanoga-ster, the Lilliputian. Inactivation of lilli generates a fly ofreduced size. By analogy with AF4, AFF2 has been considereda putative transcription factor (64), with the N-terminaldomain of the protein (amino acids 1–541) known to havetransactivation activity (65). Four of our mutations (S660T,D714N, R837C,R927H) fall in the C-terminal domain of theprotein (amino acids 633–966) shown to be responsible for lo-calization of the protein in nuclear speckles (64); none of ourmutations fell into putative RNA-binding domains betweenamino acids 787–815 and 890–917 (64). The careful func-tional characterization of these and other alleles of AFF2should help elucidate novel pathways and candidates that,when disrupted, contribute to autism susceptibility.

MATERIALS AND METHODS

Selection of samples

We selected 127 males from the Autism Genetic Resource Ex-change (AGRE) multiplex collection and 75 males from theSimons Foundation Autism Research Initiative (SFARI)Simplex Collection, New York, NY, USA (SSC) for targetDNA amplification and DNA sequencing. From the AGREcollection, we chose multiplex families with two or moremale affected sib-pairs who shared .99% of 76 genotypedsingle nucleotide polymorphisms (SNPs) in the AFF2genomic region (66). One male was randomly chosen if bothaffected siblings were equally affected; otherwise, the malewith autism was chosen over those boys with a diagnosis ofnot quite autism or broad spectrum. From the SSC collection,we chose only those boys who were described as autistic andnot reported to have any other syndromes. From the SSC col-lection, we chose 75 male children from different familieswith a diagnosis of ASD (67). We required subjects to haveat least one unaffected sibling with no history of ASD orID based on medical history and/or pedigree evaluation.Parents showed no signs of a broader autism phenotype.Prior to processing, DNA samples were quantified by measur-ing OD260/280 using a NanoDrop instrument. Following quan-tification, 100 ng of DNA from each sample were run on a0.8% agarose gel to verify the integrity of the genomic DNA.

Long PCR-based amplification of AFF2 genomic regionfrom AGRE samples

We prepared target DNA for sequencing the AGRE samplesby performing long PCR (LPCR) amplification of the AFF2genomic region. Long-range PCR primers were designedusing EmPrimer (http://primer.genetics.emory.edu) using thefollowing selection criteria: length, 29–32 bases; GC percent-age, 45–60; and Tm $ 688C. The LPCR primers used are con-tained within Supplementary Material, Table S2. Five hundrednanograms of DNA from each of the AGRE samples were ali-quoted and PCR master mix [1# LA Taq buffer (TaKaRa BioInc., Otsu Shiga, JP), 250 mM dNTP (TaKaRa Bio Inc.),400 nM of both forward and reverse LPCR primers and0.1 U/ml of LA Taq (TaKaRa Bio Inc.) were added. If anamplicon had a high GC content, we used 1# GC Buffer(TaKaRa Bio Inc.) in place of 1# LA Taq buffer. PCR was

Human Molecular Genetics, 2012 5

at Emory U

niversity on July 11, 2012http://hm

g.oxfordjournals.org/D

ownloaded from

performed using the following parameters: denaturation at948C for 2 min, 29 cycles of 948C for 10 s, 688C for 1 minper kilobase of amplicon size and final extension at 688C for5 min. Amplification was confirmed by using 1% agarose96-well E-Gels (Invitrogen, Carlsbad, CA, USA). After purifi-cation, concentration of each of the amplicons was quantitatedusing PicoGreen dsDNA Quantitation reagent (Invitrogen) andthe Tecan Ultra Evolution plate reader. An equimolar amountof each amplicon was then pooled by sample. Pooled ampli-cons were then purified using the Invitrogen PureLink PCRPurification Kit (Invitrogen) with the HC buffer.LPCR-enriched, purified products were then sheared usingCovaris E210 (Duty cycle 10%, Intensity cycle 5, cycle/burst: 200, time: 180 s).

RainDance microdroplet-based PCR enrichment of ChrXexome from SSC samples

We prepared target DNA for sequencing the SSC samples byusing RainDance Technology’s (RDT) microdroplet-basedtechnology to enrich for the human X chromosome exomeas described previously (68). For 15 samples, a slightly modi-fied version of the manufacturer’s protocol was used where thepurified PCR product was run in 1% agarose gel, and thebands between 200 and 700 bp were extracted using aQiagen Gel purification kit. The ends of the DNA fragmentswere repaired at 258C for 30 min using the New EnglandBioLabs End Repair Module (New England BioLabs,Ipswich, MA, USA, catalogue #E6050L), followed by purifi-cation using the Qiagen MinElute columns (Qiagen, Valencia,CA, USA). The PCR fragments were then concatenated at208C for 30 min using NEB Quick Ligation Kit (NEB,Ipswich, MA, USA, catalogue #M2200L). The ligated pro-ducts were purified using the Qiagen MinElute columns andwere made up to 100 ml volume by adding Qiagen elutionbuffer, and then fragmented using a Covaris E210 machine(duty cycle: 10%, intensity cycle: 5, cycle/burst: 200,number of treatments: 4, total time per treatment: 45 s).

Illumina library preparation, sequencing and data analysis

The sheared fragments from the LPCR and RainDance librar-ies were purified using a Qiagen QIAquick PCR purificationcolumn and eluted in 32 ml of elution buffer (Qiagen). Thestandard Illumina Genome Analyzer (IGA) multiplex librarypreparation protocol was then used to prepare samples for se-quencing with only one modification: while purifying theadaptor-ligated products, we used Invitrogen E-Gel SizeSelect2% (Invitrogen, catalogue #G6610-02) in place of the gel puri-fication method suggested by Illumina. We confirmed sampleenrichment by running an Agilent BioAnalyzer 7500 DNAchip. A quantitative qPCR was performed to quantitate thelibrary using a KAPA Library Quantification Kit (Kapa Bio-systems, Woburn, MA, USA, catalogue # KK4824). EnrichedDNA was denatured and diluted to a concentration of 8 pM.Cluster generation and 70 bp or 100 bp single-end sequencingwas performed using standard IGA or Illumina HiSeq proto-cols. We performed multiplex single-end sequencing of 12samples per lane for the AGRE samples and 3–4 samplesper Illumina lane for the SSC samples.

After the completion of Illumina sequencing, the filteredreads were mapped and variants called using PEMapper(Cutler et al., submitted). The AFF2 reference sequenceused for the AGRE samples consists of 10 discontiguous frag-ments covering 84.8 kb. For the AGRE samples, 99% of thebases had more than 8# coverage. Median depth of coveragewas in the range of 388–1548, while the range of averagedepth of coverage was 459–1907 (Supplementary Material,Table S3). Since the entire human X chromosome exomewas sequenced in the SSC samples, we used a reference se-quence consisting of 5748 discontiguous fragments covering4.7 Mb. For the SSC samples, between 83 and 97% of the tar-geted reference bases had more than 8# coverage. Mediandepth of coverage was in the range of 20–607, while therange of average depth of coverage was 21–675 (Supplemen-tary Material, Table S3).

Variant annotation, validation and segregation analysis

All common and rare single base pair and indel sequence var-iants identified by PEMapper were annotated into functionalclasses using the sequence annotator program, SeqAnt (69).Selected highly conserved rare variants (phastCon score .0.65) were independently validated by Sanger sequencing.The primers for validation were designed using Primer 3 soft-ware (http://frodo.wi.mit.edu/). After validating replacementand UTR variants in cases, we next sequenced the mothersand affected or unaffected male siblings to verify the segrega-tion pattern of the variant. The de novo variant at I1030L wasconfirmed by Sanger sequencing using a DNA sampleobtained from blood for both the mother and male offspring.

Control genotyping

To aid in the estimation of allele frequencies in matched con-trols, we performed genotyping of male samples of Europeanancestry obtained from the NIMH Human Genetics Initiative(https://www.nimhgenetics.org/available_data/controls/).These control samples used were identical to those used in aprevious study (70). Genotyping was performed by theiPLEX Gold method (Sequenom, San Diego, CA, USA) perthe manufacturer’s instructions, using primers (see Supple-mentary Material, Table S4) designed with the SequenomAssay Design 3.1 software. A positive control was includedin every plate to confirm the sensitivity of the assay. The geno-typing for the replacement variants, S660T and I1030L, couldnot be performed, due to failure of the primer design assay byboth TaqMan and Sequenom.

Statistical analysis

All population genetic analyses were calculated using the pop-gen_fasta2.0.c code (Cutler, unpublished work) on the fastafiles as previously described (71). This code calculated theaverage number of pairwise differences and Watterson’s esti-mator of the population mutation rate (Qw per site) for theentire sequenced region and different annotated SNV function-al classes, while accounting for missing data (39). A point es-timate for Tajima’s D was determined for all the data anddifferent SNV functional classes (40). Control data from

6 Human Molecular Genetics, 2012

at Emory U

niversity on July 11, 2012http://hm

g.oxfordjournals.org/D

ownloaded from

5545 X chromosomes were obtained from exome sequencingdata of individuals of European ancestry publicly availableat the NHLBI Exome Variant Server (72). P-values andodds ratios were calculated with Fisher’s exact test in the Rstatistical package. PhyloP scores (phyloP46wayPlacental)were obtained using the February 2009, GRCh37(hg19) as-sembly from the UCSC Genome Browser (73). Affymetrix5.0 data generated by the BROAD Institute for AGREsamples were downloaded from AGRE website (19). A totalof 215 SNPs that spanned the X chromosome were successful-ly genotyped in the AGRE cases and sequenced in our SSCcases. We used PLINK to perform LD-based pruning (indep-pairwise option) resulting in a total of 179 SNPs that werein approximate linkage equilibrium (74,75). We then usedEIGENSOFT 4.2 and GCTA for principal component analysisto determine whether there was any evidence for populationstratification between our AGRE and SSC cases (41–43).

Luciferase assays

We performed luciferase assays in HEK293T and in Neuro-2acell lines to determine whether the AFF2 3"UTR variants weidentified reduced gene expression relative to a 3"UTR con-struct with the reference sequence.

Plasmid constructionFull-length 3"UTR was amplified for two rare UTR variantswith low (ChrX: 148075200 [T.C]) and high evolutionaryconservation (ChrX: 148076068[C.T]). The resulting pro-ducts were cloned into the pmirGLO Dual-Lucifease vector(Promega, Madison, WI, USA, Cat #E1330). As a control, afull-length 3"UTR that consisted of the human genome refer-ence sequence was amplified from an unaffected HapMapnormal control and also cloned into the same vector. Wewere unable to successfully generate a construct for one ofthe variants we discovered (ChrX:148075804[T.A]).Sanger sequencing confirmed that each variant construct con-tained the correct novel variant site.

Cell culture and transfectionsHEK293T and Neuro-2a cells were cultured at 378C with 5%CO2 in DMEM and RPMI media, respectively, with 10% fetalbovine serum. Twenty-four hours before transfection, 2 # 105

cells were plated in 1 ml of media in each well of 12-well cellculture dishes. Transfections were carried out in Opti-MEM(Invitrogen), using 8 ml of Lipofectamine 2000 (Invitrogen)and 0.5 mg of plasmid. Each plasmid was transfected intothree separate wells. Just before transfection, the old mediareplaced with fresh media (DMEM or RPMI with 10% fetalbovine serum). Twenty-four hours after transfection, cellswere harvested with 250 ml 1# Passive Lysis Buffer(Promega) by rocking at room temperature for 15 min.Lysates were cleared of cell debris by centrifugation at14000 rpm for 5 min at 48C.

Luciferase assayFrom each transfection, 20 ml of the lysate was added to aluminometer tube. To each luminometer tube, 100 ml ofLAR II was added. A manual-load luminometer was used tomeasure the luminescence over a 10 s period, following a

2 s premeasurement delay. The luminometer measurementwas repeated after the addition of 100 ml of Stop & Gloreagent. For each lysate, the firefly luciferase values weredivided by the Renilla luciferase values. The results of the in-dependent transfections were averaged and the standard devi-ation was calculated for each plasmid. We performed threeindependent transformations of each construct, with threereplications of each transformation. The raw data were nor-malized relative to the reference sequence construct and a two-tailed, unequal variance Student’s t-test was performed to de-termine whether the 3"UTR containing vectors reduced geneexpression relative to a construct with the human genome ref-erence sequence.

SUPPLEMENTARY MATERIAL

Supplementary Material is available at HMG online.

ACKNOWLEDGEMENTS

The Emory Custom Cloning Core facility generated constructsto our specifications for our expression analysis. We thankmembers of the Cutler and Zwick labs for comments on themanuscript, Jennifer Mulle and Stephen T. Warren for discus-sion, Cheryl T. Strauss for editing and the Emory-Georgia Re-search Alliance Genome Center (EGC), supported in part byPHS Grant UL1 RR025008 from the Clinical and TranslationalScience Award program, National Institutes of Health, NationalCenter for Research Resources, for performing the Illumina se-quencing runs. The ELLIPSE Emory High Performance Com-puting Cluster was used for this project. The authors would liketo thank the NHLBI GO Exome Sequencing Project and itsongoing studies, which produced and provided exome variantcalls for comparison: the Lung GO Sequencing Project(HL-102923), the WHI Sequencing Project (HL-102924), theBroad GO Sequencing Project (HL-102925), the Seattle GO Se-quencing Project (HL-102926), and the Heart GO SequencingProject (HL-103010).

Conflict of Interest statement. None declared.

FUNDING

This work was supported by the National Institutes of Health/National Institutes of Mental Health (NIH/NIMH) and GiftFund (grant number: MH076439, M.E.Z.); the Simons Foun-dation Autism Research Initiative (M.E.Z.); and the TrainingProgram in Human Disease Genetics (grant number:1T32MH087977, D.R.).

REFERENCES

1. Mamanova, L., Coffey, A.J., Scott, C.E., Kozarewa, I., Turner, E.H.,Kumar, A., Howard, E., Shendure, J. and Turner, D.J. (2010)Target-enrichment strategies for next-generation sequencing. Nat.Methods, 7, 111–118.

2. Shendure, J. and Ji, H. (2008) Next-generation DNA sequencing. Nat.Biotechnol., 26, 1135–1145.

3. Kogan, M.D., Blumberg, S.J., Schieve, L.A., Boyle, C.A., Perrin, J.M.,Ghandour, R.M., Singh, G.K., Strickland, B.B., Trevathan, E. and van

Human Molecular Genetics, 2012 7

at Emory U

niversity on July 11, 2012http://hm

g.oxfordjournals.org/D

ownloaded from

Dyck, P.C. (2009) Prevalence of parent-reported diagnosis of autismspectrum disorder among children in the US, 2007. Pediatrics, 124,1395–1403.

4. Fombonne, E. (2003) The prevalence of autism. JAMA, 289, 87–89.5. Bailey, A., Le Couteur, A., Gottesman, I., Bolton, P., Simonoff, E.,

Yuzda, E. and Rutter, M. (1995) Autism as a strongly genetic disorder:evidence from a British twin study. Psychol. Med., 25, 63–77.

6. Constantino, J.N., Zhang, Y., Frazier, T., Abbacchi, A.M. and Law, P.(2010) Sibling recurrence and the genetic epidemiology of autism.Am. J. Psychiatry, 167, 1349–1356.

7. Lichtenstein, P., Carlstrom, E., Rastam, M., Gillberg, C. and Anckarsater,H. (2010) The genetics of autism spectrum disorders and relatedneuropsychiatric disorders in childhood. Am. J. Psychiatry, 167, 1357–1363.

8. Hallmayer, J., Cleveland, S., Torres, A., Phillips, J., Cohen, B., Torigoe,T., Miller, J., Fedele, A., Collins, J., Smith, K. et al. (2011) Geneticheritability and shared environmental factors among twin pairs withautism. Arch. Gen. Psychiatry, 68, 1095–1102.

9. Ronald, A. and Hoekstra, R.A. (2011) Autism spectrum disorders andautistic traits: a decade of new twin studies. Am. J. Med. Genet. B, 156B,255–274.

10. Betancur, C. (2011) Etiological heterogeneity in autism spectrumdisorders: more than 100 genetic and genomic disorders and still counting.Brain Res., 1380, 42–77.

11. State, M.W. and Levitt, P. (2011) The conundrums of understandinggenetic risks for autism spectrum disorders. Nat. Neurosci., 14, 1499–1506.

12. Wang, K., Zhang, H., Ma, D., Bucan, M., Glessner, J.T., Abrahams, B.S.,Salyakina, D., Imielinski, M., Bradfield, J.P., Sleiman, P.M.A. et al.(2009) Common genetic variants on 5p14.1 associate with autismspectrum disorders. Nature, 459, 528–533.

13. Weiss, L.A., Arking, D.E., Daly, M.J. and Chakravarti, A. (2009) Agenome-wide linkage and association scan reveals novel loci for autism.Nature, 461, 802–808.

14. Anney, R., Klei, L., Pinto, D., Regan, R., Conroy, J., Magalhaes, T.R.,Correia, C., Abrahams, B.S., Sykes, N., Pagnamenta, A.T. et al. (2010) Agenome-wide scan for common alleles affecting risk for autism. Hum.Mol. Genet., 19, 4072–4082.

15. Devlin, B., Melhem, N. and Roeder, K. (2011) Do common variants play arole in risk for autism? Evidence and theoretical musings. Brain Res.,1380, 78–84.

16. Christian, S.L., Brune, C.W., Sudi, J., Kumar, R.A., Liu, S.,KaraMohamed, S., Badner, J.A., Matsui, S., Conroy, J., McQuaid, D.et al. (2008) Novel submicroscopic chromosomal abnormalities detectedin autism spectrum disorder. Biol. Psychiatry, 63, 1111–1117.

17. Kumar, R.A., KaraMohamed, S., Sudi, J., Conrad, D.F., Brune, C.,Badner, J.A., Gilliam, T.C., Nowak, N.J., Cook, E.H., Dobyns, W.B. et al.(2008) Recurrent 16p11.2 microdeletions in autism. Hum. Mol. Genet.,17, 628–638.

18. Marshall, C.R., Noor, A., Vincent, J.B., Lionel, A.C., Feuk, L., Skaug, J.,Shago, M., Moessner, R., Pinto, D., Ren, Y. et al. (2008) Structuralvariation of chromosomes in autism spectrum disorder. Am. J. Hum.Genet., 82, 477–488.

19. Weiss, L.A., Shen, Y., Korn, J.M., Arking, D.E., Miller, D.T., Fossdal, R.,Saemundsen, E., Stefansson, H., Ferreira, M.A., Green, T. et al. (2008)Association between microdeletion and microduplication at 16p11.2 andautism. N. Engl. J. Med., 358, 667–675.

20. Bucan, M., Abrahams, B.S., Wang, K., Glessner, J.T., Herman, E.I.,Sonnenblick, L.I., Alvarez Retuerto, A.I., Imielinski, M., Hadley, D.,Bradfield, J.P. et al. (2009) Genome-wide analyses of exonic copy numbervariants in a family-based study point to novel autism susceptibility genes.PLoS Genet., 5, e1000536.

21. Glessner, J.T., Wang, K., Cai, G., Korvatska, O., Kim, C.E., Wood, S.,Zhang, H., Estes, A., Brune, C.W., Bradfield, J.P. et al. (2009) Autismgenome-wide copy number variation reveals ubiquitin and neuronalgenes. Nature, 459, 569–573.

22. Pinto, D., Pagnamenta, A.T., Klei, L., Anney, R., Merico, D., Regan, R.,Conroy, J., Magalhaes, T.R., Correia, C., Abrahams, B.S. et al. (2010)Functional impact of global rare copy number variation in autismspectrum disorders. Nature, 466, 368–372.

23. Moreno-De-Luca, D., Mulle, J.G., Kaminsky, E.B., Sanders, S.J., Myers,S.M., Adam, M.P., Pakula, A.T., Eisenhauer, N.J., Uhas, K., Weik, L.et al. (2010) Deletion 17q12 is a recurrent copy number variant that

confers high risk of autism and schizophrenia. Am. J. Hum. Genet., 87,618–630.

24. Noor, A., Whibley, A., Marshall, C.R., Gianakopoulos, P.J., Piton, A.,Carson, A.R., Orlic-Milacic, M., Lionel, A.C., Sato, D., Pinto, D. et al.(2010) Disruption at the PTCHD1 Locus on Xp22.11 in Autism spectrumdisorder and intellectual disability. Sci. Transl. Med., 2, 49ra68.

25. Sanders, S.J., Ercan-Sencicek, A.G., Hus, V., Luo, R., Murtha, M.T.,Moreno-De-Luca, D., Chu, S.H., Moreau, M.P., Gupta, A.R., Thomson,S.A. et al. (2011) Multiple recurrent de novo CNVs, includingduplications of the 7q11.23 Williams syndrome region, are stronglyassociated with autism. Neuron, 70, 863–885.

26. Levy, D., Ronemus, M., Yamrom, B., Lee, Y.-H., Leotta, A., Kendall, J.,Marks, S., Lakshmi, B., Pai, D., Ye, K. et al. (2011) Rare de novo andtransmitted copy-number variation in autistic spectrum disorders. Neuron,70, 886–897.

27. Celestino-Soper, P.B.S., Shaw, C.A., Sanders, S.J., Li, J., Murtha, M.T.,Ercan-Sencicek, A.G., Davis, L., Thomson, S., Gambin, T., Chinault, A.C.et al. (2011) Use of array CGH to detect exonic copy number variantsthroughout the genome in autism families detects a novel deletion in,TMLHE. Hum. Mol. Genet., 20, 4360–4370.

28. Jamain, S., Quach, H., Betancur, C., Rastam, M., Colineaux, C., Gillberg,I.C., Soderstrom, H., Giros, B., Leboyer, M., Gillberg, C. et al. (2003)Mutations of the X-linked genes encoding neuroligins NLGN3 andNLGN4 are associated with autism. Nat. Genet., 34, 27–29.

29. Laumonnier, F., Bonnet-Brilhault, F., Gomot, M., Blanc, R., David, A.,Moizard, M.-P., Raynaud, M., Ronce, N., Lemonnier, E., Calvas, P. et al.(2004) X-linked mental retardation and autism are associated with amutation in the NLGN4 gene, a member of the neuroligin family.Am. J. Hum. Genet., 74, 552–557.

30. Chung, R.-H., Ma, D., Wang, K., Hedges, D.J., Jaworski, J.M., Gilbert,J.R., Cuccaro, M.L., Wright, H.H., Abramson, R.K., Konidari, I. et al.(2011) An X chromosome-wide association study in autism familiesidentifies TBL1X as a novel autism spectrum disorder candidate gene inmales. Mol. Autism, 2, 18.

31. Gu, Y., Shen, Y., Gibbs, R.A. and Nelson, D.L. (1996) Identification ofFMR2, a novel gene associated with the FRAXE CCG repeat and CpGisland. Nat. Genet., 13, 109–113.

32. Gecz, J., Gedeon, A.K., Sutherland, G.R. and Mulley, J.C. (1996)Identification of the gene FMR2, associated with FRAXE mentalretardation. Nat. Genet., 13, 105–108.

33. Gecz, J., Bielby, S., Sutherland, G.R. and Mulley, J.C. (1997) Genestructure and subcellular localization of FMR2, a member of a new familyof putative transcription activators. Genomics, 44, 201–213.

34. Moore, S.J., Strain, L., Cole, G.F., Miedzybrodzka, Z., Kelly, K.F. andDean, J.C. (1999) Fragile X syndrome with FMR1 and FMR2 deletion.J. Med. Genet., 36, 565–566.

35. Probst, F.J., Roeder, E.R., Enciso, V.B., Ou, Z., Cooper, M.L., Eng, P., Li,J., Gu, Y., Stratton, R.F., Chinault, A.C. et al. (2007) Chromosomalmicroarray analysis (CMA) detects a large X chromosome deletionincluding FMR1, FMR2, and IDS in a female patient with mentalretardation. Am. J. Med. Genet. A, 143A, 1358–1365.

36. Cavani, S., Prontera, P., Grasso, M., Ardisia, C., Malacarne, M., Gradassi,C., Cecconi, M., Mencarelli, A., Donti, E. and Pierluigi, M. (2011) FMR1,FMR2, and SLITRK2 deletion inside a paracentric inversion involvingbands Xq27.3-q28 in a male and his mother. Am. J. Med. Genet. A, 155A,221–224.

37. Stettner, G.M., Shoukier, M., Hoger, C., Brockmann, K. and Auber, B.(2011) Familial intellectual disability and autistic behavior caused by asmall FMR2 gene deletion. Am. J. Med. Genet. A, 155A, 2003–2007.

38. Whibley, A.C., Plagnol, V., Tarpey, P.S., Abidi, F., Fullston, T., Choma,M.K., Boucher, C.A., Shepherd, L., Willatt, L., Parkin, G. et al. (2010)Fine-scale survey of X chromosome copy number variants and indelsunderlying intellectual disability. Am. J. Hum. Genet., 87, 173–188.

39. Watterson, G.A. (1975) On the number of segregating sites in geneticalmodels without recombination. Theor. Pop. Biol., 7, 256–276.

40. Tajima, F. (1989) Statistical method for testing the neutral mutationhypothesis by DNA polymorphism. Genetics, 123, 585–595.

41. Patterson, N., Price, A.L. and Reich, D. (2006) Population structure andeigenanalysis. PLoS Genet., 2, e190.

42. Price, A.L., Patterson, N.J., Plenge, R.M., Weinblatt, M.E., Shadick, N.A.and Reich, D. (2006) Principal components analysis corrects forstratification in genome-wide association studies. Nat. Genet., 38, 904–909.

8 Human Molecular Genetics, 2012

at Emory U

niversity on July 11, 2012http://hm

g.oxfordjournals.org/D

ownloaded from

43. Yang, J., Weedon, M.N., Purcell, S., Lettre, G., Estrada, K., Willer, C.J.,Smith, A.V., Ingelsson, E., O’Connell, J.R., Mangino, M. et al. (2011)Genomic inflation factors under polygenic inheritance. Eur. J. HumanGenet., 19, 807–812.

44. Risch, N. and Merikangas, K. (1996) The future of genetic studies ofcomplex human diseases. Science, 273, 1516–1517.

45. Lander, E.S. (1996) The new genomics: global views of biology. Science,274, 536–539.

46. Kruglyak, L. (1999) Prospects for whole-genome linkage disequilibriummapping of common disease genes. Nat. Genet., 22, 139–144.

47. Consortium, I.H. (2005) A haplotype map of the human genome. Nature,437, 1299–1320.

48. Consortium, I.H., Frazer, K.A., Ballinger, D.G., Cox, D.R., Hinds, D.A.,Stuve, L.L., Gibbs, R.A., Belmont, J.W., Boudreau, A., Hardenbol, P.et al. (2007) A second generation human haplotype map of over 3.1million SNPs. Nature, 449, 851–861.

49. Zwick, M.E., Cutler, D.J. and Chakravarti, A. (2000) Patterns of geneticvariation in Mendelian and complex traits. Annu. Rev. Genomics Hum.Genet., 1, 387–407.

50. Visscher, P.M., Brown, M.A., McCarthy, M.I. and Yang, J. (2012) Fiveyears of GWAS discovery. Am. J. Hum. Genet., 90, 7–24.

51. Haldane, J.B.S. (1927) A mathematical theory of natural and artificialselection, part V: selection and mutation. Math. Proc. Cambridge, 23,838–844.

52. Barton, N.H. and Turelli, M. (1989) Evolutionary quantitative genetics:how little do we know? Annu. Rev. Genet., 23, 337–370.

53. Barton, N.H. and Keightley, P.D. (2002) Understanding quantitativegenetic variation. Nat. Rev. Genet., 3, 11–21.

54. Lande, R. (1975) The maintenance of genetic variability by mutation in apolygenic character with linked loci. Genet. Res., 26, 221–235.

55. Turelli, M. (1984) Heritable genetic variation via mutation-selectionbalance: Lerch’s zeta meets the abdominal bristle. Theor. Pop. Biol., 25,138–193.

56. Tarpey, P.S., Smith, R., Pleasance, E., Whibley, A., Edkins, S., Hardy, C.,O’Meara, S., Latimer, C., Dicks, E., Menzies, A. et al. (2009) Asystematic, large-scale resequencing screen of X-chromosome codingexons in mental retardation. Nat. Genet., 41, 535–543.

57. Piton, A., Gauthier, J., Hamdan, F.F., Lafreniere, R.G., Yang, Y., Henrion,E., Laurent, S., Noreau, A., Thibodeau, P., Karemera, L. et al. (2011)Systematic resequencing of X-chromosome synaptic genes in autismspectrum disorder and schizophrenia. Mol. Psychiatry, 16, 867–880.

58. Keinan, A. and Clark, A.G. (2012) Recent explosive human populationgrowth has resulted in an excess of rare genetic variants. Science, 336,740–743.

59. Ioannidis, J.P., Ntzani, E.E., Trikalinos, T.A. and Contopoulos-Ioannidis,D.G. (2001) Replication validity of genetic association studies. Nat.Genet., 29, 306–309.

60. Neale, B.M., Kou, Y., Liu, L., Ma’ayan, A., Samocha, K.E., Sabo, A., Lin,C.F., Stevens, C., Wang, L.S., Makarov, V. et al. (2012) Patterns and ratesof exonic de novo mutations in autism spectrum disorders. Nature, 485,242–245.

61. O’Roak, B.J., Vives, L., Girirajan, S., Karakoc, E., Krumm, N., Coe, B.P.,Levy, R., Ko, A., Lee, C., Smith, J.D. et al. (2012) Sporadic autism

exomes reveal a highly interconnected protein network of de novomutations. Nature, 485, 246–250.

62. Sanders, S.J., Murtha, M.T., Gupta, A.R., Murdoch, J.D., Raubeson, M.J.,Willsey, A.J., Ercan-Sencicek, A.G., DiLullo, N.M., Parikshak, N.N.,Stein, J.L. et al. (2012) De novo mutations revealed by whole-exomesequencing are strongly associated with autism. Nature, 485, 237–241.

63. Iossifov, I., Ronemus, M., Levy, D., Wang, Z., Hakker, I., Rosenbaum, J.,Yamrom, B., Lee, Y.-H., Narzisi, G., Leotta, A. et al. (2012) De novogene disruptions in children on the autistic spectrum. Neuron, 74, 285–299.

64. Bensaid, M., Melko, M., Bechara, E.G., Davidovic, L., Berretta, A.,Catania, M.V., Gecz, J., Lalli, E. and Bardoni, B. (2009)FRAXE-associated mental retardation protein (FMR2) is an RNA-bindingprotein with high affinity for G-quartet RNA forming structure. NucleicAcids Res., 37, 1269–1279.

65. Hillman, M.A. and Gecz, J. (2001) Fragile XE-associated familial mentalretardation protein 2 (FMR2) acts as a potent transcription activator.J. Hum. Genet., 46, 251–259.

66. Geschwind, D.H., Sowinski, J., Lord, C., Iversen, P., Shestack, J., Jones,P., Ducat, L. and Spence, S.J., AGRE Steering Committee. (2001) Theautism genetic resource exchange: a resource for the study of autism andrelated neuropsychiatric conditions. Am. J. Hum. Genet., 69, 463–466.

67. Fischbach, G.D. and Lord, C. (2010) The Simons Simplex Collection:a resource for identification of autism genetic risk factors. Neuron, 68,192–195.

68. Mondal, K., Shetty, A.C., Patel, V., Cutler, D.J. and Zwick, M.E. (2011)Targeted sequencing of the human X chromosome exome. Genomics, 98,260–265.

69. Shetty, A.C., Athri, P., Mondal, K., Horner, V.L., Steinberg, K.M., Patel,V., Caspary, T., Cutler, D.J. and Zwick, M.E. (2010) SeqAnt: a webservice to rapidly identify and annotate DNA sequence variations. BMCBioinformatics, 11, 471.

70. Collins, S.C., Bray, S.M., Suhl, J.A., Cutler, D.J., Coffee, B., Zwick, M.E.and Warren, S.T. (2010) Identification of novel FMR1 variants bymassively parallel sequencing in developmentally delayed males.Am. J. Med. Genet. A, 152A, 2512–2520.

71. Zwick, M.E., Mcafee, F., Cutler, D.J., Read, T.D., Ravel, J., Bowman,G.R., Galloway, D.R. and Mateczun, A. (2005) Microarray-basedresequencing of multiple Bacillus anthracis isolates. Genome Biol., 6,R10.

72. Exome Variant Server, NHLBI Exome Sequencing Project (ESP), Seattle,WA. Accessed 12/2011. Available: http://evs.gs.washington.edu/EVS/ viathe Internet.

73. Dreszer, T.R., Karolchik, D., Zweig, A.S., Hinrichs, A.S., Raney, B.J.,Kuhn, R.M., Meyer, L.R., Wong, M., Sloan, C.A., Rosenbloom, K.R.et al. (2012) The UCSC Genome Browser database: extensions andupdates 2011. Nucleic Acids Res., 40, D918–D923.

74. Plink 1.0.7. Available: http://pngu.mgh.harvard.edu/purcell/plink via theInternet.

75. Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A.,Bender, D., Maller, J., Sklar, P., de Bakker, P.I., Daly, M.J. et al. (2007)PLINK: a tool set for whole-genome association and population-basedlinkage analyses. Am. J. Hum. Genet., 81, 559–575.

Human Molecular Genetics, 2012 9

at Emory U

niversity on July 11, 2012http://hm

g.oxfordjournals.org/D

ownloaded from