76
1 A contribution of novel CNVs to schizophrenia from a genome-wide study of 41,321 subjects CNV Analysis Group and the Schizophrenia Working Group of the Psychiatric Genomics Consortium Christian R. Marshall 1 *, Daniel P. Howrigan 2,3 *, Daniele Merico 1 *, Bhooma Thiruvahindrapuram 1 , Wenting Wu 4,5 , Douglas S. Greer 4,5 , Danny Antaki 4,5 , Aniket Shetty 4,5 , Peter A. Holmans 6,7 , Dalila Pinto 8,9 , Madhusudan Gujral 4,5 , William M. Brandler 4,5 , Dheeraj Malhotra 4,5,10 , Zhouzhi Wang 1 , Karin V. Fuentes Fajarado 4,5 , Stephan Ripke 2,3 , Ingrid Agartz 11,12,13 , Esben Agerbo 14,15,16 , Margot Albus 17 , Madeline Alexander 18 , Farooq Amin 19,20 , Joshua Atkins 21,22 , Silviu A. Bacanu 23 ,Richard A. Belliveau Jr 3 , Sarah E. Bergen 3,24 , Marcelo Bertalan 16,25 , Elizabeth Bevilacqua 3 , Tim B. Bigdeli 23 , Donald W. Black 26 , Richard Bruggeman 27 , Nancy G. Buccola 28 , Randy L. Buckner 29,30,31 , Brendan Bulik-Sullivan 2,3 , William Byerley 32 , Wiepke Cahn 33 , Guiqing Cai 8,34 , Murray J. Cairns 21,35,36 , Dominique Campion 37 , Rita M. Cantor 38 , Vaughan J. Carr 35,39 , Noa Carrera 6 , Stanley V. Catts 35,40 , Kimberley D. Chambert 3 , Wei Cheng 41 , C. Robert Cloninger 42 , David Cohen 43 , Paul Cormican 44 , Nick Craddock 6,7 , Benedicto Crespo-Facorro 45,46 , James J. Crowley 47 , David Curtis 48,49 , Michael Davidson 50 , Kenneth L, Davis 8 , Franziska Degenhardt 51,52 , Jurgen Del Favero 53 , Lynn E. DeLisi 54,55 , Ditte Demontis 16,56,57 , Dimitris Dikeos 58 , Timothy Dinan 59 , Srdjan Djurovic 11,60 , Gary Donohoe 44,61 , Elodie Drapeau 8 , Jubao Duan 62,63 , Frank Dudbridge 64 , Peter Eichhammer 65 , Johan Eriksson 66,67,68 , Valentina Escott-Price 6 , Laurent Essioux 69 , Ayman H. Fanous 70,71,72,73 , Kai-How Farh 2 , Martilias S. Farrell 47 , Josef Frank 74 , Lude Franke 75 , Robert Freedman 76 , Nelson B. Freimer 77 , Joseph I. Friedman 8 , Andreas J. Forstner 51,52 , Menachem Fromer 2,3,78,79 , Giulio Genovese 3 , Lyudmila Georgieva 6 , Elliot S. Gershon 80 , Ina Giegling 81,82 , Paola Giusti-Rodríguez 47 , Stephanie Godard 83 , Jacqueline I. Goldstein 2,84 , Jacob Gratten 85 , Lieuwe de Haan 86 , Marian L. Hamshere 6 , Mark Hansen 87 , Thomas Hansen 16,25 , Vahram Haroutunian 8,88,89 , Annette M. Hartmann 81 , Frans A. Henskens 35,36,90 , Stefan Herms 51,52,91 , Joel N. Hirschhorn 84,92,93 , Per Hoffmann 51,52,91 , Andrea Hofman 51,52 , Mads V. Hollegaard 94 , David M. Hougaard 94 , Hailiang Huang 2,84 , Masashi Ikeda 95 , Inge Joa 96 , Anna K Kähler 24 , René S Kahn 33 , Luba Kalaydjieva 97,167 , Juha Karjalainen 75 , David Kavanagh 6 , Matthew C. Keller 99 , Brian J. Kelly 36 , James L. Kennedy 100,101,102 , Yunjung Kim 47 , James A. Knowles 103 , Bettina Konte 81 , Claudine Laurent 18,104 , Phil Lee 2,3,79 , S. Hong Lee 85 , Sophie E. Legge 6 , Bernard Lerer 105 , Deborah L. Levy 55,106 , Kung-Yee Liang 107 , Jeffrey Lieberman 108 , Jouko Lönnqvist 109 , Carmel M. Loughland 35,36 , Patrik K.E. Magnusson 24 , Brion S. Maher 110 , Wolfgang Maier 111 , Jacques Mallet 112 , Manuel Mattheisen 16,56,57,113 , Morten Mattingsdal 11,114 , Robert W McCarley 54,55 , Colm McDonald 115 , Andrew M. McIntosh 116,117 , Sandra Meier 74 , Carin J. Meijer 86 , Ingrid Melle 11,118 , Raquelle I. Mesholam-Gately 55,119 , Andres Metspalu 120 , Patricia T. Michie 35,121 , Lili Milani 120 , Vihra Milanova 122 , Younes Mokrab 123 , Derek W. Morris 44,61 , Ole Mors 16,57,124 , Bertram Müller- Myhsok 125,126,127 , Kieran C. Murphy 128 , Robin M. Murray 129 , Inez Myin-Germeys 130 , Igor Nenadic 131 , Deborah A. Nertney 132 , Gerald Nestadt 133 , Kristin K. Nicodemus 134 , Laura Nisenbaum 135 , Annelie Nordin 136 , Eadbhard O'Callaghan 137 , Colm O'Dushlaine 3 , Sang-Yun Oh 138 , Ann Olincy 76 , Line Olsen 16,25 , F. Anthony O'Neill 139 , Jim Van Os 130,140 , Christos Pantelis 35,141 , George N. Papadimitriou 58 , Elena Parkhomenko 8 , Michele T. Pato 103 , Tiina Paunio 142 , Psychosis Endophenotypes International Consortium, Diana O. Perkins 143 , Tune H. Pers 84,93,144 , Olli Pietiläinen 142,145 , Jonathan Pimm 49 , Andrew J. Pocklington 6 , John Powell 129 , Alkes Price 84,146 , Ann . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/040493 doi: bioRxiv preprint first posted online Feb. 23, 2016;

A contribution of novel CNVs to schizophrenia from a ... · A contribution of novel CNVs to schizophrenia from a genome-wide study of 41,321 subjects ... , Jacob Gratten85, Lieuwe

  • Upload
    dongoc

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

1

AcontributionofnovelCNVstoschizophreniafromagenome-widestudyof41,321subjects

CNVAnalysisGroupandtheSchizophreniaWorkingGroupofthePsychiatricGenomicsConsortium

ChristianR.Marshall1*,DanielP.Howrigan2,3*,DanieleMerico1*,BhoomaThiruvahindrapuram1,WentingWu4,5,DouglasS.Greer4,5,DannyAntaki4,5,AniketShetty4,5,PeterA.Holmans6,7,DalilaPinto8,9,MadhusudanGujral4,5,WilliamM.Brandler4,5,DheerajMalhotra4,5,10,ZhouzhiWang1,KarinV.FuentesFajarado4,5,StephanRipke2,3,IngridAgartz11,12,13,EsbenAgerbo14,15,16,MargotAlbus17,MadelineAlexander18,FarooqAmin19,20,JoshuaAtkins21,22,SilviuA.Bacanu23,RichardA.BelliveauJr3,SarahE.Bergen3,24,MarceloBertalan16,25,ElizabethBevilacqua3,TimB.Bigdeli23,DonaldW.Black26,RichardBruggeman27,NancyG.Buccola28,RandyL.Buckner29,30,31,BrendanBulik-Sullivan2,3,WilliamByerley32,WiepkeCahn33,GuiqingCai8,34,MurrayJ.Cairns21,35,36,DominiqueCampion37,RitaM.Cantor38,VaughanJ.Carr35,39,NoaCarrera6,StanleyV.Catts35,40,KimberleyD.Chambert3,WeiCheng41,C.RobertCloninger42,DavidCohen43,PaulCormican44,NickCraddock6,7,BenedictoCrespo-Facorro45,46,JamesJ.Crowley47,DavidCurtis48,49,MichaelDavidson50,KennethL,Davis8,FranziskaDegenhardt51,52,JurgenDelFavero53,LynnE.DeLisi54,55,DitteDemontis16,56,57,DimitrisDikeos58,TimothyDinan59,SrdjanDjurovic11,60,GaryDonohoe44,61,ElodieDrapeau8,JubaoDuan62,63,FrankDudbridge64,PeterEichhammer65,JohanEriksson66,67,68,ValentinaEscott-Price6,LaurentEssioux69,AymanH.Fanous70,71,72,73,Kai-HowFarh2,MartiliasS.Farrell47,JosefFrank74,LudeFranke75,RobertFreedman76,NelsonB.Freimer77,JosephI.Friedman8,AndreasJ.Forstner51,52,MenachemFromer2,3,78,79,GiulioGenovese3,LyudmilaGeorgieva6,ElliotS.Gershon80,InaGiegling81,82,PaolaGiusti-Rodríguez47,StephanieGodard83,JacquelineI.Goldstein2,84,JacobGratten85,LieuwedeHaan86,MarianL.Hamshere6,MarkHansen87,ThomasHansen16,25,VahramHaroutunian8,88,89,AnnetteM.Hartmann81,FransA.Henskens35,36,90,StefanHerms51,52,91,JoelN.Hirschhorn84,92,93,PerHoffmann51,52,91,AndreaHofman51,52,MadsV.Hollegaard94,DavidM.Hougaard94,HailiangHuang2,84,MasashiIkeda95,IngeJoa96,AnnaKKähler24,RenéSKahn33,LubaKalaydjieva97,167,JuhaKarjalainen75,DavidKavanagh6,MatthewC.Keller99,BrianJ.Kelly36,JamesL.Kennedy100,101,102,YunjungKim47,JamesA.Knowles103,BettinaKonte81,ClaudineLaurent18,104,PhilLee2,3,79,S.HongLee85,SophieE.Legge6,BernardLerer105,DeborahL.Levy55,106,Kung-YeeLiang107,JeffreyLieberman108,JoukoLönnqvist109,CarmelM.Loughland35,36,PatrikK.E.Magnusson24,BrionS.Maher110,WolfgangMaier111,JacquesMallet112,ManuelMattheisen16,56,57,113,MortenMattingsdal11,114,RobertWMcCarley54,55,ColmMcDonald115,AndrewM.McIntosh116,117,SandraMeier74,CarinJ.Meijer86,IngridMelle11,118,RaquelleI.Mesholam-Gately55,119,AndresMetspalu120,PatriciaT.Michie35,121,LiliMilani120,VihraMilanova122,YounesMokrab123,DerekW.Morris44,61,OleMors16,57,124,BertramMüller-Myhsok125,126,127,KieranC.Murphy128,RobinM.Murray129,InezMyin-Germeys130,IgorNenadic131,DeborahA.Nertney132,GeraldNestadt133,KristinK.Nicodemus134,LauraNisenbaum135,AnnelieNordin136,EadbhardO'Callaghan137,ColmO'Dushlaine3,Sang-YunOh138,AnnOlincy76,LineOlsen16,25,F.AnthonyO'Neill139,JimVanOs130,140,ChristosPantelis35,141,GeorgeN.Papadimitriou58,ElenaParkhomenko8,MicheleT.Pato103,TiinaPaunio142,PsychosisEndophenotypesInternationalConsortium,DianaO.Perkins143,TuneH.Pers84,93,144,OlliPietiläinen142,145,JonathanPimm49,AndrewJ.Pocklington6,JohnPowell129,AlkesPrice84,146,Ann

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

2

E.Pulver133,ShaunM.Purcell78,DigbyQuested147,HenrikB.Rasmussen16,25,AbrahamReichenberg8,89,MarkA.Reimers23,AlexanderL.Richards6,7,JoshuaL.Roffman30,31,PanosRoussos78,148,DouglasM.Ruderfer6,78,VeikkoSalomaa67,AlanR.Sanders62,63,AdamSavitz149,UlrichSchall35,36,ThomasG.Schulze74,150,SibylleG.Schwab151,EdwardM.Scolnick3,RodneyJ.Scott21,35,152,LarryJ.Seidman55,119,JianxinShi153,JeremyM.Silverman8,154,JordanW.Smoller3,79,ErikSöderman13,ChrisC.A.Spencer155,EliA.Stahl78,84,EricStrengman33,156,JanaStrohmaier74,T.ScottStroup108,JaanaSuvisaari109,DraganM.Svrakic42,JinP.Szatkiewicz47,SrinivasThirumalai157,PaulA.Tooney21,35,36,JuhaVeijola158,159,PeterM.Visscher85,JohnWaddington160,DermotWalsh161,BradleyT.Webb23,MarkWeiser50,DieterB.Wildenauer98,NigelM.Williams6,StephanieWilliams47,StephanieH.Witt74,AaronR.Wolen23,BrandonK.Wormley23,NaomiRWray85,JingQinWu21,35,ClementC.Zai100,101,WellcomeTrustCase-ControlConsortium2,RolfAdolfsson136,OleA.Andreassen11,118,DouglasH.R.Blackwood116,AndersD.Børglum16,56,57,124,ElviraBramon162,JosephD.Buxbaum8,34,89,163,SvenCichon51,52,91,164,DavidA.Collier123,165,AidenCorvin44,MarkJ.Daly2,3,84,ArielDarvasi166,EnricoDomenici10,TõnuEsko84,92,93,120,PabloV.Gejman62,63,MichaelGill44,HughGurling49,ChristinaM.Hultman24,NakaoIwata95,AssenV.Jablensky35,98,167,168,ErikGJönsson11,13,KennethSKendler23,GeorgeKirov6,JoKnight100,101,102,DouglasF.Levinson18,QingqinSLi149,StevenAMcCarroll3,92,AndrewMcQuillin49,JenniferL.Moran3,PrebenB.Mortensen14,15,16,BryanJ.Mowry85,132,MarkusM.Nöthen51,52,RoelA.Ophoff33,38,77,MichaelJ.Owen6,7,AarnoPalotie3,79,145,CarlosN.Pato103,TraceyL.Petryshen3,55,169,DaniellePosthuma170,171,172,MarcellaRietschel74,BrienP.Riley23,DanRujescu81,82,PamelaSklar78,89,148,DavidSt.Clair173,JamesT.R.Walters6,ThomasWerge16,25,174,PatrickF.Sullivan24,47,143,MichaelCO’Donovan6,7†,StephenW.Scherer1,175†,BenjaminM.Neale2,3,79,84†,JonathanSebat4,5,176†*theseauthorscontributedequally†theseauthorsco-supervisedthestudyCorrespondence:jsebat@ucsd.edu1TheCentreforAppliedGenomicsandPrograminGeneticsandGenomeBiology,TheHospitalforSickChildren,Toronto,ON,Canada2AnalyticandTranslationalGeneticsUnit,MassachusettsGeneralHospital,Boston,Massachusetts02114,USA3StanleyCenterforPsychiatricResearch,BroadInstituteofMITandHarvard,Cambridge,Massachusetts02142,USA4BeysterCenterforPsychiatricGenomics,UniversityofCalifornia,SanDiego,LaJolla,CA92093,USA5DepartmentofPsychiatry,UniversityofCalifornia,SanDiego,LaJolla,CA92093,USA6MRCCentreforNeuropsychiatricGeneticsandGenomics,InstituteofPsychologicalMedicineandClinicalNeurosciences,SchoolofMedicine,CardiffUniversity,Cardiff,CF244HQ,UK7NationalCentreforMentalHealth,CardiffUniversity,Cardiff,CF244HQ,UK8DepartmentofPsychiatry,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA9DepartmentofGeneticsandGenomicSciences,SeaverAutismCenter,TheMindichChildHealth&DevelopmentInstitute,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA10NeuroscienceDiscoveryandTranslationalArea,PharmaResearch&EarlyDevelopment,F.Hoffmann-LaRocheLtd,CH-4070Basel,Switzerland11NORMENT,KGJebsenCentreforPsychosisResearch,InstituteofClinicalMedicine,UniversityofOslo,0424Oslo,Norway12DepartmentofPsychiatry,DiakonhjemmetHospital,0319Oslo,Norway

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

3

13DepartmentofClinicalNeuroscience,PsychiatrySection,KarolinskaInstitutet,SE-17176Stockholm,Sweden14NationalCentreforRegister-basedResearch,AarhusUniversity,DK-8210Aarhus,Denmark15CentreforIntegrativeRegister-basedResearch,CIRRAU,AarhusUniversity,DK-8210Aarhus,Denmark16TheLundbeckFoundationInitiativeforIntegrativePsychiatricResearch,iPSYCH,Denmark17StateMentalHospital,85540Haar,Germany18DepartmentofPsychiatryandBehavioralSciences,StanfordUniversity,Stanford,California94305,USA19DepartmentofPsychiatryandBehavioralSciences,EmoryUniversity,Atlanta,Georgia30322,USA20DepartmentofPsychiatryandBehavioralSciences,AtlantaVeteransAffairsMedicalCenter,Atlanta,Georgia30033,USA21SchoolofBiomedicalSciencesandPharmacy,UniversityofNewcastle,CallaghanNSW2308,Australia22HunterMedicalResearchInstitute,NewLambton,NewSouthWales,Australia23VirginiaInstituteforPsychiatricandBehavioralGenetics,DepartmentofPsychiatry,VirginiaCommonwealthUniversity,Richmond,Virginia23298,USA24DepartmentofMedicalEpidemiologyandBiostatistics,KarolinskaInstitutet,StockholmSE-17177,Sweden25InstituteofBiologicalPsychiatry,MentalHealthCentreSct.Hans,MentalHealthServicesCopenhagen,DK-4000,Denmark26DepartmentofPsychiatry,UniversityofIowaCarverCollegeofMedicine,IowaCity,Iowa52242,USA27UniversityMedicalCenterGroningen,DepartmentofPsychiatry,UniversityofGroningen,NL-9700RB,TheNetherlands28SchoolofNursing,LouisianaStateUniversityHealthSciencesCenter,NewOrleans,Louisiana70112,USA29CenterforBrainScience,HarvardUniversity,Cambridge,Massachusetts02138,USA30DepartmentofPsychiatry,MassachusettsGeneralHospital,Boston,Massachusetts02114,USA31AthinoulaA.MartinosCenter,MassachusettsGeneralHospital,Boston,Massachusetts02129,USA32DepartmentofPsychiatry,UniversityofCaliforniaatSanFrancisco,SanFrancisco,California,94143USA33UniversityMedicalCenterUtrecht,DepartmentofPsychiatry,RudolfMagnusInstituteofNeuroscience,3584Utrecht,TheNetherlands34DepartmentofHumanGenetics,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA35SchizophreniaResearchInstitute,SydneyNSW2010,Australia36PriorityCentreforTranslationalNeuroscienceandMentalHealth,UniversityofNewcastle,NewcastleNSW2300,Australia37CentreHospitalierduRouvrayandINSERMU1079FacultyofMedicine,76301Rouen,France38DepartmentofHumanGenetics,DavidGeffenSchoolofMedicine,UniversityofCalifornia,LosAngeles,California90095,USA39SchoolofPsychiatry,UniversityofNewSouthWales,SydneyNSW2031,Australia40RoyalBrisbaneandWomen'sHospital,UniversityofQueensland,BrisbaneQLD4072,Australia41DepartmentofComputerScience,UniversityofNorthCarolina,ChapelHill,NorthCarolina27514,USA42DepartmentofPsychiatry,WashingtonUniversity,St.Louis,Missouri63110,USA43DepartmentofChildandAdolescentPsychiatry,AssistancePubliqueHospitauxdeParis,PierreandMarieCurieFacultyofMedicineandInstituteforIntelligentSystemsandRobotics,Paris,75013,France44NeuropsychiatricGeneticsResearchGroup,DepartmentofPsychiatry,TrinityCollegeDublin,Dublin8,Ireland45UniversityHospitalMarquésdeValdecilla,InstitutodeFormacióneInvestigaciónMarquésdeValdecilla,UniversityofCantabria,E-39008Santander,Spain46CentroInvestigaciónBiomédicaenRedSaludMental,Madrid,Spain47DepartmentofGenetics,UniversityofNorthCarolina,ChapelHill,NorthCarolina27599-7264,USA48DepartmentofPsychologicalMedicine,QueenMaryUniversityofLondon,LondonE11BB,UK49MolecularPsychiatryLaboratory,DivisionofPsychiatry,UniversityCollegeLondon,LondonWC1E6JJ,UK50ShebaMedicalCenter,TelHashomer52621,Israel51InstituteofHumanGenetics,UniversityofBonn,D-53127Bonn,Germany52DepartmentofGenomics,LifeandBrainCenter,D-53127Bonn,Germany53AppliedMolecularGenomicsUnit,VIBDepartmentofMolecularGenetics,UniversityofAntwerp,B-2610Antwerp,Belgium54VABostonHealthCareSystem,Brockton,Massachusetts02301,USA55DepartmentofPsychiatry,HarvardMedicalSchool,Boston,Massachusetts02115,USA

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

4

56DepartmentofBiomedicine,AarhusUniversity,DK-8000AarhusC,Denmark57CentreforIntegrativeSequencing,iSEQ,AarhusUniversity,DK-8000AarhusC,Denmark58FirstDepartmentofPsychiatry,UniversityofAthensMedicalSchool,Athens11528,Greece59DepartmentofPsychiatry,UniversityCollegeCork,Co.Cork,Ireland60DepartmentofMedicalGenetics,OsloUniversityHospital,0424Oslo,Norway61CognitiveGeneticsandTherapyGroup,SchoolofPsychologyandDisciplineofBiochemistry,NationalUniversityofIrelandGalway,Co.Galway,Ireland62DepartmentofPsychiatryandBehavioralSciences,NorthShoreUniversityHealthSystem,Evanston,Illinois60201,USA63DepartmentofPsychiatryandBehavioralNeuroscience,UniversityofChicago,Chicago,Illinois60637,USA64DepartmentofNon-CommunicableDiseaseEpidemiology,LondonSchoolofHygieneandTropicalMedicine,LondonWC1E7HT,UK65DepartmentofPsychiatry,UniversityofRegensburg,93053Regensburg,Germany66FolkhälsanResearchCenter,Helsinki,Finland,BiomedicumHelsinki1,Haartmaninkatu8,FI-00290,Helsinki,Finland67NationalInstituteforHealthandWelfare,P.O.BOX30,FI-00271Helsinki,Finland68DepartmentofGeneralPractice,HelsinkiUniversityCentralHospital,UniversityofHelsinkiP.O.BOX20,Tukholmankatu8B,FI-00014,Helsinki,Finland69TranslationalTechnologiesandBioinformatics,PharmaResearchandEarlyDevelopment,F.Hoffman-LaRoche,CH-4070Basel,Switzerland70MentalHealthServiceLine,WashingtonVAMedicalCenter,WashingtonDC20422,USA71DepartmentofPsychiatry,GeorgetownUniversity,WashingtonDC20057,USA72DepartmentofPsychiatry,VirginiaCommonwealthUniversity,Richmond,Virginia23298,USA73DepartmentofPsychiatry,KeckSchoolofMedicineatUniversityofSouthernCalifornia,LosAngeles,California90033,USA74DepartmentofGeneticEpidemiologyinPsychiatry,CentralInstituteofMentalHealth,MedicalFacultyMannheim,UniversityofHeidelberg,Heidelberg,D-68159Mannheim,Germany75DepartmentofGenetics,UniversityofGroningen,UniversityMedicalCentreGroningen,9700RBGroningen,TheNetherlands76DepartmentofPsychiatry,UniversityofColoradoDenver,Aurora,Colorado80045,USA77CenterforNeurobehavioralGenetics,SemelInstituteforNeuroscienceandHumanBehavior,UniversityofCalifornia,LosAngeles,California90095,USA78DivisionofPsychiatricGenomics,DepartmentofPsychiatry,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA79PsychiatricandNeurodevelopmentalGeneticsUnit,MassachusettsGeneralHospital,Boston,Massachusetts02114,USA80DepartmentsofPsychiatryandHumanGenetics,UniversityofChicago,Chicago,Illinois60637USA81DepartmentofPsychiatry,UniversityofHalle,06112Halle,Germany82DepartmentofPsychiatry,UniversityofMunich,80336,Munich,Germany83DepartmentsofPsychiatryandHumanandMolecularGenetics,INSERM,InstitutdeMyologie,HôpitaldelaPitiè-Salpêtrière,Paris,75013,France84MedicalandPopulationGeneticsProgram,BroadInstituteofMITandHarvard,Cambridge,Massachusetts02142,USA85QueenslandBrainInstitute,TheUniversityofQueensland,Brisbane,QLD4072,Australia86AcademicMedicalCentreUniversityofAmsterdam,DepartmentofPsychiatry,1105AZAmsterdam,TheNetherlands87Illumina,LaJolla,California,California92122,USA88J.J.PetersVAMedicalCenter,Bronx,NewYork,NewYork10468,USA89FriedmanBrainInstitute,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA90SchoolofElectricalEngineeringandComputerScience,UniversityofNewcastle,NewcastleNSW2308,Australia91DivisionofMedicalGenetics,DepartmentofBiomedicine,UniversityofBasel,Basel,CH-4058,Switzerland92DepartmentofGenetics,HarvardMedicalSchool,Boston,Massachusetts02115,USA

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

5

93DivisionofEndocrinologyandCenterforBasicandTranslationalObesityResearch,BostonChildren'sHospital,Boston,Massachusetts02115,USA94SectionofNeonatalScreeningandHormones,DepartmentofClinicalBiochemistry,ImmunologyandGenetics,StatensSerumInstitut,Copenhagen,DK-2300,Denmark95DepartmentofPsychiatry,FujitaHealthUniversitySchoolofMedicine,Toyoake,Aichi,470-1192,Japan96RegionalCentreforClinicalResearchinPsychosis,DepartmentofPsychiatry,StavangerUniversityHospital,4011Stavanger,Norway97CentreforMedicalResearch,TheUniversityofWesternAustralia,Perth,WA6009,Australia98SchoolofPsychiatryandClinicalNeurosciences,TheUniversityofWesternAustralia,Perth,WA6009,Australia99DepartmentofPsychology,UniversityofColoradoBoulder,Boulder,Colorado80309,USA100CampbellFamilyMentalHealthResearchInstitute,CentreforAddictionandMentalHealth,Toronto,Ontario,M5T1R8,Canada101DepartmentofPsychiatry,UniversityofToronto,Toronto,Ontario,M5T1R8,Canada102InstituteofMedicalScience,UniversityofToronto,Toronto,Ontario,M5S1A8,Canada103DepartmentofPsychiatryandZilkhaNeurogeneticsInstitute,KeckSchoolofMedicineatUniversityofSouthernCalifornia,LosAngeles,California90089,USA104DepartmentofChildandAdolescentPsychiatry,PierreandMarieCurieFacultyofMedicine,Paris75013,France105DepartmentofPsychiatry,Hadassah-HebrewUniversityMedicalCenter,Jerusalem91120,Israel106PsychologyResearchLaboratory,McLeanHospital,Belmont,MA107DepartmentofBiostatistics,JohnsHopkinsUniversityBloombergSchoolofPublicHealth,Baltimore,Maryland21205,USA108DepartmentofPsychiatry,ColumbiaUniversity,NewYork,NewYork10032,USA109DepartmentofMentalHealthandSubstanceAbuseServices,NationalInstituteforHealthandWelfare,P.O.BOX30,FI-00271Helsinki,Finland110DepartmentofMentalHealth,BloombergSchoolofPublicHealth,JohnsHopkinsUniversity,Baltimore,Maryland21205,USA111DepartmentofPsychiatry,UniversityofBonn,D-53127Bonn,Germany112CentreNationaldelaRechercheScientifique,LaboratoiredeGénétiqueMoléculairedelaNeurotransmissionetdesProcessusNeurodégénératifs,HôpitaldelaPitiéSalpêtrière,75013,Paris,France113DepartmentofGenomicsMathematics,UniversityofBonn,D-53127Bonn,Germany114ResearchUnit,SørlandetHospital,4604Kristiansand,Norway115DepartmentofPsychiatry,NationalUniversityofIrelandGalway,Co.Galway,Ireland116DivisionofPsychiatry,UniversityofEdinburgh,EdinburghEH164SB,UK117CentreforCognitiveAgeingandCognitiveEpidemiology,UniversityofEdinburgh,EdinburghEH164SB,UK118DivisionofMentalHealthandAddiction,OsloUniversityHospital,0424Oslo,Norway119MassachusettsMentalHealthCenterPublicPsychiatryDivisionoftheBethIsraelDeaconessMedicalCenter,Boston,Massachusetts02114,USA120EstonianGenomeCenter,UniversityofTartu,Tartu50090,Estonia121SchoolofPsychology,UniversityofNewcastle,NewcastleNSW2308,Australia122FirstPsychiatricClinic,MedicalUniversity,Sofia1431,Bulgaria123EliLillyandCompanyLimited,ErlWoodManor,SunninghillRoad,Windlesham,Surrey,GU206PHUK124DepartmentP,AarhusUniversityHospital,DK-8240Risskov,Denmark125MaxPlanckInstituteofPsychiatry,80336Munich,Germany126InstituteofTranslationalMedicine,UniversityofLiverpool,LiverpoolL693BX,UK127Munich127ClusterforSystemsNeurology(SyNergy),80336Munich,Germany128DepartmentofPsychiatry,RoyalCollegeofSurgeonsinIreland,Dublin2,Ireland129King'sCollegeLondon,LondonSE58AF,UK130MaastrichtUniversityMedicalCentre,SouthLimburgMentalHealthResearchandTeachingNetwork,EURON,6229HXMaastricht,TheNetherlands131DepartmentofPsychiatryandPsychotherapy,JenaUniversityHospital,07743Jena,Germany132QueenslandCentreforMentalHealthResearch,UniversityofQueensland,BrisbaneQLD4076,Australia133DepartmentofPsychiatryandBehavioralSciences,JohnsHopkinsUniversitySchoolofMedicine,Baltimore,Maryland21205,USA

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

6

134DepartmentofPsychiatry,TrinityCollegeDublin,Dublin2,Ireland135EliLillyandCompany,LillyCorporateCenter,Indianapolis,46285Indiana,USA136DepartmentofClinicalSciences,Psychiatry,UmeåUniversity,SE-90187Umeå,Sweden137DETECTEarlyInterventionServiceforPsychosis,Blackrock,Co.Dublin,Ireland138LawrenceBerkeleyNationalLaboratory,UniversityofCaliforniaatBerkeley,Berkeley,California94720,USA139CentreforPublicHealth,InstituteofClinicalSciences,Queen'sUniversityBelfast,BelfastBT126AB,UK140InstituteofPsychiatry,King'sCollegeLondon,LondonSE58AF,UK141MelbourneNeuropsychiatryCentre,UniversityofMelbourne&MelbourneHealth,MelbourneVIC3053,Australia142PublicHealthGenomicsUnit,NationalInstituteforHealthandWelfare,P.O.BOX30,FI-00271Helsinki,Finland143DepartmentofPsychiatry,UniversityofNorthCarolina,ChapelHill,NorthCarolina27599-7160,USA144CenterforBiologicalSequenceAnalysis,DepartmentofSystemsBiology,TechnicalUniversityofDenmark,DK-2800,Denmark145InstituteforMolecularMedicineFinland,FIMM,UniversityofHelsinki,P.O.BOX20FI-00014,Helsinki,Finland146DepartmentofEpidemiology,HarvardSchoolofPublicHealth,Boston,Massachusetts02115,USA147DepartmentofPsychiatry,UniversityofOxford,Oxford,OX37JX,UK148InstituteforMultiscaleBiology,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA149NeuroscienceTherapeuticArea,JanssenResearchandDevelopment,Raritan,NewJersey08869,USA150DepartmentofPsychiatryandPsychotherapy,UniversityofGöttingen,37073Göttingen,Germany151PsychiatryandPsychotherapyClinic,UniversityofErlangen,91054Erlangen,Germany152HunterNewEnglandHealthService,NewcastleNSW2308,Australia153DivisionofCancerEpidemiologyandGenetics,NationalCancerInstitute,Bethesda,Maryland20892,USA154ResearchandDevelopment,BronxVeteransAffairsMedicalCenter,NewYork,NewYork10468,USA155WellcomeTrustCentreforHumanGenetics,Oxford,OX37BN,UK156DepartmentofMedicalGenetics,UniversityMedicalCentreUtrecht,Universiteitsweg100,3584CG,Utrecht,TheNetherlands157BerkshireHealthcareNHSFoundationTrust,BracknellRG121BQ,UK158DepartmentofPsychiatry,UniversityofOulu,P.O.BOX5000,90014,Finland159UniversityHospitalofOulu,P.O.BOX20,90029OYS,Finland160MolecularandCellularTherapeutics,RoyalCollegeofSurgeonsinIreland,Dublin2,Ireland161HealthResearchBoard,Dublin2,Ireland162UniversityCollegeLondon,LondonWC1E6BT,UK163DepartmentofNeuroscience,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA164InstituteofNeuroscienceandMedicine(INM-1),ResearchCenterJuelich,52428Juelich,Germany165Social,GeneticandDevelopmentalPsychiatryCentre,InstituteofPsychiatry,King'sCollegeLondon,London,SE58AF,UK166DepartmentofGenetics,TheHebrewUniversityofJerusalem,91905Jerusalem,Israel167ThePerkinsInstituteforMedicalResearch,TheUniversityofWesternAustralia,Perth,WA6009,Australia168CentreforClinicalResearchinNeuropsychiatry,SchoolofPsychiatryandClinicalNeurosciences,TheUniversityofWesternAustralia,MedicalResearchFoundationBuilding,PerthWA6000,Australia169CenterforHumanGeneticResearchandDepartmentofPsychiatry,MassachusettsGeneralHospital,Boston,Massachusetts02114,USA170DepartmentofFunctionalGenomics,CenterforNeurogenomicsandCognitiveResearch,NeuroscienceCampusAmsterdam,VUUniversity,Amsterdam1081,TheNetherlands171DepartmentofComplexTraitGenetics,NeuroscienceCampusAmsterdam,VUUniversityMedicalCenterAmsterdam,Amsterdam1081,TheNetherlands172DepartmentofChildandAdolescentPsychiatry,ErasmusUniversityMedicalCentre,Rotterdam3000,TheNetherlands173UniversityofAberdeen,InstituteofMedicalSciences,Aberdeen,AB252ZD,UK174DepartmentofClinicalMedicine,UniversityofCopenhagen,Copenhagen2200,Denmark175DepartmentofMolecularGeneticsandMcLaughlinCentre,UniversityofToronto,Toronto,Ontario,Canada176DepartmentofCellularandMolecularMedicine,UniversityofCalifornia,SanDiego,LaJolla,CA92093,USA

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

7

Abstract

Genomiccopynumbervariants(CNVs)havebeenstronglyimplicatedintheetiologyof

schizophrenia(SCZ).However,apartfromasmallnumberofriskvariants,elucidationofthe

CNVcontributiontoriskhasbeendifficultduetotherarityofriskalleles,alloccurringinless

than1%ofcases.Wesoughttoaddressthisobstaclethroughacollaborativeeffortinwhichwe

appliedacentralizedanalysispipelinetoaSCZcohortof21,094casesand20,227controls.We

observedaglobalenrichmentofCNVburdenincases(OR=1.11,P=5.7e-15),whichpersistedafter

excludinglociimplicatedinpreviousstudies(OR=1.07,P=1.7e-6).CNVburdenisalsoenrichedfor

genesassociatedwithsynapticfunction(OR=1.68,P=2.8e-11)andneurobehavioralphenotypes

inmouse(OR=1.18,P=7.3e-5).Weidentifiedgenome-widesignificantsupportforeightloci,

including1q21.1,2p16.3(NRXN1),3q29,7q11.2,15q13.3,distal16p11.2,proximal16p11.2and

22q11.2.Wefindsupportatasuggestivelevelfornineadditionalcandidatesusceptibilityand

protectiveloci,whichconsistpredominantlyofCNVsmediatedbynon-allelichomologous

recombination(NAHR).

Introduction

Studiesofgenomiccopynumbervariation(CNV)haveestablishedaroleforraregenetic

variantsintheetiologyofSCZ1.TherearethreelinesofevidencethatCNVscontributetorisk

forSCZ:genome-wideenrichmentofraredeletionsandduplicationsinSCZcasesrelativeto

controls2,3,ahigherrateofdenovoCNVsincasesrelativetocontrols4-6,andassociation

evidenceimplicatingasmallnumberofspecificloci(Extendeddatatable1).AllCNVsthathave

beenimplicatedinSCZarerareinthepopulation,butconfersignificantrisk(oddsratios2-60).

Todate,CNVsassociatedwithSCZhavelargelyemergedfrommergersofsummarydata

forspecificcandidateloci7-9;yeteventhelargestgenome-widescans(samplesizestypically

<10,000)remainunder-poweredtorobustlyconfirmgeneticassociationforthemajorityof

pathogenicCNVsreportedsofar,particularlyforthosewithlowfrequencies(<0.5%incases)or

intermediateeffectsizes(oddsratios2-10).Itisimportanttoaddressthelowpowerof

systematicCNVstudieswithlargersamplesgiventhatthistypeofmutationhasalreadyproven

usefulforhighlightingsomeaspectsofSCZrelatedbiology6,10-13.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

8

Thelimitedstatisticalpowerprovidedbysmallsamplesisasignificantobstaclein

studiesofrareandcommongeneticvariation.Inresponse,globalcollaborationshavebeen

formedinordertoattainlargesamplesizes,asexemplifiedbythestudyoftheSchizophrenia

WorkingGroupofthePsychiatricGenomicsConsortium(PGC)inwhich108independent

schizophreniaassociatedlociwereidentified14.Recognizingtheneedforsimilarlylarge

samplesinstudiesofCNVsforpsychiatricdisorders,weformedthePGCCNVAnalysisGroup.

Ourgoalwastoenablelarge-scaleanalysesofCNVsinpsychiatryusingcentralizedanduniform

methodologiesforCNVcalling,qualitycontrol,andstatisticalanalysis.Here,wereportthe

largestgenome-wideanalysisofCNVsforanypsychiatricdisordertodate,usingdatasets

assembledbytheSchizophreniaWorkingGroupofthePGC.

Dataprocessingandmeta-analyticmethods

Rawintensitydatawereobtainedfrom57,577subjectsfrom43separatedatasets

(Extendeddatatable2).AfterCNVcallingandqualitycontrol(QC),41,321subjectswere

retainedforanalysis.Inlargedatasetsderivedfrommultiplestudies,variabilityinCNV

detectionbetweenstudiesandarrayplatformspresentsasignificantchallenge.Tominimize

thetechnicalvariabilityacrossdifferentstudies,wedevelopedacentralizedpipelinefor

systematiccallingofCNVsforAffymetrixandIlluminaplatforms.(MethodsandExtendeddata

figure1).ThepipelineincludedmultipleCNVcallersruninparallel.DatafromIllumina

platformswereprocessedusingPennCNV15andiPattern16.DatafromAffymetrixplatforms

wereanalyzedusingPennCNVandBirdsuite17.Twoadditionalmethods,iPatternandC-score18,

wereappliedtodatafromtheAffymetrix6.0platform.TheCNVcallsfromeachprogramwere

convertedtoastandardizedformatandaconsensuscallsetwasconstructedbymergingCNV

outputsatthesamplelevel.OnlyCNVsegmentsthatweredetectedbyallalgorithmswere

retained.WeperformedrigorousQCattheplatformleveltoexcludesampleswithpoorprobe

intensityand/oranexcessiveCNVload(numberandlength).LargerCNVsthatappearedtobe

fragmentedweremergedandretained.CNVsspanningcentromeresorthosewith>50%

overlapwithsegmentalduplicationsorregionspronetoVDJrecombination(e.g.,

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

9

immunoglobulinorTcellreceptorloci)wereexcluded.Afinalsetofrare,highqualityCNVswas

definedasthose>20kbinlength,atleast10probes,and<1%MAF.

Geneticassociationswereinvestigatedbycase-controltestsofCNVburdenatfour

levels:(1)genome-wide(2)pathways,(3)genes,and(4)probes.AnalysescontrolledforSNP-

derivedprincipalcomponents,sex,genotypingplatform,andindividual-levelprobeintensity.

Multiple-testingthresholdsforgenome-widesignificancewereestimatedfromfamily-wise

errorratesdrawnfrompermutation

GenomewideanalysisofCNVburdenrevealsanenrichmentofultra-rarevariants

AnelevatedburdenofrareCNVshasbeenwellestablishedamongSCZcases2.We

appliedourmeta-analyticframeworktomeasuretheconsistencyofoverallCNVburdenacross

thegenotypingplatforms,andwhetherameasurableamountofCNVburdenpersistsoutsideof

previouslyimplicatedCNVregions.Consistentwithpreviousestimates,theoverallCNVburden

issignificantlygreateramongSCZcaseswhenmeasuredastotalKbcovered(OR=1.12,p=5.7e-

15),genesaffected(OR=1.21,p=6.6e-21),orCNVnumber(OR=1.03,p=1e-3).Focusingongenes

affectedbyCNV,ourstrongestsignalofenrichment,theeffectsizeisconsistentacrossall

genotypingplatforms(Figure1A).WhenwesplitbyCNVtype,theeffectsizeforcopynumber

losses(OR=1.40,p=4e-16)isgreaterthanforgains(OR=1.12,p=2e-7)(Extendeddatafigures2-

3).PartitioningbyCNVfrequency(basedon50%reciprocaloverlapwiththefullcallset,

Methods),CNVburdenisenrichedamongcasesacrossarangeoffrequencies,uptocountsof

80(MAF=0.2%)inthecombinedsample(Figure1B).

AprimaryquestioninthisstudyisthecontributionofnovellocitotheexcessCNV

burdenincases.AfterremovingninepreviouslyimplicatedCNVloci(wherereportedp-values

exceedourdesignatedmultipletestingthreshold,Extendeddatatable1),excessCNVburden

inSCZremainssignificantlyenriched(genesaffectedOR=1.11,p=1.3e-7,Figure1B).CNV

burdenalsoremainedsignificantlyenrichedafterremovalofallreportedlocifromExtended

datatable1,buttheeffect-sizewasgreatlyreduced(OR=1.08)comparedtotheenrichment

overall(OR=1.21).WhenwepartitionCNVburdenbyfrequency,wefindthatmuchofthe

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

10

previouslyunexplainedsignalisrestrictedtocomparativelyrareevents(i.e.,MAF<0.1%,Figure

1B).

Gene-set(pathway)burden

WeassessedwhetherCNVburdenwasconcentratedwithindefinedsetsofgenesinvolvedin

neurodevelopmentorneurologicalfunction.Atotalof36gene-setswereevaluated(fora

descriptionseeExtendeddatatable3),consistingofgene-setsrepresentingneuronalfunction,

synapticcomponentsandneurologicalandneurodevelopmentalphenotypesinhuman(19

sets),gene-setsbasedonbrainexpressionpatterns(7sets),andhumanorthologsofmouse

geneswhosedisruptioncausesphenotypicabnormalities,includingneurobehavioraland

nervoussystemabnormality(10sets).Somegene-setscanbeconsidered“negativecontrols”,

includinggenesnotexpressedinbrain(1set)orassociatedwithabnormalphenotypesin

mouseorgansystemsunrelatedtobrain(7sets).WemappedCNVstogenesiftheyoverlapped

byatleastoneexonicbp.

Gene-setburdenwastestedusinglogisticregressiondeviancetest6.Inadditiontousing

thesamecovariatesincludedingenome-wideburdenanalysis,wecontrolledforthetotal

numberofgenespersubjectspannedbyrareCNVstoaccountforsignalthatmerelyreflects

theglobalenrichmentofCNVburdenincases19.Multiple-testingcorrection(Benjamini-

HochbergFalseDiscoveryRate,BH-FDR)wasperformedseparatelyforeachgene-setgroupand

CNVtype(gains,losses).Aftermultipletestcorrection(Benjamini-HochbergFDR≤10%)15

gene-setswereenrichedforrarelossburdenincasesand4forraregainsincases,allofwhich

arebrain-relatedgenesets(Figure2).

Ofthe15setssignificantforlosses,themajorityconsistofsynapticorotherneuronal

components(9sets)fromgene-setgroup(a);inparticular,“GOsynaptic”(GO:0045202)and

“ARCcomplex”rankfirstbasedonstatisticalsignificanceandeffect-sizerespectively(“GO

synaptic”deviancetestp-value=2.8e-11,“ARCcomplex”regressionodds-ratio>1.8,Figure

2a).Lossesincaseswerealsosignificantlyenrichedforgenesinvolvedinnervoussystemor

behavioralphenotypesinmousebutnotforgene-setsrelatedtootherorgansystem

phenotypes(Figure2c).Toaccountfordependencybetweensynapticandneuronalgene-sets,

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

11

were-testedlossburdenfollowingastep-downlogisticregressionapproach,rankinggene-sets

basedonsignificanceoreffectsize(Extendeddatatable4).OnlyGOsynapticandARCcomplex

weresignificantinatleastoneofthetwostep-downanalyses,suggestingthatburden

enrichmentintheotherneuronalcategoriesismostlyaccountedbytheoverlapwithsynaptic

genes.Followingthesameapproach,themouseneurological/neurobehavioralphenotypeset

remainednominallysignificant,pointingtotheexistenceofadditionalsignalnotcapturedby

thesynapticset.Pathwayenrichmentwaslesspronouncedforduplications,consistentwiththe

smallerburdeneffectsforthisclassofCNV.Duplicationburdenwassignificantlyenrichedfor

NMDAreceptorcomplex,highlybrain-expressedgenes,medium/lowbrain-expressedgenes

andprenatallyexpressedbraingenes(Figure2b).

Giventhatsynapticgenesetswererobustlyenrichedfordeletionsincases,andwithan

appreciablecontributionfromlocithathavenotbeenstronglyassociatedwithSCZpreviously,

pathway-levelinteractionsofthesesetswerefurtherinvestigated.Aprotein-interaction

networkwasseededusingthesynapticandARCcomplexgenesthatwereintersectedbyrare

deletionsinthisstudy(Figure3).Agraphofthenetworkhighlightsmultiplesubnetworksof

synapticproteinsincludingpre-synapticadhesionmolecules(NRXN1,NRXN3),post-synaptic

scaffoldingproteins(DLG1,DLG2,DLGAP1,SHANK1,SHANK2),glutamatergicionotropic

receptors(GRID1,GRID2,GRIN1,GRIA4),andcomplexessuchasDystrophinanditssynaptic

interactingproteins(DMD,DTNB,SNTB1,UTRN).AsubsequenttestoftheDystrophin

glycoproteincomplex(DGC)revealedthatdeletionburdenofthesynapticDGCproteins

(intersectionof“GODGC”GO:0016010and“GOsynapse”GO:0045202)wasenrichedincases

(DeviancetestP=0.05),butdeletionburdenofthefullDGCwasnotsignificant(P=0.69).

GeneCNVburden

TodefinespecificlocithatconferriskforSCZ,wetestedCNVburdenatthelevelofindividual

genes,usinglogisticregressiondeviancetestandthesamecovariatesincludedingenome-wide

burdenanalysis.TocorrectlyaccountforlargeCNVsthataffectmultiplegenes,weaggregated

adjacentgenesintosinglelociiftheircopynumberwashighlycorrelatedacrosssubjects.CNVs

weremappedtogenesiftheyoverlappedoneormoreexons.Thecriterionforgenome-wide

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

12

significanceusedtheFamily-WiseErrorRate(FWER)<0.05.Thecriterionforsuggestive

evidenceusedaBenjamini-HochbergFalseDiscoveryRate(BH-FDR)<0.05.

OfnineteenindependentCNVlociwithgene-basedBH-FDR<0.05,twowereexcluded

basedonCNVcallingaccuracyorevidenceofabatcheffect(SupplementaryInformation).The

seventeenlocithatremainedaftertheseadditionalQCstepsarelistedinTable1.P-valuesfor

thissummarytablewereobtainedbyre-runningourstatisticalmodelacrosstheentireregion

(SupplementaryResults).Theseseventeenlocirepresentasetofnovel(n=6),previously

reported(n=4),andpreviouslyimplicated(n=7)loci.Manhattanplotsofthegeneassociation

forlossesandgainsareprovidedinFigure4.Apermutation-basedfalsediscoveryrateyielded

similarestimatestoBH-FDR.

Eightlociattaingenome-widesignificance,includingcopynumberlossesat1q21.1,

2p16.3(NRXN1),3q29,15q13.3,16p11.2(distal)and22q11.2alongwithgainsat7q11.23and

16p11.2(proximal).Anadditionalninelocimeetcriterionforsuggestiveassociation.Basedon

ourestimationofFalseDiscoveryRates(BHandpermutations),weexpecttoobservelessthan

twoassociationsmeetingsuggestivecriteriabychance.

ProbelevelCNVburden

WithourcurrentsamplesizeanduniformCNVcalling,manyindividualCNVlocicanbe

testedwithadequatepowerattheprobelevel,potentiallyfacilitatingdiscoveryatafinergrain

thanlocus-widetests.TestsforassociationwereperformedateachCNVbreakpointusingthe

residualsofcase-controlstatusaftercontrollingforanalysiscovariates,withsignificance

determinedthroughpermutation.ResultsforlossesandgainsareshowninExtendeddata

figure4.FourindependentCNVlocisurpassgenome-widesignificance,allofwhichwerealso

identifiedinthegene-basedtest,includingthe15q13.2-13.3and22q11.21deletions,16p11.2

duplication,and1q21.1deletionandduplication.Whiletheselocirepresentlessthanhalfof

thepreviouslyimplicatedSCZloci,wedofindsupportforalllociwheretheassociation

originallyreportedmeetsthecriteriaforgenome-widecorrectioninthisstudy.Weexamined

associationamongallpreviouslyreportedlocishowingassociationtoSCZ,including12CNV

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

13

lossesand20CNVgains(Extendeddatatable5),and14ofthe33lociwereassociatedwithSCZ

atp<.05.

Whenaprobe-leveltestisapplied,associationsatsomelocibecomebetterdelineated.

Forinstance,TheNRXN1geneat2p16.3isaCNVhotspot,andexonicdeletionsofthisgeneare

significantlyenrichedinSCZ9,20.Inthislargesample,weobserveahighdensityof“non-

recurrent”deletionbreakpointsincasesandcontrols.Theprobe-levelManhattanplotrevealsa

sawtoothpatternofassociation,wherepeakscorrespondtotranscriptionalstartsitesand

exonsofNRXN1(Figure5).Thisexamplehighlightshow,withhighdiversityofallelesatasingle

locus,theassociationpeakmaybecomemorerefined,andinsomecasesconvergetoward

individualfunctionalelements.Similarly,ahighdensityofduplicationbreakpointsatpreviously

reportedSCZrisklocion16p13.2(http://bit.ly/1NPgIuq)and8q11.23(http://bit.ly/1PwdYTt)

exhibitpatternsofassociationthatbetterdelineategenesintheseregions.

[theaboveURLslinktoaPGCCNVbrowserdisplayoftherespectivegenomicregions.ThebrowsercanalsobeaccesseddirectlyatthefollowingURLhttp://pgc.tcag.ca/gb2/gbrowse/pgc_hg18/]

NovelrisklociarepredominantlyNAHR-mediatedCNVs

ManyCNVlocithathavebeenstronglyimplicatedinhumandiseasearehotspotsfor

non-allelichomologousrecombination(NAHR),aprocesswhichinmostcasesismediatedby

flankingsegmentalduplications21.ConsistentwiththeimportanceofNAHRingeneratingCNV

riskallelesforschizophrenia,mostofthelociinTable1areflankedbysegmentalduplications.

Afterexcludinglocithathavebeenimplicatedinpreviousstudies,weinvestigatedwhether

NAHRmutationalmechanismswerealsoenrichedamongnovelassociatedCNVs.Wedefineda

CNVas“NAHR”whenboththestartandendbreakpointislocatedwithinasegmental

duplication.AcrossalllociwithFDR<0.05inthegene-baseburdentest,NAHR-mediatedCNVs

weresignificantlyenriched,6.03-fold(P=0.008;Extendeddatafigure5),whencomparedtoa

nulldistributiondeterminedbyrandomizingthegenomicpositionsofassociatedgenes

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

14

(SupplementalMaterial).TheseresultssuggestthatnovelSCZCNVstendtooccurinregions

pronetohighratesofrecurrentmutation.

Discussion

ThepresentstudyofthePGCSCZCNVdatasetincludesthemajorityofallmicroarray

datathathasbeengeneratedingeneticstudiesofSCZtodate.Inthis,thebestbodyof

evidencetodatewithwhichtoevaluateCNVassociations,wefinddefinitiveevidenceforeight

lociandwefindsignificantevidenceforacontributionfromnovelCNVsconferringbothrisk

andprotection.Thecompleteresults,includingCNVcallsandstatisticalevidenceatthegeneor

probelevel,canbeviewedusingthePGCCNVbrowser(URLs).Ourdatasuggestthatthenovel

risklocithatcanbedetectedwithcurrentgenotypingplatformslieattheultra-rareendofthe

frequencyspectrumandstilllargersampleswillbeneededtoidentifythematconvincinglevels

ofstatisticalevidence.

Collectively,theeightSCZrisklocithatsurpassgenome-widesignificancearecarriedby

asmallfraction(1.4%)ofSCZcasesinthePGCsample.Weestimate0.85%ofthevariancein

SCZliabilityisexplainedbycarryingaCNVriskallelewithintheseloci(SupplementaryResults).

Asacomparison,3.4%ofthevarianceinSCZliabilityisexplainedbythe108genome-wide

significantlociidentifiedinthecompanionPGCGWASanalysis.Combined,theCNVandSNP

locithathavebeenidentifiedtodateexplainasmallproportion(<5%)ofheritability.

Thelargedatasethereprovidesanopportunitytoevaluatethestrengthofevidencefor

avarietyoflociwhereanassociationwithschizophreniahasbeenreportedpreviously.Of33

publishedfindingsfromtherecentliterature,wefindevidencefor14loci(P<0.05,Extended

datatable5);thus,nearlyhalfoftheexistingcandidatelociaresupportedbyourdata.

Howeverwealsofindalackofevidenceformany.Alackofstrongevidenceinthisdataset

(whichincludessamplesthatoverlapwithmanyofthepreviousstudies)mayinsomecases

simplyreflectthatstatisticalpowerislimitedforveryrarevariants,eveninlargesamples.

However,itislikelythatsomeoftheseoriginalfindingsrepresentspuriousassociations.

Indeed,thelocithatarenotsupportedbyourdataconsistlargelyoflociforwhichtheoriginal

statisticalevidencewasmodest(Extendeddatatable5).Thus,ourresultshelptorefinethelist

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

15

ofpromisingcandidateCNVs.Continuedeffortstoevaluatethegrowingnumberofcandidate

variantshasconsiderablevaluefordirectingfutureresearcheffortsfocusedonspecificloci.

Novelcandidatelocimeetingsuggestivecriteriainthisstudyhighlightstrongcandidate

locithathavenotbeenpreviouslyimplicatedinSCZ.TwosuchassociationsarelocatedontheX

chromosomeinaregionofXq28thatishighlypronetorecurrentrearrangements22-24

(Extendeddatafigure6).GainsatthedistalXq28locusareenrichedincasesinthisstudy;

similarduplicationshavebeenreportedinassociationwithintellectualdisability,while

reciprocaldeletionsofthisregionareassociatedwithembryoniclethalityinmales25.

DuplicationsattheproximalXq28locus,includingasinglegeneMAGEA11,areenrichedin

controlsinthisstudy,andtoourknowledgehavenotbeendocumentedinotherdisorders.

Weobservedmultiple“protective”CNVsthatshowedasuggestiveenrichmentin

controls,includingduplicationsof22q11.2,MAGEA11,andZMYM5alongwithdeletionsand

duplicationsofZNF92.Noprotectiveeffectsweresignificantaftergenome-widecorrection.

Moreover,arareCNVthatconfersreducedriskforSCZmaynotconferageneralprotection

fromneurodevelopmentaldisorders.Forexample,microduplicationsof22q11.2appearto

conferprotectionfromSCZ26;however,suchduplicationshavebeenshowntoincreaseriskfor

developmentaldelayandavarietyofcongenitalanomaliesinpediatricclinicalpopulations27.It

isprobablethatsomeoftheundiscoveredrareallelesinSCZarevariantsthatconferprotection

butlargersamplesizesareneededtodeterminethisunequivocally.Iftrue,ourestimatesofthe

excessCNVburdenincasesmaynotfullyaccountforthevariationSCZliabilitythatisexplained

byrareCNVs.

OurresultsprovidestrongevidencethatdeletionsinSCZareenrichedwithinahighly

connectednetworkofsynapticproteins,consistentwithpreviousstudies2,6,10,28.ThelargeCNV

datasethereallowsamoredetailedviewofthesynapticnetworkandhighlightssubsetsof

genesaccountfortheexcessdeletionburdeninSCZ,includingsynapticcelladhesionand

scaffoldingproteins,glutamatergicionotropicreceptorsandproteincomplexessuchastheARC

complexandDGC.ModestCNVevidenceimplicatingDystrophin(DMD)anditsbindingpartners

isintriguinggiventhattheinvolvementofcertaincomponentsoftheDGChavebeen

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

16

postulated29,30anddisputed31previously.LargerstudiesofCNVareneededtodefinearole

forthisandothersynapticsub-networksinSCZ.

Thisstudyrepresentsamilestone.Large-scalecollaborationsinpsychiatricgenetics

havegreatlyadvanceddiscoverythroughgenome-wideassociationstudies.Herewehave

extendedthisframeworktorareCNVs.Ourknowledgeofthecontributionfromlower

frequencyvariantsgivesusconfidencethattheapplicationofthisframeworktolargenewly

acquireddatasetshasthepotentialtofurtherthediscoveryoflociandidentificationofthe

relevantgenesandfunctionalelements.ThePGCCNVresourceisnowpubliclyavailable

throughacustombrowserathttp://pgc.tcag.ca/gb2/gbrowse/pgc_hg18/.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

17

AuthorContributionsManagementofthestudy,coreanalysesandcontentofthemanuscriptwastheresponsibilityoftheCNVAnalysisGroupchairedbyJ.S.andjointlysupervisedbyS.W.S.andB.M.N.togetherwiththeSchizophreniaWorkingGroupchairedbyM.C.O’D.CoreanalyseswerecarriedoutbyD.P.H.,D.M.,andC.R.M.DataProcessingpipelinewasimplementedbyC.R.M.,B.T.,W.W.,D.G.,M.G.,A.S.andW.B.TheAcustomPGCCNVbrowserwasdevelopedbyC.R.M,D.P.H.,andB.T.AdditionalanalysesandinterpretationswerecontributedbyW.W.,D.A.andP.A.H.TheindividualstudiesorconsortiacontributingtotheCNVmeta-analysiswereledbyR.A.,O.A.A.,D.H.R.B.,A.D.B.,E.Bramon,J.D.B.,A.C.,D.A.C.,S.C.,A.D.,E.Domenici,H.E.,T.E.,P.V.G.,M.G.,H.G.,C.M.H.,N.I.,A.V.J.,E.G.J.,K.S.K.,G.K.,J.Knight,T.Lencz,D.F.L.,Q.S.L.,J.Liu,A.K.M.,S.A.M.,A.McQuillin,J.L.M.,P.B.M.,B.J.M.,M.M.N.,M.C.O’D.,R.A.O.,M.J.O.,A.Palotie,C.N.P.,T.L.P.,M.R.,B.P.R.,D.R.,P.C.S,P.Sklar.D.St.C.,P.F.S.,D.R.W.,J.R.W.,J.T.R.W.andT.W.Theremainingauthorscontributedtotherecruitment,genotyping,ordataprocessingforthecontributingcomponentsofthemeta-analysis.J.S.,B.M.N,C.R.M,D.P.H.,andD.M.draftedthemanuscript,whichwasshapedbythemanagementgroup.Allotherauthorssaw,hadtheopportunitytocommenton,andapprovedthefinaldraft.CompetingFinancialInterestSeveraloftheauthorsareemployeesofthefollowingpharmaceuticalcompanies:F.Hoffman-LaRoche(E.D.,L.E.),EliLilly(D.A.C.,Y.M.,L.N.)andJanssen(A.S.,Q.S.L).Noneofthesecompaniesinfluencedthedesignofthestudy,theinterpretationofthedata,theamountofdatareported,orfinanciallyprofitbypublicationoftheresults,whicharepre-competitive.Theotherauthorsdeclarenocompetinginterests.AcknowledgementsCorefundingforthePsychiatricGenomicsConsortiumisfromtheUSNationalInstituteofMentalHealth(U01MH094421).WethankT.LehnerandAnjeneAddington(NIMH).Theworkofthecontributinggroupswassupportedbynumerousgrantsfromgovernmentalandcharitablebodiesaswellasphilanthropicdonation.DetailsareprovidedintheSupplementaryNotes.MembershipoftheWellcomeTrustCaseControlConsortiumandPsychosisEndophenotypeInternationalConsortiumareprovidedintheSupplementaryNotes.URLsPGCCNVbrowser,http://pgc.tcag.ca/gb2/gbrowse/pgc_hg18.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

18

References

1. Malhotra,D.&Sebat,J.CNVs:harbingersofararevariantrevolutioninpsychiatricgenetics.Cell148,1223-41(2012).

2. Walsh,T.etal.Rarestructuralvariantsdisruptmultiplegenesinneurodevelopmentalpathwaysinschizophrenia.Science320,539-43(2008).

3. TheInternationalSchizophrenia,C.Rarechromosomaldeletionsandduplicationsincreaseriskofschizophrenia.Nature.455,237-241(2008).

4. Malhotra,D.etal.HighfrequenciesofdenovoCNVsinbipolardisorderandschizophrenia.Neuron72,951-63(2011).

5. Xu,B.etal.Strongassociationofdenovocopynumbermutationswithsporadicschizophrenia.NatGenet40,880-5(2008).

6. Kirov,G.etal.DenovoCNVanalysisimplicatesspecificabnormalitiesofpostsynapticsignallingcomplexesinthepathogenesisofschizophrenia.Molecularpsychiatry17,142-53(2012).

7. McCarthy,S.E.etal.Microduplicationsof16p11.2areassociatedwithschizophrenia.NatGenet41,1223-7(2009).

8. Mulle,J.G.etal.Microdeletionsof3q29conferhighriskforschizophrenia.AmJHumGenet87,229-36(2010).

9. Rujescu,D.etal.Disruptionoftheneurexin1geneisassociatedwithschizophrenia.HumMolGenet(2008).

10. Pocklington,A.J.etal.NovelFindingsfromCNVsImplicateInhibitoryandExcitatorySignalingComplexesinSchizophrenia.Neuron86,1203-14(2015).

11. Horev,G.etal.Dosage-dependentphenotypesinmodelsof16p11.2lesionsfoundinautism.ProcNatlAcadSciUSA108,17076-81(2011).

12. Golzio,C.etal.KCTD13isamajordriverofmirroredneuroanatomicalphenotypesofthe16p11.2copynumbervariant.Nature485,363-7(2012).

13. Holmes,A.J.etal.Individualdifferencesinamygdala-medialprefrontalanatomylinknegativeaffect,impairedsocialfunctioning,andpolygenicdepressionrisk.JNeurosci32,18087-100(2012).

14. SchizophreniaWorkingGroupofthePsychiatricGenomics,C.Biologicalinsightsfrom108schizophrenia-associatedgeneticloci.Nature511,421-7(2014).

15. Wang,K.etal.PennCNV:anintegratedhiddenMarkovmodeldesignedforhigh-resolutioncopynumbervariationdetectioninwhole-genomeSNPgenotypingdata.GenomeRes17,1665-74(2007).

16. Pinto,D.etal.Functionalimpactofglobalrarecopynumbervariationinautismspectrumdisorders.Nature466,368-72(2010).

17. Korn,J.M.etal.IntegratedgenotypecallingandassociationanalysisofSNPs,commoncopynumberpolymorphismsandrareCNVs.Nat.Genet.40,1253-1260(2008).

18. Vacic,V.etal.DuplicationsoftheneuropeptidereceptorgeneVIPR2confersignificantriskforschizophrenia.Nature471,499-503(2011).

19. Raychaudhuri,S.etal.Accuratelyassessingtheriskofschizophreniaconferredbyrarecopy-numbervariationaffectinggeneswithbrainfunction.PLoSGenet6(2010).

20. Kirov,G.etal.ComparativegenomehybridizationsuggestsaroleforNRXN1andAPBA2inschizophrenia.HumMolGenet17,458-65(2008).

21. Lupski,J.R.Genomicdisorders:structuralfeaturesofthegenomecanleadtoDNArearrangementsandhumandiseasetraits.TrendsGenet14,417-22(1998).

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

19

22. Calhoun,A.R.&Raymond,G.V.DistalXq28microdeletions:clarificationofthespectrumofcontiguousgenedeletionsinvolvingABCD1,BCAP31,andSLC6A8withanewcaseandreviewoftheliterature.AmJMedGenetA164A,2613-7(2014).

23. El-Hattab,A.W.etal.Clinicalcharacterizationofint22h1/int22h2-mediatedXq28duplication/deletion:newcasesandliteraturereview.BMCMedGenet16,12(2015).

24. Ravn,K.etal.LargegenomicrearrangementsinMECP2.HumMutat25,324(2005).25. El-Hattab,A.W.etal.Int22h-1/int22h-2-mediatedXq28rearrangements:intellectualdisability

associatedwithduplicationsandinuteromalelethalitywithdeletions.JMedGenet48,840-50(2011).

26. Rees,E.etal.Evidencethatduplicationsof22q11.2protectagainstschizophrenia.MolPsychiatry19,37-40(2014).

27. VanCampenhout,S.etal.Microduplication22q11.2:adescriptionoftheclinical,developmentalandbehavioralcharacteristicsduringchildhood.GenetCouns23,135-48(2012).

28. Fromer,M.etal.Denovomutationsinschizophreniaimplicatesynapticnetworks.Nature506,179-84(2014).

29. Zatz,M.etal.CosegregationofschizophreniawithBeckermusculardystrophy:susceptibilitylocusforschizophreniaatXp21oraneffectofthedystrophingeneinthebrain?JMedGenet30,131-4(1993).

30. Straub,R.E.etal.Geneticvariationinthe6p22.3geneDTNBP1,thehumanorthologofthemousedysbindingene,isassociatedwithschizophrenia.AmJHumGenet71,337-48(2002).

31. Mutsuddi,M.etal.Analysisofhigh-resolutionHapMapofDTNBP1(Dysbindin)suggestsnoconsistencybetweenreportedcommonvariantassociationsandschizophrenia.AmJHumGenet79,903-9(2006).

32. Zuberi,K.etal.GeneMANIApredictionserver2013update.NucleicAcidsRes41,W115-22(2013).

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

20

CHR BP1 BP2 Locus(GENE)PutativeCNVMechanism CNVtest Direction FWER BH-FDR Cases Controls

Regionalp-value OddsRatio[95%CI]

22 17,400,000 19,750,000 22q11.21 NAHR loss risk yes 3.54E-15 64 1 5.70E-18 67.7[9.3-492.8]

16 29,560,000 30,110,000 16p11.2(proximal) NAHR gain risk yes 5.82E-10 70 7 2.52E-12 9.4[4.2-20.9]

2 50,000,992 51,113,178 2p16.3(NRXN1) NHEJ loss risk yes 3.52E-07 35 3 4.92E-09 14.4[4.2-46.9]

15 28,920,000 30,270,000 15q13.3 NAHR loss risk yes 2.22E-05 28 2 2.13E-07 15.6[3.7-66.5]

1 144,646,000 146,176,000 1q21.1 NAHR loss+gain risk yes 0.00011 60 14 1.50E-06 3.8[2.1-6.9]

3 197,230,000 198,840,000 3q29 NAHR loss risk yes 0.00024 16 0 1.86E-06 NA[0-Inf]

16 28,730,000 28,960,000 16p11.2(distal) NAHR loss risk yes 0.0029 11 1 5.52E-05 20.6[2.6-162.2]

7 72,380,000 73,780,000 7q11.23 NAHR gain risk yes 0.0048 16 1 1.68E-04 16.1[3.1-125.7]

X 153,800,000 154,225,000 Xq28(distal) NAHR gain risk no 0.049 18 2 3.61E-04 8.9[2.0-39.9]

22 17,400,000 19,750,000 22q11.21 NAHR gain protective no 0.024 3 16 4.54E-04 0.15[0.04-0.52]

7 64,476,203 64,503,433 7q11.21(ZNF92) NAHR loss+gain protective no 0.033 131 180 6.71E-04 0.66[0.52-0.84]

13 19,309,593 19,335,773 13q12.11(ZMYM5) NHAR gain protective no 0.024 15 38 7.91E-04 0.36[0.19-0.67]

X 148,575,477 148,580,720 Xq28(MAGEA11) NAHR gain protective no 0.044 12 36 1.06E-03 0.35[0.18-0.68]

15 20,350,000 20,640,000 15q11.2 NAHR loss risk no 0.044 98 50 1.34E-03 1.8[1.2-2.6]

9 831,690 959,090 9p24.3(DMRT1) NHEJ loss+gain risk no 0.049 13 1 1.35E-03 12.4[1.6-98.1]

8 100,094,670 100,958,984 8q22.2(VPS13B) NHEJ loss risk no 0.048 7 1 1.74E-03 14.5[1.7-122.2]

7 158,145,959 158,664,998

7p36.3

(VIPR2/WDR60) NAHR loss+gain risk no 0.046 20 6 5.79E-03 3.5[1.3-9.0]

Table1:SignificantCNVlocifromgene-basedassociationtest

AllseventeenlocilistedcontainatleastonegenewithBenjamini-Hochbergfalsediscoveryrate(BH-FDR)<0.05inthegene-based

test,witheightlocicontainingatleastonegenesurpassingthefamily-wiseerrorrate(FWER)<0.05.Genomicpositionslistedare

usinghg18coordinates.ForputativeCNVmechanisms,non-allelichomologousrecombination(NAHR)andnon-homologousend

joining(NHEJ)arelistedasthelikelygenomicfeaturedrivingCNVformationateachlocus.Regionalp-valuesandoddsratioslisted

arefromaregionaltestateachlocus,wherewecombineCNVoverlappingtheimplicatedregionandrunthesametestasusedfor

eachgene(logisticregressionwithcovariatesanddeviancetestp-value).

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

21

Odds Ratio (95% CI)

0.5 1.0 1.5 2.0 2.5 3.0

PGC

omni

A6.0

I550

A5.0

I600

I300

A500

21094

9795

7027

1572

1130

818

410

314

20227

8422

7980

1705

1001

623

285

201

2.2

2.1

2.4

1.9

1.7

1.3

2

1.7

6.6e−21

5.8e−11

1.6e−05

0.004

3.6e−06

0.009

0.461

0.039

1.21

1.27

1.14

1.31

1.47

1.47

1.16

1.75

Platform Cases Controls Genes pval OR

AffymetrixIllumina

A

singleton 2−5 6−10 11−20 21−40 41−80 81+

CNV count

Odd

s Ra

tio (9

5% C

I)

1.0

1.5

2.0

all CNVpreviously implicated CNV removed

B

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

22

Figure1.CNVBurden

(A)ForestplotofCNVburden(measuredhereasgenesaffectedbyCNV),partitionedby

genotypingplatform,withthefullPGCsampleatthebottom.CNVburdeniscalculatedby

combiningCNVgainsandlosses.Caseandcontrolcountsarelisted,and“genes”istherateof

genesaffectedbyCNVincontrols.BurdentestsusealogisticregressionmodelpredictingSCZ

case/controlstatusbyCNVburdenalongwithcovariates(seemethods).Theoddsratioisthe

exponentialofthelogisticregressioncoefficient,andoddsratiosaboveonepredictincreased

SCZrisk.(B)CNVburdenpartitionedbyCNVfrequency.Forreference,aCNVwithMAF0.1%in

thePGCsamplewouldhave~41CNVs.Usingthesamemodelasabove,eachCNVwasplaced

intoasingleCNVfrequencycategorybasedona50%reciprocaloverlapwithotherCNVs.CNV

burdenwithinclusionofallCNVsareshowningreen,whereasCNVburdenexcludingpreviously

implicatedCNVlociareshowninblue.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

23

Figure2:Gene-setBurden

Gene-setburdentestresultsforrarelosses(a,c)andgains(b,d);framesa-bdisplaygene-sets

forneuronalfunction,synapticcomponents,neurologicalandneurodevelopmentalphenotypes

inhuman;framesc-ddisplaygene-setsforhumanhomologsofmousegenesimplicatedin

abnormalphenotypes(organizedbyorgansystems);botharesortedby–log10ofthelogistic

regressiondeviancetestp-valuemultipliedbythebetacoefficientsign,obtainedforrarelosses

whenincludingknownloci.Gene-setspassingthe10%BH-FDRthresholdaremarkedwith“*”.

!"#$$%& !'()*%&

Synaptic GO (622)

Neurof. Str. (1424)

Neuron Proj. GO (1230)

ARC Kirov (28)

FMR1 Targ. Darnell (840)

Neurof. Incl. (2874)

Nerv. Transm. GO (716)

Nerv. Sys. Dev. GO (1874)

FMR1 Targ. Ascano (927)

PSD Bayes full (1407)

Axon Guid. Pathw. (388)

Synapt. Pathw. KEGG (407)

Mind Phen. ADX (153)

Nerv. Sys. Phen. Any (1590)

NMDAR Kirov (62)

CNS Dev. GO (774)

Nerv. Sys. Phen. ADX (651)

Neuron Body GO (309)

Mind Phen. Any (439)

GA

IN: N

eurof+Synaptic: S

ignificance: U-Log (Dev P-value) * sign (Coeff)

-2 0 2 4 6 8 10

*

Synaptic GO (622)

Neurof. Str. (1424)

Neuron Proj. GO (1230)

ARC Kirov (28)

FMR1 Targ. Darnell (840)

Neurof. Incl. (2874)

Nerv. Transm. GO (716)

Nerv. Sys. Dev. GO (1874)

FMR1 Targ. Ascano (927)

PSD Bayes full (1407)

Axon Guid. Pathw. (388)

Synapt. Pathw. KEGG (407)

Mind Phen. ADX (153)

Nerv. Sys. Phen. Any (1590)

NMDAR Kirov (62)

CNS Dev. GO (774)

Nerv. Sys. Phen. ADX (651)

Neuron Body GO (309)

Mind Phen. Any (439)

LOSS: N

eurof+Synaptic: Significance: U

-Log (Dev P-value) * sign (Coeff)

-2 0 2 4 6 8 10

*

*

*********

**

Neurol. Behav. (2123)

Neuro Union (3202)

Nerv. Sys. (2375)

Endocr. Exocr. Repr. (2026)

Integ. Adip. Pigm. (1624)

Hemat. Immune (2605)

Digest. Hepat. (1493)

Cardiov. Muscle (2059)

Sensory (1293)

Skel. Cran. Limbs (1588)

LOSS: M

ouse Phenotypes: Significance: U

-Log (Dev P-value) * sign (Coeff)

-2 0 2 4 6 8 10***

*

Neurol. Behav. (2123)

Neuro Union (3202)

Nerv. Sys. (2375)

Endocr. Exocr. Repr. (2026)

Integ. Adip. Pigm. (1624)

Hemat. Immune (2605)

Digest. Hepat. (1493)

Cardiov. Muscle (2059)

Sensory (1293)

Skel. Cran. Limbs (1588)

GA

IN: M

ouse Phenotypes: S

ignificance: U

-Log (Dev P-value) * sign (Coeff)

-2 0 2 4 6 8 10

$+,-./&01,&2345607.8& $+,-./&01,&2345607.8&

9+:;&+<30+=6:./&01=+&

9+:;17:&+<30+=6:./&01=+&

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

24

Gene-setsrepresentingbrainexpressionpatternswereomittedfromthefigurebecauseonlya

fewweresignificant(losses:1,gains:3).

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

25

Figure3:ProteinInteractionNetworkforSynapticGenes

SynapticandARC-complexgenesintersectedbyararelossinatleast4caseorcontrolsubjects

andwithgenicburdenBenjamini-HochbergFDR<=25%(reddiscs)wereusedtoquery

GeneMANIA32andretrieveadditionalproteininteractionneighbors,resultinginanetworkof

136synapticgenes.Genesaredepictedasdisks;diskcentersarecoloredbasedonrareloss

frequency(Freq.SZandFreq.CT)beingprevalentincasesorcontrols;diskbordersarecolored

tomark(i)geneimplicationinhumandominantorX-linkedneurologicalor

neurodevelopmentalphenotype,(ii)denovomutation(DeN)reportedbyFromeretal.28,split

betweenLOF(frameshift,stop-gain,coresplicesite)andmissenseoraminoacidinsertion/

deletion,(iii)implicationinmouseneurobehavioralabnormality.Pre-synapticadhesion

molecules(NRXN1,NRXN3),post-synapticscaffolds(DLG1,DLG2,DLGAP1,SHANK1,SHANK2)

SYT2

SYT7

SYT4

APBA1

EPS8

SYT1SYT6

RPH3A

BAIAP2

CASK

NRXN2

NRXN3

SYT5SYT9

WNT7B

NRXN1

F2R

WNT5A

FMR1BSN

CYFIP1GRIA4

HOMER2

LIN7C

NLGN2NLGN3 GRIK1

SIPA1L1 LIN7BNLGN1 AXIN1

SPTBN2

ABI1

DVL1

MINK1

CYFIP2

LRP6

ARRB2

LRRK2

LPHN1TANC1

SHANK1 GRIK2

LIN7A

GRIA1

DTNACADPS2

SNTB1CADPS

SYNC

DES

PTK2

PLXND1

ITSN1

DMD

ANK2

DAG1

SEMA3E

L1CAM

PFN2

DLGAP3

MPDZ

HTR2B

NOS1SHANK3

SNTA1

UTRN

SNTB2

NRG1

DLGAP2

DTNB

GRID2

GRIN2B

DLGAP1

DLG4

CACNG2GOPC

SHANK2

GRIN1GRIN2D

DLG1DLG3

GRIN2C

CNKSR2

GIPC1

ATP2B2

LRP8

DLG2

GRID1

CRIPT

NETO1

SEMA4C

SEPT2

SEPT11

SEPT7

VAMP7

SEPT5

SEPT6

SNAP29

UNC13A FBXO45

SNAP23

WASF1

AP3D1

SNCB

STX1A

SNPH

CPLX2

APBB1MYO5A

CAMK2AGRIN2A

CTNNB1MYO6

CTNND1

NEDD4

MAGI2

MDM2

APP

CHRNA7

CDH1

CTNND2

P2RX6

ARCCHRFAM7A

ERBB2

DLGAP4

FYN

PTEN

NFASC

CNTN2

RAPGEF2

Gene$associa*on$test$deviance$p/value$(loss)$

BH/FDR$<=$25%,$Freq.SZ$>$Freq.CT$

BH/FDR$>$25%,$Freq.SZ$>=$1.5$*$Freq.CT$$

Other$

Freq.CT$>=$1.5$*$Freq.SZ$$

Dominant$/$X/linked,$or$DeN$LOF$

DeN$Missense$or$aa$in/del$

No$evidence$

Mouse$Neurophenotype$

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

26

andglutamatergicionotropicreceptors(GRID1,GRID2,GRIN1,GRIA4)constituteahighly

connectedsubnetworkwithmorelossesincasesthancontrols.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

27

Figure4:GeneBasedManhattanPlot.

Manhattanplotdisplayingthe–log10deviancep-valuefor(A)CNVlossesand(B)CNVgainsthe

gene-basedtest.P-valuecutoffscorrespondingtoFWER<0.05andBH-FDR<0.05are

highlightedinredandblue,respectively.Locisignificantaftermultipletestcorrectionare

labeled.

05

1015

Chromosome

−lo

g 10(p)

● ●

●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●

●●

●●●●

●●●●●● ●

●●●●●

●●●

●●

● ●●●

●●

●●●●●●●●●●●●●●●●●●

●●●

●●

●●

●●●●●

●●

●●●●●●●●●● ●●●

●●

●●●

●●

●●●

●●

●●●●●●●●●●

● ●

●●

●●●●

●●

●●●●●

●●●●●●●●●●●●●

●●●

●●

●●●●●●●

●●●

●●●●● ●●●●

●●●●

●●●●●●●●●●●●●●●●●

●●

●●●●●●

●●●●●●

●●

●●●●●●●●●●●●●

●●●●●●●●●

●●

●●●

●●●●●●●●

●●●●●●●●●●●●●●

●●●●● ●●●

●●●●

●●●●●

●●●●●●●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●

●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●●

●●●●●

●●●●●

●●

●●●●

●●●●●●●●●●●●●●●

●●●

●●

●●●

●●●

●●●●●●●●

●●●●●●●●●●●●●●●

●●●● ●●●

●●●●●●●●●●●●●●●●

●●●●●●

●●●●

●●●●●●●●●

●●●●

●●●●●

● ●

●●

●●●

●●

1 3 5 7 9 11 13 15 17 19 21 X2 4 6 8 10 12 14 16 18 20 22

CNV losses FWER cutoff = 1.33e−4CNV losses FDR cutoff = 0.0025

1q21.1

2p16.3

3q29

8q22.210q11.12

15q13.3

15q11.2

16p11.2 (distal)

22q11.2

A0

510

15

Chromosome

−lo

g 10(p)

●●●●●●●

●●●●●●●●●●●●●

●●●●●●●●

●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●

●●●

●●●●●●

●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●

●●●● ●●●●●●●●●●●●●●

●●●●●●●

●●

●●●●●●

●●

●●●

●●

●●●●

●●●●●●●●●

●●●●●

●●

●●●●

●●●●

●●●

●●●●●●●●●●

●●●●●●

●●●●●●

●●●●●●●

●●●●●●●●●●●

●●● ●

●●●●●●●●●●●●●●

●●●●●●

●●●●●●●●●

●●

●●●●●●●●●●

●●●

●●

●●

●●

●●

●●●●●●●

●●●●

●●

●●●●●●●●●●●●●●●

●●●●

●●●●●

●●●●●●●●●●●●●●●●●●●●

●●●

●●●●●●●●●●●●●●●

●●●●

●●●

●●●●●●●●●●●●●●

●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●

●●●●

●●●●●

●●●●●●●●

●●

●●●●●●●●●●●● ●●

●●●

●●●●●

●●●●●

●●

●●●●●●●

●●●●

●●●●●

●●●●● ●●●●●●●●●

●●●

●●●●●

●●

●●

●●●

●●●●

●●

●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●●●●●

●●●●●●●●●●●

●●●●●●●●

●●

●●●●

●●●●

●●●●

●●●

●●

●●

●●●●●●●●

●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●

●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●●●●●●●●

●●●●

●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●●●●●●●●●●●●●●

●●

●●●●●

●●●

●●●●●●●●●●●●●●●

●●

●●

●●●●●●●

●●

1 3 5 7 9 11 13 15 17 19 21 X2 4 6 8 10 12 14 16 18 20 22

CNV gains FWER cutoff = 4.33e−5CNV gains FDR cutoff = 0.001

1q21.17q11.23

13q12.11

16p11.2

22q11.2Xq28 (2 sites)

B

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

28

Figure5:Manhattanplotofprobe-levelassociationsacrosstheNeurexin-1locus

Empiricalp-valuesateachdeletionbreakpointrevealasaw-toothpatternofassociation.Predominantpeakscorrespondtoexons

andtranscriptionalstartsitesofNRXN1isoforms.

DEL resid-Z pval

UCSC Genes (RefSeq, GenBank, tRNAs & Comparative Genomics)

Deletions, cases (PLINK CNV track)

Deletions, controls (PLINK CNV track)

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

29

Locus CNVtype Geneorregionname

InitialSCZassociationreference(seelegend)

TestedinReesetal.2014

Reportedp-value

SCZCNVcarrier%

ControlCNVcarrier%

ReportedOddsRatio

22q11.2 deletion multigenic 1 yes 4.40E-40 0.29 0 Inf

16p11.2 duplication proximalduplication 2 yes 2.90E-24 0.35 0.03 11.52

1q21.1 deletion multigenic 3,4 yes 4.10E-13 0.17 0.021 8.35

2p16.3 deletion NRXN1exons 5,6 yes 1.30E-11 0.18 0.02 9.01

15q11.2 deletion multigenic 3 yes 2.50E-10 0.59 0.28 2.15

3q29 deletion multigenic 7,11 yes 1.50E-09 0.082 0.0014 57.65

15q13.2-13.3 deletion multigenic 3,4 yes 5.60E-06 0.14 0.019 7.52

15q11.2-13.1 duplication AS/PWS 8 yes 5.60E-06 0.083 0.0063 13.2

8q11.23 duplication RB1CC1 9 no 1.29E-05 0.106 0.014 8.58

16p13.11 duplication multigenic 8 yes 5.70E-05 0.31 0.13 2.3

7q11.23 duplication Williams-Beuren 10 yes 6.90E-05 0.066 0.0058 11.35

1q21.1 duplication multigenic 11 yes 9.90E-05 0.13 0.037 3.45

16p13.2 duplication C16orf72/USP7 11 no 1.00E-04 0.254 0.0197 12.9

1p36.33 duplication multigenic 12 no 5.00E-04 0.065 0.0075 8.66

22q11.2 duplication multigenic 13 no 8.60E-04 0.014 0.085 0.17

17p12 deletion HNPP 14 yes 1.20E-03 0.094 0.026 3.62

9q34.3 duplication intergenic 15 no 1.40E-03 1.47 0.43 3.38

16p12.1 deletion multigenic 12 no 1.60E-03 0.15 0.057 2.72

15q21.3 duplication CGNL1 12 no 1.90E-03 0.32 0.19 1.71

11q25 deletion GLB1L3/GLB1L2 11 no 3.00E-03 0.38 0.123 3

2q37.3 duplication AQP12A/KIF1A 12 no 3.00E-03 0.34 0.24 1.43

17q12 deletion RCAD 16 yes 0.0072 0.036 0.0054 6.64

9p24.2 deletion GLIS3 12 no 8.40E-03 0.033 0 Inf

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

30

9p24.2 deletion SLC1A1 12 no 9.80E-03 0.047 0.0075 6.19

16p11.2 deletion distaldeletion 17 yes 0.017 0.063 0.018 3.39

7q36.3 duplication WDR60/VIPR2 11,18 yes 0.27 0.11 0.069 1.54

ExtendedDataTable1:PreviouslyreportedCNVassociation

Weassembledalistof26reportedCNVassociationstoSCZ,whereanoddsratioandp-valuewereavailable.AteachCNVlocus,welisttheoddsratiosandp-valuesfromthelargestsamplecollectionavailableintheliterature.ResultsfromallCNVlocimeta-

analyzedinReesetal.(2014),whenavailable,wereused.Throughoutthisarticlewerefertothisentirelistas“previouslyreported”

loci.Reportedp-valuesforninelocishowninboldsurpassthemultipletestingthresholddrawnfromthecurrentdatasetusinga

Cochran-MantelHaenszelteststratifiedbygenotypingplatform.Throughoutthisarticlewerefertotheseninelocias“previously

implicated”.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

31

ExtendedDataTables2-4areseparate.xlsxsheetsavailableuponrequest.

EDTable2datasets.xlsx–datasetsandsamplesizesusedinthecurrentstudy

EDTable3NeuroGeneSsets.xlsx–Genesetsinvestigatedinthecurrentstudy

EDTable4stepdown_withLegend.xlsx–Uniquecontributionofsignificantgenesetsinstep-downregressionmodels

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

32

Locus CNVtype Geneorregion Reference(seeEDtable1)

Z-scorep-value

CMHp-value CNVcarriers SCZCase

countControlcount

CMHtestOddsRatio

N=41321 N=21094 N=20227

22q11.21 Deletion Multigenic 1 6.20E-13 4.40E-13 58 58 0 NA 16p11.2 Duplication Proximalduplication 2 2.60E-10 9.40E-12 67 63 4 13.8 15q13.2-13.3 Deletion Multigenic 3,4 6.10E-06 2.50E-06 33 30 3 10.55 3q29 Deletion Multigenic 7 6.20E-05 1.60E-04 16 16 0 NA 2p16.3 Deletion NRXN1 5,6 9.40E-05 3.00E-04 27 23 4 5.87 16p11.2 Deletion Distaldeletion 17 1.00E-04 5.20E-03 12 11 1 12.68 22q11.21 Duplication Multigenic 12 1.60E-04 6.30E-04 27 4 23 0.18 1q21.1 Deletion Multigenic 3,4 2.90E-04 2.70E-05 39 33 6 5.42 16p13.2 Duplication C16orf72/USP7 11 3.80E-04 1.10E-04 27 24 3 9.02 7q11.23 Duplication Williams-Beuren 10 4.90E-04 1.70E-03 13 13 0 NA 15q11.2-13.1 Duplication AS/PWS 8 6.60E-04 4.40E-04 15 15 0 NA 8q11.23 Duplication FAM150/RB1CC1 9 9.20E-04 5.10E-04 14 14 0 NA 15q11.2 Deletion Multigenic 3 1.70E-03 1.30E-03 142 95 47 1.8 1q21.1 Duplication Multigenic 11 2.00E-03 0.02 21 19 2 6.28 16q22.1 Duplication WWP2 19 0.003 0.08 5 5 0 NA 7q36.3 Duplication WDR60/VIPR2 11,18 4.10E-03 2.40E-03 14 13 1 12.12 17q12 Duplication RCADduplication 19 0.009 0.02 20 16 4 3.81 9q33.1 Deletion NA 19 0.02 0.09 11 9 2 4.02 22q11.23 Duplication Multigenic 19 0.02 0.03 20 15 5 3.28 5q21.2 Deletion NA 19 0.03 0.07 32 22 10 2.16 8p22 Duplication SGCZ 19 0.03 0.08 5 5 0 NA 9p24.2 Deletion SLC1A1 12 0.03 0.02 8 8 0 NA 16p12.1 Deletion Multigenic 12 0.03 0.006 33 26 7 3.22 15q21.3 Duplication CGNL1 12 0.04 1.30E-03 103 69 34 1.99 17q12 Deletion RCAD 16 0.04 0.13 4 4 0 NA 16p13.11 Del/Dup Multigenic 8 0.08 0.03 139 84 55 1.49 7q11.21 Duplication NA 19 0.09 0.35 64 26 38 0.76 12q23.1 Duplication ANKS1B/UHRF1BP1L 19 0.1 0.73 28 16 12 1.23 1p36.33 Duplication Multigenic 12 0.11 0.06 15 12 3 3.98 5q33.1 Deletion NA 12 0.11 0.1 11 9 2 4.19 9q21.33 Duplication AGTPBP1 11 0.2 0.21 22 15 7 1.94 9q34.3 Duplication C9orf62 15 0.23 0.03 409 190 219 0.8 6q24.2 Duplication PHACTR2 12 0.26 0.15 10 8 2 4.03 3q26.1 Deletion NA 11 0.27 0.43 5 4 1 3.5 4q35.2 Deletion TRIML1/TRIML2 12 0.35 0.21 26 17 9 1.82 18q21.31 Duplication NEDD4L 11 0.39 0.7 2 2 0 NA 11q25 Deletion GLB1L3/GLB1L2 11 0.42 0.22 58 34 24 1.44 9p24.2 Deletion GLIS3 12 0.43 0.99 10 5 5 0.99 18q23 Duplication GALR1 12 0.57 0.81 7 4 3 1.22

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

33

4q35.2 Duplication FAM149A/CYP4V2 12 0.69 0.77 12 5 7 0.71 2q37.2 Duplication AQP12A/KIF1A 12 0.72 0.14 125 72 53 1.34 17p12 Deletion HNPP 14 0.82 0.89 22 12 10 1.06 4q25 Duplication ELOVL6 12 0.9 0.99 13 7 6 1 10q11.21 Duplication LikelycommonCNV 19 NA NA NA NA NA NA

ExtendeddataTable5:CNVprobe-levelresults–PreviouslyreportedCNVs

Probe-levelassociationresultsforallpreviouslyreportedCNVsfromgenome-widescansofSCZ.Wereportassociationresultsfrom

theSCZresidualphenotypeandfromaCMHteststratifiedbygenotypingplatform.CNVlociinboldmakeuppreviouslyimplicated

loci,inwhichthemostrecentpublishedp-valuesurpassedgenome-widecorrection.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

34

Extendeddatatablereferences1. Shprintzen,R.J.,Goldberg,R.,Golding-Kushner,K.J.&Marion,R.W.Late-onsetpsychosis

inthevelo-cardio-facialsyndrome.AmJMedGenet42,141-2(1992).

2. McCarthy,S.E.etal.Microduplicationsof16p11.2areassociatedwithschizophrenia.NatGenet41,1223-7(2009).

3. Stefansson,H.etal.Largerecurrentmicrodeletionsassociatedwithschizophrenia.Nature(2008).

4. TheInternationalSchizophrenia,C.Rarechromosomaldeletionsandduplicationsincreaseriskofschizophrenia.Nature.455,237-241(2008).

5. Kirov,G.etal.ComparativegenomehybridizationsuggestsaroleforNRXN1andAPBA2inschizophrenia.HumMolGenet17,458-65(2008).

6. Rujescu,D.etal.Disruptionoftheneurexin1geneisassociatedwithschizophrenia.HumMolGenet(2008).

7. Mulle,J.G.etal.Microdeletionsof3q29conferhighriskforschizophrenia.AmJHum

Genet87,229-36(2010).

8. Ingason,A.etal.Maternallyderivedmicroduplicationsat15q11-q13:implicationofimprintedgenesinpsychoticillness.AmJPsychiatry168,408-17(2011).

9. Degenhardt,F.etal.DuplicationsinRB1CC1areassociatedwithschizophrenia;identificationinlargeEuropeansamplesets.TranslPsychiatry3,e326(2013).

10. Mulle,J.G.etal.Reciprocalduplicationofthewilliams-beurensyndromedeletiononchromosome7q11.23isassociatedwithschizophrenia.BiolPsychiatry75,371-7(2014).

11. Levinson,D.F.etal.Copynumbervariantsinschizophrenia:confirmationoffivepreviousfindingsandnewevidencefor3q29microdeletionsandVIPR2duplications.Am

JPsychiatry168,302-16(2011).

12. Rees,E.etal.Analysisofcopynumbervariationsat15schizophrenia-associatedloci.BrJPsychiatry204,108-14(2014).

13. Rees,E.etal.Evidencethatduplicationsof22q11.2protectagainstschizophrenia.Mol

Psychiatry19,37-40(2014).

14. Kirov,G.etal.Supportfortheinvolvementoflargecnvsinthepathogenesisofschizophrenia.HumMolGenet(2009).

15. Bergen,S.E.etal.Genome-wideassociationstudyinaSwedishpopulationyieldssupportforgreaterCNVandMHCinvolvementinschizophreniacomparedtobipolardisorder.MolecularPsychiatry17,880-6(2012).

16. Moreno-De-Luca,D.etal.Deletion17q12isarecurrentcopynumbervariantthatconfershighriskofautismandschizophrenia.AmJHumGenet87,618-30(2010).

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

35

17. Guha,S.etal.Implicationofararedeletionatdistal16p11.2inschizophrenia.JAMA

psychiatry70,253-60(2013).

18. Vacic,V.etal.DuplicationsoftheneuropeptidereceptorgeneVIPR2confersignificantriskforschizophrenia.Nature471,499-503(2011).

19. Szatkiewicz,J.etal.CopynumbervariationinschizophreniainSweden.Molecular

Psychiatry19,762-73(2014).

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

36

ExtendedDataFigure1:CNVpipelineworkflow

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

37

ExtendedDataFigure2:CNVburdenforlosses:Fig3A:ForestplotofCNVburden(genesaffected)partitionedbygenotypingplatform.Fig3B:CNVburdenpartitionedbyCNVfrequency.

Odds Ratio (95% CI)

1 2 3 4

PGC

omni

A6.0

I550

A5.0

I600

I300

A500

21094

9795

7027

1572

1130

818

410

314

20227

8422

7980

1705

1001

623

285

201

0.9

0.9

0.8

0.9

0.6

0.7

0.8

1.1

4.0e−16

2.1e−09

1.9e−04

0.073

1.9e−04

0.1

0.237

0.306

1.4

1.53

1.26

1.38

1.89

1.64

1.46

1.63

Platform Cases Controls Genes pval OR

AffymetrixIllumina

A

singleton 2−5 6−10 11−20 21−40 41−80 81+

CNV count

Odd

s Ra

tio (9

5% C

I)

12

34

all CNVpreviously implicated CNV removed

B

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

38

ExtendedDataFigure3:CNVburdenforgains:Fig4A:ForestplotofCNVburden(genesaffected)partitionedbygenotypingplatform.Fig4B:CNVburdenpartitionedbyCNVfrequency.

Odds Ratio (95% CI)

0.5 1.0 1.5 2.0 2.5 3.0

PGC

omni

A6.0

I550

A5.0

I600

I300

A500

21094

9795

7027

1572

1130

818

410

314

20227

8422

7980

1705

1001

623

285

201

2.1

2.1

2.2

1.9

1.8

1.4

2.4

1.6

2e−07

0.001

0.014

0.047

0.013

0.034

0.772

0.112

1.12

1.15

1.08

1.22

1.24

1.38

0.93

1.66

Platform Cases Controls Genes pval OR

AffymetrixIllumina

A

singleton 2−5 6−10 11−20 21−40 41−80 81+

CNV count

Odd

s Ra

tio (9

5% C

I)

0.8

1.0

1.2

1.4

1.6

1.8

all CNVpreviously implicated CNV removed

B

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

39

ExtendedDataFigure4:CNVprobelevelManhattanplot:Manhattanplotofprobe-

levelassociationresultsfromtheSCZresidualphenotype.Fig5A:CNVlossesFig5B:CNV

gains.Genome-widecorrectionwasdeterminedusingthefamily-wiseerrorrate(FWER)

drawnfrompermutation.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

40

ExtendedDataFigure5:PermutationofNAHR-mediatedCNVs:Permutationresults

fromdrawingfrequency-matchCNVlociandtestingforfractionofNAHR-mediated

CNVs.TotestfortheenrichmentofNAHR-mediatedlociinoursuggestiveresultsfrom

thegene-basedtest,eachpermutationselectedanequivalentnumberofindependent

CNVlociandtestedthefactionofNAHR-mediatedCNVs.

p=0.008

0

500

1000

0.00 0.25 0.50 0.75 1.00fraction of NAHR−mediated CNVs

frequ

ency

Gains + Losses

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

41

ExtendedDataFigure6:Xq28CNVhotspot:Fig6A:ProtectiveCNVgainassociation

peakaroundtheMAGEA11andTMEM185Agene,bothwithinanintronoftheHSFX1

gene.Fig6B:RiskCNVgainassociationpeakatthedistalendofXq28overlappingten

genes.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

42

ExtendedDataFigure7:SCZphenotyperesidualdistribution:X-axis:Distributionof

phenotyperesidualvaluesafterregressingcase/controlstatusonselectedcovariates.

PlottedagainstoverallCNVKbburden(Y-axis)tovisuallyinspectifindividualswithlarge

residualshaveanexcessofCNVburden,whichcanleadtohigherfalsepositive

associations.SCZcaseshavepositiveresidualvaluesandcontrolsnegativeresidual

values.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

43

ExtendedDataFigure8:DetectionpowerforCNVlosses:Poweristheproportionof

simulatedcausalCNVlocidetected(e.g.surpassinggenome-wideFWERcorrection)

usingprobe-levelassociation.EachgraphplotspoweracrossvariousMAF(x-axis)and

SNP−level detection power

MAF

surp

assi

ng g

enom

e−w

ide

cuto

ff

1.0e−04 0.0025 0.005 0.01

0.0

0.2

0.4

0.6

0.8

1.0

GRR1.5235102050100Inf

MAF

surp

assi

ng g

enom

e−w

ide

cuto

ff

1.0e−04 2.5e−04 5.0e−04 7.5e−04 0.001

0.0

0.2

0.4

0.6

0.8

1.0

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

44

genotyperelativerisk(GRR:coloredlines).SimulationsusethesamplesizeandFWER

cutofffromthecurrentstudy.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

45

Methods

Overview

WeassembledaCNVanalysisgroupwithmembersfromBroadInstitute,Children’s

HospitalofPhiladelphia,UniversityofChicago,UniversityofCaliforniaSanDiego,

UniversityofMichigan,UniversityofNorthCarolina,ColoradoUniversityBoulder,and

UniversityofToronto/SickKidsHospital.Ouraimwastoleveragetheextensiveexpertise

ofthegrouptodevelopafullyautomatedcentralizedpipelineforconsistentand

systematiccallingofCNVsforbothAffymetrixandIlluminaplatforms.Anoverviewof

theanalysispipelineisshowninExtendedDataFigure1.Afteraninitialdataformatting

stepweconstructedbatchesofsamplesforprocessingusingfourdifferentmethods,

PennCNV,iPattern,C-score(GADAandHMMSeg)andBirdsuiteforAffymetrix6.0.For

Affymetrix5.0dataweusedBirdsuiteandPennCNV,forAffymetrix500weused

PennCNVandC-score,andforallIlluminaarraysweusedPennCNVandiPattern.We

thenconstructedaconsensusCNVcalldatasetbymergingdataatthesampleleveland

furtherfilteredcallstomakeafinaldatasetExtendeddatatable2.Priortoanyfiltering,

weprocessedrawgenotypecallsforatotalof57,577individuals,including28,684SCZ

casesand28,893controls.

StudySample

Acompletelistofdatasetsthatwereincludedinthecurrentstudycanbefoundin

ExtendedDataTable2.Amoredetaileddescriptionoftheoriginalstudiescanbefound

inapreviouspublication1

CopyNumberVariantAnalysisPipelineArchitectureandSampleProcessing

AllaspectsoftheCNVanalysispipelinewerebuiltontheGeneticClusterComputer

(GCC)intheNetherlands.PGCmemberssentexternaldrivesofrawdatatothe

Netherlandsforuploadtotheserveraswellasthecorrespondingsamplemetadata

files.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

46

InputAcceptanceandPreprocessing:ForAffymetrixweusedthe*.CELfiles(all

convertedtothesameformat)asinput,whereasforIlluminawerequiredGenomeor

Beadstudioexported*.txtfileswiththefollowingvalues:SampleID,SNPName,Chr,

Position,Allele1–Forward,Allele2–Forward,X,Y,BAlleleFreqandLogRRatio.

Sampleswerethenpartitionedinto‘batches’toberunthrougheachpipeline.For

AffymetrixsampleswecreatedanalysisbatchesbasedontheplateID(ifavailable)or

genotypingdate.Eachbatchhadapproximately200sampleswithanequalmixofmale

andfemalesamples.AffymetrixPowerTools(APT-apt-copynumber-workflow)was

thenusedtocalculatesummarystatisticsaboutchipsanalyzed.Gendermismatches

identifiedandexcludedaswereexperimentswithMAPD>0.4.ForIlluminadata,we

firstdeterminedthegenomebuildandconvertedtohg18ifnecessaryandcreated

analysisbatchesbasedontheplateIDorgenotypingdate.Eachbatchhad

approximately200samples,andequalmixofmaleandfemalesamples.

CompositePipeline:ThecompositepipelinecomprisesCNVcallersPennCNV2,iPattern3,

Birdsuite4andC-Score5organizedintocomponentpipelines.Weusedallfourcallers

forAffymetrix6.0data,PennCNVandC-ScoreforAffymetrix500,Probeannotationfiles

werepreprocessedforeachplatform.Oncethearraydesignfilesandprobeannotation

fileswerepre-processed,eachindividualpipelinecomponentpipelinewasrunintwo

steps:1)processingtheintensitydatabythecorepipelineprocesstoproduceCNVcalls,

2)parsingthespecificoutputformatofthecorepipelineandconvertingthecallstoa

standardformdesignedtocaptureconfidencescores,copynumberstatesandother

informationcomputedbyeachpipeline

MergingofCNVdataandQualitycontrolfiltering

MergingofCNVdata:Afterstandardizationofoutputsfromeachalgorithm,CNVcalls

fromeachalgorithmweremergedatthesampleleveltoincreasespecificity3.ForCNVs

generatedfromAffymetrix6.0array,wetooktheintersectionofthefouroutputs

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

47

(Birdsuite,iPattern,C-Score,PennCNV)atthesampleleveltocreateaconsensusCNV.

FortheAffymetrix500,Affymetrix5.0,andIlluminaplatforms,CNVmergingwas

performedbytakingtheintersectionofthecallsmadebythetwoalgorithms(PennCNV

andC-ScoreforAffymetrix500,BirdsuiteandPennCNVforAffymetrix5.0,andiPattern

andPennCNVforIllumina)atthesamplelevel.CNVcallsthatweremadebyonlyoneof

thealgorithmwereexcluded.CallsdiscordantfortypeofCNV(gainorloss)werealso

excluded.

Qualitycontrolfiltering:Followingmergingweappliedfilteringcriteriaforremovalof

arrayswithexcessiveprobevarianceorGCbiasandremovalofsampleswith

mismatchesingenderorethnicityorchromosomalaneuploidies.ForAffymetrixdata,

weextractedtheMAPDandwaviness-sdfromtheAPTsummaryfile.Wealsocalculated

theproportionofeachchromosome(excludingchrY)taggedascopynumbervariable

andcomputedthenumberofCNVcallsmadeforeachsample.Wethenretained

experimentsifeachofthesemeasureswaswithin3SDofthemedian.ForIlluminadata,

weextractedLRRSD,BAFSD,GCWF(waviness)fromPennCNVlogfiles.Aswiththe

Affymetrixdata,wecalculatedtheproportionofeachchromosome(excludingchrY)

taggedascopynumbervariableandcomputedthenumberofCNVcallsmadeforeach

sample.Weretainedsamplesifeachoftheabovemeasureswaswithin3SDofthe

median.ForbothIlluminaandAffymetrixdatasets,largeCNVsthatappearedartificially

splitwerecombinedtogetherifoneofthemethodsdetectedaCNVspanningthegap.

However,sampleswhere>10%ofthechromosomewascopynumbervariablewere

excludedaspossibleaneuploidies.Further,weexcludedCNVsthat:1)spannedthe

centromereoroverlappedthetelomere(100kbfromtheendsofthechromosome);2)

had>50%ofitslengthoverlappingasegmentalduplication;3)had>50%overlapwith

immunoglobulinorTcellreceptor.ThefinalfilteredCNVdatasetwasannotatedwith

Refseqgenes(transcriptionsandexons).Afterthisstageofqualitycontrol(QC),wehad

atotalof52,511individuals,with27,034SCZcasesand25,448controls.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

48

FilteringforrareCNVs:TomakeourfinaldatasetofrareCNVsforallsubsequent

analysisweuniversallyfilteredoutvariantsthatpresentat>=1%(50%reciprocal

overlap)frequencyincasesandcontrolscombined.CNVsthatoverlapped>50%with

regionstaggedascopynumberpolymorphiconanyotherplatformwerealsoexcluded.

CNVs<20kborhavingfewerthan10probeswerealsoexcluded.

Post-CNVCallingQC

Overview:AnumberofstepswereundertakenafterCNVcallingandinitialfilteringQC

tominimizetheimpactoftechnicalartifactsandpotentialconfounds.Insummary,we

removedindividualsnotpresentinthePGC2GWASanalysis1,removeddatasetswith

non-matchingcaseorcontrolsamplesthatcouldnotbereconciledusingconsensus

platformprobes,andremovedanyadditionaloutlierswithrespecttooverallCNV

burden,CNVcallingmetrics,orSCZphenotyperesiduals.Allstepsaredescribedinmore

detailbelow.

MergingwithGWAScohort:Bymatchingtheuniquesampleidentifiers,weretained

onlyindividualsthatalsopassedQCfilteringfromthecompanionPGCGWASstudyin

Schizophrenia1.Thisstepfilteredoutsampleswithlow-qualitySNPgenotyping,related

individuals,andrepeatedsamplesacrosscohorts.AnadditionalbenefitofthePGC

analyticalframeworkistheabilitytoaccountforpopulationstratificationacrosscohorts

usingprincipalcomponentsderivedfromprobelevelanalysis.Afterthepost-CNVcalling

qualitycontrolstepsdescribedbelow,were-calculatedprincipalcomponentsusingthe

Eigenstratsoftwarepackage6.SampleinformationandsubsequentCNVandGWAS

filteredsamplesetsarepresentedinExtendeddatatable2.Intheprocessofmatching

totheGWAS-specificcohort,allindividualsofnon-Europeanancestrywereremoved

fromanalysis(~5.8%ofthepost-QCsamplecomprisingthreeseparatedatasets).We

alsoremoved42samplesthathaddiscordantphenotypedesignationsbetweenthe

GWASanalysisandCNVgenotypesubmission.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

49

Individualdatasetremoval:SomedatasetssubmittedtothePGCconsistedofonlycase

orcontrolsamples,affectedtrios,orrecruitedexternalsamplesascontrols.This

asymmetryincase-controlascertainmentandgenotypingcanpresentseriousbiasesfor

CNVanalysis,asthesensitivitytodetectCNVwillvaryconsiderablyacrossgenotyping

platforms,aswellaswithindatasetandgenotypingbatch.Unlikeimputationprotocols

commonlyusedforSNPgenotyping,thereisnoequivalentprocesstoinferunmeasured

probeintensityfromnearbymarkers.Wetookanumberofstepstoidentifyand

removedatasetsthatshowedstrongsignsofcase-controlascertainmentorgenotyping

asymmetry:

1)Identifygenotypingplatformswherecase-controlratiowasnotbetween40-60%

2)Wherepossible,mergesimilargenotypingplatformsusingconsensusprobespriorto

CNV-callingpipelineinordertoimprovecase-controlratio.

3)ExamineoverallCNVburdenandassociationpeaksforspuriousresults

4)RemovedatasetsthatremainproblematicduetounusualCNVburdenormultiple

spuriousCNVassociations.

ThegenotypingplatformsidentifiedandprocessedarelistedinExtendeddatatable2.

WewereabletocombinetheIlluminaOmniExpressandIlluminaOmniExpressplus

ExomeChipplatformswithsuccessbyremovingprobecontentspecifictotheExome

chipplatform.WeremovedthecawsAffymetrix500datasetsduetoanumberofstrong

CNVassociationpeaksnotseeninanyotherdataset.Wealsoremovethefii6dataset

duetoa2-foldCNVburdenincasesrelativetocontrols.Inordertoimprovecase-

controlbalance,wehadtoremovetheaffectedprobandtriodatasets(boco,lacw,and

lemu)intheIllumina610platform,andthecontrol-onlyuclodatasetintheAffymetrix

500platform.

Individualsampleremoval:Were-analyzedCNVburdenestimatesinthereduced

sampletoflaganylingeringoutliersmissedintheinitialQC.Weidentifiedoutliersfor

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

50

CNVcountandKbburdenintheautosome(>30CNVsor8Mb,respectively)andinthe

Xchromosome(>10CNVsor5Mb,respectively),removinganadditional15individuals.

Genome-wideCNVintensityandqualitymeasurementsproducedbyCNVcalling

algorithms(i.e.“CNVmetrics”)wereexaminedforadditionaloutliersandpotential

relationshipswithcase-controlstatus.EachCNVmetricwasre-examinedacrossstudies

toassessifanyadditionaloutlierswerepresent.Onlythreeoutlierswereremovedas

theirmeanBallele(orminorallele)frequencydeviatedsignificantlyfrom0.5.ManyCNV

metricsareauto-correlated,astheymeasuresimilarpatternsofvariationintheprobe

intensity.Thus,wefocusedonthemainintensitymetrics-medianabsolutepairwise

difference(MAPD)forprojectsgenotypedontheAffymetrix6.0platform,andLogR

Ratiostandarddeviation(LRRSD)inallothergenotypingplatforms.AmongAffymetrix

6.0datasets,MAPDdidnotdifferbetweenincasesandcontrols(t=1.14,p=0.25).

However,amongnon-Affymetrix6.0datasets,LRRSDshowedsignificantdifferences

betweencasesandcontrols(t=-35.3,p<2e-16),withcontrolshavingahigher

standardizedmeanLRRSD(0.227)thancases(-0.199).Tocontrolforanyspurious

associationsdrivenbyCNVcallingquality,weincludedLRRSD(MAPDforAffymetrix6.0

platforms)asacovariateindownstreamanalysis.CNVmetricswerenormalizedwith

theirgenotypingplatformpriortoinclusioninthecombineddataset.

Regressionofpotentialconfoundsoncase-controlascertainment

ThePGCcohortsareacombinationofmanydatasetsdrawnfromtheUSandEurope,

anditisimportanttoensurethatanybiasinsampleascertainmentdoesnotdrive

spuriousassociationtoSCZ.Inordertoensuretherobustnessoftheanalysis,we

controlledforanumberofcovariatesthatcouldpotentialconfoundresults.Burdenand

gene-setanalysesincludedcovariatesinalogisticregressionframework.Duetothe

numberoftestsrunatprobelevelassociation,weemployedastep-wiselogistic

regressionapproachtoallowfortheinclusionofcovariatesinourcase-control

association,whichwetermtheSCZresidualphenotype.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

51

Covariatesincludesex,genotypingplatform,CNVmetrics,andancestryprincipal

componentsderivedfromSNPgenotypesonthesamesamplesinapreviousstudy1.We

wereunabletocontrolfordatasetorgenotypingbatch,asasubsetofthecontributing

datasetsarefullyconfoundedwithcase/controlstatus.CNVmetricisnormalizedwithin

genotypingplatformpriortoinclusioninthelogisticmodel.Onlyprincipalcomponents

thatshowedasignificantassociationtosmallCNVburdenwereused(smallCNVbeing

definedasautosomalCNVburdenwithCNV<100kbinsize).Amongthetop20

principalcomponents,onlythe1st,2nd,3rd,4th,and8thprincipalcomponentshowed

associationwithsmallCNVburden(withp<0.01usedasthesignificancecutoff).To

calculatetheSCZresidualphenotype,wefirstfitalogisticregressionmodelof

covariatestoaffectionstatus,andthenextractedthePearsonresidualvaluesforusein

aquantitativeassociationdesignfordownstreamanalyses.Residualphenotypevalues

incasesareallabovezero,andcontrolsbelowzero,andaregraphedagainstoverallkb

burdeninExtendeddatafigure7.WeremovedthreeindividualswithanSCZresidual

phenotypegreaterthanthree(ornegativethreeincontrols).Afterthepost-processing

roundofQC,weretainedadatasetwithatotalof41,321individualscomprising21,094

SCZcasesand20,227controls.

IdentifyingpreviouslyimplicatedCNVlociintheliterature

TodelineateCNVburdeneffectscomingfromCNVlocithathavepreviouslybeen

reportedasputativeSCZriskfactorsfromCNVinremainderofthegenome,weflagged

CNVlociwithp<0.01thathaveeitherbeenreviewed7,8orotherwisereported8-10as

potentialSCZriskfactorsintheliterature.Previouslyreportedlocimeetinginclusionare

listedinExtendeddatatable1.WhileanumberofCNVlocihavebeenreportedin

multiplestudies,wesoughtthemostrecentreportsthatincorporatedthelargest

samplesizes.ToidentifyputativelyassociatedCNVlociwithSCZfromthefulllist,we

appliedthegenome-widep-valuecutoffof8e-5,derivedfromtheCochran-Mantel-

Haenzel(CMH)testinthecurrentprobe-levelanalysisasthep-valuecutoffforinclusion

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

52

asSCZimplicatedCNVloci.WhiletheCMHtestisnottheprimaryprobe-leveltestinthe

currentPGCanalysis,itcorrespondsmorecloselytothetestsusedinpublishedreports.

Inall,nineindependentCNVlocifrompublishedreportssurpassgenome-wide

correction.AllpublishedCNVloci,eventhoseexcludedasanSCZimplicatedregions,are

examinedintheprobe-levelassociationanalysis.

CNVburdenanalysis

WeanalyzedtheoverallCNVburdeninavarietyofwaystodiscernwhichgeneral

propertiesofCNVarecontributingtoSCZrisk.OverallindividualCNVburdenwas

measuredin3distinctways–1)KbburdenofCNVs,2)Numberofgenesaffectedby

CNVs,and3)NumberofCNVs.Inparticular,weonlycountedgeneasaffectedwhenthe

CNVoverlappedacodingexon.WealsopartitionedouranalysesbyCNVtype,size,and

frequency.CNVtypeisdefinedascopynumberlosses(ordeletions),copynumbergains

(orduplications),andbothcopynumberlossesandgains.Toassignaspecificallele

frequencytoaCNV,weusedthe--cnv-freq-method2commandinPLINK,wherebythe

frequencyisdeterminedasthetotalnumberofCNVoverlappingthetargetCNV

segmentbyatleast50%.ThismethoddiffersfromothermethodsthatassignCNV

frequenciesbygenomicregion,wherebyasingleCNVspanningmultipleregionsmaybe

includedinmultiplefrequencycategories.

ForFigure1,andExtendeddatafigures2and3,wepartitionedCNVburdenby

genotypingplatform,andtheabbreviationsforeachplatformareexpandedbelow:

A500:Affymetrix500

I300:Illumina300K

I600:Illumina610KandIllumina660W

A5.0:Affymetrix5.0

A6.0:Affymetrix6.0

omni:OmniExpressandOmniExpressplusExome

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

53

DuetothesmallsizeoftheOmni2.5array(28casesand10controls),theywere

excludedfrompresentationinthefigure,butareincludedinallburdenanalyseswith

thetotalPGCsample.Burdentestsusealogisticregressionframeworkwiththe

inclusionofcovariatesdetailedabove.Usingalogisticregressionframework,we

predictedSCZstatususingCNVburdenasanindependentpredictorvariable,thus

allowingustogetanaccurateestimateoftheuniquecontributionofCNVburdenina

multipleregressionframework.TogaininsightintotheproportionofCNVburdenrisk

comingfromlocioutsideofthepreviouslyimplicatedSCZregions,weranallburden

analysesafterremovingCNVthatoverlappedpreviouslyimplicatedCNVboundariesby

morethan10%.

CNVprobelevelassociation

Genome-wideinterrogationofCNVsignalswastestedateachrespectiveCNV.Probe

leveltestswereexaminedatthestart,end,andsinglebasepositionaftertheendofthe

calledCNV.ThreecategoriesofCNVweretested:CNVdeletions,CNVduplications,and

deletionsandduplicationstogether.AllanalyseswererunusingPLINKsoftware11.

WeranprobelevelassociationusingtheSCZresidualphenotypeasaquantitative

variable,withsignificancedeterminedthroughpermutationofphenotyperesidual

labels.Anadditionalz-scoringcorrection,explainedbelow,isusedtocontrolforany

extremevaluesintheSCZresidualphenotypeandefficientlyestimatetwo-sided

empiricalp-valuesforhighlysignificantloci.Toensureagainstthepotentiallossof

powerfromtheinclusionofcovariates,wealsoranasingledegreeoffreedomCochran-

Mantel-Haenzel(CMH)teststratifiedbygenotypingplatform,witha2(CNVcarrier

status)x2(phenotypestatus)xN(genotypingplatform)contingencymatrix.Whilethe

CMHtestdoesnotaccountformoresubtlebiasesthatcoulddrivefalsepositivesignals,

itisrobusttosignalsdrivenbyasingleplatformandallowsforeachCNVcarriertobe

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

54

treatedequally.Locithesurpassedgenome-widecorrectionineithertestwasfollowed

upforfurtherevaluation.

Z-scorerecalibrationofempiricaltesting:Probelevelassociationp-valuesfromtheSCZ

residualphenotypewereinitiallyobtainedbyperformingonemillionpermutationsat

eachCNVposition,wherebyeachpermutationshufflestheSCZresidualphenotype

amongallsamples,andretainstheSCZresidualmeanforCNVcarriersandnon-carriers.

ForextremelyrareCNV,however,CNVcarriersattheextremeendsoftheSCZresidual

phenotypecanproducehighlysignificantp-values.Whileweunderstandthatsuchrare

eventsareunabletosurpassstrictgenome-widecorrection,wewantedtoretainall

teststohelpdelineatethepotentialfine-scalearchitecturewithinasingleregionof

association.Toproperlyaccountfortheincreasedvariancewhenonlyafewindividuals

aretested,weappliedanempiricalZ-scorecorrectiontotheCNVcarriermean.Inorder

togetanempiricalestimateofthevarianceforeachtest,wecalculatedthestandard

deviationofresidualphenotypemeandifferencesinCNVcarriersandnon-carriersfrom

5,000permutations.Z-scoresarecalculatedastheobservedcase-controlmean

differencedividedbytheempiricalstandarddeviation,withcorrespondingp-values

calculatedfromthestandardnormaldistribution.Concordanceoftheinitialempirical

andz-scorep-valuesareclosetounityforassociationtestswithsixormoreCNV,

whereasZ-scorep-valuesaremoreconservativeamongtestswithlessthansixCNV.

Furthermore,theZ-scoremethodnaturallyprovidesanefficientmannertoestimate

highlysignificantempiricalp-valuesthatwouldinvolvehundredsofmillionsof

permutationstoachieve.

Genome-widecorrectionformultipletests

BeyondidentifyingsignificantCNVattheprobelevel,wealsoestimatedthegenome-

widetestingspaceforrareCNVanalysis.WiththelargePGCcohortbeingcalledthrough

aconsistentpipeline,wesawanopportunitytocharacterizethenullexpectationof

segregatingandrecurrentdenovorareCNVinpopulationsofEuropeanancestry.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

55

AcceptedthresholdsforsignificanceamongpublishedriskCNVhavebeenlimitedin

scope,asaccuratepopulationestimatesofrareCNVfrequencyanddistributionacross

thegenomerequirelargerepresentativesamples.

Genome-widesignificancethresholdswerecalculatedusingthe5%family-wiseerror

ratefrom5,000permutationsinboththeSCZresidualphenotypeandCMHtest.

Specifically,weselectedthe95thpercentileoftheminimump-valuesobtainedacross

permutations.Belowarethegenome-widecorrectionp-valuethresholdsdeterminedin

thismanner:

SCZresidualphenotypeFWERcorrection:

CNVlossesandgains:6.73e-6

CNVlosses:1.5e-5

CNVgains:1.35e-5

CMHtestFWERcorrection:

CNVlossesandgains:3.65e-5

CNVlosses:8.25e-5

CNVgains:7.8e-5

ThismethoddiffersslightlyfromthoseusedinLevinsonetal.9toestimatethemultiple

testcorrectionforrareCNV,howevertheirgenome-widecorrectionofp=1e-5

correspondsquitecloselytotheestimatesobservedusingtheSCZresidualphenotype.

Theobservedfamily-wisecorrectionservesasgoodapproximationoftheindependent

rareCNVsignalsfoundamongEuropeanancestrypopulationsforarray-basedCNV

capture,butassamplesizesincrease,sotoowilltheeffectivenumberoftests,

necessitatingfurtherevaluationofthemultipletestingburden.

Gene-setburdenenrichmentanalysis:gene-sets

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

56

Gene-setswithanaprioriexpectationofassociationtoneuropsychiatricdisorderswere

compiledbasedongeneannotations(GeneOntologyandcuratedpathwaydatabases,

downloadedJune2013)andpublishedarticlematerials(fordetails,seeExtendedData

Table3).Gene-setsbasedonbrainexpressionwerecompiledbyprocessingthe

BrainSpanRNA-seqgeneexpressiondata-set

(http://www.brainspan.org/static/download.html,downloadedSept2012).Four

roughlyequallysizedgene-sets(about4600geneseach)werederivedtorepresentfour

expressiontiers(veryhigh,medium-to-high,medium-to-low,veryloworabsent);genes

wereselectediftheypassedafixedexpressionthresholdinatleast5/508experimental

datapoints(correspondingtodifferentregionsofdonorbrains,differentdonorages

correspondingtodifferentdevelopmentalbrainstages,anddifferentdonorsexes).

Gene-setsbasedonmousephenotypeswereassembledbydownloadingMPO

(MammalianPhenotypeOntology)annotationsfromMGI(www.informatics.jax.org,

downloadedAugust2013),up-propagatingannotationsfollowingontologyrelations,

andmappingtohumanorthologsusingNCBIHomologene

(www.ncbi.nlm.nih.gov/homologene);finally,top-levelorgansystemswithfewergenes

wereaggregatedwhilestrivingtopreservebiologicalhomogeneity,sotohaveroughly

equal-sizedsets(2,600-1,300genes).Forallgene-sets,geneidentifiersintheprimary

sourceweremappedtoEntrez-geneidentifiersusingtheR/Bioconductorpackage

org.Hs.eg.db.

Gene-setburdenenrichmentanalysis:pre-processing

SubjectswererestrictedtotheoneswithatleastonerareCNV.Forcopynumbergains

andlosses,weseparatelycalculatedthefollowingsubject-leveltotals:variantnumber,

variantlengthandnumberofgenesimpacted;thesecovariatesarethenusedtomodel

globalburdenandcorrectgene-setburdentoensureitisspecific(i.e.notamere

reflectionofgenome-wideburdenwithsomestochasticdeviationduetosampling).The

subject-leveltotalnumberofgenesimpactedwasalsocalculatedforeachgene-set,

againseparatelyforgainsandlosses.Subjectswereflaggediftheycarriedatleastone

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

57

CNVmatchingalocuspreviouslyimplicatedinschizophrenia(seesection“Identifying

previouslyimplicatedCNVlociintheliterature”);thiswasthenusedtoanalyzedgene-

setburdenforallsubjects,orexcludingsubjectswithanalreadyimplicatedCNV.

Gene-setburdenenrichmentanalysis:statisticaltest

Foreachgene-set,wefitthefollowinglogisticregressionmodel(asimplementedbythe

Rfunctionglmofthestatspackage),wheresubjectsarestatisticalsamplingunits:

y~covariates+global+gene-set

Where:

• yisthedicotomicoutcomevariable(schizophrenia=1,control=0)

• covariatesisthesetofvariablesusedascovariatesalsointhegenome-wide

burdenandprobeassociationanalysis(sex,genotypingplatform,CNVmetric,

andCNVassociatedprincipalcomponents)

• globalisthemeasureofglobalburden;fortheresultsinthemaintext,weused

thetotalgenenumber(abbreviatedasUfromuniversegene-setcount);wealso

calculatedresultsfortotallength(abbreviatedasTL)andvariantnumberplus

variantmeanlength(abbreviatedasCNML)

• gene-setisthegene-setgenecount

Thegene-setburdenenrichmentwasassessedbyperformingachi-squaredeviancetest

(asimplementedbytheRfunctionanova.glmofthestatspackage)comparingthese

tworegressionmodels:

y~covariates+global

y~covariates+global+gene-set

Wereportedthefollowingstatistics:

• coefficientbetaestimate(abbreviatedasCoeff)

• t-studentdistribution-basedcoefficientsignificancep-value(asimplementedby

theRfunctionsummary.glmofthestatspackage,abbreviatedasPvalue_glm)

• deviancetestp-value(abbreviatedasPvalue_dev)

• gene-setsize(i.e.numberofgenesisthegene-set,regardlessofCNVdata)

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

58

• BH-FDR(Benjamini-HochbergFalseDiscoveryrate)

• percentageofschizophreniaandcontrolsubjectswithatleast1gene,2genes,

etc…impactedbyaCNVofthedesiredtype(lossorgain)inthegene-set

(abbreviatedasSZ_g1n,SZ_g2n,…CT_g1n,…)

Pleasenotethat,byperformingsimplesimulationanalyses,werealizedthatPvalue_glm

canbeextremelyover-conservativeinpresenceofveryfewgene-setcountsdifferent

than0,whilePvalue_devtendstobeslightlyunder-conservative.Whilethetwop-

valuestendtoagreewellforgene-setanalysis,Pvalue_glmissystematicallyover-

conservativeforgeneanalysissincesmallercountsaretypicallyavailableforsingle

genes.

Geneburdenanalysis:pre-processing

SubjectswererestrictedtotheoneswithatleastonerareCNV.Onlygeneswithatleast

aminimumnumberofsubjectsimpactedbyCNVweretested;thisthresholdwaspicked

bycomparingtheBH-FDRtothepermutation-basedFDRandensuringlimitedFDR

inflation(permutedFDR<1.65*BH-FDRatBH-FDRthreshold=5%)whilemaximizing

power.Forgainsthethresholdwassetto12counts,whileforlossesitwassetto8

counts.

Geneburdenanalysis:statisticaltest

Foreachgene,wefitthefollowinglogisticregressionmodel(asimplementedbytheR

functionglmofthestatspackage),wheresubjectsarestatisticalsamplingunits:

y~covariates+gene

Where:

• yisthedichotomousoutcomevariable(schizophrenia=1,control=0)

• covariatesisthesetofvariablesusedascovariatesalsointhegenome-wide

burdenandprobeassociationanalysis(sex,genotypingplatform,CNVmetric,

andCNVassociatedprincipalcomponents)

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

59

• geneisthebinaryindicatorforthesubjecthavingornothavingaCNVofthe

desiredtype(lossorgain)mappedtothegene

Thegeneburdenwasassessedbyperformingachi-squaredeviancetest(as

implementedbytheRfunctionanova.glmofthestatspackage)comparingthesetwo

regressionmodels:

• y~covariates

• y~covariates+gene

Geneburdenanalysis:multipletestcorrection

Multipletestcorrectionwasperformedforlociratherthanforgenes,toavoidthe

strongcorrelationbetweentestintroducedbymulti-genicCNVs;forthesamereason,it

ismoreusefultocountfalsepositivesaslociratherthangenes.Wefollowedagreedy

step-downprocedure:

• startfromgenewithmostsignificantdeviancep-valueG1,createlocusL1

• removefromthegenelistallgenesthatshareatleast50%oftheircarrier

subjectswithG1,andaddthemtolocusL1

• dothesameforthenextgenemostsignificantgeneinthelist(thuscreatinga

newlocusL2),andproceedrecursivelyuntilthereisnogeneleft

• definelocusp-valueasthesmallestdeviancep-valueofitsgenes

Wecomputedpermutation-basedFDRbypermutingsubjects’conditionlabels

(schizophrenia,control),butnotcovariates(asthoseareexpectedtocorrelatetoCNV

distribution),1,000times.TheFDRwasthendefinedastheratiobetweentheaverage

numberoftestspassingagivenp-valuethresholdacrossthe1,000permutationsand

thenumberoftestspassingthesamep-valuethresholdforrealdata.FDRswerealso

generatedcountingonlythesubsetofgeneswithpositiveandnegativeregression

coefficients(i.e.riskandpresumedprotective).Thep-valuethresholdforpermutation-

basedFDRcalculationwaspickedbychoosingthemaximumnominalp-value

correspondingtoagivenBH-FDRthreshold(e.g.5%).BH-FDRissupposedtobeslightly

inflatedbecause(i)thedeviancetestp-valueisslightlyunder-conservativeinpresence

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

60

ofveryfewgeneindicatorsdifferentthan0,(ii)weusethesmallestgenep-valueto

definethelocusp-value.

MethodsReferences

1. SchizophreniaWorkingGroupofthePsychiatricGenomics,C.Biologicalinsightsfrom108schizophrenia-associatedgeneticloci.Nature511,421-7(2014).

2. Wang,K.etal.PennCNV:anintegratedhiddenMarkovmodeldesignedforhigh-resolutioncopynumbervariationdetectioninwhole-genomeSNPgenotypingdata.GenomeRes17,1665-74(2007).

3. Pinto,D.etal.Functionalimpactofglobalrarecopynumbervariationinautismspectrumdisorders.Nature466,368-72(2010).

4. Korn,J.M.etal.IntegratedgenotypecallingandassociationanalysisofSNPs,commoncopynumberpolymorphismsandrareCNVs.Nat.Genet.40,1253-1260(2008).

5. McCarthy,S.E.etal.Microduplicationsof16p11.2areassociatedwithschizophrenia.NatGenet41,1223-7(2009).

6. Price,A.L.etal.Principalcomponentsanalysiscorrectsforstratificationingenome-wideassociationstudies.NatGenet38,904-909(2006).

7. Malhotra,D.&Sebat,J.CNVs:harbingersofararevariantrevolutioninpsychiatricgenetics.Cell148,1223-41(2012).

8. Rees,E.etal.Analysisofcopynumbervariationsat15schizophrenia-associatedloci.BrJPsychiatry204,108-14(2014).

9. Levinson,D.F.etal.Copynumbervariantsinschizophrenia:confirmationoffivepreviousfindingsandnewevidencefor3q29microdeletionsandVIPR2duplications.AmJPsychiatry168,302-16(2011).

10. Bergen,S.E.etal.Genome-wideassociationstudyinaSwedishpopulationyieldssupportforgreaterCNVandMHCinvolvementinschizophreniacomparedtobipolardisorder.MolecularPsychiatry17,880-6(2012).

11. Purcell,S.etal.PLINK:atoolsetforwhole-genomeassociationandpopulation-basedlinkageanalysis.AmericanJournalofHumanGenetics81,559-75(2007).

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

61

PGCSchizophreniaCNVanalysis–SupplementaryInformation

SupplementaryResults:

• CNVburdenbetweensexes

• Probelevelpoweranalysis

• Gene-basednetworkanalysis

• FollowupofsignificantCNVloci

• ProportionofvarianceinSCZexplainedbytopCNVloci

• NAHRenrichmentinsignificantnovelgeneloci

ConsortiumMembership

Acknowledgements

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

62

SupplementaryResults

CNVburdenbetweensexes

Followingrecentevidencethatostensiblyhealthyfemalescarryanincreasedburdenof

rareCNVs1,weexaminedwhetherthisincreasedfemaleburdenexistedinthecurrent

PGCdataset.WeusedalogisticregressionmodelpredictingsexusingCNVburdenand

controllingforstudycovariates,aswellastheWilcoxonrank-sumtestcomparingmale

tofemaleCNVburden1.Focusingonthesignificantfindingsinthepreviouspaper,we

examinedtheburdeninautosomalCNVcountandgenesaffectedamongPGCcontrols

(9856malesand10371females).WedoseeanelevatedCNVcountincontrolfemales

(1.90autosomalCNVrate)tomales(1.87autosomalCNVrate),howeverthisdifference

isnotsignificantineithertheregressionmodel(OR=1.004,p=0.66)ortheWilcoxon

rank-sumtest(p=0.1).Wedo,however,observeamarginallysignificantenrichment

whenfocusingonCNVlosscount,wherecontrolfemales(0.99autosomalCNVlossrate)

showahigherburdenthancontrolmales(0.94autosomalCNVlossrate;logistic

regressionOR=1.03,p=0.05;Wilcoxonrank-sumtestp=3e-3).Nosinglegenotyping

platformseemedtodrivetheenrichmentinfemales(datanotshown),andwedon’t

observeanydifferenceinCNVcountwhenlookingatCNVgains(logisticregressionOR=

0.98,p=0.18;Wilcoxonrank-sumtestp=0.56).Finally,nosignificantdifferences

betweensexeswerefoundusingeithertestwhenexaminingthenumberofgenes

affected,orwhenweincludeSCZcasesandcontrols(allp>.05).

Probelevelpoweranalysis

ByrestrictinganalysistorareCNVinthepopulation(MAF<0.01),manylocidonothave

enoughCNVtosurpassgenome-widecorrectionformultipletesting,prompting

pathwayandgenelevelanalysestoachievesufficientstatisticalpower.Touseaspecific

example,the3q29deletionisfullypenetrantinthecurrentsample,with16SCZcarriers

and0controls(MAF=3.8e-4)atthepeakofassociation.Assumingnoplatformbias,this

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

63

leadstoanuncorrectedchi-squarep-valueof8.9e-5,andapermutedp-valueof6.2e-5

testingassociationusingSCZphenotyperesiduals.Neitherp-value,however,surpasses

theirrespectivegenome-widesignificancecutoffforCNVdeletion.Whilepermutation

methodsusedtogenerategenome-widecutoffsaccuratelyreflectthetestingspace

amongobservedCNVs(veryrareCNVshavelittletonocontributiontothefamily-wise

errorrate),wewantedtoestimatetheproportionofCNVdetectableattheprobelevel.

Underourcurrentanalyticaldesignandsamplesize,wecalculatedthepowertodetect

associatedCNVacrossvariousMAFsandeffectsizesanddeterminetheproportionof

associationtestscapableofsurpassinggenome-widecorrection.

WesimulatedCNVswithinourdataset(21094casesand20227controls)andregressed

themusingthesameassociationdesignwithSCZresidualphenotypes.Wesimulated

variouseffectsizesbyrandomlysamplingcasesandcontrolsatdifferentprobabilitiesas

CNVcarriers,androundedtothenearestCNVcounttoreflecttheMAFofeachCNVin

thesample.ForeachcombinationofeffectsizeandMAF,weran1000simulations,

retrievingthet-testp-valueofCNVcarriersfromtheSCZresidualphenotype.Simulated

p-valuesbehavedinmuchthesamewayastheZ-scorecorrectiononpermutatedp-

valuesusedintheprimarytest(datanotshown).InExtendeddatafigure8,weshow

theproportionofsimulationsforCNVlossessurpassinggenome-widecorrectionateach

MAFandeffectsizeparameter(gainsperformsimilarly).

Wedefinestatisticalpowerastheproportionofsimulationssurpassinggenome-wide

significance.ForafullypenetrantriskCNV,werequireaMAFof~6e-4(orabout25CNV)

toachieve80%detectionpower.ForCNVwithagenotyperelativerisk(GRR)of10,we

requireaMAFof1e-3(oratleast41CNV)toachieve80%detectionpower.Looking

acrossthelandscapeofCNVstested,onthewholeabout10%ofdeletionorduplication

CNVbreakpointsreachafrequencygreaterthan1e-3inthesample.Ontheother

extreme,aCNVwithMAFof.005(oratleast206CNV)andaGRRof2willonlybe

detected58%ofthetime.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

64

Gene-basednetworkanalysis

Toidentifyagenenetworkenrichedinschizophreniariskgenes,wequeried

GeneMANIA2usingthe17geneswithdeletiongene-testBenjamini-HochbergFDR<=

25%andmemberofthe“GOsynaptic”or“ARCcomplex”sets.Wethuscreateda

synapticproteininteractionnetworkof136genes,withthemostdenselyconnected

networkcorecorrespondingtopost-synapticdensityorganizers(DLGs,DLGAPs,

SHANKs)andionotropicglutamatereceptors(GRIAs,GRIDs,GRINs).NRXN1isconnected

tothenetworkcoreviaadhesionpartners(NLGN1-3)andCASK.Wetestedthis

schizophreniagenenetwork,andfoundsignificantenrichmentingeneswithevidenceof

denovocodingvariantsinsequencingstudiesofschizophreniatrios3(forframeshift,

stop-gainandsplice-site:Fisher’sExactTestp-value0.0023;missenseandaminoacid

insertion/deletion:Fisher’sExactTestp-value0.0004);inaddition,wefoundagreater

enrichmentforthisnetwork,comparedtothelargersetcomposedofall“GOsynaptic”

and“ARCcomplex”genes.Nosignificantenrichmentwasfoundfordenovovariants

identifiedincontrols.

FollowupofsignificantCNVloci

Bothgeneandprobelevelassociationfollowauniformtestingframeworkacrossthe

genome,howeverrisklocimayexhibitamorenuancedCNVarchitectureacrossthe

entiretyoftheassociationpeak.AllassociatedlociwithFDR<.05inthegenebasedtest

werefollowedupforfurthertesting,alongwithasmallnumberofcandidateloci

showingsuggestiveassociationintheprobe-levelassociation.Wevisuallyinspected

eachassociationpeakanddeterminedthebpcoordinatesthatencapsulatethe

associatedregionanddeterminewhichCNVsegmentinclusion,beitcoveringexonsor

overlappingaminimumpercentageofthetotalregion,mostappropriatelyreflectthe

associationsignal.Tocomprehensivelyexaminetherobustnessandsourceof

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

65

association,wealsoranadditionaltestscontrollingforindividualdataset,splittingby

sex,andexaminingadosagemodel,wherebycopynumberismeasuredwithonecopy

fordeletion,twocopiesfornoCNV,andthreecopiesforduplication.Wealsoexamined

significantCNVlociinanunfilteredCNVcallset,usingCNVscalledpriortotheremoval

ofcommonCNVs(MAF>1%)andCNVoverlappingsegmentalduplications.

Wefurtherevaluatedtheassociatedregionsbydeterminingtheconcordanceofcalls

withinthecallsetwiththosedeterminedbyunsupervisedclustering.CallsetCNVswere

definedasCNVswithatleasta50%overlapwithregionsinTable1.Werestrictedthis

analysisto26,959samplesacrosssixcohorts(14,419Affymetrix6.0,12,540Illumina

platforms;1.1:1case:controlratio).FeaturesforclusteringincludedthemedianlogR

ratio(mLRR)andthemedianlogRratioofthechromosomeforwhichalocusresidesin,

controllingforlargechromosomalabnormalities.WeimplementedDensity-Based

SpatialClusteringofApplications(DBSCAN)foundinthepythonscikit-learnlibrary

(http://scikit-learn.org)becauseofhighsensitivitytodetectoutliersinclusters.Foreach

novelregionandwithineachcohort,genotypeswereassignedtoeverysamplebased

ontheDBSCANdefinedcluster.Theclusterwiththehighestnumberofsampleswas

designatedasreferenceandassumedtohaveacopynumberoftwo.Otherclusters

wereflaggedasgainorlossbasedontheaverageregionalmLRRanditsrelationtothe

referenceregionalmLRR.WeremovedclusterswithaveragechromosomalmLRR

outside3SDfromthereference.CNVswereconsideredconcordantiftheywereflagged

non-referencebyDBSCANandpresentinthe41kcallset,matchingonCNVtype.We

appliedalocusbasedcallsetconcordancefilterof>=70%;oneregion,NPY4R,failedto

meetthisrequirementwithaconcordanceof0.1%.Inaddition,bothproximalanddistal

lociofZNF600wereremovedduetobatcheffects,whichwedefinedasasignificant

deviationfromaPoissondistributionofcallsetcallsperplate.Regionsthatpassedboth

concordanceandbatcheffectfiltersarereportedinTable1.

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

66

ProportionofvarianceinSCZexplainedbytopCNVloci

TomeasuretheproportionofvarianceexplainedontheliabilityscaleofSCZ,we

estimatedtheoverallheritabilityofliability(orlogRRgeneticvariance)explainedbythe

eightCNVlocisurpassinggenome-widesignificance.Alleightlociwerecollapsedintoa

singlesignal.TwoSCZaffectedindividualswerefoundtocarrytwoCNVsintheseloci,

andtheircontributionwasonlycountedonce.Insum,weobserved298SCZpatients

withaCNVintheseregions(1.4%ofthetotalSCZaffectedsample),and29controls

(0.1%;CMHstratifiedOR=10.1).ToestimatethevarianceinSCZliabilityexplainedby

locisurpassinggenome-widecorrection,wecalculatedtheheritabilityofliabilityusing

theINDI-Vonlinetool(cnsgenomics.com/software)describedin4usinganoverall

diseaseriskof1%andasiblingrecurrenceriskof8.85.

NAHRenrichmentinsignificantnovelgeneloci

Totestifnovelsignificantloci(FDR<0.05;Table1)wereenrichedforNAHRevents,we

performedapermutationtest(n=10,000)simulatingthenulldistributionofNAHR-

mediatedCNVsforasetofrandomloci.Eachsimulationrandomlyselectednineloci

takenfromCNVsoverlappingatleast50%togenesinthegene-setburdenanalysis.

TheseninerandomlociwerematchedaccordingtoCNVcallfrequencytotheninenovel

significantlociinTable1.Wethencreatedwindowsforeachstartandendpositionfor

everyoverlappingCNVtoarandomlocus.Startpositionswereexpanded-50kband

+5kb,andendpositionswereexpanded-5kband+50kb.WeflaggedCNVsasNAHR-

mediatedwhenbothstartandendexpandedwindowsoverlappedto1kbsegmental

duplicationsobtainedfromthehg18buildoftheUCSCtablebrowser

(https://genome.ucsc.edu/cgi-bin/hgTables).Everyiterationreportedthefractionof

NAHR-mediatedCNVs;thatistheratioofCNVsflaggedasNAHRtothetotalnumberof

overlappingCNVs.WefoundandenrichmentofNAHRmediatedCNVsinsignificant

novellociwhencomparedtothenulldistribution(86%NAHR-mediated,6fold

enrichment,p=0.008).

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

67

ConsortiumMembership

WellcomeTrustCase-ControlConsortium2

ManagementCommittee:PeterDonnelly180,217,InesBarroso218,JeneferM

Blackwell219,220,ElviraBramon196,MatthewABrown221,JuanPCasas222,223,

AidenCorvin5,PanosDeloukas218,AudreyDuncanson224,JanuszJankowski225,

HughSMarkus226,ChristopherGMathew227,ColinNAPalmer228,RobertPlomin9,

AnnaRautanen180,StephenJSawcer229,RichardCTrembath227,AnanthC

Viswanathan230,231,NicholasWWood232.

DataandAnalysisGroup:ChrisCASpencer180,GavinBand180,CélineBellenguez180,

PeterDonnelly180,217,ColinFreeman180,EleniGiannoulatou180,GarrettHellenthal

180,RichardPearson180,MattiPirinen180,AmyStrange180,ZhanSu180,Damjan

Vukcevic180.

DNA,Genotyping,DataQC,andInformatics:CordeliaLangford218,InesBarroso218,

HannahBlackburn218,SuzannahJBumpstead218,PanosDeloukas218,SergeDronov

218,SarahEdkins218,MatthewGillman218,EmmaGray218,RhianGwilliam218,

NaomiHammond218,SarahEHunt218,AlagurevathiJayakumar218,JenniferLiddle

218,OwenTMcCann218,SimonCPotter218,RadhiRavindrarajah218,Michelle

Ricketts218,AvazehTashakkori-Ghanbaria218,MatthewWaller218,PaulWeston218,

PamelaWhittaker218,SaraWidaa218.PublicationsCommittee:ChristopherGMathew

227,JeneferMBlackwell219,220,MatthewABrown221,AidenCorvin5,MarkI

McCarthy233,ChrisCASpencer180.

PsychosisEndophenotypeInternationalConsortium

MariaJArranz156,234,StevenBakker101,StephanBender235,236,ElviraBramon

156,237,238,DavidACollier8,9,BenedictoCrespo-Facorro239,240,JeremyHall134,

ConradIyegbe156,AssenVJablensky241,RenéSKahn101,LubaKalaydjieva102,242,

StephenLawrie134,CathrynMLewis156,KuangLin156,DonHLinszen243,Ignacio

Mata239,240,AndrewMMcIntosh134,RobinMMurray142,RoelAOphoff80,Jim

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

68

VanOs143,156,JohnPowell156,DanRujescu81,83,MurielWalshe156,Matthias

Weisbrod236,DurkWiersma244.217

DepartmentofStatistics,UniversityofOxford,Oxford,UK.218WellcomeTrustSanger

Institute,WellcomeTrustGenomeCampus,Hinxton,Cambridge,UK.219Cambridge

InstituteforMedicalResearch,UniversityofCambridgeSchoolofClinicalMedicine,

Cambridge,UK.220TelethonInstituteforChildHealthResearch,CentreforChildHealth

Research,UniversityofWesternAustralia,Subiaco,WesternAustralia,Australia.221

DiamantinaInstituteofCancer,ImmunologyandMetabolicMedicine,Princess

AlexandraHospital,UniversityofQueensland,Brisbane,Queensland,Australia.222

DepartmentofEpidemiologyandPopulationHealth,LondonSchoolofHygieneand

TropicalMedicine,London,UK.223DepartmentofEpidemiologyandPublicHealth,

UniversityCollegeLondon,London,UK.224MolecularandPhysiologicalSciences,The

WellcomeTrust,London,UK.225PeninsulaSchoolofMedicineandDentistry,Plymouth

University,Plymouth,UK.226ClinicalNeurosciences,StGeorge'sUniversityofLondon,

London,UK.227DepartmentofMedicalandMolecularGenetics,SchoolofMedicine,

King'sCollegeLondon,Guy'sHospital,London,UK.228BiomedicalResearchCentre,

NinewellsHospitalandMedicalSchool,Dundee,UK.229DepartmentofClinical

Neurosciences,UniversityofCambridge,Addenbrooke'sHospital,Cambridge,UK.230

InstituteofOphthalmology,UniversityCollegeLondon,London,UK.231National

InstituteforHealthResearch,BiomedicalResearchCentreatMoorfieldsEyeHospital,

NationalHealthServiceFoundationTrust,London,UK.232DepartmentofMolecular

Neuroscience,InstituteofNeurology,London,UK.233OxfordCentreforDiabetes,

EndocrinologyandMetabolism,ChurchillHospital,Oxford,UK.234Fundacióde

DocènciaiRecercaMútuadeTerrassa,UniversitatdeBarcelona,Spain.235Childand

AdolescentPsychiatry,UniversityofTechnologyDresden,Dresden,Germany.236

SectionforExperimentalPsychopathology,GeneralPsychiatry,Heidelberg,Germany.

237InstituteofCognitiveNeuroscience,UniversityCollegeLondon,London,UK.238

MentalHealthSciencesUnit,UniversityCollegeLondon,London,UK.239Centro

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

69

InvestigaciónBiomédicaenRedSaludMental,Madrid,Spain.240UniversityHospital

MarquésdeValdecilla,InstitutodeFormacióneInvestigaciónMarquésdeValdecilla,

UniversityofCantabria,Santander,Spain.241CentreforClinicalResearchin

Neuropsychiatry,TheUniversityofWesternAustralia,Perth,WesternAustralia,

Australia.242WesternAustralianInstituteforMedicalResearch,TheUniversityof

WesternAustralia,Perth,WesternAustralia,Australia.243DepartmentofPsychiatry,

AcademicMedicalCenter,UniversityofAmsterdam,Amsterdam,TheNetherlands.244

DepartmentofPsychiatry,UniversityMedicalCenterGroningen,Universityof

Groningen,TheNetherlands.

Acknowledgements

DataProcessingandStatisticalanalyseswerecarriedoutontheGeneticCluster

Computer(http://www.geneticcluster.org)hostedbySURFsaraandfinancially

supportedbytheNetherlandsScientificOrganization(NWO480-05-003)alongwitha

supplementfromtheDutchBrainFoundationandtheVUUniversityAmsterdam.The

GRASdatacollectionwassupportedbytheMaxPlanckSociety,theMax-Planck-

Förderstiftung,andtheDFGCenterforNanoscaleMicroscopy&MolecularPhysiologyof

theBrain(CNMPB),Göttingen,Germany.TheBostonCIDARsubjectanddatacollection

wassupportedbytheNationalInstituteofMentalHealth(1P50MH080272,RWM;

U01MH081928,LJS;1R01MH092380,TLP)andtheMassachusettsGeneralHospital

ExecutiveCommitteeonResearch(TLP).ISC–Portugal:CNPandMTPareorhavebeen

supportedbygrantsfromtheNIMH(MH085548,MH085542,MH071681,MH061884,

MH58693,andMH52618)andtheNCRR(RR026075).CNP,MTP,andAHFareorhave

beensupportedbygrantsfromtheDepartmentofVeteransAffairsMeritReview

Program.TheDanishAarhusstudywassupportedbygrantsfromTheLundbeck

Foundation,TheDanishStrategicResearchCouncil,AarhusUniversity,andTheStanley

ResearchFoundation.WorkinCardiffwassupportedbyMRCCentre(G0800509)and

MRCProgramme(G0801418)Grants,theEuropeanCommunity'sSeventhFramework

Programme(HEALTH-F2-2010-241909(ProjectEU-GEI)),theEuropeanUnionSeventh

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

70

FrameworkProgramme(FP7/2007-2013)undergrantagreementn°279227,a

fellowshiptoJWfromtheMRC/WelshAssemblyGovernmentandtheMargaretTemple

AwardfromtheBritishMedicalAssociation.WethankNovartisfortheirinputin

obtainingCLOZUKsamples,andstaffatTheDoctor'sLaboratory(LisaLevett/Andrew

Levett)forhelpwithsampleacquisitionanddatalinkageandinCardiff(Kiran

Mantripragada/LucindaHopkins)forsamplemanagement.CLOZUKandsomeother

samplesweregenotypedattheBroadInstitute(whichhasaseparateacknowledgment)

orbytheWTCCCandWTCCC2(WT(083948/Z/07/Z).WeacknowledgeuseoftheBritish

1958BirthCohortDNA(MRC:G0000934)andtheWellcomeTrust(068545/Z/0/and

076113/C/04/Z),theUKBloodServicesCommonControls(UKBS-CCcollection),funded

bytheWT(076113/C/04/Z)andbyNIHRprogrammegranttoNHSBT(RP-PG-0310-

1002).VirginiaCommonwealthUniversity:BPRandKSKthankallthefacultyofthe

VirginiaInstituteforPsychiatricandBehavioralGeneticsforinvaluableinsightsand

discussionsovermanyyears.BSM,SAB,BTW,BW,KSKandBPRweresupportedby

NationalInstituteofMentalHealthgrantR01MH083094toBPR.Samplecollectionwas

supportedbypreviousfundingofNationalInstituteofMentalHealthgrantR01

MH041953toKSKandBPR.GenotypingwassupportedbyNationalInstituteofMental

HealthgrantR01MH083094toBPR,NationalInstituteofMentalHealthgrantR01

MH068881toBPRandWellcomeTrustCaseControlConsortium2grant.Wethank

NovartisfortheirinputinobtainingCLOZUKsamples,andstaffatTheDoctor's

Laboratory(LisaLevett/AndrewLevett)forhelpwithsampleacquisitionanddata

linkageandinCardiff(KiranMantripragada/LucindaHopkins)forsamplemanagement.

Ourworkwassupportedby:MedicalResearchCouncil(MRC)Centre(G0800509;

G0801418),theEuropeanCommunity'sSeventhFrameworkProgramme(HEALTH-F2-

2010-241909(ProjectEU-GEI)),theEuropeanUnionSeventhFrameworkProgramme

(FP7/2007-2013)undergrantagreementn°279227,afellowshiptoJWfromthe

MRC/WelshAssemblyGovernmentandtheMargaretTempleAwardfromtheBritish

MedicalAssociation.CLOZUKandsomeothersamplesweregenotypedattheBroad

Institute(whichhasaseparateacknowledgment)orbytheWTCCCandWTCCC2(WT

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

71

(083948/Z/07/Z).WeacknowledgeuseoftheBritish1958BirthCohortDNA(MRC:

G0000934)andtheWellcomeTrust(068545/Z/0/and076113/C/04/Z),theUKBlood

ServicesCommonControls(UKBS-CCcollection),fundedbytheWT(076113/C/04/Z)and

byNIHRprogrammegranttoNHSBT(RP-PG-0310-1002).Therecruitmentoffamiliesin

BulgariawasfundedbytheJanssenResearchFoundation,Beerse,Belgium.Weare

gratefultothestudyvolunteersforparticipatingintheJanssenresearchstudiesandto

thecliniciansandsupportstaffforenablingpatientrecruitmentandbloodsample

collection.Informedconsentwasobtainedfromallparticipantsortheirparentsor

guardians.WethankthestaffintheNeuroscienceBiomarkersGenomicLabledby

ReynaFavisatJanssenforsampleprocessingandthestaffatIlluminaforgenotyping

JanssenDNAsamples.WealsothankAnthonySantos,NicoleBottrel,Monique-Andree

Franc,WilliamCaffertyofJanssenResearch&Development)foroperationalsupport.

FundingfromtheNetherlandsOrganizationforHealthResearchandDevelopment

(ZonMw),withintheMentalHealthprogram(toGROUPconsortiumforcollecting

patientsandclinicaldata).High-DensityGenome-WideAssociationStudyOf

SchizophreniaInLargeDutchSample(R01MH078075NIH/NationalInstituteOfMental

HealthPI:RoelA.Ophoff).TheDanishCouncilforStrategicResearch(Journ.nr.09-

067048);TheDanishNationalAdvancedTechnologyFoundation(Journ.nr.001-2009-2);

TheLundbeckFoundation(Journ.nr.R24-A3243);EU7thFrameworkProgramme

(PsychGene;Grantagreementnr.218251);EU7thFrameworkProgramme(PsychDPC;

Grantagreementnr.286213).TheWellcomeTrustsupportedthisstudyaspartofthe

WellcomeTrustCaseControlConsortium2project.E.BramonholdsaMRCNew

InvestigatorAwardandaMRCCentenaryAward.TheTOPStudywassupportedbythe

ResearchCouncilofNorway(#213837,#217776,#223273),South-EastNorwayHealth

Authority(#2013-123)andK.G.JebsenFoundation.Thisworkwassupportedbythe

DonaldandBarbaraZuckerFoundation,theNorthShore–LongIslandJewishHealth

SystemFoundation,andgrantsfromtheStanleyFoundation(AKM),theNational

AllianceforResearchonSchizophreniaandDepression(AKM),andtheNIH(MH065580

toTL;MH001760toAKM).SynSys,EUFP7-242167,SigridJuseliusFoundation,The

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

72

AcademyofFinland,grantnumber:251704,SohlbergFoundation.TheSwedish

ResearchCouncil[grantnumbers2006-4472,2009-5269,2009-3413]andtheCounty

CouncilsofVästerbottenandNorrbotten,Swedensupportedthecollectionofthe

scz_umeb_eurandscz_umes_eursamples.TheBetulaStudy,fromwhichtheUmea

controlswererecruited,issupportedbygrantsfromtheSwedishResearchCouncil

[grantnumbers345-2003-3883,315-2004-6977]andtheBankofSwedenTercentenary

Foundation,theSwedishCouncilforPlanningandCoordinationofResearch,the

SwedishCouncilforResearchintheHumanitiesandSocialSciencesandtheSwedish

CouncilforSocialResearch.TheGRAS(GöttingenResearchAssociationfor

Schizophrenia)datacollectionhasbeensupportedbytheMaxPlanckSociety,theMax

PlanckFörderstiftung,andtheDFG(CNMPB).WethankallGRASpatientsfor

participatinginthestudy,andallthemanycolleagueswhohavecontributedoverthe

past10yearstotheGRASdatacollection.WeacknowledgesupportfromtheNorth

Shore–LIJHealthSystemFoundationandNIHgrantsRC2MH089964andR01

MH084098.WeacknowledgesupportfromNIMHK01MH085812(PIKeller)andNIMH

R01MH100141(PIKeller).EGCUTworkwassupportedbytheTargetedFinancingfrom

theEstonianMinistryofScienceandEducation[SF0180142s08];theUSNational

InstituteofHealth[R01DK075787];theDevelopmentFundoftheUniversityofTartu

(grantSP1GVARENG);theEuropeanRegionalDevelopmentFundtotheCentreof

ExcellenceinGenomics(EXCEGEN;grant3.2.0304.11-0312);andthroughFP7grant

313010.MilanMacekwassupportedbyCZ.2.16/3.1.00/24022OPPK,NT/13770–4and

00064203FNMotol.Forthescz_tcr1_asndatasetfundingfromtheNationalMedical

ResearchCouncil(Grant:NMRC/TCR/003/2008)andtheBiomedicalResearchCouncil,

A*STARisacknowledged.GenotypingoftheSwedishHubinsamplewasperformedby

theSNP&SEQTechnologyPlatforminUppsala,whichissupportedbyUppsala

University,UppsalaUniversityHospital,ScienceforLifeLaboratory-Uppsalaandthe

SwedishResearchCouncil(Contracts80576801and70374401).TheSwedishHubin

samplewassupportedbySwedishResearchCouncil(IA,EGJ)andtheregional

agreementonmedicaltrainingandclinicalresearchbetweenStockholmCountyCouncil

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

73

andtheKarolinskaInsititutet(EGJ).B.J.M.,V.J.C.,R.J.S.,S.V.C.,F.A.H.,A.V.J.,C.M.L.,

P.T.M.,C.P.,andU.S.weresupportedbytheAustralianSchizophreniaResearchBank,

whichissupportedbyanEnablingGrantfromtheNationalHealthandMedicalResearch

Council(Australia)[No.386500],thePrattFoundation,RamsayHealthCare,theViertel

CharitableFoundationandtheSchizophreniaResearchInstituteandtheNSW

DepartmentofHealth.C.P.issupportedbyaSeniorPrincipalResearchFellowshipfrom

theNationalHealthandMedicalResearchCouncil(Australia).Weacknowledgethehelp

of:JohannaBadcock,LindaBradbury,JasonBridge,DavidChandler,JanellCollins-

Langworthy,TrishCollinson,MilanDragovic,CherylFilippich,DavidHawkes,Danielle

Lowe,KathrynMcCabe,TamaraMacDonald,BarryMaher,BhartiMorarMarcSeal,

HeatherSmith,MelissaTooney,PaulTooney,andMelindaZiino.TheDanishAarhus

studywassupportedbygrantsfromTheLundbeckFoundation,TheDanishStrategic

ResearchCouncil,AarhusUniversity,andTheStanleyResearchFoundation.ThePerth

samplecollectionwasfundedbyAustralianNationalHealthandMedicalResearch

CouncilprojectgrantsandtheAustralianSchizophreniaResearchBank.The

Bonn/Mannheimsamplewasgenotypedwithinastudythatwassupportedbythe

GermanFederalMinistryofEducationandResearch(BMBF)throughtheIntegrated

GenomeResearchNetwork(IG)MooDS(SystematicInvestigationoftheMolecular

CausesofMajorMoodDisordersandSchizophrenia;grant01GS08144toM.M.N.and

S.C.,grant01GS08147toM.R.),undertheauspicesoftheNationalGenomeResearch

Networkplus(NGFNplus),andthroughtheIntegratedNetworkIntegraMent(Integrated

UnderstandingofCausesandMechanismsinMentalDisorders),undertheauspicesof

thee:MedProgramme.(GSKcontrolsample;Müller-Myhsok).Thisworkhasbeen

fundedbytheBavarianMinistryofCommerceandbytheFederalMinistryofEducation

andResearchintheframeworkoftheNationalGenomeResearchNetwork,

Förderkennzeichen01GS0481andtheBavarianMinistryofCommerce.M.M.N.isa

memberoftheDFG-fundedExcellence-ClusterImmunoSensation.M.M.N.alsoreceived

supportfromtheAlfriedKruppvonBohlenundHalbach-Stiftung.M.R.wasalso

supportedbythe7thFrameworkProgrammeoftheEuropeanUnion(ADAMSproject,

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

74

HEALTH-F4-2009-242257;CRESTARproject,HEALTH-2011-1.1-2)grant279227.Roche:

ThanksareexpressedtoOliviaSpleissforgreatsupportingeneticdatageneration,

DanielUmbrichtandDelphineLagardefortheirvaluablesupportinclinicalandgenetic

datasharing,andAnirvanGhoshforcontinuousencouragement.Authorsalsowishto

thankallinvestigatorsandpatientswhoparticipatedintheRocheclinicalstudies.Jo

KnightholdstheJoanneMurphyProfessorinBehaviouralScience.WethankMaria

Tampakerasforherworkonthesamples.TheStanleyCenterforPsychiatricResearchat

theBroadInstituteacknowledgesfundingfromtheStanleyMedicalResearchInstitute.

Swedishschizophreniastudy(PICMM,PFS,PS,SM):Wearedeeplygratefulforthe

participationofallsubjectscontributingtothisresearchandtothecollectionteamthat

workedtorecruitthem:E.Flordal-Thelander,A.-B.Holmgren,M.Hallin,M.Lundin,A.-K.

Sundberg,C.Pettersson,R.Satgunanthan-Dawoud,S.Hassellund,M.Rådstrom,B.

Ohlander,L.NyrénandI.Kizling.FundingsupportfortheSwedenSchizophreniaStudy

(PIsHultman,Sullivan,andSklar)wasprovidedbytheNIMH(R01MH077139toP.F.S.

andR01MH095034toP.S.),theStanleyCenterforPsychiatricResearch,theSylvan

HermanFoundation,theFriedmanBrainInstituteattheMountSinaiSchoolof

Medicine,theKarolinskaInstitutet,KarolinskaUniversityHospital,theSwedishResearch

Council,theSwedishCountyCouncil,theSöderströmKönigskaFoundation.We

acknowledgeuseofDNAfromTheUKBloodServicescollectionofCommonControls

(UKBScollection),fundedbytheWellcomeTrustgrant076113/CI04/Z,bytheJuvenile

DiabetesResearchFoundationgrantWT0618S8,andbytheNationalInstituteofHealth

ResearchofEngland.ThecollectionwasestablishedaspartoftheWellcomeTrustCase-

ControlConsortium.Wethankthestudyparticipants,andtheresearchstaffatthestudy

sites.ThisstudywassupportedbyNIMHgrantR01MH062276(toDFLevinson,C

Laurent,MOwenandDWildenauer),grantR01MH068922(toPVGejman),grant

R01MH068921(toAEPulver)andgrantR01MH068881(toBRiley).Theauthorsare

gratefultothemanyfamilymemberswhoparticipatedinthestudiesthatrecruited

thesesamples,tothemanyclinicianswhoassistedintheirrecruitment.Inadditionto

thesupportacknowledgedfortheMulticenterGeneticsStudiesofSchizophreniaand

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

75

MolecularGeneticsofSchizophreniastudies,Dr.DFLevinsonreceivedadditional

supportfromtheWalterE.Nichols,M.D.,ProfessorshipintheSchoolofMedicine,the

EleanorNicholsEndowment,theWalterF.&RachaelL.NicholsEndowmentandthe

WilliamandMaryMcIvorEndowment,StanfordUniversity.Thisstudywassupportedby

NIHR01grants(MH67257toN.G.B.,MH59588toB.J.M.,MH59571toP.V.G.,MH59565

toR.F.,MH59587toF.A.,MH60870toW.F.B.,MH59566toD.W.B.,MH59586toJ.M.S.,

MH61675toD.F.L.,MH60879toC.R.C.,andMH81800toP.V.G.),NIHU01grants

(MH46276toC.R.C.,MH46289toC.Kaufmann,MH46318toM.T.Tsuang,MH79469to

P.V.G.,andMH79470toD.F.L.),theGeneticAssociationInformationNetwork(GAIN),

andbyThePaulMichaelDonovanCharitableFoundation.Genotypingwascarriedoutby

theCenterforGenotypingandAnalysisattheBroadInstituteofHarvardandMIT(S.

GabrielandD.B.Mirel),whichissupportedbygrantU54RR020278fromtheNational

CenterforResearchResources.GenotypingofhalfoftheEAsampleandalmostallthe

AAsamplewascarriedoutwithsupportfromGAIN.TheGAINqualitycontrolteam(G.R.

AbecasisandJ.Paschall)madeimportantcontributionstotheproject.WethankS.

PurcellforassistancewithPLINK.We(DRW,RS)thankthestaffoftheLieberInstitute

andtheClinicalBrainDisordersBranchoftheIRP,NIMHfortheirassistanceindata

collectionandmanagement.WethankNingpingFengandBhaskarKolachanafor

IlluminagenotypingandformanagingDNAstocks.Theworkwassupportedbythe

LieberInstituteandbydirectNIMHIRPfundingoftheWeinbergerLab.Pfizerisvery

gratefultothestudyvolunteersforparticipatinginourresearchstudies.Wethankour

numerouscliniciansandsupportstaffforenablingpatientrecruitment,bloodsample

collection,andbiospecimenadministration.Informedconsentwasobtainedfromall

participants,theirparentsorguardians.EliLillyisgratefultotheparticipantsofclinical

trialsandresearchstudieswhogaveconsentforparticipationinthisstudy.Wearealso

gratefultoPhilipJEbertandJeffreySArnoldforfacilitatingourparticipationinthis

project.WeacknowledgetheIrishcontributiontotheInternationalSchizophrenia

Consortium(ISC)study,theWTCCC2SCZstudy&WTCCC2controlsfromthe1958BCand

UKNBS,theScienceFoundationIreland(08/IN.1/B1916).WethanktheTorontoCentre

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;

76

forAppliedGenomicsfortechnicalandcomputationalassistanceandfundingfromthe

UniversityofTorontoMcLaughlinCentreandGenomeCanada.S.W.S.holdsthe

GlaxoSmithKline-CIHRChairinGenomeSciencesattheHospitalforSickChildrenand

UniversityofToronto.WeacknowledgeuseoftheTrinityBiobanksamplefromtheIrish

BloodTransfusionService&theTrinityCentreforHighPerformanceComputing.

FundingforthisstudywasprovidedbytheWellcomeTrustCaseControlConsortium2

project(085475/B/08/Zand085475/Z/08/Z),theWellcomeTrust(072894/Z/03/Z,

090532/Z/09/Zand075491/Z/04/B),NIMHgrants(MH41953andMH083094)and

British1958BirthCohortDNAcollectionfundedbytheMedicalResearchCouncil(grant

G0000934)andtheWellcomeTrust(grant068545/Z/02)andoftheUKNationalBlood

ServicecontrolsfundedbytheWellcomeTrust.WeacknowledgeHongKongResearch

GrantsCouncilprojectgrantsGRF774707M,777511M,776412Mand776513M.

Supplementarydatareferences

1. Desachy,G.etal.Increasedfemaleautosomalburdenofrarecopynumber

variantsinhumanpopulationsandinautismfamilies.MolPsychiatry20,170-5(2015).

2. Zuberi,K.etal.GeneMANIApredictionserver2013update.NucleicAcidsRes41,W115-22(2013).

3. Fromer,M.etal.Denovomutationsinschizophreniaimplicatesynapticnetworks.Nature506,179-84(2014).

4. Witte,J.S.,Visscher,P.M.&Wray,N.R.Thecontributionofgeneticvariantstodiseasedependsontheruler.NatRevGenet15,765-76(2014).

5. Sullivan,P.F.,Kendler,K.S.&Neale,M.C.Schizophreniaasacomplextrait:evidencefromameta-analysisoftwinstudies.Arch.Gen.Psychiatry.60,1187-1192(2003).

.CC-BY-NC-ND 4.0 International licensepeer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was not. http://dx.doi.org/10.1101/040493doi: bioRxiv preprint first posted online Feb. 23, 2016;