14
Ecology and Evolution. 2018;1–14. | 1 www.ecolevol.org Received: 26 January 2018 | Revised: 14 March 2018 | Accepted: 29 March 2018 DOI: 10.1002/ece3.4147 ORIGINAL RESEARCH Double-digest RADseq loci using standard Illumina indexes improve deep and shallow phylogenetic resolution of Lophodermium, a widespread fungal endophyte of pine needles Rodolfo Salas-Lizana | Ryoko Oono This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2018 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and Marine Biology, University of California, Santa Barbara, California Correspondence Ryoko Oono, Department of Ecology, Evolution, and Marine Biology, University of California-Santa Barbara, Santa Barbara, CA. Email: [email protected] Present address Rodolfo Salas-Lizana, Departamento de Biología Comparada, Facultad de Ciencias, Universidad Nacional Autónoma de México, Mexico City, Mexico Funding information University of California Institute for Mexico and the United States, Grant/Award Number: 20150989 Abstract The phylogenetic and population genetic structure of symbiotic microorganisms may correlate with important ecological traits that can be difficult to directly measure, such as host preferences or dispersal rates. This study develops and tests a low-cost double-digest restriction site-associated DNA sequencing (ddRADseq) protocol to reveal among- and within-species genetic structure for Lophodermium, a genus of fungal endophytes whose evolutionary analyses have been limited by the scarcity of informative markers. The protocol avoids expensive barcoded adapters and incorpo- rates universal indexes for multiplexing. We tested for reproducibility and function- ality by comparing shared loci from sample replicates and assessed the effects of numbers of ambiguous sites and clustering thresholds on coverage depths, number of shared loci among samples, and phylogenetic reconstruction. Errors between technical replicates were minimal. Relaxing the quality-filtering criteria increased the mean coverage depth per locus and the number of loci recovered within a sample, but had little effect on the number of shared loci across samples. Increasing cluster- ing threshold decreased the mean coverage depth per cluster and increased the num- ber of loci recovered within a sample but also decreased the number of shared loci across samples, especially among distantly related species. The combination of low similarity clustering (70%) and relaxed quality-filtering (allowing up to 30 ambiguous sites per read) performed the best in phylogenetic analyses at both recent and deep genetic divergences. Hence, this method generated sufficient number of shared ho- mologous loci to investigate the evolutionary relationships among divergent fungal lineages with small haploid genomes. The greater genetic resolution also revealed new structure within species that correlated with ecological traits, providing valuable insights into their cryptic life histories. KEYWORDS ddRADseq, genotyping by sequencing, sequencing depth

Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

Ecology and Evolution. 2018;1–14.  | 1www.ecolevol.org

Received:26January2018  |  Revised:14March2018  |  Accepted:29March2018DOI: 10.1002/ece3.4147

O R I G I N A L R E S E A R C H

Double- digest RADseq loci using standard Illumina indexes improve deep and shallow phylogenetic resolution of Lophodermium, a widespread fungal endophyte of pine needles

Rodolfo Salas-Lizana  | Ryoko Oono

ThisisanopenaccessarticleunderthetermsoftheCreativeCommonsAttributionLicense,whichpermitsuse,distributionandreproductioninanymedium,providedtheoriginalworkisproperlycited.©2018TheAuthors.Ecology and EvolutionpublishedbyJohnWiley&SonsLtd.

DepartmentofEcology,Evolution,andMarineBiology,UniversityofCalifornia,SantaBarbara,California

CorrespondenceRyokoOono,DepartmentofEcology,Evolution,andMarineBiology,UniversityofCalifornia-SantaBarbara,Santa Barbara,CA.Email:[email protected]

Present addressRodolfoSalas-Lizana,DepartamentodeBiologíaComparada,FacultaddeCiencias,UniversidadNacionalAutónomadeMéxico,MexicoCity,Mexico

Funding informationUniversityofCaliforniaInstituteforMexicoandtheUnitedStates,Grant/AwardNumber:20150989

AbstractThephylogeneticandpopulationgeneticstructureofsymbioticmicroorganismsmaycorrelatewith importantecological traits thatcanbedifficult todirectlymeasure,suchashostpreferencesordispersalrates.Thisstudydevelopsandtestsalow-costdouble-digest restriction site-associatedDNAsequencing (ddRADseq)protocol toreveal among- andwithin-species genetic structure forLophodermium, a genusoffungalendophyteswhoseevolutionaryanalyseshavebeenlimitedbythescarcityofinformativemarkers.Theprotocolavoidsexpensivebarcodedadaptersandincorpo-ratesuniversalindexesformultiplexing.Wetestedforreproducibilityandfunction-alitybycomparingshared loci fromsample replicatesandassessed theeffectsofnumbersofambiguoussitesandclusteringthresholdsoncoveragedepths,numberof shared loci among samples, and phylogenetic reconstruction. Errors betweentechnicalreplicateswereminimal.Relaxingthequality-filteringcriteriaincreasedthemeancoveragedepthperlocusandthenumberoflocirecoveredwithinasample,buthadlittleeffectonthenumberofsharedlociacrosssamples.Increasingcluster-ingthresholddecreasedthemeancoveragedepthperclusterandincreasedthenum-beroflocirecoveredwithinasamplebutalsodecreasedthenumberofsharedlociacrosssamples,especiallyamongdistantlyrelatedspecies.Thecombinationoflowsimilarityclustering(70%)andrelaxedquality-filtering(allowingupto30ambiguoussitesperread)performedthebestinphylogeneticanalysesatbothrecentanddeepgeneticdivergences.Hence,thismethodgeneratedsufficientnumberofsharedho-mologouslocitoinvestigatetheevolutionaryrelationshipsamongdivergentfungallineageswith smallhaploidgenomes.Thegreatergenetic resolutionalso revealednewstructurewithinspeciesthatcorrelatedwithecologicaltraits,providingvaluableinsightsintotheircrypticlifehistories.

K E Y W O R D S

ddRADseq,genotypingbysequencing,sequencingdepth

Page 2: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

2  |     SALAS- LIZANA ANd OONO

1  | INTRODUC TION

The genetic diversity and structurewithin species in combinationwithgeographicalandecologicalmetadatacanuncover importantbiogeographicalstructuring,demographichistory,andadaptivetraitslinked to ecological functions (Loveless, 1984; Manel, Schwartz,Luikart,&Taberlet,2003;Neale&Savolainen,2004;Slatkin,1987).Thegeneticrelationshipamongcloselyrelatedspecies,orphyloge-neticassociations,canalsouncover important lifehistory, suchasecologicaleventsortraitsthatinfluencediversificationandspecia-tion(Huyse,Poulin,&Théron,2005).Microbialspecies,whoselifehistories and species delimitations are particularly challenging toobserveormeasure,wouldespeciallybenefitfromgeneticsequenc-ingmethods thatcanuncoverbothmicro-andmacroevolutionarydiversity.Inthisstudy,wetestalow-costdouble-digestrestrictionsite-associatedDNAsequencing(ddRADseq)protocolthat iscom-monlyappliedforpopulation-levelSNPdiscoveriesbutalsoincreas-ingly applied for phylogenetic analyses, on a fungal genuswhoselife cycle is understudied. Lophodermium (Chevall.) is a paraphy-leticgenuswithover100namedspecies thatassociatewithdeador livingplantsworldwide (Lantz,Johnston,Park,&Minter,2011).Thosespeciesassociatedwithpineneedlesconsistofasmaller(ca.30 putative species) and closely related group (Ortiz-García etal.,2003) thatoftendominateoverotherendophyticspecies (Ganley,Brunsfeld, &Newcombe, 2004;Oono, Lefèvre, Simha, & Lutzoni,2015).Theirecologicalsignificancewithpines isunknownbeyondtheir“endophyte”status,althoughafewareconsideredpathogenic(Hanso&Drenkhan,2011)andsomeareexplicitlystudiedassapro-phytes(Osono&Hirose,2011).Lophodermiumspeciesalsoappeartohavelargewithin-speciesgeneticvariation(Deckert,Hsiang,&LarryPeterson,2002;Salas-Lizana,Santini,Adán,&Piñero,2012)withtheoccurrence ofmany cryptic species (Oono etal., 2014; Reignoux,Green,&Ennos,2014).Whileahandfulofgeneticmarkersmaysuc-cessfullydelimitpopulationsorspeciestorevealdispersalratesordemographichistories,thepopulationstructureofsomewidespreadspecieshasyettobeidentifiedwithtraditionalmultilocussequence-typingapproaches(Oonoetal.,2014;Salas-Lizanaetal.,2012).

High-throughputsequencingallowstherapidgenerationofge-nomicdata forhundredsof individuals to addressdiverseecolog-ical and evolutionary questions. As a genotyping technique thatcansequencemoreindividualsforfewerloci,RADseqhasbecome

themostwidelyused,cost-effectivemethod,particularly fornon-model species without reference genomes (reviewed in Andrews,Good,Miller,Luikart,&Hohenlohe,2016;Daveyetal.,2011).Thedouble-digest RADseq (ddRADseq; Peterson, Weber, Kay, Fisher,&Hoekstra,2012), avariationof theoriginalmethod (Bairdetal.,2008), improvesondepthofcoverageper locusbyoptimizingse-quencing effort and reducing missing genotypes. The protocol isflexible,allowingforeasyoptimizationfordifferentorganisms,ge-nome sizes, genetic diversity, and scientific questions (Mastretta-Yanes etal., 2015; Nieto-Montes de Oca etal., 2017; Recknagel,Elmer,&Meyer,2013;Zhouetal.,2014).Consequently,numerousmodificationsorimprovementsofddRADseqarebeingcontinuallyproposed (Franchini, Parera, Kautt, & Meyer, 2017; Heffelfingeretal.,2014;Recknagel,Jacobs,Herzyk,&Elmer,2015).Inthisstudy,weevaluateanunderutilizedversionofthepopularddRADseqpro-tocolthataccommodatestheless-expensiveandstandardindexedprimersformultiplexing(Kess,Gross,Harper,&Boulding,2016).

Double-digest RADseq consists of digestion of the genomicDNA with two different restriction enzymes (typically one withhigh-frequency site and one with low-frequency site), ligation ofdigested fragments toadapters,aPCRstep for theenrichmentofligated fragmentsandattachmentof indexes, size selection,pool-ingofsamples,andfinally,sequencingofpooledfragmentsusingahigh-throughputsequencingplatform.MultiplexingforddRADseqistypicallyachievedbyligatingadapterswithuniquein-linebarcodestodigested genomicDNAand then addingunique indexesduringPCR enrichment (Figure1 & 2; Parchman etal., 2012; Petersonetal.,2012;Mastretta-Yanesetal.,2015).Barcodedligationadapt-ersneedtobesynthesizedinpairsandcanbepricey(e.g.,$140perpair),arespecifictoarestrictionenzyme,andcanthereforeonlybeappliedwithddRADseq.Forexample,preparinga96-sampleddRADlibrarywitheightbarcodedadaptersandtwelveindexesrequiresaninvestmentofover$1,000 inadaptersalone ([eight forward+onereverse]×2 complementary annealing oligos×$70=$1,260) thatcannot be used for non-ddRAD sequencing projects (unlike stan-dardindexes).Consequently,thecostofbarcodedadaptersmayim-pedethestartofnewprojectsforsmallerlaboratories.Inthisstudy,wemodifiedprotocol thatwaspreviouslyused forSNPdiscoveryin a single species by Kess etal. (2016) that lowers the upfrontcostsfrombarcodedadaptersandallowstheuseofcommonlyap-pliedcombinatorialdual-indexedbarcodesusingstandardadapters

F IGURE  1 Designcomparisonbetweenbarcodedadapters(a)andtheadaptersinthisstudycompatiblewithstandarddualindexes(b).Arrowsindicatesequencingprimerregions.Sequencingprimersforthetargetregionarecustomdesignedin(b)tooverlapwithenzymecuttingsites.P1andP2areadaptersligatedtotargetfragmentsthatmay(a)ormaynot(b)havebarcodes.P5andP7areflowcell-bindingregionsforIlluminaplatformsandarealwaysincorporatedusingPCRtotheligationadapters.SeeAppendixS2forsequencesofadaptersandprimersandcostcomparisonanalyses

Page 3: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

     |  3SALAS- LIZANA ANd OONO

compatiblewithcohesiverestrictionenzymesites,andsequencingusinglow-costcustomprimersthatincludetherestrictionsite.SeeAppendixS2fordetailedcostcomparisonstooriginalddRADproto-col.Weexploredthereproducibilityandgeneticdiversityrevealedbythelow-costddRADseqprotocolonmultiplespeciesofthegenusLophodermium.

At present, Illumina HiSeq is the most commonly used plat-form for genome resequencing albeit the short read lengths perloci (e.g.,<100bps;butMiSeq isusedbyDaviketal. (2015),Kessetal.(2016)andVivianandRn(2017)).Thisplatformhastheabilitytoproducesufficientcoveragedepthper locus,givenanappropri-atemultiplexing density, to overcome potential sequencing errorsandfalseSNPcallsaswellasproducehighnumbersofsharedlociamong samples.Thedeepcoverage canbeparticularly crucial foridentifyingheterozygosityinnonhaploidorganisms.WetestedtheprotocolontheMiSeqplatforminsteadbecausewebelievedfewersequencereadswouldstillproducesufficientnumbersofsharedlociamongsamplesforpopulationgenomicandphylogeneticanalyses,given the relatively small haploid genomes of these species (est.40–80Mbps).Wewere also interested in understanding how se-quencingdepthaffectedtheabilitytoidentifysharedlociorrecover

locirepeatedlyacrosssampleswiththelongerreads(e.g.,>100bps).Weassessedtherecoverablenumberofsharedlociamongtechnicalreplicatesamplesbydeterminingtherelationshipbetweenthenum-berofsharedlocibetweenreplicatesandtheirsequencingdepths.Furthermore,becausethecoverageperlocuswouldbecomparablylower than inHiSeq, potentially introducingmisidentified variantsintothefinaldatasetfromerrorsduringsequencing,PCR,orothertechnical modifications in the preparation of the library, we alsocomparedtheerrorratesbetweensharedlociintechnicalreplicatesamplesruneitherwithinorbetweenIlluminalanes.

The purpose of this study was twofold: (i) constructing a ro-bust phylogeny of a genetically diverse group of understudiedfungiusingarapidandlow-costddRADseqprotocolwithstandardindexesand(ii)assessingtherelationshipamongclusteringandfil-teringparameters,errorrates,andnumbersofhomologouslociformicro-andmacroevolutionaryanalyses.Wealsoevaluatedthere-coverablenumberofsharedlociwithincreasinggeneticdivergencewithin Lophodermium and how these loci might improve speciescircumscription compared to thewidely used internal transcribedspacer (ITS) rDNA locus.We demonstrate its utility on 50 fungalisolates representingpopulationsof six putative species basedon

F IGURE  2 AcomparisonoffourddRADseqprotocols.Optionalstepsareexcluded.DetailedprotocolforthisstudycanbefoundinAppendixS2

Page 4: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

4  |     SALAS- LIZANA ANd OONO

ITSphylogenetics.Weshowthatthislow-costddRADseqprotocolapplied on theMiSeq can be effectively implemented for circum-scribing more putative fungal species and finer population struc-turesthan ITSalone.Wedemonstrate itspotential tosignificantlyimprovephylogeneticresolutionfornonmodelorganismsandrevealpopulationstructurewithinwidespreadspecies,butclusteringandfilteringarestillkeybioinformaticstagesthatneedtobetestedfordifferentgroupsoforganisms.

2  | MATERIAL S AND METHODS

2.1 | Sampling, culturing, and DNA extraction

MultipleLophodermiumspeciesandindividualswerecollectedfromboth ascocarps on senescent needles and mycelial cultures fromhealthygreenneedlesofPinus spp. (seeTableS1A for geographicorigins, host species, and isolationmethods).Monosporic cultureswere obtained from ascocarps as described in Salas-Lizana etal.(2012)using2%maltextract(ME)agar.TheLophodermiumisolatesfromgreenneedlesused in this studycome fromanongoing sur-veyofendophytesofpinetrees.Greenneedleswerewashedandsurface-sterilizedasdescribedinOonoetal.(2015).Greenneedleswerecutwithasterilerazorbladeinto2-mm-longsections.Sectionswereplaced in2%MEagar slants in1.5mlmicrocentrifuge tubesinsterileconditions.Asubsetofemergingcultureswasgenotypedbysequencingtheinternaltranscribedspacer(ITS)andpartiallargesubunit(LSU)rDNAregion(hereafterITS-LSU)forpositiveidentifi-cationofaLophodermiumspp.Severalsamplescomefromthefungalcollection atNewZealandCrownResearch Institute (Scion).Withtheexceptionofthreespecies(L. australe,L. conigenum,andL. pinas-tri), the remaining specimenswere identified using a combinationofascocarpmorphology(datanotshown)andITS-LSUsequences.TheITS-LSUsequencesofL. australe and L. conigenumweresimilarenough(>97%sequencesimilarity)tobeconsideredasingleputativespeciesinthisstudy,whichisalsosuggestedbyothers(Minter,1981;Ortiz-García etal., 2003). Cultures were grown for 2–3weeks in50mlof2%MEmediaanddriedonfilterpaperundersterilecondi-tions.Between40and100mgofdriedmyceliumwasusedforDNAextractionfollowingamodifiedCTABmethod,seeAppendixS1fordetails.DNAqualitywas assessed by visualizing on 1.5% agarosegels, confirming the absence ofRNA/DNA smears.DNAquantifi-cationwasperformedonaQubit2.0 fluorometerusingtheQubitdsDNAHSAssayKit(Invitrogen,Carlsbad,CA,US).

2.2 | ITS- LSU sequencing and phylogenetic analyses

WeamplifiedtheITS-LSUrDNAregion(ITS1,5.8S,ITS2,andpartial28S)usingprimers ITS1F (Gardes&Bruns,1993) andLR3 (Hopple&Vilgalys,1994).PCRprotocolconsistedofan initialdenaturationstepat95°Cfor4min, followedby35cyclesof30sat95°C,30sat50°C,and90sat72°C,andafinalextensionat72°Cfor10min.PCRproductswerecleanedwithExoSAP-IT(USB-Affymetrix,SantaClara,CA)andsequencedusingSangertechnologyatUCBerkeley

DNA Sequencing Facility. The ITS-LSU sequences were alignedusingMAFFTv7 (Katoh&Standley,2013)withdefaultparametersfor the50sampledLophodermium individualsaloneaswellaswith55 additional reference ITS-LSU sequences from public databases(seeTableS1B for ITSaccessions).Wecompared the topologiesofunrooted trees constructed by ITS-LSU alone and all ddRAD loci.Inallcases,treeswereobtainedusingRAxMLv8.0.26(Stamatakis,2014)with1000bootstrapreplicatesundertheGTRGAMMAmodel(Tavaré,1986).

2.3 | ddRADseq library preparations

ThreetoninehundrednanogramsofDNApersample(6μltotal;50–150ng/μl)wasdouble-digestedusingtherare-cuttingEcoRI-HFandthefrequent-cuttingMseIenzymes(NewEnglandBioLabs,Ipswich,MA).IncaseswherestartingDNAconcentrationswerelow,dupli-catesamplesunderwentdigestionandligationstepsandthenwerepooled (see detailed workbench protocol for additional tips andsimilaritieswithKessetal.(2016)inAppendicesS1–S2).AllsampleswereplacedrandomlyinPCRplatesduringlibrarypreparation.TheDNAwasdigested for fourhoursat37°C,and theenzymeswereheat-killedat65°Cfor10min.DigestedDNAfragmentswereligatedtotheEcoRI-specificP1adapterandtheMseI-specificP2adapter(Figure1b,AppendixS2),whicharenotbarcodedasintheoriginalddRADseqprotocol(Petersonetal.,2012),withtheT4ligaseenzyme(NewEnglandBioLabs,Ipswich,MA).Theligationreactionconsistedofincubatingovernight(>12hs)atroomtemperature(approx.21°C)andheat killing the enzymeat 65°C for 10min. In order to elimi-nateunincorporatedadaptersandsmall(<300bps)DNAfragments,ligation reactionswerepurifiedusingAgencourtAMPureXPSPRImagneticbeads(BeckmanCoulter,Brea,CA)at0.8:1(beads:sample)volume ratio and resuspended in 35μl of 10mmol/L Tris buffer.Aunique combinationof the IlluminaNextera v2.0® dual-indexedbarcodes(P5andP7;Illumina,Inc.SanDiego,CA)wasattachedtopurifiedfragmentswith14cyclesofPCRfortworeplicatesofeachsample.IndexedPCRproductswerenormalizedandpooledinequi-molarproportions.Fragmentsizesbetween300and700bpswereselectedusingPippinPrep(SageScienceInc.,Beverly,MA)2%aga-rosecassettes.ThefinallibrarieswereanalyzedusingaTapeStationInstrument (AgilentTechnologies, SantaClara,CA) to confirm therecoveryoffragmentsontheselectedrange(seeAppendixS2forfragmentsizeprofiles).ThelibraryconcentrationswerequantifiedaspreviouslydescribedandsequencedusingIlluminaMiSeqplatformunder a251-cyclepaired-end readprotocol at the IGMGenomicsCenter,UCSanDiego.Intotal,fourlibraries,whichincludedsamplesfromotherstudies,weresequenced.SeeTableS1Cfor libraryde-scriptionandbarcodedsamples.

We tested reproducibility of the ddRAD data by sequencingtechnical replicatesof sixLophodermium strainswithinandamonglibraries.ReplicatesaresamplesfromthesamegenomicDNAthatwere processed independently (i.e., separate enzyme digestion)andtaggedwithdifferentcombinationsofP5andP7 indexes (seeTableS1Cfordetails).

Page 5: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

     |  5SALAS- LIZANA ANd OONO

With traditional Illumina sequencing primers, the sequencerdetectszerosequencediversityatthebeginningofthese librariesbecausetheyallhave identicalrestrictionsitepatterns.This initiallow-diversity impairs the MiSeq system to “maintain focus, reg-ister imagestotheclustermap,andmakeproperbasecallstode-liver high-quality data (Pub.No. 770-2013-013, Illumina technicalsupportnote).”Hence,customsequencingprimerswereusedthatincludetherestrictionsitebases(Figure1b;AppendixS2),whichal-lowsthesequencertodetecthighsequencediversityfromthebe-ginning.Theseprimersalsoallowtheusertomaximizesequencingthevariablelociandtoavoidsequencingthebarcodesorrestrictionsites(Figure1;Kessetal.,2016).

2.4 | Bioinformatics—filtering and clustering ddRADseq

Paired-endreadsweredemultiplexedbythesequencingfacilityandmergedusingPEAR(Zhang,Kobert,Flouri,&Stamatakis,2014)withdefaultparameters:aminimumoverlapof10bps,aminimumassem-bledsequencelengthof50bps,andamaximump-valueof.01fortheobserved-expectedalignmentscore,whichtestsstatisticalsig-nificanceofmergedreads.Weexcludedunassembledpairssince,onaverage,<1.58%ofreadswereunmerged(TableS1F,G),suggestingthatanalyzingmergedreadswithlowersequencingerrorswasmorevaluable tous thanretaining longer readswith lower likelihoodofidentifiablesharedhomologyacrosssamples.

Merged readswereprocessedusingpyRADv3.0.66pipeline(Eaton,2014)tofilterforqualityandpotentialparalogs(morede-tailbelow).NucleotidebaseswithPhredQ-scores<20(i.e.,accu-racyofthebasecall is<99%)werechangedtoan“N”characterand reads were excluded (i.e., quality-filtered) based on the al-lowablenumberofNsper read.Wetesteda rangeofallowablenumberofNsperreadfrom5to30(TableS1E),whichrepresentsapproximately1.8%–10.7%ofnucleotidesper readgivenanav-erage assembled sequence length of 281 bps (TableS1F). Thedefault of 4Ns (appropriate for shortHiSeq reads)would havebeentoostringentforthelongerMiSeqreads.Thequality-filteredreadswerethenclusteredwithinsamplesusing70%or85%se-quencing similarity thresholds. We analyzed the effect of thenumberofallowedNsduringthefilteringsteponthenumberofclusterspersample.

Filtered reads were clustered within samples using VSEARCHv1.11.1(Rognes,Flouri,Nichols,Quince,&Mahé,2016).Wetestedarangeofclusteringthresholdsbetween70and95%sequencesim-ilarityat5%incrementsforsamplesthatwerefilteredforreadswithamaximumof15Ns(hereafterdescribedas“15N”).Wethenana-lyzed the effect of clustering thresholds on the average coveragedepth (i.e.,numberof readsper cluster)per sample.Wealsoana-lyzedreadsfilteredwithamaximumcriteriaof30Ns (i.e., “30N”),rather than15N, and clustered these reads just at 70% sequencesimilarity(hereafterdescribedas“70/30N”)to includeinourcom-parativeanalysis.Putativelociorclustersrepresentedbyonlyoneread(i.e.,singletons)werediscarded.

Aconsensussequenceiscalledforeachputativelocusorclus-terusingeithertheestimatederrorrateorthemajorityrule.Whenacluster isrepresentedbymorethanthreereads,consensusbasecallsaremadeusingonlyA/T/G/CorNbasedonasequencingerrorratethatisestimatedbymaximumlikelihoodacrossallclusters(i.e.,putativeloci)withinasample,assumingzeroheterozygositybecausefungalDNAwashaploid.Anyambiguousbasesitethatoccursmoreoften than theexpected ratebasedon theestimatederror rate iscalled“N.”Ontheotherhand,whenaclusterisrepresentedbyonlytwo reads, consensusbase calls aremadeusing themajority rule,whichchoosestheloweralphabeticalbasetocallambiguousbases(i.e.,A/T=A;A/G=A;A/C=A;andG/C=C).Themajorityrulebasecall introducesbias inthedata (e.g.,more likelytoshareAsatthesame site thanTs), but allows retentionof lowcoverage loci. Lociwith consensus sequenceswithmore than fiveNs are discarded,which helps to remove potential paralogs. Decreasing clusteringsimilarity(i.e.,from85%to70%)increasescoveragepercluster,butcanalso increasepotentialparalogswithinclustersaswellasSNPerrorrates.

TheselociwerethenclusteredamongsamplesusingVSEARCHagainandalignedusingMUSCLEv3.5(Edgar,2004).Alignedlociwerekeptinthefinaldatasetiftherewerefewerthan100SNPsand 99 insertions/deletions (indels) across samples (default val-ues).Thesecriteriadiscardlongerreads(>330bps)whosecluster-ingthreshold(e.g.,70%)producehighlyvariableloci.Becauseourdatawerealreadydemultiplexedandpaired readsweremerged,the PyRAD pipelinewas started in step two, wherewe set thedatatype to ddrad, instead of pairddrad. Summary informationfor 85/15N and 70/30N pyRAD pipeline runs can be found inTablesS1F,G.

2.5 | Phylogenetic analyses with ddRAD loci

ToexploretheusefulnessofthelongerddRADreadsforinferencesatadeeptimescale (i.e.,phylogenetics),wecomparedthegeneticdistance,measuredusing ITS-LSU rDNA region, to thenumberofsharedlocibetweenpairsof50individualsfromsixputativespeciesof Lophodermium (L. australe-L. conigenum complex, L. baculiferum, L. molitoris, L. nitens, L. pinastri, and L. sp. nov.).Clusteringthresholdsbetween55and70%sequencesimilarityareconsideredtorecon-structthemostaccuratephylogenetictopologiesforrecentspeciesdivergences (e.g.,<60millionyears)but ishighly recommended tobetestedatvariousclusteringvalues(Rubin,Ree,&Moreau,2012).AsourreadsarelongerthantypicalddRADreadssequencedontheHiSeq,wetestedclusteringat85%threshold(DaCosta&Sorenson,2014;Heffelfingeretal.,2014)allowing15Nsaswellasatalower70% threshold allowing30Ns.Hereafter, the final datasets usingthesetwocriteriaarereferredtoas85/15Nand70/30Nmatrices,respectively.Locifoundinmorethan10sampleswerekept inthefinaldataset forphylogenetic reconstruction.Wechose10as theminimum sample number per locus because the maximum sam-ple size within a putative species was eight (with the exceptionofL. nitens and L. sp. nov.,whichare likely speciescomplexeswith

Page 6: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

6  |     SALAS- LIZANA ANd OONO

multiplecrypticspecies).Therefore,aminimumsampleof10wouldmostlikelyresultintheinclusionofmorethanonespeciesforeachlocus in the final alignment.Wecompared thephylogenetic reso-lutionbetweenthe70/30Nand85/15Nmatrices.AllphylogeneticanalyseswereperformedusingRaxMLv8.0.26 (Stamatakis,2014)as described above.We also evaluated the number of shared lociwithinspeciesusingthetwocriteriaforspeciesrepresentedbyatleasteightindividuals.

Pairwisegeneticdistancesbetweensamplesandputativespe-cieswereestimatedusingKimura2-parametermodel(Kimura,1980)consideringvariableratesamongsites(α=0.5)intheITS-LSUlocusafter testingseveralevolutionmodelsusingamaximum-likelihoodcriterion (lnL) inMEGA v7 (Kumar, Stecher, & Tamura, 2016). Theeffectofgeneticdistancesonthenumberofsharedlociamongpairswasanalyzedbybest-fitmodelswithminimumnumberofparame-tersinTableCurve2Dv3(SystatSoftwareInc.).

2.6 | Reproducibility analyses

Wetestedthereproducibilityofourmolecularbenchprotocolbyestimatingvariationinbasecallsbetweenninetechnicalreplicatepairs(usingsix individualsamples)withinorbetweensequencingruns(TableS1D).Wefilteredthereplicatepairsusingboth70/30Nand85/15Ncriteriaandcalculatedtheproportionofvariablesitesbetween shared loci (i.e., number of SNPs divided by the totallength of concatenated shared loci). The variable sites betweenshared lociof replicatepairsshould represent theerror rates in-troducedduringPCR,sequencing,filtering,andclustering.Wealsotestedhowsequencingdepths(i.e.,totalnumberofreadspersam-ple) related to the totalnumberofshared locibetweenreplicatepairsafterfiltration.Wewereinterestedinunderstandingwhetherincreasingsequencingdepthhassignificanteffectsonthenumberofsharedlocibetweenreplicatepairsorontheerrorrates.

3  | RESULTS

3.1 | ddRADseq data summary

Atotalof11,901,179sequencereadswereanalyzedforthisstudy(including replicate samples) which were produced in four sepa-rate librariesconsistingof44,48,64,and68samples (TableS1C).Demultiplexed, rawsequencereadsareavailable intheNCBISRAunderBioProjectaccessionPRJNA35677.Onaverage,98.42%(SD 0.89) of all readsweremergedwith aminimum of 94.41% and amaximumof99.41%persample(TableS1F).Theaveragenumberofmerged reads per samplewas 201,715with an average length of281bps.Themeanquality (Q)scoreofmergedreadsrangedfrom35.70 to39.60 for the first251nucleotidepositions, 33.60 to31uptoposition485,andthenfrom30.90to28.70at finalposition492.ThedropinQ-scoreistypicalforreversereads(Schirmeretal.,2015). Themeanpercentageof reads that passedquality-filteringwas 90.22% and 84.47% for 30N and 15N criteria, respectively.Within-sample clustering recovered an average of 48,659 and

52,658clusterspersample(nloci)forthe70/30Nand85/15Ncrite-ria,respectively.Themeandepthspercluster,excludingsingletons,were8.16 (SD11.30) for70/30Nand7.71 (SD11.20) for85/15N,andcorrelatedhighlywithsequencingdepthpersample(FigureS1).Theaveragepercentagesofnloci (i.e., representedby theconsen-sussequenceofagivencluster)remainingafterexcludingsingletonsandthosewithmorethanfiveNs in theconsensussequence (i.e.,f1loci)were28.26%and30.03%for70/30Nand85/15Ncriteria,re-spectively.Theaveragepercentagesoflociremainingafterexclud-ingpotentialparalogs(f2loci/nloci)tocompareacrosssampleswere28.00% for 70/30N criteria and 26.71% for 85/15N. The averagenumbersoftheseremainingloci(i.e.,f2loci)sharedbyaminimumoftensamplesforphylogeneticanalyseswere566(3.87%off2loci)for70/30criteriaand569(4.04%off2loci)for85/15N.SeeTablesS1F,Gforfurtherdetails.

3.2 | Allowable Ns at filtering stage and clustering similarity thresholds

IncreasingtheallowednumberofNsperreadfrom0to30Nsstead-ilyincreasedtheproportionoffilter-passedreads(Figure3).Onav-erage,43.2%,15.5%,and9.8%ofreadshadatleastone,15,and30Nsper sample, respectively.Themarginal increase in filter-passedreads decreasedwith each additional allowed number of Ns. SeeTableS1Eforadditionalsummary.

F IGURE  3 Percentfilter-passedreadsvs.No.ofallowedNsperread.EachpointrepresentsadifferentsamplefilteredwithdifferentnumbersofNs.NucleotidebaseswithPhredQ-scores<20werechangedtoan“N”character,andreadswereexcludedbasedontheallowablenumberofNsperread.Forexample,excludingreadswithmorethanfiveNsisequivalenttoexcludingreadswithmorethan1.8%nucleotidesperreadonaveragewithaPhredQ-scorelessthan20,assuminganaveragereadlengthof281bps.Excludingreadswithmorethan30Nsisequivalenttoexcludingreadswithmorethan10.7%nucleotidesperreadonaveragewithaPhredQ-scorelessthan20,assuminganaveragereadlengthof281bps

Page 7: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

     |  7SALAS- LIZANA ANd OONO

The number of clusters per sample increased on averagewithnumberofallowedNs,regardlessofclusteringthresholds(TableS1E).Whenreadswereclusteredat70%similarity,allowingzero,15,and30 Ns per read , these parameter combinations produced, on theaverage, 32,954, 45,999, and 48,659 clusters per sample, respec-tively.When readswere clustered at85% similarity, allowing zero,15,and30Nsperread,theseparametercombinationsproduced,ontheaverage,37,124,52,658,and56124clusterspersample,respec-tively(seeTableS1E,F,andGforsummary).IncreasingthenumberofallowedNsalso increasedtheaveragecoveragedepthper locus(Figure4a),althoughtheincreasewasminimal(i.e.,<1)between15and30Ns.Thefinalnumberoflocipersampletobecomparedacrosssamples(i.e.,f2loci)increasedwithincreasingnumberofallowedNs(Figure4c).However,thisincreasewasminimalfrom15to30Ns(e.g.,1.4%increasefor70%clusteringand1.6%increasefor85%cluster-ing;Figure4c).Thequality-filteringparameterhadlittleeffectontheaveragenumberoflocipersampleusedinthephylogeneticanalyseswhenlocisharedbyfewerthan10sampleswereexcluded(Figure4d).

Clustering at lower similarity thresholds decreases the totalnumberoflocipersample(Figure4c,Figure5,TableS1E,FandG)andincreasescoveragedepthpercluster(Figure5),buttheincreaseincoveragedepthperclusterwasminimalfrom85%to70%(i.e.,<1readpercluster).Clusteringthresholdvaluescan,however,signifi-cantlyaffectthenumberofsharedlociamonglineages(TableS1H).

Therelationshipbetweensequencingdepthandthenumberofloci recovered from each sample depended on the Lophodermium lineage (Figure6). For example, whereas the loci accumulationcurveforLophodermium sp. nov.appearedtobereachingaplateauwith greater sequencing depths, other species appeared to havea higher diversity of reads per sample. To test how sequencingdepthdifferentlyaffectedtheaccumulationofnew loci fordiffer-entLophodermium lineages,wemergedsharedlociperlineageandanalyzedthelociaccumulationcurveusingrandomsamplingoflociwithreplacement.L. nitensandtheL. australe–L. conigenumcomplexhadsignificantlygreaterdiversityoflocithanL. sp. nov. or L. molitoris (Figure6c,d).SeeTableS1F,Gfordetails.

Theaveragesequencelengthperlocusdecreased(i.e.,from262to249bps)assingletons,potentialparalogs,andlociwithhighratesofambiguousnucleotideswereexcludedthroughthebioinformaticpipeline. The average sequence length of loci incorporated in thephylogeneticanalyses(i.e.,beingsharedwithatleasttensamples),

however,didnotsignificantlydecrease(i.e.,from249to236bpsfor70/30Nandfrom246to242for85/15N).

3.3 | Phylogenetic analyses

The supported phylogenetic relationships, as revealed by ITS-LSUalone,of thesixputativespeciesused in thisstudy (Figure7)wereidenticaltothoseobtainedwith55additionalreferenceLophodermium sequences frompublicdatabases (FigureS2). Thedeeperbrancheshad lower support (bootstrap values <75%) based on the 85/15Nalignment compared to either the ITS-LSU or 70/30N alignments(Figure7). The 70/30N phylogeny had the highest bootstrap sup-port inalmostallbranches (i.e.,100%bootstrapsupport), includingbranchessupportingputativespecies.The tree topology recoveredbythe70/30NmatrixwasalsohighlysimilartotheoriginalITS-LSUtree.Themaindifferencebetweenthetwotopologieswasthatthe70/30Ntreewasbetterresolvedintheterminalbranches.Thebetter-supportedbrancheswithinputativespeciesoftenrevealedinterest-ingcorrelationsbetweengenetic structureandendophyteecology.ThenewLophodermiumspeciesandL. nitensbothclearlyconsistedoftwolineages,whichmostlycorrelatedwithgeography(i.e.,northvs. southforL. sp. nov.andeastvs.westforL. nitens;seeTableS1Afordetailsongeographicoriginsofsamples).L. molitorisalsoconsistedoftwolineages,whichcorrelatedtodifferentPinushostsubgenera(i.e.,Strobusvs.Pinus).ThetwolineagesofL. australe–conigenumcomplexcouldalsobeclearlydelineatedwiththe70/30Ndataset,butdidnotcorrelatewithanyknownecologicaldifferences.

Thesummaryoffinalalignmentmatricesfor85/15Nand70/30NddRADlociisfoundinTable1.The85/15Nmatrixwasnotaninclusivesubsetofthe70/30Nmatrix,butcontained36locithatwerenotfoundinthe70/30Nmatrix.Wefoundthatthese36lociwerediscardedfromthe 70/30Nmatrix during the alignment stage because they eithercontainedmore than100SNPsacrosssamples (11/36 loci)ormorethan99indelsacrosssamples(15/36loci).Theremainingloci(10/36)wereexcludedaspotentialparalogsbecausemultiplehaplotypeswerefoundwithin a cluster.However, evenwhen these36 lociwereex-cludedfromthe85/15Nmatrixsuchthatthe70/30Nmatrixwasin-clusiveofalllociinthe85/15Nmatrix,thebootstrapsupportvaluesdidnotsignificantlyimproveforthe85/15Nmatrix(datanotshown).

The number of loci shared within-species was greater in the70/30Ncriteria than in the85/15Ncriteria (15.71%more loci, on

TABLE  1 Summaryofthethreealignmentsusedforphylogeneticanalyses

Alignment No. of LociSum length of all loci Polymorphic positionsa

Ambiguous/Gap sitesb (%)

Missing locic (%)

Mean polymorphic positions per locus

ITS-LSU 1 1,143 303 24.68 0.00 303

85/15N 2,068 529,469 54,335 77.78 76.99 26

70/30N 2,388 607,736 84,747 77.59 76.35 35

aPolymorphicpositionsonlyincludenucleotidevariants.bAmbiguous/gapsitesareeither“N”sor“-”sinthealignmentmatrix.cMissinglociwascalculatedastheproportionofabsentlocioverthetotalnumberofpotentialloci(i.e.,thenumberofsamplesmultipliedbythetotalnumberoflociinadataset).

Page 8: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

8  |     SALAS- LIZANA ANd OONO

average;TableS1F,G).ThenumberofsharedlocibetweensamplepairsandspeciespairsdecreasesasgeneticdistancesbasedonITS-LSUsequencesincrease(Figure8).Thedecayrateofthenumberofshared lociwithgeneticdistance isgreater for the85/15Nmatrixthanforthe70/30Nmatrix.Thispatterndoesnotchangewhenthenumberofsharedlociisdividedbytheaveragenumberoftotallociforpairwisecomparisons(FigureS3).

3.4 | Reproducibility

Thenumberofsharedlocibetweentechnicalreplicatepairsvariedbetween1,254and12,340,mostlikelycorrelatedtotheirrangesinsequencingdepths(Figure9a;FigureS4).Thevariationinbasecalls between shared loci of technical replicate pairs (observederrorrates)rangedfrom0.006%to0.43%forthe70/30Ndatasetand0.004%to0.33%forthe85/15Ndataset (Figure9b,FigureS4,TableS1D).Theobservederrorratesbetweenreplicatesweresignificantlygreaterfor70/30Nthanfor85/15Nanalyses(pairedt test,p < .0001), albeit aminordifference (0.0012vs.0.0007).The average estimated error rates were 0.22% and 0.24% for70/30Nand85/15Ndatasets (TableS1F,G),respectively,basedonthemaximum-likelihoodequationofLynch (2008).Theaver-ageobservederrorratewaslowerthantheestimatederrorrateforbothdatasets(p < .05for70/30Nandp < .01for85/15N).Theestimatederrorratesdidnotsignificantlycorrelatewithsequenc-ing depths for the 59 samples (data not shown), but may havedifferedamongLophodermiumlineages(Figure6).ReplicatepairsoftheL. sp. nov.hadlowererrorratesoverallwithanaverageof0.01%(n=6pairs,2replicates)whereaspairsofL. nitenshadanaverageof0.24%error rate (n=5pairs,3 replicates),but there

werenotenoughreplicatepairsforeachlineagetotestthissta-tistically.Theerrorratesbetweensameanddifferentsequencinglibrariesforthreereplicatesampleswerecomparedwithapairedone-way t test andwerenot statistically significant (p = .11) al-thoughtherewasatendencyforerrorratestobehigherbetweenlibrariesthanwithin(TableS1D).

4  | DISCUSSION

The low-cost ddRADseq, which had been previously applied forSNPvariantdiscoveryinasinglespecies(Kessetal.,2016),gener-atedasufficientnumberofhomologouslocitoconstructastronglyresolvedphylogenyformultipleputativespeciesofthewidespreadLophodermium genus.Wealso found thatgenetic structurewithinputativespeciescanoftenbecorrelatedwithgeographyordiffer-enthostspecies,butitcanalsobeobservedwithinthesamehostsandlocations,suggestingthatecologicaltraitsotherthandispersallimitationorhostspecificitycanactasbarrierstogeneticintrogres-sion.ThephylogeneticresolutionwasimprovedoverITSalone,butdependedonfilteringandclusteringparameters.Theclusteringpa-rameterwasmarkedlymoreimportantthanthefilteringparameter.

4.1 | Filtering and clustering criteria

IncreasingtheallowednumberofNsfrom0to30increasedthepro-portionof filter-passed reads,with thegreatest increasebetween0and5Ns(Figure3).ThenumberofNswasalsopositivelycorre-latedwithnumberofclusters(includingsingletons),coveragedepth(Figure4a),andnumberofloci(excludingsingletons;Figure4c),but

F IGURE  4 EffectsofthenumberofallowedNsinafilter-passedreadonthe(a)coveragedepthofclusteredlociwithmorethanoneread,(b)percentloci(f2loci)outoftotalclusteredloci(nloci)afterfilteringsingletonsandpotentialparalogs,(c)numberoftotalloci(f2loci)persamplefortwoclusteringthresholdvalues(70%and85%),and(d)numberoflocishared(f2loci)byatleasttensamplesfor70%clusteringthresholdvaluesforthe59samples(lightgraylines).Dottedandsolidlinesindicateaverageforlociclusteredwith70%and85%similarity,respectively.Grayareasrepresentstandarderror(n=59).SeealsoTableS1Efordetails

Page 9: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

     |  9SALAS- LIZANA ANd OONO

the increases in the latter twoweremodest after 5Ns. Applyingaless-stringentfilteringcriteriaonlyledtoremovalofgreaterpro-portionsof clusters later in thepipeline (Figure4b) becausemoreclusters either were singletons, had consensus sequences withgreaterthanfiveNs,orwerepotentialparalogs(i.e.,hadmoream-biguousnucleotidepositionswithinclusters).Overall,thenumberof

allowedNshadlittleeffectonthenumberofsharedlociamongourLophodermiumspeciesthatwereusedinthephylogeneticanalyses(Figure4d).ThissuggeststhatallowingareasonablenumberofNsdoes not bias the final result of this study butmaymarginally in-crease the average coverage depths formore accurate consensussequencesandfewerpotentialparalogstobeexcludedduetoam-biguoussitesinreads.

Increasing clustering thresholds from 70% to 85% or 95% in-creasesnumberoflocipersample(Figure6)butdecreasesthenum-berofsharedlociamongdifferentLophodermiumlineages(Figure8,TableS1H).Interestingly,althoughwetypicallyconsiderdeepercov-eragetobeassociatedwithdecreasederrorrates,lowerclusteringthresholds had greater error rates (TableS1D) despite, albeitmar-ginally,deepercoverage (Figure5).Theerrorrateswere,however,bothminor.Therefore,wesuggestvariousclusteringthresholdstobe tested thatminimizes error rates (i.e., compare replicates) andmaximizessharedlociatthegeneticscaleandbreadthofeachstudy.

4.2 | Phylogenetic topologies with ddRAD loci

Thephylogeneticreconstructionbasedonthe70/30Nmatrixwassignificantly better resolved in the deeper branches than by the85/15Nmatrixandalso in the terminalbranches thanby the ITS-LSUmatrix (Figure7). Thewell-resolved terminal branches of the70/30Ntreecorrespondedtopreviously identified(e.g.,L. australe major vs. cryptic; Oono etal., 2014) or potential cryptic species

F IGURE  5 Effectofclusteringthresholdsonthemeandepthperclusterandnumberofloci(f2loci),excludingsingletons,persample.Solidlineindicatesmeancoveragedepth(leftaxis)anddottedlineindicatesnumberofloci(rightaxis).Inallcases,amaximumof15Nswasallowedinthequality-filteringstep.Shadedareasrepresentstandarderror(n=59).Meandepthandnumberoflocifor70/30Nareprovidedasasinglepointwithstandarderrorbarsforcomparison.SeeTableS1Efordata

F IGURE  6 Readsvs.numberoflociat70%clustering(a&c)or85%clustering(b&d).(a&b)Effectofsequencingdepth(numberofmergedreads)persampleonthenumberofloci(f2loci)persample.Fittedlogarithmiccurvesaredisplayedforeachputativespeciestorepresentaccumulationcurves.Grayregionsindicate95%confidenceintervalsoffittedcurves.(c&d)Theaccumulationofnewlociperspecies(uniquef2loci withinputativespecies)withincreasingsequences.TherewerenoenoughsamplesorreadsforL. baculiferum or L. pinastriforcomparison

Page 10: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

10  |     SALAS- LIZANA ANd OONO

(withinL. nitens and L. sp. nov.).Forinstance,thetwowell-supportedclades within L. sp. nov. (Figure7, Figure S2) correspond to sam-ples recovered from two distinct geographical regions, NorthernCaliforniaandOregon/Washington,whichmaycorrespondtotwo

species or a single highly structured species. The 70% thresholdallowed the inclusion of highly variable loci, which increased thenumberofphylogeneticallyinformativepositionsandsharedlocibe-tweenmoredistantlyrelatedspecies(Table1,Figure8,TableS1H).

F IGURE  7 Unrootedmaximum-likelihoodtreesfornrDNAITS-LSUregionandddRADlociobtainedwithtwodifferentcombinationsofsimilaritythresholdsandnumberofallowedNsinaread(seetextfordetails).Barsnexttothetreesmapisolatesthatbelongtothesameputativespeciesorspeciescomplex.L. australe and L. conigenumarelabeledaftertheiroriginalidentifiersalthoughtheylikelyrepresentasinglespeciesandarereferredastheL. australe–L. conigenumcomplexthroughoutthismanuscript.L. australeisolatesarelabeledwithM(major), C(cryptic),orH(hybrid)basedonapopulationstructureanalysisbyOonoetal.(2014).PutativeL. molitorisisolatesarelabeledwithP(Pinus)orshadedbarslabeledStrobusbasedonhostspecies.PutativeL. nitensisolatesarelabeledE(East)orW(West)basedongeographicorigins.PutativeL. sp. nov.isolatesarelabeledN(North)orS(South)basedongeographicorigins.Numbersabovebranchesrepresentbootstrapvalues,notallbootstrapvaluesareshownforaclearerview.SeeFigureS2forITS-LSUphylogenywithadditionalreferencesequences

F IGURE  8 Numberoflociclusteredat70%or85%similaritywithatleasttensamplesperlocusthatweresharedbetweenpairsofsamplesasafunctionofgeneticdistance,basedonnrDNAITS-LSUsequences.Theshadedarearepresentswithin-speciesvariation(>97%similarity).Inset:samedatabutmergedtocomparebetweenpairsofputativespeciesusingaminimumof2speciesperlocus.Thetworegressionsrepresentfittedcurvesforeachdataset,70/30N(black)and85/15N(gray).Theequationforthefittedcurvesisy=a+b/x2,wherea=37.842,b=0.404,andr2=.404for70/30Nmatrixanda=3.949,b=0.234,andr2=.675for85/15Nmatrix

Page 11: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

     |  11SALAS- LIZANA ANd OONO

Althoughclusteringreadsorlocibysimilarityisanimperfectsolu-tionto identifyinghomologous loci,ourresultsagreewithconclu-sionsfromRubinetal.(2012),showingthatclustersthatmayincludeamixtureoforthologousandparalogouslocistillcontainsubstantialphylogeneticsignalthatcanproducecorrecttopologieswithmoder-atetohighaccuracy.

Thephylogeneticreconstructionbasedonthe85/15Nmatrixstillimprovedover thatof the ITS-LSUmatrix forwithin-speciesstruc-ture(i.e.,recentdivergence)despitelowerbootstrapvaluesatdeeperbranches.Ahigher clustering thresholdof85%significantlydimin-ished the number of shared loci across species compared to 70%.Forinstance,whileL. nitensandtheL. australe–L. conigenumcomplex(thetwomostdistantlyrelatedspecies inoursample)hadonlytensharedlociinthe85/15Nmatrix,therewere73inthe70/30Nmatrix(TableS1H).Thissevenfolddifferencecontrastswiththecloselyre-latedspeciesL. baculiferum and L. molitoris,whosenumberofsharedlociin85/15Nand70/30Nwasdifferentbytwofold(30and63loci,respectively).Thisresultisconsistentwiththeobservationthatthenumberofshared locidecreasesabruptlywithgeneticdistanceforboth thresholds but ismorepronounced for85/15N (Figure8).Asa result of using a different number of allowed Ns and clusteringthresholds,the85/15Nmatrixincluded36locithatwerenotfoundinthe70/30Nmatrix.Theseloci,however,werenotsolelyresponsibleforthelowersupportofdeeperbranchesbythe85/15Nmatrix.Asexplainedbefore, it ismorelikelythatthismatrixperformedworsebecauseoffewersharedlociacrossdistantlyrelatedspecies.

Thenumberofhomologouslociindistantlyrelatedlineagescanlikelybeimprovedwithgreatersequencingdepthsornarrowerse-lectionof sequence lengths. In this study, as the genetic distancebetween species increased, the number of shared loci decreased,althoughnotata linearrate.Hence,thisddRADseqmethod ises-peciallyeffectiveinrevealingwithin-speciesfungal(<97%ITSsim-ilarity) diversity, but is also cost-effective for among-species (e.g.,97–70%ITSsimilarity)diversity.However,restrictionsitepolymor-phism will likely limit the efficacy of ddRADseq for phylogeneticreconstruction beyond the genus and family levels (DaCosta &Sorenson, 2016). Furthermore, the mutation risk hypothesis sug-gests that fewer homologous or shared lociwill be recovered forloci with greater risk of mutations. Hence, longer MiSeq ddRADreads,whichhavegreaterlikelihoodofmutationsthatcreatenovel

restrictionsiteswithinreads,mayhavebeenlesslikelytobesharedamongspeciesthanwithinspecies(Arnold,Corbett-Detig,Hartl,&Bomblies,2013;Gautieretal.,2013).Simplemutationsinrestrictionsitesmayalso increase the read lengthvariation for longerMiSeqreadscomparedtoshorterHiSeqreads,butwesawsomesequenceswith restriction sites incorporated in themiddle of the sequence,suggesting that length variation can also be caused by inefficientenzymedigestions. Lengthvariationcausedby inefficientenzymedigestionsormutationsatrestrictionsitesmaycauselowerratesofclustering,butshouldnotbiasphylogeneticanalysesbecausegapsaretreatedasmissingdata.Trimmingreadstoshorterlengthsortar-getingshorterreadswill likelyminimizelossofclustersorlocidueto lengthvariationamongandwithinsamplesofdifferentspecies.However,using longer readsproducesmore informativesites thatmay compensate for having a reduced number of shared loci andcanpotentiallyresolvedeeperdivergences(e.g.,>300millionyears)betterthanshorterreads(Rubinetal.,2012).Longerreadscanalsobemorehelpfulfordownstreampopulationgeneticanalyses,suchasgenomemapping,haplotypereconstruction,andlinkagetoadap-tive loci.Ourcasestudyshowsthatsmallportionsof thegenomesequencedwith theMiSeq platform can be sufficient to producecomparativehomologous lociwithinhighlydiversetaxawithsmallhaploid genomes (Figures6 and 7, TableS1H). Other ddRADseqdatasets that are not based on strict nucleotide polymorphisms,such as indel or presence–absence polymorphisms, may furtherhelp the resolution ofmore distantly related lineages (DaCosta&Sorenson,2016).

4.3 | Reproducibility analysis

Theobservederrorratesbetweenreplicatesamplesthatwererunwithin or between libraries were low, averaging about one errorevery1000basepairs(e.g.,0.0012for70/30Ncriteria,TableS1D)andlowerthanexpectederrorrates.Theobservederrorrateswerealsonotsignificantlydifferentwhensampleswereruninthesameordifferent libraries, but theerror rates tended tobehigherbe-tweenlibrariesthanwithin(TableS1D).Replicatesamplesindiffer-ent librariesrepresentdifferentperiodsof laboratorypreparation(i.e., PCR group, enzyme efficiency) whichmay present variationand increasing error rates than between replicate samples in the

F IGURE  9 Effectofsequencingdepth(totalnumberofmergedreadspersample)on(a)recoveryofsame(shared)loci(f2loci)and(b)nucleotidevariation(observederrorrate)betweenreplicatepairsamplesfilteredwiththe70/30Ncriteria.Replicatesamplesrunonsamesequencinglibraryarecirclesanddifferentlibrariesaretriangles.Markersaretransparent,showingsomeapparentoverlapindatavalueswheretheyappeardarker

0

5.0 × 103

1.0 × 104

1.5 × 104

2.0 × 104

0 5.0 × 103 1.0 × 104 1.5 × 104 2.0 × 104

No. of loci in replicate 1

No.

of l

oci i

n re

plic

ate

2

Shared # of loci20004000800012000

0

5.0 × 103

1.0 × 104

1.5 × 104

2.0 × 104

0 5.0 × 103 1.0 × 104 1.5 × 104 2.0 × 104

No. of loci in replicate 1

No.

of l

oci i

n re

plic

ate

2

% nucleotidevariation betweenreplicates

0.010.100.200.40

(a) (b)

Page 12: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

12  |     SALAS- LIZANA ANd OONO

samelibrary.Furthermore,themarginalincreaseincoveragedepthanddecrease inerror ratewith increasing sequencingdepthmayhavedependedontheLophodermiumspeciesiftheyhaddifferentgenomesizes.

Increasing sequencing depth per sample is likely the bestmethod to improve reproducibility. Sequencing depths correlatewith coverage depth per cluster (Figure S1), which affects theestimation of nucleotide identity when there are discrepanciesamongreadswithinacluster.Hence,increasingsequencingdepthincreases reproducibility among samples and between librar-ies bymore accurately distinguishing PCR or sequencing errorsfrompolymorphismswithgreatercoverage(Figure9).Sequencingdepth alsohas a strongpositive correlationwithnumberof locipersample(Figure6)andthereforeincreasestheprobabilityofse-quencingsharedlociamongsamples.Narrowingtherangeofreadlengthstobesequencedduringpreparationofthepooledlibrarywould also help improve coverage.However, aswe saw, deepersequencing or greater coverage per locus (i.e., lower error rateswithin locus)wasnotnecessarilyneededfor improvingphyloge-neticassessmentswithinthisgenus.Clusteringatalowerthresh-oldhadthegreatesteffectonimprovingphylogeneticresolutionbyidentifyingmoresharedlociacrossdistantlyrelatedspecies.Afine-scaleanalysisforpopulationstructureanddiversitymayre-quirebetterresolutionofnucleotidevariationwithinclusterswithgreatercoveragedepths.Fortheidentificationofheterozygosityin nonhaploid organisms or homologous loci of larger genomes(e.g., >100Mbps), deeper sequencing depths will be necessarywithadditionalmodifications, suchasadapters that include ran-dom degenerate sites for identifying PCR duplicates (Hoffbergetal., 2016), a narrower selection of read lengths, multiplexingfewersamplesperlibrary,orapplyingthisprotocolontheHiSeq.Thisstudysuggeststhatforhaploidfungithathaverelativelysmallgenome sizes (30-50Mbps; Tavares etal., 2014; Gregory etal.,2007),increasingsequencingdepthsbeyond200kpersamplewillsufficientlydecreasetheerrorratesper locusandwillberobustforfine-scalegeneticanalyses.

5  | CONCLUSIONS

Thelow-costddRADseqprotocolusingstandardindexes(Kessetal.,2016) produced sufficient numbers of loci to resolve the phyloge-neticrelationshipsofadiversegenusoffungalendophytesatlowercosts(seeAppendixS2forcostcomparisonanalysis)thantheoriginalddRADprotocols.Specialattention isneeded,however, to identifyappropriatefilteringandclusteringparameters.Althoughclusteringthresholdssignificantlyaffectedthephylogeneticresolution,quality-filteringhadlittleimpact.Reproducibilityandcoveragedepthswerelinked to sequencing depths, but high coverage depths (e.g., >10typicallyfoundinHiSeqdataanalyses)werenotessentialforstrongphylogeneticsupportinthistaxonomicgroupwitharelativelysmallhaploid genome. The use of longer reads may have reduced the

numberofsharedloci,butthemarginalincreaseinphylogeneticallyinformativesitesperreadmaycompensateforthisdisadvantage.

ACKNOWLEDG MENTS

The IlluminaMiSeq runswereconductedat the IGMGenomicsCenter, University of California, San Diego La Jolla, CA. Wewould especially like to thank Kristen Jepsen,Director of IGMGenomics Center, for consultations on designing primers andadapters. Some analyses were performed using the Pinaceaeserver at InstitutodeBiología,UNAM (CONACyTNo.221694),andCONABIO’scomputingcluster,whichwaspartiallyfundedbySecretaríadeMedioAmbienteyRecursosNaturales(SEMARNAT)through the grant “Contribución de la Biodiversidad para elCambioClimático”toComisiónNacionalparaelConocimientoyUsodelaBiodiversidad(CONABIO).Wealsothankthereviewersfortheircarefulassessmentofpreviousdrafts.Finally,wethanktheUCMEXUS-CONACyTPostdoctoralFellowshipsProgramforfundingtoRS-LandCollaborativeGrantNo.20150989toRO.

CONFLIC T OF INTERE S T

NoneDeclared.

AUTHORS’ CONTRIBUTIONS

RS-LandROconceivedtheideasanddesignedmethodology;RS-LandROcollectedthedata;RS-LandROanalyzedthedata;RS-LandROwrote themanuscript.All authors contributed critically to thedraftsandgavefinalapprovalforpublication.

DATA ACCE SSIBILIT Y

DataandRcodesusedforFigure6areavailableatDataOnehttps://doi.org/10.15146/r31m36. Phylogenetic trees and alignment filesareavailableatTreeBASES22555.

ORCID

Ryoko Oono http://orcid.org/0000-0003-4859-0030

Rodolfo Salas-Lizana http://orcid.org/0000-0002-0761-2910

R E FE R E N C E S

Andrews,K.R.,Good,J.M.,Miller,M.R.,Luikart,G.,&Hohenlohe,P.A.(2016).HarnessingthepowerofRADseqforecologicalandevolu-tionarygenomics.Nature Reviews Genetics,17(2),81–92.https://doi.org/10.1038/nrg.2015.28

Arnold,B.,Corbett-Detig,R.B.,Hartl,D.,&Bomblies,K.(2013).RADsequnderestimatesdiversityandintroducesgenealogicalbiasesduetononrandom haplotype sampling.Molecular Ecology, 22(11), 3179–3190.https://doi.org/10.1111/mec.12276

Baird,N.A.,Etter,P.D.,Atwood,T.S.,Currey,M.C.,Shiver,A.L.,Lewis,Z. A., … Johnson, E. A. (2008). Rapid SNP discovery and genetic

Page 13: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

     |  13SALAS- LIZANA ANd OONO

mapping using sequenced RAD markers. PLoS One, 3(10), e3376.https://doi.org/10.1371/journal.pone.0003376

DaCosta,J.M.,&Sorenson,M.D.(2014).Amplificationbiasesandcon-sistentrecoveryoflociinadouble-digestRAD-seqprotocol.PLoS One,9(9),e106713.https://doi.org/10.1371/journal.pone.0106713

DaCosta, J. M., & Sorenson, M. D. (2016). ddRAD-Seq phylogenet-ics based on nucleotide, indel, and presence-absence polymor-phisms: Analyses of two avian genera with contrasting histories.Molecular Phylogenetics and Evolution,94(Pt4),122–135.https://doi.org/10.1016/j.ympev.2015.07.026

Davey,J.W.,Hohenlohe,P.A.,Etter,P.D.,Boone,J.Q.,Catchen,J.M.,&Blaxter,M.L.(2011).Genome-widegeneticmarkerdiscoveryandge-notypingusingnext-generationsequencing.Nature Reviews Genetics,12(7),499–510.https://doi.org/10.1038/nrg3012

Davik,J.,Sargent,D.J.,Brurberg,M.B.,Lien,S.,Kent,M.,&Alsheikh,M. (2015). A ddRAD based linkage map of the cultivated straw-berry, Fragaria xananassa. PLoS One, 10(9), e0137746. https://doi.org/10.1371/journal.pone.0137746

Deckert,R.J.,Hsiang,T.,&LarryPeterson,R.(2002).Geneticrelation-ships of endophytic Lophodermium nitens isolates from needles ofPinus strobus. Mycological Research, 106(3), 305–313. https://doi.org/10.1017/S0953756201005494

Eaton,D.A. R. (2014). PyRAD:Assembly of de novoRADseq loci forphylogenetic analyses. Bioinformatics, 30(13), 1844–1849. https://doi.org/10.1093/bioinformatics/btu121

Edgar, R. C. (2004).MUSCLE:Multiple sequence alignment with highaccuracyandhighthroughput.Nucleic Acids Research,32(5),1792–1797.https://doi.org/10.1093/nar/gkh340

Franchini,P.,Parera,D.M.,Kautt,A.F.,&Meyer,A.(2017).quaddRAD:Anewhigh-multiplexingandPCRduplicateremovalDDRADproto-colproducesnovelevolutionaryinsightsinanonradiatingcichlidlin-eage. Molecular Ecology,26(10),2783–2795.https://doi.org/10.1111/mec.14077

Ganley,R.J.,Brunsfeld,S.J.,&Newcombe,G.(2004).Acommunityofunknown,endophyticfungiinwesternwhitepine.Proceedings of the National Academy of Sciences of the United States of America,101(27),10107–10112.https://doi.org/10.1073/pnas.0401513101

Gardes,M.,&Bruns,T.D.(1993).ITSprimerswithenhancedspecificityforbasidiomycetes-application to the identificationofmycorrhizaeandrusts.Molecular Ecology,2(2),113–118.https://doi.org/10.1111/j.1365-294X.1993.tb00005.x

Gautier,M., Gharbi, K., Cezard, T., Foucaud, J., Kerdelhué, C., Pudlo,P., … Estoup, A. (2013). The effect of RAD allele dropout on theestimation of genetic variation within and between populations.Molecular Ecology,22(11),3165–3178.https://doi.org/10.1111/mec. 12089

Gregory,T.R.,Nicol,J.A.,Tamm,H.,Kullman,B.,Kullman,K.,Leitch,I.J.,…Bennett,M.D.(2007).Eukaryoticgenomesizedatabases.Nucleic Acids Research,35,D332–D338.https://doi.org/10.1093/nar/gkl828

Hanso, M., & Drenkhan, R. (2011). Lophodermium needle cast, insectdefoliation and growth responsesof young scots pines in Estonia.Forest Pathology,42(2),124–135.

Heffelfinger, C., Fragoso, C. A., Moreno, M. A., Overton, J. D.,Mottinger, J. P., Zhao, H., … Dellaporta, S. L. (2014). Flexibleand scalable genotyping-by-sequencing strategies for popula-tion studies. BMC Genomics, 15 (November), 979. https://doi.org/10.1186/1471-2164-15-979

Hoffberg, S. L., Kieran, T. J., Catchen, J.M., Devault, A., Faircloth, B.C.,Mauricio,R.,&Glenn,T.C. (2016).RADcap:Sequencecaptureofdual-digestRADseqlibrarieswith identifiableduplicatesandre-ducedmissingdata.Molecular Ecology Resources,16(5),1264–1278.https://doi.org/10.1111/1755-0998.12566

Hopple, J. S., & Vilgalys, R. (1994). Phylogenetic relationships amongCoprinoid taxa and allies based on data from restriction site

mapping of nuclear rDNA. Mycologia, 86(1), 96. https://doi.org/10.2307/3760723

Huyse,T.,Poulin,R.,&Théron,A.(2005).Speciationinparasites:Apop-ulation genetics approach. Trends in Parasitology, 21(10), 469–475.https://doi.org/10.1016/j.pt.2005.08.009

Katoh, K., & Standley, D.M. (2013).MAFFTmultiple sequence align-ment software version 7: Improvements in performance and us-ability.Molecular Biology and Evolution,30(4), 772–780.https://doi.org/10.1093/molbev/mst010

Kess,T.,Gross,J.,Harper,F.,&Boulding,E.G.(2016).Low-costddRADmethodofSNPdiscoveryandgenotypingappliedtotheperiwinkleLittorina Saxatilis. The Journal of Molluscan Studies,82(1),104–109.

Kimura,M. (1980).A simplemethod for estimating evolutionary ratesofbasesubstitutionsthroughcomparativestudiesofnucleotidese-quences.Journal of Molecular Evolution,16(2),111–120.https://doi.org/10.1007/BF01731581

Kumar, S., Stecher,G.,&Tamura,K. (2016).MEGA7:Molecular evolu-tionarygenetics analysis version7.0 forbiggerdatasets.Molecular Biology and Evolution, 33(7), 1870–1874. https://doi.org/10.1093/molbev/msw054

Lantz,H., Johnston, P. R., Park,D.,&Minter,D.W. (2011).MolecularphylogenyrevealsacorecladeofRhytismatales.Mycologia,103(1),57–74.https://doi.org/10.3852/10-060

Loveless,M.(1984).Ecologicaldeterminantsofgeneticstructureinplantpopulations.Annual Review of Ecology and Systematics,15(1),65–95.https://doi.org/10.1146/annurev.es.15.110184.000433

Lynch,M.(2008).Estimationofnucleotidediversity,disequilibriumcoef-ficients,andmutationratesfromhigh-coveragegenome-sequencingprojects.Molecular Biology and Evolution,25(11),2409–2419.https://doi.org/10.1093/molbev/msn185

Manel,S.,Schwartz,M.K.,Luikart,G.,&Taberlet,P.(2003).Landscapegenetics: Combining landscape ecology and population genetics.Trends in Ecology & Evolution,18(4),189–197.https://doi.org/10.1016/S0169-5347(03)00008-9

Mastretta-Yanes,A.,Arrigo,N.,Alvarez,N.,Jorgensen,T.H.,Piñero,D.,&Emerson,B.C.(2015).Restrictionsite-associateddnasequencing,genotypingerrorestimationanddenovoassemblyoptimizationforpopulationgeneticinference.Molecular Ecology Resources,15(1),28–41.https://doi.org/10.1111/1755-0998.12291

Minter, D.W. (1981). Lophodermium on pines.Mycological Papers,147,1–54.

Neale,D.B.,&Savolainen,O. (2004).Associationgeneticsofcomplextraitsinconifers.Trends in Plant Science,9(7),325–330.https://doi.org/10.1016/j.tplants.2004.05.006

Nieto-Montes de Oca, A., Barley, A. J., Meza-Lázaro, R. N., García-Vázquez, U. O., Zamora-Abrego, J. G., Thomson, R. C., & Leaché,A.D. (2017).Phylogenomics and speciesdelimitation in theknob-scaled lizards of the genus xenosaurus (Squamata: Xenosauridae)usingddRADseqdatarevealasubstantialunderestimationofdiver-sity.Molecular Phylogenetics and Evolution,106,241–253.https://doi.org/10.1016/j.ympev.2016.09.001

Oono,R., François Lutzoni,A.,Arnold, E.,Kaye, L.,U’Ren, J.M.,May,G.,&Carbone, I. (2014).Geneticvariation inhorizontally transmit-tedfungalendophytesofpineneedlesrevealspopulationstructurein cryptic species. American Journal of Botany, 101(8), 1362–1374.https://doi.org/10.3732/ajb.1400141

Oono,R.,Lefèvre,E.,Simha,A.,&Lutzoni,F.(2015).Acomparisonofthecommunitydiversityof foliar fungalendophytesbetweenseedlingand adult loblolly pines (Pinus taeda).Fungal Biology,119(10), 917–928.https://doi.org/10.1016/j.funbio.2015.07.003

Ortiz-García,S.,Gernandt,D.S.,Stone,J.K.,Johnston,P.R.,Chapela,I.H.,Salas-Lizana,R.,&Alvarez-Buylla,E.R.(2003).PhylogeneticsofLophodermiumfromPine.Mycologia,95(5),846–859.https://doi.org/10.1080/15572536.2004.11833044

Page 14: Double‐digest RADseq loci using standard Illumina indexes ......Ecology and Evolution published by John Wiley & Sons Ltd. Department of Ecology, Evolution, and ... and closely related

14  |     SALAS- LIZANA ANd OONO

Osono,T.,&Hirose,D.(2011).Colonizationandlignindecompositionofpine needle litter by lophodermium Pinastri. Forest Pathology,41(2),156–162.https://doi.org/10.1111/j.1439-0329.2010.00648.x

Parchman,T.L.,Gompert,Z.,Mudge,J.,Schilkey,F.D.,Benkman,C.W.,&AlexBuerkle,C.(2012).Genome-wideassociationgeneticsofanadaptive trait in lodgepole pine.Molecular Ecology, 21(12), 2991–3005.https://doi.org/10.1111/j.1365-294X.2012.05513.x

Peterson,B.K.,Weber,J.N.,Kay,E.H.,Fisher,H.S.,&Hoekstra,H.E.(2012).DoubledigestRADseq:AninexpensivemethodfordenovoSNPdiscoveryandgenotypinginmodelandnon-modelspecies.PLoS One,7(5),e37135.https://doi.org/10.1371/journal.pone.0037135

Recknagel,H., Elmer,K.R.,&Meyer,A. (2013).A hybrid genetic link-agemap of two ecologically andmorphologically divergent midascichlidfishes(AmphilophusSpp.)obtainedbymassivelyparallelDNAsequencing (ddRADSeq). G3: Genes, Genomes, Genetics, 3, 65–74.https://doi.org/10.1534/g3.112.003897

Recknagel,H.,Jacobs,A.,Herzyk,P.,&Elmer,K.R.(2015).Double-digestRADsequencingusingionprotonsemiconductorplatform(ddRADseq-Ion) with nonmodel organisms.Molecular Ecology Resources, 15(6),1316–1329.https://doi.org/10.1111/1755-0998.12406

Reignoux, S.N.A.,Green, S.,&Ennos,R.A. (2014).Molecular identi-ficationandrelativeabundanceofcryptic lophodermium species innatural populationsof scotspine,Pinus Sylvestris L.Fungal Biology,118(9–10),835–845.https://doi.org/10.1016/j.funbio.2014.07.002

Rognes,T.,Flouri,T.,Nichols,B.,Quince,C.,&Mahé,F.(2016).VSEARCH:A versatile open source tool formetagenomics.PeerJ,4(October),e2584.https://doi.org/10.7717/peerj.2584

Rubin, B. E. R., Ree, R.H., &Moreau, C. S. (2012). Inferring phyloge-nies fromRADsequencedata.PLoS One,7(4),e33394.https://doi.org/10.1371/journal.pone.0033394

Salas-Lizana,R., Santini,N. S.,Adán,M.-P.,&Piñero,D. I. (2012). ThePleistoceneglacialcyclesshapedthehistoricaldemographyandphy-logeographyofapinefungalendophyte.Mycological Progress,11(2),569–581.https://doi.org/10.1007/s11557-011-0774-x

Schirmer,M., Ijaz,U.Z.,D’Amore,R.,Hall,N., Sloan,W.T.,&Quince,C.(2015).Insightintobiasesandsequencingerrorsforampliconse-quencingwith the illuminaMiSeqplatform.Nucleic Acids Research,43(6),e37.https://doi.org/10.1093/nar/gku1341

Slatkin,M. (1987).Gene flow and the geographic structure of naturalpopulations.Science,236(4803),787–792.https://doi.org/10.1126/science.3576198

Stamatakis,A.(2014).RAxMLversion8:Atoolforphylogeneticanalysisandpost-analysisof largephylogenies.Bioinformatics,30(9),1312–1313.https://doi.org/10.1093/bioinformatics/btu033

Tavaré,S.(1986).Someprobabilisticandstatisticalproblemsintheanalysisofdnasequences.Lectures on Mathematics in the Life Sciences.books.google.com. https://books.google.com/books?hl=en&lr=&id=8aI1ph-hOKhgC&oi=fnd&pg=PA57&dq=Some+probabilistic+statisti-cal+problems+analysis+DNA+sequences+Tavar%C3%A9&ots=rm-SE8LC8Ti&sig=YAsGLrZt9LcQQIa2_dyW8njcPGw.

Tavares,S.,Ramos,A.P.,Pires,A.S.,Azinheira,H.G.,Caldeirinha,P.,Link,T.,…Talhinhas,P.(2014).GenomesizeanalysesofPuccinialesrevealthelargestfungalgenomes.Frontiers in Plant Science,5 (August),422.

Vivian,A.,&Rn,J.(2017).AstreamlinedddRADtagprotocolforusewiththe iontorrentsequencer,asaversatileprobeforpopulations,ge-neticsandgenomicsv1(protocols.io.khuct6w).Protocols.io.https://doi.org/10.17504/protocols.io.khuct6w

Zhang, J., Kobert, K., Flouri, T.,& Stamatakis, A. (2014). PEAR:A fastandaccurateIlluminaPaired-EndreAdmergeR.Bioinformatics,30(5),614–620.https://doi.org/10.1093/bioinformatics/btt593

Zhou, X., Xia, Y., Ren, X., Chen, Y., Huang, L., Huang, S., … Jiang, H.(2014). Construction of a SNP-based genetic linkage map in cul-tivated peanut based on large scale marker development usingnext-generation double-digest restriction-site-associated DNA se-quencing (ddRADseq). BMC Genomics, 15(May), 351. https://doi.org/10.1186/1471-2164-15-351

SUPPORTING INFORMATION

Additional Supporting Information may be found online in thesupportinginformationtabforthisarticle.

How to cite this article:Salas-LizanaR,OonoR.Double-digestRADseqlociusingstandardIlluminaindexesimprovedeepandshallowphylogeneticresolutionofLophodermium,awidespreadfungalendophyteofpineneedles.Ecol Evol. 2018;00:1–14. https://doi.org/10.1002/ece3.4147