Upload
trinhkhue
View
216
Download
1
Embed Size (px)
Citation preview
1 of 13
Supplementary information
A history of expansion and anthropogenic collapse in a top marine
predator of the Black Sea estimated from genetic data.
Fontaine MC, Snirc A, Frantzis A, Koutrakis E, Öztürk B, Öztürk AA, Austerlitz F.
Table S1. Mt-DNA control region haplotype count for the harbor porpoises from the Black Sea and the Aegean Sea. S2
Table S2. Model specification, prior distributions for demographic parameters and locus-specific mutation model parameters. S3
Table S3. Model checking procedure. S4
Fig. S1. Median-joining network among mtDNA-CR haplotypes. The size of the circles is proportional to the total number of haplotypes observed. Sectors are proportional to the numbers of each haplotype observed in the Black Sea (white) and in the Aegean Sea (black). S5
Fig. S2 Mismatch distribution of the mitochondrial control region (mtDNA-CR) in the Black Sea and Aegean Sea harbor porpoises. The black line shows the observed frequencies of pairwise nucleotide differences and the dashed grey line shows the expected frequencies under a demographic expansion. S6
Fig. S3. Population structure estimated from STRUCTURE analysis. S7
Fig. S4. Composite parameters estimated from DIYABC. S8
Fig. S5. Marginal posterior probability densities for the mutation model parameters for the microsatellite loci and the mtDNA control region. S9
Fig. S6. Principal component analysis on the summary statistics when processing model checking for the 15 tested scenarios detailed in Fig. 2. S10
2 of 13
Table S1.
Mt-DNA control region haplotype count for the harbor porpoises from the Black Sea and the
Aegean Sea.
Haplotype code
Black Sea
Aegean Sea
Genbank accession n°
H1 14 5 JX105486 H2 0 1 JX105487 H3 2 1 JX105488 H4 0 1 JX105489 H5 1 1 JX105490 H6 0 1 JX105491 H7 1 0 JX105492 H8 1 0 JX105493 H9 6 0 JX105494
H10 1 0 JX105495 H11 2 0 JX105496 H12 3 0 JX105497 H13 1 0 JX105498 H14 1 0 JX105499 H15 2 0 JX105500 H16 1 0 JX105501 H17 3 0 JX105502 H18 1 0 JX105503 H19 1 0 JX105504 H20 1 0 JX105505 H21 1 0 JX105506 H22 1 0 JX105507 H23 1 0 JX105508 H24 1 0 JX105509 H25 1 0 JX105510 H26 1 0 JX105511 H27 1 0 JX105512 H28 1 0 JX105513 H29 1 0 JX105514 H30 1 0 JX105515 H31 1 0 JX105516 H32 1 0 JX105517 Total 54 10 64
The haplotype codes correspond to the code used in Fig. S1.
3 of 13
Table S2. Model specification, prior distributions for demographic parameters and locus-specific
mutation model parameters.
Priors for the demographic parameters N1 UN~[10, 1000] Na UN~[1000, 10000] Nb UN~[1, 100] Nc UN~[10, 1000] NBN UN~[1, 100] T1 UN~[100,1000] T2 UN~[1, 100] DE UN~[100, 1000] DB UN~[1, 10] Prior for the mutation model Autosomal microsatellites MEAN - !mic UN~[1 x 10-4, 1 x 10-3] GAM - !mic GA~[1 x 10-5, 1 x 10-2 , 2] MEAN – P UN~[0.10 , 0.30] GAM – P GA~[0.01, 0.9 , 2] MEAN - SNI LU~[1 x 10-8, 1 x 10-4] GAM – SNI GA~[1 x 10-9, 1 x 10-3, 2] Mt-DNA Control Region !seq UN~[1 x 10-7, 1 x 10-5] K1 UN~[0.05, 20.00] % invar. sites 82 Shape 0.9
Uniform distribution (UN) with 2 parameters: min and max; Gamma distribution (GA) with 3 parameters: min, max, shape; Log-Uniform (LU) distribution with 2 parameters: min and max. See Fig. 2 for the demographic parameters of each model tested. The mutation model parameters for the microsatellite loci were the mutation rate (!mic), the parameter determining the shape of the gamma distribution of individual loci mutation rate (P), and the Single Insertion Nucleotide rate (SNI). The Mt-DNA CR HKY mutation model had two variable parameters, the per site and generation mutation rate (!seq) and the transition/transversion ratio (K1) parameter, and two fixed parameters, the fraction of constant sites (% invar. sites), and the shape of the Gamma distribution of mutations among sites (Shape).
4 of 13
Table S3. Model checking procedure.
Summary statistics (S) A He V MGW NHA NSS MPD VPD D
Obs. S 6.9 0.49 4.36 0.64 32 32 2.20 1.87 -2.22
Prob. (Ssimul. < Sobs.)
SC1 0.842 0.195 0.644 0.037* 1.000*** 0.864 0.031* 0.018* 0.000*** SC2 0.591 0.025* 0.164 0.286 1.000*** 0.693 0.032* 0.016* 0.002** SC3 0.960* 0.730 0.940 0.016* 1.000*** 0.979* 0.168 0.121 0.001** SC4 0.772 0.148 0.555 0.051 1.000*** 0.936 0.107 0.075 0.002** SC5 0.663 0.173 0.261 0.284 1.000*** 0.771 0.104 0.078 0.001** SC6 0.952* 0.731 0.934 0.017* 1.000*** 0.986* 0.158 0.116 0.001** SC7 0.762 0.129 0.592 0.044* 1.000*** 0.779 0.044* 0.023* 0.001** SC8 0.855 0.277 0.740 0.030* 1.000*** 0.801 0.036* 0.017* 0.002** SC9 0.862 0.264 0.666 0.033* 0.996** 0.516 0.023* 0.010* 0.000*** SC10 0.998** 0.993** 0.972* 0.015* 1.000*** 1.000*** 0.855 0.768 0.009** SC11 0.806 0.106 0.5165 0.057 1.000*** 0.927 0.138 0.074 0.001** SC12 0.482 0.026* 0.207 0.167 1.000*** 0.406 0.042* 0.022* 0.003** SC13 1.000*** 0.997** 0.987* 0.027* 1.000*** 1.000*** 0.866 0.777 0.006** SC14 0.748 0.566 0.834 0.008** 0.867 0.681 0.175 0.149 0.018* SC15 0.737 0.639 0.891 0.016* 0.857 0.648 0.220 0.198 0.047*
Evolutionary scenario SC1 to SC15 are represented in Figure 2. The probability Prob. (Ssimul. < Sobs.) given for each summary statistics was computed from 1,000 virtual data sets simulated from the posterior distributions of parameters obtained under a given scenario. Corresponding tail-area probabilities (p-values) were obtained as Prob. (Ssimul. < Sobs.) and 1.0 - Prob. (Ssimul. < Sobs.) for Prob. (Ssimul. < Sobs.) ! 0.5 and > 0.5, respectively (1). These summary statistics were the ones used in the ABC analysis. For microsatellite loci, we used the mean number of alleles per locus (A), the mean expected heterozygosity (He), the mean allele size variance (V), and the mean MGW index across loci (2). For the mtDNA-CR, we used the number of haplotypes (NHA), the number of segregating sites (NSS), the mean pairwise differences (MPD) and their variance (VPD), and Tajima's D statistics. *, **, *** = tail-area probability < 0.05, < 0.01 and < 0.001, respectively.
1. Gelman A, Carlin JB, Stern HS, & Rubin DB (1995) Bayesian Data Analysis (Chapman & Hall,
London). 2. Garza JC & Williamson EG (2001) Detection of reduction in population size using data from
microsatellite loci. Mol Ecol 10:305-318.
5 of 13
Fig. S1. Median-joining network among mtDNA-CR haplotypes. Circle sizes are proportional
to the total number of haplotypes observed. Sectors are proportional to the numbers of each
haplotype observed in the Black Sea (white) and in the Aegean Sea (black).
6 of 13
Fig. S2. Mismatch distribution for the mitochondrial control region (mtDNA-CR) in the Black Sea and
Aegean Sea harbor porpoises. The black line shows the observed frequencies of pairwise differences
and the dashed grey line shows the expected pairwise frequencies under a demographic expansion.
7 of 13
Fig. S3. Population structure as estimated using STRUCTURE 2.3.3 (ref. 1). (A) The
probability of observing the data (X) under a given number of putative clusters (K) is shown
as a function of K, for two admixture models implemented in STRUCTURE: the standard
model and the Locprior model, the latter assuming a priori that porpoises from the Black Sea
(1) and the Aegean Sea (2) might come from two distinct populations (see ref. 1). The error
bars provide the standard deviation observed over 10 replicated runs for each value of K. The
individual admixture proportion estimated at K = 2 are shown for the standard model (B) and
for the Locprior model (C). Each individual is represented by a thin horizontal line divided
into two colored segments that represent its estimated admixture fraction in each cluster.
1. Hubisz MJ, Falush D, Stephens M, & Pritchard JK (2009) Inferring weak population structure with the
assistance of sample group information. Mol Ecol Res 9:1322-1332.
8 of 13
!"!"#$%&'(
#$%&'()
* + , - !
*.*
*.,
*.!
*./
*.0
!"!)#$%&'(
#$%&'()
* +* ,* -* !*
*.**
*.*,
*.*!
*.*/
*.*0
!"!'#$%&'(
#$%&'()
* + , - !
*.*
*.,
*.!
*./
*.0
+.*
!"$*+,
!"#$%&'
()((( ()((* ()((+ ()((, ()((- ()(.(
(.((
*((
/((
+((
0((
!)#$*+,
!"#$%&'
()(* ()(+ ()(, ()(- ().(
(0
.(.0
*(*0
/(
!'#$*+,
!"#$%&'
()((( ()((* ()((+ ()((, ()((- ()(.(
(.((
*((
/((
+((
0((
Fig. S4. Composite parameters estimated from DIYABC. Marginal posterior probability
density of the population genetic diversity estimated using " = 4.Ne.!micro for autosomal
microsatellite loci (left panels) and " = Ne.!seq for mtDNA-CR (right panels), where Ne is the
effective population size, and !micro and !seq are the microsatellite mutation rate and
mitochondrial mutation rate, respectively.
9 of 13
Microsatellate mutation rate
Den
sity
2e!04 4e!04 6e!04 8e!04 1e!03
050
010
0015
00
P parameter of the GSM
Den
sity
0.10 0.15 0.20 0.25 0.30
010
2030
4050
60
SNI rate
Den
sity
0e+00 2e!05 4e!05 6e!05 8e!05 1e!04
050
000
1000
0015
0000
MtDNA CR mutation rate
Den
sity
2e!06 4e!06 6e!06 8e!06 1e!05
050
000
1000
00
K1 parameter of the HKY mutation rate
Den
sity
0 5 10 15 20
0.00
0.02
0.04
Fig. S5. Marginal posterior probability densities for the mutation model parameters for the
microsatellite loci and the mtDNA control region. For the microsatellite loci, the General
Stepwise Mutation model included the mutation rate, the parameter P determining the shape
of the gamma distribution of individual loci mutation rate, and the rate of Single Nucleotide
Insertions (SNI). For the mtDNA control region, the HKY model included the mutation rate
and the K1 parameter determining the shape of the gamma distribution of the inter-sites
variation of the mutation rate. The dotted lines show the prior distributions.
12 of 13
Fig. S6. Principal component analysis on the summary statistics when processing model
checking for the 15 tested scenarios detailed in Fig. 2. PCA were processed on the summary
13 of 13
statistics used to discriminate among scenarios and compute the posterior distributions of
parameters. Principal components (PCs) were computed considering 15,000 data sets
simulated with parameter values drawn from the prior distributions of parameters obtained
under each of the 15 scenarios. Then the target (observed) data set as well as the 1,000 data
sets simulated from the posterior distributions of parameters were added to each plane of the
PCA. Only the planes defined by the PC1 - PC2 and PC2 - PC3 are provided on the left and
right panels, respectively. The scenarios labelled in red are those with the highest posterior
probability (see the text).