View
3
Download
0
Category
Preview:
Citation preview
Supplementary Information for
Genome-wide association studies of 14 agronomic traits
in rice landraces
Xuehui Huang, Xinghua Wei, Tao Sang, Qiang Zhao, Qi Feng, Yan Zhao, Canyang Li,
Chuanrang Zhu, Tingting Lu, Zhiwu Zhang, Meng Li, Danlin Fan, Yunli Guo, Ahong
Wang, Lu Wang, Liuwei Deng, Wenjun Li, Yiqi Lu, Qijun Weng, Kunyan Liu, Tao Huang,
Taoying Zhou, Yufeng Jing, Wei Li, Zhang Lin, Edward S. Buckler, Qian Qian, Qi-Fa
Zhang, Jiayang Li & Bin Han
This PDF file includes:
Supplementary Note
Supplementary Tables 2-4 and 7-8
Supplementary Figures 1-25
(Supplementary Tables 1, 5, 6, and 9 are provided in the separate Excel files)
1
Nature Genetics: doi:10.1038/ng.695
Supplementary Note SNP identification and annotation & Estimation of specificity and
sensitivity & Data imputation algorithm & Association signals on known loci
Supplementary Table 1 The list of 517 landrace accessions sampled in this study.
Supplementary Table 2 The accuracy of the low-coverage consensus sequences
estimated against four sets of sequencing data.
Supplementary Table 3 Specificity and missing data rate of the genotype dataset before
and after missing genotypes are inferred.
Supplementary Table 4 The detailed list of predicted effects of annotated SNPs.
Supplementary Table 5 The list of genes over-represented for large-effect changes.
Supplementary Table 6 The list of genes that contained large-effect
complete-differentiation SNPs.
Supplementary Table 7 Genome-wide significant association signals of fourteen
agronomic traits using the simple model.
Supplementary Table 8 Genome-wide significant association signals of fourteen
agronomic traits from both the simple model and the compressed MLM.
Supplementary Table 9 The genotype dataset of indica landraces on the causal
polymorphic sites of three known genes.
Supplementary Figure 1 Geographic origins of 517 Chinese rice landraces sampled in
this study.
Supplementary Figure 2 Number and accuracy of SNPs called for 517 landraces plus 3
varieties as internal accuracy controls under various stringency rules.
Supplementary Figure 3 SNP density and distribution across the genome.
Supplementary Figure 4 Functional and evolutionary analyses of SNPs.
Supplementary Figure 5 Divergence and geographic origins of 517 rice landraces of
China.
Supplementary Figure 6 Sequence diversity and population genetic differentiation along
chromosomes in indica.
2
Nature Genetics: doi:10.1038/ng.695
Supplementary Figure 7 Sequence diversity and population genetic differentiation along
chromosomes in japonica.
Supplementary Figure 8 LD decay rate across the genome.
Supplementary Figure 9 An example of missing genotype imputation.
Supplementary Figure 10 Missing data rate and sequencing accuracy as the function of
sequencing coverage.
Supplementary Figure 11 Frequency distribution of variation of fourteen traits in 373
indica landraces.
Supplementary Figure 12 Genome-wide association analysis of tiller number.
Supplementary Figure 13 Genome-wide association analysis of flag leaf angle.
Supplementary Figure 14 Genome-wide association analysis of grain length.
Supplementary Figure 15 Genome-wide association analysis of grain weight.
Supplementary Figure 16 Genome-wide association analysis of spikelet number per
panicle.
Supplementary Figure 17 Genome-wide association analysis of gelatinization
temperature.
Supplementary Figure 18 Genome-wide association analysis of amylose content.
Supplementary Figure 19 Genome-wide association analysis of apiculus color.
Supplementary Figure 20 Genome-wide association analysis of pericarp color.
Supplementary Figure 21 Genome-wide association analysis of hull color.
Supplementary Figure 22 Genome-wide association analysis of drought tolerance.
Supplementary Figure 23 Genome-wide association analysis of degree of seed
shattering.
Supplementary Figure 24 Regions of the genome showing association signals around
known genes controlling heading date.
Supplementary Figure 25 Steps of missing genotype imputation.
3
Nature Genetics: doi:10.1038/ng.695
Supplementary Note SNP identification and annotation
We integrated single–base pair genotypes of 520 individuals to screen for SNPs across
the genome. Discrepancies with rice reference genome were called as candidate SNPs.
Unreliable sites were then filtered according to the following criteria: 1) candidate SNP
loci must be bi-allelic; 2) candidate SNP loci must be more than 10 bp away from each
other; and 3) all the singleton SNPs were excluded. Sites passing these criteria were
retained and called as common SNPs.
SNPs in coding regions were called coding SNPs on the basis of the gene models in
the Rice Annotation Projects Database (release 2) and only gene models with full-length
cDNAs or ESTs support were used (http://rapdb.dna.affrc.go.jp/). The coding SNPs were
then annotated to be synonymous or non-synonymous SNPs, which was used to calculate
the nonsynonymous-to-synonymous ratio for each gene. SNPs with large-effect changes
were annotated and partitioned to be SNPs that introduced stop codons, SNPs that disrupt
stop codons, SNPs that disrupt initiation codons and SNPs that disrupt splice sites.
We used three steps to find gene families under relaxed selection: 1) Genes with the
same Pfam domain were grouped to be a gene family. We then calculated the numbers of
coding SNPs within genes of each family. Only those families with 300 or more SNPs
that permitted sufficient power of statistical tests are left. 2) To avoid the impact of
potential pseudogenes, genes with large-effect SNPs were removed from the family
before further statistic test. 3) A chi-square test was then used for each family. The
observed class is the nonsynonymous-to-synonymous ratio of each family, while the
expected one is the nonsynonymous-to-synonymous ratio of total families.
Estimation of specificity and sensitivity
We used four sets of sequencing data to assess genotyping accuracy. For indica cv.
Guangluai-4, the 21 Mb BAC-based sequences36 and whole-genome sequences generated
4
Nature Genetics: doi:10.1038/ng.695
from 20-fold coverage Illumina sequencing were used. The BAC-based sequences of
Guangluai-4 were composed of 273 BACs covering 65.7% of chromosome 4, of which
the accession numbers were listed at our website
(http://www.ncgr.ac.cn/english/edatabasei.htm). The 20-fold Illumina sequences of
Guangluai-4 are available in the EBI European Nucleotide Archive (ftp://ftp.era.ebi.ac.uk)
with accession number ERP000235. For japonica, we used the BAC-based sequences of
japonica cv. Nipponbare8 and genome sequences of japonica cv. Nongken-58 generated
from 14-fold coverage Illumina GA sequencing. The BAC-based sequences of japonica
cv. Nipponbare were the rice reference genome (IRGSP 4.0,
http://rgp.dna.affrc.go.jp/IRGSP/Build4/build4.html). The 14-fold Illumina sequences of
Nongken-58 are available in the EBI European Nucleotide Archive (ftp://ftp.era.ebi.ac.uk)
with accession number ERP000236.
Genotype calls of consensus sequences at positions with reads covered were then
compared with the above four sets of sequencing data respectively. The number of total
sites for comparison and the number of concordant sites were calculated, which gave the
estimates of specificity (also called “accuracy” in the RESULTS; Supplementary Table
2). Referenced to the four independent standards, the estimated specificities were all
above 99.9%, indicating that the quality control procedures were effective.
The estimate of sensitivity of the genotype calling (also referred to as "recall rate" in
the RESULTS) was based on BAC-based sequences of indica cv. Guangluai-4. Direct
comparison of the BAC-based sequences of Guangluai-4 with their corresponding
regions of the reference genome resulted in a total of 103,965 SNPs, of which 20,888
SNPs were detected by one-fold coverage Illumina sequencing of Guangluai-4, giving a
recall rate of 20.1%. Data imputation algorithm Beginning with the first window at the top of the chromosome, the local haplotype
similarity is exploited. A similarity score is calculated between all pairs of individuals in
5
Nature Genetics: doi:10.1038/ng.695
the studied population with the size of N individuals:
, 1
( )w
ij ijz
S s=
=∑ z
where sij (z) is the similarity score between individuals i and j at the z-th SNP. The
overall similarity score for the window is the sum of similarity scores of all SNPs in the
window of size w (Supplementary Fig. 25a).
At each SNP, the major and minor alleles are denoted by “0” and “1” respectively,
while the missing genotype is denoted by “?”. The single SNP similarity scores are: dij =
1 if genotypes of two individuals are identical (0 vs. 0 or 1 vs. 1); dij = 0.5 if one or both
genotypes are missing (0 vs. ?, 1 vs. ?, or ? vs. ?); dij = p if genotypes are different (0 vs.
1); p is allowed to vary and normally takes negative values (Supplementary Fig. 25a).
This is a penalty set to avoid recognizing different haplotypes as the nearest neighbors
while the difference might be caused by sequencing errors.
For the population with N individuals, a matrix of Sij is obtained (Supplementary
Fig. 25b). To infer the missing genotype at a SNP site of individual i, the nearest
neighbors of this individual are identified. The nearest neighbors of individual i are
defined as those individuals with the highest to k-th highest Sij scores (Supplementary
Fig. 25c). For the nearest neighbors, the major allele frequency at a SNP is calculated. At
a SNP site where the genotype is missing from individual i, the genotype of this
individual is determined to be the same as the major allele of the nearest neighbors if the
major allele frequency is higher than a threshold, f. This filling procedure is conducted for
the first SNP site in a window for all individuals with missing genotype and continues as
the window slides along the chromosome. All SNP sites of the last window of a
chromosome are filled at once.
One step is to determine the values of the four variables, w, p, k, and f, for the optimal
results of missing data imputation. Because testing a large number of data points of these
variables for individual windows is computationally impractical, we tested three values of
6
Nature Genetics: doi:10.1038/ng.695
each variable and obtained the optimized value of each variable for the entire genome.
Based on the genomic and populational properties of indica landraces and through
extensive cross validations, the following testing values were set for these variables: w
(50, 65, 80), p (-3, -5, -7), k (3, 5, 7), and f (0.7, 0.75, 0.8), which gave 81 combination of
values to test for the 373 indica landraces plus the internal control, Guangluai-4.
To optimize the variables for the indica landraces, 1200 chromosomal regions each
containing 300 consecutive SNPs were randomly selected from the genome for cross
validation. In these regions, we randomly masked 1% genotypes to make them missing
data. Two criteria were adopted for evaluating the performance of missing data
imputation. One is the imputation accuracy, A, defined as the percentage of correctly
inferred genotype in the 1% masked genotypes. The other is the filling rate, F, defined as
the percentage of missing genotypes that are inferred and filled. For each of the 81
combinations of the four variables, A and F were calculated for each of the 1200
chromosome regions. The mean of these 1200 A and F values are calculated. A total of 81
means for A and for F were obtained and plotted (Supplementary Fig. 25d). According
to the distribution of A and F, the combination of w = 80, p = -7, k = 5, and f = 0.7 was
judged to be the best because it yielded the highest F = 98.1% and the nearly highest A =
98.0%. After the missing genotypes were inferred, the imputation accuracy was
calculated by comparing inferred genotypes of Guangluai-4 with its accurate genome
sequences.
Association signals on known loci Among the association signals we identified, several were in the close proximity of
known loci, which were identified previously via mutants or crosses. The strongest signal
of apiculus color located within one known loci. The peak SNP was ~20 kb away from
OsC1, the gene previously identified to control the coloration (Fig. 5a)23. For pericarp
color, the strongest signal was ~26 kb away from Rc, the gene known to underlie red
pericarp of rice (Fig. 5b) 24. For hull color, the strongest signal was near the ibf locus, but
7
Nature Genetics: doi:10.1038/ng.695
the causal gene has not been identified and confirmed up to date25.
For grain quality, we studied two traits related to cooking and eating properties,
including gelatinization temperature and amylose content. The strongest association
signal was detected for gelatinization temperature, which was ~21 kb away from ALK,
the previously identified starch synthase modifying cooking property (Fig. 5c)26. For
amylose content, the strongest one has the peak SNP ~1 kb away from waxy, the starch
synthase known to control amylose content in rice (Fig. 5d) 27,28.
Grain width and length are two major characteristics for grain size. For grain width,
the peak SNP with the strongest signal was ~6 kb away from qSW5, the gene previously
identified for grain width variation (Fig. 5e)29. For grain length, one strong signals was
<1 kb away from GS3, the gene known to control grain length of rice Fig. 5f)30.
Primers for amplification and sequencing across the regions of the causal
polymorphisms of there known genes were designed using Primer 3 (v 0.2). The
sequences of the PCR primers were provided in Supplementary Table 9.
8
Nature Genetics: doi:10.1038/ng.695
Supplementary Table 2 The accuracy of the low-coverage consensus sequences estimated against four sets of sequencing data.
Sequences used for validation Number of total bases
Number of concordance Accuracy
Guangluai-4 20x Illumina data 111,725,302 111,678,146 99.96%Guangluai-4 BAC sequences 7,324,676 7,318,437 99.91%Nongken-58 14x Illumina data 137,144,714 137,108,004 99.97%Nipponbare reference genome 104,871,563 104,837,452 99.97%
9
Nature Genetics: doi:10.1038/ng.695
Supplementary Table 3 Specificity and missing data rate of the genotype dataset before and after missing genotypes are inferred. Type Before imputing After imputing Specificity Guangluai-4 20x Illumina data 98.7% 98.7%
Guangluai-4 BAC sequences 98.4% 98.5% Nongken-58 14x Illumina data 99.7% 98.6% Nipponbare reference genome 99.6% 98.7% Missing data rate The entire set of 517 landraces 61.7% 2.9% The subset of 373 indica landraces 61.4% 2.6% The subset of 131 japonica landraces 62.5% 3.8%
10
Nature Genetics: doi:10.1038/ng.695
Supplementary Table 4 The detailed list of predicted effects of annotated SNPs. Type of predicted effects Number of SNPs Number of Genes Large-effect 3,625 3,039 SNPs that introduced stop codons 2,005 1,709 SNPs that disrupt stop codons 200 200 SNPs that disrupt initiation codons 374 374 SNPs that disrupt splice sites 1,046 994 Synonymous 74,849 21,019 Non-Synonymous 92,665 22,342
11
Nature Genetics: doi:10.1038/ng.695
Supplementary Table 7 Genome-wide significant association signals of fourteen agronomic traits using the simple model.
Trait Chr. Position (IRGSP4)
Major allele
Minor allele
Minor allele freq
-log10P (Simple model)
Amylose content 3 18,705,982 C T 0.06 16.81 Amylose content 6 1,757,040 a C G 0.14 67.75 Amylose content 6 6,189,558 A T 0.11 34.76 Amylose content 6 6,709,537 C T 0.19 35.45 Amylose content 12 10,993,688 G T 0.06 15.93 Apiculus Color 6 5,335,519 a A G 0.33 48.44Apiculus Color 6 7,681,502 G A 0.32 17.31Apiculus Color 12 460,120 T C 0.22 9.92 Drought tolerance 1 5,536,395 G T 0.11 9.97 Drought tolerance 2 1,489,158 T C 0.12 9.32 Drought tolerance 5 2,275,357 A C 0.06 14.24 Drought tolerance 6 28,243,628 C T 0.09 11.98 Drought tolerance 11 21,161,361 G C 0.08 15.28 Gelatinization Temperature 2 31,298,733 T C 0.24 8.08 Gelatinization Temperature 5 28,906,320 A T 0.47 8.48 Gelatinization Temperature 6 6,726,587 a C T 0.19 14.68 Gelatinization Temperature 8 19,334,399 C A 0.27 8.33 Gelatinization Temperature 11 22,564,279 A C 0.24 9.01 Grain length 1 5,966,086 C T 0.08 12.31 Grain length 3 17,379,260 a A C 0.07 17.64 Grain length 3 17,637,475 C A 0.08 18.10 Grain length 3 23,349,781 A C 0.13 14.47 Grain length 4 1,135,241 T G 0.11 11.60 Spikelet number 1 26,933,074 A G 0.36 8.53 Spikelet number 3 436,658 G A 0.10 9.08 Grain weight 1 23,687,541 A G 0.23 13.15 Grain weight 4 3,580,191 A C 0.21 12.66 Grain weight 7 15,905,023 G A 0.11 13.06 Grain weight 8 6,288,077 T G 0.09 11.85 Grain weight 8 20,985,668 T C 0.18 11.99
12
Nature Genetics: doi:10.1038/ng.695
Grain width 5 4,942,020 A T 0.15 16.14 Grain width 5 5,341,575 a G A 0.17 23.38 Grain width 7 14,364,359 C T 0.42 10.98 Grain width 8 6,191,511 A G 0.05 9.07 Grain width 12 13,725,521 T G 0.09 9.69 Heading date 1 23,908,112 T C 0.23 44.14 Heading date 4 4,142,083 T A 0.21 54.45 Heading date 4 17,728,075 A G 0.06 46.56 Heading date 6 28,818,321 A G 0.29 38.66 Heading date 7 24,952,910 C T 0.33 37.49 Hull color 6 10,378,142 T C 0.06 9.81 Hull color 8 20,947,967 T C 0.18 8.39 Hull color 9 7,366,211 T C 0.20 19.77 Leaf angle 1 5,605,379 A T 0.08 11.39 Leaf angle 1 25,367,568 C T 0.22 11.51 Leaf angle 1 35,465,422 G A 0.09 12.52 Leaf angle 4 24,729,628 A G 0.08 8.43 Leaf angle 5 21,842,625 A G 0.40 11.27 Pericarp Color 2 34,824,975 A C 0.07 9.59 Pericarp Color 3 31,599,995 T A 0.18 9.54 Pericarp Color 7 6,127,089 a G T 0.33 60.96Pericarp Color 7 17,543,383 G T 0.23 8.13 Pericarp Color 8 12,483,076 T G 0.21 16.79 Shattering degree 3 31,392,925 A G 0.09 8.54 Shattering degree 5 898,443 C T 0.05 10.24 Shattering degree 8 25,498,378 A T 0.23 8.98 Shattering degree 11 2,231,172 C T 0.05 9.18 Shattering degree 12 16,521,488 C G 0.05 8.93 Tiller number 1 4,759,534 G A 0.21 8.92 Tiller number 1 6,840,011 T C 0.25 8.51 Tiller number 2 5,541,400 C T 0.12 8.80 Tiller number 2 25,083,473 A T 0.13 8.41 Tiller number 8 25,453,518 T C 0.11 8.31
Chr. Chromosome. a Known loci with identified genes, which are reported in Supplementary Note.
13
Nature Genetics: doi:10.1038/ng.695
Supplementary Table 8 Genome-wide significant association signals of fourteen agronomic traits from both the simple model and the compressed MLM.
Trait Chr. Position (IRGSP4)
Major allele
Minor allele
Minor allele freq
-log10P (Simple model)
-log10P (Compressed MLM)
Amylose content 3 18,705,982 C T 0.06 16.81 1.83 Amylose content 6 1,757,040 b C G 0.14 67.75 25.30 (1,770,929 a)Amylose content 6 6,189,558 A T 0.11 34.76 7.52 Amylose content 6 6,709,537 C T 0.19 35.45 11.13 Amylose content 12 10,993,688 G T 0.06 15.93 1.78 Apiculus Color 6 5,335,519 b A G 0.33 48.44 26.25 (5,335,519 a)Apiculus Color 6 7,681,502 G A 0.32 17.31 8.48Apiculus Color 12 460,120 T C 0.22 9.92 1.60 Drought tolerance 1 5,536,395 G T 0.11 9.97 6.39 Drought tolerance 2 1,489,158 T C 0.12 9.32 4.57 Drought tolerance 5 2,275,357 A C 0.06 14.24 7.61 Drought tolerance 6 28,243,628 C T 0.09 11.98 8.47 Drought tolerance 11 21,161,361 G C 0.08 15.28 11.07 Gelatinization Temperature 2 31,298,733 T C 0.24 8.08 1.66 Gelatinization Temperature 5 28,906,320 A T 0.47 8.48 0.73 Gelatinization Temperature 6 6,726,587 b C T 0.19 14.68 8.15(6,726,252 a)Gelatinization Temperature 8 19,334,399 C A 0.27 8.33 2.99
14
Nature Genetics: doi:10.1038/ng.695
Gelatinization Temperature 11 22,564,279 A C 0.24 9.01 0.61 Grain length 1 5,966,086 C T 0.08 12.31 0.65 Grain length 3 17,379,260 b A C 0.07 17.64 9.88 (17,371,398 a)Grain length 3 17,637,475 C A 0.08 18.10 10.57 Grain length 3 23,349,781 A C 0.13 14.47 6.48 Grain length 4 1,135,241 T G 0.11 11.60 0.40 Grain length 5 5,343,949 A G 0.20 4.73 6.76 Grain length 11 3,072,370 C T 0.11 1.59 6.43 Spikelet number 1 26,933,074 A G 0.36 8.53 3.77 Spikelet number 3 436,658 G A 0.10 9.08 5.54 Spikelet number 7 18,005,615 C T 0.44 0.86 7.15 Spikelet number 10 5,976,140 C T 0.06 1.85 6.90 Grain weight 1 23,687,541 A G 0.23 13.15 2.17 Grain weight 4 3,580,191 A C 0.21 12.66 1.91 Grain weight 7 15,905,023 G A 0.11 13.06 3.31 Grain weight 8 6,288,077 T G 0.09 11.85 4.26 Grain weight 8 20,985,668 T C 0.18 11.99 4.30 Grain width 5 4,942,020 A T 0.15 16.14 8.58 (4,907,158 a)Grain width 5 5,341,575 b G A 0.17 23.38 17.14Grain width 7 14,364,359 C T 0.42 10.98 3.92 Grain width 8 6,191,511 A G 0.05 9.07 0.84 Grain width 12 13,725,521 T G 0.09 9.69 2.51 Heading date 1 23,908,112 T C 0.23 44.14 1.94 Heading date 2 1,439,288 G A 0.42 2.99 6.41 Heading date 2 30,818,552 G C 0.07 2.87 6.42 Heading date 4 4,142,083 T A 0.21 54.45 0.79
15
Nature Genetics: doi:10.1038/ng.695
Heading date 4 17,728,075 A G 0.06 46.56 NAHeading date 4 18,773,995 A T 0.25 41.19 6.52 Heading date 6 11,083,237 G A 0.05 12.91 7.18 Heading date 6 28,818,321 A G 0.29 38.66 NAHeading date 7 24,952,910 C T 0.33 37.49 0.23 Heading date 9 10,738,885 C A 0.06 4.66 9.56 Heading date 11 28,247,391 C T 0.12 7.00 8.38 Heading date 12 18,324,888 G A 0.06 3.88 6.85 Hull color 6 10,378,142 T C 0.06 9.81 6.43 Hull color 8 20,947,967 T C 0.18 8.39 5.36 Hull color 9 7,366,211 T C 0.20 19.77 12.48 Leaf angle 1 5,605,379 A T 0.08 11.39 1.89 Leaf angle 1 25,367,568 C T 0.22 11.51 4.29 Leaf angle 1 35,465,422 G A 0.09 12.52 5.41 Leaf angle 4 24,729,628 A G 0.08 8.43 2.66 Leaf angle 5 21,842,625 A G 0.40 11.27 4.77 Pericarp Color 2 27,066,598 A G 0.24 3.49 8.65 Pericarp Color 2 34,824,975 A C 0.07 9.59 0.66 Pericarp Color 3 31,599,995 T A 0.18 9.54 3.57 Pericarp Color 7 6,127,089 b G T 0.33 60.96 51.68 (6,123,504 a)Pericarp Color 7 17,543,383 G T 0.23 8.13 NAPericarp Color 8 12,483,076 T G 0.21 16.79 10.89Shattering degree 2 25,025,325 C T 0.16 1.80 7.33 Shattering degree 3 31,392,925 A G 0.09 8.54 4.34 Shattering degree 5 898,443 C T 0.05 10.24 6.59 (948,266 a)Shattering degree 8 25,498,378 A T 0.23 8.98 6.05 Shattering degree 10 2,319,249 T G 0.06 3.08 6.66
16
Nature Genetics: doi:10.1038/ng.695
Shattering degree 11 2,231,172 C T 0.05 9.18 4.79 Shattering degree 12 16,521,488 C G 0.05 8.93 3.33 Tiller number 1 4,759,534 G A 0.21 8.92 4.11 Tiller number 1 6,840,011 T C 0.25 8.51 2.60 Tiller number 2 5,541,400 C T 0.12 8.80 2.35 Tiller number 2 25,083,473 A T 0.13 8.41 3.87 Tiller number 4 3,760,194 A T 0.20 2.97 6.50 Tiller number 8 25,453,518 T C 0.11 8.31 3.24 Tiller number 9 23,332,559 A G 0.34 2.39 6.82 Tiller number 10 15,239,407 T A 0.10 2.16 6.39
Chr. Chromosome; NA, Not available. a The positions of some peak SNPs from the compressed MLM were slightly different those from the simple model. b Known loci with identified genes, which are reported in Supplementary Note.
17
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Nature Genetics: doi:10.1038/ng.695
Recommended