52
Quality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes Principal Investigator: Mary L. Marazita, Ph.D. Support: CIDR Contract # HHSN268201200008I (Task Order #HHSN26800024) NIH Institute: NIDCR Contents 1 Summary and recommendations for dbGaP users 3 2 Project overview 3 3 Genotyping process 4 4 Quality control process and participants 4 5 Sample and participant number and composition 5 6 Annotated vs. genetic sex 5 7 Chromosomal anomalies 5 8 Relatedness 7 9 Population structure 8 10 Missing call rates 9 11 Batch effects 9 12 Duplicate sample discordance 10 13 Mendelian errors 10 14 Hardy-Weinberg equilibrium 11 15 Minor allele frequency 12 16 Duplicate SNP probes 12 17 Sample exclusion and filtering summary 12 18 SNP filter summary 12 1

Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Quality Control Report for Genotypic Data

University of Washington

February 5, 2015

Project: Genetics of Orofacial Clefts and Related PhenotypesPrincipal Investigator: Mary L. Marazita, Ph.D.Support: CIDR Contract # HHSN268201200008I (Task Order #HHSN26800024)NIH Institute: NIDCR

Contents

1 Summary and recommendations for dbGaP users 3

2 Project overview 3

3 Genotyping process 4

4 Quality control process and participants 4

5 Sample and participant number and composition 5

6 Annotated vs. genetic sex 5

7 Chromosomal anomalies 5

8 Relatedness 7

9 Population structure 8

10 Missing call rates 9

11 Batch effects 9

12 Duplicate sample discordance 10

13 Mendelian errors 10

14 Hardy-Weinberg equilibrium 11

15 Minor allele frequency 12

16 Duplicate SNP probes 12

17 Sample exclusion and filtering summary 12

18 SNP filter summary 12

1

Page 2: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

19 Characteristics of SNP by array source 13

20 Preliminary association tests 13

A Project participants 15

List of Tables

1 Summary of recommended SNP filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 Summary of DNA samples and scans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Summary of subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 HapMap samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 IBD kinship coefficient expected values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Summary of SNP missingness by chromosome . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Duplicate sample discordance error rates and counts . . . . . . . . . . . . . . . . . . . . . . . 208 SNP characteristics by array source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 TDT over-transmission comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

List of Figures

1 Sex discrepancies identified by normalized intensities . . . . . . . . . . . . . . . . . . . . . . . 222 Sex-chromosome anomalies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Normal BAF scan sample A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 Anomalous BAF scan (sample A) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Anomalous BAF scan (sample B) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 Anomalous BAF scan (sample C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 Sex chromosome anomaly XX/XO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 Sex chromosome anomaly XXX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Sex chromosome anomaly XXY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3010 Relatedness plot for all study subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3111 Initial PCA with HapMap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3212 PCA scree plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3313 PC-SNP correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3414 PCA parallel coordinates plot with self-identified race . . . . . . . . . . . . . . . . . . . . . . 3615 PCA eigenvectors 1-2-4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3716 PCA eigenvectors 1-2-6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3817 Histogram of missing call rate per sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3918 Boxplot of missing call rate categorized by plate . . . . . . . . . . . . . . . . . . . . . . . . . 4019 Mean odds ratio from allele frequency batch test . . . . . . . . . . . . . . . . . . . . . . . . . 4120 Summary of concordance by SNP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4221 Mendelian error rate by family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4322 PCA selection of homogenous subset of Caucasian subjects used for HWE . . . . . . . . . . . 4423 QQ plots of HWE p-values for PCA-defined homogeneous Caucasian . . . . . . . . . . . . . . 4524 Distributions of estimated inbreeding coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . 4625 Minor allele frequency distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4726 QQ plots of association test p-values for TDT . . . . . . . . . . . . . . . . . . . . . . . . . . . 4827 QQ plots of association test p-values for parenTDT . . . . . . . . . . . . . . . . . . . . . . . . 4928 Manhattan plots of TDT test p-values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5029 Manhattan plots of parenTDT test p-values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5130 QQ plot TDT: chromosomes 14, 15, and 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2

Page 3: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

1 Summary and recommendations for dbGaP users

A total of 11,727 study subjects were genotyped on the HumanCoreExomePlusCustom Marazita 15050181array. These subjects include 2,989 affected, 6,052 unaffected in families which included affected individuals,and 2,685 unaffected in control families (with no history of cleft). The median call rate is 99.96% and theerror rate estimated from 264 pairs of study sample duplicates is 2.74e-06. Genotypic data are provided forall subjects and SNPs. Generally, we recommend selective filtering of genotypic data prior to analysis toremove large (> 5 Mb) chromosomal anomalies showing evidence of genotyping error and to remove wholesamples with an overall missing call rate > 2%. In this study, there are 33 such anomalies and 59 sampleshave a missing call rate > 2%. Since this is a family study and the maximum missing call rate (0.0342) is stillreasonable, the subjects with missing call rate > 2% were not filtered. Preliminary association test resultsare provided as an example of how to apply the filters. All SNPs are included in the association test resultsfile, but we recommend that these be filtered according to the criteria specified in Table 1. A composite SNPfilter is provided, along with each of the component criteria so that the user may vary thresholds. Additionalspecific recommendations are highlighted in the following document in italics.

2 Project overview

The purpose of this study is to investigate the genetics of orofacial clefts (OFCs) in a large study population,and importantly, to incorporate subclinical phenotypic features into these studies. Orofacial clefts (OFCs)comprise a significant fraction of human birth defects (about 1/700 live births [1]) and represent a majorpublic health challenge, as individuals with these anomalies require surgical, nutritional, dental, speech,medical and behavioral interventions, thus imposing a substantial economic and personal burden [2]. Themost common forms include OFCs of the lip alone (CL), CL plus cleft palate (CL+CP) or of the palateonly (CP). Individuals born with OFC may have their first surgical repair at age 3 months, but this initialsurgery is just the beginning of a lifetime of health burdens. An individual born with an OFC has a hospitaluse rate increased for most ages (up to 233% increase for children ages 0-10 years and 16% for middle agedadults [3]). Healthcare costs for children with OFCs are estimated to be 800% greater compared with theirunaffected peers [4]. Data from Denmark show that people born with CL with or without CP (CL/P) havean increased mortality up to age 55, which may be attributed to an increased risk of suicide and/or certaincancers [5]. The focus of most OFC genetic research has been CL and/or CP. Furthermore, the majority ofOFC (about 70% of CL/P and 50% of CP) is considered “nonsyndromic” [6], i.e. isolated anomalies with noother apparent cognitive or structural abnormalities.

The factors leading to the majority of nonsyndromic OFCs are still unclear, particularly at an individualfamily level. As is true for many complex traits, substantial progress in gene identification has occurredin the OFC field in the last two years [7, 8]. Genome Wide Association Studies (GWAS) and sequencingstudies to date by our research team and others have focused on genetic risk factors for overt CL/P andCP and have been very successful. A major finding from this work is that OFCs exhibit significant geneticheterogeneity, i.e., multiple genetic regions have been implicated [9, 10]. Thus, approaches are needed tounderstand this genetic heterogeneity. Are there GxG interactions at work? Are there subsets of families,each due to a different gene? Our research group has shown that a promising approach to dissect the etiologyof OFC is to focus on subclinical phenotypic features within entire cleft families (not just in affected cases,but also in their non-cleft relatives). These subtle features are believed to represent mild manifestations ofthe same underlying genetic susceptibility responsible for OFCs; as such, their inclusion in case-control andfamily-based genetic studies can help to clarify and refine the relationship between genotype and phenotype.

The study population comprises large numbers of families and individuals (about 12,000 individuals)from multiple populations worldwide (Caucasians from the US and Europe, Asians from China, India, andthe Philippines, Mixed Native American/Caucasians from South America, and Africans from Nigeria andEthiopia). There are cases, case families (nuclear families and extended kindreds), as well as controls withno history of OFC nor other developmental defects.

Note that, for the data cleaning, only case status was used in analyses (not making use of subclinical

3

Page 4: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

phenotpyes) where case was defined as a subject with any type of clefting. For preliminary association results,a random sample of one complete trio (with an affected offspring) per family was used (see Section 20).

3 Genotyping process

DNA study samples were derived from Saliva (52.1%), Blood (45.1%), Buccal Cells (0.858%), Mouthwash(0.791%) and 0.741% unspecified. Whole genome amplification (WGA) was used for 0.991% of the samples(1.02% of the Blood samples were WGA, 0.808% of the Saliva samples were WGA, and 12.6% of the BuccalCell samples were WGA). There were a variety of DNA extraction methods used. There were 257 HapMapgenotyping control samples.

The samples were genotyped in batches corresponding to 96-well plates with one batch per plate. Eachbatch on average contained 2 HapMap controls and 2 duplicate study samples. Duplicate samples were notplaced in the same batch. All members of a given family were assigned to a randomly chosen plate, with theconstraint that a sufficient number of wells were available on that plate. The order of assignment was byfamily size. We attempted to find a sufficiently-randomized plate design that balanced family case/controlstatus, but we found it impossible to balance case status across plates due to the large number of families,the small number of control families, and the restriction that all family members must be placed on the sameplate. Family Case Status, Sex, DNA Source, and Recruitment Site were not stratified, as families wererandomly plated. These variables were unbalanced due to the restriction that family members must be onthe same plate.

The genotyping was performed at the Center for Inherited Disease Research (CIDR) using the IlluminaHumanCoreExomePlusCustom Marazita 15050181 array (BPM annotation version A, genome build 37) andusing the calling algorithm GenomeStudio version 2011.1, Genotyping Module 1.9.4 and GenTrain version1.0. This array is a semi-custom array that includes SNPs from the HumanCore array with additions ofcustom content SNPS and SNPs from the exome array. The array consisted of a total of 557,677 SNPs,of which 244,594 SNPs (43.9%) are from the exome array and 15,890 SNPs are from custom content. Thecustom content SNPs were chosen by the study investigators based on previous GWAS results and/or regionsof interest suspected to be involved in the genetics of orofacial clefts.

Note that earlier versions of Illumina annotations mis-annotated chromosome information for numerousSNPs designated as X or Y rather than as XY. These SNPs occur in pseudo-autosomal (PAR1, PAR2) regionsor in the X-translocated region (XTR). The annotation was corrected prior to genotype calling. Additionally,positions for indels have been updated to reflect the 1000 Genomes reference for use in imputation. SNPpositions are given in genome build 37/hg19. Two hundred sixteen (216) SNPs were of Illumina build36. SNP positions of 204 of these were successfully updated to build 37 with the remaining 12 positionsset to unmapped. (See “position” and “position.ilm” in “SNP annotation.csv” for more details on updatedpositions.)

4 Quality control process and participants

Genotypic data that passed initial quality control at CIDR were released to the Quality Assurance/QualityControl (QA/QC) analysis team at the UWGCC (University of Washington Genetics Coordinating Center),the study investigator’s team and dbGaP. These data were analyzed by the analysis team at UWGCC andthe results were discussed with all groups in periodic conference calls. Key participants in this process andtheir institutional affiliations are given in Appendix A. The results presented here were generated with theR packages GWASTools [11] and SNPRelate [12] unless indicated otherwise. The methods of QA/QC usedhere are described by Laurie et al. [13].

4

Page 5: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

5 Sample and participant number and composition

In the following, the term “sample” refers to a DNA sample and, for brevity, “scan” refers to a genotypinginstance (including genotyping chemistry, array scanning, genotype calls, etc.).

A total of 12,238 samples (including duplicates) from study subjects were put into genotyping production,of which 12,004 were successfully genotyped and passed CIDR’s QC process (Table 2). In the subsequentQA process thirteen (13) samples were found to have sample identity issues and were dropped. In addition,several unexpected duplicates were identified and verified. Refer to Section 8. Consequently, the set of scansto be posted include 11,991 study participants and 257 HapMap controls. The 11,991 study scans derivefrom 11,727 subjects, 264 of which have duplicate scans (Table 3). The 257 genotyping control scans derivefrom 128 HapMap subjects, of which 123 have one or two additional replicates. The study subjects occuras 1,170 singletons and 2,787 families with from 2 to 29 members in each. The family count here takes intoaccount super-families that were identified during the analysis of relatedness (Section 8). Table 4 gives thefamily characteristics and distribution of the HapMap genotyping controls.

6 Annotated vs. genetic sex

To check annotated vs. genetic sex, we look at both X chromosome heterozygosity and the means of theintensities of SNP probes on the X and Y chromosomes. The expectation is that male and female sampleswill fall into distinct clusters that differ markedly in X and Y intensities. Figure 1 shows two distinct clusters,as expected. There are 5 samples with unspecified annotated gender (black), 6 samples annotated as malebut genotyping as female (blue points in red cluster) and 11 samples annotated as female but genotyping asmale (red points in blue cluster). After examination of records by the study investigator and using resultsfrom the relatedness analysis (Section 8), all of the samples with unspecified gender and all but one of thesamples with mis-annotated gender were matched with the appropriate subject with the correct genetic sex.One sample was dropped because of not being able to conclusively identify the subject.

A tail of high Y chromosome intensity females can be seen in Figure 1. These were examined further.DNA source (in particular, whole genome amplified (WGA)) was examined and plots of individual autosomevs Y chromosome intensity were also reviewed. The high Y intensity was not an artifact of WGA samples.There was no overall pattern observed between the autosome vs Y chromosome intensities for these females.Since analyses are not performed on the Y chromosome for females, this feature is unlikely to be problematic.

In addition, higher or lower than usual intensities or heterozygosities can be used to identify possiblesex discrepancies or sex chromosome anomalies. Males with higher than usual chromosome X intensity havea possible XXY karyotype, females with higher than usual chromosome X intensity have a possible XXXkaryotype, and females with lower than usual chromosome X intensity have a possible XX/XO karyotype(Figure 2). These samples were examined further by viewing BAF/LRR plots. It was confirmed that 2female samples had XXX karyotype, 5 female samples had XX/XO karyotype, and 7 male samples had XXYkaryotype (see Section 7).

7 Chromosomal anomalies

Large chromosomal anomalies, such as aneuploidy, copy number variations and mosaic uniparental disomy,can be detected using “Log R Ratio” (LRR) and “B Allele Frequency” (BAF) [14, 15]. LRR is a measure ofrelative signal intensity (log2 of the ratio of observed to expected intensity, where the expectation is basedon other samples). BAF is an estimate of the frequency of the B allele of a given SNP in the population ofcells from which the DNA was extracted. In a normal cell, the B allele frequency at any locus is either 0(AA), 0.5 (AB) or 1 (BB) and the expected LRR is 0. Both copy number changes and copy-neutral changesfrom biparental to uniparental disomy (UPD) result in changes in BAF, while copy number changes alsoaffect LRR.

To identify aneuploid or mosaic samples systematically, we used two methods. For anomalies that splitthe intermediate BAF band into two components, we used Circular Binary Segmentation (CBS) [16] on BAF

5

Page 6: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

values for SNPs not called as homozygotes. For heterozygous deletions (with loss of the intermediate BAFband), we identified runs of homozygosity accompanied by a decrease in LRR. See [17] for a full descriptionand application of this process. All sample-chromosome combinations with anomalies greater than 5 Mb orsample-chromosome combinations with the sum of the lengths of the anomalies greater than 10 Mb wereverified by manual review of BAF and LRR plots.

Figure 3 shows BAF/LRR plots for chromosome 5 in Sample A. This chromosome shows a normalpattern, with LRR centered at 0 and all three BAF bands at 0, 0.5, and 1 (corresponding to AA, AB, andBB genotypes).

Figure 4 shows BAF/LRR plots for chromosome 2 in the same sample, which shows an abnormal patternwith a split in the heterozygous band and a corresponding decrease in LRR. The BAF/LRR pattern is con-sistent with a mosaic deletion. The split is wide enough to cause genotyping errors, with some heterozygotesevidently called as homozygotes. Because of the apparent genotyping errors, we recommend filtering outgenotype calls in the anomaly region on this chromosome for this sample.

Figure 5 shows BAF/LRR plots for Sample B chromosome 12. This pattern is consistent with a het-erozygous deletion. Because this is a very large deletion, it is most likely acquired (not constitutional). Suchlarge deletions are recommended for filtering.

Figure 6 shows BAF/LRR plots for chromosome 2 in Sample C, which shows a split in the heterozygousband wide enough to cause genotyping errors. Since LRR remains centered at 0 (copy neutral), this patternis consistent with mosaic uniparental disomy, possibly due to aneuploidy rescue. Because of the apparentgenotyping errors, we recommend filtering out genotype calls in the anomaly region on this chromosome forthis sample.

Recall from Section 6 and Figure 2 that several sex chromosome anomalies were suggested by outlier Xand/or Y chromosome intensities or heterozygosities.

Figure 7 shows BAF/LRR plots for chromosome X for Female Sample D. A split in the heterozygousband and decreased LRR (relative to other females) is consistent with an XX/XO mosaic karyotype. (Themean LRR for females is greater than zero and, for males, less than zero.) In this chromosome, the BAFsplit is wide enough to indicate genotyping errors so it is recommended to be filtered. There are five (5)samples with XX/XO karyotypes in this study.

Figure 8 shows BAF/LRR plots for chromosome X for Female Sample E. A split in the heterozygousband at 1

3 and 23 and increased LRR (relative to other females) is consistent with trisomy, i.e. with an XXX

karyotype. There are two (2) XXX karyotypes in this study. These anomalies are not recommended forfiltering since there is no indication of genotyping error and they may be acquired.

In the discussion below, XY SNPs refer to SNPs that are on both the X and Y chromosomes in the pseudo-autosomal PAR1 and PAR2 regions on either end of the chromosome and the XTR region (X–translocatedregion) in the middle region of the chromosome. Figure 9 shows BAF/LRR plots for chromosome X in MaleF. The pattern of the BAF/LRR plots is consistent with XXY karyotype. Note the presence of a middleheterozygous band. Splits in the heterozygous band in the pseudo-autosomal regions (PAR1, PAR2, andXTR) indicate three copies which is consistent with XXY karyotype and distinguish the plot from a typicalX-chromosome BAF plot of a female. Seven (7) samples with X-chromosome BAF/LRR patterns consistentwith XXY karyotype were identified in this study. The X and XY chromosome SNPs are filtered for thissample, not because of errors, but because of the unusual situation of X heterozygosity in males.

Four subjects were found to have trisomy of chromosome 21. All of these subjects were verified tohave Down’s syndrome by study investigator records. Chromosome 21 SNPs for these samples were notfiltered; however, depending upon the study goals, users may want to also filter these SNPs (refer to “chro-mosome anomalies.csv”). None of these subjects were used in subsequent analyses.

The breakpoints of all detected anomalies> 5 Mb in length are provided in the file“chromosome anomalies.csv”.Of the 82 anomalies listed in this file, 33 are recommended for filtering. Two PLINK files are provided: onewith no filtering, and one with genotypes in filtered anomaly regions set to missing.

We also examine BAF/LRR plots for evidence of sample contamination (more than 3 BAF bands on allchromosomes) and other artifacts. For this we examine scans that are high or low outliers for heterozygosity,high outliers for BAF standard deviation (for non-homozygous genotypes), and high outliers for relatedness

6

Page 7: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

connectivity (the number of samples to which a sample appears to be related with kinship coefficient > 1/32).No samples with evidence of contamination or unusual genotyping artifacts were found in this study.

8 Relatedness

The relatedness between each pair of participants was evaluated by estimation of the kinship coefficient(KC):

KC =12k2 +

14k1 (1)

where k2 is the probability that two pairs of alleles are identical by descent (IBD) and k1 is the probabilitythat one pair of alleles is IBD. Table 5 shows the expected coefficients for some common relationships.

Initial IBD coefficients were estimated using 67,371 autosomal SNPs and the KING-robust procedure [18],but implemented in R using the package SNPRelate [12]. The SNPs were selected by LD pruning from aninitial pool consisting of all autosomal SNPs with a missing call rate < 5% and minor allele frequency (MAF)> 5% with all pairs of SNPs having r2 < 0.1 in a sliding 10 Mb window. KING-robust provides estimatesof the kinship coefficient and IBS0 (the fraction of SNPs that share no alleles), from which relationshipscan be inferred. KING-robust was used because it is robust to population structure, which is needed forthis mixture of multiple ethnic and ancestral groups. This study has a rich relatedness structure (withover 2,500 families, some of which had inbreeding). The initial KING results showed numerous unexpectedrelationships, including unexpected duplicates.

Many of the populations represented in this study are admixed. KING-robust is robust to discrete popu-lation substructure but has been shown to provide biased estimates in admixed populations [19]. Therefore,to aid in pedigree resolutions, we also found kinship coefficients using a new method called PC-Relate devel-oped by Matt Conomos, Bruce Weir, and Timothy Thornton at University of Washington, Department ofBiostatistics [20]. PC-Relate is a principal component analysis (PCA) based method for genetic relatednessinference in samples with unspecified population structure. PC-Relate provides accurate estimates of kin-ship coefficients and identity by descent (IBD) sharing probabilities in structured samples without requiringadditional reference population panels or prior specification of the number of ancestral sub-populations. (Seealso Section 9.)

Expected relationships were based on the pedigree. Observed relationships were inferred using estimatedkinship coefficients (see Table 1 in [18]; full-sibling and parent-offspring relationships were distinguished byIBS0). Both KING and PC-Relate kinship coefficients were considered during the pedigree resolution.

Relatedness estimates were used to identify sample swaps and other pedigree issues. Most of the un-expected relationships could be resolved and the pedigree corrected. After pedigree resolution, however,an additional twelve (12) samples were dropped because the sample identity could not be verified (recallone subject was dropped due to unresolvable gender mis-annotation). In addition, twenty-four (24) super-families were formed, where at least one individual in each of two different families were related by a degree2 (Deg2) or higher relationship but an unambiguous pedigree could not be formed. A pair of subjects weredetermined to be degree 2 or higher if the kinship coefficient KC satisified KC > 0.08839 = 2−3.5, the lowerboundary of inference for Degree 2 relationships (see Table 1 in [18]). Note that neither the KING northe PC-Relate kinship coefficients are precise enough to reliably distinguish between unrelated and degree3 (Deg3) relationships so these were not used in making any pedigree change decisions where a differencebetweem degree 3 or unrelated relationship was the only evidence.

Figure 10 displays the KING-robust results for 15,918 pairs of subjects with kinship coefficient > 132 after

revision of the pedigree. Note that this plot only shows pairs of subjects after duplicates were removed;an intermediate run of KING after revision of the pedigree verified all expected duplicates from the revisedpedigree. (The corresponding plot of PC-Relate results would look similar; however, the offset cluster aroundIBS 0.06 would not be there as the PC-Relate kinship coefficient estimates would be lower (more accuratelyreflecting the relationship).) The plot appears to still indicate unresolved unexpected relationships. Theapparent unexpected duplicates (red triangles) were due to 36 pairs of monozygotic twins and one set ofmonozygotic triplets. As well, the unexpected parent-offspring relationships which were expected to be

7

Page 8: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Deg2 (blue triangles in cyan parent-offspring cluster) were between one of a pair of monozygotic twins andthe children of the other twin. Some of the blue triangles above the full-sibling threshold are actually notunexpected as they are half-sib-first-cousins (same mother, different fathers who are brothers). Some of theother unexpected relationships reflect Deg2 relationships that are accounted for via super-families. Othersreflect higher (or lower) kinship coefficient than expected due to inbreeding. Also note that, for some of theunexpected relationships as indicated by KING kinship coefficients, the PC-Relate kinship coefficients wouldyield inferred relationship the same as the expected relationship.

The pedigree file “Pedigree.csv” contains a variable that identifies the monozygotic twins and triplets,a variable that indicates the presence of inbreeding or family loops (such as sisters marrying brothers orthe same woman marrying brothers), and a variable that identifies super-families. In addition, a file ofestimated kinship coefficients “kinship coefficient.csv” is provided. This file contains both the KING andPC-Relate kinship coefficient estimates and has variables indicating remaining differences between expectedand estimated relationships. (See “README kinship coefficient.txt” for further details.)

9 Population structure

To investigate population structure, we use principal components analysis (PCA), essentially as describedby Patterson et al. [21], but implemented in R (SNPRelate package). We and others [22] have shown that itis often necessary to perform linkage disequilibrium (LD)-based or other pruning of the SNPs to be used forPCA, in order to avoid having sample eigenvectors that are determined by small clusters of SNPs at specificlocations, such as the LCT, HLA, or polymorphic inversion regions [22]. Therefore, the SNPs used for PCAwere selected by LD pruning from an initial pool consisting of all autosomal SNPs with a missing call rate< 5% and minor allele frequency (MAF) > 5%. In addition, the 2q21 (LCT), HLA, 8p23, and 17q21.31regions were excluded from the initial pool. The LD pruning process selects SNPs from the initial pool withall pairs having r2 < 0.1 in a sliding 10 Mb window.

The PC-Relate method discussed in Section 8 uses an iterative process of principal component analysisalong with estimating IBD coefficients. The final iteration produces IBD coefficient estimates for all pairs ofsubjects and principal component eigenvectors for all study subjects. Starting with KING robust relatednessestimates, a maximal set of unrelated subjects is chosen based on the kinship coefficients. (We definedrelated for these purposes as having kinship coefficient greater than 2−9/2 = 0.442, i.e. related at Deg3 orhigher level.) Principal component analysis is performed on this unrelated set using an LD-pruned set ofSNPs (as described above). In order to obtain eigenvectors for all study subjects (related and unrelated), weimplemented the approach described by Zhu et al [23]. In this approach, the SNP eigenvectors from the PCAof the maximal set of unrelated samples are then used to calculate sample eigenvectors for the remainingsubjects. PC-Relate then uses these eigenvectors to incorporate population structure in computing new IBDcoefficient estimates. (How many eigenvectors to use depends on assessing how many eigenvectors appear toshow population structure. Here we used 13 eigenvectors.) A new maximal unrelated set of samples is foundand the process is repeated. Three iterations were performed, after which the top principal components fromthe next-to-last iteration were highly correlated with the top principal components from the last iteration.)For the first PCA analysis, unrelated HapMap controls genotyped with the study subjects were included toestablish continental ancestry. In subsequent steps, only study subjects were included.

For the initial analysis, PCA was performed on 5,993 unrelated study subjects plus 101 unrelatedHapMap controls genotyped with the study subjects using 66,996 pruned SNPs. Figure 11 displays theresults for eigenvector 1 vs eigenvector 2 with the HapMap controls and study subjects plotted sepa-rately. The study subjects are color-coded by self-identified race and the HapMap subjects are color-coded by population group. The study subjects come from many sites around the world with diverseself-identified race. The sites are Argentina, China, Colombia, Denmark, Ethiopia, Guatemala, Hungary,India, Spain, Nigeria, Philippines, Puerto Rico, Turkey, and several sites within the United States. The self-identified race categories included Caucasian (44.5%), Caucasian/African-American/Native-American de-noted by Cauc/AfrAmr/NatAmr (21.2%), Asian (18.6%), Caucasian/Native-American denoted by Cauc/NatAmr(9.1%), African (4%), Native-American/Alaskan Native/Other-Native-People denoted by NatAmr/AlskNat/OthNat

8

Page 9: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

(1.7%), a small number of Caucasian/African-American (Cauc/AfrAmr) or Caucasian/Asian (Cauc/Asian)from the United States, and 0.5% Unspecified. The HapMap samples represent a variety of population groupsas can be seen in Table 4. Figure 11 shows that there are no distinguished outliers where self-identified raceis at odds with the PCA-determined ancestry. However, it also points out the diversity of some of theself-identified categories (e.g. the Caucasian group is quite diverse).

The final PCA was performed using 5,937 study subjects and 66,998 pruned SNPs. The scree plot,Figure 12 shows that the fraction of variance accounted for falls off dramatically after the first three com-ponents. To determine whether the LD-pruning effectively prevented the occurrence of small clusters ofSNPs that are highly correlated with a specific eigenvector, we examine plots of the correlation of each SNPwith each eigenvector. These plots are similar to GWAS “Manhattan” plots except that the Y-axis has theSNP-eigenvector correlation rather than an association test p-value. Figure 13 shows these plots for the first8 eigenvectors. No large clusters of highly correlated SNPs are evident in these plots, indicating that eacheigenvector is related to many SNPs distributed across all chromosomes.

Figures 14, 15, and 16 help visualize the population substructure revealed by the final PCA eigenvectors.Figure 14 is a parallel coordinates plot, color-coded by self-identified race; each vertical line represents aneigenvector and each piecewise line between the vertical lines traces eigenvector values for a given subject.Note that eigenvectors 3 and 4 separate out three subsets of Asian subjects (note the three separated bandscolored green) and two subsets of African subjects (note the two separated bands colored red). The threeAsian bands separate out subjects from China, from India, and from the Phillipines and the two Africanbands separate out subjects from Nigeria and from Ethiopia. Further note that eigenvector 6 appears toseparate out the Caucasian/African-American/Native-American (cyan) group. Figure 15 and Figure 16further illustrate how the eigenvectors separate out different groups, revealing population structure. In thesefigures, the race group designations are fine-tuned using site to reveal more specific population structure.

10 Missing call rates

Two missing call rates were calculated for each sample and for each SNP in the following way (and providedin files “SNP analysis.csv” and “Sample analysis.csv” on dbGaP). (1) missing.n1 is the missing call rateper SNP over all samples (including HapMap controls). (2) missing.e1 is the missing call rate per samplefor all SNPs with missing.n1< 100%. (3) missing.n2 is the missing call rate per SNP over all sampleswith missing.e1< 5%. In this project, all samples have missing.e1< 5%, so missing.n1 =missing.n2. (4)missing.e2 is the missing call rate per sample over all SNPs with missing.n2< 5%.

In this study, the two missing rates by sample are very similar, with median values of 0.000377 (miss-ing.e1 ) and 0.000284 (missing.e2 ). Figure 17 shows the distribution of missing.e1. There are 59 sampleswith a missing call rate > 2%.

The two missing call rates by SNP are identical. Table 6 gives a summary of SNP genotyping failuresand missingness by chromosome type. For SNPs that passed the genotyping center QC, the median value ofmissing.n1 is 0.000163 and 98.9% of SNPs have a missing call rate < 0.02. The Y chromosome has a hightechnical failure rate: this is not unusual and it should be noted that all Y SNPs were manually reviewedby CIDR (the genotyping center) and, if problematic were either re-called (to fix, if possible) or failed. Wegenerally recommend filtering out samples with a missing call rate > 2%; however, since this is a family studyand the maximum missing call rate (0.0342) is still reasonable, the subjects with missing call rate > 2% werenot filtered out. We recommend filtering out SNPs with a missing call rate > 2%.

11 Batch effects

The samples were processed together in batches consisting of complete or partial 96-well plates with onebatch per plate. As one way to identify any batch effects, we plotted log10 of the autosomal missing call rate(Figure 18) for each batch. There is a highly significant variation among batches in log10 of the autosomal

9

Page 10: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

missing call rate (p =3.2e-224), but all plates have a low mean missing call rate so this variation should notbe problematic.

Another way to detect genotyping plate effects is to assess the difference in allelic frequencies betweeneach plate and a pool of the other plates. We calculated the odds ratio (OR) for each SNP and each plate andthen averaged these statistics over SNPs, using only study samples. The mean odds ratio was calculated as1/min(OR,1/OR). This statistic is a measure of how different each plate is from the other plates. Figure 19shows the mean odds ratio (OR) compared with the fraction of Caucasian samples on the plate; Caucasianis the largest self-identified race group. There are no outlier plates.

In addition, since the association tests performed here will use transmission disequilibrium (TDT) fromparents to children and all members of a given family were assigned to the same plate, there should be noproblematic plate effects.

12 Duplicate sample discordance

Genotyping error rates can be estimated from duplicate discordance rates. The genotype at any SNP maybe called correctly, or miscalled as either of the other two genotypes. If α and β are the two error rates, theprobability that duplicate genotyping instances of the same participant will give a discordant genotype is2[(1− α− β)(α+ β) + αβ]. When α and β are very small, this is approximately 2(α+ β) or twice the totalerror rate. Potentially, each true genotype has different error rates (i.e. three α and three β parameters),but here we assume they are the same. In this case, since the median discordance rate over all sample pairsis 5.49e-06, a rough estimate of the mean error rate is 2.74e-06 errors per SNP per sample, indicating a highlevel of reproducibility.

Duplicate discordance estimates for individual SNPs can be used as a SNP quality filter. The challengehere is to find a level of discordance that would eliminate a large fraction of SNPs with high error rates,while retaining a large fraction with low error rates. The probability of observing > x discordant genotypesin a total of n pairs of duplicates can be calculated using the binomial distribution. Table 7 shows theseprobabilities for x = 0 through x = 3 and n = 264. Here we chose n = 264 to correspond to the number ofpairs of duplicate study samples. We recommend a filter threshold of > 2 discordant calls because this retains> 98% of SNPs with an error rate < 10−3, while removing 89.65% of SNPs with an error rate > 10−2. Thisthreshold eliminates 374 SNPs. Note that SNPs with > 0 discordances in HapMap subjects were failed byCIDR.

Figure 20 summarizes the concordance by SNP, binned by MAF. Figure 20a shows the number of SNPsin each MAF bin. Figure 20b shows the correlation of allelic dosage (r2), which is greater for SNPs withhigher MAF. Figure 20c shows the overall concordance, which is very high for all SNPs. For SNPs with lowMAF, we expect high concordance because these SNPs are most likely to be called as homozygous for themajor allele and thus be concordant by chance. Figure 20d shows the minor allele concordance, which isthe concordance between genotypes in the members of sample pairs in which both genotypes have at leastone copy of the minor allele (i.e. excluding pairs where both are the major homozygote). This concordancemeasure is more reflective of true genotyping concordance for low-MAF SNPs and the distribution is verysimilar to the correlation.

13 Mendelian errors

Mendelian errors were analyzed in the 5,288 trios or single-parent/offspring pairs of study and Hapmapsubjects. This included 3,247 study subject trios and 2,015 study subject single-parent/offspring pairs. Thenumber of Mendelian errors per SNP is a useful quality filter, although establishing a suitable threshold forthe filter is difficult. Mendelian error can be due to copy number variation or other chromosomal anomaly.The threshold recommended is rather subjective but is based upon the number of trios investigated andviewing cluster plots for a random sample of SNPs with varying numbers of errors. For this study thethreshold chosen was 20 Mendelian errors. Only 11.1% of SNPs have any Mendelian errors and 2,829 SNPs

10

Page 11: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

have more than 20 errors. We recommend filtering out SNPs with more than twenty (20) Mendelian errors.Filtering information for the Mendelian errors is provided in the file “SNP analysis.csv.”

We also investigated Mendelian error by family. Figure 21 displays a distribution of the Mendelian errorrate per trio for 3,247 study trios (top plot) and a distribution of Mendelian error rate per parent-offspringpair for 2,015 study single-parent/offspring pairs (bottom plot). The Mendelian error rate = (number ofMendelian errors)/(number of SNPs examined). Outliers were examined: trios with error rate > 0.0015and single-parent/offspring pairs error rate > 0.00025. Errors were not focused in any particular part ofthe genome, there was no pattern in these families with regard to chromosome anomalies, and there was nopattern with respect to differences in ancestry. However, users may want to run association analyses withand without these outlier families for comparison.

The file “MendelError byFamily.csv” contains data for each trio and single-parent/offspring pair. Vari-ables include the total count of Mendelian errors, count of number of SNPs examined, error rate, countsof Mendelian errors by chromosome and includes a variable “type” indicating trio or single-parent/offspringpair.

14 Hardy-Weinberg equilibrium

Since the goal of the Hardy-Weinberg Equilibrium (HWE) testing is to use departure from HWE to identifypoor quality SNPs and not the presence of population structure, we need to use genetically homogeneoussets of samples. Since affection status can also affect deviation from HWE, we use control samples. Forthis study, samples from subjects in control families are the best controls to use. Given that we are usingcontrol families only, the only racial/ethnic (potentially non-admixed) group that had adequate sample size isCaucasian. Since subjects self-identified as Caucasian are quite diverse (see Section 9), we needed to restrictto a more genetically homogeneous subset.

To determine a more genetically homogeneous subset of Caucasian subjects, a maximal unrelated set ofCaucasian subjects was found using kinship coefficients. Then PCA was performed on this set of subjects.Figure 22 shows the plot of eigenvector 1 against eigenvector 2. To restrict this diverse group to a moregenetically homogeneous subset, we selected subjects with eigenvector 1 > −0.007 and eigenvector 2 > −0.03,indicated in the figure as the points above the horizontal line and to the right of the vertical line. This subset of2,167 Caucasian subjects is indicated by the logical variable“pca homog Caucasian”in“Sample analysis.csv”.

We calculated an exact test of HWE using study subjects who are (1) unrelated, (2) have missing callrate < 2%, (3) genetically homogeneous Caucasian as described above and (4) from control families. Thisset of 847 subjects is indicated by the logical variable “hwe” in “Sample analysis.csv”.

Figure 23 shows the quantile-quantile (QQ) plots for the HWE test. The autosomal SNPs deviate fromexpectation between 0.01 and 0.001, and the X chromosome SNPs have small deviation beginning around0.01. The X versus autosomal difference has been observed in many other studies. The reason(s) for it arenot clear, but appear to be unrelated to sample size, since the difference generally is observed even whenonly females are analysed for autosomes.

Deviations from HWE due to population structure are expected to result in an excess of homozygotes ora positive inbreeding coefficient estimate, calculated as 1−(number of observed heterozygotes)/(number ofexpected heterozygotes). A comparison of the observed distribution of the inbreeding coefficient estimates(for a random sample of autosomal SNPs) with a simulated distribution of inbreeding coefficient estimatesfor the same set of SNPs under the assumption of Hardy-Weinberg equilibrium was performed. Figure 24shows that the observed and simulated distributions are similar. We conclude that most deviations fromHWE result from genotyping artifacts, rather than population structure.

Although the QQ plots show deviation of observed from expected p-values for autosomal SNPs between0.001 and 0.01, we suggest using a filter threshold of p = 0.0001 because examination of cluster plots revealsgood plots for many assays with p-values > 0.0001. This threshold is rather subjective, but we are reluctantto recommend a higher threshold that would eliminate many good SNP assays. We recommend filteringSNPs that have p < 0.0001 from the HWE analysis.

11

Page 12: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

15 Minor allele frequency

Figure 25 shows the distribution of minor allele frequency (MAF) for all study subjects. The percentage ofall SNPs with MAF ≤ 0.01 is 44.3% for the autosomes and 37.2% for the X chromosome. The percentage ofall SNPs with MAF > 0.01 and ≤ 0.05 is 4.44% for the autosomes and 6.84% for the X chromosome.

16 Duplicate SNP probes

The Illumina HumanCoreExomePlusCustom Marazita 15050181 array has 15,252 sets of SNPs that occur asreplicates, as indicated by identical genomic map positions within each set (“positional duplicates”). Theseoccur as 7,593 pairs and 22 triplets. We checked concordance of genotype calls across study samples foreach pair of SNPs with the same map position. A high level of concordance indicates that these SNPs assaythe same variant. To determine a suitable cut-off for concordance, we calculated the probability of having> x discordant calls over 11,727 study samples, given assumed error rates. We chose 198 discordances, forwhich the probability is 4.2e-17 with error rate of 0.001 and 0.99 with error rate 0.01. Pairs with ≤ 198discordances are considered to assay the same SNP and one member of each pair (or two from triplets) islabeled as “redundant” in “SNP analysis.csv” (the one(s) with higher missing call rate). Pairs with > 198discordances may be assaying different SNPs and are flagged as discordant by “dup.pos.disc” = TRUE in“SNP analysis.csv”. There were 7,532 redundant SNPs and 213 positions flagging 102 discordant duplicatesand 3 triplets.

17 Sample exclusion and filtering summary

As discussed in Section 5, genotyping was attempted for a total of 12,495 samples, of which 12,261 passedCIDR’s QC process (Section 2). The subsequent data cleaning QA process identified 13 samples withunresolved identity issues. Therefore 12,248 scans will be posted on dbGaP.

In general, we recommend filtering out large chromosomal anomalies associated with error-prone genotypesand whole samples with missing call rate > 2%. In this study, there are 33 such anomalies and 59 sampleshave a missing call rate > 2%. Since this is a family study and the maximum missing call rate (0.0342) isstill reasonable, the subjects with missing call rate > 2% were not filtered. We also recommend filters forspecific types of analyses, such as PCA, HWE and association testing as indicated in those sections of thisreport, which are provided in “Sample analysis.csv.” These filters generally include just one scan per subject(unduplicated) and PCA and HWE use a maximal set of unrelated subjects per family.

18 SNP filter summary

Table 1 summarizes SNP failures applied by CIDR prior to data release and a set of additional filterssuggested for removing assays of low quality or informativeness. The suggested quality filter and compositefilter are provided as logical variables in the “SNP analysis.csv” file, which also has the individual qualitymetrics so that the user can apply alternative thresholds. The quality filters (rows 2–8) remove 3.26% of the557,677 SNP assays attempted and the composite filter (rows 2–10), also excluding uninformative redundantSNPs and monomorphic SNPs, removes 18.33% of the SNP assays.

In addition to the composite filter, we suggest applying an allele frequency filter. For illustration, Table 1provides figures for applying a filter of MAF < 0.01 among study subjects. The quality, informativeness, andMAF filters combined remove 47.35% of the SNP assays attempted. Regardless of what filters are applied toassociation test results, it is highly recommended to view SNP cluster plots for any SNPs of interest.

12

Page 13: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

19 Characteristics of SNP by array source

The Illumina HumanCoreExomePlusCustom Marazita 15050181 array is a semi-custom array that includesSNPs from the HumanCore array with additions of 15,890 custom content SNPS and SNPs from the exomearray. (The custom content SNPs were chosen by the study investigators based on previous GWAS resultsand/or regions of interest suspected to be involved in the genetics of orofacial clefts.) To compare the SNPcharacteristics from the two main array sources we categorized the SNPs as “Exome array SNPs” (SNPswith “exm” prefix in their SNP identifiers) and “HumanCore SNPs” (SNPs without “exm” prefix in theirSNP identifiers). Note that the category “HumanCore SNPs” includes the custom content SNPs (which arenot distinguished). Table 8 summarizes the characteristics of SNPs from the two array sources. The exomearray SNPs represent a large fraction of the SNPs on the HumanCoreExomePlusCustom Marazita 15050181array (43.86% overall) (Table 8a). They have a higher CIDR technical failures (2.33%) than the HumanCorearray SNPs (1.25%). For a description of the criteria for CIDR technical failures, refer to the CIDR docu-ment “SNP Summary README.pdf”. On applying the quality filter, the percentage of exome array SNPslost (3.33%) is comparable to the percentage of HumanCore array SNPs lost (3.21%). However, a greaterpercentage of SNPs from the exome array than SNPs from the HumanCore array were lost after applyingthe composite filter (23.71% exome array SNPs compared to 14.13% HumanCore array SNPs). The qualityfilter and composite filter are described in Section 18. It should be noted that SNPs on the HumanCore arraycan also map to exonic regions. We found that 1,819 SNPs, that were not “exm” SNPs but were positionalduplicates of “exm” SNPs, passed the composite filter.

Table 8b gives the percentage of SNPs that passed the quality filter (described in Section 18) within thethree MAF bins for each SNP array category. These fractions are high for all categories, with the fraction forexome array SNPs only slightly lower than for HumanCore array SNPs. Within each SNP type, the fractionpassing is similar across MAF bins. However, this doesn’t necessarily mean that genotype calling accuracyis equally good or better for lower MAF, since it is more difficult to identify poor performance for low MAFSNPs . For example, there is less power to detect HWE deviations and many fewer opportunities to detectMendelian errors.

The CIDR QC process for low MAF SNPs includes running zCall [24] to identify SNPs where possibleheterozygous clusters were missed by GenCall (parameters T=21 and I=0.2). SNPs with 3 or more possiblenew heterozygotes were manually reviewed and edited or failed as appropriate. The variable“zCalled.flagged”in“SNP annotation.csv”provides the number of new heterozygous calls recommended by the zCall algorithm.Also provided in “SNP annotation.csv” is the variable “RS dbSNP137” which provides “rs” identifiers fromdbSNP137 for the SNPs.

20 Preliminary association tests

Preliminary association tests were performed using the tdt (transmission disequilibrium test) option inPLINK [25]. The tdt option performs basic TDT and a variant of this test that also incorporates parentalphenotype information, the parenTDT. The TDT statistic is calculated simply as (b−c)2

b+c where b and care the number of transmitted and untransmitted alleles; under the null, it is distributed as a 1 degree offreedom chi-squared. The parental discordance test is based on counting the number of alleles in affectedversus unaffected parents, treating each nuclear family parental pair as a matched pair. These counts canbe combined with the T (transmitted minor allele) and U (untransmitted minor allele) counts of the basicTDT to give a combined test statistic, which also is distributed as a 1 degree of freedom chi-squared underthe null. (See PLINK documentation as well as reference [25] for further details.)

A random sample of one complete trio (with affected offspring) was chosen from each case family. Asubject is said to be affected if he/she has any type of cleft. Subjects with trisomy 21 were excluded. Theselection of subjects is indicated in “Sample analysis.csv” by the logical variable “tdt.one” and the case statusis given by the variable “tdt.one.case”. This resulted in 1,487 trios for a total of 4,461 subjects. There were126 discordant parental pairs.

Figure 26 shows the QQ plots for the TDT results and Figure 27 shows the QQ plots for the parenTDT

13

Page 14: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

results. Results are given with no SNP filter, with the recommended composite (quality plus informativeness)filter, with the composite filter plus a minor allele frequency filter MAF > 0.01, and with composite filterbut MAF ≤ 0.01. The minor allele frequency used here was calculated using only the subjects used in thetdt analysis. The results for the TDT and for the parenTDT are similar, so consequently the discussionsbelow will focus on the TDT results. The value lambda shown on the plots equals the genomic inflationfactor. There is considerable inflation: 1.3 after composite filter and 1.055 after restricting to SNPs withMAF > 0.01. The inflation factor for low minor allele frequency SNPs is very high (2.198). Potential causesand concerns relating to this inflation are discussed below.

The corresponding Manhattan plots are shown in Figure 28 (TDT) and Figure 29 (parenTDT). It isrecommended to examine cluster plots for any SNPs of interest. Top hits after applying the composite filtershowed good clustering.

Accompanying files are provided for use with the filtered PLINK file when using the tdt option: “as-soc.keep.txt” for selection of association subjects and “assoc.altPheno.txt” for alternate phenotype file. Referto README assoc.txt for more information.

Inflation Issues

It is clear from the results above that the TDT results are highly inflated for low minor allele frequencySNPs. However, low minor allele frequency is not the only contributor to inflation. Applying a minor allelefrequency filter MAF < 0.2 still showed high inflation (lambda = 1.064) and early departure from the x=yline.

A potential cause of inflation is bias caused by genotypic-specific missingness that can lead to over-transmission of the common allele ([26], [27]). When Mendel inconsistencies are removed (as is done inPLINK tdt), the TDT has been shown to not maintain correct type I error ([28]). As one way to investigateif these concerns are operational here, we compared the over-transmission of the major allele with over-transmission of the minor allele. Table 9 show the results of this investigation. The differences do not appearto be strong enough to explain the inflation results. As another way to investigate, we restricted SNPs tothose with no missing calls and with no Mendel inconsistencies (280,334 SNPs). The inflation factor (afterminor allele frequency filter MAF < 0.01) is 1.048 (compared with 1.055), i.e. some improvement, but thebehaviour of the QQ plots is similar. If we restrict SNPs to just chromosomes 14, 15, and 16 (where theManhattan plots don’t show much activity), the inflation factor, after a minor allele frequency filter MAF< 0.01, is 1.021 and the QQ plot is well-behaved (Figure 30). The issues discussed here could contribute tothe inflation but do not completely explain it; inflation decreases but is still high and also one would expectthese issues to affect all chromosomes, including chromosomes 14, 15, and 16.

Other potential contributing factors are unequal allele frequencies, polygenic inheritance, and long-rangelinkage disequilibrium (LD) due to admixture. When allele frequencies are more divergent-as opposed tomore equal-there are more trios in which both parents are homozygous. When errors are introduced intotrios in which both parents are homozygous and the resultant trio is consistent, the observed (and incorrect)trio of genotypes is counted in the estimation of the parameter t (test statisitic), introducing a bias in t awayfrom the true value of 0.5 [28]. This could be more of a problem in admixed populations.

As mentioned in Section 2, orofacial clefts exhibit significant genetic heterogeneity, i.e., multiple geneticregions have been implicated [9, 10]. For case-control studies (but not specifically TDT), it has been shownthat significant inflation of test statistics is expected under polygenic inheritance even when there is nopopulation structure [29].

Admixture between genetically differentiated populations may lead to high levels of LD even at moderatelyfar-apart loci. It is possible that this may contribute to higher overall test statistics. Even if this is not so, itis worthwhile to take this into consideration when interpreting results. Significant TDT results in admixedpopulations should not be taken as an event of tight linkage between the markers and of the underlyingdisease locus [30].

Appendix

14

Page 15: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

A Project participants

The University of PittsburghSchool of Dental Medicine, Department of Oral Biology:Mary Marazita, Toby McHenry, Beth Emanuele, Elizabeth Leslie, Manika Govil, Carla Sanchez, Seth Wein-berg, Jennifer Jacobs, Nandita Mukhopadhyay, and Alexandre VieiraGraduate School of Public Health, Department of Human Genetics:Eleanor Feingold and John Shaffer

University of IowaDows Research Institute, Department of Oral Pathology, Radiology and Medicine:Azeez Butali and Lina MorenoDepartment of Pediatrics-Neonatology:Jeff Murray and Lisa HarneyCollege of Public Health, Department of Health Management and Policy:George Wehby

University of Puerto RicoSchool of Dental Medicine, Office of the Assistant Dean of Research:Carmen J. Buxo

Center for Inherited Disease Research, Johns Hopkins UniversityKim Doheny, Jane Romm, Michelle Mawhinney, Marie Hurley, Hua Ling, and Elizabeth Pugh

Genetics Coordinating Center, Department of Biostatistics, University of WashingtonCecelia Laurie, Deepti Jain, Cathy Laurie, and Bruce Weir

dbGaP-NCBI, National Institutes of HealthNataliya Sharopova

NICDR, National Institutes of HealthEmily Harris

15

Page 16: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

References

[1] F. Rahimov, A. Jugessur, and J. C. Murray. Genetics of nonsyndromic orofacial clefts. Cleft PalateCraniofac. J., 49(1):73–91, Jan 2012.

[2] N.W. Berk and M. L. Marazita. Costs of Cleft lip and Palate: Personal and Societal Implications. InD. F. Wyszynski, editor, Cleft Lip and Palate: From Origin to Treatment. Oxford University Press,Inc., New York, 2002.

[3] G. L. Wehby, D. A. Pedersen, J. C. Murray, and K. Christensen. The effects of oral clefts on hospitaluse throughout the lifespan. BMC Health Serv Res, 12:58, 2012.

[4] S. L. Boulet, S. D. Grosse, M. A. Honein, and A. Correa-Villasenor. Children with orofacial clefts:health-care use and costs among a privately insured population. Public Health Rep, 124(3):447–453,2009.

[5] K. Christensen, K. Juel, A. M. Herskind, and J. C. Murray. Long term follow up study of survivalassociated with cleft lip and palate at birth. BMJ, 328(7453):1405, Jun 2004.

[6] M. C. Jones. Etiology of facial clefts: prospective evaluation of 428 patients. Cleft Palate J, 25(1):16–20,Jan 1988.

[7] M. J. Dixon, M. L. Marazita, T. H. Beaty, and J. C. Murray. Cleft lip and palate: understanding geneticand environmental influences. Nat. Rev. Genet., 12(3):167–178, Mar 2011.

[8] M. L. Marazita. The evolution of human genetic studies of cleft lip and cleft palate. Annu Rev GenomicsHum Genet, 13:263–283, 2012.

[9] T. H. Beaty, J. C. Murray, M. L. Marazita, R. G. Munger, I. Ruczinski, J. B. Hetmanski, K. Y. Liang,T. Wu, T. Murray, M. D. Fallin, R. A. Redett, G. Raymond, H. Schwender, S. C. Jin, M. E. Cooper,M. Dunnwald, M. A. Mansilla, E. Leslie, S. Bullard, A. C. Lidral, L. M. Moreno, R. Menezes, A. R.Vieira, A. Petrin, A. J. Wilcox, R. T. Lie, E. W. Jabs, Y. H. Wu-Chou, P. K. Chen, H. Wang, X. Ye,S. Huang, V. Yeow, S. S. Chong, S. H. Jee, B. Shi, K. Christensen, M. Melbye, K. F. Doheny, E. W.Pugh, H. Ling, E. E. Castilla, A. E. Czeizel, L. Ma, L. L. Field, L. Brody, F. Pangilinan, J. L. Mills,A. M. Molloy, P. N. Kirke, J. M. Scott, J. M. Scott, M. Arcos-Burgos, and A. F. Scott. A genome-wide association study of cleft lip with and without cleft palate identifies risk variants near MAFB andABCA4. Nat. Genet., 42(6):525–529, Jun 2010.

[10] K. U. Ludwig, E. Mangold, S. Herms, S. Nowak, H. Reutter, A. Paul, J. Becker, R. Herberz, T. AlChawa,E. Nasser, A. C. Bohmer, M. Mattheisen, M. A. Alblas, S. Barth, N. Kluck, C. Lauster, B. Braumann,R. H. Reich, A. Hemprich, S. Potzsch, B. Blaumeiser, N. Daratsianos, T. Kreusch, J. C. Murray, M. L.Marazita, I. Ruczinski, A. F. Scott, T. H. Beaty, F. J. Kramer, T. F. Wienker, R. P. Steegers-Theunissen,M. Rubini, P. A. Mossey, P. Hoffmann, C. Lange, S. Cichon, P. Propping, M. Knapp, and M. M. Nothen.Genome-wide meta-analyses of nonsyndromic cleft lip with or without cleft palate identify six new riskloci. Nat. Genet., 44(9):968–971, Sep 2012.

[11] S. M. Gogarten, T. Bhangale, M. P. Conomos, C. A. Laurie, C. P. McHugh, I. Painter, X. Zheng,D. R. Crosslin, D. Levine, T. Lumley, S. C. Nelson, K. Rice, J. Shen, R. Swarnkar, B. S. Weir, andC. C. Laurie. GWASTools: an R/Bioconductor package for quality control and analysis of genome-wideassociation studies. Bioinformatics, 28(24):3329–3331, Dec 2012.

[12] X. Zheng, D. Levine, J. Shen, S. M. Gogarten, C. Laurie, and B. S. Weir. A high-performance computingtoolset for relatedness and principal component analysis of SNP data. Bioinformatics, 28(24):3326–3328,Dec 2012.

16

Page 17: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

[13] C.C. Laurie et al. Quality control and quality assurance in genotypic data for genome-wide associationstudies. Genetic Epidemiology, 34:591–602, 2010.

[14] D.A. Peiffer et al. High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. Genome Research, 16:1136–1148, 2006.

[15] L.K. Conlin et al. Mechanisms of mosaicism, chimerism and uniparental disomy identified by singlenucleotide polymorphism array analysis. Human Molecular Genetics, 19:1263–1275, 2009.

[16] E.S. Venkatraman and A.B. Olshen. A faster circular binary segmentation algorithm for the analysis ofarray CGH data. Bioinformatics, 23:657–663, 2007.

[17] Cathy C. Laurie, Cecelia A. Laurie, et al. Detectable clonal mosaicism from birth to old age and itsrelationship to cancer. Nature Genetics, 44:642–650, 2012.

[18] Ani Manichaikul, Josyf C. Mychaleckyj, Stephen S. Rich, Kathy Daly, Michele Sale, and Wei-Min Chen.Robust relationship inference in genome-wide association studies. Bioinformatics, 26(22):2867–2873,2010.

[19] T. Thornton, H. Tang, T. J. Hoffmann, H. M. Ochs-Balcom, B. J. Caan, and N. Risch. Estimatingkinship in admixed populations. Am. J. Hum. Genet., 91(1):122–138, Jul 2012.

[20] M. P. Conomos, B. S. Weir, and T. A. Thornton. PCA-Based Relatedness Estimation for AdmixedPopulations with Unspecified Structure. In preparation, 2014.

[21] N. Patterson, A.L. Price, and D. Reich. Population structure and eigenanalysis. PLoS Genetics, 2:e190,2006.

[22] J. Novembre et al. Genes mirror geography within Europe. Nature, 456:98–101, 2008.

[23] X. Zhu, S. Li, R. S. Cooper, and R. C. Elston. A unified association analysis approach for family andunrelated samples correcting for stratification. American Journal of Human Genetics, 82(2):352–365,Feb 2008.

[24] J. I. Goldstein, A. Crenshaw, J. Carey, G. B. Grant, J. Maguire, M. Fromer, C. O’Dushlaine, J. L.Moran, K. Chambert, C. Stevens, P. Sklar, C. M. Hultman, S. Purcell, S. A. McCarroll, P. F. Sullivan,M. J. Daly, and B. M. Neale. zCall: a rare variant caller for array-based genotyping: genetics andpopulation analysis. Bioinformatics, 28(19):2543–2545, Oct 2012.

[25] S. Purcell et al. PLINK: a tool set for whole-genome association and population-based linkage analyses.American Journal of Human Genetics, 81:559–575, 2007.

[26] J. N. Hirschhorn and M. J. Daly. Genome-wide association studies for common diseases and complextraits. Nat. Rev. Genet., 6(2):95–108, Feb 2005.

[27] Z. Yu. Family-based association tests using genotype data with uncertainty. Biostatistics, 13(2):228–240,Apr 2012.

[28] D. Gordon, S. C. Heath, X. Liu, and J. Ott. A transmission/disequilibrium test that allows for genotyp-ing errors in the analysis of single-nucleotide polymorphism data. Am. J. Hum. Genet., 69(2):371–380,Aug 2001.

[29] J. Yang, M. N. Weedon, S. Purcell, G. Lettre, K. Estrada, C. J. Willer, A. V. Smith, E. Ingelsson, J. R.O’Connell, M. Mangino, R. Magi, P. A. Madden, A. C. Heath, D. R. Nyholt, N. G. Martin, G. W.Montgomery, T. M. Frayling, J. N. Hirschhorn, M. I. McCarthy, M. E. Goddard, and P. M. Visscher.Genomic inflation factors under polygenic inheritance. Eur. J. Hum. Genet., 19(7):807–812, Jul 2011.

[30] J. C. Whittaker, M. C. Denham, and A. P. Morris. The problems of using the transmis-sion/disequilibrium test to infer tight linkage. Am. J. Hum. Genet., 67(2):523–526, Aug 2000.

17

Page 18: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Table 1: Summary of recommended SNP filters. The number of SNPs lost is given for sequential applicationof the filters in the order given. For a description of the criteria for CIDR technical failures, refer to theCIDR document “SNP Summary README.pdf”. Rows 2 - 10 comprise the composite.filter with rows 2 - 8being quality metrics and rows 9 and 10 being informativeness metrics. The sex difference metrics in lines 7and 8 were computed on the homogeneous genetic ancestry sample set identified by “pca homog Caucasian”in “Sample analysis.csv” (see section 14).

Filter SNPs.lost SNPs.kept1 none - all SNP probes 557,6772 CIDR technical filters 9,625 548,0523 Missing call rate >=2% 5,891 542,1614 >2 discordant calls in 264 study duplicates 42 542,1195 >20 Mendelian errors in 5,288 trios or single-parent/offspring pairs 2,410 539,7096 HWE p-value < 10^(-4) in pca homog Caucasian subjects 194 539,5157 Sex difference in allele freq >=0.2 for autosomes/XY 42 539,4738 Sex difference in heterozygosity >0.3 for autosomes/XY 0 539,4739 positional duplicates 7,278 532,195

10 MAF = 0 76,746 455,44911 MAF < 0.01 161,816 293,63312 Percent of SNPs lost due to quality filters (rows 2-8) 3.26%13 Percent of SNPs lost due to composite filter (rows 2-10) 18.33%14 Percent of SNPs lost due to composite filter and MAF (rows 2-11) 47.35%

Table 2: Summary of DNA samples and genotyping instances (scans).

Study HapMap BothDNA samples into genotyping production 12,238 257 12,495Failed samples -234 0 -234Scans released by genotyping center 12,004 257 12,261Scans failing post-release QC 0 0 0Scans with unresolved identity issues -13 0 -13Scans to post on dbGaP 11,991 257 12,248

Table 3: Summary of numbers of scans, subjects and subject characteristics.

Study HapMap BothScans to post on dbGaP 11,991 257 12,248Subjects 11,727 128 11,855Replicated subjects 264 123 387Families (N > 1) 2,787 26 2,813Singletons 1,170 51 1,221

18

Page 19: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Table 4: HapMap genotyping control samples: abbreviation, source, number of samples, number of subjects,and number of trios, doubletons, and singletons among the subjects.

(a) HapMap samples: abbreviation and source

abbrev sourceCEU Utah residents with Northern and Western European AncestryCHB Han Chinese in Beijing, ChinaCHINESE Human Variation Panel - Chinese (Version2)CHS Southern HanChinese, ChinaGIH Gujarati Indian in Houston, TXJPT Japanese in Tokyo, JapanLWK Luhya in Webuye, KenyaMAYAN Campeche State of the Yucatan, Amerindian PopulationMXL Mexican Ancestry in Los Angeles, CaliforniaPUR Puerto Rican in Puerto RicoSEAsian Human Variation Panel - Southeast AsianTSI Toscani in ItaliaYRI Yoruban in Ibadan,Nigeria

(b) HapMap samples: Number of samples, subjects, trios, doubletons, andsingletons

abbrev samples subjects trios doubletons singletonsCEU 46 23 7 0 2CHB 12 6 0 0 6CHINESE 2 1 0 0 1CHS 12 6 2 0 0GIH 16 8 0 0 8JPT 12 6 0 0 6LWK 22 12 0 0 12MAYAN 9 3 0 0 3MXL 44 22 7 0 1PUR 9 3 1 0 0SEAsian 2 1 0 0 1TSI 18 10 0 0 10YRI 53 27 8 1 1

Table 5: Expected identity-by-descent coefficients for some common relationships.

k2 k1 k0 KC Relationship1.00 0.00 0.00 0.5 MZ twin or duplicate0.00 1.00 0.00 0.25 parent-offspring0.25 0.50 0.25 0.25 full siblings0.00 0.50 0.50 0.125 half siblings/avuncular/grandparent-grandchild0.00 0.25 0.75 0.0625 first cousins0.00 0.00 1.00 0.0 unrelated

19

Page 20: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Table 6: Summary of SNP genotyping failures and missingness by chromosome type. A=autosomes,M=mitochondrial, U=unknown position, X=X chromosome, XY=pseudoautosomal, Y=Y chromosome. Therow “SNP technical failures” gives the fraction of SNPs that failed QC at the genotyping center. The row“missing> 0.05” gives the fraction of SNPs that passed QC at the genotyping center and that have a missingcall rate (missing.n1 ) > 0.05.

A M U X XY Ynumber of probes 541,824 418 98 12,954 339 2,044SNP tech failures 0.014948 0.033493 0.377551 0.055967 0.038348 0.360568

missing>0.05 0.001602 0.000000 0.000000 0.000082 0.000000 0.000000

Table 7: Probability of observing more than the given number of discordant calls in 264 pairs of duplicatesamples, given an assumed error rate. The number of SNPs with a given number of discordant calls is shownin the final column. The recommended threshold for SNP filtering is > 2 discordant calls.

# discordant calls error=1e-05 error=1e-04 error=0.001 error=0.01 # SNPs> 0 0.0053 0.0514 0.4103 0.9950 4,025> 1 0.0000 0.0013 0.0985 0.9681 955> 2 0.0000 0.0000 0.0164 0.8965 374> 3 0.0000 0.0000 0.0021 0.7700 165

Table 8: SNP characteristics by array source.The Illumina HumanCoreExomePlusCustom Marazita 15050181 array is a semi-custom array that includesSNPs from the HumanCore array with additions of 15,890 custom content SNPs and SNPs from the exomearray. To compare the SNP characteristics from the two main array sources we categorized the SNPs as“Exome array SNPs” (SNPs with “exm” prefix in their SNP identifiers) and “HumanCore SNPs” (SNPswithout “exm” prefix in their SNP identifiers). Note that the category “HumanCore SNPs” includes thecustom content SNPs (which are not distinguished). For a description of the criteria for CIDR technicalfailures, refer to the CIDR document “SNP Summary README.pdf”. The quality and composite filters aredescribed in Section 18.

(a) SNP characteristics by array source type

Exome array SNPs HumanCore array SNPsNumber of SNP probes 244,594 313,083Percent over array source type 43.86% 56.14%Percent Failed by CIDR 2.33% 1.25%Percent lost by quality filter 3.33% 3.21%Percent lost by composite filter 23.71% 14.13%Number of SNPs kept after applying quality filter 236,452 303,021Number of SNPs kept after applying composite filter 186,597 268,852

(b) Percentage of SNPs passing quality filter by minor allele frequencybin

MAF bin Exome array SNPs HumanCore array SNPs[0,0.01] 99.24% 99.43%

(0.01,0.05] 98.62% 98.91%(0.05,0.5] 97.26% 97.73%

20

Page 21: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Table 9: Comparison of over-transmission of major allele with over-transmission of minor allele in TDTfor autosomal SNPs passing the composite filter, binned my minor allele frequency (“maf.bin”). Variablesbeginning with “minor” refer to count of number and percent of SNPs with over-transmission of minor allele( “minor.cnt” and “minor.pct”) and the median chi-squared statistic over these SNPs (“minor.pct”). Similarlyvariables beginning with “major” refers to SNPs with over-transmission of the major allele.

maf.bin minor.CHISQ major.CHISQ minor.cnt major.cnt minor.pct major.pct[0,0.01] 1.00 1.00 52,791 54,931 42.2% 43.9%

(0.01,0.05] 0.51 0.49 11,151 10,667 49.3% 47.1%(0.05,0.5] 0.50 0.50 128,681 130,018 49.1% 49.6%

all 0.67 0.67 192,623 195,616 47.0% 47.7%

21

Page 22: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 1: The X and Y chromosome intensities are calculated for each sample as the mean of the sum ofthe normalized intensities of the two alleles for each probe on those chromosomes. Sample sizes are given inthe axis labels. X heterozygosity is the fraction of heterozygous calls out of all non-missing genotype callson the X chromosome for each sample. There are 5 samples with unspecified annotated gender (black), 6samples annotated as male but genotyping as female (blue points in red cluster) and 11 samples annotatedas female but genotyping as male (red points in blue cluster).

●●

●●●

●●●

●●●

●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●

●●

●●

●●

●●●

●●

●●●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●● ●

●●

●●

●●●●

●●●

●●

●●●

●●

●●

●●

●●●●●●

●●

●●●

●●

●●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●●

●●

●●

●●

●●●

●●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●●

●●●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

● ●●

●●●

●●●

●●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●● ●●

●●

●●

●●

● ●

●●●

● ●

●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●

●●●●

●●

●●

●●

●●

●●

●●●

● ●●

●●

●●

● ●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

●●

●●●●●

●●●●●

●●

●●●

●●●●

●●

●●●

●●

●● ●●

●●

● ●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●●

●●●●

●●

●●

●●●●

●●

●●●

●●●

●●

●●

●●

●●●

●●●●

●●●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

● ●●

●●

●●

●●

●●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●

●●●

●●

●●●

● ●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●●●

●●●●

●●●

●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●●

●●

●●

●●●●●

●●

●●●

●●●

●●●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●●●●●

●●

●●

●●●●●●

●●●

●●

●● ●

●●

●●

●●

●●

● ●

●●●

●●●

●●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●

●●

●●●●●●

●●●●

●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●●●●

●●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●● ●

●●●●

●●●●

●●

●●

●●●

●●

●●●

●●

●●

●●●●

●●

●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●●●

●●

● ●

●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●●●

●●●

●●

●●●●

●●●●

●●

●●

●●●

●●●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●●

●●

●●

● ●●●

● ●

●●●

●●

●● ●

●●

● ●●

●●

●●

●●

●●● ●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●

●●●●

●●●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●●

●●

●●●

●●●

●●●

●● ●

●●

●●

●●

● ●

●●

● ●

●●●

●●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●

● ●●●

●●●●

●●

● ●

● ●

●●

●●

● ●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●●●

●●

●●

●●

●●

●●●●

●●

●●●●●

●●●

●●●

●●●

●●

●●

●●

●● ●

●●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●●

●●

●●

●●

●● ●●

● ●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

● ●

●●

●●●

●●●●●

●●

●●

●●

●●●

●●

●● ●

●●●

●●●●

●●

●●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●●●

●●●●●●●●

●●●

●●

●●

●●

●●●●●

●●●

● ●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●●●●

●●

●●●●

●●

●●

●●

● ●

●●●●

●●●●

●●●●●●

●●

●●

●●

●●●●

●●●●

●●

●●

●●

●●

●●

● ●

●●●●

●●

●●

●●

●●●●

●●

●●

●●

●●●

●●●●

●●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●●

●●●●

●●●

●●

●●●

●●●●●● ●

●●●●

●●

●●●

●●●

●●●

●●

●● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

● ●●

●●

●●

●●

● ●

●●

●●

●●●

● ●●

●●●●●●

●●●●

●●

● ●●

●● ●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●● ●

●●

●●

●● ●

●●

●●●

●●

●●

●●

●●

●●●

●●●

● ●●●

●●●●

● ●●●

●●

●●●●

●●●●

●●

●●● ●

●●

●●

●●●

●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●●

●●

●●●●

●●

● ●●

●●

●●●

●●

●●●● ●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●● ●●●

●●●

●●

●●●

●●●

●●

●●

●●●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●●

●●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●●

●●●●●

●●●

●●

●● ● ●

●●●

●●

●●

●●

●●●●

●●

●●●

●●●

●●

●●

●●

● ●

●●● ●

●●●

●●●●

●●

●●●

●●●

●●

●●

●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●●●

●●●

●●●●●●

●●

●● ●

●●●● ●

●●

●●

●●

● ●

●●●

●●●

●●

●●●

●●●

●●

●●

●●

●●●

●●●

●●●

●●● ●

●●

●●

●●

●●

●●

●●

●●●

● ●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●●●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●●●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

● ●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●●●

●●●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●

●●

●●

●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●● ●●●

●●

●●●

●●

●●

●●●●●

●●

●●

●●

●●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●

●●●

●● ●

●●●●●

●●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●●

● ●●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●●●

●●●

●●

●●

●●

●●●

●●●

● ●

●●●●

●●

●●

●●

●●●

● ●●

●●

●●●

●●●

●●●●

● ●●

●●

●●

●●

●●

●●

●●●

●● ●●●

●●

●●●

●●

● ●●●

● ●

●●

●●

●●●●●

●●●●

●● ●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

● ●●

●●●

●●

●●

●●●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●●●●

●●

●●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●●

●●

●●●

●● ●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

● ●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●● ●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●●

●●●

●●●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●●●

●●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

● ●

●●

●●

● ●

●●● ●

●●●

●●

●●

●●●

●●

●●

● ●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●●●●

●●

●●

●●

● ●

●●

● ●

●●●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●● ●

●●

●●●●

●●

● ●

●●

●●● ●●

●●●

●●●

●●

●●●

●●●

●●

●●

● ●●●

●● ●●● ●

●●

●●●●●●

●●

●●

●●●

●●●

●●●●

●●

●●●

●●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●● ●

●●●

●●

●●

●●●

●●

●●● ●●

●●

●●

●●●●●

●●

●●

●●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●●

●●

●●●●

●●●

● ●

●●

●● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●● ●

●●

●●

●●●●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

● ●●●

●●

●●●

●●●

●●●●

● ●●

●●●

●●●

●●●

●●

●●

● ●

●●

●●●

●●●●●

●●●

●●●

●●

●●●

●●

● ●

●●

●●

●●●

● ●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●

●●

●●

●●● ●

●●●

●●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

● ●●●

●●●

●●

● ●●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

● ●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

● ●●

●●●●

●●

●●●●

●●●

●●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●●

●●●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●

●●

●● ●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●● ●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

● ●

●●●

●●●

●●●

●●

●●●●

●●

●●

●●●●●●

●●

●●

●●

●●●

●●

● ●

●●

●●●

●●●●

●●

●●

●●

●●●

●●

●●● ●

●●

●●

●●

●●

●●

●● ●●

●●

● ●●

●●●

●●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

● ●●●●

●●

●●●●

●●

●●

● ●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●●●●●

●●

●●

●●

●●●

●●●

●● ●

●●●

●●●

●●●●

●●●

●●

●●

●●●

●●

●●●

●●●

●●●

●●

●●

●●

●●●●●●

●●●●

●●

●●●

●●●

● ●●

●●

●●●●●

●●●

●● ●●●

●●

●●

●●

●●

●●●

●●● ●

●●●

●●

●●

● ●

●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●● ●●

●●

●●

●●

●●

●●

●●●

●●●

●●●●

●●

● ●

●●●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●●

●●●●

●●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

● ●

●● ●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●● ●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●●●

●●

●● ●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

● ●●

● ●●

●●●

●●

●●

●●●

●●●

●●●

●●●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●●

●●●●

●●

●●●

●● ●

●●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●

●●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

● ●

●●●●●

●●●●

●●

● ●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●● ●

●●

●●

● ●

●●

●●●

●●●

●●

● ●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●●

●●

● ●

●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●●●● ●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●●

●●

●●●

●● ●●●

●●

●● ●●

●●

●●

●● ●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●●●

●●

●●

●●●

● ●●

●●●

●●●

●●

●●

●●●●

●●●

●●

●●

●●

●●●●

●●●

●●

●●

●●●●

●●●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●● ●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

● ●●●

●●●

●●

●●

●●

●●●●●●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●●●●

●●

●● ●

●●

●●

●●●●●

●●

●●●

●●

●●

●●

●●●●●●

●●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●●

●●

●● ●●

●●●

●●

●●

●●●●●

●●

●●

●●●

●●●

●●●

●●●●●●

●●

●●

●●

●●

● ●●

●●●

●●

●●●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

●●●

●●●

●●

●●

●● ●

●●●

●●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●●

●●●

●●

●●

●●●

●●

● ●●●●

●●

●●

●●

●●

● ●

●●

●●

●●●

●●●

●●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

●●●

●● ●●●

●●

●●●●

●●●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●●

●●●

●●

●●

● ●

●●●

●●

●●

●●

●●●

●●

●●●●●●

●●

●●●●●

●●

●●

●●●

●●●●

●●

●●

●●

●●

●●

●●●●● ●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

●●●

●●●●

●●●

●●

●●●

●●●●

●●●

●●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●●●●●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●●●

●●

●●

●●●●

●●

●●●

●●

●●●

●●●

●●●●

●●●

● ●●

●●

●●

0.7 0.8 0.9 1.0 1.1 1.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

X intensity (12954 probes)

Y in

tens

ity (

2044

pro

bes)

●●●

● ●

●● ●

●●

●● ●

●●●

MFNA

● ●

●●●

●●●●

●●

●●

● ●● ●

●●

● ●●

●●

●●●●

●●●●

●●

● ●

● ●●

●●

●●●●

●●

● ●

● ●

●●

●●●

●●

●●

● ●●●●

●●

●●

●● ●

● ●●

●●

●●

● ●

● ●

●●●●●●●●

●●

●●

● ●

●●

●●

●●

●●●●●●

●●

●●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●

●●●

●●●

●●

● ●

●●

●●

●●

●●

●● ● ●●

●●●

●●●

●●

●●●

● ●

● ●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●● ●

●●

●●

●●

●●

●●●

●●

●●

● ●● ●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

● ●● ●●

●●

●●

●●

● ●

●●

●●

● ●

●●●●●

●●

●●● ●●

●●

●●

●●●

● ●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

● ●●

●●

●●

●●● ●

●●

●●●

●●

●●

●●

● ●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

● ●●●●●

●●● ●

●●

●●●

●●● ●

●●

●● ●●● ●

●●●

● ●

● ●●

●●●●

●●

●●

●●

●●

●●● ●●

●●●

●●

●●

●●

● ●●●

●●

●● ●

● ●

●● ●●

●●●

● ●

●●●

●●

●●●

●● ●●

● ●

●●

●●●

●●●●

● ●●

●●

●●

●●●

● ●●●

●●

●●● ●●●

●●●

●●

●●●●●

● ●●

●●

●●

●●

●●

●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●● ●● ●●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●● ●

●●

● ●●

●●●

●●

●●●●●

●●

●●

●●

●●●●●

●●

●●●

●●

●●

●● ●

●●●

●●●●●

● ●●●

●●

●● ●

●●

● ●●●

●●

●●

●●

●●●

●●

●● ●●

●●

●●

●●

●●

● ●●●

●●

●●●● ●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

● ●

●●

●●●●●

●●

●●

● ●●●●●

●●●

●●

●●

●●

●●

●●●

● ●

●●●

●●

●●●

●●

●●

● ●●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●●●●●

●●

●●●●

●●

●●●

●●●●

●● ●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●●●

●●

●●●●

●●

●●●●

●●

●●

●●

●●●

●●●

● ●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

● ●

●●

●●●●●●

●●

●●

●●

● ●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

● ●● ●

●●

●●

●●

●●●

●●● ●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●●●●

●●

●●

●●

●●●

●●●●

● ●

●●

●●

●●

●●

● ●●●

● ●

●●●

●●

●●

● ●●

●●

●●

● ●● ●● ●●

●●

●●

●●● ●

● ●●●

●●

●●●●

●●●●

●●

●●

●●

●●●●●

●●

●● ●

●●●

●●●●

●●●

● ●

●●

●●

●● ●

●●

●●●

● ●

●●

● ●

●●

●● ● ●

●●

●●

●●

●●●

●●●

●●●

●●●

● ●●

●●

●●● ● ●●

●●

● ●

● ●

●●

●●

●●●

● ●

●●●

●●●●

●●

●●

● ●

●●

● ●●

●●

● ●●

●●

●●●

● ●●

●●●

●●

●●

●● ●

● ●●●

●●

● ●●

●●

● ●●●

●●●

●●

● ●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

● ●

●●

●●●●●

●●● ●●●

●●●

●●

●●

● ●

●●

●●●

●●

●●●

●●●●●

●●

●●●● ●

●●

●●

● ●

●●

●●● ●

●●●

●●

●●

● ●

●●

● ●

●● ●●●●

●●

●●

● ●●

●●

● ●●

●●● ●

●●●●

●●●

●●

●●

●●

●●●

● ●

●●●

● ●●●●●

●●

●●

●●● ●●● ●

●●

● ●●●●

●●

●●

●●

●●

●●

●●● ●

●●

●●●●●

●●

●●

● ●

●●●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●●●

●●●

●●●

●●

●●●

●●

●●●●

●●

●●

● ●●

● ●

● ●

●● ●

●●

●●

● ● ●

●●

●●●●●

●●

● ●●

●●

● ●

●●

●●●

●●

●●

● ●

● ●

●●

●●●

●●

●●

●●

●●●

●●●

●●●

● ●

●●

●●

●●●●

●●

●●

●●●

●●●●

●●

●●

●●

●●

●●

●●

● ●

●●● ●●

●●

●●

●●

● ●

● ●●

●●

●●

● ●

● ●

●● ●

●●

●●

●● ●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

● ●●

●●

●●●●

●●●●●● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

●●

● ●

●●

● ●●

●●

● ●●

●●●

●●●●

● ●●

●●

●●●

●●

●●

●●●●●

●●●

●●● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●● ●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●● ●●

●●

●●

●●

●●

●●● ●

●●

●●●

●●

●●

●●

●●●

●●

● ●●●

●●

●●

●●

●●

●●

●●● ●

●●

●●● ●

●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

●●●●

●●

●●

●●

● ●

●●●

●●

● ●

●●

●●

● ●●●

●●

●●

●●●●● ●●

●●

●●●

●●

●●●●●

●●

●●●●

●●

●●

●●

●●●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●●●

● ●●●

● ●

●●●

●●

●●●

●●

●●●●

●●

●● ●●●

●●

●●

●●●

●●

●●●

●● ●●

●●

●●

●●

●●

● ●●

●●

● ●●

●●

●●

● ●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●●●

●●

● ●

●●

● ●

●●

●●

● ●

●●●

●●

●●

●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

● ●

● ●

●●● ●

●●●● ●●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●●●●●

●●

●●●●●

●●

● ●

●●

●●●● ●

●●

● ●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

● ●

●●●

●●

●●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●●● ●● ●

●●

● ●●

●●

●●

●●

●●●●●●

●●

●●

●●●

●●●●

●●

●●

● ●●

● ●●●●

●●●●

● ●●

●●

●●

●● ●● ●

●●

●● ●●

●●

●●

●●

●●● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●●●●

●●

●●●

●●

●●●

● ●

●●

● ●

●● ●●

●●

● ●●

● ●

●●

●●●

●●● ●

●●

● ●●●

●●●

●●

● ●

●●

●●

●● ●

●● ●

●●

●●

●●

● ●●

●●●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●● ●

●● ●

●●

●●

●●● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●●●

●●●●

●●

● ●●

●●

●● ●

●●

●●●

●●

●●

●●

●●●

●●

● ●

●●●

●●

●●

●●●●

●●●

●● ●

●●

●●●

●●●

●●

● ●●●

●●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●● ●● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●● ●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●●

●●

●●●●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●●

●●

●●

● ●

●● ●●

●●

●●●

●●

●●

●●

●●

●● ●●

● ●

●●●●

●●

●●

●●

●●

●●● ●

●●

● ●

●●

●●●●

●●

●●

●● ●

● ●●●

● ●●

● ●

●●●

● ●●●

●●

●●●●●

●●●

●●●

● ●●

●●

●●●

●●

●●

●●

●●

●●●

●● ●

● ●

●●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●● ●●

●● ●

●●

●●

●●

●●●

● ●●

●●●

●● ●●

●● ●●

●●

●● ●●●

●●

● ●

●●●

●●●

●●

●●

●●

●●

● ●●●

● ●

●●

●●●

●●●●●● ●

●●

●●

●●

●●●● ●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●●●●●

●●●●

●●●

●● ●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

● ●●

●● ●●

●●

●●

●●

●●

●●●●●●

●●

●●

●●

●● ●●

● ●

●●

● ●

●● ●●●

●●●●

●●●●

●●

● ●

●●

●●

●●

●●●●●

●●

●●●

●●

●●

● ●●

●● ●

●●

●●●

● ●

●●

●●

●●

●● ●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

● ●●● ●

●●

●●●

● ●

●●

●● ●

●●

●●

●● ●

●●●

●●

●●

● ●

●●●

●●

●●

●●

●●●● ●● ●

●●

●●●

●●●

●●

●●

● ●

●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●●

●●

● ●● ●

●●

●●

●●●●

●● ●●●● ●

● ●● ●●

●●●

●●

●●

●●

●●

● ●●●●●●

●●●

● ●

●●

●●

●●

● ●

● ●

●●

● ●

●● ●●●●●

●●●

●●

●●●

●●●

●●

●●●

●●●

●●

●●

●●●●●

●●●

●●

●● ●

●● ●

●●

●●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●

●● ●●●●

●●

●●

●●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●● ●

●●

●●●●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

● ●

●●

● ●●

●●

●●● ●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●● ●●

● ●

●●●●●

●●●

●●●●●●●

●●

● ●

●●

● ●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

● ●

●●

●●

● ●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●● ●●● ●

●●

●●●●●●

●●

●●

●●

●●●

●● ●

●●

●●

●●

●●●●

●●

● ●

●● ●

●● ●

●●●●

● ●●

●●

●●

●● ●

●●

●●

●●

●●● ●

●●

●●●●

●●

●● ●●

● ●

● ●

● ●

●●

●●●

●●●

●●

●●

●●●●●

●●

● ●

● ●●●

●●●●

● ●●

● ●

●●

●●

● ●

● ●

●●● ●●

●●

●●●●

●●●●●

● ●

●●●

●●●

●●

●●●

●●

●●

●●

● ●●

●● ●

●●●

●●

● ●

● ●●

●●

● ●

●●●●

●●

●●●●● ●

●●

● ●●●

●●

●● ●

●●

●●

●●

●●

● ●●

● ●●

●●

●●●●●●

● ●●

●● ●

●●

●●

●●

●● ●●●●

●●

●● ●

● ●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

●●●

●●●

●●● ●

●●● ●

●●

●●●●●

●●●

●●

●●

●●

●●●

● ●

●●

●●

●●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●●●●●

●●

●●

●●●

●●

●●

●●

●●

●●● ●

●●●

●●

●●●●

●●

●●

●●

●●●

●●

●●● ●● ●

●●

●●

●●●●

●●

●●●

●●

● ●●●●

●●

●●

●●

● ●●

● ●●●

●●

●●●●

●●

●● ●●● ●

●●●

●●

●●

●●

●●

● ●●

●●

●●●

●●

●●●●

●●

● ●●

●●

● ●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

● ●●

●●

●●

●●

● ●●

●●

●●

●●●●

●●●

●●●●● ●

●●

●●

●● ●

● ●

●●●

● ●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●● ●

●● ●

● ●

●●

●●●●● ●

●●●

●●

● ●

●●

●●

●●●

●●●● ●

●●●

●●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●●

●● ●

●●

● ●●

●●

●●●

●●

●●

●●

●● ●● ●●●

●●●

●●

●●

● ●

●●●

●●

●●

● ●●● ●●●

●●

●●

●●

● ●

●●

●●

●●● ●●●● ●

●●

●●

● ●●

● ●

● ●

●●●

●●

●●●● ●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

● ●

● ●●

●●

●●

●●● ●

● ● ●●

● ●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●

● ●

●● ●●

●● ●●

●●

● ● ●●

●●●

●●

●●●

●●●

●●

●●

●●●

●●

●● ●

●●

●●● ●

● ●●●● ●

●●●

●●

●●

●●● ●

● ●●●

●●●

●● ●

●●●

●●●

●●●●

●●●●●●●

●●

●●●●●

● ●●●● ●●

●●●

● ●

● ●

●●

●●

●● ●

●●

●●●

●●

●●●

●●

● ●

●●

●●●

●●●

●●●

●●

●●

●● ●●●

●●

● ●

●●●

●●

●●●● ● ●●

●●

●● ●●● ●●●

●●●

●●

●●

●●

●●●

●●

●●●

● ●●

●●● ●

●●●●

● ●

●●

●● ●

●●●

●●

●●

●●

●●●●●

●●

●●●

● ●●

●●

●●●●

●●●●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

●●●

● ●●●● ●

● ●

●●●●

● ●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●

●●●●

●●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●● ●●

●●●●

●●

● ●

●●●

●●

●●

●●

●●

●●●

●●

● ●●● ●●

● ●●

●●●

●●

●●

●●

●●

●●●● ●●

●●

●●

●●

●●● ●

●●

●●●

●●

●●

● ●●●●●

●●●●

●●

●●●

●●●

●●●

●●

●● ●●●

●●

●●●

●●

● ●●

●●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

●●●

● ●

●●

●●

●●

●●

●●●

● ●● ●

●●

●●

●●●

●●

●●●

● ●●● ●

●● ●

●●

● ●

●●●●

●●

●●●

●●

●●● ●

● ●

●●

●●●

●● ●●

●●●

● ●●●

● ●● ●●●

● ●

●●

●● ●

●●●

● ●

●●

●●●●●

● ●

●● ●●●

●●

●●

●●

● ●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●●●

●●

●●

●●

●●

●●

●●

●●●● ●●

●●

●● ●

●●

●●

● ●●

●●

●●

●●●

●●

●●

●●

●●

●●●●

●●

●●●●

●●●

● ●●

●●●

●●

●●

●●● ●

●● ●

● ●●●●

●●

●●

●●

●● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●●

● ●

●●●●

●●●

●●● ●●

●●

●●

●●

●●

●●

●●●●

●●

●●●●●

●●●

●●● ●

●●

●●

●●

●●

● ●

●●

●●

●● ●●

●●

●●

● ●●

●●

●●

●●

●●●●

●●

●● ●● ●

●●

●●

●●

●●

●●●

●●

●●

●●●●

● ●

●●

●●●

●●●

● ●

● ●

●●

●● ●

●●●

●●

●●

●●

●● ●

●●

●● ●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●●

●● ●●

●●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●●●●

●●●●

●●

●●

● ●

●●●

●●

●●

● ●

●●

●●●

●●●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●● ●●●

●●

●●

●●

● ●

●●●

●● ●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●

●●

● ●

● ●●

●●●

●●●

●●

●●

● ●

●●

●●

●●●

●●●

●●●●

●●

●●●●

●●●

●●

●●●

●●

●●

●●

● ●●●●●

●●●● ●

●●●

●●

●●●

●●

● ●

● ●

●●●

●●

●●

●●

● ●●●

● ●

●●

●●

●●

●●●

●●●

● ●●

●●● ● ●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●●●●●● ●●●

●●

●●●●●

●●

●●

● ●

●●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●● ●

●●

●●

●● ●

●●

●●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

●●●

●●●●●

●●

●●

●●

●●

●●

●●●●●●

●●

●● ●●

● ●● ●●●

●●

●●

●●●

●●

●●● ●●●

●●

●●

● ●●

● ●

●●●

●●● ●

●●●

●● ●

●●

●●

●●

●●●●

●●●

● ●

●●

● ●●

●●

●●

●●

●●●

●●

●●

● ●●●● ●●

●●

●●

●●

● ●

●●

●●●●●

● ●

●●

●●●

●●

●●●●●●

●● ●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●● ●●●

●● ●

●●●●

●●

●●

● ●● ●

●●

● ●●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

● ●●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●●

● ●

● ●

● ●●

●●●● ●●

●●

●● ●

●●

● ●●

●●●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●●

●●●

● ●●

●●

●●

●●● ●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●●●

●●

●●●

●●●●●

●● ●●

●●

●●

●●●

●●

●●

●●●● ●●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

● ●

●●

●●

●●

●●

●●

●●

●● ●●●

● ●●●●● ●

●●

●●●●

●●

●●

●●

●●●

●●

●●●●●●●

●●

●●

●●

● ●

●●

●●

●●●●

●●● ●●●●

●●●

●●●●●

●●

●●●

●●

●●●

●●

●●● ●

●●

●●

● ●●

●●

● ●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●●●

● ●●

●●

●●●●●●

●●●

● ●●

●●

● ●●

●●

●●

● ●

0.7 0.8 0.9 1.0 1.1 1.2

0.00

0.05

0.10

0.15

0.20

0.25

X intensity (12954 probes)

X h

eter

ozyg

osity

●●●

●●

●●

● ●●

● ●●

● ●

●●

●●

●●●●

●●

●●

●● ●●

●●

●●●

●●

●●●●

●●●●

●●

●●

●●●

●●

●●●●

●●●

●●

●●

● ●

●● ●

●●

●●

●●● ●●

●●

● ●

●● ●

●●●

●●

●●

●●

●●

●●●● ●●●●

●●

● ●

●●

●●

●●

●●

●●●●●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●●

●●

●● ●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

● ●●● ●

●● ●

●● ●

●●

●● ●

●●

●●

●●●

● ●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

● ●●●

●●

●●

●●

● ●

●●●

●●

●●

●●●● ●

●●●

●●

●●

●●●

● ●

●●

●●

●● ●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

● ●●●●

●●

●● ●● ●

● ●

●●

●●●

●●●●●

● ●

●●

●●

●●

● ●

●●

●● ●

●●

●●

●●●

●●●

●●

●●

● ●●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●●●●●

●●● ●

●●

●●●

●● ●●

●●

● ●●● ●●

●●●

●●

●●●

●● ●●

●●

●●

●●

● ●

●● ●●●

●●●

●●

● ●

●●

●●●●

●●

●● ●

● ●

● ●●●

●●●

●●

●● ●

●●

●●●

● ●●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●●

●●●●

●●

● ● ●●● ●

●●●

●●

●●●● ●

●● ●

●●

●●

●●

●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●●●●●

●● ●●

●●

●●

● ●

●●

●●

●●

●●

●● ●●

●●

●●●

●●●

●●

●●●● ●

●●

●●

●●

●●● ●●

●●

●●●

● ●

●●

● ●●

●●●

●●●●●

●●●●

●●

●●●

●●

●●● ●

●●

●●

●●

●●●

●●

● ●●●

●●

● ●

● ●

●●

●●● ●

●●

●●● ●●

●●

●●

●●

●●

●●

●●

● ●●

●●●

●●●●

●●

●●

●●

●●●●

● ●

●●

●●

● ●

●●

●●●

●●

● ●

●●

●●

●● ●●●

●●

●●

●●●●●●

●● ●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●

●●

●●●●

●●

●●●

●●●●

● ●●

●●

●●

●●

●● ●

●●

●● ●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●●●●

●●

●● ●●

●●

●●●●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

● ●

●●

●●

●●

● ●

●●

● ●●●●●

●●

●●

●●

●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●●

●●

● ●

●●●

●●

● ●

●●

●● ●●

●●

●●

●●

●●●

●● ●●

●●

●●●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●●●●

●●

●●

●●

● ●●

●●●●

●●

● ●

●●

●●

●●

●●●●

●●

●●●

●●

●●

●●●

●●

●●

●●●●●● ●

●●

●●

●●●●

●●● ●

●●

●●●●

●●

●●

●●

●●

●●

●●●●●

●●

● ●●

●● ●

●●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

● ●

●●

●●●

●●●

●● ●

● ●●

●● ●

●●

●● ●●● ●

●●

●●

●●

●●

●●

● ●●

●●

●●●

●●●●

●●

● ●

●●

●●

●●●

●●

●●●

●●

●●●

●●●

●●●

●●

●●

● ●●

●● ●●

●●

● ●●

●●

●●●●

●●●

●●

●●

●●

●●

●● ●

●●

●●● ●

●●

● ●

●●

●●

●●

●●

●●●●●

●● ●● ● ●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●● ●●

●●

● ●●●●

●●

●●

●●

●●

●● ●●

●●●

●●

●●

●●

●●

●●

● ●●●●●

●●

●●

● ●●

●●●

●● ●

●● ●●

●●●●

●● ●

● ●

●●

●●

●●●

●●

●●●

●●●●●●

●●

●●

●● ●● ●●●

●●

●●●●●

●●

●●

●●

●●

●●

●● ●●

●●

●●● ●●

●●

● ●

●●

●● ●

●●

●●

● ●

●●

● ●

●●●

●●

●●

●●

●●●

●●●

●●●

● ●

●●●

●●

● ●●●

●●

● ●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●●

●●

●●●

●●

●●

●●

●●●

● ●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●●

●● ●

●●●

●●

●●

●●

●●●●

●●

●●

●●●

●●●●

●●

●●

● ●

●●

●●

●●

●●

● ● ●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

● ●●

●●

●●

● ●●

●●●

●●

●● ●

●●

●●

●●

●● ●

●●

●●

● ●

● ●

●●

●●

●●●

●●

●●● ●

●●●●●●●

●●

● ●

●●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

● ●

●●

●●

● ●● ●

●●●

●●●

●●

● ●

●●

●●

●●

●●●

●●

●●●

●●●

●●●●

●●●

●●

●● ●

●●

●●

●● ●●●

●●●

● ● ●●

● ●

●●●

●●

●●

●●

●●

●●

● ●

●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●● ●●

●●

● ● ●● ●

●●

●●

●●

● ●

●●●●

●●

●● ●

●●

●●

●●

●●●

● ●

●● ●●

●●

●●

● ●

●●

●●

●●●●

●●●

●●●●

●●

●●

●●●

● ●

●●

●●

● ●● ●

●●

●●

●● ●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●●●

●●

● ●

● ●●●●●●

●●

●●●

●●

● ●●● ●

●●

●●●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●●●●

● ●●

●●

●●●

● ●

●● ●

●●

●●● ●

●●

●●● ●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●● ●

●●

●●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

● ●

● ●

●●●

● ●

●● ●

●●

●●●●

●●

● ●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●●●●

●●

●●●

● ●

●●

●●

●●

●●

●●●●

●●●

●●

● ●

● ●

●●

●●●●

●●● ●●●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●● ●

●●

● ●

●●

●●●

●●

●●

● ●

●●

●●

●●●●●●

●●

●●●● ●

●●

●●

●●

● ●● ●●

●●

●●● ●

●●

●●

●●

●●

●●

●●

●● ●●

●●

●●

●●●

●●

●●●

●●

●● ●

●●

●●

●●

●●

●●

● ●

●●● ●● ●●

● ●

●●●

●●

●●

●●

● ●●● ● ●

●●

●●

●●●

● ●●●

●●

●●

● ●●

●●●●●

● ●●●

●●●

● ●

●●

● ●● ●●

●●

● ●●●

●●

● ●

●●

●●● ●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●● ●●

●●

●●

●●●●●

●●

●● ●

●●

●●●

●●

●●

●●

● ●●●

●●

●●●

● ●

●●

●●●

●● ●●

●●

●●● ●

●●●

●●

●●

●●

●●

● ●●

● ●●

●●

●●

●●

●●●

●●●●

●●

●●

●●●

●●

● ●

●●

●●

●●

● ●

● ●●

● ●●

●●

●●

●● ●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

●●●●

● ●●●

●●

●● ●

●●

●●●

●●

●●

●●

●●

●●

●●●

● ●

● ●

●● ●

●●

●●

●●●●

●● ●

●●●

●●

●●

● ●●

●●

●● ●●

●●● ●

●●

●●●

●●●

● ●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●● ●●

●●

●●

● ●●

●●

●●

● ●

●●

●●

●●●● ●● ●

●● ●

●●

●●

●● ●

●●

●●

● ●

● ●

●●

● ●

●● ●●●

●●

●●●

●●

●●●●●

●● ●

● ●

●●

●●●

●●

●●

● ●

●● ●

● ●

●●●●

●●

●●

● ●

●●●●

●●

●●●

●●

●●

●●

●●

● ●●●

●●

●●● ●

●●

● ●

●●

●●

●●●●

●●

● ●

●●

●●●●

●●

●●

● ●●

●●●●

●●●

●●

●●●

●●●●

●●

●●● ●

●●●

●●●

● ●●

●●

●●

●●

●●

●●

●●

●●●

● ●●

● ●

●●

● ●

●●

●●

●● ●

●●●

●● ●

● ●

●●

● ●● ●

●●●

●●

●●

●●

● ●●

●●●

●●●

●●● ●

● ●● ●

●●

●●●● ●

●●

●●

●●●

●●●

●●

● ●

● ●

●●

●●●●

●●

●●

●●●

●● ●●●●●

●●

●●

●●

● ●● ●●

●●

● ●

●●

●●

●●

●●

● ● ●

●●

●●●●●●

●● ●●

● ●●

●● ●●

●● ●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●● ●

●● ●●

●●

●●

●●

● ●

●● ●● ●●

●●

● ●

●●

● ●●●

● ●

● ●

●●

● ●●●●

● ●●●

●●●●

●●

●●

●●

●●

●●

●● ●● ●

●●

● ●●

●●

●●

● ●●

● ●●

● ●

●●●

● ●

●●

●●

● ●

● ●● ●

●●

● ●

● ●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●●●●

●●

●● ●

●●

●●

● ●●

●●

●●

● ●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●●●● ●●

●●

●●●

●● ●

●●

●●

●●

●●

●● ●

●●

●●

● ●

●●●

● ●

● ●●

● ●

●●

●●

●●

●●

●● ●

● ●

●●

●●●

●●

● ●●●

●●

●●

●●●●

● ● ●●● ●●

●● ●●●

●● ●

● ●

●●

●●

●●

●● ●●●● ●

●● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●●●● ●●

●● ●

● ●

●●●

●●

●● ●

●●●

●●●

●●

●●

●● ●●●

●●●

●●

●●●

● ●●

●●

●●●

●●●

● ●●

●●

●●●

●●

●●

● ●

●●

● ●

●●

●●

●●●

●●● ●

● ●●● ●●

●●

●●

●● ●

●●

●●

●●

● ●

● ●●

●●

●●

●●

●●●

●●

●●●●

●●

● ●

● ●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●● ●

●●

●● ●● ●

●●

●● ●

●●●

●●

●●

●●

●●

● ●

●●

● ●● ●●●

●●

●●●● ●

●●●

●● ● ●●●●

● ●

●●

●●

●●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●● ●

●●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

● ●●

●●

●●

●●

●●●

●●

●●

●●

●●●●●●

●●

●●●●●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●

●●

●●

● ●●

●●●

●●● ●

●● ●

● ●

●●

● ●●

●●

●●

● ●

●● ●●

●●

●●●●

●●

●●● ●

● ●

●●

● ●

●●

●●●

● ●●

●●

●●

●●●● ●

●●

●●

●●● ●

● ●●●

●●●

●●

●●

●●

● ●

●●

●●●●●

●●

● ● ●●

●●●●●

● ●

● ●●

●●●

●●

●●●

●●

●●

● ●

●● ●

● ●●

● ●●

● ●

●●

●● ●

●●

●●

●●●●

●●

●●●● ●●

●●●

●● ●●

●●

● ●●

●●

●●

●●

● ●

●●●

● ●●

●●

●●●●● ●

●●●

●●●

●●

●●

●●

●● ●●●●

● ●

●● ●

●●

●●

●●

●●

● ●

●●

●●

● ●●●

●●

● ● ●● ●●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

●●

●●● ●

●●

●●●

●●●

● ●●●

●● ●●

●●

●●●●●

●●●

●●

●●

●●

●●●

●●

● ●

●●

●● ●

● ●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●●●●●

●●

●●

●●●

● ●

● ●

●●

●●

●●●●

●●●

●●

●●● ●

●●

●●

● ●

●● ●

●●

●● ●● ●●

●●

●●

●●● ●

●●

● ● ●

●●

●●●●●

●●

●●

● ●

●●●

●●●●

●●

● ●●●

●●

● ●●●●●

●● ●

●●

●●

●●

●●

● ● ●

●●

● ●●

●●

●●●●

●●

●●●

●●

●●

●●●

● ●

● ●

●●

● ●

●●

● ●

●●●

●●

●●

●●●

●● ●

●●

●●

●●

● ●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●●●

●●●

● ●● ●●●

●●

●●

●● ●

● ●

● ●●

● ●

● ●

●●

●●

●● ●

●●

●●

●●●

●●

● ●●

●● ●●

● ●●

● ●

●●

●● ●●●●

●●●

●●

●●

●●

●●

●●●

●●● ● ●

●●●

●● ●

●●

●●

● ●

●●

●●

● ● ●

●●

●●

●● ●

● ●●

● ●

●●●

●●

●●●

●●

●●

●●

● ●● ●●●●

●● ●

● ●

●●

● ●

●● ●

●●

●●

● ●●●●●●

● ●

●●

●●

●●

●●

●●

●● ●●●● ●●

●●

●●

●●●

●●

●●

●●●

●●

●● ● ●●●●●

●●

● ●

●●

●● ●

●●

●●

●●

● ●

●●

● ●

●●

●●●

●●●

●●

●●

●●

● ●●

●●

● ●

● ● ●●

●● ●●

●●

●●

●●

●● ●

●●●

●●

●●

●●

●●

●●

●●

● ●●●

●●●●

● ●

●●●●

●●●

●●

●●●

●● ●

●●

●●

●●●

●●

● ●●

●●

●● ● ●

●●●●●●

●●

●●

●●

● ●●●

●●●●

●●●

●● ●

●● ●

●●●

●● ●●

●●●●●●●

●●

● ●● ●●

●●●●●●●

●●●

●●

● ●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●●

● ●●

●●

●●

● ●● ●●

●●

●●

●●

●●

●●● ●●●●

● ●

●● ●●● ●●●

●●●

●●

●●

● ●

● ●●

●●

●●●

● ●●

●●●●

●●●●

●●

●●

● ●●

● ●●

●●

●●

●●

●●●● ●

●●

● ●●

●● ●

● ●

● ●●●

●●● ●●

●●

●●

●●

●●

●●● ●●

●●

● ●

●●

● ●

●●

●●

● ●

● ●

●●

●●

● ●●●

●●

●●

● ●

●●

●●●

●●●

● ●●●●●

●●

●● ●●

●●●

●●●

●● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ● ●

●●●●

●●●●

●●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●● ● ●

● ●●●

●●

● ●

●●●

●●

●●

●●

●●

●● ●

●●

●● ●●●●

●● ●

●●●

● ●

●●

●●

●●

● ●●●●●

●●

●●

● ●

●● ●●

●●

●●●

●●

●●

● ●●●●●

●● ●●

●●

●●●

●● ●

●●●●

●●

●●● ●●

●●

● ●●

●●

●●●

●●

●● ●●

●●

●●

●●

● ●

●●

● ●

●●

●●●●

●●

●●

● ●

●●

●●●

●● ●

● ●

●●

●●

● ●

●●

●●●

●● ●●

●●

●●

●● ●

●●

●●●

●● ● ●●

●●●

●●

● ●

●●● ●

●●

●●

●●

●● ●●

●●

●●

● ●●

●●●●

●●●

●● ●●

●● ●●●●

●●

●●

● ●●

●●

●●

●●

●●●● ●

●●

●●● ●●

● ●

●●

●●

● ●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●● ●●

●●

●●

●●

● ●

●●

●●

●● ●●● ●

●●

● ●●

●●

●●

●●●

●●

●●

●●●

● ●

●●

●●

● ●

●●●

● ●

●●●●

●●●

● ●●

●●●

● ●

●●

● ● ●●

●●●

● ●●●●

●●

● ●

● ●

●●●

● ●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●● ●

●●●

● ●●●●

●●

●●

●●

●●

●●

●●●●

●●

●●● ●●

●●●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●● ●

●●

● ●●●●

●●

●●

●●

●●

●● ●

●●

●●

● ●●●

●●

●●

●●●

●●●

●●

● ●

●●

●● ●

●●●

●●

● ●

●●

●●●

● ●

●● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

● ●●●

●●● ●

●●

●●

●●

● ●●

●●●

●●

● ●●

●●

●●●

●●● ●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

● ●●

●●

●●●●●

●●

●●

●●

●●

●●●

●●●●●

●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●●●

●●

●●●●

●●●

●●

● ●●

●●

●●

●●

●●●●● ●

●●●●●

●●●

●●

●● ●

●●

●●

● ●

●●●

● ●

●●

●●

●●●●

●●

●●

● ●

●●

●●●

●●●

●●●

●● ●●●●

●●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●● ●

● ●

●●

●●

● ●

● ●

●●●●●● ●●● ●

●●

●● ●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●●

● ●

● ●●

●●●

●●

●●

●●

● ●

● ●

●●

●●●●

●●●

●●●

●●

●●

●● ●

●●●●●

●●

●●

●●

● ●

● ●

●● ●●●●

●●

● ●●●

●● ●●● ●

●●

● ●

●●●

●●

●●●● ● ●

●●

●●

●● ●

●●

●●●

● ● ●●

●●●

●●●

●●

●●

●●

●●● ●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

● ●●● ●●●

●●

●●

● ●

●●

●●

●● ●●●

●●

●●

● ●●

●●

●●●●●●

● ●●●

●●

●●●

●●

●● ●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ● ●

●●

●●

●● ●● ● ●

●● ●

●●●●

●●

●●

●●●●

●●

●●●●

●●

● ●

●●

● ●●

●●

●●

●●

●●

●●

● ●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●● ●

●●●●●

●●

●●

●●

●●

● ●

●●

● ●

●● ●

●●●●

●●

● ●

●●

●●

●●

●● ●

●●● ●●●

●●

● ● ●

● ●

● ●●

●●● ●

●●

●●

●●●

●● ●

●●

●● ●

● ●

●●

●●●

●● ●

●●●

●●

●●

●●● ●

●●

● ●

●●

● ●

●● ●

● ●

●●

● ●●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●● ●●

● ●

● ●●

●●●● ●

●● ●●

●●

●●

●●●

●●

●●

●●● ●●●●

●●●

●●

●●●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●● ●●●

●●●●●●●

●●

●●●●

● ●

●●

●●

●●●

● ●

● ●● ●●●●

● ●

●●

●●

●●

●●

●●

●●●●

●●●●●● ●

● ●●

●●●● ●

●●

●●●

●●

●●

●●

●●

●●● ●

●●

●●

●●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●●

●●●

●●

● ●●●●●

●●●

●●●

●●

●●●

●●

● ●

●●

0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8

0.00

0.05

0.10

0.15

0.20

0.25

Y intensity (2044 probes)

X h

eter

ozyg

osity

●●●

● ●

●●

●●●

●● ●

●●

●●

●●

●●●

●●

●●

●●

●● ●

●●

●●

● ●

●●●

●●●

●●

●●●●●●

● ●

● ●

●● ●

●●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

● ●●

●●

●●

●● ●

●●●●●

●●

●●

●●

● ●●

●●

● ●

●●

●●

● ●●●

●●

●●

●●

●● ●●●

●●●

●●

●●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

● ●

●●

● ●

●●

●●●

●●●

●●

● ●

● ●

●●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●● ●

●●

●●●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●● ●●

●●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

● ●●

●●●●

●●

● ●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

●●

●● ●●

● ●

●●

●●

●●

●●● ●

● ●● ●

●●

●●

●●●

● ●●

●●

●●

●●●

●●●

● ●●

●●

●●●●● ●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●● ●

●●

●●

●●●

●●

●●

● ●●●

●●● ●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●● ●

●●

●●

● ●

●●

●●

●●

●●●

● ●●●

●●

●●●●●

● ●

●●●

●●●

●●●

● ●

●●

●●● ●

●●

●●●

●●

● ●●●

●●

●●

●●●

●●● ●●

●●

●●● ●

●●●

●●

● ●

●●

●●

●●

●●

●●

● ●●●

●●

●●

● ●●

●●

●●●●

●●

●●

●●●●●

●●●

●●

●●

●● ● ●●●

●●●

●●

●●●

●●

●●●●●

●●

●●

●●

● ●

● ●

●●

●●●

●●

●●●

●●●

●●●●

●●

●●

●●●

●●

● ●●

●●●●

●●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●● ●●

●●

●●

●●

●●●

●●

● ●

●●

●●●

●●

●●

●●

●●

● ● ●

●●●

●●●

●●

●●

● ●

●●

●●●●

●●

●●

●●●●●

●●

●●

●●●

● ●●

●● ●

● ●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●

● ●

●●

●●●

●●

●●

● ●

●●●

●●

●●●●

●●●

●●

●●● ●

● ●

●●

●●

●●●

●●

● ●

●●●

●●●

●●

●●

● ●●●

●●●

●●●●

●●

●●●

●●●●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●● ●●● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●● ●

●●

●●●

● ●

●●

●●

●● ●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●●

●●

● ●

●●

● ●

●●

●● ●

●●

● ●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

●●

●●●

●●

●●

●●

●●●●

● ●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●●●●●●

●●

●●

●●●

●●●● ●●

●● ●

●●

●●

● ●●●

●●

●●●

●●

●●

●●

●●

●●

● ●

● ●●●

●●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●●

●●

●●●●

●●●

●●

●●

●● ●

●●

●●

●●

● ●●

●●

● ●●

●●●

●●

●● ●

●●●

●●●●

●●

●●

●●

●●●

●●

●●

●● ●

●●

● ●●

●●

●●

●●

●●●●

●●

●●

●● ●●●●

●●

●●

● ●

●●

●●●

●●

●●● ●

●●●●●●●● ●

● ●●

●●●

●●

●●

●●●

●● ●

●●

●●

●●

●●●●●

● ●

●● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●●●

●●●

●●

●●●

●●●●

●● ●

●●

●●

●●●

●● ●

●●

●●

●●

●●

● ●●

●●

● ●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

● ●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●

●●

●●●

●●

●●

●●●●

●●

●●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●●

●●●

●●

●●

● ●

●●●

● ● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●● ●●●●

●●●

●●

●●

●●●●●

●● ●●

●●

●●●●●

● ●

●●

●●

●●

● ●●●

●●

●●

●●●●

● ●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●●

● ●

●●

●●

●●

●●●

●●●●●

●●

●●●

●●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●● ●

●●

●●●●● ●

●●●

●●

●●●

●●

●● ●

●●

●●

●●●

●●

●●

●●

● ●●

●●●●●

●●

●●

● ●

●●●

●●

●●

●●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

● ●

●●●●

●●

●●●●

●●

● ●●

●●●●

●●●

●●

●●●

●●●●●

●●

●●

●●

● ●

●●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●● ● ●

● ●●

●●

●●

●●

●●

●●

●●

●●●●

● ●●

●● ●●●●

●● ●●

● ●●

●●

●●

●●

●●

● ●

●●●

●●

●●●

● ●

●●

●●●

●●●●

●●

●●

●● ●

●●

●●

●●

●●●

● ●

●●●●●

●●

● ●●

● ●●

●●● ●

●●●●

●●

●●

● ●●●

●●

● ●●●●●

●●

●●●

●●

● ●●●

●●

●●

●●

●●●

●●

●●●

●●●●

●●

● ●●

●●

●●

●●

●●

●●

●●●

● ●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

● ●●

●●●

●●●

●●

●●●

●●

●●● ●

●●●

●●

●●●

●●

●●

●●●

● ●

●●

●●

●● ●

● ●●●

●●●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●● ●

●●●

●●●

●●●

●●●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●● ●

●● ●

●●●

● ●

●●

● ●

●●

● ●

● ●

●●●

●●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●●●

●●●

●●

●●

●●●●

●●

●●

●●● ●

● ●

●●

●●

● ●

●●●

●●

●●●

●●

●●

●●●●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

● ●●

●●●

●●

●●●

●●

●●

● ●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●●

●●●

●●●

●●

● ●

●●●

●●

●●●●●

●●

●●●●

●●● ●●

●●●

● ●●

●●

●●

● ●●

● ●●

●●●

●●●

●●

●●●●

● ●

●●●

●●

●●

●●

●●● ●

●●●●●●

●●

●●●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●● ●●

●●

●●●●

●●

●●

●●●

●●●

● ●

● ●

●●

●●

● ●●●

●● ●

● ●● ●●

●●

●●●

●●

● ●

●●

●●●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●● ●

●●●●●

●●●

●●●

●●

●●

● ●

●●

●●

● ●●

●●

●●●

●●●

● ●

●●

● ●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●●

●●●

●●

●●

● ●

●●

●●

●●

●●●●

●●●

●●

●●

●●● ●

●●●

●●

●●●●

●●

●●●

●●

●●●

●●●

●● ●●● ●

● ●

●●●

●●

●●

●●●●

●●

● ●

●●

● ●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

● ●●

●●

●●

●●●

●● ●●

●●

●●●● ●●

● ●

●●

●●

●●

●●

●●

●●●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

●●●

●●●

●●●

●●

●●

●●●●●

● ●●

●●

● ●

●●●

●● ●●●

● ● ●●

●●

●●

●●●

● ●

●●

●●●●

●●

●●

●●

●●●

●●●

●●●

●●● ●

●●

●●

●●●

●●

●●●●

●●

●●●●

●●●●

●●●

●●● ●

●●

●●

●●

●●

●●

● ●

● ●●●●

●●● ●●●

●● ● ●

●●

●● ●

●●

●●●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●● ●

●●●

●●

●●

●●

●●●

●● ●

●●●●●●

●●

●●

●●●

●●●

●●

●●

● ●

● ●

●●●●

●●

●● ●

●●

●●

●●

●●●●

●●

●●

● ●

●●

● ●

●●●

●● ●● ●

●●

●●

● ●

●●

●●●

● ●●

●●●

●●

●●●

●●

●●

●●●

● ●●●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

● ●

●●●

● ●●

●●●●●●●

●●

●●

●●

●●●●

●●

● ●●

●●● ●

●●

●●

●●

●●●

●●

● ●●●

●●

●●

●● ●

●●

●●

●●

●●

●●●

●●

●●

● ● ●●

●●

● ●

●●

●●

●●

●●

● ●●

●●

●●

●● ●●

●●●

● ●●●●●

●●

●●

●●●●

●●●

● ●

●●●

●●

●●● ●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

● ●

●●● ●●●

●●

●●●

●●

●●● ●

●●●

●●

●●

●●

●●

● ●●

●●●

●●

●●

●●

●●●●

●●

0.12 0.14 0.16 0.18 0.20 0.22

0.00

0.05

0.10

0.15

0.20

0.25

Autosomal heterozygosity

X h

eter

ozyg

osity

●●●● ●● ●●

● ●●●●

22

Page 23: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 2: The X and Y intensities are calculated for each sample as the mean of the sum of the normalizedintensities of the two alleles for each probe on those chromosomes. Groups of samples with possible sexchromosome anomalies are labeled by the corresponding possible karyotype.

●●

●●●

●●●

●●●

●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●●●●●

●●

●●

●●●

●●

●●

●●

●● ●

●●

●●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

● ●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●

● ●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●● ●●

●●

●●

●●

● ●

●●●

● ●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●

●●

●●●

● ●●

●●

●●

●●

●● ●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●● ●

●●

● ●

●●

●●

●●●●●

●●●●

●●

●●●

●●

●●

●●●

●●

● ●●

●●

● ●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●●

● ●●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●● ●● ●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●● ●

●●

●●

● ●

●●●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●●●●

●●

●●

●●●

●●

●●

●●

● ●

●●●

●●●

●●●

●●●●●

●●

●●

●●

●●

●●

●●

●●●●● ●●●●

●●

●●

●●●●

●●●

●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●●●

●●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●

●●●

●●

●●●●

●●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●

●●●●●●

●●

●●

●●●●

●●

●●

● ●●●

●●

●●●

●●

●● ●

● ●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●

● ●

●●●

●●●

●●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●●●

●●

●●●

●●●

●●

●●●

●●

●●

● ●

●●●

●●●

●●●

● ●

●●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●●

●●

●●

●● ●●

● ●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●● ●●

●●●●●●●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●●

●●

●●●

●●

● ●

● ●

●●●●

●●●

●●●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●

●●

● ●

●●●●

●●

●●

●●●●

●●

●●

●●

●●●●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●● ●

●●●

●●

●●●

●●●

●●● ●

●●●●

● ●●

●●●

● ●●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

● ●●

●●

●●

● ●

●●

●●●

● ●●

●●●●●●

●●●●

●●

● ●

●●●

●●

●●

●●

●●●

●●

●●

●● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●●●

●●

●●

●●

●●●

●●●

●●

●●● ●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

● ●●

●●

●●

●●

●●

●● ●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●● ●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●●●●

●●

●●

●●●

●●●

●●

●●

●●●●

●●

●● ●

●●●

●●

● ●

●●●

●●●

●● ●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●●●

●●

● ●

●●

●●●

●●

●●

● ●

●●●

●●●

●●

●●●

●●●

●●

●●

● ●

●● ●

●●

●●

● ●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●● ●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

● ●

●●

●●

●●●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●● ●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

● ●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●● ●●●

●●

●●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●●●

●●

●●

●●

● ●●

●●

●●

●●●

●●

●●

●●●

●● ●

●●●

●●●●

●●

●● ●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

● ●●

●●

●●

●●●

● ●

●●

●●

●●

●●●

●● ●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

●●

●● ●

● ●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●● ●●●

●●●

●●

● ●● ●

●●

●●

●●

●●●●

●●

●●● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

● ●●

●●●

●●

●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●●

●●●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●

●● ●

●●

●●

●●

●●

●●●

● ●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●● ●●●

●●

●●●

●● ●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●● ●●

●●

● ●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

● ●●

●●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●● ●

● ●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

● ●

●●

● ●

●●

● ●

●●●

●●

●●

●●●

●●

●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●●

●●

●●

●●

● ●

●●

● ●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●● ●●

●●●

●●●

●●●

●●●

●●

●●

●●●

●● ●●● ●

●●

●●●● ●●

●●

●●●

●●●

●●

●●

●●

●●

●●●●

●●

●●●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

● ●●

●●

●●

●●●

● ●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●●●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

● ●

●●●●

●●

●●●●

●●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●●

●●

●● ●

●●

●●

●●●●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●●

●●

●●●

●●●

●●●●

●●

●●●

●●

●●●

●●

● ●

●●

●●

●●

●●●

● ●●

●●●

● ●

●●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●●●●●

●●

●●

●●

●●●

●●●●

●●●

●●

●●

●●

●●●

●●

●●●●

● ●

●●

●●

●●●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

● ●●

●●●●

●●

●●●●

●●●

●●

●●

●● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●●●

●●

●●●

●●

●●

●●●●

●●

●● ●

●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●● ●

●●

●●

●●

●●

●●

● ●●

●●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●●●●

●●

●●

●●

●●●

●● ●

●●●

●●

●●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

● ●●

●●

●●

●●

●●●

●● ●●●

●●

●●

●●

●●

●●●

●●● ●

●●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●●●

●●

●●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

● ●

●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●

●●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

● ●

● ●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●● ●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●●●

●●

●● ●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●●●

●●●

●●

●●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●

●● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●●●

●●●●

●●

●●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

● ●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●●

●●

● ●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●●●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●

●●●

●●

●●●

●● ●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●●●

● ●

●●

●●●

● ●●

●●●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●●●

●●●

●●

●●

●●●●

●●●●●

●●

●●

●●

●●

●●

●●●●

●●

●●●

● ●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●●

●●

●●

●●●●●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

● ●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●● ●

●●

●●

●●●

●●●

●●

●●

● ●

●●●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●●

●●●

●●

●●

●●

●●●

●● ●●

●●●

●●

●●

●●●●

●●

●●

●●●

●●●

● ●●●●

●●

●●

●●

●●

● ●●

●●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●● ●

●●

● ●

●●

●●

●●

●●●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●● ●●●

●●

●●

●●

●●●

●●

● ●

●●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●●●●

●●●●●

●●

●● ●

●●

●●

●●

●●

●●

●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●●●

●●

●●●

●●

●●

●●●

●●●●

●●●

● ●●

●●

0.7 0.8 0.9 1.0 1.1 1.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

X intensity (12954 probes)

Y in

tens

ity (

2044

pro

bes)

●● ●

●●

●●

●●

●●

●●

XXX

XXY

XX/XO

MFNA

23

Page 24: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 3: LRR and BAF plots for chromosome 5 in sample A. This chromosome shows a normal pattern.Color-coding is for genotype calls (orange=AA, green=AB, fuchsia=B, black=missing).

24

Page 25: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 4: LRR and BAF plots for chromosome 2 in Sample A. This chromosome shows an abnormal patternwith a split in the heterozygous BAF band and decrease in LRR, indicating a mosaic deletion. The splitis wide enough to cause genotyping errors, with some heterozygotes evidently called as homozygotes. Thisanomaly is recommended for filtering. Color-coding is for genotype calls (orange=AA, green=AB, fuch-sia=BB, black=missing). The horizontal solid red line in both plots is the median value of non-anomalousregions of the autosomes, while the horizontal dashed red line is the median value within the anomaly. Thered box on the ideogram indicates the region shown in the BAF and LRR plots.

25

Page 26: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 5: LRR and BAF plots for chromosome 12 in Sample B. This pattern is consistent with a heterozygousdeletion. This deletion is recommended for filtering since it is almost certainly acquired because it is so large.Color-coding is for genotype calls (orange=AA, green=AB, fuchsia=BB, black=missing). The horizontalsolid red line in both plots is the median value of non-anomalous regions of the autosomes, while the horizontaldashed red line is the median value within the anomaly. The red box on the ideogram indicates the regionshown in the BAF and LRR plots.

26

Page 27: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 6: LRR and BAF plots for chromosome 2 in sample C. This chromosome shows an abnormal BAFpattern with a split in the heterozygous band. Since LRR remains centered at 0 (copy neutral), this patternis consistent with mosaic uniparental disomy. The split is wide enough to cause genotyping errors, with someheterozygotes evidently called as homozygotes. This anomaly is recommended for filtering. Color-coding isfor genotype calls (orange=AA, green=AB, fuchsia=BB, black=missing). The horizontal solid red line inboth plots is the median value of non-anomalous regions of the autosomes, while the horizontal dashed redline is the median value within the anomaly. The red box on the ideogram indicates the region shown in theBAF and LRR plots.

27

Page 28: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 7: LRR and BAF plots for the X chromosome in Female Sample D. This chromosome shows anabnormal BAF pattern. A split in the heterozygous band and decreased LRR (relative to other females)is consistent with an XX/XO mosaic karyotype. It is apparent that some heterozygotes were called ashomozygotes (note in particular the two bands of AA calls (orange)). Thus this anomaly is recommendedfor filtering. Color-coding is for genotype calls (orange=AA, green=AB, fuchsia=BB, black=missing). Thehorizontal solid red line in both plots is the median value of non-anomalous regions of the autosomes, whilethe horizontal dashed red line is the median value within the anomaly. The vertical light-blue rectanglerepresents the centromere. The red box on the ideogram indicates the region shown in the BAF and LRRplots.

28

Page 29: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 8: LRR and BAF plots for chromosome X in Female sample E. This chromosome shows a patternconsistent with trisomy, i.e., a split in the heterozygous band with positions at about 1/3 and 2/3, andelevated LRR, indicating an XXX genotype. This anomaly is not recommended for filtering since there isno indication of genotyping error. Color-coding is for genotype calls (orange=AA, green=AB, fuchsia=BB,black=missing). The horizontal solid red line in both plots is the median value of non-anomalous regions ofthe autosomes, while the horizontal dashed red line is the median value within the anomaly. The red box onthe ideogram indicates the region shown in the BAF and LRR plots.

29

Page 30: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 9: LRR and BAF plots for the X chromosome in Male Sample F. The pattern of the BAF/LRRplots is consistent with XXY karyotype. Note the presence of a middle heterozygous band. Splits in theheterozygous band in the pseudo-autosomal regions (PAR1, PAR2, and XTR) indicate increased copy numberfor X and distinguishes the plot from a typical X-chromosome BAF plot for a female. X and XY SNPs arerecommended to be filtered for this sample. Color-coding is green for SNPs in the pseudo-autosomal regions(PAR1 and PAR2, shown in gray rectangles, and XTR, shown in a yellow rectangle) and pink for other Xchromosome SNPs.

30

Page 31: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 10: IBD coefficients to estimate relatedness. Each point represents a pair of samples. This plotshows 15,918 pairs of study subjects with an estimated KC > 1/32, color-coded by expected relationships.Gray dashed horizontal lines show boundaries for KC values for inferring varying degrees of relatedness.The first and second (from the top) form a region for expected full siblings, the second and third forma region for expected second-degree relatives, the third and fourth for expected third-degree relatives andbelow the fourth we expect unrelated or related at fourth degree or higher. (See Table 1 in [18].) The verticaldashed gray line represents a boundary for designating PO pairs or duplicates (whose IBS0 is theoretically0). In the legend, “PO” = parent-offspring, “FS” = full siblings, “Deg2”= Degree 2 relationships (such ashalf-siblings/avuncular/grandparent/grandchild), “Deg3” = Degree 3 relationships (such as cousins or half-avuncular) and “Unrel” = unrelated. Unexpected relationships are indicated by triangles which are colorcoded by expected relationship. The red triangles in the upper left corner represent expected full siblingpairs that were monozygotic twins and hence have kinship coefficients of duplicates. See Section 8 for furtherinterpretations.

31

Page 32: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 11: Principal component analysis of 5,993 unrelated study subjects with 101 HapMap genotypingcontrols. Separate plots (on the same scale) of HapMap genotyping controls and study subjects are providedfor ease of comparison. Color-coding is according to self-identified race for study subjects and populationgroup for Hapmap genotyping controls. Axis labels indicate the percentage of variance explained by eacheigenvector. Refer to Table 4 for a description of HapMap abbreviations in the legend. For self-identifiedrace abbreviations, see Section 9.

(a) Genotyping controls

(b) Study subjects

32

Page 33: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 12: Scree plot for eigenvectors of final PCA.

● ●● ●

● ● ● ● ● ● ● ●

2 4 6 8 10 12 14

01

23

Eigenvector

Per

cent

of v

aria

nce

acco

unte

d fo

r

33

Page 34: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 13: SNP position versus correlation between SNP genotype (0, 1 or 2) and each of the first 8eigenvectors. These eigenvectors are from the final PCA of unrelated study subjects identified by the variable“pca unrel” in “Sample analysis.csv”.

34

Page 35: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 13: Continued.

35

Page 36: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 14: Parallel coordinates plot for visualization of relationship of PCA eigenvector structure withself-identified race for the first 15 eigenvectors. Vertical lines represent eigenvectors and each piece-wiseline between the vertical lines traces eigenvector values for a given subject. Color-coding is according tothe self-reported race. As examples of interpretation: note that eigenvectors 3 and 4 separate out threesubsets of Asian subjects (note the three separated bands colored green) and two subsets of African sub-jects (note the two separated bands colored red). Further note that eigenvector 6 appears to separate outthe Caucasian/African-American/Native-American (cyan) group. For self-identified race abbreviations, seeSection 9.

36

Page 37: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

(a) (b)

(c)

Figure 15: 3D-views of population substructure from final PCA eigenvectors 1, 2, and 4. Color-coding is byself-identified race or by site. For self-identified race abbreviations, see Section 9.

37

Page 38: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

(a) (b)

(c)

Figure 16: 3D-views of population substructure from final PCA eigenvectors 1, 2, and 6. Color-coding is byself-identified race or by site. For self-identified race abbreviations, see Section 9.

38

Page 39: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 17: Histogram of the missing call rate per sample (missing.e1 ).

Missing call rate by sample

Fre

quen

cy

0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035

020

0040

0060

0080

0010

000

39

Page 40: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 18: Boxplot of missing call rate for study samples categorized by genotyping plate. Each box representsone of the 134 plates/batches. Red boxes indicate plates containing samples that failed in the first roundof genotyping and were re-genotyped together. The width of each box is proportional to the square root ofsample size. All batches have low missing call rate.

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●●●

●●

●●

−4.0

−3.5

−3.0

−2.5

−2.0

−1.5

Autosomal missing call rate by plate

log1

0(au

toso

mal

mis

sing

cal

l rat

e)

40

Page 41: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 19: Mean odds ratio (OR) plotted against the fraction of Caucasian samples on the plate; Caucasianis the largest self-identified race group. Red points indicate plates containing samples that failed in the firstround of genotyping and were re-genotyped together. There are no outlier plates.

●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

0.1 0.2 0.3 0.4 0.5 0.6 0.7

1.80

1.85

1.90

1.95

2.00

2.05

fraction of Caucasian samples per batch

mea

n F

ishe

r’s O

R

● redomean

41

Page 42: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

MAF

Fre

quen

cy

0.0 0.1 0.2 0.3 0.4 0.5

050

000

1000

0015

0000

2000

0025

0000

(a) Distribution of minor allele frequency.

●●

●●●

●●

●●

●●●●●●●●●●●

●●●

●●●●●

●●●●

●●

●●●●●●●●●●

0.0 0.1 0.2 0.3 0.4 0.50.

9988

0.99

900.

9992

0.99

940.

9996

0.99

981.

0000

MAF

corr

elat

ion

of a

llelic

dos

age

(b) Correlation of allelic dosage.

●●

●●

●●

●●

●●

0.0 0.1 0.2 0.3 0.4 0.5

0.99

992

0.99

994

0.99

996

0.99

998

MAF

conc

orda

nce

(c) Overall concordance.

●●

●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

0.0 0.1 0.2 0.3 0.4 0.5

0.99

20.

994

0.99

60.

998

1.00

0

MAF

min

or a

llele

con

cord

ance

(d) Minor allele concordance.

Figure 20: Summary of concordance by SNP over 264 duplicate sample pairs, binned by minor allele fre-quency.

42

Page 43: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 21: Sorted Mendelian error rates. Plot (a) is for 3,247 trios and plot (b) is for 2,015 single-parent/offspring pairs where only one parent was genotyped.

(a) Trios

(b) Parent-offspring pairs

43

Page 44: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 22: Eigenvector 1 (EV1) vs Eigenvector 2 (EV2) of PCA of 2,167 unrelated self-identified Caucasiansubjects using 62,987 pruned SNPs. The horizontal line is at -0.03 and the vertical line is at -0.007. Agenetically homogeneous subset of these samples was chosen as EV1 > -0.007 (to the right of the verti-cal line) and EV2 > -0.03 (above the horizontal line). This selection is indicated by the logical variable“pca homog Caucasian” in “Samples analysis.csv”. Axis labels indicate the percentage of variance explainedby each eigenvector.

44

Page 45: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 23: Quantile-quantile plots for −log10(p) from exact test of Hardy-Weinberg equilibrium. Plots inthe left column show all SNPs, whereas those in the right column have the y-axis truncated to show moreclearly the point of deviation from expectation.

45

Page 46: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 24: Distributions of estimated inbreeding coefficient for a random sample of 48,145 autosomal SNPs.Black trace represents observed values calculated from the data and red trace represents values calculatedfrom simulation assuming Hardy-Weinberg equilibrium. The potential values range from -1 to 1.

−0.2 −0.1 0.0 0.1 0.2

05

1015

2025

Simulated F − 48145 SNPs − 847 Samples

F

Den

sity

DataSimulation

46

Page 47: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 25: Minor allele frequency distribution across all study subjects.

Autosomes

MAF

Fre

quen

cy

0.0 0.1 0.2 0.3 0.4 0.5

050

000

1000

0015

0000

2000

00

(a) Autosomes

X chromosome

MAF

Fre

quen

cy

0.0 0.1 0.2 0.3 0.4 0.5

010

0020

0030

0040

00

(b) X chromosome

47

Page 48: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 26: Quantile-quantile plots for preliminary association tests using TDT. QQ plots are provided afterusing no SNP filter, using composite filter, using composite filter plus MAF filter (MAF < 0.01), and (bottomright plot) for SNPs satisfying the composite filter but MAF is lower than MAF filter threshold. Lambda isthe genomic inflation factor

48

Page 49: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 27: Quantile-quantile plots for parenTDT tests. QQ plots are provided after using no SNP filter,using composite filter, using composite filter plus MAF filter (MAF < 0.01), and (bottom right plot) forSNPs satisfying the composite filter but MAF is lower than MAF filter threshold. Lambda is the genomicinflation factor

49

Page 50: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 28: Manhattan plots for association TDT tests. The chromosome designation U refers to SNPs withunknown position.

50

Page 51: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 29: Manhattan plots for association parenTDT tests. The chromosome designation M refers tomitochondrial SNPs and the chromosome designation U refers to SNPs with unknown position.

51

Page 52: Quality Control Report for Genotypic DataQuality Control Report for Genotypic Data University of Washington February 5, 2015 Project: Genetics of Orofacial Clefts and Related Phenotypes

Figure 30: QQ plots for TDT results restricting to SNPs on chromosomes 14, 15, and 16

52