Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
and W. BarendseS. Bolormaa, L. R. Porto Neto, Y. D. Zhang, R. J. Bunch, B. E. Harrison, M. E. Goddard
A genome-wide association study of meat and carcass traits in Australian cattle
doi: 10.2527/jas.2010-3138 originally published online March 18, 20112011, 89:2297-2309.J ANIM SCI
http://jas.fass.org/content/89/8/2297the World Wide Web at:
The online version of this article, along with updated information and services, is located on
www.asas.org
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
A genome-wide association study of meat and carcass traits in Australian cattle1
S. Bolormaa,*† L. R. Porto Neto,*‡ Y. D. Zhang,*§ R. J. Bunch,*‡ B. E. Harrison,*‡ M. E. Goddard,*† and W. Barendse*‡2
*Cooperative Research Centre for Beef Genetic Technologies, Armidale, New South Wales 2351, Australia; †Victorian Department of Primary Industries, Bundoora, Victoria 3083, Australia;
‡Commonwealth Scientific and Industrial Research Organisation (CSIRO) Livestock Industries, Queensland Bioscience Precinct, St. Lucia 4067, Queensland, Australia;
and §Animal Genetics and Breeding Unit (AGBU), NSW Department of Primary Industries and University of New England, Armidale, New South Wales 2351, Australia
ABSTRACT: Chromosomal regions containing DNA variation affecting the traits intramuscular fat percent-age (IMF), meat tenderness measured as peak force to shear the LM (LLPF), and rump fat measured at the sacro-iliac crest in the chiller (CHILLP8) were identi-fied using a set of 53,798 SNP genotyped on 940 taurine and indicine cattle sampled from a large progeny test experiment. Of these SNP, 87, 64, and 63 were sig-nificantly (P < 0.001) associated with the traits IMF, LLPF, and CHILLP8, respectively. A second, nonover-lapping sample of 1,338 taurine and indicine cattle from the same large progeny test experiment genotyped for 335 SNP, including as a positive control the calpastatin (CAST) c.2832A > G SNP, was used to confirm these locations. In total, 37 SNP were significantly (P < 0.05) associated with the same trait and with the same fa-vorable homozygote in both data sets, representing 27
chromosomal regions. For the trait IMF, the effect of SNP in the confirmation data set was predicted from the discovery set by multiplying the estimated allele ef-fect of each SNP in the discovery set by the number of copies of the reference allele of each SNP in the confir-mation set. These weighted effects were then summed over all SNP to generate a molecular breeding value (MBV) for each animal in the confirmation data set. Using a bivariate analysis of MBV and IMF phenotypes of animals in the confirmation set, a panel of 14 SNP explained 5.6 and 15.6% of the phenotypic and genetic variance of IMF, respectively, in the confirmation data set. The amount of variation did not increase as more SNP were added to the MBV and instead decreased to 1.2 and 3.8% of the phenotypic and genetic variance of IMF, respectively, when 329 SNP were included in the analysis.
Key words: cattle, DNA, fat, genome-wide association study, intramuscular, quantitative trait loci, selection, subcutaneous
©2011 American Society of Animal Science. All rights reserved. J. Anim. Sci. 2011. 89:2297–2309 doi:10.2527/jas.2010-3138
INTRODUCTION
Genetic selection based on pedigree and phenotypes is well established for traits that are easy to measure. In the last decade, DNA markers for harder to measure traits have begun to be implemented, but most of these tests were not integrated into genetic evaluation sys-tems based on pedigree and phenotypes. This is partly due to the small number of DNA markers reliably as-
sociated with traits. In cattle, for meat and carcass quality there are 2 genes that account for moderate ge-netic effects on meat tenderness, calpain 1 (CAPN1), and calpastatin (CAST; Barendse, 2002; Page et al., 2002), which have been confirmed many times (White et al., 2005; Casas et al., 2006; Schenkel et al., 2006; Barendse et al., 2007a; Cafe et al., 2010). Methods to incorporate smaller effects than these through genomic selection (Meuwissen et al., 2001) have been developed, and these methods should allow an integration of DNA marker and quantitative genetic approaches to animal improvement (Goddard and Hayes, 2007).
The main objectives of this study were to identify more SNP associated with meat and carcass quality us-ing the genome-wide association study (GWAS) meth-odology to confirm the significant SNP and to evaluate
1 We thank A. Reverter (CSIRO) and B. Tier (AGBU) for discus-sion of data and methods. We thank Meat and Livestock Australia P/L for financial support (W. Barendse).
2 Corresponding author: [email protected] May 6, 2010.Accepted February 15, 2011.
2297
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
the performance of panels of the SNP for prediction of a phenotype. Approximately 54,000 SNP spread across the bovine genome were tested for associations with 3 traits of interest in a discovery sample, and a selec-tion of SNP was evaluated in a confirmation sample. The traits of interest were intramuscular fat percent-age (IMF), which is of particular importance due to price premiums for beef with high levels of IMF, meat tenderness measured as peak force to shear the LM (LLPF), and rump fat thickness measured in the chill-er at the sacro-iliac crest of the third sacral vertebra (CHILLP8).
MATERIALS AND METHODS
Animal Care and Use Committee approval was not obtained for this study because no new animals were handled in this experiment. The experiment was per-formed on trait records and DNA samples that had been collected previously. The animals in this experi-ment were born between 1993 and 1999 as described below.
The experiment consisted of 2 parts, an initial GWAS of 53,798 SNP genotyped over 940 animals in a discovery data set sampled from a large progeny test experiment that had been measured for several carcass and meat traits. Subsequently, 335 SNP, of which 329 were from the GWAS, were genotyped in a confirma-tory data set of 1,338 animals of the same large progeny test experiment that had been measured for the same carcass and meat traits.
GWAS for Meat and Carcass Traits
The DNA of 940 beef cattle of the Genetic Corre-lations Experiment of the Cooperative Research Cen-ter for the Cattle and Beef Industry were used for the GWAS. The breeding of these animals was reported previously (Upton et al., 2001). The measurements of IMF, CHILLP8, and LLPF have been reported pre-viously (Perry et al., 2001) and genetic correlations between these traits reported (Reverter et al., 2003a). These animals form the bulk of the sample reported previously in studies of residual feed intake (Robinson and Oddy, 2004; Barendse et al., 2007b). The breed composition of the sample consisted of 220 Angus, 146 Hereford, 55 Murray Grey, 81 Shorthorn, 78 Brahman, 165 Belmont Red, 126 Santa Gertrudis, 25 Taurine-Brahman crossbred animals, and 44 Tropical Compos-ite-Brahman crossbred animals. All 4 grandparents of the animals in each of the 5 purebred or 2 composite breeds belonged to the particular breed. For the cross-bred animals, the dams were all Brahman and the sires were either of 1 of the 4 taurine breeds or 1 of the 2 composite breeds. These represent the offspring of 246 sires, representing 34 herds of origin, each breed con-sisting of several herds of origin, 2 sexes, and 50 kill groups. The average number of half-sibs per sire was 3.8 with a range of 1 to 15 offspring per sire.
The DNA samples were genotyped using the Bovine SNP50 array (Matukumalli et al., 2009) by Illumina Inc. (Hayward, CA). Genotypes were analyzed for GC10 (GenCall quality at the 10th percentile) scores, call rates, call frequency, cluster separation, and devia-tions from Hardy-Weinberg equilibrium using the Ge-nome Studio Software version 1.0. In addition, tests of repeated genotyping of the same animal were includ-ed, but the identity of the repeat was unknown to the genotyper and included individuals of known pedigree. These methods were used to analyze the raw data and arrive at a final data set of n = 940 animals genotyped for 53,798 SNP. The bp designations for each genotype were converted to number of copies of a reference allele so that for an A/G SNP the AA genotype was 0, the AG genotype was 1, and the GG genotype was 2. The codes for each SNP are shown in Supplemental Table 1 (http://jas.fass.org/content/vol89/issue8/).
Confirmation of Associations for Meat Quality Identified in the 50K GWAS
The DNA of an additional sample of 1,338 beef cattle samples from the same large progeny test experiment (Genetic Correlations Experiment of the Cooperative Research Center for the Cattle and Beef Industry) was chosen for confirmation of the associations. The breed composition of the animals in the confirmation sample consisted of 655 Angus, 343 Hereford, and 340 Brah-man animals, representing 3 of the 9 breeds or breed-composites that were present in the discovery sample. The animals represent the offspring of 198 sires, repre-senting 23 herds of origin, 2 sexes, and 142 kill groups. The average number of half-sibs per sire was 6.8 with a range of 1 to 31 offspring per sire. None of the animals used in the confirmation sample were used in the GWAS discovery sample. Of the 198 sires in the confirmation sample, 123 had half-sib offspring in the discovery sam-ple, 463 of the 940 (49.2%) discovery animals were half sibs of animals in the confirmation sample, and 957 of the 1,338 (71.5%) animals in the confirmation sample were half-sibs of animals in the discovery sample. The size of the half-sib group in the discovery sample was positively correlated to the size of the half-sibs group in the confirmation sample for offspring of the same sire, with r = 0.46. The main genetic difference between the samples is that the confirmation sample contained no stabilized composite animals of breeds such as the Santa Gertrudis or first generation crossbreds such as F1 Angus × Brahman animals.
The DNA samples were genotyped using a custom Illumina Golden Gate Assay (Supplemental Table 2; http://jas.fass.org/content/vol89/issue8/) using an Il-lumina iScan machine (Illumina, San Diego, CA) at CSIRO. This list consists of 384 SNP from a list of 443 SNP sent to Illumina and consists of all SNP signifi-cantly (P < 0.001) associated with IMF, CHILLP8, or LLPF in the GWAS for which assays could be generat-ed, with the remainder consisting of SNP significantly
2298 Bolormaa et al.
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
(P < 0.005) associated with IMF and that were part of the Bovine SNP50 array or SNP mapped to regions that had previously been associated with IMF or LLPF (Barendse, 2008). The significance thresholds of α = 0.001 and 0.005 (for IMF) were used in the GWAS to limit the number of SNP to be considered in the con-firmation round. In GWAS, due to the number of tests, the more relaxed the threshold in the discovery sample the more false positives will be discovered. The more stringent the significance threshold the more likely it is that real genetic effects of small size will be overlooked due to sampling effects. After genotyping the animals for the confirmation sample, the genotypes were ana-lyzed for GC10 scores, call rates, call frequency, clus-ter separation, and deviations from Hardy-Weinberg equilibrium using the Genome Studio Software version 1.0 to arrive at a final data set of 335 SNP for 1,338 animals (see Supplemental Table 2; http://jas.fass.org/content/vol89/issue8/). Of these 335 SNP, the panel contained 75 of the 87 SNP significantly (P < 0.001) associated with IMF, 31 of the 63 SNP significantly (P < 0.001) associated with CHILLP8, and 32 of the 64 SNP significant (P < 0.001) for LLPF. In addition, 154 SNP significant (P < 0.005) for IMF were included, and 42 of the SNP near candidate genes for IMF and LLPF were included, resulting in 292 SNP selected for their significance to 1 of the 3 traits. Of these 335 SNP, 329 were in the Illumina Bovine SNP50 Array, so the discovery and confirmation samples were genotyped for 329 SNP in common. The 6 SNP that were not in the Il-lumina Bovine SNP50 Array are in the candidate genes CAST (Barendse, 2002; Barendse et al., 2007a), stea-royl co-A desaturase 1 (SCD1; Taniguchi et al., 2004), and calpain 3 (CAPN3; Barendse et al., 2008). The specific SNP were CAST:c.2832A > G (CAST-2832), SCD:c.878C > T (SCD-878 GenBank AY241933.1), SCD-e638 (GenBank NW_003104582.1:g.7034766C > T), CAPN3:c.1882A > G, CAPN3:c.2082G > A, and CAPN3:c.2123A > T, which were respectively labeled CAPN3–2089, CAPN3–2289, and CAPN3–2330 in pre-vious research (Barendse, 2007), and the variant se-quences for the 6 SNP are contained in Supplemental Table 3 (http://jas.fass.org/content/vol89/issue8/). The CAST-2832 was inserted as a positive control, whereas SCD-878, CAPN3–2089, and CAPN3–2330 are known nonsynonymous mutations in candidate genes for either IMF composition or LLPF.
Analyses
The phenotypes and genotypes were fitted in a mixed model of the form trait ~mean + fixed effects + SNP + animal + error using the software ASReml (Gilm-our et al., 2002) where SNP was treated as a variable consisting of the number of copies of an allele, and ani-mal and error were random effects. The SNP associa-tions were evaluated by regression. For the carcass and meat quality traits, the fixed effects were breed, herd of origin, sex, and kill group of the animals (Reverter
et al., 2003b). Pedigree information for 5 generations was included as well as age in days at slaughter. Each breed consisted of several herds of origin, and herd of origin was nested within breed in case there were allele frequency differences between different herds within a breed as well as different management of herds on the properties from which the animals were obtained. Al-most all kill groups were of a single sex, so sex (s) and kill group (KG) were concatenated to form sKG. In the GWAS sample, the regression was evaluated as a 2-tailed test against the null hypothesis of a regres-sion coefficient of zero. The quantiles of the observed t-values for the GWAS experiment were compared with the quantiles of the standard normal distribution using a quantile-quantile plot (Q-Q plot) in the R-project software (Ihaka and Gentleman, 1996) downloaded from http://www.r-project.org to determine if there was an excess of extremely high observed values. In the analysis of the confirmation sample, the regression was evaluated as a 1-tailed test of the favorable homozygote of the discovery sample. The significance threshold was set at α = 0.05 because the SNP already had evidence for association from data in the experiment. If the dis-covery and confirmation data sets were combined, the association that had been found in the discovery sam-ple would not be materially degraded by the addition of significant new data of this kind from the confirmation sample. However, if data from the confirmation sample showed either that the alternative homozygote was sig-nificantly associated with the trait or that there was no association in the confirmation sample, then it would degrade the evidence for the association when added to the data from the discovery sample. The proportion of the residual variance explained by a SNP was esti-mated by comparing the ratio of the residual sums of squares of a model that did not include the SNP to the residual sums of squares of the model that included the SNP. The false positive rate (FPR) = Ep/Op where Ep is the expected number of SNP with p-values below a particular significance threshold, given the number of SNP in the panel and assuming that all tests are inde-pendent, and Op is the observed number of SNP with P-values below that same threshold.
To analyze genes, their locations, and the position of SNP on the map, the Btau 4.0 Bovine Genome As-sembly implemented at http://www.livestockgenomics.csiro.au/perl/gbrowse.cgi/bova4/ (Elsik et al., 2009) was used.
To calculate the molecular breeding value (MBV) for IMF for animals in the confirmation sample based on a panel of more than 1 SNP, first the allele substitu-tion effects of all the SNP in the panel were estimated simultaneously using the genotypes and phenotypes from the discovery sample. Then, for each animal of the confirmation sample, for each SNP in the panel, the number of copies of the reference allele was mul-tiplied by the allele effect of the SNP estimated from the discovery sample, and the resulting coefficients were summed across all SNP in the panel to generate
2299Genome-wide association study of meat and carcass traits
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
the MBV. For n loci in the panel, for bi allele effects for SNP i obtained from the discovery sample and gij genotypes (0, 1, and 2) in individual j at SNP i in the confirmation sample, the MBV mj for animals in the confirmation sample is
mj = Σni = 1bigij.
This method, on the additive scale for a quantitative trait, is similar to the method for predicting genetic risk of a complex trait applied to the multiplicative scale (Wray et al., 2007). A bivariate analysis of the MBV and the trait values of individuals of the confirmation sample was performed using ASReml to calculate ge-netic and phenotypic correlations, and the squares of these correlations were used to estimate the genetic and phenotypic variance, r2
G and r2P, respectively, explained
by a panel of SNP. The MBV were calculated from panels of different numbers of SNP. Simultaneously es-timated allele effects are affected by analytical method. We considered BayesA and BayesB (Meuwissen et al., 2001) and regression, which have substantially differ-ent assumptions about the underlying distribution of allele effects. The ASReml software used for regression has a limit of 280 SNP analyzed simultaneously on computers available to us, so we compared MBV from these methods using a set of 280 SNP that would fit in ASReml regression analysis. The MBV for IMF were calculated from 1) the most significant SNP in the dis-covery sample, 2) a panel of 190 out of the possible 229 SNP significantly (P < 0.05) associated with IMF iden-tified by forward regression in the discovery sample (see below), 3) the most significant SNP in the confirmation sample, 4) a panel of 14 SNP identified by forward re-gression in the confirmation sample (see below), and 5) as many SNP as possible genotyped in both discovery and confirmation samples irrespective of whether they were associated with IMF or not, of which there was a maximum of 329 SNP in common. The subset of 280 SNP for comparing regression (ASReml), BayesA and BayesB are contained in Supplemental Table 2 (http://jas.fass.org/content/vol89/issue8/). They were the first 280 SNP in alphabetical order, covered chromosomes BTA1–23, and included all the confirmed SNP for IMF (Table 2) but did not include 1 SNP (ARS-BFGL-NGS-40342) on BTA29 that formed part of the panel of 14 SNP (see Supplemental Table 5; http://jas.fass.org/content/vol89/issue8/).
To identify a panel of SNP associated with IMF by forward regression the following 3-step algorithm was used to identify the members of the panel. The idea behind this process is to progressively add to the panel the SNP that explain most of the variance in the trait after the effects of the previous SNP were removed. Step 0: select the best SNP (with the greatest F-value) from a single-SNP regression analysis, fitting the SNP as a fixed effect on uncorrected data. This SNP is add-ed to the pool of SNP. The defined critical threshold to enter SNP to the pool or model for multiple-SNP
regression analysis was set at P = 0.05. Step 1: perform the regression of the pool of SNP (consisting of 1 to as many SNP as are significant) fitted as a fixed effect on the original uncorrected data. This analysis results in a set of data corrected for the pool of SNP as well as other fixed effects, and these corrected values are sub-tracted from the original data. The resulting residual trait values are then used to identify new SNP to be added to the pool of SNP. Step 2: select the best SNP in the data set, excluding those already in the pool, by fitting the SNP as a fixed effect on the residual trait values. This SNP is added to the pool of SNP and the critical threshold is the same as step 0. Repeat steps 1 and 2 in sequence until there are no more SNP with F-values significantly (P < 0.05) associated with residual variance.
RESULTS
There were slightly more SNP that were significantly (P < 0.001) associated with IMF than to either LLPF or CHILLP8 in the 50K GWAS (Table 1). Given the size of the discovery sample the significance threshold of α = 0.001 generated a manageable number of targets for further study without increasing FPR too much. There were 87 SNP with significant (P < 0.001) associations with IMF, located across the bovine genome (Figure 1) and listed in Supplemental Table 4 (http://jas.fass.org/content/vol89/issue8/). Because of the number of SNP (53,798) tested, 54 of the SNP would be significant by chance at the α = 0.001 significance threshold, giving a FPR of 54/87 or 62%. There were 64 and 63 SNP with significant (P < 0.001) associations for LLPF and CHILLP8, respectively, listed in Supplemental Table 4 (http://jas.fass.org/content/vol89/issue8/), discovered at a FPR of 84% for both traits. Despite the increased FPR, the Q-Q plots for these traits showed an excess of greater t-values compared with the theoretical distribu-tion. For example, Figure 2 shows the Q-Q plot for the LLPF GWAS, which is known to include the CAPN1 mutations, and this excess usually indicates a likely real discovery of SNP associated with QTL. Although the majority of these associations have not been confirmed, the results for all SNP are provided in Supplemental Table 4 (http://jas.fass.org/content/vol89/issue8/) for use in meta-analyses and for confirmation studies per-formed by other researchers.
The chromosomes with large numbers of significant (P < 0.001) associations in the discovery sample are dif-ferent for the 3 traits (Table 1). There was no overlap in the list of chromosomes that had the largest number of SNP associated with the 3 traits. The 5 chromosomes with the largest number of associations with IMF ac-counted for 35 out of the 87 significant (P < 0.001) SNP, and the 87 SNP were located on 26 chromosomes. Although BTA1 had the most associations with IMF, the region with the greatest concentration of significant associations was BTA5 at 21 Mb. The 5 chromosomes with the largest number of significant associations with
2300 Bolormaa et al.
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
LLPF accounted for 52 out of the 64 significant (P < 0.001) SNP, a much greater proportion than in the other 2 traits, indicating greater clustering of SNP on particular chromosomes, and the 64 SNP were located on 22 chromosomes. Bovine chromosome 12 had the most associations with LLPF consisting of 9 SNP, but the greatest concentration was on BTA29 at 45 Mb surrounding the CAPN1 gene. The 5 chromosomes with the largest number of significant associations with CHILLP8 accounted for 27 out of the 63 significant (P < 0.001) SNP, and the 63 SNP were located on 21 chromosomes. Bovine chromosome 14 had the most as-sociations with CHILLP8 consisting of 8 SNP, and the greatest concentration was also on BTA14 at 23 Mb in a region containing the XK, Kell blood group complex subunit-related family, member 4 (XKR4) gene. The significant (P < 0.001) associations on BTA8 for LLPF and CHILLP8 did not overlap because none of the SNP associated with a single trait were within 1 Mb of SNP associated with the other trait. There were no SNP that were significantly (P < 0.001) associated for all 3 traits and only 1 SNP, BTA-105294-no-rs located at BTA3:91216203, was associated with both LLPF and CHILLP8. There were no SNP associated at the α = 0.001 significance threshold with effects on both IMF and LLPF or IMF and CHILLP8.
In the confirmation sample, a total of 13% (37/293) of the SNP that were chosen on their association with IMF, LLPF, or CHILLP8 in the GWAS were signifi-cantly (P < 0.05) associated with the same trait with the same favorable homozygote (Table 2). The signifi-cance threshold of α = 0.05 for the same favorable ho-mozygote was used because there was already informa-tion in the experiment that identified the SNP. The most significant SNP in the confirmation sample were not the most significant SNP in the discovery sample for IMF. Furthermore, of the SNP tested for IMF, 15% (11 of 75) of those that were significantly (P < 0.001) associated with IMF in the discovery sample were sig-nificantly (P < 0.05) associated with IMF in the con-firmation sample and 9% (14 of 155) of those that were significantly (P < 0.005) associated with IMF in the discovery sample were significantly (P < 0.05) associ-ated with IMF in the confirmation sample, a nonsignifi-
cant difference. Four regions, BTA4:18 Mb, BTA8:96 Mb, BTA10:85 Mb, and BTA11:80 Mb, had more than 1 significant (P < 0.05) SNP. In addition, BTA2, 8, 10, 14, and 16 had more than 1 significant (P < 0.05) SNP separated by >5 Mb on the same chromosome. In-cluded in Table 2 is a SNP that had been significantly (P < 0.001) associated with CHILLP8 in the GWAS but was significantly (P < 0.01) associated with IMF in the confirmation sample, where the same homozygote increased CHILLP8 or IMF, respectively. It was not significantly associated with CHILLP8 in the confirma-tion sample.
There were 7 SNP significantly (P < 0.05) associated with LLPF out of the group of 32 SNP tested that had been significantly (P < 0.001) associated with LLPF in the GWAS. The CAST-2832 SNP, which is not in the Illumina Bovine SNP50 Array, was also significantly (P < 0.001) associated with LLPF in the confirmation sample. These 8 SNP were located to 2 chromosomes, 2 genomic regions on BTA7 and 1 on BTA29. Two of these regions contained the CAST and CAPN1 genes, and although other SNP near CAPN1 had shown stronger associations with LLPF than CAPN1_1 in the GWAS discovery sample, in the confirmation sam-ple, the CAPN1_1 SNP was the most significant (P = 0.00068) of the SNP associated with LLPF on BTA29. The CAPN1_1 SNP is part of current genetic tests for meat tenderness. The CAST-2832 SNP, which is also part of current genetic tests, was not included in the Bovine SNP50 array, but it was the most significant (P = 0.00003) of all the SNP associated with LLPF in the confirmation sample. Interestingly, of the 4 SNP in the introns of or immediately 5′ or 3′ to the CALP gene, ARS-BFGL-NGS-43901, ARS-USMARC-670, ARS-USMARC-116, and BTA-111223-no-rs, which were in-cluded in the Bovine SNP50 array, none showed signifi-cant associations with LLPF in the discovery sample, even if a very relaxed (α = 0.05) significance threshold was applied. The third region, BTA7:50 Mb, which con-tained a SNP significantly (P < 0.05) associated with LLPF, was located more than 45 Mb away from CAST.
There were 5 SNP significantly (P < 0.05) associated with CHILLP8 in the confirmation sample out of the group of 31 SNP tested that had been significantly (P
Table 1. Summary of genome-wide associations for intramuscular fat percentage (IMF), rump fat measured at the sacro-iliac crest in the chiller (CHILLP8), and meat tenderness measured as peak force to shear the LM (LLPF) in the discovery sample
Trait n SNP1 FPR,2 % Top 5 Chr3 Sig %4 Best region5 Gene(s)6
IMF 87 62 1, 4, 5, 10, 13 40.2 BTA5:21 KITLG to ATP2B1CHILLP8 63 84 8, 11, 14, 19, 23 42.9 BTA14:22 XKR4LLPF 64 84 7, 8, 12, 22, 29 81.3 BTA29:45 EHD1 to OVOL1 includes CAPN1
1Number SNP significant at P < 0.001.2False positive rate.3Top 5 chromosomes with largest number of SNP significant at P < 0.001.4Percentage of SNP with P < 0.001 on the top 5 chromosomes compared with all chromosomes.5Genomic region with the largest number of SNP significant at P < 0.001.6Gene(s) located to this genomic region.
2301Genome-wide association study of meat and carcass traits
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
Tab
le 2
. Ass
ocia
tion
s be
twee
n SN
P a
nd in
tram
uscu
lar
fat
perc
enta
ge (
IMF),
rum
p fa
t m
easu
red
at t
he s
acro
-ilia
c cr
est
in t
he c
hille
r (C
HIL
LP
8), a
nd m
eat
tend
erne
ss m
easu
red
as p
eak
forc
e to
she
ar t
he L
M (
LLP
F)
in t
he c
onfir
mat
ion
sam
ple
of c
attle
SNP
Chr
omos
ome
Pos
itio
n, b
pT
rait
GW
A
P-v
alue
1b2
SEt
R2 , %
P-v
alue
3G
ene4
BT
B-0
1742
157
00
IMF
0.00
034
−0.
110.
061.
800.
60.
0360
8U
nkno
wn
BT
B-0
0019
769
144
1149
81IM
F0.
0036
00.
240.
082.
880.
60.
0020
3D
CB
LD
2B
TA
-120
552-
no-r
s2
5477
2881
IMF
0.00
382
0.13
0.06
2.07
0.7
0.01
935
GT
DC
1H
apm
ap55
796-
rs29
0111
722
6394
7670
IMF
0.00
162
−0.
220.
082.
681.
10.
0037
4nn
gA
RS-
BFG
L-N
GS-
1184
682
9358
6145
IMF
0.00
139
−0.
190.
063.
131.
20.
0009
0LO
C51
8393
AR
S-B
FG
L-N
GS-
7503
93
1548
0732
IMF
0.00
032
−0.
180.
072.
570.
70.
0051
6P
RC
CB
TA
-684
67-n
o-rs
418
4660
19IM
F0.
0004
5−
0.15
0.06
2.68
0.5
0.00
374
nng
Hap
map
4804
9-B
TA
-684
654
1851
9898
IMF
0.00
154
0.10
0.06
1.71
0.2
0.04
379
nng
AR
S-B
FG
L-N
GS-
6407
27
4997
5005
LLP
F0.
0002
9−
0.16
0.07
2.32
0.6
0.01
027
1 kb
dis
tal M
GC
1595
84C
AST
-283
27
9757
4679
LLP
FN
BSA
0.22
0.05
4.05
1.5
0.00
003
CA
STB
TB
-007
3317
88
2551
6628
IMF
0.00
350
−0.
100.
061.
660.
20.
0486
1K
IAA
1797
BT
B-0
0488
794
896
2815
83IM
F0.
0045
80.
250.
064.
041.
70.
0000
3nn
gA
RS-
BFG
L-N
GS-
8778
78
9636
2722
IMF
0.00
354
0.18
0.07
2.51
0.7
0.00
612
LO
C53
0677
AR
S-B
FG
L-N
GS-
1003
958
1096
4968
0C
HIL
LP
8 IM
F5
0.00
021
−0.
230.
082.
991.
20.
0014
3T
NC
BT
B-0
1229
331
939
7514
80IM
F0.
0034
5−
0.12
0.06
1.93
0.2
0.02
695
nng
AR
S-B
FG
L-N
GS-
1036
8510
6266
2238
IMF
0.00
327
−0.
130.
072.
060.
40.
0198
3D
FN
A5
BT
B-0
1804
753
1084
5731
01IM
F0.
0000
0−
0.25
0.12
2.04
0.7
0.02
081
LO
C52
3451
AR
S-B
FG
L-N
GS-
1062
0310
8592
7764
IMF
0.00
023
−0.
250.
131.
960.
60.
0251
4R
GS6
AR
S-B
FG
L-N
GS-
1075
4311
7961
0835
IMF
0.00
059
0.12
0.07
1.74
0.4
0.04
109
nng
AR
S-B
FG
L-N
GS-
1064
7911
8190
8400
IMF
0.00
165
0.11
0.06
1.80
0.2
0.03
609
nng
BT
B-0
0492
076
1247
1896
87IM
F0.
0006
60.
120.
061.
910.
50.
0282
1nn
gH
apm
ap31
968-
BT
C-0
5675
414
4399
112
IMF
0.00
424
0.09
0.05
1.78
0.7
0.03
770
nng
AR
S-B
FG
L-N
GS-
1042
6814
2226
0372
IMF C
HIL
LP
860.
0033
60.
390.
152.
640.
50.
0042
1nn
gB
TB
-015
3078
814
2272
0373
CH
ILLP
80.
0000
00.
390.
201.
940.
30.
0263
3X
KR
4B
TB
-015
3083
614
2276
8980
CH
ILLP
80.
0000
0−
0.49
0.20
2.44
0.4
0.00
743
XK
R4
BT
B-0
0557
585
1422
8033
66C
HIL
LP
80.
0000
00.
520.
202.
660.
50.
0039
7X
KR
4B
TB
-005
5753
214
2283
8801
CH
ILLP
80.
0000
60.
540.
202.
730.
60.
0032
232
kb
dist
al t
o X
KR
4B
TA
-283
03-n
o-rs
1437
7115
67C
HIL
LP
80.
0004
50.
240.
131.
800.
20.
0360
934
kb
prox
imal
to
LO
C78
2385
Hap
map
4113
5-B
TA
-104
993
1461
7832
14IM
F0.
0008
60.
150.
062.
530.
60.
0057
8LO
C78
1881
BT
B-0
0640
427
1637
0499
32IM
F0.
0016
4−
0.16
0.06
2.41
0.3
0.00
807
nng
AR
S-B
FG
L-N
GS-
1806
816
4969
6753
IMF
0.00
036
0.13
0.06
1.98
0.4
0.02
399
LO
C51
0718
AR
S-B
FG
L-N
GS-
3443
018
6291
7071
IMF
0.00
083
−0.
120.
062.
000.
40.
0228
9LA
IR1
BTA
-542
07-n
o-rs
2236
4588
36IM
F0.
0006
90.
120.
061.
960.
40.
0251
4LO
C78
3484
INR
A-4
4323
6228
806
IMF
0.00
287
0.15
0.06
2.36
0.4
0.00
923
nng
AR
S-B
FG
L-N
GS-
3744
129
4497
7945
LLP
F0.
0000
00.
120.
062.
160.
40.
0155
12
kb d
ista
l to
GP
HA
2C
AP
N1_
129
4522
1191
LLP
F0.
0000
10.
170.
053.
210.
90.
0006
8C
AP
N1
AR
S-B
FG
L-N
GS-
1109
3629
4578
1509
LLP
F0.
0000
00.
100.
051.
920.
20.
0275
82
kb d
ista
l to
OV
OL1
BT
B-0
1033
227
2946
5935
75LLP
F0.
0000
10.
090.
052.
010.
40.
0223
53
kb d
ista
l to
RB
M4
AR
S-B
FG
L-N
GS-
4034
229
4745
7316
LLP
F0.
0001
10.
080.
051.
710.
10.
0437
9LO
C50
8879
Hap
map
4105
0-B
TA
-661
3529
4806
5510
LLP
F0.
0002
3−
0.09
0.05
1.79
0.3
0.03
688
8 kb
pro
xim
al t
o M
RP
L21
1 P-v
alue
for
the
SN
P in
the
disc
over
y sa
mpl
e. N
BSA
= n
ot in
Bov
ine
SNP
50 a
rray
.2 R
egre
ssio
n co
effic
ient
due
to
the
regr
essi
on o
f th
e tr
ait
on n
umbe
r of
cop
ies
of a
n al
lele
.3 P
-val
ue for
the
SN
P in
the
conf
irm
atio
n sa
mpl
e ba
sed
on a
1-t
aile
d te
st o
f th
e sa
me
favo
rabl
e al
lele
for
the
tra
it.
4 Gen
e lo
cation
on
the
Bta
u4.0
ass
embl
y, g
ene
sym
bol as
in
Gen
Ban
k, S
NP
des
igna
ted
as in
a ge
ne if it o
ccur
s in
the
exo
ns, in
tron
s, o
r ot
her
untr
ansl
ated
reg
ions
of a
gene
. nn
g =
not
nea
r ge
ne.
5 CH
ILLP
8 in
the
gen
ome-
wid
e as
soci
atio
n st
udy
(GW
AS)
but
IM
F in
the
conf
irm
atio
n sa
mpl
e.6 I
MF in
the
GW
AS
but
CH
ILLP
8 in
the
con
firm
atio
n sa
mpl
e.
2302 Bolormaa et al.
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
Figure 1. The location of SNP associated with intramuscular fat percentage shown as a Manhattan plot. Odd-numbered chromosomes are shown in black; even-numbered chromosomes are shown in gray. Values above –logP > 3 are equivalent to P < 0.001.
Figure 2. The quantile-quantile plots of the observed distribution of the t-values for the genome-wide association study of meat tenderness measured as peak force to shear the LM in the discovery sample compared with the standard normal distribution. The plot represents at least 50,000 data points. Points at the extreme of the observed distribution show values that are greater than expected.
2303Genome-wide association study of meat and carcass traits
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
< 0.001) associated with CHILLP8 in the GWAS. The ARS-BFGL-NGS-104268 SNP on BTA14, which was significantly (P < 0.001) associated with IMF in the discovery sample, was significantly (P < 0.01) associ-ated with CHILLP8 in the confirmation sample. These 6 SNP were located to 2 chromosomal regions, both on BTA14 separated by 15 Mb. Only 1 of the SNP was located to BTA14:38 Mb, the other 5 SNP were located to a narrow region between BTA14:22–23 Mb that included only the XKR4 gene. The association be-tween these SNP and CHILLP8 had smaller P-values than P-values of any other SNP-trait combination in the GWAS, and had consisted of the cluster of 4 SNP all located to a region of less than 1 Mb in length on BTA14 as reported above. One of these SNP, BTB-01530836 located at BTA14:22768980, explained R2 = 2.3% of the residual variance in the GWAS. However, in the confirmation sample it explained R2 = 0.4%.
Initially, MBV were calculated for IMF because there were many more SNP associated with IMF than the other traits, and of these 3 traits it is the one of most economic interest. In the bivariate analysis that com-pared phenotypes with the MBV calculated from a panel of markers (Table 3), a panel of the 4 SNP with the largest effects in the confirmation sample explained r2
P = 2.2% and r2G = 6.1% of the variance in the con-
firmation sample. These 4 SNP were all of the SNP that each explained >1% of the residual variance for IMF. In comparison, a panel of the 4 most significant SNP from the discovery sample explained r2
P = 1.5% and r2
G = 3.8% of the variance in the confirmation sample. The performance of this latter panel was not materially different than panels of all of the 75 signifi-cant (P < 0.001) SNP from the discovery sample or all of the 190 significant (P < 0.05) SNP from the dis-covery sample identified by forward regression, which explained r2
P = 1.2 or 1.3% and r2G = 3.6 or 4.0% of
the variance, respectively. The panel that explained the most variance (Supplemental Table 5; http://jas.fass.org/content/vol89/issue8/) was obtained using forward regression in the confirmation sample and consisted of 14 SNP that explained r2
P = 5.6% and r2G = 15.6% of
the variance. When estimated as single markers, these SNP explained r2
P = 16.0% and r2G = 34.4% in the
GWAS sample and r2P = 12.4% and r2
G = 31.3% in the confirmation sample, respectively (Supplemental Table 5; http://jas.fass.org/content/vol89/issue8/), a deflation of more than 50% when analyzed as a panel. These 14 SNP were located on 12 chromosomes, and of the 2 pairs of SNP on the same chromosome none were closer together than 29 Mb. None of these 14 SNP showed significant linkage disequilibrium to another of the SNP in the panel. Expanding the panel to hundreds of SNP, irrespective of their level of significance to IMF in either the discovery or confirmation sample, result-ed in a reduction in the amount of variance explained by the panel. Several methods of analysis agreed that the amount of variance was low when several hundred SNP were used. Some of the correlations between MBV
and phenotypes were almost identical, such as for the estimates for 280 SNP using ASReml compared with the same 280 SNP using BayesB analyses, suggesting that the calculations are robust because the methods and their underlying assumptions are so different. Us-ing the BayesB method to estimate MBV for a panel of 329 SNP explained r2
P = 1.2% and r2G = 3.8% of
the variance for IMF, which is not materially different than that explained by the most significant SNP in the discovery sample. Note that in all these calculations the SNP effects that were used to derive the predic-tions were always obtained from the discovery sample irrespective of whether the SNP were chosen on their level of significance in the discovery or the confirmation sample.
The MBV for LLPF and CHILLP8 were then calcu-lated for differing numbers of SNP to determine wheth-er increasing the number of SNP also decreased the amount of variance explained for these traits (Table 4). Although a small panel of all the confirmed SNP explained r2
G = 3.0 and 13.5% for CHILLP8 and LLPF, respectively (SNP contained in Table 2), when all 329 SNP were used the amount of genetic variance that was explained was r2
G = 1.0 and 0.8% for CHILLP8 and LLPF, respectively, and panels of all the SNP signifi-cant (P < 0.001) in the discovery sample also did not explain large amounts of the genetic variance.
Although there are many confirmed SNP in anony-mous loci, those with names starting with LOC (un-known locus in the Entrez database, GenBank), KIAA (Kazusa unknown cDNA index), and MGC (Mamma-lian Gene Collection), these are not significantly dif-ferent than the distribution of named to anonymous genes in the bovine genome as a whole. There are 23,452 unique gene names annotated in the bovine genome to date (Gibbs et al., 2009), of which 9,270 had anonymous LOC names, 26 had anonymous KIAA names, and 4 had anonymous MGC names. Confirmed SNP (Table 2) were located within introns and cod-ing sequences of 9 named genes with assigned function, whereas confirmed SNP were located in 8 anonymous genes. These numbers increase to 12 named genes and 9 anonymous genes if SNP are counted that are in the intergenic sequences but are within 5 kb of the coding sequence. Both these distributions were not significant-ly (P > 0.05) different than the distribution of named to anonymous genes in the bovine genome as a whole. Apart from those SNP in CAPN1 and CAST, none of these SNP was in a gene that had previously been ex-plored for these traits, nor was there such a gene within 1 Mb of the most significant SNP in a region.
Furthermore, 12 of the confirmed SNP were more than 50 kb from a gene in either direction. The average distance between SNP in the panel is 51 kb. Each of these 12 SNP was evaluated to determine whether there were closer SNP to a gene. Of these 12, one of these was near another SNP that was in a gene and was signifi-cantly (P < 0.05) associated with the same trait of IMF (BTB-00488794). However, BTB-00488794 was signifi-
2304 Bolormaa et al.
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
cantly (P < 0.0001) associated with the trait, whereas the SNP nearer the gene, ARS-BFGL-NGS-87787 in the gene LOC530677, was not as significant (P < 0.01) and explained less than one-half as much of the residu-al variance. One other SNP, ARS-BFGL-NGS-104268, was significantly (P < 0.05) associated with CHILLP8
and was near several SNP in the gene XKR4 that were significantly (P < 0.01) associated with the same trait, but both the proximal and distal neighbors between it and the nearby genes were nonsignificant. That par-ticular SNP had been associated with IMF in the dis-covery sample. All the other 10 of these 12 confirmed
Table 3. Comparison of molecular breeding value (MBV) prediction equations for in-tramuscular fat percentage (IMF) using panels of different numbers of SNP
SNP set rP1 SE(rP)
2 rG3 SE(rG)4 r2
P5 r2
G6
MBV-top47 0.150 0.031 0.246 0.062 0.022 0.061MBV-top4 GWAS8 0.122 0.031 0.196 0.057 0.015 0.038MBV-FR9 0.236 0.032 0.395 0.074 0.056 0.156MBV-MS10 0.217 0.032 0.352 0.068 0.047 0.124MBV-75 GWAS11 0.112 0.031 0.190 0.061 0.012 0.036MBV-190 GWAS12 0.116 0.031 0.200 0.063 0.013 0.040MBV-28013 ASReml 0.129 0.031 0.230 0.067 0.017 0.053MBV-280 BayesA 0.122 0.031 0.216 0.066 0.015 0.046MBV-280 BayesB 0.129 0.031 0.232 0.067 0.017 0.054MBV-32914 BayesA 0.104 0.031 0.182 0.062 0.011 0.033MBV-329 BayesB 0.110 0.031 0.196 0.064 0.012 0.038
1Phenotypic correlation between the MBV and the trait.2SE of the phenotypic correlation.3Genetic correlation between the MBV and the trait.4SE of the genetic correlation.5Phenotypic variance explained by the SNP.6Genotypic variance explained by the SNP.7MBV based on the 4 SNP with the strongest associations with IMF in the confirmation sample.8MBV based on the 4 SNP with the strongest associations with IMF in the discovery sample. GWAS =
genome-wide association study.9MBV based on 14 SNP identified by forward regression in the confirmation sample.10MBV based on the 14 most significant (P < 0.05) SNP in the confirmation sample.11MBV based on the 75 significant (P < 0.001) SNP with associations with IMF in the discovery sample.12MBV based on the 190 significant (P < 0.05) SNP with associations with IMF in the discovery sample
identified by forward regression.13280 SNP in the panel.14329 SNP in the panel.
Table 4. Comparison of molecular breeding value (MBV) prediction equations for rump fat measured at the sacro-iliac crest in the chiller (CHILLP8) and meat tender-ness measured as peak force to shear the LM (LLPF) using panels of different numbers of SNP
Trait SNP set rP1 SE(rP)
2 rG3 SE(rG)4 r2
P5 r2
G6
CHILLP87 6 SNP 0.109 0.031 0.173 0.054 0.012 0.030CHILLP88 31 SNP 0.040 0.031 0.066 0.052 0.002 0.004CHILLP8 329 SNP 0.060 0.031 0.100 0.055 0.004 0.010LLPF9 7 SNP 0.132 0.031 0.319 0.103 0.017 0.102LLPF7 8 SNP 0.193 0.036 0.367 0.098 0.037 0.135LLPF10 32 SNP −0.017 0.031 −0.050 0.092 0.000 0.003LLPF 329 SNP 0.036 0.032 0.092 0.084 0.001 0.008
1Phenotypic correlation between the SNP and the trait.2SE of the phenotypic correlation.3Genetic correlation between the SNP and the trait.4SE of the genetic correlation.5Phenotypic variance explained by the SNP.6Genotypic variance explained by the SNP.7MBV based on the SNP with strongest associations with the trait in the confirmation sample.8MBV based on the 31 significant (P < 0.001) SNP with associations with CHILLP8 in the discovery sample.9Excluding calpastatin (CAST)-2832.10MBV based on the 32 significant (P < 0.001) SNP with associations with the trait in the discovery sample.
2305Genome-wide association study of meat and carcass traits
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
SNP were in gene deserts where none of the SNP in the distant gene was significantly associated with the trait in the GWAS sample.
DISCUSSION
The main objectives of this study were to identify more SNP associated with meat and carcass quality using the GWAS methodology, to test as many of these SNP as possible to determine how many could be con-firmed, and to evaluate the performance of panels of the SNP for prediction of a phenotype. Although many SNP were identified in the GWAS, and although more of these SNP were confirmed in a second sample than would be expected by chance, only 13% of the posi-tives evaluated from the GWAS were significantly (P < 0.05) associated with the same trait and had the same favorable allele in a confirmation sample. This confir-mation sample was genetically similar, so the reduced confirmation cannot be ascribed to genetic difference. Instead, it may represent the increased FPR found in GWAS studies. Three new genetic regions of interest for IMF were discovered, out of the 20 regions from which markers had been found significantly (P < 0.05) asso-ciated with IMF in the confirmation sample. These 3 regions each showed an association that explained R2 > 1% of the residual variance. This threshold was chosen because it roughly corresponds to the amount of the residual variance explained by the CAPN1 and CAST SNP in these samples, which have proved to be useful DNA markers for meat tenderness. These are also of a similar size to moderately sized QTL affecting other traits, such as those due to the GH receptor (GHR), the prolactin receptor (PRLR), and the ATP binding cassette subfamily G, member 2 (ABCG2) that affect milk yield and milk component yield (Blott et al., 2003; Cohen-Zinder et al., 2005; Viitala et al., 2006). These have been shown to explain 1% < R2 < 2.5% of the residual variance for milk component yield (Turner et al., 2010) in a similar-sized sample to that used in this study. Although the coverage of the bovine genome was not exhaustive in this study (i.e., the average distance between SNP was 51 kb), it is unlikely that there are genes of very large effect and of moderate to high mi-nor allele frequency affecting IMF, LLPF, or CHILLP8 segregating in these samples, because this study has been sufficiently powerful to identify and confirm genes of relatively small effect. In particular, there was no evidence of genes or genetic regions explaining as much variation for these traits as the DGAT1 or MSTN mu-tations explain for milk fat percentage or meat yield, respectively. For the traits of LLPF and CHILLP8, fewer markers were tested, and no new regions with confirmed QTL explaining R2 > 1% were found.
Some of the associations found in this study are con-sistent with the locations of QTL identified by linkage mapping in cattle. Only those examples are reported where the 2 studies roughly agree, bearing in mind that the confidence intervals on QTL locations are large
(Visscher et al., 1996). The confirmed association of a SNP at BTA7:50 Mb with LLPF is consistent with the location of a QTL affecting LLPF previously mapped to the central region of BTA7 and that did not include the CAST gene (Drinkwater et al., 2006). The associa-tion of a series of SNP with CHILLP8 at 2 locations of BTA14 is consistent with QTL mapping studies of fat depth mapped to BTA14 (Casas et al., 2000, 2003). In the central region of BTA2 there were 3 confirmed SNP for IMF in this study, each separated by more than 5 Mb from the next one, 2 of which have R2 > 1%. One study has reported a well supported, broad centrally located QTL for marbling score in cattle on BTA2 (Ca-sas et al., 2004). At BTA14:61.7 there was a confirmed association with IMF in this study, and several linkage mapping studies located a QTL to the central region of BTA14 (Casas et al., 2003; Mizoshita et al., 2004; Choi et al., 2006), in addition to the many reports of studies of possible candidate genes affecting marbling score or IMF located to this part of BTA14 (Barendse, 2008). Finally, there were 2 SNP with confirmed associations with IMF on the central region of BTA16 in this study, which may correspond to a marbling QTL located in the central region of BTA16 (Casas et al., 2004).
The lack of success that candidate gene approaches have when trying to identify QTL is highlighted by these results, which also suggest that the density of SNP coverage needs to be greater in those studies. The Bovine SNP50 array was designed to have evenly spaced markers across the bovine genome (Matukumalli et al., 2009) and there would have been SNP close to the can-didate and positional candidate genes for fatness traits. The decay of linkage disequilibrium (LD) with distance in cattle breeds that has been reported previously (Bar-endse et al., 2007b; McKay et al., 2007; Gibbs et al., 2009) should mean that these SNP are at a density high enough to detect LD between noncausative SNP in a candidate gene and the trait of interest, although the density is not great enough for comprehensive coverage of a gene. Yet, the markers that were either identified in the discovery sample or replicated in the confirmation sample were not generally in candidate genes for these fatness traits. The main exceptions to this observation on candidate genes are the CAPN1 and CAST genes affecting meat tenderness, the effects of which were dis-covered some time ago. The CAST-2832 was not in-cluded in the SNP chip, yet there was still a significant (P < 0.001) SNP 1 Mb distance from CAST, and if CAPN1_1 had not been included in the SNP chip the CAPN1 gene would still have come to attention be-cause there was a significant (P < 0.001) SNP close to it. Although these results would have raised interest in these 2 candidate genes, it should nonetheless be noted that all 4 of the SNP in and around CAST in the Bo-vine SNP50 array showed no association with LLPF. In the absence of other information this lack of association in the gene would have given a false signal for future research, particularly when one considers the high de-gree of replication of the association of the CAST-2832
2306 Bolormaa et al.
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
SNP with meat tenderness, the degree of LD between SNP demonstrated in other studies, and the associa-tion of several SNP in CAST to LLPF (Barendse, 2002; Schenkel et al., 2006; Barendse et al., 2007a). None of the candidate genes traditionally associated with fat-ness traits showed strong evidence of association when considering SNP within a 1-Mb window in the discov-ery population. None of the confirmed SNP was close to a well-studied, candidate, or positional candidate gene for fatness, nor were any of these genes identified as top candidates in gene expression studies (Wang et al., 2009). In our results, genes with substantial evidence for fatness traits were not candidate gene for fatness, whereas none of the previously studied candidate genes showed strong evidence for association. Our results cannot in general be used to rule out these candidate genes because the SNP density was not completely comprehensive and sufficient LD between the SNP in the Bovine 50K Array and potential causative alleles in the gene may not exist in our samples. The region near R3HDM1, which does have significant evidence for an association with IMF in both the discovery and confir-mation samples, was identified previously not because it was a candidate gene but because it contained evi-dence of population genetic selection of gene frequen-cies (Barendse et al., 2009).
The calculated allele substitution effect of a SNP will vary from one data set to another, whereas the associa-tions may be significant or not as the case may be and where the amount of variance explained by the SNP may differ substantially, particularly when the breeds or types of cattle used in one study differ from the breeds in another. In that sort of confirmatory analy-sis, the allele effects are unconstrained by data that have been collected previously. But in prediction and breeding, it is not enough to say that a marker was sta-tistically significant and that it explains a certain pro-portion of the variance; one must calculate a breeding value from current data that can be used to compare with the actual phenotype in another sample. In esti-mating the molecular breeding value from the regres-sion coefficients (allele substitution effects) of the panel of 14 SNP estimated simultaneously from the discovery sample data, this analysis explained r2
P = 5.6% and r2G
= 15.6% of the variance. This is a substantial propor-tion of the genetic variance, and it is encompassed by a relatively small number of SNP. Although this appears to be a useful amount of the variability, one needs to keep in mind that such a process will still show an up-ward bias in the amount of variability explained. The most unbiased method would be to genotype a third, large independent sample of cattle for these 14 SNP and then calculate the proportion of the variance that is explained by the 14 SNP. Because of a lack of such a sample, the breeding value in the confirmation sample was estimated using markers significantly associated with IMF in the confirmation sample but using allele substitution effect coefficients obtained from the dis-covery sample and then that breeding value was com-
pared with observed IMF in the confirmation sample. Further genotyping of additional cattle samples, by us or by other researchers, will provide more unbiased es-timates of the amount of variation explained by this panel of 14 SNP.
Although a forward regression method provided a panel of SNP that explained r2
G = 15.6%, adding more SNP that were not associated with the trait substan-tially degraded the amount of variance explained. Add-ing hundreds of additional SNP to the panel, many of which were significant at either the α = 0.005 or α = 0.001 significance threshold in the GWAS discovery sample but which had not been found significantly (P < 0.05) associated with IMF in the confirmation sam-ple, downgraded the prediction substantially, so that r2
G = 3.8% was explained using 329 SNP. Essentially the same amount of variance was explained whether the 4 most significant SNP (P < 0.001) or the 190 sig-nificant (P < 0.05) SNP for IMF identified by forward regression in the discovery sample were used, highlight-ing the importance of using confirmed SNP. The MBV for CHILLP8 and LLPF showed the same effect of de-creased variance explained when nonsignificant SNP were included in the MBV. The decreased similarity of the MBV and the phenotype when hundreds of SNP are used in the prediction panel is unlikely to be due to genetic differences between the samples. Although some of the evidence of association of a SNP in the discovery sample may have come from breeds that were not included in the confirmation study which may have resulted in a failure of confirmation of the SNP, the inclusion of SNP that have no effect on the trait should not have degraded the performance of the MBV. There are several possible explanations for this reduction in the amount of explained variance. First, putting SNP into a panel that are not associated with a trait could increase the error of prediction. Second, the analyti-cal methods for calculating MBV were not efficient at removing the error associated with SNP that were not associated with the trait. Third, although the sample size used in this study was large enough to characterize marker panels of a small number of elements, it was not large enough to characterize panels with hundreds of elements when estimates are obtained simultaneously. Larger experiments, based on many more thousands of animals and investigating a much larger pool of SNP may be able to distinguish between these explanations because larger samples will give more accurate esti-mates of allele substitution effects, particularly when the estimates are calculated simultaneously. However, as the SNP used in this study were the most significant from the GWAS, the conclusions from such a larger study would only differ in degree rather than in kind unless the larger study found a substantially different list of highly significant SNP compared with this study. These results have implications for genomic selection because all available loci are used in those methods, whether statistically significantly associated with the trait or not, and the same process is used to handle
2307Genome-wide association study of meat and carcass traits
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
nonsignificant SNP as was used in this study. These re-sults suggest that it may be necessary to use only SNP that are confirmed to be associated with the trait of interest and that a SNP list should be available as sup-porting information for a set of prediction equations.
LITERATURE CITED
Barendse, W., inventor; Commonwealth Scientific and Industrial Research Organization, Meat and Livestock Australia Limited, New South Wales Department of Agriculture, Queensland De-partment of Primary Industries, and University of New Eng-land, assignees. 2002. DNA markers for meat tenderness. Pat-ent Application WO02064820. p 1–88.
Barendse, W., inventor; Commonwealth Scientific and Industrial Research Organization, Meat and Livestock Australia Limited, New South Wales Department of Agriculture, Queensland De-partment of Primary Industries, and University of New Eng-land, assignees. 2007. Assessing the tenderness of meat from an animal by testing the animal for a genetic marker in the calpain3 (CAPN3) gene associated with Warner-Bratzler peak force variation or for a genetic marker located other than in CAPN3. Patent Application WO2007053891. p 1–62.
Barendse, W. 2008. Genetic based diagnostic tools for predicting meat quality. Pages 292–317 in Improving the sensory and nu-tritional quality of fresh meat. J. P. Kerry and D. Ledward, ed. Woodhead Publishing Limited and CRC Press LLC, Cam-bridge, UK.
Barendse, W., B. E. Harrison, R. J. Bunch, and M. B. Thomas. 2008. Variation at the calpain 3 gene is associated with meat tenderness in zebu and composite breeds of cattle. BMC Gen-et. 9:41.
Barendse, W., B. E. Harrison, R. J. Bunch, M. B. Thomas, and L. B. Turner. 2009. Genome wide signatures of positive selection: The comparison of independent samples and identification of regions associated to traits. BMC Genomics 10:178.
Barendse, W., B. E. Harrison, R. J. Hawken, D. M. Ferguson, J. M. Thompson, M. B. Thomas, and R. J. Bunch. 2007a. Epistasis between calpain 1 and its inhibitor calpastatin within breeds of cattle. Genetics 176:2601–2610.
Barendse, W., A. Reverter, R. J. Bunch, B. E. Harrison, W. Barris, and M. B. Thomas. 2007b. A validated whole genome associa-tion study of efficient food conversion. Genetics 176:1893–1905.
Blott, S., J. J. Kim, S. Moisio, A. Schmidt-Kuntzel, A. Cornet, P. Berzi, N. Cambisano, C. Ford, B. Grisart, D. Johnson, L. Karim, P. Simon, R. Snell, R. Spelman, J. Wong, J. Vilkki, M. Georges, F. Farnir, and W. Coppieters. 2003. Molecular dis-section of a quantitative trait locus: A phenylalanine-to-tyro-sine substitution in the transmembrane domain of the bovine growth hormone receptor is associated with a major effect on milk yield and composition. Genetics 163:253–266.
Cafe, L. M., B. L. McIntyre, D. L. Robinson, G. H. Geesink, W. Bar-endse, D. W. Pethick, J. M. Thompson, and P. L. Greenwood. 2010. Production and processing studies on calpain-system gene markers for tenderness in Brahman cattle: 2. Objective meat quality. J. Anim. Sci. 88:3059–3069.
Casas, E., J. W. Keele, S. D. Shackelford, M. Koohmaraie, and R. T. Stone. 2004. Identification of quantitative trait loci for growth and carcass composition in cattle. Anim. Genet. 35:2–6.
Casas, E., S. D. Shackelford, J. W. Keele, M. Koohmaraie, T. P. L. Smith, and R. T. Stone. 2003. Detection of quantitative trait loci for growth and carcass composition in cattle. J. Anim. Sci. 81:2976–2983.
Casas, E., S. D. Shackelford, J. W. Keele, R. T. Stone, S. M. Kappes, and M. Koohmaraie. 2000. Quantitative trait loci affecting growth and carcass composition of cattle segregating alternate forms of myostatin. J. Anim. Sci. 78:560–569.
Casas, E., S. N. White, T. L. Wheeler, S. D. Shackelford, M. Koohm-araie, D. G. Riley, C. C. Chase, D. D. Johnson, and T. P. L. Smith. 2006. Effects of calpastatin and mu-calpain markers in beef cattle on tenderness traits. J. Anim. Sci. 84:520–525.
Choi, I. S., H. S. Kong, J. D. Oh, D. H. Yoon, B. W. Cho, Y. H. Choi, K. S. Kim, K. D. Choi, H. K. Lee, and G. J. Jeon. 2006. Analysis of microsatellite markers on bovine chromosomes 1 and 14 for potential allelic association with carcass traits in Hanwoo (Korean cattle). Asian-australas. J. Anim. Sci. 19:927–930.
Cohen-Zinder, M., E. Seroussi, D. M. Larkin, J. J. Loor, A. Everts-van der Wind, J. H. Lee, J. K. Drackley, M. R. Band, A. G. Hernandez, M. Shani, H. A. Lewin, J. I. Weller, and M. Ron. 2005. Identification of a missense mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield and composition in Holstein cattle. Ge-nome Res. 15:936–944.
Drinkwater, R. D., Y. Li, I. Lenane, G. P. Davis, R. Shorthose, B. E. Harrison, K. Richardson, D. Ferguson, R. Stevenson, J. Ren-aud, I. Loxton, R. J. Hawken, M. B. Thomas, S. Newman, D. J. S. Hetzel, and W. Barendse. 2006. Detecting quantitative trait loci affecting beef tenderness on bovine chromosome 7 near cal-pastatin and lysyl oxidase. Aust. J. Exp. Agric. 46:159–164.
Elsik, C. G., R. L. Tellam, K. C. Worley, and Bovine Genome Se-quencing Analysis Consortium. 2009. The genome sequence of taurine cattle: A window to ruminant biology and evolution. Science 324:522–528.
Gibbs, R. A., J. F. Taylor, C. P. Van Tassell, W. Barendse, K. A. Eversole, C. A. Gill, R. D. Green, D. L. Hamernik, S. M. Kappes, S. Lien, L. K. Matukumalli, J. C. McEwan, L. V. Naz-areth, R. D. Schnabel, G. M. Weinstock, D. A. Wheeler, P. Ajmone-Marsan, P. J. Boettcher, A. R. Caetano, J. F. Garcia, O. Hanotte, P. Mariani, L. C. Skow, J. L. Williams, B. Di-allo, L. Hailemariam, M. L. Martinez, C. A. Morris, L. O. C. Silva, R. J. Spelman, W. Mulatu, K. Y. Zhao, C. A. Abbey, M. Agaba, F. R. Araujo, R. J. Bunch, J. Burton, C. Gorni, H. Ol-ivier, B. E. Harrison, B. Luff, M. A. Machado, J. Mwakaya, G. Plastow, W. Sim, T. Smith, T. S. Sonstegard, M. B. Thomas, A. Valentini, P. Williams, J. Womack, J. A. Wooliams, Y. Liu, X. Qin, K. C. Worley, C. Gao, H. Y. Jiang, S. S. Moore, Y. R. Ren, X. Z. Song, C. D. Bustamante, R. D. Hernandez, D. M. Muzny, S. Patil, A. S. Lucas, Q. Fu, M. P. Kent, R. Vega, A. Matukumalli, S. McWilliam, G. Sclep, K. Bryc, J. Choi, H. Gao, J. J. Grefenstette, B. Murdoch, A. Stella, R. Villa-Angulo, M. Wright, J. Aerts, O. Jann, R. Negrini, M. E. Goddard, B. J. Hayes, D. G. Bradley, M. B. da Silva, L. P. L. Lau, G. E. Liu, D. J. Lynn, F. Panzitta, and K. G. Dodds. 2009. Genome-wide survey of SNP variation uncovers the genetic structure of cattle breeds. Science 324:528–532.
Gilmour, A. R., B. J. Gogel, B. R. Cullis, S. J. Welham, and R. Thompson. 2002. ASReml User Guide Release 1.0. VSN Inter-national Ltd., Hemel Hempstead, UK.
Goddard, M. E., and B. J. Hayes. 2007. Genomic selection. J. Anim. Breed. Genet. 124:323–330.
Ihaka, R., and R. Gentleman. 1996. R: A language for data analysis and graphics. J. Comput. Graph. Statist. 5:299–314.
Matukumalli, L. K., C. T. Lawley, R. D. Schnabel, J. F. Taylor, M. F. Allan, M. P. Heaton, J. O’Connell, S. S. Moore, T. P. L. Smith, T. S. Sonstegard, and C. P. Van Tassell. 2009. Develop-ment and characterization of a high density SNP genotyping assay for cattle. PLoS ONE 4:e5350.
McKay, S. D., R. D. Schnabel, B. M. Murdoch, L. K. Matukumalli, J. Aerts, W. Coppieters, D. Crews, E. Dias, C. A. Gill, C. Gao, H. Mannen, P. Stothard, Z. Q. Wang, C. P. Van Tassell, J. L. Williams, J. F. Taylor, and S. S. Moore. 2007. Whole genome linkage disequilibrium maps in cattle. BMC Genet. 8:74.
Meuwissen, T. H. E., B. J. Hayes, and M. E. Goddard. 2001. Pre-diction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829.
Mizoshita, K., T. Watanabe, H. Hayashi, C. Kubota, H. Yamaku-chi, J. Todoroki, and Y. Sugimoto. 2004. Quantitative trait
2308 Bolormaa et al.
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
loci analysis for growth and carcass traits in a half-sib fam-ily of purebred Japanese Black (Wagyu) cattle. J. Anim. Sci. 82:3415–3420.
Page, B. T., E. Casas, M. P. Heaton, N. G. Cullen, D. L. Hyndman, C. A. Morris, A. M. Crawford, T. L. Wheeler, M. Koohmaraie, J. W. Keele, and T. P. L. Smith. 2002. Evaluation of single-nucleotide polymorphisms in CAPN1 for association with meat tenderness in cattle. J. Anim. Sci. 80:3077–3085.
Perry, D., W. R. Shorthose, D. M. Ferguson, and J. M. Thompson. 2001. Methods used in the CRC program for the determina-tion of carcass yield and beef quality. Aust. J. Exp. Agric. 41:953–957.
Reverter, A., D. J. Johnston, D. M. Ferguson, D. Perry, M. E. God-dard, H. M. Burrow, V. H. Oddy, J. M. Thompson, and B. M. Bindon. 2003a. Genetic and phenotypic characterisation of ani-mal, carcass, and meat quality traits from temperate and tropi-cally adapted beef breeds. 4. Correlations among animal, car-cass, and meat quality traits. Aust. J. Agric. Res. 54:149–158.
Reverter, A., D. J. Johnston, D. Perry, M. E. Goddard, and H. M. Burrow. 2003b. Genetic and phenotypic characterisation of animal, carcass, and meat quality traits from temperate and tropically adapted beef breeds. 2. Abattoir carcass traits. Aust. J. Agric. Res. 54:119–134.
Robinson, D. L., and V. H. Oddy. 2004. Genetic parameters for feed efficiency, fatness, muscle area and feeding behaviour of feedlot finished beef cattle. Livest. Prod. Sci. 90:255–270.
Schenkel, F. S., J. R. Miller, Z. Jiang, I. B. Mandell, X. Ye, H. Li, and J. W. Wilton. 2006. Association of a single nucleotide poly-morphism in the calpastatin gene with carcass and meat quality traits of beef cattle. J. Anim. Sci. 84:291–299.
Taniguchi, M., T. Utsugi, K. Oyama, H. Mannen, M. Kobayashi, Y. Tanabe, A. Ogino, and S. Tsuji. 2004. Genotype of stearoyl-
CoA desaturase is associated with fatty acid composition in Japanese Black cattle. Mamm. Genome 15:142–148.
Turner, L. B., B. E. Harrison, R. J. Bunch, L. R. Porto Neto, Y. T. Li, and W. Barendse. 2010. A genome wide association study of tick burden and milk composition in cattle. Anim. Prod. Sci. 50:235–245.
Upton, W., H. M. Burrow, A. Dundon, D. L. Robinson, and E. B. Farrell. 2001. CRC breeding program design, measurements and database: Methods that underpin CRC research results. Aust. J. Exp. Agric. 41:943–952.
Viitala, S., J. Szyda, S. Blott, N. Schulman, M. Lidauer, A. Maki-Tanila, M. Georges, and J. Vilkki. 2006. The role of the bovine growth hormone receptor and prolactin receptor genes in milk, fat and protein production in Finnish Ayrshire dairy cattle. Genetics 173:2151–2164.
Visscher, P. M., R. Thompson, and C. S. Haley. 1996. Confidence in-tervals in QTL mapping by bootstrapping. Genetics 143:1013–1020.
Wang, Y. H., N. I. Bower, A. Reverter, S. H. Tan, N. De Jager, R. Wang, S. M. McWilliam, L. M. Cafe, P. L. Greenwood, and S. A. Lehnert. 2009. Gene expression patterns during intramuscu-lar fat development in cattle. J. Anim. Sci. 87:119–130.
White, S. N., E. Casas, T. L. Wheeler, S. D. Shackelford, M. Koohm-araie, D. G. Riley, C. C. Chase Jr., D. D. Johnson, J. W. Keele, and T. P. L. Smith. 2005. A new single nucleotide polymor-phism in CAPN1 extends the current tenderness marker test to include cattle of Bos indicus, Bos taurus, and crossbred descent. J. Anim. Sci. 83:2001–2008.
Wray, N. R., M. E. Goddard, and P. M. Visscher. 2007. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res. 17:1520–1528.
2309Genome-wide association study of meat and carcass traits
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from
Referenceshttp://jas.fass.org/content/89/8/2297#BIBLThis article cites 34 articles, 18 of which you can access for free at:
at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from