A genome-wide association study of meat and carcass traits ... · A genome-wide association study of meat and carcass traits in Australian cattle doi: 10.2527/jas.2010-3138 originally

and W. BarendseS. Bolormaa, L. R. Porto Neto, Y. D. Zhang, R. J. Bunch, B. E. Harrison, M. E. Goddard

A genome-wide association study of meat and carcass traits in Australian cattle

doi: 10.2527/jas.2010-3138 originally published online March 18, 20112011, 89:2297-2309.J ANIM SCI

http://jas.fass.org/content/89/8/2297the World Wide Web at:

The online version of this article, along with updated information and services, is located on

www.asas.org

at Serials Acquisitions Dept on July 15, 2011jas.fass.orgDownloaded from

http://jas.fass.org/

A genome-wide association study of meat and carcass traits in Australian cattle1

S. Bolormaa,*† L. R. Porto Neto,*‡ Y. D. Zhang,*§ R. J. Bunch,*‡ B. E. Harrison,*‡ M. E. Goddard,*† and W. Barendse*‡2

*Cooperative Research Centre for Beef Genetic Technologies, Armidale, New South Wales 2351, Australia; †Victorian Department of Primary Industries, Bundoora, Victoria 3083, Australia;

‡Commonwealth Scientific and Industrial Research Organisation (CSIRO) Livestock Industries, Queensland Bioscience Precinct, St. Lucia 4067, Queensland, Australia;

and §Animal Genetics and Breeding Unit (AGBU), NSW Department of Primary Industries and University of New England, Armidale, New South Wales 2351, Australia

ABSTRACT: Chromosomal regions containing DNA variation affecting the traits intramuscular fat percent-age (IMF), meat tenderness measured as peak force to shear the LM (LLPF), and rump fat measured at the sacro-iliac crest in the chiller (CHILLP8) were identi-fied using a set of 53,798 SNP genotyped on 940 taurine and indicine cattle sampled from a large progeny test experiment. Of these SNP, 87, 64, and 63 were sig-nificantly (P < 0.001) associated with the traits IMF, LLPF, and CHILLP8, respectively. A second, nonover-lapping sample of 1,338 taurine and indicine cattle from the same large progeny test experiment genotyped for 335 SNP, including as a positive control the calpastatin (CAST) c.2832A > G SNP, was used to confirm these locations. In total, 37 SNP were significantly (P < 0.05) associated with the same trait and with the same fa-vorable homozygote in both data sets, representing 27

chromosomal regions. For the trait IMF, the effect of SNP in the confirmation data set was predicted from the discovery set by multiplying the estimated allele ef-fect of each SNP in the discovery set by the number of copies of the reference allele of each SNP in the confir-mation set. These weighted effects were then summed over all SNP to generate a molecular breeding value (MBV) for each animal in the confirmation data set. Using a bivariate analysis of MBV and IMF phenotypes of animals in the confirmation set, a panel of 14 SNP explained 5.6 and 15.6% of the phenotypic and genetic variance of IMF, respectively, in the confirmation data set. The amount of variation did not increase as more SNP were added to the MBV and instead decreased to 1.2 and 3.8% of the phenotypic and genetic variance of IMF, respectively, when 329 SNP were included in the analysis.

Key words: cattle, DNA, fat, genome-wide association study, intramuscular, quantitative trait loci, selection, subcutaneous

©2011 American Society of Animal Science. All rights reserved. J. Anim. Sci. 2011. 89:2297–2309 doi:10.2527/jas.2010-3138

INTRODUCTION

Genetic selection based on pedigree and phenotypes is well established for traits that are easy to measure. In the last decade, DNA markers for harder to measure traits have begun to be implemented, but most of these tests were not integrated into genetic evaluation sys-tems based on pedigree and phenotypes. This is partly due to the small number of DNA markers reliably as-

sociated with traits. In cattle, for meat and carcass quality there are 2 genes that account for moderate ge-netic effects on meat tenderness, calpain 1 (CAPN1), and calpastatin (CAST; Barendse, 2002; Page et al., 2002), which have been confirmed many times (White et al., 2005; Casas et al., 2006; Schenkel et al., 2006; Barendse et al., 2007a; Cafe et al., 2010). Methods to incorporate smaller effects than these through genomic selection (Meuwissen et al., 2001) have been developed, and these methods should allow an integration of DNA marker and quantitative genetic approaches to animal improvement (Goddard and Hayes, 2007).

The main objectives of this study were to identify more SNP associated with meat and carcass quality us-ing the genome-wide association study (GWAS) meth-odology to confirm the significant SNP and to evaluate

1 We thank A. Reverter (CSIRO) and B. Tier (AGBU) for discus-sion of data and methods. We thank Meat and Livestock Australia P/L for financial support (W. Barendse).

2 Corresponding author: [email protected] May 6, 2010.Accepted February 15, 2011.

2297



the performance of panels of the SNP for prediction of a phenotype. Approximately 54,000 SNP spread across the bovine genome were tested for associations with 3 traits of interest in a discovery sample, and a selec-tion of SNP was evaluated in a confirmation sample. The traits of interest were intramuscular fat percent-age (IMF), which is of particular importance due to price premiums for beef with high levels of IMF, meat tenderness measured as peak force to shear the LM (LLPF), and rump fat thickness measured in the chill-er at the sacro-iliac crest of the third sacral vertebra (CHILLP8).

MATERIALS AND METHODS

Animal Care and Use Committee approval was not obtained for this study because no new animals were handled in this experiment. The experiment was per-formed on trait records and DNA samples that had been collected previously. The animals in this experi-ment were born between 1993 and 1999 as described below.

The experiment consisted of 2 parts, an initial GWAS of 53,798 SNP genotyped over 940 animals in a discovery data set sampled from a large progeny test experiment that had been measured for several carcass and meat traits. Subsequently, 335 SNP, of which 329 were from the GWAS, were genotyped in a confirma-tory data set of 1,338 animals of the same large progeny test experiment that had been measured for the same carcass and meat traits.

GWAS for Meat and Carcass Traits

The DNA of 940 beef cattle of the Genetic Corre-lations Experiment of the Cooperative Research Cen-ter for the Cattle and Beef Industry were used for the GWAS. The breeding of these animals was reported previously (Upton et al., 2001). The measurements of IMF, CHILLP8, and LLPF have been reported pre-viously (Perry et al., 2001) and genetic correlations between these traits reported (Reverter et al., 2003a). These animals form the bulk of the sample reported previously in studies of residual feed intake (Robinson and Oddy, 2004; Barendse et al., 2007b). The breed composition of the sample consisted of 220 Angus, 146 Hereford, 55 Murray Grey, 81 Shorthorn, 78 Brahman, 165 Belmont Red, 126 Santa Gertrudis, 25 Taurine-Brahman crossbred animals, and 44 Tropical Compos-ite-Brahman crossbred animals. All 4 grandparents of the animals in each of the 5 purebred or 2 composite breeds belonged to the particular breed. For the cross-bred animals, the dams were all Brahman and the sires were either of 1 of the 4 taurine breeds or 1 of the 2 composite breeds. These represent the offspring of 246 sires, representing 34 herds of origin, each breed con-sisting of several herds of origin, 2 sexes, and 50 kill groups. The average number of half-sibs per sire was 3.8 with a range of 1 to 15 offspring per sire.

The DNA samples were genotyped using the Bovine SNP50 array (Matukumalli et al., 2009) by Illumina Inc. (Hayward, CA). Genotypes were analyzed for GC10 (GenCall quality at the 10th percentile) scores, call rates, call frequency, cluster separation, and devia-tions from Hardy-Weinberg equilibrium using the Ge-nome Studio Software version 1.0. In addition, tests of repeated genotyping of the same animal were includ-ed, but the identity of the repeat was unknown to the genotyper and included individuals of known pedigree. These methods were used to analyze the raw data and arrive at a final data set of n = 940 animals genotyped for 53,798 SNP. The bp designations for each genotype were converted to number of copies of a reference allele so that for an A/G SNP the AA genotype was 0, the AG genotype was 1, and the GG genotype was 2. The codes for each SNP are shown in Supplemental Table 1 (http://jas.fass.org/content/vol89/issue8/).

Confirmation of Associations for Meat Quality Identified in the 50K GWAS

The DNA of an additional sample of 1,338 beef cattle samples from the same large progeny test experiment (Genetic Correlations Experiment of the Cooperative Research Center for the Cattle and Beef Industry) was chosen for confirmation of the associations. The breed composition of the animals in the confirmation sample consisted of 655 Angus, 343 Hereford, and 340 Brah-man animals, representing 3 of the 9 breeds or breed-composites that were present in the discovery sample. The animals represent the offspring of 198 sires, repre-senting 23 herds of origin, 2 sexes, and 142 kill groups. The average number of half-sibs per sire was 6.8 with a range of 1 to 31 offspring per sire. None of the animals used in the confirmation sample were used in the GWAS discovery sample. Of the 198 sires in the confirmation sample, 123 had half-sib offspring in the discovery sam-ple, 463 of the 940 (49.2%) discovery animals were half sibs of animals in the confirmation sample, and 957 of the 1,338 (71.5%) animals in the confirmation sample were half-sibs of animals in the discovery sample. The size of the half-sib group in the discovery sample was positively correlated to the size of the half-sibs group in the confirmation sample for offspring of the same sire, with r = 0.46. The main genetic difference between the samples is that the confirmation sample contained no stabilized composite animals of breeds such as the Santa Gertrudis or first generation crossbreds such as F1 Angus × Brahman animals.

The DNA samples were genotyped using a custom Illumina Golden Gate Assay (Supplemental Table 2; http://jas.fass.org/content/vol89/issue8/) using an Il-lumina iScan machine (Illumina, San Diego, CA) at CSIRO. This list consists of 384 SNP from a list of 443 SNP sent to Illumina and consists of all SNP signifi-cantly (P < 0.001) associated with IMF, CHILLP8, or LLPF in the GWAS for which assays could be generat-ed, with the remainder consisting of SNP significantly

2298 Bolormaa et al.



(P < 0.005) associated with IMF and that were part of the Bovine SNP50 array or SNP mapped to regions that had previously been associated with IMF or LLPF (Barendse, 2008). The significance thresholds of α = 0.001 and 0.005 (for IMF) were used in the GWAS to limit the number of SNP to be considered in the con-firmation round. In GWAS, due to the number of tests, the more relaxed the threshold in the discovery sample the more false positives will be discovered. The more stringent the significance threshold the more likely it is that real genetic effects of small size will be overlooked due to sampling effects. After genotyping the animals for the confirmation sample, the genotypes were ana-lyzed for GC10 scores, call rates, call frequency, clus-ter separation, and deviations from Hardy-Weinberg equilibrium using the Genome Studio Software version 1.0 to arrive at a final data set of 335 SNP for 1,338 animals (see Supplemental Table 2; http://jas.fass.org/content/vol89/issue8/). Of these 335 SNP, the panel contained 75 of the 87 SNP significantly (P < 0.001) associated with IMF, 31 of the 63 SNP significantly (P < 0.001) associated with CHILLP8, and 32 of the 64 SNP significant (P < 0.001) for LLPF. In addition, 154 SNP significant (P < 0.005) for IMF were included, and 42 of the SNP near candidate genes for IMF and LLPF were included, resulting in 292 SNP selected for their significance to 1 of the 3 traits. Of these 335 SNP, 329 were in the Illumina Bovine SNP50 Array, so the discovery and confirmation samples were genotyped for 329 SNP in common. The 6 SNP that were not in the Il-lumina Bovine SNP50 Array are in the candidate genes CAST (Barendse, 2002; Barendse et al., 2007a), stea-royl co-A desaturase 1 (SCD1; Taniguchi et al., 2004), and calpain 3 (CAPN3; Barendse et al., 2008). The specific SNP were CAST:c.2832A > G (CAST-2832), SCD:c.878C > T (SCD-878 GenBank AY241933.1), SCD-e638 (GenBank NW_003104582.1:g.7034766C > T), CAPN3:c.1882A > G, CAPN3:c.2082G > A, and CAPN3:c.2123A > T, which were respectively labeled CAPN3–2089, CAPN3–2289, and CAPN3–2330 in pre-vious research (Barendse, 2007), and the variant se-quences for the 6 SNP are contained in Supplemental Table 3 (http://jas.fass.org/content/vol89/issue8/). The CAST-2832 was inserted as a positive control, whereas SCD-878, CAPN3–2089, and CAPN3–2330 are known nonsynonymous mutations in candidate genes for either IMF composition or LLPF.

Analyses

The phenotypes and genotypes were fitted in a mixed model of the form trait ~mean + fixed effects + SNP + animal + error using the software ASReml (Gilm-our et al., 2002) where SNP was treated as a variable consisting of the number of copies of an allele, and ani-mal and error were random effects. The SNP associa-tions were evaluated by regression. For the carcass and meat quality traits, the fixed effects were breed, herd of origin, sex, and kill group of the animals (Reverter

et al., 2003b). Pedigree information for 5 generations was included as well as age in days at slaughter. Each breed consisted of several herds of origin, and herd of origin was nested within breed in case there were allele frequency differences between different herds within a breed as well as different management of herds on the properties from which the animals were obtained. Al-most all kill groups were of a single sex, so sex (s) and kill group (KG) were concatenated to form sKG. In the GWAS sample, the regression was evaluated as a 2-tailed test against the null hypothesis of a regres-sion coefficient of zero. The quantiles of the observed t-values for the GWAS experiment were compared with the quantiles of the standard normal distribution using a quantile-quantile plot (Q-Q plot) in the R-project software (Ihaka and Gentleman, 1996) downloaded from http://www.r-project.org to determine if there was an excess of extremely high observed values. In the analysis of the confirmation sample, the regression was evaluated as a 1-tailed test of the favorable homozygote of the discovery sample. The significance threshold was set at α = 0.05 because the SNP already had evidence for association from data in the experiment. If the dis-covery and confirmation data sets were combined, the association that had been found in the discovery sam-ple would not be materially degraded by the addition of significant new data of this kind from the confirmation sample. However, if data from the confirmation sample showed either that the alternative homozygote was sig-nificantly associated with the trait or that there was no association in the confirmation sample, then it would degrade the evidence for the association when added to the data from the discovery sample. The proportion of the residual variance explained by a SNP was esti-mated by comparing the ratio of the residual sums of squares of a model that did not include the SNP to the residual sums of squares of the model that included the SNP. The false positive rate (FPR) = Ep/Op where Ep is the expected number of SNP with p-values below a particular significance threshold, given the number of SNP in the panel and assuming that all tests are inde-pendent, and Op is the observed number of SNP with P-values below that same threshold.

To analyze genes, their locations, and the position of SNP on the map, the Btau 4.0 Bovine Genome As-sembly implemented at http://www.livestockgenomics.csiro.au/perl/gbrowse.cgi/bova4/ (Elsik et al., 2009) was used.

To calculate the molecular breeding value (MBV) for IMF for animals in the confirmation sample based on a panel of more than 1 SNP, first the allele substitu-tion effects of all the SNP in the panel were estimated simultaneously using the genotypes and phenotypes from the discovery sample. Then, for each animal of the confirmation sample, for each SNP in the panel, the number of copies of the reference allele was mul-tiplied by the allele effect of the SNP estimated from the discovery sample, and the resulting coefficients were summed across all SNP in the panel to generate

2299Genome-wide association study of meat and carcass traits



the MBV. For n loci in the panel, for bi allele effects for SNP i obtained from the discovery sample and gij genotypes (0, 1, and 2) in individual j at SNP i in the confirmation sample, the MBV mj for animals in the confirmation sample is

mj = Σni = 1bigij.

This method, on the additive scale for a quantitative trait, is similar to the method for predicting genetic risk of a complex trait applied to the multiplicative scale (Wray et al., 2007). A bivariate analysis of the MBV and the trait values of individuals of the confirmation sample was performed using ASReml to calculate ge-netic and phenotypic correlations, and the squares of these correlations were used to estimate the genetic and phenotypic variance, r2

G and r2P, respectively, explained

by a panel of SNP. The MBV were calculated from panels of different numbers of SNP. Simultaneously es-timated allele effects are affected by analytical method. We considered BayesA and BayesB (Meuwissen et al., 2001) and regression, which have substantially differ-ent assumptions about the underlying distribution of allele effects. The ASReml software used for regression has a limit of 280 SNP analyzed simultaneously on computers available to us, so we compared MBV from these methods using a set of 280 SNP that would fit in ASReml regression analysis. The MBV for IMF were calculated from 1) the most significant SNP in the dis-covery sample, 2) a panel of 190 out of the possible 229 SNP significantly (P < 0.05) associated with IMF iden-tified by forward regression in the discovery sample (see below), 3) the most significant SNP in the confirmation sample, 4) a panel of 14 SNP identified by forward re-gression in the confirmation sample (see below), and 5) as many SNP as possible genotyped in both discovery and confirmation samples irrespective of whether they were associated with IMF or not, of which there was a maximum of 329 SNP in common. The subset of 280 SNP for comparing regression (ASReml), BayesA and BayesB are contained in Supplemental Table 2 (http://jas.fass.org/content/vol89/issue8/). They were the first 280 SNP in alphabetical order, covered chromosomes BTA1–23, and included all the confirmed SNP for IMF (Table 2) but did not include 1 SNP (ARS-BFGL-NGS-40342) on BTA29 that formed part of the panel of 14 SNP (see Supplemental Table 5; http://jas.fass.org/content/vol89/issue8/).

To identify a panel of SNP associated with IMF by forward regression the following 3-step algorithm was used to identify the members of the panel. The idea behind this process is to progressively add to the panel the SNP that explain most of the variance in the trait after the effects of the previous SNP were removed. Step 0: select the best SNP (with the greatest F-value) from a single-SNP regression analysis, fitting the SNP as a fixed effect on uncorrected data. This SNP is add-ed to the pool of SNP. The defined critical threshold to enter SNP to the pool or model for multiple-SNP

regression analysis was set at P = 0.05. Step 1: perform the regression of the pool of SNP (consisting of 1 to as many SNP as are significant) fitted as a fixed effect on the original uncorrected data. This analysis results in a set of data corrected for the pool of SNP as well as other fixed effects, and these corrected values are sub-tracted from the original data. The resulting residual trait values are then used to identify new SNP to be added to the pool of SNP. Step 2: select the best SNP in the data set, excluding those already in the pool, by fitting the SNP as a fixed effect on the residual trait values. This SNP is added to the pool of SNP and the critical threshold is the same as step 0. Repeat steps 1 and 2 in sequence until there are no more SNP with F-values significantly (P < 0.05) associated with residual variance.

RESULTS

There were slightly more SNP that were significantly (P < 0.001) associated with IMF than to either LLPF or CHILLP8 in the 50K GWAS (Table 1). Given the size of the discovery sample the significance threshold of α = 0.001 generated a manageable number of targets for further study without increasing FPR too much. There were 87 SNP with significant (P < 0.001) associations with IMF, located across the bovine genome (Figure 1) and listed in Supplemental Table 4 (http://jas.fass.org/content/vol89/issue8/). Because of the number of SNP (53,798) tested, 54 of the SNP would be significant by chance at the α = 0.001 significance threshold, giving a FPR of 54/87 or 62%. There were 64 and 63 SNP with significant (P < 0.001) associations for LLPF and CHILLP8, respectively, listed in Supplemental Table 4 (http://jas.fass.org/content/vol89/issue8/), discovered at a FPR of 84% for both traits. Despite the increased FPR, the Q-Q plots for these traits showed an excess of greater t-values compared with the theoretical distribu-tion. For example, Figure 2 shows the Q-Q plot for the LLPF GWAS, which is known to include the CAPN1 mutations, and this excess usually indicates a likely real discovery of SNP associated with QTL. Although the majority of these associations have not been confirmed, the results for all SNP are provided in Supplemental Table 4 (http://jas.fass.org/content/vol89/issue8/) for use in meta-analyses and for confirmation studies per-formed by other researchers.

The chromosomes with large numbers of significant (P < 0.001) associations in the discovery sample are dif-ferent for the 3 traits (Table 1). There was no overlap in the list of chromosomes that had the largest number of SNP associated with the 3 traits. The 5 chromosomes with the largest number of associations with IMF ac-counted for 35 out of the 87 significant (P < 0.001) SNP, and the 87 SNP were located on 26 chromosomes. Although BTA1 had the most associations with IMF, the region with the greatest concentration of significant associations was BTA5 at 21 Mb. The 5 chromosomes with the largest number of significant associations with




LLPF accounted for 52 out of the 64 significant (P < 0.001) SNP, a much greater proportion than in the other 2 traits, indicating greater clustering of SNP on particular chromosomes, and the 64 SNP were located on 22 chromosomes. Bovine chromosome 12 had the most associations with LLPF consisting of 9 SNP, but the greatest concentration was on BTA29 at 45 Mb surrounding the CAPN1 gene. The 5 chromosomes with the largest number of significant associations with CHILLP8 accounted for 27 out of the 63 significant (P < 0.001) SNP, and the 63 SNP were located on 21 chromosomes. Bovine chromosome 14 had the most as-sociations with CHILLP8 consisting of 8 SNP, and the greatest concentration was also on BTA14 at 23 Mb in a region containing the XK, Kell blood group complex subunit-related family, member 4 (XKR4) gene. The significant (P < 0.001) associations on BTA8 for LLPF and CHILLP8 did not overlap because none of the SNP associated with a single trait were within 1 Mb of SNP associated with the other trait. There were no SNP that were significantly (P < 0.001) associated for all 3 traits and only 1 SNP, BTA-105294-no-rs located at BTA3:91216203, was associated with both LLPF and CHILLP8. There were no SNP associated at the α = 0.001 significance threshold with effects on both IMF and LLPF or IMF and CHILLP8.

In the confirmation sample, a total of 13% (37/293) of the SNP that were chosen on their association with IMF, LLPF, or CHILLP8 in the GWAS were signifi-cantly (P < 0.05) associated with the same trait with the same favorable homozygote (Table 2). The signifi-cance threshold of α = 0.05 for the same favorable ho-mozygote was used because there was already informa-tion in the experiment that identified the SNP. The most significant SNP in the confirmation sample were not the most significant SNP in the discovery sample for IMF. Furthermore, of the SNP tested for IMF, 15% (11 of 75) of those that were significantly (P < 0.001) associated with IMF in the discovery sample were sig-nificantly (P < 0.05) associated with IMF in the con-firmation sample and 9% (14 of 155) of those that were significantly (P < 0.005) associated with IMF in the discovery sample were significantly (P < 0.05) associ-ated with IMF in the confirmation sample, a nonsignifi-

cant difference. Four regions, BTA4:18 Mb, BTA8:96 Mb, BTA10:85 Mb, and BTA11:80 Mb, had more than 1 significant (P < 0.05) SNP. In addition, BTA2, 8, 10, 14, and 16 had more than 1 significant (P < 0.05) SNP separated by >5 Mb on the same chromosome. In-cluded in Table 2 is a SNP that had been significantly (P < 0.001) associated with CHILLP8 in the GWAS but was significantly (P < 0.01) associated with IMF in the confirmation sample, where the same homozygote increased CHILLP8 or IMF, respectively. It was not significantly associated with CHILLP8 in the confirma-tion sample.

There were 7 SNP significantly (P < 0.05) associated with LLPF out of the group of 32 SNP tested that had been significantly (P < 0.001) associated with LLPF in the GWAS. The CAST-2832 SNP, which is not in the Illumina Bovine SNP50 Array, was also significantly (P < 0.001) associated with LLPF in the confirmation sample. These 8 SNP were located to 2 chromosomes, 2 genomic regions on BTA7 and 1 on BTA29. Two of these regions contained the CAST and CAPN1 genes, and although other SNP near CAPN1 had shown stronger associations with LLPF than CAPN1_1 in the GWAS discovery sample, in the confirmation sam-ple, the CAPN1_1 SNP was the most significant (P = 0.00068) of the SNP associated with LLPF on BTA29. The CAPN1_1 SNP is part of current genetic tests for meat tenderness. The CAST-2832 SNP, which is also part of current genetic tests, was not included in the Bovine SNP50 array, but it was the most significant (P = 0.00003) of all the SNP associated with LLPF in the confirmation sample. Interestingly, of the 4 SNP in the introns of or immediately 5′ or 3′ to the CALP gene, ARS-BFGL-NGS-43901, ARS-USMARC-670, ARS-USMARC-116, and BTA-111223-no-rs, which were in-cluded in the Bovine SNP50 array, none showed signifi-cant associations with LLPF in the discovery sample, even if a very relaxed (α = 0.05) significance threshold was applied. The third region, BTA7:50 Mb, which con-tained a SNP significantly (P < 0.05) associated with LLPF, was located more than 45 Mb away from CAST.

There were 5 SNP significantly (P < 0.05) associated with CHILLP8 in the confirmation sample out of the group of 31 SNP tested that had been significantly (P

Table 1. Summary of genome-wide associations for intramuscular fat percentage (IMF), rump fat measured at the sacro-iliac crest in the chiller (CHILLP8), and meat tenderness measured as peak force to shear the LM (LLPF) in the discovery sample

Trait n SNP1 FPR,2 % Top 5 Chr3 Sig %4 Best region5 Gene(s)6

IMF 87 62 1, 4, 5, 10, 13 40.2 BTA5:21 KITLG to ATP2B1CHILLP8 63 84 8, 11, 14, 19, 23 42.9 BTA14:22 XKR4LLPF 64 84 7, 8, 12, 22, 29 81.3 BTA29:45 EHD1 to OVOL1 includes CAPN1

1Number SNP significant at P < 0.001.2False positive rate.3Top 5 chromosomes with largest number of SNP significant at P < 0.001.4Percentage of SNP with P < 0.001 on the top 5 chromosomes compared with all chromosomes.5Genomic region with the largest number of SNP significant at P < 0.001.6Gene(s) located to this genomic region.




Tab

le 2

. Ass

ocia

tion

s be

twee

n SN

P a

nd in

tram

uscu

lar

fat

perc

enta

ge (

IMF),

rum

p fa

t m

easu

red

at t

he s

acro

-ilia

c cr

est

in t

he c

hille

r (C

HIL

LP

8), a

nd m

eat

tend

erne

ss m

easu

red

as p

eak

forc

e to

she

ar t

he L

M (

LLP

F)

in t

he c

onfir

mat

ion

sam

ple

of c

attle

SNP

Chr

omos

ome

Pos

itio

n, b

pT

rait

GW

A

P-v

alue

1b2

SEt

R2 , %

P-v

alue

3G

ene4

BT

B-0

1742

157

00

IMF

0.00

034

−0.

110.

061.

800.

60.

0360

8U

nkno

wn

BT

B-0

0019

769

144

1149

81IM

F0.

0036

00.

240.

082.

880.

60.

0020

3D

CB

LD

2B

TA

-120

552-

no-r

s2

5477

2881

IMF

0.00

382

0.13

0.06

2.07

0.7

0.01

935

GT

DC

1H

apm

ap55

796-

rs29

0111

722

6394

7670

IMF

0.00

162

−0.

220.

082.

681.

10.

0037

4nn

gA

RS-

BFG

L-N

GS-

1184

682

9358

6145

IMF

0.00

139

−0.

190.

063.

131.

20.

0009

0LO

C51

8393

AR

S-B

FG

L-N

GS-

7503

93

1548

0732

IMF

0.00

032

−0.

180.

072.

570.

70.

0051

6P

RC

CB

TA

-684

67-n

o-rs

418

4660

19IM

F0.

0004

5−

0.15

0.06

2.68

0.5

0.00

374

nng

Hap

map

4804

9-B

TA

-684

654

1851

9898

IMF

0.00

154

0.10

0.06

1.71

0.2

0.04

379

nng

AR

S-B

FG

L-N

GS-

6407

27

4997

5005

LLP

F0.

0002

9−

0.16

0.07

2.32

0.6

0.01

027

1 kb

dis

tal M

GC

1595

84C

AST

-283

27

9757

4679

LLP

FN

BSA

0.22

0.05

4.05

1.5

0.00

003

CA

STB

TB

-007

3317

88

2551

6628

IMF

0.00

350

−0.

100.

061.

660.

20.

0486

1K

IAA

1797

BT

B-0

0488

794

896

2815

83IM

F0.

0045

80.

250.

064.

041.

70.

0000

3nn

gA

RS-

BFG

L-N

GS-

8778

78

9636

2722

IMF

0.00

354

0.18

0.07

2.51

0.7

0.00

612

LO

C53

0677

AR

S-B

FG

L-N

GS-

1003

958

1096

4968

0C

HIL

LP

8 IM

F5

0.00

021

−0.

230.

082.

991.

20.

0014

3T

NC

BT

B-0

1229

331

939

7514

80IM

F0.

0034

5−

0.12

0.06

1.93

0.2

0.02

695

nng

AR

S-B

FG

L-N

GS-

1036

8510

6266

2238

IMF

0.00

327

−0.

130.

072.

060.

40.

0198

3D

FN

A5

BT

B-0

1804

753

1084

5731

01IM

F0.

0000

0−

0.25

0.12

2.04

0.7

0.02

081

LO

C52

3451

AR

S-B

FG

L-N

GS-

1062

0310

8592

7764

IMF

0.00

023

−0.

250.

131.

960.

60.

0251

4R

GS6

AR

S-B

FG

L-N

GS-

1075

4311

7961

0835

IMF

0.00

059

0.12

0.07

1.74

0.4

0.04

109

nng

AR

S-B

FG

L-N

GS-

1064

7911

8190

8400

IMF

0.00

165

0.11

0.06

1.80

0.2

0.03

609

nng

BT

B-0

0492

076

1247

1896

87IM

F0.

0006

60.

120.

061.

910.

50.

0282

1nn

gH

apm

ap31

968-

BT

C-0

5675

414

4399

112

IMF

0.00

424

0.09

0.05

1.78

0.7

0.03

770

nng

AR

S-B

FG

L-N

GS-

1042

6814

2226

0372

IMF C

HIL

LP

860.

0033

60.

390.

152.

640.

50.

0042

1nn

gB

TB

-015

3078

814

2272

0373

CH

ILLP

80.

0000

00.

390.

201.

940.

30.

0263

3X

KR

4B

TB

-015

3083

614

2276

8980

CH

ILLP

80.

0000

0−

0.49

0.20

2.44

0.4

0.00

743

XK

R4

BT

B-0

0557

585

1422

8033

66C

HIL

LP

80.

0000

00.

520.

202.

660.

50.

0039

7X

KR

4B

TB

-005

5753

214

2283

8801

CH

ILLP

80.

0000

60.

540.

202.

730.

60.

0032

232

kb

dist

al t

o X

KR

4B

TA

-283

03-n

o-rs

1437

7115

67C

HIL

LP

80.

0004

50.

240.

131.

800.

20.

0360

934

kb

prox

imal

to

LO

C78

2385

Hap

map

4113

5-B

TA

-104

993

1461

7832

14IM

F0.

0008

60.

150.

062.

530.

60.

0057

8LO

C78

1881

BT

B-0

0640

427

1637

0499

32IM

F0.

0016

4−

0.16

0.06

2.41

0.3

0.00

807

nng

AR

S-B

FG

L-N

GS-

1806

816

4969

6753

IMF

0.00

036

0.13

0.06

1.98

0.4

0.02

399

LO

C51

0718

AR

S-B

FG

L-N

GS-

3443

018

6291

7071

IMF

0.00

083

−0.

120.

062.

000.

40.

0228

9LA

IR1

BTA

-542

07-n

o-rs

2236

4588

36IM

F0.

0006

90.

120.

061.

960.

40.

0251

4LO

C78

3484

INR

A-4

4323

6228

806

IMF

0.00

287

0.15

0.06

2.36

0.4

0.00

923

nng

AR

S-B

FG

L-N

GS-

3744

129

4497

7945

LLP

F0.

0000

00.

120.

062.

160.

40.

0155

12

kb d

ista

l to

GP

HA

2C

AP

N1_

129

4522

1191

LLP

F0.

0000

10.

170.

053.

210.

90.

0006

8C

AP

N1

AR

S-B

FG

L-N

GS-

1109

3629

4578

1509

LLP

F0.

0000

00.

100.

051.

920.

20.

0275

82

kb d

ista

l to

OV

OL1

BT

B-0

1033

227

2946

5935

75LLP

F0.

0000

10.

090.

052.

010.

40.

0223

53

kb d

ista

l to

RB

M4

AR

S-B

FG

L-N

GS-

4034

229

4745

7316

LLP

F0.

0001

10.

080.

051.

710.

10.

0437

9LO

C50

8879

Hap

map

4105

0-B

TA

-661

3529

4806

5510

LLP

F0.

0002

3−

0.09

0.05

1.79

0.3

0.03

688

8 kb

pro

xim

al t

o M

RP

L21

1 P-v

alue

for

the

SN

P in

the

disc

over

y sa

mpl

e. N

BSA

= n

ot in

Bov

ine

SNP

50 a

rray

.2 R

egre

ssio

n co

effic

ient

due

to

the

regr

essi

on o

f th

e tr

ait

on n

umbe

r of

cop

ies

of a

n al

lele

.3 P

-val

ue for

the

SN

P in

the

conf

irm

atio

n sa

mpl

e ba

sed

on a

1-t

aile

d te

st o

f th

e sa

me

favo

rabl

e al

lele

for

the

tra

it.

4 Gen

e lo

cation

on

the

Bta

u4.0

ass

embl

y, g

ene

sym

bol as

in

Gen

Ban

k, S

NP

des

igna

ted

as in

a ge

ne if it o

ccur

s in

the

exo

ns, in

tron

s, o

r ot

her

untr

ansl

ated

reg

ions

of a

gene

. nn

g =

not

nea

r ge

ne.

5 CH

ILLP

8 in

the

gen

ome-

wid

e as

soci

atio

n st

udy

(GW

AS)

but

IM

F in

the

conf

irm

atio

n sa

mpl

e.6 I

MF in

the

GW

AS

but

CH

ILLP

8 in

the

con

firm

atio

n sa

mpl

e.




Figure 1. The location of SNP associated with intramuscular fat percentage shown as a Manhattan plot. Odd-numbered chromosomes are shown in black; even-numbered chromosomes are shown in gray. Values above –logP > 3 are equivalent to P < 0.001.

Figure 2. The quantile-quantile plots of the observed distribution of the t-values for the genome-wide association study of meat tenderness measured as peak force to shear the LM in the discovery sample compared with the standard normal distribution. The plot represents at least 50,000 data points. Points at the extreme of the observed distribution show values that are greater than expected.




< 0.001) associated with CHILLP8 in the GWAS. The ARS-BFGL-NGS-104268 SNP on BTA14, which was significantly (P < 0.001) associated with IMF in the discovery sample, was significantly (P < 0.01) associ-ated with CHILLP8 in the confirmation sample. These 6 SNP were located to 2 chromosomal regions, both on BTA14 separated by 15 Mb. Only 1 of the SNP was located to BTA14:38 Mb, the other 5 SNP were located to a narrow region between BTA14:22–23 Mb that included only the XKR4 gene. The association be-tween these SNP and CHILLP8 had smaller P-values than P-values of any other SNP-trait combination in the GWAS, and had consisted of the cluster of 4 SNP all located to a region of less than 1 Mb in length on BTA14 as reported above. One of these SNP, BTB-01530836 located at BTA14:22768980, explained R2 = 2.3% of the residual variance in the GWAS. However, in the confirmation sample it explained R2 = 0.4%.

Initially, MBV were calculated for IMF because there were many more SNP associated with IMF than the other traits, and of these 3 traits it is the one of most economic interest. In the bivariate analysis that com-pared phenotypes with the MBV calculated from a panel of markers (Table 3), a panel of the 4 SNP with the largest effects in the confirmation sample explained r2

P = 2.2% and r2G = 6.1% of the variance in the con-

firmation sample. These 4 SNP were all of the SNP that each explained >1% of the residual variance for IMF. In comparison, a panel of the 4 most significant SNP from the discovery sample explained r2

P = 1.5% and r2

G = 3.8% of the variance in the confirmation sample. The performance of this latter panel was not materially different than panels of all of the 75 signifi-cant (P < 0.001) SNP from the discovery sample or all of the 190 significant (P < 0.05) SNP from the dis-covery sample identified by forward regression, which explained r2

P = 1.2 or 1.3% and r2G = 3.6 or 4.0% of

the variance, respectively. The panel that explained the most variance (Supplemental Table 5; http://jas.fass.org/content/vol89/issue8/) was obtained using forward regression in the confirmation sample and consisted of 14 SNP that explained r2

P = 5.6% and r2G = 15.6% of

the variance. When estimated as single markers, these SNP explained r2

P = 16.0% and r2G = 34.4% in the

GWAS sample and r2P = 12.4% and r2

G = 31.3% in the confirmation sample, respectively (Supplemental Table 5; http://jas.fass.org/content/vol89/issue8/), a deflation of more than 50% when analyzed as a panel. These 14 SNP were located on 12 chromosomes, and of the 2 pairs of SNP on the same chromosome none were closer together than 29 Mb. None of these 14 SNP showed significant linkage disequilibrium to another of the SNP in the panel. Expanding the panel to hundreds of SNP, irrespective of their level of significance to IMF in either the discovery or confirmation sample, result-ed in a reduction in the amount of variance explained by the panel. Several methods of analysis agreed that the amount of variance was low when several hundred SNP were used. Some of the correlations between MBV

and phenotypes were almost identical, such as for the estimates for 280 SNP using ASReml compared with the same 280 SNP using BayesB analyses, suggesting that the calculations are robust because the methods and their underlying assumptions are so different. Us-ing the BayesB method to estimate MBV for a panel of 329 SNP explained r2

P = 1.2% and r2G = 3.8% of

the variance for IMF, which is not materially different than that explained by the most significant SNP in the discovery sample. Note that in all these calculations the SNP effects that were used to derive the predic-tions were always obtained from the discovery sample irrespective of whether the SNP were chosen on their level of significance in the discovery or the confirmation sample.

The MBV for LLPF and CHILLP8 were then calcu-lated for differing numbers of SNP to determine wheth-er increasing the number of SNP also decreased the amount of variance explained for these traits (Table 4). Although a small panel of all the confirmed SNP explained r2

G = 3.0 and 13.5% for CHILLP8 and LLPF, respectively (SNP contained in Table 2), when all 329 SNP were used the amount of genetic variance that was explained was r2

G = 1.0 and 0.8% for CHILLP8 and LLPF, respectively, and panels of all the SNP signifi-cant (P < 0.001) in the discovery sample also did not explain large amounts of the genetic variance.

Although there are many confirmed SNP in anony-mous loci, those with names starting with LOC (un-known locus in the Entrez database, GenBank), KIAA (Kazusa unknown cDNA index), and MGC (Mamma-lian Gene Collection), these are not significantly dif-ferent than the distribution of named to anonymous genes in the bovine genome as a whole. There are 23,452 unique gene names annotated in the bovine genome to date (Gibbs et al., 2009), of which 9,270 had anonymous LOC names, 26 had anonymous KIAA names, and 4 had anonymous MGC names. Confirmed SNP (Table 2) were located within introns and cod-ing sequences of 9 named genes with assigned function, whereas confirmed SNP were located in 8 anonymous genes. These numbers increase to 12 named genes and 9 anonymous genes if SNP are counted that are in the intergenic sequences but are within 5 kb of the coding sequence. Both these distributions were not significant-ly (P > 0.05) different than the distribution of named to anonymous genes in the bovine genome as a whole. Apart from those SNP in CAPN1 and CAST, none of these SNP was in a gene that had previously been ex-plored for these traits, nor was there such a gene within 1 Mb of the most significant SNP in a region.

Furthermore, 12 of the confirmed SNP were more than 50 kb from a gene in either direction. The average distance between SNP in the panel is 51 kb. Each of these 12 SNP was evaluated to determine whether there were closer SNP to a gene. Of these 12, one of these was near another SNP that was in a gene and was signifi-cantly (P < 0.05) associated with the same trait of IMF (BTB-00488794). However, BTB-00488794 was signifi-




cantly (P < 0.0001) associated with the trait, whereas the SNP nearer the gene, ARS-BFGL-NGS-87787 in the gene LOC530677, was not as significant (P < 0.01) and explained less than one-half as much of the residu-al variance. One other SNP, ARS-BFGL-NGS-104268, was significantly (P < 0.05) associated with CHILLP8

and was near several SNP in the gene XKR4 that were significantly (P < 0.01) associated with the same trait, but both the proximal and distal neighbors between it and the nearby genes were nonsignificant. That par-ticular SNP had been associated with IMF in the dis-covery sample. All the other 10 of these 12 confirmed

Table 3. Comparison of molecular breeding value (MBV) prediction equations for in-tramuscular fat percentage (IMF) using panels of different numbers of SNP

SNP set rP1 SE(rP)

2 rG3 SE(rG)4 r2

P5 r2

G6

MBV-top47 0.150 0.031 0.246 0.062 0.022 0.061MBV-top4 GWAS8 0.122 0.031 0.196 0.057 0.015 0.038MBV-FR9 0.236 0.032 0.395 0.074 0.056 0.156MBV-MS10 0.217 0.032 0.352 0.068 0.047 0.124MBV-75 GWAS11 0.112 0.031 0.190 0.061 0.012 0.036MBV-190 GWAS12 0.116 0.031 0.200 0.063 0.013 0.040MBV-28013 ASReml 0.129 0.031 0.230 0.067 0.017 0.053MBV-280 BayesA 0.122 0.031 0.216 0.066 0.015 0.046MBV-280 BayesB 0.129 0.031 0.232 0.067 0.017 0.054MBV-32914 BayesA 0.104 0.031 0.182 0.062 0.011 0.033MBV-329 BayesB 0.110 0.031 0.196 0.064 0.012 0.038

1Phenotypic correlation between the MBV and the trait.2SE of the phenotypic correlation.3Genetic correlation between the MBV and the trait.4SE of the genetic correlation.5Phenotypic variance explained by the SNP.6Genotypic variance explained by the SNP.7MBV based on the 4 SNP with the strongest associations with IMF in the confirmation sample.8MBV based on the 4 SNP with the strongest associations with IMF in the discovery sample. GWAS =

genome-wide association study.9MBV based on 14 SNP identified by forward regression in the confirmation sample.10MBV based on the 14 most significant (P < 0.05) SNP in the confirmation sample.11MBV based on the 75 significant (P < 0.001) SNP with associations with IMF in the discovery sample.12MBV based on the 190 significant (P < 0.05) SNP with associations with IMF in the discovery sample

identified by forward regression.13280 SNP in the panel.14329 SNP in the panel.

Table 4. Comparison of molecular breeding value (MBV) prediction equations for rump fat measured at the sacro-iliac crest in the chiller (CHILLP8) and meat tender-ness measured as peak force to shear the LM (LLPF) using panels of different numbers of SNP

Trait SNP set rP1 SE(rP)

2 rG3 SE(rG)4 r2

P5 r2

G6

CHILLP87 6 SNP 0.109 0.031 0.173 0.054 0.012 0.030CHILLP88 31 SNP 0.040 0.031 0.066 0.052 0.002 0.004CHILLP8 329 SNP 0.060 0.031 0.100 0.055 0.004 0.010LLPF9 7 SNP 0.132 0.031 0.319 0.103 0.017 0.102LLPF7 8 SNP 0.193 0.036 0.367 0.098 0.037 0.135LLPF10 32 SNP −0.017 0.031 −0.050 0.092 0.000 0.003LLPF 329 SNP 0.036 0.032 0.092 0.084 0.001 0.008

1Phenotypic correlation between the SNP and the trait.2SE of the phenotypic correlation.3Genetic correlation between the SNP and the trait.4SE of the genetic correlation.5Phenotypic variance explained by the SNP.6Genotypic variance explained by the SNP.7MBV based on the SNP with strongest associations with the trait in the confirmation sample.8MBV based on the 31 significant (P < 0.001) SNP with associations with CHILLP8 in the discovery sample.9Excluding calpastatin (CAST)-2832.10MBV based on the 32 significant (P < 0.001) SNP with associations with the trait in the discovery sample.




SNP were in gene deserts where none of the SNP in the distant gene was significantly associated with the trait in the GWAS sample.

DISCUSSION

The main objectives of this study were to identify more SNP associated with meat and carcass quality using the GWAS methodology, to test as many of these SNP as possible to determine how many could be con-firmed, and to evaluate the performance of panels of the SNP for prediction of a phenotype. Although many SNP were identified in the GWAS, and although more of these SNP were confirmed in a second sample than would be expected by chance, only 13% of the posi-tives evaluated from the GWAS were significantly (P < 0.05) associated with the same trait and had the same favorable allele in a confirmation sample. This confir-mation sample was genetically similar, so the reduced confirmation cannot be ascribed to genetic difference. Instead, it may represent the increased FPR found in GWAS studies. Three new genetic regions of interest for IMF were discovered, out of the 20 regions from which markers had been found significantly (P < 0.05) asso-ciated with IMF in the confirmation sample. These 3 regions each showed an association that explained R2 > 1% of the residual variance. This threshold was chosen because it roughly corresponds to the amount of the residual variance explained by the CAPN1 and CAST SNP in these samples, which have proved to be useful DNA markers for meat tenderness. These are also of a similar size to moderately sized QTL affecting other traits, such as those due to the GH receptor (GHR), the prolactin receptor (PRLR), and the ATP binding cassette subfamily G, member 2 (ABCG2) that affect milk yield and milk component yield (Blott et al., 2003; Cohen-Zinder et al., 2005; Viitala et al., 2006). These have been shown to explain 1% < R2 < 2.5% of the residual variance for milk component yield (Turner et al., 2010) in a similar-sized sample to that used in this study. Although the coverage of the bovine genome was not exhaustive in this study (i.e., the average distance between SNP was 51 kb), it is unlikely that there are genes of very large effect and of moderate to high mi-nor allele frequency affecting IMF, LLPF, or CHILLP8 segregating in these samples, because this study has been sufficiently powerful to identify and confirm genes of relatively small effect. In particular, there was no evidence of genes or genetic regions explaining as much variation for these traits as the DGAT1 or MSTN mu-tations explain for milk fat percentage or meat yield, respectively. For the traits of LLPF and CHILLP8, fewer markers were tested, and no new regions with confirmed QTL explaining R2 > 1% were found.

Some of the associations found in this study are con-sistent with the locations of QTL identified by linkage mapping in cattle. Only those examples are reported where the 2 studies roughly agree, bearing in mind that the confidence intervals on QTL locations are large

(Visscher et al., 1996). The confirmed association of a SNP at BTA7:50 Mb with LLPF is consistent with the location of a QTL affecting LLPF previously mapped to the central region of BTA7 and that did not include the CAST gene (Drinkwater et al., 2006). The associa-tion of a series of SNP with CHILLP8 at 2 locations of BTA14 is consistent with QTL mapping studies of fat depth mapped to BTA14 (Casas et al., 2000, 2003). In the central region of BTA2 there were 3 confirmed SNP for IMF in this study, each separated by more than 5 Mb from the next one, 2 of which have R2 > 1%. One study has reported a well supported, broad centrally located QTL for marbling score in cattle on BTA2 (Ca-sas et al., 2004). At BTA14:61.7 there was a confirmed association with IMF in this study, and several linkage mapping studies located a QTL to the central region of BTA14 (Casas et al., 2003; Mizoshita et al., 2004; Choi et al., 2006), in addition to the many reports of studies of possible candidate genes affecting marbling score or IMF located to this part of BTA14 (Barendse, 2008). Finally, there were 2 SNP with confirmed associations with IMF on the central region of BTA16 in this study, which may correspond to a marbling QTL located in the central region of BTA16 (Casas et al., 2004).

The lack of success that candidate gene approaches have when trying to identify QTL is highlighted by these results, which also suggest that the density of SNP coverage needs to be greater in those studies. The Bovine SNP50 array was designed to have evenly spaced markers across the bovine genome (Matukumalli et al., 2009) and there would have been SNP close to the can-didate and positional candidate genes for fatness traits. The decay of linkage disequilibrium (LD) with distance in cattle breeds that has been reported previously (Bar-endse et al., 2007b; McKay et al., 2007; Gibbs et al., 2009) should mean that these SNP are at a density high enough to detect LD between noncausative SNP in a candidate gene and the trait of interest, although the density is not great enough for comprehensive coverage of a gene. Yet, the markers that were either identified in the discovery sample or replicated in the confirmation sample were not generally in candidate genes for these fatness traits. The main exceptions to this observation on candidate genes are the CAPN1 and CAST genes affecting meat tenderness, the effects of which were dis-covered some time ago. The CAST-2832 was not in-cluded in the SNP chip, yet there was still a significant (P < 0.001) SNP 1 Mb distance from CAST, and if CAPN1_1 had not been included in the SNP chip the CAPN1 gene would still have come to attention be-cause there was a significant (P < 0.001) SNP close to it. Although these results would have raised interest in these 2 candidate genes, it should nonetheless be noted that all 4 of the SNP in and around CAST in the Bo-vine SNP50 array showed no association with LLPF. In the absence of other information this lack of association in the gene would have given a false signal for future research, particularly when one considers the high de-gree of replication of the association of the CAST-2832




SNP with meat tenderness, the degree of LD between SNP demonstrated in other studies, and the associa-tion of several SNP in CAST to LLPF (Barendse, 2002; Schenkel et al., 2006; Barendse et al., 2007a). None of the candidate genes traditionally associated with fat-ness traits showed strong evidence of association when considering SNP within a 1-Mb window in the discov-ery population. None of the confirmed SNP was close to a well-studied, candidate, or positional candidate gene for fatness, nor were any of these genes identified as top candidates in gene expression studies (Wang et al., 2009). In our results, genes with substantial evidence for fatness traits were not candidate gene for fatness, whereas none of the previously studied candidate genes showed strong evidence for association. Our results cannot in general be used to rule out these candidate genes because the SNP density was not completely comprehensive and sufficient LD between the SNP in the Bovine 50K Array and potential causative alleles in the gene may not exist in our samples. The region near R3HDM1, which does have significant evidence for an association with IMF in both the discovery and confir-mation samples, was identified previously not because it was a candidate gene but because it contained evi-dence of population genetic selection of gene frequen-cies (Barendse et al., 2009).

The calculated allele substitution effect of a SNP will vary from one data set to another, whereas the associa-tions may be significant or not as the case may be and where the amount of variance explained by the SNP may differ substantially, particularly when the breeds or types of cattle used in one study differ from the breeds in another. In that sort of confirmatory analy-sis, the allele effects are unconstrained by data that have been collected previously. But in prediction and breeding, it is not enough to say that a marker was sta-tistically significant and that it explains a certain pro-portion of the variance; one must calculate a breeding value from current data that can be used to compare with the actual phenotype in another sample. In esti-mating the molecular breeding value from the regres-sion coefficients (allele substitution effects) of the panel of 14 SNP estimated simultaneously from the discovery sample data, this analysis explained r2

P = 5.6% and r2G

= 15.6% of the variance. This is a substantial propor-tion of the genetic variance, and it is encompassed by a relatively small number of SNP. Although this appears to be a useful amount of the variability, one needs to keep in mind that such a process will still show an up-ward bias in the amount of variability explained. The most unbiased method would be to genotype a third, large independent sample of cattle for these 14 SNP and then calculate the proportion of the variance that is explained by the 14 SNP. Because of a lack of such a sample, the breeding value in the confirmation sample was estimated using markers significantly associated with IMF in the confirmation sample but using allele substitution effect coefficients obtained from the dis-covery sample and then that breeding value was com-

pared with observed IMF in the confirmation sample. Further genotyping of additional cattle samples, by us or by other researchers, will provide more unbiased es-timates of the amount of variation explained by this panel of 14 SNP.

Although a forward regression method provided a panel of SNP that explained r2

G = 15.6%, adding more SNP that were not associated with the trait substan-tially degraded the amount of variance explained. Add-ing hundreds of additional SNP to the panel, many of which were significant at either the α = 0.005 or α = 0.001 significance threshold in the GWAS discovery sample but which had not been found significantly (P < 0.05) associated with IMF in the confirmation sam-ple, downgraded the prediction substantially, so that r2

G = 3.8% was explained using 329 SNP. Essentially the same amount of variance was explained whether the 4 most significant SNP (P < 0.001) or the 190 sig-nificant (P < 0.05) SNP for IMF identified by forward regression in the discovery sample were used, highlight-ing the importance of using confirmed SNP. The MBV for CHILLP8 and LLPF showed the same effect of de-creased variance explained when nonsignificant SNP were included in the MBV. The decreased similarity of the MBV and the phenotype when hundreds of SNP are used in the prediction panel is unlikely to be due to genetic differences between the samples. Although some of the evidence of association of a SNP in the discovery sample may have come from breeds that were not included in the confirmation study which may have resulted in a failure of confirmation of the SNP, the inclusion of SNP that have no effect on the trait should not have degraded the performance of the MBV. There are several possible explanations for this reduction in the amount of explained variance. First, putting SNP into a panel that are not associated with a trait could increase the error of prediction. Second, the analyti-cal methods for calculating MBV were not efficient at removing the error associated with SNP that were not associated with the trait. Third, although the sample size used in this study was large enough to characterize marker panels of a small number of elements, it was not large enough to characterize panels with hundreds of elements when estimates are obtained simultaneously. Larger experiments, based on many more thousands of animals and investigating a much larger pool of SNP may be able to distinguish between these explanations because larger samples will give more accurate esti-mates of allele substitution effects, particularly when the estimates are calculated simultaneously. However, as the SNP used in this study were the most significant from the GWAS, the conclusions from such a larger study would only differ in degree rather than in kind unless the larger study found a substantially different list of highly significant SNP compared with this study. These results have implications for genomic selection because all available loci are used in those methods, whether statistically significantly associated with the trait or not, and the same process is used to handle




nonsignificant SNP as was used in this study. These re-sults suggest that it may be necessary to use only SNP that are confirmed to be associated with the trait of interest and that a SNP list should be available as sup-porting information for a set of prediction equations.

LITERATURE CITED

Barendse, W., inventor; Commonwealth Scientific and Industrial Research Organization, Meat and Livestock Australia Limited, New South Wales Department of Agriculture, Queensland De-partment of Primary Industries, and University of New Eng-land, assignees. 2002. DNA markers for meat tenderness. Pat-ent Application WO02064820. p 1–88.

Barendse, W., inventor; Commonwealth Scientific and Industrial Research Organization, Meat and Livestock Australia Limited, New South Wales Department of Agriculture, Queensland De-partment of Primary Industries, and University of New Eng-land, assignees. 2007. Assessing the tenderness of meat from an animal by testing the animal for a genetic marker in the calpain3 (CAPN3) gene associated with Warner-Bratzler peak force variation or for a genetic marker located other than in CAPN3. Patent Application WO2007053891. p 1–62.

Barendse, W. 2008. Genetic based diagnostic tools for predicting meat quality. Pages 292–317 in Improving the sensory and nu-tritional quality of fresh meat. J. P. Kerry and D. Ledward, ed. Woodhead Publishing Limited and CRC Press LLC, Cam-bridge, UK.

Barendse, W., B. E. Harrison, R. J. Bunch, and M. B. Thomas. 2008. Variation at the calpain 3 gene is associated with meat tenderness in zebu and composite breeds of cattle. BMC Gen-et. 9:41.

Barendse, W., B. E. Harrison, R. J. Bunch, M. B. Thomas, and L. B. Turner. 2009. Genome wide signatures of positive selection: The comparison of independent samples and identification of regions associated to traits. BMC Genomics 10:178.

Barendse, W., B. E. Harrison, R. J. Hawken, D. M. Ferguson, J. M. Thompson, M. B. Thomas, and R. J. Bunch. 2007a. Epistasis between calpain 1 and its inhibitor calpastatin within breeds of cattle. Genetics 176:2601–2610.

Barendse, W., A. Reverter, R. J. Bunch, B. E. Harrison, W. Barris, and M. B. Thomas. 2007b. A validated whole genome associa-tion study of efficient food conversion. Genetics 176:1893–1905.

Blott, S., J. J. Kim, S. Moisio, A. Schmidt-Kuntzel, A. Cornet, P. Berzi, N. Cambisano, C. Ford, B. Grisart, D. Johnson, L. Karim, P. Simon, R. Snell, R. Spelman, J. Wong, J. Vilkki, M. Georges, F. Farnir, and W. Coppieters. 2003. Molecular dis-section of a quantitative trait locus: A phenylalanine-to-tyro-sine substitution in the transmembrane domain of the bovine growth hormone receptor is associated with a major effect on milk yield and composition. Genetics 163:253–266.

Cafe, L. M., B. L. McIntyre, D. L. Robinson, G. H. Geesink, W. Bar-endse, D. W. Pethick, J. M. Thompson, and P. L. Greenwood. 2010. Production and processing studies on calpain-system gene markers for tenderness in Brahman cattle: 2. Objective meat quality. J. Anim. Sci. 88:3059–3069.

Casas, E., J. W. Keele, S. D. Shackelford, M. Koohmaraie, and R. T. Stone. 2004. Identification of quantitative trait loci for growth and carcass composition in cattle. Anim. Genet. 35:2–6.

Casas, E., S. D. Shackelford, J. W. Keele, M. Koohmaraie, T. P. L. Smith, and R. T. Stone. 2003. Detection of quantitative trait loci for growth and carcass composition in cattle. J. Anim. Sci. 81:2976–2983.

Casas, E., S. D. Shackelford, J. W. Keele, R. T. Stone, S. M. Kappes, and M. Koohmaraie. 2000. Quantitative trait loci affecting growth and carcass composition of cattle segregating alternate forms of myostatin. J. Anim. Sci. 78:560–569.

Casas, E., S. N. White, T. L. Wheeler, S. D. Shackelford, M. Koohm-araie, D. G. Riley, C. C. Chase, D. D. Johnson, and T. P. L. Smith. 2006. Effects of calpastatin and mu-calpain markers in beef cattle on tenderness traits. J. Anim. Sci. 84:520–525.

Choi, I. S., H. S. Kong, J. D. Oh, D. H. Yoon, B. W. Cho, Y. H. Choi, K. S. Kim, K. D. Choi, H. K. Lee, and G. J. Jeon. 2006. Analysis of microsatellite markers on bovine chromosomes 1 and 14 for potential allelic association with carcass traits in Hanwoo (Korean cattle). Asian-australas. J. Anim. Sci. 19:927–930.

Cohen-Zinder, M., E. Seroussi, D. M. Larkin, J. J. Loor, A. Everts-van der Wind, J. H. Lee, J. K. Drackley, M. R. Band, A. G. Hernandez, M. Shani, H. A. Lewin, J. I. Weller, and M. Ron. 2005. Identification of a missense mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield and composition in Holstein cattle. Ge-nome Res. 15:936–944.

Drinkwater, R. D., Y. Li, I. Lenane, G. P. Davis, R. Shorthose, B. E. Harrison, K. Richardson, D. Ferguson, R. Stevenson, J. Ren-aud, I. Loxton, R. J. Hawken, M. B. Thomas, S. Newman, D. J. S. Hetzel, and W. Barendse. 2006. Detecting quantitative trait loci affecting beef tenderness on bovine chromosome 7 near cal-pastatin and lysyl oxidase. Aust. J. Exp. Agric. 46:159–164.

Elsik, C. G., R. L. Tellam, K. C. Worley, and Bovine Genome Se-quencing Analysis Consortium. 2009. The genome sequence of taurine cattle: A window to ruminant biology and evolution. Science 324:522–528.

Gibbs, R. A., J. F. Taylor, C. P. Van Tassell, W. Barendse, K. A. Eversole, C. A. Gill, R. D. Green, D. L. Hamernik, S. M. Kappes, S. Lien, L. K. Matukumalli, J. C. McEwan, L. V. Naz-areth, R. D. Schnabel, G. M. Weinstock, D. A. Wheeler, P. Ajmone-Marsan, P. J. Boettcher, A. R. Caetano, J. F. Garcia, O. Hanotte, P. Mariani, L. C. Skow, J. L. Williams, B. Di-allo, L. Hailemariam, M. L. Martinez, C. A. Morris, L. O. C. Silva, R. J. Spelman, W. Mulatu, K. Y. Zhao, C. A. Abbey, M. Agaba, F. R. Araujo, R. J. Bunch, J. Burton, C. Gorni, H. Ol-ivier, B. E. Harrison, B. Luff, M. A. Machado, J. Mwakaya, G. Plastow, W. Sim, T. Smith, T. S. Sonstegard, M. B. Thomas, A. Valentini, P. Williams, J. Womack, J. A. Wooliams, Y. Liu, X. Qin, K. C. Worley, C. Gao, H. Y. Jiang, S. S. Moore, Y. R. Ren, X. Z. Song, C. D. Bustamante, R. D. Hernandez, D. M. Muzny, S. Patil, A. S. Lucas, Q. Fu, M. P. Kent, R. Vega, A. Matukumalli, S. McWilliam, G. Sclep, K. Bryc, J. Choi, H. Gao, J. J. Grefenstette, B. Murdoch, A. Stella, R. Villa-Angulo, M. Wright, J. Aerts, O. Jann, R. Negrini, M. E. Goddard, B. J. Hayes, D. G. Bradley, M. B. da Silva, L. P. L. Lau, G. E. Liu, D. J. Lynn, F. Panzitta, and K. G. Dodds. 2009. Genome-wide survey of SNP variation uncovers the genetic structure of cattle breeds. Science 324:528–532.

Gilmour, A. R., B. J. Gogel, B. R. Cullis, S. J. Welham, and R. Thompson. 2002. ASReml User Guide Release 1.0. VSN Inter-national Ltd., Hemel Hempstead, UK.

Goddard, M. E., and B. J. Hayes. 2007. Genomic selection. J. Anim. Breed. Genet. 124:323–330.

Ihaka, R., and R. Gentleman. 1996. R: A language for data analysis and graphics. J. Comput. Graph. Statist. 5:299–314.

Matukumalli, L. K., C. T. Lawley, R. D. Schnabel, J. F. Taylor, M. F. Allan, M. P. Heaton, J. O’Connell, S. S. Moore, T. P. L. Smith, T. S. Sonstegard, and C. P. Van Tassell. 2009. Develop-ment and characterization of a high density SNP genotyping assay for cattle. PLoS ONE 4:e5350.

McKay, S. D., R. D. Schnabel, B. M. Murdoch, L. K. Matukumalli, J. Aerts, W. Coppieters, D. Crews, E. Dias, C. A. Gill, C. Gao, H. Mannen, P. Stothard, Z. Q. Wang, C. P. Van Tassell, J. L. Williams, J. F. Taylor, and S. S. Moore. 2007. Whole genome linkage disequilibrium maps in cattle. BMC Genet. 8:74.

Meuwissen, T. H. E., B. J. Hayes, and M. E. Goddard. 2001. Pre-diction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829.

Mizoshita, K., T. Watanabe, H. Hayashi, C. Kubota, H. Yamaku-chi, J. Todoroki, and Y. Sugimoto. 2004. Quantitative trait




loci analysis for growth and carcass traits in a half-sib fam-ily of purebred Japanese Black (Wagyu) cattle. J. Anim. Sci. 82:3415–3420.

Page, B. T., E. Casas, M. P. Heaton, N. G. Cullen, D. L. Hyndman, C. A. Morris, A. M. Crawford, T. L. Wheeler, M. Koohmaraie, J. W. Keele, and T. P. L. Smith. 2002. Evaluation of single-nucleotide polymorphisms in CAPN1 for association with meat tenderness in cattle. J. Anim. Sci. 80:3077–3085.

Perry, D., W. R. Shorthose, D. M. Ferguson, and J. M. Thompson. 2001. Methods used in the CRC program for the determina-tion of carcass yield and beef quality. Aust. J. Exp. Agric. 41:953–957.

Reverter, A., D. J. Johnston, D. M. Ferguson, D. Perry, M. E. God-dard, H. M. Burrow, V. H. Oddy, J. M. Thompson, and B. M. Bindon. 2003a. Genetic and phenotypic characterisation of ani-mal, carcass, and meat quality traits from temperate and tropi-cally adapted beef breeds. 4. Correlations among animal, car-cass, and meat quality traits. Aust. J. Agric. Res. 54:149–158.

Reverter, A., D. J. Johnston, D. Perry, M. E. Goddard, and H. M. Burrow. 2003b. Genetic and phenotypic characterisation of animal, carcass, and meat quality traits from temperate and tropically adapted beef breeds. 2. Abattoir carcass traits. Aust. J. Agric. Res. 54:119–134.

Robinson, D. L., and V. H. Oddy. 2004. Genetic parameters for feed efficiency, fatness, muscle area and feeding behaviour of feedlot finished beef cattle. Livest. Prod. Sci. 90:255–270.

Schenkel, F. S., J. R. Miller, Z. Jiang, I. B. Mandell, X. Ye, H. Li, and J. W. Wilton. 2006. Association of a single nucleotide poly-morphism in the calpastatin gene with carcass and meat quality traits of beef cattle. J. Anim. Sci. 84:291–299.

Taniguchi, M., T. Utsugi, K. Oyama, H. Mannen, M. Kobayashi, Y. Tanabe, A. Ogino, and S. Tsuji. 2004. Genotype of stearoyl-

CoA desaturase is associated with fatty acid composition in Japanese Black cattle. Mamm. Genome 15:142–148.

Turner, L. B., B. E. Harrison, R. J. Bunch, L. R. Porto Neto, Y. T. Li, and W. Barendse. 2010. A genome wide association study of tick burden and milk composition in cattle. Anim. Prod. Sci. 50:235–245.

Upton, W., H. M. Burrow, A. Dundon, D. L. Robinson, and E. B. Farrell. 2001. CRC breeding program design, measurements and database: Methods that underpin CRC research results. Aust. J. Exp. Agric. 41:943–952.

Viitala, S., J. Szyda, S. Blott, N. Schulman, M. Lidauer, A. Maki-Tanila, M. Georges, and J. Vilkki. 2006. The role of the bovine growth hormone receptor and prolactin receptor genes in milk, fat and protein production in Finnish Ayrshire dairy cattle. Genetics 173:2151–2164.

Visscher, P. M., R. Thompson, and C. S. Haley. 1996. Confidence in-tervals in QTL mapping by bootstrapping. Genetics 143:1013–1020.

Wang, Y. H., N. I. Bower, A. Reverter, S. H. Tan, N. De Jager, R. Wang, S. M. McWilliam, L. M. Cafe, P. L. Greenwood, and S. A. Lehnert. 2009. Gene expression patterns during intramuscu-lar fat development in cattle. J. Anim. Sci. 87:119–130.

White, S. N., E. Casas, T. L. Wheeler, S. D. Shackelford, M. Koohm-araie, D. G. Riley, C. C. Chase Jr., D. D. Johnson, J. W. Keele, and T. P. L. Smith. 2005. A new single nucleotide polymor-phism in CAPN1 extends the current tenderness marker test to include cattle of Bos indicus, Bos taurus, and crossbred descent. J. Anim. Sci. 83:2001–2008.

Wray, N. R., M. E. Goddard, and P. M. Visscher. 2007. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res. 17:1520–1528.




Referenceshttp://jas.fass.org/content/89/8/2297#BIBLThis article cites 34 articles, 18 of which you can access for free at:



Documents

A genome-wide association study of meat and carcass traits ... · A genome-wide association study of meat and carcass traits in Australian cattle doi: 10.2527/jas.2010-3138 originally