10
Use of genotype probabilities in selection on PrP genotype in sheep Gregor Gorjanc a,* Drago Kompan a a University of Ljubljana, Biotechnical Faculty, Department of Animal Science, Groblje 3, 1230 Domˇ zale, Slovenia Abstract PrP genotype probabilities for ungenotyped animals of Jezersko-Solcava sheep breed were calculated. Data consisted of 10,429 animals among which 3,669 had PrP genotype data. There were 2,673 live ungenotyped animals. Five PrP haplotypes were present with the following frequencies: ARR 0.174, AHQ 0.074, ARH 0.083, ARQ 0.632, and V RQ 0.037. All 15 possible PrP genotypes were found. Iterative allelic peeling with incomplete penetrance model as implemented in GenoProb program was used for calculation of PrP genotype probabilities. There were only some additional PrP genotype and NSP type identifications with high probability. The main reasons for a low number of additional identifications can be attributed to the large number of haplotypes with moderate frequencies, incomplete penetrance model, uniform prior, and inherent pedigree and genotype data structure. Incomplete penetrance model caused inflation of small probabilities but has proved to be very useful for field data where conflicting data rise due to the pedigree of genotype errors. Novel parameters (maximal NSP type, average NSP type and its variance and accuracy) are proposed to make use of PrP genotype probabilities and facilitate selection for scrapie resistance. Parameters were derived with emphasis on practical implementation of selection schemes based on NSP types. Maximal NSP type can be used to infer maximal potential scrapie susceptibility of individual ungenotyped animal as well as for the entire flock. The average NSP type encompass all information contained in PrP genotype probabilities and is the most useful and practical parameter for the selection on NSP type and therefore PrP genotype. In addition, accuracy of the average NSP type should also be used as a selection criteria in order to assess the variability of an average NSP type. Key words: genotype probabilities, PrP genotype, NSP type, scrapie, sheep 1. Introduction Scrapie is a type of transmissible spongiform en- cephalopathy disease in sheep. Hunter et al. (1997) have shown that sheep can develop scrapie if they are genetically susceptible and exposed to an in- fecting agent. Polymorphisms on 136th, 154th, and 171st codon of the PrP gene are strongly associated with the resistance or susceptibility to scrapie in sheep (Hunter, 1997). The following five PrP haplo- types (alleles) are the most frequent: A 136 R 154 R 171 (ARR), A 136 H 154 Q 171 (AHQ), A 136 R 154 H 171 * Corresponding author. Email address: [email protected] (Gregor Gorjanc). (ARH ), A 136 R 154 Q 171 (ARQ), and V 136 R 154 Q 171 (V RQ). PrP genotypes are classified into risk groups based on scrapie susceptibility (e.g. Dawson et al. , 2008). Risk groups are also called NSP types or groups (Na- tional Scrapie Plan in UK). The most resistant geno- type ARR/ARR is of NSP type 1, while the most susceptible genotypes are of NSP type 5 (Table 1). This grouping also represents the organizational im- plementation of selection programs for scrapie resis- tance in sheep. Selection programs aim at removal of V RQ haplotype and increase of ARR haplotype frequency, especially in ARR/ARR form. Recent re- view on scrapie in sheep has been given by Ulvund (2008). Preprint submitted to Elsevier 8 May 2008

Use of genotype probabilities in selection on PrP genotype in sheep

Embed Size (px)

DESCRIPTION

Gorjanc, G. Kompan, D. 2008. Use of genotype probabilities in selection on PrP genotype in sheep.

Citation preview

Page 1: Use of genotype probabilities in selection on PrP genotype in sheep

Use of genotype probabilities in selection on PrP genotype in sheep

Gregor Gorjanc a,∗ Drago Kompan a

aUniversity of Ljubljana, Biotechnical Faculty, Department of Animal Science, Groblje 3, 1230 Domzale, Slovenia

Abstract

PrP genotype probabilities for ungenotyped animals of Jezersko-Solcava sheep breed were calculated. Data consistedof 10,429 animals among which 3,669 had PrP genotype data. There were 2,673 live ungenotyped animals. FivePrP haplotypes were present with the following frequencies: ARR 0.174, AHQ 0.074, ARH 0.083, ARQ 0.632, andV RQ 0.037. All 15 possible PrP genotypes were found. Iterative allelic peeling with incomplete penetrance modelas implemented in GenoProb program was used for calculation of PrP genotype probabilities. There were only someadditional PrP genotype and NSP type identifications with high probability. The main reasons for a low number ofadditional identifications can be attributed to the large number of haplotypes with moderate frequencies, incompletepenetrance model, uniform prior, and inherent pedigree and genotype data structure. Incomplete penetrance modelcaused inflation of small probabilities but has proved to be very useful for field data where conflicting data rise due to thepedigree of genotype errors. Novel parameters (maximal NSP type, average NSP type and its variance and accuracy)are proposed to make use of PrP genotype probabilities and facilitate selection for scrapie resistance. Parameters werederived with emphasis on practical implementation of selection schemes based on NSP types. Maximal NSP type canbe used to infer maximal potential scrapie susceptibility of individual ungenotyped animal as well as for the entireflock. The average NSP type encompass all information contained in PrP genotype probabilities and is the mostuseful and practical parameter for the selection on NSP type and therefore PrP genotype. In addition, accuracy of theaverage NSP type should also be used as a selection criteria in order to assess the variability of an average NSP type.

Key words: genotype probabilities, PrP genotype, NSP type, scrapie, sheep

1. Introduction

Scrapie is a type of transmissible spongiform en-cephalopathy disease in sheep. Hunter et al. (1997)have shown that sheep can develop scrapie if theyare genetically susceptible and exposed to an in-fecting agent. Polymorphisms on 136th, 154th, and171st codon of the PrP gene are strongly associatedwith the resistance or susceptibility to scrapie insheep (Hunter, 1997). The following five PrP haplo-types (alleles) are the most frequent: A136R154R171

(ARR), A136H154Q171 (AHQ), A136R154H171

∗ Corresponding author.Email address: [email protected] (Gregor

Gorjanc).

(ARH), A136R154Q171 (ARQ), and V136R154Q171

(V RQ).PrP genotypes are classified into risk groups based

on scrapie susceptibility (e.g. Dawson et al. , 2008).Risk groups are also called NSP types or groups (Na-tional Scrapie Plan in UK). The most resistant geno-type ARR/ARR is of NSP type 1, while the mostsusceptible genotypes are of NSP type 5 (Table 1).This grouping also represents the organizational im-plementation of selection programs for scrapie resis-tance in sheep. Selection programs aim at removalof V RQ haplotype and increase of ARR haplotypefrequency, especially in ARR/ARR form. Recent re-view on scrapie in sheep has been given by Ulvund(2008).

Preprint submitted to Elsevier 8 May 2008

Page 2: Use of genotype probabilities in selection on PrP genotype in sheep

Table 1PrP genotype grouping by NSP types (e.g. Dawson et al. ,

2008)

PrP genotype NSP type

ARR/ARR 1

ARR/AHQ, ARR/ARH, ARR/ARQ 2

AHQ/AHQ, AHQ/ARH, AHQ/ARQ, 3

ARH/ARH, ARH/ARQ, ARQ/ARQ

ARR/V RQ 4

AHQ/V RQ, ARH/V RQ, ARQ/V RQ, 5

V RQ/V RQ

The infecting agent causing scrapie is still notknown. PrP gene is therefore the main factor thatcan be used for the regulation of scrapie suscepti-bility/resistance in populations of sheep. PrP geno-type information of selection candidates is essen-tial for the successful application of selection pro-gram for scrapie resistance. The number of sheepinvolved in genotyping is massive - Dawson et al.(2008) reported that 700.000 rams have been geno-typed in Great Britain since the start of NationalScrapie Plan in year 2001. The costs of genotypinghave decreased in recent years. However, the col-lection of tissue samples and genotyping of a largenumber of individuals are still of considerable cost.Therefore, methods for the calculation of genotypeprobabilities for ungenotyped animals could be ex-ploited to reduce the costs and increase selection in-tensity. Elston and Stewart (1971) proposed a gen-eral method for the calculation of genotype prob-abilities in human genetics. For livestock pedigreesmodified or different methods (e.g. Van Arendonket al., 1989; Kerr and Kinghorn, 1996; Fernandez etal., 2001; Thallman et. al, 2001b; Henshall and Tier,2003; Gengler et al. , 2007) must be used due to alarger number of individuals and greater pedigreecomplexity caused by inbreeding loops. The aim ofthis paper is to present the results of PrP genotypeprobabilities calculation and their use in selectionfor scrapie resistance.

Table 2Data structure

Animals Sires Dams

All 10,429 397 3,530

Unknown sire or dam 2,403 76 1,184

With PrP genotype data 3,669 114 1,443

Live and without PrP genotype data 2,673 28 508

2. Material

PrP genotype and pedigree data of the Slovenianautochthonous Jezersko-Solcava sheep were used.This is a meat-type breed. In Slovenia pedigree ismonitored only in the performance recorded flocks.All animals from the Jezersko-Solcava herdbookwere taken into consideration, except those thatwere not informative for the calculation of PrPgenotype probabilities and were therefore prunedfrom the pedigree. Pruning was applied in the di-rection from ascendants to descendants. The fol-lowing criteria had to be met for the removal of ananimal: known date of death or culling, only onedescendant, and no PrP genotype data. Live ani-mals were retained since the aim of this work wasto calculate genotype probabilities for live animals,the potential selection candidates.

Altogether 10,429 animals were used in the analy-sis (Table 2). The number of Jezersko-Solcava sheepin performance recording increased over the years(Fig. 1). Among all animals in the study, 397 ramsand 3,530 ewes had at least one progeny (Table 2).The percentage of unknown sires or dams was 23.0 %for all animals, 19.1 % for sires and 33.5 % for dams.PrP genotype data was available for 3,669 animals ofwhich 114 were sires and 1,443 were dams. PrP geno-type data was first available in year 2005 for perfor-mance tested rams at the national central test sta-tion. Later, PrP genotype data were available alsofor rams and ewes of all ages in the flocks all aroundthe country. The number of rams and ewes with PrPgenotype data increased by the year of birth (Fig. 1),but there was a large variability between herds (datanot shown). Reduction in the year 2008 was due torecords in progress. There were 2,673 live animals,28 live sires and 508 live dams (Table 2) that werenot yet genotyped at the time of writing. Haplotypeand genotype frequencies for used dataset are pre-sented in the results.

2

Page 3: Use of genotype probabilities in selection on PrP genotype in sheep

Year of birth

Num

ber

200

400

600

800

1000

<1990 1995 2000 2005

All animalsEwes with PrP genotype dataRams with PrP genotype data

Fig. 1. Number of all animals, and ewes and rams with PrPgenotype data by year of birth

3. Methods

PrP genotype probabilities were calculated withiterative allelic peeling (Thallman et. al, 2001a,b)using incomplete penetrance model as implementedin GenoProb program (Thallman, 2002). Inputdata were PrP genotypes and pedigree. In Geno-Prob program prior probability for haplotypes(alleles) in ungenotyped founders is uniform i.e.Pr(ARR, AHQ, ARH, ARQ, V RQ) = 1/5. Incom-plete penetrance model allowed use of data withpotential errors in either genotype or pedigree data.Error rate was set to 0.1. For each animal a vectorg [1] with 15 genotype probabilities was obtained.Only probabilities higher than 0.0001 were retained.From the obtained genotype probabilities a vectorn [2] with 5 NSP type probabilities was calculatedaccording to PrP genotype grouping (Table 1).

g′ = [Pr (ARR/ARR) , . . . , Pr (V RQ/V RQ)] (1)

n′ = [Pr (NSP1) , . . . , Pr (NSP5)] (2)

In order to asses potential scrapie susceptibility,maximal NSP type [3] for each animal was definedas:

max (NSP, l) = max (I (n > l) • a) , (3)

where I() is an indicator function returning value0 when the condition is false, or 1 when the condi-tion is true; l is an arbitrary NSP type probabilitythreshold value such as 0.05; a′ = [1, 2, 3, 4, 5] is avector of NSP type values; and • is an elementwisemultiplication operator.

Additionally, an average NSP type [4] was calcu-lated for each animal as weighted average of arbi-

trary values a weighted with NSP type probabilities(n):

NSP = a′∗ n. (4)

The variance of individual average NSP type [5]was derived as:

σ2

NSP=

(

a′− NSP

)2

∗ n. (5)

If all 15 PrP genotypes are equally likely, all valuesof g are equal to 1/15,n′ = [1/15, 3/15, 6/15, 1/15, 4/15],NSP = 3.27, and σ2

NSP= 1.53. This results were

used to derive the accuracy of an average NSP type:

rNSP

= 1 − σ2

NSP/k, (6)

where k = σ2

NSPunder the uniform distribution

of PrP genotypes. In case of uniform probabilitiesfor NSP types, NSP = 3, and σ2

NSP= 2.

4. Results with Discussion

4.1. Frequencies

Frequencies of PrP haplotypes and genotypeswere calculated as simple percentages without ac-counting for relationship dependencies in the data(e.g. Bohenke, 1991). Frequency of the favourableARR haplotype was 0.174, while the frequency ofV RQ haplotype was 0.037 (Table 3). The most fre-quent haplotype was ARQ with frequency 0.632. All15 possible PrP genotypes were found in this breed,with the highest frequencies for ARQ/ARQ (0.4)and ARR/ARQ (0.212). Other genotypes had fre-quencies lower than 0.12. Heterozygosity was equalto 0.56, while polymorphism information contentwas equal to 0.52. In comparison with PrP haplo-type and genotype frequency estimates of 56 sheepbreeds compiled by Luhken et. al (2001a), Jezersko-Solcava breed had low frequency of ARR haplotype,high frequency of ARQ, AHQ, and ARH haplo-types, and high frequencies of heterozygotes.

Frequencies of NSP type 1, 2, 3, 4, and 5 forJezersko-Solcava breed were 0.035, 0.267, 0.628,0.011, and 0.061, respectively. These frequenciescorrespond to an average NSP type [4] of 2.8.

4.2. Additional identifications

Ideally, the calculation of genotype probabilitiesis performed in order to acquire additional identifi-cation or exclusion of genotype(s) for some animals

3

Page 4: Use of genotype probabilities in selection on PrP genotype in sheep

Table 3PrP haplotype, PrP genotype and NSP type frequencies

(n=3,669)

PrP haplotype

ARR AHQ ARH ARQ V RQ

0.174 0.074 0.083 0.632 0.037

PrP genotype NSP type

ARR/ARR 1

0.035 0.035

ARR/AHQ ARR/ARH ARR/ARQ 2

0.026 0.029 0.212 0.267

AHQ/AHQ AHQ/ARH AHQ/ARQ

0.008 0.010 0.092 3

ARH/ARH ARH/ARQ ARQ/ARQ 0.628

0.007 0.111 0.400

ARR/V RQ 4

0.011 0.011

AHQ/V RQ ARH/V RQ ARQ/V RQ

0.004 0.005 0.050 5

V RQ/V RQ 0.061

0.002

and partial information on genotype(s) for other an-imals. Unfortunately, there were no additional iden-tifications of PrP genotype with certainty via thecalculated genotype probabilities (Table 4), not evenfor ungenotyped sires or dams. When probability ofidentification was lowered to 0.99, 66 animals hadadditionally identified PrP genotype. However, thiswas only 2.5 % of all live ungenotyped animals.There were more additional identifications (as mea-sured by the percentage of animals in a group) forsires, but not for dams. This is clearly due to highernumber of progeny per sire than per dam. Lower-ing the probability of genotype identification downto 0.80 increased the number of additional identifi-cations, but not substantially.

There were also none additional identifications ofNSP type with certainty (Table 4). However, 209(7.8 %) animals had additionally identified NSPtype with probability of identification equal to 0.99.This number more than doubled when probabilityof identification was lowered to 0.8. Higher numberof additional identifications for NSP type in com-parison to PrP genotype is due to the fact that 15genotypes are grouped into 5 NSP types. Therewere also more additional NSP type identificationsthan genotype identifications for ungenotyped sires

and dams, although the increase was proportionallyhigher in dams.

Additional exclusion of PrP genotype and NSPtype was also assessed but is not presented. Thoseresults were not very informative, since there are 15possible PrP genotypes and 5 NSP types. Similarresults for additional identification and exclusion ofPrP genotype and NSP type were obtained also forother sheep breeds in Slovenia (Gorjanc et al., 2008),where data structure, haplotype and genotype fre-quencies, and percentage of genotyped animals var-ied.

Tier and Henshall (2005) have evaluated the lim-its of additional genotype identification for singleloci in common livestock pedigrees. They concludedthat increased number of alleles, intermediate fre-quencies of alleles, and smaller families decreasedsuccess as measured with additionally inferred geno-types with certainty. In case of sheep the problemis a small family size, unless insemination is in use.Beside small families, PrP genotype in sheep hasseveral known haplotypes (alleles) with moderatefrequencies. Additional reasons for such a low num-ber of additional identification of PrP genotype andNSP type are discussed further on, where we alsopropose new parameters (maximal NSP type andaverage NSP type) that can be used as selection cri-teria.

4.3. Genotype probabilities

For each animal a vector (g) of PrP genotypeprobabilities was calculated. These are posteriorprobabilities given pedigree, PrP genotype, andhaplotype prior frequencies for founders. Distribu-tion of all calculated probabilities is shown in Fig. 2separately for animals without as well as for animalwith PrP genotype data. Genotype probabilities forthe later group of animals were obtained due to theuse of incomplete penetrance model.

There was a considerable amount of small geno-type probabilities (Fig. 2). This shows why therewere not a lot of additional genotype identificationswith certainty or at least high probability (Table 4).Only some genotype probabilities were moderate orhigh. Since there are 15 possible PrP genotypes thereis an inflation of small probabilities and the distri-bution is not very informative.

A considerable amount of very small genotypeprobabilities was calculated also for animals withPrP genotype data. This is mainly due to the in-

4

Page 5: Use of genotype probabilities in selection on PrP genotype in sheep

Table 4Additional identifications of PrP genotype and NSP type for

live animals (n=2,673), sires (n=28), and dams (n=508)

PrP genotype

Animals Sires Dams

Probability No. % No. % No. %

1.00 0 0.0 0 0.0 0 0.0

0.99 66 2.5 6 21.4 9 1.8

0.95 101 3.8 8 28.6 15 3.0

0.90 116 4.3 9 32.1 19 3.7

0.85 134 5.0 9 32.1 22 4.3

0.80 144 5.4 10 35.7 27 5.3

NSP type

Animals Sires Dams

Probability No. % No. % No. %

1.00 0 0.0 0 0.0 0 0.0

0.99 209 7.8 7 25.0 28 5.5

0.95 283 10.6 10 35.7 47 9.2

0.90 329 12.3 12 42.8 54 10.6

0.85 382 14.3 12 42.8 65 12.8

0.80 440 16.5 15 53.6 81 15.9

complete penetrance model as we discuss further on.Some moderate probabilities indicated data errors.The comparison of percentage of high and low prob-abilities is again not very informative, due to the in-flation caused by the large number of possible geno-types.

With incomplete penetrance model the collectedPrP genotypes are treated as a “phenotype” data(Lincoln and Lander, 1992). This allows the analysisof data that are not consistent with the Mendelianlaws i.e. a “phenotype” of an individual does notmatch with genotypes of relatives. Although theMendelian laws are simple in principle, it is hard todetermine which individual has an erroneous “phe-notype” in large and complex pedigrees with incom-plete genotype data for only one loci. Presence ofmultiple alleles/haplotypes increases the complex-ity. Additionally, errors that are Mendelian consis-tent are even harder to detect (Sobel et al. , 2002).Sobel et al. (2002) give a general treatment of geno-type errors in statistical genetics, while de Givry etal. (2006) and Manolios et. al (2007) describe novelmethods for checking genotype consistency in com-plex pedigrees.

Incomplete penetrance model is very useful forfield data in animal breeding (e.g. Thallman et. al,

Genotype probability

Dis

trib

utio

n

Without PrP genotype data

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.1

0.2

0.3

Genotype probability

Dis

trib

utio

n0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.1

0.2

0.7

With PrP genotype data

Fig. 2. Distribution of all PrP genotype probabilities for allanimals without (above) and with (bellow) PrP genotypedata

2001b; Gengler et al. , 2007), since errors might oc-cur due to pedigree errors and typing errors. Afterthe calculation of genotype probabilities with in-complete penetrance model, the probability of anerror can be computed as a sum of probabilitiesfor genotypes that are inconsistent with a “pheno-type” (Thallman et. al, 2001b). In GenoProb pro-gram (Thallman, 2002) the relationship between a“phenotype” and a genotype for the co-dominantloci is described via simple penetrance (error) func-tion (Lincoln and Lander, 1992; Ehm et al., 1996).Probability of 1− ǫ is assigned to the genotype thatis consistent with the “phenotype”, while ǫ is uni-formly assigned to other genotypes. In this analysisǫ was set to 0.1 leading to the probability of 1− ǫ =0.9 for the consistent phenotype-genotype pair andǫ/ (5 (5 + 1) /2 − 1)) = 0.007 for the inconsistentphenotype-genotype pairs.

A simple example in Fig 3 can be used to showthe effect of incomplete penetrance model on theinflation of small genotype probabilities. There isa parent (A) that has six progeny (from B to G).

5

Page 6: Use of genotype probabilities in selection on PrP genotype in sheep

Fig. 3. Pedigree example

Genotype of a parent is not known, while genotypesof progeny are known. There are five alleles (hap-lotypes) as for PrP gene in Jezersko-Solcava sheepbreed. Alleles are for the simplicity sake marked with1, 2, 3, 4, and 5. Upon inspection it can be deducedthat genotype of a parent is likely 1/2 and that thereis likely an error in the pedigree or genotype datafor the individual G.

If we calculate genotype probabilities for a par-ent with complete penetrance model, we can not usethe individual G. We have to either correct or omitthe genotype of conflicting individuals. This seemsto be easy in the presented example, but not in gen-eral (e.g. Sobel et al. , 2002; de Givry et al. , 2006;Manolios et. al , 2007). First, the individual G wasremoved and the complete penetrance model wasused to calculate genotype probabilities for an in-dividual (parent) A. Only one genotype probabilityof 1.0 was obtained for the genotype 1/2 (Table 5).On the same dataset (without the individual G), theincomplete penetrance model returned 15 genotypeprobabilities for a parent A even though there wereno conflicting genotypes in the dataset. The prob-ability of genotype 1/2 was ∼0.87. When the indi-vidual G was added incomplete penetrance modelhad to be used and the number of calculated geno-type probabilities was also 15 with the probabilityof genotype 1/2 for a parent A equal to ∼0.44. Prob-ability of genotype 4/5 for individual G was 0.49,while the sum of other probabilities i.e. probabilityof typing error or pedigree error was 0.51 (data notshown). This example clearly shows that incompletepenetrance model inflates the number of calculatedgenotype probabilities, but on the other hand en-ables the calculation when errors are present. Sobelet al. (2002) have stated that results from incom-plete penetrance model are not so indicative, butmore secure.

In order to remove phenotype-genotype conflicts,two stage process can be used. First, the incompletemodel is applied to enable the use of all data and cal-culation of error probabilities. Then the suspiciousdata can be checked and repaired if possible. When

Table 5Genotype probabilities for individual A (see Fig. 3) with

complete penetrance (CP) and incomplete penetrance (IP)model

Genotype CP1 IP1 IP2 IP3

1/1 0.00883 0.00450 0.01801

1/2 1.00 0.86740 0.44191 0.26052

1/3 0.03336 0.01700 0.01638

1/4 0.00476 0.03273 0.09195

1/5 0.00476 0.03273 0.21957

2/2 0.00883 0.00450 0.00193

2/3 0.00476 0.00242 0.00119

2/4 0.03336 0.22945 0.19890

2/5 0.03336 0.22945 0.14236

3/3 0.00001 0.00001 0.00001

3/4 0.00018 0.00126 0.00311

3/5 0.00018 0.00126 0.00222

4/4 0.00001 0.00017 0.00679

4/5 0.00018 0.00242 0.03672

5/5 0.00001 0.00017 0.000331 - individual G excluded; 2 - individual G included; 3 -

individual G included, haplotype prior probabilities takenfrom Table 3

all inconsistencies were removed, the complete pene-trance model would be used. Although this two stageprocess seems simple there might be practical issueswith repeated genotyping of conflicting animals. Asubstantial cost associated with additional collec-tion of tissue samples can be expected and some an-imals might already be dead. As such, the incom-plete penetrance model is of great importance in theapplied work.

Used error rate (ǫ) of 0.1 can be considered high.For example Thallman et. al (2001b) used smaller er-ror rate (ǫ = 0.01), but also noted (Thallman, 2002)that the estimation of error rates is not an easy task.Sobel et al. (2002) reported that their default er-ror rate for a uniform model in the field of humangenetics is 0.025. We think that larger error ratethan 0.01 or 0.025 is warranted in animal breedingscenario for loci with multiple alleles, especially inmeat-type sheep where pedigree tracking is more er-ror prone. On the other hand, incomplete penetrancemodel with high error rate could lower the numberof additional genotype identifications. This was as-sessed with error rates equal to 0.01, 0.025, 0.05,0.075, and 0.1. There were only minor differences inthe percentage of additional PrP genotype and NSP

6

Page 7: Use of genotype probabilities in selection on PrP genotype in sheep

0.80 0.85 0.90 0.95 1.00

01

23

45

6

Probability of identification

Add

ition

al P

rP g

enot

ype

iden

t. (%

)

Error rate (ε):

0.1000.0750.0500.0250.010

0.80 0.85 0.90 0.95 1.00

05

1015

20

Probability of identification

Add

ition

al N

SP

type

iden

t. (%

)

Error rate (ε):

0.1000.0750.0500.0250.010

Fig. 4. Percentage of additional PrP genotype (above) andNSP type (bellow) identifications for all live ungenotypedanimals by probability of identification and error rate

type identifications between used error rates for theprobability of identification higher than 0.8 (Fig. 4).Similar results were obtained also for sires and dams(data not shown).

Additional source of noninformativness was a uni-form prior for haplotype frequencies in founders i.e.Pr (ARR, AHQ, ARH, ARQ, V RQ) = 1/5. Resultsby Luhken et. al (2001a) as well as many othersclearly show, that some PrP haplotypes are morefrequent than others. Effect of a different prior forthe example pedigree (Fig. 3) is shown in the lastcolumn of Table 5, where the population haplotypefrequency estimates (Table 3) were used as a prior.In general, haplotype frequencies in founders (e.g.Bohenke, 1991) should be used as a prior. For thisparticular example, more informative prior resultedin a larger redistribution of probability among geno-types, due to pedigree error or typing error.Kerr andKinghorn (1996) have shown that a uniform prior re-duces the number of erroneous genotype exclusions.This is very important for the PrP genotype case,where erroneous statements about V RQ/∗ geno-

Maximal NSP type

Dis

trib

utio

n

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 5. Distribution of live animals by maximal NSP type:bars - animals with PrP genotype data, vertical lines - ani-mals without PrP genotype data

types are not wanted. Therefore, a uniform prior forhaplotype frequencies in founders is warranted as aconservative choice.

4.4. Maximal NSP type

Selection on PrP genotype is mainly driven bygrouping of genotypes in so called NSP types orgroups after the National Scrapie Plan in UK. In-stead of the PrP genotype and NSP type identifica-tion via calculated genotype probabilities, maximalNSP type [3] can be quantified from the same source.However, with incomplete penetrance model there isa practical problem of setting the NSP type proba-bility threshold (i.e. value of l in [3]) in order to skipNSP types with negligible probability. We have used0.05, but other values could also be used. NSP typeprobabilities and maximal NSP type for the exam-ple pedigree (Fig. 3) are shown in Table 6. For thisparticular example, the effect of incomplete pene-trance model on erroneous data is clearly shown i.e.maximal type of individual A is 5 when the conflict-ing individual G is used in the analysis. Distribu-tion of NSP type probabilities of genotyped animalsand maximal NSP type of ungenotyped animals forJezersko-Solcava dataset is shown in Fig. 5. The ma-jority of ungenotyped animals had a maximal NSPtype of 5, but there was a considerable amount ofanimals with lower (better) NSP type. The maximalNSP type of ungenotyped animals and NSP type ofgenotyped animals can be jointly used in determin-ing the NSP type status of the entire flocks.

7

Page 8: Use of genotype probabilities in selection on PrP genotype in sheep

Table 6NSP type probabilities, maximal NSP type, average NSP

type(

NSP)

and its variance

(

σ2

NSP

)

and accuracy(

rNSP

)

for individual A (see Fig. 3) with complete pene-

trance (CP) and incomplete penetrance (IP) model

NSP type CP1 IP1 IP2 IP3

1 0.00883 0.00450 0.01801

2 1.00 0.90552 0.49164 0.36885

3 0.04715 0.23781 0.21193

4 0.00476 0.03273 0.21957

5 0.03373 0.23330 0.18163

max (NSP, 0.05) 2 2 5 5

NSP 2 2.15 3.00 3.18

σ2

NSP0 0.36 1.48 1.36

rNSP

1 0.77 0.04 0.111 - individual G excluded; 2 - individual G included; 3 -

individual G included, haplotype prior probabilities takenfrom Table 3

4.5. Average NSP type

The maximal NSP type parameter does not en-compass all information contained in PrP genotypeprobabilities. Therefore, the average NSP type [4]was derived as a weighted average of NSP type val-ues, where weights are NSP type probabilities. Thisparameter uses all information contained in PrPgenotype probabilities and is the most useful andpractical parameter for the selection on PrP geno-type. The average NSP type for the example pedi-gree is shown in Table 6, while Fig. 6 shows distribu-tion of the average NSP type for live ungenotypedanimals of Jezersko-Solcava breed. The full use ofgenotype information is manifested as a good accor-dance in distributions of average NSP type for thegenotyped and ungenotyped animals. Although thisparameter is continuous, the peaks were observed atvalues around 1.5, 2, 2.5, 3, 3.5, 4, and 5. Similarresults were obtained also for other sheep breeds inSlovenia (Gorjanc et al., 2008).

The average NSP type is in essence similar to abreeding value, but with emphasis on practical useof PrP genotype in the selection process. NSP typevalues (a) could reflect the scrapie susceptibility ina more precise way, but in that case the calculationof breeding value would be preferred. Breeding valuewould be calculated as a weighted average of PrPbreeding values i.e. a sum of PrP haplotype averageeffects, weighted with PrP genotype probabilities.Unfortunately, average effects of PrP haplotypes are

Average NSP type

Dis

trib

utio

n

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 6. Distribution of live animals by average NSP type(

NSP)

: bars - animals with PrP genotype data, superim-posed curve - animals without PrP genotype data; vertical

dashed line - population NSP

not known. As it is usually the case with breedingvalues, the average NSP type can be represented asa deviation from the NSP type population average.

The average NSP type can be used as a selectioncriterion for the ungenotyped animals, but this pa-rameter should not be used blindly. When there isno genotype information from relatives all PrP geno-types are equally likely and NSP is equal to 3.27. Inthis case the average NSP type is indistinguishablefrom the same average NSP type for an individualwith genotype information from relatives. Varianceof the average NSP type [5] or accuracy [6] can beused to distinguish such cases. Therefore, the aver-age NSP type and its accuracy should be used in theselection for scrapie resistance.

Relationship between the average NSP type andits accuracy is presented in Fig. 7. Lower accuracieswere more common for intermediate values of the av-erage NSP type. Accuracies were computed also forthe animals with genotype data. Of course, accura-cies are much higher for these animals, but some lowvalues indicated families with conflicting pedigreeor genotype data. Negative accuracies indicate thatthe derived formula for accuracies [6] is not perfect.This is due to the fact that PrP genotypes are not“linearly” grouped into NSP types. However, therewere only some cases with negative accuracies andthese could be set to 0 in the applied work.

5. Conclusion

PrP genotype probabilities for ungenotyped ani-mals of Jezersko-Solcava breed were calculated with

8

Page 9: Use of genotype probabilities in selection on PrP genotype in sheep

1 2 3 4 5

−0.

50.

00.

51.

0

Average NSP type

Acc

urac

y

Without PrP genotype data

1 2 3 4 5

−0.

50.

00.

51.

0

Average NSP type

Acc

urac

y

With PrP genotype data

Fig. 7. Relationship between the average NSP type and itsaccuracy for live animals without (above) and with (bellow)PrP genotype data

iterative allelic peeling using incomplete penetrancemodel. There were only some additional PrP geno-type and NSP type identifications with high proba-bility. This can be attributed to the large number ofhaplotypes (alleles) with moderate frequencies, in-complete penetrance model, uniform prior, and in-herent pedigree and genotype data structure. Nev-ertheless, PrP genotype probabilities can be used tocalculate maximal NSP type and more importantlyaverage NSP type and its accuracy. These parame-ters can be used as criteria for the selection againstscrapie susceptibility in ungenotyped animals.

6. Acknowledgment

Pedigree and PrP genotype data were kindly pro-vided by the Slovenian breeding organisation forsmall ruminants. We are also grateful to RichardMark Thallman for sharing the GenoProb programand valuable suggestions on its use.

ReferencesBohenke, M. 1991. Allele frequency estimation from

data on relatives. Am. J. Hum. Gen. (48) 22–25.http://www.pubmedcentral.nih.gov/articlerender.fcgi?arti

(31. mar. 2008)Dawson, M., Moore, R.C., Bishop, S.C.

2008. Progress and limits of PrP geneselection policy. Vet. Res. (39) 25–37.http://dx.doi.org/10.1051/vetres:2007064

(31. mar. 2008)Ehm, M.G., Kimmel, M., Cottingham, R.W.Jr.

1996. Error detection for genetic data, using likeli-hood methods. Am. J. Hum. Genet. (58) 225–234.http://www.pubmedcentral.nih.gov/articlerender.fcgi?pubm

(31. mar. 2008)Elston, R.C., Stewart, J.A. 1971. General model for

the genetic analysis of pedigree data. Hum. Hered.(21) 523-542.

Fernandez, S.A., Fernando, R.L., Guldbrandt-sen, B., Totir, L.R., Carriquiry, A.L. 2001.Sampling genotypes in large pedigreeswith loops. Genet. Sel. Evol. (33) 337–367.http://dx.doi.org/10.1051/gse:2001122

(31. mar. 2008)Gengler, N., Mayeres, P., Szydlowski, P. 2007.

A simple method to approximate gene con-tent in large pedigree populations: applica-tion to the myostatin gene in dual-purposeBelgian Blue cattle. Animal (1) 21–28.http://dx.doi.org/10.1017/S1751731107392628

(31. mar. 2008)de Givry, S., Vitezica, Z.G., Palhiere, I., Schiex, T.

2006. MendelSoft: Mendelian error detection incomplex pedigree using weighted constraint sat-isfaction techniques. In: Proceeding of the 8thWorld Congress on Genetics Applied to LivestockProduction 2002-08-19/23, INRA, Montpellier,France.

Gorjanc, G., Plohl, M., Kompan, D. 2008. Calcula-tion of PrP genotype and NSP type probabilitiesin Slovenian sheep. Acta Agric. Slov. (submitted).

Henshall, J.M., Tier, B. 2003. An algorithm forsampling descent graphs in large complex pedi-grees efficiently. Genet. Res. (81) 205–212.http://dx.doi.org/10.1017/S0016672303006232

(31. mar. 2008)Hunter, N. 1997. PrP genetics in sheep

and the applications for scrapie andBSE. Trends Microbiol. (5) 331-334.

9

Page 10: Use of genotype probabilities in selection on PrP genotype in sheep

http://dx.doi.org/10.1016/S0966-842X(97)01081-0

(31. mar. 2008)Hunter, N., Cairns, D., Foster, J., Smith,

G., Goldman, W., Donnelly, K., 1997. Isscrapie a genetic disease? Evidence fromscrapie-free countries. Nature (386) 137–137.http://dx.doi.org/10.1038/386137a0 (31.mar. 2008)

Kerr, R.J., Kinghorn, B.P. 1996. An efficient algo-rithm for segregation analysis in large popula-tions. J. Anim. Breed. Genet. (113) 457–469.

Lincoln, S.E., Lander, E.S. 1992. Sys-tematic detection of errors in geneticlinkage data. Genomics (14) 604–610.http://dx.doi.org/10.1016/S0888-7543(05)80158-2

(10. mar. 2008)Luhken, G., Lipsky, S. Peter, C. Erhardt, G.

2008. Prion protein polymorphisms in au-tochtonous European sheep breeds in re-spect to scrapie eradication in affectedflocks. Small. Rumin. Res. (75) 43–47.http://dx.doi.org/10.1016/j.smallrumres.2007.07.010

(10. mar. 2008)Manolios, P., Oms, M. Valls, S. 2007. Check-

ing pedigree consistency with PCS. In:Grumberg, O., Huth, M. (Eds.), Toolsand Algorithms for the Constructionand Analysis of Systems (4424) 339–342.http://dx.doi.org/10.1007/978-3-540-71209-1_26

(31. mar. 2008)Sobel, E., Papp, J.C., Lange, K. 2002. Detection

and integration of genotyping errors in statisti-cal genetics. Am. J. Hum. Genet. (70) 496–508.http://dx.doi.org/10.1086/338920 (31. mar.2008)

Thallman, R.M., Bennet, G.L., Keele, J.W.,Kappes, S.M. 2001a. Efficient computation ofgenotype probabilities for loci with many alle-les: I. Allelic peeling. J. Anim. Sci. (79) 26–33.http://jas.fass.org/cgi/content/abstract/79/1/26

(31. mar. 2008)Thallman, R.M., Bennet, G.L., Keele, J.W.,

Kappes, S.M. 2001b. Efficient computationof genotype probabilities for loci with manyalleles: II. Iterative method for large com-plex pedigrees. J. Anim. Sci. (79) 34–44.http://jas.fass.org/cgi/content/abstract/79/1/34

(31. mar. 2008)Thallman, R.M. 2002. User’s Manual for GenoProb

Version 2.0. Computation of genotype and phaseprobabilities in complex pedigrees by iterative al-lelic peeling. 18 pp.

Tier, B., Henshall, J.M. 2005. Limits to genotypicprobabilities for single loci. In: Application of newgenetic technologies to animal breeding: Proceed-ings of the 16th Conference of the Association forthe Advancement of Animal Breeding and Genet-ics 2005-09-25/2, CSIRO, Noosa Lakes, Queens-land, Australia 366-369.

Ulvund, M.J. 2008. Ovine scrapie disease: Do wehave to live with it? Small Rumin. Res. (in press)http://dx.doi.org/10.1016/j.smallrumres.2007.12.004

(31. mar. 2008)Van Arendonk, J.A.M., Smith, C., Kennedy,

B.W. 1989. Method to estimate genotypeprobabilities at individual loci in farm live-stock. Theor. Appl. Genet. (78) 735–740.http://dx.doi.org/10.1007/BF00262571 (31.mar. 2008)

10