Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating...

Preview:

Citation preview

Module 7: Estimating Genetic Variances

– Why estimate genetic variances?

– Single factor mating designs

PBG 650 Advanced Plant Breeding

Why estimate genetic variances?

• New crop species

– ensure adequate genetic variance for selection

– determine appropriate type of cultivar• pure lines, hybrids, open-pollinated varieties

• Predict response to short and long-term selection

• Determine optimum number and location of testing environments

• Use in selection indices

• Predict single-cross performance

Do Breeders Need to Estimate Genetic Variances?

• For breeders working with elite germplasm, it is often more useful to develop breeding populations for the purposes of selection, than to estimate genetic variances– use parents with high means

– make crosses between unrelated individuals to maintain high genetic variation (or assess diversity at molecular level)

– single-cross performance can be predicted from data routinely generated in breeding programs

– recurrent selection is not widely used in breeding programs for major crop species

Bernardo, 2010, Chapt. 7

What about newer crops, less developed germplasm?

• Options in mating designs for self-pollinated crops are limited

• Potential of purelines or open-pollinated varieties vs hybrids can be assessed by comparing means of these types of cultivars and by considering costs of hybrid seed production

• Precision of genetic variance estimates is often low

• Selection indices can be constructed that do not require input of genetic variances

Do Breeders Need to Estimate Genetic Variances?

Genetic variances?

• Provides valuable baseline information for breeding initiatives for minor crops, new traits

• For many crops and situations, recurrent selection is more efficient than pedigree selection

• Need to distinguish between genetic and environmental correlations among traits

• Better understanding of environmental influences and GXE is essential for effective, well-targeted breeding efforts

Obtain estimates of genetic variances as an integral part of breeding program– progeny trials, mapping populations

– realized selection response, correlated selection response

– monitor changes in genetic variances over time

– accumulate information about inheritance of important traits

Classic approach for estimating genetic variances

• Develop one or more types of progeny

– half sibs, full-sibs, testcrosses, recombinant inbreds

• Evaluate progeny in a set of environments

– representative of potential environments in target region

• Estimate variance components from mean squares in ANOVA (or directly using mixed models)

• Equate variance components with expectation based on covariances among relatives

# of variance components that can be estimated

= # of covariances among relatives in the design

Assumptions

• Relatives are noninbred and belong to a particular random-mating reference population

– estimates apply to that population alone

– relatives must represent a random sample from the population

• parents cannot be selected from the population, or chosen from different populations

• parents can be inbred, as long as their progeny (relatives) are not inbred (use of inbred parents can increase precision)

• The usual assumptions for equilibrium also apply

– diploid inheritance

– no linkage or linkage disequilibrium• using fully inbred parents may reduce effects of linkage

Fixed vs Random effects

• Fixed effects• interested in the effects of the treatments per se

• Σi=0

• Random effects• treatments are a random sample from a larger reference

population that has a mean of 0 and variance σt2

• objectives are to extend conclusions to all members of the population

• interested in estimating magnitude of variance among and within groups

• Σti 0 for any given experiment

Source df MS Expected Mean Square

Blocks r-1 MSR

Families f-1 MSF

Error (r-1)(f-1) MSE

Single-factor analysis, one location

• Families and blocks are considered to be random effects

2

R

2

e f 2

F

2

e r 2

e

r/)( EF

2

F MSMS

However, estimate of additive genetic variance will be biased upward if there is GXE or epistasis

2

F = CovFamily

Single-factor analysis, multiple environments

• An environment could be a location or a different year or season at the same location

• Environments are generally considered to be random, because we want to make inferences about the performance that could be expected at other potential sites in the target production environment

• Specific environments, such as irrigation, fertilizer levels, temperature or daylength regimes, would be fixed effects

• Note that aspects of the experimental design (blocks, locations) are often treated as fixed effects in molecular studies where the objective is to make associations between markers and phenotypes.

Single-factor analysis, multiple environments

Source df MS Expected Mean Square

Years y -1

Blocks/Years y(r-1)

Families f-1 MSF

Families x Years (f-1)(y-1) MSFY

Error y(r-1)(f-1) MSE

2

F

2

FY

2

e ryr 2

FY

2

e r 2

e

ry/)( FYF

2

F MSMS 2

F = CovFamily

Not biased by GXE

Additive genetic variance from single-factor design

Family

2

F Cov 2

A

Genotypes divided into sets

Source df MS Expected Mean Square

Years y -1

Sets s-1

Years x Sets (y-1)(s-1)

Blocks/(YearsxSets) (r-1)ys

Families/Sets (f-1)s MSF

Years x Families/Sets (y-1)(f-1)s MSFY

Error (r-1)(f-1)ys MSE

2

F

2

FY

2

e ryr 2

FY

2

e r 2

e

Calculation of σA2 is the same as before

• Large numbers of families can be divided into sets, and variances can be pooled across sets.

Example – single-factor analysis

• 60 maize S2 lines are allowed to open pollinate; bulked to form half-sib families

• 2 randomized complete blocks, 3 locations

Bernardo, pg 155

Source df MS Mean Square

Location 2

Blocks/Locations 3

Families 59 MSF 14.36

FamiliesxLocations 118 MSFL 6.18

Error 177 MSE 4.00

2

F

2

FL

2

e rlr 2

FL

2

e r 2

e

Are there significant differences among families?

F test MSF/ MSFL= 14.36/6.18 = 2.32Compare to Fcritical with 59,118 df Pr>F is <0.0001

What is the level of inbreeding in the S2 parents?

Expected frequency of heterozygotesP12 = 2pq(1-F)

Plants Families P12 F

F2 or S0 F3 or S1 P12=2pq 0

F3 or S1 F4 or S2 (0.5)P12 0.5

F4 or S2 F5 or S3 (0.25)P12 0.75

F5 or S3 F6 or S4 (0.125)P12 0.875

Fn or Sn-2 Fn+1 or Sn-1 (1/2)n-2P12 1-(1/2)n-2

• A family represents the alleles of its parents– Collectively, an S1 family has the same distribution of alleles as

the S0 plant from which it was derived

• The distinction between plants and families decreases as F approaches 1

rl/)( FLF

2

F MSMS = (14.36-6.18)/(2*3) = 1.36

Source df MS Mean Square

Location 2

Blocks/Locations 3

Families 59 MSF 14.36

FamiliesxLocations 118 MSFL 6.18

Error 177 MSE 4.00

2

F

2

FL

2

e rlr 2

FL

2

e r 2

e

Example – single-factor analysis

Estimate additive genetic variance

63.336.11

4)(

F1

4

21

2

F

2

A

Heritability based on family means

• For animals, a family consists of multiple progeny from an individual

– each of the progeny is a replicate

– usually measure variance among progeny within each family

• For plants, we usually take collective measurements of multiple plants in a plot, and replicate the plots across reps and environments

• Heritabilities in plants are usually expressed on the basis of family means. Meaning will vary depending on the size of the plots, number of replications and number of environments

2 2

2 22

2 2 22

( , )e GL

G G

P G Xrl lG

Cov G Ph

Variance of family means

rl

MSerror2

X appropriate error term for families

number of observations on each family

Families 59 MSF 14.36

FamiliesxLocations 118 MSFL 6.18

Error 177 MSE 4.00

2

F

2

FL

2

e rlr 2

FL

2

e r 2

e

03.13*2

18.6

rl

MSFL2

X

39.23*2

36.14

lrlrl

rlr

rl

MS 2

FL

2

e2

F

2

F

2

FL

2

eF2

P

2 2 2P F X

1.36 1.03 2.39

think of this as the square of the standard error of a family mean

Heritability on a family mean basis

2

X

2

G

2

G

rrl

2

G

2

G2

P

22GL

2e

)P,G(Covh

57.003.136.1

36.1

rrl

2

F

2

F22FL

2e

h

2

A4

F1