63
Genetics of barley grain size Arnis Druka IAEA Regional Training Course Phenotypical Characterization of Mutants 19-23 May 2013 Amman, Jordan

Genetics of barley grain size - Arnis Drukaarnisdruka.uk/evidence/Day4_AD_RTC_Jordan_2013.pdf · Genetics of barley grain size Arnis Druka ... Application: cloning eam8 (or mat-a)

Embed Size (px)

Citation preview

Genetics of barley grain size

Arnis Druka

IAEA Regional Training Course Phenotypical Characterization of Mutants 19-23 May 2013 Amman, Jordan

T h e o u t l i n e

1) Introduction – modelling grain yield from a geneticists point of view

2) Generation and analysis of the phenotype data matrix

3) Generation and genotyping of Recombinant Inbred Line (RIL) population

4) Linkage mapping using J/QTL

5) Known rice grain size genes – how good candidates are they?

6) What Bowman-backcross derived lines are good for?

Today

Enhancing gene prediction using Bowman backcross-derived lines

and barley-rice synteny

Thursday, May 23

Germplasm identification

STW trait

Changing gene action

Gene identification

How it works?

Genetic linkage mapping

Molecular genetics golden

promise

Germplasm – what’s available?

• ‘Immortal’ bi-parental populations (eg GPMx) • Multi-parental crosses • Random collections • Association panels • Original ‘off-type’ lines (induced and

spontaneous variants) • Backcross-derived lines • F2 populations • F1 seeds

Udda Lundqvist

• Generated large collection of induced barley mutant lines (~10,000)

• Performed extensive allelism testing

Mutagenesis experiment WT mutant line 196

mutant line 7, mutant line 196 and mutant line 20927 have the same phenotype ph

mutant line 7

mutant line 20927

What happened? • Three independent mutations in one gene? • Or, three different genes determine

phenotype ph?

To know which is which:

1. Cross each mutant line to every other

2. Plant F1 seed

3. Observe F1 plant Ph phenotype

4. Make sure that crosses worked by

genotyping F1 plants

Essential, but may not be straightforward

mutant line 196

mutant line 7

mutant line 20927

X

mutant line 7

mutant line 196

mutant line 20927

X

X

ph

WT

WT

• ph phenotype can be induced by mutations in two genes

• there are two different mutations (alleles) in one of the genes

• This is called allelism test

• Udda Lundqvist performed such allelism tests for

many phenotypes in many cases identifying 20-30

alleles (called allelic series)

• Induced allelic series are critical resource for barley

gene validation because

• Probability of occurring multiple deleterious

mutations in the same gene in natural

conditions is extremely low

• Phenotypes generally are stronger

Barley Genetics Newsletter

V26

Graingenes

http://wheat.pw.usda.gov/

Bowman backcross-derived

lines

Jerome Franckowiak • Collected and generated large number of mutant lines

including many Udda’s mutants

• Author of the cultivar Bowman

• Performed monumental backcrossing program

generating ~1000 Bowman backcross-derived lines

Hermitage Research Station QAAFI

Jerry Franckowiak North Dakota State

University, Fargo USA

Jerry Franckowiak Hermitage Research Station

Warwick, Queensland AUSTRALIA

J e r r y F r a n c k o w i a k , B o w m a n & B C L s

P h e n o t y p e s

OREGON WOLFE BARLEYS DHL population

Pat Hayes, Oregon State University, Corvallis, USA

M u t a g e n s

O r i g i n a l c u l t i v a r s

homozygous mutant line x cultivar Bowman

F1 generation

F1 generation

selfing

recombinant mutant line X

cultivar Bowman

Crossing to cultivar

Bowman

Crossing to cultivar

Bowman

Crossing to cultivar Bowman

Selection by phenotype

Selection by phenotype

B o w m a n p o p u l a t i o n s t r u c t u r e

How do we know that after > 4 rounds of BC there is a single, small

introgression left?

?

BW999

Beadarray technology and the data sets

Bowman Cy3 signal

BW

99

9 C

y3 s

ign

al

Bowman Cy5 signal

BW

99

9 C

y5 s

ign

al

6000 data points (1536 x 4)

Bowman Cy3 signal

BW

99

9 C

y3 s

ign

al

Bowman Cy5 signal

BW

99

9 C

y5 s

ign

al

9 polymorphic SNPs

9 polymorphic SNPs

How they are spread across the genome?

SNP 47 SNP169 SNP 589 SNP1675 SNP 469 SNP 800 SNP BEE SNP 504

SNP 9

BW999

SNP 47 541 Mb SNP169 542 Mb

SNP 589 544 Mb SNP1675 544 Mb SNP 469 546 Mb

SNP 800 547 Mb SNP BEE 547 Mb SNP 504 547 Mb

SNP 9 548 Mb

SNP 47 541 Mb SNP169 542 Mb

SNP 589 544 Mb SNP1675 544 Mb SNP 469 546 Mb

SNP 800 547 Mb SNP BEE 547 Mb SNP 504 547 Mb

SNP 9 548 Mb

Introgression size

548 – 541 = 7 Mb

SNP positions

cM Mb 1H 133 581 2H 149 628 3H 155 565 4H 115 543 5H 169 560 6H 127 539 7H 140 601

total 988 4,016

number of genes 30,400 genome size (Mb) 5,300

genes/cM 30.76 genes/Mb 5.7 Mb/cM 5.4

chr

SNP 47 541 Mb SNP169 542 Mb

SNP 589 544 Mb SNP1675 544 Mb SNP 469 546 Mb

SNP 800 547 Mb SNP BEE 547 Mb SNP 504 547 Mb

SNP 9 548 Mb

Introgression size

548 – 541 = 7 Mb

Number of genes

7 x 5.7= 40

SNP positions

Bowman backcross-derived line genotyping experiment

976 Bowman lines 1536 SNPs each

2 signals each SNP

2,998,272 data points

4.50

3.75

3.00

2.25

3.00 3.75 4.50 cy5

cy3

3 million data points

DataDesk

Based on genotyping of 976 Bowman backcross-derived lines with 1536 known SNPs

Number of polymorphic SNPs

Based on genotyping of 976 Bowman backcross-derived lines with 1536 known SNPs

Introgression size

Based on genotyping of 976 Bowman backcross-derived lines with 1536 known SNPs

Number of introgressions

Barley – rice synteny

• Nucleotide sequences around SNPs are known

• SNPs come from barley gene sequences • Rice genome sequence and identities of

many genes are known • Barley and rice proteins have similar

amino acid sequences • Multiple chromosomal regions have

similar gene order in barley and rice

Why can synteny-based gene predictions work?

Data – Thiel et al 2009 Circos – Krzywinski 2009

www.harvest-web.org

HARVEST

www.harvest-web.org

Tim Close University of California Riverside

USA

www.harvest-web.org

SNP 47 541 Mb SNP169 542 Mb

SNP 589 544 Mb SNP1675 544 Mb SNP 469 546 Mb

SNP 800 547 Mb SNP BEE 547 Mb SNP 504 547 Mb

SNP 9 548 Mb

SNP positions • SNPs come from

barley gene sequences

Use these sequences to identify rice gene

homologs

See if synteny can be established

If yes, see what’s interesting is in rice

The National Center for Biotechnology Information

NCBI Set up by US Government

http://www.ncbi.nlm.nih.gov/

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410

BLAST - Basic Local Alignment Search Tool

David Lipman

http://rice.plantbiology.msu.edu

Kawahara, Y., de la Bastide, M., Hamilton J. P., Kanamori, H., McCombie, W. R., Ouyang, S., Schwartz, D. C., Tanaka, T., Wu, J., Zhou, S., Childs, K. L., Davidson, R. M., Lin, H., Quesada-Ocampo, L., Vaillancourt, B., Sakai, H., Lee, S. S., Kim, J., Numa, H., Itoh, T., Buell, C. R., Matsumoto, T. 2013 Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice 6:4.

Robin Buell

Considerations using synteny approach

• Not all genes are in syntenuous

positions

• Phenotype equivalency problem

• Missing and/or diverged genes

(Educated) guessing approach to clone gene underlying

mat-a phenotype

Mats Hansson & Team Barley

Carlsberg Laboratory

Application: cloning eam8 (or mat-a) gene

difference in heading date (5-10 days)

m a t p h e n o t y p e i n b a r l e y

wild type mutant

Image from: Nat Genet. 2012 Dec;44(12):1388-92.

Cloning eam8 (or mat-a) gene using backcross-derived lines

original species gene name gene symbol accession rice

homolog/ortholog

arabidopsis agamous like 19 agl19 NM_118424 LOC_Os03g03100

arabidopsis agamous like 20 agl20 NM_130128 LOC_Os10g39130

arabidopsis agamous like 24 agl24 NM_118587 LOC_Os06g11330

arabidopsis apetala1 ap1 NM_105581 LOC_Os03g54160

arabidopsis constans co NM_001036810 LOC_Os06g16370

arabidopsis constitutive morphogenesis1 cop1 NM_128855 LOC_Os02g53140

arabidopsis cryptochrome1 cry1 NM_116961 LOC_Os04g37920

arabidopsis curly leaf clf NM_127902 LOC_Os06g16390

arabidopsis cycling dof factor1 cdf1 NM_125637 LOC_Os07g48570

arabidopsis early flowering in short days efs NM_001084367 LOC_Os02g34850

arabidopsis early flowering3 elf3 NM_128153 LOC_Os01g38530

arabidopsis early flowering5 elf5 NM_125659 LOC_Os07g09510

arabidopsis early flowering6 elf6 NM_120506 LOC_Os03g05680

arabidopsis early flowering7 elf7 NM_106622 LOC_Os08g06070

arabidopsis early flowering8 elf8 NM_201703 LOC_Os07g29360

rice early heading date1 ehd1 AB092506 LOC_Os10g32600

rice early heading date2 ehd2 AB359198 LOC_Os10g28330

arabidopsis early in short days 4 esd4 NM_117680 LOC_Os03g22400

arabidopsis fca fca NM_179211 LOC_Os09g03610

arabidopsis fd fd NM_119756 LOC_Os02g52780

Cross-referencing known flowering time genes

marker id chr pos (cM) LOC_rice chr rice 5'-end (Mb) annotation_rice

ABC11085-1-1-168 1H 132.5 LOC_Os05g50360 Os05 28,771,533 anaphase-promoting complex subunit 10

3_0803 1H 132.5 LOC_Os07g04220 Os07 1,849,699 wound and phytochrome signaling receptor like kinase

6473-811 1H 134.0 LOC_Os05g50480 Os05 28,856,869 expressed protein

ConsensusGBS0554-4 1H 135.6 LOC_Os01g14590 Os01 8,174,691 pathogen-related protein

11454-414 1H 135.6 LOC_Os01g42960 Os01 24,787,885 electron transporter

ConsensusGBS0450-1 1H 135.6 LOC_Os01g69970 Os01 40,794,608 periodic tryptophan protein 1

9022-543 1H 135.6 LOC_Os05g50800 Os05 29,054,191 protein ABIL1

ABC05061-1-1-159 1H 135.6 LOC_Os05g50840 Os05 29,078,396 Grave disease carrier

3_0277 1H 135.6 LOC_Os05g50930 Os05 29,141,766 RNA polymerase sigma factor rpoD

3_0517 1H 135.6 LOC_Os07g08430 Os07 4,331,942 indole-3-glycerol phosphate lyase

3653-519 1H 135.6 LOC_Os09g36680 Os09 21,157,464 ribonuclease 3 precursor

3671-59 1H 135.6 LOC_Os09g36700 Os09 21,167,169 extracellular ribonuclease LE precursor

3639-969 1H 136.3 LOC_Os05g51180 Os05 29,272,704 plasminogen activator inhibitor 1 RNA-binding protein

4927-1340 1H 137.8 LOC_Os05g51450 Os05 29,416,207 endopeptidase Clp

5316-739 1H 138.3 LOC_Os05g51470 Os05 29,423,748 cupin, RmlC-type

4057-2114 1H 138.3 LOC_Os05g51480 Os05 29,428,583 DNA damage binding protein 1a

5222-919 1H 138.3 LOC_Os05g51500 Os05 29,451,410 eukaryotic translation initiation factor 5B

3_0231 1H 138.3 LOC_Os05g51540 Os05 29,470,870 expressed protein

3246-1135 1H 138.9 LOC_Os05g51530 Os05 29,468,418 vacuolar ATP synthase subunit C

elf3

ft

spa1

elf3 LOC_Os01g38530

21,638,883

spa1 LOC_Os05g49590

28,382,352

ft LOC_Os01g11940

6,493,516

Identifying eam8 (or mat-a) candidate genes using barley-rice synteny

Causal gene – the one with multiple

deleterious sequence variants

In this case it turned

out to be elf3

elf3 LOC_Os01g38530 21,638,883

spa1 LOC_Os05g49590 28,382,352

ft LOC_Os01g11940 6,493,516

Re-sequence each candidate gene from

some of the

87 available mat-a mutant

lines

Validation of elf3 as eam8 (or mat-a) gene

Udda’s allelic series

Validation of elf3 as eam8 (or mat-a) gene using allelic series

BA R C O D E c r o s s e s

David Harrap KWS UK Ltd

F1 seeds &

F2 families

R e f e r e n c e

THANK YOU!

All presentations and data sets are available for downloading from here:

http://www.ilze-arnis.net/jordan_2013