28
Mauricio Parra Quijano FAO consultant International Treaty on Plant Genetic Resources for Nutrition and Agriculture CAPFITOGEN Program Coordinator http://www.capfitogen.net Tools ColNucleo & FIGS_R INTRODUCTION TO

Presentation 6 col nucleo_figs_r

Embed Size (px)

Citation preview

Page 1: Presentation 6 col nucleo_figs_r

Mauricio Parra QuijanoFAO consultant International Treaty on Plant Genetic Resources for Nutrition and Agriculture CAPFITOGEN Program Coordinator

http://www.capfitogen.net

ToolsColNucleo & FIGS_R

INTRODUCTION TO

Page 2: Presentation 6 col nucleo_figs_r

ColNucleo

Obtaining ecogeographical core collections based on ELC maps

Page 3: Presentation 6 col nucleo_figs_r

Again about genetic representativeness

A B C

accggtccc accggtcgc accggtctc

A B C

A A A

A B CAAA

A

B

BBB

C BA

Page 4: Presentation 6 col nucleo_figs_r

When collections are very large (>1000)…

A BB

A AA

CA B

CA B

A BB

A AA

AAA

A B

A

A

AA

B

BB

B

C BA

A

B CA

AA

AB

B

BB

CBA

C

AA

A

A

AA

AA

RandomBy genotypeBy phenotype

A BB

A AA

AAA

A BB

A AA

CA B

CA B

But not real

Page 5: Presentation 6 col nucleo_figs_r

What information should we use to select?

Characterization

Morphological

Biochemical/Molecular

Agronomic/ Physiological/ PhytopathologyEntomology

Page 6: Presentation 6 col nucleo_figs_r

Types of core collections according to data

Random

Political / Administrative

Phenotypic (morphological)

Phenotypic (quantitative traits of agronomic interest)

Genotypic (molecular markers - neutral)

Ecogeographical (adaptation to the abiotic environment)

Mixed / Cumulative

Page 7: Presentation 6 col nucleo_figs_r

Ecogeographical core collections

The first ideas about using information on CC using adaptation data back to 1995

Only until 2000-2010 the use of GIS became popular in RFG

In 2005 the first ELC map was created

In 2009, two eco-geographical core collections were obtained and validated

Page 8: Presentation 6 col nucleo_figs_r

Ecogeographical core collections

Page 9: Presentation 6 col nucleo_figs_r

Determination of representativeness

Mean Variance Matching Ranges Coefficient of variace

Page 10: Presentation 6 col nucleo_figs_r

Ecogeographical CC vs Phenotypic CC

Page 11: Presentation 6 col nucleo_figs_r

Determination of representativeness

Page 12: Presentation 6 col nucleo_figs_r

What does ColNucleo offer?Starting with an ELC map

(from ELC mapas tool)

P

CSampling intensity

10%15%20%

1000

100

Page 13: Presentation 6 col nucleo_figs_r

What does ColNucleo offer?

Seeds availability?

Ecogeographical core collection

In addition…

Phenotypic/Genotipic validation is advisable

Perform further stepwise strategy by selecting other types of variables (descriptors)

Selecting by pheno/genotypic representativeness, not randomly

Page 14: Presentation 6 col nucleo_figs_r

One or more

core

collections?

Page 15: Presentation 6 col nucleo_figs_r

FIGS_R

Determination of subsets focused on traits of interest for breeders (Focused Identification of Germplasm Strategy)

Page 16: Presentation 6 col nucleo_figs_r

Why is it so difficult to use germplasm?

Poor visibility of the germplasm collections

Lack of information on the preserved material

The available information is not very useful in practice

Limited accessibility to information

Inaccessibility to germplasm

Limited interest of breeders to use germplasm collections

Page 17: Presentation 6 col nucleo_figs_r

Conflict of interests…

Curators Representativeness Breeders Traits

Page 18: Presentation 6 col nucleo_figs_r

The paradox of the use of PGR

Breeders frequently find collections of 1000 entries or more

They have limited availability to test

Breeders use 100 or 150 entries at the most to evaluate a trait of particular interest, as part of their routine activity

Breeders need information (characterization / evaluation data) on the preserved germplasm to make use of it.

PGR curators prioritize efforts to preserve and, only when enough funds are available, to characterize

There are very few evaluation data (or at least available)... which consequently leads to almost random selections by breeders…

There are always little or insufficient funds to characterize and evaluate the germplasm

Low level of use, reduced interest

Gradual reduction of funds for characterizing/evaluating

Page 19: Presentation 6 col nucleo_figs_r

Focused Identification Germplasm Strategy

Original idea from Michael Mackay (1986,1990, 1995)

Fenotype = Genotype + Environment + (GxE)

Identifies germplasm with high probability of containing genetic diversity for the trait of interest

Uses ecogeographical information for the prediction of traits occurrence as a preliminary step to field trials, where breeders ultimately confirm the existence of the trait

No previous efforts on characterization/field evaluation are required and the number of entries that are delivered to the breeders to be evaluated is reduced

Resistanc e/Tolerance = Genotype + Environment + (GxE)

Generating FIGS subcollections (≠ core collections)

UseEnhancing the

Page 20: Presentation 6 col nucleo_figs_r

First approach…

er

y

a

n

is

F I G SOCUSED DENTIFICATION OF ERM PLASM TRATEGY

Data la

yers sie

ve acce

ssions

ba

sed on

latitud

e &

lon

gitud

eSource: Figure from Mackay (1995)

GIS

laye

rs /

Ecog

eogr

aphi

cal v

aria

bles

GermplasmFILTERED!!!

We use expert knowledge Species experts Breeders Entomologists,

phytopathologists

Page 21: Presentation 6 col nucleo_figs_r

Second approach… modeling

Clasification method AUC Kappa Field validation

Principal Component Regression (PCR)

0.69 0.40 ?

Partial Least Squares (PLS) 0.69 0.41 ?

Random Forest (RF) 0.70 0.42 ?

Support Vector Machines (SVM)

0.71 0.44 ?

Artificial Neural Networks (ANN)

0.71 0.44 ?

Y = b + X1 + X2 + X3Resistance/Tolerance

Ecogeographical variables

(Genebank: ICARDA wheat collection– Trait: Stem rust (Puccinia gramini)Source: Bari et al., 2012. Focused identification of germplasm strategy (FIGS) detects wheat stem rust resistance linked to environmental variables. Genet Resour Crop Evol 59(7):1465-1481

Predict on non-eval/characterized germplasmEval/characterized of germplasm Pattern

Page 22: Presentation 6 col nucleo_figs_r

What does FIGS_R offer?

It generates FIGS subsets via filtering

Ecogeographicalcharacterization Matrix

Pasaport data table Elevation

Average Annual Temperature

Edaphic Organic Carbon

Topsoil pH

….….

Y

X

ECOGEO

FIGS_R characterize ecogeographically the collection using the selected variables

Page 23: Presentation 6 col nucleo_figs_r

What does FIGS_R offer?

FIGS_R characterize ecogeographically the collection using the selected variables

It uses up to three ecogeographical variables and perform a stepwise selection

Annual Precipitation (primary variable)

Edaphic clay (secondary variable)

Slope (tertiary variable)

40

4

Intensidadde selección

Page 24: Presentation 6 col nucleo_figs_r

What does FIGS_R offer?

FIGS_R characterize ecogeographically the collection using the selected variables It uses up to three eco-geographical variables and perform a stepwise selection

It selects entries from a range of values for each variable or a proportion of the distribution of values (e.g. lower 30%), in separate processes for each variable.

PROPORTION OFTHE DISTRIBUTION

40% lower35% higher

Lowervalue

UppervalueRANGE

Page 25: Presentation 6 col nucleo_figs_r

What does FIGS_R offer?

FIGS_R characterize ecogeographically the collection using the selected variables It uses up to three eco-geographical variables and perform a stepwise selection It selects entries from a range of values for each variable or a proportion of the

distribution of values (e.g. lower 30%), in separate processes for each variable.

It can use (depending on the user) an ELC map to try to balance the selection of accessions, taking the fraction of the distribution from each category

Page 26: Presentation 6 col nucleo_figs_r

What does FIGS_R offer?

FIGS_R characterize ecogeographically the collection using the selected variables It uses up to three eco-geographical variables and perform a stepwise selection It selects entries from a range of values for each variable or a proportion of the

distribution of values (e.g. lower 30%), in separate processes for each variable.

Like ColNucleo, it can take into account the availability of the germplasm indicated by the curator.

Page 27: Presentation 6 col nucleo_figs_r

One or more

FIGS

subsets?

Page 28: Presentation 6 col nucleo_figs_r