Upload
mauricio-parra-quijano
View
323
Download
2
Embed Size (px)
DESCRIPTION
CAPFITOGEN english presentations for workshops
Citation preview
Tools
Mauricio Parra QuijanoFAO consultant International Treaty on Plant Genetic Resources for Nutrition and Agriculture CAPFITOGEN Program Coordinator
GEOQUAL
Evaluates the quality of the geo-referencing at a given collecting site indicated in passport data
Geo-referencing and passport data
40° 20’ 33.4’’ N03° 11’ 52.1’’ W
Why should we evaluate the geo-referencing quality?
Coordinates
True site
x km
1030
75
100
Potential effects of poor geo-referencing
10 km
5 oC
6 oC
7 oC8 oC 9 oC
1 km
Level Value
ORIGCTY CRI
ADM1 Punta Arenas
ADM2 Buenos Aires
ADM3 NA
ADM4 NA
Level Value
ORIGCTY CRI
ADM1 Punta Arenas
ADM2 Pérez Zeledón
ADM3 NA
ADM4 NA
Description of the collecting site
Error describing the collecting sites
GEOQUAL features
•GEOQUAL is a tool which assigns a quality value to the passport data of a germplasm collection that include coordinates.
•The user enters the passport data in FAO-Bioversity 2012 format.
•GEOQUAL calculates three parameters COORQUAL, LOCALQUAL and SUITQUALalong with other sub-parameters.
•The parameters are summarized to generate both TOTALQUAL (0-60 range) and TOTALQUAL100 (0-100 range)
Parameter that determines the intrinsic quality of the coordinates included in the passport data. Values from 0 to 20. Sub-parameters.:
• ERRORES: Values beyond the coordinate frame
• PRECIS: Accuracy level. Measured in degrees, minutes or seconds (sexagesimal)
• GEORBLE: Probability of correct coordinates from site description
• INTERTEMP: Quality of coordinates by collection year
• * GEOREFMETH: System by which coordinates are assigned
COORQUAL
SUITQUAL
Parameter that assigns a quality value to the coordinates according to the appropriateness of the collection site for plant growth. Values from 0 to 20.
• Difference between cultivated and wild plants (SAMPSTAT)
• It uses information on land use from Global Land Cover map (1 km)
> 30 km
10-20 km
5-10 km
0-1 km
Ground level
0 20
Distance from the coastline
SUITQUAL
Lower resolution
Higher resolution
zoom!
SUITQUAL
LOCALQUAL
Parameter that comes from the comparison of the site (locality) description and administrative data coming from the coordinates, both from user’s passport data.
• The administrative geo-referenced information is extracted from the GADM database
• The comparison is between character strings, generating a distance (Levenshtein). Insertions, deletions and changes are determined, to assume that a string is equal to another. Function "agrep" in R
• According to the number of correct matches, a value ranging from 0 to 20 is assigned.
Passportdescription
GADM from Coordinates
GADM (second option)
ORIGCTY ISO
ADM1 NAME1 VARNAME1
ADM2 NAME2 VARNAME2
ADM3 NAME3 VARNAME3
ADM4 NAME4 VARNAME4
LOCALQUAL
TOTALQUAL = COORQUAL + SUITQUAL + LOCALQUAL
VALUES FROM 0 TO 60
TOTALQUAL100 = (TOTALQUAL*60) /100
0 98
LOCALQUAL and TOTALQUAL100
0 100
TOTALQUAL100: Unified value of the geo-referencing quality
80 90
Use of GEOQUAL
ELC maps
It allows the user to create eco-geographical land characterization maps (ELC), that reflect adaptive scenarios for a given species (or species groups) and a specific country or region
Characterization of a territory
Variable selection
Geophysical variables
Cluster analysis
Determination of optimal number
of groups
Combination(N bioclimatic*N geophysical*N edaphic)
Categories
MAP
Description of categories using original variables
Edaphic variables
Cluster analysis
Determination of optimal number
of groups
Bioclimatic variables
Cluster analysis
Determination of optimal number
of groups
How an ELC map is developed?
Expert opinion / knowledge
• Experts on target species are a valuable source of information
• Surveys are an efficient way to gather information from expert knowledge (internet/email, meetings, workshops, etc.).
• Variable lists are made by components, with details on the nature of the variables (explanation of codes, variable units, source, etc..). Then a value is assigned based on the importance that a given variable has regarding the adaptation of the species.
Bibliography search on major factors in the adaptation of target species
Variable selection I
Variable selection II
Debugging:
• Redundancy? Correlation? Collinearity?
• Bivariate correlations analysis, PCA, the inflation factor of VIF variance (comparison of linear relationships between variables – only in regression)
• Significance. Through a multiple regression analysis taking into account a dependent variable (that gives a measurement of adaptation).
x1
x2
x1
x1
x1
What type of map you need?
Depending on the approach of the analysis, the ELC map can be :
1. Generalist map
2. Map by species / gene pool / group of related Sp(Specific map)
It defines the major environments for a large number of species (related or not). For most of these species, the ELC map should discriminate different adaptive scenarios in a given target area. It is expected to find unadjusted relationships between adaptive characteristic of a smaller group of species and the resulting map (see Parra-Quijano et al., 2012).
They define in more detail the key environments for a particular species or a limited set of genetically related species. A good fit between the map and the adaptive characteristics of the target species is expected.
ELC mapas tool results
• Maps (which can be opened with DIVA-GIS) and tables describing each category.
ECOGEO
It allows to perform eco-geographical characterization of the geo-referenced collecting sites
0 cm
5 cm
10 cm
Internodes length
= 5.56 cm
1 2 3
1 0 1
0 1 0= present = 1
= absent = 0
NOT of thegermplasm
but of the collecting site
ECOGEO is a characterization
Process of ecogeographical characterization
Characterizationmatrix :Rows: Germplasm identifier Columns:Ecogreographicaldescriptors
passportData (includingcoordinates)
GIS
Elevation
Average Annual Temp
Soil Organic Carbon
Soil pH
….….
Y
X
Point or radial extraction?
2 4 3
1 3 2
1 3 2
1
1
3
1 1 3 4
Ecogeografical variable X
NA
NA
NA
NA
1 1 3 4NA
ACCENUMB VARIABLE
a NA
b NA
c 2
2 4 3
1 3 2
1 3 2
1
1
3
1 1 3 4
NA
NA
NA
NA
1 1 3 4NA
a
b
c
Distribution of passport data entries
2 4 3
1 3 2
1 3 2
1
1
3
1 1 3 4
NA
NA
NA
NA
1 1 3 4NA
GIS overlap Extraction results
ACCENUMB VARIABLE
a NA (1)
b 1
c 3
a
b
c
True location
a=68
b=65
c=50
GEOQUALuncertainty
Radius
Radial extraction
ACCENUMB CAPTURED VALUES
AVERAGE
a NA,1,1 1
b NA,1,1 1
c 3,2,1,3,2,3
2.333
GIS overlap
Results of radial extraction
ACCENUMB VARIABLE
a 1
b 1
c 3
Correct extraction
ACCENUMB VARIABLE
a NA
b NA
c 2
Point extraction
1
1
2.333
Radial extraction
2 4 3
1 3 2
1 3 2
1
1
3
1 1 3 4
NA
NA
NA
NA
1 1 3 4NA
Characterizationmatrix
40
9-0
93
20
-05
31
9-0
53
18
-05
31
7-0
53
15
-05
31
6-0
54
05
-09
39
1-0
73
90
-07
38
6-0
93
85
-07
38
6-0
73
75
-06
40
6-0
93
23
-05
37
6-0
73
21
-05
40
1-0
83
11
-05
37
2-0
63
77
-07
30
7-0
53
69
-06
29
9-0
53
68
-06
53
0-0
95
28
-09
52
7-0
95
23
-09
52
4-0
93
78
-07
37
9-0
75
26
-09
50
4-0
9-v
50
4-0
95
03
-09
-v5
03
-09
50
1-0
95
02
-09
50
7-0
95
34
-09
53
3-0
95
31
-09
53
2-0
93
00
-05
54
1-0
95
40
-09
53
6-0
95
35
-09
52
2-0
95
29
-09
53
9-0
95
37
-09
53
8-0
93
08
-05
41
4-0
92
76
-05
27
7-0
53
06
-05
35
7-0
63
65
-06
36
6-0
65
05
-09
-v5
25
-09
41
5-0
92
85
-05
28
3-0
52
84
-05
54
6-1
04
03
-09
40
2-0
93
55
-06
35
6-0
63
04
-05
30
2-0
53
03
-05
34
9-0
63
37
-06
33
8-0
63
97
-08
35
3-0
63
96
-08
41
3-0
95
16
-09
45
4-0
94
55
-09
41
2-0
92
79
-05
28
1-0
52
87
-05
28
0-0
52
91
-05
30
9-0
53
89
-07
39
2-0
73
24
-06
35
0-0
63
51
-06
52
1-0
9-v
52
1-0
95
20
-09
-v5
19
-09
-v5
19
-09
51
8-0
9-v
51
8-0
95
17
-09
-v5
17
-09
51
6-0
9-v
51
5-0
9-v
51
5-0
95
14
-09
-v5
14
-09
46
5-0
94
64
-09
46
3-0
94
62
-09
46
1-0
94
60
-09
45
9-0
94
58
-09
45
6-0
94
57
-09
50
6-0
9-v
50
5-0
95
06
-09
51
3-0
9-v
51
3-0
95
12
-09
-v5
12
-09
51
1-0
9-v
51
1-0
95
10
-09
-v5
10
-09
50
9-0
9-v
50
9-0
95
08
-09
50
8-0
9-v
26
8-0
52
88
-05
28
9-0
53
61
-06
34
1-0
63
60
-06
29
2-0
55
48
-10
34
8-0
63
47
-06
34
6-0
63
45
-06
34
3-0
63
42
-06
33
5-0
63
34
-06
33
3-0
63
32
-06
32
7-0
6-v
32
5-0
62
93
-05
29
8-0
55
51
-10
29
7-0
52
96
-05
29
5-0
52
94
-05
26
2-0
52
63
-05
41
0-0
94
11
-09
41
7-0
94
18
-09
39
3-0
72
75
-05
39
4-0
75
49
-10
55
2-1
05
50
-10
39
5-0
74
04
-09
26
6-0
53
80
-07
27
4-0
54
67
-09
41
6-0
94
66
-09
38
3-0
73
82
-07
26
9-0
52
65
-05
26
7-0
53
81
-07
27
3-0
52
72
-05
27
0-0
52
71
-05
30
1-0
52
82
-05
30
5-0
55
07
-09
-v4
53
-09
45
2-0
94
50
--0
94
51
-09
02
46
8
Cluster analysis - Ecogeographic characterization
hclust (*, "average")
ecogeodist
He
igh
t
d = 1
2 3
4
5
6 7
8 9 10 11 12
13 14
15 16
17
18
19
20
21 22 23
24
25 26
27 28 29
30 31 32 33
34
35
36
37
38 39 40
41
42
43
44
45 46
47 48 49 50 51 52
53
54
55
56 57 58 59 60 61
62 63
64
65 66 67 68 69 70
71
72 73
74
75 76
77
78 79
80 81
82
83
84
85
86 87
88 89
90 91
92 93
94 95 96
97
98 99
100
101
102
103
104 105
106
107 108
109
110
111
112
113
114
115
116
117
118
119
120
121
122 123 124 125
126 127
128 129 130 131 132 133 134 135 136 137
138 139
140 141 142 143 144 145
146
147
148 149
150
151
152 153 154 155 156 157 158 159 160 161 162 163
164 165 166 167
168
169 170 171 172 173 174 175 176 177 178
179
180 181
182
183 184 185
186
187
188 189 190 191
192 193 194 195 196
197 198
199
200 201 202
203 204
DECLATITUDE
alt
northness
slope
bio_18
bio_1
t_clay
t_sand
t_oc t_silt
t_ph_h2o
Eigenvalues
Data analysis