Legendre Scale Talk

What are the important spatial scales in an ecosystem?

Pierre Legendre Département de sciences biologiques

Université de Montréal

[email protected] http://www.bio.umontreal.ca/legendre/

“Meet the experts” seminar Université catholique de Louvain, Louvain-La-Neuve, April 19, 2012

Outline of the talk

1. Motivation

2. Variation partitioning

3. Multi-scale analysis

• The dbMEM method

• Simulation study (summary)

4. Several applications to real ecological and data

5. Recent developments: MEM (general forms), AEM

Setting the stage Ecologists want to understand and model spatial [or temporal] community structures through the analysis of species assemblages. • Species assemblages are the best response variable available to estimate the impact of [anthropogenic] changes in ecosystems.

• Difficulty: species assemblages form multivariate data tables (sites x species).

Setting the stage Beta diversity is the variation in species composition among sites.

Beta diversity is organized in communities. It displays spatial structures.

Ile Callot, Finistère. Photo P. Legendre

Stonehenge, Wiltshire, southern England. Photo P. Legendre

Setting the stage Spatial structures in communities indicate that some process has been at work to create them. Two families of mechanisms can generate spatial structures in communities.

Google Maps

Setting the stage Spatial structures in communities indicate that some process has been at work to create them. Two families of mechanisms can generate spatial structures in communities:

• Induced spatial dependence: forcing (explanatory) variables are responsible for the spatial structures found in the species assemblage. They represent environmental or biotic control of the species assemblages, or historical dynamics. Generally broad-scaled. • Community dynamics: the spatial structures are generated by the species assemblage themselves, creating autocorrelation1 in the response variables (species). Mechanisms: neutral processes such as ecological drift and limited dispersal, interactions among species. Spatial structures are generally fine-scaled.

1 Spatial autocorrelation (SA) is technically defined as the dependence, due to geographic proximity, present in the residuals of a [regression-type] model of a response variable y which takes into account all deterministic effects due to forcing variables. Model: yi = f(Xi) + SAi + εi .

Multivariate variation partitioning Borcard & Legendre 1992 [1722 citations]

Borcard & Legendre 1994

and many published application papers

Figure – Venn diagram illustrating a partition of the variation of a response matrix Y (e.g., community composition data) between environmental (matrix X) and spatial (matrix W) explanatory variables. The rectangle represents 100% of the variation of Y.

[a] [b] [c]

[d] = Residuals

Environmental datamatrix X

Spatial datamatrix W

Communitycompositiondata table Y

=

Method described in: Borcard, Legendre & Drapeau, 1992; Borcard & Legendre, 1994; Legendre & Legendre, 2012.

How to combine environmental and spatial variables in modeling community composition data?

• A single response variable: partial multiple regression.

Partial multiple regression is computed as follows: 1. Compute the residuals Xres of the regression of X on W: Xres = X – [W [W'W]–1 W' X] 2. Regress y on Xres

How to combine environmental and spatial variables in modelling community composition data?

• Multivariate data: partial canonical analysis (RDA or CCA).

Response table Explanatory table Explanatory table

XEnvironmental

variables

WSpatial

base functions

YCommunitycomposition

data

Partial canonical analysis is computed as follows: 1. Compute the residuals Xres of the regression of X on W: Xres = X – [W [W'W]–1 W' X] 2. Regress Y on Xres to obtain Yfit . Compute PCA of Yfit .

Geographic base functions

First (simple) representation: Polynomial function of geographic coordinates (polynomial trend-surface analysis).

Example 1: 20 sampling sites in the Thau lagoon, southern France.

z = f(X,Y) = b0 + b1X + b2Y + b3X2 + b4XY + b5Y2 + b6X3 + b7X2Y + b8XY2 + b9Y3^

Small textbook example:

20 sampling sites in the Thau lagoon, southern France.

Response (Y): 2 types of aquatic heterotrophic bacteria (log-transf.)

Environmental (X): NH4, phaeopigments, bacterial production

Spatial (W): selected geographic monomials X2, X3, X2Y, XY2, Y3

In real-life studies, partitioning is carried out on larger data sets.

Spatial eigenfunctions

Second representation: Distance-based Moran’s eigenvector maps

(dbMEM) (formerly called Principal Coordinates of Neighbor Matrices, PCNM)

leading to multiscale analysis in variation partitioning

Borcard & Legendre 2002, 2004

Dray, Legendre & Peres-Neto 2006

Figure – Graphs of ten of the 49 dbMEM eigenfunctions that represent the spatial variation along a transect with 50 equally-spaced points. Abscissa, from left to right: sites 1 to 50. Ordinates: values along the dbMEM eigenfunctions.

+

Principal coordinateanalysis

Eigenvectors

–0

Euclidean distances... n–1

1 2 3 4 5 ...1 2 3 4 5 ...

1 2 3 4 5 ...1 2 3 4 5 ...

1 2 3 4 5

1 2 3 4 5 ...

1 2 3 41 2 3

1 21

Truncated matrix of Euclideandistances = neighbor matrix

1...max

Max x 4

1...max1...max

1...max1...max

1...max1...max

1...max1...max

1...max1

XY

Multiple regressionor canonical analysis

Eigenvectors withpositive eigenvalues= PCNM variables

(+)

x(spatial coordinates)

Data

1 2 3 4 5 6 7 8 9 10 11

Observedvariable

y

Figure – Schematic description of dbMEM analysis. The descriptors of spatial relationships (dbMEM eigenfunctions) are obtained by principal coordinate analysis of a truncated matrix of Euclidean (geographic) distances among the sampling sites.

MEM eigenfunctions display spatial autocorrelation

A spatial correlation coefficient (Moran’s I) can be computed for each MEM. The eigenvalues are actually proportional to Moran’s I. Example: 50-point transect, 49 MEMs.

How to find the truncation distance in 2-dimensional problems?

Compute a minimum spanning tree

Truncation distance ≥ length of the longest link. Longest link here: D(7, 8) = 3.0414

Technical notes on MEM eigenfunctions

MEM variables represent a spectral decomposition of the spatial relationships among the study sites. They can be computed for regular or irregular sets of points in space or time.

MEM eigenfunctions are orthogonal. If the sampling design is regular, they look like sine waves. This is a property of the eigen-decomposition of the centred form of a distance matrix.

⇒ Consider a matrix of 0/1 connections among points, with • 1 between connected points • 0 between unconnected points and on the diagonal Double-centre the matrix to have the sums of rows and columns equal to 0. The eigenvectors of that matrix, plotted on a map of the geographic coordinates of the points, are sine-shaped.

Simulation study

Type I error study Simulations showed that the procedure is honest. It does not generate more significant results that it should for a given significance level α. Power study Simulations showed that dbMEM (PCNM) analysis is capable of detecting spatial structures of many kinds: • random autocorrelated data, • bumps and sine waves of various sizes, without or with random noise, representing deterministic structures, as long as the structures are larger than the truncation value used to create the dbMEM (PCNM) eigenfunctions. Detailed results are found in Borcard & Legendre 2002.

A difficult test case



.

−6−4−2

02468

1012

g) Data (100%)

0 10 20 30 40 50 60 70 80 90 100

0,00,51,01,5 j) Broad-scale submodel (R2 = 0.058)

PCNM #2

−2−101234

k) Intermediate-scale submodel (R2 = 0.246) PCNM #6, 8, 14

−3−2−1

0123 l) Fine-scale submodel (R2 = 0.128)

PCNM #28, 33, 35, 41

−4−20246 i) Spatial model with 8 PCNM base functions

(R2 = 0.433)

−1,0−0,5

−8

−40

4

8 h) Detrended data

Simulateddata

Step 1:detrending

PCNM analysisof detrended

data

Dep

ende

nt v

aria

ble


Selection of explanatory variables? Significance of the adjustment of a MEM model can be tested using the full set of MEM variables modelling positive spatial correlation, without selection of any kind.

The adjusted R2 gives a correct estimate of the variation explained by the MEM by correcting for the number of explanatory variables in the model.1

Before constructing submodels, forward selection of MEMs can be done by combining two criteria during model selection: the alpha significance level and the adjusted R2 of the model containing all MEM eigenfunctions2.

1 Peres-Neto, P. R., P. Legendre, S. Dray and D. Borcard. 2006. Variation partitioning of species data matrices: estimation and comparison of fractions. Ecology 87: 2614-2625. 2 Blanchet F. G., P. Legendre and D. Borcard. 2008. Forward selection of explanatory variables. Ecology 89: 2623-2632.

Example 1 Regular one-dimensional transect in upper Amazonia1

Data: abundance of the fern Adiantum tomentosum in quadrats. Sampling design: 260 adjacent, square (5 m x 5 m) subplots forming a transect in the region of Nauta, Peru. Questions • At what spatial scales is the abundance of this species structured? • Are these scales related to those of the environmental variables? Pre-treatment • The abundances were square-root transformed • and detrended (significant linear trend: R2 = 0.102, p = 0.001)

1 Data from Tuomisto & Poulsen 2000, reanalysed in Borcard, Legendre, Avois-Jacquet & Tuomisto 2004.

–2

–1

0

1

2

3(a) R2 = 0.815

1 21 41 61 81 101 121 141 161 181 201 221 241 260

-2

-1,5-1

-0,5

0

0,51

1,5

2

1 21 41 61 81 101 121 141 161 181 201 221 241 260

(b) Very-broad-scale submodel, 10 PCNMs, R2 = 0.333

-1,5

-1

-0,5

0

0,5

1

1,5

1 21 41 61 81 101 121 141 161 181 201 221 241 260

(c) Medium-scale submodel, 12 PCNMs, R2 = 0.126

-2

-1,5

-1

-0,5

0

0,5

1

1,5

1 21 41 61 81 101 121 141 161 181 201 221 241 260

(d) Fine-scale submodel, 20 PCNMs, R2 = 0.117

Data

PCNM model (50 significant PCNMs out of 176)

Broad-scale submodel, 8 PCNMs, R2 = 0.239

Forward selection

50 dbMEM eigenfunctions (PCNM) were selected (permutation test, 999 permutations).

The dbMEM were arbitrarily divided into 4 submodels. The submodels are orthogonal to one another.

Significant wavelengths (periodogram analysis):

V-br-scale: 250, 355-440 m Broad-scale: 180 m Medium-scale: 90 m Fine-scale: 50, 65 m

Interpretation: regression on the environmental variables

Use dbMEM in variation partitioning: Adiantum tomentosum at Nauta, Peru (R2

a).

Scalogram of the fern Adiantum tomentosum multiscale structure along another transect called Huanta (Peru). Abscissa: the 129 dbMEM eigenfunctions with positive Moran’s I. Ordinate: absolute values of the t-statistics. The 26 eigenfunctions selected by forward selection (p ≤ 0.05) are identified by black squares.

Example 2 Regular two-dimensional sampling grid

Chlorophyll a in a brackish lagoon1 Data: Chlorophyll a concentrations at 63 sites on a geographic surface. Sampling design: 63 sites forming a regular grid (1-km mesh) in the Thau marine lagoon (19 km x 5 km). Questions • At what spatial scales is chlorophyll a structured? • Are these scales related to the environmental variables? Pre-treatment • None. Forward selection • 12 dbMEM (PCNM) eigenfunctions were selected out of 45. 1 Data first analyzed by Legendre & Troussellier 1988; reanalyzed in Borcard et al. 1992 and

Borcard, Legendre, Avois-Jacquet & Tuomisto 2004.

Example 3 Gutianshan forest plot in China1

• Evergreen forest in Gutianshan Forest Reserve, Zhejiang Province. • Fully-surveyed 24-ha forest plot in subtropical forest, 29º15'N. • Plot divided into 600 cells of 20 m × 20 m. • 159 tree species. Richness: 19 to 54 species per cell. • Data collection: 2005

Legendre, P., X. Mi, H. Ren, K. Ma, M. Yu, I. F. Sun, and F. He. 2009. Partitioning beta diversity in a subtropical broad-leaved forest of China. Ecology 90: 663-674.

1 The Gutianshan forest plot is a member of the Center for Tropical Forest Science (CTFS). Details on the plot available at http://www.ctfs.si.edu/site/Gutianshan/.

Example 3 Gutianshan forest plot in China

Questions • How much of the variation in species composition among sites (beta diversity) is spatially structured? • Of that, how much is related to the environmental variables?

⇒ Four environmental variables developed in cubic polynomial form: Altitude: altitude, altitude2, altitude3 Convexity: convexity, convexity2, convexity3 Slope: slope, slope2, slope3 Aspect (circular variable): sin(aspect), cos(aspect) (Soil cores collected in 2007. Soil chemistry data not available yet.)

⇒  599 dbMEM eigenfunctions. 200 model positive spatial correlation. ⇒  Nearly all of them are significant: spatial variation at all scales.

Example of a regular grid with 8 × 12 = 96 points. Maps showing ten of the 48 dbMEM eigenfunctions that display positive spatial correlation. Shades of grey: values in each eigenvector, from white (largest negative value) to black (largest positive value).

• 63% of the among-cell variation (Ra2) of the community composition

(159 species) is spatially structured and explained by the 339 PCNM.

• Nearly half of that 63% is also explained by the four environmental variables. Soil chemistry to be added to the model when available.

• Scales of spatial variation: the dominant structure is broad-scaled.

⇒ Balance between neutral processes and environmental control.

[a] =0.029

[b] =0.278

[c] =0.348

Variationexplained byEnvironment= 0.307

Variationexplainedby PCNM

= 0.626

Unexplained (residual) = [d] = 0.344

Variation inspecies data Y(beta diversity)

=

Applications of dbMEM eigenfunctions a) We proceeded as follows in the first three examples: • dbMEM analysis of the response table Y; • Division of the significant dbMEM eigenfunctions into submodels; • Interpretation of the submodels using explanatory variables. The objective was to divide the variation of Y into submodels and relate those to explanatory environmental variables.

b) dbMEM eigenfunctions can also be used in the framework of variation partitioning, as in Example #4. The variation of Y is then partitioned with respect to a table of explanatory variables X and (for example) several tables W1, W2, W3, containing dbMEM submodels.

Applications of dbMEM eigenfunctions c) dbMEM can be used to model spatial and temporal variation in the study of spatio-temporal data, and test for the space-time interaction. Refer to the space-time interaction talk, Wednesday.

d) dbMEM can efficiently model spatial structures in data. They can be used to control for spatial autocorrelation in tests of significance of the species-environment relationship (fraction [a]).1

1 Peres-Neto, P. R. and P. Legendre. 2010. Estimating and controlling for spatial structure in the study of ecological communities. Global Ecology and Biogeography 19: 174-184.

The Moran’s eigenvector maps (MEM) method is a generalization of dbMEM (PCNM) to different types of spatial weights. The result is a set of spatial eigenfunctions, as in dbMEM analysis.

B = 0/1 c o n n e c t i v i t y

m a t r i x a m o n g s i t e s * H a d a m a r d

p r o d u c t Stéphane Dray: MEM analysis

Dray, Legendre and Peres-Neto (2006); Legendre and Legendre (2012, Chapter 14).

Eigen-decomposition of a spatial weighting matrix W

A = edge weighting

matrix W =

A site is connected to itself (D = 0)

A site is not connected to itself (D = 4 × threshold)

Difference between classical PCNM and dbMEM

The eigenvalues are different but the eigenvectors (i.e. the spatial eigenfunctions) are the same.

Other forms of Moran’s Eigenvector Maps (generalized MEM) can be constructed (Dray et al. 2006):

• Binary MEM: double-centre matrix B, then compute its eigenvalues and eigenvectors.

• Replace matrix A by some function of the distances.

• Replace A by some other weights, e.g. resistance of the landscape.

B = 0/1 c o n n e c t i v i t y

m a t r i x a m o n g s i t e s * H a d a m a r d

p r o d u c t A = edge weighting

matrix W =

Asymmetric eigenvector maps (AEM) is a spatial eigenfunction method developed to model species spatial distributions generated by an asymmetric, directional physical process.1

Guillaume Blanchet: AEM analysis

1 Blanchet, F.G., P. Legendre and D. Borcard. 2008. Modelling directional spatial processes in ecological data. Ecological Modelling 215: 325-336.

The following slides were provided by F. G. Blanchet from his talk: Modelling directional spatial processes in ecological data. Spatial Ecological Data Analysis with R (SEDAR) Workshop, Université Lyon I, May 26, 2008.

Constructing asymmetric eigenvector maps (AEM eigenfunctions)

Spat

ial a

sym

met

ry

# # Link number

Site number

1 = presence of a link 0 = absence of a link

Link

# # Link number

Site number


Spat

ial a

sym

met

ry

Link


# # Link number

Site number


1 2 3 4 5 6 7 8 9 10 11 12 13 14 … 57

Site 8 0 1 1 1 0 0 0 0 0 1 0 1 0 1 0 0

Spat

ial a

sym

met

ry

Link


Matrix of edges E

Matrix of edges E

Weights can be put on the edges

Compute the AEM eigenfunctions

Eigenfunctions (explanatory spatial variables)

or SVD of matrix E centred by columns

Either the object scores from PCA of matrix E

Maps of some of the AEM eigenfunctions AEM 1 AEM 2 AEM 3

AEM 22 AEM 23 AEM 24

Negative values

Positive values

[...]

Three applications of AEM analysis in the following paper –

Blanchet, F. G., P. Legendre, R. Maranger, D. Monti, and P. Pepin. 2011. Modelling the effect of directional spatial ecological processes at different scales. Oecologia 166: 357-368.

Example 1 – Roxane Maranger, U. de Montréal Bacterial production in Lake St. Pierre

Sampling was done on the morning of August 18th 2005

Frenette et al. - Limnology & Oceanography (2006)

Current direction

AEM model: R2adj = 51.4% using 4 selected AEMs

94 sites were sampled in Rivière Capesterre

Cur

rent

dire

ctio

n Atya innocous

Example 2 – Dominique Monti, UAG

93 AEM variables were constructed based on this connection diagram; 38 measured positive autocorrelation. 12 were selected.

Atya innocous

AEM model: R2adj = 59.8% using 12 selected AEMs

Example 3: 6 larval stages of Calanus finmarchicus on the Newfoundland and Labrador oceanic shelf

– Pierre Pepin, DFO, St. John’s, NL AEM model: R2

adj = 38.4% using 2 selected AEMs

Computer programs in the R statistical language

On R-Forge: http://r-forge.r-project.org/R/?group_id=195 PCNM package (P. Legendre) AEM package (F. G. Blanchet) SPACEMAKER: an R package to compute PCNM and MEM (D. Dray) PACKFOR: R package for selection of explanatory variables (S. Dray) On the CRAN page: http://cran.r-project.org VEGAN package (Oksanen et al. 2011):

function varpart() for multivariate variation partitioning function ordistep() for selection of explanatory variables

A new package, ADESPATIAL, is in preparation, that will contain all functions for spatial eigenfunction analysis presently found on the R-Forge page, and more.

To appear in 2012 (August or before)

References Available in PDF at http://numericalecology.com/reprints/

Blanchet, F. G., P. Legendre, and D. Borcard. 2008a. Modelling directional spatial processes in ecological data. Ecological Modelling 215: 325-336.

Blanchet, F. G., P. Legendre, R. Maranger, D. Monti, and P. Pepin. 2011. Modelling the effect of directional spatial ecological processes at different scales. Oecologia 166: 357-368.

Borcard, D., P. Legendre & P. Drapeau. 1992. Partialling out the spatial component of ecological variation. Ecology 73: 1045-1055.

Borcard, D. & P. Legendre. 1994. Environmental control and spatial structure in ecological communities: an example using Oribatid mites (Acari, Oribatei). Environmental and Ecological Statistics 1: 37-61.

Borcard, D. & P. Legendre. 2002. All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices. Ecological Modelling 153: 51-68.

Borcard, D., P. Legendre, C. Avois-Jacquet & H. Tuomisto. 2004. Dissecting the spatial structure of ecological data at multiple scales. Ecology 85: 1826-1832.

References (continued) Available in PDF at http://numericalecology.com/reprints/

Dray, S., P. Legendre & P. Peres-Neto. 2006. Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (PCNM). Ecological Modelling 196: 483-493.

Guénard, G., P. Legendre, D. Boisclair, and M. Bilodeau. 2010. Multiscale codependence analysis: an integrated approach to analyze relationships across scales. Ecology 91: 2952-2964.

Oksanen, J., G. Blanchet, R. Kindt, P. Legendre, P. R. Minchin, R. B. O’Hara, G. L. Simpson, P. Solymos, M. H. H. Stevens, and H. Wagner. 2011. vegan: Community Ecology Package. R package version 2.0-0. http://cran.r-project.org/package=vegan.

Peres-Neto, P. R., P. Legendre, S. Dray and D. Borcard. 2006. Variation partitioning of species data matrices: estimation and comparison of fractions. Ecology 87: 2614-2625.

Peres-Neto, P. R. and P. Legendre. 2010. Estimating and controlling for spatial structure in the study of ecological communities. Global Ecology and Biogeography 19: 174-184.

The End

Documents

Legendre Scale Talk