Upload
carlo
View
216
Download
0
Embed Size (px)
Citation preview
ORIGINAL ARTICLE
doi:10.1111/j.1558-5646.2012.01823.x
TESTING FOR PHYLOGENETIC SIGNALIN BIOLOGICAL TRAITS: THE UBIQUITYOF CROSS-PRODUCT STATISTICSSandrine Pavoine1,2,3 and Carlo Ricotta4
1Museum national d’Histoire naturelle, Departement Ecologie et Gestion de la Biodiversite, UMR 7204 CNRS UPMC, 55–61
rue Buffon, 75005 Paris, France2Mathematical Ecology Research Group, Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS,
United Kingdom3E-mail: [email protected]
4Department of Environmental Biology, University of Rome ‘La Sapienza’, Piazzale Aldo Moro 5, 00185 Rome, Italy
Received May 8, 2012
Accepted September 18, 2012
To evaluate rates of evolution, to establish tests of correlation between two traits, or to investigate to what degree the phylogeny
of a species assemblage is predictive of a trait value so-called tests for phylogenetic signal are used. Being based on different
approaches, these tests are generally thought to possess quite different statistical performances. In this article, we show that the
Blomberg et al. K and K∗, the Abouheif index, the Moran’s I, and the Mantel correlation are all based on a cross-product statistic,
and are thus all related to each other when they are associated to a permutation test of phylogenetic signal. What changes is only
the way phylogenetic and trait similarities (or dissimilarities) among the tips of a phylogeny are computed. The definitions of the
phylogenetic and trait-based (dis)similarities among tips thus determines the performance of the tests. We shortly discuss the bio-
logical and statistical consequences (in terms of power and type I error of the tests) of the observed relatedness among the statistics
that allow tests for phylogenetic signal. Blomberg et al. K∗ statistic appears as one on the most efficient approaches to test for phy-
logenetic signal. When branch lengths are not available or not accurate, Abouheif’s Cmean statistic is a powerful alternative to K∗.
KEY WORDS: Abouheif test, Blomberg et al. K and K∗, equivalent test statistic, Mantel test, Moran’s I, permutation.
Phylogenetic signal is obtained when phylogenetically related
species tend to have more similar trait values than more distantly
related species. It is tested with different aims: (1) to find models
and rates of evolution for explaining extant species’ traits (e.g.,
Blomberg et al. 2003); (2) to find which test approach should be
used to compare two traits in cross-species analyses and whether
phylogenetic information should be included in these tests (e.g.,
Abouheif 1999; but see Rohlf 2006 and Revell 2010); (3) to
elucidate the processes that underpin patterns in phylogenetic
diversity in ecological studies of communities, species interaction
networks, and ecosystem services (e.g., Mouquet et al. 2012).
Variations in trait states may have various levels of associa-
tion with the species phylogenetic history (Hansen and Martins
1996) and a number of different statistics are widely used to test
for phylogenetic signal (see Revell et al. 2008). Some of these tests
are flexible and might be model-dependent or model-free depend-
ing on how phylogenetic proximities/distances among species are
defined. These include the Mantel (1967) test developed to com-
pare, via a correlation, any kind of dissimilarity matrices (see also
Mantel and Valand 1970), and the Moran (1948) test originally
developed to detect spatial signal in environmental variables and
introduced in a phylogenetic context by Gittleman and Kot (1990).
8 2 8C© 2012 The Author(s). Evolution C© 2012 The Society for the Study of Evolution.Evolution 67-3: 828–840
TESTING FOR PHYLOGENETIC SIGNAL
Abouheif (1999) proposed a model-free test of phylogenetic
signal for a continuous character adapting a diagnostic test for
serial independence originally developed by von Neumann et al.
(1941) in a nonphylogenetic context. Recently, Pavoine et al.
(2008) provided an exact analytic formulation of the Abouheif
test showing that it turns out to be an application of the Moran test
(1948), with a particular definition of the pairwise phylogenetic
proximities between species. In contrast, Blomberg et al. (2003)
proposed two statistics K and K∗ to compare the evolution of
a trait to that expected under a Brownian motion model of trait
evolution.
Being based on different approaches all these methods are
thought to possess different statistical performances in terms of
power and type I error (see, e.g., Harmon and Glor 2010; Hardy
and Pavoine 2012). In this article, we will show that Blomberg
et al. K and K∗, the Abouheif index, the Moran’s I and its gen-
eralizations, and the Mantel correlation are all based on a cross-
product statistic, such that whenever the significance of the tests
is evaluated via permutation procedures, the test procedures are
identical to each other. What changes is the way the phyloge-
netic and trait similarities (or dissimilarities) are computed. Ac-
cordingly, the observed differences in the statistical performances
among the tests are related to differences in how the similar-
ity/dissimilarity matrices are constructed, rather than to the math-
ematical formulation of the tests. We thus compared the ways
phylogenetic (dis)similarity and trait-based (dis)similarity are de-
fined in these statistics of phylogenetic signal and evaluated the
consequences on the performance of the associated tests of phy-
logenetic signal. We end with discussion and recommendation on
which statistic could be usefully preferred in which circumstance.
Matrices of Phylogenetic(Dis)similarityMany tests of phylogenetic signal require the definition of a ma-
trix of phylogenetic similarity or dissimilarity. Hereafter, we will
consider two complementary matrices of phylogenetic similari-
ties among tips: A (Pavoine et al. 2008) and the Brownian co-
variance matrix C. In the next section, we will demonstrate that
the values of C−1, the inverse of C used for instance in Blomberg
et al. (2003) statistics of phylogenetic signal, can be considered
as measures of phylogenetic differences among species.
The matrix A = (aij) was discovered by Pavoine et al. (2008)
when providing an analytical solution to the test of phylogenetic
signal developed by Abouheif (1999). For a tip i of a phylogenetic
tree, aii is the inverse of the product of the number of branches
descending from each ancestral node of the tip and, for a couple
of tips (i,j), aij is the inverse of the product of the number of
branches descending from the ancestral nodes unshared by tips i
and j and that of their most common ancestor only (nodes located
in the shortest path that connects the two tips). The values aii
have been interpreted as measures of how isolated a tip is in the
phylogenetic tree. A tip is isolated if it descends from lineages
that embed few tips. Extreme isolation is obtained when the tip is
the sole descendent from a branch directly connected to the root.
One of the main characteristics of matrix A is that it avoids the
cost associated with assuming that branch lengths and a model of
evolutionary change are known and accurate (Abouheif 1999).
The matrix C = (cij) is connected with Brownian evolution.
The diagonal value cii is defined as the sum of branch lengths be-
tween tip i and the root of the phylogenetic tree. The off-diagonal
value cij are defined as the sum of branch lengths between the first
common ancestor of tips i and j and the root of the tree (i.e., the
height above the root of the most recent common ancestor of some
pair of tips). C is the basis of the matrix of variance–covariance
V = σ2C (where σ2 is the rate of evolution) among tips’ traits
according to a Brownian motion model. Each diagonal value is
the variance of the trait value at each tip according to a Brownian
evolution from the root of the tree; and each off-diagonal value is
a covariance between the trait values at two tips according to the
same Brownian model. Examples of calculation of the matrix Cof the Brownian motion model and of the phylogenetic proximity
matrix A are given for a simple theoretical tree in Figure 1.
Special Measures of PhylogeneticDifferences: Matrix C−1
Matrices A and C of phylogenetic similarities have been defined
in the previous section. We show below that C−1 is a particular
measure of phylogenetic differences among tips (we use the word
“difference” here instead of “dissimilarity” because the term “dis-
similarity” has usually been associated with non-negative matrices
whereas these values of difference are allowed to be negative).
DEFINITION OF THE VOLUME OF A TREE
Matrix C can be associated with two graphical representations:
a tree and a parallelepiped. Hereafter, we will use the expres-
sion “the volume of a tree” to designate the volume in n di-
mensions of the parallelepiped associated with a tree of n tips
with the following definition. Let (e1 . . . en) be the standard ba-
sis Rn , where e1 = (1, 0, 0, . . . , 0)t is associated with the first
tip, e2 = (0, 1, 0 . . . , 0)t is associated with the second tip, . . . and
en = (0, 0, 0, . . . , 1)t is associated with the last tip. Let us denote
(c1 . . . cn) the columns of C. The coordinates of the vector ck in
the standard basis are the variance of the kth tip on axis ek and
the covariance between the kth tip and each of the other tips on
the other axes. The n vectors c1, . . . , cn, define a parallelepiped
as illustrated in Figure 2 (the set of points whose coordinates
EVOLUTION MARCH 2013 8 2 9
S. PAVOINE AND C. RICOTTA
(a) (b)
(c)
Figure 1. Matrices C and A. (a) Theoretical tree on which Figures 1b, 1c, and 3 are based. The triangle indicates the root node. Squares
indicate interior nodes and circles indicate the tips. (b) Matrices C (Brownian model used by Blomberg et al. 2003) and (c) A (Abouheif
matrix defined by Pavoine et al. 2008) associated to the theoretical tree given in (a).
in the standard basis are in {∑i ti ci |0 ≤ ti ≤ 1}). In the extreme
situations where the matrix C has zero off-diagonal values, the
associated tree is a star phylogeny with all branch lengths de-
scending from the root node and the associated parallelepiped has
right angles and a volume equal to the product of its edge lengths
(i.e., the variances or diagonal values of C; Fig. 2a).
If some of the covariances (off-diagonal values of C) are
positive, then the phylogenetic tree has bifurcating interior nodes
between the tips and the root node and some of the angles in the
parallelepiped are acute. In that case, if three tips are considered,
the three-dimensional parallelepiped looks like a flattened card-
board box (examples are given in Fig. 2). The volume of a tree
thus increases with the variances and decreases with the covari-
ances among tips. For a given number of tips (n), the volume of a
tree is thus a measure of phylogenetic diversity in Rn and it can
be measured by the absolute value of the determinant of matrix C(see File S1 for details).
C−1: MATRIX OF NEGATIVE PHYLOGENETIC
DIFFERENCES
Consider a reference tree from which matrix C is calculated and
C−1 = (δij) is the inverse matrix of C. It can be shown (File
S1) that the diagonal values of C−1 (δii) correspond to the ratio
of the volume in n − 1 dimensions of the new, degraded trees
obtained by removing one tip at a time in the reference tree to
the volume of the reference tree. It can also be shown (File S1)
that the off-diagonal value of C−1, δij, depends on a reduced tree
obtained by dismantling the structure of the tree as follows: (1)
first remove the path that connects the ith to the jth tip; this leads to
several disconnected subtrees; (2) if none of the subtrees contain
the root node of the main tree, δij = 0; otherwise, reconnect the
subtrees by their root, and δij is equivalent to minus the ratio of the
volume of this new reduced tree to the volume of the reference tree
(Fig. 3).
The diagonal value δi i of C−1 is thus positive and it is high
if removing the ith tip hardly change the volume associated to the
reference phylogenetic tree. As the influence of a tip depends on
its variance and covariance with the other tips, this means that δi i
is high if the ith tip is confined in the phylogenetic tree, with lots
of relatives as regards its distance to the root node. Values on the
diagonal of C−1 are thus related to the concept of phylogenetic
originality or distinctiveness of a tip (May 1990; Vane-Wright
et al. 1991; Pavoine et al. 2005, 2008): they increase with de-
creasing phylogenetic distinctiveness.
The value of the off-diagonal δi j entry of C−1 is high if re-
moving the path that connects the ith to the jth tip breaks the
tree into large subtrees. It measures thus some kind of phyloge-
netic difference between tips i and j, which is high if the two tips
compared are far from each other and far from other tips. These
values of difference have the particularity of being negative. A
value close to zero thus means high phylogenetic difference and a
strongly negative value means very low phylogenetic difference.
A value of zero in C is conserved as a zero in C−1. Tips separated
by the root node are thus considered unrelated by this approach
both in C (where zero is the lowest possible phylogenetic similar-
ity) and in C−1 (where zero is the highest possible phylogenetic
difference).
A Comparison between Matrices A,C, and C−1
The main difference between matrices A, C, and C−1 is that matri-
ces A and C measure phylogenetic similarity whereas matrix C−1
8 3 0 EVOLUTION MARCH 2013
TESTING FOR PHYLOGENETIC SIGNAL
Figure 2. Five examples of association between phylogenetic trees and parallelepipeds. The matrix C (of the Brownian model used by
Blomberg et al. 2003) is given in each case. The black vectors of the standard basis give the scale. The red vectors correspond to the
columns of C. Three faces of the parallelograms are colored to help to visualize their shape.
Figure 3. Matrix C−1 (inverse of matrix C given in Fig. 1b) associated with the tree given in Figure 1a designated here by t. V is the
volume associated to a tree (volume of the parallelepiped, see File S1 for a full definition of the parallelepiped associated to a tree).
Dismantled trees are given below the matrix.
EVOLUTION MARCH 2013 8 3 1
S. PAVOINE AND C. RICOTTA
measures phylogenetic difference. However, a more subtle prop-
erty is shared by matrices A and C−1; they contain relative (instead
of absolute) measures of phylogenetic (dis)similarity. Usual ma-
trices of pairwise phylogenetic proximities/differences between
tips (e.g., patristic distances) have off-diagonal values that only
depend on the two concerned tips: evolution on all branches is
independent on evolution on all other branches. They are ab-
solute phylogenetic proximities/differences (e.g., Gittleman and
Kot 1990). In contrast, matrices A and C−1 contain relative mea-
sures of phylogenetic proximities and differences, respectively,
that depend on the pool of taxa considered.
In the previous section, we have shown that C−1 measures
pairwise phylogenetic differences among tips influenced by how
confined the pairs of tips are (off-diagonal) and also by pure
phylogenetic confinement of individual tips (diagonal). Confine-
ment means presence in nested species-rich clades (in opposition
to originality that is associated with species-poor clades). This
should be connected to the matrix A. The diagonal elements of
A measure the phylogenetic originality of the tips and the off-
diagonal elements of A measure pairwise phylogenetic proxim-
ities influenced by how confined the pairs of tips are (Fig. 1).
Contrary to classical matrices of phylogenetic distance or prox-
imities (such as patristic distances), values in C−1 and in A thus
depend on the shape of the phylogenetic tree, and not only on the
path that connects the pair of tips.
Unification: Widely Used Tests ofPhylogenetic Signal are Based onthe Cross-Product StatisticThe general form of a cross-product statistic is given by (e.g.,
Getis 1991):
� = c∑n
i=1
∑n
j=1wi j yi j ,
where wij and yij are the elements of two pairwise dissimilarity or
similarity matrices W = (wij), Y = (yij) for objects i and j (i, j =1, 2, . . . , n), and c is a constant that is invariant by permutation.
We will consider hereafter that wij represents some measure of
phylogenetic (dis)similarity between tips i and j of a phylogenetic
tree, whereas yij is a measure of (dis)similarity between trait values
at tips i and j. The cross-product statistic is flexible as wij and yij
can be freely defined. We have suggested above two potential
matrices of phylogenetic similarities (A and C) and a matrix of
phylogenetic differences (C−1).
Because the elements of a (dis)similarity matrix are not in-
dependent, the significance of a cross-product statistic (i.e., the
association between W and Y) is usually tested by randomly
permuting the order of the elements within one matrix (rows
and columns are permuted in tandem) keeping the other matrix
unchanged (Rosenberg and Anderson 2011). P-values are then
computed as the proportion of permutation-derived values that
are as extreme or more extreme than the actual � value.
We demonstrate in Table 1 that several widely used statistics
of phylogenetic signal are applications of the cross-product statis-
tic. As a consequence, the differences between the permutation
tests using these statistics are due to the choice of the matrices
W and Y. We will show in the next section that this choice is
critical to the performance of the test in terms of power and type I
error. We illustrate below that the different cross-product statistics
presented in Table 1 have had very different justifications when
they were first developed and have different levels of flexibility
in the definitions of W and Y.
The first statistic related to the cross-product is Mantel cross-
product (1967, p. 213), where, in a phylogenetic context, Y is
a matrix of pairwise dissimilarity between trait values at tips
and W is a corresponding matrix of phylogenetic dissimilarity.
Mantel test was developed to analyze the correlation between any
two matrices of dissimilarities (first in a context of spatial and
temporal aggregation in disease expansions, Mantel 1967). The
main advantage of this index is its flexibility because the definition
of how to compute the trait- and phylogeny-based dissimilarities
among tips is left completely free to convenience of the user of the
test. Note that using dissimilarities among species in Mantel test
implies that a tip will never be compared to itself (the dissimilarity
between a species and itself is zero: wii = 0 and yii = 0 for all i).
The statistics IR, IN , and IW in Table 1, are all rooted in the
analysis of autocorrelation in time series and spatial data (Cliff and
Ord 1973; Rohlf 2001). They are generalized versions of Moran’s
(1948) I autocorrelation index. They take the classic form of any
autocorrelation coefficient: the numerator term is a measure of
covariance among the trait values at tips and the denominator
term is a measure of variance. The general formula is
I =n
∑n
i=1
∑n
j=1, j �=iwi j
(xi −
∑n
k=1rk xk
)(x j −
∑n
k=1rk xk
)(∑n
i=1
∑n
j=1, j �=iwi j
)(∑n
i=1ri
(xi −
∑n
k=1rk xk
)2) .
Originally, the diagonal values of W = (wij) were set to zero.
Here, we use a more general formula where they are allowed to
be positive. The value xk is the trait value at tip k. In IR and IN ,
rk = 1/n for all k. The difference between IR and IN is that in
IN ,∑n
j=1 wi j = 1 for all i (see Gittleman and Kot 1990 for an
application of IN in a phylogenetic context). With this constraint,
the numerator of IN can be seen as a covariance between the
observed value of i (xi) and the average value of the other tips∑nj=1 wi j x j where each tip is weighted by how closely related
(as measured by wij) it is from tip i (Cliff and Ord, 1981; see
also File S3 where the equations of all statistics are detailed). The
covariance is expected to be high if the value at tip i is close to
8 3 2 EVOLUTION MARCH 2013
TESTING FOR PHYLOGENETIC SIGNAL
Table 1. A comparison of the generalized Moran statistics, Bomberg et al. K, K∗, and the new KW . xi is the trait value for tip i,
x = ∑ni=1
1n xi , ri is the phylogenetic weight for tip i (see main text), R is the diagonal matrix with values ri for all i on the diagonal, 1n is
the unit vector of length n, σ2 is the rate of evolution in the Brownian model.
Phylogenetic matrix (W) Trait-based matrix (Y) Constant (c)
IR Any non-negative squaredmatrix
⎡⎣ (xi − x)(x j − x)∑n
i=1
1n (xi − x)2
⎤⎦ 1/1t
nW1n
IN Any non-negative squaredrow-normalized matrix
⎡⎣ (xi − x)(x j − x)∑n
i=1
1n (xi − x)2
⎤⎦ 1/n
IW Any non-negative squaredmatrix
⎡⎢⎢⎣
(xi −
∑n
i=1ri xi
)(x j −
∑n
i=1ri xi
)∑n
i=1ri (xi −
∑n
i=1ri xi )
2
⎤⎥⎥⎦ 1/1t
nW1n
1/K∗ C−1
⎡⎢⎢⎣
(xi −
∑n
i=1ri xi
)(x j −
∑n
i=1ri xi
)∑n
i=1(xi − x)2
⎤⎥⎥⎦
(σ2
n − 1
) (tr [C] − 1t
nC1n
n
)
1/K C−1
⎡⎢⎢⎢⎣
(xi −
∑n
i=1ri xi
)(x j −
∑n
i=1ri xi
)
∑n
i=1
(xi −
∑n
i=1ri xi
)2
⎤⎥⎥⎥⎦
(σ2
n − 1
) (tr [C] − n
1tnC−11n
)
1/KW C−1
⎡⎢⎢⎢⎣
(xi −
∑n
i=1ri xi
)(x j −
∑n
i=1ri xi
)
∑n
i=1ri
(xi −
∑n
i=1ri xi
)2
⎤⎥⎥⎥⎦
(σ2
n − 1
) (tr [RC] − 1
1t C−11
)
the values at its most related tips. The index IW use different ri
values: ri = ∑nj=1 wi j/
∑ni=1
∑nj=1 wi j . This weighting grants a
higher importance to tips having many closely related tips. It was
suggested (Thioulouse et al. 1995) to unify several points of view
on how autocorrelation (spatial autocorrelation in Thioulouse et
al. paper, for us phylogenetic correlation) should be measured and
analyzed including Moran’s (1948) index, Geary’s (1954) index,
the local variance (Lebart 1969), the local principal component
analysis (Le Foll 1982; see Thioulouse et al. 1995 for details).
According to Pavoine et al. (2008, p. 83), Abouheif (1999)’s
Cmean test of phylogenetic signal turns out to be equal to IR(A).
Given that for A = (aij),∑n
j=1 ai j = 1, the weights defined in IW
are simply ri = 1/n, so that Cmean = IR(A) = IW (A) = IN (A).
Finally, the formulas for K, K∗, and KW statistics in Table 1
are connected. The statistic K∗ (Blomberg et al. 2003) was defined
as:
MSE∗
MSE
/ (MSE∗
MSE
)expected with Brownian model
.
The denominator of this ratio is a scaling factor that does not
depend on the traits and that is thus unchanged by permutation. It
is the expected value the numerator would have if the traits were
distributed according to a Brownian model of evolution. K and
K∗ were defined by Blomberg et al. (2003). Consider the model
x = μ1n + ε, where x is the vector of observed trait values at the
tips of a phylogeny, μ is a mean (scalar), 1n is the unit vector of
length n, and ε the vector of residuals. In ordinary least squares
(OLS) the elements of ε are assumed to be independent and iden-
tically distributed according to a normal distribution of mean zero
and variance σ2. The mean square error of the model in OLS is
MSE∗ = 1n−1
∑ni=1 (xi − x)2 where x = ∑n
i=11n xi. In generalized
least squares (GLS), ε is assumed to follow a multivariate normal
distribution of mean 0n (null vector of length n) and of covari-
ance matrix σ2C (of size n × n). The mean square error of the
model in GLS is MSE = 1n−1
∑ni=1
∑nj=1 (C−1)ij(xi − x)(xj − x),
where x = [1t C−11]−1(C−11)t X. In K∗ (see Table 1 for the whole
formula), MSE∗/MSE is thus the ratio of the mean square error
of the model in OLS to the mean square error of the model in
GLS.
K is similar to K∗ except that 1n−1
∑ni=1 (xi − x)2 is replaced
with 1n−1
∑ni=1 (xi − x)2, with x defined above. The justification
for this replacement was that x is the “phylogenetically correct
mean” (Blomberg et al. 2003): the estimated value at the root
node of the phylogenetic tree. Rohlf (2006) considered K∗ but not
EVOLUTION MARCH 2013 8 3 3
S. PAVOINE AND C. RICOTTA
K and wrote for the component 1n−1
∑(xi − x)2 of K (Appendix
in Rohlf 2006) that “it is somewhat inconsistent to use deviations
from a GLS mean when computing a MSE representing the re-
sults of using OLS.” Even if K is more widely used in practice
and practically available in statistical softwares (e.g., functions
“Kcalc” and “phylosignal” in picante, Kembel et al. 2010, in R
Development Core Team 2012), K∗ seems to be preferred in the-
oretical, statistical studies (e.g., Ives et al. 2007). On the contrary,
Blomberg et al. (2003) wrote: “we only present results for K,
which we feel has greater heuristic value.” Given that K does not
have a strong theoretical justification, the evaluation of its perfor-
mance in the permutation test, in comparison with K∗, is critical
to determine any recommendation about its future use.
Here, we would like to introduce the new statistic KW , which
could reconcile Blomberg et al.’s (2003) advice of using only the
phylogenetic correct mean in a statistic for phylogenetic signal
with the general idea of having a strong theoretical basis for
any statistic. KW (see Table 1 for the formula) is similar to K
except that it replaces∑n
i=1 (xi − x)2 with∑n
i=1 ri (xi − x)2; with
the notation C−1 = (θij), ri = ∑nj=1 θi j/
∑ni=1
∑nj=1 θi j . With
this definition, it can be shown that KW = λ/IW (C−1) (File S3),
where λ is a scalar (i.e., real value) invariant by permutation. A
permutation test based on KW is thus equivalent to a permutation
test based on IW , which is rooted on a strong statistical framework
that unifies several points of view on the measures and analyses
of autocorrelation as indicated above.
Mantel’s statistic and Moran’s generalized statistics thus be-
longs to the same statistical framework, the cross-product statistic,
as Blomberg et al.’s statistics. Specifically, the observation that
K∗, K, and KW can be considered in the context of cross-product
statistics and are closely related to Moran’s generalized statistics,
is based on our demonstration that C−1 can be interpreted as a
matrix of phylogenetic differences. The characterization of matrix
C−1 was thus critical to unify all these indices. The Mantel test
compares phylogenetic dissimilarities with trait-based dissimilar-
ities and Moran’s tests, when applied to phylogenetic similarities,
compare phylogenetic similarities with trait-based similarities. In
contrast, Blomberg et al.’s K, K∗ and the new KW statistics all
mix phylogenetic differences in C−1 and trait-based similarities.
However, as functions of the inverse of a cross-product, K, K∗,
and KW also increase with phylogenetic signal. If applied to C−1,
any generalized Moran statistic (IR, IN, IW ) decreases with phylo-
genetic signal but can still be used to test for phylogenetic signal
provided one considers that low values of IR, IN, IW in that case
correspond to high phylogenetic signal. To evaluate the perfor-
mance of these statistics in testing for phylogenetic signal, we
provide in the next sections a comparison between the general-
ized Moran statistics, K, K∗, and KW statistics and between the
definitions of phylogenetic (dis)similarity in terms of power and
type I error of tests for phylogenetic signal.
Table 2. Statistics used in the simulations. Note that when the
diagonal values of A are not set to zero, Abouheif (1999), Cmean =IR(A) = IN(A) = IW (A), but when the diagonal values of A are set to
zero, IR(A) �= IN(A) �= IW (A). In addition, when the diagonal values
of C−1 are not set to zero, the statistical performance of KW and
IW (C−1) are equal.
Statistics compared, which have an intrinsicMatrix W definition of Y (see Table 1)
W = A IR(A), IN(A), IW (A)W = C IR(C), IN(C), IW (C)W = C−1 IR(C−1), IN(C−1), IW (C−1), K, K∗
Type I Error and PowerRecommendations that increase the power of the Mantel’s test,
and a comparison between the power of the Mantel’s and
Blomberg et al.’s (2003) test can be found in Hardy and Pavoine
(2012). We focus here only on the different generalizations of
the Moran test (IR, IN , IW ), and the statistics K and K∗. The
performances of KW are equivalent to those of IW (C−1) and the
performances of Abouheif’s test are equivalent to those of IW (A)
(knowing that IW (A) = IR(A) = IN(A) see section Unification:
Widely Used Tests of Phylogenetic Signal are Based on the Cross-
Product Statistic). We have simulated data to evaluate type I error
and power (1-type II error) associated with each coefficient of
phylogenetic signal (IR, IN , IW , K, and K∗) when permutation
tests are used. The indices IR, IN , IW were applied with matrices
A, C, C−1 first with the diagonal values given by their definitions
and then by artificially setting zero on their diagonal to evaluate
the role of diagonal values in power and type I error (see also
File S4). We have shown in the previous sections that the indices
IR, IN , IW , K, and K∗ are all part of the same statistical framework,
the cross-product statistic. The differences between these indices
concern mostly the way the trait-based similarities among tips are
computed (column Y in Table 1), but also some restrictions on
the way phylogenetic (dis)similarities are computed (column Win Table 1) such as the row normalization in IN . The restriction
on W in K and K∗ is stronger as they were developed only to
be used with matrix C−1 of phylogenetic differences among tips.
We summarize in Table 2 the statistics of phylogenetic signal
compared.
The calculus of the coefficients and the randomizations were
done with functions “gearymoran” of package ade4 (Dray and
Dufour 2007) of R and “Kcalc” of picante (Kembel et al. 2010)
and with personal R scripts. Matrix A was computed with func-
tion proxTips of package adephylo of R (Jombart and Dray
2010), Matrix C with function “vcv” of the package ape (Paradis
et al. 2004) and the inverse of C was obtained by function “ginv”
of package MASS (Venables and Ripley 2002) and checked for
congruence with function “solve” of the basis of R.
8 3 4 EVOLUTION MARCH 2013
TESTING FOR PHYLOGENETIC SIGNAL
We simulated a series of trees as follows. We first simulated
pure birth trees (with birth rate of 0.1 as in Harmon and Glor 2010)
leading to relatively well-balanced trees (function “sim.bd.taxa”
in the package Treesim, Stadler 2011, of R, R Development Core
Team 2012). Next, we analyzed the effect of the strength of covari-
ance among tips by transforming the previous trees first moving
back most speciation events near the root (low expected covari-
ance among tips) and then, inversely, moving forward most spe-
ciation events near the tips (low expected covariance among tips)
using package geiger of R (function “deltaTree” with δ = 10 and
0.1, respectively, Harmon et al. 2009; Hardy and Pavoine 2012).
We then obtained asymmetric trees by applying UPGMA on val-
ues randomly drawn from a log-normal distribution (Euclidean
distance; mean and standard deviation of the distribution on the
log scale equal 0 and 1, respectively). Finally, we also analyzed
the power of the tests on nonultrametric trees (where the distance
from tips to root is not a constant) by simulating trees where the
topology is generated by splitting randomly the edges (function
“rtree” of package ape of R, Paradis et al. 2004). We generated
the branch lengths with a uniform distribution (bound between
0 and 1), next with a log-normal distribution (mean and stan-
dard deviation of the distribution on the log scale equal 0 and 1,
respectively). Nonultrametric trees can represent unequal evolu-
tionary rates in different parts of the phylogeny. Power analyses
were based on 1000 trees per model and type I error on 10,000
trees to have a better precision on the deviation from the nominal
α = 5% level. We simulated trees with 23 = 8, 25 = 32 and 27 =128 tips.
For power analyses, trait values were simulated per tree based
on Brownian (BM) and Ornstein–Ulenbeck (OU) models with
σ2 = 1, θ = 0, and α = 2, 4, 6, 8, or 10 (scaling the maximum
distance from a tip to the root of the tree equal to unity, see
Pavoine et al. 2008 for details). We analyzed type I error by four
models: (1) trait values drawn from a normal distribution with
mean θ = 0 and variance σ2 = 1; (2) trait values drawn from
a log-normal distribution with mean θ = 0 and variance σ2 =1 on the log scale; (3) for n values simulated, n−1 were drawn
from the normal distribution and an extreme value was added as
max(n−1 values) + range(n−1 values) × 10; (3) for n values
simulated, n−1 were drawn from the normal distribution and
an extreme value was added as max(n−1 values) + range(n−1
values) × 100.
With the random normal and the log-normal distributions of
trait values, all type I errors were correctly close to 5% (results
of the type I and power analyses are detailed in File S5). The dis-
tribution of the 10,000 simulated P-values per model was always
uniform (even) from 0 to 1. When the trees were ultrametric and
relatively well balanced (i.e., here with the pure birth model), the
distribution of P-values was still close to even from 0 to 1 when an
extreme value was added to the trait dataset. When the trees were
ultrametric but unbalanced and the number of tips was high (32 or
128) the type I error was inflated by an extreme trait value in which
case the distribution of P-values was right-skewed with many low
P-values and a few large P-values (typically when the diagonal
values of the phylogenetic proximity/differences matrices were
included) or of U-shape with high number of P-values close to 0
and 1 and low number of intermediate values (typically when the
diagonal values of the phylogenetic proximity/differences matri-
ces were not included). In contrast, when the number of tips was
low (eight tips only), the type I error decreased below 5% leading
to too conservative tests.
When the trees were nonultrametric and whatever the branch-
length model, the type I error was correct for IR(C−1) with diag-
onal values and always near to correct (except with eight tips) for
K∗. It was inflated in all other cases except two: (1) with IR(C)
with diagonal values the type I error was lower than 5% and the
distribution of P-values was skewed to the left with 32 or 128 tips
and correctly close to 5% with eight tips; (2) with IN(C−1) and
the most extreme value we considered the type I error was lower
than 5% and the distribution of P-values was skewed to the left.
Powers of tests increased with the number of tips in the
phylogenetic trees as previously shown for instance by Pavoine
et al. (2008) and Hardy and Pavoine (2012). We present the results
of the power analyses for 128 tips in Figure 4. Results for eight
and 32 tips can be found in File S5. Whatever the number of tips,
we obtained the following main results (Fig. 4):
Result 1: Power is impacted by the shape of the phylogenetic
tree, the matrix of phylogenetic proximity/differences and the
way trait-based proximities are computed in the different statistics
based on Moran (1948) and Blomberg et al. (2003) and summa-
rized in Table 1.
Result 2: When A is used to describe phylogenetic prox-
imities, considering the positive diagonal values of matrix A as
defined in Pavoine et al. (2008) instead of arbitrarily setting them
equal to zeros can increase the power of the generalized Moran
tests.
Result 3: When C is used to describe phylogenetic proximi-
ties, the row normalization used in index IN decreases the power
of the test for phylogenetic signal.
Result 4: When C−1 is used to describe phylogenetic dif-
ferences, the differences in power among the generalized Moran
tests are strongly dependent on whether the phylogenetic tree is
ultrametric. With ultrametric trees, indices IR and IN used with
zero values on the diagonal of C−1 decrease power; the index IW
slightly decreases power in comparison with K, K∗, and IR with
positive diagonal values for C−1. With nonultrametric trees, K∗
and IR with positive diagonal values for C−1 have the highest
powers.
Result 5: The highest powers are associated with coefficients
that use C−1.
EVOLUTION MARCH 2013 8 3 5
S. PAVOINE AND C. RICOTTA
Figure 4. Power analyses for trees with 128 tips. Results are given per tree shape and per phylogenetic matrix used (A, C, or C−1). A
symbol legend is given at the bottom of the panel. All numerical values can be found in File S5. Results for trees with eight and 32 tips
can be found in File S5.
Result 6: The use of C to describe phylogenetic proximities
generally decreases power in comparison with A and C−1.
Because Pavoine et al. (2008) suggested that contrasting val-
ues in phylogenetic (dis)similarity matrices could increase the
power of the tests, we calculated the coefficient of variation (CV)
of matrices A, C, and C−1. Detailed results are given in File S6.
On average over all simulated trees, the CV increased from C(1.645 for the off-diagonal values and 0.145 for the diagonal val-
ues), through A (3.152 for the off-diagonal values and 1.637 for
the diagonal values), to C−1 (12.304 for the off-diagonal values,
in absolute value, and 1.342 for the diagonal values). The CV in-
creased with the number of tips for A, C−1, and the off-diagonal
values of C but not for the diagonal values of C.
DiscussionIn this article, we showed that all tests for phylogenetic signal we
considered (Mantel, Moran, Abouheif, and Blomberg et al.) are
connected to each other being related to a cross-product statistic.
This means that the observed differences in their statistical perfor-
mances cannot depend on the tests themselves; the only relevant
change among the different approaches is the way the pairwise
phylogenetic and trait (dis)similarities are computed. A posteriori,
this seemingly counterintuitive result is not completely surpris-
ing. For instance, all tests for phylogenetic signal are aimed at
comparing two datasets that are usually produced in two different
formats: a phylogenetic tree and a vector (or a matrix) of traits.
Accordingly, all these methods explicitly or implicitly convert
trait values and phylogeny into pairwise (dis)similarities to calcu-
late the test statistic shifting the attention from the mathematical
formulation of the tests to how the (dis)similarity matrices are ob-
tained. This observation has a number of valuable consequences.
Here we analyzed three matrices of phylogenetic
(dis)similarities. We discovered C−1 as a matrix of phylogenetic
differences among tips where the phylogenetic difference depend
on volumes of subtrees degraded by the loss of one or two tips
in a phylogenetic tree. This definition of C−1 allowed us to unify
Mantel’s and Moran’s statistics with Blomberg et al. statistics and
thus to compare their performance within this unique statistical
framework of the cross-product. Power analyses for phylogenetic
signal tests were performed in Pavoine et al. (2008) and Hardy
and Pavoine (2012). In Pavoine et al. (2008) only IW was ana-
lyzed and it was shown that there may be a strong effect of the
choice of the phylogenetic similarity matrix in Moran’s permuta-
tion test. In Pavoine and Hardy (2012), the focus was put on Man-
tel test and K and it was shown that the power of the Mantel test
might in certain circumstances exceed the power associated with
K. Compared to these previous studies here we analyzed several
8 3 6 EVOLUTION MARCH 2013
TESTING FOR PHYLOGENETIC SIGNAL
generalizations of Moran’s index, K, K∗, and KW , which altogether
encompassed a range of different ways of measuring phylogenetic
(dis)similarities and trait-based (dis)similarities among the tips of
a phylogenetic tree. We also analyzed the impact of the positive
diagonal of matrices A, C, C−1 of phylogenetic (dis)similarities
among tips on the performance of the tests.
Harmon and Glor (2010) stated that converting raw data into
matrices of pairwise differences among species is an inefficient
process that reduces power of tests. They also stated that, unlike
the Mantel test, data are not converted into pairwise distances
to calculate K. To the contrary, although this is not immedi-
ately evident, we have seen that data are actually converted into
(negative) measures of phylogenetic differences between tips. In
fact, Blomberg et al.’s K uses a particular phylogenetic matrix
of differences among tips that are positive on the diagonal and
negative elsewhere. In our simulations, this matrix C−1 of phylo-
genetic differences was often associated with the highest powers
of test for phylogenetic signal. When the trees were ultramet-
ric, K and IW (C−1) gave close but slightly lower powers than K∗
and IR(C−1). Pavoine et al. (2008) found that IW (A) had a higher
power than IW (C) whatever the tree shape except for comb-like
trees with which IW (C) had approximately the same but slightly
higher power than IW (A). A similar result was obtained here when
speciation events were driven toward the tips of the tree (see also
Hardy and Pavoine 2012 for a similar result with Mantel test).
However even in that case, our simulations showed that K∗ and
IR(C−1) had higher power than all statistics applied to matrices
A and C. When the trees were not ultrametric, K∗ and IR(C−1)
clearly reached much higher powers and correct type I errors
even in presence of outliers in trait values. IR(C−1) is very closely
related to Mantel tests, which confirms that Mantel test power
might be high with appropriate definitions of phylogenetic and
trait differences among tips (Hardy and Pavoine 2012). K∗ dif-
fers from the scheme of the Mantel test by the use of both a
phylogenetically weighted mean for the tips’ trait values and an
unweighted mean. Despite these differences we obtained similar
results in both type I and power analyses for K∗ and IR(C−1). More
generally, in our simulations, the diagonal values of phylogenetic
proximity/difference matrices tended to increase the power of the
tests. Contrary to what is done in spatial analysis, we thus recom-
mend their use in phylogenetic analysis. Finally, despite not based
on branch lengths, matrix A provided surprisingly high powers as
observed already by Pavoine et al. (2008). Matrix A with positive
diagonal values satisfies IW (A) = IR(A) = IN(A). A with any gen-
eralized I index is thus particularly recommended in all situations
when branch lengths are unavailable.
In addition to power analyses, we have provided a thorough
analysis of type I error. The type I error of tests for phyloge-
netic signals has generally been studied with normal traits and
the sensitivity to asymmetric distributions and to outliers was
neglected. Asymmetric distributions of trait values might result
from independent evolution along a phylogenetic tree where a tip
is greatly separated to the other tips. However real data might also
contain outliers or might have intrinsic asymmetric distributions
even if this asymmetry is not the result of an evolution along a
long branch in a phylogenetic tree. Our simulations show that
most generalized Moran tests are impacted by outliers but not by
asymmetric distributions (here log-normal distributions). Despite
K∗ and IR(C−1) were robust to outliers with nonultrametric phy-
logenetic trees, they were associated with inflated type I error in
presence of outliers as all other statistics with ultrametric phylo-
genetic trees. We thus advocate that transformations are used on
data to limit the impact of outliers in all tests for phylogenetic
signal. For instance, body weight, one of the most studied traits,
can be transformed by cubic root and logarithm to reduce the
effect of potential outliers.
Given the lack of theoretical justification for K and the lower
power and less stable type I error compared to K∗, we suggest
abandoning this statistic K. Compared to KW , K∗ was found to
be more powerful with correct type I error in our simulations and
should thus be preferred when used in permutation tests at least in
the situations covered by our simulations. Overall, based on our
results, we recommend the use of K∗ or IR(C−1) (with positive
diagonal values for C−1) to test for phylogenetic signal whatever
the tree shape when branch lengths are available and accurate.
Nevertheless, although branch lengths were assumed to be known
in our simulations, in many real datasets they are expected to be
estimated. Incorrect branch lengths estimate could decrease the
power of K∗ and IR(C−1) to detect phylogenetic signal. To avoid
the unknown cost associated with assuming that the branch lengths
are known, Abouheif’s Cmean statistic can be recommended as it
provided nearly as high power as K∗ and IR(C−1) whatever the
tree shape. It should be recalled that all statistics assume that the
topology is known and accurate.
To explain differences in power of the statistics of
phylogenetic signal when different matrices of phylogenetic
(dis)similarities were used, Pavoine et al. (2008) suggested that
the high power associated with A could be due to more contrast-
ing values in A than in C. To demonstrate that, they calculated the
average CV of the off-diagonal values of A and C, and obtained
a higher CV with A. With our data, the CV of the three matrices
of phylogenetic (dis)similarity increased from C, through A, to
C−1. We also found that in the only situation where higher power
is associated with C than with A (when most speciation events
are moved back near the root of the phylogenetic tree), the CV
of C is higher than that of A. Pavoine et al. (2008) suggestion is
thus supported by our article. Pavoine et al. (2008) also suggested
that the higher power associated with A could be due to the fact
that A never considers that tips connected only at the root node
of the tree are not related. However, this hypothesis is refuted by
EVOLUTION MARCH 2013 8 3 7
S. PAVOINE AND C. RICOTTA
our present article as, despite C−1 considers that tips connected
only at the root node of the tree are not related, it is associated to
the highest power.
Future researches now need to take account, in all these co-
efficients of phylogenetic signal (Mantel r, generalized Moran
IR, IN , and IW , Blomberg et al. K, K∗, and the new KW ), of (1)
within-species variations or measurement errors; (2) groups of
traits; (3) traits of different statistical types. Within-species vari-
ations and measurement errors have already been considered for
K∗ (Ives et al. 2007). This consideration increased the power
of the permutation test based on K∗ (Hardy and Pavoine 2012).
In addition, when approaches that incorporate measurement er-
rors were used, phylogenetic signal could be detected even with
small phylogenies with low number of tips (Zheng et al. 2009).
Adapting these developments for the other indices is needed to
complete the evaluation of their performance. Regarding the use
of groups of traits, tests of phylogenetic signal based on combined
traits have been developed by Zheng et al. (2009). Jombart et al.
(2010) used IN with several traits at the same time to describe
lineage-dependent phylogenetic signals in combinations of traits.
These developments are particularly needed in ecology as many
ecological processes involve a combination of traits rather than a
single trait. From a biological viewpoint, the species ecological
behavior, for example, is expected to be driven by complex in-
teractions among functional traits that are not fully independent
from each other (see for instance Milla et al. 2009). Therefore,
in some cases the researcher may be more interested in testing
for phylogenetic signal in a combination of traits, rather than in a
single trait only, and this is easily done if single trait differences
between species are combined into a multivariate pairwise dis-
similarity matrix (Pavoine et al. 2009, Ricotta and Moretti 2010).
Regarding the statistical type of traits, while tests of phyloge-
netic signal are easily performed on quantitative variables or on
ordinal variables transformed to ranks, it is unclear how to deal
with nominal traits. The usual approach consists in coding the
information in nominal variables by as many independent binary
variables as the number of categories. Then, each binary variable
is tested separately for phylogenetic signal. However, the cate-
gories of a nominal variable may often be nonexclusive. That is,
a species may be characterized by the simultaneous presence of
two or more character states. These nonexclusive states may be
binary coded as is done for the exclusive states, or they may be
fuzzy coded (Chevenet et al. 1994). In this case, each character
state receives a positive score in the range of 0–1 that describes
the affinity of a species for that state. For example, diet habits
in animals or Grime’s (1979) CSR (Competitor-Stress tolerator-
Ruderal) strategies in plants are typically coded as fuzzy variables.
Nonetheless, irrespective of how nonexclusive nominal variables
are coded, a multivariate measure of dissimilarity or disagreement
between pairs of species seems the most straightforward solution
for summarizing interspecies functional differences (Podani and
Schmera 2007). The cross-product can thus be adapted easily to
deal with any type and number of traits.
We analyzed here a limited number of indices that can be
derived from the cross-product statistic. Overall the power of the
test is influenced by the choice of the phylogenetic and functional
interspecies (dis)similarities wij and yij. Although the choice of an
appropriate functional (dis)similarity depends on the number and
type (i.e., quantitative, ordinal, nominal, etc.) of the selected traits
(Legendre and Legendre 1998; Pavoine et al. 2009), the choice
of an appropriate phylogenetic (dis)similarity matrix primarily
depends on whether branch lengths are known and whether a par-
ticular model of evolution is assumed (e.g., Brownian model). In
this view, the general formulation of equation (1), which is at the
core of the tests for phylogenetic signal analyzed in this article
offers very high flexibility in the calculation of the phylogenetic
distance matrix W = [wij], depending on the available phylogeny
and on the problem under scrutiny, that is to say the reasons why
the test of phylogenetic signal is performed. In Abouheif’s test,
the matrix of phylogenetic proximity is obtained solely from the
tree topology assuming all branch lengths to be equal. However,
as stressed by Crozier (1997, p. 243): “[Phylogenetic] measures
using branch-lengths are better than procedures relying solely
on topology.” Despite that, we showed that matrix A maintain
high power of tests for phylogenetic signal especially when it
is used with its positive diagonal and matrix A was found often
highly associated with matrix C−1. These two matrices have the
particularity of being relative measures of phylogenetic proximi-
ties/differences, as the value of proximity/difference between two
tips is not only dependent on the shared history of the taxa at these
tips but also depend on the level of shared history with the other
tips of the tree. Such relative, integrative measures of phylogenetic
proximities/differences appear to enhance the power of tests.
To conclude, methods for testing for phylogenetic signal
in functional traits should be possibly designed to deal with the
particular question asked. The cross-product statistic offers a
common statistical framework for widely used tests for phylo-
genetic signal. It also offers large flexibility into well-known
tests for phylogenetic signal, which provides the opportunity for
adapting tests to the taxa at hand, the quality of the phylogeny and
trait measurements and to the particular question asked. By this
it opens new directions for analyzing multivariate phylogenetic
signals in biological traits ensuring correct type I error and high
power.
ACKNOWLEDGMENTSThe authors would like to thank the editor and anonymous referees fortheir useful comments on our article.
8 3 8 EVOLUTION MARCH 2013
TESTING FOR PHYLOGENETIC SIGNAL
LITERATURE CITEDAbouheif, E. 1999. A method for testing the assumption of phylogenetic
independence in comparative data. Evol. Ecol. Res. 1:895–909.Blomberg, S. P., T. Garland, and A. R. Ives 2003. Testing for phylogenetic
signal in comparative data: behavioral traits are more liable. Evolution57:717–745.
Chevenet, F., S. Doledec, and D. Chessel. 1994. A fuzzy coding approach forthe analysis of long-term ecological data. Freshw. Biol. 31:295–309.
Cliff, A. D., and J. K. Ord. 1973. Spatial autocorrelation. Pion, London, U.K.———. 1981. Spatial processes: models and applications. Pion, London, U.K.Crozier, R. H. 1997. Preserving the information content of species: genetic
diversity, phylogeny and conservation worth. Annu. Rev. Ecol. Evol.Syst. 24:243–268.
Dray, S., and A. B. Dufour. 2007. The ade4 package: implementing the dualitydiagram for ecologists. J. Stat. Softw. 22:1–20.
Geary, R. C. 1954. The contiguity ratio and statistical mapping. Inc. Stat.5:115–145.
Getis, A. 1991. Spatial interaction and spatial autocorrelation: a cross-productapproach. Environ. Plann. A 23:1269–1277.
Gittleman, J. L., and M. Kot. 1990. Adaptation: statistics and a null model forestimating phylogenetic effects. Syst. Zool. 39:227–241.
Grime, J. P. 1979. Plant strategies and vegetation processes. John Wiley andSons, Chichester, MA.
Hansen, T. F., and E. P. Martins. 1996. Translating between microevolution-ary process and macroevolutionary patterns: the correlation structure ofinterspecific data. Evolution 50:1404–1417.
Hardy, O., and S. Pavoine. 2012. Assessing phylogenetic signal with mea-surement error: a comparison of Mantel tests, Blomberg et al.’s K andphylogenetic distograms. Evolution 66:2614–2621.
Harmon, L. J., and R. E. Glor. 2010. Poor statistical performance of the Manteltest in phylogenetic comparative analyses. Evolution 64:2173–2178.
Harmon, L., J. Weir, C. Brock, R. Glor, W. Challenger, and G. Hunt. 2009.geiger: analysis of evolutionary diversification. R package version 1.3–1.Available at http://CRAN.R-project.org/package=geiger
Ives, A. R., P. E. Midford, and T. Garland, Jr. 2007. Within-species variationand measurement error in phylogenetic comparative methods. Syst. Biol.56:252–270.
Jombart, T., and S. Dray. 2010. adephylo: exploratory analyses for the phylo-genetic comparative method. Bioinformatics 26:1907–1909.
Jombart T., S. Pavoine, S. Devillard, and D. Pontier. 2010. Putting phylogenyinto the analysis of biological traits: a methodological approach. J. Theor.Biol. 264:693–701.
Kembel, S. W., P. D. Cowan, M. R. Helmus, W. K. Cornwell, H. Morlon, D.D. Ackerly, S. P. Blomberg, and C. O. Webb. 2010. Picante: R tools forintegrating phylogenies and ecology. Bioinformatics 26:1463–1464.
Lebart, L. 1969. Analyse statistique de la contiguıte. Publication de l’Institutde Statistiques de l’Universite de Paris 28:81–112.
Legendre, P., and L. Legendre. 1998. Numerical ecology. Elsevier, Amster-dam, NL.
Le Foll, Y. 1982. Ponderation des distances en analyse factorielle. Statistiqueset Analyse des Donnees 7:13–31.
Mantel, N. 1967. The detection of disease clustering and a generalized regres-sion approach. Cancer Res. 27:209–220.
Mantel, N., and R. S. Valand. 1970. A technique of nonparametric multivariateanalysis. Biometrics 26:547–558.
May, R. M. 1990. Taxonomy as destiny. Nature 347:129–130.
Milla, R., A. Escudero, and J. M. Iriondo. 2009. Inherited variability in mul-tiple traits determines fitness in populations of an annual legume fromcontrasting latitudinal origins. Ann. Bot. 103:1279–1289.
Moran, P. A. P. 1948. The interpretation of statistical maps. J. R. Stat. Soc.Ser. B Stat. Methodol. 10:243–251.
Mouquet, N., V. Devictor, C. N. Meynard, F. Munoz, L.-F. Bersier, J. Chave,P. Couteron, A. Dalecky, C. Fontaine, D. Gravel, et al. 2012. Ecophylo-genetics: advances and perspectives. Biol. Rev. 87:769–785.
Paradis, E., J. Claude, and K. Strimmer. 2004. APE: analyses of phylogeneticsand evolution in R language. Bioinformatics 20:289–290.
Pavoine, S., S. Ollier, and A. B. Dufour. 2005. Is the originality of a speciesmeasurable? Ecol. Lett. 8:579–586.
Pavoine, S., S. Ollier, D. Pontier, and D. Chessel. 2008. Testing for phyloge-netic signal in phenotypic traits: new matrices of phylogenetic proximi-ties. Theor. Popul. Biol. 73:79–91.
Pavoine, S., J. Vallet, A. B. Dufour, S. Gachet, and H. Daniel. 2009. On thechallenge of treating various types of variables: application for improv-ing the measurement of functional diversity. Oikos 118:391–402.
Podani, J., and D. Schmera. 2007. How should a dendrogram-based measureof functional diversity function. A rejoinder to Petchey and Gaston.Oikos 116:1427–1430.
R Development Core Team. 2012. R: a language and environment for statisticalcomputing. R Foundation for Statistical Computing, Vienna, Austria.ISBN 3–900051–07–0, URL http://www.R-project.org/
Revell, L. J. 2010. Phylogenetic signal and linear regression on species data.Methods Ecol. Evol. 1:319–329.
Revell, L. J., L. J. Harmon, and D. C. Collar. 2008. Phylogenetic signal,evolutionary process, and rate. Syst. Biol. 57:591–601.
Ricotta, C., and M. Moretti. 2010. Assessing the functional turnover of speciesassemblages with tailored dissimilarity matrices. Oikos 119:1089–1098.
Rohlf, F. J. 2001. Comparative methods for the analysis of continuous vari-ables: geometric interpretations. Evolution 55:2143–2160.
———. 2006. A comment on phylogenetic correction. Evolution 60:1509–1515.
Rosenberg, M. S., and C. D. Anderson. 2011. PASSaGE: pattern analysis,spatial statistics and geographic exegesis. Version 2. Methods Ecol.Evol. 2:229–232.
Stadler, T. 2011. TreeSim: simulating trees under the birth-deathmodel. R package version 1.5. Available at http://CRAN.R-project.org/package=TreeSim
Thioulouse, J., D. Chessel, and S. Champely. 1995. Multivariate analysisof spatial patterns: a unified approach to local and global structures.Environ. Ecol. Stat. 2:1–14.
Vane-Wright, R. I., C. J. Humphries, and P. H. Williams. 1991. What toprotect? Systematics and the agony of choice. Biol. Conserv. 55:235–254.
Venables, W. N., and B. D. Ripley. 2002. Modern applied statistics with S.4th ed. Springer, New York.
Von Neumann, J., R. H. Kent, H. R. Bellinson, and B. I. Hart. 1941. The meansquare successive difference. Ann. Math. Stat. 12:153–162.
Zheng, L., A. R. Ives, T. Garland, Jr., B. R. Larget, Y. Yu, and K. Cao. 2009.New multivariate tests for phylogenetic signal and trait correlations ap-plied to ecophysiological phenotypes of nine Manglietia species. Funct.Ecol. 23:1059–1069.
Associate Editor: E. Abouheif
EVOLUTION MARCH 2013 8 3 9
S. PAVOINE AND C. RICOTTA
Supporting InformationAdditional Supporting Information may be found in the online version of this article at the publisher’s website:
File S1. Blomberg et al. (2003) K and K∗ are based on a cross-product.
File S2. Resemblance between the matrix C−1 used in Blomberg et al. (2003) and matrix A implicitly used in Abouheif (1999)
and defined by Pavoine et al. (2008).
File S3. Details on the statistics of phylogenetic signal.
File S4. Positive diagonal values in matrices of phylogenetic proximities/differences.
File S5. Power and type I error analyses—all values.
File S6. Coefficient of variation of matrices A, C, and C−1—all values.
File S7. R scripts.
File S8. R functions (see file S7 for explanations).
8 4 0 EVOLUTION MARCH 2013