Upload
ay-yasemin-beril
View
8
Download
0
Embed Size (px)
DESCRIPTION
Model-based methods for clustering artefacts, given their chemical composition, often assume sampling from a mixture of multivariate normal distributions and/or make explicit assump- tions about the way in which a composition is formed. It is argued that, analysed within a modelling framework, several important and apparently competing methodologies are more similar than would initially appear. The opportunity is taken to note that models for populations are often not compatible with models for compositions, and that dilution correctionÐwhich can be accomplished in a variety of waysÐcan be interpreted as an attempt to resolve this problem.
Citation preview
STATISTICAL MODELLING OF ARTEFACT
COMPOSITIONAL DATA*
M. J. BAXTER
Department of Mathematics, Statistics and Operational Research, Nottingham Trent University, Clifton Campus,
Nottingham NG11 8NS, UK
Model-based methods for clustering artefacts, given their chemical composition, often assume
sampling from a mixture of multivariate normal distributions and/or make explicit assump-
tions about the way in which a composition is formed. It is argued that, analysed within a
modelling framework, several important and apparently competing methodologies are more
similar than would initially appear. The opportunity is taken to note that models for
populations are often not compatible with models for compositions, and that dilution
correctionÐwhich can be accomplished in a variety of waysÐcan be interpreted as an
attempt to resolve this problem.
KEYWORDS: CLUSTER ANALYSIS, DATA TRANSFORMATION, COMPOSITION, DILUTION,
MAHALANOBIS DISTANCE, MODEL, MULTIVARIATE NORMALITY
INTRODUCTION
This paper has arisen from a project that has, as one of its aims, the investigation of statistical
model-based approaches to the analysis of artefact compositional data. The particular focus is on
the use of model-based methods for grouping or clustering data. A model-based analysis is taken
to be an approach that incorporates some, or all, of the following features:
(1) A model is formulated for the composition of an artefact or case.
(2) A model is formulated that explains why artefacts (cases) of the same kind (and possibly
from the same source) differ in their composition. This may include assumptions about the
nature of the population from which a sample of artefacts is drawn.
(3) The data analysis is in¯uenced by the modelling assumptions of (1) and/or (2).
Model-based approaches may be contrasted with exploratory methods, which include methods
of cluster analysis (e.g., average linkage, complete linkage) commonly used in the archaeometric
literature. Model-based methods typically involve the exploitation of statistical assumptions that
are absent from exploratory methods. They may be more demanding of data and computational
resources, but are also potentially more rewarding in ways to be discussed later.
The focus in this paper is on model-based approaches that have been used in the archaeometric
literature for the analysis of artefact compositional data. A main contention is that several
apparently competing, and super®cially distinct, methodologies that have been proposed are
much more similar than is apparent at ®rst sight. For convenience of exposition, there is an
emphasis on the analysis of ceramic compositional data, though the arguments advanced hold
more generally. A separate paper will investigate model-based clustering approaches that have
been proposed in the statistical literature, but which have yet to be exploited by archaeologists
(Papageorgiou et al. 2000).
# University of Oxford, 2001.
Archaeometry 43, 1 (2001) 131±147. Printed in Great Britain
* Received 13 March 2000; accepted 5 July 2000.
It is noted that models that have been proposed for ceramic paste compositions and models
that have been used to analyse samples of ceramics are contradictory, in the sense that
assumptions of the latter are incompatible with the implications of the former. What has
sometimes been referred to as `dilution correction' can be viewed as an attempt to resolve this
problem, and this is also studied from a modelling perspective.
The next two sections establish notation and develop the models for ceramic pastes and
samples of ceramics that underpin discussion in the main section of the paper. In the main
section methodologies developed in laboratories at Brookhaven, Missouri, Bonn and Barcelona
(Bieber et al. 1976; Glascock 1992; Beier and Mommsen 1994; Buxeda 1999) are examined, and
compared within an explicit modelling framework. In all but the last case, the methodologies
depend (at least in part) on assumptions about the normality of compositional data within groups,
and use probability calculations based on Mahalanobis distance calculations to assess whether
cases belong to a group (e.g., Sayre 1975; Sayre et al. 1992; Glascock 1992; Beier and
Mommsen 1994; Leese and Main 1994). Such calculations, which exploit the modelling
assumption of normality, are not possible in exploratory methodologies, and constitute one of
the major uses of modelling assumptions in statistical analyses of archaeometric data.
NOTATION AND TERMINOLOGY
A single artefact will be referred to as a case; a collection of n cases, to be subjected to chemical
and statistical analysis, will be called a sample. Models may be proposed for both case
composition and sample distribution, as will be seen in the next section.
The observed composition for the ith case will be denoted by the p ´ 1 vector xi, with xi j the
value of case i and jth variable, Xj. Let Åxj and s2j be the estimated mean and variance of the jth
variable, with Åx the p ´ 1 vector of means.
The notation xi j will also be used for transformed data, as follows and as will be clear from the
context:
(1) xi j à xi j, untransformed data;
(2) xi j à �xi j ÿ Åxj�, centred data;
(3) xi j à �xi j ÿ Åxj�=sj, standardized data;
(4) xi j à log xi j, logged data;
(5) xi j à log�xi j=xi p� for j � 1; 2; . . . ; p ÿ 1, log-ratio transformed data.
The data matrix X with typical element xi j will be called the untransformed, centred,
standardized, logged or log-ratio data matrix, according to which transform is used; and other
possibilities, such as standardized logged data, exist.
The multivariate normal distribution (MVN) plays a fundamental role in most modelling
approaches to the analysis of artefact compositional data. If xi is sampled from a p-variate
multivariate normal distribution with mean vector m and covariance matrix S, write
xi , MVN�m;S�:
If all n cases are sampled from the same MVN, then Åx is an unbiased estimate of m and
S � X0X=�n ÿ 1� is an unbiased estate of S.
In many applications, data are analysed with the expectation that there is more than one group
or cluster in the data, re¯ecting the underlying population structure. In this situation assume that
the data are sampled from G populations, where G may be unknown in advance of analysis. Let
ng be the number of cases sampled from the gth population, so that SGg�1ng � n. Let
g � �g1; . . . gn�0 be a labelling vector such that gi � g if the case is a sample from the gth
132 M. J. Baxter
population. Then Xg is a data matrix whose rows consist of those x0i for which gi � g. That is, Xg
is an ng ´ p (or ng ´ � p ÿ 1� for log-ratios) data matrix. It is typically assumed that
xi , MVN�mg;Sg�
where mg and Sg are the mean vector and covariance matrix for population g and x0i is a row of
Xg.
MODELS FOR CERAMIC COMPOSITIONAL DATA
Models for cases
To ®x ideas, consider a model for a ceramic made from two components, a clay and a temper.
The composition of the ceramic paste may be modelled as
yi � p1iz1i � p2iz2i �1�
where the p ´ 1 vector zci : c � 1; 2 is the composition of component c and pci : c � 1; 2 are
mixing proportions with p1i � p2i � 1. This is a weighted sum of the two components, and
expresses, in mathematical language, a natural idea about how pastes are formed.
To explain why two pastes formed from components from the same clay and temper vary in
composition, statistical assumptions must be invoked. One obvious reason (e.g., Neff et al.
1988, 1989) is that mixing proportions, pci, may vary. Another is that zci is randomly distributed
within a source. If this distribution is modelled as MVN we have a model for a case of the
form
yi , p1iMVN�n1;Q1� � p2iMVN�n2;Q2� �2�
where nc and Qc are the mean vector and covariance matrix that characterize variation within the
source for component c.
Equation (2) generalizes directly to C > 2 components, as
yi ,XC
c�1
pciMVN�nc;Qc� �3�
and to other materials, wherePC
c�1 pci � 1.
The paste composition yi may be modi®ed by a variety of processes, so that the observed
composition xi may differ from that of the paste. Formally, this can be written as
xi à yi à z1i; z2i; . . . ; zCi
to indicate the transformation from components to paste to measured composition. To simplify
presentation it is assumed, unless otherwise stated, that there is no signi®cant difference between
the paste and observed composition, so xi and yi can be used interchangeably, and equation (3) is
the model for xi.
In equation (3), yi (or xi� is a weighted sum of normal distributions, and is thus itself normal.
The precise form of the distribution depends on the mixing proportions, pci, so that, in general,
two cases xi and xj will be samples from different normal distributions. A sample of n cases (with
the same component sources) can thus be regarded as a mixture of n MVNs (assuming that the pci
differ). Since a mixture of n MVNs is not, in general, MVN, an immediate consequence is that the
sample is not itself MVN.
The model for case compositions in equation (3) is super®cially similar to, and inspired by,
models used in the simulation studies of Neff et al. (1988, 1989) and Bishop and Neff (1989) to
133Statistical modelling of artefact compositional data
investigate the effects of tempering in the statistical analysis of compositional data. However, it
is not identical. They model the zci as log-normal rather than normal, so that for a single variable
a multiplicative model for the paste composition of the form
yi j �YCk�1
zpki
ki j
is obtained. This is less natural than either the additive model discussed above or the alternative
multiplicative model introduced later in equation (5). If individual components are log-normal
the paste composition will not, of course, be normal under the additive model.
Models for samples
Neff (1998, 116), in a discussion of chemistry-based ceramic provenance studies, argues that the
fundamental challenge is to `align geographic co-ordinate units with the multi-dimensional
space de®ned by measured elemental concentration units'. This is based on the provenance
postulate of Weigand et al. (1977, 24) `that there exist differences in chemical composition
between different natural sources that exceed, in some recognizable way, the differences
observed within a given source'. Neff (1998, 116) equates `source' with a `point or zone of
origin' from which the materials used to make a pot originate.
This can be viewed as a `model' of the kind of variation to be expected in samples of cases
obtained from different sources. Ideally, cases will form separated clusters in a high-dimensional
space determined by the number of variables measured. These clusters may then be viewed as
samples from distinct populations that may be equated with `source' and located in geographical
space.
These ideas can be ¯eshed out in the form of statistical models, one possibility being a mixture
model of the form
xi ,XG
g�1
pgMVN�mg;Sg� �4�
where the pg are the mixing proportions, and SGg�1pg � 1. It is important to emphasize that what
we now have is a model for the population from which the sample of cases xi is drawn, rather
than a model for the composition of a single case, as previously.
The model states that the observed sample is drawn from G separate populations, which
have distinct MVN distributions. The MVN assumption is not essential, but is that almost
invariably used in model-based approaches. Many provenance studies are not based on
models such as equation (4), but rely on exploratory methods of cluster analysis to determine
the value of G, and associate cases with the population from which they are sampled. This
approach, or any sensible grouping method, is likely to work well if the component
populations of the mixture are well separated, and does not require the MVN assumption.
Problems can arise if components are not well separated and/or exhibit similar and high
variable correlations within different populations.
The main point to make here is that the models for cases in equation (3) and for samples in
equation (4) are incompatible. In particular, if the former model is valid, it cannot be assumed
that a sample of cases from a population have an MVN distribution, and hence the assumption
that populations in the latter model are MVN must be invalid. The former model is the more
fundamental and it then follows that any modelling approach based on the latter model must be
wrong, unless corrective action is taken. Some practitioners are clearly aware of this dif®culty
134 M. J. Baxter
and attempts to avoid or minimize the problem are discussed after ®rst looking at how models
have been used in practice.
MODELS IN PRACTICE
Mahalanobis distance
Mahalanobis distance plays an important role in several approaches that make use of models that
can be related to equation (4). Its properties are discussed at some length in Baxter (1999) and
Baxter and Buck (2000), and only the most salient features are reviewed here.
The squared Mahalanobis distance between a case and the centroid of a group is de®ned as
d 2i � �xi ÿ Åx�0Sÿ1
�xi ÿ Åx�:
It can be viewed as a generalization of Euclidean distance that allows for the correlation
structure of a group. The smaller d 2i is, the closer a case is to a group centroid. For suf®ciently
`small' d 2i a case may plausibly be regarded as a member of the group.
The idea of `small' may be made precise by introducing the modelling assumption that the
group is a sample from a p-variate MVN population. This allows d 2i to be transformed to a
probability ( p-value). If the p-value is too smallÐconventionally less than 0.05 or 0.01Ðit may
be doubted that the case belongs to the group against which it is being tested.
Details of implementation of this idea vary, with some researchers basing the probability
calculations on large sample approximations (e.g., Beier and Mommsen 1994) and others on
small sample approximations (e.g., Glascock 1992). The more sophisticated uses of the idea take
into account whether or not xi belongs to the group against which its membership is being tested
(e.g., Leese and Main 1994; Slane et al. 1994).
In ceramic studies this usage of Mahalanobis distance can be traced to work at the Brookhaven
National Laboratory (BNL) in the 1970s (Sayre 1975; Bieber et al. 1976; Harbottle 1976). It has
subsequently been developed by researchers, some now at the Missouri University Research
Laboratory (MURR) (Neff 1992; Glascock 1992) and will be referred to as the BNL/MURR
approach. In certain approaches to the statistical analysis of lead isotope ratio data, Mahalanobis
distance is also important (Sayre et al. 1992). Lead isotope ®elds for an ore source are assumed
to have a trivariate normal distribution. Samples are used to estimate the distribution of a ®eld,
and cases too distant from the centroid of the sample are excluded from this estimation process.
Once a ®eld has been delineated in this way, Mahalanobis distance and probability calculations,
based on the lead isotope signatures of artefacts, may be used to assess whether an ore source is a
possible provenance for an artefact.
A major practical problem in using Mahalanobis distance that has long been recognized is the
sample size requirement. For a single group, as a minimum, n > p is necessary in order to be able
to estimate S. For stable estimation much larger values of n are needed, with n > 3p and
preferably n > 5p having been suggested (Harbottle 1976). The sample size implications, when
there are several groups in the data and p is large, as is typical with many modern analytical
techniques such as Neutron Activation Analysis (NAA), are obvious.
Even for problems where the dimensionality is small, as in lead isotope ratio analysis where
p � 3, the requirements are not trivial. Pollard and Heron (1996, 328) detected an emerging
consensus that n � 20 was an `agreeable minimum' sample size. This has been challenged in
Baxter et al. (2000) who argue that 20 may be adequate if lead isotope ®elds are MVN
distributed, but will usually be inadequate if one wishes to test this modelling assumption.
135Statistical modelling of artefact compositional data
Two themes emerge here, that recur in the study of model-based methods. One is that sample
sizes may preclude the use of model-based methods that one would like to use. The second is
that, even if such use is possible, it will often not be feasible to test the assumptions of the model,
so that the model's validity is essentially an act of faith.
The BNL/MURR approach
The BNL/MURR approach is not prescriptive, but rather may be viewed as comprising a set of
tools to be deployed as appropriate (e.g., Slane et al. 1994; Lizee et al. 1995; Steponaitis et al.
1996; Hegmon et al. 1997; Triadan et al. 1997; Herrera et al. 1999). The discussion that follows
concentrates on aspects relevant to model-based approaches to statistical analysis.
Typically data are logged (to base 10) before analysis. One reason for this is an assumption
that the variables (often trace elements measured in ppm) are more likely to be MVN within
groups on the log scale, and the MVN assumption is necessary for other procedures that are used.
The merits, or otherwise, of transformation are discussed in Sayre (1975), Bieber et al. (1976),
Pollard (1986), Beier and Mommsen (1994), Baxter (1995) and below. It is often assumed that
measurement error is unimportant in relation to `natural' variation, so that the former can be
ignored, and an explicit justi®cation for this assumption is given in Bieber et al. (1976).
A recurring concern is the problems posed by high correlations among variables within
populations (Sayre 1975; Harbottle 1976, 1991). If present, these give rise to populations, and
samples from them, that have an elliptical shape in p-dimensional space. Exploratory methods of
cluster analysis are the multivariate workhorse in the analysis of compositional data, and often
the only method presented in publications (Baxter 1994). These often produce spherical clusters
of roughly equal size and can be misleading if the true structure is ellipsoidal. This can be
demonstrated explicitly for Ward's method of cluster analysis, but may also be the case for other
supposedly `model-free' methods (Gordon 1999, 65±8). Where this is a concern a solution is to
model the correlation or covariance structure explicitly (Krzanowski and Marriott 1995, 89).
The BNL/MURR approach relegates cluster analysis to a minor role and applies group
evaluation methodology to groups provisionally de®ned using cluster analysis or on the basis of
archaeological criteria. Using the assumption that the groups being sought are samples from
MVN distributions, probability calculations are undertaken using Mahalanobis distance to
evaluate whether or not a case could belong to a group. This allows for the elliptical shape of
clusters, provided that sample sizes permit the calculation. Where this is not immediately
possible, recourse may be had to a subset of the principal components of the data (e.g., Slane et
al. 1994; Herrera et al. 1999) or to subsets of variables that appear to discriminate between
groups (Bieber et al. 1976, 67±8). Cases may be added to, or deleted from, a group and the
process repeated until a stable clustering is found. It is worth noting that, where initial groups are
de®ned using multivariate methods, this often involves the use of unstandardized logged data.
Subsequent group evaluation does, however, introduce standardization through the use of
Mahalanobis distance.
It may be remarked that, in assuming an MVN distribution, cases that do not conform with the
assumption are likely to be rejected from groups, so that what ®nally remains will tend to be
MVN. In this sense the MVN assumption might be regarded as a `self-ful®lling prophecy' that
tends to impose MVN structure on the results obtained.
The Bonn approach
The approach developed at Bonn University (e.g., Mommsen et al. 1988; Beier and Mommsen
136 M. J. Baxter
1994) challenges several aspects of the BNL/MURR methodology, including the need for using
logged data, the lack of importance of measurement error, and the importance of high
correlations. The argument to follow is that the methodologies are more similar than might
appear at ®rst sight.
The central ideas in Beier and Mommsen (1994) are readily explained. Starting from a single
case, or a small number of similar cases, a group is `grown' by adding to it cases that are `close'
to the starting set. This gives a new group, and cases close to this are further added to the group.
This process proceeds iteratively until no cases are close to the group. A second group is then
`grown' from a different starting point and so on until all cases are assigned to a group or
regarded as outliers. The fundamental modelling assumption is that groups are sampled from an
MVN distribution. Closeness to a group is decided on the basis of distance and probability
calculations, underpinned by the MVN assumption, using either weighted Euclidean or
Mahalanobis distance.
Groups are de®ned sequentially and iteratively, whereas the BNL/MURR method involves a
simultaneous determination of groups that are subsequently re®ned in an iterative fashion.
Arguably, the spirit underlying both methods is very similar.
The Bonn methodology allows for measurement uncertainty. In assessing whether a case xi
could belong to a group, S in the de®nition of Mahalanobis distance is replaced by ÃS � Sx � S,
where Sx is a diagonal matrix of the measurement `uncertainties' (i.e., the variance of the
analytical error). The jth diagonal element of ÃS may be written as s2x j � s2
j , where the former
term is the analytical error variance, and the latter term the estimated variance of the variable
within the group, which incorporates both analytical and natural error. This generalization of
other methodologies may have little effect in many practical situations if either natural variation
dominates analytical variation, or if there are only a few variables for which analytical variation
is the major component of variability within a group.
It is argued, in Beier and Mommsen (1994), that there is no need to transform data
logarithmically, as this gives similar results to the use of untransformed data. Some of the
detailed evidence for this is given in Beier and Mommsen (1991). Evaluating the general
validity of this claim raises a number of issues that have wider implications. One reason for a
lack of distinction between results based on untransformed and logged data appears to be the
relatively small spread of variable values within groups. This may well be a function of the
particular type of ®ne ware studied, and does not necessarily generalize. It is unlikely that it is
the high precision of results that allows the formation of groups with quite small spreads (Beier
and Mommsen 1991), since precision of measurement and the natural spread of a variable in a
group are logically unrelated. It is possible that the modelling methodology used gives rise to
results of the kind reported. In particular, the groups examined for normality are de®ned by an
algorithm that assumes normality, and which may impose that kind of structure on the groups
found. This relation between modelling assumptions and structure found has already been noted
for the BNL/MURR methodology.
Another possible reason for the lack of a distinction between results for untransformed and
logged data is the role played by standardization. Beier and Mommsen (1994) argue against the
use of principal component analysis, on the basis that this usually involves the use of
standardized data, and that this changes as cases are added to a data base. However, their
approach is also dependent on standardization. The main difference is that standardization takes
place within groups rather than across the sample as a whole (and will change as these groups are
iteratively modi®ed). Baxter (1995) found little difference in results obtained in studies using
standardized data and standardized logged data, except in the presence of clear outliers. Since
137Statistical modelling of artefact compositional data
the Bonn methodology does depend on standardization, and outliers are excluded in group
formation, it follows that results may not be strongly dependent on whether or not data are
logged.
The gist of the argument so far is that the Bonn and BNL/MURR approaches are more similar
than would seem at ®rst sight. The re®nement of allowing for analytical error in the grouping
procedure will make little difference when natural variation dominates. When high-precision
data are used, and analytical variation dominates natural variation, groups are likely to be tightly
de®ned and found by any reasonable method. When analytical variation dominates and precision
is low it is questionable whether the elements so affected should be used in clustering. It has also
been argued that the use of untransformed as opposed to logged data is not a critical difference.
The treatment of highly correlated data is considered in the subsection on modelling dilution
effects.
Multiplicative models
An alternative multiplicative model for case compositions to that introduced earlier is that of
Buxeda (1999). This has statistical implications in the sense that a particular log-ratio,
transformation of the dataÐnot widely employed in archaeometric studiesÐis indicated for
data analysis. Given this transformation, any of the available methods of statistical analysis
might then be used. It is argued in what follows that Buxeda's (1999) approach will often be well
approximated by the simpler use of logged data, bringing it into the mainstream of methodol-
ogies that have been proposed for artefact compositional analysis.
Buxeda's (1999) model views the composition as a perturbation of the original clay
composition, z1i say. If there are A separate perturbation processes, a multiplicative model,
for a single element, of the form
Äxi j � z1i jui j �5�
is obtained, where
ui j �YA
a�1
uai j
and uai j > 0 represents the effect of the perturbation at the ath stage, on the composition that
exists at that point.
If all naturally occurring elements (D say) in the periodic table are measured, the xi j must sum
to 100%, so that the observed composition is of the form
xi j � 100Äxi j
�XD
j�1
Äxi j
andPD
j�1 xi j � 100. This gives rise to (fully) compositional data in the sense of Aitchison
(1986).
The compositional constraint presents problems for `standard' statistical analysis, documen-
ted in Aitchison (1982, 1986). One way of avoiding these problems, and that advocated in
Buxeda (1999), is to base analyses on log-ratios of the form
yi j � log�xi j=xi D� �6�
for j � 1; 2; . . . ; D ÿ 1 and a suitable choice of the Dth element for the divisor. Part of the
reasoning behind this choice is that, whereas xi j is constrained to lie between 0 and 100%, yi j is
138 M. J. Baxter
unrestricted and more amenable to analysis by standard statistical methods. In practice, of
course, only p p D elements are used. Provided that these are such thatPp
j�1 xi j < 100, similar
considerations apply with yi j � log�xi j=xi p� and j � 1; 2; . . . ; p ÿ 1. If the p measured elements
are such thatPp
j�1 xi j p 100, there is less of a case for using the log-ratio transformation (see the
discussion of Aitchison 1982), but this debate is not pursued here.
Equation (5) can be viewed as a mathematical model of ceramic compositional data, the form
of which suggests that a particular data transformation be applied before statistical analysis. That
the transformation depends on the choice of divisor may be viewed as a potential weakness of
the methodology, and is the focus of some attention in Buxeda (1999), following Aitchison
(1986, 1990). The choice is intimately linked to the concept of the variation matrix de®ned
(Aitchison 1990) as the p ´ p matrix with typical element
ti j � varflog�Xi=Xj�g � v2i � v2
j ÿ 2ri jvivj
where varf:g is the variance; v2i is the variance of the logarithm of variable i, varflog�Xi�g; and
ri j is the correlation between log�Xi� and log�Xj�. The total variation in the data is then de®ned as
vt �X
i j
ti j=2p
and
t:s �X
i
tis
is the total variance in the log-ratio covariance matrix when variable s is used as a divisor. It can
be shown that t:s > vt, and the excess is interpretable as variability imposed by the choice of
variable s as the divisor in equation (6). Buxeda's (1999) strategy is to choose as a divisor that
variable for which vt=t:s is a maximum. In other words' variable s is chosen to impose the least
variability.
The choice of divisor requires the minimization of
t:s �Xp
i�1
v2i � pv2
s ÿ 2vs
Xp
i�1
risvi:
The ®rst term on the right-hand side is constant for all s, so it is the last two terms that must be
minimized. If, on the log-scale, a variable is approximately constant, these last two terms will
be approximately zero, and analysis is then effectively based on the log-transformed data of the
remaining variablesÐa standard procedure (e.g., Bieber et al. 1976; Glascock 1992).
This ideal is unlikely to arise in practice, but the thrust of Buxeda's (1999) strategy is to
choose the variable for which this state is most closely approximated. It can thus be conjectured
that the use of log-ratios proposed by Buxeda (1999) will often be closely approximated by the
simpler procedure of using logarithms of variables. The argument above is heuristic, but
empirical evidence suggests that it is reasonable. In Buxeda's (1999) analysis of ceramic data
from Abella results are largely determined by just six of 14 ratios used. These account for most
of the variance in the data on the unstandardized log-ratio scale. It is straightforward to
determine, empirically, that the same six variables dominate an analysis based on unstandar-
dized log-transformed data, and lead to virtually identical results. Essentially this happens
because the transformations (log-ratio or log) differentially weight the variables in the absence
of subsequent standardization, leading to implicit variable selection of the same variables. In the
139Statistical modelling of artefact compositional data
context of analyses of glass compositional data that sum to 100%, Baxter (1993) noted a
tendency for a small number of minor oxides with high variances on the log-ratio scale to
dominate analysis. Re-analysis of several glass data sets of the kind referred to has con®rmed
that virtually identical results are obtained if unstandardized log-transformed data are used.
Finally, some previous debate on the relative merits of using log-transformed and log-ratio data
has demonstrated that they produce similar results (Church 1995; Hoard et al. 1995). The
analysis given above may help explain why.
Likelihood and Bayesian clustering
Other model-based approaches to clustering that have received limited use in archaeometry are
only noted brie¯y. These include the Bayesian methodology of Buck and Litton (1996) and Buck
et al. (1996), which assumes that within a particular provenance a sample of cases (possibly after
transformation) is drawn from an MVN distribution. The total sample is assumed to be drawn
from a mixture of G such distributions, where G is unknown. The procedure is illustrated in Buck
and Litton (1996) for a 150 ´ 15 data set, in which there are three fairly clear groups.
Similar assumptions underpin classi®cation and mixture maximum likelihood models that
have also had little archaeometric application. They are investigated in Papageorgiou et al.
(2000). One potential attraction of both methodologies is that tests of the numbers of groups in
the sample are possible. A second potential attraction is the ability to model elliptical groups of
the kind to be expected with correlated data. Kraznowski and Marriott (1995) give a concise
account of the mathematics of the methodologies.
Modelling `dilution' effects
The term `dilution' has been used in various ways in the literature and is introduced here through
an idealized example. Suppose that for a two-component paste, modelled as in equation (1),
repeated below in slightly modi®ed form,
yi � p1iz1i � �1 ÿ p1i�z2i
the clay, z1i, involves � p ÿ 1� non-zero elements and is identical for each case in a sample of n.
Suppose, also, that the temper consists of a single element, different from those in the clay. The
composition for a case is then
� p1iz1i 1; p1iz1i 2; . . . ; p1iz1i � pÿ1�; 100�1 ÿ p1i��
since the pth variable comprises 100% of the temper. It is clear in this formulation that the
composition of cases from the same clay source differs only because of differences in p1i or,
equivalently, the proportion of temper in the paste. This effect, in which the variable addition of
temper to a paste can obscure the similarity of the clay compositions has sometimes been
referred to as a dilution effect.
What we have here is a simple model of dilution. If interest centres on identifying cases for
which the clay source is the same, or similar, the model can be used to remove or understand the
effect of tempering, in order better to identify cases with similar sources. Bishop and Neff's
(1989, 83) emphasis on the importance of modelling encompasses this kind of modelling, as
distinct from the more purely statistical models discussed elsewhere in this paper.
In the present simple example, a dilution correction can be accomplished in several equivalent
ways. One is to identify and remove the tempering element from the composition and rescale
remaining elements to sum to 100%. A second possibility is to remove the tempering element
140 M. J. Baxter
and work with ratios of the form
yi j=yi k � p1iz1i j=p1iz1i k � z1i j=z1i k
for j Þ k and some choice of k, in which the effect of tempering is `cancelled out'. A third
possibility is to note that, for distinct cases i and j,
yi k=yj k � p1iz1i k=p1jz1j k � p1i=p1j � ai j
where ai j is constant for all k, and estimate ai j using any k. Once this is done, the values of any
case may be adjusted to match, as closely as possible, that of any other case or, as is more
commonly done, a group mean. These ideas, while simple and based on an idealized model,
form the basis of much that has been done in practice to deal with dilution effects.
For i � 1; . . . ; n and any pair of variables, j; k, not involved in the temper, a bivariate plot of yi j
against yi k will show a scatter of points lying on a straight line, or a vector passing through the
origin. This can be seen by noting that the plot is of p1iz1i j against p1iz1i k and, by assumption,
z1i k � bz1i j for some constant b and for all i. Another way of stating this is that the effect of
dilution, of the kind being modelled, will be to induce high positive correlations among the
variables. In the present case a dilution correctionÐbased on the centroid of the point scatter, for
exampleÐwill be to `shrink' all observations to that point and remove all correlation from the
data.
It is, in fact, possible to view high positive correlations among variables as potentially
indicative of dilution, where dilution is now interpreted in a much more general sense that an
effect due to tempering. This is the view taken in Beier and Mommsen (1994, 295±6) who note a
variety of technical effects that can give rise to `dilution', which they de®ne to `include both
shifts due to different additional components in the clay and due to technical effects'. From this
perspective dilution can give rise to elliptical clouds of points, in p-dimensional hyperspace, in
which the major axis of the ellipse passes through the origin. It may be remarked that this
de®nition of dilution encompasses data that are naturally, and highly correlated, a point that is
considered further below.
In practice, of course, matters are more complicated. Clays from the same source will vary;
tempers will consist of more than one element; and the elemental composition of clays and
tempers will overlap. Nevertheless, several researchers have considered the tempering model
used above to be suf®ciently close to what may sometimes occur in practice to devote time to
developing methods to correct for it.
More realistically, and expressed somewhat informally, `dilution' may be occurring if, for two
cases, i and j,
yi k < ai jyj k
for some constant ai j and for a majority, p0, of the p variables. In our idealized example ai j could
be determined from the ratio yi k=yj k for any variable, k, not involved in the temper. In practice,
different k will give rise to different values, so that ai j must be estimated in some way.
In the best relative ®t method developed at BNL (Harbottle 1976), if it is thought that a case yi
is related to a group with mean Åy and a dilution model of the form yi j � ai Åyj is postulated, ai is
estimated as
Ãai �Yp
j�1
�yi j=Åyj�
" #1=p0
141Statistical modelling of artefact compositional data
the geometric mean of the correction factor determined separately for each element considered
to be affected by dilution. From this, adjusted values of the form yi j= Ãai can be calculated. An
arithmetic mean might also be used (e.g., Mommsen et al. 1988, 50).
The most general of the procedures proposed in Beier and Mommsen (1994) is numerically
more complex. For matching a case to a mean a modi®ed Mahalanobis `distance' of the
form
d 2i � �vixi ÿ Åx�0 ÃSÿ1
vi�vixi ÿ Åx�
is used, where vi is a parameter, used to model the dilution effect, that is estimated to minimize
d 2i , and
ÃSvi� v2
i Sx � S:
In general, vi must be determined numerically, although simpli®cation is possible if the
measurement uncertainty is ignored.
Buxeda's (1999) methodology, being based on log-ratios, will deal with dilution effects of
the kind under consideration. Similarly, ratios in the form
yi j � logfxi j=g�xi�g
where g�xi� is the geometric mean of the elements of xi, have been used explicitly to deal with
dilution effects arising from tempering in Leese et al. (1989) and, less transparently, to deal with
dilution arising for technical reasons, in Pike and Fulford (1983).
To illustrate some of the foregoing ideas, a data set published in Tubb et al. (1980) on the
chemical composition of Romano-British pottery, measured by atomic absorption spectrometry,
will be used. In its original form this is a 48 ´ 9 data set. The pottery comes from ®ve kiln sites in
three regions and previous multivariate analyses suggest the three regions are chemically
distinct. This is shown in the upper plot of Figure 1 based on a principal component analysis of
scaled data. Four oxides (Fe2O3, MgO, CaO and K2O), identi®ed in Tubb et al. as the only ones
necessary for discrimination have been used, and one outlier with an unusually low value of K2O
has been omitted (C14 in the original publication).
The separation of the three regional groups is evident. Groups 1 and 2 are dispersed in
comparison to group 3 and, to the centre right of the plot, there is a single specimen of group 1
that is outlying with respect to the rest of the group. It has previously been noted that this is a
multivariate outlier, possibly attributable to dilution effects (Baxter 1999). This can be examined
and corrected for in a variety of ways. For example, if the model yi j � ai Åyj is postulated this
gives rise to a model of the form log�yi j� � log�ai� � log�Åyj�. In Figure 2 log�yi j� is plotted
against log�Åyj� (using base 10 logarithms) along with the associated regression line. The solid
line is that which would be obtained in the absence of dilution (i.e., log�ai � 0�. It is
approximately parallel to the regression line and the difference between the two lines, of
about ÿ0:14, gives an informal estimate of log�ai�. This suggests that ai is about 0.73 or, in other
words, that values for the specimen be multiplied by about 1.37 (� 1=0:73) to `correct' for
dilution. Using exact calculations, the best relative ®t method leads to ai � 0:76 and a
multiplying factor of 1.32. Use of the Beier±Mommsen approach, ignoring measurement
error, gives a multiplying factor of 1.34. Even more simply, averaging Åyj=yi j leads to a
multiplying factor of 1.33. In this instance, the different methods of correcting for dilution
lead to very similar results.
Using this last approach to obtaining a correction factor, and adjusting all cases in group 1 to
142 M. J. Baxter
143Statistical modelling of artefact compositional data
Figure 1 Principal component plots of scaled data using a subset of variables and cases from Tubb et al. (1980) (see text
for details). The upper ®gure uses the original data, and the lower after `correcting' for dilution in two of three regional
groups.
the mean of that group, and similarly for group 2, leads to the principal components plot in the
lower part of Figure 1. It can be seen that, with the exception of four cases in group 1, the groups
are more concentrated and spherical than in the original analysis.
It is interesting to view these attempts to deal with dilution in the context of the statistical
models for compositional data that have been used in this paper. It has been argued that the MVN
assumption used in all the modelling applications discussed is at odds with the more
fundamental model of case composition presented in equation (3). One way to reconcile the
two models is to transform case compositions in such a way that the MVN assumption for a
sample from a component population is more likely to be true. Dilution correction procedures
can be viewed as an attempt to do precisely this, and the variety of approaches noted above all
stem from the same simple model of dilution.
When sample sizes preclude the use of Mahalanobis distance recourse must be had to
Euclidean distance, and this is less than ideal when dealing with highly elliptical clusters.
Dilution corrections will, if successful, have the effect of reducing correlations within
groupsÐpossibly quite considerably (Beier and Mommsen 1994)Ðso that Euclidean
methods are more satisfactory. Beier and Mommsen (1994) interpret their results as showing
that the prevalence of highly correlated data, and the problems it causes, has been
exaggerated, but their methodology does not distinguish between naturally and arti®cially
correlated data. The foregoing argument suggests that their methodology can be interpreted
as an approach to data transformation that will generate approximately spherical and
normally distributed groups if successful.
Thus, from a statistical standpoint, dilution correction can be viewed as a methodology for
making model assumptions more valid and easing the computational burden.
144 M. J. Baxter
Figure 2 Using four variables, the logged (to base 10) values for a single case are plotted against the logged means for
each of the variables and a regression (dotted) line is shown. The good linear ®t is approximately parallel to the solid line
that would be obtained if the case had values equal to the means and is indicative of a dilution effect. The vertical
distance between the lines provides an estimate of log�ai�, where ai de®nes the proportionate relation between the case
and `variable' means in the presence of dilution.
DISCUSSION
This paper has examined a number of competing approaches to the statistical analysis of artefact
compositional data within a model-based framework. It has been argued that, despite their
apparent differences, the methods examined have strong similarities and might often be
expected to produce similar results in practice.
Using a modelling framework makes quite explicit the fact that models that have been
entertained for artefact compositions are at variance with models that are used for grouping
cases. Whether this is a serious problem will depend on the separation of groups and departure
from multivariate normality of a sample from a population. Methods that assume normality will
tend to impose normal structure on the groups found, regardless of the `true' situation, and may
mislead about their effectiveness. Different approaches to dilution correction can be viewed as
attempts to transform compositions to normality that satisfy the assumptions of grouping
procedures, although this is not the primary reason for the development of such approaches. In
the absence of dilution and presence of non-normality simpler methods, such as the use of log
transformation, exist.
Finally, the creators of the methods discussed here might not necessarily view what they do as
`model-based', although adopting such a view helps in understanding how methods compare.
More thorough-going model-based Bayesian and likelihood methods of grouping data,
developed in the statistical literature and little used in archaeometry, have been noted. They
have the potential attraction that clusters of highly correlated variables can be dealt with in a
natural fashion, and tests of the number of clusters in the data are possible. Some of these
methods are investigated in more detail in Papageorgiou et al. (2000).
ACKNOWLEDGEMENTS
I am particularly grateful to Thomas Beier, Caitlin Buck, Jaume Buxeda i GarrigoÂs, Hans Mommsen and Hector Neff for
discussing their approaches to data analysis with me. There is no implication that they necessarily agree with my
interpretations, and any misunderstandings and errors are entirely my responsibility. Jaume Buxeda i GarrigoÂs is thanked,
additionally, for providing and allowing use of his data from Abella. My colleagues Christian Beardah and Ioulia
Papageorgiou are thanked for contributing to my understanding of the practicalities of implementing some of the methods
discussed. This work forms part of the GEOPRO Research Network funded by the DGXII of the European Commission,
under the TMR Network Programme (Contract Number ERBFMRX-CT98-0165).
REFERENCES
Aitchison, J., 1982, The statistical analysis of compositional data (with discussion), Journal of the Royal Statistical
Society, B44, 139±77.
Aitchison, J., 1986, The statistical analysis of compositional data, Chapman and Hall, London.
Aitchison, J., 1990, Relative variation diagrams for describing patterns of compositional variability, Mathematical
Geology, 22, 487±511.
Baxter, M. J., 1993, Comment on D. Tangri and R. V. S. Wright, `Multivariate analysis of compositional data ldots',
Archaeometry, 35, 112±15.
Baxter, M. J., 1994, Exploratory multivariate analysis in archaeology, Edinburgh University Press, Edinburgh.
Baxter, M. J., 1995, Standardization and transformation in principal component analysis, with applications to
archeometry, Applied Statistics, 44, 513±27.
Baxter, M. J., 1999, Detecting multivariate outliers in artefact, compositional data, Archaeometry, 41, 321±38.
Baxter, M. J., and Buck, C. E., 2000, Data handling and statistical analysis, in (eds. E. Ciliberto and G. Spoto) Modern
analytical methods in art and archaeology 681±746, John Wiley, New York.
Baxter, M. J., Beardah, C. C., and Westwood, S., 2000, Sample size and related issues in the analysis of lead isotope data,
Journal of Archaeological Science, 27, 973±80.
145Statistical modelling of artefact compositional data
Beier, T., and Mommsen, H., 1991, On the distribution function of elements within groups of pottery and some
consequences for multivariate analysis, unpublished conference paper.
Beier, T., and Mommsen, H., 1994, Modi®ed Mahalanobis ®lters for grouping pottery by chemical composition,
Archaeometry, 36, 287±306.
Bieber, A. M., Brooks, D. W., Harbottle, G., and Sayre, E. V., 1976, Application of multivariate techniques to analytical
data on Aegean ceramics, Archaeometry, 18, 59±74.
Bishop, R. L., and Neff, H., 1989, Compositional data analysis in archaeology, Archaeological chemistry IV in (ed. R. O.
Allen), American Chemical Society Advances in Chemistry Series 220, 57±86, Washington, DC.
Buck, C. E., and Litton, C. D., 1996, Mixtures, Bayes and archaeology, in Bayesian statistics 5 (eds. J. M. Bernado, J. O.
Berger, A. P. Dawid and A. F. M. Smith), 499±506, Clarendon Press, Oxford.
Buck, C. E., Cavanagh, W. G., and Litton, C. D., 1996, Bayesian approach to interpreting archaeological data, John
Wiley, New York.
Buxeda, i GarrigoÂs, J., 1999, Alteration and contamination of archaeological ceramics: the perturbation problem, Journal
of Archaeological Science, 26, 295±313.
Church, T., 1995, Comment on `Neutron-activation analysis of stone from the Chadron formation and a Clovis site on the
Great Plains' by Hoard et al. (1992), Journal of Archaeological Science, 22, 1±5.
Glascock, M. D., 1992, Characterization of archeological ceramics at MURR by neutron activation analysis and
multivariate statistics, in Chemical characterization of ceramic pastes in archaeology (ed. H. Neff), 11±26,
Prehistory Press, Madison, Wisconsin.
Gordon, A. D., 1999, Classi®cation, 2nd edn, Chapman and Hall/CRC, London.
Harbottle, G., 1976, Activation analysis in archaeology, in Radiochemistry 3 (ed. G. W. A. Newton), 33±72, Chemical
Society, London.
Harbottle, G., 1991, The ef®ciencies and error-rates in Euclidean Mahalanobis searches in hypergeometries of
archaeological ceramic compositions, in Archaeometry 90 (eds. E. Pernicka and G. A. Wagner), 413±24, BirkhaÈuser
Verlag, Basel.
Hegmon, M., Allison, J. R., Neff, H., and Glascock, M. D., 1997, Production of San Juan red ware in the Northern
Southwest: insights into regional interaction in early Puebloan prehistory, American Antiquity, 62, 449±63.
Herrera, R. S., Neff, H., Glascock, M. D., and Elam, J. M., 1999, Ceramic patterns, social interaction, and the Olmec:
neutron activation analysis of Early formative Pottery in the Oaxaca Highlands of Mexico, Journal of Archaeological
Science, 26, 967±87.
Hoard, R. J., Holen, S. R., Glascock, M. D., and Neff, H., 1995, Additional comments on neutron-activation analysis of
stone from the Great PlainsÐreply, Journal of Archaeological Science, 22, 7±10.
Krzanowski, W. J., and Marriott, F. H. C., 1995, Multivariate analysis: part 2, Edward Arnold, London.
Leese, M. N., and Main, P. L., 1994, The ef®cient computation of unbiased Mahalanobis distances and their
interpretation in archaeology, Archaeometry, 36, 307±16.
Leese, M. N., Hughes, M. J., and Stopford, J., 1989, The chemical composition of tiles from Bordesley, in Computer
applications and quantitative methods in archaeology 1989 (eds. S. Rahtz and J. Richards), 241±9, BAR
International Series 548, British Archaeological Reports, Oxford.
Lizee, J. M., Neff, H., and Glascock, M. D., 1995, Clay acquisition and vessel distribution patternsÐneutron activation
analysis of Late Windsor and Shantok tradition ceramics from southern New England, American Antiquity, 60, 515±
30.Mommsen, H., Kreuser, A., and Weber, J., 1988, A method for grouping pottery by chemical composition,
Archaeometry, 30, 47±57.
Neff, H, (ed.), 1992, Chemical characterization of ceramic pastes in archaeology, Prehistory Press, Madison, Wisconsin.
Neff, H., 1998, Units in chemistry-based provenance investigations of ceramics, in Measuring time, space and material:
unit issues in archaeology (eds. A. F. Ramenofsky and A. Steffen), 115±27, University of Utah Press, Provo, UT.
Neff, H., Bishop, R. L., and Sayre, E. V., 1988, A simulation approach to the problem of tempering in compositional
studies of archaeological ceramics, Journal of Archaeological Science, 15, 159±72.
Neff, H., Bishop, R. L., and Sayre, E. V., 1989, More observations on the problem of tempering in compositional studies
of archaeological ceramics, Journal of Archaeological Science, 16, 57±69.
Papageorgiou, I., Baxter, M. J., and Cau, M. A., 2000, Model-based cluster analysis of artefact compositional data
(submitted for publication).
Pike, H. H. M., and Fulford, M. G., 1983, Neutron activation analysis of black-glazed pottery from Carthage,
Archaeometry, 25, 77±86.
Pollard, A. M., 1986, Multivariate methods of data analysis, in Greek and Cypriot pottery: a review of scienti®c studies
(ed. R. E Jones), 56±83, Occasional Paper 1, British School at Athens Fitch Laboratory, Athens.
146 M. J. Baxter
Pollard, A. M., and Heron, C., 1996, Archaeological chemistry, Royal Society of Chemistry, Cambridge.
Sayre, E. V., 1975, Brookhaven procedures for statistical analyses of multivariate archaeometric data, Unpublished
manuscript.
Sayre, E. V., Yener, K. A., Joel, E. C., and Barnes, I. L., 1992, Statistical evaluation of the presently accumulated lead
isotope data from Anatolia and surrounding regions, Archaeometry, 34, 73±105.
Slane, K. W., Elam, J. M., Glascock, M. D., and Neff, H., 1994, Compositional analysis of eastern sigillata A and related
wares from Tel-Anafa (Israel), Journal of Archaeological Science, 21, 51±64.
Steponaitis, V. P., Blackman, M. J., and Neff, H., 1996, Large scale patterns in the chemical composition of
Mississippian pottery, American Antiquity, 61, 555±72.
Triadan, D., Neff, H., and Glascock, M. D., 1997, An evaluation of the archaeological relevance of weak-acid extraction
ICP: White Mountain redware as a case study, Journal of Archaeological Science, 24, 997±1002.
Tubb, A., Parker, A. J., and Nickless, G., 1980, The analysis of Romano-British pottery by atomic absorption
spectrophotometry, Archaeometry, 22, 153±71.
Weigand, P. C., Harbottle, G., and Sayre, E. V., 1977, Turquoise sources and source analysis: Mesoamerica and the
southwestern U.S.A., in Exchange systems in prehistory (eds. T. K. Earle and J. E. Ericson), 15±34, Academic Press,
New York.
147Statistical modelling of artefact compositional data