8

Click here to load reader

MCD_2010

Embed Size (px)

Citation preview

Page 1: MCD_2010

Advanced Review

Minimum covariance determinantMia Hubert,1∗ and Michiel Debruyne2

The minimum covariance determinant (MCD) estimator is a highly robustestimator of multivariate location and scatter. It can be computed efficientlywith the FAST-MCD algorithm of Rousseeuw and Van Driessen. Since estimatingthe covariance matrix is the cornerstone of many multivariate statistical methods,the MCD has also been used to develop robust and computationally efficientmultivariate techniques.

In this paper, we review the MCD estimator, along with its main properties suchas affine equivariance, breakdown value, and influence function. We discuss itscomputation, and list applications and extensions of the MCD in theoretical andapplied multivariate statistics. 2009 John Wiley & Sons, Inc. WIREs Comp Stat 2010 2 36–43

The minimum covariance determinant (MCD)estimator is one of the first affine equivariant and

highly robust estimators of multivariate location andscatter.1,2 Being resistant to outlying observations,makes the MCD very helpful in outlier detection.Although already introduced in 1984, its mainuse has only started since the introduction of thecomputationally efficient FAST-MCD algorithm ofRousseeuw and Van Driessen.3 Since then, the MCDhas been applied in numerous fields such as medicine,finance, image analysis, and chemistry. Moreover, theMCD has also been used to develop many robustmultivariate techniques, such as principal componentanalysis, factor analysis, and multiple regression.

DESCRIPTION OF THE MCDESTIMATOR

MotivationIn the multivariate location and scatter setting weassume that the data are stored in an n × p datamatrix XXX = (xxx1, . . . , xxxn)t with xxxi = (xi1, . . . , xip)t theith observation. Hence n stands for the number ofobjects and p for the number of variables.

To illustrate we first consider a bivariate dataset, hence p = 2. We consider the wine data set,available in Ref 4 and also analyzed in Ref 5. The

∗Correspondence to: [email protected] of Mathematics-LStat, Katholieke Universiteit Leuven,Celestijnenlaan 200B, B-3001 Leuven, Belgium2Department of Mathematics and Computer Science, University ofAntwerp, Middelheimlaan 1, B-2020 Antwerp, Belgium

DOI: 10.1002/wics.61

data set contains the quantities of 13 constituentsfound in three types of Italian wines. We considerthe first group containing 59 wines, and focus on theconstituents ‘malic acid’ and ‘proline’. A scatter plotof the data is shown in Figure 1, together with theclassical and the robust 97.5% tolerance ellipse.

The classical tolerance ellipse is defined as theset of p-dimensional points xxx whose Mahalanobisdistance

MD(xxx) =√

(xxx − xxx)tSSS−1(xxx − xxx) (1)

equals√

χ2p,0.975. We denote χ2

p,α as the α-quantile of

the χ2p distribution. The Mahalanobis distance MD(xxxi)

should tell us how far away xxxi is from the center ofthe cloud, relative to the size of the cloud. Herexxx is the sample mean and SSS the sample covariancematrix. We see that this tolerance ellipse tries toencompass all observations. Consequently none ofthe Mahalanobis distances, shown in Figure 2(a), isexceptional large and only three observations wouldbe considered as mild outliers. On the other hand, therobust tolerance ellipse in Figure 1 which is based onthe robust distances

RD(xxx) =√

(xxx − µMCD)t�−1MCD(xxx − µMCD) (2)

is much smaller and encloses the regular data points.Here, µMCD is the MCD estimate of location, and�MCD the MCD covariance estimate. The robustdistances exposed in Figure 2(b) now clearly spot eightoutliers and one mild outlier (observation 3).

This illustrates the masking effect: classicalestimates can be so highly affected by outlying

36 2009 John Wi ley & Sons, Inc. Volume 2, January /February 2010

Page 2: MCD_2010

WIREs Computational Statistics Minimum covariance determinant

0 0.5 1 1.5 2 2.5 3 3.5 4

600

800

1000

1200

1400

1600

Malic acid

Pro

line

Tolerance ellipse (97.5%)

40

44

46

RobustClassical

FIGURE 1 | Bivariate wine data with classical and robust toleranceellipse.

values that diagnostic tools such as the Mahalanobisdistances can no longer detect the outliers. To get areliable analysis of these data robust estimators arerequired that can resist possible outliers. The MCDestimator of location and scatter is such a robustestimator.

DefinitionDenote [.] the floor function. The raw MCD estimatorwith parameter [(n + p + 1)/2] ≤ h ≤ n defines thefollowing location and dispersion estimates:

1. µ0 is the mean of the h observations for whichthe determinant of the sample covariance matrixis minimal.

2. �0 is the corresponding covariance matrixmultiplied by a consistency factor c0.

Note that the MCD estimator can only be computedwhen h > p, otherwise the covariance matrix of anyh-subset will be singular. Since h ≥ [(n + 2)/2], thiscondition is certainly satisfied when n ≥ 2p. To avoidthe curse of dimensionality it is however recommendedthat n > 5p. To obtain consistency at the normaldistribution, the consistency factor c0 equals α/Fχ2

p+2

(χ2p,α) with α = limn→∞ h(n)/n.6 Also a finite-sample

correction factor can be added.7

The MCD estimator is designed for ellipticallysymmetric unimodal distributions. A multivariatedistribution with parameters µ ∈ IRp and � a positivedefinite matrix of size p is called elliptically symmetricand unimodal if there exists a strictly decreasingfunction g such that the density can be written inthe form

f (xxx) = 1√|�| g((xxx − µ)t�−1(xxx − µ)). (3)

Consistency of the raw MCD estimator of locationand scatter at elliptical models, as well as asymptoticnormality of the MCD location estimator has beenproved in Ref 8.

The MCD estimator is most robust by taking h =[(n + p + 1)/2]. This corresponds at the populationlevel with α = 0.5. But unfortunately the MCD thensuffers from low efficiency at the normal model. Forexample, if α = 0.5, the asymptotic relative efficiencyof the diagonal elements of the MCD scatter matrixwith regard to the sample covariance matrix is only

0 10 20 30 40 50 600

0.5

1

1.5

2

2.5

3

(a) (b)Index

Mah

alan

obis

dis

tanc

e

40 4446

COV

0 10 20 30 40 50 600

2

4

6

8

10

12

14

Index

Rob

ust d

ista

nce

3

5

20

47

422240 46

44

MCDCOV

FIGURE 2 | (a) Mahalanobis distances and (b) robust distances for the bivariate wine data.

Volume 2, January /February 2010 2009 John Wi ley & Sons, Inc. 37

Page 3: MCD_2010

Advanced Review www.wiley.com/wires/compstats

6% when p = 2, and 20.5% when p = 10. Thisefficiency can be increased by considering a largerα such as α = 0.75. This yields relative efficienciesof 26.2% for p = 2 and 45.9% for p = 10.6 On theother hand this choice of α decreases the robustnesstoward possible outliers.

In order to increase the efficiency whileretaining high robustness, one can apply reweightedestimators.9,10 For the MCD this yields the estimates:

µMCD =∑n

i=1 W(d2i )xxxi∑n

i=1 W(d2i )

(4)

�MCD = c11n

n∑i=1

W(d2i )(xxxi − µMCD)(xxxi − µMCD)t

(5)

with di =√

(xxx − µ0)t�−10 (xxx − µ0) and W an appro-

priate weight function. The constant c1 is again aconsistency factor. A simple yet effective choice for Wis W(d2) = I(d2 ≤ χ2

p,0.975). This is the default choicein current implementations in S-PLUS, R, SAS, andMatlab. If we take α = 0.5 this reweighting stepincreases the efficiency up to 45.5% for p = 2 and82% for p = 10. The example of the wine data alsouses the reweighted MCD estimator with α = 0.75,but the results were similar for smaller values of α.

Remark that based on the MCD covariancematrix a robust correlation matrix can also be con-structed. For all 1 ≤ i �= j ≤ p, the robust correlationbetween variables Xi and Xj can be estimated by

rij = sij√siisjj(6)

with sij the (i, j)th element of the MCD covarianceestimate.

Outlier detectionAs illustrated in Figure 3, the robust MCD estimator isvery helpful to detect outliers in multivariate data. Asthe robust distances (Eq. (2)) are not sensitive to themasking effect, they can be used to flag the outliers.11

This becomes even more useful at data sets in morethan two (or three) dimensions, which become difficultto visualize.

We illustrate the outlier detection potential of theMCD on the full wine data set, with p = 13 variables.The distance--distance plot of Figure 3 now plotsthe robust distances based on the MCD versus theMahalanobis distances.3 From the robust analysis wesee that seven observations clearly stand out, whereasthe classical analysis does not flag any of the wines.

0 1 2 3 4 50

2

4

6

8

10

12

14

16

18

Mahalanobis distance

Rob

ust d

ista

nce

MCDCOV

263 345 51

20

47

2242464044

FIGURE 3 | Distance–distance plot of the full wine data.

Note that the cutoff value√

χ2p,0.975 is based on

the asymptotic distribution of the robust distances,and often flags too many observations as outlying.The true distribution of the robust distances canbe better approximated by an F-distribution, seeRef. 12.

PROPERTIES

Affine equivarianceThe MCD estimator of location and scatter is affineequivariant. This means that for any nonsingularmatrix AAA and constant vector bbb ∈ IRp

µMCD(AAAXXX + bbb) = AAAµMCD(XXX) + bbb (7)

�MCD(AAAXXX + bbb) = AAA�MCD(XXX)AAAt. (8)

This property follows from the fact that for eachsubset of size h, denoted as XXXh, the determinant of thecovariance matrix of the transformed data equals

|SSS(AAAXXXh)| = |AAASSS(XXXh)AAAt| = |AAA|2|SSS(XXXh)|. (9)

Hence, the optimal h-subset (which minimizes|SSS(AAAXXXh)|) remains the same as for the original data(which minimizes |SSS(XXXh)|), and its covariance matrixis appropriately transformed. Similarly the affineequivariance of the raw MCD location estimatorfollows from the equivariance of the sample mean.Finally we note that the robust distances di =√

(xxx − µ0)t�−10 (xxx − µ0) are affine invariant, which

implies that the reweighted estimator is againequivariant.

38 2009 John Wi ley & Sons, Inc. Volume 2, January /February 2010

Page 4: MCD_2010

WIREs Computational Statistics Minimum covariance determinant

Affine equivariance implies that the estimatortransforms well under any nonsingular reparametriza-tion of the space of the xxxi. Consequently, the datamight be rotated, translated, or rescaled (for examplethrough a change of the measurement units) withoutaffecting the outlier detection diagnostics.

The MCD is one of the first high-breakdownaffine equivariant estimators of location and scatter,and was only preceded by the Stahel-Donohoestimator.13,14 Together with the MCD, the minimumvolume ellipsoid estimator was introduced1,2 but thisestimator is not asymptotically normal and it is moredifficult to compute than the MCD.

Breakdown valueThe breakdown value of an estimator measures thesmallest fraction of observations that need to bereplaced by arbitrary values to carry the estimatebeyond all bounds. Denote XXXn,m as the set obtainedby replacing m data points xxxi1 , . . . , xxxim of XXXn byarbitrary values. For a multivariate location estimatorTn the breakdown value is then defined as:

ε∗n(Tn; XXXn) =1

nmin {m ∈ {1, . . . , n} :

supm

||Tn(XXXn) − Tn(XXXn,m)|| = +∞}

.

(10)

For a multivariate estimator of scatter we have

ε∗n(Cn; XXXn) =1

nmin {m ∈ {1, . . . , n} :

supm

maxi

{| log(λi(Cn(XXXn)))

− log(λi(Cn(XXXn,m)))|}}

, (11)

with 0 < λp(Cn) ≤ . . . ≤ λ1(Cn) the eigenvalues of Cn.This means that we consider a scatter estimator to bebroken whenever any of the eigenvalues can becomearbitrary large or arbitrary close to 0.

Let k(XXXn) denote the maximum number ofobservations in the data set lying on a hyperplaneof IRp. Assume k(XXXn) < h, then for the raw MCDestimator of location and scatter, we have that15

ε∗n(µ0; XXXn) = ε∗

n(�0; XXXn) = min(n − h + 1, h − k(XXXn))n

.

(12)

If the data are sampled from a continuous distri-bution, then almost surely k(XXXn) = p which yieldsε∗

n(µ0; XXXn) = ε∗n(�0; XXXn) = min(n − h + 1, h − p)/n,

and consequently any [(n + p)/2] ≤ h ≤ [(n + p +1)/2] gives the maximal breakdown value [(n − p +2)/2]. This is also the highest possible breakdownvalue for affine equivariant scatter estimators16 atdata sets that satisfy k(XXXn) = p (this is also known asgeneral position). Also for affine equivariant locationestimators the upper bound on the breakdown value is[(n − p + 2)/2] under natural regularity conditions.17

Note that in the limit limn→∞ ε∗n = min(1 − α, α)

which is maximal for α = 0.5.Finally, we remark that the breakdown value of

the reweighted MCD estimator µMCD and �MCD isnot lower than the breakdown value of the raw MCDestimator, as long as the weight function W used inEq. (4) is bounded and becomes zero for large di.9

Influence functionThe influence function of an estimator measures theinfinitesimal effect of point contamination on theestimator.18 It is defined at the population level, henceit requires the functional form of the estimator T,which maps any distribution F on a value T(F) inthe parameter space. For multivariate location, thisparameter space is IRp, whereas for multivariate scatterestimators the parameter space corresponds with allpositive definite matrices of size p. The influencefunction of the estimator T at the distribution F in apoint xxx is then defined as:

IF(xxx, T, F) = limε→0

T(Fε) − T(F)ε

(13)

with Fε = (1 − ε)F + ε�x a contaminated distributionwith point mass in xxx.

The influence function of the raw and thereweighted MCD has been studied in Ref. 6 andappears to be bounded. This is a desired propertyfor robust estimators. It reflects the robustness ofthe estimator toward point contamination. At thestandard Gaussian distribution, the influence functionof the MCD location estimator becomes zero forall xxx with ‖xxx‖2 > χ2

p,α, hence large outliers do notinfluence the estimates. The same happens at the off-diagonal elements of the MCD scatter estimator. Atthe diagonal elements on the other hand the influencefunction remains constant (different from zero) when‖xxx‖2 is sufficiently large. This reflects that the outliersstill have a bounded influence of the estimator.Moreover the influence functions are smooth, exceptat those xxx with ‖xxx‖2 = χ2

p,α. The reweighted MCDestimator has an additional jump in ‖xxx‖2 = χ2

p,0.975because of the discontinuity of the weight function.

Volume 2, January /February 2010 2009 John Wi ley & Sons, Inc. 39

Page 5: MCD_2010

Advanced Review www.wiley.com/wires/compstats

Univariate MCDFor univariate data, the MCD estimates reduce tothe mean and the variance of the h-subset withsmallest variance. They can be computed in O(n log n)time by considering contiguous h-subsets and bycomputing their mean and variance recursively.19

Their consistency and asymptotic normality is provedin Ref 2,20. For h = [n/2] + 1 the MCD locationestimator has breakdown value [(n + 1)/2]/n, and theMCD scale estimator [n/2]/n. These are the maximalvalues that can be attained by affine equivariantestimators.21 The univariate MCD influence functionsare also bounded, see6 for details. The maxbias curve(which measures the maximal asymptotic bias ata certain fraction of contamination) is studied inRef 22,23.

Note that the univariate MCD location esti-mator corresponds with the univariate least trimmedsquares (LTS) estimator,1 defined by

minµ

h∑i=1

(r2µ)i:n (14)

where (r2µ)1:n ≤ (r2

µ)2:n ≤ . . . ≤ (r2µ)n:n are the ordered

squared residuals (xi − µ)2.

COMPUTATION

The exact MCD estimator is very hard to compute, as

it requires the evaluation of all(nh

)subsets of size h.

In Ref. 3 the FAST-MCD algorithm is developed toefficiently compute the MCD. The key component ofthe algorithm is the C-step:Theorem. Take XXX = {xxx1, . . . , xxxn} and let H1 ⊂{1, . . . , n} be an h-subset, that is |H1| = h. Put µ1and �1 the empirical mean and covariance matrixof the data in H1. If det(�1) �= 0 define the relativedistances

d1(i) :=√

(xxxi − µ1)t�−11 (xxxi − µ1) for i = 1, . . . , n.

(15)

Now take H2 such that {d1(i); i ∈ H2} :={(d1)1:n, . . . , (d1)h:n} where (d1)1:n ≤ (d1)2:n ≤ · · · ≤(d1)n:n are the ordered distances, and compute µ2and �2 based on H2. Then

det(�2) ≤ det(�1) (16)

with equality if and only if µ2 = µ1 and �2 = �1.If det(�1) > 0, the C-step thus yields a new

h-subset with lower covariance determinant very

easily. Note that the C stands for ‘concentration’ since�2 is more concentrated (has a lower determinant)than �1. The condition det(�1) �= 0 in the C-step theorem is not a real restriction because ifdet(�1) = 0 the minimal objective value is alreadyreached.

C-steps can be iterated until det(�new) = 0 ordet(�new) = det(�old). The sequence of determinantsobtained in this way must converge in a finite numberof steps because there are only finitely many h-subsets.However, there is no guarantee that the final valuedet(�new) of the iteration process is the global min-imum of the MCD objective function. Therefore, anapproximate MCD solution can be obtained by takingmany initial choices of H1, applying C-steps to eachand keeping the solution with lowest determinant.

To construct an initial subset H1, a random(p + 1)-subset J is drawn and µ0 := ave(J) and�0 := cov(J) are computed [If det(�0) = 0 thenJ can be extended by adding observations untildet(�0) > 0]. Then, for i = 1, . . . , n the distancesd2

0(i) := (xxxi − µ0)t�−10 (XXXi − µ0) are computed and

sorted. The initial H1 subset then consists of the hobservations with smallest distance d0. This methodyields better initial subsets than by drawing randomh-subsets directly, because the probability of drawingan outlier-free subset is much higher when drawing(p + 1)-subsets than with h-subsets.

The FAST-MCD algorithm contains severalcomputational improvements. As each C-step involvesthe calculation of a covariance matrix, its determinantand the corresponding distances using fewer C-stepsconsiderably improves the speed of the algorithm.It turns out that after two C-steps, many runs thatwill lead to the global minimum already have aconsiderably smaller determinant. Therefore, thenumber of C-steps is reduced by applying only twoC-steps on each initial subset and selecting the 10different subsets with lowest determinants. Onlyfor these 10 subsets further C-steps are taken untilconvergence.

This procedure is very fast for small samplesizes n, but when n grows the computation timeincreases because of the n distances that need to becalculated in each C-step. For large n FAST-MCD usesa partitioning of the data set, which avoids doing allthe calculations in the entire data.

Note that the FAST-MCD algorithm itself isaffine equivariant. Consistency and breakdown of theapproximate algorithm is discussed in Ref 24.

Implementations of the FAST-MCD algorithmare available in the package S-PLUS (as the built-in function cov.mcd), in R (as part of the packagesrrcov, robust and robustbase), in SAS/IML Version 7,

40 2009 John Wi ley & Sons, Inc. Volume 2, January /February 2010

Page 6: MCD_2010

WIREs Computational Statistics Minimum covariance determinant

and SAS Version 9 (in PROC ROBUSTREG).A stand-alone program can be downloaded from thewebsite www.agoras.ua.ac.be, as well as a Matlabversion. A Matlab function is also part of LIBRA, aMatlab Library for Robust Analysis25 which can bedownloaded from wis.kuleuven.be/stat/robust. More-over it is available in the PLS Toolbox of EigenvectorResearch (www.eigenvector.com). Note that somefunctions use α = 0.5 as default value, yielding abreakdown value of 50%, whereas other implemen-tations use α = 0.75.

APPLICATIONSMany multivariate statistical methods rely on covari-ance estimation; hence the MCD estimator is wellsuited to construct robust multivariate techniques.Moreover, the trimming idea of the MCD hasbeen generalized toward many new estimators.Here, we enumerate some applications and exten-sions.

The MCD analog in regression is the leasttrimmed squares regression estimator1 which mini-mizes the sum of the h-smallest squared residuals.Equivalently, the LTS estimate corresponds withthe least squares fit of the h-subset with smallestsquared residuals. The FAST-LTS algorithm uses sim-ilar techniques as FAST-MCD.26 The diagnostic plotintroduced in Ref 11 exposes the regression resid-uals versus the robust distances of the predictors,and is very useful for outlier classification. Theserobust distances are, e.g., also useful for robust linearregression,27,28 regression with continuous and cate-gorical regressors29 and for logistic regression.30,31

In the multivariate regression setting (with sev-eral response variables) the MCD can be directlyused to obtain MCD regression,32 whereas MCDapplied to the residuals leads to multivariate LTSestimation.33

Covariance estimation is also important in prin-cipal component analysis and related methods. Forlow-dimensional data (with n < 5p) the principalcomponents can be obtained as the eigenvectors ofthe MCD covariance matrix,34 whereas robust factor

analysis based on the MCD has been studied inRef 35. Robust canonical correlation is proposed inRef 36. For high-dimensional data, projection pur-suit ideas combined with the MCD results in theso-called ROBPCA method37,38 for robust PCA. Thismethod has led to the construction of robust princi-pal component regression,39 and robust partial leastsquares regression,40,41 together with appropriate out-lier maps. The LTS-subspace estimator42 generalizesLTS regression to subspace estimation and orthogonalregression.

An MCD-based alternative to the Hotelling testwas provided in Ref. 43. A robust bootstrap for theMCD is proposed in Ref. 44. The computation of theMCD with missing values is explored in Ref 45–47.Classification (or discriminant analysis) based onMCD is studied in Ref 48,49, whereas an alterna-tive for high-dimensional data is developed in Ref. 50.Robust clustering is handled in Ref 51–53.

The trimming procedure of the MCD hasinspired the construction of maximum trimmed like-lihood estimators,54–57 trimmed k-means,58–60 leastweighted squares regression61 and minimum weightedcovariance determinant estimation.15 The idea of theC-step in the MCD algorithm has been extended toS-estimators.62

Applications of the MCD are numerous.We mention recent applications in finance andeconometrics,63,64 medicine,65 quality control,66

geophysics,67 image analysis68,69 and chemistry,70 butthis list is far from complete.

CONCLUSION

In this paper we have reviewed the MCD estima-tor of multivariate location and scatter. We haveillustrated its resistance to outliers on an exampleof real data. Its main properties concerning robust-ness, efficiency, and equivariance were enumerated,and computational aspects were described. Finally wehave provided a detailed reference list with applica-tions and generalizations of the MCD in theoreticaland applied research.

REFERENCES

1. Rousseeuw PJ. Least median of squares regression.J Am Stat Assoc 1984, 79:871–880.

2. Rousseeuw PJ. Multivariate estimation with highbreakdown point. In: Grossmann W, Pflug G,Vincze I, Wertz W, eds. Mathematical Statistics and

Applications, Vol. B. Dordrecht: Reidel PublishingCompany; 1985, 283–297.

3. Rousseeuw PJ, Van Driessen K. A fast algorithm for theMinimum Covariance Determinant estimator. Techno-metrics 1999, 41:212–223.

Volume 2, January /February 2010 2009 John Wi ley & Sons, Inc. 41

Page 7: MCD_2010

Advanced Review www.wiley.com/wires/compstats

4. Hettich S, Bay SD. The UCI KDD Archive. Irvine, CA:University of California, Department of Informationand Computer Science; 1999.

5. Maronna RA, Martin DR, Yohai VJ. Robust statistics:Theory and Methods. New York: Wiley; 2006.

6. Croux C, Haesbroeck G. Influence function and effi-ciency of the Minimum Covariance Determinantscatter matrix estimator. J Multivariate Anal 1999,71:161–190.

7. Pison G, Van Aelst S, Willems G. Small sample correc-tions for LTS and MCD. Metrika 2002, 55:111–123.

8. Butler RW, Davies PL, Jhun M. Asymptotics for theMinimum Covariance Determinant estimator. Ann Stat1993, 21:1385–1400.

9. Lopuhaa HP, Rousseeuw PJ. Breakdown points ofaffine equivariant estimators of multivariate locationand covariance matrices. Ann Stat 1991, 19:229–248.

10. Lopuhaa HP. Asymptotics of reweighted estimatorsof multivariate location and scatter. Ann Stat 1999,27:1638–1665.

11. Rousseeuw PJ, van Zomeren BC. Unmasking multivari-ate outliers and leverage points. J Am Stat Assoc 1990,85:633–651.

12. Hardin J, Rocke DM. The distribution of robust dis-tance. J Comput Graph Stat 2005, 14:928–946.

13. Stahel WA, Robuste schatzungen: infinitesimale opti-malitat und schatzungen von kovarianzmatrizen. PhDthesis, ETH Zurich, 1981.

14. Donoho DL, Gasko M. Breakdown properties of loca-tion estimates based on halfspace depth and projectedoutlyingness. Ann Stat 1992, 20:1803–1827.

15. Roelant E, Van Aelst S, Willems G. The minimumweighted covariance determinant estimator. Metrika2009, 70:177–204.

16. Davies L. Asymptotic behavior of S-estimators of mul-tivariate location parameters and dispersion matrices.Ann Stat 1987, 15:1269–1292.

17. Rousseeuw PJ. Discussion on ‘Breakdown and groups’.Ann Stat 2005, 33:1004–1009.

18. Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA.Robust Statistics: The Approach Based on InfluenceFunctions. New York: Wiley; 1986.

19. Rousseeuw PJ, Leroy AM. Robust Regression and Out-lier Detection. New York: Wiley-Interscience; 1987.

20. Butler RW. Nonparametric interval and point predic-tion using data trimmed by a Grubbs-type outlier rule.Ann Stat 1982, 10:197–204.

21. Croux C, Rousseeuw PJ. A class of high-breakdownscale estimators based on subranges. Commun StatTheory Methods 1992, 21:1935–1951.

22. Croux C, Haesbroeck G. Maxbias curves of robustscale estimators based on subranges. Metrika 2001,53:101–122.

23. Croux C, Haesbroeck G. Maxbias curves of locationestimators based on subranges. J Nonparametr Stat2002, 14:295–306.

24. Hawkins D, Olive D. Inconsistency of resampling algo-rithms for high breakdown regression estimators and anew algorithm. J Am Stat Assoc 2002, 97:136–148.

25. Verboven S, Hubert M. LIBRA: a Matlab library forrobust analysis. Chemometr Intell Lab Syst 2005,75:127–136.

26. Rousseeuw PJ, Van Driessen K. Computing LTS regres-sion for large data sets. Data Min Knowl Discov 2006,12:29–45.

27. Simpson DG, Ruppert D, Carroll RJ. On one-step GM-estimates and stability of inferences in linear regression.J Am Stat Assoc 1992, 87:439–450.

28. Coakley CW, Hettmansperger TP. A bounded influ-ence, high breakdown, efficient regression estimator.J Am Stat Assoc 1993, 88:872–880.

29. Hubert M, Rousseeuw PJ. Robust regression with bothcontinuous and binary regressors. J Stat Plann Infer1996, 57:153–163.

30. Rousseeuw PJ, Christmann A. Robustness against sep-aration and outliers in logistic regression. Comput StatData Anal 2003, 43:315–332.

31. Croux C, Haesbroeck G. Implementing the Bianco andYohai estimator for logistic regression. Comput StatData Anal 2003, 44:273–295.

32. Rousseeuw PJ, Van Aelst S, Van Driessen K, Agullo J.Robust multivariate regression. Technometrics 2004,46:293–305.

33. Agullo J, Croux C, Van Aelst S. The multivariate leasttrimmed squares estimator. J Multivariate Anal 2008,99:311–318.

34. Croux C, Haesbroeck G. Principal components anal-ysis based on robust estimators of the covariance orcorrelation matrix: influence functions and efficiencies.Biometrika 2000, 87:603–618.

35. Pison G, Rousseeuw PJ, Filzmoser P, Croux C. Robustfactor analysis. J Multivariate Anal 2003, 84:145–172.

36. Croux C, Dehon C. Analyse canonique basee sur desestimateurs robustes de la matrice de covariance. RevStat Appl 2002, 2:5–26.

37. Hubert M, Rousseeuw PJ, Vanden Branden K.ROBPCA: a new approach to robust principal com-ponents analysis. Technometrics 2005, 47:64–79.

38. Debruyne M, Hubert M. The influence function of theStahel-Donoho covariance estimator of smallest outly-ingness. Stat Probab Lett 2009, 79:275–282.

39. Hubert M, Verboven S. A robust PCR method forhigh-dimensional regressors. J Chemometr 2003,17:438–452.

40. Hubert M, Vanden Branden K. Robust methods forPartial Least Squares Regression. J Chemometr 2003,17:537–549.

42 2009 John Wi ley & Sons, Inc. Volume 2, January /February 2010

Page 8: MCD_2010

WIREs Computational Statistics Minimum covariance determinant

41. Vanden Branden K, Hubert M. Robustness propertiesof a robust PLS regression method. Anal Chim Acta2004, 515:229–241.

42. Maronna RA. Principal components and orthogonalregression based on robust scales. Technometrics 2005,47:264–273.

43. Willems G, Pison G, Rousseeuw PJ, Van Aelst S.A robust Hotelling test. Metrika 2002, 55:125–138.

44. Willems G, Van Aelst S. A fast bootstrap method forthe MCD estimator. In: Antoch J, ed. Proceedings inComputational Statistics. Heidelberg: Springer-Verlag;2004, 1979–1986.

45. Cheng T-C, Victoria-Feser M. High breakdown esti-mation of multivariate location and scale with miss-ing observations. Br J Math Stat Psychol 2002,55:317–335.

46. Copt S, Victoria-Feser M-P. Fast algorithms for com-puting high breakdown covariance matrices with miss-ing data. In: Hubert M, Pison G, Struyf A, Van Aelst S,eds. Theory and Applications of Recent Robust Meth-ods (Basel). Statistics for Industry and Technology:Birkhauser; 2004, 71–82.

47. Serneels S, Verdonck T. Principal component analysisfor data containing outliers and missing elements. Com-put Stat Data Anal 2008, 52:1712–1727.

48. Hawkins DM, McLachlan GJ. High-breakdown lin-ear discriminant analysis. J Am Stat Assoc 1997,92:136–143.

49. Hubert M, Van Driessen K. Fast and robust dis-criminant analysis. Comput Stat Data Anal 2004,45:301–320.

50. Vanden Branden K, Hubert M. Robust classificationin high dimensions based on the SIMCA method.Chemometr Intell Lab Syst 2005, 79:10–21.

51. Rocke DM, Woodruff DL. A synthesis of outlier detec-tion and cluster identification, technical report, 1999.

52. Hardin J, Rocke DM. Outlier detection in the mul-tiple cluster setting using the minimum covariancedeterminant estimator. Comput Stat Data Anal 2004,44:625–638.

53. Gallegos MT, Ritter G. A robust method for clusteranalysis. Ann Stat 2005, 33:347–380.

54. Vandev DL, Neykov NM. About regression estima-tors with high breakdown point. Statistics 1998,32:111–129.

55. Hadi AS, Luceno A. Maximum trimmed likelihood esti-mators: a unified approach, examples and algorithms.Comput Stat Data Anal 1997, 25:251–272.

56. Muller CH, Neykov N. Breakdown points of trimmedlikelihood estimators and related estimators in gen-eralized linear models. J Stat Plann Infer 2003,116:503–519.

57. Cizek P. Robust and efficient adaptive estimation ofbinary-choice regression models. J Am Stat Assoc 2008,103:687–696.

58. Cuesta-Albertos JA, Gordaliza A, Matran C. Trimmedk-means: an attempt to robustify quantizers. Ann Stat1997, 25:553–576.

59. Cuesta-Albertos JA, Matran C, Mayo-Iscar A. Robustestimation in the normal mixture model based on robustclustering. J R Stat Soc Ser B 2008, 70:779–802.

60. Garcıa-Escudero LA, Gordaliza A, San Martın R, VanAelst S, Zamar RH. Robust linear clustering. J R StatSoc B 2009, 71:1–18.

61. Vısek JA. The least weighted squares I. the asymptoticlinearity of normal equations. Bull Czech Econ Soc2002, 9:31–58.

62. Salibian-Barrera M, Yohai VJ. A fast algorithm forS-regression estimates. J Comput Graph Stat 2006,15:414–427.

63. Zaman A, Rousseeuw PJ, Orhan M. Econometricapplications of high-breakdown robust regression tech-niques. Econ Lett 2001, 71:1–8.

64. Welsh R, Zhou X. Application of robust statistics toasset allocation models. Revstat 2007, 5:97–114.

65. Prastawa M, Bullitt E, Ho S, Gerig G. A brain tumorsegmentation framework based on outlier detection.Med Image Anal 2004, 8:275–283.

66. Jensen WA, Birch JB, Woodal WH. High breakdownestimation methods for phase I multivariate controlcharts. Qual Reliab Eng Int 2007, 23:615–629.

67. Neykov NM, Neytchev PN, Van Gelder PHAJM,Todorov VK. Robust detection of discordant sites inregional frequency analysis. Water Resour Res 2007,43.

68. Vogler C, Goldenstein S, Stolfi J, Pavlovic V,Metaxas D. Outlier rejection in high-dimensionaldeformable models. Image Vis Comput 2007, 25:274–284.

69. Lu Y, Wang J, Kong J, Zhang B, Zhang J. An inte-grated algorithm for MRI brain images segmenta-tion. Comput Vis Approaches Med Image Anal 2006,4241:132–1342.

70. van Helvoort PJ, Filzmoser P, van Gaans PFM. Sequen-tial Factor Analysis as a new approach to multivariateanalysis of heterogeneous geochemical datasets: Anapplication to a bulk chemical characterization of flu-vial deposits (Rhine-Meuse delta, The Netherlands).Appl Geochem 2005, 20:2233–2251.

Volume 2, January /February 2010 2009 John Wi ley & Sons, Inc. 43