15
Extreme value theory in analysis of differential expression in microarrays where either only up- or down-regulated genes are relevant or expected RENATA IVANEK 1 *, YRJO ¨ T. GRO ¨ HN 1 , MARTIN T. WELLS 2 , SARITA RAENGPRADUB 3 , MARK J. KAZMIERCZAK 4 AND MARTIN WIEDMANN 3 1 Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA 2 Department of Biological Statistics and Computational Biology, 301 Malott Hall, Cornell University, Ithaca, NY 14853, USA 3 Department of Food Science, 412 Stocking Hall, Cornell University, Ithaca, NY 14853, USA 4 Channing Laboratory, Harvard Medical School, 181 Longwood Avenue, Boston, MA 02115, USA (Received 24 January 2008 and in revised form 30 May 2008 ) Summary We propose an empirical Bayes method based on the extreme value theory (EVT) (BE) for the analysis of data from spotted microarrays where the interest of the investigator (e.g. to identify up-regulated gene markers of a disease) or the design of the experiment (e.g. in certain ‘ wild-type versus mutant ’ experiments) limits identification of differentially expressed genes to those regulated in a single direction (either up or down). In such experiments, unlike in genome-wide microarrays, analysis is restricted to the tail of the distribution (extremes) of all the genes in the genome. The EVT provides a platform to account for this extreme behaviour, and is therefore a natural candidate for inference about differential expression. We compared the performance of the developed BE method with two other empirical Bayes methods on two real ‘wild-type versus mutant’ datasets where a single direction of regulation was expected due to experimental design, and in a simulation study. The BE method appears to have a better fit to the real data. In the analysis of simulated data, the BE method showed better accuracy and precision while being robust to different characteristics of microarray experiments. The BE method, therefore, seems promising and useful for inference about differential expression in microarrays where either only up- or down-regulated genes are relevant or expected. 1. Introduction A common task in microarray studies is to determine which genes are differentially expressed (DE) between two cell samples. Different rules for identifying DE genes have been adopted, many based on the gene- specific mean and the standard deviation (SD). Their main limitation originates from the low number of replicates. More recently, as an alternative to these methods, a number of empirical Bayes methods have been proposed (e.g. Efron et al., 2001; Lonnstedt & Speed, 2002; Gottardo et al., 2003 ; Newton et al., 2004). Empirical Bayes methods seem natural in the context of microarray experiments, because they summarize information from the whole dataset into prior parameters to be combined with means and SDs at the gene level (Lonnstedt & Britton, 2005). An empirical Bayes approach to differential ex- pression is a normal mixture model proposed by Lonnstedt & Speed (2002). This approach was later extended by Smyth (2004) to general linear models and modified into an empirical Bayes normal model (and not mixture model) for variance regularization. Smyth’s (2004) model represents a posterior odds stat- istic formulated in terms of a moderated t-statistic, in which posterior residual SDs are used in place of ordinary SDs ; the model can also account for corre- lation among technical replicates (Smyth et al., 2005). It is implemented in Limma (Smyth, 2005), a popular software package for gene expression analy- sis. Throughout this paper, we will refer to the empirical Bayes normal method offered in Limma * Corresponding author. Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA. Tel: +1(607) 2533052. Fax. +1 (607) 2533083. e-mail : [email protected] Genet. Res., Camb. (2008), 90, pp. 347–361. f 2008 Cambridge University Press 347 doi:10.1017/S0016672308009427 Printed in the United Kingdom

Extreme value theory in analysis of differential expression in microarrays where either only up- or down-regulated genes are relevant or expected

Embed Size (px)

Citation preview

Extreme value theory in analysis of differential expressionin microarrays where either only up- or down-regulatedgenes are relevant or expected

RENATA IVANEK1*, YRJO T. GROHN1, MARTIN T. WELLS2,SARITA RAENGPRADUB3, MARK J. KAZMIERCZAK4

AND MARTIN WIEDMANN3

1Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA2Department of Biological Statistics and Computational Biology, 301 Malott Hall, Cornell University, Ithaca, NY 14853, USA3Department of Food Science, 412 Stocking Hall, Cornell University, Ithaca, NY 14853, USA4Channing Laboratory, Harvard Medical School, 181 Longwood Avenue, Boston, MA 02115, USA

(Received 24 January 2008 and in revised form 30 May 2008 )

Summary

We propose an empirical Bayes method based on the extreme value theory (EVT) (BE) for theanalysis of data from spotted microarrays where the interest of the investigator (e.g. to identifyup-regulated gene markers of a disease) or the design of the experiment (e.g. in certain ‘wild-typeversus mutant ’ experiments) limits identification of differentially expressed genes to those regulatedin a single direction (either up or down). In such experiments, unlike in genome-wide microarrays,analysis is restricted to the tail of the distribution (extremes) of all the genes in the genome. TheEVT provides a platform to account for this extreme behaviour, and is therefore a natural candidatefor inference about differential expression. We compared the performance of the developed BEmethod with two other empirical Bayes methods on two real ‘wild-type versus mutant ’ datasetswhere a single direction of regulation was expected due to experimental design, and in a simulationstudy. The BE method appears to have a better fit to the real data. In the analysis of simulated data,the BE method showed better accuracy and precision while being robust to different characteristicsof microarray experiments. The BE method, therefore, seems promising and useful for inferenceabout differential expression in microarrays where either only up- or down-regulated genes arerelevant or expected.

1. Introduction

A common task in microarray studies is to determinewhich genes are differentially expressed (DE) betweentwo cell samples. Different rules for identifying DEgenes have been adopted, many based on the gene-specific mean and the standard deviation (SD). Theirmain limitation originates from the low number ofreplicates. More recently, as an alternative to thesemethods, a number of empirical Bayes methods havebeen proposed (e.g. Efron et al., 2001; Lonnstedt &Speed, 2002; Gottardo et al., 2003; Newton et al.,2004). Empirical Bayes methods seem natural in thecontext of microarray experiments, because they

summarize information from the whole dataset intoprior parameters to be combined with means and SDsat the gene level (Lonnstedt & Britton, 2005).

An empirical Bayes approach to differential ex-pression is a normal mixture model proposed byLonnstedt & Speed (2002). This approach was laterextended by Smyth (2004) to general linear modelsand modified into an empirical Bayes normal model(and not mixture model) for variance regularization.Smyth’s (2004) model represents a posterior odds stat-istic formulated in terms of a moderated t-statistic,in which posterior residual SDs are used in place ofordinary SDs; the model can also account for corre-lation among technical replicates (Smyth et al.,2005). It is implemented in Limma (Smyth, 2005), apopular software package for gene expression analy-sis. Throughout this paper, we will refer to theempirical Bayes normal method offered in Limma

* Corresponding author. Department of Population Medicine andDiagnostic Sciences, College of Veterinary Medicine, CornellUniversity, Ithaca, NY 14853, USA. Tel: +1(607) 2533052.Fax. +1 (607) 2533083. e-mail : [email protected]

Genet. Res., Camb. (2008), 90, pp. 347–361. f 2008 Cambridge University Press 347doi:10.1017/S0016672308009427 Printed in the United Kingdom

(Lonnstedt & Speed, 2002; Smyth, 2004; Smythet al., 2005) as the ‘BN’ method.

The underlying assumption of the BN method isnormal distribution of mean ratios of expressions ofDE genes, i.e. that the DE genes are symmetricallyup- and down-regulated. This assumption is valid forgenome-wide microarray experiments, where interestlies in detection of all DE genes between two samples,and typically, out of all the tested genes (the wholegenome), only a fraction is expected to be DE, withsimilar numbers of up- and down-regulated genes.However, in a number of microarray experiments,only up- or down-regulated genes are of interest.Examples are certain experiments designed to identifynovel markers for molecular diagnosis and therapy ofdiseases (Suzuki et al., 2002; Kobayashi et al., 2004).Furthermore, in some experiments, regulation is ex-pected in one direction only (either up or down) dueto the design of the experiments, such as in certain‘wild-type versus mutant ’ microarrays (Kazmierczaket al., 2003; van Schaik et al., 2007) and those de-signed to select DE genes among a group of poten-tially up- or down-regulated genes that have beenpre-selected either as part of some genome-widemicroarray or other (e.g. Hidden Markov Chainpromoter search) screening methods (Kazmierczaket al., 2003). In such experimental designs, where onlyup- or down-regulated genes are relevant or expected,the distributional assumption of normality of the BNmethod is obviously violated.

As a long-tailed alternative to the normal mixturemodel of Lonnstedt & Speed (2002), which assumesthat the DE genes are symmetrically up- and down-regulated, Bhowmick et al. (2006) proposed a Laplacemixture model (hereinafter referred to as the ‘BL’method). Bhowmick et al. (2006) showed that theasymmetric gene expression data fit better under theasymmetric BL method. However, while the perform-ance of BL was similar to the Smyth (2004) andLonnstedt & Speed (2002) methods, BL depends on alarge number of replicates for acceptable parameterestimates. Hence, microarrays with a single relevantor expected direction of regulation lack an appropri-ate analytical tool.

Extreme value theory (EVT) has been traditionallyused for risk and financial analysis, and studies ofextreme events, such as intense rainfall and floods.An important feature of EVT is that it models theextreme behaviour (the tail of the distribution) ratherthan the average behaviour of the systems as classicalstatistics do. By focusing on the values located in thetail of the distribution that diverge extremely fromthe mean value of a dataset, EVT provides a naturalframework for detection of DE genes in an exper-iment where regulation is relevant or expected in asingle direction. Therefore, our objective was to de-velop a new statistical method, based on EVT, for

analysis of differential expression in experiments in-terested in or expecting either up- or down-regulatedgenes. Application to two experimental datasets andsimulation studies indicated comparatively betterperformance of the developed empirical Bayes ex-treme value distribution (EVD) mixture model (BE)as compared with the two existing empirical Bayesmethods (BN and BL).

2. Methods

(i) Model setup in the context of linear models

Consider a microarray experiment comparing theexpression (i.e. mRNA transcript) levels of wild-typeand mutant cells, where mutant cells lack a geneencoding a positive regulator. We wish to identifygenes that differ in their transcript levels betweenwild-type and mutant cells exposed to the sametreatment. For the jth replicate of gene g on array i,we use the log ratio of the expressions:

Ygij= log2

(expression level in wild type)gij(expression level in mutant)gij

:

Let us assume that we have n arrays, where each arrayhas G number of genes spotted on it, and each geneis replicated m times. Therefore, the complete setof data from the experiment consists of Ygij, g=1, … ,G, i=1, … ,n, and j=1, … ,m, and so Yg canbe viewed as a vector of mn log ratios observed for agene g. We regard the Ygij as random variables froma normal distribution with mean mg and variance sg

2,which, although not completely true, is convenientand found empirically to be roughly the case(Lonnstedt & Speed, 2002), i.e.

Ygij � N(mg,s2g): (1)

A general microarray experiment can be rep-resented by a linear model (Smyth, 2004), i.e.E(Yg)=Xbg, where X is an nmrk dimensional designmatrix specifying experimental design and bg is avector of k regression coefficients. As we have m rep-licates of each gene on each array, our design matrixhas m repeated rows corresponding to each set ofm replicate spots. To simplify E(Yg)=Xbg, let �YYg bethe n-vector of array means �YYgi and let �XX be the re-duced nrk dimensional design matrix in which thereis only one row instead of m rows for each arraycombination (Smyth et al., 2005). Then, E(�YYg)=�XXbg.Now, let agq=cTbg, where c is a vector of q knownconstants defining contrasts of biological interest andagq is a particular contrast or linear combination ofthe regression coefficients and suppose that interestlies in testing H0 : agq=0 (Smyth et al., 2005). For therest of the paper, the subscript q will be suppressed fornotational simplicity under the assumption that our

R. Ivanek et al. 348

hypothetical experiment has only one contrast ofinterest. We suppose that measurements from genesspotted on different arrays are independent. However,we acknowledge that within an array technicalreplicates of a gene are correlated. To account forcorrelation among replicates, we used a commoncorrelation factor (Smyth et al., 2005). Let bbg be theestimator of bg from fitting a linear model and letcontrast estimator aag

� �aag=cTbbg. Here, aag is as-

sumed to be a random variable from a normal distri-bution (as in (1)) with mean mg and variance ng

2 sg2,

where ng2 is an unscaled variance of aag that also ac-

counts for a common correlation factor among rep-licates (Smyth et al., 2005), i.e.

aagjmg, s2g � N(mg, n

2gs

2g): (2)

The residual variance from fitting a linear model(sg

2) was assumed to follow a gamma distribution (G;equivalent to the scaled chi-square distribution as-sumed by Smyth, 2004) :

s2gjs2g � G

dg2,dg2s2

g

!, (3)

where dg is the residual degrees of freedom for thelinear model for gene g.

(ii) Empirical Bayes EVD mixture model (BE )

The EVD has three parameters : a location parameter,a ; a scale parameter, b ; and a shape parameter, c. Itencompasses three classes of distributions with typesI, II and III widely known as the Gumbel, Frechet andWeibull families, respectively (Coles, 2001). These arecombined into a single-family model known as thegeneralized extreme value (GEV) family of distri-butions, where the shape parameter (c) determines thetype; the Frechet and Weibull families have c>0 andc<0, respectively, while the shape parameter of theGumbel family has c=0. Throughout the presentpaper, we will use EVD and GEV interchangeably,both indicating an unspecified family of EVD. Incontrast, when we refer to a specific family we willuse its name (i.e. Gumbel, Frechet or Weibull distri-bution).

For an appropriately background corrected andnormalized gene to be DE, its mg value should bestatistically different from zero. Let Ig indicatewhether the gene is DE (mgl0) or not (mg=0), suchthat

Ig=0, if mg=0,1, if mgl0:

We assume that

mgjIg=0 � d(0), (4)

where d(0) denotes the distribution which places unitmass at mg=0. Because we expect DE genes to beeither up- or down-regulated, we assumed an EVDprior on mgl0, i.e.

mgjIg=1 � EVD(a, b, c): (5)

An inverse gamma (IG) prior distribution is assumedon sg

2, with sg2 equivalent to a prior estimator s0

2 withd0 degrees of freedom (Smyth, 2004), i.e.

s2g � IG

d0

2,d0s

20

2

� �: (6)

We wish to know whether the gene is DE, i.e. whatis Pr(Ig=1jaag, s

2g) or what are the odds of differential

expression (more specifically, the natural logarithm ofthe odds), BEg :

BEg=logPr [Ig=1j(aag, s

2g)]

Pr [Ig=0j(aag, s2g)]:

By the Bayes theorem,

Pr[Ig=1j(aag, s2g)]=

Pr(Ig=1, aag, s2g)

Pr(aag, s2g)

=Pr(Ig=1)Pr(aag, s

2gjIg=1)

Pr(Ig=1)Pr(aag, s2gjIg=1)+Pr(Ig=0)Pr(aag, s2gjIg=0),

Pr[Ig=0j(aag, s2g)]=

Pr(Ig=0, aag, s2g)

Pr(aag, s2g)

=Pr(Ig=0)Pr(aag, s

2gjIg=0)

Pr(Ig=1)Pr(aag, s2gjIg=1)+Pr(Ig=0)Pr(aag, s2gjIg=0):

Then, the log Bayes’ factor is

BEg= logPr(Ig=1)

[1xPr(Ig=1)]

Pr[(aag, s2g)jIg=1]

Pr[(aag, s2g)jIg=0]

" #, (7)

where Pr(Ig=1) is the proportion of DE genes (pDE)in the experiment. The joint densities Pr(aag, s

2gjIg=1)

and Pr(aag, sg2jIg=0) are:

Pr(aag, s2gjIg=1)=fIg=1(aag, s

2g)

=ZZ

f(aagjmg, s2g)f(mg)f(s

2gjs2

g)f(s2g) dmgds

2g

Pr(aag, s2gjIg=0)=fIg=0(aag, s

2g)

=Z

f(aagjmg=0, s2g)f(s

2gjs2

g)f(s2g)ds

2g:

To evaluate the probabilities Pr(aag, s2gjIg=1) and

Pr(aag, s2gjIg=0), and the BE statistic in (7), we used the

distributions defined in equations (2)–(6), i.e. theirrespective likelihood or probability density functions :

f (aagjmg, s2g)= 2pn2

gs2g

� �x1=2exp x

1

2n2gs

2g

aagxmg

� �2( ),

(8)

Extreme value theory in differential expression analysis 349

where x1<aag<1, x1<mg<1, ngsg>0;

f s2gjs2g

� �=

dg=2s2g

� �dg=2s2g

� �(dg=2x1)

C dg=2� � exp xdgs

2g=2s

2g

� �,

(9)

where dg/2>0, dg/2sg2>0, sg

2>0;

fIg=0(mg)=d(0); (10)

fIg=1(mgja, b, c)=1

b1+c

mgxa

b

� �x(1=c)x1

rexp x 1+cmgxa

b

� �x1=c� �

(11)

defined on {mg :1+(mgxa)/b>0}, where x‘<mg<‘,b>0 and x‘<c<‘ (Coles, 2001; Panjer, 2006) ;

f s2gjs20, d0

� �

=d0s

20=2

� �d0=2 s2g

� � x1xd0=2ð Þ

C d0=2ð Þ exp xd0s20=2s

2g

� �,

(12)

where d0/2>0, d0s02/2>0, sg

2>0.It is not possible to integrate the integral

fIg=1(aag, s2g). Therefore, numerical approximation

through the Monte Carlo (MC) integration method(Tanner, 1996) was applied to obtain the posteriorexpectations of fIg=1(aag, s

2g) and fIg=0(aag, s

2g) (denoted

as E( fIg=1) and E( fIg=0), respectively) :

E( fIg=1)=1

rgr

i=1f(aagjmgi, s

2gi) f (s

2gjs2

gi)

� �, (13)

E( fIg=0)=1

rgr

i=1f(aagjmgi=0, s2

gi) f (s2gjs2

gi)

� �, (14)

where r denotes the number of iterations, while mgiand sgi

2 denote iid draws from prior distributionsof gene-specific means and variances, respectively.Thereafter, the expected value of the BE statistic is :

BEg=logpDE

1xpDEð ÞE( fIg=1)

E( fIg=0)

: (15)

For an estimator to be useful, a measure of esti-mation error is required. The estimated Monte Carlostandard errors (MCSEs) of E( fIg=1) and E( fIg=0)(denoted as SE( fIg=1) and SE( fIg=0), respectively)were:

SE( fIg=1)=1ffiffir

p

r

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffigr

i=1[ f (aagjmgi, s2gi) f (s

2gjs2

gi)xE( fIg=1)]2

rx1

s, (16)

SE( fIg=0)=1ffiffir

p

r

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffigr

i=1[ f (aagjmgi=0, s2gi) f (s

2gjs2

gi)xE( fIg=0)]2

rx1

s: (17)

Thereafter, by accounting for error propagation, theMCSE of the BEg (MCSEg) was estimated by addinguncertainties in E( fIg=1) and E( fIg=0), i.e. SE( fIg=1)and SE( fIg=0), in quadrature (Taylor, 1982) :

MCSEg=

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiSE( fIg=1)

E( fIg=1)

2+

SE( fIg=0)

E( fIg=0)

2s: (18)

BE statistics were evaluated on a PC running onWindows XP Pro SP2 with an AMD Athlon 64 X2Dual Core Processor 4200+(2200 MHz) and2048 Mb of RAM. For a dataset with 200 genesspotted in triplicate on four arrays (similar to datasetsdescribed in Section 3(i)), the computing time for theBE method involving 50 000 iterations was roughly15 s, while the BN and BL methods took roughly 1 seach. For a considerably larger dataset, with 10 000genes also spotted in triplicate on four arrays, thecomputing time for the BE method with 50 000 iter-ations was roughly 16 min, while it was roughly 1 minfor each of the BN and BL methods. The R codeimplementing the proposed methodology is availableon request from the first author.

(iii) Estimation of hyperparameters

The BE statistic defined in Section 2(ii) uses hyper-parameters estimated from the data. The parameterss02 and d0 are estimated from sg

2 following Smyth’s(2004) procedure. The only exception to this rule waswhen d0 could not be estimated as in Smyth (2004),indicating that there is no evidence that the under-lying gene-specific variances sg

2 vary between genes,and so in Smyth (2004) d0 was set to positive infinity.In evaluation of BE statistics, d0 equal to positive in-finity precludes generation of a prior on sg

2. Therefore,in such situations, instead of positive infinity we set d0to 10300, which is close to the largest positive decimalnumber on a typical R platform. In terms of pDE, theBE method uses the same approach as the BN meth-od; it fixes pDE and then estimates the remaininghyperparameters (a, b and c). Hyperparameters a, band c are estimated simultaneously from aag valuescorresponding to a pDE fraction of genes with thehighest moderated t-statistics. The method involvesmaximum likelihood estimation (MLE) fitting for theGEV distribution, with the Nelder–Mead optimiz-ation method (Nelder & Mead, 1965). The MLEprocedure, subject to the limitations discussed inSection 5, provides the means and standard errors(SEs) for the parameters a, b and c. The means are

R. Ivanek et al. 350

used to estimate BE statistics, and the correspondingSEs provide an indication of the uncertainty aroundthe estimated mean values. Genes with the top mod-erated t-statistics (estimated according to Smyth,2004) were used instead of the top genes ranked bytheir ordinary t-statistics as they provided a morestable estimation of a, b and c.

3. Application to experimental data

(i) Data description

To analyse the performance of BE as compared withBN and BL, we used two datasets generated in a pre-vious two-color cDNA microarray study conductedto identify genes regulated by the sigma factor sB inthe bacterium Listeria monocytogenes (Kazmierczaket al., 2003). In that study, an L. monocytogenes sigBnull mutant (which lacks the sB protein) and a parentstrain with intact sigB gene (wild-type) were exposedto two stress conditions, namely osmotic stress andstationary phase, to identify genes with transcriptlevels affected by the sigB deletion under these twoconditions. For each stress condition, two indepen-dent RNA isolates (biological replicates) for bothwild-type and sigB mutant cells were dye swapped fora total of four arrays per stress condition. Each arrayincluded 211 test genes, and a number of non-hybridizing and normalization controls (for detailssee Kazmierczak et al., 2003) spotted in triplicate.Most (166) genes included on the array were identifiedby HiddenMarkovModel promoter searches as beingpreceded by a putative sB-dependent promoter, whilesome genes (36) were included because of previousreports of their involvement in virulence or stressresponse. As sB is a positive regulator of gene ex-pression with particular importance for regulatingstress response and virulence genes, most genes inthese two experiments are expected to show highertranscript levels in the wild-type strain as comparedwith the sigB deletion strain.

In their analysis, Kazmierczak et al. (2003) con-sidered all individual spots as repetitions, generating24 data points for each gene (3 spots per gener4arraysr2 channels per array), i.e. correlation amongtechnical replicates was not considered. They reportedfindings for 208 of the 211 test genes as three geneswere spotted twice. Prior to analysis, cross-slide meannormalization (without background correction) andflooring were performed. The analysis by the Signifi-cance Analysis of Microarrays (SAM) program(Tusher et al., 2001) identified 51 (25%) and 41 (20%)genes with at least 1.5-fold different statisticallysignificant expressions under osmotic stress andstationary-phase conditions, respectively.

Prior to our analysis of the two datasets of 211genes, we performed the background correction and

normalization. The median background fluorescenceintensities are usually recommended for correction ofthe background noise because of their robustness tothe outliers. We, however, used the mean backgroundintensities because the distribution of median back-ground intensities had a bimodal distribution withsome spots having zero background while the otherswere in the higher range of intensities (above 28)(possibly due to the setup or limitations of the laserscanner used).

Two background correction procedures seemedappropriate for the data. The first, the normal-exponential convolution background correctionmodel (NeBC) (performed with an offset of 100),involves fitting of the convolution of normal andexponential distributions to the foreground inten-sities using the background intensities as a covariate(also referred to as the normexp method in Smyth,2005). The second procedure used was multiplicativebackground correction (MBC). This is a novel ap-proach that involves logarithmic transformationof the intensity readings before the backgroundcorrection and is found (via a series of examples) tobe superior to the additive background correctionand no background correction (Zhang et al., 2006).Because MBC reportedly gives fewer false positivesthan conventional additive background correction(Zhang et al., 2006) and because its performancehas never been contrasted with NeBC, we used (andcompared) both background correction models inour study.

The normalization appropriate for the data was theLowess normalization (Cleveland & Devlin, 1988),with up-weighting of the background and normal-ization control spots, known to be non-DE (http://bioconductor.org/packages/1.8/bioc/vignettes/limma/inst/doc/usersguide.pdf). Application of the twobackground correction procedures (NeBC and MBC)to each of the two stress-condition datasets (osmoticstress and stationary phase) provided a total of fourreal model-datasets used in our analyses.

(ii) Results

In all four model-datasets, normalized and back-ground corrected log2 ratios between genes’ expressionvalues in wild-type and mutant cells (Ygij) were dis-tributed asymmetrically around zero and heavilyskewed to the right. This was expected because up-regulation was anticipated in most of the tested genes.It was therefore reasonable to assume that the dis-tribution of mean expressions of DE genes followsEVD. Hence, the BE method could be applied forinference about differential expression.

A critical issue in the MC integration methodology,underlying the BE method, is to determine the num-ber of iterations that can be safely used as a basis for

Extreme value theory in differential expression analysis 351

inference. We used 50 000 iterations as they providedreasonable accuracy of the approximated BE stat-istics. The achieved MCSEs varied for different genesand model-datasets. The medians, followed by theranges in parentheses, of the achieved MCSEs were0.05 (0.01–0.42) and 0.03 (0.01–0.82) for the osmoticstress datasets corrected with the NeBC and MBCmethods, respectively, and 0.38 (0.02–0.92) and 0.18(0.02–0.52) for the stationary-phase datasets cor-rected with the NeBC and MBC methods, respect-ively. In all four model-datasets, MCSEs were thelowest (<0.1) for the genes with the value of BEstatistic around 0.

For each model-dataset, the gene-specific BN, BLand BE statistics were approximated. The biologicalmeaning of the identified DE genes is important.Hence, for each of the four model-datasets in Fig. 1,we show the values of the BN, BL and BE statistics,

plotted against contrast estimators from linear models(aag) (also translated into fold changes, 2aag , for moreintuitive interpretation) and against previous resultsof Kazmierczak et al. (2003). In each model-dataset,genes that ranked very low with the BE statistic have afold change below 1. At the same time, the BN stat-istic ranked high some of the genes with very low foldchange, incorrectly suggesting down-regulation. TheBL statistic gave ambiguous results with high valuesfor most genes, particularly in the stationary-phasedata. It should be noted that for approximation of theBN and BE statistics, we fixed the pDEs to thosereported in Kazmierczak et al. (2003). Fixing pDEs todifferent values would change the BN and BE vs. foldchange plots. Decreasing pDE would shift the plots tothe right and down, whereas increasing pDE wouldshift the plots left and up on the x- and y-axes, re-spectively.

Fig. 1. The BN (Lonnstedt & Speed, 2002; Smyth, 2004), BL (Bhowmick et al., 2006) and empirical Bayes EVD mixturemodel (BE) statistics plotted against the contrast estimators from fitting linear models at the gene level, ‘alpha_g’ (denotedas aag in the text), also translated into fold changes (FC), and against results reported by Kazmierczak et al. (2003). ‘K’and the associated right y-axis indicate whether Kazmierczak et al. (2003) reported a gene as DE (‘yes’) or not (‘no’).‘NeBC’=normal-exponential convolution background correction method. ‘MBC’=multiplicative background correctionmethod. Two horizontal dashed lines (enclosing a shaded area) indicate the 5th and 95th percentiles of OT of the BEstatistic estimated for the FDR fixed to 0. ‘FNR=(,) ’ denotes false negative rate (5th and 95th percentiles) associated withthe OT.

R. Ivanek et al. 352

Table 1 shows the characteristics of the data andthe values of the hyperparameters estimated for eachof the four model-datasets. The ambiguous resultsof the BL method are probably due, at least in part, toa very high estimated probability that a gene is DE(w=1; Table 1). The prior variance distributions seemquite stable among the BN, BL and BE methods,except for the roughly double value of the scale par-ameter estimated for the BL method as comparedwith that estimated for the BN and BE methods.Contrary to that, the prior variances differ substan-tially between background correction methods, beingnarrower for the data corrected with MBC, whichmay explain the smoother plots of the BN, BL and BEstatistics following MBC. Also, interestingly, corre-lation among technical replicates tends to be higherfollowing the NeBC than MBC, demonstrating thedifference between these two procedures.

In the BE statistic, a natural choice of the optimalthreshold (OT) above which a gene could be con-sidered DE is 0. However, the actual OT depends onthe imposed criteria, such as the cost of a false positiveand false negative. A typical approach in choosing arule for interpretation of a statistical test is to controlthe type I error probability while maintaining a cer-tain power. A sensible, powerful and easy to interpret(Verhoeven et al., 2005) method to control type Ierror when multiple statistical tests are performedis the false discovery rate (FDR) (Benjamini &Hochberg, 1995). FDR is the expected proportion oferrors among the genes selected to be DE. As a lowFDR often comes at the cost of low sensitivity orpower (i.e. a high false negative rate (FNR)), theseshould be controlled jointly (Pawitan et al., 2005).Because Kazmierczak et al. (2003) considered genesthat had been pre-selected for their expected differ-ential expression, we chose an FDR=0, i.e. no falsepositives were acceptable. The OT for BE (its 5th and95th percentiles) was determined through simulationanalysis for each of the four model-datasets (as-suming that the pDEs reported in Kazmierczak et al.(2003) are true), and is shown in Fig. 1, together withthe associated FNR. Genes whose BE statistic wasabove the 95th percentile of the OT could be con-sidered DE with a high certainty. Genes with a BEstatistic between the 5th and 95th percentiles of theOT are likely to be DE. BE did rank high (abovethe 95th percentile of the OT) some of the genes pre-viously unidentified by Kazmierczak et al. (2003),whereas a few of the genes previously reported as DEby Kazmierczak et al. (2003) were ranked low (belowthe 5th percentile of the OT). However, the findingsof the BE method have been validated by other inde-pendent studies for most of the genes, for which theresult of the BE method differed from those reportedby Kazmierczak et al. (2003) (elaborated in theAppendix).T

able1.Definitionsofmodelparametersandhyperparametersin

theem

piricalBayes

EVD

mixture

model(B

E),andmodelsofLonnstedt&

Speed(2002)

modified

bySmyth

(2004)(B

N)andBhowmicket

al.(2006)(B

L)

Notation

Parameter

definition

Osm

oticstress,

NeB

Ca

Osm

oticstress,

MBCb

Stationary

phase,NeB

CStationary

phase,MBC

GNumber

ofgenes

intheexperim

ent

211

211

211

211

MNumber

oftechnicalreplicates

33

33

NNumber

ofbiologicalreplicates(arrays)

44

44

Correlationamongtechnicalreplicates

0. 65

0. 43

0. 65

0. 42

pDE

ProportionofDEcgenes

0. 25

0. 25

0. 2

0. 2

mg

PriordistributionofDEcgenemeansin

BE

EVD

d(1. 27,0. 60,0. 15)

EVD

(1. 01,0. 47,0. 17)

EVD

(1. 61,0. 68,0. 06)

EVD

(1. 35,0. 53,0. 10)

sg2

Priordistributionofgenevariance

inBE

IGe(1. 61,4. 79)

IG(7. 91,0. 47)

IG(1. 41,14. 23)

IG(2. 90,3. 83)

BNmg

PriordistributionofDEgenemeansin

BN

Nf(0,11. 95rBNsg2)

N(0,7. 06r

BNsg2)

N(0,30. 31rBNsg2)

N(0,15. 65rBNsg2)

BNsg2

Priordistributionofgenevariance

inBN

IG(1. 61,4. 79)

IG(7. 91,0. 47)

IG(1. 41,14. 23)

IG(2. 90,3. 83)

BLmg

PriordistributionofDEgenemeansin

BL

Lg(0. 31,2. 43rBLsg2)

L(0. 44,2. 54rBLsg2)

L(0. 18,5. 98rBLsg2)

L(0. 21,4. 65rBLsg2)

BLsg2

Priordistributionofgenevariance

inBL

IG(1. 9,4. 83)

IG(7. 08,0. 75)

IG(1. 12,23. 72)

IG(2. 19,7. 94)

wh

Mixingprobability

0. 93

0. 58

11

aNeB

C=norm

al-exponential

convolution

background

correction

method;

bMBC=multiplicative

background

correction

method;

cDE=differentially

expressed

;dEVD=extrem

evaluedistribution;eIG

=inverse

gammadistribution;fN=norm

aldistribution;gL=Laplace

distribution;hw=theprobabilitythatageneisDEestimated

aspart

oftheBLmethod(note

thattheBN

andBEstatisticsuse

afixed,user-defined

pDE).

Extreme value theory in differential expression analysis 353

4. Simulation studies

Based on the hyperparameters estimated from thedata and shown in Table 1, 100 datasets were simu-lated for each of the four model-datasets. To easecomputational burden, we reduced the number ofiterations in the MC integration model underlying theBE statistics from 50 000 to 5000. This reduction waswarranted by our empirical observation that geneswith BE around zero (a natural cutoff value for sep-aration of DE and non-DE genes) have the lowestMCSEs. With 5000 iterations, the achieved MCSEsof these genes were below 0.2. So, the reduction didnot jeopardize the ability of the BE to differentiatebetween DE and non-DE genes.

The performance of the BN, BL and BE statisticswas tested under the Normal, Laplace and EVDmodels. The BE statistic did not do well under theasymmetric Laplace model and did even worse underthe Normal model. That is expected because theNormal, and to a lesser extent the asymmetricLaplace, model permit both up- and down-regulatedgenes, while the EVD model allows one direction of

expression (either up- or down-regulation) only.Because the Normal and Laplace models do not fitto data from experiments where only one direction ofexpression is expected, we only present results of thesimulation under the EVD model.

Simulated datasets (100 of them) were generatedunder different scenarios. These were analysed withBN, BL and BE and the performance of the threestatistics was evaluated based on the FDR vs. FNRplots, where the lower left corner indicates perfectperformance, with zero false positives and falsenegatives. Figure 2 shows FDR vs. FNR plots of 100simulated datasets overlaid by the horizontal averagecurve and box plots showing horizontal spread ofthe performance of the three statistics in the foursimulated model-datasets. Averaging over simulateddatasets horizontally mimics the situation when theexperimenter chooses a tolerable FDR and tries tomaximize power. Results for vertical averaging,corresponding to situations when an experimenterchooses an acceptable power while minimizing FDR,are not shown as they were very similar to the resultsfor horizontal averaging. Figure 2 shows that the

Fig. 2. FDR vs. FNR plots of 100 simulated datasets overlaid by the horizontal average curve and box plots showinghorizontal spread of the performance of the BN (Lonnstedt & Speed, 2002; Smyth, 2004), BL (Bhowmick et al., 2006) andempirical Bayes EVD mixture model (BE) statistics in the four simulated model-datasets. ‘NeBC’=normal-exponentialconvolution background correction method; ‘MBC’=multiplicative background correction method; ‘ne’=not estimated.

R. Ivanek et al. 354

three statistics do very well in all four model-datasets.The only exception was BL: analysis failed for severalsimulated datasets reproduced from the osmoticmodel-dataset background corrected with MBC. Toassure fair comparison, the performance of the BLmethod in analysis of this model-dataset is not plottedin Fig. 2. A curve showing average FNR for differentvalues of FDR (horizontal average curve) in Fig. 2indicates that BE, on average, has better accuracy(correct classification of genes as DE and non-DE) inthe simulated data than BN and BL. The horizontalbox plots of the achieved FNR for various FDR shownarrower spread of BE compared with the other twostatistics, indicating better precision (repeatability ofclassification success) in the simulated data.

Robustness of the BE statistic, as compared withthe BN and BL statistics, was tested by varying sev-eral key characteristics of a microarray experiment(pDE, n, m and G). The pDE was increased and

decreased by +10% and x10%, respectively.Figure 3 shows, not surprisingly, that as the pDE de-creases all three B statistics perform worse, while theydo better as the pDE increases. However, overall, BEconsistently performed the best in these simulateddatasets. In the osmotic stress data background cor-rected with MBC, BL failed under all three simulationscenarios (pDEx0.10, pDE and pDE+0.10) in sev-eral of 100 simulated datasets and thus is not shown.

To test the robustness of the model to the numberof arrays (n), usually representing the number ofbiological replicates, we simulated the four model-datasets with n=2, 4 and 8. Figure 4 shows that BEconsistently performed better than BN and BL, butthe margin of BE superiority became smaller as thenumber of arrays increased. Interestingly, the im-provement in the performance gained with the largernumber of arrays was more apparent in the model-datasets corrected with MBC. The wiggly plot of

Fig. 3. FDR vs. FNR plots showing horizontal average of the BN (Lonnstedt & Speed, 2002; Smyth, 2004), BL(Bhowmick et al., 2006) and empirical Bayes EVD mixture model (BE) statistics from 100 simulated datasets in the foursimulated model-datasets with the proportion of DE genes (pDE) reported in Kazmierczak et al. (2003) (25 and 20%,in osmotic stress and stationary-phase data, respectively) and simulating deviations from pDE of x10% and +10%.‘NeBC’=normal-exponential convolution background correction method; ‘MBC’=multiplicative background correctionmethod.

Extreme value theory in differential expression analysis 355

BL for n=2 in osmotic stress and stationary-phasedata corrected with NeBC is a consequence of out-liers ; in a few simulated datasets, there was a highFNR estimated for different values of FDR. Whenosmotic stress and stationary-phase data correctedwith MBC were used, BL failed in the analysis ofseveral simulated datasets with n=2 and 4 and n=2,respectively.

We also tested the sensitivity of the model to thenumber of technical replicates (m) ; we simulatedm=1, 3 and 6 (Fig. 5). BE did better in all foursimulated model-datasets, and under all simulatednumbers of technical replicates. BL failed in theanalysis of several simulations of the MBC osmoticstress dataset when simulated with all three testedvalues of m (1, 3 and 6). Interestingly, in the station-ary-phase data (and to a lesser extent in the osmoticstress data) corrected with NeBC, the average per-formance of all three statistics was better underthe simulation with m=3 than m=6. This may be a

consequence of overfitting to the empirical data inestimation of hyperparameters. The performance ofall three statistics, particularly BN and BL, was re-markably good in the simulated stationary-phase datacorrected with MBC.

Finally, we also tested the performance of the BEmethod in experiments characterized with a highnumber of tested genes but with only a few DE genes(Fig. 6). We ran the models with combinations(G=200, pDE=0.05) and (G=1000, pDE=0.01)(note equal number (10) of DE genes in both thesettings). BE did better than the other two statisticsalthough the margin was not great, particularly inthe stationary-phase data. That is expected becauseas the number of genes increases and only a fewgenes are DE, the distribution of gene-specific meansbecomes closer to the normal distribution. BL failedin the analysis of several realizations of the osmoticstress data corrected with MBC under the scenariowith G=200 and pDE=0.05.

Fig. 4. FDR vs. FNR plots showing horizontal average of the BN (Lonnstedt & Speed, 2002; Smyth, 2004), BL(Bhowmick et al., 2006) and empirical Bayes EVD mixture model (BE) statistics from 100 simulated datasets in the foursimulated model-datasets simulating the number of biological replicates (n) equal to 2, 4 and 8. ‘NeBC’=normal-exponential convolution background correction method; ‘MBC’=multiplicative background correction method.

R. Ivanek et al. 356

As illustrated above, in all simulated settings, theperformance of the BE model was at least as good asthe performance of the other two models. BL failed inseveral simulated datasets under several simulationsettings, mostly due to errors in integration and/orhyperparameter estimation. We would like to addhere that we performed simulation analysis on thedatasets generated under the modified Normal modelthat allows only mean ratios of expressions largerthan 1 (i.e. higher gene expression in the wild-type ascompared with the mutant). The results were similarto those obtained from simulation under the EVDmodel, so they are not shown here.

5. Discussion

We proposed a novel method (BE) for analysis ofmicroarray data from experiments where only up- ordown-regulated genes were expected or relevant.

Its merit originates from an empirical Bayes frame-work and foundation in the EVT, as well as its goodaccuracy, precision and stability demonstrated inthe simulated datasets. The main limitations of the BEmethod pertain to its sensitivity to the assumed pDEand the higher computational cost of its underlyingnumerical approximation through the MC inte-gration method.

The main asset of empirical Bayes methods inanalysis of differential expression is their use ofinformation from hundreds or thousands of simul-taneously tested genes to support testing at thegene level. However, empirical Bayes methods domake distributional assumptions, particularly whenthe choice of priors for the parameters is limited toconjugate priors (Lonnstedt, 2001). We stepped out ofthe conjugate prior class, and in experiments whereonly up- or only down-regulated genes are expected orrelevant, assumed that the mean ratios of expressions

Fig. 5. FDR vs. FNR plots showing horizontal average of the BN (Lonnstedt & Speed, 2002; Smyth, 2004), BL(Bhowmick et al., 2006) and empirical Bayes EVD mixture model (BE) statistics from 100 simulated datasets in the foursimulated model-datasets simulating the number of technical replicates (m) equal to 1, 3 and 6. ‘NeBC’=normal-exponential convolution background correction method; ‘MBC’=multiplicative background correction method.

Extreme value theory in differential expression analysis 357

of DE genes follow the EVD. In microarrays expect-ing a single direction of regulation, the validity of thisassumption is based on the fact that the distributionof the true genes’ mean ratios of expressions (thelog of it) cannot be symmetric around zero. In theexperiments where only a certain direction of regu-lation is relevant, it is advantageous to use the BEover the BN method even if the gene expression dataare more or less symmetric around zero. That is be-cause, being restricted to one direction of expression,due to the design of the experiment or the interestof an investigator, the genes characterized by suchexpressions are actually ‘selected’ beyond a thresholdexpression, and so, they show extreme behaviour.

The three families in EVD (Gumbel, Frechet andWeibull) have distinct forms of tail behaviour, withfinite limit distributions in its upper end-point forthe Weibull distribution, and infinite ones for the

Gumbel and Frechet distributions (Coles, 2001).Furthermore, the density in the upper end-point de-cays exponentially for the Gumbel distribution andpolynomially for the Frechet distribution (Coles,2001). While, consequently, the three families givequite different representations of the extreme valuebehaviour, it may not be obvious which best rep-resents the data and the problem at hand. It is thusconvenient to use a single family GEV rather than‘guessing’ which of the three original EVD familiesis proper, because the data themselves (throughinference about the shape parameter) determine themost appropriate type of tail behaviour (Coles, 2001).

To estimate the hyperparameters of the EVD, weapplied the MLE procedure. While many techniqueshave been proposed for estimation of hyperpar-ameters in extreme value models (including graphicaltechniques and moment-based and likelihood-based

Fig. 6. FDR vs. FNR plots showing horizontal average of the BN (Lonnstedt & Speed, 2002; Smyth, 2004), BL(Bhowmick et al., 2006) and empirical Bayes EVD mixture model (BE) statistics from 100 simulated datasets in the foursimulated model-datasets simulating an experiment with a low number of DE genes (10). ‘NeBC’=normal-exponentialconvolution background correction method; ‘MBC’=multiplicative background correction method.

R. Ivanek et al. 358

methods), the MLE approach is particularly appeal-ing due to its all-around utility and adaptability(Coles, 2001). Indeed, our analyses showed that BEworks well even when hyperparameters a, b and c areestimated from a small sample of genes (y10) relatedto smaller pDE and/or a smaller number of testedgenes G. However, caution is needed when MLE isused to estimate means and SEs of EVD hyperpar-ameters. Because the end-points of the GEV distri-butions are functions of the parameter values,regularity conditions required for the usual asymp-totic properties associated with the maximum likeli-hood (ML) estimator are not satisfied by the GEVmodel (Coles, 2001). Particularly, in the experimentsinvolving detection of down-regulated genes, if thevalue of the shape parameter (c) lies between x1 andx0.5, ML estimators are generally obtainable, but donot have the standard asymptotic properties, whilewhen c<x1, ML estimators are unlikely to be ob-tainable (Smith, 1985). Conversely, when c>x0.5,ML estimators are regular (have the usual asymptoticproperties) (Smith, 1985). Therefore, in experimentsdesigned to detect up-regulated genes, where c>0, theusual asymptotic properties immediately apply. Fordetection of down-regulated genes, it is reasonableto negate the contrast estimators from gene-specificlinear models before estimation of the MLE hyper-parameters and subsequent approximation of BEstatistics, i.e. to switch from modelling minima tomodelling maxima.

While only one direction of differential expressionis often expected in the wild-type vs. mutant exper-iments, it is possible to observe a few genes with theirexpression in an opposite direction. For example,while the sigma factor sB is a positive regulator ofgene expression, some genes show higher transcriptlevels in a sigB null mutant, likely representing in-direct effects, which may, however, be biologicallyrelevant. For example, in the bacterium Staphylo-coccus aureus, genes encoding a number of exo-enzymes and toxins show higher transcript levels in asigB null mutant strain as compared with the parentstrain with an intact sigB gene, suggesting indirectnegative regulation of these genes by sB (Bischoffet al., 2004). To test differential expression of asuspected biologically relevant gene regulated inthe direction opposite to the direction of regulationof the majority of the genes in the experiment, itscontrast estimator should be negated prior to apply-ing the BE method. Otherwise, the BE method wouldrank this gene low, indicating non-differential ex-pression.

Because there is no conjugate structure betweenthe normal likelihood of a gene being DE and theEVD prior, it is impossible to estimate BE analyti-cally. Therefore, the MC integration approximationtechnique was applied. While this allowed us to base

the choice of the prior on the nature of the problem,rather than on the available conjugate priors, it cameat the price of a higher, but still reasonable (e.g.16 min for 50 000 iterations in a dataset with 10 000genes), computational cost related to a large numberof tested genes and (usually) a large number of iter-ations required to achieve a reasonable accuracy ofthe BE statistics (because the accuracy increases onlyas the square root of the number of iterations). TheMCSEs of the estimated BE statistics differed forvarious genes, being the lowest for genes with a BEstatistic around zero. That is as expected becausegenes with a high mean expression ratio, correspond-ing to the upper tail of the EVD prior, will likelyhave a high BE statistic. Similarly, genes with a highvariance, corresponding to the upper tail of the vari-ance prior, will likely have a low BE statistic. At eachiteration, the probability of drawing a sample from atail is lower than from a body of the prior distri-bution, so these genes require more iterations toachieve sufficiently low MCSEs. If the BE statistic isused to identify DE genes, as in the present study, aneven lower number of iterations may be sufficient.However, when BE is used with the purpose of preciseranking of genes by their strength of differentialexpression, a larger number of iterations may benecessary.

As discussed by Smyth (2004), it is possible to esti-mate pDE from the data, for example, from posteriorodds through MLE or based on the P-values fromt-statistics (moderated or ordinary) (such as inLangaas et al., 2005 and Wu et al., 2006). Never-theless, estimation of pDE is unstable (related tocollapse in estimation of posterior odds caused byboundary values of pDE=1 and pDE=0 havingpositive probability), and may be sensitive to theparticular prior distribution assumed for the contrastestimators from linear models and to dependencebetween the genes (Smyth, 2004). To bypass theseproblems, Smyth (2004) and Lonnstedt & Speed,(2002) fixed pDE and then estimated the remaininghyperparameter. This same approach has been adop-ted in the BE method. A wrong choice of pDE wouldnot change the shape of the BE statistic vs. foldchange plot (i.e. ranking of the genes would stayintact), but it would move the BE up and down onthe y-axis (as in Lonnstedt & Speed, 2002). Also,through influence of the pDE on the hyperparametersof EVD, it would vary fold change correspondingto BE=0. Accepting that BE=0 is a natural cutoffseparating DE from non-DE, the fold change corre-sponding to this point represents the ‘meaningfulfold change cutoff’ above which a gene could beconsidered meaningfully DE. This ‘meaningful foldchange cutoff’ could give some indication of whetherthe chosen pDE is wrong. For example, in exper-iments expecting only up-regulated genes, the

Extreme value theory in differential expression analysis 359

‘meaningful fold change cutoff’ f1 would indicatethat the selected pDE is too high.

The accuracy and precision of BN, BL and BE wereassessed based on the achieved FDR and FNR over100 simulated datasets generated under the EVDmodel. In an experiment, a researcher could choosea tolerable FDR and try to minimize FNR (i.e.maximize power) or choose a tolerable FNR and tryto minimize FDR. In the simulated datasets, wemimicked both situations, and in both, the accuracyof the BE method was at least as good as or betterthan the other two tested methods. In addition tobeing more accurate, BE was also more precise thanBN and BL in the simulated datasets. We also testedthe sensitivity of BE to several key characteristics of amicroarray experiment, namely, the pDE, the numberof arrays (usually representing the number of bio-logical replicates) and the number of technical rep-licates, as well as its performance in experimentswith a high number of tested genes but with only afew DE genes. Overall, BE performed the best in allfour simulated model-datasets and under all the testedsimulated scenarios. When applied to real datasets,BE performed well, with most of the findings beingvalidated by other independent studies (elaboratedin the Appendix). This ‘reality check’ is a valuableaddition to simulation studies because datasets usedin simulation analyses have been replicated from thesame model as the BE is built upon, thus limitingability of the simulation studies to give a completelyunbiased evaluation of BE performance.

6. Conclusions

In microarray experiments where only one directionof expression is expected or relevant (only up- ordown-regulation), such as in certain experiments in-volving wild-type vs. knockout design (Kazmierczaket al., 2003; van Schaik et al., 2007) and some re-stricted coverage arrays, including those performedwith the purpose of identifying novel gene markersand drug targets (Suzuki et al., 2002; Kobayashiet al., 2004), the nature of genes’ true mean ratios ofexpressions is extreme. EVT was designed specificallyto study extreme behaviour, and is therefore an ex-cellent candidate for use in analysis of differentialexpression in such experiments. In this paper, weproposed a new empirical Bayes method, BE, foranalysis of differential expression that is based on theEVT, and we compared its performance with twoother empirical Bayes methods reportedly used inanalysis of such data. The first of the two methods,BN, is a very popular method that is, however, basedon a distributional assumption invalid for exper-iments where only one direction of expression is an-ticipated or of interest. The distributional assumptionof the other method, BL, is valid, but the method is

unstable and its performance is still comparable withBN. Based on the series of simulation analyses andanalyses of two real datasets, we believe that the BEmethod has greater accuracy, precision and stabilitythan the BN and BL methods. It thus seems promis-ing and useful for analysis of differential expression inexperiments where only up- or down-regulated genesare relevant or expected. Because custom, restricted-coverage microarray experiments, including thosedescribed here, are likely to become much morecommon in the future due to their possible use intherapeutic and diagnostic applications (Liu-Strattonet al., 2004), development and use of appropriatebioinformatics tools will become vital, and the ad-vantages of the BE method may become even moreevident.

This work was funded by the National Institutes of Health(RO1-AI052151-01A1 to Kathryn J. Boor and M.W.) andthe United States Department of Agriculture (USDA)National Needs Fellowship grant (2002-38420-11738Sto S.R.).

Appendix

The appendix for this paper is available online at thejournal’s website (http://journals.cambridge.org/grh).

References

Benjamini, Y. & Hochberg, Y. (1995). Controlling the falsediscovery rate – a practical and powerful approach tomultiple testing. Journal of the Royal Statistical Society,Series B 57, 289–300.

Bhowmick, D., Davison, A. C., Goldstein, D. R. &Ruffieux, Y. (2006). A Laplace mixture model for identi-fication of differential expression in experiments.Biostatistics 7, 630–641.

Bischoff, M., Dunman, P., Kormanec, J., Macapagal, D.,Murphy, E., Mounts, W., Berger-Bachi, B. & Projan, S.(2004). Microarray-based analysis of the Staphylococcusaureus sB regulon. Journal of Bacteriology 186,4085–4099.

Cleveland, W. S. & Devlin, S. J. (1988). Locally weightedregression: an approach to regression analysis by localfitting. Journal of the American Statistical Association 83,596–610.

Coles, S. (2001). An Introduction to Statistical Modelingof Extreme Values. Springer Series in Statistics. Berlin:Springer.

Efron, B., Tibshirani, R., Storey, J. D. & Tusher, V. (2001).Empirical Bayes analysis of a microarray experiment.Journal of the American Statistical Association 96,1151–1160.

Gottardo, R., Pannucci, J. A., Kuske, C. R. & Brettin, T.(2003). Statistical analysis of microarray data: a Bayesianapproach. Biostatistics 4, 597–620.

Kazmierczak, M. J., Mithoe, S. C., Boor, K. J. &Wiedmann, M. (2003). Listeria monocytogenes sB regu-lates stress response and virulence functions. Journal ofBacteriology 185, 5722–5734.

Kobayashi, K., Nishioka, M., Kohno, T., Nakamoto, M.,Maeshima, A., Aoyagi, K., Sasaki, H., Takenoshita, S.,

R. Ivanek et al. 360

Sugimura, H. & Yokota, J. (2004). Identification of geneswhose expression is upregulated in lung adenocarcinomacells in comparison with type II alveolar cells andbronchiolar epithelial cells in vivo. Oncogene 23, 3089–3096.

Langaas, M., Lindqvist, B. H. & Ferkingstad, E. (2005).Estimating the proportion of true null hypotheses,with application to DNA microarray data. Journal of theRoyal Statistical Society, Series B 67, 555–572.

Liu-Stratton, Y., Roy, S. & Sen, C. K. (2004). DNAmicroarray technology in nutraceutical and food safety.Toxicology Letters 150, 29–42.

Lonnstedt, I. (2001). Replicated microarray data. Li-centiate Thesis. Department of Mathematics, UppsalaUniversity.

Lonnstedt, I. & Britton, T. (2005). Hierarchical Bayesmodels for cDNA microarray gene expression. Biostat-istics 6, 279–291.

Lonnstedt, I. & Speed, T. P. (2002). Replicated microarraydata. Statistica Sinica 12, 31–46.

Nelder, J. A. & Mead, R. (1965). A simplex algorithm forfunction minimization. Computer Journal 7, 308–313.

Newton, M. A., Noueiry, A., Sarkar, D. & Ahlquist, P.(2004). Detecting differential gene expression with asemiparametric hierarchical mixture method. Biostatistics5, 155–176.

Panjer, H. H. (2006). Fitting extreme value models. InOperational Risk: Modeling Analytics (ed. H. H. Panjar),pp. 383–393. New York: John Wiley and Sons.

Pawitan, Y., Michiels, S., Koscielny, S., Gusnanto, A. &Ploner, A. (2005). False discovery rate, sensitivity andsample size for microarray studies. Bioinformatics 21,3017–3024.

Smith, R. L. (1985). Maximum likelihood estimation in aclass of non-regular cases. Biometrika 72, 67–90.

Smyth, G. K. (2004). Linear models and empirical Bayesmethods for assessing differential expression in micro-array experiments. Statistical Applications in Genetics andMolecular Biology 3, Article 3.

Smyth, G. K. (2005). Limma: linear models for microarraydata. In Bioinformatics and Computational Biology Sol-utions Using R and Bioconductor (ed. R. Gentleman,V. Carey, S. Dudoit, R. Irizarry & W. Huber), pp. 397–420. New York: Springer.

Smyth, G. K., Michaud, J. & Scott, H. (2005). The useof within-array replicate spots for assessing differentialexpression in microarray experiments. Bioinformatics 21,2067–2075.

Suzuki, H., Gabrielson, E., Chen, W., Anbazhagan, R., vanEngeland, M., Weijenberg, M. P., Herman, J. G. &Baylin, S. B. (2002). A genomic screen for genes upregu-lated by demethylation and histone deacetylase inhibitionin human colorectal cancer. Nature Genetics 31, 141–149.

Tanner, M. A. (1996). Tools for Statistical Inference.Methods for the Exploration of Posterior Distributions andLikelihood Functions. New York: Springer-Verlag.

Taylor, J. R. (1982). An Introduction to Error Analysis.The Study of Uncertainties in Physical Measurements.Herndon, VA: University Science Books.

Tusher, V. G., Tibshirani, R. & Chu, G. (2001). Significanceanalysis of microarrays applied to the ionizing radiationresponse. Proceedings of the National Academy ofSciences of the USA 98, 5116–5121.

van Schaik, W., van der Voort, M., Molenaar, D.,Moezelaar, R., de Vos, W. M. & Abee, T. (2007).Identification of the sB regulon of Bacillus cereus andconservation of sB-regulated genes in low-GC-contentGram-positive bacteria. Journal of Bacteriology 18,4384–4390.

Verhoeven, K. J. F., Simonsen, K. L. & McIntyre, L. M.(2005). Implementing false discovery rate control : in-creasing your power. Oikos 108, 643–647.

Wu, B., Guan, Z. & Zhao, H. (2006). Parametric andnonparametric FDR estimation revisited. Biometrics 62,735–744.

Zhang, D., Zhang, M. &Wells, M. T. (2006). Multiplicativebackground correction for spotted microarrays to im-prove reproducibility. Genetical Research 87, 195–206.

Extreme value theory in differential expression analysis 361