DNA Microarrays Paper 2010

Chemometrics and Intelligent Laboratory Systems 104 (2010) 28–52

Contents lists available at ScienceDirect

Chemometrics and Intelligent Laboratory Systems

j ourna l homepage: www.e lsev ie r.com/ locate /chemolab

An introduction to DNA microarrays for gene expression analysis

Tobias K. Karakach a, Robert M. Flight b,c, Susan E. Douglas a, Peter D. Wentzell b,⁎a Institute of Marine Biosciences, National Research Council of Canada, 1411 Oxford Street, Halifax, Nova Scotia, Canada B3H 3Z1b Department of Chemistry, Dalhousie University, Halifax, Nova Scotia, Canada B3H 4J3c Department of Neuroscience Training, University of Louisville, Louisville, Kentucky, 40203, USA

⁎ Corresponding author.E-mail address: [email protected] (P.D. Wentzel

0169-7439/$ – see front matter © 2010 Elsevier B.V. Adoi:10.1016/j.chemolab.2010.04.003

a b s t r a c t
a r t i c l e i n f o
Article history:Received 24 November 2009Received in revised form 5 April 2010Accepted 6 April 2010Available online 29 April 2010

Keywords:DNA microarrayGeneChipGene expressionExperimental design

This tutorial presents a basic introduction to DNA microarrays as employed for gene expression analysis,approaching the subject from a chemometrics perspective. The emphasis is on describing the nature of themeasurement process, from the platforms used to a few of the standard higher-level data analysis toolsemployed. Topics include experimental design, detection, image processing, measurement errors, ratiocalculation, background correction, normalization, and higher-level data processing. The objective is topresent the chemometrician with as clear a picture as possible of an evolving technology so that thestrengths and limitations of DNA microarrays are appreciated. Although the focus is primarily on spotted,two-color microarrays, a significant discussion of single-channel, lithographic arrays is also included.

l).

ll rights reserved.

© 2010 Elsevier B.V. All rights reserved.

1. Introduction

The rise of chemometrics as an important sub-discipline of analyticalmeasurement science paralleled the rapid growth of analyticalinstrumentation capable of providing higher orders of multivariatedata and the associated demand for new kinds of information. Sometwenty years later, the biological sciences are undergoing a similarrevolution resulting fromnewmeasurement technologies, and the needfor effective data analysis tools is just as pressing. Since the beginning ofthe 1990s, molecular biology has moved toward high throughputmeasurements and data similar to the transition in the analyticalchemistry field in the early 1970s. The move toward high throughputtechnologies inmolecular biology is concomitantwith the advent of thehuge amounts of genome information and the need to utilize it inunderstanding complexmolecular interactions in biological systems.This is a consequence of the recognition that even simple cellularactivities are the result of well-orchestratedmolecular networks thatcontrol the cell and that these cannot be fully understood by studyingone component at a time, but only through a comprehensiveintegration of the entire molecular machinery controlling the cell.Predictably, analysis of the data generated by high throughputmeasurements has necessitated more complex mathematicalapproaches that had not previously been available to molecularbiologists. Chemometrics has an important role to play in this regard,since at their core these are analytical measurements and amenableto the tools that have been developed by chemometricians overmany

years. The application of those tools, however, requires a clearunderstanding of the nature of these new measurements and thechallenges they pose.

There are many different high throughput measurement technol-ogies currently employed by molecular biologists, including DNAsequencing and LC–MS (and derivatives), but one of the moreubiquitous tools in use is the DNA microarray. DNA microarrays arepopular due to their unique ability to query the mRNA expressionlevels of thousands of genes (potentially all of the genes in anorganism) simultaneously with relatively high specificity, providing asnapshot in time of the overall gene expression of the system understudy. However, there are some important considerations to take intoaccount when one is using DNA microarrays or analyzing DNAmicroarray data. Although this topic has been previously reviewed inother fields [1–5], this tutorial provides an introduction, to ananalytical chemistry audience, of this technology and various issuesrelated to the analysis of the resultant data. It begins by providing abrief biological background necessary to appreciate the experimentalunderpinnings of the technology and an overview of the methodsused in manufacturing DNA microarrays. Later sections provide adetailed introduction to the measurement process of DNA micro-arrays in the context of the DNA microarray experiment workflow,starting with the experimental design and following through dataacquisition and processing. In addition, the pre-processing applied tothe data before final analysis is discussed. Finally, themethods used toanalyze the resultant data are briefly considered.

The primary technological platform treated in this paper is thespotted DNA microarray, with a secondary focus on Affymetrix®arrays (see Section 3 for a description of the microarray types). This islargely due to the fact that the authors' have more extensive

mailto:[email protected]

http://dx.doi.org/10.1016/j.chemolab.2010.04.003

http://www.sciencedirect.com/science/journal/01697439

29T.K. Karakach et al. / Chemometrics and Intelligent Laboratory Systems 104 (2010) 28–52

experience working with data only from the former, and that much ofthe research available in the literature has been published on spottedmicroarrays. It is also important to note from the outset that theemphasis of this tutorial is on the nature of microarray measurementsand the experimental procedures used to obtain them, rather than onthe data analysis techniques applied to the final data sets. Chemome-tricians are well-versed in the tools of the trade, but less familiar withstrengths, limitations, and peculiarities of high throughput biologicalmeasurements. Readers looking for a primer on higher-level analysisof transcriptomics data are likely to be disappointed (they should visit[6] for a listing of papers describing DNA microarray analysismethods), but it is hoped that those who wish to gain a fundamentalunderstanding of the measurement workflow will find what theyneed to venture into the field of microarray analysis with confidence.

2. Biological background and motivation

A simplified view of the flow of information in a cell would showinformation traversing from the genes (DNA) to messenger RNA(mRNA) to proteins, which can subsequently act on DNA, mRNA,metabolites, or other proteins. To produce the required proteins, thegene must be transcribed into mRNA by RNA polymerases, and themRNA can then be translated by ribosomes into protein (see Fig. 1).

Depending on the cell type and its biological state, specific proteinswill be expressed at different levels. Therefore, if one can measure thecomplement of all expressed proteins then this will provideinformation about the current state of the cell. Given the explicitrelationship between gene expression (transcription) and proteintranslation, knowledge of mRNA levels may provide an indirect routeto this knowledge. For example, comparing the gene expressionbetween diseased and healthy cells could allow the determination ofthe molecular basis of disease. Alternatively, measuring geneexpression as a function of a serial process would allow thedetermination of molecular changes over time (cell cycle) or withchanging dosage (drugs/metabolite response).

Consequently, three options are available for investigating molec-ular dynamics of the cell, analyzing the variations of (1) the completeset of proteins in the cell (proteomics), or (2) the complete set ofmRNA transcripts that leads to the production of these proteins(transcriptomics), or (3) the complete set of metabolites generated bythe proteins (metabolomics). Although research in proteomics andmetabolomics has been ongoing for many years, both fields still sufferfrom a lack of standardized methodologies and poor reproducibility.This is partly a result of the heterogeneous properties of themoleculesbeing measured. In the case of proteomics, different amino acidsequences lead to a wide variety of protein types, making it difficult todesign standard protocols for performing measurements on the entireprotein complement. Metabolomics likewise suffers from the widediversity of chemical properties of different metabolites. The

Fig. 1. Overview of the process of transcribing DNA to mRNA, which is translated into proteignores many of the complexities in the process, such as alternative splicing of mRNA, miR

relatively homogeneous nature of mRNA, and the development ofcapture methods based on complementary base pairing, has led to thevery mature field of transcriptomics using DNA microarrays. Inaddition, in many cases mRNA levels are a reasonable proxy forprotein amounts, allowing one to make a rational inference regardingthe level of protein expression based on the levels of mRNAexpression. There are, however, exceptions where protein expressionis controlled post-transcriptionally by other factors.

Transcriptomics generally utilizes DNAmicroarrays, small slides towhich are attached hundreds to tens of thousands of molecules ofDNA [7]. The DNA is able to bind complementary sequences createdfrom mRNA transcripts, facilitating the quantitation of various mRNAtranscripts in the cell. This process is illustrated schematically in Fig. 2.DNA microarrays allow molecular biologists to monitor the levels ofmRNA transcripts for tens of thousands of genes simultaneously,thereby giving them a window into the inner workings of the genomeat the transcriptional level. Microarrays have impacted the study ofnumerous diseases, the regulation of many biological mechanisms, aswell as the cell cycle of various organisms [1]. The methods by whichDNAmicroarrays are constructed and used, however, can take variousforms.

3. DNA microarrays

A microarray consists of a series of miniaturized chemicalrecognition sites onto which binding reagents, capable of distinguish-ing complementary molecules, have been attached. Pirrung defined amicroarray as a flat solid support that bears multiple probe sitescontaining distinct chemical reagents with the capacity to recognizematching molecules unambiguously [8]. Thus, in principle, ifcomplementary molecules in a complex mixture were modifiedwith fluorophores, for instance, and allowed to interact with theprobes, the molecules could be interrogated simultaneously todetermine their respective concentrations. This definition is similarto the classic definition of multianalyte chemical sensors, notwith-standing the different measurement environments and detectionsystems. It also restricts a microarray to be a miniaturized assaywithout specifying, explicitly, the chemical reagents that constitutethe probes. In the case of DNA microarrays, the probes are DNAoligomers that are allowed to interact with labeled complementaryDNA strands. This has led to a wide variety of DNA microarray types,although there are two general classes. The first category encom-passes microarrays on which a single stranded DNA (ssDNA) oligomerprobe is synthesized directly on the substrate (in situ synthesis). Thesecond category encompasses microarrays on which a ssDNAoligomer or dsDNA (double-stranded DNA) amplicon probe isdeposited on the substrate, and these are commonly referred to asspotted arrays. These different methods of generating DNA micro-arrays lead to some important considerations in the data analysis, and

ins that are then able to act on metabolites. It should be noted that this simple modelNA silencing and the effect of post-translational modifications on proteins.

Fig. 2. (a) Spotted microarray experimental set-up. mRNA extracts (targets) from cells under two distinct physiological conditions are reverse transcribed to cDNA and then labeledwith different fluorescent dyes e. g. Cy3 and Cy5. Equal amounts of the dye-labeled targets are combined and applied to a glass substrate onto which cDNA amplicons or oligomers(probes) are immobilized. (b) Scanned image of an Atlantic salmon cDNA microarray [7].

30 T.K. Karakach et al. / Chemometrics and Intelligent Laboratory Systems 104 (2010) 28–52

so a basic primer on the synthesis and detection of target binding forboth is provided in the next sections.

3.1. In situ synthesis

Among the most popular arrays where the DNA oligomers aresynthesized in situ are the Affymetrix® arrays, known as GeneChips®.In a GeneChip, a photolithographic mask is used to determine theprobe position on the array at which photo-induced deprotection of apreviously deposited functionalized nucleotide occurs, in order toattach the subsequent nucleotide to the growing oligomer [9]. Due topossible failure of photo-induced deprotection at each step of thesynthesis, GeneChips contain short probes (25 nucleotides long), withmultiple probe sequences for each target of interest. These make upwhat are known as “probe sets”, and contain both perfect matches(PM) for the sequence of interest, and also probes that contain a singlebase mis-match (MM) at the middle position to allow determinationof non-specific target binding. The use of photolithographic techni-ques to produce the arrays leads to very reproducible, extremelyregular probe regions on the array surface. However, this samestrategy makes it more expensive to produce custom arrays, andAffymetrix has concentrated on producing arrays for widely usedorganisms, although the selection of organisms for which arrays existhas expanded considerably in recent years.

Nimblegen® arrays are similar to GeneChips in that they use aphoto-induced deprotection of previously deposited functionalizednucleotides to subsequently add to the growing oligomer. However, inthe case of the Nimblegen arrays, a digital micromirror device (DMD)is used to direct the light to cause photo-induced deprotection [10].This has the advantage of not requiring the fabrication of newphotolithographic masks for new array designs, as in the case of theGeneChips. Another important difference in the Nimblegen technol-ogy is the use of longer oligonucleotides, 60mers in contrast to the25mers used by Affymetrix. In theory, this allows for greaterspecificity of hybridization of the targets to the probes on the slide,with less chance of cross-hybridization between the target sequences.The use of the DMD allows Nimblegen to achieve high densities, whileeasily allowing one to create customized arrays.

Another method of in situ oligomer synthesis uses addressableelectrodes to cause deprotection of the previously deposited nucleo-

tides via electrochemical methods. Each chemical reaction is confinedto the activated electrode through the use of a buffering solution [11].The number of probe sequences that can be synthesized on the arrayis limited by the lower limit size of the fabricated electrodes on thechip.

The last method of note is the use of extremely accurate ink-jetsystems to control the delivery of various reagents used for oligomersynthesis [12]. In this situation, the growing oligomer is synthesizedby changing the base added at each location via the ink-jet system.Agilent® uses this method to produce their arrays.

Although the Affymetrix system using photolithographic masks isthe least flexible method of in situ oligomer array fabrication, it hasthe advantage of having been commercialized the longest, and hasbecome a de facto standard in the industry. However, with the rise ofrequirements for experiments with non-model organisms or non-standard applications, the arrays generated using the other methodsoutlined above are becoming more popular.

3.2. Spotted arrays

In contrast to the in situ arrays mentioned above, spotted arraysare generated through the mechanical deposition of synthesizedoligomer (generally 50–70mers) or cDNA amplicons on a functiona-lized substrate. cDNA amplicons can be generated via reversetranscription of all mRNAs from an organism under study to generateso-called expressed sequence tags (ESTs), and then amplified bypolymerase chain reaction (PCR), purified, and spotted on microarrayslides. In the early days of microarrays, cDNA libraries were producedfor many different organisms, and generating more cDNA is easilyaccomplished, providing a readily accessible source of probe material.This was an inherent advantage of spotted arrays over GeneChips atthat time, but as genomic information becomes available for a widerrange of organisms and the selection of commercial arrays becomeslarger, the use of cDNA arrays is declining.

Long oligomer arrays, in contrast, use probes consisting ofoligomers synthesized via traditional solid-phase methods. Like insitu synthesized arrays, they require knowledge of the genomicsequence for the organism under study, and, in theory they providemore specificity than cDNA arrays. This is possible because 50–70nucleotides allow discriminatory binding between several different


but closely related sequences (in contrast to the 25mers used forGeneChips), and cDNA amplicons would likely cross-hybridize toclosely related sequences. Both long oligomer and cDNA arrays areused currently; however the consensus appears to be that long oligoarrays are more advantageous due to the ability to have more controlover the actual sequence in the microarray probe.

The first reported instance of the modern spotted microarray usedthe relatively simple method of spotting the cDNA by a roboticarrayer, with a single pin picking up a solution of cDNA and depositingit on an appropriately functionalized glass slide [13]. This method ofarray construction is still in use today with the modification of usingmultiple pins simultaneously to pick up and deposit the DNA (seeFig. 3 for a schematic of a robotic arrayer system). Since the initialreport, there has been a large body of literature generated on the besttypes of array surfaces, surface chemistries, different types of pins, anddifferent DNA solution compositions. A discussion of these variousissues is beyond the scope of this review, and the interested readerwill be able to easily find information in the literature. Mostimportantly, the materials required to build the robotic arrayer arerelatively cheap, and allowed almost anyone with the time andtechnical know-how to assemble their own arrayer and begingenerating microarray slides to perform experiments. This led to avery large body of literature on the manufacture and use of spottedmicroarrays in the academic community. Although it is still possible tobuild one's own robotic microarrayer, due to the many sources ofpotential error in the manufacture of microarrays, printing is nowgenerally performed by commercial suppliers or specialty academicmicroarray centers.

The primary method of attaching the probe DNA (cDNAamplicon or synthesized oligomer), as mentioned above, usesprint-tips whereby the DNA of interest is picked up by a solid orcapillary metal tip, and then placed on the array with or withoutcontact of the print tip to the surface. There are many other methodsof depositing the probe on the array surface, including ink-jet [14]and electrophoretically driven [15] deposition. The use of spottedmicroarrays does lead to lower probe densities on the array incomparison to synthesized arrays, especially the Affymetrix andNimblegen arrays.

Fig. 3. Schematic of microarrayer with the arrows pointing to the direction of movement ofbase, on which the glass slides and microtitre plates sit, moves in the y-direction (courtesy oWashburne).

3.3. Impact of array type

Although the intricacies of detection of hybridized targets to theprobes on the microarray are discussed in a later section, a briefcomment regarding the influence of the method of array fabricationon the approach used for probe detection and the influence on arraydesign is appropriate, especially in the context of comparing spottedmicroarrays with GeneChips. The highly reproducible manufacture ofGeneChips leads to a very high technical reproducibility between twoarrays measuring the same sample. This has resulted in the use of asingle GeneChip for each sample of interest, while still being confidentof making comparisons between two different samples hybridized totwo different arrays.

Spotted microarrays, in contrast, tend to have large variations inspot size and morphology between spots and between arrays, makingit more difficult to make a valid comparison between two sampleshybridized to two different arrays. Therefore, spotted arrays almostalways hybridize two samples with different labels to the same array,thereby enabling comparison of the two samples. It is thus importantto understand the limitations of the two different formats whenundertaking data analysis of microarray data from different types ofarrays.

The next section examines the physical manipulations necessaryto actually perform a microarray experiment, namely labeling of thesample, and hybridization to the microarray.

4. Measurement and analysis

The process of acquiring and analyzing DNA microarray data canbe regarded as consisting of a workflow of discrete steps, starting withthe design of the experiment, following through extraction ofappropriate samples, labeling, hybridization, scanning of the micro-array, image processing, normalization, ratio calculation, statisticalanalysis, and ending with the extraction of information andgeneration of knowledge from the results. Although this review isintended to highlight the steps of the workflow that result in theactual data generation and the analysis of the data, all of the steps areincluded here with at least a cursory overview to give the reader an

components during printing. The print-head moves in the x- and z-directions while thef Gisli Sigtryggsson). Inset: photograph of actual microarrayer (courtesy of M. Werner-


appreciation of the complexities involved in developing the actualdata that are analyzed. Most of the subjects addressed in this revieware applicable to all types of DNA microarray experiments (e.g.experimental design and transformations). However, in some of theareas there may be experimental considerations specific to eitherspotted microarrays or GeneChips. These will be addressed separatelywhere appropriate.

4.1. Experimental design issues

One of the unfortunate consequences of the technical andconceptual simplicity of microarray technology is its capacity toyield data sets that are biased by inadequate design considerations. Inthe absence of well-established experimental designs for microarrays,poorly designed experiments continue to yield multiply-confoundeddata with which one is unable to answer the question for which theexperiment was conducted. The general objective of experimentaldesign is to curtail the effects of confounding factors by generatingdata that span rich and diverse sample spaces, minimize the effects ofunwanted variation, and provide the potential for maximumefficiency for probing the hypotheses under investigation. Yet, withmicroarray experiments, there is often the false hope that due to thevolume of data generated per experiment, confounding factors andunwanted variation will be somewhat mitigated.

Although many different types of experiments may be conductedusing DNA microarrays, such as comparative genome hybridization(CGH) [16], single nucleotide polymorphism (SNP) analysis [17],alternative splicing [18], and microRNAs [19], the focus of interest inthe majority of microarray studies is typically to discover genes thatare differentially expressed in different subjects, different tissues, cellsexposed to varying physical/biochemical conditions, or those under-going growth, development, or degeneration. Some of the commonreasons for evaluating these variables are to discover the roles ofgenes in an organism, to group genes according to common functions,to understand the relationships among genes in a biological system(systems biology), to classify biological specimens (e.g. tumor cells)on the basis of gene expression, and to identify important biomarkersin disease progression. Thus, analysis of these experiments involvesidentification of genes that display uncharacteristic tendencies ofincreased or decreased expression, and achieving this goal mustinvolve careful experimental design to avoid spurious observationsconfounded by unrelated experimental variables at multiple levels.Microarray experiments can be regarded as multilayered in the sensethat they involve several nested levels at which variability may beintroduced. Churchill [20] and Simon et al. [21] categorized the levelsat which microarray experiments must be designed into three layers:(1) the selection of experimental units, (2) the design of mRNAextraction, labeling and hybridization, and (3) the arrangement ofprobes on the glass slides. Whereas the first layer controls the span ofthe biological design space, the second and third layers account for theanalytical (technical) variability at the lower levels of the experimen-tal process and will be the focus of this section.

4.1.1. Types of experimentsIn a broad sense, most microarray experiments can be classified as

either comparator experiments or serial experiments based on thenature and objectives of the procedures employed. In a comparatorexperiment, the objective is to compare gene expression among cellsunder several distinct conditions (e.g. different drug treatments,different tissue types and different tumor types) to identifydifferentially expressed genes. These experiments can be furthercategorized according to their objectives as class comparison, classprediction, and class discovery experiments [22,23]. In contrast, serialexperiments are designed to follow the evolution of gene expressionas a function of some ordinal variable in order to better understandthe biological system under study [24–26]. Most often, the ordinal

variable is time and the experiment is referred to as a time-course, butit is also possible to examine other variables such as the dosage levelof a drug or toxin. Serial experiments are less widely employed thancomparator experiments, probably because they demand moreresources, require synchronization, and are not as amenable toconventional cluster analysis and other techniques that are easy toimplement and widely used. These two experimental categories arediscussed in greater detail below.

Comparator experiments can be carried out using controlled oruncontrolled design strategies. The former are controlled in the sensethat cell populations are selected and partitioned as reference and testsamples, after which the test cells may be treated in some way thatdifferentiates them from the reference, such as by exposure to a toxin[27], a drug [28] or environmental stress [29]. RNAs are extracted fromthe two cell populations, labeled with different dyes, and hybridizedto the same array for direct comparison of relative expression (seeFig. 2). Uncontrolled comparator experiments involve identification ofsubjects that may exhibit the conditions of interest (e.g. patientssuffering from different forms of a cancer), extracting RNA from thesecandidates, and comparing their abundance to reference mRNAextracted from separate normal individuals. A comprehensivecomparison of comparator designs has been reported elsewhere[30]. The section that follows briefly describes some of the commondesigns currently used.

Time-course experiments include those that profile gene expres-sion in response to cell cycle [25], development [31–33], and externalstresses over time [29,34]. In these experiments, RNA is extractedfrom candidate cells at specified time intervals and co-hybridizedwith RNA extracted from a common reference. For instance, thegenetic profiles of yeast cells exiting from stationary phase have beenobtained using, as reference, mRNA derived from cells in theexponential growth phase [35]. Other approaches have beendiscussed in reference [36]. Experimental design issues associatedwith time-course experiments have been discussed in detail else-where [24,37] but a brief mention of the most significant aspects willbe made here. These include the frequency at which experimentalmRNA samples are extracted (i.e. the number of samples per giventime interval) and the synchronicity of the units in view of thehomogeneity of cell populations. For example, in cell developmentand growth experiments, the sampling rate during exponentialgrowth phase is maximized in order to minimize temporal aggrega-tion. The synchronization of the initial population is also important inthese experiments, since it is impossible to follow changes for a mixedpopulation for which the distribution of cellular states does notchange. For dose experiments, it is important to ensure that all cells towhich a chemical dose is administered have had similar priortreatments and exhibit the same population distribution.

It should be noted that the design of experiments utilizingGeneChips will be very similar to those employing two-color spottedarrays, with the exception that individual samples are hybridized toseparate arrays. This mitigates some of the concerns inherent in two-color experiments, especially the amount of sample required, but itdoes introduce novel complications for other types of experiments.

4.1.2. Experimental designsThe most widely used and easily interpreted experimental design

employed in two-color microarray experiments is referred to as thereference design. In this design, the test samples, labeled with onedye, are hybridized against a relevant reference which has beenlabeled with the other dye. For purposes of illustration in this section,wewill consider a hypothetical example in which we are interested inthe gene expression levels of three different types of tumors, A, B andC, extracted from test subjects. If the principal interest in thisexperiment is to examine the differences in gene expression betweennormal tissue and cancerous tissue of various types, a reference designcould be used in which normal tissue serves as the reference, R. Even


with this specification, however, there are sub-classifications ofdesigns based on how the reference is obtained. In a commonreference design, the same reference material is used for all of the testsamples; that is, the reference material is extracted from one source,or from multiple sources and homogenized. In our example, thiswould correspond to extracting the mRNA from the healthy tissue ofone individual (Fig. 4a). An alternative, however, would be to extractboth healthy and diseased tissue from the same subject and use thesepairs for comparison (Fig. 4b). This is referred to as a directcomparison, and would be expected to reduce the variance indifferential expression arising from different individuals. This ap-proach relies on the availability of a natural biological internalstandard, however, and may not always be possible. For example, ifthe goal of the experiment is to determine the effect of a drugtreatment, direct comparison will not generally be possible. As analternative to the common reference in such circumstances, anindirect comparison can be used, where individual (unrelated)reference samples are obtained for each test sample (Fig. 4c).Although this would be more robust than using a common reference,which may increase the likelihood of a false positive due to a fewanomalous genes, it would also be expected to increase the variance inthe observations.

Fig. 4. Some possible experimental designs illustrated for a microarray experimentconsisting of three treatments (A, B and C) plus a reference and two replicates: (a)common reference design, where R is the common reference; (b) reference design withdirect comparison, where RA1 is a reference matched to A1; (c) reference design withindirect comparison, where R1 is not related to A1; (d) common reference design withdye-swap; (e) loop design (including a reference); (f) balanced incomplete blockdesign (note that three replicates are used in this design).

Another issue related to reference designs is the use of “dye-reversal” (also known as “dye-swap” or “fluoro-flip”) experiments.Although it is natural to expect that there will be differences in thescale of intensities from the red and green channels due to factorssuch as dye labeling efficiency and laser power, these are usuallycompensated for through a process known as normalization (seeSection 4.8). However, if there is preferential incorporation of one dyeor preferential hybridization of one dye-labeled transcript overanother and this varies across genes, a gene-specific bias isintroduced. To compensate for this, the use of dye-swap experiments,in which an experiment is repeated with the red and green labelsreversed, has been advocated (Fig. 4d). The use of these experimentsis popular, although there are arguments that they may beunnecessary [21]. In addition, there are more efficient designs nowavailable to account for these biases should they exist (see below).

The reference design is appealing because of its simplicity and itscompatibility with data analysis techniques such as cluster analysis,but it is not the most efficient design. Often, the choice of designstrategies in microarray experiments is determined by factors such asthe specific biological question, the availability of resources, and theproposedmethods for validation of the results [36]. Reference designshave been argued to be inefficient when resources are limited sincethe reference is hybridized multiple times. Common alternatives tothe reference design are the loop design [38] and the balanced blockdesign.

In the loop design, sample 1 is co-hybridized with sample 2,sample 2 with sample 3, sample 3 with sample 4, and so on until thelast sample is co-hybridized with sample 1. Successive hybridizationsare set up so that a dye-swap occurs for the common sample inconsecutive experiments, resulting in a design that is able to includedye–gene interactions. This design is illustrated in Fig. 4e for theexample presented earlier, including the reference (normal tissue) asone of the samples. The design for the example requires eight arrays,and would be equivalent to a reference design with dye-swaps,requiring 12 arrays. If one were only interested in comparing the testsamples, only six experiments would be required. Thus, the loopdesign can be regarded as more efficient, but it suffers from a numberof drawbacks. Since each sample connects to the next as a reference,one bad sample or array can disrupt the continuity, making the designsensitive to experimental problems. The indirect method of compar-ison also makes the method prone to inflated variance when twosamples far apart are contrasted. Finally, data analysis is not asstraightforward as for the reference design and standard methods fordata clustering cannot be directly applied.

The balanced block design is similar to the loop design in that itattempts to improve efficiency through co-hybridization of testsamples. The requirement of this design is that each pair of sampleclasses appears together the same number of times. This is illustratedin Fig. 4f for the earlier example, where there are four sample classesor treatments (A, B, C and R), each with three replicates. Note thateach treatment should appear labeled with each dye, ideally an equalnumber of times, although this may not be possible with an oddnumber of appearances, as in the current example. The minimumnumber of arrays required is equal to the number of combinations oftreatments taken two at a time, which in this case is six (if using onereplicate from each sample). Multiples of this minimum can also beemployed. Also note that the complete utilization of biologicalreplicates requires that the number of replicates for each treatmentbe an integer multiple of the number of treatments minus one, whichis why the number of replicates was expanded to three in thisexample. If two replicates had been used, as in previous designs, itwould be necessary to replace the third replicate with technicalreplicates of the biological samples. The design in this example is abalanced incomplete block design, since all treatments do not appearin all blocks. A complete block design is only possible when there areonly two treatments. The balanced block design is very efficient, but


suffers from some of the same drawbacks as the loop design in termsof interpretation and data analysis.

The discussion above has been focused primarily on designs forcomparator experiments, but some comments on serial experimentsshould also be made. By far the most common design used for time-course experiments is the reference design, but other designs arepossible and some of these have been described by Yang and Speed[36]. Perhaps the most natural alternative design is the loop design,since there is an obvious relationship between sequential time points.In practice however, this can lead to significant problems if there isone bad sample or array that breaks the chain of measurements.Another important consideration in serial experiments is the choice ofreference. In comparator experiments, it is generally expected that thedifferential expression will occur in relatively few genes, but in serialexperiments the changes in gene expression can be much moredramatic over the course of an experiment, making the choice of anatural reference difficult. Moreover, the reference in a serialexperiment is primarily used as an internal standard for measure-ments and not somuch as ameasure of differential expression, since itis the change in expression from one time point to the next, asopposed to the absolute ratio, that is of greatest interest. Because ofthis, the reference for serial experiments should be chosen torepresent as many genes as possible. The absence of a gene transcriptin the reference will lead to an undefined ratio and the inability tomeasure changes in the expression of that gene, which may beimportant even if it is initially absent in the early samples.

4.1.3. ReplicationOne of the key questions posed by microarray researchers

concerning experimental design relates to the number of timeshybridizations must be repeated in order to gain accuracy in theestimates of variables of interest. Statistical inference of thesignificance of measured variables (usually log-ratios) is determinedby the magnitude of the residual variance which, in microarrays, hascontributions from the inherent biological variability in samples plusthe analytical variability (referred to as technical variability). Thesesources of variability can occur on multiple levels and within both thetest and reference samples. As with any designed experiment, the goalin microarray experiments is to either control sources of variability orinclude them as part of the model. Accurate determination of theresidual variance can only be achieved through objective replicationof experiments both at the biological and technical levels. In thissection, some strategies for replication of microarray experiments arediscussed and some models in current use are described.

It has been reported that technical variability accounts for as littleas 5–10% of the standard error [39], yet manymicroarray experimentsplace an emphasis on this source of variability, probably because it iseasier to generate technical replicates than biological replicates.Within the category of technical replication, there are different levelsof contribution that can be investigated [40]. At the lowest level is thespot-to-spot variability on a given slide. In principle, this can beestimated by multiple spotting of the same probe at differentlocations on the array. In practice, it is common for logistical reasonsto place probe replicates side-by-side, which limits their utility inestimating this source of variance since they do not model effectsassociated with spatial or temporal distribution, or different pins.However, this is only one component of the technical variance and,while it may be important in assessing the overall error structure, it isperhaps more useful to estimate the total variance from this source.The ideal technical replicate would begin with the replicate extrac-tions of mRNA from the same biological source and carry thesethrough the labeling and hybridization procedures. For practicalreasons, this is not commonly done and technical replication is likelyto be carried out at some downstream step, such as before or afterlabeling.

From a classical statistical standpoint, it is the variance that isintroduced by biological replication (which also incorporates techni-cal variance) that is of greatest importance in a comparisonexperiment. Some aspects of intrinsic biological variability arerelatively simple to control, while others are impractical or impossibleto eliminate, depending on the nature of the study. Variation resultingfrom gender, age, genotype and the interactions of these factors havebeen reported to account for upwards of 60% of the standard error[41]. It has also been argued that, if the biological samples are drawnfrom cell lines, biological variability will be smaller. However, evenwithin these populations, some diversity is expected (e.g. with cellpassage number) and biological replication is advisable to avoid thedetection of false positives that arise from the anomalous behavior ofa few geneswithin an individual sample of the population. Tomitigatethe high cost or difficulties (e.g. small sample size), associated withindependent sample replication, the pooling of mRNA extracts hasalso been considered as a possible alternative to capturing variationdue to the transcriptional diversity of samples [42], although cautioussentiments have been expressed [21,39,40]. Finally, it should beemphasized that randomization of the procedures used in themicroarray trials e.g. order of experiments, operators, etc., is criticalto yield meaningful results.

Perhaps the most comprehensive work addressing factors thatinfluence the overall standard error of measuring the fluorescenceintensity of a spot on a microarray is reported by Kerr et al. [43]. Here,fluorescence intensity is modeled as a function of sample (V), array(A), dye (D), and gene (G) effects together with interactions betweena gene and an array, and a gene and sample. This model is reproducedhere as Eq. (1) (as it appears in the reference), where µ is the overallmean, yijkg is the measured intensity for the gth gene, correspondingto the kth variety (which is equivalent to treatment), labeled with thejth dye and hybridized to the ith array and, εijkg is the random errorcomponent.

yijkg = c + Ai + Dj + Vk + Gg + VGð Þkg + εijkg ð1Þ

Practically, this model calls for replication of arrays (ideally withindependent biological samples), together with all the main effects inorder to capture the relative fluorescence intensity that represents theunbiased differential gene expression. In essence, variability due tothe interaction term (VG) measures the quantity of interest, whilevariability due to the main effects must be controlled. This model waslater updated to include (array×gene) and (dye×gene) interactions[44], as shown in Eq. (2) where AG and DG are the two additionalterms.

yijkg = c + Ai + Dj + Vk + Gg + VGð Þkg + AGð Þig + DGð Þjg + εijkgð2Þ

By including AG and DG interaction terms, this updated modelensures that technical replication accounts for spot-to-spot and gene-specific dye effects. It is perhaps due to this model that replication inmost microarray experiments has focused on multiple spotting andthe so-called “dye-swap” experiments. The purpose of multiplespotting is to estimate spatial variability in the measurements thatmay result from a variety of factors, such as variations in the amountof probe deposited on specific sites. As already noted, multiplespotting is mainly done via side-by-side deposition of probes on themicroarray for operational simplicity. The extent to which thisaccounts for the desired variability has not been established. Othermethods of multiple spotting have been reported. In reference [7] forinstance, microarray glass slides were divided into two and probematerial deposited as side-by-side duplicates on each of the twohalves, whereas in reference [45], probes were spotted multiple timesin various locations on the array.


One aspect of Eq. (2) that has received widespread attention in themicroarray literature on experimental design is the gene-specific dyebiases modeled by the (DG)-term and is mostly captured by the dye-swap experiments discussed in Section 4.1.2. Kerr et al. [38,43] arguedthat, in some instances, certain transcripts incorporate one dye betterthan another and that this effect will be confounded with thedifferential expression arising from true biological factors. Statisticalanalyses have been carried out to confirm the importance of thisargument [46], but there is some skepticism regarding the overallcontribution of this factor to the standard error [47].

General experimental design strategies encourage replication inorder to estimate random variability in the measurement process andother strategies, such as dye-swaps, capture systematic variability aswell. Systematic variations in microarray measurements can be hardto control in view of the layers at which these experiments must bedesigned. In principle, it is anticipated that the standard method ofevaluating two-color microarrays using ratios will correct forsystematic uncertainty related to the technical level of the experi-ment. In practice, systematic artifacts persist and sometimes threatento obscure the true biological variability which an experiment isdesigned to investigate. One of these systematic artifacts arises fromthe fact that the ratio calculated for a given spot is a function of avariety of technical variables that are unrelated to gene expression,such as the PMT response. A number of methods have been developedto mathematically correct for these effects and are collectivelyreferred to as normalization. These are considered in a later section.

4.2. Labeling

The rapid, simultaneous, and highly sensitive quantitative detec-tion of transcripts fromwhole genomes remains the main objective ofDNA microarray technology. Fluorescent labeling of cDNA targetmolecules allows rapid detection while providing the high sensitivitydesired without the inherent problems associated with radioisotopiclabeling [48]. Increased detection sensitivity has allowed thetechnology to be applied to investigations where the quantitativeamounts of starting material would otherwise be consideredundetectable. For instance, most mRNA extracts yield less than 1 µg/gof tissue and even after amplification, one usually has only between 10and 20 µg of cDNA, which is quite difficult to quantitatively detect forordinary hybridization experiments. Although fluorescence is inher-ently sensitive as an analytical technique, signals emitted by the dye-labeled target often require enhancement through the incorporation ofmultiple fluorophores. The fluorescence intensity observed from thehybridized target will depend on a number of factors, including labelingdensity, fluorophore charge and linker length [49]. Trends in DNAmicroarray technology show continued advancements in the method-ology for labeling cDNA [50,51], thus improving the detection of probe–target interactions.

An understanding of the influence of labeling methods isimportant in the context of comparison of results from differentexperiments. The variability introduced by the labeling techniquecould be systematic and has been shown to influence expressionpatterns obtained in microarray experiments [51]. Labeling methodswill influence the data in terms of sensitivity, reproducibility anddynamic range of the signal. For instance, the efficiency of dyeincorporation of fluor-tagged bases into a target cDNA is argued to beless than that of incorporating functional nucleotides. Thus, as notedearlier the variability in the measured intensity may reflect this dyeincorporation efficiency and not differential expression, although thisis still a subject of some debate.

Early methods of fluorescent labeling of cDNA involved eitherattachment of single fluorophores to the 5′ ends of DNA targets [51] orenzymatic incorporation of approximately 4% fluor-tagged bases[48,52] into DNA targets. In recent years, the demand for highersensitivity in high throughput analyses has required increased

incorporation efficiency of fluorophores. Although such an increasecomes with enhanced detection sensitivity [53], reports have nowappeared in the literature demonstrating elevated fluorescencequenching and dwindling probe–target duplex stability resultingfrom bulky dyes [49]. Nonetheless, continued improvements inmethodologies for preparation of fluorescent targets have given riseto a number of approaches for dye labeling, most of which addressthese issues. Several of these methods involve chemically couplingfluorophores to nucleotide substrates, the most common methodsbeing the so-called direct and indirect labeling schemes. Whereasdirect labeling methods incorporate nucleotides with covalentlyattached fluorescent tags into the targets, indirect methods affix thetags to incorporated modified bases via chemical coupling. Compar-isons of these methods have been carried out in recent years, withmixed results concerning reproducibility, sensitivity and accuracy[50,51,54,55].

The conceptual simplicity of the direct labeling approach isperhaps its main advantage, in addition to the strength of signalsobtainedwhen nucleic acids are labeled by this approach. Dye-labelednucleotides are synthesized by simple nucleophilic reactions betweena succinimidyl ester on a fluorophore and a primary alkyl-aminemodified nucleotide, usually deoxycytidine triphosphate (dCTP)[49,51]. Such a reaction scheme is illustrated in Fig. 5 using themost common dyes employed in two-channel microarray platforms—the cyanine dyes, so-called Cy3 and Cy5.

These commercially available dye-modified bases are incorporatedinto the nucleotide sequences of cDNA targets during the reversetranscription of mRNA to cDNA. Direct labeling approaches carry therisk of unequal incorporation of dye-labeled nucleotides (Cy3 andCy5) into the sequences of cDNA targets, perhaps due to the slightdifferences in the size of the two fluorophores, hence introducing adye-bias. This bias can give artefactual results that necessitate dye-swap experiments (see Section 4.1.2).

There are two main indirect labeling approaches. In the first,nucleotides, usually deoxyuridine triphosphate (dUTP), are modifiedwith a functional group such as aminoallyl [51,54], and theseprecursors are incorporated into a target cDNA sequence. Thissequence is subsequently reacted with fluorophores to form covalentbonds between the modified bases and the fluorophores, The mainbenefit of this approach is increased efficiency of incorporation of theaminoallyl-modified nucleotides into a cDNA sequence owing to theirrelative small sizes compared to dye-labeled nucleotides. This alsoeliminates signal bias resulting from differential incorporationefficiency of Cy3- and Cy5-labeled nucleotides. Another commonindirect labeling approach is the so-called dendrimer indirect 3DNAlabeling approach, originally described by Wang et al. [56] and laterStears et al. [57] in collaboration with Genisphere® (Hatfield, PA),who currently market the technique. This approach is described indetail in references [55,57–59].

Although not as widely used, it is argued that indirect labelingapproaches provide up to 300 times brighter signals than directlabeling methods and require much less mRNA for labeling [58].Dendrimer labeling has an additional advantage that it yields targetswith relatively high solubility in hybridization buffers, leading to lowbackground fluorescence. Furthermore, since as little as 1–3 µg oftotal RNA can be labeled using indirect approaches, the need foramplification of starting material is diminished. (Amplification ofstarting material, if not performed carefully, may introduce experi-mental artifacts that can confound the results.)

In contrast to the direct labeling approaches employed by two-color spotted arrays, GeneChips have generally employed an indirectlabeling method. This uses nucleotides functionalized with a biotinmoiety to generate the cRNA/cDNA. Following hybridization to thearray, streptavidin with a linked fluorescent dye is added to the array.The streptavidin binds extremely tightly to the biotin on the cRNA/cDNA, and excess dye can be washed away. More recently, indirect

Fig. 5. A typical nucleophilic reaction between a succinimidyl ester on a fluorophore (Cy3) and a primary alkyl-amine-modified deoxycytidine triphosphate (dCTP).


labeling using Genisphere's dendrimer technology has been used withGeneChips, however, in contrast to the method described above,biotin molecules are attached to the dendrimer, and then the avidin-dye is added, thereby increasing the number of binding sites for thefluorescent dye.

A further note should be mentioned with regard to labeling, andthat is the effect of ozone on signal intensities. Exposure of DNAmicroarray slides to ozone has been shown to affect signal quality,with a much greater effect on Cy5 (red) compared to Cy3 [60]. Theseeffects have been observed with relatively low amounts of ozoneexposure (5–10 ppb) and care should be taken to avoid exposingmicroarray slides labeled with Cy5 to environmental ozone, especiallywhen wet. The Brown group has released plans for an enclosure to aidin eliminating ozone from a room or in a small enclosed environment[61], while Genisphere has released a product for coating arrays toprevent Cy5 degradation [62]. In 2008, GE Healthcare® reported thedevelopment of an ozone-stable dye for DNA microarray applications[63].

4.3. Hybridization

A critical part of any microarray experiment is the hybridization ofdye-labeled targets to surface-immobilized probes. The hybridizationof complementary DNA strands on glass supports is relatively well-established in molecular biology [9,64,65] and is important to thequality of microarrays given that the specificity and affinity of probe–target interaction largely determines the quality of a microarray [58].In this regard, factors that influence the efficiency and stability ofhybridization will have a direct influence on the quality and amountof information that can be derived from a microarray study. Thesefactors include hybridization time, the length and composition ofprobes and targets used for hybridization, and the hybridizationtemperature, as well as the pH, ionic strength and viscosity of thehybridization solution.

The influence of these factors on the stability of the probe–targetduplex has been discussed extensively in the literature [58,66,67] andwill not be covered here. However suffice it to say that severalprotocols have been developed to ensure that the experimentalparameters that influence hybridization efficiency are optimized[55,67,68]. The signal obtained from a microarray is measuredwithout reference to these hybridization conditions except whereobvious aberrations are apparent, which often leaves only the optionof repeating the experiments.

4.4. Detection

The use of DNA microarrays for monitoring transcriptional statesof biological samples is generally accomplished by comparing therelative abundance of transcripts from two samples via hybridizationto either a single array of DNA probes (spotted two-color arrays) ortwo different arrays (GeneChip). The simplicity of this concept isdeceptive; complexities in the measurement process are oftenignored in spite of their importance. In principle, differential geneexpression is measured by determining the ratio of fluorescenceintensities of the two dye-labeled targets emitting signals propor-tional to their concentration. In practice, however, acquisition of thedata involves prior steps, including the scanning of the microarraywith lasers set at different excitation wavelengths for Cy3 (green) andCy5 (red) labeled targets and identification of spot locations on themicroarray (a process referred to as gridding). The purpose ofscanning the microarray is to excite the fluorophores tagged to thehybridized probes as well as to collect the emitted fluorescence andgenerate an image (for each wavelength) in which pixel intensitiescorrespond to the level of localized fluorescence. For spotted arrays,these images are typically stored as pairs of unsigned 16-bit tiff files.To evaluate the fluorescence ratio, the location of the fluorescing spotis carefully determined to accurately relate the pixel intensity tofluorescence of a hybridized transcript. Due to some specificconsiderations when using GeneChips, the remainder of this section


will focus on analysis using spotted microarrays, and specific pointsrelevant to GeneChips will be considered separately.

Higher-level analyses of DNA microarray data generally pay littleor no attention to the measurement processes mentioned above anddubious assumptions are often made regarding the variability in thedata. Yet, even when all experimental aspects are held constant (thesolid support system, the spotting procedure, the probe types, thelabeling and so on), the process of acquiring data can still influencethe variability of gene expression patterns observed in a microarrayexperiment. Conventional approaches for capturing uncertainty inmicroarrays focus on variance associated with the spatial position ofthe microarray spots, the sample preparation, and the biologicalsampling. In most cases, these sources of variability are captured byreplicating the deposition of DNA probes in varied locations on themicroarray, as well as by replication of hybridization procedures andbiological samples (see Section 4.1.3). Whereas these approachesmaycontrol uncertainties from extrinsic sources, inherent ambiguities thatarise during the measurement process persist. These can arise fromtwo main sources, the scanning of the microarray and primary levelprocessing of the images.

It has been reported that uncertainty in microarray results may beassociated, in part, with the fluctuations observed when independentscans of the same microarray are conducted [69]. Thus, when thissource of variability is ignored, the observed gene expression profilesare likely to be confounded with scanner instabilities. This isespecially true for cases where the microarray is scanned by runninga single laser pass for each dye. Furthermore, primary level processingof the acquired images introduces a potential for severe aberration incases where the processing methods are not robust. In particular, thisprocessing partitions the spots into regions called foreground andbackground and this could lead to severe uncertainties if the shapes ofthe spots are not well-defined (see Section 4.5). Although theindividual sources of uncertainty in microarray measurement pro-cesses may be negligible, an understanding of their incrementalcontribution to the overall variability is essential for microarraytechnology to reach its full potential. This section describes themicroarray data acquisition process, focusing on the scanningprocedures and primary level processing in the context of measure-ment quality.

4.4.1. Scanning the microarrayGenerally, the acquisition of fluorescence signals emitted by dye-

labeled molecules on the microarray occurs by laser scanningconfocal microscopy. In conjunction with photomultiplier tubes(PMT) or charge coupled device (CCD) cameras, microarrayscanners detect and record the emitted fluorescence signals,which are stored as16-bit tiff images for further analysis. Althoughthe PMT is the most ubiquitous detector employed in microarrayscanners due to its cost-effectiveness, portability and high sensitiv-ity, CCDs play an important, albeit peripheral, role in microarraytechnology. This is because CCDs have excessive operationaldemands despite the desired high sensitivities they exhibit [70].The general architecture of a microarray scanner consists of a lightsource(s), optical components (mirrors and lenses), a detector, anda data acquisition system. A basic optical architecture representingthe general configuration of confocal laser scanning microscopesemployed in most microarray scanners is shown in Fig. 6. Typically,the light sources consist of gas or solid-state lasers. Xenon lamps,which supply white light, are also a viable option, although theirsize and heat dissipation make their use less common.

In the detection process, laser light is directed through thedichromatic (dichroic) mirror that allows light of a desired frequencyto excite the sample after passing through a set of microscopeobjective lenses. Excited fluorophores emit light of a differentwavelength, which is transmitted back through the objective lensesto the detector via the dichroic mirror. The pinhole, conjugated to the

focal point of the objective lenses, eliminates out-of-focus fluores-cence from reaching the PMT, where the true signal is amplified anddetected (conjugation of the pinhole to the focal point of the objectivelens is the key to confocal microscopy). Finally, the analogue signalfrom the PMT is digitized (by A/D converters) and recorded to depict amap of pixel intensities, which is stored as a 16-bit tiff image. It shouldbe noted that the development of confocal microscopic measure-ments was a key component in enabling modern microarraytechnology, since it allowed fluorescent interferences outside thefocal plane to be greatly reduced.

Scanning the microarray is usually executed pixel-wise, atresolutions ranging from 5 µm to 10 µm, in amechanism that involvesx–y Cartesian translation of either the substrate or optical compo-nents. Most microarray scanners employ the former due to theadvantages associated with a stationary optical path and increaseddurability of delicate scanner components [70]. Common strategiesfor exciting samples during the scanning process include simulta-neous and sequential scanning mechanisms. The first approachemploys two laser light sources in parallel, and yields two images(corresponding to the two dyes) in a single pass. This approach allowsfaster scanning rates but may exhibit a lower signal-to-noise ratio (S/N) and is prone to increased cross-talk [70]. Sequential scanning isdesigned to minimize cross-talk since independent scans are run foreach dye. Regardless of the mechanism used, the laser power and thePMT gain normally need to be optimized independently for eachchannel before image acquisition. This process, often referred to as a“pre-scan”, ensures that each channel has adequate sensitivity torepresent low-level signals without excessive saturation of high levelsignals. In principle, either the laser power or PMT voltage could beused to adjust the signal amplitude, but factors such as photobleach-ing and S/N need to be considered.

The most common errors affecting microarray scanner signals canbe categorized based on their origins from either instrumentalcomponents, the substrate, or various contaminants. For instance,the quantized arrival of photons at the detector are governed byPoisson statistics, which leads to a measurement standard deviationequal to the square-root of the signal. Thus, this type of uncertainty(shot noise) is tied to the instrument detector and, althoughimpossible to eliminate, it can be estimated by appropriate errormodels if it is suspected to be the dominant source of uncertainty inthe measurement. Other examples of instrument noise include laserand PMT noise. Laser noise (source flicker noise or drift noise) arisesdue to intensity fluctuations over time and is typically characterizedby a multiplicative effect on the signal. PMT noise (detector noise)may result from fluctuations in the amplification of the signal or thepresence of dark current. Conceivably, one of the most importantconsiderations in PMT noise is the effect introduced by increasing thevoltage gain. Whereas this is intended to yield stronger signals,escalating PMT voltages enhances the background noise as well. For agiven instrumental setup, it is difficult to predict which of thesesources will dominate the instrumental noise.

Noise arising from the substrate is predominantly due to the non-uniformity of the surface. One of the fundamental properties of laserscanning confocal microscopy is that the focal points are pre-set todepths of ∼2.5 µm in order to restrict the collected signals to thosethat originate only from the desired sample. In extreme cases wherethe surface is spatially heterogeneous, it is likely that undesiredsignals will be propagated to the pinhole. Other sources of sampleuncertainty include dust smudges on the slide and back-reflection ofthe laser light.

In theory, the only reagents on the slide at the time of scanningshould be cDNA and the dyes used to label the DNA. In practice,although great care is taken to remove as much of the excessreagents as possible, there remains traces of the various chemicalson the slide. These chemicals often have spectral profiles thatoverlap with those of the dyes, contributing to noise in the

Fig. 6. General optical architecture of a confocal microscope.


background intensities. This type of noise can very rarely beremoved, as discussed in Section 4.5.2.

4.5. Image processing: spotted microarrays

The key aspects of microarray image processing are identificationof the spotted probe locations after scanning and the quantitation ofsignal intensities/ratios at the probe sites. We refer to this as primarylevel processing since it is the first step in microarray image analysis.Most software for processing microarray images include routines forgridding, segmenting the spot pixels into foreground and background(for spotted arrays), and measurement of signals corresponding tothese regions. In this section, a brief mention will be made of theseaspects of image analysis in order to define and outline the goals ofeach approach, and their possible limitations. Yang and coworkers[71] have reviewed many software packages that are designed toperform the segmentation and background adjustment aspects ofthese processes.

One of the basic purposes of imaging microarray slides is to permita global visualization and interpretation of the relative concentrationsof hybridized transcripts when corresponding Cy3 and Cy5 images areoverlaid. This rudimentary analysis of the data is designed to providethe investigator with a general overview of the hybridization successof the experiment based on an interpretation of the color codes.Whenthe images are overlaid, an assessment can be made regarding theconcentration of labeled transcripts from one sample relative to theother by examining the predominance of either Cy5 (red) or Cy3(green) on the spot. Conventionally, a red spot is interpreted asresulting from preferential hybridization of the Cy5-labeled targets tothe probe relative to the Cy3-labeled targets, and vice-versa.Preferential hybridization to a spot is assumed to be influenced onlyby higher concentrations of the particular target. If the concentrationsare in equal proportion then the spot is expected to be yellow and, ifno target hybridized, the spot is expected to be black. An example ofthis type of image is shown in Fig. 7. In addition to providing a generaloverview of the hybridization, such images are also very useful inevaluating the intensities of external controls. These allow calibrationof scanner settings during the scanning of the array. If externalcontrols are spotted on the array in incremental concentrations, it ispossible to calibrate a scanner's dynamic range by adjusting its PMTgain and laser power until a desired brightness is obtained from pre-scans.

It is important to recognize that microarray images, althoughcoded in terms of red and green contributions, are rendered through

software that is not intended to reflect the fluorescent spectra actuallyobtained. The images are false color representations of the intensitiesmeasured on the two fluorescence excitation channels and theviewer's perception will be the convolution of several transforma-tions, including the color mapping of the software, the representationof colors by the output device, and the processing of visualinformation by the eye and the brain. It would seem natural forsoftware to represent the intensities of the two channels directly asthe red and green components of the red–green–blue (RGB) tripleused to encode colors on most video displays, but this turns out not tobe visually satisfying and is limited by the fact that pixel intensities arerestricted to 8-bit (0–255) while the fluorescence intensities areencoded as 16-bit values (0–65,535). Consequently, most commercialsoftware applications use a technique known as color mapping inwhich combinations of the two dye intensities define a particular RGBtriple of pixel intensities. Although this makes the representation ofthe image more subjective, since it relies on the design of themapping, it also allows for the use of more subtle hues and shades thatcan be more appealing to the viewer.

In addition to color mapping, commercial software also tends toapply transformations to themeasured intensities to make the imagesmore visually informative, if less quantitative. The wide range ofintensities on amicroarray often results in an image that is dominatedby a few spots of high magnitude, while the remaining spots are toofaint to be seen. Not only is this relatively uninformative, but it alsoleads to difficulties when trying to grid the spots (see below) since thespot is indistinguishable from the background. A common solution tothis problem is to apply a square-root transform to the data to bedisplayed, which has the result of suppressing large signals andamplifying small ones, as well as reducing the range from 16-bits to 8-bits. This gives a more complete picture, although it may present adistorted view of relative intensities. Because of the combination ofdata transformation and color mapping, care should be taken notover-interpret visual images provided by microarrays. It should benoted that at least one instrument has been produced that allowsscanning of fluorescence spectra at each pixel (hyperspectralimaging) [73], but such an instrument is currently impractical forroutine use, so we must continue to rely on the quantitativeinformation available from two channels.

In order to quantify the results in the microarray image, specificpixel intensities of the fluorescing spot must be evaluated to provide ameasure of the relative concentration of dye-labeled targets that havehybridized. To achieve this, the Cartesian coordinates of the spot onthe image must be identified and separated from spurious signals

Fig. 7. Image of a sub-array from a typical microarray. This image is part of the CAMDA [72] data set depicting red, green and yellow spots as well as “black holes”. The interpretationof the color codes is as described in the text.


outside of the probe site. This is typically referred to as gridding oraddressing. This process could, in principle, be automated since thebasic structure of the microarray is determined by the arrayer. Thenumber and arrangement of pins on the arrayer print-head providesthe fundamental structure of rows and columns of grids (also referredto as sub-arrays or sub-grids). In addition, the number of rows andcolumns of spots printed in each sub-grid is pre-set. As a result,various gridding software applications use the known x–y displace-ments of spots per grid and the x–y separation for the pins, togetherwith the initial location of the first spot to automatically determinethe address of each spot on the microarray. Unfortunately, thissimplified approach to automatic gridding usually requires manualintervention to optimize the separation between grid rows andcolumns, slight variations in individual spots resulting from shifts inprint-tip positions and, sometimes, shifts in rows or columns of spotsin a sub-array, including rotation of grid axes relative to the image. Anexample of a grid is shown in Fig. 8. One major disadvantage ofmanual gridding is the time required and the associated monotony,which has the potential for introducing user bias and inaccuracy.

The importance of accurate gridding of spots stems from thereliance of most of the higher-level analysis methods for microarrayson reliable measurements of pixel intensities comprising the spot. Inthis regard, addressed spots are classified into foreground andbackground regions through a process referred to as segmentation.Pixels within the foreground region are believed to represent the truesignal corresponding to fluorescing dye-labeled target that hybridizedto the spot. On the other hand, pixels in the background regioncorrespond to spurious signals from the substrate that are unrelatedto the hybridized targets. Thus, for each spot, segmentation will leadto identification of a region around the spot, referred to as a spotmask,which is comprised of pixels from either the foreground or thebackground. It is therefore argued that after addressing, segmentationis the most important step in microarray image processing [74].

The most common segmentation methods can be classifieddepending on whether they place spot-shape restrictions on theestimation of the spot masks. Fixed circle and adaptive circlesegmentation methods assume that the spots are circular, while thehistogram segmentation and adaptive shape segmentation methods[74,75] place no restrictions on the shapes of the spots. The centraldifference between fixed circle and adaptive circle segmentationmethods is that the former fits a circle with a fixed radius for all spotsin the image, while the latter allows for estimation of different radiifor each spot. In principle, if all the spots are of similar size, then thefixed circle segmentation method provides estimates of backgroundand foreground regions similar to the adaptive circle approach.Unfortunately, spot sizes within a microarray vary due to unequaldeposition of material on the spots by pins and thus the procedure isprone to inadequate segmentation of spots. On the other hand, it hasbeen argued that the adaptive circle method can be overly time-consuming for an array with thousands of spots since it requires theuser to adjust spot sizes. Furthermore, when the signal strength is low,it is hard to distinguish a transition between the foreground andbackground.

Several automated software applications have been developed toaddress the drawbacks of the adaptive circle segmentation approach.Chen et al. partition the pixels into background and foreground bysetting up a nonparametric test statistic that enables one todistinguish foreground pixels from background in the proximity ofthe probe site [76]. In particular, the Mann–Whitney test statistic isused to test a hypothesis that the intensity of a set of pixels chosenfrom outside the probe site is equal to a similar set of pixels chosenfrom the probe site. When the null hypothesis is rejected, the set ofpixels causing the hypothesis to be rejected are assumed tocorrespond to the signal from a hybridized target. Thus, all the pixelsin the spot target mask that have intensities higher than the set thatlead to the rejection of the null hypothesis are classified as

Fig. 8. Section of typical grids on a microarray image. The circles describe an area segmented to consist of the spot.


foreground. Unfortunately the computational demands of thisapproach may have limited its utility since microarray data containseveral thousand probe sites. To alleviate the time constraintsassociated with user intervention in adaptive segmentationapproaches, Buhler et al. developed Dapple [77], a spot findingapproach that places candidate spots (identified using the provided x–y displacements of spots per grid) into vignettes. Within the vignettes,spots are identified by looking for characteristic sharp edges (giventhat spot morphology profiles exhibit rising intensities at the edgesfrom where they meet the background) typified by high negativesecond derivatives of pixel intensities with respect to the x and ydisplacements. Thus, the brightest ring in the vignette is identifiedand used in segmenting the spot. Another approach, Matarray [78],amalgamates the signal intensity and spatial information to deter-mine spot locations and appropriate segmentation. Similar to Dapple,spots are identified from the initial estimates of spot location providedby the user. For each spot, patches (similar to the vignettes in Dapple)are defined and a circle is subscribed around a tentative spot centre toprovide foreground pixels, while pixel intensities outside the circleenclosed in the patch are segmented as background. Grid locations areadjusted after identifying pixels within the circle that have intensitiesgreater than the sum of the mean background intensity and twice itsstandard deviation. The locations of such pixels are determined, theircentre re-calculated, and new patches defined. The process isrepeated until some convergence criterion is satisfied.

Although in principle most spot shapes are expected to be circular,in practice spots printed in-house rarely exhibit the perfect shapesanticipated, and instead descriptors such as comet tails, craters anddonuts have been associated with aberrant microarray spot morphol-ogies. Fig. 9 shows a sampling of the variety of spot shapes that can beobtained from a typical microarray. Accordingly, restricting spots toparticular shapes could provide poor estimates of fluorescenceintensities for hybridized targets when the spotted probes exhibitmorphologies different from the prescribed ones. Advancedapproaches, referred to as adaptive segmentation methods, such aswatershed and seeded region growing, continue to be used for

microarrays, albeit with mixed success [79]. The most widely usedmethod for segmenting spots, without restricting them to particularshapes, is the histogram method [75]. This method defines a targetspot mask whose size is thought to be bigger than any spot andevaluates a histogram of the pixels within this mask. Subsequently,from the histogram, background intensities are calculated as themeanof the pixels between the 5th and 20th percentile while foreground isthe mean intensity of pixels between the 80th and 95th percentile.

The segmentation methods discussed in this section are imple-mented in most software applications to perform primary levelprocessing of microarray images. Table 1 reports the methodsemployed by some of these applications. Several recent developments[82–84] introduce more complex methods of spot segmentation inorder to improve the determination of fluorescence ratio throughminimized misidentification of spot masks. It should be noted thateven with automated methods for spot addressing and segmentation,the resultant grids are often checked manually, and if necessaryadjusted to better segment the spots.

4.5.1. Ratio calculationStatistical analysis of microarrays is based on the evaluation of

relative fluorescence intensities of two differentially labeled targetsthat are hybridized to a probe. Ratiometricmethods [76,85] of analysisare preferred because absolute fluorescence intensities do notcorrespond directly to the absolute concentration of the mRNAobtained from each of the samples. Instead, the observed fluorescenceis a function of the efficiency of dye incorporation and DNAhybridization, the length and amount of probe attached to thesurface, the relative content of dye-modified bases in a transcript, andthe scanning parameters (laser intensity, PMT gain, etc.). For single-channel microarrays (GeneChip), the test and reference scans aremade on separate arrays (the analysis of GeneChips will be discussedin a later section). For spottedmicroarrays, this is not possible becausedirect comparison of intensities from separate arrays would be greatlyaffected by variations in spot morphology. The use of two-channelmicroarrays overcomes this limitation since the spot morphology for

Fig. 9. A three-dimensional view of typical spot morphologies alongside the spot images drawn from the CAMDA data set [72]. Each spot is indexed with an identifying spot number,grid (block) number and column and row numbers as well as a corresponding gene ID. The morphologies of the spots are identified as follows: (a) comet tail, (b) normal highintensity, (c) crescent, (d) donut, (e) pointed and (f) normal low intensity.


the test and the reference channels will be the same for a given spot.For these arrays, the morphology will influence the calculation of theratio and several methods have been developed to estimate summarystatistics for the pixels in the spot masks. These methods include ratioof medians, ratio of means, median of ratios, mean of ratios, andregression ratios which are discussed in more detail below.

One of the most widely used methods for ratio calculation is theratio of medians. This is a method whereby differential expression ismeasured as a ratio of the median of pixel intensities within a spotmask for both channels. The median is intended to represent thecentre for the distribution of pixel intensities in the spot mask.Perhaps one of the major advantages of this approach is that themeasured ratios are robust to influence from a few pixels withextreme values at either end of the distribution. Unfortunately, whenspots are characterized by substantial regions (N50%) of low-intensity

pixels, as in the case of “donuts” as shown in Fig. 9d, it is anticipatedthat the low-intensity pixels will dominate the spotmask and result inratios with a high uncertainty.

Another common measure of differential expression involvesevaluating the ratio of the mean of pixel intensities within the spotmask. Calculation of mean values is straightforward and less affectedby extended regions of low-intensity fluorescence, but they are moresusceptible to the influence of extreme values, i.e. outliers in pixelpopulation. For this reason, the ratio of means is generally less robust.

A less frequently used approach to measuring the relativefluorescence is to calculate pixel-by-pixel ratios of intensities acrossthe spot and then report the differential expression as the arithmeticmean ormedian of the ratios. This is referred to as the “mean of ratios”or “median of ratios”, respectively. Amajor drawback of this approach,especially when using means, is the high sensitivity of the summary

Table 1Image segmentation methods used in some commercially and publicly availablesoftware packages.

Software Segmentation method

QuantArray (GSI Luminomics) Histogram, fixed circleScanAlyze [80] Fixed circleGenePix Pro (Axon) Adaptive circleUCSF Spot [79] HistogramTIGR Spotfinder [81] HistogramDapple [77] Adaptive circleMatarray [78] Adaptive circle


statistic to pixels near the background level, since the ratiomeasurements can become very erratic in these cases. A potentialadvantage, however, is that the individual ratios across the spotprovide a population from which dispersion measures can be used toestimate the uncertainty in the ratio, as has been described in theliterature [86]. Such uncertainty estimates are often unreliablehowever, because of inhomogeneity in the variances.

Another infrequently used approach to ratio measurements is theregression ratio method. This method determines the ratio directlyfrom the slope of a plot of Cy5 vs. Cy3 (or vice-versa) pixel intensitiesacross a spot [87]. Although the regression can be influenced byoutliers in a manner similar to the ratio of means, one of its potentialadvantages is that, for a range of pixel intensities across the spot, itshould allow some compensation for the contribution of backgroundfluorescence through the use of an intercept. In this way, segmenta-tion of the image into foreground and background pixels is not critical.For the regression ratio to be calculated properly, however, orthog-onal regression, rather than conventional regression should beemployed.

4.5.2. Background measurementsTypical microarray expression ratios are adjusted to eliminate the

influence of the background signal, which can be the result of non-specific hybridization and auto-fluorescence from the glass slides. Theratio is adjusted by subtracting a measure of the estimatedbackground signal from the foreground signal for the spot. Mostmicroarray software applications estimate the background signal bymeasuring the intensity of pixels in the proximity of a spot mask, i.e.from the pixels segmented by the image analysis software as being“background”. However, there are some differences among themethods of background estimation used. GenePix Pro 3.0® (Axon)estimates the background from a circular region three times thediameter of the foreground spot away from the spot centre, excludingregions that define spots and maintaining a two pixel buffer from thespot masks. The appropriate measure (median, mean) of the pixelsdrawn from this region is computed and subtracted from thecorresponding measure of the pixel intensities computed from thespot mask. QuantArray® (Packard BioScience) evaluates the back-ground by calculating the mean or median of pixels (in a histogram ofall pixels) that lie below a given percentile. Matarray [78] drawsbackground pixels from patches that define a neighborhood region foreach spot, and evaluates their mean, which is reported as thebackground intensity for any chosen spot.

The presence of background fluorescence in the calculation ofintensity ratios is undesirable because it introduces bias into theresult, especially for low-intensity spots that are near the backgroundlevel. However, errors in the estimation of background intensities canbe just as damaging to the quality of the measurements. For example,one of the immediate potential effects of subtracting backgroundintensity from the foreground is negative intensities, which are notmeaningful from a physical perspective and indicate aberrantmeasurements. Spots exhibiting such characteristics are normallyexcluded from further analysis. Whether the common methods of

background calculation truly meet their objectives has been a subjectof debate [88]. Arguments have emerged that naïve backgroundsubtraction is sometimes more detrimental to the final analysis ofmicroarray data than not correcting for background [73,89]. In fact,fundamental questions have been raised in the literature recentlyregarding the legitimacy of background estimated this way. A generalassumption in calculating background signals in microarrays, asoutlined above, is that the background signal around a spot reflectsthe background signal at the spot location. This is not necessarily thecase, however, since the surface chemistry at the spot is fundamen-tally different from that away from the spot. Moreover, differences inthe spot-localized background are suggested by the presence of “blackholes” on microarrays, where the spot appears darker than thesurrounding background. These differences were confirmed by Timlinet al. [73] who used hyperspectral imaging of spot fluorescencecombined with multivariate curve resolution to separate thefluorescence spectra of the dyes from glass and contaminatingfluorescence. It was shown that background fluorescence in micro-arrays is variable, spot-localized, and channel-dependent. The likelyramification of this is a further deterioration in the quality of the datawhen standard background correction methods are used, especiallyfor the low-intensity signals. Unfortunately, true background at thespot cannot be calculated using current dual wavelength scanners, asthey are incapable of distinguishing fluorescence due to a contami-nant from the true fluorescence due to the dye-labeled target. Therehave been suggestions that spot-localized background fluorescencecould be estimated from negative controls or blank spots (in theabsence of the former) [75,90], but this is also problematic, especiallysince spatial variations in the background are commonly observedacross a microarray. It may also be possible to mitigate the effects ofthe spot-localized background by using the regression ratio method,which includes an intercept term, but this has not been demonstrated.In addition to leading to erroneous ratio estimates, background issuesin microarrays can confound normalization methods that assume alinear relationship between the background-corrected intensities ofthe two channels, as will be discussed in Section 4.8.

4.6. Image processing: GeneChips

In contrast to spotted microarrays, the extremely precise andrepeatable fabrication process used to manufacture GeneChips(discussed in Section 3.1) in conjunction with the large number ofprobes used per gene substantially changes the methods employed toconvert GeneChip images into intensities on a per gene basis.Representing the spot intensities is greatly simplified due to the useof only one fluorophore on each array, requiring only a simple square-root transform to allow the full range of intensities to be interpretedby the user when viewing the raw image. Gridding the image is alsosimplified through the use of regular patterns of control probes at thecorners of the array for grid placement, as well as in particular sectionsof the array to help correct for grid misalignment. The extremelyprecise method of manufacture also means that deviations fromregularity are very rare, thereby simplifying the process of addressingthe spots over the image. In addition, following the gridding process,spot segmentation into foreground and background regions is notrequired, as each spot almost completely fills the location on the array.However, due to the possibility of overlapped pixel intensities and thetendency for decreased signal intensities at the edges of the spot, theoutermost pixels are discarded, and the 75th percentile of theremaining pixel intensities is reported [91]. These probe levelintensities and their associated standard deviations are stored in afile referred to as the CEL file. In contrast to the two-color microarrays,GeneChips use a set of multiple probes to interrogate the expressionlevel of each gene. Transforming the intensities from all the probes ina particular set into a single value for use in downstream analysis isnot a trivial process, and many different methods have been


developed. The details of many of the methods used are beyond thescope of this review; however a general description of four widelyused methods is presented in the next section so that the reader canappreciate the nature of the procedure.

4.6.1. Probe summarizationThe goal of probe summarization is to transform the intensities

from a set of perfect match (PM) and mis-match (MM) probes for aparticular gene into a single value that can be used in subsequentanalyses. As previously mentioned, each gene has a set of 11 to 20probe pairs of PM and MM probes, each 25 bases long, and the 13thbase of the MM differs from the PM sequence. These MM probes areused to provide an estimate of the signal of the PM resulting from non-specific hybridization. The earliest methods to summarize the probesignals employed a simple average difference, whereby the average ofthe differences PMij−MMij, (j=1, … J, for each array i) wascalculated. However, these techniques suffered from a number ofweaknesses, (1) unaccounted for variations in background signal, (2)MM intensities that are higher than the corresponding PM intensity,(3) multiplicative error in the measured signal intensities (seeSection 4.7 for more details on errors), (4) differential responses ofprobes to the same gene. Various solutions over time have attemptedto correct different sets of these problems. The most widely usedmethods and how they address these weaknesses are describedbelow.

Although the MM probes are designed to correct for backgroundsignal and signal in the PM due to non-specific hybridization, for manysamples there is a large proportion of MM probes with higher signalthan the corresponding PM probe. The MAS 5.0 algorithm developedby Affymetrix [92] calculates the signal as the anti-log of a robustaverage (Tukey biweight) of the values log(PMij− IMij), j=1, ..., J. Toavoid taking the log of negative numbers, IM is defined as a quantityequal to MM when MMbPM, but adjusted to be less than PM whenMM≥PM. For more specifics on the adjustments used to define IM, thereader is referred to the Affymetrix documentation [92].

A different approach was used by Li and Wong [93] in their dChipsoftware package. They theorized that the different probes in a setmay have different affinities for the same gene, however theirbehavior across arrays should remain constant (after normalization),and therefore the affinities can be modeled using a set of arrays. Theyemploy the model

PMij−MMij = θiϕj + εij ð3Þ

This assumes that the probe affinities (ϕj) influence the final signalin a multiplicative manner, and are the same across all of the arrays.Therefore, fitting this model using multiple arrays allows calculationof θi for each array i, giving a summary statistic for the probe set. Theauthors claim that fitting of this model also allows detection ofdefective probes by discovery of probes that do not have a good fit tothe model.

The second probe affinity modeling approach is the robust multi-chip average (RMA) [94–96]. This method has been implemented inthe Bioconductor [97] package affy, and has over 1000 citations. Incontrast to the methods discussed thus far, RMA only uses the PMprobe intensities on each array, due to the high proportion of MMprobes with higher intensities than the corresponding PM probe. Incomparison to the dChip method, RMA uses the log2 transformed PMvalues, leading to

Yij = μ i + αj + εij ð4Þ

where Yij is the normalized, background-corrected PM values. Fittingthe equation using the Yij values allows one to estimate thesummarization value for the probe set on array i, µi.

Finally, Affymetrix also developed a very similar method to dChipand RMA, probe logarithmic intensity error (PLIER) estimation [98].Like the others it fits the probe intensities across a series of multiplechips; however it also applies a penalty function for those probes thatare less informative. All of the model-based methods use probeintensities from multiple arrays, and only arrays that are expected tobehave similarly (i.e.most of the genes are not undergoing differentialchanges) between samples should be processed in the same batch. Inaddition, the model-based methods all use probe intensities that haveall been background-corrected (Section 4.6.2) and normalized(Section 4.8) prior to summarization.

It should be noted that the above summarizations are only theones that the authors have encountered most often in the literature,and that new methods are continuously being developed. Frequently,the raw data made available for analysis consists of the raw intensitiesfor each probe, in addition to the results of the summarized values,although this is not required for submission tomany of themicroarraydatabases.

4.6.2. Background correction for GeneChipsThe MM probes on the arrays are designed to account for non-

specific hybridization, with the signal from the PM probe comprised ofboth specific and non-specific signal, and that from the MM probe ofnon-specific signal only [99]. This assumption that the MM probeswould account for only non-specific signal led to the original practiceof correcting the PM intensities by a subtraction of the correspondingMM intensities. This does not account for all of the background signals,however, nor does it account for the many instances of MM probeswith higher intensities than the corresponding PM probe.

Methods to account for the global background (common to thewhole chip) when using PM/MM measures were introduced in theMAS 5.0 algorithm by Affymetrix. The array is divided up into zones,the probe values sorted within each zone, and the lowest 2% is chosenas the background for that zone. In order to avoid discontinuitiesbetween zones, the background value to subtract from each probe iscalculated as a sum of the background values from each zone, witheach background value weighted by the distance of the probe fromthe centre of each zone. To avoid creating negative probe intensitiesusing this process, a lower threshold is also computed based on thenoise in the lowest 2% of values in each zone and the same weightingscheme as used for the background value.

Unfortunately, the MAS 5.0 method uses a fairly arbitrary methodof substituting values used for MM when the MM intensity is higherthan the PM intensity (see Section 4.6.1). There have been a range ofproposed solutions to fix this. One is to ignore the MM intensitiescompletely, and use only the PM probes and associated intensities.RMA uses this method, assuming that the PM intensities are a mixtureof background and true signal, and that the background intensity isnormally distributed and the true signal follows an exponentialdistribution [100]. Ignoring the MM probes has become morecommon, especially following the release of the actual sequences ofthe PM and MM probes on the array for many GeneChip designs. Thisis due to the fact that often the probe annotations supplied byAffymetrix have changed due to new gene sequence information.With the probe sequence information, it then becomes possible toreassign the probes to new genes [101]; however, the pairing of PMandMM probes is often lost in this process, necessitating that only thePM probes are used to perform the analysis.

An alternative to both approaches was put forward by Wu et al. in2004 [102] that used the sequence information in the PM and MMpairs to allow the calculation of an adjusted value for each MM basedon its actual sequence and calculated affinity for the sequence that isbound by the PM. The normalization and summarization procedureswere based on the RMA method, and the method is known as GC-RMA.


4.7. Measurement quality and transformations

In principle, most ratio calculation methods will be expected toyield similar measures of the relative concentration of mRNA in thetwo samples if spots have comparable amounts of sample hybridizedat every pixel [103]. Unfortunately, it is hard to find such truly idealspots in a population of up to several thousand spots and, therefore,the methods employed will only provide estimates of the ‘true’, albeitunknown, relative intensities of the pixels in the spot masks. Further,these methods are differentially affected by spot characteristics suchas morphology and the shape of the pixel distribution. Thus, for agiven microarray, some methods may work better than others.Depending on the symmetry and distribution of the pixels, it isquite possible to obtain ratios that are widely varied and, given thatthe “true” intensities are unknown, it is hard to assess which approachis the closest estimate. Consider, for example, the spots drawn fromthe image shown in Fig. 9, which is part of the challenge data set (P.falciparum) of the 2004 Critical Assessment of Microarray DataAnalysis (CAMDA) conference available from reference [72]. Cy5/Cy3 ratios calculated using various methods in MatLab® (Math-Works) are presented in Table 2, which demonstrates the apparentdifferences among the methods.

Even when working with the same ratio calculation method,uncertainties in the ratios determined can vary greatly from spot tospot. Unlike some analyticalmeasurements where the uncertainty canbe inferred from the magnitude of the signal, microarray expressionratios do not provide an implicit indication of their uncertainty, sincethe ratio gives no clue as to the magnitude of the signals defining it. Inpractice, observed ratio measurements are found to exhibit a highlevel of heteroscedasticity [104,105]. This can be interpreted as arisingfrom a combination of two limiting error contributions, one which isrelated to spot morphology/intensity and the other which ischaracterized by a multiplicative component.

For spots where one or both channels are close to thebackground level, or for spots that are substantially distorted, errorswill be dominated by the uncertainties in the ratio calculation itself.In the former case, for example, even a small absolute uncertainty inpixel intensities near the background level can lead to large relativeuncertainties in a ratio calculation, most likely tied to the estimationof the background level. These errors are typically quite large, withrelative uncertainties often in excess of 100%, and the spots may beconsidered to be outliers. Unfortunately, there are no clearly definedquality measures to assess these spots so the most widely usedpractice is to rely on quality control employed during primaryprocessing to remove spots that demonstrate a potential forintroducing fluctuations in the data, especially those exhibiting

Table 2Spot ratios calculated using various methods: ratio of medians, median of ratios, ratio ofmeans, mean of ratios and regression ratio. Raw ratio (non-log-transformed) values areshown.

Spot no./locationgene name

Ratio ofmedians

Ratio ofmeans

Median ofratios

Mean ofratios

Regressionratios

3857/8-7-22oPFrRNA0001

0.931 0.991 0.843 1.047 1.010

3313/7-13-19oPFH0009

1.002 1.080 1.039 1.113 1.108

7647/16-13-18Prps11

0.551 0.590 0.640 0.703 0.573

2398/5-22-21Empty

0.610 0.680 0.620 0.663 0.712

7609/16-19-16N155_35

0.753 0.908 0.730 0.858 0.957

80/1-14-4D16785_6

0.746 0.851 0.771 0.809 1.065

low overall intensity. These spots are excluded from further analysisthrough a subjective process referred to as flagging. This all toocommon solution of ignoring potentially aberrant spots assumesthat in such a large population it is possible to assign the quality ofthe spots to a binary classification. Although statistical approachesof flagging weak spots have been reported [106,107], they rely onsetting thresholds for a cut-off. In reality, spots exhibit a continuumof quality and the imposition of a cut-off based on their aestheticappeal (or some other parameter) must be carefully evaluated asthis can be quite deceptive and potentially lead to a loss of valuabledata.

For spots that are well-defined and exhibit sufficient intensity, theuncertainty introduced by the ratio calculation is small and the errorstructure appears to be dominated by a multiplicative (proportional)component. At this limit, the relative standard deviation in the ratiomeasurement appears to be fixed, typically in the range of 10–40%,depending on the microarray under study [108]. Although severalresearchers have reported such behavior [76,105,109,110], thephysical origin has not been clearly elucidated, but it does not seemto be limited by the optical measurement [104]. Such an errorstructure is also observed in the absolute fluorescence intensities, andit is easily shown that this would propagate directly to the ratios. Thepresence of this multiplicative error is one of the principal reasonsthat log-ratios are used to represent expression data as opposed toratios — multiplicative errors revert to a uniform variance under suchconditions. For example, if the ratio X has associated standarddeviation σX=αX, where α is the constant relative standard deviation(RSD), propagation of error shows that the uncertainty associatedwith Y=log2X is σY=α / ln2 (note— base two logarithms are typicallyused with expression data to measure fold-changes). Another benefitof this transformation is that it will suppress the range of outlierssomewhat for a more informative display. For these reasons, log-transformed data are widely used for presentation and normalization(see Section 4.8).

An alternative to log-transformation is the so-called variancestabilizing transformation [111,112]. Variance stabilization has beenargued to be important from the point of view that probabilisticprocesses sometimes lead to intensity-dependent variance [85].Rocke et al. [109] and Durbin et al. [111] proposed a method fordealing with this problem by expressing the measured dye intensityas a function of two components such that:

y = α + ceη + ε ð5Þ

where y is the measured dye intensity for each spot, α is thebackground intensity, and µ is the true expression level of the gene,while η and ε are normally distributed random variables centered atzero, with σ2

η and σ2ε as their respective variances. This model was

first developed for purposes of evaluating measurement errorsin analytical chemistry instrument responses [113], and its applica-tion to microarrays has several implications. First, at very lowexpression levels i.e. if µ is approximately zero, a measured spotintensity is expected to be dominated by the first term, meaning thaty is normally distributed in the limit of many such measurements,i.e. y∼N(α, σ2

ε). Second, when µ is very large, the measured intensityis dominated by the second term, with an approximate log-normaldistribution for y. Under such circumstances the model given for the

variance is μ2S2η where S2η = eσ2η eσ

2η−1

� �, which implies that the

variance of y is linearly related to µ2, i.e. multiplicative noise. Third, atmoderate expression levels, measured spot intensities are expected tolie between the two extremes mentioned above and the distributionof y is anticipated to exhibit the characteristics of both normal andlog-normal distributions. The approach provided for dealing with theimplications mentioned above is to transform the data using a variant


of the logarithmic transformation that stabilizes the variance in all theintensity regions, referred to as generalized log-transformation [111](glog) given as:

g yð Þ = ln y−αð Þ +ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiy−αð Þ2 +

σ 2ε

S2η

!vuut0@

1A ð6Þ

where α is the background intensity measured from the intensity ofunexpressed genes on a microarray. This transformation ensures thatthe variance is constant over the dynamic range of measuredintensities.

In the literature, although sparse, several quality measures thatprovide somewhat objective methods for associating quality to themeasured ratios have been reported. The reader is directed toKarakach and Wentzell [114] for a more in-depth review of ratio-quality measures. Regrettably, most downstream microarray dataanalysis methods do not use these quality measures in the analysis ofthe data. For instance, as early as 1997, Chen et al. [85] introduced anapproach for associating confidence levels to calculated ratios, yet,unfortunately, most microarray data analysis methods known to theauthors do not employ this information. In recent years, there hasbeen continued interest in the issues related to the quality of spotratios with the goal of associating confidence to the measured ratios.Brown et al. [104] proposed an approach based on the relativestandard deviation (RSD) of ratios of spots to calculate variability dueto spot morphology and used this measure, spot ratio variability(SRV), to assign a significance value to the ratios. Similarly,Wang et al.[78], developed a quality score depicting the overall quality of all thespots on amicroarray based on the spot size, signal-to-noise ratio, andexcessive and variable local background as well as spot pixel intensitysaturation. As noted by Newton et al. [105], expression measurementprocedures that rely solely on the raw intensity ratios are unlikely tobe efficient, as high errors accompany the reported ratios especially atlow signal intensities. The output from most microarray imageanalysis software includes a measure of the spread for the pixels inthe spot mask. Whether or not this provides a good estimate of theuncertainty associated with the ratios, especially for low-intensityspots, is not clear [106]. More recently, Karakach and coworkersdeveloped techniques to estimate both the additive andmultiplicativecomponents of the ratio uncertainties [87], thereby providing anumerical value associated with the quality of the spot ratio.However, it is evident that more research on estimating andincorporating measurement uncertainty in microarray data analysisneeds to be carried out.

4.8. Normalization

The goals of higher-level analyses of microarray data include theidentification of genes whose expression strongly depends on thebiological state of the cell and partitioning such genes, as well as thecorresponding samples, together based on similarity of their expres-sion profiles. Microarray data analysis entails identification of genesthat exhibit uncharacteristic patterns of expression, i.e. genes that areup-regulated or down-regulated with respect to some reference state.Unfortunately, the setup for these experiments exacerbates thepotential for wide-ranging variability; hence access to the biologicalinformation that may be mirrored in the expression profiles is oftenimpeded. This variability may be of a random or systematic nature.Some random variability, the sources of which have been discussed inpreceding sections, is inevitable and can be addressed through properstatistical analysis. This section discusses the methods for removal ofsystematic biases in microarray data (normalization), and begins byproviding a general summary of some of the sources of experimentaluncertainty. This is particularly important since normalization is thefinal pre-processing step for microarray data.

The origins of the need for normalization are relatively simple tounderstand from an analytical perspective. In an ideal situation, a ratioof the intensity measurements on the test and reference channelswould give a direct indication of up-or down-regulation, i.e. a ratio ofunity would indicate no change. In practice, of course, the absoluteintensities are a function of many variables, obvious ones includingthe response of the PMT and optical system, the laser power, the laserwavelength, the absorption spectrum of the dyes and their quantumyields, and the efficiency with which each dye is incorporated into itsrespective sample. While the use of two-color arrays can solve theproblem of variable spot morphologies, it cannot compensate forthese other effects. An analytical solution to these differences mightbe to use some sort of internal standard whose concentration was thesame in both the test and reference, and in fact this is one approach.However, the efficacy of this strategy is limited by another source ofvariability, which is the amount of mRNA extracted from each sample.Normally, the amounts of total RNA in the test and reference sampleare adjusted to be the same by a spectrophotometric measurement,but the mRNA is only a few percent of the total RNA and this can vary,leading to another source of variability.

It is expected that the correction for these effects would bemultiplicative in nature such that it would involve simply scalingeither the test or reference intensities by an appropriate constant, butthe possibility of spatially dependent scaling factors or nonlinearbehavior cannot be excluded. There are different approaches tonormalization of microarray data, most of which have been developedbased on sound distributional assumptions about the data. The mostcommon assumption made for comparator experiments is that thevast majority of genes in the test sample exhibit no change inexpression from the reference. Although most of the normalizationtechniques have been developed for comparator experiments, theirapplication to time-course experiments is not uncommon. Some ofthe most widely used normalization methods include total intensitynormalization [115], local regression [116] or local scatter smoothing(such as locally weighted scatter plot smoothing (Lowess) normal-ization [117,118]) and normalization by housekeeping genes [119] orexternal control genes. Only a brief description of some of thesemethods is presented in this section, since Quackenbush [115] hasreviewed the most common normalization strategies and Park et al.[120] have also compared various methods of normalization ofmicroarray data.

4.8.1. Total intensity normalizationTotal intensity normalization is the simplest strategy for normal-

ization of microarrays and is based on some of the most straightfor-ward assumptions. In this approach, it is assumed that the averageamount of transcripts representing each target is approximatelyconstant for the two samples. In addition, it is assumed that theprobes are randomly sampled from the population of genes in thegenome (or constitute a complete genome). This implies thatapproximately equal amounts of target (from both samples) shouldhybridize to each spot, producing equal intensities in both channelswhen integrated. The rationale of these assumptions is that in a givenliving system, basic cellular maintenance must continue and pertur-bations to the current state are addressed by adjustments in theexpression of only a few genes. Thus, normalization is performed byscaling the total intensity of one of the channels with a factorcalculated as the ratio of the total fluorescence intensity of channel-one to channel-two such that:

G′i = γGi and R′

i = Ri ð7Þ

where G′i is the normalized intensity of the ith probe hybridized to thegreen labeled target, Gi is the respective raw intensity while R′i is thenormalized intensity of the ith probe hybridized to the red labeled


target and Ri is the respective raw intensity for the same probe. γ is thenormalization factor given as:

γ =∑g

i=1Ri

∑g

i=1Gi

ð8Þ

where the summation is over all the g probes on an array. This adjuststhe mean of the relative expression level for all the spots to unity.Alternatively, to provide better stabilization of the error variance, thegeometric mean is often used and calculated using the log-ratios. Thisleads to:

logγ =∑ logR

g−∑ logG

gð9Þ

Note that this approach, as well as most other normalizationstrategies, can be applied to an entire array or to sub-grids of the arraywhere the terms global normalization and sub-grid normalizationrefer to the two approaches, respectively [115]. However, cautionmust be exercised in the use of these terms since others [120,121]have used the term global normalization to refer to the total intensitynormalization method while approaches such as Lowess are referredto as intensity-dependent normalization methods.

4.8.2. Lowess normalizationIn view of the assumptions made in the total intensity normali-

zation method, it is anticipated that a plot of the red channel versusgreen channel intensities will yield a unity slope and a zero interceptwhen properly normalized, since, typically, only a small fraction ofgenes exhibit differential expression. Therefore, another approach tonormalization might be to make such a plot, as shown in Fig. 10a, anduse the slope as the normalization factor. However such an approachis problematic for a number of reasons. First, the typical distribution of

Fig. 10. Red vs. green channel intensity plots from Atlantic salmonmicroarrays [7]: (a) raw inshowing deviation from zero for low intensities, and (d) Lowess-corrected data.

measurements generates significantly fewer points at high intensities,which would not in itself be a problem except for the proportionalerror structure of the intensity measurements discussed earlier. Thiswill tend to give excessive weight to the high-intensity points. Inaddition, there are often a considerable number of outliers in the data.A logarithmic transformation is therefore used to transform themultiplicative errors to uniform errors and reduce the range ofoutliers. It is expected that a plot of the log-transformed measure-ments (Fig. 10b), i.e. log R vs. log G, will yield a unity slope and anintercept equal to the logarithm of the normalization factor in theuntransformed space:

log2 R = log2 γ + log2 G ð10Þ

Alternatively, the normalization can be performed by adding theintercept to the log G values.

It is now standard practice to visualize microarray intensitymeasurements on the log scale in this way. Dudoit et al. [118]introduced a variation in this approach by incorporating a 45°clockwise rotation of the (log R vs. log G) coordinate system forease of visualization. Such a rotation involves plotting the log-ratio(log(R/G)) of the intensities, designated M, versus the mean of theirlogarithmic intensities (log(R×G)½), designated A. It is then antici-pated that properly normalized plots of M vs. A will have zero slopecentered on the zero horizontal. This is shown in Fig. 10c. These so-called MA plots are considered as useful by some since the horizontalaxis can be viewed as being related to a kind of average intensity,allowing intensity-dependent patterns to be observed.

Unfortunately, both log R vs. log G plots and MA plots commonlydepict nonlinear characteristics that likely arise due to differentialbackground on the two channels. Such a nonlinear structure is shownin Fig. 10b and c. The curvature depicted in these figures introduces acomplication to a problem that could otherwise be solved by simplelinear regression. Yang et al. [121] observed that this nonlinearity is anintensity-dependent systematic bias in the log-ratio values, and

tensities, (b) log2 R vs. log2 G plots showing ‘banana shaped’ curvature, (c)M vs. A plots


manifests itself as deviations from zero by the low-intensity signals, asseen in theM vs. A plots. This intensity-dependent bias renders the logR vs. log G plots ‘banana shaped’. Thus, to correct this intensity-dependence, they introduced Lowess [122] normalization and, sinceits inception, it has become the de facto standard normalizationtechnique. This approach corrects nonlinearity in data via an nth orderlocally weighted regression of every response variable on localpredictor variables (for DNA microarrays, a 1st order fit is generallyused). Accordingly, M is regressed on A, point-by-point, through aweighted scheme such that for every point Mi, in M, a local subset ofpoints, Msub, which are closest to Mi are identified and regressed onAsub, after beingweighted by their respective Euclidean distances fromMi. Thus, in typical microarray data, the data are normalized such that:M=k(A) andM′=M−k(A) where k(A) is the Lowess fit to theM vs. Aplot. The size of Msub is based on a “span” chosen on the basis of therange of points from which the smoothest fit can be obtained. This issimilar to the window in a moving average filter; a small windowwillcompromise the smoothness of the filter while a large window mightescalate computation time without improvements to the filtersmoothness.

The application of Lowess to data normalization can either beglobal or local, where the former implies that normalization is appliedto the entire data set and the latter entails dividing the data intophysical subsets such as sub-arrays, where the elements of the sub-array consist of spots printed with one pin (these are often called“print-tip groups”). Local Lowess normalization is said to correct forsystematic spatial biases in the array, possibly related to discrepanciesin the print-tips used to make the microarray [121].

Although the application of the Lowess methodology is wide-spread, it is largely empirical and, to the authors' knowledge, nophysical explanation of the observed curvature has been provided inthe literature. There is some evidence to suggest that the character-istic arises from spot-localized background effects for low-intensityspots, however.

4.8.3. External controlsNormalization in early microarray experiments was performed

using external controls. For instance, in the pioneering work reportedby Schena et al. [13], Arabidopsis thaliana mRNA was spiked withhuman acetylcholine receptor (AChR) mRNA controls, which wereused for normalization. Current microarray experimental designsencourage inclusion of control probes, generally derived from a non-homologous organism, deposited at every sub-array to control for thesystematic variability within the print-tip group. These externalcontrols will not be expected to exhibit any differential expressionsince they are not subjected to the same biological stimulus as theexperimental mRNA. Naturally, these act like internal standardsagainst which signal from experimental samples are calibrated. Whilesuch controls serve a useful purpose, there are drawbacks in their usefor normalization. Although they can account for differences ininstrumental response parameters, they are unable to account fordifferences in the amount of mRNA extracted for the test and thereference, as already noted. Moreover, reliance on a limited number ofspots can be dangerous if the quality turns out to be poor orinsufficient target is added.

4.8.4. Housekeeping genesThis approach to normalization assumes that, in a large number of

genes, the expression level of a relatively large subset will remainunaltered under most biological stimuli—except death. Thus, if thesegenes, referred to as “housekeeping genes”, are identified on amicroarray, they could be employed to find the normalization factorto be applied to the entire array. The Harvard University HUGE 451Index [123] is a list of housekeeping genes that are ubiquitouslyexpressed in all cell types and conditions. In addition, DeRisi et al.[124] identified a set of 90 housekeeping genes whose intensities

were used to normalize over 1000 spots on the microarray.Nonetheless, this approach does not take into account effects ofnonlinearity in the data, and is viewed rather skeptically for thisreason and the fact that it is hard to establish whether the extractedamount of mRNA corresponding to these genes is constant. Inaddition, recent reports [125] have suggested that certain “house-keeping genes” may be affected by the treatment to which a testorganism is subjected. Since this has been suspected in the past,robust methods for choosing a self-consistent set of genes whoseexpression levels remain unchanged have been introduced [116,126].

4.8.5. Other approachesPerhaps due to the significance of pre-processing of microarray

data, several other normalization strategies have been developed.These include quantile normalization [121], ANOVA [43], and mixedmodel methods (MMM) [110], which are statistical approaches thatadjust the means of the log-ratio of spot intensities to reflect expecteddistributional similarities between multiple arrays or, sometimes,within a single array given a ‘mock’ array. The abundance ofnormalization methods and literature in this area is a testament toits importance in the broader picture of microarray data processing.

4.8.6. Normalization of GeneChip dataIn the case of GeneChip data, many of the same considerations as

were discussed for spotted two-color microarrays still hold, and themethods used for normalization are very similar. Commonly usedmethods include a linear scaling akin to total intensity normalization,which may be performed before or after probe summarization, andmay or may not use an invariant set of probes across a set of arrays.Lowess is also used, however the implementation for GeneChip datadiffers from two-color arrays due to the hybridization of a singlesample to each array. This necessitates that Lowess be carried outusing two arrays, with one arbitrarily designated as R and the other asG (following the convention used for the two-color arrays). If morethan two arrays have been used in the study, then cyclic Lowess maybe used, whereby each array is normalized against all of the otherarrays in turn. Instead of assuming that the distributions of probeintensities among the various arrays is the same, quantile normaliza-tion forces the distribution of intensities to be the same. For thismethod the probe intensities for each array are first sorted, and theactual value for each array is replaced by the average sorted valueacross all the arrays.

4.9. Missing values

In contrast to many other analytical measurements, microarraydata are often characterized by a significant proportion of missingvalues. These arise as a consequence of several factors, including (1)the large dynamic range of the measured fluorescent signal andremoval of spots based on signal measures, (2) spot artifacts such asnon-uniform background, smudges and scratches that compromisethe quality of a given spot, and (3) negative intensities or ratios arisingfrom abnormally high background. (1) and (2) tend to be random fora particular array, whereas (3) can be more systematic, depending onthe experimental design. In the authors' experience, these factors leadto approximately 5% of measurements to be considered missing onany given array. With few exceptions (see [127] and [128] wheremissing values are included by weighting them appropriately),downstream analysis methods require complete data sets with nomissing values. To allow the use of these downstream methodswithout discarding potentially important genes, missing valueimputation (MVI) methods have been regularly used with DNAmicroarray data.

In 2001, Troyanskaya et al. introduced the now standard K-nearestneighbor (KNN) approach, evaluating it and two other imputationmethods [129]. Different imputation methods have been developed

Fig. 11. Volcano plot depicting statistical significance against fold change in Atlanticsalmon microarray data [7]. The solid horizontal line depicts the nominal p-value of0.05 while the solid vertical lines depict a 1.4 fold change. The region labeled “X”corresponds to the region of differential expression.


specifically for DNA microarray experiments (see [130–140] forexamples). In recent work, there has been controversy over theimpact of the imputation method on the outcome of subsequent dataanalyses. Brock and coworkers [141] investigated the accuracy ofeight different MVI techniques as well as methods to evaluate andselect the most appropriate MVI approach. In this work, theyconcluded that in many cases, there was very little difference in theperformance of the best algorithms, as evaluated by their ability toreconstruct the original data. In contrast, Celton et al. also examined12 different imputation methods and their effect on the results ofexploratory data analysis via clustering [142], demonstrating that thechoice of imputation method and the number of imputed values cancause instabilities in the clustering results.

What is of particular concern for the chemometrician performingdownstream analysis of the data, is whether or not MVI is warranted,and which method to employ. Although KNN is the most commonlyused MVI method, this appears to be primarily by virtue of havingbeen introduced first, as both studies above demonstrate that formany datasets it is not the best method. However, the best method isoften dataset- or experimental design-specific, requiring specializedknowledge of the various methods available.

4.10. Higher-level processing

At the primary level of data analysis, whichmight be considered asdata pre-processing from a chemometrics perspective, the steps arelargely the same from one application to another: gridding andsegmentation, flagging, image processing, background subtraction,ratio calculation, transformation and normalization. Although thedetails of these steps may differ, in the end the usual result is a vectorof ratios or, more typically, log-ratios and their associated geneidentifiers for a series of samples, forming a two-way data matrix forfurther analysis. At this stage, a variety of methods can be used to coaxthe desired information from the data, depending on the nature of theexperiment. Typical goals include: (1) the identification of genesexhibiting differential expression (up-or down-regulation) relative tosome reference state, (2) the clustering or classification of samplesbased on their gene expression profiles, (3) the clustering orclassification of genes based on their expression across multiplesamples, (4) the identification of genes that may be used as biologicalmarkers (e.g. for a mutation, a disease, or resistance to somemedication), and (5) elucidation of gene function and mechanismsof interaction, i.e. gene networks. In these studies, the term“expression profile” is generally used to describe the normalizedratio (test/reference) or log-ratio of signals across all genes for asample represented on a particular microarray. From a chemometricspoint of view, it could be considered a kind of “genetic spectrum”

except that there is no naturally contiguous ordering of channels. Inother contexts, “expression profile” may also refer to changes in theexpression of a particular gene across multiple samples, especially in aserial experiment.

The application of higher level data analysis methods to micro-array measurements generally assumes that the data have beenadequately pre-processed such that poor quality spots have beeneliminated or flagged, background signals subtracted, and systematicvariability has been accounted for through proper experimentaldesign and normalization. Often these assumptions were not valid inthe early days of microarrays, but the situation has improvedsomewhat in recent years. The earliest data analysis was performedby assigning differential expression cut-off values to genes based on“fold-changes” in their expression levels between two samples. Forinstance, Schena et al. [13] declared a gene to be differentiallyexpressed if its expression level in the two samples differed by a factorof 5, while DeRisi et al. [124] chose a cut-off value of ±3 fold up-ordown-regulation. The standard approach, at this stage was tocompute log-ratios of measured expression levels of genes in the

test and reference samples, and to assign an ad hoc threshold fordifferential expression. A convention emerged, soon after thetechnology was developed, that a two-fold relative induction orrepression of the measured dye intensities indicated a significantchange in gene expression. It is not, however, clear how thisconvention was conceived and over time it has received a lot ofcriticism, mainly because such fold-changes did not take into accountthe reliability of the measurements. Moreover, most of the publisheddata were quite elusive about the measurement reproducibility, andhence it was difficult to assess the confidence levels of the reportedfold-changes. In addition, some genes, such as those encodingtranscription factors, may exhibit relatively small changes inexpression yet have dramatic impacts on the cellular machinery.

As time evolved, more replication was performed in microarrayexperiments and formal statistical testing was employed. Initially,simple t-tests were used, where a t-statistic could be calculated andevaluated for each gene, accounting for differences in variabilityamong the genes. One method of displaying these results is in theform of a “volcano plot”, as shown in Fig. 11, where the − log10 of thep-value calculated for each gene is plotted against the log2 of its foldchange. The plot clearly shows that, while there is a correlationbetween significance and fold change, it is not very strong. One of theproblemswith carrying out a t-test in this way is defining a p-value forsignificance. If a typical cut-off of p=0.05 were used (log10(0.05)=1.3 in the figure), clearly a very large number of genes would bediscovered. However, part of the difficulty here is the problem ofmultiple testing — with ca. 4000 genes, one would expect the cut-offto be exceeded (4000×0.05)=200 times just by chance. Typically, inthese cases, a Bonferroni correction would be applied, which wouldadjust the p-value to 0.05/4000=0.0000125 (or − log10p=4.9).While this reduces the number of genes discovered (often to zero), ithas been argued that this correction is too conservative because it isbased on an assumption of independence among the genes, which isnot likely to be the case.

The issue of significance levels of differential expression has beenwidely addressed in the literature, culminating in the developmentof methods designed specifically to address some of the shortfalls oftechniques based on p-values, such as significance analysis ofmicroarrays (SAM) [143]. SAM, which has now become a widelyused tool for microarray data analysis, is essentially based on a


standard t-statistic for replicate experiments, although a smalladditive term is used in the denominator to correct for theanomalous behavior of low-level signals. An important difference,however, is that SAM evaluates the false discovery rate (FDR), whichis the estimate of the fraction of false positives, through a bootstrapmethod that uses random permutations of the samples. By adjustingthe critical values accordingly, a biologist is able to control the FDRto an acceptable level and identify an appropriate number of genesfor further investigation. More algorithmic details on this approachcan be found in the original reference [143], as well as a recentreview of the validity of the SAM methodology [144]. For routinedata analysis the software is freely available on the internet foracademic users [145].

Beyond basic hypothesis testing, a wide range of other multivar-iate statistical analysis tools have been applied tomicroarray data sets.These consist of both conventional methods and more noveltechniques specifically designed for microarrays. Although a compre-hensive review of all of these methods is beyond the scope of thisarticle, a number of analysis strategies are becoming somewhatstandard in the field and deserve mention. These fall into a variety ofcategories that include analysis of variance (ANOVA), cluster analysis,exploratory data analysis, classification and time series analysis.Where appropriately designed experiments with proper blocking andrandomization have been performed, ANOVA presents a morepowerful alternative to simple t-tests, but suffers from some of thesame drawbacks associated with multiple testing. Where sufficientsamples are available, clustering is a popular approach, with theadvantage that it can be applied to observational studies as well asdesigned experiments. Methods such as hierarchical clustering and k-means clustering are commonly used. Dendrograms are typicallypresented in a manner peculiar to this area, where the clusteringresults for both the samples and the genes are presented on the samefigure. The sample dendrogram is normally shown at the top of the

Fig. 12. Heat map resulting from a two-way hierarchical cluster analysis. Red indicates genesample relative to the reference sample. The clustering according to genes is shown on the[147].

page, with the gene dendrogram rotated by 90° and shown along theside of the page. In the rectangular region defined by the base of thetwo dendrograms, color-coded expression profiles are presented foreach gene, with green indicating up-regulation and red indicatingdown-regulation (see Fig. 12 for an example). Alternatively, theordering provided by the sample dendrogram on the horizontal axiscan be replaced by some other grouping, such as time or sample class[146]. This permits a rapid visual assessment of the expressionpatterns characteristic for each gene and each sample that is moreintuitive to biologists.

For exploratory data analysis, principal components analysis andnonlinear mappingmethods (a.k.a.multidimensional scaling) are alsowidely used so that the data can be visualized in a lower dimensionalspace. A variety of methods have also been used for classificationpurposes, including discriminant analysis, support vector machines,and artificial neural networks, although in early studies of microarrayssome traditional classifiers were re-invented by those unfamiliar withexisting techniques. Techniques for the analysis of serial microarraydata are not yet as well-established as those for comparatorexperiments, but some rudimentary time series approaches havebeen used, as well as other techniques, such as independentcomponent analysis and hidden Markov models. Recently, anapplication of multivariate curve resolution has also been reportedfor spotted microarrays [128].

4.11. Caveats for chemometrics

As previously noted, the range of techniques that have beenapplied to microarray data is extensive and beyond the scope of thisreview. It has been the objective of this work to rather provide insightinto the nature of the microarraymeasurements themselves, as such adescription of the various measurement aspects has been lacking,especially in the chemometrics literature. From a chemometrics

s that are underexpressed, and green indicates the genes are overexpressed in the testleft, and the clustering according to sample is shown on the top. Figure modified from


perspective, a number of aspects should be emphasized for anyoneundertaking an analysis of this type of data.

First, experimental designs, especially for early work, are repletewith examples of confounded variables. Therefore, caution should beused in over-interpreting the results of any multivariate analysisperformed on these data. Second, most microarray data analyzed inthe literature uses log-transformed ratios rather than ratios. This isdone as a variance stabilization technique and is so commonplace thatthe transformation is often not even mentioned. Depending on themodel being applied, this may have implications with respect to areassuch as linearity, noise distribution, and scaling. Third, missing dataand heteroscedasticity are common issues that need to be addressedin any multivariate analysis. Unlike conventional analytical instru-ments, microarray measurements can exhibit extreme non-uniformi-ty in measurement variances with no particular structure.

Although some may regard these aspects as insurmountable, theauthors believe that these are in fact areas where chemometrics canmake the most contribution, and represent great opportunities forthose interested in this area of research.

5. Summary

Transcriptomics in general, and DNA microarrays in particularprovide a window into the inner workings of the cell. However, froman analytical measurement perspective, DNA microarray measure-ments present numerous challenges to researchers. Although the pasttwo decades have seen an incredible amount of work performed toreduce the many sources of variance on the resultant measurements,the technology platform itself constrains any proposed solution tomitigate variances of the final measurement, whether they areintensities or ratios.

Due to the conceptual simplicity of microarrays, their acceptanceas a standard molecular biology technique, the rise of biologicalstudies performed from a system-wide perspective, and the need forfurther multivariate analyses, we foresee chemometrics playing alarger role in the analysis of DNA microarray experiments. Therefore,we believe it is vitally important that researchers are aware of thenature and the inherent limitations of the measurements with whichthey are working. It is hoped that this article has contributed toachieving that end.

Declaration

This is NRCC publication number 51755.

References

[1] R.B. Stoughton, Applications of DNA microarrays in biology, Annu. Rev. Biochem.74 (2005) 53–82.

[2] J. Quackenbush, Computational analysis of microarray data, Nat. Rev. Genet. 2(2001) 418–427.

[3] F. Katagiri, J. Glazebrook, Overview of mRNA expression profiling using DNAmicroarrays, Curr. Protoc. Mol. Biol. 85 (2009) 22.4.1–22.4.13.

[4] D.V. Nguyen, A.B. Arpat, N. Wang, R.J. Carroll, DNA microarray experiments:biological and technological aspects, Biometrics 58 (2002) 701–717.

[5] J.S. Verducci, V.F. Melfi, S. Lin, Z. Wang, S. Roy, C.K. Sen, Microarray analysis ofgene expression: considerations in data mining and statistical treatment,Physiol. Genomics 25 (2006) 355–363.

[6] Microarray data analysis, [http://www.nslij-genetics.org/microarray/], Date lastaccessed: March 29, 2010.

[7] K.V. Ewart, J.C. Belanger, J. Williams, T. Karakach, S. Penny, S.C.M. Tsoi, R.C.Richards, S.E. Douglas, Identification of genes differentially expressed in Atlanticsalmon (Salmo salar) in response to infection by Aeromonas salmonicida usingcDNA microarray technology, Dev. Comp. Immunol. 29 (2005) 333–347.

[8] M.C. Pirrung, How to make a DNA chip, Angew. Chem. Int. Ed. Engl. 41 (2002)1276–1289.

[9] S.P. Fodor, J.L. Read, M.C. Pirrung, L. Stryer, A.T. Lu, D. Solas, Light-directed,spatially addressable parallel chemical synthesis, Science 251 (1991) 767–773.

[10] S. Singh-Gasson, R.D. Green, Y. Yue, C. Nelson, F. Blattner, M.R. Sussman, F.Cerrina, Maskless fabrication of light-directed oligonucleotide microarrays usinga digital micromirror array, Nat. Biotechnol. 17 (1999) 974–978.

[11] K. Maurer, J. Cooper, M. Caraballo, J. Crye, D. Suciu, A. Ghindilis, J.A. Leonetti, W.Wang, F.M. Rossi, A.G. Stöver, C. Larson, H. Gao, K. Dill, A. McShea,Electrochemically generated acid and its containment to 100 micron reactionareas for the production of DNA microarrays, PLoS ONE 1 (2006) e34.

[12] A.P. Blanchard, R.J. Kaiser, L.E. Hood, High-density oligonucleotide arrays,Biosens. Bioelectron. 11 (1996) 687–690.

[13] M. Schena, D. Shalon, R.W. Davis, P.O. Brown, Quantitative monitoring of geneexpression patterns with a complementary DNA microarray, Science 270 (1995)467–470.

[14] D.J. Hall, Inkjet technology for precise, high throughput manufacture of proteinmicroarrays Advances in Microarray Technology, [http://mms.technologynet-works.net/hall/player.html] (2005).

[15] R.G. Sosnowski, E. Tu, W.F. Butler, J.P. O'Connell, M.J. Heller, Rapid determinationof single base mismatch mutations in DNA hybrids by direct field control, Proc.Natl. Acad. Sci. U. S. A. 94 (1997) 1119–1123.

[16] J.R. Pollack, C.M. Perou, A.A. Alizadeh, M.B. Eisen, A. Pergamenschikov, C.F.Williams, S.S. Jeffrey, D. Botstein, P.O. Brown, Genome-wide analysis of DNAcopy-number changes using cDNA microarrays, Nat. Genet. 23 (1999) 41–46.

[17] N. Patil, N. Nouri, L. McAllister, H. Matsukaki, T. Ryder, Single-nucleotidepolymorphism genotyping using microarrays, Curr. Protoc. Hum. Genet. (2001)Chapter 2, Unit 2.9.

[18] G.K. Hu, S.J. Madore, B. Moldover, T. Jatkoe, D. Balaban, J. Thomas, Y. Wang,Predicting splice variant from DNA chip expression data, Genome Res. 11 (2001)1237–1245.

[19] J.M. Thomson, J. Parker, C.M. Perou, S.M. Hammond, A custom microarrayplatform for analysis of microRNA gene expression, Nat. Meth. 1 (2004) 47–53.

[20] G.A. Churchill, Fundamentals of experimental design for cDNA microarrays, Nat.Genet. 32 (2002) 490–495 Suppl.

[21] R.M. Simon, K. Dobbin, Experimental design of DNA microarray experiments,BioTechniques Suppl (2003) 16–21.

[22] A.A. Alizadeh, M.B. Eisen, R.E. Davis, C. Ma, I.S. Lossos, A. Rosenwald, J.C. Boldrick,H. Sabet, T. Tran, X. Yu, J.I. Powell, L. Yang, G.E. Marti, T. Moore, J. Hudson, L. Lu, D.B. Lewis, R. Tibshirani, G. Sherlock, W.C. Chan, T.C. Greiner, D.D. Weisenburger, J.O. Armitage, R. Warnke, R. Levy, W. Wilson, M.R. Grever, J.C. Byrd, D. Botstein, P.O. Brown, L.M. Staudt, Distinct types of diffuse large B-cell lymphoma identifiedby gene expression profiling, Nature 403 (2000) 503–511.

[23] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov, H.Coller, M.L. Loh, J.R. Downing, M.A. Caligiuri, C.D. Bloomfield, E.S. Lander,Molecular classification of cancer: class discovery and class prediction by geneexpression monitoring, Science 286 (1999) 531–537.

[24] Z. Bar-Joseph, Analyzing time series gene expression data, Bioinformatics 20(2004) 2493–2503.

[25] P.T. Spellman, G. Sherlock, M.Q. Zhang, V.R. Iyer, K. Anders, M.B. Eisen, P.O.Brown, D. Botstein, B. Futcher, Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridiza-tion, Mol. Biol. Cell 9 (1998) 3273–3297.

[26] M. Werner-Washburne, B. Wylie, K. Boyack, E. Fuge, J. Galbraith, J. Weber, G.Davidson, Comparative analysis of multiple genome-scale data sets, GenomeRes. 12 (2002) 1564–1573.

[27] E.F. Nuwaysir, M. Bittner, J. Trent, J.C. Barrett, C.A. Afshari, Microarrays andtoxicology: the advent of toxicogenomics, Mol. Carcinog. 24 (1999) 153–159.

[28] M.J. Marton, J.L. DeRisi, H.A. Bennett, V.R. Iyer, M.R. Meyer, C.J. Roberts, R.Stoughton, J. Burchard, D. Slade, H. Dai, D.E. Bassett, L.H. Hartwell, P.O. Brown, S.H. Friend, Drug target validation and identification of secondary drug targeteffects using DNA microarrays, Nat. Med. 4 (1998) 1293–1301.

[29] M. Shapira, E. Segal, D. Botstein, Disruption of yeast forkhead-associated cellcycle transcription by oxidative stress, Mol. Biol. Cell 15 (2004) 5659–5669.

[30] K. Dobbin, R. Simon, Comparison of microarray designs for class comparison andclass discovery, Bioinformatics 18 (2002) 1438–1445.

[31] Z. Bozdech, M. Llinas, B.L. Pulliam, E.D. Wong, J.C. Zhu, J.L. Derisi, Thetranscriptome of the intraerythrocytic developmental cycle of Plasmodiumfalciparum, PLoS Biol. 1 (2003) 85–100.

[32] M. Llinás, Z. Bozdech, E.D. Wong, A.T. Adai, J.L. DeRisi, Comparative wholegenome transcriptome analysis of three Plasmodium falciparum strains, NucleicAcids Res. 34 (2006) 1166–1173.

[33] M.N. Arbeitman, E.E.M. Furlong, F. Imam, E. Johnson, B.H. Null, B.S. Baker, M.A.Krasnow, M.P. Scott, R.W. Davis, K.P. White, Gene expression during the life cycleof Drosophila melanogaster, Science 297 (2002) 2270–2275.

[34] A.P. Gasch, P.T. Spellman, C.M. Kao, O. Carmel-Harel, M.B. Eisen, G. Storz, D.Botstein, P.O. Brown, Genomic expression programs in the response of yeast cellsto environmental changes, Mol. Biol. Cell 11 (2000) 4241–4257.

[35] M.J. Martinez, S. Roy, A.B. Archuletta, P.D. Wentzell, S.S. Anna-Arriola, A.L.Rodriguez, A.D. Aragon, G. Quinones, C. Allen, M. Werner-Washburne, Genomicanalysis of stationary phase and exit in Saccharomyces cerevisiae: geneexpression and identification of novel essential genes, Mol. Biol. Cell 15 (2004)5295–5305.

[36] Y.H. Yang, T. Speed, Design issues for cDNA microarray experiments, Nat. Rev.Genet. 3 (2002) 579–588.

[37] G.F.V. Glonek, P.J. Solomon, Factorial and time course designs for cDNAmicroarray experiments, Biostatistics 5 (2004) 89–111.

[38] M.K. Kerr, G.A. Churchill, Experimental design for gene expression microarrays,Biostatistics 2 (2001) 183–201.

[39] D.M. Rocke, Design and analysis of experiments with high throughput biologicalassay data, Semin. Cell Dev. Biol. 15 (2004) 703–713.

[40] M.K. Kerr, Design considerations for efficient and effective microarray studies,Biometrics 59 (2003) 822–828.


[41] W. Jin, R.M. Riley, R.D. Wolfinger, K.P. White, G. Passador-Gurgel, G. Gibson, Thecontributions of sex, genotype and age to transcriptional variance in Drosophilamelanogaster, Nat. Genet. 29 (2001) 389–395.

[42] X. Peng, C. Wood, E. Blalock, K. Chen, P. Landfield, A. Stromberg, Statisticalimplications of pooling RNA samples for microarray experiments, BMC Bioin-form. 4 (2003) 26.

[43] M.K. Kerr, M. Martin, G.A. Churchill, Analysis of variance for gene expressionmicroarray data, J. Comput. Biol. 7 (2000) 819–837.

[44] M.K. Kerr, G.A. Churchill, Statistical design and the analysis of gene expressionmicroarray data, Genet. Res. 77 (2001) 123–128.

[45] E.V. Thomas, K.H. Phillippy, B. Brahamsha, D.M. Haaland, J.A. Timlin, L.D.H.Elbourne, B. Palenik, I.T. Paulsen, Statistical analysis of microarray data withreplicated spots: a case study with Synechococcus WH8102, Comp. Funct.Genomics 2009 (2009) 950171.

[46] M. Liang, A.G. Briggs, E. Rute, A.S. Greene, J. Cowley, Quantitative assessment ofthe importance of dye switching and biological replication in cDNA microarraystudies, Physiol. Genomics 14 (2003) 199–207.

[47] K. Dobbin, J.H. Shih, R. Simon, Statistical design of reverse dye microarrays,Bioinformatics 19 (2003) 803–810.

[48] V. Folsom, M.J. Hunkeler, A. Haces, J.D. Harding, Detection of DNA targets withbiotinylated and fluoresceinated RNA probes. Effects of the extent of derivitiza-tion on detection sensitivity, Anal. Biochem. 182 (1989) 309–314.

[49] J.B. Randolph, A.S. Waggoner, Stability, specificity and fluorescence brightness ofmultiply-labeled fluorescent DNA probes, Nucleic Acids Res. 25 (1997)2923–2929.

[50] A. Badiee, H.G. Eiken, V.M. Steen, R. Løvlie, Evaluation of five different cDNAlabeling methods for microarrays using spike controls, BMC Biotechnol. 3 (2003)23.

[51] A. Richter, C. Schwager, S. Hentze, W. Ansorge, M.W. Hentze, M. Muckenthaler,Comparison of fluorescent tag DNA labeling methods used for expressionanalysis by DNA microarrays, Biotechniques 33 (2002) 620–628 630.

[52] J. Haralambidis, M. Chai, G.W. Tregear, Preparation of base-modified nucleosidessuitable for non-radioactive label attachment and their incorporation intosynthetic oligodeoxyribonucleotides, Nucleic Acids Res. 15 (1987) 4857–4876.

[53] G. Wallner, Rudolf Amann, Wolfgang Beisker, Optimizing fluorescent in situhybridization with rRNA-targeted oligonucleotide probes for flow cytometricidentification of microorganisms, Cytometry 14 (1993) 136–143.

[54] E. Manduchi, L.M. Scearce, J.E. Brestelli, G.R. Grant, K.H. Kaestner, C.J. Stoeckert,Comparison of different labeling methods for two-channel high-densitymicroarray experiments, Physiol. Genomics 10 (2002) 169–179.

[55] J. Yu, M.I. Othman, R. Farjo, S. Zareparsi, S.P. MacNee, S. Yoshida, A. Swaroop,Evaluation and optimization of procedures for target labeling and hybridizationof cDNA microarrays, Mol. Vis. 8 (2002) 130–137.

[56] J. Wang, M. Jiang, T.W. Nilsen, R.C. Getts, Dendritic nucleic acid probes for DNAbiosensors, J. Am. Chem. Soc. 120 (1998) 8281–8282.

[57] R.L. Stears, R.C. Getts, S.R. Gullans, A novel, sensitive detection system for high-density microarrays using dendrimer technology, Physiol. Genomics 3 (2000)93–99.

[58] M. Schena, Microarray Analysis, Wiley, Hoboken, NJ, 2003.[59] S. Capaldi, R.C. Getts, S.D. Jayasena, Signal amplification through nucleotide

extension and excision on a dendritic DNA platform, Nucleic Acids Res. 28(2000) e21.

[60] T.L. Fare, E.M. Coffey, H. Dai, Y.D. He, D.A. Kessler, K.A. Kilian, J.E. Koch, E.LeProust, M.J. Marton, M.R. Meyer, R.B. Stoughton, G.Y. Tokiwa, Y. Wang, Effectsof atmospheric ozone on microarray data quality, Anal. Chem. 75 (2003)4672–4675.

[61] Ozone Prevention, [http://cmgm.stanford.edu/pbrown/protocols/Ozone_Pre-vention.pdf], Date last accessed: April 1, 2010.

[62] Genisphere—3DNA Array Detection DyeSaver™2, [http://www.genisphere.com/array_detection_dyesaver.html] , Date last accessed: April 1, 2010.

[63] M. Dar, T. Giesler, R. Richardson, C. Cai, M. Cooper, S. Lavasani, P. Kille, T. Voet, J.Vermeesch, Development of a novel ozone- and photo-stable HyPer5 redfluorescent dye for array CGH and microarray gene expression analysis withconsistent performance irrespective of environmental conditions, BMC Biotech-nol. 8 (2008) 86.

[64] U. Maskos, E.M. Southern, Oligonucleotide hybridizations on glass supports: anovel linker for oligonucleotide synthesis and hybridization properties ofoligonucleotides synthesised in situ, Nucleic Acids Res. 20 (1992)1679–1684.

[65] K.R. Khrapko, A.A. Lysov YuP, V.V. Khorlyn, V.L. Shick, Florentiev, A.D.Mirzabekov, An oligonucleotide hybridization approach to DNA sequencing,FEBS Lett. 256 (1989) 118–122.

[66] V.G. Cheung, M. Morley, F. Aguilar, A. Massimi, R. Kucherlapati, G. Childs, Makingand reading microarrays, Nat. Genet. 21 (1999) 15–19.

[67] T.D. Shalon, DNAmicro arrays: a new tool for genetic analysis, Ph.D. dissertation,Stanford University, 1996.

[68] P. Hegde, R. Qi, K. Abernathy, C. Gay, S. Dharap, R. Gaspard, J.E. Hughes, E.Snesrud, N. Lee, J. Quackenbush, A concise guide to cDNA microarray analysis,BioTechniques 29 (2000) 548–550 552–554, 556 passim.

[69] C. Romualdi, S. Trevisan, B. Celegato, G. Costa, G. Lanfranchi, Improved detectionof differentially expressed genes in microarray experiments through multiplescanning and image integration, Nucleic Acids Res. 31 (2003) e149.

[70] G. Kamberova, S. Shah, DNA Array Image Analysis: Nuts and Bolts, DNA Press,New York, 2002.

[71] Y.H. Yang, M.J. Buckley, S. Dudoit, T.P. Speed, Comparison of methods for imageanalysis on cDNA microarray data, J. Comput. Graph. Statist. 11 (2002) 108–136.

[72] CAMDA 2004 Conference Contest Datasets, [http://www.camda.duke.edu/camda04/datasets/index.html] , Date last accessed: April 1, 2010.

[73] J.A. Timlin, D.M. Haaland, M.B. Sinclair, A.D. Aragon, M.J. Martinez, M. Werner-Washburne, Hyperspectral microarray scanning: impact on the accuracy andreliability of gene expression data, BMC Genomics 6 (2005) 72.

[74] R.S.H. Istepanian, Microarray image processing: current status and futuredirections, IEEE Trans. Nanobioscience 2 (2003) 173–175.

[75] Y.H. Yang, M.J. Buckley, T.P. Speed, Analysis of cDNA microarray images, Brief.Bioinform. 2 (2001) 341–349.

[76] Y. Chen, V. Kamat, E.R. Dougherty, M.L. Bittner, P.S. Meltzer, J.M. Trent, Ratiostatistics of gene expression levels and applications to microarray data analysis,Bioinformatics 18 (2002) 1207–1215.

[77] J. Buhler, T. Ideker, and D. Haynor, Dapple: Improved techniques for finding spotson DNA microarrays, University of Washington [http://www.cs.wustl.edu/∼j-buhler/dapple/dapple-tr.pdf] (2000) , Date last accessed: April 1, 2010.

[78] X. Wang, S. Ghosh, S.W. Guo, Quantitative quality control in microarray imageprocessing and data acquisition, Nucleic Acids Res. 29 (2001) E75-5.

[79] A.N. Jain, T.A. Tokuyasu, A.M. Snijders, R. Segraves, D.G. Albertson, D. Pinkel, Fullyautomatic quantification of microarray image data, Genome Res. 12 (2002)325–332.

[80] M.B. Eisen, ScanAlyze user manual, [http://rana.lbl.gov/manuals/ScanAlyzeDoc.pdf] , Date last accessed: April 1, 2010.

[81] TM4: Spotfinder, [http://www.tm4.org/spotfinder.html], Date last accessed:April 1, 2010.

[82] C.A. Glasbey, P. Ghazal, Combinatorial image analysis of DNA microarrayfeatures, Bioinformatics 19 (2003) 194–203.

[83] E.E. Schadt, C. Li, B. Ellis, W.H. Wong, Feature extraction and normalizationalgorithms for high-density oligonucleotide gene expression array data, J. Cell.Biochem. Suppl Suppl 37 (2001) 120–125.

[84] Ø. Gjerstad, Å. Aakra, L. Snipen, U. Indahl, Probabilistically assisted spotsegmentation — with application to DNA microarray images, Chemometr. Intell.Lab. Syst. 98 (2009) 1–9.

[85] Y. Chen, E.R. Dougherty, M.L. Bittner, Ratio-based decisions on the quantitativeanalysis of cDNA microarray images, J. Biomed. Opt. 2 (1997) 364–374.

[86] J.P. Brody, B.A.Williams, B.J.Wold, S.R. Quake, Significance andstatistical errors in theanalysisofDNAmicroarraydata, Proc.Natl. Acad. Sci.U.S.A.99 (2002)12975–12978.

[87] T.K. Karakach, R.M. Flight, P.D. Wentzell, Bootstrap method for the estimation ofmeasurement uncertainty in spotted dual-color DNA microarrays, Anal. Bioanal.Chem. 389 (2007) 2125–2141.

[88] L. Qin, K.F. Kerr, Empirical evaluation of data transformations and rankingstatistics for microarray analysis, Nucleic Acids Res. 32 (2004) 5471–5479.

[89] Y. Fang, A. Brass, D.C. Hoyle, A. Hayes, A. Bashein, S.G. Oliver, D. Waddington, M.Rattray, A model-based analysis of microarray experimental error and normal-isation, Nucleic Acids Res. 31 (2003) e96.

[90] P. Brzoska, Backround analysis and cross hybridization, Agilent Technologies[http://www.chem.agilent.com/Library/technicaloverviews/Public/5988-2363%20Bknd%20Analys.pdf] (2001) Publication 5988-236EN, Date last accessed:May 11, 2010.

[91] J.M. Arteaga-Salas, H. Zuzan, W.B. Langdon, G.J.G. Upton, A.P. Harrison, Anoverview of image-processing methods for Affymetrix GeneChips, Brief. Bioin-form. 9 (2008) 25–33.

[92] Statistical algorithms description document, Affymetrix [http://www.affymetrix.com/support/technical/whitepapers/sadd_whitepaper.pdf] (2002), Date lastaccessed: April 1, 2010.

[93] C. Li, W.H. Wong, Model-based analysis of oligonucleotide arrays: expressionindex computation and outlier detection, Proc. Natl. Acad. Sci. U. S. A. 98 (2001)31–36.

[94] R.A. Irizarry, B.M. Bolstad, F. Collin, L.M. Cope, B. Hobbs, T.P. Speed, Summaries ofAffymetrix GeneChip probe level data, Nucleic Acids Res. 31 (2003) e15.

[95] R.A. Irizarry, B. Hobbs, F. Collin, Y.D. Beazer-Barclay, K.J. Antonellis, U. Scherf, T.P.Speed, Exploration, normalization, and summaries of high density oligonucle-otide array probe level data, Biostatistics 4 (2003) 249–264.

[96] B. Bolstad, R. Irizarry, M. Astrand, T. Speed, A comparison of normalizationmethods for high density oligonucleotide array data based on variance and bias,Bioinformatics 19 (2003) 185–193.

[97] R. Gentleman, V. Carey, D. Bates, B. Bolstad, M. Dettling, S. Dudoit, B. Ellis, L.Gautier, Y. Ge, J. Gentry, K. Hornik, T. Hothorn, W. Huber, S. Iacus, R. Irizarry, F.Leisch, C. Li, M. Maechler, A. Rossini, G. Sawitzki, C. Smith, G. Smyth, L. Tierney, J.Yang, J. Zhang, Bioconductor: open software development for computationalbiology and bioinformatics, Genome Biol. 5 (2004) R80.

[98] Guide to probe logarithmic intensity error (PLIER) estimation, Affymetrix [http://www.affymetrix.com/support/technical/technotes/plier_technote.pdf](2005), Date last accessed: April 1, 2010

[99] D.J. Lockhart, H. Dong, M.C. Byrne, M.T. Follettie, M.V. Gallo, M.S. Chee, M.Mittmann, C. Wang, M. Kobayashi, H. Norton, E.L. Brown, Expression monitoringby hybridization to high-density oligonucleotide arrays, Nat. Biotechnol. 14(1996) 1675–1680.

[100] B. Bolstad, Low level analysis of high-density oligonucleotide array data: background,normalization and summarization, Ph.D. Dissertation, University of California,Berkeley, 2004.

[101] L. Gautier, M. Moller, L. Friis-Hansen, S. Knudsen, Alternative mapping of probesto genes for Affymetrix chips, BMC Bioinform. 5 (2004) 111.

[102] Z. Wu, R. Irizarry, R. Gentleman, F.M. Murillo, and F. Spencer, A model basedbackground adjustment for oligonucleotide expression arrays, Johns HopkinsUniversity [http://www.bepress.com/jhubiostat/paper1/] (2004), Date lastaccessed: April 1, 2010.


[103] J. Nuñez-Garcia, V. Mersinias, K. Cho, C.P. Smith, O. Wolkenhauer, The statisticaldistribution of the intensity of pixels within spots of DNA microarrays: what isthe appropriate single-value representative? Appl. Bioinform. 2 (2003) 229–239.

[104] C.S. Brown, P.C. Goodwin, P.K. Sorger, Image metrics in the statistical analysis ofDNA microarray data, Proc. Natl. Acad. Sci. U. S. A. 98 (2001) 8944–8949.

[105] M.A. Newton, C.M. Kendziorski, C.S. Richmond, F.R. Blattner, K.W. Tsui, Ondifferential variability of expression ratios: improving statistical inference aboutgene expression changes from microarray data, J. Comput. Biol. 8 (2001) 37–52.

[106] P.H. Tran, D.A. Peiffer, Y. Shin, L.M. Meek, J.P. Brody, K.W.Y. Cho, Microarrayoptimizations: increasing spot accuracy and automated identification of truemicroarray signals, Nucleic Acids Res. 30 (2002) e54.

[107] M.C.K. Yang, Q.G. Ruan, J.J. Yang, S. Eckenrode, S. Wu, R.A. McIndoe, J.X. She, Astatistical method for flagging weak spots improves normalization and ratioestimates in microarrays, Physiol. Genomics 7 (2001) 45–53.

[108] R.L. Stears, T. Martinsky, M. Schena, Trends in microarray analysis, Nat. Med. 9(2003) 140–145.

[109] D.M. Rocke, B. Durbin, A model for measurement error for gene expressionarrays, J. Comput. Biol. 8 (2001) 557–569.

[110] R.D. Wolfinger, G. Gibson, E.D. Wolfinger, L. Bennett, H. Hamadeh, P. Bushel, C.Afshari, R.S. Paules, Assessing gene significance from cDNA microarrayexpression data via mixed models, J. Comput. Biol. 8 (2001) 625–637.

[111] B.P. Durbin, J.S. Hardin, D.M. Hawkins, D.M. Rocke, A variance-stabilizingtransformation for gene-expression microarray data, Bioinformatics 18(Suppl 1) (2002) S105–S110.

[112] W. Huber, A.V. Heydebreck, H. Sültmann, A. Poustka, M. Vingron, Variancestabilization applied to microarray data calibration and to quantification ofdifferential expression, Bioinformatics 18 (2002) s96–s104.

[113] D.M. Rocke, S. Lorenzato, A two component model for measurement error inanalytical chemistry, Technometrics 37 (1995) 176–184.

[114] T.K. Karakach, P.D. Wentzell, Methods for estimating and mitigating errors inspotted, dual-color DNA microarrays, OMICS 11 (2007) 186–199.

[115] J. Quackenbush, Microarray data normalization and transformation, Nat. Genet.32 (2002) 496–501.

[116] T. Kepler, L. Crosby, K. Morgan, Normalization and analysis of DNA microarraydata by self-consistency and local regression, Genome Biol. 3 (2002)research0037.1-research0037.12.

[117] G.K. Smyth, T. Speed, Normalization of cDNA microarray data, Methods 31(2003) 265–273.

[118] S. Dudoit, Y.H. Yang, M.J. Callow, T.P. Speed, Statistical methods for identifyingdifferentially expressed genes in replicated cDNA microarray experiments, Stat.Sinica 12 (2002) 111–139.

[119] T. Suzuki, P.J. Higgins, D.R. Crawford, Control selection for RNA quantitation,Biotechniques 29 (2000) 332–337.

[120] T. Park, S. Yi, S. Kang, S. Lee, Y. Lee, R. Simon, Evaluation of normalizationmethods for microarray data, BMC Bioinform. 4 (2003) 33.

[121] Y.H. Yang, S. Dudoit, P. Luu, D.M. Lin, V. Peng, J. Ngai, T.P. Speed, Normalizationfor cDNA microarray data: a robust composite method addressing single andmultiple slide systematic variation, Nucleic Acids Res. 30 (2002) e15.

[122] W.S. Cleveland, Robust locally weighted regression and smoothing scatterplots,J. Am. Stat. Assoc. 74 (1979) 829–836.

[123] Human Genome Expression Index, [http://www.biotechnologycenter.org/hio/databases/index.html], Date last accessed: April 1, 2010.

[124] J. DeRisi, L. Penland, P.O. Brown, M.L. Bittner, P.S. Meltzer, M. Ray, Y. Chen, Y.A.Su, J.M. Trent, Use of a cDNA microarray to analyse gene expression patterns inhuman cancer, Nat. Genet. 14 (1996) 457–460.

[125] A.H. Khimani, A.M. Mhashilkar, A. Mikulskis, M. O'Malley, J. Liao, E.E. Golenko, P.Mayer, S. Chada, J.B. Killian, S.T. Lott, Housekeeping genes in cancer:normalization of array data, BioTechniques 38 (2005) 739–745.

[126] T.T. Ni, W.J. Lemon, Y. Shyr, T.P. Zhong, Use of normalizationmethods for analysisof microarrays containing a high degree of gene effects, BMC Bioinform. 9 (2008)505.

[127] G. Wang, A.V. Kossenkov, M.F. Ochs, LS-NMF: a modified non-negative matrixfactorization algorithm utilizing uncertainty estimates, BMC Bioinform. 7 (2006)175.

[128] P.D. Wentzell, T.K. Karakach, S. Roy, M.J. Martinez, C.P. Allen, M. Werner-Washburne, Multivariate curve resolution of time course microarray data, BMCBioinform. 7 (2006) 343.

[129] O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D.Botstein, R.B. Altman, Missing value estimation methods for DNA microarrays,Bioinformatics 17 (2001) 520–525.

[130] S. Oba, M. Sato, I. Takemasa, M. Monden, K. Matsubara, S. Ishii, A Bayesianmissing value estimation method for gene expression profile data, Bioinfor-matics 19 (2003) 2088–2096.

[131] Z. Bar-Joseph, G.K. Gerber, D.K. Gifford, T.S. Jaakkola, I. Simon, Continuousrepresentations of time-series gene expression data, J. Comput. Biol. 10 (2003)341–356.

[132] X. Zhou, X. Wang, E.R. Dougherty, Missing-value estimation using linear andnon-linear regression with Bayesian gene selection, Bioinformatics 19 (2003)2302–2307.

[133] M.S.B. Sehgal, I. Gondal, L.S. Dooley, Collateral missing value imputation: a newrobust missing value estimation algorithm for microarray data, Bioinformatics21 (2005) 2417–2423.

[134] H. Kim, G.H. Golub, H. Park, Missing value estimation for DNA microarray geneexpression data: local least squares imputation, Bioinformatics 21 (2005) 187–198.

[135] R. Jörnsten, H. Wang, W.J. Welsh, M. Ouyang, DNA microarray data imputationand significance analysis of differential expression, Bioinformatics 21 (2005)4155–4161.

[136] G. Feten, T. Almøy, A.H. Aastveit, Prediction of missing values in microarray anduse of mixed models to evaluate the predictors, Stat. Appl. Genet. Mol. Biol. 4(2005) Article10.

[137] X. Gan, A.W. Liew, H. Yan, Microarray missing data imputation based on a settheoretic framework and biological knowledge, Nucleic Acids Res. 34 (2006)1608–1619.

[138] J. Tuikkala, L. Elo,O.S.Nevalainen, T. Aittokallio, Improvingmissing value estimationin microarray data with gene ontology, Bioinformatics 22 (2006) 566–572.

[139] X. Wang, A. Li, Z. Jiang, H. Feng, Missing value estimation for DNA microarraygene expression data by Support Vector Regression imputation and orthogonalcoding scheme, BMC Bioinform. 7 (2006) 32.

[140] P. Johansson, J. Häkkinen, Improving missing value imputation of microarraydata by using spot quality weights, BMC Bioinform. 7 (2006) 306.

[141] G.N. Brock, J.R. Shaffer, R.E. Blakesley, M.J. Lotz, and G.C. Tseng, Which missingvalue imputation method to use in expression profiles: a comparative study andtwo selection schemes, BMC Bioinformatics 9 12.

[142] M. Celton, A. Malpertuy, G. Lelandais, A. de Brevern, Comparative analysis ofmissing value imputation methods to improve clustering and interpretation ofmicroarray experiments, BMC Genomics 11 (2010) 15.

[143] V.G. Tusher, R. Tibshirani, G. Chu, Significance analysis of microarrays applied tothe ionizing radiation response, Proc. Natl. Acad. Sci. U. S. A. 98 (2001)5116–5121.

[144] S. Zhang, A comprehensive evaluation of SAM, the SAM R-package and a simplemodification to improve its performance, BMC Bioinform. 8 (2007) 230.

[145] Significance Analysis of Microarrays, [http://www-stat.stanford.edu/∼tibs/SAM/].

[146] M.B. Eisen, P.T. Spellman, P.O. Brown, D. Botstein, Cluster analysis and display ofgenome-wide expression patterns, Proc. Natl. Acad. Sci. U. S. A. 95 (1998)14863–14868.

[147] C. Desert, M. Duclos, P. Blavy, F. Lecerf, F. Moreews, C. Klopp, M. Aubry, F. Herault,P. Le Roy, C. Berri, M. Douaire, C. Diot, S. Lagarrigue, Transcriptome profiling ofthe feeding-to-fasting transition in chicken liver, BMC Genomics 9 (2008) 611.

Documents

DNA Microarrays Paper 2010