11
PRECLINICAL STUDY Evidence for a transcriptional signature of breast cancer Yumei Feng Xiaoqing Li Baocun Sun Yuli Wang Lina Zhang Xiuhua Pan Xiaohui Chen Xiaoyan Wang Jinfeng Wang Xishan Hao Received: 5 April 2009 / Accepted: 6 August 2009 / Published online: 2 September 2009 Ó Springer Science+Business Media, LLC. 2009 Abstract Cancer arises from a step-wise accumulation of genetic and epigenetic changes in oncogenes and tumor suppressor genes, followed by changes in transcription and protein profiles. To identify the intrinsic transcriptional features of breast cancer and to explore in more detail the molecular basis of breast carcinogenesis, genes differen- tially expressed between cancers and their paired normal breast samples in nine breast cancer patients were screened using microarray. Nine normal breast tissues and 49 breast cancer tissue samples were then clustered based on the set of differentially expressed genes. A transcriptional signa- ture of breast cancer consisting of 188 differentially expressed genes was identified. This signature allowed the normal breast tissues to be distinguished from all of the breast cancer samples, and primary breast cancers could be classified into two phenotype-associated subgroups with different ER status and clinical outcome. Furthermore, the classification accuracy of the set of differentially expressed genes was validated in publically available breast micro- array data. Moreover, the differentially expressed genes could be grouped into five subclusters involved in different biological processes of carcinogenesis. Most genes in a given subcluster interacted within an independent subnet- work, and subnetworks could cross-talk through a set of signal molecules. Thus, the transcriptional signature iden- tified here may be an intrinsic feature of breast cancer, and it may constitute to the molecular basis of breast carcino- genesis and different phenotypes of breast cancer. Keywords Breast cancer Gene expression profiling Diagnosis Introduction Cancer arises from a step-wise accumulation of genetic and epigenetic changes in oncogenes and tumor suppressor genes, followed by changes in transcription and protein profiles. Sporadic breast cancers show a wide variety of genomic alterations, and no specific genetic mutation profile has been reported in sporadic breast cancer. How- ever, cancer cells seem to share a common set of molecular pathways that may govern the genesis of most types of human cancers [1]. Profiles of transcription and translation have been shown specific changes among different kinds of cancer as a result of sequential mutation and signal amplification, and these differences can distinguish cancers from normal tissues [24]. Moreover, the different gene expression profiles are likely to reflect distinct tumor sub- types involving different phenotypes and clinical features Electronic supplementary material The online version of this article (doi:10.1007/s10549-009-0505-z) contains supplementary material, which is available to authorized users. Y. Feng J. Wang (&) School of Pharmaceutical Science and Technology, Tianjin University, 300072 Tianjin, China e-mail: [email protected] Y. Feng X. Li Y. Wang L. Zhang X. Pan X. Chen X. Wang Department of Biochemistry and Molecular Biology, Tianjin Medical University Cancer Institute and Hospital, Tianjin, China Y. Feng X. Li B. Sun X. Hao (&) Breast Cancer Prevention and Treatment Key Laboratory of the Ministry of Education, Tianjin Medical University Cancer Institute and Hospital, 300060 Tianjin, China e-mail: [email protected] B. Sun Department of Pathology, Tianjin Medical University Cancer Institute and Hospital, Tianjin, China 123 Breast Cancer Res Treat (2010) 122:65–75 DOI 10.1007/s10549-009-0505-z

Evidence for a transcriptional signature of breast cancer

Embed Size (px)

Citation preview

Page 1: Evidence for a transcriptional signature of breast cancer

PRECLINICAL STUDY

Evidence for a transcriptional signature of breast cancer

Yumei Feng Æ Xiaoqing Li Æ Baocun Sun Æ Yuli Wang ÆLina Zhang Æ Xiuhua Pan Æ Xiaohui Chen ÆXiaoyan Wang Æ Jinfeng Wang Æ Xishan Hao

Received: 5 April 2009 / Accepted: 6 August 2009 / Published online: 2 September 2009

� Springer Science+Business Media, LLC. 2009

Abstract Cancer arises from a step-wise accumulation of

genetic and epigenetic changes in oncogenes and tumor

suppressor genes, followed by changes in transcription and

protein profiles. To identify the intrinsic transcriptional

features of breast cancer and to explore in more detail the

molecular basis of breast carcinogenesis, genes differen-

tially expressed between cancers and their paired normal

breast samples in nine breast cancer patients were screened

using microarray. Nine normal breast tissues and 49 breast

cancer tissue samples were then clustered based on the set

of differentially expressed genes. A transcriptional signa-

ture of breast cancer consisting of 188 differentially

expressed genes was identified. This signature allowed the

normal breast tissues to be distinguished from all of the

breast cancer samples, and primary breast cancers could be

classified into two phenotype-associated subgroups with

different ER status and clinical outcome. Furthermore, the

classification accuracy of the set of differentially expressed

genes was validated in publically available breast micro-

array data. Moreover, the differentially expressed genes

could be grouped into five subclusters involved in different

biological processes of carcinogenesis. Most genes in a

given subcluster interacted within an independent subnet-

work, and subnetworks could cross-talk through a set of

signal molecules. Thus, the transcriptional signature iden-

tified here may be an intrinsic feature of breast cancer, and

it may constitute to the molecular basis of breast carcino-

genesis and different phenotypes of breast cancer.

Keywords Breast cancer � Gene expression profiling �Diagnosis

Introduction

Cancer arises from a step-wise accumulation of genetic and

epigenetic changes in oncogenes and tumor suppressor

genes, followed by changes in transcription and protein

profiles. Sporadic breast cancers show a wide variety of

genomic alterations, and no specific genetic mutation

profile has been reported in sporadic breast cancer. How-

ever, cancer cells seem to share a common set of molecular

pathways that may govern the genesis of most types of

human cancers [1]. Profiles of transcription and translation

have been shown specific changes among different kinds of

cancer as a result of sequential mutation and signal

amplification, and these differences can distinguish cancers

from normal tissues [2–4]. Moreover, the different gene

expression profiles are likely to reflect distinct tumor sub-

types involving different phenotypes and clinical features

Electronic supplementary material The online version of thisarticle (doi:10.1007/s10549-009-0505-z) contains supplementarymaterial, which is available to authorized users.

Y. Feng � J. Wang (&)

School of Pharmaceutical Science and Technology,

Tianjin University, 300072 Tianjin, China

e-mail: [email protected]

Y. Feng � X. Li � Y. Wang � L. Zhang � X. Pan � X. Chen �X. Wang

Department of Biochemistry and Molecular Biology, Tianjin

Medical University Cancer Institute and Hospital, Tianjin, China

Y. Feng � X. Li � B. Sun � X. Hao (&)

Breast Cancer Prevention and Treatment Key Laboratory

of the Ministry of Education, Tianjin Medical University Cancer

Institute and Hospital, 300060 Tianjin, China

e-mail: [email protected]

B. Sun

Department of Pathology, Tianjin Medical University Cancer

Institute and Hospital, Tianjin, China

123

Breast Cancer Res Treat (2010) 122:65–75

DOI 10.1007/s10549-009-0505-z

Page 2: Evidence for a transcriptional signature of breast cancer

[5]. Changes in the expression level of cancer-related genes

occur much earlier than morphological changes, and the

expression changes lead to different degree of differentia-

tion of tissue cells. A characteristic expression profile may

help pathologists and oncologists diagnose cancer before

morphologic changes can be observed.

Transcriptional profiling involves differential mRNA

expression of genes related to different aspects of carcino-

genesis, such as cell growth and metabolism, cell differ-

entiation, signal transduction and transcription regulation,

cell adhesion and migration, and immune surveillance. The

high-throughput microarray provides a powerful tool for

detecting the expression of thousands of genes simulta-

neously and gaining insights into the underlying molecular

mechanism of carcinogenesis. New molecular tumor

markers can potentially be used for more accurate diagnosis

and drug targets for effective individualized therapy.

To explore the molecular basis of breast carcinogenesis,

we selected nine breast cancer patients, each with a tumor

with a diameter smaller than 5 cm and without lymph node

metastasis. The difference between the cancers and their

paired normal breast samples using a high-density micro-

array was analyzed, and the nine normal breast samples

and 49 breast cancer samples were clustered based on the

set of differentially expressed genes. Public microarry data

were used to validate the classification accuracy of the set

of differentially expressed genes. To confirm the genes

obtained by microarray, kallikrein 5 (KLK5), kallikrein 7

(KLK7), small inducible cytokine subfamily B (Cys-X-

Cys) member 10 (CXCL10), collagen type IX alpha 1

(COL11A1), matrix metalloproteinase 3 (MMP3), and os-

teoglycin (OGN) were detected by real-time RT-PCR

analysis to estimate expression differences among 30 nor-

mal breast samples, 30 benign tumor samples, and 30

invasive ductal carcinoma samples.

Materials and methods

Clinical samples

All samples of normal breast tissues, benign breast

tumors, and invasive breast cancers were collected from

patients who underwent complete dissection of the breast

and axillary lymph nodes (breast cancer patients) or local

tumorectomy (patients with benign breast disease) at

Tianjin Cancer Hospital, China, between January 2002

and June 2003. Forty-nine cases with a tumor with

diameter below 5 cm were selected for microarray anal-

ysis; this group consisted of 29 node-negative cases and

20 node-positive cases. In these 49 cases, primary cancers

and paired normal breast tissues were collected in nine

lymph node-negative cases; only primary cancer samples

were obtained from the other 40 cases. In addition, RNA

was extracted and pooled equally from normal breast

tissue taken from 32 patients with benign or malignant

breast disease as control RNA. Thirty breast cancer

patients (15 node-negative and 15 node-positive cases)

and 30 patients with a benign adenoma were selected for

real-time RT-PCR. Twelve cases were used in both

microarray and real-time RT-PCR analyses.

All of the breast cancers were confirmed as invasive

ductal carcinoma by hematoxylin–eosin (H&E) staining.

Normal breast tissues were defined as breast tissues lying

more than 5 cm away from the edge of tumors and were

confirmed normal by pathologic analysis. Tissue samples

were snap-frozen in liquid nitrogen and stored at -80�C.

ER and PR expression were determined as positive when

more than 15% of the nuclei showed staining by immu-

nohistochemical staining. Her2 was defined as positive

when more than 10% of the membrane showed staining by

immunohistochemical assay. The use of these tissues was

approved by the Institutional Review Board and the

Research Ethics Committee of Tianjin Medical University.

RNA extraction

RNA was extracted with TRIZOL reagent (Invitrogen,

Gaithersburg, MD, USA) and purified with the RNeasy

mini kit (Qiagen, Valencia, CA, USA) according to the

manufacturer’s instructions. RNA quality was assessed by

formaldehyde agarose gel electrophoresis and quantified

spectrophotometrically.

Preparation of fluorescent dye-labeled DNA

and hybridization

DNA labeled with fluorescent dye (cy5-dCTP and cy3-dCTP)

was produced by Eberwine’s linear RNA amplification

method [6] and subsequent enzymatic reaction as previously

described [7]. In detail, double-stranded (ds) cDNA contain-

ing the T7 RNA polymerase promoter sequence (50-AAACG

ACGGCCAGTGAATTGTAATACGACTCACTATAGGC

GC-30) was synthesized with 10 lg of total RNA using the

cDNA Synthesis System Kit according to the manufac-

turer’s protocol (TaKaRa, Dalian, China). A T7-OligodT

primer (50-AAACGACGGCCAGTGAATTGTAATACGA

CTCACTATAGGCGCTTTTTTTTTTTTTTTTTV-30) was

used. Half of the eluted dscDNA product was subject to an

in vitro transcription reaction using T7 RiboMAX Express

Large Scale RNA Production System (Promega, Madison,

WI, USA) [8]. The amplified RNA (aRNA) was purified

with the RNeasy Mini kit (Qiagen) and labeled with a ran-

dom primer labeling kit (TaKaRa) [9]. All samples were

labeled with Cy5, and the control, pooled from 32 normal

breast tissues, was labeled with Cy3. The labeled DNA was

66 Breast Cancer Res Treat (2010) 122:65–75

123

Page 3: Evidence for a transcriptional signature of breast cancer

purified, resuspended in Elution Buffer, and quantified.

Labeled control and test samples were quantitatively

adjusted based on the efficiency of Cy-dye incorporation and

mixed with 30 ll of hybridization solution (39 SSC, 0.2%

SDS, 25% formamide, and 59 Denhart’s). DNA in the

hybridization solution was denatured at 95�C for 3 min prior

to loading on a microarray. The human long oligonucleotide

microarray was constructed by CapitalBio Corporation

(Beijing, China). The microarray consists of 50-amino-

modified 70-mer probes representing 21,329 well-charac-

terized human genes purchased from the Operon Company

(www.operon.com). The array was hybridized at 42�C

overnight and washed consecutively with two consecutive

washing solutions: 0.2% SDS, 29 SSC at 42�C for 5 min,

followed by 0.2% SSC for 5 min at room temperature.

Imaging and data analysis

Arrays were scanned with a ScanArray Express Scanner

(Packard Bioscience, Kanata, OT, USA), and the images

obtained were analyzed with GenePix Pro 4.0 (Axon

Instruments, Foster City, CA, USA). Normalization was

performed using the LOWESS program [10]. Genes with

Cy3 and/or Cy5 intensity values higher than 100 were

considered to be expressed genes. The Cy5 intensity value

of each gene was divided by the Cy3 intensity value (the

control), in order to determine a relative expression level

for the gene in the sample tissues relative to the control.

Definition of differentially expressed genes

and classification of samples

A gene was considered to be significantly differentially

expressed (over-expressed or under-expressed), if the ratio

of the expression level in the cancer sample to the expression

level in normal breast tissues was than 4.0-fold (or lower

than 0.25) in more than six of nine cases showing the same

trend in expression (up-regulation or down-regulation). In

addition, a paired-samples two-sided t-test was performed

for each of these genes. Genes with no significant P value

(P [ 0.05) were excluded. All of the differentially expres-

sed genes were rank-ordered on the basis of the P value of

paired t-test between breast cancers and their paired normal

breast samples. Then, the differentially expressed genes was

optimized by sequential backward selection from the bottom

of this rank-ordered list and evaluating its power for correct

classification using leave-one-out cross validation. The

optimal differentially expressed gene set was reached until

the minimal error rate. Leave-one-out cross validation pro-

cedure was carried out using GeneCluster 2.0 software

(http://www.broad.mit.edu/cancer/software/genecluster2/

gc2.html). The functions of these differentially expressed

genes were retrieved using GoMiner software (http://dis

cover.nci.nih.gov/gominer/) [11]. The interaction between

molecules was retrieved using the String 8.0 (http://string.

embl.de/). Cluster 3.0 and Treeview software (Stanford

University) were used to carry out average linkage cluster-

ing. Samples were clustered based on the genes differen-

tially expressed between breast cancers and normal breast

tissues.

Validation of classification potential of differentially

expressed genes in public microarray data sets

Turashvili’s data [12], consisting of experiment data of

laser capture microdissected 10 normal breast ductal sam-

ples, 10 breast normal lobular samples, 5 breast ductal, and

5 breast lobular tumor samples onto Affymetrix Human

Genome U133 Plus 2.0 Arrays with more than 50,000

genes, was retrieve in NCBI Gene Expression Omnibus

(GEO, http://www.ncbi.nlm.nih.gov/geo/) with a GEO

Series accession number GSE5764. Sørlie’s primary data

[13] included 4 normal breast samples, 3 fibroadenomas

samples, and 78 breast carcinomas samples on six different

batches of microarray with 8,000–23,000 genes, and was

accessible in the Stanford Microarray Database (http://

genome-www.stanford.edu/breast_cancer/mopo_clinical/).

To avoid the classification error resulted by different

microarray batches, only 44 samples using svcc bacth

microarray and CRA as common reference samples,

including 3 normal breast samples, 1 fibroadenomas sam-

ple, and 40 breast cancer samples, were chosen to validate

the classification accuracy of the differentially expressed

gene set. van de Vijver’s data [14] on Hu25 K microarray

consisting of 25,000 genes and clinical factors of 295

breast cancer samples were gained on the website of

Rosetta Inpharmatics LLC (http://www.rii.com/publica

tions/2002/nejm.html). Samples in these three different

data sets were clustered, respectively, based on the differ-

entially expressed genes and genes related to different

biological processes identified by our experiments.

Confirmation of microarray data using

real-time RT-PCR

mRNA expression levels of KLK5, KLK7, CXCL10,

COL11A1, MMP3, OGN in 30 normal breast tissue samples,

30 benign tumor samples, and 30 invasive ductal carcinoma

samples were detected by real-time RT-PCR method.

Reverse transcription was carried out after denaturation at

65�C for 5 min, followed by incubation on ice for 5 min and

at 42�C for 50 min in order to synthesize cDNA. Real-time

RT-PCR analysis was performed using the Platinum�

Quantitative PCR System (Invitrogen) according to the

manufacturer’s instructions. Assays were performed with

Breast Cancer Res Treat (2010) 122:65–75 67

123

Page 4: Evidence for a transcriptional signature of breast cancer

the ABI 7500 TaqMan system (Applied Biosystems, Foster

City, CA, USA). We quantified the transcripts of the

housekeeping gene glyceraldehyde 3-phosphate dehydro-

genase (GAPDH) as control, as previously described [7].

Primers and Taqman probes shown in Table 1 for KLK5,

KLK7, CXCL10, COL11A1, MMP3, and OGN were

designed using Oligo 6.0 software (Molecular Biology

Insights, West Cascade, USA). PCR was carried out after

incubation at 50�C for 2 min and pre-denaturation at 95�C

for 3 min, followed by 45 cycles at 95�C for 30 s and 62�C

for 1 min. Quantification of target gene expression in

samples was accomplished by measuring the fractional

cycle number at which the amount of expression reached a

fixed threshold (CT). The relative quantification was given

by the CT values, which were determined in triplicate

reactions with both the experimental and the GAPDH ref-

erence test and reference samples. Triplicate CT values were

averaged and the GAPDH CT was subtracted from the test

sample to obtain DCT. Relative expression level of the

target gene was determined as 2�DCT .

Statistical analysis

Chi-square test or the Fisher’s exact test was used to ana-

lyze the relationship between molecular classification of

breast tissues based on the set of differentially expressed

genes and ER status or relapse/metatasis status. The paired

t test was used to analyze differences in mRNA expression

between primary breast cancers and paired normal breast

tissue. One-way analysis of variance (ANOVA) was used

to compare mRNA expression among normal breast tissue

samples, benign tumor samples, and invasive ductal car-

cinoma samples. The correlation coefficients between

interacted genes were calculated with Pearson correlation

analysis. Survival analysis was carried out according to the

methods of Kaplan and Meier. All calculations were

Table 1 Primer and TaqMan

probe sequences used in real-

time RT-PCR

Genes Primers and probe sequences Product size (bp)

KLK5

Upper 50-GCAAGACCCCCCTGGATGTG-30 127

Lower 50-TCCCAGAGGGCACGGTGTTA-30

Probe 50(FAM)-GTTGGCGAGAACATGCTCTGTGACCC-(TAMRA)30

KLK7

Upper 50-AGGCGTCCTGGTCAATGAG-30 138

Lower 50-GGGTGGCGGAATGACTT-30

Probe 50(FAM)-CCACTGCAAGATGAATGAGTACACCG-(TAMRA)30

CXCL10

Upper 50-CTTTCTGACTCTAAGTGGCATTC-30 176

Lower 50-CACCCTTCTTTTTCATTGTAGCAA-3

Probe 50(FAM)-ACAGCGTACGGTTCTAGAGAGAGGT-(TAMRA)30

COL11A1

Upper 50-TCGCATTGACCTTCCTCTTC-30 113

Lower 50-TCCCGTTGTTTTTGATATTC-3

Probe 50(FAM)-CAGAGGAGCTGCTCCAGTTGATGT-(TAMRA)30

MMP3

Upper 50-TGCCCACTTTGATGATGATG-30 122

Lower 50-GTTGGCTGAGTGAAAGAGACC-30

Probe 50(FAM)-GACAAAGGATACAACAGGGACCAAT-(TAMRA)30

OGN

Upper 50-ACACCATTACCTCCCAAGAAAG-30 111

Lower 50-GGGTGGTACAGCATCAATGTCAA-30

Probe 50(FAM)-AGCAGACACGTGGGCATTTCATCAT-(TAMRA)30

GAPDH

Upper 50-GAAGGTGAAGGTCGGAGTC-30 226

Lower 50-GAAGATGGTGATGGGATTTC-30

Probe 50(FAM)-CAAGCTTCCCGTTCTCAGCC-(TAMRA)30

68 Breast Cancer Res Treat (2010) 122:65–75

123

Page 5: Evidence for a transcriptional signature of breast cancer

performed using the SPSS for Windows statistical software

package (SPSS Inc, Chicago, IL, USA).

Results

Differentially expressed genes and molecular

classification of breast cancer

About 188 differentially expressed genes (Supplementary

Table 1) between breast cancers and paired normal breast

tissues were identified, consisting of 128 genes down-reg-

ulated and 60 up-regulated. These differentially expressed

genes were enriched in chromosomal regions 1p21-36 (nine

genes, 4.76%), 1q21-32 (11 genes, 5.82%), 2p12-25 (seven

genes, 3.70%), 2q12-36 (10 genes, 5.29%), 3q12-28 (eight

genes, 4.23%), 4q11-26 (six genes, 3.17%), 5q22-35 (seven

genes, 3.70%), 7q21-31 (five genes, 2.65%), 8q21-24

(seven genes, 3.70%), 9q21-22 (five genes, 2.65%), 11p11-

15 (seven genes, 3.70%), 11q11-23 (14 genes, 7.41%),

12q12-24 (six genes, 3.17%), 13q11-34 (five genes, 2.65%),

17q11-23 (five genes, 2.65%), 19q12-13 (six genes, 3.17%),

and X (10 genes, 5.29%). The functions of these genes are

related to the cell cycle, apoptosis, signal transduction and

transcriptional regulation, cell adhesion, the cytoskeleton,

and the extracellular matrix, all of which are involved in the

biological process of breast carcinogenesis.

Based on this transcriptional signature of the 188 differ-

entially expressed genes, eight of nine normal breast tissues

were distinguishable from all of the breast cancers, and most

of the 49 primary breast cancers could be classified into two

different biological groups by average clustering: ‘ER-neg-

ative group’ (‘Basal-like’ group) and ‘ER-positive group’

(‘Luminal-like’ group). There were more ER-negative cases

in ‘ER-negative group’ (9/13) than in ‘ER-positive group’

(6/34), and this difference was statistically significant

(P = 0.002) (Fig. 1). Furthermore, ‘ER-positive group’

were clustered into two subgroups. Patients in ‘Subgroup II’

had better clinical outcome than patients in ‘Subgroup I’ and

‘ER-negative group’, though their difference did not show

statistic significance (P = 0.127; Fig. 5a).

Besides the classification of breast cancer samples, the

188 differentially expressed genes could be clustered

(Fig. 1). Most of the genes in ‘Cluster A’ are involved in

signal transduction and transcription regulation; genes

in ‘Cluster B’ are components of the cytoskeleton; genes in

‘Cluster C’ are related to cell adhesion and migration;

genes in ‘Cluster D’ are concerned with the cell cycle; and

most of the genes in ‘Cluster E’ are involved in signal

transduction and immune response. However, when these

gene clusters related to different biological processes were

used to classify breast samples, none of them could group

the samples successfully.

Interactions among the differentially expressed genes

All of the 188 differentially expressed genes were input

into a protein interaction database to find any interactions

among the proteins encoded by these genes. Forty-four

genes were shown to interact with at least one another gene

based on the results of active prediction methods, including

neighborhood, gene fusion, co-occurrence, databases,

homology, text mining, and experiments (Fig. 2). Most of

these genes were grouped in four subnetworks functioning

in different biological processes (Fig. 1): ‘Subnetwork A’

consisted of genes in ‘Cluster A’, with functions related to

signal transduction and transcriptional regulation; ‘Sub-

network C’ was enriched with adhesion molecules in

‘Cluster C’; ‘Subnetwork D’ consisted of genes involved in

cell cycle and proliferation in ‘Cluster D’; and ‘Subnet-

work E’ consisted of genes concerned with signal trans-

duction and immune response in ‘Cluster E’. More than

80% interacted genes were shown correlated expression

with correlation coefficients higher than 0.3 (Fig. 2), and

most of these correlated expression were also observed in

van de Vijver’s data (data not shown).

Validation of classification potential of differentially

expressed genes in public microarray data sets

The mRNA expression levels of 183 genes in 188 differ-

entially expressed genes were detected in Turashvili’s

microarray experiments. A median-centered average clus-

tering on the basis of the expression profiling of the 183

genes could classify most of tumors (9/10) and normal

breast samples (14/20) correctly, regardless of ductal or

lobular tissues (Fig. 3a and Supplementary Fig. 1), and this

difference was statistically significant (P = 0.005).

Only 79 genes of 188 differentially expressed genes were

included in the data of Sørlie et al. In these 79 genes, 19

genes were overlapped with the 456 intrinsic genes identi-

fied by Sørlie et al., including myxovirus resistance 1 (MX1),

keratin 5 (KRT5), keratin 17 (KRT17), fibromodulin

(FMOD), signal transducer and activator of transcription 1

(STAT1), estrogen receptor 1 (ESR1), dual specificity

phosphatase 6 (DUSP6), bullous pemphigoid antigen 1

(BPAG1/DST), fatty acid binding protein 4 (FABP4), lipo-

protein lipase (LPL), small inducible cytokine subfamily D

(Cys-X3-Cys), member 1 (SCYD1/CX3CL1), immuno-

globulin J polypeptide (IGJ), cysteine dioxygenase, type I

(CDO1), collagen type XI alpha 1(COL11A1), small

inducible cytokine subfamily A (Cys-Cys), member 18

(SCYA18/CCL18), apolipoprotein D (APOD), fibronectin 1

(FN1), interferon-stimulated protein (ISG15/G1P2), sema

domain, immunoglobulin domain (Ig), short basic domain,

secreted, (semaphorin) 3C (SEMA3C), and endothelin

receptor type B (EDNRB). When 44 breast samples using

Breast Cancer Res Treat (2010) 122:65–75 69

123

Page 6: Evidence for a transcriptional signature of breast cancer

svcc bacth microarray and CRA as common reference were

clustered based on these 79 genes, all of the three normal

breast tissues and one fibroadenomas sample could be rec-

ognized from the other 40 breast cancers (Fig. 3b and Sup-

plementary Fig. 2). However, breast cancers could not be

classified into subgroups related to different ER status and

outcome due to insufficient genes (only 42%) used to cluster

samples and fewer ER-negative cases.

Furthermore, 142 genes in the set of differentially

expressed genes were available in van de Vijver’s data.

MetER

Normal

GB.accession NAME Chr NetAF007153 2q33.3-q34NM 000125 ESR1 6q25.1X51730 PGR 11q22-q23NM 000353 TAT 16q22.1NM 001756 SERPINA6 14q32.1AF136408 C6ORF4 6q21AK021972 14AL390170 9AF073310 IRS2 13q34AK057678 4p15.32BC015907 11NM 002023 FMOD 1q32AL157455 15AJ000098 EYA1 8q13.3NM 024748 FLJ11539 4NM 004795 KL 13q12NM 000555 DCX Xq22.3-q23AL390150 10NM 006379 SEMA3C 7q21-q31NM 001159 AOX1 2q33AB032953 ODZ2 5q34-q35.1AB037730 BKLHD2 Xq23-q24NM 004570 PIK3C2G 12p12AF115402 ELF5 11p13-p12AK024491 SOX8 16p13.3AK054858 FLJ30296 Xp22.11NM 002639 SERPINB5 18q21.3NM 006533 MIA 19q13.32-q13.33NM 005046 KLK7 19q13.33NM 012427 KLK5 19q13.3-q13.4

NM 007289 MME 3q25.1-q25.2NM 004673 ANGPTL1 1q25.2U50748 LEPR 1p31AK025953 3q21AK057333 FLJ32771 11q12.2NM 053025 MYLK 3q21NM 000115 EDNRB 13q22NM 000916 OXTR 3p25AK027841 DKFZP586H211p13NM 001723 BPAG1 6p12.1NM 032411 ECRG4 2q12.2NM 002380 MATN2 8q22NM 022844 MYH11 16p13.11NM 001946 DUSP6 12q22-q23AB002351 DMN 15q26.3NM 018658 KCNJ16 17q23.1-q24.2NM 002923 RGS2 1q31NM 000222 KIT 4q11-q12NM 002275 KRT15 17q21.2NM 003012 SFRP1 8p12-p11.1NM 032321 MGC13057 2q32.2NM 002089 CXCL2 4q21NM 002996 CX3CL1 16q13NM 000596 IGFBP1 7p13-p12

M60502 FLG 1q21.3AF041210 MID1 Xp22AK022269 CLDN8 21q22.11NM 006307 SRPX Xp21.1M54927 PLP1 Xq22NM 001647 APOD 3q26.2-qterNM 000422 KRT17 17q12-q21NM 000424 KRT5 12q12-q13NM 001615 ACTG2 2p13.1AJ420458 NTRK2 9q22.1NM 003243 TGFBR3 1p33-p32AK026320 PIGR 1q31-q41BC012513 ARHE 2q23.3NM 003919 SGCE 7q21-q22

NM 001850 COL8A1 3q12.3BC014245 CTHRC1 8q22.3AB029000 SULF1 8q13.2-q13.3NM 000493 COL10A1 6q21-q22NM 000089 COL1A2 7q22.1M86849 GJB2 13q11-q12NM 001854 COL11A1 1p21NM 002026 FN1 2q34NM 002421 MMP1 11q22.3NM 005940 MMP11 22q11.2|22q11.23

NM 001238 CCNE1 19q12NM 001827 CKS2 9q22AK001379 ASPM 1q31NM 031423 CDCA1 18p11.32NM 001237 CCNA2 4q25-q31BC015050 OIP5 15q15.1NM 004701 CCNB2 15q22.2NM 001034 RRM2 2p25-p24NM 014176 HSPC150 1q32.1NM 004217 AURKB 17p13.1NM 018101 CDCA8 1p34.3NM 057749 CCNE2 8q22.1NM 020675 AD024 2q24.3NM 002422 MMP3 11q22.3AK027294 8q24.1-q24.3NM 005402 RALA 7p15-p13NM 004336 BUB1 2q14NM 018685 ANLN 7p15-p14NM 022346 HCAP-G 4p15.33

AF026941 CIG5 2p25.2BC016969 LOC129607 2p25.2NM 002462 MX1 21q22.3NM 016816 OAS1 12q24.1NM 005101 G1P2 1p36.33AK055278 13q12.13NM 001565 CXCL10 4q21NM 002416 CXCL9 4q21NM 007315 STAT1 2q32.2NM 012252 TFEC 7q31.2NM 014479 ADAMDEC18p21.2NM 020125 BLAME 1q23.2NM 014398 LAMP3 3q26.3-q27NM 012162 FBXL6 8q24.3NM 005980 S100P 4p16NM 000239 LYZ 12q15NM 004931 CD8B1 2p12NM 001814 CTSC 11q14.1-q14.3NM 002164 INDO 8p12-p11NM 000433 NCF2 1q25NM 002288 LAIR2 19q13.4NM 002664 PLEK 2p14NM 004131 GZMB 14q11.2NM 000732 CD3D 11q23NM 006398 UBD 6p21.3NM 021950 MS4A1 11q12NM 032966 BLR1 11q23.3NM 002988 CCL18 17q11.2NM 018196 TMLHE Xq28

ER-Neg ER-Pos subgroup I

ER-Pos subgroup II

A

Signal

Transduction

B

Cytoskeleton

C

Cell Adhesion

D

Cell Cycle

ESignal

Transductionand Immune

Response

Fig. 1 Hierarchical clustering

of 49 breast cancer samples and

nine normal breast tissue

samples based on a set of 188

differentially expressed genes.

Each column represents an

experimental sample, and each

row a single gene. The detailed

ER status and relapse/metastasis

status are shown on the

dendrogram. Black barsrepresent negative ER and

positive relapse/metastasis.

White bars represent positive

ER and disease-free survival.

Gray bars represent missing

data. Cluster A, highlighted in

red, contains genes involved in

signal transduction and

transcriptional regulation.

Genes in Cluster B, highlighted

in purple, are components of the

cytoskeleton. Cluster C in blueincludes genes related to cell

adhesion and migration. Genes

in Cluster D, highlighted in

pink, are involved in cell cycle

and proliferation. Cluster E in

green contains genes related to

signal transduction and immune

response. Colored bars in the

‘Net’ column signify genes

involved in the homonymic

subnetworks in Fig. 3

70 Breast Cancer Res Treat (2010) 122:65–75

123

Page 7: Evidence for a transcriptional signature of breast cancer

Based on the transcriptional signature of the 142 genes, all

of the 295 cases were clustered into two groups (Fig. 4):

ESR1-negative group and ESR1-positive group. ESR1

mRNA was low expressed in 100% (43/43) samples of

ESR1-negative group, but high expressed in 89.68% (226/

252) ESR1-positive group (P = 0.000). Moreover, cases in

one of subgroup of the ESR1-positive group (subgroup II)

had better clinical outcome than the other subgroup of the

ESR1-positive group (subgroup I) and ESR1-negative

group (Figs. 4, 5b).

Validation of differentially expressed genes

using real-time RT-PCR

To confirm the results obtained by microarray, kallikrein 5

(KLK5), kallikrein 7 (KLK7), small inducible cytokine sub-

family B (Cys-X-Cys) member 10 (CXCL10), collagen XI,

alpha 1 (COL11A1), matrix metalloproteinase 3 (MMP3),

and osteoglycin (OGN) were detected by real-time RT-PCR

analysis to estimate expression differences among 30 benign

breast disease samples, 30 breast cancer samples, and paired

normal breast samples. GAPDH mRNA, as the control gene,

was shown stable expression across all breast samples with

CT values from 19 to 21. Consistent with the microarray data,

all six of these genes were differentially expressed, with

changes higher than twofold in at least half of the 30 breast

cancers when compared with their paired normal breast tis-

sues, and paired t-test showed statistic difference (P \ 0.05).

Figure 6 shows the mRNA expression levels of KLK5,

KLK7, CXCL10, COL11A1, MMP3, and OGN in all of the

samples. The mRNA levels of KLK5, KLK7, and CXCL10 in

benign breast disease tissues did not differ from their levels

in normal breast tissues, but they were present at higher

(KLK5 and KLK7) or lower (CXCL10) levels in breast can-

cers. Levels of COL11A1 mRNA were higher in breast

cancers than in benign breast disease tissues, and they were

E

D

A

C

AD024

CDCA1

CDCA8

BKLHD2AURKB

ANLN

CCNE2CCNB2

BUB1CKS2

CCNE1

CCNA2

TAT

SERPINA6TGFBR3

PGR

OXTR

FMOD

ESR1MID1

RALA

MS4A1TCN1MMP1

COL1A2

FN1MIA

MATN2

COL4A6 MMP3

COL11A1

CD3D

CD8B1

APOD

MME

IGFBP1

EDNRB

KIT

IRS2 LEPR

PLEK

HSPC150

BLAME

STAT1

ALX4

DUSP6FLG

UBD

HOXB1G1P2

OAS1

CXCL10BLR1

CCL18LAMP3

CX3CL1

CTSC

GZMB

CXCL9

MX1DUSP1

INDO

0.51

0.48

0.600.20

-0.31

-0.44

0.66

0.50

0.54

0.41

0.50

0.71

0.63

0.77

-0.2

4

0.28

0.26

0.25

0.35

0.25

0.08-0.53

-0.12

-0.03

-0.36-0.19

0.59

-0.0

7

0.77

-0.48

0.39

-0.65-0.5

7

0.58

0.51

0.15

0.58

0.23

0.37

0.630.67

0.28

0.45

-0.60-0.66-0.55

-0.52

-0.340.16

-0.50

-0.520.40

0.49

-0.380.71

0.70

-0.35

0.37

0.48

0.45

-0.220.22

-0.43

0.47

0.58

0.51

0.58

0.75

0.46

-0.30

-0. 32

-0.2

8

0.46

0.650.77

0.81

0.76 0.65

0.69

0.870.750.75

0.77

0.80 0.83

0.61

0.82

CIG5

Fig. 2 The interaction network of genes differentially expressed between breast cancer tissue and normal tissue. Correlation coefficients

between mRNA expression levels of two interacted genes were shown on their link lines

Breast Cancer Res Treat (2010) 122:65–75 71

123

Page 8: Evidence for a transcriptional signature of breast cancer

higher in benign breast disease tissues than in normal breast

tissues. However, the mRNA levels of MMP3 and OGN in

benign breast disease tissues were higher than in both normal

breast tissues and breast cancers.

Discussion

In this study, the transcriptional profile of breast cancer

tissue samples was found to differ from normal breast

samples based on the set of 188 differentially expressed

genes; moreover, the profile always differed from the

control in the same way, regardless of the characteristics of

breast cancer sample. This suggests that this transcriptional

profile is a transcriptional signature specific to breast can-

cer. Only one pathologically ‘‘normal’’ sample had a

transcriptional signature that was similar to that of breast

cancer, leading our classification algorithm to group it

erroneously with the cancer samples. It may be that this

‘‘normal’’ sample is in fact cancerous tissue that has

undergone transcriptional changes before the breast tissue

showed any morphological changes. Thus, this transcrip-

tional signature could be used to the molecular diagnosis of

surgery margin, so it could alert clinical oncologists ahead

of any morphological symptoms, and prevent them from

applying breast-conserving surgery to these patients, since

they would have a high risk of recurrence following such a

procedure.

Microarray analysis has been used to screen for phe-

notype-associated gene signatures. Several research groups

have explored the relationship between gene expression

profiles and the phenotype of breast cancer. The research

groups of Perou [2] and Sørlie [13] have classified

breast cancers into four groups: ‘normal-like subgroup’,

‘basal-like subgroup’, ‘luminal-like subgroup’, and

‘Her2 ? subgroup’. ‘Luminal-like’ tumors are ER-positive

and have a similar keratin (KRT) expression profile to that

of epithelial cells lining the lumen of the breast ducts [5,

13, 15]. ER-negative tumors can be divided into two main

subtypes: the ‘Her2 ? subgroup’, with amplified Her2

DNA and ‘basal-like’ tumors which have an expression

profile similar to that of basal epithelium and which

express KRT5, KRT6B, KRT14, and KRT17. Several other

studies have identified characteristic signatures in breast

cancers with different lymph node status [16], nuclear

grade [17], and clinical outcome [18, 19]. In this study, the

set of 188 differentially expressed genes could classify

most of the breast cancers into two different biological

groups by average clustering: ‘ER-negative group’ and

‘ER-positive group’. ‘ER-positive group’ were clustered

into two subgroups. Patients in ‘Subgroup II’ had better

clinical outcome than patients in ‘Subgroup I’ and ‘ER-

negative group’. Both estrogen receptor 1 (ESR1) and

progesterone receptor (PGR) were included in these 188

differentially expressed genes. Their mRNA expression

levels, in accordance with the ER and PR status detected by

immunohistochemical staining, differed between sub-

groups. Thus, breast cancer samples had common tran-

scriptional signature comparing to normal breast tissues,

however, this transcriptional signature were viable in dif-

ferent breast tumors which decided the different phenotype

of breast cancer.

Changes in the expression profile of cancerous tissue

relative to normal tissue may be the result of genetic and

epigenetic alteration, followed by changes in signal trans-

duction. Nearly, 70% of the differentially expressed genes

identified in this study were enriched in chromosomal

regions 1p21-36, 1q21-32, 2p12-25, 2q12-36, 3q12-28,

4q11-26, 5q22-35, 7q21-31, 8q21-24, 9q21-22, 11p11-15,

11q11-23, 12q12-24, 13q11-34, 17q11-23, 19q12-13, and

X. Therefore, the genetic changes in these chromosomal

regions, including deletion and amplification, may play

important roles in the process of carcinogenesis. Though

genetic changes may be the initiating event in carcino-

genesis, generating a genetic mutation profile can uncover

alterations in only some genes. However, such an approach

cannot identify the downstream genes that are regulated by

these mutated genes. Expression profiling, on the other

hand, can reflect changes in the expression of both the

Fig. 3 a Classification of Turashvili’s 30 microarray data based on

183 differentially expressed genes. b Classification of Sørlie’s 44 svcc

bacth microarray data in the basis of the only 79 differentially genes

detected in their microarray experiments. Black bars represent

negative ER and positive relapse/metastasis. White bars represent

positive ER, and disease-free survival

72 Breast Cancer Res Treat (2010) 122:65–75

123

Page 9: Evidence for a transcriptional signature of breast cancer

initially mutated genes and genes they subsequently regu-

late during breast carcinogenesis.

The changes in the transcriptional profile identified here

involve many of the genes related to different biological

processes in carcinogenesis, such as cell growth and

metabolism, cell differentiation, signal transduction and

transcriptional regulation, cell adhesion and migration, and

immune surveillance. A biological process occurs through

the coordinated regulation of the expression of a set of

function-related genes, and a process interacts with other

processes through signal crosstalk. In this study, most of

the differentially expressed genes were grouped into four

subnetworks. Genes in ‘Subnetwork A’, whose functions

are related to signal transduction, play a central role in the

network by connecting the other three subnetworks; the

genes in these subnetworks are involved in cell adhesion

and migration (‘Subnetwork C’), cell cycle and prolifera-

tion (‘Subnetwork D’), and immune response (‘Subnetwork

ES

R1

Met

ESR1-Negative

ESR1-Positive

Subgroup I

ESR1-Positive

Subgroup I

ESR1-Positive

Subgroup II

Fig. 4 Classification of van de Vijver’s 295 microarray data based on

142 differentially expressed genes which were detected in their

microarrays. Black bars represent negative ESR1 and positive relapse/

metastasis (Met). White bars represent positive ESR1 and disease-free

survival

Breast Cancer Res Treat (2010) 122:65–75 73

123

Page 10: Evidence for a transcriptional signature of breast cancer

E’). Furthermore, hierarchical clustering tends to group

genes in one subnetwork into their own cluster, indicating

that their expression is coordinately regulated and their

functions in carcinogenesis are linked.

The six genes KLK5, KLK7, CXCL10, COL11A1,

MMP3, and OGN were selected for real-time RT-PCR, and

in this way, microarray analysis proved to be a reliable

method for identifying the characteristic gene expression

signature of breast cancer. However, these differentially

expressed genes must have changed in different ways

during the step-wise process of carcinogenesis. Some

genes, such as MMP3 and OGN, which this study found to

be expressed at higher levels in benign breast tissues than

in normal or cancerous tissues, may change during an early

stage of carcinogenesis and serve as trigger events. Genes

such as KLK5, KLK7, and CXCL10 were expressed to

similar extents in benign and normal tissues, but their

expression level changed in cancer; thus, these genes may

play a role during a late stage of carcinogenesis. The

expression level of still other genes, such as COL11A1, was

found to increase in the order normal \ benign \ cancer

tissue, and such genes may play a role throughout the entire

process of breast carcinogenesis.

In conclusion, this study identified a transcriptional

signature consisting of 188 genes differentially expressed

between breast cancers and their paired normal breast tis-

sues. Based on this set of differentially expressed genes,

normal breast tissues could be distinguished from all of the

breast cancers, and most of the primary breast cancers

could be classified into different phenotype-associated

subgroups. This signature may be a specific feature of

breast cancer, and it may represent the molecular basis of

breast carcinogenesis and its distinct phenotype.

Acknowledgments This research was supported by the Tianjin

Major Program of Science and Technology (013182311), the National

High-Tech Research Development Plan of China (2002AA2Z2011),

the Program for Changjiang Scholars and Innovative Research Team

in University (URT0743), the Applied Basic Research Programs of

Science and Technology Commission Foundation of Tianjin

(06YFJMJC1290) and a donation from TaiJi Co., China.

References

1. Hahn WC, Weinberg RA (2002) Rules for making human tumor

cells. N Engl J Med 347:1593–1603

TIME (years)20181614121086420

1.1

1.0

.9

.8

.7

.6

.5

.4

.3

TIME (months)

706050403020100

1.1

1.0

.9

.8

.7

.6

.5

.4

.3

ER-negative (n=14)

ER-positive-I (n=7)

ER-positive-II (n=28)

ER-positive-I (n=28)

ER-positive-II (n=206)

ER-negative (n=43)

P=0.003P=0.127a b

Cum

Sur

viva

l

Cum

Sur

viva

l

Fig. 5 Result of Kaplan and Meier

survival analysis. Patients in ER-

positive subgroup I have better

outcome than ER-positive subgroup

II and ER-negative group. a Survival

analysis result of our 49 breast cancer

samples classified by the set of 188

differentially expressed genes. bSurvival analysis result of van de

Vijver’s 295 breast cancers classified

by 142 differentially expressed genes

MMP3 OGNKLK5 KLK7 CXCL10 COL11A1

1.00E-07

1.00E-06

1.00E-05

1.00E-04

1.00E-03

1.00E-02

1.00E-01

1.00E+00

mR

NA

exp

ress

ion

leve

lss

N B C N B C N B C N B C N B C N B C

Fig. 6 The mRNA expression levels of KLK5, KLK7, CXCL10,

COL11A1, MMP3, and OGN in normal breast tissue samples (N),

benign breast samples (B), and breast cancer samples (C) using real-

time RT-PCR method. The black bars represent the median

expression levels of these six genes in each sample group

74 Breast Cancer Res Treat (2010) 122:65–75

123

Page 11: Evidence for a transcriptional signature of breast cancer

2. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees

CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, Fluge O,

Pergamenschikov A, Williams C, Zhu SX, Lonning PE, Borre-

sen-Dale AL, Brown PO, Botstein D (2000) Molecular portraits

of human breast tumours. Nature 406:747–752

3. Nindl I, Dang C, Forschner T, Kuban RJ, Meyer T, Sterry W,

Stockfleth E (2006) Identification of differentially expressed

genes in cutaneous squamous cell carcinoma by microarray

expression profiling. Mol Cancer 5:30

4. Lee EJ, Kong G, Lee SH, Rho SB, Park CS, Kim BG, Bae DS,

Kavanagh JJ, Lee JH (2005) Profiling of differentially expressed

genes in human uterine leiomyomas. Int J Gynecol Cancer

15:146–154

5. Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A,

Deng S, Johnsen H, Pesich R, Geisler S, Demeter J, Perou CM,

Lonning PE, Brown PO, Borresen-Dale AL, Botstein D (2003)

Repeated observation of breast tumor subtypes in independent

gene expression data sets. Proc Natl Acad Sci USA 100:8418–

8423

6. Guo Y, Guo H, Zhang L, Xie H, Zhao X, Wang F, Li Z, Wang Y,

Ma S, Tao J, Wang W, Zhou Y, Yang W, Cheng J (2005)

Genomic analysis of anti-hepatitis B virus (HBV) activity by

small interfering RNA and lamivudine in stable HBV-producing

cells. J Virol 79:14392–14403

7. Feng Y, Sun B, Li X, Zhang L, Niu Y, Xiao C, Ning L, Fang Z,

Wang Y, Zhang L, Cheng J, Zhang W, Hao X (2007) Differen-

tially expressed genes between primary cancer and paired lymph

node metastases predict clinical outcome of node-positive breast

cancer patients. Breast Cancer Res Treat 103:319–329

8. Gomes LI, Silva RL, Stolf BS, Cristo EB, Hirata R, Soares FA,

Reis LF, Neves EJ, Carvalho AF (2003) Comparative analysis of

amplified and nonamplified RNA for hybridization in cDNA

microarray. Anal Biochem 321:244–251

9. Smith L, Underhill P, Pritchard C, Tymowska-Lalanne Z, Abdul-

Hussein S, Hilton H, Winchester L, Williams D, Freeman T,

Webb S, Greenfield A (2003) Single primer amplification (SPA)

of cDNA for microarray expression analysis. Nucleic Acids Res

31:e9

10. Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP

(2002) Normalization for cDNA microarray data: a robust com-

posite method addressing single and multiple slide systematic

variation. Nucleic Acids Res 30:e15

11. Zeeberg BR, Feng W, Wang G, Wang MD, Fojo AT, Sunshine

M, Narasimhan S, Kane DW, Reinhold WC, Lababidi S, Bussey

KJ, Riss J, Barrett JC, Weinstein JN (2003) GoMiner: a resource

for biological interpretation of genomic and proteomic data.

Genome Biol 4:R28

12. Turashvili G, Bouchal J, Baumforth K, Wei W, Dziechciarkova

M, Ehrmann J, Klein J, Fridman E, Skarda J, Srovnal J, Hajduch

M, Murray P, Kolar Z (2007) Novel markers for differentiation of

lobular and ductal invasive breast carcinomas by laser micro-

dissection and microarray analysis. BMC Cancer 7:55

13. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H,

Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, Thorsen T, Quist

H, Matese JC, Brown PO, Botstein D, Eystein Lonning P, Bor-

resen-Dale AL (2001) Gene expression patterns of breast carci-

nomas distinguish tumor subclasses with clinical implications.

Proc Natl Acad Sci USA 98:10869–10874

14. van de Vijver MJ, He YD, Van0t Veer LJ, Dai H, Hart AA,

Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ,

Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, van der

Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH,

Bernards R (2002) A gene-expression signature as a predictor of

survival in breast cancer. N Engl J Med 347:1999–2009

15. Taylor-Papadimitriou J, Stampfer M, Bartek J, Lewis A, Boshell

M, Lane EB, Leigh IM (1989) Keratin expression in human

mammary epithelial cells cultured from normal and malignant

tissue: relation to in vivo phenotypes and influence of medium. J

Cell Sci 94(Pt 3):403–413

16. Nishidate T, Katagiri T, Lin ML, Mano Y, Miki Y, Kasumi F,

Yoshimoto M, Tsunoda T, Hirata K, Nakamura Y (2004) Gen-

ome-wide gene-expression profiles of breast-cancer cells purified

with laser microbeam microdissection: identification of genes

associated with progression and metastasis. Int J Oncol 25:797–

819

17. Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, Nord-

gren H, Farmer P, Praz V, Haibe-Kains B, Desmedt C, Larsimont

D, Cardoso F, Peterse H, Nuyten D, Buyse M, Van de Vijver MJ,

Bergh J, Piccart M, Delorenzi M (2006) Gene expression pro-

filing in breast cancer: understanding the molecular basis of

histologic grade to improve prognosis. J Natl Cancer Inst 98:262–

272

18. Van0t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao

M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT,

Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards

R, Friend SH (2002) Gene expression profiling predicts clinical

outcome of breast cancer. Nature 415:530–536

19. Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F,

Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, Jat-

koe T, Berns EM, Atkins D, Foekens JA (2005) Gene-expression

profiles to predict distant metastasis of lymph-node-negative

primary breast cancer. Lancet 365:671–679

Breast Cancer Res Treat (2010) 122:65–75 75

123