134
INTEGRATIVE PROTEOMIC ANALYSIS OF CELL LINE CONDITIONED MEDIA AND PANCREATIC JUICE FOR THE IDENTIFICATION OF CANDIDATE PANCREATIC CANCER BIOMARKERS by Shalini Makawita A thesis submitted in conformity with the requirements for the degree of Master of Science Laboratory Medicine and Pathobiology University of Toronto © Copyright by Shalini Makawita 2011

INTEGRATIVE PROTEOMIC ANALYSIS OF CELL LINE … · Shalini Makawita Master of Science Department of Laboratory Medicine and Pathobiology University of Toronto 2011 ABSTRACT Novel

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

INTEGRATIVE PROTEOMIC ANALYSIS OF CELL

LINE CONDITIONED MEDIA AND PANCREATIC

JUICE FOR THE IDENTIFICATION OF CANDIDATE

PANCREATIC CANCER BIOMARKERS

by

Shalini Makawita

A thesis submitted in conformity with the requirements

for the degree of Master of Science

Laboratory Medicine and Pathobiology

University of Toronto

© Copyright by Shalini Makawita 2011

ii

INTEGRATIVE PROTEOMIC ANALYSIS OF CELL LINE

CONDITIONED MEDIA AND PANCREATIC JUICE FOR THE

IDENTIFICATION OF CANDIDATE PANCREATIC CANCER

BIOMARKERS

Shalini Makawita

Master of Science

Department of Laboratory Medicine and Pathobiology

University of Toronto

2011

ABSTRACT

Novel serological biomarkers to aid in the detection and clinical management of

pancreatic cancer patients are urgently needed. In the present study, we performed in-depth

proteomic analysis of conditioned media from six pancreatic cancer cell lines (MIA-PaCa2,

PANC1, BxPc3, CAPAN1, CFPAC1 and SU.86.86), the normal pancreatic ductal epithelial cell

line HPDE, and pancreatic juice samples from cancer patients for identification of novel

biomarker candidates. Using 2D-LC-MS/MS, a total of 3479 non-redundant proteins were

identified with ≥2 peptides. Subsequent label-free protein quantification and integrative analysis

of the biological fluids resulted in the generation of candidate biomarkers, of which five proteins

were shown to be significantly elevated in plasma from pancreatic cancer patients in a

preliminary assessment. Further verification of two of the proteins in ~200 serum samples

demonstrated the ability of these proteins to significantly improve the area under the receiver

operating characteristic curve of CA19.9 from 0.84 to 0.91.

iii

DEDICATION

My parents are the hardest working, kindest hearted and most persevering individuals I know.

Their love, support and encouragement have given me the courage to always dream big and

strive for my goals. For this I will be forever grateful.

- I dedicate this thesis to my parents Ananda and Dorathy Makawita.

iv

ACKNOWLEDGEMENTS

First and foremost, I would like to thank Dr. Eleftherios P. Diamandis, my supervisor,

mentor and teacher who has encouraged and guided me throughout my M.Sc. journey. You have

helped to mould the somewhat wandering interests I had in research at the start of my journey

into a strong passion for science that I will carry forward with me in my future endeavours. I

will always be forever grateful for the opportunities, kindness and trust you have bestowed upon

me and for the doors you have helped me to open. I look forward to your continued friendship

and mentorship in the years to come.

I would also like to acknowledge Dr. H. Elsholtz and the members of my advisory

committee, Dr. S. Asa and Dr. A. Romaschin for their advice and direction, as well as Dr. Irwin,

the Chair of the examination committee. Likewise, I would like to acknowledge the Department

of Laboratory Medicine and Pathobiology at the University of Toronto and funding I received

from the Ontario Graduate Scholarship.

It is said that “a good friend is hard to find, hard to lose, and impossible to forget”. This

is certainly the case for all of the friends I have made over the past two years in the ACDC lab.

You are all uniquely brilliant in so many different facets of life, and you have all been an

integral part of my M.Sc. journey. Thank you for the countless memories, jokes and the

scientific advice as well! As I sit here at the crossroads of graduate school and the next step in

my academic career, I am excited to see what the future has in store; however I cannot help but

feel somewhat melancholy that my M.Sc journey is coming to its end. The knowledge that all of

you have played a role in some way, shape or form both in my personal journey over the past

two years and in the research presented in this thesis is overwhelming and to you I owe my

sincerest debt of gratitude.

v

I would like to also particularly acknowledge Antoninus Soosaipillai, Chris Smith and

Ihor Batruch for all of their invaluable technical advice and help over the past two years, as well

as Dr. Irv Bromberg at Mount Sinai Hospital, Toronto for software assistance, and Dr. Yingye

Zheng (Fred Hutchinson Cancer Centre, Seattle, Washington), Elissa Brown (Fred Hutchinson

Cancer Centre) and Apostolos Dimitromanolakis (University of Toronto) for assistance with

statistical analyses. Thank you also to our clinical collaborators who have provided samples

used in this study (Dr. Steven Gallinger at the University Health Network, Toronto and Dr.

Randy Haun at the University of Arkansas Cancer Research Center for plasma/serum samples,

Dr. Felix Rueckert, Dresden, Germany for pancreatic juice samples and Dr. Alice Newman at

Princess Margaret Hospital for assistance with collection of ascites fluid).

Lastly, I would like to thank my parents and family once again for instilling in me the

importance of hard work, providing for me the means for a good education and for always

supporting me in my life‟s goals.

vi

TABLE OF CONTENTS

ABSTRACT……………………………………………………………………………………...ii

DEDICATION………………………………………………………………………………….iii

ACKNOWLEDGEMENTS……………………………………………………………………iv

TABLE OF CONTENTS………………………………………………………………………vi

LIST OF TABLES……………………………………………………………………………...ix

LIST OF FIGURES……………………………………………………………………………..x

LIST OF APPENDICES………………………………………………………………………xii

LIST OF ABBREVIATIONS………………………………………………………………...xiii

CHAPTER 1: INTRODUCTION 1

1.1 The Pancreatic Cancer Problem 2

1.1.1 The Human Pancreas 2

1.1.2 Pancreatic Cancer 2

1.1.2.1 Precursor Lesions and Cell of Origin 3

1.1.2.2 Symptoms 4

1.1.2.3 Risk Factors 4

1.1.2.4 Prevention 5

1.1.2.5 Treatment 6

1.1.3 Pancreatic Cancer Statistics 8

1.1.4 Current Methods for Pancreatic Cancer Detection and Their Limitations 9

1.2 Serological Cancer Biomarkers 10

1.2.1 Criteria for Detection and Biomarker Applications in Pancreatic Cancer 10

1.2.2 Current State of Pancreatic Cancer Serum Biomarkers 11

1.2.2.1 A General Introduction to Biomarkers 11

1.2.2.2 Mechanisms for Biomarker Elevation in Serum 12

1.2.2.3 CA19.9 and Other Putative Pancreatic Cancer Markers 13

vii

1.3 Mass Spectrometry-Based Methods for Serum Biomarker Discovery 14

1.3.1 Principles of Mass Spectrometry 14

1.3.2 Proteomics Discovery Pipeline – Discovery, Verification and Validation 17

1.3.3 Pancreatic Cancer Serum Proteomics 20

1.3.4 Tissue Proteomics 22

1.3.5 Proteomics of Proximal Biological Fluid and Cell Line Conditioned

Media 22

1.3.6 Integrated Strategies 24

1.4 Rationale, Hypothesis, Objectives 24

1.4.1 Rationale 24

1.4.2 Hypothesis 26

1.4.3 Objectives 26

CHAPTER 2: MASS SPECTROMETRY ANALYSIS OF CELL LINE

CONDITIONED MEDIA AND PANCREATIC JUICE FOR IDENTIFICATION

OF PANCREATIC CANCER BIOMARKERS 28

2.1 Introduction 29

2.2 Materials and Methods 32

2.3 Results 40

2.4 Discussion 60

CHAPTER 3: ENHANCED PERFORMANCE OF CA19.9 WITH ADDITION OF

SYNCOLLIN AND ANTERIOR GRADIENT HOMOLOG 2 IN PANEL 70

3.1 Introduction 71

3.2 Materials and Methods 73

3.3 Results 76

3.4 Discussion 80

CHAPTER 4: SUMMARY AND FUTURE DIRECTIONS 87

4.1 Summary 88

viii

4.2 Future Directions 90

REFERENCES 92

APPENDICES 112

ix

LIST OF TABLES

Table Title Page

2.1 Increasing the number of identified proteins by optimizing cation exchange

chromatography fraction pooling

43

2.2 Total number of proteins identified in triplicate analysis of cell line conditioned

media and pancreatic juice

45

2.3 Protein overlap between cell line conditioned media 47

2.4 List of 15 pancreas-specific proteins (≥3 databases) identified in conditioned

media and pancreatic juice

55

3.1 Stage and grade of 111 pancreatic cancer serum samples 75

3.2 Distribution of serum SYCN, AGR2, CA19.9 and age in pancreatic cancer and

control serum

76

x

LIST OF FIGURES

Figure Title Page

1.1 Biomarker discovery pipeline 19

2.1 Schematic outline of proteomic analysis 42

2.2 Total non-redundant proteins identified 46

2.3 Cellular localization and comparison of GO categories between cell line

conditioned media and pancreatic juice proteomes

49

2.4 Hierarchical clustering analysis based on normalized emPAI values for 3479

total non-redundant proteins

51

2.5 Preliminary verification of AGR2, OLFM4, SYCN, COL6A1 and PIGR in

plasma

58

2.6 Receiver operating characteristic curve analysis for CA19.9 and panel of five

candidates

59

3.1 Distribution of serum CA19.9, SYCN and AGR2 in normal controls, early-stage

and all pancreatic cancer patients

78

3.2

3.3

Correlation between CA19.9 and SYCN, CA19.9 and AGR2 and SYCN and

AGR2

ROC curves of SYCN, AGR2 and CA19.9 for all pancreatic cancer and controls

79

81

xi

3.4

ROC curves of SYCN, AGR2 and CA19.9 for stage II pancreatic cancer and

controls

82

xii

LIST OF APPENDICES

Appendix Title Page

1

2

3

4

Table of overrepresented KEGG pathways in the pancreatic juice proteome in

comparison to the cell line conditioned media proteome.

Pearson correlation coefficient values comparing normalized spectral counts of

the triplicate cell line analysis.

Extracellular and cell surface annotated proteins with over 5-fold increase in at

least three pancreatic cancer cell lines.

Forty-three proteins common to cancer cell lines, pancreatic juice and ascites.

113

114

115

118

xiii

LIST OF ABBREVIATIONS

2D-LC-MS/MS two dimensional liquid chromatography tandem mass spectrometry

AGR2 anterior gradient homolog 2

ANOVA analysis of variance

AUC area under curve

CA125 carbohydrate antigen 125

CA19.9 carbohydrate antigen 19.9

CDCHO Chinese hamster ovary serum-free medium

CEA Carcinoembryonic antigen

CEACAM5 Carcinoembryonic antigen-related cell adhesion molecule 5

CM conditioned media

COL6A1- Collagen alpha-1(VI) chain

CT computed tomography

CV coefficient of variation

ELISA enzyme-linked immunosorbent assay

emPAI exponentially modified protein abundance index

ERCP endoscopic retrograde cholangiopancreatography

ESI electrospray ionization

EUS endoscopic ultrasound

FDR false discovery rate

GO gene ontology

HCA hierarchical clustering analysis

hCG human corionic gonadotropin

HE4 WAP four-disulfide core domain protein 2 precursor

HPA Human Protein Atlas

HPDE human pancreatic ductal epithelial

HPLC high pressure liquid chromatography

IPA Ingenuity Pathway Analysis

IPI international protein index

IPMN intraductal papillary mucinous neoplasms

KEGG Kyoto Encyclopedia of Genes and Genomes

KLK Kallikrein

xiv

LTQ linear ion trap

MALDI matrix-assisted laser desorption ionization

MMP matrix metalloproteinase

MRM multiple reaction monitoring

MS mass spectrometry

MUC mucin

OLFM4 Olfactomedin-4

PanIN pancreatic intraepithelial neoplasia

PDAC pancreatic ductal adenocarcinoma

PIGR Polymeric immunoglobulin receptor

PLAT tissue-type plasminogen activator

PSA prostate specific antigen

ROC receiver operating characteristic

SCX strong cation exchange

SELDI surface-enhanced laser desorption ionization

SYCN Syncollin

TiGER Tissue-specific and Gene Expression and Regulation

TiSGeD Tissue-Specific Genes Database

1

CHAPTER 1:

Introduction

2

1.1 The Pancreatic Cancer Problem

1.1.1 The Human Pancreas

Extending from the C-shaped curve of the duodenum towards the hilum of the spleen,

the pancreas is a glandular organ comprised of both exocrine and endocrine functional units [1].

The exocrine pancreatic component is a serous gland consisting of two major cell types: (1)

acinar cells responsible for the synthesis of digestive enzymes in their inactive form (zymogens)

and (2) ductal cells responsible for the transport of zymogens to the duodenum through a

complex network of branching ducts and the main and accessory pancreatic ducts [1]. The

ductal cells also secrete an alkaline fluid (bicarbonate and water) which acts to neutralize

stomach contents as they enter the duodenum to ensure an optimal pH for digestive enzyme

activity. Another less well characterized cell-type of the exocrine pancreas are centroacinar

cells, which are duct cells located within the acinus [2]. The endocrine component of the

pancreas, which comprises ~ 1-2% of the total volume of the pancreas, is composed of islets of

Langerhans. Islets contain primarily alpha (15-20% of the islet cell population), beta (70% of

cell population) and delta (5-10% of cell population) cells that secrete glucagon, insulin and

somatostatin, respectively, and are fundamental for regulating blood-glucose levels and sugar

metabolism [1,2]. Minor islet cell types include PP (protein polypeptide) cells, EC

(enterochromaffin cell) cells and D-1 cells, each with either an inhibitory or stimulatory effect

on exocrine pancreatic secretions and gastro-intestinal motility and secretion [1].

1.1.2 Pancreatic Cancer

Pancreatic malignancies are a heterogeneous group of tumors classified largely based

on the pancreatic cell-type they recapitulate [3]. The great majority of pancreatic cancers (~85-

90%) arise from the exocrine pancreas and are pancreatic ductal adenocarcinomas (PDAC). In

their well differentiated state, PDACs resemble glandular morphology akin to benign ducts and

3

are also characterized by large areas of desmoplastic stroma, and invasion of vascular and

perineurial structures [3,4]. Other more rare types of pancreatic cancers include: undifferentiated

carcinomas, which lack ductal-like structures and show increased aggressiveness to PDAC;

colloid carcinomas characterized by large mucin deposits; medullary carcinomas characterized

by large, undifferentiated epitheliod-like cells; acinar cell carcinomas, which comprise ~1-2% of

pancreatic cancers and recapitulate acinar cell-like properties with zymogen granules; serous

cystadenomas which show cystic growths and ductal morphology, and endocrine tumors

typically characterized by improper production of pancreatic endocrine hormones such as

insulinomas [3,4]. The present study focuses on PDAC as it accounts for the majority of

pancreatic cancers.

1.1.2.1 Precursor Lesions and Cell of Origin

The ductal-like phenotype of PDAC is supported by a genetic progression model of

pancreatic cancer in which ductal cells, as they acquire sequential mutations in KRAS, INK4A,

TP53, SMAD4/DPC4, telomere shortening, etc. progress from normal ductal epithelia to low

grade pancreatic intraepithelial neoplasia (PanIN) to high grade PanIN (with increasing nuclear

abnormalities, abnormal mitosis and cytological atypia) and then to invasive ductal

adenocarcinoma [2,4]. However, the cell of origin of pancreatic cancer remains elusive and

other reports support a model which entails greater developmental plasticity during

tumorigenesis, where-by other cells within the pancreas, such as acinar or centroacinar cells,

may give rise to ductal adenocarcinomas through transdifferentiation or acinar-to-ductal

metaplasia. This is primarily supported by findings in murine models [2,5-7].

Other lesions that can lead to invasive carcinoma of the pancreas include mucinous

cystic neoplasms, intraductal papillary mucinous neoplasms (IPMNs) and intraductal oncocytic

papillary neoplasms [3,8]. These lesions are ductal-like tumors with cyst formation and it is

4

believed that ~2% of pancreatic cancers arise through these means. In some institutions they

may account for >10% of pancreatic resections [3,8]

1.1.2.2 Symptoms

Pancreatic cancer is asymptomatic in the early stages of tumor development and most

patients present with non-specific abdominal complaints or back pain [3,4,9]. Back pain is

thought to be caused by perineurial invasion, or invasion into bundles of nerve fibres. Patients

with tumors developing in the head and neck regions of the pancreas may develop jaundice due

to blockage of the main bile duct by the growing tumor [10]. More advanced disease may be

characterized by ascites, which is the build-up of fluid in the abdominal cavity, anorexia and

cachexia [11,12]. Cachexia is the unintentional loss of weight (≥ 10% or more body weight)

over the course of several months due to an increase in protein degradation and reduction in the

synthesis of muscle. It is present in ~80% of pancreatic cancer patients and accelerated weight

loss can result in decreased survival [12]. Other complications of pancreatic cancer may include

pancreatitis or diabetes mellitus [13]. Due to the lack of highly specific symptoms and the late

onset of symptoms, pancreatic cancer is an elusive disease and often dubbed a „silent killer‟.

1.1.2.3 Risk Factors

Pancreatic cancer has been associated with old age, smoking, family history of the

disease, hereditary syndromes, and diabetes [2,4,14]. The median age of pancreatic cancer

patients is in the 7th

decade of life and the disease is rare in individuals below the age of 40.

Smoking is one of the most preventable causal factors of pancreatic cancer and has been linked

to as many as one in four cases of pancreatic cancer [15]. In general, smoking has been shown to

increase the relative risk of pancreatic cancer by two-fold. Other risk factors include family

history and hereditary syndromes. Studies have shown that ~ 8-10% of pancreatic cancer

patients may have a familial link [16-18], with a relative risk of 18 and 57-fold in comparison to

5

sporadic disease if two or three first-degree relatives have pancreatic cancer, respectively

[19,20].

Several hereditary diseases have also been associated with increased pancreatic cancer

risk, most notably Peutz-Jeghers Syndrome and hereditary pancreatitis [20]. Germline mutations

in the STK11/LKB1 gene result in Peutz-Jeghers Syndrome, a disease characterized by macules

on the lips, noncancerous (hamartomatous) gastrointestinal polyps and a relative risk >100 times

that of the general population for pancreatic cancer development [21,22]. Pancreatitis is an

inflammatory condition of the pancreas and mutations in the PRSS1 gene, which encodes the

protease serine 1 (cationic trypsinogen) protein, cause hereditary pancreatitis. Hereditary

pancreatitis increases pancreatic cancer risk by 50-80 times [23]. Familial pancreatic cancer has

been linked to germline mutations in BRCA2, LKB1, CDKN2A and MLH1.

In an interesting recent genome-wide association study by the Pancreatic Cancer Cohort

Consortium (PanScan), blood type was associated with risk for pancreatic cancer [24]. In a

group of 1,534 pancreatic cancer patients and 1,583 controls, individuals with a non-O blood

type showed increasing risk for pancreatic cancer with each additional non-O allele (odds ratio

(OR) of 1.33, 1.61, 1.45 and 2.42 were obtained for individuals with type AO, AA, BO and BB

genotypes respectively, when compared to the OO genotype) [24].

1.1.2.4 Prevention

Various dietary and chemical agents have been described in literature with potential

preventative properties for pancreatic cancer. For instance, a compound in soy products

(genistein), and curcumin, a component of the spice turmeric have been shown to have

inhibitory effects on pancreatic cancer through inhibition of the NF-κB pathway [25]. Similarly,

various compounds in vegetables and fruits, such as vitamin C and D and Indole-3-carbinol (I3C

- found in cabbages, broccoli and cauliflower), have shown reduced risk of pancreatic cancer or

6

inhibition of cell proliferation, and the induction of apoptosis as is the case with IC3 [25].

Metformin, a drug given to patients with type II diabetes to lower glucose has been shown in

epidemiological studies to confer a significant decrease in development of pancreatic cancer

[26]. As well, angiotensin-I-converting enzyme inhibitors aspirin and enalapril were shown in a

recent study to delay progression of precursor pancreatic intraepithelial neoplasias to invasive

PDACs, where occurrence of pancreatic cancer was reduced from 60% to 17.6-31.2% in

untreated versus treated groups of a mouse model of pancreatic cancer treated for 3 and 5

months with the agents [27]. The leading preventative measure associated with pancreatic

cancer is abstinence from cigarette smoking [25].

1.1.2.5 Treatment

Treatment of pancreatic cancer varies depending on the extent of the cancer in patients.

For patients that present with early stage disease (<2cm lesions localized to the pancreas), which

is approximately 10% of pancreatic cancer patients, surgical resection is the main treatment

course [28]. Often called the “Whipple Procedure” after Dr. Allen Oldfather Whipple who

described at first a two-stage procedure in the late 1930s, and then a one-stage procedure in the

1940s, pancreaticoduodenectomy is a common procedure for resection of cancerous pancreatic

lesions. It is a procedure which involves removal of the duodenum, head of the pancreas, bile

duct and gallbladder (with or without preservation of the pylorus – the connective region of the

stomach and small intestines) [28]. Following surgery, adjuvent therapy in the form of radiation

and chemotherapy is prescribed in an attempt to reduce recurrence of disease [29]. Gemcitabine,

which is an inhibitor of DNA synthesis and 5-fluorouracil are common chemotherapeutic agents

for pancreatic cancer [29].

Locally advanced pancreatic cancer is unresectable disease that often involves invasion

into the superior mesenteric artery and other important vascular structures [30]. Approximately

7

30% of pancreatic cancer patients present with locally advanced disease. There is no real

consensus as to the most optimal treatment strategy for these patients; however most receive a

combination of radiation and chemotherapy. Pancreatic cancer is highly resistant to available

therapies and although significant efforts have been taken towards development of new therapies

over the past decade, they have been met with little success [13]. Several clinical trials

comparing inhibitors (individually or in combination with gemcitabine) targeting various

overexpressed pathways seen in pancreatic cancer such as the Kras inhibitor Tipifarnib, the

MMP (matrix metalloproteinase) inhibitor Marimastat, the anti-VEGF (vascular endothelial

growth factor) treatment Bevacizumab and the EGFR (epidermal growth factor receptor)

inhibitor Erlotinib to gemcitabine alone have shown no or little survival benefit [13]. Erlotinib

has shown a statistically significant increase in survival; unfortunately the median overall

improvement was quite small (from 5.91 months to 6.24 months; p=0.038) [31]. Metastatic

disease, which is present in ~50% of individuals at the time of diagnosis is treated by systemic

chemotherapies or through palliative measures to reduce pain and discomfort. Without

treatment, the median survival time for metastatic pancreatic cancer patients is ~2-3 months

[13].

An increased understanding into the complex signaling pathways gone awry in

pancreatic cancer and important implications of the tumor stroma (which in malignant

pancreatic tumors often encompass a greater volume than epithelial cells) [32] and other

microenvironmental effects over the past several years is believed to aid in the development of

future therapies with greater efficacy for pancreatic cancer. According to a recent review in

Nature Reviews Clinical Oncology [13], such treatment strategies will likely include targeting

multiple pathways, cancer stem cells and pathways involved in development such as Wnt, Notch

and Hedgehog signaling.

8

1.1.3 Pancreatic Cancer Statistics

Pancreatic cancer is the 10th most common cancer type in North America; however, it

is the 4th leading cause of cancer-related death [33]. Worldwide, this cancer afflicts

approximately 232,000 individuals annually [34]. As described above, prognosis is extremely

poor for patients diagnosed with pancreatic cancer and the majority of patients succumb to the

disease within several months to one year after diagnosis. Less than 5% of pancreatic cancer

patients survive up to five-years post diagnosis [3,33].

At present, surgical resection, made possible through the early diagnosis of cancerous

lesions while they are small (<2cm) and localized, offers the best treatment option and the only

potentially curative option for pancreatic cancer patients [35]. Unfortunately, due to the

asymptomatic nature of the early-stages of this disease and a lack of adequate screening methods

for its early detection, the majority of patients present with locally advanced (~30% of patients)

or metastatic disease (~50% of patients) at the time of diagnosis. At these advanced stages,

chemotherapy, in combination with radiation therapy, or palliative care options is the usual

treatment course [36,37]; however such treatment options are largely anecdotal due to the high

rate of metastatic spread of pancreatic cancer, especially to vital organs such as the liver. As a

result, there is great interest by clinicians and researchers alike for the development of novel

methods with high sensitivity and specificity for detection of pancreatic cancer in its

asymptomatic or early stages.

Currently, with early detection, five-year survival rates have been shown to improve for

pancreatic cancer patients from <5% to ~20-40%. Compared to other cancer sites such as breast,

prostate, colon and ovarian where 5-year survival rates improve from 23% to 98%, 31% to

100%, 11% to 91% and 28% to 94%, respectively with early detection [33], the improvement

seen in pancreatic cancer patients seems modest at best. However, optimism remains that the

9

combination of parallel advancements in therapeutic strategies, preventative measures and early

detection will further improve the outcome of pancreatic cancer patients. The focus of the

present study is towards the identification of biomarkers and biomarker candidates that may aid

in the improvement of existing detection strategies.

1.1.4 Current Methods for Pancreatic Cancer Detection and Their Limitations

Current methods for pancreatic cancer detection are based primarily on imaging

techniques for the detection of pancreatic masses or suspected cancerous lesions, in individuals

who either present with nonspecific abdominal complaints, or symptoms suggestive of

pancreatic cancer such as painless jaundice and weight loss [38]. High-resolution, contrast-

enhanced cross sectional computed tomography (CT), in particular, which enables the

acquisition of thin image slices (5mm) from the base of the lungs to the pelvis, is a widely used

technique for pancreatic cancer detection [39]. By examining contour abnormalities within the

pancreas and surrounding ducts and arteries, CT can also facilitate assessment of staging, tumor

resectability, and post-operative follow-up in patients with established pancreatic cancer [38,39].

Endoscopic ultrasound (EUS) has also emerged as a sensitive means for the detection of

pancreatic tumor masses [40,41]. Through a combination of real-time endoscopy and high-

frequency ultrasound, EUS is used to image the pancreas through the gastric and duodenal walls.

The close proximity at which images are obtained has enabled EUS to overcome confounding

effects caused by gaseous features overlying the pancreas. In this respect, EUS has been shown

useful for the detection and evaluation of small (2-3cm) focal lesions [40-42].

Other techniques for detection and assessment of pancreatic cancer include magnetic

resonance imaging (MRI) and positron emission tomography (PET), the latter of which is used

largely for the detection of metastasis [43]. There are conflicting reports as to which imaging

method shows superiority for the clinical assessment of pancreatic cancer and a combination of

10

techniques may be utilized based on the clinical question and practice preferences [38-43].

Certain definitive diagnoses of pancreatic cancer may require more invasive means such as

endoscopic retrograde cholangiopancreatography (ERCP) which enables tissue sampling,

acquisition of a computed tomography-guided biopsy or endoscopic ultrasound-guided fine

needle aspiration (EUS-FNA) [20,42].

The major drawback of all of these methods for the optimal management of pancreatic

cancer patients is that they are primarily utilized after the onset of symptoms, which, in

pancreatic cancer patients occurs predominantly after the onset of late stage disease. At some

institutions, imaging methods have been implemented for screening asymptomatic individuals

with a known familial or genetic predisposition for pancreatic cancer; however, due to their

associated high operating costs, invasive (albeit some more than others) and relatively time

consuming nature, imaging methods are ineffective for screening the general population or at-

risk groups (such as the elderly) for detection of the asymptomatic and resectable early stages of

pancreatic cancer [42].

1.2 Serological Cancer Biomarkers

1.2.1 Criteria for Detection and Biomarker Applications in Pancreatic Cancer

Standard criteria for an effective method for early detection calls upon a test that is cost

effective, non-invasive and easily performed by staff with minimal training [44]. Most

importantly, it should possess a high degree of sensitivity and specificity to enable the accurate

identification of disease from non-disease without overdiagnoses and false negatives. In this

regard, the test should provide a clear clinical benefit to patients [43-45]. While the fulfillment

of all of these criteria may be challenging for any one technique, serum biomarkers, either

individually or as a multi-parametric panel, have the potential to meet many of the above

criteria. In addition, the development of serological markers is ideal due to the present set-up in

11

many clinical laboratories which centers around the use of blood testing.

For pancreatic cancer specifically, given the low prevalence of the disease and low

efficacy of current treatments, a population-based screening test is somewhat unrealistic and

screening would likely be in individuals in high-risk groups such as those with familial

pancreatic cancer or syndromes such as Peutz-Jeghers (see section 1.1.2.3 above) [42,45]. In the

sections above, emphasis was placed on early detection; however biomarkers have a wide range

of other important clinical purposes. For instance, local relapse in pancreatic cancer patients

who undergo surgical resection has been reported as 50-85% [46] and as a result, markers that

can be used for monitoring response to surgery/treatment and pancreatic cancer progression can

also greatly aid in the optimal management of pancreatic cancer patients. As well, at present up

to 25% of patients may undergo surgery in whom the inability to resect is discovered only

during the surgery due to micrometastasis or invasion that was not identified through current

CT-based methods [47]. In this regard, biomarkers to aid in enhanced prediction of resectability

and staging of tumors can also be of clinical use for pancreatic cancer [47].

1.2.2 Current State of Pancreatic Cancer Serum Biomarkers

1.2.2.1 A General Introduction to Biomarkers

Biomarkers are molecules or processes, which, when measured are indicative of a

particular biological state or condition [44]. Many molecules and cellular processes have been

studied for the detection and management of cancer and for pancreatic cancer. These include,

but are not limited to, molecules and processes such as DNA, mRNA, miRNA, proteins,

circulating tumor cells and angiogenesis [44]. For instance a recent comprehensive genomic

analysis of pancreatic cancer tumors published in Science [32], revealed an average of 63

mutations (mostly point mutations) in pancreatic cancer and 12 core pathways deregulated in the

majority of cases. This study also revealed several hundred (541) genes expressed over 10-fold

12

in 90% of cases that may serve as potential biomarkers. Other more targeted studies have shown

utility of analyzing mutations in genes such as K-ras and p53, and promoter methylation of

p16(INK4a) in high-risk groups to further stratify risk of developing pancreatic cancer [48].

Similarly, miRNA profiling has shown an association between miR-196a and pancreatic cancer

in several studies [49], where increased miR-196 has shown discriminatory potential between

PDAC and benign pancreatitis and healthy controls [49]. Increased intratumoral microvessel

density has also been noted in pancreatic cancer, as well as increased expression of angiogenic

factors in serum such as VEGF (vascular endothelial growth factor) [50,51]. Circulating tumor

cells (CTCs) have been a growing area of research in recent years and with respect to pancreatic

cancer CTC research, alpha-1,4-acetyl-glucosaminyltransferase (alpha4GnT) mRNA has shown

diagnostic potential for PDAC upon extraction from peripheral mononuclear blood cells [51,52].

However, given that most molecular and genetic alterations (mRNA, miRNA, DNA,

etc) tend to ultimately culminate in the altered expression of protein products, and with the

recent advancements made in high-throughput proteomic technologies, the study of cancer

proteomics represents a potentially fruitful means for identification of novel biomarkers [44,53].

1.2.2.2 Mechanisms for Biomarker Elevation in Serum

Human plasma is described as the most complex of all human proteomes containing

proteomes from all other tissues as subsets [54]. There are several mechanisms by which

proteins can enter into circulation and serve as cancer biomarkers [44]. Primarily these include

the increased secretion and shedding of proteins from tumor cells, angiogenesis, and

leakage/release of proteins from tissues as tumors invade and cause destruction of the local

tumor microenvironment [44]. Approximately 20-25% of all human proteins are secreted and

recent analysis of single nucleotide polymorphisms in proteins containing signal peptides have

shown aberrant secretion of certain proteins in disease states [55]. Similar events may likely

13

occur in cancer [56]. Additionally, aberrant production of extracellular proteases may result in

the increased cleavage of extracellular domains of membrane-bound proteins resulting in their

elevation in circulation [44]. Most currently used biomarkers such as AFP (alpha-fetoprotein)

and hCG (human chorionic gonadotropin) are secreted, and HER2 (a member of the epidermal

growth factor (EGF) receptor family) is a membrane-bound protein, the extracellular domain of

which can be detected in serum [44,57]. An example of increased levels of a protein marker in

serum due to leakage caused by local tissue destruction is prostate specific antigen (PSA) in

prostate cancer [44,58].

1.2.2.3 CA19.9 and Other Putative Pancreatic Cancer Markers

Several tumor markers with good sensitivity and specificity are currently in routine

clinical use for the detection of various cancer sites, such as PSA for prostate cancer and hCG

for testicular cancer. Unfortunately, a marker of high diagnostic sensitivity and specificity is

lacking for pancreatic cancer. Currently, the most widely used clinical marker for pancreatic

cancer is CA-19.9, a sialylated lewis A antigen found on the surface of proteins [59,60]. CA19.9

has reported sensitivity values ranging from 70%-90% (median ~79%) and specificity values

ranging from 68%-91% (median ~82%) for diagnosis of pancreatic cancer [59]. While elevated

CA19.9 levels have been associated with the advanced stages of the disease, they have also been

associated with benign and inflammatory diseases such as obstructive jaundice, pancreatitis, as

well as other malignancies of the gastrointestinal system [60-63]. For early-stage pancreatic

cancer detection, CA19.9 has a reported sensitivity of ~55% and it is often undetectable in many

asymptomatic individuals [59,43]. In addition, CA19.9 is associated with Lewis antigen status

and is absent in individuals with Lewis antigen negative blood group (~10% of the general

population) [64]. Taken together, CA19.9 lacks the necessary sensitivity and specificity for early

pancreatic cancer detection and is most widely used as a biomarker to monitor response to

14

treatment in patients who had elevated levels prior to treatment.

Other tumor markers such as members of the carcinoembryonic antigen (CEA) [65,66]

and mucin (MUC) [67] families have also been associated with pancreatic cancer. Similarly,

many other proteins have been described as putative candidates in literature. When used in

combination, with or without CA19.9, some of these markers have shown enhanced sensitivity

and specificity; however none have been able to successfully displace or supplement CA19.9 in

the clinic.

1.3 Mass Spectrometry-Based Methods for Serum Biomarker Discovery

1.3.1 Principles of Mass Spectrometry

In proteomics, mass spectrometry is a mainstay and is crucial to the design of large-

scale proteomics-based discovery studies. Mass spectrometry (MS) enables the simultaneous

identification of proteins in a biological sample [68]. When configured to monitor specific

peptides and peptide products, MS analysis also permits targeted quantification of analytes [69].

If samples are prepared through digestion of proteins using enzymes, followed by MS analysis

of the peptide products and identification of proteins through subsequent database searching, the

MS analysis is referred to as a „bottom-up‟ or shotgun proteomic analysis [68]. Through these

means, mass spectra generated through MS analyses are analyzed using computer-based

algorithms such as MASCOT, X!Tandem and SEQUEST and compared to databases containing

all known and predicted protein sequences to confer protein identifications [70,71]. The

database containing all amino acid sequences of proteins in reverse is also commonly used to

determine the false positive rate of MS protein identifications based on the number of

proteins/peptides that match to the reverse database component.

Conversely, if intact proteins are analyzed without prior enzymatic digestion to

peptides, the approach is referred to as „top-down‟ analysis [68]. While top-down analysis can

15

result in greater sequence coverage of proteins, including increased information on post

translational modifications present in proteins, the fractionation of intact proteins prior to MS

analysis is more challenging than fractionation of peptide mixtures, and as a result, top-down

approaches are typically used for single proteins or simple protein mixtures [68]. In the present

study, a bottom-up proteomic approach was utilized for identification of proteins.

Mass spectrometers are composed of three main components– an ionization source, a

mass analyzer and a detector. An important development that paved the way for mass

spectrometry analysis of proteins was the introduction of ionization methods through which

proteins and peptides could be stably transferred into the gas-phase. Two such ionization

techniques widely used in proteomic analyses are MALDI (matrix-assisted laser desorption

ionization) and ESI (electrospray ionization) [68,72]. In MALDI, small amounts of sample

(~1uL) are applied onto a plate along with a light-absorbing crystallized substance (matrix). The

solvent containing the sample vaporizes, leaving the matrix and sample mixture of proteins co-

crystallized. Subsequent short (nanosecond) pulses of a laser cause vaporization of the protein

mixture and ionization of the protein molecules through energy transfer from the matrix [72].

Addition of proteins onto a coated resin that can select/enrich for proteins based on properties

such as hydrophobicity, charge, specific antibody affinity, etc. prior to ionization is referred to

as SELDI (surface-enhanced laser desorption ionization) [73].

ESI enables ionization of proteins/peptides directly from solution (the solvent typically

contains water and a volatile compound such as acetonitrile). Mass spectrometery is usually

coupled directly to a high pressure liquid chromatography system and in ESI, high voltages (~2-

6 kV) are applied at the interface where peptides elute from the chromatography system and

prior to entering into the inlet of the mass spectrometer [68]. The high voltage causes the

analytes to form a jet or spray of charged particles (Taylor cone), resulting in the eventual

16

formation of gaseous peptide ions [74]. ESI was used in the present study.

Once ionized, electric fields are typically used to guide peptides to the mass analyzers

which function to store and separate ions based on their mass-to-charge (m/z) ratio. Two main

categories of mass analyzers include scanning and ion-beam analyzers such as those found in

time of flight (TOF) mass spectrometers [72], and analyzers that trap ions such as the linear ion

trap (LTQ) and LTQ-Orbitrap instruments [75]. Trapping instruments were utilized in this study,

specifically the LTQ-Orbitrap (Thermo). For instance, in the Orbitrap analyzer, ions are trapped

in electrostatic fields where they orbit and oscillate around a central electrode. The frequency of

their oscillation can be related back to m/z of the ion [68,75,76]. Important considerations in

mass analyzers are their resolution (ability to separate/distinguish between two peaks), the mass

accuracy of instruments, mass range, which is the range of m/z that an instrument is capable of

analyzing and the scan rate of the analyzer. Two mass analyzers can be coupled in tandem,

where-by ions analyzed in the first mass analyzer can be further fragmented through collision

with a neutral gas (collision induced dissociation) and analyzed in a second mass analyzer

[68,77]. Such tandem MS analysis can occur simultaneously as is the case in the LTQ-Orbitrap

instrument used in this study, where the Orbitrap analyzer performs the first scan, while the

LTQ carries out fragmentation and the second scan in parallel. Coupling of two analyzers such

as this can combine the advantages of both such as the speed and sensitivity of the LTQ

component with increased mass accuracy and resolution of the Orbitrap analyzer [68.77].

The final component of mass spectrometers are the detectors which record the charge or

current produced by the ions resulting in the generation of mass spectra. As described above, the

spectra are then searched using computer algorithms against databases containing sequences of

all known and predicted proteins to confer protein identifications (in typical bottom-up

approaches).

17

1.3.2 Proteomics Discovery Pipeline – Discovery, Verification and Validation

A standardized protocol for the discovery of biomarkers through mass spectrometry-

based proteomic approaches does not exist in the field; however there are several theoretical 3-

5-phased models described in literature which can serve as useful platforms for the identification

of novel protein biomarkers [78,79]. The primary phases of such models are discovery,

verification and validation. In the discovery phase, mass spectrometry analysis is undertaken for

the identification of proteins in various biological sources. Due to the complex nature of serum,

as described in greater detail in the sections below, samples used in this phase are predominantly

tissues, cell lines and biofluids [44,79]. The discovery phase typically results in the

identification of thousands of proteins and may involve semi-quantitative or relative

quantification analysis between cancer and non-cancerous samples. Next, the identified proteins

are mined through application of bioinformatics to generate a manageable list of putative

candidates (~50-100) for testing. The bioinformatics criteria used to filter candidates are usually

arbitrary and defined by each individual study group [80]. Common criteria involve study of

differential expression of proteins, genome ontology analysis, pathway analysis of proteins,

study of protein tissue specificity through publically available databases, comparisons to

literature and mRNA, miRNA, DNA database sources, etc. Subsequently, generated candidates

are verified in a moderate number of serum samples to preliminarily assess the efficacy of the

candidates to discriminate between cases and controls [78-80]. Verification and validation in the

final clinically used biological source (i.e. in this case serum) is pertinent given that other

biological sources are used in the high throughput discovery-phase studies for initial

identification of candidates and the somewhat arbitrary nature of candidate prioritization. Many

candidates will be rejected during verification phases; however a small handful of proteins (~5-

10) will likely emerge as promising candidates that can distinguish cancer from controls,

18

warranting their further validation in larger sample sets (several 100 – 1000s of samples per

study group) [78,79]. A schematic of this pipeline is presented in Figure 1.1.

Where-as the technology used in the discovery phase is mass spectrometry, the „gold

standard‟ for verification and validation are immunoassays used to measure concentration of

specific proteins, particularly enzyme-linked immunosorbent assays (ELISAs) due to their high

sensitivity and specificity for targeted protein quantification [79]. A challenge or bottleneck in

the described pipeline is the inability to verify all generated candidates due to a lack of

commercially available ELISAs for the majority of proteins, coupled with the high costs of

producing ELISAs for proteins that currently lack reagents (~$50,000 - $100,000 for a research

grade ELISA and higher for a clinical grade ELISA) [69].

19

Figure 1.1 Biomarker Discovery Pipeline. Depicted are the discovery, verification and validation phases of proteomics-based

biomarker discovery. The phases are described in the “input” and “output” text. The shape of the pipeline depicts a bottleneck to

portray the existing inability to verify/validate all candidates due to a lack of commercially available assays/reagents for verification.

*, candidate selection through bioinformatics; garbage cans depict candidates that are rejected at each phase due to poor

discriminatory ability.

20

Mass spectrometry-based targeted protein quantification methods such as multiple

reaction monitoring (MRM) are emerging to fill this gap. At present, MRM-based approaches

are limited by their sensitivity for analysis of low abundance serum proteins (without prior

enrichment of proteins, MRM can detect serum proteins only into the mg/L in direct serum

digests); however with advancements in the sensitivity of mass spectrometers, MRM will likely

emerge in the near future as a means to alleviate this bottleneck [69,81].

During the verification and validation phases, various statistical criteria are examined such as

sensitivity, specificity and receiver operating characteristic (ROC) curve analysis [44].

Sensitivity: The true positives identified by a test (i.e. proportion of individuals that

have the disease who correctly tested positive) [44]

Specificity: The true negatives identified by a test (i.e. proportion of individuals that do

not have the disease (controls) that correctly tested negative) [44]

Receiver operating characteristic (ROC) curve: ROC curves graphically represent the

true positive rate (sensitivity) versus false positive rate (1-specificity). ROC curves

enable determination of the effectiveness of a biomarker at various cut-off points for

sensitivity or specificity. An ideal marker would be one in which the area under the

ROC curve is maximum (i.e. AUC = 1.0; the test can correctly classify all true positives

as such and all true negatives as such). An advantage of ROC curve analysis is the

ability to evaluate multiple biomarkers or candidates on the same plot as well as

perform combined analyses of biomarkers in combination (panels of biomarkers) [44,

82].

1.3.3 Pancreatic Cancer Serum Proteomics

The majority of pancreatic cancer proteomic-based discovery studies have focused on

21

serum [83-90] or tissue proteomics [91-95], with an increasing number of proximal biological

fluid studies [96-104] over the past several years. Serum proteomics studies for pancreatic

cancer have utilized SELDI approaches to generate characteristic peak patterns or pancreatic

cancer „signatures‟. For instance, the use of four mass to charge (m/z) values in combination

with CA19.9 was shown recently to improve the diagnostic accuracy of CA19.9 (improved area

under the curve from 0.883 to 0.935) [83]. Several years earlier, the use of four different peaks

with CA19.9 was shown to detect 29 of 29 pancreatic cancer patients accurately [105]. A

limitation of such approaches is the difficulty in translating cancer signatures into the clinical

laboratory setting. In addition, unlike the method of MS described in the “Principles of Mass

Spectrometry” section above, certain SELDI-based approaches are limited by the inability to

identify corresponding proteins for the characteristic mass to charge (m/z) patterns or signatures

identified. Other approaches to pancreatic cancer serum proteomics have been MALDI based

[65,83,84,88], as well as a proteomic approach analyzing murine serum in a progression model

of pancreatic cancer followed by verification of candidates in human serum [106].

Generally, while analysis of serum seems a practical choice when mining for

serological biomarkers, MS-based serum proteomics is hindered by the large dynamic range of

proteins in serum (approximately 10^11 orders of magnitude) compared with the analytical

range of ~ 10^5 of mass spectrometers. In addition, serum contains approximately twenty-two

proteins of high abundance which are of little diagnostic value that comprise ~99% of the total

protein mass of serum [44]. The remaining 1% contains potentially thousands of proteins of

interest for biomarker studies. These proteins are primarily in the ng/L to ug/L range and are

difficult to examine through mass spectrometry due to the masking effects posed by proteins of

high abundance. Furthermore, serum is largely heterogeneous and its inherent features can vary

from individual to individual based on hormonal status, age, sex, diet etc. [107]. While great

22

strides have been made in serum proteomics-based discovery studies, due to the above stated

reasons, at present, analysis of serum is likely best left for candidate verification stages.

1.3.4 Tissue Proteomics

Tissue is another source that is often analyzed in discovery studies. Tissue proteomics

offers the ability to analyze cancerous versus normal adjacent tissue for the detection of changes

between the two [91,108]. While this is an appealing prospect for the discovery of tissue-based

biomarkers, not many candidates discovered through tissue proteomics have been verified in

serum. Additionally, almost all of the cancer biomarkers currently in use are proteins that are

secreted or shed from tumor cells and proteomic analysis of whole tissue fails to enrich for

secreted and/or shed components. Pancreatic cancer, in particular, is characterized by a high

stromal reaction and in some instances, tumor sections may contain more stroma than cancerous

cells [109]. While these stromal cells are a part of the pancreatic cancer tumor

microenvironment and may contribute to the production of biomarkers, analysis of tissue lysates

is not optimal for the identification of serological biomarkers. Instead, proximal biological fluids

which bathe tumor cells and into which tumor cells and their microenvironment contribute their

secretions is likely a more valuable biological source for discovery studies.

1.3.5 Proteomics of Proximal Biological Fluid and Cell Line Conditioned Media

The concentration of many biomarkers is expected to be in the ng/L – ug/L range in

circulation; however closer to the tumor, their concentration is greater. For instance, in a study

looking at levels of CA125 (carbohydrate antigen 125) – a widely used ovarian cancer

biomarker – in serum, ascites and cystic fluid of ovarian cancer patients, of which the latter two

sources represent sources more proximal to the tumor site, median CA125 levels were found to

be 696 U/mL, 18,563 U/mL and 44,850 U/mL respectively [110]. As a result, many groups have

taken to proteomic analysis of biological fluids more proximal to the tumor, as they represent

23

sources more enriched in potential biomarkers.

In terms of pancreatic cancer, several recent studies detailing proteomic analysis of

pancreatic juice and pancreatic cystic fluid for the discovery of novel biomarkers have been

published [98-104]. Pancreatic juice is an alkaline fluid secreted by the exocrine cells of the

pancreas into the duodenum. It contains a large number of inactive enzymes which become

activated once in the intestine to aide in digestion. In pancreatic cancer, it is likely that

pancreatic juice will also contain the secretions of tumor cells. Protein numbers ranging from 22

to 170 have been identified in six pancreatic juice studies using a variety of different MS-based

approaches [98-103]. Of these, two studies performed verification in serum samples using

ELISAs of the proteins hepatocarcinoma-intestine-pancreas/pancreatitis-associated-protein

(HIP/PAP-I) [103] and matrix metalloproteinase-9 (MMP-9) [102]. Ascites is another fluid that

has been shown recently as a good media to mine for biomarkers [111,112]. Ascites fluid acts as

a local microenvironment containing secretions from cancer cells and other malignant processes;

however the proteome of pancreatic cancer-derived ascites has, to our knowledge, not yet been

profiled.

The use of cell culture supernatants or conditioned media (CM) for biomarker

discovery is another approach and has been gaining popularity in recent years [80]. Although

significant differences have been noted in the literature between cell lines and primary tumors,

their genomic and transcriptional characteristics, as well as biological heterogeneity, have, in

general, been shown to recapitulate those of primary tumors [113-116]. Comparison of features

such as morphology, aneuploidy, and expression of important genes such as K-ras and p53, have

also shown good concordance between cancer cell lines and primary tumors for other cancer

sites [115,116] and for pancreatic cancer as well [113,114]. Additionally, the identification of

24

known biomarkers, such as PSA in prostate cancer cell lines and CA125 in ovarian cancer cell

lines, make cell lines a viable source to mine.

1.3.6 Integrated Strategies

A way in which to improve current strategies for biomarker discovery and produce

markers with clinical utility may be to incorporate and integrate multiple biological fluids. Most

studies to date have utilized only serum/plasma, tissue or a proximal biological fluid for their

respective analyses. However, given that cancer is a highly heterogeneous disease, integration

and comparison of proteomes from multiple sources may yield „stronger‟ or more promising

candidates for verification. For instance, in a recent study by our laboratory which compared the

proteins/genes identified in six publications chosen arbitrarily to represent various biological

sources and both proteomic and genomic data pertaining to ovarian cancer (2 cell line CM

studies, 2 ascites, 1 tissue proteomics study and 1 microarray study), no proteins were found

common to all 6; however two proteins were found common to four of the studies [117]. The

proteins identified were WAP four-disulfide core domain protein 2 precursor (HE4) and GRN

(granulin). Both have been implicated in ovarian cancer and HE4 is a known ovarian cancer

biomarker. In this regard, the combining of information from multiple biological sources may

yield stronger candidates for verification.

1.4 Rationale, Hypothesis, Objectives

1.4.1 Rationale

Pancreatic cancer is one of the most lethal of all solid malignancies, for which non-

invasive, highly specific and sensitive methods to shift all diagnoses to occur in the early stages

of tumor development can improve patient survival and provide the most optimal care for

patients. Deregulated molecular pathways and physiological processes are a hallmark of

25

tumorigenesis, and many molecular and pathophysiological changes in cells and the tumor

microenvironment can ultimately culminate in the aberrant expression of protein products

[118,119]. The identification of such proteins, especially those which when measured in serum

reveal clinically useful information about the disease state of individuals, can highly aid in the

detection and clinical management of patients with pancreatic cancer.

Serum is a highly complex fluid that is believed to contain subsets of proteomes

representative of each tissue [54]. However ~22 proteins of high abundance constitute ~99% of

the total protein mass in serum. It is the in remaining ~1%, which contains thousands of

proteins, that potentially useful information regarding the presence and growth of tumors is

believed to lie. Given the limitations of MS-based serum analysis due to interference from

proteins of high abundance, many researchers have turned to analysis of fluids in closer

proximity to tumor cells [117]. Protein-based biomarkers are believed to be proteins that are

secreted, shed, cleaved or leaked from tumor tissues and their local microenvironment, and in

this regard, proximal biological fluids represent enriched sources to mine for potential

biomarkers prior to proteins entering the circulation and becoming vastly diluted [44,117].

Cell culture systems and analysis of conditioned media from cells grown in serum-free

media lack the presence of high abundance serum proteins that can interfere with MS analysis

[80]; however one of the possible limitations to tissue culture-based work is perhaps the inability

to adequately capture salient aspects of the tumor microenvironment and protein biomarkers that

may be produced as a result of aberrant interactions at the tumor-host interface [117]. Biological

fluids from patients with pancreatic cancer, such as pancreatic juice on the other hand, which

may possibly contain high abundance serum proteins (due to the method of extraction during

surgery), likely also contains pertinent contributions of tumor cells and their surrounding

microenvironment. Previous studies have focused on one biological source for analysis;

26

however integration of multiple biological sources (cell line conditioned media and pancreatic

juice) should enable us to better capture salient aspects relevant for biomarker discovery as the

advantages of one biological source may account for the shortcomings of another.

Mass spectrometry has also been in a constant state of evolution, becoming increasingly

sensitive and sophisticated. In parallel, software for data mining and other bioinformatic tools

have been garnering momentum [68,71,120]. Despite the inability of high-throughput methods

to provide an enhanced biomarker that can displace or supplement CA19.9 in the clinic for

pancreatic cancer to date, optimism remains that the experience gained in the field in the last

few years, and implementation of more integrated approaches to biomarker discovery, should

prove fruitful for pancreatic cancer biomarker discovery in the upcoming future.

1.4.2 Hypothesis

Proteins which can serve as biomarkers become elevated in serum through secretion,

shedding, cleavage and leakage from tumor cells and their microenvironment. To this end, we

hypothesize that novel candidate biomarkers for pancreatic cancer can be identified through

extensive proteomic analysis of supernatants of human pancreatic cancer cell lines grown in

vitro, in conjunction with pancreatic juice collected from pancreatic cancer patients. Through

subsequent application of bioinformatics-based filtering criteria, which will include label-free

protein quantification between cancer and normal cell lines and integration of the multiple

biological fluids, followed by verification of candidates in serum, we hope to identify a small

handful of proteins that show promise as potential pancreatic cancer biomarkers, warranting

their further and extended validation.

1.4.3 Objectives

1. Perform mass spectrometry analysis of cell line conditioned media in triplicate from six

pancreatic cancer cell lines (MIA-PaCa2, BxPc3, PANC1, CAPAN1, CFPAC1 and

27

SU.86.86) and one normal pancreatic ductal epithelial cell line (HPDE) using two

dimensional liquid chromatography tandem mass spectrometry (2D-LC-MS/MS)

2. Perform mass spectrometry analysis of pancreatic juice samples in triplicate using 2D-

LC-MS/MS.

3. Identify candidates for verification in serum/plasma using bioinformatics-based analysis

such as the following:

a. Label-free protein quantification comparing average normalized spectral counts

between triplicate analysis of cancer cell lines and the HPDE cell line to

determine differentially expressed proteins.

b. Genome Ontology analysis, focusing on extracellular and cell surface annotated

proteins.

c. Integrated analysis of cell lines with pancreatic juice, focusing on proteins

common to multiple biological fluids.

d. Tissue specificity analysis focusing on proteins specific to or highly expressed in

the pancreas based on publically available databases.

e. Hierarchical clustering analysis.

4. Perform initial verification studies in serum/plasma from patients with pancreatic cancer

and healthy controls of similar age and sex (n=40) using ELISAs to preliminarily assess

the ability of candidates to discriminate between cancer and controls.

5. Perform further verification/validation of promising candidates in a larger number of

serum samples (n~200)

28

CHAPTER 2

Mass Spectrometry Analysis of Cell Line Conditioned Media and

Pancreatic Juice for Identification of Candidate Pancreatic Cancer

Biomarkers

A modified version of the work presented in this chapter will be submitted to the journal

Molecular and Cellular Proteomics

29

2.1 Introduction

Pancreatic cancer is the 4th

leading cause of cancer-related death and one of the most

highly aggressive and lethal of all solid malignancies [121]. Worldwide, over 200,000

individuals are diagnosed with pancreatic cancer each year, and due to the asymptomatic nature

of its early stages, coupled with inadequate methods for early detection, the majority of patients

(>75%) present with locally advanced and inoperable forms of the cancer at the time of

diagnosis [121]. At these advanced stages, available chemotherapy, radiation and combinatorial

therapies are largely anecdotal, and less than 5% of patients survive up to five-years post

diagnosis [4,121].

One way to aid in the clinical management of cancer patients is through the use of serum

biomarkers. Biomarkers are measurable indicators of a biological state or condition, and in the

context of cancer, serum biomarkers present a non-invasive and relatively cost effective means

to aid in detection, monitor tumor progression and response to therapy, and for other measurable

outcomes of disease [44]. The most widely used biomarker in the clinic for pancreatic cancer is

CA19.9, a sialylated Lewis A antigen found on the surface of proteins [59,60]. While CA19.9 is

elevated in late stage disease, it is also elevated in benign and inflammatory diseases of the

pancreas and in other malignancies of the gastrointestinal tract [61,63]. As well, for early-stage

pancreatic cancer detection, CA19.9 has a reported sensitivity of ~55% and is often undetectable

in many asymptomatic individuals [43,59]. Other tumor markers such as members of the

carcinoembryonic antigen [65,66] and mucin [67,122] families have also been associated with

pancreatic cancer. When used in combination, with or without CA-19.9, some of these markers

have shown enhanced sensitivity and specificity; however none have become a constant fixture

in the clinic. The lack of a single highly specific and sensitive marker has led to a growing

consensus in the field towards the development of multiparametric panels of biomarkers, where-

30

by the combinatorial assessment of multiple molecules can likely achieve increased sensitivity

and specificity for disease detection and management [51,123-125].

In the present study, we performed in-depth shotgun proteomic analyses, integrating and

comparing the proteomes of conditioned media from pancreatic cancer cell lines, as well as

pancreatic juice samples, for the identification of novel pancreatic cancer biomarkers. Protein-

based biomarkers that can be detected in circulation are typically proteins that are secreted, shed

or cleaved from tumor cells, or ones that may leak out due to local tissue destruction during

disease progression [44]. As such, biological fluids in close proximity to tumor cells likely serve

as enriched sources of potential biomarkers before they enter circulation and become vastly

diluted and potentially masked by proteins of high abundance [79,110,117,125]. With respect to

pancreatic cancer, proteomic analysis of biological fluids such as pancreatic juice, cyst fluids

and bile have been conducted [98-104,126]. Protein numbers ranging from 22 to 170 have been

identified in six pancreatic juice studies using a variety of different MS-based approaches [98-

103], as well as over 460 proteins recently identified in a cyst fluid study [104], and 127 proteins

in the bile proteome from patients with bile duct stenosis [126]; subsequent verification of

candidate biomarkers in serum or plasma has been minimal.

Tissue culture supernatants or conditioned media (CM) is another relevant fluid, the

utility of which, for the identification of novel biomarkers, has been demonstrated in multiple

cancer sites by our group [80,127-130], and others [131-133]. For pancreatic cancer, Gronborg

et al. analyzed differential protein secretion between the CM of a pancreatic cancer cell line in

comparison to a normal ductal epithelial cell line and identified 195 proteins, of which 145

showed >1.5 fold-change [96], and Mauri et al had identified 46 proteins from the supernatant of

a pancreatic cancer cell line (SUIT2) [97]. In a more recent study, Wu et al. performed

proteomic analysis of 23 cell lines from 11 cancer sites, of which two cell lines were of

31

pancreatic cancer origin [131]. This group took an interesting approach, utilizing the Human

Protein Atlas database and the absence of proteins in other cancer cell lines to delineate

candidate biomarkers for the various cancer sites. More recently, another group analyzed the

conditioned media of 5 pancreatic cell lines to identify deregulated pathways [134]. What is

lacking in the field is integrative analysis and mining of the proteomes from different biological

sources pertaining to a disease type for biomarker discovery. The utility of using an integrative

approach to biomarker discovery has been described recently [117,125]. Given that cancer is a

highly heterogeneous disease, through integration and comparison of proteomes from multiple

biological sample types, the advantages of one source may account for the shortcomings of

others, resulting in more relevant and stronger candidates for verification in plasma.

As such, in this study, we performed proteomic analysis of cell line conditioned media

and pancreatic juice. Seven cell lines and six pancreatic juice samples in two pools were

analyzed in triplicate for a total of 27 experiments using 2D-LC-MS/MS. Through label-free

protein quantification between the cancer and normal cell lines and integration of the pancreatic

juice proteome with that of the cell lines, candidate biomarkers were delineated for verification

in plasma. Within our list of candidates were numerous proteins known to be upregulated in

pancreatic cancer, and proteins previously studied in serum as pancreatic cancer biomarkers,

which helps to provide credence to our approach. Of the derived candidates, initial verification

in plasma samples from patients with established pancreatic cancer and controls identified five

proteins – AGR2, PIGR, OLFM4, SYCN and COL6A1 – which showed a significant increase in

plasma from pancreatic cancer patients in comparison to controls. This demonstrates the utility

of our approach to identify proteins elevated in serum of pancreatic cancer patients. Further

validation of these proteins in a larger number of plasma samples is warranted, as is the

investigation of the remaining group of candidates.

32

2.2 Materials and Methods

Cell Lines

Six pancreatic cancer cell lines (MIA-PaCa2 (CRL-1420), PANC1 (CRL-1469), BxPc3

(CRL-1687), CAPAN1 (HTB-79), CFPAC-1 (CRL-1918) and SU.86.86 (CRL-1837)) were

obtained from the American Type Culture Collection (ATCC, Manassas, VA). The cell lines

were derived from pancreatic ductal adenocarcinomas, which account for approximately 85-

90% of all pancreatic cancers. The cell lines originated from primary tumors of the head or body

of the pancreas (MIA-PaCa2, PANC1, BxPc3), or from metastatic sites (CAPAN1, CFPAC-1,

SU.86.86) [113,114]. The cell lines were derived from individuals of similar ethnic background

and age group (with the exception of CFPAC-1), and all of the cancer cell lines, except for

BxPc3, are positive for K-ras mutations, which is found in 85-90% of pancreatic cancers. An

HPV transfected „normal‟ human pancreatic ductal epithelial cell line (HPDE) [135], provided

by Dr. Ming-Sound Tsao at Princess Margaret Hospital, Toronto, Ontario, Canada was also

analyzed. Apart from a slightly aberrant expression of p53, molecular profiling of this cell line

has shown that expression of other proto-oncogenes and tumour suppressor genes are normal

[135].

Cell culture media specified by ATCC for each of the six pancreatic cancer cell lines

were used and are as follows: DMEM (Catalog No. 30-2002 from ATCC) with 10% fetal bovine

serum (Catalog No.10091-148; Invitrogen) was used for MIA-PaCa2 and Panc1; RPMI–1640

medium modified to contain 2mM L-glutamine, 10mM HEPES, 1mM sodium pyruvate, 4500

mg/L glucose, 1500 mg/L sodium bicarbonate (ATCC Catalog No. 30-2001) with 10% FBS was

used for SU.86.86 and BxPc3; IMDM (Catalog No. 30-2005) with 10% and 20% FBS was used

for the CFPAC-1 and Capan1 cell lines, respectively. The HPDE cell line was grown in

keratinocyte serum free media (Catalog No.17005-042; Invitrogen) supplemented with bovine

33

pituitary extract and recombinant epidermal growth factor. All cells were cultured in an

atmosphere of 5% CO2 in air in a humidified incubator at 37C.

Cell Culture

An optimal seeding density and incubation period which supported maximal protein

secretion with minimal cell death was selected for each of the cell lines, as described previously

[127]. Cells were cultured in T-175 cm2

flasks at determined optimal seeding densities of ~ 10 X

106 for MIA-PaCa2, Panc1 and Capan1, 14 X 10

6 for BxPc3, 3 X 10

6 for HPDE, 13 X 10

6 for

CFPAC1 and 4 X 106 for Su.86.86 in three replicates per cell line. Cells were first cultured for

48 hours in 40mL of their respective growth media to obtain adherence to culture flasks. The

medium was then removed and the cells/flasks were subjected to two gentle washes with 30mL

of PBS (Invitrogen). Forty milliliters of chemically defined Chinese hamster ovary (CDCHO)

serum-free medium (Invitrogen) supplemented with 8mM glutamine (Invitrogen) was then

added and the cells were left to culture for determined optimal incubation periods of 72 hours

for Capan1, CFPAC1 and SU.86.86, 96 hours for BxPc3 and HPDE and 144 hours for MIA-

PaCa2. The CDCHO media that the cells were grown in were subsequently collected and

centrifuged at 1500 rpm for 10 minutes to remove cellular debris. Total protein concentration (as

determined through a Coomassie (Bradford) total protein assay, [136]) was measured in each of

the three replicates and a volume corresponding to 1mg of total protein from each of the

replicates was subjected to the sample preparation protocol below.

Pancreatic Juice

Pancreatic juice samples were provided by Dr. Felix Rueckert, Dresden, Germany.

Approximately 50-500µL of pancreatic juice was collected from the main pancreatic duct of

patients undergoing pancreatic surgery. Upon collection, the samples were stored at -80ºC until

further use. Samples from patients with clinically confirmed cases of pancreatic ductal

34

adenocarcinoma that contained no visible signs of blood were selected for analysis. Six

pancreatic juice samples met these criteria. The samples were centrifuged at 16,000 rpm for 10

minutes at 4C to remove tissue debris. Total protein concentration of each sample was

measured using the Biuret method [137]. Keeping in line with the cell line conditioned media

analysis, it was desirable to use a total protein amount of 1mg for analysis of each of the three

replicates per sample. As a result, two pools of pancreatic juice (pool A and B) were made,

containing three samples each, with total protein concentrations of 2.65 mg/mL and 2.32 mg/mL

for pool A and B, respectively. A volume corresponding to 1mg of total protein was retrieved

from each pool, in triplicate, and subjected to the standardized sample preparation protocol

below (with the exception of dialysis).

Sample Preparation

Samples were processed as described previously [127]. Briefly, samples were dialyzed

using a 3.5kDa molecular weight cut-off membrane (Spectrum Laboratories, Inc., Compton,

CA) in 5L of 1 mM NH4HCO3 buffer solution at 4°C overnight and subsequently frozen and

lyophilized to dryness to concentrate proteins using a ModulyoD Freeze Dryer (Thermo

Electron Corporation). Proteins in each lyophilized replicate were denatured using 8M urea and

reduced with the addition of 200mM dithiothreitol (final concentration of 13mM) in 1M

NH4HCO3 at 50°C for 30 minutes. Samples were then alkylated with the addition of 500mM

iodoacetamide and incubated in the dark, at room temperature, for 1 hour. Each replicate was

then desalted using a NAP5 column (GE Healthcare), frozen and lyophilized. Lastly, samples

were trypsin-digested (Promega, sequencing grade modified porcine trypsin) through an

overnight incubation at 37C using a ratio of 1:50 trypsin to protein concentration. Tryptic

peptides were frozen in solution at -80°C to inhibit trypsin function and lyophilized.

35

Strong Cation Exchange (SCX) on a High Pressure Liquid Chromatography (HPLC) System

The tryptic peptides were resuspended in 510µL of mobile phase A (0.26 M formic acid

in 10% acetonitrile; pH 2-3) and loaded directly onto a 500uL loop connected to a

PolySULFOETHYL A™ column (The Nest Group, Inc.). The column has a silica-based

hydrophilic, anionic polymer (poly-2-sulfoethyl aspartamide) with a pore size of 200 Å and a

diameter of 5 µm. The SCX chromatography and fractionation was performed on an HPLC

system (Agilent 1100) using a 1-hour procedure with a linear gradient of mobile phase A. For

elution of peptides, an elution buffer which contained all components of mobile phase A with

the addition of 1 M ammonium formate was introduced at 20 min in the 60 min method. The

eluent was monitored at a wavelength of 280 nm and fractions were collected every minute from

the 20 minute time point onwards. This resulted in the collection of 40 one-minute fractions.

Collected fractions were left unpooled or subsequently combined into 2, 3 or 5min pools,

according to the elution profile of the resulting SCX chromatogram. As a general strategy,

where the absorbance reading of the elution profile was greater (typically the first 10-15 min of

elution), fractions were left unpooled or pooled every two minutes to keep sample complexity at

a minimum. Where the absorbance readings were lower (towards the end of the method),

fractions were pooled in 3 or 5 min pools. The same pooling method was utilized for all three

replicates of the CM from each cell line and for the pancreatic juice pools.

Mass spectrometry (LC-MS/MS)

The SCX fractions/pools were purified through OMIX Pipette Tips C18 (Varian Inc.) to

further remove impurities and salts and eluted in 4uL of 70% MS Buffer B (90% ACN, 0.1%

formic acid, 10% water, 0.02% TFA ) and 30% MS Buffer A (95% water, 0.1% formic acid, 5%

ACN, 0.02% TFA). Eighty microlitres of MS Buffer A was added to the eluent, and 40uL of

sample was loaded onto a 3 cm C18 trap column (with an inner diameter of 150 µm; New

36

Objective), packed in-house with 5 µm Pursuit C18 (Varian Inc.). A 96-well microplate

autosampler was utilized for sample loading. Eluted peptides from the trap column were

subsequently loaded onto a resolving analytical PicoTip Emitter column, 5cm in length (with an

inner diameter of 75 µm and 8 µm tip, New Objective) and packed in-house with 3 µm Pursuit

C18 (Varian Inc.). The trap and analytical columns were operated on the EASY-nLC system

(Proxeon Biosystems, Odense, Denmark), and this liquid chromatography setup was coupled

online to an LTQ-Orbitrap XL hybrid mass spectrometer (Thermo Fisher Scientific, San Jose,

California) using a nano-ESI source (Proxeon Biosystems, Odense, Denmark). Samples were

analyzed using a gradient of either 54 or 90 minutes (for 5 min pools, a 90 minute gradient was

used, and for 2min, 3min and non-pooled samples, a 54 minute gradient was used). Samples

were analyzed in data dependent mode and while full MS1 scan acquisition from 450-1450m/z

occurred in the Orbitrap mass analyzer (resolution 60,000), MS2 scan acquisition of the top six

parent ions occurred in the linear ion trap (LTQ) mass analyzer. The following parameters were

enabled: monoisotopic precursor selection, charge state screening and dynamic exclusion. In

addition, charge states of +1, >4 and unassigned charge states were not subjected to MS2

fragmentation.

Protein Identification

XCalibur software was utilized to generate RAW files of each MS run. The RAW files

were subsequently used to generate Mascot Generic Files (MGF) through extract_msn on

Mascot Daemon (version 2.2). Once generated, MGFs were searched with two search engines,

Mascot (Matrix Science, London, UK; version 2.2) and X!Tandem (Global Proteome Machine

Manager; version 2006.06.01), to confer protein identifications. Searches were conducted

against the non-redundant Human IPI database (v.3.62) which contains a total of 167,894

forward and reverse protein sequences and using the following parameters: fully tryptic

37

cleavages, 7ppm precursor ion mass tolerance, 0.4Da fragment ion mass tolerance, allowance of

one missed cleavage, fixed modifications of carbamidomethylation of cysteines, and variable

modification of oxidation of methionines. The files generated from MASCOT (DAT files) and

X!Tandem (XML files) for the three replicates of each biological source were then integrated

through Scaffold 2 software (version 2.06; Proteome Software Inc., Portland, Oregon) resulting

in a non-redundant list of identified proteins per sample. Results were filtered using the

X!Tandem LogE filter and Mascot ion-score filters on Scaffold to achieve a protein false

discovery rate (FDR) <1.0%.

Data Analysis

Scaffold prot-XML reports were generated and uploaded onto Protein Center (Proxeon)

to facilitate comparisons between cell line CM and pancreatic juice proteomes, and to obtain

gene ontology information. Cellular localization, function and process annotations were

extracted by Protein Center from the Gene Ontology (GO) Consortium

(http://www.geneontology.org/GO.tools.shtml). Due to the large number of different GO

annotations per localization, function and process, Protein Center reduces terms to

approximately 20 high-level terms that are used for filtering. Details can be found at

http://tgh.proteincenter.proxeon.com/ProXweb/Help/Manual/apd.html. A Microsoft Excel

Macro developed in-house by Dr. Irv Bromberg, Mount Sinai Hospital, was also utilized for

comparison of protein lists based on accession number or gene name. Hierarchical clustering

analysis of proteomic data was performed using PermutMatrix, available freely online at

http://www.lirmm.fr/~caraux/PermutMatrix/EN/index.html. PermutMatrix was a software

originally developed for gene expression analysis [138]. More recently it has been utilized and

validated for proteomics [139]. For clustering analysis, average emPAI values from the triplicate

analysis of the samples were exported from Protein Center into a space delimited Microsoft

38

Excel file. For visualization, comparison and data analysis purposes, cell line or pancreatic juice

samples with missing emPAI values for a particular protein were assigned half the minimum

emPAI value for that protein in the data set. The emPAI values were imported into

PermutMatrix and transformed to Z score values for normalization. Two-way hierarchical

clustering analysis was performed using the Pearson and Ward‟s minimum variance methods for

distance and aggregation, respectively. Resultant dendograms with cell lines and pancreatic

juice samples on the x-axis and gene name on the y-axis were exported.

Label-free Protein Quantification

Semi-quantitative analysis was conducted between the cancer cell lines and the HPDE

normal pancreatic ductal epithelial cell line to ascertain proteins over or under-expressed in the

cancer cell lines based on spectral counting. The „Quantitative Value‟ function of Scaffold 2.06

software, which provides normalized spectral counts based on the total number of spectra

identified in each sample was utilized. One file containing all of the normalized spectral counts

of each of the three replicates from the 7 cell lines was generated for proteins identified with 2

or more peptides. One-way ANOVA was conducted to determine proteins that show a

significant difference amongst the seven cell lines (p<0.05). For proteins that showed a p value

<0.05, the average spectral count for the three replicates was calculated and fold-change was

determined by dividing the average counts from each of the cancer cell lines with that of the

normal HPDE cell line and vice versa. Not all proteins were identified in all of the cell lines;

however all proteins had to have been identified by ten or more spectra in at least one biological

sample to be included. Proteins with ambiguous peptides were searched individually to ensure

normalization of spectral counts did not significantly alter values. Unidentified proteins or

missing values in a particular biological sample were assigned a normalized spectral count of 1

to keep from dividing by zero and to prevent overestimation of fold-changes.

39

Plasma Samples

Blood samples were collected from pancreatic cancer patients at the Princess Margaret

Hospital GI Clinic in Toronto, Canada, or from kits sent directly to consented patients recruited

from the Ontario Pancreas Cancer Study at Mount Sinai Hospital following a standardized

protocol (age range 55-86; median age 68; 10 female and 10 male). Samples were collected with

informed consent, and with the approval of the institutional ethics board. Samples from healthy

controls were obtained from the Familial Gastrointestinal Cancer Registry (FGICR). The

controls are non-blood relatives of patients in FGICR studies (age range 46-84; median age 60;

9 female and 11 male). Blood was collected in ACD (anticoagulant) vacutainer tubes and

plasma samples were processed within 24 hours of blood draw. To pellet the cells, blood

samples were centrifuged at room temperature for 10 minutes at 913 X g. Immediately after

centrifugation, the plasma samples were aliquoted into 250uL cryotubes and stored in -80°C or

liquid nitrogen until further use.

ELISAs and Immunoassays

Enzyme-linked immunosorbent assays for AGR2, SYCN, OLFM4, COL6A1, PIGR,

PLAT, TFF2 and NUCB2 were purchased commercially and performed according to the

manufacturer‟s instructions. Six of the ELISA kits were purchased from USCN LifeSciences

(AGR2: Catalogue # E2285Hu, SYCN: Catalogue # E93879Hu, OLFM4: Catalogue #

E90162Hu, COL6A1: Catalogue # E92150Hu; PIGR: Catalogue # E91074Hu; TFF2: Catalogue

# E0748Hu). The PLAT (tPA) immunoassay was purchased from American Diagnositca Inc.

(Catalogue # 860) and the NUCB2 ELISA was purchased from Phoenix Pharmaceuticals

(Catalogue # EK-003-26). The ELECSYS CA 19-9 immunoassay by Roche was utilized to

measure CA19.9 levels in plasma, and kallikrein 6 and 10 internal control proteins were

measured in CM using in-house developed ELISA assays, as described previously [140,141].

40

Statistical Analysis

Mann-Whitney U-tests were applied to verification experiments in plasma to determine

if differences in the medians were significant between cancer and control groups using Graph

Pad Prism 4 Software. The five candidates that showed a statistically significant difference

(p<0.05) were then assessed in combination in comparison to CA19.9 through ROC curve

analysis. The area under the curve (AUC) values were calculated using ROCR software and the

corresponding variances were calculated with a bootstrap method.

2.3 Results

Increasing Protein Yield

Once the cell lines were grown and CM collected, the samples were subjected to a 2D-

LC-MS/MS analysis which combined SCX liquid chromatography on an HPLC system,

followed by LC-MS/MS. A schematic of the workflow is provided in Figure 2.1. Guided by our

previous experience [127-130], SCX fractions were initially collected at 5 minute intervals

during peptide elution, resulting in approximately 8 fractions that were analyzed using a ~2 hour

reverse-phase method on the LTQ-Orbitrap mass spectrometer. This resulted in the

identification of 1305, 1468, and 1749 proteins (≥1 peptide) in the triplicate analysis of the

BxPc3, HPDE6 and MIA-PaCa2 cell lines, respectively (Table 2.1). In some of the individual 5

min fractions analyzed (specifically the fractions that contained the highest absorbance readings

during SCX peptide elution) >700 proteins were identified per fraction (data not shown). Based

on previous experience, this was a very large number of proteins to have been identified in

individual fractions. Consequently, we opted to employ a different fraction collection and

pooling strategy. By collecting fractions every minute from SCX and pooling fractions based on

the intensity of peaks eluting on the SCX chromatogram (as described in the „Materials and

Methods‟; Fig 2.1), we identified 2017 proteins for the BxPc3 cell line, 2297 for HPDE, and

41

2756 for the MIA-PaCa2 cell line subjected to the same growth and sample processing

conditions, in triplicate. In order to ensure that this increase in protein yield was not due to

variation in cell growth/sample collection, an additional replicate using MIA-PaCa2 CM left

over from the initial analysis, which had been stored in -80ºC, was also run and 2348 proteins

were identified (a 52-56% increase from the individual replicates of the first run of MIA-PaCa2)

(Table 2.1).

42

Figure 2.1 Schematic Outline of Proteomic Analysis. The top panel (a) details the two pooling methods used for pooling of SCX-

generated fractions. Through application of 1,2, 3 and 5 min pools (pooling method 2) an increase of ~50-60% in the number of

proteins identified through mass spectrometry was observed. The lower panel (b) details the methodology (sample preparation, pre-

fractionation, mass spectrometry and data analysis) followed in the cell line and pancreatic juice proteomic analyses. CM, conditioned

media; SCX, strong cation exchange; LC-MS/MS, liquid chromatography tandem mass spectrometry.

43

Table 2.1. Increasing the Number of Identified Proteins by Optimizing Cation Exchange Chromatography Fraction Pooling

Cell

Line

Number of Proteins

identified with 5 min

fractionsa

(pooling method 1)

Number of Proteins identified through

combination of 1,2,3 and 5 Min Poolsa

(pooling method 2)

Number (and %) of

method 1 proteins also

identified by method 2

% increase in total

proteins identified

between methods 1

and 2

BxPc3 1305

[777]

2017

[1261]

1171 (90%)

[705 (91%)]

54%

[62%]

HPDE 1468

[876]

2297

[1474]

1326 (90%)

[802 (92%)]

56%

[68%]

MIA-

PaCa2

Rep1 Rep

2

Rep

3

Total Rep1 Rep2 Rep3 Rep4b Total

c 1598 (91%)

[1030 (94%)]

58%

[70%]

1242 [885]

1447 [929]

1450 [908]

1755 [1096]

2502 [1837]

2501 [1837]

2424 [1823]

2348 [1615]

2756 [1862]

aThese numbers include proteins identified by one or more peptide and with false discovery rate <1.0%. Proteins identified with ≥2

peptides are in brackets.

b Additional replicate using MIA-PaCa2 CM left over from the pooling strategy 1 analysis and then run using pooling strategy 2.

cThe indicated total excludes Rep4 values.

44

Over 90% of the proteins from the first analysis were re-identified and the new pooling

strategy resulted in approximately a 54-58% increase in protein yield across the cell lines. This

improved strategy was utilized for proteomic analysis of the remaining cell lines and the

pancreatic juice samples.

Protein Identification through LC-MS/MS

Six human pancreatic cancer cell lines, one „near normal‟ human pancreatic ductal

epithelial cell line (HPDE) and six pancreatic juice samples from ductal adenocarcinoma

patients (in two pools) were analyzed in triplicate in this study (Figure 2.2). Using both

MASCOT and X!Tandem search engines, between 2017 to 3250 proteins were identified in the

7 cell lines and 1014 and 956 proteins were identified from pool A and B of pancreatic juice,

respectively (Table 2). These numbers represent proteins identified in the three replicates

combined, with 1 or more peptides and with protein false discovery rates (FDR) of <1.0%. For

protein identifications, the human forward and reverse IPI3.62 database, which contains 167,894

forward and reverse protein sequences was used, and FDR was calculated as

[2XFP/(TP+FP)]100, where FP (false positive) is the number of proteins that were identified

based on sequences in the reverse database component and TP (true positive) is the number of

proteins that were identified based on sequences in the forward database component [142-144].

For increased stringency and assurance of protein identification, only proteins identified with

two or more peptides were included in the remainder of the analysis, resulting in between 1261

and 2171 proteins for each of the cell lines and a total of 648 non-redundant proteins from the

pancreatic juice analysis. This data is summarized in Table 2.2.

Protein Overlap Between Samples

From our combined analysis, a total of 3479 non-redundant proteins (3324 in the cell

45

Table 2.2. Total Number of Proteins Identified in Triplicate Analysis of Cell Line Conditioned Media and Pancreatic Juice

Cell Lines Pancreatic Juice

BxPc3 CAPAN1 CFPAC1 HPDE MIA-

PaCa2

PANC1 SU.86.86 Pool A Pool B

Total Non-Redundant Proteinsa

[with ≥2 peptides] 2017

[1261]

2182

[1420]

2427

[1573]

2297

[1474]

2756

[1862]

3250

[2171]

3010

[2002]

1018

[546]

957

[496]

Num

ber

of

Pro

tein

s

Iden

tifi

ed w

ith …

Only 1 peptide 756 762 854 823 894 1079 1008 472 461

Only 2 peptides 336 374 394 400 464 491 491 172 150

Only 3 peptides 226 230 290 252 281 347 322 96 79

Only 4 peptides 144 171 175 166 217 248 224 43 64

≥ 5 peptides 555 645 714 656 900 1085 965 235 203

Protein False Discovery Rateb 0.69% 0.82% 0.66% 0.87% 0.80% 0.62% 0.73% 0.79% 1.0%

Number [%] of Extracellular

and Cell Surface Proteins with

≥2 peptides

511

[40.5%]

605

[42.6%]

665

[42.3%]

592

[40.2%]

635

[34.1%]

757

[34.9%]

749

[37.4%]

314

[57.5%]

281

[56.7%]

a All non-redundant proteins identified with ≥1 peptide; the number of total proteins identified with ≥ 2 peptides is enclosed in

brackets.

b Pertains to total proteins identified with ≥1 peptide; False discovery rate = 0.0% for ≥ 2 peptide identifications.

46

Figure 2.2. Total non-redundant proteins identified. Venn diagrams depicting total proteins

identified with ≥2 peptides in the three replicates of each cell line CM and pancreatic juice

sample (a). Overlap of 3479 total non-redundant proteins identified in the conditioned media

and pancreatic juice analysis is also depicted (b). HPDE, (normal) human pancreatic ductal

epithelial cell line.

lines and 648 in the pancreatic juice analysis) were identified with ≥2 peptides. Six-hundred and

forty-four proteins (of 3324; 19.4%) were common to all cell lines and an average of 143

proteins were unique to each (Table 2.3). From our preliminary studies of the three cell lines

described in the „increasing protein yield‟ section above, 83 additional non-redundant proteins

(≥2 peptides) were identified; however these were not included in the remainder of the analyses.

Significant overlap was noted between the pancreatic juice and CM proteins.

Approximately 76% (493 of 648) of proteins identified in the pancreatic juice samples were also

identified in the cell line analysis (Figure 2.2b), which indicates much similarity in the

47

proteomes between these biological fluids; however many proteins that are largely associated

with exocrine pancreatic function were unique to the pancreatic juice and not identified in the

cell lines. Analysis of overrepresented KEGG pathways through Protein Center software further

revealed the KEGG pancreatic secretion pathway (hsa04972) to be one of three pathways

overrepresented in the pancreatic juice proteome in comparison to the combined cell line

proteome (p=3.611E-5) (Appendix 1).

Table 2.3. Protein Overlap Between Cell Line Conditioned Media

Number of

Cell Lines

Number of

Proteinsa

% of the CM Proteins

Identified in the

Pancreatic Juicec

7 644 42%

6 285 14%

5 295 14%

4 268 14%

3 336 10%

2 494 5%

1 1001b 4%

a Indicates the number of proteins with two or more peptides that were commonly identified in

the reported number of cell lines.

bOne thousand and one proteins is the total number of proteins that were unique to only one of

the seven cell lines.

cIndicates the percentage of proteins common to the multiple cell lines that were also identified

in the pancreatic juice.

Gene Ontology – Function, Process and Cell Localization Classifications

Gene ontology classifications, which include function, process and cell localization,

were obtained for all identified proteins. Proteins that are secreted into the extracellular milieu

48

or cleaved from the plasma membrane of cells have the highest chance of entering the

circulation and serving as serological biomarkers. Between 34.1%-42.6% of proteins in each of

the cell lines and 57% of proteins in the pancreatic juice samples were annotated as belonging to

the extracellular or cell surface compartments (Table 2.2). In total, 1376 (40%) of 3479 proteins

contained these two annotations. The cytoplasm received the greatest number of annotations in

both biological fluids and approximately 2.9% of the total contingent of proteins did not contain

cell localization information and are unannotated (Figure 2.3a). It is important to note that

proteins can be classified as belonging to multiple cellular localizations, processes and functions

and as a result, the categories for each are non-exclusive and the sum of the percentages can be

>100%.

The top three molecular functions for the cell lines and the pancreatic juice were the

same: protein binding (~80.2%, 79.9%), catalytic activity (69.4%, 70.2%) and metal ion binding

(45.7%, 49.7%), respectively. Both fluids also shared the top two biological processes –

metabolic process (81.2%, 83.5%) and regulation of biological process (61.9%, 63.1%),

respectively. In a comparison between the cell line and pancreatic juice proteomes as a whole,

extracellular proteins and several molecular functions related to enzyme activity were found

overrepresented in the pancreatic juice proteome (Figure 2.3b). The only GO category

overrepresented in the cell line proteome was the biological process „macromolecule metabolic

process‟ (GO:0043170; p=5.338E-7; FDR p=3.308E-3). No GO terms were over or

underrepresented in a comparison between the cancer cell lines and HPDE.

49

Figure 2.3 Cellular localization and comparison of GO categories between cell line

conditioned media and pancreatic juice proteomes. Cellular localization of proteins

annotated using gene ontology (GO) consortium annotations (a). Depicted in dark grey is the

percentage of proteins from the cell line CM analysis and light grey is percentage of proteins

from the pancreatic juice analysis for each cell localization category. Proteins can contain

multiple GO annotations resulting in a sum of percentages >100%. Top three significantly

overrepresented GO categories in pancreatic juice proteins in comparison to the cell line

conditioned media proteome for cellular localization, molecular function and biological process

are also depicted (b). Blue represents pancreatic juice proteins and red represent cell line

proteins. FDR (false discovery rate) p-values are based on application of hypergeometric test at

a FDR of 1%.

50

Hierarchical Clustering

One of the difficulties in dealing with large datasets is visualizing the proteomes as a

whole and identifying subsets of proteins that may be of importance within certain biological

contexts. In an initial attempt to mine and explore the CM and pancreatic juice proteomes,

unsupervised two-way hierarchical clustering analysis (HCA) was performed using average

emPAI values of the three replicates of each sample, normalized through Z-scores. Through

these means, proteins were clustered based on abundance within each sample. The

concentrations of two proteins (KLK6 and KLK10) were assessed in the CM through ELISA to

determine if Z-scores of emPAI values are a suitable indicator of protein abundance. Good

correlation was seen between Z-scores of ELISA concentrations and Z-scores of emPAIs

(Figure 2.4a). Additionally, the lowest ELISA concentration measured was 0.80 ug/L for

KLK10 in CAPAN1 CM, which indicates the sensitivity of our mass spectrometry analysis in

general, to be at least in the low ug/L range for the CM analysis.

HCA was performed on the entire dataset of 3479 proteins and based on normalized

emPAI values, the pancreatic juice samples were distinctly clustered separately from the cell

lines, and within the cell lines, the three derived from metastatic sites (SU.86.86, CFPAC1 and

CAPAN1) were clustered together. MIA-PaCa2, PANC1 and BxPc3 are cell lines derived from

the primary tumor site of three patients [114]. The MIA-PaCa2 and PANC1 proteomes were

clustered together, as were the BxPc3 and HPDE cell lines. Heat-map visualization facilitated a

first exploration of the dataset and the identification of several regions or protein clusters of

interest. Among them, two clusters containing 34 proteins were shown to be highly expressed in

multiple cancer cell lines and the pancreatic juice samples, all with minimal expression in HPDE

(Figure 2.4b). This included proteins such as MUC1 (Mucin-1) [67] and RNASE1 (pancreatic

ribonuclease) [145] which have been shown to be elevated in pancreatic cancer and studied

51

Figure 2.4 Hierarchical clustering analysis based on normalized emPAI values for the 3479 total non-redundant proteins

identified. Good correlation (R-square = 0.7362) of z-scores between KLK6 and KLK10 emPAI values and ELISA concentrations

was noted (a). Clustering analysis depicting the seven cell lines and two pancreatic juice pools on the x-axis and proteins on the y-axis

(b). Shown is a segment of the resulting dendogram depicting two clusters of proteins found highly expressed in the cancer cell lines

and pancreatic juice (low expression in the normal HPDE cell line). KLK6, kallikrein 6; KLK10, kallikrein 10; emPAI, exponentially

modified protein abundance index; EC, extracellular; CS, cell surface.

52

previously as pancreatic cancer biomarkers in serum. This prompted us to further examine

proteins that are differentially expressed between the cancer cell lines and the HPDE cell line.

Differential Expression of Proteins in Cancer vs. Normal Cell Lines

Normalized spectral counts of the cancer cell lines were compared with those of the

normal HPDE cell line as described in the „Materials and Methods‟ section. The Pearson

correlation coefficient was evaluated for all pairs of the 21 replicates from the 7 cell lines using

normalized spectral count values (Appendix 2). With the exception of replicate 2 from CFPAC1,

which showed 0.727 and 0.851 correlation with CFPAC1 replicates 1 and 3, good correlation

(ranging from 0.944-0.993) was seen between replicates of each cell line (including CFPAC1

replicates 1 and 3) indicating good reproducibility (Appendix 2).

Analysis of variance (ANOVA) testing identified 1293 proteins (each with a minimum

number of 10 spectra in at least one cell line), with a statistically significant difference amongst

the seven cell lines (p<0.05). Based on the criteria described in the „Materials and Methods‟,

491of these proteins showed ≥ 5-fold increase in at least one cancer cell line in comparison to

HPDE. One-hundred and nineteen proteins further demonstrated ≥ 5-fold increase in at least

three cancer cell lines in comparison to HPDE, of which 53 proteins showed over 10-fold

increase and 18 showed over 20-fold increase in at least three cancer cell lines. Examination of

underexpressed proteins revealed 19 proteins consistently decreased at least 5-fold in all six

cancer cell lines and 18 consistently decreased in five cancer cell lines in comparison to HPDE.

Sixty-three of the 119 proteins were extracellular and cell surface-annotated and are

listed in Appendix 3. Additionally, 17 of these proteins have been previously shown to be

upregulated in pancreatic cancer in at least four studies [146], and 10 have been shown to be

elevated in pancreatic cancer serum in comparison to controls [145, 147-156] (Appendix 3). The

unstudied proteins may yield promising new candidate biomarkers for pancreatic cancer.

53

Many of the 491 total overexpressed proteins were also identified in a comprehensive

database of human plasma proteins [157] and five proteins, COPS4 (COP9 signalosome

complex subunit 4), PXN (Paxillin), MYO1C (Myosin-1c), GBA (protein similar to

Glucosylceramidase) and LMAN2 (Vesicular integral-membrane protein VIP36), were also

identified in a recent global genomic analysis of pancreatic cancer [32] as overexpressed in the

large majority of pancreatic cancer cases studied.

Further Prioritization of Candidates through Integration of Biofluids and Tissue Specificity

Recent evidence suggests the integration and combining of different biological fluids

may also yield strong candidates for verification phases of biomarker discovery [117,125]. As

such, we applied a set of filtering criteria based on overlap of proteins between different

biological sources, the cellular localization of proteins and tissue specificity for the generation of

further candidates. Of the 488 proteins common to the pancreatic juice and cell lines (Figure

2.2b), 235 had been annotated as belonging to the extracellular and cell surface compartments.

One-hundred and nine of these proteins were also identified in the proteome of ascites fluid

from 3 patients with pancreatic adenocarcinoma (Makawita et al. unpublished)1, and of these, 43

were not identified in the HPDE normal cell line (Appendix 4).

Because there may be pertinent proteins in the pancreatic juice that may not be identified

in the CM and vice versa, proteins identified in either proteome that were shown to be highly

specific to/expressed in the pancreas were also included. To examine tissue specificity, we

compared the proteins identified in our CM and pancreatic juice datasets to proteins shown

highly specific to the pancreas based on microarray, EST and immunohistochemistry data using

1 Makawita S., Kosanam H., Diamandis E.P. Proteomic analysis of ascites fluid from pancreatic

cancer patients. (unpublished data).

54

TiSGeD [158], TiGER [159], Unigene [160], and the Human Protein Atlas [161], respectively.

These are publically available databases that have been described in detail previously [158-161].

Specifically, we compared our lists to 150 proteins reported as specific to pancreas tissue using

TiSGeD specificity measure >0.90 [158], 55 pancreas-specific proteins from Unigene, 205

proteins preferentially expressed in the pancreas based on the TiGER database and 198 proteins

showing „strong‟ pancreatic exocrine cell staining and annotated on the Human Protein Atlas.

Twenty proteins were common to at least three or more of the databases, of which 2 proteins,

PRSS1 and SPINK1, were identified in the cell line CM as meeting these criteria and 15

proteins from the pancreatic juice proteome (including PRSS1 and SPINK1) met the same

criteria (Table 2.4). Twelve of these proteins have been previously shown to be elevated in

serum/plasma of patients with pancreatitis or pancreatic cancer [162-172], leaving CTRC

(chymotrypsin C), SYCN (syncollin) and REG1B (Lithostathine-1-beta) (Table 2.4).

Candidate Verification in Plasma

Based on availability of enzyme-linked immunosorbent assays, eight candidates were

selected for verification in plasma. Of these, five proteins - Anterior Gradient Homolog 2

(AGR2), Olfactomedin-4 (OLFM4), Syncollin (SYCN), Collagen alpha-1(VI) chain (COL6A1),

Polymeric Immunoglobulin Receptor (PIGR) – showed a significant increase in pancreatic

cancer plasma (Fig 2.5a-e). Tissue-type plasminogen activator (PLAT), Trefoil factor 2 (TFF2)

and Nucleobindin-2 (NUCB2) did not show a significant increase in plasma samples (data not

shown).

55

Table 2.4 List of 15 Pancreas Specific Proteins (≥3 databases) Identified in CM and Pancreatic Juice

Gene Protein Name HPA

[162]

UniGene

[161]

TiGER

[160]

TiSGeD

[159] Identified in Previous shown

Elevated in

Pancreatic Cancer

or Pancreatitis

Serum/Plasma [ref]

Pancreatic

Juice

Proteome

CM

Proteome

CPA1 Carboxypeptidase A1

162

PRSS1 Trypsin-1 163,164

CPA1

cDNA FLJ53709, highly similar to

Carboxypeptidase A1

162

CPA2 Carboxypeptidase A2

162

GP2

Isoform Alpha of Pancreatic secretory

granule membrane major glycoprotein

GP2

165

REG1A Lithostathine-1-alpha

166

CTRC Chymotrypsin-C

CPB1 Carboxypeptidase B

167

GP2

Isoform 1 of Pancreatic secretory granule

membrane major glycoprotein GP2

165

PNLIP Pancreatic triacylglycerol lipase

168,169

SYCN Syncollin

REG1B Lithostathine-1-beta

CLPS Colipase

170

SPINK1 Pancreatic secretory trypsin inhibitor

171

PLA2G1B Phospholipase A2

172

HPA, Human Protein Atlas; TiSGeD, Tissue-Specific Genes Database; TiGER, Tissue-specific and Gene Expression and Regulation

56

In the CM analysis, AGR2 showed over 10-fold increase in the BxPc3, CAPAN1,

CFPAC1 and SU-86-86 cell lines compared to the near normal HPDE cell line (Appendix 3). As

well, AGR2 was common to the CM and pancreatic juice proteomes and was identified in the

cluster of proteins highly expressed in many cancer cell lines and pancreatic juice in comparison

to HPDE (Figure 2.4b). In plasma, AGR2 levels were significantly increased in pancreatic

cancer patients (p<0.0001) in comparison to controls (Fig. 2.5a). Mean and median plasma

levels in the pancreatic cancer patients were 8.8 ug/L and 2.1ug/L while mean and median levels

in controls were 0.33 ug/L and 0.28 ug/L).

OLFM4 was a protein identified based on the integrated method (Appendix 4), and as

well it was identified in the cluster shown in Fig 2.4b. In the plasma samples, OLFM4 also

showed a significant elevation (p<0.0001) in cancer (mean = 161 ug/L, median = 90 ug/L) in

comparison to controls (mean = 51 ug/L, median = 38 ug/L) (Fig. 2.5b). SYCN was a protein

identified solely in the pancreatic juice samples. It is monospecific to the pancreas based on

TiGER, TiSGeD and Unigene databases (data was unavailable in the Human Protein Atlas)

(Table 2.4). This protein is a part of the secretory granule membranes of the exocrine pancreas,

and due to its tissue specificity, it was selected for the verification phases. In the plasma

samples, SYCN also showed a significant increase in pancreatic cancer patients (p=0.0011;

mean cancer = 18.2 ug/L, median cancer = 13.5 ug/L; mean controls = 5.1 ug/L, median controls

= 2.9 ug/L) (Fig. 2.5c).

COL6A1 was expressed over 20-fold in all of the cancer cell lines except for the BxPc3

cell line in comparison to the HPDE cell line. Similarly, PIGR was expressed over 20-fold in

three of the cancer cell lines (Appendix 3). Both proteins showed a significant increase in

pancreatic cancer plasma in our preliminary analysis (p=0.0098; mean cancer = 3.3 mg/L,

median cancer = 2.1 mg/L; mean controls = 1.5 mg/L, median controls = 0.73 mg/L for

57

COL6A1 and p<0.0001; mean cancer = 16.8 mg/L, median cancer = 12.3 mg/L; mean controls =

9.2 mg/L, median controls = 8.96 mg/L for PIGR) (Fig. 2.5d,e).

At present, CA19.9 is the most widely used pancreatic cancer biomarker and CA19.9

levels were also assessed in our screening set of plasma samples (Fig 2.5f). While neither of the

proteins verified here shows enhanced performance to CA19.9 individually, preliminary

assessment, using ROC curve analysis, of all proteins as a panel show a slight increase in AUC

to CA19.9 alone (AUCCA19.9= 0.97 , AUC AGR2, OLFM4, SYCN, COL6A1,PIGR= 0.98; Fig 2.6).

58

Figure 2.5. Preliminary verification of AGR2 (a), OLFM4 (b), SYCN (c), COL6A1 (d) and

PIGR (e) in plasma from pancreatic cancer patients and healthy controls of similar age

and sex. Plasma concentrations of the proteins were measured through ELISA. Mean values are

indicated by a horizontal line and p-values were calculated using the Mann-Whitney U-test.

CA19.9 levels in the plasma samples were also tested (f).

n, number of subjects

59

Figure 2.6. Receiver Operating Characteristic curve analysis for CA19.9 and panel of 5

candidates (AGR2, OLFM4, SYCN, COL6A1 and PIGR). AUC (area under curve) is given

at 95% confidence intervals. AUC of the 5 candidates in panel show slight improvement to AUC

of CA19.9 alone.

60

2.4 Discussion

Deregulated molecular pathways are a hallmark of cancer and the resultant secretion,

shedding and aberrant cleavage of proteins by tumor cells and their microenvironment present a

way in which to detect and track tumor development and progression [44,173]. With the advent

of high throughput protein profiling techniques, at the centre of which lies mass spectrometry

analysis, various approaches have been taken for the identification of novel protein biomarkers

and novel biomarker candidates. Serum or plasma is the desired diagnostic fluid in the clinic;

however initial discovery studies in serum are hampered by the high complexity of the fluid and

its large dynamic range [54]. To overcome these limitations and others posed by MS analysis of

serum, researchers have turned to the characterization of proteomes of less complex biological

fluids that are „upstream‟ of plasma. Due to the proximity of selected biological fluids to tumor

cells and tissues, their proteomes likely represent a reservoir of proteins enriched in potential

biomarkers prior to dilution upon entering the circulation [110,117,125].

Although many notable differences exist, the genomic and transcriptional make-up of

cancer cell lines have been shown to recapitulate salient aspects of primary tumors [113-116]. In

addition, the identification of known biomarkers in the conditioned media of cancer cell lines for

numerous cancer sites, make it a viable source to mine [80, 174]. Previously, our group has

characterized the CM of breast, ovarian, prostate and lung cancer-related cell lines using 3-4 cell

lines per cancer site [127-130]. Using an LTQ mass spectrometer, 1139, 1830, 2124 and 2039

proteins were identified with at least one peptide in the breast [127], lung [130], prostate [128]

and ovarian cancer [129] analyses, respectively. Given the vast heterogeneity of the disease,

from our previous work it was concluded that a larger number of cell lines per cancer site, as

well as the incorporation and integration of proximal biological fluids from patients may provide

a more complete picture of disease heterogeneity and the tumor-host interface, there-by

61

facilitating the identification of stronger candidates for verification.

In the present study, we applied such an approach to pancreatic cancer. By utilizing 2D

LC-MS/MS, we characterized the proteomes of conditioned media from six pancreatic cancer

cell lines, one near normal pancreatic ductal epithelial cell line and six pancreatic juice samples

in two pools. All experiments were performed in triplicate and multiple search engines

(MASCOT and X!Tandem), which employ different search algorithms, were utilized for protein

identification. Previously it has been reported that use of multiple search engines results in

increased confidence in the proteins identified [70,71]. Additionally, only proteins identified

with multiple peptides (≥ 2 peptides) were used in the analysis. Through these means we

identified 3324 non-redundant proteins in the CM of the seven cell lines and 648 proteins in the

pancreatic juice. In total, 3479 non-redundant proteins were identified. This, to our knowledge,

is one of the largest and most comprehensive proteomes to date for pancreatic cancer-related

biological fluids in a single study.

In the first part of the study, an increase in protein yield of ~50% was achieved by

applying a pre-fractionation strategy that was tailored to the SCX elution profile. SCX was the

first dimension of fractionation in our multidimensional approach. Different modes of

fractionation from isoelectric focusing (IEF) to SDS-PAGE fractionation and SCX have been

previously compared, with different studies reporting different methods as the most effective

when coupled with MS analysis [175-178]. Fractionation of complex samples prior to MS

analysis is a technique used to minimize sample complexity and penetrate deeper into the

proteome, there-by achieving increased coverage of proteins. Indeed, more proteins (including

those known to be of low abundance such as various interleukins) were identified through these

means in our analysis. A corollary of increased fractionation is typically decreased throughput.

In the present study, reduced gradient times during the second dimension of separation (reverse-

62

phase) helped to keep any increase in analysis time to a minimum.

Not all proteins identified in shotgun proteomics-driven discovery approaches will be

suitable for study as serological biomarkers, and one of the challenges in the field is in the

selection of the most promising candidates for further investigation. In the present study, we

utilized two strategies: (1) semi-quantitative analysis through label-free protein quantification

between the cancer and normal cell lines and (2) integrative analysis of cell line CM and

pancreatic juice. Label-free approaches typically employ chromatographic ion intensity-based

methods or spectral count-based means to obtain relative quantification of proteins between LC-

MS/MS run samples [179,180]. Further approximations of absolute protein abundance can be

obtained through reported indices such as emPAI and absolute protein expression (APEX)

[181,182]. Normalized spectral counts have been reported previously to be reliable indicators of

protein abundance in studies comparing different label-free methods, and strong correlation

between spectral counts and protein abundance have been shown [183]. When restricting

analysis to proteins identified with five or more spectra, results comparable to label-based

approaches have been shown to be obtainable [184]. In the present study, this method was

utilized for relative quantification between the cancer cell lines and the HPDE cell line.

Using the criteria outlined in the „Materials and Methods‟, 119 proteins were found to be

expressed over 5-fold consistently in at least three cancer cell lines. Included in this list were

many proteins previously shown to be upregulated in pancreatic cancer. For instance, the protein

GDF15, also known as macrophage inhibitory cytokine 1 (MIC1), showed >10-fold increase in

the CAPAN1, CFPAC1, PANC1 and SU.86.86 cell lines. Increased GDF15 mRNA and protein

levels have been shown previously in pancreatic tissue in comparison to adjacent normal

controls [154] and evaluation of this protein in serum has also shown it to have diagnostic

potential [155]. Similarly, neutrophil gelatinase-associated lipocalin (LCN2) [156], matrix

63

metalloproteinase 7 (MMP7) [152], complement component 3 (C3) [150,151] and leucine-rich

alpha-2-glycoprotein (LRG1) [185] have been reported to be elevated in serum of pancreatic

cancer patients, while mesothelin (MSLN) [186], tissue-type plasminogen activator (PLAT)

[187], C-X-C motif chemokine 5 (CXCL5) [188] and other proteins highlighted in Appendix 3

have been shown to be upregulated in pancreatic cancer or pancreatic neoplasia at the level of

tissue and/or mRNA.

Identification of these proteins provides some credence to our label-free discovery

approach; however proteomic comparisons between non-malignant and malignant biological

sources are limited by the possibility that the observed differences may be due to many factors,

not solely due to differences in tumorigenic potential alone. This was demonstrated as three of

the eight proteins verified did not show a significant increase in plasma from pancreatic cancer

patients. These proteins, PLAT, NUCB2 and TFF2, were expressed over 5-fold in three, one and

one of the pancreatic cancer cell lines, respectively; however this failed to translate into our

plasma analysis.

We further investigated three other proteins, AGR2, PIGR and COL6A1 which showed

over 5-fold increase in four, three and five cancer cell lines, respectively (Appendix 3). Our

analysis of these proteins in human plasma also showed a significant increase in pancreatic

cancer patients. Except for AGR2, to the best of our knowledge, neither of these proteins have

previously been studied in sera/plasma of pancreatic cancer patients. AGR2 is an orthologue of

the Xenopus laevis protein XAG-2, which is a protein shown to play a role in ectodermal

patterning [189]. The function of AGR2 in normal human states is largely unknown; however in

humans cancers, AGR2 has been associated with several cancer types [190-192] and recently,

increased AGR2 levels were reported in pancreatic juice [153]. In this latter study, Chen et al.,

utilized quantitative proteomics to profile pancreatic juice samples from pancreatic

64

intraepithelial neoplasia (PanIN) patients in comparison to controls and AGR2 was one of the

proteins this group found to show over 2-fold increase in PanIN-stage III. While Chen et al.,

found diagnostic relevance for AGR2 in pancreatic juice, their analysis in 6 paired serum and

pancreatic juice samples from PanIN patients found no correlation between serum and

pancreatic juice AGR2 levels. Further analysis by this group in serum of 9 pancreatic cancer and

9 cancer-free controls showed no significant difference in AGR2 levels as well [153]. Despite

this, given that AGR2 was highly elevated in the majority of cancer cell line CM based on

spectral counting in this study, as well as its identification in pancreatic juice, we tested its levels

in our screening set of plasma samples and found a significant elevation in AGR2 levels in

pancreatic cancer plasma versus controls (Fig 2.5a). AGR2 has been previously shown to play a

role in invasion and metastasis [193,194], and it may be that elevated levels of this protein occur

in blood in the later stages of pancreatic cancer; however our initial results warrant further

evaluation of this protein in plasma/sera in larger sample sets.

PIGR has been shown previously through MRM to be increased in endometrial cancer

tissue homogenates [195]; however it has not been studied in clinical samples from many other

cancer sites. In the present study we demonstrate its significant increase in pancreatic cancer

plasma. COL6A1 is an important component of microfibrillar network formation, associating

closely with basement membranes in many tissues. It is an extracellular matrix protein and also

found in stromal tissue [196]. Mutations in this gene play a role in muscular disorders and

differential COL6A1 gene expression has been associated with astrocytomas [197,198];

however it has not been studied in pancreatic cancer and was found to be significantly increased

in our preliminary assessment in plasma. Taken together, the increased levels of these proteins

in pancreatic cancer plasma demonstrate the utility of our label-free differential protein

quantification approach to identify proteins relevant for study as potential serological

65

biomarkers of pancreatic cancer.

The identification of cancer-derived protein alterations through integration of different

biological sources is also an area of interest in cancer proteomics and the integrative mining of

multiple biological fluids may result in the identification of relevant candidates [125]. For

instance, in a recent analysis done in our laboratory [117], which compared the proteins/genes

identified in six publications chosen arbitrarily to represent various biological sources and both

proteomic and genomic data pertaining to ovarian cancer (2 cell line CM studies, 2 ascites, 1

tissue proteomics study and 1 microarray study), no proteins were found common to all 6

studies; however two proteins were found common to four of the studies. The proteins identified

were WAP four-disulfide core domain protein 2 precursor (HE4) and GRN (granulin). Both

have been implicated in ovarian cancer and HE4 is a recently FDA-approved ovarian cancer

biomarker [199]. In this respect, we looked at proteins common to the cancer CM and pancreatic

juice for identification of further candidates. These proteins were also compared to a pancreatic

cancer ascites proteome (Makawita et al. unpublished) for additional filtering. Most, if not all,

current biomarkers, such as PSA for prostate cancer, CA125 for ovarian cancer, hCG for

testicular cancer, etc. are secreted and shed proteins and focus was given to extracellular and cell

surface proteins as they have the highest likelihood of entering into the circulation [44,200].

Focus was also given to proteins highly or specifically expressed in the pancreas. If a protein is

only expressed in one tissue in healthy individuals, that tissue is likely the only contributor to

endogenous serum levels of that protein. As such, increasing serum contributions of such a

protein due to the presence of a growing tumor may be more easily detected. Furthermore, many

current biomarkers, such as PSA mentioned above, are highly expressed in one tissue [201].

Interestingly, the great majority of pancreas-specific proteins (as denoted by several

databases) were unique to the pancreatic juice and not identified in the cell lines (Table 2.4).

66

Similarly, the KEGG pancreatic secretion pathway was overrepresented in the pancreatic juice

proteome (Appendix 1). In the exocrine pancreas, acinar cells are responsible for secretion of

enzymes (zymogens) while ductal cells secrete primarily an alkaline fluid [202,203]. While the

majority of pancreatic cancers are ductal adenocarcinomas with pancreatic ductal cell-like

properties, the cell of origin of these cancers is still unclear [5,7]. Previously it has been shown

that acinar cells, once having undergone a transformation to duct-like cells show a reduced

secretion of zymogens [204]. The lack of pancreas specific proteins (enzymes, zymogens, etc.)

in the cell line CM may likely reflect the ductal-like nature of the cell lines, while the presence

of such proteins in the pancreatic juice may be reflective its acinar cell contributions.

Among the proteins common to all three biological fluids (Appendix 4), were several

proteins shown previously to be increased in the serum of pancreatic cancer patients and studied

as pancreatic cancer biomarkers, such as MUC1 and CEACAM5 (CEA) [65-67]. Two proteins

not previously assessed in the serum/plasma of pancreatic cancer patients, OLFM4 and SYCN,

were selected for verification. OLFM4 has been shown to promote proliferation in the PANC1

cell line by Kobayashi et al [205], and its mRNA levels were shown to be elevated in 5

cancerous, versus non-cancerous pancreatic tissue samples in the same study. OLFM4 serum

protein levels have shown potential diagnostic utility for gastric cancer [206]; however this

protein has not been studied in serum/plasma of pancreatic cancer patients. In this study,

OLFM4 showed over 5-fold expression in the CAPAN1 cell line in comparison to the HPDE

cell line. It was also identified by us in the pancreatic juice and ascites fluid proteomes and our

preliminary assessment shows that it is significantly increased in plasma from pancreatic cancer

patients (Fig 5b). Syncollin is a zymogen granule protein specific to the pancreas and is believed

to play a role in the concentration and/or efficient maturation of zymogens [207]. Syncollin has

been previously identified through mass spectrometry in human pancreatic juice and in the

67

proteomic analysis of plasma from a murine pancreatic cancer model [98,106]; however little is

known about the role of this pancreas-specific protein in pancreatic cancer and other

pathologies. Our data show that it is significantly elevated in human pancreatic cancer plasma

through ELISA (Fig 2.5c).

The growing consensus in this field is towards the development of panels of biomarkers,

as the combined assessment of multiple molecules can result in increased sensitivity and

specificity, in comparison to the assessment of molecules individually. In the present study, this

was demonstrated preliminarily to be true as the combined assessment of AGR2, OLFM4,

PIGR, SYCN and COL6A1 showed improved AUC, compared to CA19.9 alone. CA19.9 has

reported sensitivity and specificity values between 70%-90% (median ~79%) and 68%-91%

(median ~82%), respectively, for detection of pancreatic cancer (note: sensitivity decreases to

~55% in early-stage disease and CA19.9 is often undetectable in many asymptomatic

individuals; specificity decreases with benign disease) [59]. In the present study, CA19.9

showed a very high AUC (0.97) likely because the cancer plasma samples utilized were from

patients with established (primarily late-stage) pancreatic cancer. We used such samples since

the goal of the present study was to determine the utility of our approach to identify proteins that

are increased in serum/plasma of pancreatic cancer patients. Our marker panel requires further

validation with samples that have low/normal CA19.9 values, and includes patients with early

stage disease, as well as those with benign abdominal pathologies.

Three of the five proteins verified in plasma and which showed a significant increase in

pancreatic cancer plasma samples in comparison to controls (AGR2, PIGR and OLFM4) were

also identified in relevant clusters through hierarchical clustering analysis. emPAI is another

means of label-free protein quantification [181], and the identification of these three proteins

through emPAI-based quantification, and several other proteins, that were also identified

68

through spectral counting, is not unexpected. It aids in further corroborating our results.

Recently, Wu et al. [131], utilized emPAI values of proteins normalized through z-scores for

pathway-based biomarker discovery as a part of their study of 23 human cancer cell lines. In the

present study, we utilized normalized emPAI values of proteins to gain a preliminary

understanding of the dataset through hierarchical clustering analysis. The six cancer cell lines

chosen for analysis in the study are well characterized and highly studied cell lines in literature.

They contain many of the major genetic aberrations present in pancreatic cancer such as

mutations in Kras, SMAD4, CD16 and TP53 [113,114]. Interestingly, the cancer cell lines

derived from metastatic sites (SU-86-86, CFPAC1 and CAPAN1) were clustered together, while

MIA-PaCa2 and PANC1, which are cell lines derived from a primary tumor site were clustered

together, as were the BxPc3 and HPDE cell lines. BxPc3 is a cancer cell line derived from a

primary tumor site and HPDE is a widely used surrogate for normal pancreatic ductal epithelial

cells. Incidentally these two cell lines were the only ones with wild-type Kras expression [114],

a gene that is mutated in the vast majority (>90%) of pancreatic cancers; however firm

conclusions cannot be drawn regarding the clustering without further investigation. None-the-

less, identification of three of the five proteins that showed a significant increase in plasma from

pancreatic cancer patients in this study render the proteins identified in relevant clusters through

normalized emPAI values a potentially viable means for the generation of biologically relevant

leads.

Pancreatic cancer bodes one of the lowest five-year survival rates (<5%) of all cancer

types [121]. This is largely associated with the existence of locally advanced or metastatic

disease in the majority of patients at the time of diagnosis. Genomic sequencing studies reveal

that a broad window may exist for the detection of pancreatic cancer between the initial stages

of tumour development and dissemination to secondary sites [208]. Here, we present the

69

proteomic analysis of pancreatic cancer-related cell lines and pancreatic juice for the

identification of novel diagnostic leads. Label-free protein quantification methods revealed a

group of proteins differentially expressed in pancreatic cancer. Contained within this group were

numerous proteins previously studied as pancreatic cancer biomarkers and associated with

pancreatic cancer pathology. Further candidates were generated through integrative analysis of

multiple biological fluids and examination of tissue specificity. Through a preliminary

assessment, five proteins (AGR2, OLFM4, PIGR, SYCN and COL6A1) were shown to be

significantly increased in plasma from pancreatic cancer patients. Appropriate validation of

potential biomarkers requires the use of clearly defined clinical specimen, appropriate controls

and a large number of samples (preclinical, early and late-stage, benign disease, healthy

controls) [209]. Our preliminary assessment warrants further validation of these five proteins in

larger cohorts of samples (early and late-stage pancreatic cancer, benign disease and healthy

controls) as well as consideration of these proteins in the development of biomarker panels for

pancreatic cancer.

The current state of cancer proteomics boasts a large number of discovery studies

resulting in the generation of many potential diagnostic and therapeutic leads; however due in

part to a lack of parallel high-throughput/multiplexed technologies, subsequent verification and

validation of these leads is lagging. In this regard, the proteomic data-set presented in this study,

when combined with existing repositories or compendiums [146,210], may also aid in further

prioritizing candidates for future diagnostic and therapeutic applications.

70

CHAPTER 3:

Enhanced Performance of CA19.9 with Addition of Syncollin and

Anterior Gradient Homolog 2 in Panel

71

3.1 Introduction

New serum biomarkers to aid in the detection and clinical management of patients with

pancreatic cancer are urgently needed. With an estimated 43,140 new cases in the United States

in 2010 and an estimated 36,800 deaths, pancreatic cancer is one of the most aggressive of all

cancer types and the 4th

leading cause of cancer related death [33]. Currently five-year survival

rates are less than 5%; however detection of pancreatic cancer in its early stages can increase

five-year survival rates to 20-40% [33]. Until better therapeutic measures are developed, or even

with the advent of novel therapeutics, the detection of pancreatic cancer earlier is key to

improving patient survival. Currently, CA19.9 is the most widely used biomarker in the clinic

for pancreatic cancer with a median sensitivity of ~79% and median specificity of ~82%;

however its efficacy is low for detection of early-stage disease (~55% sensitivity) [59]. It is also

elevated in benign conditions and is not produced in Lewis genotype negative individuals [64]

(~10% of the general population). As a result, it is mostly used in the clinic for monitoring

response to therapy in patients with established disease. Thus the need remains for the

identification of new tumor markers with high sensitivity and specificity for optimal

management of pancreatic cancer patients.

We previously performed extensive proteomic analysis of conditioned media (CM) from

six pancreatic cancer cell lines, one normal pancreatic ductal epithelial cell line and six

pancreatic juice samples using two dimensional LC-MS/MS. Specifically, our triplicate analysis

of the BxPc3, MIA-PaCa2, PANC1, CAPAN1, CFPAC1, SU.86.86 and HPDE cell line CM,

and pancreatic juice samples resulted in the identification of 3479 non-redundant proteins with

two or more peptides. Through subsequent examination of differential protein expression

between the cancer and normal cell lines using relative label-free protein quantification and

integrative analysis, focusing on the overlap of proteins between the multiple biological fluids,

72

cellular localization and tissue specificity, candidate biomarkers for verification were elucidated.

Preliminary verification of 5 proteins, AGR2, OLFM4, SYCN, PIGR and COL6A1 in 20 plasma

samples from pancreatic cancer patients and 20 healthy individuals of similar age/sex using

ELISA showed a significant elevation of these proteins in plasma from pancreatic cancer

patients. Assessment of the combination of the 5 proteins showed an improved area under the

curve to CA19.9 alone. In the present study, we further assessed two of the proteins, SYCN and

AGR2, in a larger number of samples (n=198).

Syncollin (SYCN) is a 14 kDa protein that shows high/preferential expression in the

pancreas based on several publically available databases such as TiSGeD (Tissue-Specific

Genes Database) which provides microarray-based tissue expression of proteins, and TiGER

(Tissue-specific and Gene Expression and Regulation) and Unigene which provide EST

(expressed sequence tag) tissue expression data. Data is not yet available for SYCN on the

Human Protein Atlas database which provides immunohistochemistry tissue expression profiles

of proteins. In literature, SYCN has been identified in rat pancreatic tissues as a protein found on

the membranes of zymogen granules in pancreatic acinar cells which functions as a regulator of

exocitosis in a Ca2+

dependent manner, and has been shown to play a role in the maturation of

zymogens [207,211].

More recently, SYCN was identified in human neutrophilic granulocytes by Bach et al.

[212]. This group has postulated a possible role for syncollin in host defense; however at present

empirical support for such a claim is lacking [212]. The role of SYCN, if any, in cancer is

unknown. AGR2 is a more widely studied protein, especially in the context of cancer. AGR2 is

elevated in several cancer sites including breast, prostate and lung cancers, and has been

implicated in invasion and metastasis [190-192]. In pancreatic cancer, elevated levels of AGR2

mRNA and protein have been shown in tissue sections of pancreatic ductal adenocarcinoma.

73

Silencing with siRNA has shown decreased cell proliferation and reduced invasion in pancreatic

cancer cell lines, and silencing through shRNA has shown reduced tumor growth in vivo in an

orthotopic mouse model of pancreatic cancer [193]. This latter study showed that silencing of

AGR2 can increase the efficacy of treatment with gemcitabine, significantly reducing metastatic

loci in the liver and lung [193].

In the present study, we investigated the levels of syncollin and AGR2 in serum of 111

patients with pancreatic cancer and 87 normal controls. Significantly elevated levels of both

proteins were observed in pancreatic cancer patients, although SYCN performed better than

AGR2. Receiver operating characteristic (ROC) curve analysis was conducted on all cases and

controls, and on the subset of confirmed Stage II cases and all controls. Individually, only

SYCN showed a slight (but not statistically significant) improvement in AUC in comparison to

that of CA19.9 in the Stage II and controls analysis. In both cases however, the combination of

SYCN, AGR2 and CA19.9 showed a statistically significant improvement in AUC to CA19.9

alone. These results support the view that syncollin and AGR2 are promising candidate

serological biomarkers for pancreatic cancer which can enhance the performance of CA19.9.

The clinical utility of this novel panel needs to be further studied in larger cohorts of samples

from patients with pancreatic cancer, individuals with benign and preclinical disease, and

normal controls.

3.2 Materials and Methods

Patients and clinical specimen

One-hundred and eleven serum samples from patients diagnosed with pancreatic cancer

and 87 serum samples from healthy controls were used in this study. Median age of controls was

51 years (age range 19-84 years; 3 samples have unreported age) and median age of cancer

patients was 65 years (age range 32-85 years; 1 sample with unreported age). Of the 111

74

pancreatic cancer patients, 3 had stage I disease [30], 50 had stage II disease, 1 had been

reported as stage III and 3 patients were reported as having stage IV disease, with 4 more

reported as „unresectable‟ [30]. Stage was unknown in 50 patients. Tumor grade was reported in

37 cases, of which three were grade I, two were grade I-II and I-III, 19 were grade II and 13

were grade III (Table 3.1). All serum samples were provided by Dr. Randy Haun at the

University of Arkansas Cancer Research Center and collected with informed consent in

accordance with the Institutional Ethics Board. Samples were stored in -80 ºC upon collection

and shipped in dry ice. Samples were not thawed until use in this study.

Measurement of CA19.9, SYCN and AGR2 in serum

CA19.9 levels were measured using a commercially available immunoassay (ELECSYS

by Roche) and performed according to manufacturer‟s instructions. Enzyme linked-

immunosorbent assay kits were purchased for SYCN and AGR2 from USCN LifeSciences

(SYCN: Catalogue # E93879Hu; AGR2: Catalogue # E2285Hu). ELISAs were performed

according to manufacturer‟s instructions with slight modifications. Briefly, 100uL of sample

was incubated in pre-coated 96-well plates for 2 hours in 37 ºC, along with standards. Samples

were diluted in phosphate buffered saline as instructed, using a 1in5 dilution for both proteins.

Plates were washed 2 times using the wash buffer provided in the kits (where-as manufacturer‟s

instructions indicate no washing needed at this stage). A biotin-conjugated polyclonal secondary

antibody specific to SYCN and AGR2 (detection reagent A from USCN kit) was prepared and

incubated for 1 hour in 37 ºC. Following 4 washes, horseradish peroxidase (HRP) conjugated to

avidin (detection reagent B from USCN kit) was prepared and incubated for 30 min at 37 ºC.

The plates were washed 4 times and 90uL of tetramethylbenzidine (TMB) substrate was added

to each well. Wells were protected from light and incubated in 37 ºC for 10-15 min or until the

two highest standards were not saturated (based on visual examination of colour change). Fifty

75

microlitres of stop solution (sulphuric acid solution provided in USCN kit) was added and the

colour change was measured spectrophotometrically using the Perkin-Elmer Envision 2103

multilabel reader at a wavelength of 450 nm (540nm measurements were used to determine

background).

Table 3.1 Stage and Grade of 111 Pancreatic Cancer Serum Samples

Stage Number of

Samples

Number of Samples with Grade...

I I-II I-III II III Unknown

I 3 2 1

II 50 2 1 1 19 10 17

III 1 1

IV 3 1 2

"Unresectable"a 4 4

Unknown 50 1 49

Total 111 4 1 1 22 13 70

aFour cancer cases were reported as unresectable which, according to the 6

th edition of the

American Joint Commission on Cancer (AJCC) implies stage III or greater where-by patients

have locally advanced disease involving the celiac axis or superior mesenteric artery; stage IV is

the presence of distance metastasis [30].

Data Analysis and Statistics

The Mann-Whitney U-test was applied to assess statistical significance of medians of

cases and controls at a 95% confidence interval. Spearman‟s rank correlation coefficient was

calculated to evaluate correlation between CA19.9 and SYCN, CA19.9 and AGR2 and SYCN

and AGR2. The diagnostic value of the proteins was assessed using ROC curve analysis and

AUC calculations were carried out using ROCR and pROC software (Swiss Institute of

Bioinformatics) with variances calculated using a bootstrap method for multiple biomarker

modeling. Statistical differences between AUCs were assessed as described previously using

DeLong‟s method [82].

76

3.3 Results

Table 3.1 shows the demographics of the 111 cancer serum samples. Table 3.2 shows the

distribution of age, CA19.9, SYCN and AGR2 in the cancer patients and normal controls.

Clinical information was not available for all patients; however 50 patients were reported as

Table 3.2 Distribution of Serum SYCN, AGR2, CA19.9 and age in pancreatic cancer and

control serum.

Marker Disease

State

Na Min Median Max Mean p-value

b

Normal

versus

All

Cancer

Agec

Normal 87 19 51 84 52 p<0.0001

All Cancer 111 32 65 85 64

CA19.9 Normal 87 5.75 14.9 109.3 20.38

p<0.0001 All Cancer 111 3 137.7 23700 1319.23

SYCN Normal 87

Below

LOD 2.84 76.36 6.54 p<0.0001

All Cancer 111 0.56 10.93 110.8 23.9

AGR2 Normal 87 2.525 8.035 61 12.06

p<0.0001 All Cancer 111 3.515 12.965 265.7 34.09

Normal

versus

Early

Staged

Agec

Normal 87 19 51 84 52 p<0.0001

Early Stage 53 42 68 85 66

CA19.9 Normal 87 5.75 14.9 109.3 20.38

p<0.0001 Early Stage 53 3 120.3 2184.5 343.23

SYCN Normal 87

Below

LOD 2.84 76.36 6.54 p<0.0001

Early Stage 53 0.71 17.61 110.8 31.53

AGR2 Normal 87 2.525 8.035 61 12.06

p=0.0938 Early Stage 53 3.515 8.89 215.9 23.95

a Number of samples

b Mann-Whitney U-test

c Age is in years; concentration of AGR2 and SYCN are in ug/L; CA19.9 levels are in Units/mL

d Early Stage = stage I and II

having stage II disease when serum was collected. According to the American Joint Commission

on Cancer Staging System for Pancreatic Adenocarcinoma, this means that the patients have

potentially resectable disease with possible involvement of adjacent organs or venous structures;

however no involvement of the celiac axis or superior mesenteric artery [30]. Three patients had

confirmed stage I disease (potentially resectable pancreatic cancer confined to the pancreas).

77

Where possible, levels of markers were also assessed in these patients as a separate category

denoted as “early stage”. Clinical/pathological stage and history was not obtainable for a number

of samples and an adequate sample group with clinically/pathologically confirmed late stage

(stage III/IV) was not available to include early versus late-stage comparisons.

Figure 3.1 shows the distribution of CA19.9, SYCN and AGR2 in normal, early-stage

samples and all cancer samples. A clear elevation can be seen for CA19.9 for early stage and all

cancer samples (Fig 3.1a,b) and for SYCN (Fig 3.1c). While AGR2 appears elevated when all

cancer samples are considered, elevation in early-stage samples is less clear. This was shown

statistically through comparison of medians between the groups (Table 3.1). Comparisons

between all cancer and normal control samples were found to be statistically significant for all

three proteins (p<0.0001). Comparisons between early stage cancer and normal controls was

significant for CA19.9 and SYCN (p<0.0001 for both). AGR2 did not show a significant

increase in levels between early stage and control comparison (p=0.0938). The median ages of

the cancer and control groups was also found to be statistically different (p<0.0001) (Table 3.1).

Spearman‟s rank correlation coefficient was evaluated to assess correlation between the

three molecules for all cancer and controls (Figure 3.2). All combinations assessed (CA19.9 and

SYCN, CA19.9 and AGR2 and SYCN and AGR2) showed a significant correlation; however

ROC curve analysis showed the ability of SYCN and AGR2 to statistically enhance the

performance of CA19.9 alone (Figure 3.3,3.4).

78

Figure 3.1 Distribution of serum CA19.9, SYCN and AGR2 in normal controls, early stage

(stage I and II, n = 53) and all pancreatic cancer patients (n=111). Plot a and b both show the

same data for CA19.9; however plot b is a magnified view showing values only from 0 units/mL

to 2,500 units/mL. The horizontal line indicates the median values for each group. Statistical

significance was calculated using Mann-Whitney U-test for all groups of early stage versus

normals and all cancer samples versus normals. All comparisons showed a statistically

significant difference between medians (p<0.0001), except for the comparison of AGR2 early

stage versus normals. See also Table 3.2.

79

Figure 3.2 Correlation between CA19.9 and SYCN (a), CA19.9 and AGR2 (b) and SYCN and AGR2 (c). The Spearman

correlation coefficient was significant for all comparisons.

80

To evaluate the diagnostic value of SYCN and AGR2, AUC was calculated using ROC

curve analysis (Figure 3.3). Individually, neither protein showed improved AUC to CA19.9

alone (Fig 3.3a). Further biomarker model comparisons were performed using a bootstrap

method of analysis for modeling and the combination of SYCN, AGR2 and CA19.9 performed

the best with a statistically significant improvement in AUC when compared to CA19.9 alone

(AUCSYCN+AGR2+CA19.9 = 0.91, AUCCA19.9 = 0.84; p = 0.0007937).

We further performed ROC curve analysis on the clinically confirmed stage II cancer

samples with normal controls (Figure 3.4). In this analysis, SYCN performed the best (in terms

of AUC) of the three markers individually, with a slight improvement over CA19.9 although the

improvement was not significant (AUC SYCN = 0.84; AUCCA19.9 = 0.82; p-value = 0.7846). When

assessed in combination, the combination of SYCN, AGR2 and CA19.9 performed the best as

seen with all cancer cases, showing a statistically significant improvement in AUC to CA19.9

alone (AUCSYCN+AGR2+CA19.9 = 0.93, AUCCA19.9 = 0.82; p = 0.002718) (Fig 3.4b). The

combination of SYCN and AGR2 in the analysis with stage II patients and controls (Fig 3.4b)

had an AUC of 0.83 which was lower than the AUC of SYCN alone (0.84; Fig 3.4a). In the

assessment with all cancer patients however, the combination of SYCN and AGR2 resulted in a

higher AUC than either of these proteins alone.

Taken together, SYCN and AGR2 significantly enhanced the performance of CA19.9 in

this sample set, warranting further investigation of this novel panel in larger sample sets of

pancreatic cancer, benign and normal controls.

3.4 Discussion

The identification of novel biomarkers or biomarker panels with a high degree of

sensitivity and specificity can aid in the clinical management of pancreatic cancer patients and in

81

Figure 3.3 ROC curves for SYCN, AGR2 and CA19.9 individually (a) and in combination (b) for all cancer (n=111) and all

controls (n=87) with estimated AUC (95% confidence interval (CI)). Individually, CA19.9 performed best (AUC = 0.84) (a). In

combination, addition of SYCN and AGR2 significantly improved AUC of CA19.9 (p=0.0007937) (b).

2 candidates = SYCN and AGR2

82

Figure 3.4 ROC curves for SYCN, AGR2 and CA19.9 individually (a) and in combination (b) for stage II pancreatic cancer

(n=50) and all controls (n=87) with estimated AUC (95% confidence interval (CI)). Individually, SYCN performed best (AUC =

0.84) (a). In combination, addition of SYCN and AGR2 significantly improved AUC of CA19.9 (p=0.002718) (b). The combination

of SYCN and AGR2 also had higher AUC than CA19.9 alone; however the difference was not significant (b).

2 candidates = SYCN and AGR2

83

the detection of disease in the early stages of tumor development when patients can be most

optimally treated. CA19.9 is currently the most widely used clinical biomarker for pancreatic

cancer. It was discovered over 20 years ago using monoclonal antibody technology, where

tumor extracts and cell lines were used as immunogens, followed by selection for hybridoma

clones that recognized these tumor antigens [213]. This was also the case for several other

currently used cancer markers such as CA125 and CA15.3. Although CA19.9 lacks utility as a

marker for detection, new markers have not entered the clinic to replace or sufficiently

supplement CA19.9. The advent of new high throughput proteomic technologies has led to a

new wave of enthusiasm, and the emergence of oncoproteomics and proteomics-based discovery

studies [44,71,117,125]. In this regard, we previously characterized pancreatic cancer-related

biological fluids for the generation of candidate pancreatic cancer biomarkers. Our initial

verification studies led to five promising candidates – AGR2, SYCN, PIGR, OLFM4 and

COL6A1. Two of those five proteins, AGR2 and SYCN, were further investigated in this study.

Syncollin is a protein that is highly expressed in pancreatic acinar granules, with a few

recent studies describing its presence and function in granules in other tissue-types [212].

Digestive enzymes secreted by the pancreas are synthesized in their inactive form by ribosomes

on the endoplasmic reticulum (ER) [214]. Following insertion into the lumen of the ER and

transport to the Golgi, condensing vacuoles form which mature into zymogen granules that are

stored apically in acinar cells. Ca2+

causes fusion of granules with the plasma membrane and

exocytosis. Zymogen granule membrane proteins are key to the packaging of zymogens and

granule movement and fusion with the plasma membrane [214]. Syncollin is a zymogen granule

protein found on the inner surface of the granule membrane with a role in the concentration and

maturation of zymogens, as well as in the regulation of exocytosis [207, 211].

Syncollin has been identified in a qualitative proteomic analysis of pancreatic juice from

84

patients with pancreatic cancer and in serum from a murine model of pancreatic cancer [98,106];

however to our knowledge, this is the first report of its study in human serum. In the present

study we found it to be significantly elevated in patients with pancreatic cancer. Given that the

majority of pancreatic cancers are believed to arise from ductal cells (or acinar cells that undergo

acinar-to-ductal metaplasia losing acinar cell properties such as zymogen granules) [5,204],

elevation of SYCN in circulation may be a secondary effect of the growing tumor through local

tissue destruction. Release or leakage of proteins during invasion of pancreatic cancer has been

recently studied for the protein transthyretin (TTR), an islet cell protein that is elevated in

pancreatic juice from pancreatic cancer patients through destruction of islet cell architecture in

the presence of invasive cancer [215]. A similar phenomenon may be occurring with SYCN

release. Another key zymogen granule membrane protein is GP2 (glycoprotein 2). Elevated

levels of GP2 have also been reported in plasma from pancreatic cancer patients in comparison

to normal controls [165]. In the same study, GP2 levels showed better diagnostic utility for acute

pancreatitis [165]. This may also be the case for syncollin and our initial findings warrant further

investigation of this protein in larger sample sets which include benign diseases of the pancreas

such as pancreatitis.

AGR2 is a protein initially identified in Xenopus laevis that was shown to be crucial

during ectoderm developmental stages of embryogenesis for formation of anterior structures

[189]. Its role in normal human structures is still unclear; however it has been implicated in

many cancer types, including pancreatic cancer. In a recent proteomic analysis by Chen et al.

[153], AGR2 was found to be overexpressed in pancreatic juice from patients with pancreatic

intraepithelial neoplasia – III (PanIN3) and ELISA results showed this protein to have potential

diagnostic utility for pancreatic cancer in pancreatic juice; however these findings did not

translate into their serum analysis. AGR2 was highly overexpressed in our proteomic cell line

85

analysis and it was identified in the pancreatic juice. In our preliminary verification studies in 20

cancer and 20 normal plasma samples (described in the previous chapter), this protein also

showed a significant elevation in pancreatic cancer patients (p<0.0001; AUC = 0.95). In the

present study, AGR2 was significantly elevated when all pancreatic cancer cases were compared

to controls; however comparison of only early stage (stage I and II; n=50) cases did not show a

significant difference between cancer and controls. Literature findings support a role for AGR2

in the invasion and metastasis of cancer [192-194], and it may be that elevated levels of this

protein are seen in circulation in the later stages of pancreatic cancer. Given that confirmed stage

was not known in many samples in this study, firm conclusions cannot be made regarding this

without analysis in larger sample sizes from patients with early and late-stage disease.

Due to the lack of one single highly sensitive and specific marker for many diseases,

including for various measurable outcomes of pancreatic cancer, research has shifted to the

identification of panels of markers to achieve enhanced performance [216-218]. In the present

study, we demonstrate the ability of SYCN and AGR2 to significantly enhance the performance

of CA19.9 alone (AUCSYCN+AGR2+CA19.9 = 0.91, AUCCA19.9 = 0.84; p = 0.0007937). The median

ages between the cancer and control samples used in this study also showed a significant

difference and further statistically analyses, taking age into account, are warranted; however it is

highly likely that with addition of other candidates derived from our proteomic analysis of cell

line conditioned media and pancreatic juice (described in the previous chapter), further

improvements in the AUC can be made.

Validation of biomarkers is a rigorous process [209]. The proteins preliminarily

validated in this study represent promising novel biomarkers for pancreatic cancer, with the

ability to significantly enhance performance of CA19.9 in discriminating cancer from controls.

In this respect, it is crucial that the proteins presented in this study be further investigated in

86

independent sample sets of serum from early and late-stage pancreatic cancer and other cancer

patients, benign controls (acute pancreatits, chronic pancreatits, benign lesions), preclinical

samples and healthy controls to further determine their clinical utility. As well, inclusion or

consideration of these proteins in biomarker panels for pancreatic cancer currently under

development is also warranted.

87

CHAPTER 4:

Summary and Future Directions

88

4.1 Summary

This thesis presents a study aimed at the identification of novel pancreatic cancer serum

biomarker candidates through proteomic technologies, followed by verification of selected

candidates using enzyme-linked immunosorbent assays. Mass spectrometry analysis of

conditioned media from six pancreatic cancer cell lines, one normal human pancreatic ductal

epithelial cell line and six pancreatic juice samples (in two pools) in triplicate, resulted in 3479

total proteins identified with high confidence. Subsequent application of bioinformatics-based

criteria centered on label-free protein quantification between the cancer and normal cell lines

and integrative analysis of the multiple biological fluids facilitated the generation of candidate

pancreatic cancer biomarkers. Preliminary verification of candidates in a small subset of plasma

samples resulted in the identification of five promising leads. Further verification of two of the

five proteins in serum demonstrated the ability of these proteins, when used in combination with

CA19.9, to significantly enhance the area under the curve of CA19.9 alone, warranting their

further and extended validation, as well as validation of the remaining candidates. Finally, this

thesis demonstrates the utility of mining and integrating multiple biological fluids for the

identification of putative cancer biomarkers.

Key Findings:

1. Mass Spectrometry Analysis:

a. Strong cation exchange (SCX) liquid chromatography was the first dimension of

fractionation used to minimize sample complexity and it was demonstrated that

pooling fractions based on their SCX elution profile could increase protein yield by

>50%.

89

b. Using two dimensional liquid chromatography-tandem mass spectrometry (2D-LC-

MS/MS), 3324 non-redundant proteins were identified with ≥ 2 peptides in the

triplicate analysis of the 7 cell lines (MIA-PaCa2, PANC1, BxPc3, CAPAN1,

CFPAC1, SU.86.86 and HPDE).

c. Using 2D-LC-MS/MS, 648 non-redundant proteins were identified with ≥ 2 peptides

in the triplicate analysis of two pancreatic juice pools containing six samples from

pancreatic ductal adenocarcinoma patients.

d. A combined total of 3479 proteins were identified from all cell lines and pancreatic

juice samples, of which ~40% were found to be extracellular or cell surface

annotated through Genome Ontology analysis.

e. Label-free protein quantification, comparing the average normalized spectral counts

from the three replicates of each cancer cell line to that of the HPDE normal cell line,

resulted in 63 extracellular and cell surface proteins that showed over 5-fold increase

in at least three cancer cell lines.

f. Integrative analysis of proteins common to the cancer cell line and pancreatic juice

proteomes (with further filtering using a pancreatic cancer ascites proteome;

unpublished) and study of tissue specificity, focusing on pancreas specific proteins,

resulted in the generation of further candidates.

g. Many proteins identified through the label-free protein quantification approach as

overexpressed in pancreatic cancer and through the integrated analysis have been

previously shown to be increased in pancreatic cancer serum, which helps to

internally validate our approach.

90

2. Preliminary Verification

a. Five proteins, AGR2, SYCN, OLFM4, PIGR and COL6A1 showed a significant

increase in plasma from 20 patients with pancreatic cancer in comparison to 20

controls using ELISA.

b. ROC curve analysis of these five proteins in panel showed a slight improvement to

the AUC of CA19.9 alone.

3. Extended Verification

a. SYCN and AGR2 were shown to significantly improve AUC of CA19.9 when all

three molecules were assessed in combination in 198 serum samples (111 samples

from pancreatic cancer patients and 87 from normal controls).

4.2 Future Directions

The experimental data presented in this thesis has led to the identification of several

novel candidate serological biomarkers for pancreatic cancer, which require further verification

and validation to determine their diagnostic utility, and potential use in other areas of pancreatic

cancer management. In our extended verification, addition of SYCN and AGR2 in panel with

CA19.9 demonstrated a significant improvement in the AUC of CA19.9 alone. The addition of

other proteins to this panel that were shown to be increased in our preliminary verification phase

may likely further enhance its performance. It is also important to assess levels of these proteins

in benign disease, preclinical samples, and clearly defined early and late-stage samples in

independent sample sets to more thoroughly assess their potential as biomarkers.

In addition to detection, it is possible that these candidates may be useful in other areas

of clinical care, such as in monitoring response to treatment or detecting recurrence of disease.

91

In this respect, it would be useful to examine their levels in matched serum sets from pancreatic

cancer patients before and after treatment or before and after surgical resection, although

acquisition of such samples may be difficult. Similarly, detailed analysis of post translational

modifications (PTMs) such as glycosylations of the candidates verified here may further

enhance their utility as biomarkers by identification of unique disease-specific PTM patterns.

Not all of the candidates generated from the proteomic analysis presented in this thesis

were verified, partly due to a lack of assays and reagents for immunoassay-based verification. In

this regard, future directions also include verification of the remaining candidates that have not

previously been studied. This can be done through immunoassay-based technologies, if reagents

exist, or through development of mass spectrometry-based targeted protein quantification assays

such as multiple reaction monitoring (MRM). For such MRM-based approaches, given the low

sensitivity of mass spectrometers in serum, prefractionation or immunoextraction techniques to

enrich for candidates will likely be required prior to MRM.

It is highly possible that the proteins identified as deregulated in this study play a role in

the pathogenesis of pancreatic cancer. A detailed analysis of the mechanisms through which

they are increased or decreased and their role in pancreatic cancer may shed light on

tumorigenesis and cancer progression. Additionally, the comprehensive proteome of 3479

proteins presented in this thesis and the proteins identified as differentially expressed through

label-free protein quantification may aid other researchers in further prioritizing candidates in

future cancer therapeutic and diagnostic applications.

92

REFERENCES

1. Ross MH, Pawlina W. (2006) Histology A Text and Atlas. Baltimore, MD: Lippincott

Williams and Wilkins:594 – 602.

2. Bardeesy N, DePinho RA. (2002) Pancreatic cancer biology and genetics. Nat Rev Cancer.

12, 897-909.

3. Adsay N.V., Thirabanjasak D., Altinel D. (2008) Spectrum of human pancreatic neoplasia.

In: Lowy AM, Leach SD, Philip AP, eds. Pancreatic cancer – M.D. Anderson solid tumor

oncology series. New York: Springer Science+Business Media,LLC:3-26.

4. Maitra A., Hruban R. (2008) Pancreatic Cancer. Annu Rev Pathol Mech Dis. 3,157-188.

5. Stanger B.Z., Dor Y. (2006) Dissecting the cellular origins of pancreatic cancer. Cell Cycle.

5, 43-46

6. Zhu L, Shi G, Schmidt CM, Hruban RH, Konieczny SF. (2007) Acinar cells contribute to the

molecular heterogeneity of pancreatic intraepithelial neoplasia. Am J Pathol. 263-73.

7. Schmid RM. (2002) Acinar-to-ductal metaplasia in pancreatic cancer development. J Clin

Invest. 109, 1403-1404.

8. Tanaka M, Chari S, Adsay V, Fernandez-del Castillo C, Falconi M, Shimizu M, Yamaguchi

K, Yamao K, Matsuno S; International Association of Pancreatology. (2006) International

consensus guidelines for management of intraductal papillary mucinous neoplasms and

mucinous cystic neoplasms of the pancreas. Pancreatology. 6, 17-32.

9. Dunphy E.P. (2008) Pancreatic cancer: a review and update. Clin J Oncol Nurs. 12, 735-741.

10. Krishna NB, Mehra M, Reddy AV, Agarwal B. (2009) EUS/EUS-FNA for suspected

pancreatic cancer: influence of chronic pancreatitis and clinical presentation with or without

obstructive jaundice on performance characteristics. Gastrointest Endosc. 70, 70-79.

11. Zervos EE, Osborne D, Boe BA, Luzardo G, Goldin SB, Rosemurgy AS (2006) Prognostic

significance of new onset ascites in patients with pancreatic cancer. World J Surg Oncol. 4,

16.

12. Bachmann J, Ketterer K, Marsch C, Fechtner K, Krakowski-Roosen H, Büchler MW, Friess

H, Martignoni ME. (2009) Pancreatic cancer related cachexia: influence on metabolism and

correlation to weight loss and pulmonary function. BMC Cancer. 9, 255.

13. Stathis A, Moore MJ. (2010) Advanced pancreatic carcinoma: current treatment and future

challenges. Nat Rev Clin Oncol. 7, 163-172

14. Ghadirian P, Lynch HT, Krewski D. (2003) Epidemiology of pancreatic cancer: an

93

overview. Cancer Detect Prev. 27, 87-93.

15. Lowenfels AB, Maisonneuve P. (2006) Epidemiology and risk factors for pancreatic cancer.

Best Pract. Res. Clin. Gastroenterol. 20:197–209.

16. Brand RE, Lynch HT. (2000) Hereditary pancreatic adenocarcinoma. A clinical perspective.

Med Clin North Am. 84, 665-675.

17. Lynch HT, Smyrk T, Kern SE, Hruban RH, Lightdale CJ, Lemon SJ, Lynch JF, Fusaro LR,

Fusaro RM, Ghadirian P. (1996) Familial pancreatic cancer: a review. Semin Oncol. 23, 251-

275.

18. Ghadirian P, Boyle P, Simard A, Baillargeon J, Maisonneuve P, Perret C. (1991) Reported

family aggregation of pancreatic cancer within a population-based case-control study in the

Francophone community in Montreal, Canada. Int J Pancreatol 10, 183-196.

19. Tersmette AC, Petersen GM, Offerhaus GJ, Falatko FC, Brune KA, Goggins M, Rozenblum

E, Wilentz RE, Yeo CJ, Cameron JL, Kern SE, Hruban RH. (2001) Increased risk of incident

pancreatic cancer among first-degree relatives of patients with familial pancreatic cancer.

Clin Cancer Res. 7, 738-744.

20. Klapman J, Malafa MP. (2008) Early detection of pancreatic cancer: why, who, and how to

screen. Cancer Control. 15, 280-287.

21. Gruber SB, Entius MM, Petersen GM, Laken SJ, Longo PA, Boyer R, Levin AM, Mujumdar

UJ, Trent JM, Kinzler KW, Vogelstein B, Hamilton SR, Polymeropoulos MH, Offerhaus GJ,

Giardiello FM. (1998) Pathogenesis of adenocarcinoma in Peutz-Jeghers syndrome. Cancer

Res. 58, 5267-5270.

22. Giardiello FM, Brensinger JD, Tersmette AC, Goodman SN, Petersen GM, Booker SV,

Cruz-Correa M, Offerhaus JA (2000) Very high risk of cancer in familial Peutz-Jeghers

syndrome. Gastroenterology. 119, 1447-1453.

23. Lowenfels AB, Maisonneuve P, DiMagno EP, Elitsur Y, Gates LK Jr, Perrault J, Whitcomb

DC. (1997) Hereditary pancreatitis and the risk of pancreatic cancer. International Hereditary

Pancreatitis Study Group. J. Natl. Cancer Inst. 89, 442–446

24. Wolpin BM, Kraft P, Gross M, Helzlsouer K, Bueno-de-Mesquita HB, Steplowski E,

Stolzenberg-Solomon RZ, Arslan AA, Jacobs EJ, Lacroix A, Petersen G, Zheng W, Albanes

D, Allen NE, Amundadottir L, Anderson G, Boutron-Ruault MC, Buring JE, Canzian F,

Chanock SJ, Clipp S, Gaziano JM, Giovannucci EL, Hallmans G, Hankinson SE, Hoover

RN, Hunter DJ, Hutchinson A, Jacobs K, Kooperberg C, Lynch SM, Mendelsohn JB,

94

Michaud DS, Overvad K, Patel AV, Rajkovic A, Sanchéz MJ, Shu XO, Slimani N, Thomas

G, Tobias GS, Trichopoulos D, Vineis P, Virtamo J, Wactawski-Wende J, Yu K, Zeleniuch-

Jacquotte A, Hartge P, Fuchs CS. (2010) Pancreatic cancer risk and ABO blood group alleles:

results from the pancreatic cancer cohort consortium. Cancer Res. 70, 1015-1023.

25. Sarkar FH, Banerjee S, Li Y. (2007) Pancreatic cancer: pathogenesis, prevention and

treatment. Toxicol Appl Pharmacol. 224, 326-336.

26. Li D, Abbruzzese JL. (2010) New strategies in pancreatic cancer: emerging epidemiologic

and therapeutic concepts. Clin Cancer Res. 16, 4313-4318.

27. Fendrich V, Chen NM, Neef M, Waldmann J, Buchholz M, Feldmann G, Slater EP, Maitra

A, Bartsch DK. (2010) The angiotensin-I-converting enzyme inhibitor enalapril and aspirin

delay progression of pancreatic intraepithelial neoplasia and cancer formation in a genetically

engineered mouse model of pancreatic cancer. Gut. 59, 630-637.

28. Shirley A, Yeo CJ. Pancreaticoduodenectomy: Past and Present. (2008) In: Lowy AM,

Leach SD, Philip AP, eds. Pancreatic cancer – M.D. Anderson solid tumor oncology series.

New York: Springer Science+Business Media,LLC:313-327

29. Neoptolemos J, Büchler M, Stocken DD, Ghaneh P, Smith D, Bassi C, Moore M,

Cunningham D, Dervenis C, Goldstein D. (2009). ESPAC‑3(v2): A multicenter,

international, open‑label, randomized, controlled phase III trial of adjuvant

5‑fluorouracil/folinic acid (5‑FU/FA) versus gemcitabine (GEM) in patients with resected

pancreatic ductal adenocarcinoma [abstract]. J. Clin. Oncol. 27 (Suppl. 18), a4505.

30. Katz MH, Hwang R, Fleming JB, Evans DB. (2008) Tumor-node-metastasis staging of

pancreatic adenocarcinoma. CA Cancer J Clin. 58, 111-125.

31. Moore MJ, Goldstein D, Hamm J, Figer A, Hecht JR, Gallinger S, Au HJ, Murawa P, Walde

D, Wolff RA, Campos D, Lim R, Ding K, Clark G, Voskoglou-Nomikos T, Ptasynski M,

Parulekar W; National Cancer Institute of Canada Clinical Trials Group. (2007) Erlotinib

plus gemcitabine compared with gemcitabine alone in patients with advanced pancreatic

cancer: a phase III trial of the National Cancer Institute of Canada Clinical Trials Group. J.

Clin. Oncol. 25, 1960–1966.

32. Jones S., Zhang X., Parsons D.W., Lin J.C., Leary R.J., Angenendt P., Mankoo P., Carter H.,

Kamiyama H., Jimeno A., Hong S.M., Fu B., Lin M.T., Calhoun E.S., Kamiyama M., Walter

K., Nikolskaya T., Nikolsky Y., Hartigan J., Smith D.R., Hidalgo M., Leach S.D., Klein A.P.,

Jaffee E.M., Goggins M., Maitra A., Iacobuzio-Donahue C., Eshleman J.R., Kern S.E.,

95

Hruban R.H., Karchin R., Papadopoulos N., Parmigiani G., Vogelstein B., Velculescu V.E.,

Kinzler K.W. (2008) Core signaling pathways in human pancreatic cancers revealed by

global genomic analyses. Science. 321, 1801-1806.

33. Jemal A, Siegel R, Xu J, Ward E. (2010) Cancer statistics, 2010. CA Cancer J Clin. 60, 277-

300.

34. Boyle P, Levin B. World cancer report 2008. International Agency for Research on Cancer.

http://www.iarc.fr/en/publications/pdfs-online/wcr/ (Accessed November 2010).

35. Wray CJ, Ahmad SA. (2008) Controversies in the surgical management of pancreatic

cancer. In: Lowy AM, Leach SD, Philip AP, eds. Pancreatic cancer – M.D. Anderson solid

tumor oncology series. New York: Springer Science+Business Media,LLC:385-400.

36. Blackstock AW, Wentworth S. (2008) The evolution of chemoradiation strategies for locally

advanced pancreatic cancer. In: Lowy AM, Leach SD, Philip AP, eds. Pancreatic cancer –

M.D. Anderson solid tumor oncology series. New York: Springer Science+Business

Media,LLC:497-510.

37. Garofalo MC, Regine WF. Adjuvant chemoradiation for pancreatic cancer: past, present and

future. (2008) In: Lowy AM, Leach SD, Philip AP, eds. Pancreatic cancer – M.D. Anderson

solid tumor oncology series. New York: Springer Science+Business Media,LLC:535-547.

38. Ojeda-Fournier H, Choe KA. Imaging of pancreatic adenocarcinoma. (2008) In: Lowy AM,

Leach SD, Philip AP, eds. Pancreatic cancer – M.D. Anderson solid tumor oncology series.

New York: Springer science + Business Media,LLC:255-270.

39. Soriano A, Castells A, Ayuso C, Ayuso JR, de Caralt MT, Ginès MA, Real MI, Gilabert R,

Quintó L, Trilla A, Feu F, Montanyà X, Fernández-Cruz L, Navarro S. (2004) Preoperative

staging and tumor respectability assessment of pancreatic cancer: prospective study

comparing endoscopic ultrasonography, helical computed tomography, magnetic resonance

imaging, and angiography. Am J Gastroenterol. 99, 492-501.

40. Ho JM, Eysselein VE, Stabile BE. (2008) The value of endoscopic ultrasonography in

predictring respectability and margins of resection for periampullar tumors. Am Surg.

74,1026-1029.

41. Irisawa A, Sato A, Sato M, Ikeda T, Suzuki R, Ohira H. (2009) Early diagnosis of small

pancreatic cancer: Role of endoscopic ultrasonography. Digestive Endoscopy. 21, S92-S96.

42. Zeron HM, Flores JRG, Prieto MLR. (2009) Limintations in improving detection of

pancreatic adenocarcinoma. Future Oncol. 5, 657-668.

96

43. Rosty C, Goggins M. (2002) Early detection of pancreatic carcinoma. Hematol Oncol Clin N

Am. 16, 37-52.

44. Kulasingam V, Diamandis EP. (2008) Strategies for discovering novel cancer biomarkers

through utilization of emerging technologies. Nat Clin Pract Oncol. 5, 588-599.

45. Rulyak SJ, Kimmey MB, Veenstra DL, Brentnall TA. (2003) Cost-effectiveness of

pancreatic cancer screening in familial pancreatic cancer kindreds. Gastrointest Endosc. 57,

23-29.

46. Evans DB, Rich TA. Cancer of the pancreas. (1997) In: DeVita HS, Rosenberg SA eds.

Cancer: principles and practice of oncology. Philadelphia: Lippincott-Raven.:1059-1060.

47. Kim YC, Kim HJ, Park JH, Park DI, Cho YK, Sohn CI, Jeon WK, Kim BI, Shin JH. (2009)

Can preoperative CA19-9 and CEA levels predict the resectability of patients with pancreatic

adenocarcinoma? J Gastroenterol Hepatol. 24, 1869-1875.

48. Yan L, McFaul C, Howes N, Leslie J, Lancaster G, Wong T, Threadgold J, Evans J, Gilmore

I, Smart H, Lombard M, Neoptolemos J, Greenhalf W. (2005) Molecular analysis to detect

pancreatic ductal adenocarcinoma in high-risk groups, Gastroenterology. 128, 2124–2130.

49. Bartels CL, Tsongalis GJ. (2009) MicroRNAs: novel biomarkers for human cancer. Clin

Chem. 55, 623-631.

50. Zhang X., Galardi E., Duquette M., Lawler J., Parangi S., (2005) Antiangiogenic treatment

with three thrombospondin-1 type 1 repeats versus gemcitabine in an orthotopic human

pancreatic cancer model. Clin. Cancer Res. 11, 5622–5630.

51. Tanase CP, Neagu M, Albulescu R, Hinescu ME. (2010) Advances in pancreatic cancer

detection. Adv Clin Chem. 51, 145-80.

52. Ishizone S, Yamauchi K, Kawa S, Suzuki T, Shimizu F, Harada O, Sugiyama A, Miyagawa

S, Fukuda M, Nakayama. (2006) Clinical utility of quantitative RT-PCR targeted to alpha1,

4-N-acetylglucosaminyltransferase mRNA for detection of pancreatic cancer. Cancer Sci. 97,

119–126.

53. Kohn EC, Azad N, Annunziata C, Dhamoon AS, Whiteley G. (2007) Proteomics as a tool

for biomarker discovery. Dis Markers. 23, 411-417.

54. Anderson NL, Anderson NG. (2002) The human plasma proteome: history, character, and

diagnostic prospects. Mol Cell Proteomics. 1, 845-867.

55. Jarjanazi H, Savas S, Pabalan N, Dennis JW, Ozcelik H. (2008) Biological implications of

SNPs in signal peptide domains of human proteins. Proteins. 70, 394-403.

97

56. Hon LS, Zhang Y, Kaminker JS, Zhang Z. (2009) Computational prediction of the

functional effects of amino acid substitutions in signal peptides using a model-based

approach. Hum Mutat. 30, 99-106.

57. Molina R, Jo J, Filella X, Zanon G, Pahisa J, Muñoz M, Farrus B, Latre ML, Gimenez N,

Hage M, Estape J, Ballesta AM. (1996) C-erbB-2 oncoprotein in the sera and tissue of

patients with breast cancer. Utility in prognosis. Anticancer Res. 16, 2295-2300.

58. Stacker SA, Achen MG, Jussila L, Baldwin ME, Alitalo K. (2002) Lymphangiogenesis and

cancer metastasis. Nat Rev Cancer. 2, 573-583.

59. Goonetilleke KS, Siriwardena AK. (2007) Systematic review of carbohydrate antigen (CA

19-9) as a biochemical marker in the diagnosis of pancreatic cancer. EJSO. 33, 266-270.

60. Magnani JL, Steplewski Z, Koprowski H, Ginsburg V. (1983) Identification of the

gastrointestinal and pancreatic cancer-associated antigen detected by monoclonal antibody

19-9 in the sera of patients as a mucin. Cancer Res. 43, 5489-5492.

61. Marrelli D, Caruso S, Pedrazzani C, Neri A, Fernandes E. Marini M, Pinto E, Roviello F.

(2009) CA19-9 serum levels in obstructive jaundice: clinical value in benign and malignant

conditions. Am J Surg. 198, 333-339.

62. Ventrucci M, Pozzato P, Cipolla A, Uomo G. (2009) Persistent elevation of serum CA 19-9

with no evidence of malignant disease. Dig Liver Dis. 41, 357-363.

63. Hatate K, Yamashita K, Hirai K, Kumamoto H, Sato T, Ozawa H, Nakamura T, Onozato W,

Kokuba Y, Ihara A, Watanabe M. (2008) Liver metastasis of colorectal cancer by protein-

tyrosine phosphatase type 4A, 3 (PRL-3) is mediated through lymph node metastasis and

elevated serum tumor markers such as CEA and CA19-9. Oncol Rep. 20, 737-743.

64. Rosen A, Linder S, Harmenberg U, Pegert S. (1993) Serum levels of CA 19-9 and CA50 in

relation to Lewis blood cell status in patients with malignant and benign pancreatic disease.

Pancreas. 8, 160-165.

65. Nazli O, Bozdag A, Tansug T, Kir R, Kaymak E. (2000) The diagnostic importance of CEA

and CA19-9 for the early diagnosis of pancreatic carcinoma. Hepatogatroenterology. 47,

1750 –1752.

66. Tsavaris N, Kosmas C, Papadoniou N, Kopteridis P, Tsigritis K, Dokou A, Sarantonis J,

Skopelitis H, Tzivras M, Gennatas K, Polyzos A, Papastratis G, Karatzas G, Papalambros A.

(2009) CEA and CA-19.9 serum tumor markers as prognostic factors in patients with locally

advanced (unresectable) or metastatic pancreatic adenocarcinoma: a retrospective analysis. J

98

Chemother. 21, 673-80.

67. Gold DV, Modrak DE, Ying Z, Cardillo TM, Sharkey RM, Goldenberg DM. (2006) New

MUC1 serum immunoassays differentiates pancreatic cancer from pancreatitis. J Clin Oncol.

24, 252-258.

68. Yates JR, Ruse CI, Nakorchevsky A. (2009) Proteomics by mass spectrometry: approaches,

advances, and applications. Annu Rev Biomed Eng. 11, 49-79.

69. Makawita S, Diamandis EP. (2010) The bottleneck in the cancer biomarker pipeline and

protein quantification through mass spectrometry-based approaches: current strategies for

candidate verification. Clin Chem. 56, 212-222.

70. Kapp, E. A., Schu¨ tz, F., Connolly, L. M., Chakel, J. A., Meza, J. E., Miller, C. A., Fenyo,

D., Eng, J. K., Adkins, J. N., Omenn, G. S., and Simpson, R. J. (2005) An evaluation,

comparison, and accurate benchmarking of several publicly available MS/MS search

algorithms: sensitivity and specificity analysis. Proteomics. 5, 3475–3490.

71. Domon, B., and Aebersold, R. (2006) Challenges and opportunities in proteomics data

analysis. Mol. Cell. Proteomics. 5, 1921–1926.

72. Hortin GL. (2006) The MALDI-TOF mass spectrometric view of the plasma proteome and

peptidome. Clin Chem. 52, 1223-1237.

73. Whelan LC, Power KA, McDowell DT, Kennedy J, Gallagher WM. (2008) Applications of

SELDI-MS technology in oncology. J Cell Mol Med. 12, 1535-1547.

74. Taylor GI. (1964) Disintegration of water drops in an electric field. Proc. Royal Soc. Lond.

280, 383–397.

75. Perry RH, Cooks RG, Noll RJ. (2008 Orbitrap mass spectrometry: instrumentation, ion

motion and applications. Mass Spectrom. Rev. 27, 661–699.

76. Hu Q, Noll RJ, Li H, Makarov A, Hardman M, Graham Cooks R. (2005) The Orbitrap: a

new mass spectrometer. J. Mass Spectrom. 40, 430–443.

77. Makarov A, Denisov E, Lange O, Horning S. (2006) Dynamic range of mass accuracy in

LTQ Orbitrap hybrid mass spectrometer. J. Am. Soc. Mass Spectrom. 17, 977–982.

78. Pepe MS, Etzioni R, Feng Z, Potter JD, Thompson ML, Thornquist M, Winget M, Yasui Y.

(2001) Phases of biomarker development for early detection of cancer. J Natl Cancer Inst.

93, 1054-1061.

79. Rifai N, Gillette MA, Carr SA. (2006) Protein biomarker discovery and validation: the long

and uncertain path to clinical utility. Nat Biotechnol. 24, 971-983.

99

80. Kulasingam V., Diamandis E.P. (2008) Tissue culture-based breast cancer biomarker

discovery platform. Int JCancer. 123, 2007-2012.

81. Kitteringham NR, Jenkins RE, Lane CS, Elliott VL, Park BK. (2009) Multiple reaction

monitoring for quantitative biomarker analysis in proteomics and metabolomics. J

Chromatogr B Analyt Technol Biomed Life Sci. 877, 1229-1239.

82. DeLong ER, DeLong DM, Clarke-Pearson DL. (1988) Comparing the areas under two or

more correlated receiver operating characteristic curves: a nonparametric approach.

Biometrics. 44, 837-45.

83. Navaglia F, Fogar P, Basso D, Greco E, Padoan A, Tonidandel L, Fadi E, Zambon C,

Bozzato D, Moz S, Seraglia R, Pedcazzoli S, Plebani M. (2009) Pancreatic cancer biomarkers

discovery by surface-enhanced laser desorption and ionization time-of-flight mass

spectrometry. Clin Chem Lab Med. 47, 713-723.

84. Fiedler GM, Leichtle AB, Kase J, Baumann S, Ceglarek U, Felix K, Conrad T, Witzigmann

H, Weimann A, Schutte C, Hauss J, Buchler M, Thiery J. (2009) Serum peptidome profiling

revealed platelet factor 4 as a potential discriminating peptide associated with pancreatic

cancer. Clin Cancer Res. 15, 3812-3819.

85. Kojima K, Asmellash S, Klug CA, Grizzle WE, Mobley JA, Christein JD. (2008) Applying

proteomic-based biomarker tools for the accurate diagnosis of pancreatic cancer. J

Gastrointest Surg. 12, 1683-1690.

86. Sun Z, Zhu Y, Wang F, Chen R, Peng T, Fan Z, Xu Z, Miao Y. (2007) Serum proteomic-

based analysis of pancreatic carcinoma for the identification of potential cancer biomarkers.

Biochimica et Biophysica Acta. 1774, 765-771.

87. Deng R, Lu Z, Chen Y, Zhou L, Lu X. (2007) Plasma proteomic analysis of pancreatic

cancer by 2-dimensional gel electrophoresis. Pancreas. 34, 310-317.

88. Lin Y, Goedegebuure P, Tan M, Gross J, Malone J, Feng S, Larson J, Phommaly C,

Trinkaus K, Townsend R, Linehan D. (2006) Proteins associated with disease and clinical

course in pancreas cancer: A proteomic analysis of plasma in surgical patients. J Proteome

Res. 5, 2169-2176.

89. Bloomston M, Zhou J, Rosemurgy A, Frankel W, Muro-Cacho C, Yeatman TJ. (2006)

Fibrinogen: overexpression in pancreatic cancer identified by large-scale proteomic analysis

of serum samples. Cancer Res. 66, 2592-2599.

90. Yu K, Rustgi AK, Blair I. (2005) Characterization of proteins in human pancreatic cancer

100

serum using differential gel electrophoresis and tandem mass spectrometry. J Proteome Res.

4, 1742-1751.

91. Chen J, Ni R, Xiao M, Guo J, Zhou J. (2009) Comparative proteomic analysis of

differentially expressed proteins in human pancreatic cancer tissue. Hepatobiliary Pancreat

Dis Int. 8, 193-200.

92. Chung J, Oh M, Choi S, Bae C. (2008) Proteomic analysis to identify biomarker proteins in

pancreatic ductal adenocarcinoma. ANZ J Surg. 78, 245-251.

93. Chen R, Brentnall T, Pan S, Cooke K, Moyes K, Lane Z, Crispin D, Goodlett DR, Aebersold

R, Bronner M. (2007) Quantitative proteomics analysis reveals that proteins differentially

expressed in chronic pancreatitis are also frequently involved in pancreatic cancer. Mol Cell

Proteomics. 6, 1331-1342.

94. Qi T, Han J, Cui Y, Zong M, Liu X, Zhu B. (2008) Comparative proteomic analysis for the

detection of biomarkers in pancreatic ductal adenocarcinoma. J Clin Pathol. 61, 49–58.

95. Scarlett CJ, Smith R, Saxby A, Nielsen A, Sarma J, Wilson S, Baxter R. (2006) Proteomic

classification of pancreatic adenocarcinoma tissue using protein chip technology.

Gastroenterology. 130, 1670-1678.

96. Gronborg M, Kristiansen T, Iwahori A, Chang R, Reddy R, Sato N, Molina H, Jensen O,

Hruban R, Goggins M, Maitra A, Pandey A. (2006) Biomarker discovery from pancreatic

cancer sercretome using a differential proteomic approach. Mol Cell Proteomics. 5:157-

171.

97. Mauri P, Scarpa A, Nascimbeni A, Benazzi L, Parmagnani E, Mafficini A, Peruta M, Bassi

C, Miyazaki K, Sorio C. (2005) Identification of proteins released by pancreatic cancer cells

by multidimensional protein identification technology: a strategy for identification of novel

cancer markers. FASEB J. 19, 1125-1127.

98. Grønborg M, Bunkenborg J, Kristiansen TZ, Jensen ON, Yeo CJ, Hruban RH, Maitra A,

Goggins MG, Pandey A. Comprehensive proteomic analysis of human pancreatic juice. J

Proteome Res. 3, 1042-55.

99. Chen R, Pan S, Yi E, Donohoe S, Bronner M, Potter J, Goodlett D, Aebersold R, Brentnall

T. (2006) Quantitative proteomic profiling of pancreatic cancer juice. Proteomics. 6, 3871-

3879.

100. Chen R, Pan S, Cooke K, Moyes K, Bronner M, Goodlett D, Aebersold R, Brentnall T.

(2007) Comparison of pancreas juice proteins from cancer versus pancreatitis using

101

quantitative proteomic analysis. Pancreas. 34, 70-79.

101. Zhou L, Lu Z, Yang A, Deng R, Mai C, Sang X, Faber K, Lu X. (2007) Comparative

proteomic analysis of human pancreatic juice: Methodological study. Proteomics. 1345-

1355.

102. Tian M, Cui Y, Song G, Zong M, Zhou X, Chen Y, Han J. (2008) Proteomic analysis

identifies MMP-9, DJ-1 and A1BG as overexpressed proteins in pancreatic juice from

pancreatic ductal adenocarcinoma patients. BMC Cancer. 8, 241-251.

103. Rosty C, Christa L, Kuzdzal S, Baldwin W, Zahurak M, Carnot F, Chan D, Canto M,

Lillemoe K, Cameron J, Yeo CJ, Hruban R, Goggins M. (2002) Identification of

hepatocarcinoma-intestine-pancreas/pancreatitis-associated protein I as a biomarker for

pancreatic ductal adenocarcinoma by protein biochip technology. Cancer Res. 62, 1868-

1875.

104. Ke E, Patel BB, Liu T, Li XM, Haluszka O, Hoffman JP, Ehya H, Young NA, Watson

JC, Weinberg DS, Nguyen MT, Cohen SJ, Meropol NJ, Litwin S, Tokar JL, Yeung AT.

(2009) Proteomic analyses of pancreatic cyst fluids. Pancreas. 38, e33-42.

105. Honda K, Hayashida Y, Umaki T, Okusaka T, Kosuge T, Kikuchi S, Endo M, Tsuchida A,

Aoki T, Itoi T, Moriyasu F, Hirohashi S, Yamada T. (2005) Possible detection of pancreatic

cancer by plasma protein profiling. Cancer Res. 65, 10613-10622.

106. Faca VM, Song KS, Wang H, Zhang Q, Krasnoselsky AL, Newcomb LF, Plentz RR,

Gurumurthy S, Redston MS, Pitteri SJ, Pereira-Faca SR, Ireton RC, Katayama H, Glukhova

V, Phanstiel D, Brenner DE, Anderson MA, Misek D, Scholler N, Urban ND, Barnett MJ,

Edelstein C, Goodman GE, Thornquist MD, McIntosh MW, DePinho RA, Bardeesy N,

Hanash SM. (2008) A mouse to human search for plasma proteome changes associated with

pancreatic tumor development. PLos Med. 5, e123.

107. Diamandis EP. (2004) Mass spectrometry as a diagnostic and a cancer biomarker discovery

tool: opportunities and potential limitations. Mol Cell Proteomics. 3, 367-378.

108. Kondo,T. (2008) Tissue proteomics for cancer biomarker development: laser

microdissection and 2D-DIGE. BMB. Rep. 41, 626-634.

109. Korc M. (2007) Pancreatic cancer-associated stroma production. Am J Surg. 194, 84-86.

110. Sedlaczek P, Frydecka I, Gabryś M, Van Dalen A, Einarsson R, Harłozińska A. (2002)

Comparative analysis of CA125, tissue polypeptide specific antigen, and soluble interleukin-

2 receptor alpha levels in sera, cyst, and ascitic fluids from patients with ovarian carcinoma.

102

Cancer. 95, 1886-1893.

111. Kuk,C., Kulasingam,V., Gunawardana,C.G., Smith,C.R., Batruch,I, Diamandis,E.P. (2009)

Mining the ovarian cancer ascites proteome for potential ovarian cancer biomarkers. Mol.

Cell Proteomics. 8, 661-669.

112. Gortzak-Uzan,L., Ignatchenko,A., Evangelou,A.I., Agochiya,M., Brown,K.A., St Onge,P.,

Kireeva,I., Schmitt-Ulms,G., Brown,T.J., Murphy,J., Rosen,B., Shaw,P., Jurisica,I,

Kislinger,T. (2009) A proteome resource of ovarian cancer ascites: integrated proteomic and

bioinformatic analyses to identify putative biomarkers. J. Proteome Res. 7, 339-351.

113. Sipos B., Möser S., Kalthoff H., Török V., Löhr M., Klöppel G. (2003) A comprehensive

characterization of pancreatic ductal carcinoma cell lines: towards the establishment of an in

vitro research platform. Virchows Arch. 442, 444-452.

114. Deer E.L., González-Hernández J., Coursen J.D., Shea J.E., Ngatia J., Scaife C.L., Firpo

M.A., Mulvihill S.J. (2010) Phenotype and genotype of pancreatic cancer cell lines.

Pancreas. 39, 425-435.

115. Wistuba II, Behrens C, Milchgrub S, Syed S, Ahmadian M, Virmani AK, Kurvari V,

Cunningham TH, Ashfaq R, Minna JD, Gazdar AF. (1998) Comparison of features of human

breast cancer cell lines and their corresponding tumors. Clin Cancer Res 4, 2931-2938.

116. Wistuba II, Bryant D, Behrens C, Milchgrub S, Virmani AK, Ashfaq R, Minna JD, Gazdar

AF. (1999) Comparison of features of human lung cancer cell lines and their corresponding

tumors. Clin Cancer Res. 5, 991-1000.

117. Kulasingam V., Pavlou M.P., Diamandis E.P. (2010) Integrating high-throughput

technologies in the quest for effective biomarkers for ovarian cancer. Nat Rev Cancer.10,

371-378.

118. Koliopanos A, Avgerinos C, Paraskeva C, Touloumis Z, Kelgiorgi D. Dervenis C. (2008)

Molecular aspects of carcinogenesis in pancreatic cancer. Hepatobiliary Pancreat Dis Int.

7, 345-356.

119. Whiteside TL. (2008) The tumor microenvironment and its role in promoting tumor

growth. Oncogene. 27, 5904–5912.

120. Domon B, Aebersold R. (2006) Mass spectrometry and protein analysis. Science. 312, 212-

7.

121. Jemal A., Siegel R., Ward E., Hao Y., Xu J., Thun MJ. (2009) Cancer statistics, 2009. CA

Cancer J Clin. 59, 22-49.

103

122. Ringel J., Lohr M. (2003) The MUC gene family: their role in diagnosis and early detection

of pancreatic cancer. Mol Cancer. 2, 9.

123. Robin X., Turck N., Hainard A., Lisacek F., Sanchez J.C., Müller M. (2009)

Bioinformatics for protein biomarker panel classification: what is needed to bring biomarker

panels into in vitro diagnostics? Expert Rev Proteomics. 6, 675-689.

124. Yurkovetsky Z.R., Linkov F.Y., E Malehorn D., Lokshin A.E. (2006) Multiple biomarker

panels for early detection of ovarian cancer. Future Oncol. 2, 733-741.

125. Hanash S., Taguchi A. (2010) The grand challenge to decipher the cancer proteome. Nat

Rev Cancer. 10, 652-660..

126. Farina A, Dumonceau JM, Frossard JL, Hadengue A, Hochstrasser DF, Lescuyer P. (2009)

Proteomic analysis of human bile from malignant biliary stenosis induced by pancreatic

cancer. J Proteome Res. 8, 159-69.

127. Kulasingam V., Diamandis E.P. (2007) Proteomics analysis of conditioned media from

three breast cancer cell lines. Mol Cell Proteomics. 6, 1997-2011.

128. Sardana G., Jung K., Stephan C., Diamandis E.P. (2008) Proteomic analysis of conditioned

media from the PC3, LNCaP, and 22Rv1 prostate cancer cell lines: discovery and validation

of candidate prostate cancer biomarkers. J Proteome Res. 7, 3329-3338.

129. Gunawardana C.G., Kuk C., Smith C.R., Batruch I., Soosaipillai A., Diamandis E.P. (2009)

Comprehensive analysis of conditioned media from ovarian cancer cell lines identifies novel

candidate markers of epithelial ovarian cancer. J Proteome Res. 8, 4705-4713.

130. Planque C., Kulasingam V., Smith C.R., Reckamp K., Goodglick L., Diamandis E.P.

(2009) Identification of five candidate lung cancer biomarkers by proteomics analysis of

conditioned media of four lung cancer cell lines. Mol Cell Proteomics. 8, 2746-2758.

131. Wu C.C., Hsu C.W., Chen C.D., Yu C.J., Chang K.P., Tai D.I., Liu H.P., Su W.H., Chang

Y.S., Yu J.S. (2010) Candidate serological biomarkers for cancer identified from the

secretomes of 23 cancer cell lines and the human protein atlas. Mol Cell Proteomics. 9, 1100-

1117.

132. Xue H., Lü B., Zhang J., Wu M., Huang Q., Wu Q., Sheng H., Wu D., Hu J., Lai M. (2010)

Identification of serum biomarkers for colorectal cancer metastasis using a differential

secretome approach. J Proteome Res. 9, 545-555.

133. Feng X.P., Yi H., Li M.Y., Li X.H., Yi B., Zhang P.F., Li C., Peng F., Tang C.E., Li J.L.,

Chen Z.C., Xiao Z.Q. (2010) Identification of biomarkers for predicting nasopharyngeal

104

carcinoma response to radiotherapy by proteomics. Cancer Res. 70, 3450-3462.

134. Schiarea S., Solinas G., Allavena P., Scigliuolo G.M., Bagnati R., Fanelli R., Chiabrando

C. (2010) Secretome analysis of multiple pancreatic cancer cell lines reveals perturbations of

key functional networks. J Proteome Res. 9, 4376-4392.

135. Furukawa T., Duguid W.P., Rosenberg L., Viallet J., Galloway D.A., Tsao M.S. (1996)

Long-term culture and immortalization of epithelial cells from normal adult human

pancreatic ducts transfected by the E6E7 gene of human papolloma virus 16. Am J Pathol.

148, 1763-1770.

136. Sedmak J.J., Grossberg S.E. (1977) A rapid, sensitive, and versatile assay for protein using

Coomassie brilliant blue G250. Anal Biochem. 79, 544-552.

137. Itzhaki R.F., Gill D.M. (1964) A micro-biuret method for estimating proteins. Anal Biol. 9,

401-410.

138. Caraux G., Pinloche S. (2005) PermutMatrix: a graphical environment to arrange gene

expression profiles in optimal linear order. Bioinformatics. 21, 1280-1281.

139. Meunier B., Dumas E., Piec I., Béchet D., Hébraud M., Hocquette J.F. (2007) Assessment

of hierarchical clustering methodologies for proteomic data mining. J Proteome Res. 6, 358-

366.

140. Luo LY, Soosaipillai A, Grass L, Diamandis EP. (2006) Characterization of human

kallikreins 6 and 10 in ascites fluid from ovarian cancer patients. Tumour Biol. 27, 227-234.

141. Shaw, J. L., and Diamandis, E. P. (2007) Distribution of 15 human kallikreins in tissues

and biological fluids. Clin. Chem. 53, 1423–1432.

142. Higdon R., Hogan J.M., Van Belle G., Kolker E. (2005) Randomized sequence databases

for tandem mass spectrometry peptide and protein identification. OMICS. 9, 364-379.

143. Elias, J. E., Gygi, S. P. (2007) Target-decoy search strategy for increased confidence in

large-scale protein identifications by mass spectrometry. Nat. Methods. 4, 207-214.

144. Choi, H., Nesvizhskii, A. I. (2008) False discovery rates and related statistical

concepts in mass spectrometry-based proteomics. J. Proteome. Res. 7, 47-50.

145. Reddi K.K., Holland J.F. (1976) Elevated serum ribonuclease in patients with pancreatic

cancer. Proc Natl Acad Sci U S A. 73, 2308-2310.

146. Harsha H.C., Kandasamy K., Ranganathan P., Rani S., Ramabadran S., Gollapudi S.,

Balakrishnan L., Dwivedi S.B., Telikicherla D., Selvan L.D., Goel R., Mathivanan S.,

Marimuthu A., Kashyap M., Vizza R.F., Mayer R.J., Decaprio J.A., Srivastava S., Hanash

105

S.M., Hruban R.H., Pandey A. (2009) A compendium of potential biomarkers of pancreatic

cancer. PLoS Med. 6, e1000046.

147. Maker AV, Katabi N, Gonen M, Dematteo RP, D'Angelica MI, Fong Y, Jarnagin WR,

Brennan MF, Allen PJ. (2010) Pancreatic Cyst Fluid and Serum Mucin Levels Predict

Dysplasia in Intraductal Papillary Mucinous Neoplasms of the Pancreas. Ann Surg Oncol.

Aug 18. [Epub ahead of print].

148. Itkonen O., Koivunen E., Hurme M., Alfthan H., Schröder T., Stenman U.H. (1990) Time-

resolved immunofluorometric assays for trypsinogen-1 and 2 in serum reveal preferential

elevation of trypsinogen-2 in pancreatitis. J Lab Clin Med. 115, 712-718.

149. Hanas J.S., Hocker J.R., Cheung J.Y., Larabee J.L., Lerner M.R., Lightfoot S.A., Morgan

D.L., Denson K.D., Prejeant K.C., Gusev Y., Smith B.J., Hanas R.J., Postier R.G., Brackett

D.J. (2008) Biomarker identification in human pancreatic cancer sera. Pancreas. 36, 61-69.

150. Irigoyen Oyarzabal A.M., Amiguet García J.A., López Vivanco G., Genollá Subirats J.,

Muñoz Villafranca M.C., Ojembarrena Martínez E., Liso Irurzun P. (2003) Tumoral markers

and acute-phase reactants in the diagnosis of pancreatic cancer. Gastroenterol Hepatol. 26,

624-629.

151. Märten A., Büchler M.W., Werft W., Wente M.N., Kirschfink M., Schmidt J. (2010)

Soluble iC3b as an early marker for pancreatic adenocarcinoma is superior to CA19.9 and

radiology. J Immunother. 33, 219-224.

152. Kuhlmann K.F., van Till J.W., Boermeester M.A., de Reuver P.R., Tzvetanova I.D.,

Offerhaus G.J., Ten Kate F.J., Busch O.R., van Gulik T.M., Gouma D.J., Crawford H.C.

(2007) Evaluation of matrix metalloproteinase 7 in plasma and pancreatic juice as a

biomarker for pancreatic cancer. Cancer Epidemiol Biomarkers Prev. 16, 886-891.

153. Chen R., Pan S., Duan X., Nelson B.H., Sahota R.A., de Rham S., Kozarek R.A., McIntosh

M., Brentnall T.A. (2010) Elevated level of anterior gradient-2 in pancreatic juice from

patients with pre-malignant pancreatic neoplasia. Mol Cancer. 15, 149.

154. Koopmann J., Buckhaults P., Brown D.A., Zahurak M.L., Sato N., Fukushima N., Sokoll

L.J., Chan D.W., Yeo C.J., Hruban R.H., Breit S.N., Kinzler K.W., Vogelstein B., Goggins

M. (2004) Serum macrophage inhibitory cytokine 1 as a marker of pancreatic and other

periampullary cancers. Clin Cancer Res. 10, 2386-2392.

155. Koopmann J., Rosenzweig C.N., Zhang Z., Canto M.I., Brown D.A., Hunter M., Yeo C.,

Chan D.W., Breit S.N., Goggins M. (2006) Serum markers in patients with resectable

106

pancreatic adenocarcinoma: macrophage inhibitory cytokine 1 versus CA19-9. Clin Cancer

Res. 12, 442-446.

156. Moniaux N., Chakraborty S., Yalniz M., Gonzalez J., Shostrom V.K., Standop J., Lele

S.M., Ouellette M., Pour P.M., Sasson A.R., Brand R.E., Hollingsworth M.A., Jain M., Batra

S.K. (2008) Early diagnosis of pancreatic cancer: neutrophil gelatinase-associated lipocalin as

a marker of pancreatic intraepithelial neoplasia. Br J Cancer. 98, 1540-1547.

157. Saha S., Harrison S.H., Shen C., Tang H., Radivojac P., Arnold R.J., Zhang X., Chen J.Y.

(2008) HIP2: an online database of human plasma proteins from healthy individuals. BMC

Med Genomics. 25, 12.

158. Xiao S.J., Zhang C., Zou Q., Ji Z.L. (2010) TiSGeD: a database for tissue-specific genes.

Bioinformatics. 26, 1273-1275.

159. Liu X., Yu X., Zack D.J., Zhu H., Qian J. (2008) TiGER: a database for tissue-specific

gene expression and regulation. BMC Bioinformatics. 9, 271.

160. Pontius J.U., Wagner L., Schuler G.D. (2003) UniGene: a unified view of the

transcriptome. In: The NCBI Handbook. Bethesda (MD): National Center for

Biotechnology Information: pp. 21.1-21.12.

161. Pontén F., Jirström K., Uhlen M. (2008) The Human Protein Atlas--a tool for pathology. J

Pathol. 216, 387-393.

162. Matsugi S., Hamada T., Shioi N., Tanaka T., Kumada T., Satomura S. (2007) Serum

carboxypeptidase A activity as a biomarker for early-stage pancreatic carcinoma. Clin Chim

Acta. 378, 147-153.

163. Adrian T.E., Besterman H.S., Mallinson C.N., Pera A., Redshaw M.R., Wood T.P., Bloom

S.R. (1979) Plasma trypsin in chronic pancreatitis and pancreatic adenocarcinoma. Clin Chim

Acta. 97, 205-212.

164. Artigas JM, Garcia ME, Faure MR, Gimeno AM. (1981) Serum trypsin levels in acute

pancreatic and non-pancreatic abdominal conditions. Postgrad Med J. 57, 219-222.

165. Hao Y, Wang J, Feng N, Lowe AW. (2004) Determination of plasma glycoprotein 2 levels

in patients with pancreatic disease. Arch Pathol Lab Med. 128, 668-674.

166. Hayakawa T, Kondo T, Shibata T, Kitagawa M, Sakai Y, Sobajima H, Tanikawa M, Nakae

Y, Hayakawa S, Katsuzaki T. (1993) Serum pancreatic stone protein in pancreatic

diseases. Int J Pancreatol. 13, 97-103.

167. Borgström A, Regnér S. (2005) Active carboxypeptidase B is present in free form in serum

107

from patients with acute pancreatitis. Pancreatology. 5, 530-536.

168. Hayakawa T, Kondo T, Shibata T, Kitagawa M, Ono H, Sakai Y, Kiriyama S. (1989)

Enzyme immunoassay for serum pancreatic lipase in the diagnosis of pancreatic diseases.

Gastroenterol Jpn. 24, 556-60.

169. Smith RC, Southwell-Keely J, Chesher D. (2005) Should serum pancreatic lipase replace

serum amylase as a biomarker of acute pancreatitis? ANZ J Surg.75, 399-404.

170. Junge W, Leybold K. (1982) Detection of colipase in serum and urine of pancreatitis

patients. Clin Chim Acta. 123, 293-302.

171. Pasanen PA, Eskelinen M, Partanen K, Pikkarainen P, Penttilä I, Alhava E. (1994)

Tumour-associated trypsin inhibitor in the diagnosis of pancreatic carcinoma. J Cancer Res

Clin Oncol. 120, 494-497.

172. Funakoshi A, Yamada Y, Ito T, Ishikawa H, Yokota M, Shinozaki H, Wakasugi H, Misaki

A, Kono M. (1991) Clinical usefulness of serum phospholipase A2 determination in patients

with pancreatic diseases. Pancreas. 6, 588-594.

173. Hanahan D., Weinberg R.A. (2000) The hallmarks of cancer. Cell. 100, 57-70.

175. Sobel RE, Sadar MD. (2005) Cell lines used in prostate cancer research: a compendium of

4ld and new lines--part 1. J Urol. 173, 342-359.

175. Barnea E, Sorkin R, Ziv T, Beer I, Admon A. (2005) Evaluation of prefractionation

methods as a preparatory step for multidimensional based chromatography of serum proteins.

Proteomics. 5, 3367-3375.

176. Slebos RJ, Brock JW, Winters NF, Stuart SR, Martinez MA, Li M, Chambers MC,

Zimmerman LJ, Ham AJ, Tabb DL, Liebler DC. (2008) Evaluation of strong cation exchange

versus isoelectric focusing of peptides for multidimensional liquid chromatography-tandem

mass spectrometry. J Proteome Res. 7, 5286-5294.

177. Das S, Bosley AD, Ye X, Chan KC, Chu I, Green JE, Issaq HJ, Veenstra TD, Andresson T.

(2010) Comparison of Strong Cation Exchange and SDS-PAGE Fractionation for Analysis of

Multiprotein Complexes. J Proteome Res. 9, 6696-6704.

178. Fang Y, Robinson DP, Foster LJ. (2010) Quantitative analysis of proteome coverage and

recovery rates for upstream fractionation methods in proteomics. J Proteome Res. 9, 1902-12.

179. Zhu W, Smith JW, Huang CM. (2010) Mass spectrometry-based label-free quantitative

proteomics. J Biomed Biotechnol. Epub 2009 Nov 10.

180. Bachi A, Bonaldi T. (2008) Quantitative proteomics as a new piece of the systems biology

108

puzzle. J Proteomics. 71, 357-367.

181. Ishihama Y, Oda Y, Tabata T, Sato T, Nagasu T, Rappsilber J, Mann M. (2005)

Exponentially modified protein abundance index (emPAI) for estimation of absolute protein

amount in proteomics by the number of sequenced peptides per protein. Mol Cell Proteomics.

4, 1265-1272.

182. Lu P., Vogel C., Wang R., Yao X., Marcotte E.M. (2007) Absolute protein expression

profiling estimates the relative contributions of transcriptional and translational regulation.

Nat Biotechnol. 25, 117–124.

183. Liu H., Sadygov R.G., Yates III J.R. (2004) A model for random sampling and estimation

of relative protein abundance in shotgun proteomics. Anal Chem. 76, 4193–4201.

184. Collier TS, Sarkar P, Franck WL, Rao BM, Dean RA, Muddiman DC. (2010) Direct

Comparison of Stable Isotope Labeling by Amino Acids in Cell Culture and Spectral

Counting for Quantitative Proteomics. Anal Chem. [Epub ahead of print]

185. Kakisaka T, Kondo T, Okano T, Fujii K, Honda K, Endo M, Tsuchida A, Aoki T, Itoi T,

Moriyasu F, Yamada T, Kato H, Nishimura T, Todo S, Hirohashi S. (2007) Plasma

proteomics of pancreatic cancer patients by multi-dimensional liquid chromatography and

two-dimensional difference gel electrophoresis (2D-DIGE): up-regulation of leucine-rich

alpha-2-glycoprotein in pancreatic cancer. J Chromatogr B Analyt Technol Biomed Life Sci.

852, 257-267.

186. Inami K, Kajino K, Abe M, Hagiwara Y, Maeda M, Suyama M, Watanabe S, Hino O.

(2008) Secretion of N-ERC/mesothelin and expression of C-ERC/mesothelin in human

pancreatic ductal carcinoma. Oncol Rep. 20, 1375-1380.

187. Paciucci R, Torà M, Díaz VM, Real FX. (1998) The plasminogen activator system in

pancreas cancer: role of t-PA in the invasive potential in vitro. Oncogene. 16, 625-633.

188. Frick VO, Rubie C, Wagner M, Graeber S, Grimm H, Kopp B, Rau BM, Schilling MK.

(2008) Enhanced ENA-78 and IL-8 expression in patients with malignant pancreatic diseases.

Pancreatology. 8, 488-497.

189. Aberger F, Weidinger G, Grunz H, Richter K. (1998) Anterior specification of embryonic

ectoderm: the role of the Xenopus cement gland-specific gene XAG-2. Mech Dev. 72, 115-

130.

190. Zhang Y, Forootan SS, Liu D, Barraclough R, Foster CS, Rudland PS, Ke Y (2007)

Increased expression of anterior gradient-2 is significantly associated with poor survival of

109

prostate cancer patients. Prostate Cancer Prostatic Dis. 10, 293-300.

191. Fritzsche FR, Dahl E, Dankof A, Burkhardt M, Pahl S, Petersen I, Dietel M, Kristiansen G

(2007) Expression of AGR2 in non-small cell lung cancer. Histol Histopathol. 22, 703–708.

192. Barraclough DL, Platt-Higgins A, de Silva Rudland S, Barraclough R, Winstanley J, West

CR, Rudland PS. The metastasis-associated anterior gradient 2 protein is correlated with poor

survival of breast cancer patients. Am J Pathol. 175, 1848-1857.

193. Ramachandran V, Arumugam T, Wang H, Logsdon CD. (2008) Anterior gradient 2 is

expressed and secreted during the development of pancreatic cancer and promotes cancer cell

survival. Cancer Res. 68, 7811-7818.

194. Zhang Y, Ali TZ, Zhou H, D'Souza DR, Lu Y, Jaffe J, Liu Z, Passaniti A, Hamburger AW.

(2010) ErbB3 binding protein 1 represses metastasis-promoting gene anterior gradient protein

2 in prostate cancer. Cancer Res. 70, 240-248.

195. DeSouza LV, Romaschin AD, Colgan TJ, Siu KW. (2009) Absolute quantification of

potential cancer markers in clinical tissue homogenates using multiple reaction monitoring on

a hybrid triple quadrupole/linear ion trap tandem mass spectrometer. Anal Chem. 81, 3462-

3470.

196. Hessle H, Engvall E. (1984) Type VI collagen. Studies on its localization, structure, and

biosynthetic form with monoclonal antibodies. J Biol Chem. 259, 3955–3961.

197. Lampe AK, Bushby KM. (2005) Collagen VI related muscle disorders. J Med Genet. 42,

673-685.

198. Fujita A, Sato JR, Festa F, Gomes LR, Oba-Shinjo SM, Marie SK, Ferreira CE, Sogayar

MC. (2008) Identification of COL6A1 as a differentially expressed gene in human

astrocytomas. Genet Mol Res. 7, 371-378.

199. Li J, Dowdy S, Tipton T, Podratz K, Lu WG, Xie X, Jiang SW. (2009) HE4 as a biomarker

for ovarian and endometrial cancer management. Expert Rev Mol Diagn. 9, 555-566.

200. Welsh JB, Sapinoso LM, Kern SG, Brown DA, Liu T, Bauskin AR, Ward RL, Hawkins

NJ, Quinn DI, Russell PJ, Sutherland RL, Breit SN, Moskaluk CA, Frierson HF Jr, Hampton

GM. (2003) Large-scale delineation of secreted protein biomarkers overexpressed in cancer

tissue and serum. Proc Natl Acad Sci U S A. 100, 3410-3415.

201. Bjorling E, Lindskog C, Oksvold P, Linne J, Kampf C, Hober S, Uhlen M, Ponten F (2008)

Aweb-based tool for in silico biomarker discovery based on tissue-specific protein profiles in

normal and cancer tissues. Mol Cell Proteomics. 7, 825–844.

110

202. Grapin-Botton A. (2005) Ductal cells of the pancreas. Int J Biochem Cell Biol. 37, 504-510.

203. Rovira M, Delaspre F, Massumi M, Serra SA, Valverde MA, Lloreta J, Dufresne M, Payré

B, Konieczny SF, Savatier P, Real FX, Skoudy A. (2008) Murine embryonic stem cell-

derived pancreatic acinar cells recapitulate features of early pancreatic differentiation.

Gastroenterology. 135, 1301-1310.

204. Schmid RM, Klöppel G, Adler G, Wagner M. (1999) Acinar-ductal-carcinoma sequence in

transforming growth factor-alpha transgenic mice. Ann N Y Acad Sci. 880, 219-230.

205. Kobayashi D, Koshida S, Moriai R, Tsuji N, Watanabe N. (2007) Olfactomedin 4 promotes

S-phase transition in proliferation of pancreatic cancer cells. Cancer Sci. 98, 334-40.

206. Oue N, Sentani K, Noguchi T, Ohara S, Sakamoto N, Hayashi T, Anami K, Motoshita J, Ito

M, Tanaka S, Yoshida K, Yasui W. (2009) Serum olfactomedin 4 (GW112, hGC-1) in

combination with Reg IV is a highly sensitive biomarker for gastric cancer patients. Int J

Cancer. 125, 2383-2392.

207. Antonin W, Wagner M, Riedel D, Brose N, Jahn R. (2002) Loss of the zymogen granule

protein syncollin affects pancreatic protein synthesis and transport but not secretion. Mol Cell

Biol. 22, 1545-1554.

208. Yachida S, Jones S, Bozic I, Antal T, Leary R, Fu B, Kamiyama M, Hruban RH, Eshleman

JR, Nowak MA, Velculescu VE, Kinzler KW, Vogelstein B, Iacobuzio-Donahue CA. (2010)

Distant metastasis occurs late during the genetic evolution of pancreatic cancer. Nature. 467,

1114-1117.

209. Mischak H, Allmaier G, Apweiler R, Attwood T, Baumann M, Benigni A, Bennett SE,

Bischoff R, Bongcam-Rudloff E, Capasso G, Coon JJ, D'Haese P, Dominiczak AF, Dakna M,

Dihazi H, Ehrich JH, Fernandez-Llama P, Fliser D, Frokiaer J, Garin J, Girolami M, Hancock

WS, Haubitz M, Hochstrasser D, Holman RR, Ioannidis JP, Jankowski J, Julian BA, Klein

JB, Kolch W, Luider T, Massy Z, Mattes WB, Molina F, Monsarrat B, Novak J, Peter K,

Rossing P, Sánchez-Carbayo M, Schanstra JP, Semmes OJ, Spasovski G, Theodorescu D,

Thongboonkerd V, Vanholder R, Veenstra TD, Weissinger E, Yamamoto T, Vlahou A.

(2010) Recommendations for biomarker identification and qualification in clinical

proteomics. Sci Transl Med. 2, 46ps42.

210. Cutts RJ, Gadaleta E, Hahn SA, Crnogorac-Jurcevic T, Lemoine NR, Chelala C. (2010)

The Pancreatic Expression database: 2011 update. Nucleic Acids Res. [Epub ahead of print].

211. Edwardson JM, An S, Jahn R (1997) The secretory granule protein syncollin binds to

111

syntaxin in a Ca2(+)-sensitive manner. Cell. 90, 325-333.

212. Bach JP, Borta H, Ackermann W, Faust F, Borchers O, Schrader M. (2006) The secretory

granule protein syncollin localizes to HL-60 cells and neutrophils. J Histochem Cytochem.

54, 877-888.

213. Koprowski H, Herlyn, M, Steplewski Z , Sears H.F. (1981) Specific Antigen in Serum of

Patients with Colon Carcinoma. Science. 212, 53-55.

214. Williams JA. (2006) Regulation of pancreatic acinar cell function. Curr Opin

Gastroenterol. 22, 498-504.

215. Lv S, Gao J, Zhu F, Li Z, Gong Y, Xu G, Ma L (2010) Transthyretin, identified by

proteomics, is overabundant in pancreatic juice from pancreatic carcinoma and originates

from pancreatic islets. Diagn Cytopathol. [Epub ahead of print].

216. Killary AM, Balasenthil S, Chen N, Lott ST, Chen J, Carter J, Grizzle WE, Frazier ML,

Sen S. (2010) A Migration Signature and Plasma Biomarker Panel for Pancreatic

Adenocarcinoma. Cancer Prev Res (Phila).[Epub ahead of print].

217. Xue A, Scarlett CJ, Chung L, Butturini G, Scarpa A, Gandy R, Wilson SR, Baxter RC,

Smith RC. (2010) Discovery of serum biomarkers for pancreatic adenocarcinoma using

proteomic analysis. Br J Cancer. 103, 391-400.

218. Takayama R, Nakagawa H, Sawaki A, Mizuno N, Kawai H, Tajika M, Yatabe Y, Matsuo

K, Uehara R, Ono K, Nakamura Y, Yamao K. (2010) Serum tumor antigen REG4 as a

diagnostic biomarker in pancreatic ductal adenocarcinoma. J Gastroenterol. 45, 52-59.

219. McKinney KQ, Lee YY, Choi HS, Groseclose G, Iannitti DA, Martinie JB, Russo MW,

Lundgren DH, Han DK, Bonkovsky HL, Hwang SI. (2011) Discovery of putative

pancreatic cancer biomarkers using subcellular proteomics. J Proteomics. 74, 79-88.

112

APPENDICES

113

Appendix 1. Table of overrepresented KEGG pathways in the pancreatic juice proteome

in comparison to the cell line conditioned media proteome

Overrepresented

Pathway Description

% of

Pathway

Proteins in

Pancreatic

Juice

% of

Pathway

Proteins in

Cell Line

Proteome

Number of

Pathway

Proteins in

Pancreatic

Juice

Number of

Pathway

Proteins in

Cell Line

Proteome

Raw p-

value

FDR p-

value

Complement and

coagulation cascades

(hsa04610) 3.24 0.48 21 16

4.78

E-07

4.49

E-05

Pancreatic secretion

(hsa04972) 2.62 0.48 17 16

3.61

E-05

1.70

E-03

Systemic lupus

erythematosus

(hsa05322) 2.78 0.6 18 20

8.92

E-05

2.80

E-03

Protein centre software uses statistical hypergeometric test analysis to determine if KEGG

(Kyoto Encyclopedia of Genes and Genomes; http://www.genome.jp/kegg/) categories are

disproportionally represented in comparisons between two datasets. In a comparison between all

of the proteins identified in the pancreatic juice and all of the proteins identified in the cell line

conditioned media, the three KEGG pathways presented in the table were shown as

overrepresented in the pancreatic juice dataset. Provided are the percentage and number of

proteins from the pancreatic juice and cell line datasets that were mapped to the three KEGG

pathways. Raw p-values are based on hypergeometric tests indicating the protein counts are not

due to random sampling. False discovery rate (FDR) p-values are raw p-values corrected for a

false discovery rate of 1.0%.

114

Appendix 2. Pearson correlation coefficient values comparing normalized spectral counts of the triplicate cell line analysis.

BxP

c3

rep1

BxP

c3

rep2

BxP

c3

rep3

CA

PA

N-1

CA

PA

N-2

CAP

AN1

-3

CFP

AC1

rep1

CFP

AC1

rep2

CFP

AC1

rep3

HP

DE

rep1

HP

DE

rep2

HP

DE

rep3

MIA

-1

MI

A-2

MI

A-3

PA

NC1

rep1

PA

NC1

rep2

PA

NC1

rep3

SU.

86.8

6-1

SU.

86.8

6-2

SU.

86.8

6-3

BxPc3-1 1.00 0.99 0.99 0.62 0.62 0.65 0.65 0.48 0.62 0.69 0.68 0.68 0.53 0.53 0.55 0.64 0.68 0.64 0.63 0.62 0.55

BxPc3-2 0.99 1.00 0.99 0.62 0.62 0.65 0.64 0.48 0.62 0.70 0.69 0.69 0.53 0.53 0.55 0.64 0.68 0.64 0.64 0.63 0.56

BxPc3-3 0.99 0.99 1.00 0.62 0.62 0.64 0.65 0.48 0.62 0.71 0.70 0.70 0.53 0.53 0.56 0.65 0.68 0.66 0.64 0.62 0.56

CAPAN1-1 0.62 0.62 0.62 1.00 0.99 0.97 0.67 0.51 0.67 0.54 0.53 0.54 0.49 0.48 0.46 0.59 0.58 0.57 0.71 0.71 0.68

CAPAN1-2 0.62 0.62 0.62 0.99 1.00 0.97 0.68 0.51 0.67 0.53 0.53 0.54 0.49 0.48 0.46 0.58 0.58 0.57 0.71 0.71 0.67

CAPAN1-3 0.65 0.65 0.64 0.97 0.97 1.00 0.67 0.50 0.66 0.56 0.55 0.56 0.53 0.52 0.50 0.62 0.61 0.60 0.70 0.70 0.67

CFPAC1-1 0.65 0.64 0.65 0.67 0.68 0.67 1.00 0.73 0.95 0.52 0.53 0.54 0.46 0.44 0.44 0.67 0.68 0.68 0.74 0.73 0.68

CFPAC1-2 0.48 0.48 0.48 0.51 0.51 0.50 0.73 1.00 0.85 0.39 0.39 0.40 0.35 0.34 0.33 0.50 0.50 0.51 0.57 0.58 0.56

CFPAC1-3 0.62 0.62 0.62 0.67 0.67 0.66 0.95 0.85 1.00 0.51 0.51 0.51 0.46 0.44 0.43 0.64 0.65 0.66 0.74 0.74 0.70

HPDE-1 0.69 0.70 0.71 0.54 0.53 0.56 0.52 0.39 0.51 1.00 0.98 0.97 0.49 0.49 0.47 0.54 0.53 0.55 0.56 0.55 0.52

HPDE-2 0.68 0.69 0.70 0.53 0.53 0.55 0.53 0.39 0.51 0.98 1.00 0.99 0.45 0.45 0.43 0.53 0.52 0.55 0.56 0.56 0.51

HPDE-3 0.68 0.69 0.70 0.54 0.54 0.56 0.54 0.40 0.51 0.97 0.99 1.00 0.45 0.45 0.43 0.54 0.53 0.55 0.56 0.56 0.51

MIA -1 0.53 0.53 0.53 0.49 0.49 0.53 0.46 0.35 0.46 0.49 0.45 0.45 1.00 0.99 0.98 0.70 0.67 0.71 0.43 0.41 0.38

MIA- 2 0.53 0.53 0.53 0.48 0.48 0.52 0.44 0.34 0.44 0.49 0.45 0.45 0.99 1.00 0.98 0.69 0.65 0.69 0.42 0.40 0.38

MIA- 3 0.55 0.55 0.56 0.46 0.46 0.50 0.44 0.33 0.43 0.47 0.43 0.43 0.98 0.98 1.00 0.67 0.66 0.68 0.43 0.41 0.38

PANC1-1 0.64 0.64 0.65 0.59 0.58 0.62 0.67 0.50 0.64 0.54 0.53 0.54 0.70 0.69 0.67 1.00 0.97 0.97 0.58 0.57 0.50

PANC1-2 0.68 0.68 0.68 0.58 0.58 0.61 0.68 0.50 0.65 0.53 0.52 0.53 0.67 0.65 0.66 0.97 1.00 0.95 0.61 0.59 0.53

PANC1-3 0.64 0.64 0.66 0.57 0.57 0.60 0.68 0.51 0.66 0.55 0.55 0.55 0.71 0.69 0.68 0.97 0.95 1.00 0.58 0.56 0.49

SU.86.86-1 0.63 0.64 0.64 0.71 0.71 0.70 0.74 0.57 0.74 0.56 0.56 0.56 0.43 0.42 0.43 0.58 0.61 0.58 1.00 0.99 0.94

SU.86.86-2 0.62 0.63 0.62 0.71 0.71 0.70 0.73 0.58 0.74 0.55 0.56 0.56 0.41 0.40 0.41 0.57 0.59 0.56 0.99 1.00 0.95

SU.86.86-3 0.55 0.56 0.56 0.68 0.67 0.67 0.68 0.56 0.70 0.52 0.51 0.51 0.38 0.38 0.38 0.50 0.53 0.49 0.94 0.95 1.00

Each cell line was analyzed in triplicate for a total of 21 replicates. Each replicate was compared pair-wise and Pearson correlation

coefficients are reported. With the exception of CFPAC1-rep2, good correlation (0.944-0.993) was seen for replicates of the same cell

line indicating good reproducibility between cell line replicates. MIA-1, MIA-2, MIA-3 are replicates 1, 2 and 3 of the MIA-PaCa2

cell line.

115

Appendix 3. Extracellular and cell surface annotated proteins with over 5-fold increase in at least three pancreatic cancer cell

lines.

Gene

An

ov

a P

-Va

lue

Accession

BxPc3 MIA-

PaCa2

PANC1 CAPAN1 CFPAC1 SU.86.86 Identified in/as...

Pre

vio

usl

y S

tud

ied

as

Pa

ncr

eati

c C

an

cer S

eru

m

Bio

ma

rker

%

CV FC

%

CV FC

%

CV FC

%

CV FC

%

CV FC

%

CV FC Pan

crea

tic

Ju

ice

Asc

ites

a

Hu

ma

n P

lasm

a P

rote

om

e b

Ov

erex

pre

ssed

in

Pa

ncr

ea

tic

Ca

nce

r i

n A

t

Lea

st 4

or

mo

re S

tuid

es [

14

7]

RNASE1 1.1E-16 IPI00014048 2 127 34 9 18 39

145

PIGR 3.3E-16 IPI00004573 7 396 18 31 1 53

LOXL2;

ENTPD4 7.8E-16 IPI00294839 7 125 7 35 20 8

MUC5AC 1.4E-15 IPI00103397 4 358 10 153 10 170

147

PRSS2 8.1E-15 IPI00011695 4 45 19 19 5 42

148

MUC5B 1.2E-14 IPI00902941 0 130 9 49 12 105

FCGBP 1.4E-14 IPI00242956 27 12 33 30 6 166

MMP13 3.0E-14 IPI00021738 8 94 33 12 31 6

VWA1 4.2E-14 IPI00396383 20 6 7 74 15 29

CP 5.2E-14 IPI00017601 12 7 9 184 10 23

149,150

C3 9.2E-14 IPI00783987 12 44 5 35 13 161 6 236 149,151

SEMA3A 9.3E-14 IPI00031510 47 5 5 86 25 13 21 19 22 16 24 5

SEMA3C 1.3E-13 IPI00019209 14 21 3 51 24 12

SPOCK1 4.3E-13 IPI00005292 26 8 2 26 29 6

MMP7 4.4E-13 IPI00013400 12 13 42 40 11 481 24 9

152

116

SERPINA1 6.0E-13 IPI00553177 22 37 11 304 11 10

PLAT 3.1E-12 IPI00019590 12 130 9 40 9 49

NRP1 1.8E-11 IPI00299594 8 21 32 6 8 38 29 14

ST14 2.5E-11 IPI00001922 13 27 13 19 14 10

LYZ 2.7E-11 IPI00019038 13 111 28 31 17 10

TGM2 6.8E-11 IPI00294578 7 32 7 15 11 13 10 49 12 122 15 55

EPHA2 7.5E-11 IPI00021267 8 15 15 6 9 5

MFI2 1.4E-10 IPI00029275 30 13 13 12 7 37 19 18

CTSH 1.9E-10 IPI00297487 19 18 5 12 14 48

NAGLU 2.1E-10 IPI00008787 6 24 19 10 16 33 4 10

AGR2 2.4E-10 IPI00007427 25 31 13 101 19 56 9 76

153

CSF1 6.4E-10 IPI00015881 3 73 25 63 23 6 37 8

CFB;C2 7.2E-10 IPI00019591 23 7 8 10 13 29 7 26 27 14

RNASE4 7.3E-10 IPI00029699 8 9 21 6 13 17 9 12 16 6

PLBD1 7.8E-10 IPI00016255 42 5 14 26 21 6 21 7

COL6A1 8.2E-10 IPI00291136 14 121 14 47 8 27 15 55 21 43

FUCA1 1.1E-09 IPI00843910 15 43 15 7 12 16 9 20 13 35 48 5

LRG1 1.8E-09 IPI00022417 14 18 19 39 11 34 19 11

185

LTBP3 1.9E-09 IPI00073196 32 10 14 33 26 6 18 18 5 22

PLBD2 1.9E-09 IPI00169285 31 7 4 20 8 15 19 16

GDF15 2.3E-09 IPI00306543 29 32 17 137 11 20 22 43

154,155

SIAE 6.4E-09 IPI00010949 16 9 20 60 42 10 1 24 22 8

DPP7 8.4E-09 IPI00296141 16 18 11 28 20 14

TGFB2 9.0E-09 IPI00235354 7 23 22 35 15 34 12 16

B3GNT3 9.6E-09 IPI00031983 40 8 15 19 5 11

ITGA2 2.8E-08 IPI00013744 10 21 9 24 12 9 32 17

CTSS 3.4E-08 IPI00299150 22 22 20 15 14 9

LTBP1 5.7E-08 IPI00302679 43 7 23 9 13 16

BSG 1.1E-07 IPI00019906 27 11 13 24 36 10 34 6 24 6

ACE 1.6E-07 IPI00437751 14 15 22 7 24 5 36 6

LCN2 1.9E-07 IPI00299547 13 28 9 31 31 51 156

CDH2 1.9E-07 IPI00290085 28 53 17 28 18 15

117

ITGB1 4.4E-07 IPI00217563 14 8 19 6 22 34 4 18 19 15 30 12

MSLN 5.0E-07 IPI00025110 12 38 30 15 49 22

SDCBP 6.2E-07 IPI00299086 38 5 24 8 14 12 11 18 37 8 11 13

CXCL1 1.8E-06 IPI00013874 14 26 19 48 17 14 14 16 49 22

AHSG 5.0E-06 IPI00022431 25 14 38 24 22 39 36 13

DNASE2 5.2E-06 IPI00010348 23 8 28 6 45 6 5 10

CXCL5 5.3E-06 IPI00292936 7 37 14 90 24 17 38 226

WFDC2 8.1E-06 IPI00291488 17 13 18 26 43 45 12 41

NEU1 8.5E-06 IPI00029817 17 9 29 9 12 12

SERPINB9 1.2E-05 IPI00032139 19 15 30 20 20 10 22 14

RARRES1 1.4E-05 IPI00410240 32 22 49 8 38 9

PLA2G15 2.7E-05 IPI00301459 9 12 6 6 42 6

CTBS 2.8E-05 IPI00007778 31 5 14 9 35 12 24 6

HS3ST1 9.3E-05 IPI00021377 18 12 49 9 41 29

LFNG 1.1E-04 IPI00455739 20 19 28 23 34 29 12 5 47 20

ENO2 2.0E-02 IPI00216171 6 235 27 213 10 175 36 152

a Proteome of ascites samples from pancreatic cancer patients (Makawita et al., unpublished).

b Identification in 12,787 protein containing plasma proteome database [158].

FC, fold change between cancer cell line and HPDE; %CV, percent coefficient of variation in normalized spectral counts for

triplicates of cell line; PJ, pancreatic juice

118

Appendix 4. Forty-three proteins common to cancer cell lines, pancreatic juice and ascites

Gene Protein Name Accession Identified in/as... Tissue Specificity

Asc

ites

a

≥ 5

-fold

in

at

least

on

e

can

cer

cell

lin

e vs

HP

DE

Hu

man

Pla

sma P

rote

om

e b

Over

exp

ress

ed i

n P

an

crea

tic

Cn

ace

r in

at

Lea

st 4

Oth

er

Stu

die

s (H

ars

ha e

t al.

[146

])

Over

exp

ress

ed i

n C

ore

Gen

e

Exp

ress

ion

Stu

dy (

Jon

es e

t

al.

[32

])

Over

exp

ress

ed i

n P

an

crea

tic

Can

cer

Tis

sue

(McK

inn

ey e

t

al.

[219

])

Cyst

Flu

id o

f P

DA

C P

ati

ent

(Ke

et a

l. [

104])

HPA

[161]

Uni

Gene

[160]

TiGER

[159]

TiSGeD

[158]

PRSS1 Trypsin-1 IPI00011694

PRSS2 Protease serine 2 isoform

B

IPI00011695

MUC5AC Mucin-5AC (Fragment) IPI00103397

RNASE1 Ribonuclease pancreatic IPI00014048

LUM Lumican IPI00020986

COL1A1 collagen alpha-1(I) chain

preproprotein

IPI00297646

CEACAM5 Carcinoembryonic

antigen-related cell

adhesion molecule 5

IPI00027486

MUC1 Mucin IPI00013955

PIGR Polymeric

immunoglobulin receptor

IPI00004573

OLFM4 Olfactomedin-4 IPI00022255

SPP1 Isoform A of

Osteopontin

IPI00021000

SERPINF1 Pigment epithelium-

derived factor

IPI00006114

LRG1 Leucine-rich alpha-2-

glycoprotein

IPI00022417

RBP4 Retinol-binding protein 4 IPI00022420

CFI Complement factor I IPI00291867

119

DMBT1 Isoform 1 of Deleted in

malignant brain tumors 1

protein

IPI00099110

F5 252 kDa protein IPI00022937

C4B complement component

4B preproprotein

IPI00418163

MXRA5 Matrix-remodeling-

associated protein 5

IPI00012347

LYZ Lysozyme C IPI00019038

AGT Angiotensinogen IPI00032220

CP Ceruloplasmin IPI00017601

FCGBP IgGFc-binding protein IPI00242956

SERPINA3 cDNA FLJ35730 fis,

clone TESTI2003131,

highly similar to

ALPHA-1-

ANTICHYMOTRYPSIN

IPI00550991

VTN Vitronectin IPI00298971

ACE Isoform Somatic-1 of

Angiotensin-converting

enzyme

IPI00437751

TCN1 Transcobalamin-1 IPI00299729

SERPINA4 Kallistatin IPI00328609

ITIH2 Inter-alpha (Globulin)

inhibitor H2, isoform

CRA_a

IPI00305461

APOA1 Apolipoprotein A-I IPI00021841

APOC1 Apolipoprotein C-I IPI00021855

APOL1 Isoform 2 of

Apolipoprotein L1

IPI00186903

SERPINC1 Antithrombin-III IPI00032179

SERPING1 Plasma protease C1

inhibitor

IPI00291866

HPX Hemopexin IPI00022488

SOD3 Extracellular superoxide

dismutase [Cu-Zn]

IPI00027827

GC Vitamin D-binding

protein

IPI00555812

120

F2 Prothrombin (Fragment) IPI00019568

CD14 Monocyte differentiation

antigen CD14

IPI00029260

C4BPA C4b-binding protein

alpha chain

IPI00021727

A2M alpha-2-macroglobulin

precursor

IPI00478003

HBA1 Hemoglobin subunit

alpha

IPI00410714

MUC6 mucin-6 IPI00401776

a Ascites fluid proteome from 3 pancreatic cancer patients (Makawita et al., unpublished)

b Identification in 12,787 protein containing plasma proteome database [157].

PDAC, pancreatic ductal adenocarcinoma