Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
INTEGRATIVE PROTEOMIC ANALYSIS OF CELL
LINE CONDITIONED MEDIA AND PANCREATIC
JUICE FOR THE IDENTIFICATION OF CANDIDATE
PANCREATIC CANCER BIOMARKERS
by
Shalini Makawita
A thesis submitted in conformity with the requirements
for the degree of Master of Science
Laboratory Medicine and Pathobiology
University of Toronto
© Copyright by Shalini Makawita 2011
ii
INTEGRATIVE PROTEOMIC ANALYSIS OF CELL LINE
CONDITIONED MEDIA AND PANCREATIC JUICE FOR THE
IDENTIFICATION OF CANDIDATE PANCREATIC CANCER
BIOMARKERS
Shalini Makawita
Master of Science
Department of Laboratory Medicine and Pathobiology
University of Toronto
2011
ABSTRACT
Novel serological biomarkers to aid in the detection and clinical management of
pancreatic cancer patients are urgently needed. In the present study, we performed in-depth
proteomic analysis of conditioned media from six pancreatic cancer cell lines (MIA-PaCa2,
PANC1, BxPc3, CAPAN1, CFPAC1 and SU.86.86), the normal pancreatic ductal epithelial cell
line HPDE, and pancreatic juice samples from cancer patients for identification of novel
biomarker candidates. Using 2D-LC-MS/MS, a total of 3479 non-redundant proteins were
identified with ≥2 peptides. Subsequent label-free protein quantification and integrative analysis
of the biological fluids resulted in the generation of candidate biomarkers, of which five proteins
were shown to be significantly elevated in plasma from pancreatic cancer patients in a
preliminary assessment. Further verification of two of the proteins in ~200 serum samples
demonstrated the ability of these proteins to significantly improve the area under the receiver
operating characteristic curve of CA19.9 from 0.84 to 0.91.
iii
DEDICATION
My parents are the hardest working, kindest hearted and most persevering individuals I know.
Their love, support and encouragement have given me the courage to always dream big and
strive for my goals. For this I will be forever grateful.
- I dedicate this thesis to my parents Ananda and Dorathy Makawita.
iv
ACKNOWLEDGEMENTS
First and foremost, I would like to thank Dr. Eleftherios P. Diamandis, my supervisor,
mentor and teacher who has encouraged and guided me throughout my M.Sc. journey. You have
helped to mould the somewhat wandering interests I had in research at the start of my journey
into a strong passion for science that I will carry forward with me in my future endeavours. I
will always be forever grateful for the opportunities, kindness and trust you have bestowed upon
me and for the doors you have helped me to open. I look forward to your continued friendship
and mentorship in the years to come.
I would also like to acknowledge Dr. H. Elsholtz and the members of my advisory
committee, Dr. S. Asa and Dr. A. Romaschin for their advice and direction, as well as Dr. Irwin,
the Chair of the examination committee. Likewise, I would like to acknowledge the Department
of Laboratory Medicine and Pathobiology at the University of Toronto and funding I received
from the Ontario Graduate Scholarship.
It is said that “a good friend is hard to find, hard to lose, and impossible to forget”. This
is certainly the case for all of the friends I have made over the past two years in the ACDC lab.
You are all uniquely brilliant in so many different facets of life, and you have all been an
integral part of my M.Sc. journey. Thank you for the countless memories, jokes and the
scientific advice as well! As I sit here at the crossroads of graduate school and the next step in
my academic career, I am excited to see what the future has in store; however I cannot help but
feel somewhat melancholy that my M.Sc journey is coming to its end. The knowledge that all of
you have played a role in some way, shape or form both in my personal journey over the past
two years and in the research presented in this thesis is overwhelming and to you I owe my
sincerest debt of gratitude.
v
I would like to also particularly acknowledge Antoninus Soosaipillai, Chris Smith and
Ihor Batruch for all of their invaluable technical advice and help over the past two years, as well
as Dr. Irv Bromberg at Mount Sinai Hospital, Toronto for software assistance, and Dr. Yingye
Zheng (Fred Hutchinson Cancer Centre, Seattle, Washington), Elissa Brown (Fred Hutchinson
Cancer Centre) and Apostolos Dimitromanolakis (University of Toronto) for assistance with
statistical analyses. Thank you also to our clinical collaborators who have provided samples
used in this study (Dr. Steven Gallinger at the University Health Network, Toronto and Dr.
Randy Haun at the University of Arkansas Cancer Research Center for plasma/serum samples,
Dr. Felix Rueckert, Dresden, Germany for pancreatic juice samples and Dr. Alice Newman at
Princess Margaret Hospital for assistance with collection of ascites fluid).
Lastly, I would like to thank my parents and family once again for instilling in me the
importance of hard work, providing for me the means for a good education and for always
supporting me in my life‟s goals.
vi
TABLE OF CONTENTS
ABSTRACT……………………………………………………………………………………...ii
DEDICATION………………………………………………………………………………….iii
ACKNOWLEDGEMENTS……………………………………………………………………iv
TABLE OF CONTENTS………………………………………………………………………vi
LIST OF TABLES……………………………………………………………………………...ix
LIST OF FIGURES……………………………………………………………………………..x
LIST OF APPENDICES………………………………………………………………………xii
LIST OF ABBREVIATIONS………………………………………………………………...xiii
CHAPTER 1: INTRODUCTION 1
1.1 The Pancreatic Cancer Problem 2
1.1.1 The Human Pancreas 2
1.1.2 Pancreatic Cancer 2
1.1.2.1 Precursor Lesions and Cell of Origin 3
1.1.2.2 Symptoms 4
1.1.2.3 Risk Factors 4
1.1.2.4 Prevention 5
1.1.2.5 Treatment 6
1.1.3 Pancreatic Cancer Statistics 8
1.1.4 Current Methods for Pancreatic Cancer Detection and Their Limitations 9
1.2 Serological Cancer Biomarkers 10
1.2.1 Criteria for Detection and Biomarker Applications in Pancreatic Cancer 10
1.2.2 Current State of Pancreatic Cancer Serum Biomarkers 11
1.2.2.1 A General Introduction to Biomarkers 11
1.2.2.2 Mechanisms for Biomarker Elevation in Serum 12
1.2.2.3 CA19.9 and Other Putative Pancreatic Cancer Markers 13
vii
1.3 Mass Spectrometry-Based Methods for Serum Biomarker Discovery 14
1.3.1 Principles of Mass Spectrometry 14
1.3.2 Proteomics Discovery Pipeline – Discovery, Verification and Validation 17
1.3.3 Pancreatic Cancer Serum Proteomics 20
1.3.4 Tissue Proteomics 22
1.3.5 Proteomics of Proximal Biological Fluid and Cell Line Conditioned
Media 22
1.3.6 Integrated Strategies 24
1.4 Rationale, Hypothesis, Objectives 24
1.4.1 Rationale 24
1.4.2 Hypothesis 26
1.4.3 Objectives 26
CHAPTER 2: MASS SPECTROMETRY ANALYSIS OF CELL LINE
CONDITIONED MEDIA AND PANCREATIC JUICE FOR IDENTIFICATION
OF PANCREATIC CANCER BIOMARKERS 28
2.1 Introduction 29
2.2 Materials and Methods 32
2.3 Results 40
2.4 Discussion 60
CHAPTER 3: ENHANCED PERFORMANCE OF CA19.9 WITH ADDITION OF
SYNCOLLIN AND ANTERIOR GRADIENT HOMOLOG 2 IN PANEL 70
3.1 Introduction 71
3.2 Materials and Methods 73
3.3 Results 76
3.4 Discussion 80
CHAPTER 4: SUMMARY AND FUTURE DIRECTIONS 87
4.1 Summary 88
ix
LIST OF TABLES
Table Title Page
2.1 Increasing the number of identified proteins by optimizing cation exchange
chromatography fraction pooling
43
2.2 Total number of proteins identified in triplicate analysis of cell line conditioned
media and pancreatic juice
45
2.3 Protein overlap between cell line conditioned media 47
2.4 List of 15 pancreas-specific proteins (≥3 databases) identified in conditioned
media and pancreatic juice
55
3.1 Stage and grade of 111 pancreatic cancer serum samples 75
3.2 Distribution of serum SYCN, AGR2, CA19.9 and age in pancreatic cancer and
control serum
76
x
LIST OF FIGURES
Figure Title Page
1.1 Biomarker discovery pipeline 19
2.1 Schematic outline of proteomic analysis 42
2.2 Total non-redundant proteins identified 46
2.3 Cellular localization and comparison of GO categories between cell line
conditioned media and pancreatic juice proteomes
49
2.4 Hierarchical clustering analysis based on normalized emPAI values for 3479
total non-redundant proteins
51
2.5 Preliminary verification of AGR2, OLFM4, SYCN, COL6A1 and PIGR in
plasma
58
2.6 Receiver operating characteristic curve analysis for CA19.9 and panel of five
candidates
59
3.1 Distribution of serum CA19.9, SYCN and AGR2 in normal controls, early-stage
and all pancreatic cancer patients
78
3.2
3.3
Correlation between CA19.9 and SYCN, CA19.9 and AGR2 and SYCN and
AGR2
ROC curves of SYCN, AGR2 and CA19.9 for all pancreatic cancer and controls
79
81
xii
LIST OF APPENDICES
Appendix Title Page
1
2
3
4
Table of overrepresented KEGG pathways in the pancreatic juice proteome in
comparison to the cell line conditioned media proteome.
Pearson correlation coefficient values comparing normalized spectral counts of
the triplicate cell line analysis.
Extracellular and cell surface annotated proteins with over 5-fold increase in at
least three pancreatic cancer cell lines.
Forty-three proteins common to cancer cell lines, pancreatic juice and ascites.
113
114
115
118
xiii
LIST OF ABBREVIATIONS
2D-LC-MS/MS two dimensional liquid chromatography tandem mass spectrometry
AGR2 anterior gradient homolog 2
ANOVA analysis of variance
AUC area under curve
CA125 carbohydrate antigen 125
CA19.9 carbohydrate antigen 19.9
CDCHO Chinese hamster ovary serum-free medium
CEA Carcinoembryonic antigen
CEACAM5 Carcinoembryonic antigen-related cell adhesion molecule 5
CM conditioned media
COL6A1- Collagen alpha-1(VI) chain
CT computed tomography
CV coefficient of variation
ELISA enzyme-linked immunosorbent assay
emPAI exponentially modified protein abundance index
ERCP endoscopic retrograde cholangiopancreatography
ESI electrospray ionization
EUS endoscopic ultrasound
FDR false discovery rate
GO gene ontology
HCA hierarchical clustering analysis
hCG human corionic gonadotropin
HE4 WAP four-disulfide core domain protein 2 precursor
HPA Human Protein Atlas
HPDE human pancreatic ductal epithelial
HPLC high pressure liquid chromatography
IPA Ingenuity Pathway Analysis
IPI international protein index
IPMN intraductal papillary mucinous neoplasms
KEGG Kyoto Encyclopedia of Genes and Genomes
KLK Kallikrein
xiv
LTQ linear ion trap
MALDI matrix-assisted laser desorption ionization
MMP matrix metalloproteinase
MRM multiple reaction monitoring
MS mass spectrometry
MUC mucin
OLFM4 Olfactomedin-4
PanIN pancreatic intraepithelial neoplasia
PDAC pancreatic ductal adenocarcinoma
PIGR Polymeric immunoglobulin receptor
PLAT tissue-type plasminogen activator
PSA prostate specific antigen
ROC receiver operating characteristic
SCX strong cation exchange
SELDI surface-enhanced laser desorption ionization
SYCN Syncollin
TiGER Tissue-specific and Gene Expression and Regulation
TiSGeD Tissue-Specific Genes Database
2
1.1 The Pancreatic Cancer Problem
1.1.1 The Human Pancreas
Extending from the C-shaped curve of the duodenum towards the hilum of the spleen,
the pancreas is a glandular organ comprised of both exocrine and endocrine functional units [1].
The exocrine pancreatic component is a serous gland consisting of two major cell types: (1)
acinar cells responsible for the synthesis of digestive enzymes in their inactive form (zymogens)
and (2) ductal cells responsible for the transport of zymogens to the duodenum through a
complex network of branching ducts and the main and accessory pancreatic ducts [1]. The
ductal cells also secrete an alkaline fluid (bicarbonate and water) which acts to neutralize
stomach contents as they enter the duodenum to ensure an optimal pH for digestive enzyme
activity. Another less well characterized cell-type of the exocrine pancreas are centroacinar
cells, which are duct cells located within the acinus [2]. The endocrine component of the
pancreas, which comprises ~ 1-2% of the total volume of the pancreas, is composed of islets of
Langerhans. Islets contain primarily alpha (15-20% of the islet cell population), beta (70% of
cell population) and delta (5-10% of cell population) cells that secrete glucagon, insulin and
somatostatin, respectively, and are fundamental for regulating blood-glucose levels and sugar
metabolism [1,2]. Minor islet cell types include PP (protein polypeptide) cells, EC
(enterochromaffin cell) cells and D-1 cells, each with either an inhibitory or stimulatory effect
on exocrine pancreatic secretions and gastro-intestinal motility and secretion [1].
1.1.2 Pancreatic Cancer
Pancreatic malignancies are a heterogeneous group of tumors classified largely based
on the pancreatic cell-type they recapitulate [3]. The great majority of pancreatic cancers (~85-
90%) arise from the exocrine pancreas and are pancreatic ductal adenocarcinomas (PDAC). In
their well differentiated state, PDACs resemble glandular morphology akin to benign ducts and
3
are also characterized by large areas of desmoplastic stroma, and invasion of vascular and
perineurial structures [3,4]. Other more rare types of pancreatic cancers include: undifferentiated
carcinomas, which lack ductal-like structures and show increased aggressiveness to PDAC;
colloid carcinomas characterized by large mucin deposits; medullary carcinomas characterized
by large, undifferentiated epitheliod-like cells; acinar cell carcinomas, which comprise ~1-2% of
pancreatic cancers and recapitulate acinar cell-like properties with zymogen granules; serous
cystadenomas which show cystic growths and ductal morphology, and endocrine tumors
typically characterized by improper production of pancreatic endocrine hormones such as
insulinomas [3,4]. The present study focuses on PDAC as it accounts for the majority of
pancreatic cancers.
1.1.2.1 Precursor Lesions and Cell of Origin
The ductal-like phenotype of PDAC is supported by a genetic progression model of
pancreatic cancer in which ductal cells, as they acquire sequential mutations in KRAS, INK4A,
TP53, SMAD4/DPC4, telomere shortening, etc. progress from normal ductal epithelia to low
grade pancreatic intraepithelial neoplasia (PanIN) to high grade PanIN (with increasing nuclear
abnormalities, abnormal mitosis and cytological atypia) and then to invasive ductal
adenocarcinoma [2,4]. However, the cell of origin of pancreatic cancer remains elusive and
other reports support a model which entails greater developmental plasticity during
tumorigenesis, where-by other cells within the pancreas, such as acinar or centroacinar cells,
may give rise to ductal adenocarcinomas through transdifferentiation or acinar-to-ductal
metaplasia. This is primarily supported by findings in murine models [2,5-7].
Other lesions that can lead to invasive carcinoma of the pancreas include mucinous
cystic neoplasms, intraductal papillary mucinous neoplasms (IPMNs) and intraductal oncocytic
papillary neoplasms [3,8]. These lesions are ductal-like tumors with cyst formation and it is
4
believed that ~2% of pancreatic cancers arise through these means. In some institutions they
may account for >10% of pancreatic resections [3,8]
1.1.2.2 Symptoms
Pancreatic cancer is asymptomatic in the early stages of tumor development and most
patients present with non-specific abdominal complaints or back pain [3,4,9]. Back pain is
thought to be caused by perineurial invasion, or invasion into bundles of nerve fibres. Patients
with tumors developing in the head and neck regions of the pancreas may develop jaundice due
to blockage of the main bile duct by the growing tumor [10]. More advanced disease may be
characterized by ascites, which is the build-up of fluid in the abdominal cavity, anorexia and
cachexia [11,12]. Cachexia is the unintentional loss of weight (≥ 10% or more body weight)
over the course of several months due to an increase in protein degradation and reduction in the
synthesis of muscle. It is present in ~80% of pancreatic cancer patients and accelerated weight
loss can result in decreased survival [12]. Other complications of pancreatic cancer may include
pancreatitis or diabetes mellitus [13]. Due to the lack of highly specific symptoms and the late
onset of symptoms, pancreatic cancer is an elusive disease and often dubbed a „silent killer‟.
1.1.2.3 Risk Factors
Pancreatic cancer has been associated with old age, smoking, family history of the
disease, hereditary syndromes, and diabetes [2,4,14]. The median age of pancreatic cancer
patients is in the 7th
decade of life and the disease is rare in individuals below the age of 40.
Smoking is one of the most preventable causal factors of pancreatic cancer and has been linked
to as many as one in four cases of pancreatic cancer [15]. In general, smoking has been shown to
increase the relative risk of pancreatic cancer by two-fold. Other risk factors include family
history and hereditary syndromes. Studies have shown that ~ 8-10% of pancreatic cancer
patients may have a familial link [16-18], with a relative risk of 18 and 57-fold in comparison to
5
sporadic disease if two or three first-degree relatives have pancreatic cancer, respectively
[19,20].
Several hereditary diseases have also been associated with increased pancreatic cancer
risk, most notably Peutz-Jeghers Syndrome and hereditary pancreatitis [20]. Germline mutations
in the STK11/LKB1 gene result in Peutz-Jeghers Syndrome, a disease characterized by macules
on the lips, noncancerous (hamartomatous) gastrointestinal polyps and a relative risk >100 times
that of the general population for pancreatic cancer development [21,22]. Pancreatitis is an
inflammatory condition of the pancreas and mutations in the PRSS1 gene, which encodes the
protease serine 1 (cationic trypsinogen) protein, cause hereditary pancreatitis. Hereditary
pancreatitis increases pancreatic cancer risk by 50-80 times [23]. Familial pancreatic cancer has
been linked to germline mutations in BRCA2, LKB1, CDKN2A and MLH1.
In an interesting recent genome-wide association study by the Pancreatic Cancer Cohort
Consortium (PanScan), blood type was associated with risk for pancreatic cancer [24]. In a
group of 1,534 pancreatic cancer patients and 1,583 controls, individuals with a non-O blood
type showed increasing risk for pancreatic cancer with each additional non-O allele (odds ratio
(OR) of 1.33, 1.61, 1.45 and 2.42 were obtained for individuals with type AO, AA, BO and BB
genotypes respectively, when compared to the OO genotype) [24].
1.1.2.4 Prevention
Various dietary and chemical agents have been described in literature with potential
preventative properties for pancreatic cancer. For instance, a compound in soy products
(genistein), and curcumin, a component of the spice turmeric have been shown to have
inhibitory effects on pancreatic cancer through inhibition of the NF-κB pathway [25]. Similarly,
various compounds in vegetables and fruits, such as vitamin C and D and Indole-3-carbinol (I3C
- found in cabbages, broccoli and cauliflower), have shown reduced risk of pancreatic cancer or
6
inhibition of cell proliferation, and the induction of apoptosis as is the case with IC3 [25].
Metformin, a drug given to patients with type II diabetes to lower glucose has been shown in
epidemiological studies to confer a significant decrease in development of pancreatic cancer
[26]. As well, angiotensin-I-converting enzyme inhibitors aspirin and enalapril were shown in a
recent study to delay progression of precursor pancreatic intraepithelial neoplasias to invasive
PDACs, where occurrence of pancreatic cancer was reduced from 60% to 17.6-31.2% in
untreated versus treated groups of a mouse model of pancreatic cancer treated for 3 and 5
months with the agents [27]. The leading preventative measure associated with pancreatic
cancer is abstinence from cigarette smoking [25].
1.1.2.5 Treatment
Treatment of pancreatic cancer varies depending on the extent of the cancer in patients.
For patients that present with early stage disease (<2cm lesions localized to the pancreas), which
is approximately 10% of pancreatic cancer patients, surgical resection is the main treatment
course [28]. Often called the “Whipple Procedure” after Dr. Allen Oldfather Whipple who
described at first a two-stage procedure in the late 1930s, and then a one-stage procedure in the
1940s, pancreaticoduodenectomy is a common procedure for resection of cancerous pancreatic
lesions. It is a procedure which involves removal of the duodenum, head of the pancreas, bile
duct and gallbladder (with or without preservation of the pylorus – the connective region of the
stomach and small intestines) [28]. Following surgery, adjuvent therapy in the form of radiation
and chemotherapy is prescribed in an attempt to reduce recurrence of disease [29]. Gemcitabine,
which is an inhibitor of DNA synthesis and 5-fluorouracil are common chemotherapeutic agents
for pancreatic cancer [29].
Locally advanced pancreatic cancer is unresectable disease that often involves invasion
into the superior mesenteric artery and other important vascular structures [30]. Approximately
7
30% of pancreatic cancer patients present with locally advanced disease. There is no real
consensus as to the most optimal treatment strategy for these patients; however most receive a
combination of radiation and chemotherapy. Pancreatic cancer is highly resistant to available
therapies and although significant efforts have been taken towards development of new therapies
over the past decade, they have been met with little success [13]. Several clinical trials
comparing inhibitors (individually or in combination with gemcitabine) targeting various
overexpressed pathways seen in pancreatic cancer such as the Kras inhibitor Tipifarnib, the
MMP (matrix metalloproteinase) inhibitor Marimastat, the anti-VEGF (vascular endothelial
growth factor) treatment Bevacizumab and the EGFR (epidermal growth factor receptor)
inhibitor Erlotinib to gemcitabine alone have shown no or little survival benefit [13]. Erlotinib
has shown a statistically significant increase in survival; unfortunately the median overall
improvement was quite small (from 5.91 months to 6.24 months; p=0.038) [31]. Metastatic
disease, which is present in ~50% of individuals at the time of diagnosis is treated by systemic
chemotherapies or through palliative measures to reduce pain and discomfort. Without
treatment, the median survival time for metastatic pancreatic cancer patients is ~2-3 months
[13].
An increased understanding into the complex signaling pathways gone awry in
pancreatic cancer and important implications of the tumor stroma (which in malignant
pancreatic tumors often encompass a greater volume than epithelial cells) [32] and other
microenvironmental effects over the past several years is believed to aid in the development of
future therapies with greater efficacy for pancreatic cancer. According to a recent review in
Nature Reviews Clinical Oncology [13], such treatment strategies will likely include targeting
multiple pathways, cancer stem cells and pathways involved in development such as Wnt, Notch
and Hedgehog signaling.
8
1.1.3 Pancreatic Cancer Statistics
Pancreatic cancer is the 10th most common cancer type in North America; however, it
is the 4th leading cause of cancer-related death [33]. Worldwide, this cancer afflicts
approximately 232,000 individuals annually [34]. As described above, prognosis is extremely
poor for patients diagnosed with pancreatic cancer and the majority of patients succumb to the
disease within several months to one year after diagnosis. Less than 5% of pancreatic cancer
patients survive up to five-years post diagnosis [3,33].
At present, surgical resection, made possible through the early diagnosis of cancerous
lesions while they are small (<2cm) and localized, offers the best treatment option and the only
potentially curative option for pancreatic cancer patients [35]. Unfortunately, due to the
asymptomatic nature of the early-stages of this disease and a lack of adequate screening methods
for its early detection, the majority of patients present with locally advanced (~30% of patients)
or metastatic disease (~50% of patients) at the time of diagnosis. At these advanced stages,
chemotherapy, in combination with radiation therapy, or palliative care options is the usual
treatment course [36,37]; however such treatment options are largely anecdotal due to the high
rate of metastatic spread of pancreatic cancer, especially to vital organs such as the liver. As a
result, there is great interest by clinicians and researchers alike for the development of novel
methods with high sensitivity and specificity for detection of pancreatic cancer in its
asymptomatic or early stages.
Currently, with early detection, five-year survival rates have been shown to improve for
pancreatic cancer patients from <5% to ~20-40%. Compared to other cancer sites such as breast,
prostate, colon and ovarian where 5-year survival rates improve from 23% to 98%, 31% to
100%, 11% to 91% and 28% to 94%, respectively with early detection [33], the improvement
seen in pancreatic cancer patients seems modest at best. However, optimism remains that the
9
combination of parallel advancements in therapeutic strategies, preventative measures and early
detection will further improve the outcome of pancreatic cancer patients. The focus of the
present study is towards the identification of biomarkers and biomarker candidates that may aid
in the improvement of existing detection strategies.
1.1.4 Current Methods for Pancreatic Cancer Detection and Their Limitations
Current methods for pancreatic cancer detection are based primarily on imaging
techniques for the detection of pancreatic masses or suspected cancerous lesions, in individuals
who either present with nonspecific abdominal complaints, or symptoms suggestive of
pancreatic cancer such as painless jaundice and weight loss [38]. High-resolution, contrast-
enhanced cross sectional computed tomography (CT), in particular, which enables the
acquisition of thin image slices (5mm) from the base of the lungs to the pelvis, is a widely used
technique for pancreatic cancer detection [39]. By examining contour abnormalities within the
pancreas and surrounding ducts and arteries, CT can also facilitate assessment of staging, tumor
resectability, and post-operative follow-up in patients with established pancreatic cancer [38,39].
Endoscopic ultrasound (EUS) has also emerged as a sensitive means for the detection of
pancreatic tumor masses [40,41]. Through a combination of real-time endoscopy and high-
frequency ultrasound, EUS is used to image the pancreas through the gastric and duodenal walls.
The close proximity at which images are obtained has enabled EUS to overcome confounding
effects caused by gaseous features overlying the pancreas. In this respect, EUS has been shown
useful for the detection and evaluation of small (2-3cm) focal lesions [40-42].
Other techniques for detection and assessment of pancreatic cancer include magnetic
resonance imaging (MRI) and positron emission tomography (PET), the latter of which is used
largely for the detection of metastasis [43]. There are conflicting reports as to which imaging
method shows superiority for the clinical assessment of pancreatic cancer and a combination of
10
techniques may be utilized based on the clinical question and practice preferences [38-43].
Certain definitive diagnoses of pancreatic cancer may require more invasive means such as
endoscopic retrograde cholangiopancreatography (ERCP) which enables tissue sampling,
acquisition of a computed tomography-guided biopsy or endoscopic ultrasound-guided fine
needle aspiration (EUS-FNA) [20,42].
The major drawback of all of these methods for the optimal management of pancreatic
cancer patients is that they are primarily utilized after the onset of symptoms, which, in
pancreatic cancer patients occurs predominantly after the onset of late stage disease. At some
institutions, imaging methods have been implemented for screening asymptomatic individuals
with a known familial or genetic predisposition for pancreatic cancer; however, due to their
associated high operating costs, invasive (albeit some more than others) and relatively time
consuming nature, imaging methods are ineffective for screening the general population or at-
risk groups (such as the elderly) for detection of the asymptomatic and resectable early stages of
pancreatic cancer [42].
1.2 Serological Cancer Biomarkers
1.2.1 Criteria for Detection and Biomarker Applications in Pancreatic Cancer
Standard criteria for an effective method for early detection calls upon a test that is cost
effective, non-invasive and easily performed by staff with minimal training [44]. Most
importantly, it should possess a high degree of sensitivity and specificity to enable the accurate
identification of disease from non-disease without overdiagnoses and false negatives. In this
regard, the test should provide a clear clinical benefit to patients [43-45]. While the fulfillment
of all of these criteria may be challenging for any one technique, serum biomarkers, either
individually or as a multi-parametric panel, have the potential to meet many of the above
criteria. In addition, the development of serological markers is ideal due to the present set-up in
11
many clinical laboratories which centers around the use of blood testing.
For pancreatic cancer specifically, given the low prevalence of the disease and low
efficacy of current treatments, a population-based screening test is somewhat unrealistic and
screening would likely be in individuals in high-risk groups such as those with familial
pancreatic cancer or syndromes such as Peutz-Jeghers (see section 1.1.2.3 above) [42,45]. In the
sections above, emphasis was placed on early detection; however biomarkers have a wide range
of other important clinical purposes. For instance, local relapse in pancreatic cancer patients
who undergo surgical resection has been reported as 50-85% [46] and as a result, markers that
can be used for monitoring response to surgery/treatment and pancreatic cancer progression can
also greatly aid in the optimal management of pancreatic cancer patients. As well, at present up
to 25% of patients may undergo surgery in whom the inability to resect is discovered only
during the surgery due to micrometastasis or invasion that was not identified through current
CT-based methods [47]. In this regard, biomarkers to aid in enhanced prediction of resectability
and staging of tumors can also be of clinical use for pancreatic cancer [47].
1.2.2 Current State of Pancreatic Cancer Serum Biomarkers
1.2.2.1 A General Introduction to Biomarkers
Biomarkers are molecules or processes, which, when measured are indicative of a
particular biological state or condition [44]. Many molecules and cellular processes have been
studied for the detection and management of cancer and for pancreatic cancer. These include,
but are not limited to, molecules and processes such as DNA, mRNA, miRNA, proteins,
circulating tumor cells and angiogenesis [44]. For instance a recent comprehensive genomic
analysis of pancreatic cancer tumors published in Science [32], revealed an average of 63
mutations (mostly point mutations) in pancreatic cancer and 12 core pathways deregulated in the
majority of cases. This study also revealed several hundred (541) genes expressed over 10-fold
12
in 90% of cases that may serve as potential biomarkers. Other more targeted studies have shown
utility of analyzing mutations in genes such as K-ras and p53, and promoter methylation of
p16(INK4a) in high-risk groups to further stratify risk of developing pancreatic cancer [48].
Similarly, miRNA profiling has shown an association between miR-196a and pancreatic cancer
in several studies [49], where increased miR-196 has shown discriminatory potential between
PDAC and benign pancreatitis and healthy controls [49]. Increased intratumoral microvessel
density has also been noted in pancreatic cancer, as well as increased expression of angiogenic
factors in serum such as VEGF (vascular endothelial growth factor) [50,51]. Circulating tumor
cells (CTCs) have been a growing area of research in recent years and with respect to pancreatic
cancer CTC research, alpha-1,4-acetyl-glucosaminyltransferase (alpha4GnT) mRNA has shown
diagnostic potential for PDAC upon extraction from peripheral mononuclear blood cells [51,52].
However, given that most molecular and genetic alterations (mRNA, miRNA, DNA,
etc) tend to ultimately culminate in the altered expression of protein products, and with the
recent advancements made in high-throughput proteomic technologies, the study of cancer
proteomics represents a potentially fruitful means for identification of novel biomarkers [44,53].
1.2.2.2 Mechanisms for Biomarker Elevation in Serum
Human plasma is described as the most complex of all human proteomes containing
proteomes from all other tissues as subsets [54]. There are several mechanisms by which
proteins can enter into circulation and serve as cancer biomarkers [44]. Primarily these include
the increased secretion and shedding of proteins from tumor cells, angiogenesis, and
leakage/release of proteins from tissues as tumors invade and cause destruction of the local
tumor microenvironment [44]. Approximately 20-25% of all human proteins are secreted and
recent analysis of single nucleotide polymorphisms in proteins containing signal peptides have
shown aberrant secretion of certain proteins in disease states [55]. Similar events may likely
13
occur in cancer [56]. Additionally, aberrant production of extracellular proteases may result in
the increased cleavage of extracellular domains of membrane-bound proteins resulting in their
elevation in circulation [44]. Most currently used biomarkers such as AFP (alpha-fetoprotein)
and hCG (human chorionic gonadotropin) are secreted, and HER2 (a member of the epidermal
growth factor (EGF) receptor family) is a membrane-bound protein, the extracellular domain of
which can be detected in serum [44,57]. An example of increased levels of a protein marker in
serum due to leakage caused by local tissue destruction is prostate specific antigen (PSA) in
prostate cancer [44,58].
1.2.2.3 CA19.9 and Other Putative Pancreatic Cancer Markers
Several tumor markers with good sensitivity and specificity are currently in routine
clinical use for the detection of various cancer sites, such as PSA for prostate cancer and hCG
for testicular cancer. Unfortunately, a marker of high diagnostic sensitivity and specificity is
lacking for pancreatic cancer. Currently, the most widely used clinical marker for pancreatic
cancer is CA-19.9, a sialylated lewis A antigen found on the surface of proteins [59,60]. CA19.9
has reported sensitivity values ranging from 70%-90% (median ~79%) and specificity values
ranging from 68%-91% (median ~82%) for diagnosis of pancreatic cancer [59]. While elevated
CA19.9 levels have been associated with the advanced stages of the disease, they have also been
associated with benign and inflammatory diseases such as obstructive jaundice, pancreatitis, as
well as other malignancies of the gastrointestinal system [60-63]. For early-stage pancreatic
cancer detection, CA19.9 has a reported sensitivity of ~55% and it is often undetectable in many
asymptomatic individuals [59,43]. In addition, CA19.9 is associated with Lewis antigen status
and is absent in individuals with Lewis antigen negative blood group (~10% of the general
population) [64]. Taken together, CA19.9 lacks the necessary sensitivity and specificity for early
pancreatic cancer detection and is most widely used as a biomarker to monitor response to
14
treatment in patients who had elevated levels prior to treatment.
Other tumor markers such as members of the carcinoembryonic antigen (CEA) [65,66]
and mucin (MUC) [67] families have also been associated with pancreatic cancer. Similarly,
many other proteins have been described as putative candidates in literature. When used in
combination, with or without CA19.9, some of these markers have shown enhanced sensitivity
and specificity; however none have been able to successfully displace or supplement CA19.9 in
the clinic.
1.3 Mass Spectrometry-Based Methods for Serum Biomarker Discovery
1.3.1 Principles of Mass Spectrometry
In proteomics, mass spectrometry is a mainstay and is crucial to the design of large-
scale proteomics-based discovery studies. Mass spectrometry (MS) enables the simultaneous
identification of proteins in a biological sample [68]. When configured to monitor specific
peptides and peptide products, MS analysis also permits targeted quantification of analytes [69].
If samples are prepared through digestion of proteins using enzymes, followed by MS analysis
of the peptide products and identification of proteins through subsequent database searching, the
MS analysis is referred to as a „bottom-up‟ or shotgun proteomic analysis [68]. Through these
means, mass spectra generated through MS analyses are analyzed using computer-based
algorithms such as MASCOT, X!Tandem and SEQUEST and compared to databases containing
all known and predicted protein sequences to confer protein identifications [70,71]. The
database containing all amino acid sequences of proteins in reverse is also commonly used to
determine the false positive rate of MS protein identifications based on the number of
proteins/peptides that match to the reverse database component.
Conversely, if intact proteins are analyzed without prior enzymatic digestion to
peptides, the approach is referred to as „top-down‟ analysis [68]. While top-down analysis can
15
result in greater sequence coverage of proteins, including increased information on post
translational modifications present in proteins, the fractionation of intact proteins prior to MS
analysis is more challenging than fractionation of peptide mixtures, and as a result, top-down
approaches are typically used for single proteins or simple protein mixtures [68]. In the present
study, a bottom-up proteomic approach was utilized for identification of proteins.
Mass spectrometers are composed of three main components– an ionization source, a
mass analyzer and a detector. An important development that paved the way for mass
spectrometry analysis of proteins was the introduction of ionization methods through which
proteins and peptides could be stably transferred into the gas-phase. Two such ionization
techniques widely used in proteomic analyses are MALDI (matrix-assisted laser desorption
ionization) and ESI (electrospray ionization) [68,72]. In MALDI, small amounts of sample
(~1uL) are applied onto a plate along with a light-absorbing crystallized substance (matrix). The
solvent containing the sample vaporizes, leaving the matrix and sample mixture of proteins co-
crystallized. Subsequent short (nanosecond) pulses of a laser cause vaporization of the protein
mixture and ionization of the protein molecules through energy transfer from the matrix [72].
Addition of proteins onto a coated resin that can select/enrich for proteins based on properties
such as hydrophobicity, charge, specific antibody affinity, etc. prior to ionization is referred to
as SELDI (surface-enhanced laser desorption ionization) [73].
ESI enables ionization of proteins/peptides directly from solution (the solvent typically
contains water and a volatile compound such as acetonitrile). Mass spectrometery is usually
coupled directly to a high pressure liquid chromatography system and in ESI, high voltages (~2-
6 kV) are applied at the interface where peptides elute from the chromatography system and
prior to entering into the inlet of the mass spectrometer [68]. The high voltage causes the
analytes to form a jet or spray of charged particles (Taylor cone), resulting in the eventual
16
formation of gaseous peptide ions [74]. ESI was used in the present study.
Once ionized, electric fields are typically used to guide peptides to the mass analyzers
which function to store and separate ions based on their mass-to-charge (m/z) ratio. Two main
categories of mass analyzers include scanning and ion-beam analyzers such as those found in
time of flight (TOF) mass spectrometers [72], and analyzers that trap ions such as the linear ion
trap (LTQ) and LTQ-Orbitrap instruments [75]. Trapping instruments were utilized in this study,
specifically the LTQ-Orbitrap (Thermo). For instance, in the Orbitrap analyzer, ions are trapped
in electrostatic fields where they orbit and oscillate around a central electrode. The frequency of
their oscillation can be related back to m/z of the ion [68,75,76]. Important considerations in
mass analyzers are their resolution (ability to separate/distinguish between two peaks), the mass
accuracy of instruments, mass range, which is the range of m/z that an instrument is capable of
analyzing and the scan rate of the analyzer. Two mass analyzers can be coupled in tandem,
where-by ions analyzed in the first mass analyzer can be further fragmented through collision
with a neutral gas (collision induced dissociation) and analyzed in a second mass analyzer
[68,77]. Such tandem MS analysis can occur simultaneously as is the case in the LTQ-Orbitrap
instrument used in this study, where the Orbitrap analyzer performs the first scan, while the
LTQ carries out fragmentation and the second scan in parallel. Coupling of two analyzers such
as this can combine the advantages of both such as the speed and sensitivity of the LTQ
component with increased mass accuracy and resolution of the Orbitrap analyzer [68.77].
The final component of mass spectrometers are the detectors which record the charge or
current produced by the ions resulting in the generation of mass spectra. As described above, the
spectra are then searched using computer algorithms against databases containing sequences of
all known and predicted proteins to confer protein identifications (in typical bottom-up
approaches).
17
1.3.2 Proteomics Discovery Pipeline – Discovery, Verification and Validation
A standardized protocol for the discovery of biomarkers through mass spectrometry-
based proteomic approaches does not exist in the field; however there are several theoretical 3-
5-phased models described in literature which can serve as useful platforms for the identification
of novel protein biomarkers [78,79]. The primary phases of such models are discovery,
verification and validation. In the discovery phase, mass spectrometry analysis is undertaken for
the identification of proteins in various biological sources. Due to the complex nature of serum,
as described in greater detail in the sections below, samples used in this phase are predominantly
tissues, cell lines and biofluids [44,79]. The discovery phase typically results in the
identification of thousands of proteins and may involve semi-quantitative or relative
quantification analysis between cancer and non-cancerous samples. Next, the identified proteins
are mined through application of bioinformatics to generate a manageable list of putative
candidates (~50-100) for testing. The bioinformatics criteria used to filter candidates are usually
arbitrary and defined by each individual study group [80]. Common criteria involve study of
differential expression of proteins, genome ontology analysis, pathway analysis of proteins,
study of protein tissue specificity through publically available databases, comparisons to
literature and mRNA, miRNA, DNA database sources, etc. Subsequently, generated candidates
are verified in a moderate number of serum samples to preliminarily assess the efficacy of the
candidates to discriminate between cases and controls [78-80]. Verification and validation in the
final clinically used biological source (i.e. in this case serum) is pertinent given that other
biological sources are used in the high throughput discovery-phase studies for initial
identification of candidates and the somewhat arbitrary nature of candidate prioritization. Many
candidates will be rejected during verification phases; however a small handful of proteins (~5-
10) will likely emerge as promising candidates that can distinguish cancer from controls,
18
warranting their further validation in larger sample sets (several 100 – 1000s of samples per
study group) [78,79]. A schematic of this pipeline is presented in Figure 1.1.
Where-as the technology used in the discovery phase is mass spectrometry, the „gold
standard‟ for verification and validation are immunoassays used to measure concentration of
specific proteins, particularly enzyme-linked immunosorbent assays (ELISAs) due to their high
sensitivity and specificity for targeted protein quantification [79]. A challenge or bottleneck in
the described pipeline is the inability to verify all generated candidates due to a lack of
commercially available ELISAs for the majority of proteins, coupled with the high costs of
producing ELISAs for proteins that currently lack reagents (~$50,000 - $100,000 for a research
grade ELISA and higher for a clinical grade ELISA) [69].
19
Figure 1.1 Biomarker Discovery Pipeline. Depicted are the discovery, verification and validation phases of proteomics-based
biomarker discovery. The phases are described in the “input” and “output” text. The shape of the pipeline depicts a bottleneck to
portray the existing inability to verify/validate all candidates due to a lack of commercially available assays/reagents for verification.
*, candidate selection through bioinformatics; garbage cans depict candidates that are rejected at each phase due to poor
discriminatory ability.
20
Mass spectrometry-based targeted protein quantification methods such as multiple
reaction monitoring (MRM) are emerging to fill this gap. At present, MRM-based approaches
are limited by their sensitivity for analysis of low abundance serum proteins (without prior
enrichment of proteins, MRM can detect serum proteins only into the mg/L in direct serum
digests); however with advancements in the sensitivity of mass spectrometers, MRM will likely
emerge in the near future as a means to alleviate this bottleneck [69,81].
During the verification and validation phases, various statistical criteria are examined such as
sensitivity, specificity and receiver operating characteristic (ROC) curve analysis [44].
Sensitivity: The true positives identified by a test (i.e. proportion of individuals that
have the disease who correctly tested positive) [44]
Specificity: The true negatives identified by a test (i.e. proportion of individuals that do
not have the disease (controls) that correctly tested negative) [44]
Receiver operating characteristic (ROC) curve: ROC curves graphically represent the
true positive rate (sensitivity) versus false positive rate (1-specificity). ROC curves
enable determination of the effectiveness of a biomarker at various cut-off points for
sensitivity or specificity. An ideal marker would be one in which the area under the
ROC curve is maximum (i.e. AUC = 1.0; the test can correctly classify all true positives
as such and all true negatives as such). An advantage of ROC curve analysis is the
ability to evaluate multiple biomarkers or candidates on the same plot as well as
perform combined analyses of biomarkers in combination (panels of biomarkers) [44,
82].
1.3.3 Pancreatic Cancer Serum Proteomics
The majority of pancreatic cancer proteomic-based discovery studies have focused on
21
serum [83-90] or tissue proteomics [91-95], with an increasing number of proximal biological
fluid studies [96-104] over the past several years. Serum proteomics studies for pancreatic
cancer have utilized SELDI approaches to generate characteristic peak patterns or pancreatic
cancer „signatures‟. For instance, the use of four mass to charge (m/z) values in combination
with CA19.9 was shown recently to improve the diagnostic accuracy of CA19.9 (improved area
under the curve from 0.883 to 0.935) [83]. Several years earlier, the use of four different peaks
with CA19.9 was shown to detect 29 of 29 pancreatic cancer patients accurately [105]. A
limitation of such approaches is the difficulty in translating cancer signatures into the clinical
laboratory setting. In addition, unlike the method of MS described in the “Principles of Mass
Spectrometry” section above, certain SELDI-based approaches are limited by the inability to
identify corresponding proteins for the characteristic mass to charge (m/z) patterns or signatures
identified. Other approaches to pancreatic cancer serum proteomics have been MALDI based
[65,83,84,88], as well as a proteomic approach analyzing murine serum in a progression model
of pancreatic cancer followed by verification of candidates in human serum [106].
Generally, while analysis of serum seems a practical choice when mining for
serological biomarkers, MS-based serum proteomics is hindered by the large dynamic range of
proteins in serum (approximately 10^11 orders of magnitude) compared with the analytical
range of ~ 10^5 of mass spectrometers. In addition, serum contains approximately twenty-two
proteins of high abundance which are of little diagnostic value that comprise ~99% of the total
protein mass of serum [44]. The remaining 1% contains potentially thousands of proteins of
interest for biomarker studies. These proteins are primarily in the ng/L to ug/L range and are
difficult to examine through mass spectrometry due to the masking effects posed by proteins of
high abundance. Furthermore, serum is largely heterogeneous and its inherent features can vary
from individual to individual based on hormonal status, age, sex, diet etc. [107]. While great
22
strides have been made in serum proteomics-based discovery studies, due to the above stated
reasons, at present, analysis of serum is likely best left for candidate verification stages.
1.3.4 Tissue Proteomics
Tissue is another source that is often analyzed in discovery studies. Tissue proteomics
offers the ability to analyze cancerous versus normal adjacent tissue for the detection of changes
between the two [91,108]. While this is an appealing prospect for the discovery of tissue-based
biomarkers, not many candidates discovered through tissue proteomics have been verified in
serum. Additionally, almost all of the cancer biomarkers currently in use are proteins that are
secreted or shed from tumor cells and proteomic analysis of whole tissue fails to enrich for
secreted and/or shed components. Pancreatic cancer, in particular, is characterized by a high
stromal reaction and in some instances, tumor sections may contain more stroma than cancerous
cells [109]. While these stromal cells are a part of the pancreatic cancer tumor
microenvironment and may contribute to the production of biomarkers, analysis of tissue lysates
is not optimal for the identification of serological biomarkers. Instead, proximal biological fluids
which bathe tumor cells and into which tumor cells and their microenvironment contribute their
secretions is likely a more valuable biological source for discovery studies.
1.3.5 Proteomics of Proximal Biological Fluid and Cell Line Conditioned Media
The concentration of many biomarkers is expected to be in the ng/L – ug/L range in
circulation; however closer to the tumor, their concentration is greater. For instance, in a study
looking at levels of CA125 (carbohydrate antigen 125) – a widely used ovarian cancer
biomarker – in serum, ascites and cystic fluid of ovarian cancer patients, of which the latter two
sources represent sources more proximal to the tumor site, median CA125 levels were found to
be 696 U/mL, 18,563 U/mL and 44,850 U/mL respectively [110]. As a result, many groups have
taken to proteomic analysis of biological fluids more proximal to the tumor, as they represent
23
sources more enriched in potential biomarkers.
In terms of pancreatic cancer, several recent studies detailing proteomic analysis of
pancreatic juice and pancreatic cystic fluid for the discovery of novel biomarkers have been
published [98-104]. Pancreatic juice is an alkaline fluid secreted by the exocrine cells of the
pancreas into the duodenum. It contains a large number of inactive enzymes which become
activated once in the intestine to aide in digestion. In pancreatic cancer, it is likely that
pancreatic juice will also contain the secretions of tumor cells. Protein numbers ranging from 22
to 170 have been identified in six pancreatic juice studies using a variety of different MS-based
approaches [98-103]. Of these, two studies performed verification in serum samples using
ELISAs of the proteins hepatocarcinoma-intestine-pancreas/pancreatitis-associated-protein
(HIP/PAP-I) [103] and matrix metalloproteinase-9 (MMP-9) [102]. Ascites is another fluid that
has been shown recently as a good media to mine for biomarkers [111,112]. Ascites fluid acts as
a local microenvironment containing secretions from cancer cells and other malignant processes;
however the proteome of pancreatic cancer-derived ascites has, to our knowledge, not yet been
profiled.
The use of cell culture supernatants or conditioned media (CM) for biomarker
discovery is another approach and has been gaining popularity in recent years [80]. Although
significant differences have been noted in the literature between cell lines and primary tumors,
their genomic and transcriptional characteristics, as well as biological heterogeneity, have, in
general, been shown to recapitulate those of primary tumors [113-116]. Comparison of features
such as morphology, aneuploidy, and expression of important genes such as K-ras and p53, have
also shown good concordance between cancer cell lines and primary tumors for other cancer
sites [115,116] and for pancreatic cancer as well [113,114]. Additionally, the identification of
24
known biomarkers, such as PSA in prostate cancer cell lines and CA125 in ovarian cancer cell
lines, make cell lines a viable source to mine.
1.3.6 Integrated Strategies
A way in which to improve current strategies for biomarker discovery and produce
markers with clinical utility may be to incorporate and integrate multiple biological fluids. Most
studies to date have utilized only serum/plasma, tissue or a proximal biological fluid for their
respective analyses. However, given that cancer is a highly heterogeneous disease, integration
and comparison of proteomes from multiple sources may yield „stronger‟ or more promising
candidates for verification. For instance, in a recent study by our laboratory which compared the
proteins/genes identified in six publications chosen arbitrarily to represent various biological
sources and both proteomic and genomic data pertaining to ovarian cancer (2 cell line CM
studies, 2 ascites, 1 tissue proteomics study and 1 microarray study), no proteins were found
common to all 6; however two proteins were found common to four of the studies [117]. The
proteins identified were WAP four-disulfide core domain protein 2 precursor (HE4) and GRN
(granulin). Both have been implicated in ovarian cancer and HE4 is a known ovarian cancer
biomarker. In this regard, the combining of information from multiple biological sources may
yield stronger candidates for verification.
1.4 Rationale, Hypothesis, Objectives
1.4.1 Rationale
Pancreatic cancer is one of the most lethal of all solid malignancies, for which non-
invasive, highly specific and sensitive methods to shift all diagnoses to occur in the early stages
of tumor development can improve patient survival and provide the most optimal care for
patients. Deregulated molecular pathways and physiological processes are a hallmark of
25
tumorigenesis, and many molecular and pathophysiological changes in cells and the tumor
microenvironment can ultimately culminate in the aberrant expression of protein products
[118,119]. The identification of such proteins, especially those which when measured in serum
reveal clinically useful information about the disease state of individuals, can highly aid in the
detection and clinical management of patients with pancreatic cancer.
Serum is a highly complex fluid that is believed to contain subsets of proteomes
representative of each tissue [54]. However ~22 proteins of high abundance constitute ~99% of
the total protein mass in serum. It is the in remaining ~1%, which contains thousands of
proteins, that potentially useful information regarding the presence and growth of tumors is
believed to lie. Given the limitations of MS-based serum analysis due to interference from
proteins of high abundance, many researchers have turned to analysis of fluids in closer
proximity to tumor cells [117]. Protein-based biomarkers are believed to be proteins that are
secreted, shed, cleaved or leaked from tumor tissues and their local microenvironment, and in
this regard, proximal biological fluids represent enriched sources to mine for potential
biomarkers prior to proteins entering the circulation and becoming vastly diluted [44,117].
Cell culture systems and analysis of conditioned media from cells grown in serum-free
media lack the presence of high abundance serum proteins that can interfere with MS analysis
[80]; however one of the possible limitations to tissue culture-based work is perhaps the inability
to adequately capture salient aspects of the tumor microenvironment and protein biomarkers that
may be produced as a result of aberrant interactions at the tumor-host interface [117]. Biological
fluids from patients with pancreatic cancer, such as pancreatic juice on the other hand, which
may possibly contain high abundance serum proteins (due to the method of extraction during
surgery), likely also contains pertinent contributions of tumor cells and their surrounding
microenvironment. Previous studies have focused on one biological source for analysis;
26
however integration of multiple biological sources (cell line conditioned media and pancreatic
juice) should enable us to better capture salient aspects relevant for biomarker discovery as the
advantages of one biological source may account for the shortcomings of another.
Mass spectrometry has also been in a constant state of evolution, becoming increasingly
sensitive and sophisticated. In parallel, software for data mining and other bioinformatic tools
have been garnering momentum [68,71,120]. Despite the inability of high-throughput methods
to provide an enhanced biomarker that can displace or supplement CA19.9 in the clinic for
pancreatic cancer to date, optimism remains that the experience gained in the field in the last
few years, and implementation of more integrated approaches to biomarker discovery, should
prove fruitful for pancreatic cancer biomarker discovery in the upcoming future.
1.4.2 Hypothesis
Proteins which can serve as biomarkers become elevated in serum through secretion,
shedding, cleavage and leakage from tumor cells and their microenvironment. To this end, we
hypothesize that novel candidate biomarkers for pancreatic cancer can be identified through
extensive proteomic analysis of supernatants of human pancreatic cancer cell lines grown in
vitro, in conjunction with pancreatic juice collected from pancreatic cancer patients. Through
subsequent application of bioinformatics-based filtering criteria, which will include label-free
protein quantification between cancer and normal cell lines and integration of the multiple
biological fluids, followed by verification of candidates in serum, we hope to identify a small
handful of proteins that show promise as potential pancreatic cancer biomarkers, warranting
their further and extended validation.
1.4.3 Objectives
1. Perform mass spectrometry analysis of cell line conditioned media in triplicate from six
pancreatic cancer cell lines (MIA-PaCa2, BxPc3, PANC1, CAPAN1, CFPAC1 and
27
SU.86.86) and one normal pancreatic ductal epithelial cell line (HPDE) using two
dimensional liquid chromatography tandem mass spectrometry (2D-LC-MS/MS)
2. Perform mass spectrometry analysis of pancreatic juice samples in triplicate using 2D-
LC-MS/MS.
3. Identify candidates for verification in serum/plasma using bioinformatics-based analysis
such as the following:
a. Label-free protein quantification comparing average normalized spectral counts
between triplicate analysis of cancer cell lines and the HPDE cell line to
determine differentially expressed proteins.
b. Genome Ontology analysis, focusing on extracellular and cell surface annotated
proteins.
c. Integrated analysis of cell lines with pancreatic juice, focusing on proteins
common to multiple biological fluids.
d. Tissue specificity analysis focusing on proteins specific to or highly expressed in
the pancreas based on publically available databases.
e. Hierarchical clustering analysis.
4. Perform initial verification studies in serum/plasma from patients with pancreatic cancer
and healthy controls of similar age and sex (n=40) using ELISAs to preliminarily assess
the ability of candidates to discriminate between cancer and controls.
5. Perform further verification/validation of promising candidates in a larger number of
serum samples (n~200)
28
CHAPTER 2
Mass Spectrometry Analysis of Cell Line Conditioned Media and
Pancreatic Juice for Identification of Candidate Pancreatic Cancer
Biomarkers
A modified version of the work presented in this chapter will be submitted to the journal
Molecular and Cellular Proteomics
29
2.1 Introduction
Pancreatic cancer is the 4th
leading cause of cancer-related death and one of the most
highly aggressive and lethal of all solid malignancies [121]. Worldwide, over 200,000
individuals are diagnosed with pancreatic cancer each year, and due to the asymptomatic nature
of its early stages, coupled with inadequate methods for early detection, the majority of patients
(>75%) present with locally advanced and inoperable forms of the cancer at the time of
diagnosis [121]. At these advanced stages, available chemotherapy, radiation and combinatorial
therapies are largely anecdotal, and less than 5% of patients survive up to five-years post
diagnosis [4,121].
One way to aid in the clinical management of cancer patients is through the use of serum
biomarkers. Biomarkers are measurable indicators of a biological state or condition, and in the
context of cancer, serum biomarkers present a non-invasive and relatively cost effective means
to aid in detection, monitor tumor progression and response to therapy, and for other measurable
outcomes of disease [44]. The most widely used biomarker in the clinic for pancreatic cancer is
CA19.9, a sialylated Lewis A antigen found on the surface of proteins [59,60]. While CA19.9 is
elevated in late stage disease, it is also elevated in benign and inflammatory diseases of the
pancreas and in other malignancies of the gastrointestinal tract [61,63]. As well, for early-stage
pancreatic cancer detection, CA19.9 has a reported sensitivity of ~55% and is often undetectable
in many asymptomatic individuals [43,59]. Other tumor markers such as members of the
carcinoembryonic antigen [65,66] and mucin [67,122] families have also been associated with
pancreatic cancer. When used in combination, with or without CA-19.9, some of these markers
have shown enhanced sensitivity and specificity; however none have become a constant fixture
in the clinic. The lack of a single highly specific and sensitive marker has led to a growing
consensus in the field towards the development of multiparametric panels of biomarkers, where-
30
by the combinatorial assessment of multiple molecules can likely achieve increased sensitivity
and specificity for disease detection and management [51,123-125].
In the present study, we performed in-depth shotgun proteomic analyses, integrating and
comparing the proteomes of conditioned media from pancreatic cancer cell lines, as well as
pancreatic juice samples, for the identification of novel pancreatic cancer biomarkers. Protein-
based biomarkers that can be detected in circulation are typically proteins that are secreted, shed
or cleaved from tumor cells, or ones that may leak out due to local tissue destruction during
disease progression [44]. As such, biological fluids in close proximity to tumor cells likely serve
as enriched sources of potential biomarkers before they enter circulation and become vastly
diluted and potentially masked by proteins of high abundance [79,110,117,125]. With respect to
pancreatic cancer, proteomic analysis of biological fluids such as pancreatic juice, cyst fluids
and bile have been conducted [98-104,126]. Protein numbers ranging from 22 to 170 have been
identified in six pancreatic juice studies using a variety of different MS-based approaches [98-
103], as well as over 460 proteins recently identified in a cyst fluid study [104], and 127 proteins
in the bile proteome from patients with bile duct stenosis [126]; subsequent verification of
candidate biomarkers in serum or plasma has been minimal.
Tissue culture supernatants or conditioned media (CM) is another relevant fluid, the
utility of which, for the identification of novel biomarkers, has been demonstrated in multiple
cancer sites by our group [80,127-130], and others [131-133]. For pancreatic cancer, Gronborg
et al. analyzed differential protein secretion between the CM of a pancreatic cancer cell line in
comparison to a normal ductal epithelial cell line and identified 195 proteins, of which 145
showed >1.5 fold-change [96], and Mauri et al had identified 46 proteins from the supernatant of
a pancreatic cancer cell line (SUIT2) [97]. In a more recent study, Wu et al. performed
proteomic analysis of 23 cell lines from 11 cancer sites, of which two cell lines were of
31
pancreatic cancer origin [131]. This group took an interesting approach, utilizing the Human
Protein Atlas database and the absence of proteins in other cancer cell lines to delineate
candidate biomarkers for the various cancer sites. More recently, another group analyzed the
conditioned media of 5 pancreatic cell lines to identify deregulated pathways [134]. What is
lacking in the field is integrative analysis and mining of the proteomes from different biological
sources pertaining to a disease type for biomarker discovery. The utility of using an integrative
approach to biomarker discovery has been described recently [117,125]. Given that cancer is a
highly heterogeneous disease, through integration and comparison of proteomes from multiple
biological sample types, the advantages of one source may account for the shortcomings of
others, resulting in more relevant and stronger candidates for verification in plasma.
As such, in this study, we performed proteomic analysis of cell line conditioned media
and pancreatic juice. Seven cell lines and six pancreatic juice samples in two pools were
analyzed in triplicate for a total of 27 experiments using 2D-LC-MS/MS. Through label-free
protein quantification between the cancer and normal cell lines and integration of the pancreatic
juice proteome with that of the cell lines, candidate biomarkers were delineated for verification
in plasma. Within our list of candidates were numerous proteins known to be upregulated in
pancreatic cancer, and proteins previously studied in serum as pancreatic cancer biomarkers,
which helps to provide credence to our approach. Of the derived candidates, initial verification
in plasma samples from patients with established pancreatic cancer and controls identified five
proteins – AGR2, PIGR, OLFM4, SYCN and COL6A1 – which showed a significant increase in
plasma from pancreatic cancer patients in comparison to controls. This demonstrates the utility
of our approach to identify proteins elevated in serum of pancreatic cancer patients. Further
validation of these proteins in a larger number of plasma samples is warranted, as is the
investigation of the remaining group of candidates.
32
2.2 Materials and Methods
Cell Lines
Six pancreatic cancer cell lines (MIA-PaCa2 (CRL-1420), PANC1 (CRL-1469), BxPc3
(CRL-1687), CAPAN1 (HTB-79), CFPAC-1 (CRL-1918) and SU.86.86 (CRL-1837)) were
obtained from the American Type Culture Collection (ATCC, Manassas, VA). The cell lines
were derived from pancreatic ductal adenocarcinomas, which account for approximately 85-
90% of all pancreatic cancers. The cell lines originated from primary tumors of the head or body
of the pancreas (MIA-PaCa2, PANC1, BxPc3), or from metastatic sites (CAPAN1, CFPAC-1,
SU.86.86) [113,114]. The cell lines were derived from individuals of similar ethnic background
and age group (with the exception of CFPAC-1), and all of the cancer cell lines, except for
BxPc3, are positive for K-ras mutations, which is found in 85-90% of pancreatic cancers. An
HPV transfected „normal‟ human pancreatic ductal epithelial cell line (HPDE) [135], provided
by Dr. Ming-Sound Tsao at Princess Margaret Hospital, Toronto, Ontario, Canada was also
analyzed. Apart from a slightly aberrant expression of p53, molecular profiling of this cell line
has shown that expression of other proto-oncogenes and tumour suppressor genes are normal
[135].
Cell culture media specified by ATCC for each of the six pancreatic cancer cell lines
were used and are as follows: DMEM (Catalog No. 30-2002 from ATCC) with 10% fetal bovine
serum (Catalog No.10091-148; Invitrogen) was used for MIA-PaCa2 and Panc1; RPMI–1640
medium modified to contain 2mM L-glutamine, 10mM HEPES, 1mM sodium pyruvate, 4500
mg/L glucose, 1500 mg/L sodium bicarbonate (ATCC Catalog No. 30-2001) with 10% FBS was
used for SU.86.86 and BxPc3; IMDM (Catalog No. 30-2005) with 10% and 20% FBS was used
for the CFPAC-1 and Capan1 cell lines, respectively. The HPDE cell line was grown in
keratinocyte serum free media (Catalog No.17005-042; Invitrogen) supplemented with bovine
33
pituitary extract and recombinant epidermal growth factor. All cells were cultured in an
atmosphere of 5% CO2 in air in a humidified incubator at 37C.
Cell Culture
An optimal seeding density and incubation period which supported maximal protein
secretion with minimal cell death was selected for each of the cell lines, as described previously
[127]. Cells were cultured in T-175 cm2
flasks at determined optimal seeding densities of ~ 10 X
106 for MIA-PaCa2, Panc1 and Capan1, 14 X 10
6 for BxPc3, 3 X 10
6 for HPDE, 13 X 10
6 for
CFPAC1 and 4 X 106 for Su.86.86 in three replicates per cell line. Cells were first cultured for
48 hours in 40mL of their respective growth media to obtain adherence to culture flasks. The
medium was then removed and the cells/flasks were subjected to two gentle washes with 30mL
of PBS (Invitrogen). Forty milliliters of chemically defined Chinese hamster ovary (CDCHO)
serum-free medium (Invitrogen) supplemented with 8mM glutamine (Invitrogen) was then
added and the cells were left to culture for determined optimal incubation periods of 72 hours
for Capan1, CFPAC1 and SU.86.86, 96 hours for BxPc3 and HPDE and 144 hours for MIA-
PaCa2. The CDCHO media that the cells were grown in were subsequently collected and
centrifuged at 1500 rpm for 10 minutes to remove cellular debris. Total protein concentration (as
determined through a Coomassie (Bradford) total protein assay, [136]) was measured in each of
the three replicates and a volume corresponding to 1mg of total protein from each of the
replicates was subjected to the sample preparation protocol below.
Pancreatic Juice
Pancreatic juice samples were provided by Dr. Felix Rueckert, Dresden, Germany.
Approximately 50-500µL of pancreatic juice was collected from the main pancreatic duct of
patients undergoing pancreatic surgery. Upon collection, the samples were stored at -80ºC until
further use. Samples from patients with clinically confirmed cases of pancreatic ductal
34
adenocarcinoma that contained no visible signs of blood were selected for analysis. Six
pancreatic juice samples met these criteria. The samples were centrifuged at 16,000 rpm for 10
minutes at 4C to remove tissue debris. Total protein concentration of each sample was
measured using the Biuret method [137]. Keeping in line with the cell line conditioned media
analysis, it was desirable to use a total protein amount of 1mg for analysis of each of the three
replicates per sample. As a result, two pools of pancreatic juice (pool A and B) were made,
containing three samples each, with total protein concentrations of 2.65 mg/mL and 2.32 mg/mL
for pool A and B, respectively. A volume corresponding to 1mg of total protein was retrieved
from each pool, in triplicate, and subjected to the standardized sample preparation protocol
below (with the exception of dialysis).
Sample Preparation
Samples were processed as described previously [127]. Briefly, samples were dialyzed
using a 3.5kDa molecular weight cut-off membrane (Spectrum Laboratories, Inc., Compton,
CA) in 5L of 1 mM NH4HCO3 buffer solution at 4°C overnight and subsequently frozen and
lyophilized to dryness to concentrate proteins using a ModulyoD Freeze Dryer (Thermo
Electron Corporation). Proteins in each lyophilized replicate were denatured using 8M urea and
reduced with the addition of 200mM dithiothreitol (final concentration of 13mM) in 1M
NH4HCO3 at 50°C for 30 minutes. Samples were then alkylated with the addition of 500mM
iodoacetamide and incubated in the dark, at room temperature, for 1 hour. Each replicate was
then desalted using a NAP5 column (GE Healthcare), frozen and lyophilized. Lastly, samples
were trypsin-digested (Promega, sequencing grade modified porcine trypsin) through an
overnight incubation at 37C using a ratio of 1:50 trypsin to protein concentration. Tryptic
peptides were frozen in solution at -80°C to inhibit trypsin function and lyophilized.
35
Strong Cation Exchange (SCX) on a High Pressure Liquid Chromatography (HPLC) System
The tryptic peptides were resuspended in 510µL of mobile phase A (0.26 M formic acid
in 10% acetonitrile; pH 2-3) and loaded directly onto a 500uL loop connected to a
PolySULFOETHYL A™ column (The Nest Group, Inc.). The column has a silica-based
hydrophilic, anionic polymer (poly-2-sulfoethyl aspartamide) with a pore size of 200 Å and a
diameter of 5 µm. The SCX chromatography and fractionation was performed on an HPLC
system (Agilent 1100) using a 1-hour procedure with a linear gradient of mobile phase A. For
elution of peptides, an elution buffer which contained all components of mobile phase A with
the addition of 1 M ammonium formate was introduced at 20 min in the 60 min method. The
eluent was monitored at a wavelength of 280 nm and fractions were collected every minute from
the 20 minute time point onwards. This resulted in the collection of 40 one-minute fractions.
Collected fractions were left unpooled or subsequently combined into 2, 3 or 5min pools,
according to the elution profile of the resulting SCX chromatogram. As a general strategy,
where the absorbance reading of the elution profile was greater (typically the first 10-15 min of
elution), fractions were left unpooled or pooled every two minutes to keep sample complexity at
a minimum. Where the absorbance readings were lower (towards the end of the method),
fractions were pooled in 3 or 5 min pools. The same pooling method was utilized for all three
replicates of the CM from each cell line and for the pancreatic juice pools.
Mass spectrometry (LC-MS/MS)
The SCX fractions/pools were purified through OMIX Pipette Tips C18 (Varian Inc.) to
further remove impurities and salts and eluted in 4uL of 70% MS Buffer B (90% ACN, 0.1%
formic acid, 10% water, 0.02% TFA ) and 30% MS Buffer A (95% water, 0.1% formic acid, 5%
ACN, 0.02% TFA). Eighty microlitres of MS Buffer A was added to the eluent, and 40uL of
sample was loaded onto a 3 cm C18 trap column (with an inner diameter of 150 µm; New
36
Objective), packed in-house with 5 µm Pursuit C18 (Varian Inc.). A 96-well microplate
autosampler was utilized for sample loading. Eluted peptides from the trap column were
subsequently loaded onto a resolving analytical PicoTip Emitter column, 5cm in length (with an
inner diameter of 75 µm and 8 µm tip, New Objective) and packed in-house with 3 µm Pursuit
C18 (Varian Inc.). The trap and analytical columns were operated on the EASY-nLC system
(Proxeon Biosystems, Odense, Denmark), and this liquid chromatography setup was coupled
online to an LTQ-Orbitrap XL hybrid mass spectrometer (Thermo Fisher Scientific, San Jose,
California) using a nano-ESI source (Proxeon Biosystems, Odense, Denmark). Samples were
analyzed using a gradient of either 54 or 90 minutes (for 5 min pools, a 90 minute gradient was
used, and for 2min, 3min and non-pooled samples, a 54 minute gradient was used). Samples
were analyzed in data dependent mode and while full MS1 scan acquisition from 450-1450m/z
occurred in the Orbitrap mass analyzer (resolution 60,000), MS2 scan acquisition of the top six
parent ions occurred in the linear ion trap (LTQ) mass analyzer. The following parameters were
enabled: monoisotopic precursor selection, charge state screening and dynamic exclusion. In
addition, charge states of +1, >4 and unassigned charge states were not subjected to MS2
fragmentation.
Protein Identification
XCalibur software was utilized to generate RAW files of each MS run. The RAW files
were subsequently used to generate Mascot Generic Files (MGF) through extract_msn on
Mascot Daemon (version 2.2). Once generated, MGFs were searched with two search engines,
Mascot (Matrix Science, London, UK; version 2.2) and X!Tandem (Global Proteome Machine
Manager; version 2006.06.01), to confer protein identifications. Searches were conducted
against the non-redundant Human IPI database (v.3.62) which contains a total of 167,894
forward and reverse protein sequences and using the following parameters: fully tryptic
37
cleavages, 7ppm precursor ion mass tolerance, 0.4Da fragment ion mass tolerance, allowance of
one missed cleavage, fixed modifications of carbamidomethylation of cysteines, and variable
modification of oxidation of methionines. The files generated from MASCOT (DAT files) and
X!Tandem (XML files) for the three replicates of each biological source were then integrated
through Scaffold 2 software (version 2.06; Proteome Software Inc., Portland, Oregon) resulting
in a non-redundant list of identified proteins per sample. Results were filtered using the
X!Tandem LogE filter and Mascot ion-score filters on Scaffold to achieve a protein false
discovery rate (FDR) <1.0%.
Data Analysis
Scaffold prot-XML reports were generated and uploaded onto Protein Center (Proxeon)
to facilitate comparisons between cell line CM and pancreatic juice proteomes, and to obtain
gene ontology information. Cellular localization, function and process annotations were
extracted by Protein Center from the Gene Ontology (GO) Consortium
(http://www.geneontology.org/GO.tools.shtml). Due to the large number of different GO
annotations per localization, function and process, Protein Center reduces terms to
approximately 20 high-level terms that are used for filtering. Details can be found at
http://tgh.proteincenter.proxeon.com/ProXweb/Help/Manual/apd.html. A Microsoft Excel
Macro developed in-house by Dr. Irv Bromberg, Mount Sinai Hospital, was also utilized for
comparison of protein lists based on accession number or gene name. Hierarchical clustering
analysis of proteomic data was performed using PermutMatrix, available freely online at
http://www.lirmm.fr/~caraux/PermutMatrix/EN/index.html. PermutMatrix was a software
originally developed for gene expression analysis [138]. More recently it has been utilized and
validated for proteomics [139]. For clustering analysis, average emPAI values from the triplicate
analysis of the samples were exported from Protein Center into a space delimited Microsoft
38
Excel file. For visualization, comparison and data analysis purposes, cell line or pancreatic juice
samples with missing emPAI values for a particular protein were assigned half the minimum
emPAI value for that protein in the data set. The emPAI values were imported into
PermutMatrix and transformed to Z score values for normalization. Two-way hierarchical
clustering analysis was performed using the Pearson and Ward‟s minimum variance methods for
distance and aggregation, respectively. Resultant dendograms with cell lines and pancreatic
juice samples on the x-axis and gene name on the y-axis were exported.
Label-free Protein Quantification
Semi-quantitative analysis was conducted between the cancer cell lines and the HPDE
normal pancreatic ductal epithelial cell line to ascertain proteins over or under-expressed in the
cancer cell lines based on spectral counting. The „Quantitative Value‟ function of Scaffold 2.06
software, which provides normalized spectral counts based on the total number of spectra
identified in each sample was utilized. One file containing all of the normalized spectral counts
of each of the three replicates from the 7 cell lines was generated for proteins identified with 2
or more peptides. One-way ANOVA was conducted to determine proteins that show a
significant difference amongst the seven cell lines (p<0.05). For proteins that showed a p value
<0.05, the average spectral count for the three replicates was calculated and fold-change was
determined by dividing the average counts from each of the cancer cell lines with that of the
normal HPDE cell line and vice versa. Not all proteins were identified in all of the cell lines;
however all proteins had to have been identified by ten or more spectra in at least one biological
sample to be included. Proteins with ambiguous peptides were searched individually to ensure
normalization of spectral counts did not significantly alter values. Unidentified proteins or
missing values in a particular biological sample were assigned a normalized spectral count of 1
to keep from dividing by zero and to prevent overestimation of fold-changes.
39
Plasma Samples
Blood samples were collected from pancreatic cancer patients at the Princess Margaret
Hospital GI Clinic in Toronto, Canada, or from kits sent directly to consented patients recruited
from the Ontario Pancreas Cancer Study at Mount Sinai Hospital following a standardized
protocol (age range 55-86; median age 68; 10 female and 10 male). Samples were collected with
informed consent, and with the approval of the institutional ethics board. Samples from healthy
controls were obtained from the Familial Gastrointestinal Cancer Registry (FGICR). The
controls are non-blood relatives of patients in FGICR studies (age range 46-84; median age 60;
9 female and 11 male). Blood was collected in ACD (anticoagulant) vacutainer tubes and
plasma samples were processed within 24 hours of blood draw. To pellet the cells, blood
samples were centrifuged at room temperature for 10 minutes at 913 X g. Immediately after
centrifugation, the plasma samples were aliquoted into 250uL cryotubes and stored in -80°C or
liquid nitrogen until further use.
ELISAs and Immunoassays
Enzyme-linked immunosorbent assays for AGR2, SYCN, OLFM4, COL6A1, PIGR,
PLAT, TFF2 and NUCB2 were purchased commercially and performed according to the
manufacturer‟s instructions. Six of the ELISA kits were purchased from USCN LifeSciences
(AGR2: Catalogue # E2285Hu, SYCN: Catalogue # E93879Hu, OLFM4: Catalogue #
E90162Hu, COL6A1: Catalogue # E92150Hu; PIGR: Catalogue # E91074Hu; TFF2: Catalogue
# E0748Hu). The PLAT (tPA) immunoassay was purchased from American Diagnositca Inc.
(Catalogue # 860) and the NUCB2 ELISA was purchased from Phoenix Pharmaceuticals
(Catalogue # EK-003-26). The ELECSYS CA 19-9 immunoassay by Roche was utilized to
measure CA19.9 levels in plasma, and kallikrein 6 and 10 internal control proteins were
measured in CM using in-house developed ELISA assays, as described previously [140,141].
40
Statistical Analysis
Mann-Whitney U-tests were applied to verification experiments in plasma to determine
if differences in the medians were significant between cancer and control groups using Graph
Pad Prism 4 Software. The five candidates that showed a statistically significant difference
(p<0.05) were then assessed in combination in comparison to CA19.9 through ROC curve
analysis. The area under the curve (AUC) values were calculated using ROCR software and the
corresponding variances were calculated with a bootstrap method.
2.3 Results
Increasing Protein Yield
Once the cell lines were grown and CM collected, the samples were subjected to a 2D-
LC-MS/MS analysis which combined SCX liquid chromatography on an HPLC system,
followed by LC-MS/MS. A schematic of the workflow is provided in Figure 2.1. Guided by our
previous experience [127-130], SCX fractions were initially collected at 5 minute intervals
during peptide elution, resulting in approximately 8 fractions that were analyzed using a ~2 hour
reverse-phase method on the LTQ-Orbitrap mass spectrometer. This resulted in the
identification of 1305, 1468, and 1749 proteins (≥1 peptide) in the triplicate analysis of the
BxPc3, HPDE6 and MIA-PaCa2 cell lines, respectively (Table 2.1). In some of the individual 5
min fractions analyzed (specifically the fractions that contained the highest absorbance readings
during SCX peptide elution) >700 proteins were identified per fraction (data not shown). Based
on previous experience, this was a very large number of proteins to have been identified in
individual fractions. Consequently, we opted to employ a different fraction collection and
pooling strategy. By collecting fractions every minute from SCX and pooling fractions based on
the intensity of peaks eluting on the SCX chromatogram (as described in the „Materials and
Methods‟; Fig 2.1), we identified 2017 proteins for the BxPc3 cell line, 2297 for HPDE, and
41
2756 for the MIA-PaCa2 cell line subjected to the same growth and sample processing
conditions, in triplicate. In order to ensure that this increase in protein yield was not due to
variation in cell growth/sample collection, an additional replicate using MIA-PaCa2 CM left
over from the initial analysis, which had been stored in -80ºC, was also run and 2348 proteins
were identified (a 52-56% increase from the individual replicates of the first run of MIA-PaCa2)
(Table 2.1).
42
Figure 2.1 Schematic Outline of Proteomic Analysis. The top panel (a) details the two pooling methods used for pooling of SCX-
generated fractions. Through application of 1,2, 3 and 5 min pools (pooling method 2) an increase of ~50-60% in the number of
proteins identified through mass spectrometry was observed. The lower panel (b) details the methodology (sample preparation, pre-
fractionation, mass spectrometry and data analysis) followed in the cell line and pancreatic juice proteomic analyses. CM, conditioned
media; SCX, strong cation exchange; LC-MS/MS, liquid chromatography tandem mass spectrometry.
43
Table 2.1. Increasing the Number of Identified Proteins by Optimizing Cation Exchange Chromatography Fraction Pooling
Cell
Line
Number of Proteins
identified with 5 min
fractionsa
(pooling method 1)
Number of Proteins identified through
combination of 1,2,3 and 5 Min Poolsa
(pooling method 2)
Number (and %) of
method 1 proteins also
identified by method 2
% increase in total
proteins identified
between methods 1
and 2
BxPc3 1305
[777]
2017
[1261]
1171 (90%)
[705 (91%)]
54%
[62%]
HPDE 1468
[876]
2297
[1474]
1326 (90%)
[802 (92%)]
56%
[68%]
MIA-
PaCa2
Rep1 Rep
2
Rep
3
Total Rep1 Rep2 Rep3 Rep4b Total
c 1598 (91%)
[1030 (94%)]
58%
[70%]
1242 [885]
1447 [929]
1450 [908]
1755 [1096]
2502 [1837]
2501 [1837]
2424 [1823]
2348 [1615]
2756 [1862]
aThese numbers include proteins identified by one or more peptide and with false discovery rate <1.0%. Proteins identified with ≥2
peptides are in brackets.
b Additional replicate using MIA-PaCa2 CM left over from the pooling strategy 1 analysis and then run using pooling strategy 2.
cThe indicated total excludes Rep4 values.
44
Over 90% of the proteins from the first analysis were re-identified and the new pooling
strategy resulted in approximately a 54-58% increase in protein yield across the cell lines. This
improved strategy was utilized for proteomic analysis of the remaining cell lines and the
pancreatic juice samples.
Protein Identification through LC-MS/MS
Six human pancreatic cancer cell lines, one „near normal‟ human pancreatic ductal
epithelial cell line (HPDE) and six pancreatic juice samples from ductal adenocarcinoma
patients (in two pools) were analyzed in triplicate in this study (Figure 2.2). Using both
MASCOT and X!Tandem search engines, between 2017 to 3250 proteins were identified in the
7 cell lines and 1014 and 956 proteins were identified from pool A and B of pancreatic juice,
respectively (Table 2). These numbers represent proteins identified in the three replicates
combined, with 1 or more peptides and with protein false discovery rates (FDR) of <1.0%. For
protein identifications, the human forward and reverse IPI3.62 database, which contains 167,894
forward and reverse protein sequences was used, and FDR was calculated as
[2XFP/(TP+FP)]100, where FP (false positive) is the number of proteins that were identified
based on sequences in the reverse database component and TP (true positive) is the number of
proteins that were identified based on sequences in the forward database component [142-144].
For increased stringency and assurance of protein identification, only proteins identified with
two or more peptides were included in the remainder of the analysis, resulting in between 1261
and 2171 proteins for each of the cell lines and a total of 648 non-redundant proteins from the
pancreatic juice analysis. This data is summarized in Table 2.2.
Protein Overlap Between Samples
From our combined analysis, a total of 3479 non-redundant proteins (3324 in the cell
45
Table 2.2. Total Number of Proteins Identified in Triplicate Analysis of Cell Line Conditioned Media and Pancreatic Juice
Cell Lines Pancreatic Juice
BxPc3 CAPAN1 CFPAC1 HPDE MIA-
PaCa2
PANC1 SU.86.86 Pool A Pool B
Total Non-Redundant Proteinsa
[with ≥2 peptides] 2017
[1261]
2182
[1420]
2427
[1573]
2297
[1474]
2756
[1862]
3250
[2171]
3010
[2002]
1018
[546]
957
[496]
Num
ber
of
Pro
tein
s
Iden
tifi
ed w
ith …
Only 1 peptide 756 762 854 823 894 1079 1008 472 461
Only 2 peptides 336 374 394 400 464 491 491 172 150
Only 3 peptides 226 230 290 252 281 347 322 96 79
Only 4 peptides 144 171 175 166 217 248 224 43 64
≥ 5 peptides 555 645 714 656 900 1085 965 235 203
Protein False Discovery Rateb 0.69% 0.82% 0.66% 0.87% 0.80% 0.62% 0.73% 0.79% 1.0%
Number [%] of Extracellular
and Cell Surface Proteins with
≥2 peptides
511
[40.5%]
605
[42.6%]
665
[42.3%]
592
[40.2%]
635
[34.1%]
757
[34.9%]
749
[37.4%]
314
[57.5%]
281
[56.7%]
a All non-redundant proteins identified with ≥1 peptide; the number of total proteins identified with ≥ 2 peptides is enclosed in
brackets.
b Pertains to total proteins identified with ≥1 peptide; False discovery rate = 0.0% for ≥ 2 peptide identifications.
46
Figure 2.2. Total non-redundant proteins identified. Venn diagrams depicting total proteins
identified with ≥2 peptides in the three replicates of each cell line CM and pancreatic juice
sample (a). Overlap of 3479 total non-redundant proteins identified in the conditioned media
and pancreatic juice analysis is also depicted (b). HPDE, (normal) human pancreatic ductal
epithelial cell line.
lines and 648 in the pancreatic juice analysis) were identified with ≥2 peptides. Six-hundred and
forty-four proteins (of 3324; 19.4%) were common to all cell lines and an average of 143
proteins were unique to each (Table 2.3). From our preliminary studies of the three cell lines
described in the „increasing protein yield‟ section above, 83 additional non-redundant proteins
(≥2 peptides) were identified; however these were not included in the remainder of the analyses.
Significant overlap was noted between the pancreatic juice and CM proteins.
Approximately 76% (493 of 648) of proteins identified in the pancreatic juice samples were also
identified in the cell line analysis (Figure 2.2b), which indicates much similarity in the
47
proteomes between these biological fluids; however many proteins that are largely associated
with exocrine pancreatic function were unique to the pancreatic juice and not identified in the
cell lines. Analysis of overrepresented KEGG pathways through Protein Center software further
revealed the KEGG pancreatic secretion pathway (hsa04972) to be one of three pathways
overrepresented in the pancreatic juice proteome in comparison to the combined cell line
proteome (p=3.611E-5) (Appendix 1).
Table 2.3. Protein Overlap Between Cell Line Conditioned Media
Number of
Cell Lines
Number of
Proteinsa
% of the CM Proteins
Identified in the
Pancreatic Juicec
7 644 42%
6 285 14%
5 295 14%
4 268 14%
3 336 10%
2 494 5%
1 1001b 4%
a Indicates the number of proteins with two or more peptides that were commonly identified in
the reported number of cell lines.
bOne thousand and one proteins is the total number of proteins that were unique to only one of
the seven cell lines.
cIndicates the percentage of proteins common to the multiple cell lines that were also identified
in the pancreatic juice.
Gene Ontology – Function, Process and Cell Localization Classifications
Gene ontology classifications, which include function, process and cell localization,
were obtained for all identified proteins. Proteins that are secreted into the extracellular milieu
48
or cleaved from the plasma membrane of cells have the highest chance of entering the
circulation and serving as serological biomarkers. Between 34.1%-42.6% of proteins in each of
the cell lines and 57% of proteins in the pancreatic juice samples were annotated as belonging to
the extracellular or cell surface compartments (Table 2.2). In total, 1376 (40%) of 3479 proteins
contained these two annotations. The cytoplasm received the greatest number of annotations in
both biological fluids and approximately 2.9% of the total contingent of proteins did not contain
cell localization information and are unannotated (Figure 2.3a). It is important to note that
proteins can be classified as belonging to multiple cellular localizations, processes and functions
and as a result, the categories for each are non-exclusive and the sum of the percentages can be
>100%.
The top three molecular functions for the cell lines and the pancreatic juice were the
same: protein binding (~80.2%, 79.9%), catalytic activity (69.4%, 70.2%) and metal ion binding
(45.7%, 49.7%), respectively. Both fluids also shared the top two biological processes –
metabolic process (81.2%, 83.5%) and regulation of biological process (61.9%, 63.1%),
respectively. In a comparison between the cell line and pancreatic juice proteomes as a whole,
extracellular proteins and several molecular functions related to enzyme activity were found
overrepresented in the pancreatic juice proteome (Figure 2.3b). The only GO category
overrepresented in the cell line proteome was the biological process „macromolecule metabolic
process‟ (GO:0043170; p=5.338E-7; FDR p=3.308E-3). No GO terms were over or
underrepresented in a comparison between the cancer cell lines and HPDE.
49
Figure 2.3 Cellular localization and comparison of GO categories between cell line
conditioned media and pancreatic juice proteomes. Cellular localization of proteins
annotated using gene ontology (GO) consortium annotations (a). Depicted in dark grey is the
percentage of proteins from the cell line CM analysis and light grey is percentage of proteins
from the pancreatic juice analysis for each cell localization category. Proteins can contain
multiple GO annotations resulting in a sum of percentages >100%. Top three significantly
overrepresented GO categories in pancreatic juice proteins in comparison to the cell line
conditioned media proteome for cellular localization, molecular function and biological process
are also depicted (b). Blue represents pancreatic juice proteins and red represent cell line
proteins. FDR (false discovery rate) p-values are based on application of hypergeometric test at
a FDR of 1%.
50
Hierarchical Clustering
One of the difficulties in dealing with large datasets is visualizing the proteomes as a
whole and identifying subsets of proteins that may be of importance within certain biological
contexts. In an initial attempt to mine and explore the CM and pancreatic juice proteomes,
unsupervised two-way hierarchical clustering analysis (HCA) was performed using average
emPAI values of the three replicates of each sample, normalized through Z-scores. Through
these means, proteins were clustered based on abundance within each sample. The
concentrations of two proteins (KLK6 and KLK10) were assessed in the CM through ELISA to
determine if Z-scores of emPAI values are a suitable indicator of protein abundance. Good
correlation was seen between Z-scores of ELISA concentrations and Z-scores of emPAIs
(Figure 2.4a). Additionally, the lowest ELISA concentration measured was 0.80 ug/L for
KLK10 in CAPAN1 CM, which indicates the sensitivity of our mass spectrometry analysis in
general, to be at least in the low ug/L range for the CM analysis.
HCA was performed on the entire dataset of 3479 proteins and based on normalized
emPAI values, the pancreatic juice samples were distinctly clustered separately from the cell
lines, and within the cell lines, the three derived from metastatic sites (SU.86.86, CFPAC1 and
CAPAN1) were clustered together. MIA-PaCa2, PANC1 and BxPc3 are cell lines derived from
the primary tumor site of three patients [114]. The MIA-PaCa2 and PANC1 proteomes were
clustered together, as were the BxPc3 and HPDE cell lines. Heat-map visualization facilitated a
first exploration of the dataset and the identification of several regions or protein clusters of
interest. Among them, two clusters containing 34 proteins were shown to be highly expressed in
multiple cancer cell lines and the pancreatic juice samples, all with minimal expression in HPDE
(Figure 2.4b). This included proteins such as MUC1 (Mucin-1) [67] and RNASE1 (pancreatic
ribonuclease) [145] which have been shown to be elevated in pancreatic cancer and studied
51
Figure 2.4 Hierarchical clustering analysis based on normalized emPAI values for the 3479 total non-redundant proteins
identified. Good correlation (R-square = 0.7362) of z-scores between KLK6 and KLK10 emPAI values and ELISA concentrations
was noted (a). Clustering analysis depicting the seven cell lines and two pancreatic juice pools on the x-axis and proteins on the y-axis
(b). Shown is a segment of the resulting dendogram depicting two clusters of proteins found highly expressed in the cancer cell lines
and pancreatic juice (low expression in the normal HPDE cell line). KLK6, kallikrein 6; KLK10, kallikrein 10; emPAI, exponentially
modified protein abundance index; EC, extracellular; CS, cell surface.
52
previously as pancreatic cancer biomarkers in serum. This prompted us to further examine
proteins that are differentially expressed between the cancer cell lines and the HPDE cell line.
Differential Expression of Proteins in Cancer vs. Normal Cell Lines
Normalized spectral counts of the cancer cell lines were compared with those of the
normal HPDE cell line as described in the „Materials and Methods‟ section. The Pearson
correlation coefficient was evaluated for all pairs of the 21 replicates from the 7 cell lines using
normalized spectral count values (Appendix 2). With the exception of replicate 2 from CFPAC1,
which showed 0.727 and 0.851 correlation with CFPAC1 replicates 1 and 3, good correlation
(ranging from 0.944-0.993) was seen between replicates of each cell line (including CFPAC1
replicates 1 and 3) indicating good reproducibility (Appendix 2).
Analysis of variance (ANOVA) testing identified 1293 proteins (each with a minimum
number of 10 spectra in at least one cell line), with a statistically significant difference amongst
the seven cell lines (p<0.05). Based on the criteria described in the „Materials and Methods‟,
491of these proteins showed ≥ 5-fold increase in at least one cancer cell line in comparison to
HPDE. One-hundred and nineteen proteins further demonstrated ≥ 5-fold increase in at least
three cancer cell lines in comparison to HPDE, of which 53 proteins showed over 10-fold
increase and 18 showed over 20-fold increase in at least three cancer cell lines. Examination of
underexpressed proteins revealed 19 proteins consistently decreased at least 5-fold in all six
cancer cell lines and 18 consistently decreased in five cancer cell lines in comparison to HPDE.
Sixty-three of the 119 proteins were extracellular and cell surface-annotated and are
listed in Appendix 3. Additionally, 17 of these proteins have been previously shown to be
upregulated in pancreatic cancer in at least four studies [146], and 10 have been shown to be
elevated in pancreatic cancer serum in comparison to controls [145, 147-156] (Appendix 3). The
unstudied proteins may yield promising new candidate biomarkers for pancreatic cancer.
53
Many of the 491 total overexpressed proteins were also identified in a comprehensive
database of human plasma proteins [157] and five proteins, COPS4 (COP9 signalosome
complex subunit 4), PXN (Paxillin), MYO1C (Myosin-1c), GBA (protein similar to
Glucosylceramidase) and LMAN2 (Vesicular integral-membrane protein VIP36), were also
identified in a recent global genomic analysis of pancreatic cancer [32] as overexpressed in the
large majority of pancreatic cancer cases studied.
Further Prioritization of Candidates through Integration of Biofluids and Tissue Specificity
Recent evidence suggests the integration and combining of different biological fluids
may also yield strong candidates for verification phases of biomarker discovery [117,125]. As
such, we applied a set of filtering criteria based on overlap of proteins between different
biological sources, the cellular localization of proteins and tissue specificity for the generation of
further candidates. Of the 488 proteins common to the pancreatic juice and cell lines (Figure
2.2b), 235 had been annotated as belonging to the extracellular and cell surface compartments.
One-hundred and nine of these proteins were also identified in the proteome of ascites fluid
from 3 patients with pancreatic adenocarcinoma (Makawita et al. unpublished)1, and of these, 43
were not identified in the HPDE normal cell line (Appendix 4).
Because there may be pertinent proteins in the pancreatic juice that may not be identified
in the CM and vice versa, proteins identified in either proteome that were shown to be highly
specific to/expressed in the pancreas were also included. To examine tissue specificity, we
compared the proteins identified in our CM and pancreatic juice datasets to proteins shown
highly specific to the pancreas based on microarray, EST and immunohistochemistry data using
1 Makawita S., Kosanam H., Diamandis E.P. Proteomic analysis of ascites fluid from pancreatic
cancer patients. (unpublished data).
54
TiSGeD [158], TiGER [159], Unigene [160], and the Human Protein Atlas [161], respectively.
These are publically available databases that have been described in detail previously [158-161].
Specifically, we compared our lists to 150 proteins reported as specific to pancreas tissue using
TiSGeD specificity measure >0.90 [158], 55 pancreas-specific proteins from Unigene, 205
proteins preferentially expressed in the pancreas based on the TiGER database and 198 proteins
showing „strong‟ pancreatic exocrine cell staining and annotated on the Human Protein Atlas.
Twenty proteins were common to at least three or more of the databases, of which 2 proteins,
PRSS1 and SPINK1, were identified in the cell line CM as meeting these criteria and 15
proteins from the pancreatic juice proteome (including PRSS1 and SPINK1) met the same
criteria (Table 2.4). Twelve of these proteins have been previously shown to be elevated in
serum/plasma of patients with pancreatitis or pancreatic cancer [162-172], leaving CTRC
(chymotrypsin C), SYCN (syncollin) and REG1B (Lithostathine-1-beta) (Table 2.4).
Candidate Verification in Plasma
Based on availability of enzyme-linked immunosorbent assays, eight candidates were
selected for verification in plasma. Of these, five proteins - Anterior Gradient Homolog 2
(AGR2), Olfactomedin-4 (OLFM4), Syncollin (SYCN), Collagen alpha-1(VI) chain (COL6A1),
Polymeric Immunoglobulin Receptor (PIGR) – showed a significant increase in pancreatic
cancer plasma (Fig 2.5a-e). Tissue-type plasminogen activator (PLAT), Trefoil factor 2 (TFF2)
and Nucleobindin-2 (NUCB2) did not show a significant increase in plasma samples (data not
shown).
55
Table 2.4 List of 15 Pancreas Specific Proteins (≥3 databases) Identified in CM and Pancreatic Juice
Gene Protein Name HPA
[162]
UniGene
[161]
TiGER
[160]
TiSGeD
[159] Identified in Previous shown
Elevated in
Pancreatic Cancer
or Pancreatitis
Serum/Plasma [ref]
Pancreatic
Juice
Proteome
CM
Proteome
CPA1 Carboxypeptidase A1
162
PRSS1 Trypsin-1 163,164
CPA1
cDNA FLJ53709, highly similar to
Carboxypeptidase A1
162
CPA2 Carboxypeptidase A2
162
GP2
Isoform Alpha of Pancreatic secretory
granule membrane major glycoprotein
GP2
165
REG1A Lithostathine-1-alpha
166
CTRC Chymotrypsin-C
CPB1 Carboxypeptidase B
167
GP2
Isoform 1 of Pancreatic secretory granule
membrane major glycoprotein GP2
165
PNLIP Pancreatic triacylglycerol lipase
168,169
SYCN Syncollin
REG1B Lithostathine-1-beta
CLPS Colipase
170
SPINK1 Pancreatic secretory trypsin inhibitor
171
PLA2G1B Phospholipase A2
172
HPA, Human Protein Atlas; TiSGeD, Tissue-Specific Genes Database; TiGER, Tissue-specific and Gene Expression and Regulation
56
In the CM analysis, AGR2 showed over 10-fold increase in the BxPc3, CAPAN1,
CFPAC1 and SU-86-86 cell lines compared to the near normal HPDE cell line (Appendix 3). As
well, AGR2 was common to the CM and pancreatic juice proteomes and was identified in the
cluster of proteins highly expressed in many cancer cell lines and pancreatic juice in comparison
to HPDE (Figure 2.4b). In plasma, AGR2 levels were significantly increased in pancreatic
cancer patients (p<0.0001) in comparison to controls (Fig. 2.5a). Mean and median plasma
levels in the pancreatic cancer patients were 8.8 ug/L and 2.1ug/L while mean and median levels
in controls were 0.33 ug/L and 0.28 ug/L).
OLFM4 was a protein identified based on the integrated method (Appendix 4), and as
well it was identified in the cluster shown in Fig 2.4b. In the plasma samples, OLFM4 also
showed a significant elevation (p<0.0001) in cancer (mean = 161 ug/L, median = 90 ug/L) in
comparison to controls (mean = 51 ug/L, median = 38 ug/L) (Fig. 2.5b). SYCN was a protein
identified solely in the pancreatic juice samples. It is monospecific to the pancreas based on
TiGER, TiSGeD and Unigene databases (data was unavailable in the Human Protein Atlas)
(Table 2.4). This protein is a part of the secretory granule membranes of the exocrine pancreas,
and due to its tissue specificity, it was selected for the verification phases. In the plasma
samples, SYCN also showed a significant increase in pancreatic cancer patients (p=0.0011;
mean cancer = 18.2 ug/L, median cancer = 13.5 ug/L; mean controls = 5.1 ug/L, median controls
= 2.9 ug/L) (Fig. 2.5c).
COL6A1 was expressed over 20-fold in all of the cancer cell lines except for the BxPc3
cell line in comparison to the HPDE cell line. Similarly, PIGR was expressed over 20-fold in
three of the cancer cell lines (Appendix 3). Both proteins showed a significant increase in
pancreatic cancer plasma in our preliminary analysis (p=0.0098; mean cancer = 3.3 mg/L,
median cancer = 2.1 mg/L; mean controls = 1.5 mg/L, median controls = 0.73 mg/L for
57
COL6A1 and p<0.0001; mean cancer = 16.8 mg/L, median cancer = 12.3 mg/L; mean controls =
9.2 mg/L, median controls = 8.96 mg/L for PIGR) (Fig. 2.5d,e).
At present, CA19.9 is the most widely used pancreatic cancer biomarker and CA19.9
levels were also assessed in our screening set of plasma samples (Fig 2.5f). While neither of the
proteins verified here shows enhanced performance to CA19.9 individually, preliminary
assessment, using ROC curve analysis, of all proteins as a panel show a slight increase in AUC
to CA19.9 alone (AUCCA19.9= 0.97 , AUC AGR2, OLFM4, SYCN, COL6A1,PIGR= 0.98; Fig 2.6).
58
Figure 2.5. Preliminary verification of AGR2 (a), OLFM4 (b), SYCN (c), COL6A1 (d) and
PIGR (e) in plasma from pancreatic cancer patients and healthy controls of similar age
and sex. Plasma concentrations of the proteins were measured through ELISA. Mean values are
indicated by a horizontal line and p-values were calculated using the Mann-Whitney U-test.
CA19.9 levels in the plasma samples were also tested (f).
n, number of subjects
59
Figure 2.6. Receiver Operating Characteristic curve analysis for CA19.9 and panel of 5
candidates (AGR2, OLFM4, SYCN, COL6A1 and PIGR). AUC (area under curve) is given
at 95% confidence intervals. AUC of the 5 candidates in panel show slight improvement to AUC
of CA19.9 alone.
60
2.4 Discussion
Deregulated molecular pathways are a hallmark of cancer and the resultant secretion,
shedding and aberrant cleavage of proteins by tumor cells and their microenvironment present a
way in which to detect and track tumor development and progression [44,173]. With the advent
of high throughput protein profiling techniques, at the centre of which lies mass spectrometry
analysis, various approaches have been taken for the identification of novel protein biomarkers
and novel biomarker candidates. Serum or plasma is the desired diagnostic fluid in the clinic;
however initial discovery studies in serum are hampered by the high complexity of the fluid and
its large dynamic range [54]. To overcome these limitations and others posed by MS analysis of
serum, researchers have turned to the characterization of proteomes of less complex biological
fluids that are „upstream‟ of plasma. Due to the proximity of selected biological fluids to tumor
cells and tissues, their proteomes likely represent a reservoir of proteins enriched in potential
biomarkers prior to dilution upon entering the circulation [110,117,125].
Although many notable differences exist, the genomic and transcriptional make-up of
cancer cell lines have been shown to recapitulate salient aspects of primary tumors [113-116]. In
addition, the identification of known biomarkers in the conditioned media of cancer cell lines for
numerous cancer sites, make it a viable source to mine [80, 174]. Previously, our group has
characterized the CM of breast, ovarian, prostate and lung cancer-related cell lines using 3-4 cell
lines per cancer site [127-130]. Using an LTQ mass spectrometer, 1139, 1830, 2124 and 2039
proteins were identified with at least one peptide in the breast [127], lung [130], prostate [128]
and ovarian cancer [129] analyses, respectively. Given the vast heterogeneity of the disease,
from our previous work it was concluded that a larger number of cell lines per cancer site, as
well as the incorporation and integration of proximal biological fluids from patients may provide
a more complete picture of disease heterogeneity and the tumor-host interface, there-by
61
facilitating the identification of stronger candidates for verification.
In the present study, we applied such an approach to pancreatic cancer. By utilizing 2D
LC-MS/MS, we characterized the proteomes of conditioned media from six pancreatic cancer
cell lines, one near normal pancreatic ductal epithelial cell line and six pancreatic juice samples
in two pools. All experiments were performed in triplicate and multiple search engines
(MASCOT and X!Tandem), which employ different search algorithms, were utilized for protein
identification. Previously it has been reported that use of multiple search engines results in
increased confidence in the proteins identified [70,71]. Additionally, only proteins identified
with multiple peptides (≥ 2 peptides) were used in the analysis. Through these means we
identified 3324 non-redundant proteins in the CM of the seven cell lines and 648 proteins in the
pancreatic juice. In total, 3479 non-redundant proteins were identified. This, to our knowledge,
is one of the largest and most comprehensive proteomes to date for pancreatic cancer-related
biological fluids in a single study.
In the first part of the study, an increase in protein yield of ~50% was achieved by
applying a pre-fractionation strategy that was tailored to the SCX elution profile. SCX was the
first dimension of fractionation in our multidimensional approach. Different modes of
fractionation from isoelectric focusing (IEF) to SDS-PAGE fractionation and SCX have been
previously compared, with different studies reporting different methods as the most effective
when coupled with MS analysis [175-178]. Fractionation of complex samples prior to MS
analysis is a technique used to minimize sample complexity and penetrate deeper into the
proteome, there-by achieving increased coverage of proteins. Indeed, more proteins (including
those known to be of low abundance such as various interleukins) were identified through these
means in our analysis. A corollary of increased fractionation is typically decreased throughput.
In the present study, reduced gradient times during the second dimension of separation (reverse-
62
phase) helped to keep any increase in analysis time to a minimum.
Not all proteins identified in shotgun proteomics-driven discovery approaches will be
suitable for study as serological biomarkers, and one of the challenges in the field is in the
selection of the most promising candidates for further investigation. In the present study, we
utilized two strategies: (1) semi-quantitative analysis through label-free protein quantification
between the cancer and normal cell lines and (2) integrative analysis of cell line CM and
pancreatic juice. Label-free approaches typically employ chromatographic ion intensity-based
methods or spectral count-based means to obtain relative quantification of proteins between LC-
MS/MS run samples [179,180]. Further approximations of absolute protein abundance can be
obtained through reported indices such as emPAI and absolute protein expression (APEX)
[181,182]. Normalized spectral counts have been reported previously to be reliable indicators of
protein abundance in studies comparing different label-free methods, and strong correlation
between spectral counts and protein abundance have been shown [183]. When restricting
analysis to proteins identified with five or more spectra, results comparable to label-based
approaches have been shown to be obtainable [184]. In the present study, this method was
utilized for relative quantification between the cancer cell lines and the HPDE cell line.
Using the criteria outlined in the „Materials and Methods‟, 119 proteins were found to be
expressed over 5-fold consistently in at least three cancer cell lines. Included in this list were
many proteins previously shown to be upregulated in pancreatic cancer. For instance, the protein
GDF15, also known as macrophage inhibitory cytokine 1 (MIC1), showed >10-fold increase in
the CAPAN1, CFPAC1, PANC1 and SU.86.86 cell lines. Increased GDF15 mRNA and protein
levels have been shown previously in pancreatic tissue in comparison to adjacent normal
controls [154] and evaluation of this protein in serum has also shown it to have diagnostic
potential [155]. Similarly, neutrophil gelatinase-associated lipocalin (LCN2) [156], matrix
63
metalloproteinase 7 (MMP7) [152], complement component 3 (C3) [150,151] and leucine-rich
alpha-2-glycoprotein (LRG1) [185] have been reported to be elevated in serum of pancreatic
cancer patients, while mesothelin (MSLN) [186], tissue-type plasminogen activator (PLAT)
[187], C-X-C motif chemokine 5 (CXCL5) [188] and other proteins highlighted in Appendix 3
have been shown to be upregulated in pancreatic cancer or pancreatic neoplasia at the level of
tissue and/or mRNA.
Identification of these proteins provides some credence to our label-free discovery
approach; however proteomic comparisons between non-malignant and malignant biological
sources are limited by the possibility that the observed differences may be due to many factors,
not solely due to differences in tumorigenic potential alone. This was demonstrated as three of
the eight proteins verified did not show a significant increase in plasma from pancreatic cancer
patients. These proteins, PLAT, NUCB2 and TFF2, were expressed over 5-fold in three, one and
one of the pancreatic cancer cell lines, respectively; however this failed to translate into our
plasma analysis.
We further investigated three other proteins, AGR2, PIGR and COL6A1 which showed
over 5-fold increase in four, three and five cancer cell lines, respectively (Appendix 3). Our
analysis of these proteins in human plasma also showed a significant increase in pancreatic
cancer patients. Except for AGR2, to the best of our knowledge, neither of these proteins have
previously been studied in sera/plasma of pancreatic cancer patients. AGR2 is an orthologue of
the Xenopus laevis protein XAG-2, which is a protein shown to play a role in ectodermal
patterning [189]. The function of AGR2 in normal human states is largely unknown; however in
humans cancers, AGR2 has been associated with several cancer types [190-192] and recently,
increased AGR2 levels were reported in pancreatic juice [153]. In this latter study, Chen et al.,
utilized quantitative proteomics to profile pancreatic juice samples from pancreatic
64
intraepithelial neoplasia (PanIN) patients in comparison to controls and AGR2 was one of the
proteins this group found to show over 2-fold increase in PanIN-stage III. While Chen et al.,
found diagnostic relevance for AGR2 in pancreatic juice, their analysis in 6 paired serum and
pancreatic juice samples from PanIN patients found no correlation between serum and
pancreatic juice AGR2 levels. Further analysis by this group in serum of 9 pancreatic cancer and
9 cancer-free controls showed no significant difference in AGR2 levels as well [153]. Despite
this, given that AGR2 was highly elevated in the majority of cancer cell line CM based on
spectral counting in this study, as well as its identification in pancreatic juice, we tested its levels
in our screening set of plasma samples and found a significant elevation in AGR2 levels in
pancreatic cancer plasma versus controls (Fig 2.5a). AGR2 has been previously shown to play a
role in invasion and metastasis [193,194], and it may be that elevated levels of this protein occur
in blood in the later stages of pancreatic cancer; however our initial results warrant further
evaluation of this protein in plasma/sera in larger sample sets.
PIGR has been shown previously through MRM to be increased in endometrial cancer
tissue homogenates [195]; however it has not been studied in clinical samples from many other
cancer sites. In the present study we demonstrate its significant increase in pancreatic cancer
plasma. COL6A1 is an important component of microfibrillar network formation, associating
closely with basement membranes in many tissues. It is an extracellular matrix protein and also
found in stromal tissue [196]. Mutations in this gene play a role in muscular disorders and
differential COL6A1 gene expression has been associated with astrocytomas [197,198];
however it has not been studied in pancreatic cancer and was found to be significantly increased
in our preliminary assessment in plasma. Taken together, the increased levels of these proteins
in pancreatic cancer plasma demonstrate the utility of our label-free differential protein
quantification approach to identify proteins relevant for study as potential serological
65
biomarkers of pancreatic cancer.
The identification of cancer-derived protein alterations through integration of different
biological sources is also an area of interest in cancer proteomics and the integrative mining of
multiple biological fluids may result in the identification of relevant candidates [125]. For
instance, in a recent analysis done in our laboratory [117], which compared the proteins/genes
identified in six publications chosen arbitrarily to represent various biological sources and both
proteomic and genomic data pertaining to ovarian cancer (2 cell line CM studies, 2 ascites, 1
tissue proteomics study and 1 microarray study), no proteins were found common to all 6
studies; however two proteins were found common to four of the studies. The proteins identified
were WAP four-disulfide core domain protein 2 precursor (HE4) and GRN (granulin). Both
have been implicated in ovarian cancer and HE4 is a recently FDA-approved ovarian cancer
biomarker [199]. In this respect, we looked at proteins common to the cancer CM and pancreatic
juice for identification of further candidates. These proteins were also compared to a pancreatic
cancer ascites proteome (Makawita et al. unpublished) for additional filtering. Most, if not all,
current biomarkers, such as PSA for prostate cancer, CA125 for ovarian cancer, hCG for
testicular cancer, etc. are secreted and shed proteins and focus was given to extracellular and cell
surface proteins as they have the highest likelihood of entering into the circulation [44,200].
Focus was also given to proteins highly or specifically expressed in the pancreas. If a protein is
only expressed in one tissue in healthy individuals, that tissue is likely the only contributor to
endogenous serum levels of that protein. As such, increasing serum contributions of such a
protein due to the presence of a growing tumor may be more easily detected. Furthermore, many
current biomarkers, such as PSA mentioned above, are highly expressed in one tissue [201].
Interestingly, the great majority of pancreas-specific proteins (as denoted by several
databases) were unique to the pancreatic juice and not identified in the cell lines (Table 2.4).
66
Similarly, the KEGG pancreatic secretion pathway was overrepresented in the pancreatic juice
proteome (Appendix 1). In the exocrine pancreas, acinar cells are responsible for secretion of
enzymes (zymogens) while ductal cells secrete primarily an alkaline fluid [202,203]. While the
majority of pancreatic cancers are ductal adenocarcinomas with pancreatic ductal cell-like
properties, the cell of origin of these cancers is still unclear [5,7]. Previously it has been shown
that acinar cells, once having undergone a transformation to duct-like cells show a reduced
secretion of zymogens [204]. The lack of pancreas specific proteins (enzymes, zymogens, etc.)
in the cell line CM may likely reflect the ductal-like nature of the cell lines, while the presence
of such proteins in the pancreatic juice may be reflective its acinar cell contributions.
Among the proteins common to all three biological fluids (Appendix 4), were several
proteins shown previously to be increased in the serum of pancreatic cancer patients and studied
as pancreatic cancer biomarkers, such as MUC1 and CEACAM5 (CEA) [65-67]. Two proteins
not previously assessed in the serum/plasma of pancreatic cancer patients, OLFM4 and SYCN,
were selected for verification. OLFM4 has been shown to promote proliferation in the PANC1
cell line by Kobayashi et al [205], and its mRNA levels were shown to be elevated in 5
cancerous, versus non-cancerous pancreatic tissue samples in the same study. OLFM4 serum
protein levels have shown potential diagnostic utility for gastric cancer [206]; however this
protein has not been studied in serum/plasma of pancreatic cancer patients. In this study,
OLFM4 showed over 5-fold expression in the CAPAN1 cell line in comparison to the HPDE
cell line. It was also identified by us in the pancreatic juice and ascites fluid proteomes and our
preliminary assessment shows that it is significantly increased in plasma from pancreatic cancer
patients (Fig 5b). Syncollin is a zymogen granule protein specific to the pancreas and is believed
to play a role in the concentration and/or efficient maturation of zymogens [207]. Syncollin has
been previously identified through mass spectrometry in human pancreatic juice and in the
67
proteomic analysis of plasma from a murine pancreatic cancer model [98,106]; however little is
known about the role of this pancreas-specific protein in pancreatic cancer and other
pathologies. Our data show that it is significantly elevated in human pancreatic cancer plasma
through ELISA (Fig 2.5c).
The growing consensus in this field is towards the development of panels of biomarkers,
as the combined assessment of multiple molecules can result in increased sensitivity and
specificity, in comparison to the assessment of molecules individually. In the present study, this
was demonstrated preliminarily to be true as the combined assessment of AGR2, OLFM4,
PIGR, SYCN and COL6A1 showed improved AUC, compared to CA19.9 alone. CA19.9 has
reported sensitivity and specificity values between 70%-90% (median ~79%) and 68%-91%
(median ~82%), respectively, for detection of pancreatic cancer (note: sensitivity decreases to
~55% in early-stage disease and CA19.9 is often undetectable in many asymptomatic
individuals; specificity decreases with benign disease) [59]. In the present study, CA19.9
showed a very high AUC (0.97) likely because the cancer plasma samples utilized were from
patients with established (primarily late-stage) pancreatic cancer. We used such samples since
the goal of the present study was to determine the utility of our approach to identify proteins that
are increased in serum/plasma of pancreatic cancer patients. Our marker panel requires further
validation with samples that have low/normal CA19.9 values, and includes patients with early
stage disease, as well as those with benign abdominal pathologies.
Three of the five proteins verified in plasma and which showed a significant increase in
pancreatic cancer plasma samples in comparison to controls (AGR2, PIGR and OLFM4) were
also identified in relevant clusters through hierarchical clustering analysis. emPAI is another
means of label-free protein quantification [181], and the identification of these three proteins
through emPAI-based quantification, and several other proteins, that were also identified
68
through spectral counting, is not unexpected. It aids in further corroborating our results.
Recently, Wu et al. [131], utilized emPAI values of proteins normalized through z-scores for
pathway-based biomarker discovery as a part of their study of 23 human cancer cell lines. In the
present study, we utilized normalized emPAI values of proteins to gain a preliminary
understanding of the dataset through hierarchical clustering analysis. The six cancer cell lines
chosen for analysis in the study are well characterized and highly studied cell lines in literature.
They contain many of the major genetic aberrations present in pancreatic cancer such as
mutations in Kras, SMAD4, CD16 and TP53 [113,114]. Interestingly, the cancer cell lines
derived from metastatic sites (SU-86-86, CFPAC1 and CAPAN1) were clustered together, while
MIA-PaCa2 and PANC1, which are cell lines derived from a primary tumor site were clustered
together, as were the BxPc3 and HPDE cell lines. BxPc3 is a cancer cell line derived from a
primary tumor site and HPDE is a widely used surrogate for normal pancreatic ductal epithelial
cells. Incidentally these two cell lines were the only ones with wild-type Kras expression [114],
a gene that is mutated in the vast majority (>90%) of pancreatic cancers; however firm
conclusions cannot be drawn regarding the clustering without further investigation. None-the-
less, identification of three of the five proteins that showed a significant increase in plasma from
pancreatic cancer patients in this study render the proteins identified in relevant clusters through
normalized emPAI values a potentially viable means for the generation of biologically relevant
leads.
Pancreatic cancer bodes one of the lowest five-year survival rates (<5%) of all cancer
types [121]. This is largely associated with the existence of locally advanced or metastatic
disease in the majority of patients at the time of diagnosis. Genomic sequencing studies reveal
that a broad window may exist for the detection of pancreatic cancer between the initial stages
of tumour development and dissemination to secondary sites [208]. Here, we present the
69
proteomic analysis of pancreatic cancer-related cell lines and pancreatic juice for the
identification of novel diagnostic leads. Label-free protein quantification methods revealed a
group of proteins differentially expressed in pancreatic cancer. Contained within this group were
numerous proteins previously studied as pancreatic cancer biomarkers and associated with
pancreatic cancer pathology. Further candidates were generated through integrative analysis of
multiple biological fluids and examination of tissue specificity. Through a preliminary
assessment, five proteins (AGR2, OLFM4, PIGR, SYCN and COL6A1) were shown to be
significantly increased in plasma from pancreatic cancer patients. Appropriate validation of
potential biomarkers requires the use of clearly defined clinical specimen, appropriate controls
and a large number of samples (preclinical, early and late-stage, benign disease, healthy
controls) [209]. Our preliminary assessment warrants further validation of these five proteins in
larger cohorts of samples (early and late-stage pancreatic cancer, benign disease and healthy
controls) as well as consideration of these proteins in the development of biomarker panels for
pancreatic cancer.
The current state of cancer proteomics boasts a large number of discovery studies
resulting in the generation of many potential diagnostic and therapeutic leads; however due in
part to a lack of parallel high-throughput/multiplexed technologies, subsequent verification and
validation of these leads is lagging. In this regard, the proteomic data-set presented in this study,
when combined with existing repositories or compendiums [146,210], may also aid in further
prioritizing candidates for future diagnostic and therapeutic applications.
70
CHAPTER 3:
Enhanced Performance of CA19.9 with Addition of Syncollin and
Anterior Gradient Homolog 2 in Panel
71
3.1 Introduction
New serum biomarkers to aid in the detection and clinical management of patients with
pancreatic cancer are urgently needed. With an estimated 43,140 new cases in the United States
in 2010 and an estimated 36,800 deaths, pancreatic cancer is one of the most aggressive of all
cancer types and the 4th
leading cause of cancer related death [33]. Currently five-year survival
rates are less than 5%; however detection of pancreatic cancer in its early stages can increase
five-year survival rates to 20-40% [33]. Until better therapeutic measures are developed, or even
with the advent of novel therapeutics, the detection of pancreatic cancer earlier is key to
improving patient survival. Currently, CA19.9 is the most widely used biomarker in the clinic
for pancreatic cancer with a median sensitivity of ~79% and median specificity of ~82%;
however its efficacy is low for detection of early-stage disease (~55% sensitivity) [59]. It is also
elevated in benign conditions and is not produced in Lewis genotype negative individuals [64]
(~10% of the general population). As a result, it is mostly used in the clinic for monitoring
response to therapy in patients with established disease. Thus the need remains for the
identification of new tumor markers with high sensitivity and specificity for optimal
management of pancreatic cancer patients.
We previously performed extensive proteomic analysis of conditioned media (CM) from
six pancreatic cancer cell lines, one normal pancreatic ductal epithelial cell line and six
pancreatic juice samples using two dimensional LC-MS/MS. Specifically, our triplicate analysis
of the BxPc3, MIA-PaCa2, PANC1, CAPAN1, CFPAC1, SU.86.86 and HPDE cell line CM,
and pancreatic juice samples resulted in the identification of 3479 non-redundant proteins with
two or more peptides. Through subsequent examination of differential protein expression
between the cancer and normal cell lines using relative label-free protein quantification and
integrative analysis, focusing on the overlap of proteins between the multiple biological fluids,
72
cellular localization and tissue specificity, candidate biomarkers for verification were elucidated.
Preliminary verification of 5 proteins, AGR2, OLFM4, SYCN, PIGR and COL6A1 in 20 plasma
samples from pancreatic cancer patients and 20 healthy individuals of similar age/sex using
ELISA showed a significant elevation of these proteins in plasma from pancreatic cancer
patients. Assessment of the combination of the 5 proteins showed an improved area under the
curve to CA19.9 alone. In the present study, we further assessed two of the proteins, SYCN and
AGR2, in a larger number of samples (n=198).
Syncollin (SYCN) is a 14 kDa protein that shows high/preferential expression in the
pancreas based on several publically available databases such as TiSGeD (Tissue-Specific
Genes Database) which provides microarray-based tissue expression of proteins, and TiGER
(Tissue-specific and Gene Expression and Regulation) and Unigene which provide EST
(expressed sequence tag) tissue expression data. Data is not yet available for SYCN on the
Human Protein Atlas database which provides immunohistochemistry tissue expression profiles
of proteins. In literature, SYCN has been identified in rat pancreatic tissues as a protein found on
the membranes of zymogen granules in pancreatic acinar cells which functions as a regulator of
exocitosis in a Ca2+
dependent manner, and has been shown to play a role in the maturation of
zymogens [207,211].
More recently, SYCN was identified in human neutrophilic granulocytes by Bach et al.
[212]. This group has postulated a possible role for syncollin in host defense; however at present
empirical support for such a claim is lacking [212]. The role of SYCN, if any, in cancer is
unknown. AGR2 is a more widely studied protein, especially in the context of cancer. AGR2 is
elevated in several cancer sites including breast, prostate and lung cancers, and has been
implicated in invasion and metastasis [190-192]. In pancreatic cancer, elevated levels of AGR2
mRNA and protein have been shown in tissue sections of pancreatic ductal adenocarcinoma.
73
Silencing with siRNA has shown decreased cell proliferation and reduced invasion in pancreatic
cancer cell lines, and silencing through shRNA has shown reduced tumor growth in vivo in an
orthotopic mouse model of pancreatic cancer [193]. This latter study showed that silencing of
AGR2 can increase the efficacy of treatment with gemcitabine, significantly reducing metastatic
loci in the liver and lung [193].
In the present study, we investigated the levels of syncollin and AGR2 in serum of 111
patients with pancreatic cancer and 87 normal controls. Significantly elevated levels of both
proteins were observed in pancreatic cancer patients, although SYCN performed better than
AGR2. Receiver operating characteristic (ROC) curve analysis was conducted on all cases and
controls, and on the subset of confirmed Stage II cases and all controls. Individually, only
SYCN showed a slight (but not statistically significant) improvement in AUC in comparison to
that of CA19.9 in the Stage II and controls analysis. In both cases however, the combination of
SYCN, AGR2 and CA19.9 showed a statistically significant improvement in AUC to CA19.9
alone. These results support the view that syncollin and AGR2 are promising candidate
serological biomarkers for pancreatic cancer which can enhance the performance of CA19.9.
The clinical utility of this novel panel needs to be further studied in larger cohorts of samples
from patients with pancreatic cancer, individuals with benign and preclinical disease, and
normal controls.
3.2 Materials and Methods
Patients and clinical specimen
One-hundred and eleven serum samples from patients diagnosed with pancreatic cancer
and 87 serum samples from healthy controls were used in this study. Median age of controls was
51 years (age range 19-84 years; 3 samples have unreported age) and median age of cancer
patients was 65 years (age range 32-85 years; 1 sample with unreported age). Of the 111
74
pancreatic cancer patients, 3 had stage I disease [30], 50 had stage II disease, 1 had been
reported as stage III and 3 patients were reported as having stage IV disease, with 4 more
reported as „unresectable‟ [30]. Stage was unknown in 50 patients. Tumor grade was reported in
37 cases, of which three were grade I, two were grade I-II and I-III, 19 were grade II and 13
were grade III (Table 3.1). All serum samples were provided by Dr. Randy Haun at the
University of Arkansas Cancer Research Center and collected with informed consent in
accordance with the Institutional Ethics Board. Samples were stored in -80 ºC upon collection
and shipped in dry ice. Samples were not thawed until use in this study.
Measurement of CA19.9, SYCN and AGR2 in serum
CA19.9 levels were measured using a commercially available immunoassay (ELECSYS
by Roche) and performed according to manufacturer‟s instructions. Enzyme linked-
immunosorbent assay kits were purchased for SYCN and AGR2 from USCN LifeSciences
(SYCN: Catalogue # E93879Hu; AGR2: Catalogue # E2285Hu). ELISAs were performed
according to manufacturer‟s instructions with slight modifications. Briefly, 100uL of sample
was incubated in pre-coated 96-well plates for 2 hours in 37 ºC, along with standards. Samples
were diluted in phosphate buffered saline as instructed, using a 1in5 dilution for both proteins.
Plates were washed 2 times using the wash buffer provided in the kits (where-as manufacturer‟s
instructions indicate no washing needed at this stage). A biotin-conjugated polyclonal secondary
antibody specific to SYCN and AGR2 (detection reagent A from USCN kit) was prepared and
incubated for 1 hour in 37 ºC. Following 4 washes, horseradish peroxidase (HRP) conjugated to
avidin (detection reagent B from USCN kit) was prepared and incubated for 30 min at 37 ºC.
The plates were washed 4 times and 90uL of tetramethylbenzidine (TMB) substrate was added
to each well. Wells were protected from light and incubated in 37 ºC for 10-15 min or until the
two highest standards were not saturated (based on visual examination of colour change). Fifty
75
microlitres of stop solution (sulphuric acid solution provided in USCN kit) was added and the
colour change was measured spectrophotometrically using the Perkin-Elmer Envision 2103
multilabel reader at a wavelength of 450 nm (540nm measurements were used to determine
background).
Table 3.1 Stage and Grade of 111 Pancreatic Cancer Serum Samples
Stage Number of
Samples
Number of Samples with Grade...
I I-II I-III II III Unknown
I 3 2 1
II 50 2 1 1 19 10 17
III 1 1
IV 3 1 2
"Unresectable"a 4 4
Unknown 50 1 49
Total 111 4 1 1 22 13 70
aFour cancer cases were reported as unresectable which, according to the 6
th edition of the
American Joint Commission on Cancer (AJCC) implies stage III or greater where-by patients
have locally advanced disease involving the celiac axis or superior mesenteric artery; stage IV is
the presence of distance metastasis [30].
Data Analysis and Statistics
The Mann-Whitney U-test was applied to assess statistical significance of medians of
cases and controls at a 95% confidence interval. Spearman‟s rank correlation coefficient was
calculated to evaluate correlation between CA19.9 and SYCN, CA19.9 and AGR2 and SYCN
and AGR2. The diagnostic value of the proteins was assessed using ROC curve analysis and
AUC calculations were carried out using ROCR and pROC software (Swiss Institute of
Bioinformatics) with variances calculated using a bootstrap method for multiple biomarker
modeling. Statistical differences between AUCs were assessed as described previously using
DeLong‟s method [82].
76
3.3 Results
Table 3.1 shows the demographics of the 111 cancer serum samples. Table 3.2 shows the
distribution of age, CA19.9, SYCN and AGR2 in the cancer patients and normal controls.
Clinical information was not available for all patients; however 50 patients were reported as
Table 3.2 Distribution of Serum SYCN, AGR2, CA19.9 and age in pancreatic cancer and
control serum.
Marker Disease
State
Na Min Median Max Mean p-value
b
Normal
versus
All
Cancer
Agec
Normal 87 19 51 84 52 p<0.0001
All Cancer 111 32 65 85 64
CA19.9 Normal 87 5.75 14.9 109.3 20.38
p<0.0001 All Cancer 111 3 137.7 23700 1319.23
SYCN Normal 87
Below
LOD 2.84 76.36 6.54 p<0.0001
All Cancer 111 0.56 10.93 110.8 23.9
AGR2 Normal 87 2.525 8.035 61 12.06
p<0.0001 All Cancer 111 3.515 12.965 265.7 34.09
Normal
versus
Early
Staged
Agec
Normal 87 19 51 84 52 p<0.0001
Early Stage 53 42 68 85 66
CA19.9 Normal 87 5.75 14.9 109.3 20.38
p<0.0001 Early Stage 53 3 120.3 2184.5 343.23
SYCN Normal 87
Below
LOD 2.84 76.36 6.54 p<0.0001
Early Stage 53 0.71 17.61 110.8 31.53
AGR2 Normal 87 2.525 8.035 61 12.06
p=0.0938 Early Stage 53 3.515 8.89 215.9 23.95
a Number of samples
b Mann-Whitney U-test
c Age is in years; concentration of AGR2 and SYCN are in ug/L; CA19.9 levels are in Units/mL
d Early Stage = stage I and II
having stage II disease when serum was collected. According to the American Joint Commission
on Cancer Staging System for Pancreatic Adenocarcinoma, this means that the patients have
potentially resectable disease with possible involvement of adjacent organs or venous structures;
however no involvement of the celiac axis or superior mesenteric artery [30]. Three patients had
confirmed stage I disease (potentially resectable pancreatic cancer confined to the pancreas).
77
Where possible, levels of markers were also assessed in these patients as a separate category
denoted as “early stage”. Clinical/pathological stage and history was not obtainable for a number
of samples and an adequate sample group with clinically/pathologically confirmed late stage
(stage III/IV) was not available to include early versus late-stage comparisons.
Figure 3.1 shows the distribution of CA19.9, SYCN and AGR2 in normal, early-stage
samples and all cancer samples. A clear elevation can be seen for CA19.9 for early stage and all
cancer samples (Fig 3.1a,b) and for SYCN (Fig 3.1c). While AGR2 appears elevated when all
cancer samples are considered, elevation in early-stage samples is less clear. This was shown
statistically through comparison of medians between the groups (Table 3.1). Comparisons
between all cancer and normal control samples were found to be statistically significant for all
three proteins (p<0.0001). Comparisons between early stage cancer and normal controls was
significant for CA19.9 and SYCN (p<0.0001 for both). AGR2 did not show a significant
increase in levels between early stage and control comparison (p=0.0938). The median ages of
the cancer and control groups was also found to be statistically different (p<0.0001) (Table 3.1).
Spearman‟s rank correlation coefficient was evaluated to assess correlation between the
three molecules for all cancer and controls (Figure 3.2). All combinations assessed (CA19.9 and
SYCN, CA19.9 and AGR2 and SYCN and AGR2) showed a significant correlation; however
ROC curve analysis showed the ability of SYCN and AGR2 to statistically enhance the
performance of CA19.9 alone (Figure 3.3,3.4).
78
Figure 3.1 Distribution of serum CA19.9, SYCN and AGR2 in normal controls, early stage
(stage I and II, n = 53) and all pancreatic cancer patients (n=111). Plot a and b both show the
same data for CA19.9; however plot b is a magnified view showing values only from 0 units/mL
to 2,500 units/mL. The horizontal line indicates the median values for each group. Statistical
significance was calculated using Mann-Whitney U-test for all groups of early stage versus
normals and all cancer samples versus normals. All comparisons showed a statistically
significant difference between medians (p<0.0001), except for the comparison of AGR2 early
stage versus normals. See also Table 3.2.
79
Figure 3.2 Correlation between CA19.9 and SYCN (a), CA19.9 and AGR2 (b) and SYCN and AGR2 (c). The Spearman
correlation coefficient was significant for all comparisons.
80
To evaluate the diagnostic value of SYCN and AGR2, AUC was calculated using ROC
curve analysis (Figure 3.3). Individually, neither protein showed improved AUC to CA19.9
alone (Fig 3.3a). Further biomarker model comparisons were performed using a bootstrap
method of analysis for modeling and the combination of SYCN, AGR2 and CA19.9 performed
the best with a statistically significant improvement in AUC when compared to CA19.9 alone
(AUCSYCN+AGR2+CA19.9 = 0.91, AUCCA19.9 = 0.84; p = 0.0007937).
We further performed ROC curve analysis on the clinically confirmed stage II cancer
samples with normal controls (Figure 3.4). In this analysis, SYCN performed the best (in terms
of AUC) of the three markers individually, with a slight improvement over CA19.9 although the
improvement was not significant (AUC SYCN = 0.84; AUCCA19.9 = 0.82; p-value = 0.7846). When
assessed in combination, the combination of SYCN, AGR2 and CA19.9 performed the best as
seen with all cancer cases, showing a statistically significant improvement in AUC to CA19.9
alone (AUCSYCN+AGR2+CA19.9 = 0.93, AUCCA19.9 = 0.82; p = 0.002718) (Fig 3.4b). The
combination of SYCN and AGR2 in the analysis with stage II patients and controls (Fig 3.4b)
had an AUC of 0.83 which was lower than the AUC of SYCN alone (0.84; Fig 3.4a). In the
assessment with all cancer patients however, the combination of SYCN and AGR2 resulted in a
higher AUC than either of these proteins alone.
Taken together, SYCN and AGR2 significantly enhanced the performance of CA19.9 in
this sample set, warranting further investigation of this novel panel in larger sample sets of
pancreatic cancer, benign and normal controls.
3.4 Discussion
The identification of novel biomarkers or biomarker panels with a high degree of
sensitivity and specificity can aid in the clinical management of pancreatic cancer patients and in
81
Figure 3.3 ROC curves for SYCN, AGR2 and CA19.9 individually (a) and in combination (b) for all cancer (n=111) and all
controls (n=87) with estimated AUC (95% confidence interval (CI)). Individually, CA19.9 performed best (AUC = 0.84) (a). In
combination, addition of SYCN and AGR2 significantly improved AUC of CA19.9 (p=0.0007937) (b).
2 candidates = SYCN and AGR2
82
Figure 3.4 ROC curves for SYCN, AGR2 and CA19.9 individually (a) and in combination (b) for stage II pancreatic cancer
(n=50) and all controls (n=87) with estimated AUC (95% confidence interval (CI)). Individually, SYCN performed best (AUC =
0.84) (a). In combination, addition of SYCN and AGR2 significantly improved AUC of CA19.9 (p=0.002718) (b). The combination
of SYCN and AGR2 also had higher AUC than CA19.9 alone; however the difference was not significant (b).
2 candidates = SYCN and AGR2
83
the detection of disease in the early stages of tumor development when patients can be most
optimally treated. CA19.9 is currently the most widely used clinical biomarker for pancreatic
cancer. It was discovered over 20 years ago using monoclonal antibody technology, where
tumor extracts and cell lines were used as immunogens, followed by selection for hybridoma
clones that recognized these tumor antigens [213]. This was also the case for several other
currently used cancer markers such as CA125 and CA15.3. Although CA19.9 lacks utility as a
marker for detection, new markers have not entered the clinic to replace or sufficiently
supplement CA19.9. The advent of new high throughput proteomic technologies has led to a
new wave of enthusiasm, and the emergence of oncoproteomics and proteomics-based discovery
studies [44,71,117,125]. In this regard, we previously characterized pancreatic cancer-related
biological fluids for the generation of candidate pancreatic cancer biomarkers. Our initial
verification studies led to five promising candidates – AGR2, SYCN, PIGR, OLFM4 and
COL6A1. Two of those five proteins, AGR2 and SYCN, were further investigated in this study.
Syncollin is a protein that is highly expressed in pancreatic acinar granules, with a few
recent studies describing its presence and function in granules in other tissue-types [212].
Digestive enzymes secreted by the pancreas are synthesized in their inactive form by ribosomes
on the endoplasmic reticulum (ER) [214]. Following insertion into the lumen of the ER and
transport to the Golgi, condensing vacuoles form which mature into zymogen granules that are
stored apically in acinar cells. Ca2+
causes fusion of granules with the plasma membrane and
exocytosis. Zymogen granule membrane proteins are key to the packaging of zymogens and
granule movement and fusion with the plasma membrane [214]. Syncollin is a zymogen granule
protein found on the inner surface of the granule membrane with a role in the concentration and
maturation of zymogens, as well as in the regulation of exocytosis [207, 211].
Syncollin has been identified in a qualitative proteomic analysis of pancreatic juice from
84
patients with pancreatic cancer and in serum from a murine model of pancreatic cancer [98,106];
however to our knowledge, this is the first report of its study in human serum. In the present
study we found it to be significantly elevated in patients with pancreatic cancer. Given that the
majority of pancreatic cancers are believed to arise from ductal cells (or acinar cells that undergo
acinar-to-ductal metaplasia losing acinar cell properties such as zymogen granules) [5,204],
elevation of SYCN in circulation may be a secondary effect of the growing tumor through local
tissue destruction. Release or leakage of proteins during invasion of pancreatic cancer has been
recently studied for the protein transthyretin (TTR), an islet cell protein that is elevated in
pancreatic juice from pancreatic cancer patients through destruction of islet cell architecture in
the presence of invasive cancer [215]. A similar phenomenon may be occurring with SYCN
release. Another key zymogen granule membrane protein is GP2 (glycoprotein 2). Elevated
levels of GP2 have also been reported in plasma from pancreatic cancer patients in comparison
to normal controls [165]. In the same study, GP2 levels showed better diagnostic utility for acute
pancreatitis [165]. This may also be the case for syncollin and our initial findings warrant further
investigation of this protein in larger sample sets which include benign diseases of the pancreas
such as pancreatitis.
AGR2 is a protein initially identified in Xenopus laevis that was shown to be crucial
during ectoderm developmental stages of embryogenesis for formation of anterior structures
[189]. Its role in normal human structures is still unclear; however it has been implicated in
many cancer types, including pancreatic cancer. In a recent proteomic analysis by Chen et al.
[153], AGR2 was found to be overexpressed in pancreatic juice from patients with pancreatic
intraepithelial neoplasia – III (PanIN3) and ELISA results showed this protein to have potential
diagnostic utility for pancreatic cancer in pancreatic juice; however these findings did not
translate into their serum analysis. AGR2 was highly overexpressed in our proteomic cell line
85
analysis and it was identified in the pancreatic juice. In our preliminary verification studies in 20
cancer and 20 normal plasma samples (described in the previous chapter), this protein also
showed a significant elevation in pancreatic cancer patients (p<0.0001; AUC = 0.95). In the
present study, AGR2 was significantly elevated when all pancreatic cancer cases were compared
to controls; however comparison of only early stage (stage I and II; n=50) cases did not show a
significant difference between cancer and controls. Literature findings support a role for AGR2
in the invasion and metastasis of cancer [192-194], and it may be that elevated levels of this
protein are seen in circulation in the later stages of pancreatic cancer. Given that confirmed stage
was not known in many samples in this study, firm conclusions cannot be made regarding this
without analysis in larger sample sizes from patients with early and late-stage disease.
Due to the lack of one single highly sensitive and specific marker for many diseases,
including for various measurable outcomes of pancreatic cancer, research has shifted to the
identification of panels of markers to achieve enhanced performance [216-218]. In the present
study, we demonstrate the ability of SYCN and AGR2 to significantly enhance the performance
of CA19.9 alone (AUCSYCN+AGR2+CA19.9 = 0.91, AUCCA19.9 = 0.84; p = 0.0007937). The median
ages between the cancer and control samples used in this study also showed a significant
difference and further statistically analyses, taking age into account, are warranted; however it is
highly likely that with addition of other candidates derived from our proteomic analysis of cell
line conditioned media and pancreatic juice (described in the previous chapter), further
improvements in the AUC can be made.
Validation of biomarkers is a rigorous process [209]. The proteins preliminarily
validated in this study represent promising novel biomarkers for pancreatic cancer, with the
ability to significantly enhance performance of CA19.9 in discriminating cancer from controls.
In this respect, it is crucial that the proteins presented in this study be further investigated in
86
independent sample sets of serum from early and late-stage pancreatic cancer and other cancer
patients, benign controls (acute pancreatits, chronic pancreatits, benign lesions), preclinical
samples and healthy controls to further determine their clinical utility. As well, inclusion or
consideration of these proteins in biomarker panels for pancreatic cancer currently under
development is also warranted.
88
4.1 Summary
This thesis presents a study aimed at the identification of novel pancreatic cancer serum
biomarker candidates through proteomic technologies, followed by verification of selected
candidates using enzyme-linked immunosorbent assays. Mass spectrometry analysis of
conditioned media from six pancreatic cancer cell lines, one normal human pancreatic ductal
epithelial cell line and six pancreatic juice samples (in two pools) in triplicate, resulted in 3479
total proteins identified with high confidence. Subsequent application of bioinformatics-based
criteria centered on label-free protein quantification between the cancer and normal cell lines
and integrative analysis of the multiple biological fluids facilitated the generation of candidate
pancreatic cancer biomarkers. Preliminary verification of candidates in a small subset of plasma
samples resulted in the identification of five promising leads. Further verification of two of the
five proteins in serum demonstrated the ability of these proteins, when used in combination with
CA19.9, to significantly enhance the area under the curve of CA19.9 alone, warranting their
further and extended validation, as well as validation of the remaining candidates. Finally, this
thesis demonstrates the utility of mining and integrating multiple biological fluids for the
identification of putative cancer biomarkers.
Key Findings:
1. Mass Spectrometry Analysis:
a. Strong cation exchange (SCX) liquid chromatography was the first dimension of
fractionation used to minimize sample complexity and it was demonstrated that
pooling fractions based on their SCX elution profile could increase protein yield by
>50%.
89
b. Using two dimensional liquid chromatography-tandem mass spectrometry (2D-LC-
MS/MS), 3324 non-redundant proteins were identified with ≥ 2 peptides in the
triplicate analysis of the 7 cell lines (MIA-PaCa2, PANC1, BxPc3, CAPAN1,
CFPAC1, SU.86.86 and HPDE).
c. Using 2D-LC-MS/MS, 648 non-redundant proteins were identified with ≥ 2 peptides
in the triplicate analysis of two pancreatic juice pools containing six samples from
pancreatic ductal adenocarcinoma patients.
d. A combined total of 3479 proteins were identified from all cell lines and pancreatic
juice samples, of which ~40% were found to be extracellular or cell surface
annotated through Genome Ontology analysis.
e. Label-free protein quantification, comparing the average normalized spectral counts
from the three replicates of each cancer cell line to that of the HPDE normal cell line,
resulted in 63 extracellular and cell surface proteins that showed over 5-fold increase
in at least three cancer cell lines.
f. Integrative analysis of proteins common to the cancer cell line and pancreatic juice
proteomes (with further filtering using a pancreatic cancer ascites proteome;
unpublished) and study of tissue specificity, focusing on pancreas specific proteins,
resulted in the generation of further candidates.
g. Many proteins identified through the label-free protein quantification approach as
overexpressed in pancreatic cancer and through the integrated analysis have been
previously shown to be increased in pancreatic cancer serum, which helps to
internally validate our approach.
90
2. Preliminary Verification
a. Five proteins, AGR2, SYCN, OLFM4, PIGR and COL6A1 showed a significant
increase in plasma from 20 patients with pancreatic cancer in comparison to 20
controls using ELISA.
b. ROC curve analysis of these five proteins in panel showed a slight improvement to
the AUC of CA19.9 alone.
3. Extended Verification
a. SYCN and AGR2 were shown to significantly improve AUC of CA19.9 when all
three molecules were assessed in combination in 198 serum samples (111 samples
from pancreatic cancer patients and 87 from normal controls).
4.2 Future Directions
The experimental data presented in this thesis has led to the identification of several
novel candidate serological biomarkers for pancreatic cancer, which require further verification
and validation to determine their diagnostic utility, and potential use in other areas of pancreatic
cancer management. In our extended verification, addition of SYCN and AGR2 in panel with
CA19.9 demonstrated a significant improvement in the AUC of CA19.9 alone. The addition of
other proteins to this panel that were shown to be increased in our preliminary verification phase
may likely further enhance its performance. It is also important to assess levels of these proteins
in benign disease, preclinical samples, and clearly defined early and late-stage samples in
independent sample sets to more thoroughly assess their potential as biomarkers.
In addition to detection, it is possible that these candidates may be useful in other areas
of clinical care, such as in monitoring response to treatment or detecting recurrence of disease.
91
In this respect, it would be useful to examine their levels in matched serum sets from pancreatic
cancer patients before and after treatment or before and after surgical resection, although
acquisition of such samples may be difficult. Similarly, detailed analysis of post translational
modifications (PTMs) such as glycosylations of the candidates verified here may further
enhance their utility as biomarkers by identification of unique disease-specific PTM patterns.
Not all of the candidates generated from the proteomic analysis presented in this thesis
were verified, partly due to a lack of assays and reagents for immunoassay-based verification. In
this regard, future directions also include verification of the remaining candidates that have not
previously been studied. This can be done through immunoassay-based technologies, if reagents
exist, or through development of mass spectrometry-based targeted protein quantification assays
such as multiple reaction monitoring (MRM). For such MRM-based approaches, given the low
sensitivity of mass spectrometers in serum, prefractionation or immunoextraction techniques to
enrich for candidates will likely be required prior to MRM.
It is highly possible that the proteins identified as deregulated in this study play a role in
the pathogenesis of pancreatic cancer. A detailed analysis of the mechanisms through which
they are increased or decreased and their role in pancreatic cancer may shed light on
tumorigenesis and cancer progression. Additionally, the comprehensive proteome of 3479
proteins presented in this thesis and the proteins identified as differentially expressed through
label-free protein quantification may aid other researchers in further prioritizing candidates in
future cancer therapeutic and diagnostic applications.
92
REFERENCES
1. Ross MH, Pawlina W. (2006) Histology A Text and Atlas. Baltimore, MD: Lippincott
Williams and Wilkins:594 – 602.
2. Bardeesy N, DePinho RA. (2002) Pancreatic cancer biology and genetics. Nat Rev Cancer.
12, 897-909.
3. Adsay N.V., Thirabanjasak D., Altinel D. (2008) Spectrum of human pancreatic neoplasia.
In: Lowy AM, Leach SD, Philip AP, eds. Pancreatic cancer – M.D. Anderson solid tumor
oncology series. New York: Springer Science+Business Media,LLC:3-26.
4. Maitra A., Hruban R. (2008) Pancreatic Cancer. Annu Rev Pathol Mech Dis. 3,157-188.
5. Stanger B.Z., Dor Y. (2006) Dissecting the cellular origins of pancreatic cancer. Cell Cycle.
5, 43-46
6. Zhu L, Shi G, Schmidt CM, Hruban RH, Konieczny SF. (2007) Acinar cells contribute to the
molecular heterogeneity of pancreatic intraepithelial neoplasia. Am J Pathol. 263-73.
7. Schmid RM. (2002) Acinar-to-ductal metaplasia in pancreatic cancer development. J Clin
Invest. 109, 1403-1404.
8. Tanaka M, Chari S, Adsay V, Fernandez-del Castillo C, Falconi M, Shimizu M, Yamaguchi
K, Yamao K, Matsuno S; International Association of Pancreatology. (2006) International
consensus guidelines for management of intraductal papillary mucinous neoplasms and
mucinous cystic neoplasms of the pancreas. Pancreatology. 6, 17-32.
9. Dunphy E.P. (2008) Pancreatic cancer: a review and update. Clin J Oncol Nurs. 12, 735-741.
10. Krishna NB, Mehra M, Reddy AV, Agarwal B. (2009) EUS/EUS-FNA for suspected
pancreatic cancer: influence of chronic pancreatitis and clinical presentation with or without
obstructive jaundice on performance characteristics. Gastrointest Endosc. 70, 70-79.
11. Zervos EE, Osborne D, Boe BA, Luzardo G, Goldin SB, Rosemurgy AS (2006) Prognostic
significance of new onset ascites in patients with pancreatic cancer. World J Surg Oncol. 4,
16.
12. Bachmann J, Ketterer K, Marsch C, Fechtner K, Krakowski-Roosen H, Büchler MW, Friess
H, Martignoni ME. (2009) Pancreatic cancer related cachexia: influence on metabolism and
correlation to weight loss and pulmonary function. BMC Cancer. 9, 255.
13. Stathis A, Moore MJ. (2010) Advanced pancreatic carcinoma: current treatment and future
challenges. Nat Rev Clin Oncol. 7, 163-172
14. Ghadirian P, Lynch HT, Krewski D. (2003) Epidemiology of pancreatic cancer: an
93
overview. Cancer Detect Prev. 27, 87-93.
15. Lowenfels AB, Maisonneuve P. (2006) Epidemiology and risk factors for pancreatic cancer.
Best Pract. Res. Clin. Gastroenterol. 20:197–209.
16. Brand RE, Lynch HT. (2000) Hereditary pancreatic adenocarcinoma. A clinical perspective.
Med Clin North Am. 84, 665-675.
17. Lynch HT, Smyrk T, Kern SE, Hruban RH, Lightdale CJ, Lemon SJ, Lynch JF, Fusaro LR,
Fusaro RM, Ghadirian P. (1996) Familial pancreatic cancer: a review. Semin Oncol. 23, 251-
275.
18. Ghadirian P, Boyle P, Simard A, Baillargeon J, Maisonneuve P, Perret C. (1991) Reported
family aggregation of pancreatic cancer within a population-based case-control study in the
Francophone community in Montreal, Canada. Int J Pancreatol 10, 183-196.
19. Tersmette AC, Petersen GM, Offerhaus GJ, Falatko FC, Brune KA, Goggins M, Rozenblum
E, Wilentz RE, Yeo CJ, Cameron JL, Kern SE, Hruban RH. (2001) Increased risk of incident
pancreatic cancer among first-degree relatives of patients with familial pancreatic cancer.
Clin Cancer Res. 7, 738-744.
20. Klapman J, Malafa MP. (2008) Early detection of pancreatic cancer: why, who, and how to
screen. Cancer Control. 15, 280-287.
21. Gruber SB, Entius MM, Petersen GM, Laken SJ, Longo PA, Boyer R, Levin AM, Mujumdar
UJ, Trent JM, Kinzler KW, Vogelstein B, Hamilton SR, Polymeropoulos MH, Offerhaus GJ,
Giardiello FM. (1998) Pathogenesis of adenocarcinoma in Peutz-Jeghers syndrome. Cancer
Res. 58, 5267-5270.
22. Giardiello FM, Brensinger JD, Tersmette AC, Goodman SN, Petersen GM, Booker SV,
Cruz-Correa M, Offerhaus JA (2000) Very high risk of cancer in familial Peutz-Jeghers
syndrome. Gastroenterology. 119, 1447-1453.
23. Lowenfels AB, Maisonneuve P, DiMagno EP, Elitsur Y, Gates LK Jr, Perrault J, Whitcomb
DC. (1997) Hereditary pancreatitis and the risk of pancreatic cancer. International Hereditary
Pancreatitis Study Group. J. Natl. Cancer Inst. 89, 442–446
24. Wolpin BM, Kraft P, Gross M, Helzlsouer K, Bueno-de-Mesquita HB, Steplowski E,
Stolzenberg-Solomon RZ, Arslan AA, Jacobs EJ, Lacroix A, Petersen G, Zheng W, Albanes
D, Allen NE, Amundadottir L, Anderson G, Boutron-Ruault MC, Buring JE, Canzian F,
Chanock SJ, Clipp S, Gaziano JM, Giovannucci EL, Hallmans G, Hankinson SE, Hoover
RN, Hunter DJ, Hutchinson A, Jacobs K, Kooperberg C, Lynch SM, Mendelsohn JB,
94
Michaud DS, Overvad K, Patel AV, Rajkovic A, Sanchéz MJ, Shu XO, Slimani N, Thomas
G, Tobias GS, Trichopoulos D, Vineis P, Virtamo J, Wactawski-Wende J, Yu K, Zeleniuch-
Jacquotte A, Hartge P, Fuchs CS. (2010) Pancreatic cancer risk and ABO blood group alleles:
results from the pancreatic cancer cohort consortium. Cancer Res. 70, 1015-1023.
25. Sarkar FH, Banerjee S, Li Y. (2007) Pancreatic cancer: pathogenesis, prevention and
treatment. Toxicol Appl Pharmacol. 224, 326-336.
26. Li D, Abbruzzese JL. (2010) New strategies in pancreatic cancer: emerging epidemiologic
and therapeutic concepts. Clin Cancer Res. 16, 4313-4318.
27. Fendrich V, Chen NM, Neef M, Waldmann J, Buchholz M, Feldmann G, Slater EP, Maitra
A, Bartsch DK. (2010) The angiotensin-I-converting enzyme inhibitor enalapril and aspirin
delay progression of pancreatic intraepithelial neoplasia and cancer formation in a genetically
engineered mouse model of pancreatic cancer. Gut. 59, 630-637.
28. Shirley A, Yeo CJ. Pancreaticoduodenectomy: Past and Present. (2008) In: Lowy AM,
Leach SD, Philip AP, eds. Pancreatic cancer – M.D. Anderson solid tumor oncology series.
New York: Springer Science+Business Media,LLC:313-327
29. Neoptolemos J, Büchler M, Stocken DD, Ghaneh P, Smith D, Bassi C, Moore M,
Cunningham D, Dervenis C, Goldstein D. (2009). ESPAC‑3(v2): A multicenter,
international, open‑label, randomized, controlled phase III trial of adjuvant
5‑fluorouracil/folinic acid (5‑FU/FA) versus gemcitabine (GEM) in patients with resected
pancreatic ductal adenocarcinoma [abstract]. J. Clin. Oncol. 27 (Suppl. 18), a4505.
30. Katz MH, Hwang R, Fleming JB, Evans DB. (2008) Tumor-node-metastasis staging of
pancreatic adenocarcinoma. CA Cancer J Clin. 58, 111-125.
31. Moore MJ, Goldstein D, Hamm J, Figer A, Hecht JR, Gallinger S, Au HJ, Murawa P, Walde
D, Wolff RA, Campos D, Lim R, Ding K, Clark G, Voskoglou-Nomikos T, Ptasynski M,
Parulekar W; National Cancer Institute of Canada Clinical Trials Group. (2007) Erlotinib
plus gemcitabine compared with gemcitabine alone in patients with advanced pancreatic
cancer: a phase III trial of the National Cancer Institute of Canada Clinical Trials Group. J.
Clin. Oncol. 25, 1960–1966.
32. Jones S., Zhang X., Parsons D.W., Lin J.C., Leary R.J., Angenendt P., Mankoo P., Carter H.,
Kamiyama H., Jimeno A., Hong S.M., Fu B., Lin M.T., Calhoun E.S., Kamiyama M., Walter
K., Nikolskaya T., Nikolsky Y., Hartigan J., Smith D.R., Hidalgo M., Leach S.D., Klein A.P.,
Jaffee E.M., Goggins M., Maitra A., Iacobuzio-Donahue C., Eshleman J.R., Kern S.E.,
95
Hruban R.H., Karchin R., Papadopoulos N., Parmigiani G., Vogelstein B., Velculescu V.E.,
Kinzler K.W. (2008) Core signaling pathways in human pancreatic cancers revealed by
global genomic analyses. Science. 321, 1801-1806.
33. Jemal A, Siegel R, Xu J, Ward E. (2010) Cancer statistics, 2010. CA Cancer J Clin. 60, 277-
300.
34. Boyle P, Levin B. World cancer report 2008. International Agency for Research on Cancer.
http://www.iarc.fr/en/publications/pdfs-online/wcr/ (Accessed November 2010).
35. Wray CJ, Ahmad SA. (2008) Controversies in the surgical management of pancreatic
cancer. In: Lowy AM, Leach SD, Philip AP, eds. Pancreatic cancer – M.D. Anderson solid
tumor oncology series. New York: Springer Science+Business Media,LLC:385-400.
36. Blackstock AW, Wentworth S. (2008) The evolution of chemoradiation strategies for locally
advanced pancreatic cancer. In: Lowy AM, Leach SD, Philip AP, eds. Pancreatic cancer –
M.D. Anderson solid tumor oncology series. New York: Springer Science+Business
Media,LLC:497-510.
37. Garofalo MC, Regine WF. Adjuvant chemoradiation for pancreatic cancer: past, present and
future. (2008) In: Lowy AM, Leach SD, Philip AP, eds. Pancreatic cancer – M.D. Anderson
solid tumor oncology series. New York: Springer Science+Business Media,LLC:535-547.
38. Ojeda-Fournier H, Choe KA. Imaging of pancreatic adenocarcinoma. (2008) In: Lowy AM,
Leach SD, Philip AP, eds. Pancreatic cancer – M.D. Anderson solid tumor oncology series.
New York: Springer science + Business Media,LLC:255-270.
39. Soriano A, Castells A, Ayuso C, Ayuso JR, de Caralt MT, Ginès MA, Real MI, Gilabert R,
Quintó L, Trilla A, Feu F, Montanyà X, Fernández-Cruz L, Navarro S. (2004) Preoperative
staging and tumor respectability assessment of pancreatic cancer: prospective study
comparing endoscopic ultrasonography, helical computed tomography, magnetic resonance
imaging, and angiography. Am J Gastroenterol. 99, 492-501.
40. Ho JM, Eysselein VE, Stabile BE. (2008) The value of endoscopic ultrasonography in
predictring respectability and margins of resection for periampullar tumors. Am Surg.
74,1026-1029.
41. Irisawa A, Sato A, Sato M, Ikeda T, Suzuki R, Ohira H. (2009) Early diagnosis of small
pancreatic cancer: Role of endoscopic ultrasonography. Digestive Endoscopy. 21, S92-S96.
42. Zeron HM, Flores JRG, Prieto MLR. (2009) Limintations in improving detection of
pancreatic adenocarcinoma. Future Oncol. 5, 657-668.
96
43. Rosty C, Goggins M. (2002) Early detection of pancreatic carcinoma. Hematol Oncol Clin N
Am. 16, 37-52.
44. Kulasingam V, Diamandis EP. (2008) Strategies for discovering novel cancer biomarkers
through utilization of emerging technologies. Nat Clin Pract Oncol. 5, 588-599.
45. Rulyak SJ, Kimmey MB, Veenstra DL, Brentnall TA. (2003) Cost-effectiveness of
pancreatic cancer screening in familial pancreatic cancer kindreds. Gastrointest Endosc. 57,
23-29.
46. Evans DB, Rich TA. Cancer of the pancreas. (1997) In: DeVita HS, Rosenberg SA eds.
Cancer: principles and practice of oncology. Philadelphia: Lippincott-Raven.:1059-1060.
47. Kim YC, Kim HJ, Park JH, Park DI, Cho YK, Sohn CI, Jeon WK, Kim BI, Shin JH. (2009)
Can preoperative CA19-9 and CEA levels predict the resectability of patients with pancreatic
adenocarcinoma? J Gastroenterol Hepatol. 24, 1869-1875.
48. Yan L, McFaul C, Howes N, Leslie J, Lancaster G, Wong T, Threadgold J, Evans J, Gilmore
I, Smart H, Lombard M, Neoptolemos J, Greenhalf W. (2005) Molecular analysis to detect
pancreatic ductal adenocarcinoma in high-risk groups, Gastroenterology. 128, 2124–2130.
49. Bartels CL, Tsongalis GJ. (2009) MicroRNAs: novel biomarkers for human cancer. Clin
Chem. 55, 623-631.
50. Zhang X., Galardi E., Duquette M., Lawler J., Parangi S., (2005) Antiangiogenic treatment
with three thrombospondin-1 type 1 repeats versus gemcitabine in an orthotopic human
pancreatic cancer model. Clin. Cancer Res. 11, 5622–5630.
51. Tanase CP, Neagu M, Albulescu R, Hinescu ME. (2010) Advances in pancreatic cancer
detection. Adv Clin Chem. 51, 145-80.
52. Ishizone S, Yamauchi K, Kawa S, Suzuki T, Shimizu F, Harada O, Sugiyama A, Miyagawa
S, Fukuda M, Nakayama. (2006) Clinical utility of quantitative RT-PCR targeted to alpha1,
4-N-acetylglucosaminyltransferase mRNA for detection of pancreatic cancer. Cancer Sci. 97,
119–126.
53. Kohn EC, Azad N, Annunziata C, Dhamoon AS, Whiteley G. (2007) Proteomics as a tool
for biomarker discovery. Dis Markers. 23, 411-417.
54. Anderson NL, Anderson NG. (2002) The human plasma proteome: history, character, and
diagnostic prospects. Mol Cell Proteomics. 1, 845-867.
55. Jarjanazi H, Savas S, Pabalan N, Dennis JW, Ozcelik H. (2008) Biological implications of
SNPs in signal peptide domains of human proteins. Proteins. 70, 394-403.
97
56. Hon LS, Zhang Y, Kaminker JS, Zhang Z. (2009) Computational prediction of the
functional effects of amino acid substitutions in signal peptides using a model-based
approach. Hum Mutat. 30, 99-106.
57. Molina R, Jo J, Filella X, Zanon G, Pahisa J, Muñoz M, Farrus B, Latre ML, Gimenez N,
Hage M, Estape J, Ballesta AM. (1996) C-erbB-2 oncoprotein in the sera and tissue of
patients with breast cancer. Utility in prognosis. Anticancer Res. 16, 2295-2300.
58. Stacker SA, Achen MG, Jussila L, Baldwin ME, Alitalo K. (2002) Lymphangiogenesis and
cancer metastasis. Nat Rev Cancer. 2, 573-583.
59. Goonetilleke KS, Siriwardena AK. (2007) Systematic review of carbohydrate antigen (CA
19-9) as a biochemical marker in the diagnosis of pancreatic cancer. EJSO. 33, 266-270.
60. Magnani JL, Steplewski Z, Koprowski H, Ginsburg V. (1983) Identification of the
gastrointestinal and pancreatic cancer-associated antigen detected by monoclonal antibody
19-9 in the sera of patients as a mucin. Cancer Res. 43, 5489-5492.
61. Marrelli D, Caruso S, Pedrazzani C, Neri A, Fernandes E. Marini M, Pinto E, Roviello F.
(2009) CA19-9 serum levels in obstructive jaundice: clinical value in benign and malignant
conditions. Am J Surg. 198, 333-339.
62. Ventrucci M, Pozzato P, Cipolla A, Uomo G. (2009) Persistent elevation of serum CA 19-9
with no evidence of malignant disease. Dig Liver Dis. 41, 357-363.
63. Hatate K, Yamashita K, Hirai K, Kumamoto H, Sato T, Ozawa H, Nakamura T, Onozato W,
Kokuba Y, Ihara A, Watanabe M. (2008) Liver metastasis of colorectal cancer by protein-
tyrosine phosphatase type 4A, 3 (PRL-3) is mediated through lymph node metastasis and
elevated serum tumor markers such as CEA and CA19-9. Oncol Rep. 20, 737-743.
64. Rosen A, Linder S, Harmenberg U, Pegert S. (1993) Serum levels of CA 19-9 and CA50 in
relation to Lewis blood cell status in patients with malignant and benign pancreatic disease.
Pancreas. 8, 160-165.
65. Nazli O, Bozdag A, Tansug T, Kir R, Kaymak E. (2000) The diagnostic importance of CEA
and CA19-9 for the early diagnosis of pancreatic carcinoma. Hepatogatroenterology. 47,
1750 –1752.
66. Tsavaris N, Kosmas C, Papadoniou N, Kopteridis P, Tsigritis K, Dokou A, Sarantonis J,
Skopelitis H, Tzivras M, Gennatas K, Polyzos A, Papastratis G, Karatzas G, Papalambros A.
(2009) CEA and CA-19.9 serum tumor markers as prognostic factors in patients with locally
advanced (unresectable) or metastatic pancreatic adenocarcinoma: a retrospective analysis. J
98
Chemother. 21, 673-80.
67. Gold DV, Modrak DE, Ying Z, Cardillo TM, Sharkey RM, Goldenberg DM. (2006) New
MUC1 serum immunoassays differentiates pancreatic cancer from pancreatitis. J Clin Oncol.
24, 252-258.
68. Yates JR, Ruse CI, Nakorchevsky A. (2009) Proteomics by mass spectrometry: approaches,
advances, and applications. Annu Rev Biomed Eng. 11, 49-79.
69. Makawita S, Diamandis EP. (2010) The bottleneck in the cancer biomarker pipeline and
protein quantification through mass spectrometry-based approaches: current strategies for
candidate verification. Clin Chem. 56, 212-222.
70. Kapp, E. A., Schu¨ tz, F., Connolly, L. M., Chakel, J. A., Meza, J. E., Miller, C. A., Fenyo,
D., Eng, J. K., Adkins, J. N., Omenn, G. S., and Simpson, R. J. (2005) An evaluation,
comparison, and accurate benchmarking of several publicly available MS/MS search
algorithms: sensitivity and specificity analysis. Proteomics. 5, 3475–3490.
71. Domon, B., and Aebersold, R. (2006) Challenges and opportunities in proteomics data
analysis. Mol. Cell. Proteomics. 5, 1921–1926.
72. Hortin GL. (2006) The MALDI-TOF mass spectrometric view of the plasma proteome and
peptidome. Clin Chem. 52, 1223-1237.
73. Whelan LC, Power KA, McDowell DT, Kennedy J, Gallagher WM. (2008) Applications of
SELDI-MS technology in oncology. J Cell Mol Med. 12, 1535-1547.
74. Taylor GI. (1964) Disintegration of water drops in an electric field. Proc. Royal Soc. Lond.
280, 383–397.
75. Perry RH, Cooks RG, Noll RJ. (2008 Orbitrap mass spectrometry: instrumentation, ion
motion and applications. Mass Spectrom. Rev. 27, 661–699.
76. Hu Q, Noll RJ, Li H, Makarov A, Hardman M, Graham Cooks R. (2005) The Orbitrap: a
new mass spectrometer. J. Mass Spectrom. 40, 430–443.
77. Makarov A, Denisov E, Lange O, Horning S. (2006) Dynamic range of mass accuracy in
LTQ Orbitrap hybrid mass spectrometer. J. Am. Soc. Mass Spectrom. 17, 977–982.
78. Pepe MS, Etzioni R, Feng Z, Potter JD, Thompson ML, Thornquist M, Winget M, Yasui Y.
(2001) Phases of biomarker development for early detection of cancer. J Natl Cancer Inst.
93, 1054-1061.
79. Rifai N, Gillette MA, Carr SA. (2006) Protein biomarker discovery and validation: the long
and uncertain path to clinical utility. Nat Biotechnol. 24, 971-983.
99
80. Kulasingam V., Diamandis E.P. (2008) Tissue culture-based breast cancer biomarker
discovery platform. Int JCancer. 123, 2007-2012.
81. Kitteringham NR, Jenkins RE, Lane CS, Elliott VL, Park BK. (2009) Multiple reaction
monitoring for quantitative biomarker analysis in proteomics and metabolomics. J
Chromatogr B Analyt Technol Biomed Life Sci. 877, 1229-1239.
82. DeLong ER, DeLong DM, Clarke-Pearson DL. (1988) Comparing the areas under two or
more correlated receiver operating characteristic curves: a nonparametric approach.
Biometrics. 44, 837-45.
83. Navaglia F, Fogar P, Basso D, Greco E, Padoan A, Tonidandel L, Fadi E, Zambon C,
Bozzato D, Moz S, Seraglia R, Pedcazzoli S, Plebani M. (2009) Pancreatic cancer biomarkers
discovery by surface-enhanced laser desorption and ionization time-of-flight mass
spectrometry. Clin Chem Lab Med. 47, 713-723.
84. Fiedler GM, Leichtle AB, Kase J, Baumann S, Ceglarek U, Felix K, Conrad T, Witzigmann
H, Weimann A, Schutte C, Hauss J, Buchler M, Thiery J. (2009) Serum peptidome profiling
revealed platelet factor 4 as a potential discriminating peptide associated with pancreatic
cancer. Clin Cancer Res. 15, 3812-3819.
85. Kojima K, Asmellash S, Klug CA, Grizzle WE, Mobley JA, Christein JD. (2008) Applying
proteomic-based biomarker tools for the accurate diagnosis of pancreatic cancer. J
Gastrointest Surg. 12, 1683-1690.
86. Sun Z, Zhu Y, Wang F, Chen R, Peng T, Fan Z, Xu Z, Miao Y. (2007) Serum proteomic-
based analysis of pancreatic carcinoma for the identification of potential cancer biomarkers.
Biochimica et Biophysica Acta. 1774, 765-771.
87. Deng R, Lu Z, Chen Y, Zhou L, Lu X. (2007) Plasma proteomic analysis of pancreatic
cancer by 2-dimensional gel electrophoresis. Pancreas. 34, 310-317.
88. Lin Y, Goedegebuure P, Tan M, Gross J, Malone J, Feng S, Larson J, Phommaly C,
Trinkaus K, Townsend R, Linehan D. (2006) Proteins associated with disease and clinical
course in pancreas cancer: A proteomic analysis of plasma in surgical patients. J Proteome
Res. 5, 2169-2176.
89. Bloomston M, Zhou J, Rosemurgy A, Frankel W, Muro-Cacho C, Yeatman TJ. (2006)
Fibrinogen: overexpression in pancreatic cancer identified by large-scale proteomic analysis
of serum samples. Cancer Res. 66, 2592-2599.
90. Yu K, Rustgi AK, Blair I. (2005) Characterization of proteins in human pancreatic cancer
100
serum using differential gel electrophoresis and tandem mass spectrometry. J Proteome Res.
4, 1742-1751.
91. Chen J, Ni R, Xiao M, Guo J, Zhou J. (2009) Comparative proteomic analysis of
differentially expressed proteins in human pancreatic cancer tissue. Hepatobiliary Pancreat
Dis Int. 8, 193-200.
92. Chung J, Oh M, Choi S, Bae C. (2008) Proteomic analysis to identify biomarker proteins in
pancreatic ductal adenocarcinoma. ANZ J Surg. 78, 245-251.
93. Chen R, Brentnall T, Pan S, Cooke K, Moyes K, Lane Z, Crispin D, Goodlett DR, Aebersold
R, Bronner M. (2007) Quantitative proteomics analysis reveals that proteins differentially
expressed in chronic pancreatitis are also frequently involved in pancreatic cancer. Mol Cell
Proteomics. 6, 1331-1342.
94. Qi T, Han J, Cui Y, Zong M, Liu X, Zhu B. (2008) Comparative proteomic analysis for the
detection of biomarkers in pancreatic ductal adenocarcinoma. J Clin Pathol. 61, 49–58.
95. Scarlett CJ, Smith R, Saxby A, Nielsen A, Sarma J, Wilson S, Baxter R. (2006) Proteomic
classification of pancreatic adenocarcinoma tissue using protein chip technology.
Gastroenterology. 130, 1670-1678.
96. Gronborg M, Kristiansen T, Iwahori A, Chang R, Reddy R, Sato N, Molina H, Jensen O,
Hruban R, Goggins M, Maitra A, Pandey A. (2006) Biomarker discovery from pancreatic
cancer sercretome using a differential proteomic approach. Mol Cell Proteomics. 5:157-
171.
97. Mauri P, Scarpa A, Nascimbeni A, Benazzi L, Parmagnani E, Mafficini A, Peruta M, Bassi
C, Miyazaki K, Sorio C. (2005) Identification of proteins released by pancreatic cancer cells
by multidimensional protein identification technology: a strategy for identification of novel
cancer markers. FASEB J. 19, 1125-1127.
98. Grønborg M, Bunkenborg J, Kristiansen TZ, Jensen ON, Yeo CJ, Hruban RH, Maitra A,
Goggins MG, Pandey A. Comprehensive proteomic analysis of human pancreatic juice. J
Proteome Res. 3, 1042-55.
99. Chen R, Pan S, Yi E, Donohoe S, Bronner M, Potter J, Goodlett D, Aebersold R, Brentnall
T. (2006) Quantitative proteomic profiling of pancreatic cancer juice. Proteomics. 6, 3871-
3879.
100. Chen R, Pan S, Cooke K, Moyes K, Bronner M, Goodlett D, Aebersold R, Brentnall T.
(2007) Comparison of pancreas juice proteins from cancer versus pancreatitis using
101
quantitative proteomic analysis. Pancreas. 34, 70-79.
101. Zhou L, Lu Z, Yang A, Deng R, Mai C, Sang X, Faber K, Lu X. (2007) Comparative
proteomic analysis of human pancreatic juice: Methodological study. Proteomics. 1345-
1355.
102. Tian M, Cui Y, Song G, Zong M, Zhou X, Chen Y, Han J. (2008) Proteomic analysis
identifies MMP-9, DJ-1 and A1BG as overexpressed proteins in pancreatic juice from
pancreatic ductal adenocarcinoma patients. BMC Cancer. 8, 241-251.
103. Rosty C, Christa L, Kuzdzal S, Baldwin W, Zahurak M, Carnot F, Chan D, Canto M,
Lillemoe K, Cameron J, Yeo CJ, Hruban R, Goggins M. (2002) Identification of
hepatocarcinoma-intestine-pancreas/pancreatitis-associated protein I as a biomarker for
pancreatic ductal adenocarcinoma by protein biochip technology. Cancer Res. 62, 1868-
1875.
104. Ke E, Patel BB, Liu T, Li XM, Haluszka O, Hoffman JP, Ehya H, Young NA, Watson
JC, Weinberg DS, Nguyen MT, Cohen SJ, Meropol NJ, Litwin S, Tokar JL, Yeung AT.
(2009) Proteomic analyses of pancreatic cyst fluids. Pancreas. 38, e33-42.
105. Honda K, Hayashida Y, Umaki T, Okusaka T, Kosuge T, Kikuchi S, Endo M, Tsuchida A,
Aoki T, Itoi T, Moriyasu F, Hirohashi S, Yamada T. (2005) Possible detection of pancreatic
cancer by plasma protein profiling. Cancer Res. 65, 10613-10622.
106. Faca VM, Song KS, Wang H, Zhang Q, Krasnoselsky AL, Newcomb LF, Plentz RR,
Gurumurthy S, Redston MS, Pitteri SJ, Pereira-Faca SR, Ireton RC, Katayama H, Glukhova
V, Phanstiel D, Brenner DE, Anderson MA, Misek D, Scholler N, Urban ND, Barnett MJ,
Edelstein C, Goodman GE, Thornquist MD, McIntosh MW, DePinho RA, Bardeesy N,
Hanash SM. (2008) A mouse to human search for plasma proteome changes associated with
pancreatic tumor development. PLos Med. 5, e123.
107. Diamandis EP. (2004) Mass spectrometry as a diagnostic and a cancer biomarker discovery
tool: opportunities and potential limitations. Mol Cell Proteomics. 3, 367-378.
108. Kondo,T. (2008) Tissue proteomics for cancer biomarker development: laser
microdissection and 2D-DIGE. BMB. Rep. 41, 626-634.
109. Korc M. (2007) Pancreatic cancer-associated stroma production. Am J Surg. 194, 84-86.
110. Sedlaczek P, Frydecka I, Gabryś M, Van Dalen A, Einarsson R, Harłozińska A. (2002)
Comparative analysis of CA125, tissue polypeptide specific antigen, and soluble interleukin-
2 receptor alpha levels in sera, cyst, and ascitic fluids from patients with ovarian carcinoma.
102
Cancer. 95, 1886-1893.
111. Kuk,C., Kulasingam,V., Gunawardana,C.G., Smith,C.R., Batruch,I, Diamandis,E.P. (2009)
Mining the ovarian cancer ascites proteome for potential ovarian cancer biomarkers. Mol.
Cell Proteomics. 8, 661-669.
112. Gortzak-Uzan,L., Ignatchenko,A., Evangelou,A.I., Agochiya,M., Brown,K.A., St Onge,P.,
Kireeva,I., Schmitt-Ulms,G., Brown,T.J., Murphy,J., Rosen,B., Shaw,P., Jurisica,I,
Kislinger,T. (2009) A proteome resource of ovarian cancer ascites: integrated proteomic and
bioinformatic analyses to identify putative biomarkers. J. Proteome Res. 7, 339-351.
113. Sipos B., Möser S., Kalthoff H., Török V., Löhr M., Klöppel G. (2003) A comprehensive
characterization of pancreatic ductal carcinoma cell lines: towards the establishment of an in
vitro research platform. Virchows Arch. 442, 444-452.
114. Deer E.L., González-Hernández J., Coursen J.D., Shea J.E., Ngatia J., Scaife C.L., Firpo
M.A., Mulvihill S.J. (2010) Phenotype and genotype of pancreatic cancer cell lines.
Pancreas. 39, 425-435.
115. Wistuba II, Behrens C, Milchgrub S, Syed S, Ahmadian M, Virmani AK, Kurvari V,
Cunningham TH, Ashfaq R, Minna JD, Gazdar AF. (1998) Comparison of features of human
breast cancer cell lines and their corresponding tumors. Clin Cancer Res 4, 2931-2938.
116. Wistuba II, Bryant D, Behrens C, Milchgrub S, Virmani AK, Ashfaq R, Minna JD, Gazdar
AF. (1999) Comparison of features of human lung cancer cell lines and their corresponding
tumors. Clin Cancer Res. 5, 991-1000.
117. Kulasingam V., Pavlou M.P., Diamandis E.P. (2010) Integrating high-throughput
technologies in the quest for effective biomarkers for ovarian cancer. Nat Rev Cancer.10,
371-378.
118. Koliopanos A, Avgerinos C, Paraskeva C, Touloumis Z, Kelgiorgi D. Dervenis C. (2008)
Molecular aspects of carcinogenesis in pancreatic cancer. Hepatobiliary Pancreat Dis Int.
7, 345-356.
119. Whiteside TL. (2008) The tumor microenvironment and its role in promoting tumor
growth. Oncogene. 27, 5904–5912.
120. Domon B, Aebersold R. (2006) Mass spectrometry and protein analysis. Science. 312, 212-
7.
121. Jemal A., Siegel R., Ward E., Hao Y., Xu J., Thun MJ. (2009) Cancer statistics, 2009. CA
Cancer J Clin. 59, 22-49.
103
122. Ringel J., Lohr M. (2003) The MUC gene family: their role in diagnosis and early detection
of pancreatic cancer. Mol Cancer. 2, 9.
123. Robin X., Turck N., Hainard A., Lisacek F., Sanchez J.C., Müller M. (2009)
Bioinformatics for protein biomarker panel classification: what is needed to bring biomarker
panels into in vitro diagnostics? Expert Rev Proteomics. 6, 675-689.
124. Yurkovetsky Z.R., Linkov F.Y., E Malehorn D., Lokshin A.E. (2006) Multiple biomarker
panels for early detection of ovarian cancer. Future Oncol. 2, 733-741.
125. Hanash S., Taguchi A. (2010) The grand challenge to decipher the cancer proteome. Nat
Rev Cancer. 10, 652-660..
126. Farina A, Dumonceau JM, Frossard JL, Hadengue A, Hochstrasser DF, Lescuyer P. (2009)
Proteomic analysis of human bile from malignant biliary stenosis induced by pancreatic
cancer. J Proteome Res. 8, 159-69.
127. Kulasingam V., Diamandis E.P. (2007) Proteomics analysis of conditioned media from
three breast cancer cell lines. Mol Cell Proteomics. 6, 1997-2011.
128. Sardana G., Jung K., Stephan C., Diamandis E.P. (2008) Proteomic analysis of conditioned
media from the PC3, LNCaP, and 22Rv1 prostate cancer cell lines: discovery and validation
of candidate prostate cancer biomarkers. J Proteome Res. 7, 3329-3338.
129. Gunawardana C.G., Kuk C., Smith C.R., Batruch I., Soosaipillai A., Diamandis E.P. (2009)
Comprehensive analysis of conditioned media from ovarian cancer cell lines identifies novel
candidate markers of epithelial ovarian cancer. J Proteome Res. 8, 4705-4713.
130. Planque C., Kulasingam V., Smith C.R., Reckamp K., Goodglick L., Diamandis E.P.
(2009) Identification of five candidate lung cancer biomarkers by proteomics analysis of
conditioned media of four lung cancer cell lines. Mol Cell Proteomics. 8, 2746-2758.
131. Wu C.C., Hsu C.W., Chen C.D., Yu C.J., Chang K.P., Tai D.I., Liu H.P., Su W.H., Chang
Y.S., Yu J.S. (2010) Candidate serological biomarkers for cancer identified from the
secretomes of 23 cancer cell lines and the human protein atlas. Mol Cell Proteomics. 9, 1100-
1117.
132. Xue H., Lü B., Zhang J., Wu M., Huang Q., Wu Q., Sheng H., Wu D., Hu J., Lai M. (2010)
Identification of serum biomarkers for colorectal cancer metastasis using a differential
secretome approach. J Proteome Res. 9, 545-555.
133. Feng X.P., Yi H., Li M.Y., Li X.H., Yi B., Zhang P.F., Li C., Peng F., Tang C.E., Li J.L.,
Chen Z.C., Xiao Z.Q. (2010) Identification of biomarkers for predicting nasopharyngeal
104
carcinoma response to radiotherapy by proteomics. Cancer Res. 70, 3450-3462.
134. Schiarea S., Solinas G., Allavena P., Scigliuolo G.M., Bagnati R., Fanelli R., Chiabrando
C. (2010) Secretome analysis of multiple pancreatic cancer cell lines reveals perturbations of
key functional networks. J Proteome Res. 9, 4376-4392.
135. Furukawa T., Duguid W.P., Rosenberg L., Viallet J., Galloway D.A., Tsao M.S. (1996)
Long-term culture and immortalization of epithelial cells from normal adult human
pancreatic ducts transfected by the E6E7 gene of human papolloma virus 16. Am J Pathol.
148, 1763-1770.
136. Sedmak J.J., Grossberg S.E. (1977) A rapid, sensitive, and versatile assay for protein using
Coomassie brilliant blue G250. Anal Biochem. 79, 544-552.
137. Itzhaki R.F., Gill D.M. (1964) A micro-biuret method for estimating proteins. Anal Biol. 9,
401-410.
138. Caraux G., Pinloche S. (2005) PermutMatrix: a graphical environment to arrange gene
expression profiles in optimal linear order. Bioinformatics. 21, 1280-1281.
139. Meunier B., Dumas E., Piec I., Béchet D., Hébraud M., Hocquette J.F. (2007) Assessment
of hierarchical clustering methodologies for proteomic data mining. J Proteome Res. 6, 358-
366.
140. Luo LY, Soosaipillai A, Grass L, Diamandis EP. (2006) Characterization of human
kallikreins 6 and 10 in ascites fluid from ovarian cancer patients. Tumour Biol. 27, 227-234.
141. Shaw, J. L., and Diamandis, E. P. (2007) Distribution of 15 human kallikreins in tissues
and biological fluids. Clin. Chem. 53, 1423–1432.
142. Higdon R., Hogan J.M., Van Belle G., Kolker E. (2005) Randomized sequence databases
for tandem mass spectrometry peptide and protein identification. OMICS. 9, 364-379.
143. Elias, J. E., Gygi, S. P. (2007) Target-decoy search strategy for increased confidence in
large-scale protein identifications by mass spectrometry. Nat. Methods. 4, 207-214.
144. Choi, H., Nesvizhskii, A. I. (2008) False discovery rates and related statistical
concepts in mass spectrometry-based proteomics. J. Proteome. Res. 7, 47-50.
145. Reddi K.K., Holland J.F. (1976) Elevated serum ribonuclease in patients with pancreatic
cancer. Proc Natl Acad Sci U S A. 73, 2308-2310.
146. Harsha H.C., Kandasamy K., Ranganathan P., Rani S., Ramabadran S., Gollapudi S.,
Balakrishnan L., Dwivedi S.B., Telikicherla D., Selvan L.D., Goel R., Mathivanan S.,
Marimuthu A., Kashyap M., Vizza R.F., Mayer R.J., Decaprio J.A., Srivastava S., Hanash
105
S.M., Hruban R.H., Pandey A. (2009) A compendium of potential biomarkers of pancreatic
cancer. PLoS Med. 6, e1000046.
147. Maker AV, Katabi N, Gonen M, Dematteo RP, D'Angelica MI, Fong Y, Jarnagin WR,
Brennan MF, Allen PJ. (2010) Pancreatic Cyst Fluid and Serum Mucin Levels Predict
Dysplasia in Intraductal Papillary Mucinous Neoplasms of the Pancreas. Ann Surg Oncol.
Aug 18. [Epub ahead of print].
148. Itkonen O., Koivunen E., Hurme M., Alfthan H., Schröder T., Stenman U.H. (1990) Time-
resolved immunofluorometric assays for trypsinogen-1 and 2 in serum reveal preferential
elevation of trypsinogen-2 in pancreatitis. J Lab Clin Med. 115, 712-718.
149. Hanas J.S., Hocker J.R., Cheung J.Y., Larabee J.L., Lerner M.R., Lightfoot S.A., Morgan
D.L., Denson K.D., Prejeant K.C., Gusev Y., Smith B.J., Hanas R.J., Postier R.G., Brackett
D.J. (2008) Biomarker identification in human pancreatic cancer sera. Pancreas. 36, 61-69.
150. Irigoyen Oyarzabal A.M., Amiguet García J.A., López Vivanco G., Genollá Subirats J.,
Muñoz Villafranca M.C., Ojembarrena Martínez E., Liso Irurzun P. (2003) Tumoral markers
and acute-phase reactants in the diagnosis of pancreatic cancer. Gastroenterol Hepatol. 26,
624-629.
151. Märten A., Büchler M.W., Werft W., Wente M.N., Kirschfink M., Schmidt J. (2010)
Soluble iC3b as an early marker for pancreatic adenocarcinoma is superior to CA19.9 and
radiology. J Immunother. 33, 219-224.
152. Kuhlmann K.F., van Till J.W., Boermeester M.A., de Reuver P.R., Tzvetanova I.D.,
Offerhaus G.J., Ten Kate F.J., Busch O.R., van Gulik T.M., Gouma D.J., Crawford H.C.
(2007) Evaluation of matrix metalloproteinase 7 in plasma and pancreatic juice as a
biomarker for pancreatic cancer. Cancer Epidemiol Biomarkers Prev. 16, 886-891.
153. Chen R., Pan S., Duan X., Nelson B.H., Sahota R.A., de Rham S., Kozarek R.A., McIntosh
M., Brentnall T.A. (2010) Elevated level of anterior gradient-2 in pancreatic juice from
patients with pre-malignant pancreatic neoplasia. Mol Cancer. 15, 149.
154. Koopmann J., Buckhaults P., Brown D.A., Zahurak M.L., Sato N., Fukushima N., Sokoll
L.J., Chan D.W., Yeo C.J., Hruban R.H., Breit S.N., Kinzler K.W., Vogelstein B., Goggins
M. (2004) Serum macrophage inhibitory cytokine 1 as a marker of pancreatic and other
periampullary cancers. Clin Cancer Res. 10, 2386-2392.
155. Koopmann J., Rosenzweig C.N., Zhang Z., Canto M.I., Brown D.A., Hunter M., Yeo C.,
Chan D.W., Breit S.N., Goggins M. (2006) Serum markers in patients with resectable
106
pancreatic adenocarcinoma: macrophage inhibitory cytokine 1 versus CA19-9. Clin Cancer
Res. 12, 442-446.
156. Moniaux N., Chakraborty S., Yalniz M., Gonzalez J., Shostrom V.K., Standop J., Lele
S.M., Ouellette M., Pour P.M., Sasson A.R., Brand R.E., Hollingsworth M.A., Jain M., Batra
S.K. (2008) Early diagnosis of pancreatic cancer: neutrophil gelatinase-associated lipocalin as
a marker of pancreatic intraepithelial neoplasia. Br J Cancer. 98, 1540-1547.
157. Saha S., Harrison S.H., Shen C., Tang H., Radivojac P., Arnold R.J., Zhang X., Chen J.Y.
(2008) HIP2: an online database of human plasma proteins from healthy individuals. BMC
Med Genomics. 25, 12.
158. Xiao S.J., Zhang C., Zou Q., Ji Z.L. (2010) TiSGeD: a database for tissue-specific genes.
Bioinformatics. 26, 1273-1275.
159. Liu X., Yu X., Zack D.J., Zhu H., Qian J. (2008) TiGER: a database for tissue-specific
gene expression and regulation. BMC Bioinformatics. 9, 271.
160. Pontius J.U., Wagner L., Schuler G.D. (2003) UniGene: a unified view of the
transcriptome. In: The NCBI Handbook. Bethesda (MD): National Center for
Biotechnology Information: pp. 21.1-21.12.
161. Pontén F., Jirström K., Uhlen M. (2008) The Human Protein Atlas--a tool for pathology. J
Pathol. 216, 387-393.
162. Matsugi S., Hamada T., Shioi N., Tanaka T., Kumada T., Satomura S. (2007) Serum
carboxypeptidase A activity as a biomarker for early-stage pancreatic carcinoma. Clin Chim
Acta. 378, 147-153.
163. Adrian T.E., Besterman H.S., Mallinson C.N., Pera A., Redshaw M.R., Wood T.P., Bloom
S.R. (1979) Plasma trypsin in chronic pancreatitis and pancreatic adenocarcinoma. Clin Chim
Acta. 97, 205-212.
164. Artigas JM, Garcia ME, Faure MR, Gimeno AM. (1981) Serum trypsin levels in acute
pancreatic and non-pancreatic abdominal conditions. Postgrad Med J. 57, 219-222.
165. Hao Y, Wang J, Feng N, Lowe AW. (2004) Determination of plasma glycoprotein 2 levels
in patients with pancreatic disease. Arch Pathol Lab Med. 128, 668-674.
166. Hayakawa T, Kondo T, Shibata T, Kitagawa M, Sakai Y, Sobajima H, Tanikawa M, Nakae
Y, Hayakawa S, Katsuzaki T. (1993) Serum pancreatic stone protein in pancreatic
diseases. Int J Pancreatol. 13, 97-103.
167. Borgström A, Regnér S. (2005) Active carboxypeptidase B is present in free form in serum
107
from patients with acute pancreatitis. Pancreatology. 5, 530-536.
168. Hayakawa T, Kondo T, Shibata T, Kitagawa M, Ono H, Sakai Y, Kiriyama S. (1989)
Enzyme immunoassay for serum pancreatic lipase in the diagnosis of pancreatic diseases.
Gastroenterol Jpn. 24, 556-60.
169. Smith RC, Southwell-Keely J, Chesher D. (2005) Should serum pancreatic lipase replace
serum amylase as a biomarker of acute pancreatitis? ANZ J Surg.75, 399-404.
170. Junge W, Leybold K. (1982) Detection of colipase in serum and urine of pancreatitis
patients. Clin Chim Acta. 123, 293-302.
171. Pasanen PA, Eskelinen M, Partanen K, Pikkarainen P, Penttilä I, Alhava E. (1994)
Tumour-associated trypsin inhibitor in the diagnosis of pancreatic carcinoma. J Cancer Res
Clin Oncol. 120, 494-497.
172. Funakoshi A, Yamada Y, Ito T, Ishikawa H, Yokota M, Shinozaki H, Wakasugi H, Misaki
A, Kono M. (1991) Clinical usefulness of serum phospholipase A2 determination in patients
with pancreatic diseases. Pancreas. 6, 588-594.
173. Hanahan D., Weinberg R.A. (2000) The hallmarks of cancer. Cell. 100, 57-70.
175. Sobel RE, Sadar MD. (2005) Cell lines used in prostate cancer research: a compendium of
4ld and new lines--part 1. J Urol. 173, 342-359.
175. Barnea E, Sorkin R, Ziv T, Beer I, Admon A. (2005) Evaluation of prefractionation
methods as a preparatory step for multidimensional based chromatography of serum proteins.
Proteomics. 5, 3367-3375.
176. Slebos RJ, Brock JW, Winters NF, Stuart SR, Martinez MA, Li M, Chambers MC,
Zimmerman LJ, Ham AJ, Tabb DL, Liebler DC. (2008) Evaluation of strong cation exchange
versus isoelectric focusing of peptides for multidimensional liquid chromatography-tandem
mass spectrometry. J Proteome Res. 7, 5286-5294.
177. Das S, Bosley AD, Ye X, Chan KC, Chu I, Green JE, Issaq HJ, Veenstra TD, Andresson T.
(2010) Comparison of Strong Cation Exchange and SDS-PAGE Fractionation for Analysis of
Multiprotein Complexes. J Proteome Res. 9, 6696-6704.
178. Fang Y, Robinson DP, Foster LJ. (2010) Quantitative analysis of proteome coverage and
recovery rates for upstream fractionation methods in proteomics. J Proteome Res. 9, 1902-12.
179. Zhu W, Smith JW, Huang CM. (2010) Mass spectrometry-based label-free quantitative
proteomics. J Biomed Biotechnol. Epub 2009 Nov 10.
180. Bachi A, Bonaldi T. (2008) Quantitative proteomics as a new piece of the systems biology
108
puzzle. J Proteomics. 71, 357-367.
181. Ishihama Y, Oda Y, Tabata T, Sato T, Nagasu T, Rappsilber J, Mann M. (2005)
Exponentially modified protein abundance index (emPAI) for estimation of absolute protein
amount in proteomics by the number of sequenced peptides per protein. Mol Cell Proteomics.
4, 1265-1272.
182. Lu P., Vogel C., Wang R., Yao X., Marcotte E.M. (2007) Absolute protein expression
profiling estimates the relative contributions of transcriptional and translational regulation.
Nat Biotechnol. 25, 117–124.
183. Liu H., Sadygov R.G., Yates III J.R. (2004) A model for random sampling and estimation
of relative protein abundance in shotgun proteomics. Anal Chem. 76, 4193–4201.
184. Collier TS, Sarkar P, Franck WL, Rao BM, Dean RA, Muddiman DC. (2010) Direct
Comparison of Stable Isotope Labeling by Amino Acids in Cell Culture and Spectral
Counting for Quantitative Proteomics. Anal Chem. [Epub ahead of print]
185. Kakisaka T, Kondo T, Okano T, Fujii K, Honda K, Endo M, Tsuchida A, Aoki T, Itoi T,
Moriyasu F, Yamada T, Kato H, Nishimura T, Todo S, Hirohashi S. (2007) Plasma
proteomics of pancreatic cancer patients by multi-dimensional liquid chromatography and
two-dimensional difference gel electrophoresis (2D-DIGE): up-regulation of leucine-rich
alpha-2-glycoprotein in pancreatic cancer. J Chromatogr B Analyt Technol Biomed Life Sci.
852, 257-267.
186. Inami K, Kajino K, Abe M, Hagiwara Y, Maeda M, Suyama M, Watanabe S, Hino O.
(2008) Secretion of N-ERC/mesothelin and expression of C-ERC/mesothelin in human
pancreatic ductal carcinoma. Oncol Rep. 20, 1375-1380.
187. Paciucci R, Torà M, Díaz VM, Real FX. (1998) The plasminogen activator system in
pancreas cancer: role of t-PA in the invasive potential in vitro. Oncogene. 16, 625-633.
188. Frick VO, Rubie C, Wagner M, Graeber S, Grimm H, Kopp B, Rau BM, Schilling MK.
(2008) Enhanced ENA-78 and IL-8 expression in patients with malignant pancreatic diseases.
Pancreatology. 8, 488-497.
189. Aberger F, Weidinger G, Grunz H, Richter K. (1998) Anterior specification of embryonic
ectoderm: the role of the Xenopus cement gland-specific gene XAG-2. Mech Dev. 72, 115-
130.
190. Zhang Y, Forootan SS, Liu D, Barraclough R, Foster CS, Rudland PS, Ke Y (2007)
Increased expression of anterior gradient-2 is significantly associated with poor survival of
109
prostate cancer patients. Prostate Cancer Prostatic Dis. 10, 293-300.
191. Fritzsche FR, Dahl E, Dankof A, Burkhardt M, Pahl S, Petersen I, Dietel M, Kristiansen G
(2007) Expression of AGR2 in non-small cell lung cancer. Histol Histopathol. 22, 703–708.
192. Barraclough DL, Platt-Higgins A, de Silva Rudland S, Barraclough R, Winstanley J, West
CR, Rudland PS. The metastasis-associated anterior gradient 2 protein is correlated with poor
survival of breast cancer patients. Am J Pathol. 175, 1848-1857.
193. Ramachandran V, Arumugam T, Wang H, Logsdon CD. (2008) Anterior gradient 2 is
expressed and secreted during the development of pancreatic cancer and promotes cancer cell
survival. Cancer Res. 68, 7811-7818.
194. Zhang Y, Ali TZ, Zhou H, D'Souza DR, Lu Y, Jaffe J, Liu Z, Passaniti A, Hamburger AW.
(2010) ErbB3 binding protein 1 represses metastasis-promoting gene anterior gradient protein
2 in prostate cancer. Cancer Res. 70, 240-248.
195. DeSouza LV, Romaschin AD, Colgan TJ, Siu KW. (2009) Absolute quantification of
potential cancer markers in clinical tissue homogenates using multiple reaction monitoring on
a hybrid triple quadrupole/linear ion trap tandem mass spectrometer. Anal Chem. 81, 3462-
3470.
196. Hessle H, Engvall E. (1984) Type VI collagen. Studies on its localization, structure, and
biosynthetic form with monoclonal antibodies. J Biol Chem. 259, 3955–3961.
197. Lampe AK, Bushby KM. (2005) Collagen VI related muscle disorders. J Med Genet. 42,
673-685.
198. Fujita A, Sato JR, Festa F, Gomes LR, Oba-Shinjo SM, Marie SK, Ferreira CE, Sogayar
MC. (2008) Identification of COL6A1 as a differentially expressed gene in human
astrocytomas. Genet Mol Res. 7, 371-378.
199. Li J, Dowdy S, Tipton T, Podratz K, Lu WG, Xie X, Jiang SW. (2009) HE4 as a biomarker
for ovarian and endometrial cancer management. Expert Rev Mol Diagn. 9, 555-566.
200. Welsh JB, Sapinoso LM, Kern SG, Brown DA, Liu T, Bauskin AR, Ward RL, Hawkins
NJ, Quinn DI, Russell PJ, Sutherland RL, Breit SN, Moskaluk CA, Frierson HF Jr, Hampton
GM. (2003) Large-scale delineation of secreted protein biomarkers overexpressed in cancer
tissue and serum. Proc Natl Acad Sci U S A. 100, 3410-3415.
201. Bjorling E, Lindskog C, Oksvold P, Linne J, Kampf C, Hober S, Uhlen M, Ponten F (2008)
Aweb-based tool for in silico biomarker discovery based on tissue-specific protein profiles in
normal and cancer tissues. Mol Cell Proteomics. 7, 825–844.
110
202. Grapin-Botton A. (2005) Ductal cells of the pancreas. Int J Biochem Cell Biol. 37, 504-510.
203. Rovira M, Delaspre F, Massumi M, Serra SA, Valverde MA, Lloreta J, Dufresne M, Payré
B, Konieczny SF, Savatier P, Real FX, Skoudy A. (2008) Murine embryonic stem cell-
derived pancreatic acinar cells recapitulate features of early pancreatic differentiation.
Gastroenterology. 135, 1301-1310.
204. Schmid RM, Klöppel G, Adler G, Wagner M. (1999) Acinar-ductal-carcinoma sequence in
transforming growth factor-alpha transgenic mice. Ann N Y Acad Sci. 880, 219-230.
205. Kobayashi D, Koshida S, Moriai R, Tsuji N, Watanabe N. (2007) Olfactomedin 4 promotes
S-phase transition in proliferation of pancreatic cancer cells. Cancer Sci. 98, 334-40.
206. Oue N, Sentani K, Noguchi T, Ohara S, Sakamoto N, Hayashi T, Anami K, Motoshita J, Ito
M, Tanaka S, Yoshida K, Yasui W. (2009) Serum olfactomedin 4 (GW112, hGC-1) in
combination with Reg IV is a highly sensitive biomarker for gastric cancer patients. Int J
Cancer. 125, 2383-2392.
207. Antonin W, Wagner M, Riedel D, Brose N, Jahn R. (2002) Loss of the zymogen granule
protein syncollin affects pancreatic protein synthesis and transport but not secretion. Mol Cell
Biol. 22, 1545-1554.
208. Yachida S, Jones S, Bozic I, Antal T, Leary R, Fu B, Kamiyama M, Hruban RH, Eshleman
JR, Nowak MA, Velculescu VE, Kinzler KW, Vogelstein B, Iacobuzio-Donahue CA. (2010)
Distant metastasis occurs late during the genetic evolution of pancreatic cancer. Nature. 467,
1114-1117.
209. Mischak H, Allmaier G, Apweiler R, Attwood T, Baumann M, Benigni A, Bennett SE,
Bischoff R, Bongcam-Rudloff E, Capasso G, Coon JJ, D'Haese P, Dominiczak AF, Dakna M,
Dihazi H, Ehrich JH, Fernandez-Llama P, Fliser D, Frokiaer J, Garin J, Girolami M, Hancock
WS, Haubitz M, Hochstrasser D, Holman RR, Ioannidis JP, Jankowski J, Julian BA, Klein
JB, Kolch W, Luider T, Massy Z, Mattes WB, Molina F, Monsarrat B, Novak J, Peter K,
Rossing P, Sánchez-Carbayo M, Schanstra JP, Semmes OJ, Spasovski G, Theodorescu D,
Thongboonkerd V, Vanholder R, Veenstra TD, Weissinger E, Yamamoto T, Vlahou A.
(2010) Recommendations for biomarker identification and qualification in clinical
proteomics. Sci Transl Med. 2, 46ps42.
210. Cutts RJ, Gadaleta E, Hahn SA, Crnogorac-Jurcevic T, Lemoine NR, Chelala C. (2010)
The Pancreatic Expression database: 2011 update. Nucleic Acids Res. [Epub ahead of print].
211. Edwardson JM, An S, Jahn R (1997) The secretory granule protein syncollin binds to
111
syntaxin in a Ca2(+)-sensitive manner. Cell. 90, 325-333.
212. Bach JP, Borta H, Ackermann W, Faust F, Borchers O, Schrader M. (2006) The secretory
granule protein syncollin localizes to HL-60 cells and neutrophils. J Histochem Cytochem.
54, 877-888.
213. Koprowski H, Herlyn, M, Steplewski Z , Sears H.F. (1981) Specific Antigen in Serum of
Patients with Colon Carcinoma. Science. 212, 53-55.
214. Williams JA. (2006) Regulation of pancreatic acinar cell function. Curr Opin
Gastroenterol. 22, 498-504.
215. Lv S, Gao J, Zhu F, Li Z, Gong Y, Xu G, Ma L (2010) Transthyretin, identified by
proteomics, is overabundant in pancreatic juice from pancreatic carcinoma and originates
from pancreatic islets. Diagn Cytopathol. [Epub ahead of print].
216. Killary AM, Balasenthil S, Chen N, Lott ST, Chen J, Carter J, Grizzle WE, Frazier ML,
Sen S. (2010) A Migration Signature and Plasma Biomarker Panel for Pancreatic
Adenocarcinoma. Cancer Prev Res (Phila).[Epub ahead of print].
217. Xue A, Scarlett CJ, Chung L, Butturini G, Scarpa A, Gandy R, Wilson SR, Baxter RC,
Smith RC. (2010) Discovery of serum biomarkers for pancreatic adenocarcinoma using
proteomic analysis. Br J Cancer. 103, 391-400.
218. Takayama R, Nakagawa H, Sawaki A, Mizuno N, Kawai H, Tajika M, Yatabe Y, Matsuo
K, Uehara R, Ono K, Nakamura Y, Yamao K. (2010) Serum tumor antigen REG4 as a
diagnostic biomarker in pancreatic ductal adenocarcinoma. J Gastroenterol. 45, 52-59.
219. McKinney KQ, Lee YY, Choi HS, Groseclose G, Iannitti DA, Martinie JB, Russo MW,
Lundgren DH, Han DK, Bonkovsky HL, Hwang SI. (2011) Discovery of putative
pancreatic cancer biomarkers using subcellular proteomics. J Proteomics. 74, 79-88.
113
Appendix 1. Table of overrepresented KEGG pathways in the pancreatic juice proteome
in comparison to the cell line conditioned media proteome
Overrepresented
Pathway Description
% of
Pathway
Proteins in
Pancreatic
Juice
% of
Pathway
Proteins in
Cell Line
Proteome
Number of
Pathway
Proteins in
Pancreatic
Juice
Number of
Pathway
Proteins in
Cell Line
Proteome
Raw p-
value
FDR p-
value
Complement and
coagulation cascades
(hsa04610) 3.24 0.48 21 16
4.78
E-07
4.49
E-05
Pancreatic secretion
(hsa04972) 2.62 0.48 17 16
3.61
E-05
1.70
E-03
Systemic lupus
erythematosus
(hsa05322) 2.78 0.6 18 20
8.92
E-05
2.80
E-03
Protein centre software uses statistical hypergeometric test analysis to determine if KEGG
(Kyoto Encyclopedia of Genes and Genomes; http://www.genome.jp/kegg/) categories are
disproportionally represented in comparisons between two datasets. In a comparison between all
of the proteins identified in the pancreatic juice and all of the proteins identified in the cell line
conditioned media, the three KEGG pathways presented in the table were shown as
overrepresented in the pancreatic juice dataset. Provided are the percentage and number of
proteins from the pancreatic juice and cell line datasets that were mapped to the three KEGG
pathways. Raw p-values are based on hypergeometric tests indicating the protein counts are not
due to random sampling. False discovery rate (FDR) p-values are raw p-values corrected for a
false discovery rate of 1.0%.
114
Appendix 2. Pearson correlation coefficient values comparing normalized spectral counts of the triplicate cell line analysis.
BxP
c3
rep1
BxP
c3
rep2
BxP
c3
rep3
CA
PA
N-1
CA
PA
N-2
CAP
AN1
-3
CFP
AC1
rep1
CFP
AC1
rep2
CFP
AC1
rep3
HP
DE
rep1
HP
DE
rep2
HP
DE
rep3
MIA
-1
MI
A-2
MI
A-3
PA
NC1
rep1
PA
NC1
rep2
PA
NC1
rep3
SU.
86.8
6-1
SU.
86.8
6-2
SU.
86.8
6-3
BxPc3-1 1.00 0.99 0.99 0.62 0.62 0.65 0.65 0.48 0.62 0.69 0.68 0.68 0.53 0.53 0.55 0.64 0.68 0.64 0.63 0.62 0.55
BxPc3-2 0.99 1.00 0.99 0.62 0.62 0.65 0.64 0.48 0.62 0.70 0.69 0.69 0.53 0.53 0.55 0.64 0.68 0.64 0.64 0.63 0.56
BxPc3-3 0.99 0.99 1.00 0.62 0.62 0.64 0.65 0.48 0.62 0.71 0.70 0.70 0.53 0.53 0.56 0.65 0.68 0.66 0.64 0.62 0.56
CAPAN1-1 0.62 0.62 0.62 1.00 0.99 0.97 0.67 0.51 0.67 0.54 0.53 0.54 0.49 0.48 0.46 0.59 0.58 0.57 0.71 0.71 0.68
CAPAN1-2 0.62 0.62 0.62 0.99 1.00 0.97 0.68 0.51 0.67 0.53 0.53 0.54 0.49 0.48 0.46 0.58 0.58 0.57 0.71 0.71 0.67
CAPAN1-3 0.65 0.65 0.64 0.97 0.97 1.00 0.67 0.50 0.66 0.56 0.55 0.56 0.53 0.52 0.50 0.62 0.61 0.60 0.70 0.70 0.67
CFPAC1-1 0.65 0.64 0.65 0.67 0.68 0.67 1.00 0.73 0.95 0.52 0.53 0.54 0.46 0.44 0.44 0.67 0.68 0.68 0.74 0.73 0.68
CFPAC1-2 0.48 0.48 0.48 0.51 0.51 0.50 0.73 1.00 0.85 0.39 0.39 0.40 0.35 0.34 0.33 0.50 0.50 0.51 0.57 0.58 0.56
CFPAC1-3 0.62 0.62 0.62 0.67 0.67 0.66 0.95 0.85 1.00 0.51 0.51 0.51 0.46 0.44 0.43 0.64 0.65 0.66 0.74 0.74 0.70
HPDE-1 0.69 0.70 0.71 0.54 0.53 0.56 0.52 0.39 0.51 1.00 0.98 0.97 0.49 0.49 0.47 0.54 0.53 0.55 0.56 0.55 0.52
HPDE-2 0.68 0.69 0.70 0.53 0.53 0.55 0.53 0.39 0.51 0.98 1.00 0.99 0.45 0.45 0.43 0.53 0.52 0.55 0.56 0.56 0.51
HPDE-3 0.68 0.69 0.70 0.54 0.54 0.56 0.54 0.40 0.51 0.97 0.99 1.00 0.45 0.45 0.43 0.54 0.53 0.55 0.56 0.56 0.51
MIA -1 0.53 0.53 0.53 0.49 0.49 0.53 0.46 0.35 0.46 0.49 0.45 0.45 1.00 0.99 0.98 0.70 0.67 0.71 0.43 0.41 0.38
MIA- 2 0.53 0.53 0.53 0.48 0.48 0.52 0.44 0.34 0.44 0.49 0.45 0.45 0.99 1.00 0.98 0.69 0.65 0.69 0.42 0.40 0.38
MIA- 3 0.55 0.55 0.56 0.46 0.46 0.50 0.44 0.33 0.43 0.47 0.43 0.43 0.98 0.98 1.00 0.67 0.66 0.68 0.43 0.41 0.38
PANC1-1 0.64 0.64 0.65 0.59 0.58 0.62 0.67 0.50 0.64 0.54 0.53 0.54 0.70 0.69 0.67 1.00 0.97 0.97 0.58 0.57 0.50
PANC1-2 0.68 0.68 0.68 0.58 0.58 0.61 0.68 0.50 0.65 0.53 0.52 0.53 0.67 0.65 0.66 0.97 1.00 0.95 0.61 0.59 0.53
PANC1-3 0.64 0.64 0.66 0.57 0.57 0.60 0.68 0.51 0.66 0.55 0.55 0.55 0.71 0.69 0.68 0.97 0.95 1.00 0.58 0.56 0.49
SU.86.86-1 0.63 0.64 0.64 0.71 0.71 0.70 0.74 0.57 0.74 0.56 0.56 0.56 0.43 0.42 0.43 0.58 0.61 0.58 1.00 0.99 0.94
SU.86.86-2 0.62 0.63 0.62 0.71 0.71 0.70 0.73 0.58 0.74 0.55 0.56 0.56 0.41 0.40 0.41 0.57 0.59 0.56 0.99 1.00 0.95
SU.86.86-3 0.55 0.56 0.56 0.68 0.67 0.67 0.68 0.56 0.70 0.52 0.51 0.51 0.38 0.38 0.38 0.50 0.53 0.49 0.94 0.95 1.00
Each cell line was analyzed in triplicate for a total of 21 replicates. Each replicate was compared pair-wise and Pearson correlation
coefficients are reported. With the exception of CFPAC1-rep2, good correlation (0.944-0.993) was seen for replicates of the same cell
line indicating good reproducibility between cell line replicates. MIA-1, MIA-2, MIA-3 are replicates 1, 2 and 3 of the MIA-PaCa2
cell line.
115
Appendix 3. Extracellular and cell surface annotated proteins with over 5-fold increase in at least three pancreatic cancer cell
lines.
Gene
An
ov
a P
-Va
lue
Accession
BxPc3 MIA-
PaCa2
PANC1 CAPAN1 CFPAC1 SU.86.86 Identified in/as...
Pre
vio
usl
y S
tud
ied
as
Pa
ncr
eati
c C
an
cer S
eru
m
Bio
ma
rker
%
CV FC
%
CV FC
%
CV FC
%
CV FC
%
CV FC
%
CV FC Pan
crea
tic
Ju
ice
Asc
ites
a
Hu
ma
n P
lasm
a P
rote
om
e b
Ov
erex
pre
ssed
in
Pa
ncr
ea
tic
Ca
nce
r i
n A
t
Lea
st 4
or
mo
re S
tuid
es [
14
7]
RNASE1 1.1E-16 IPI00014048 2 127 34 9 18 39
145
PIGR 3.3E-16 IPI00004573 7 396 18 31 1 53
LOXL2;
ENTPD4 7.8E-16 IPI00294839 7 125 7 35 20 8
MUC5AC 1.4E-15 IPI00103397 4 358 10 153 10 170
147
PRSS2 8.1E-15 IPI00011695 4 45 19 19 5 42
148
MUC5B 1.2E-14 IPI00902941 0 130 9 49 12 105
FCGBP 1.4E-14 IPI00242956 27 12 33 30 6 166
MMP13 3.0E-14 IPI00021738 8 94 33 12 31 6
VWA1 4.2E-14 IPI00396383 20 6 7 74 15 29
CP 5.2E-14 IPI00017601 12 7 9 184 10 23
149,150
C3 9.2E-14 IPI00783987 12 44 5 35 13 161 6 236 149,151
SEMA3A 9.3E-14 IPI00031510 47 5 5 86 25 13 21 19 22 16 24 5
SEMA3C 1.3E-13 IPI00019209 14 21 3 51 24 12
SPOCK1 4.3E-13 IPI00005292 26 8 2 26 29 6
MMP7 4.4E-13 IPI00013400 12 13 42 40 11 481 24 9
152
116
SERPINA1 6.0E-13 IPI00553177 22 37 11 304 11 10
PLAT 3.1E-12 IPI00019590 12 130 9 40 9 49
NRP1 1.8E-11 IPI00299594 8 21 32 6 8 38 29 14
ST14 2.5E-11 IPI00001922 13 27 13 19 14 10
LYZ 2.7E-11 IPI00019038 13 111 28 31 17 10
TGM2 6.8E-11 IPI00294578 7 32 7 15 11 13 10 49 12 122 15 55
EPHA2 7.5E-11 IPI00021267 8 15 15 6 9 5
MFI2 1.4E-10 IPI00029275 30 13 13 12 7 37 19 18
CTSH 1.9E-10 IPI00297487 19 18 5 12 14 48
NAGLU 2.1E-10 IPI00008787 6 24 19 10 16 33 4 10
AGR2 2.4E-10 IPI00007427 25 31 13 101 19 56 9 76
153
CSF1 6.4E-10 IPI00015881 3 73 25 63 23 6 37 8
CFB;C2 7.2E-10 IPI00019591 23 7 8 10 13 29 7 26 27 14
RNASE4 7.3E-10 IPI00029699 8 9 21 6 13 17 9 12 16 6
PLBD1 7.8E-10 IPI00016255 42 5 14 26 21 6 21 7
COL6A1 8.2E-10 IPI00291136 14 121 14 47 8 27 15 55 21 43
FUCA1 1.1E-09 IPI00843910 15 43 15 7 12 16 9 20 13 35 48 5
LRG1 1.8E-09 IPI00022417 14 18 19 39 11 34 19 11
185
LTBP3 1.9E-09 IPI00073196 32 10 14 33 26 6 18 18 5 22
PLBD2 1.9E-09 IPI00169285 31 7 4 20 8 15 19 16
GDF15 2.3E-09 IPI00306543 29 32 17 137 11 20 22 43
154,155
SIAE 6.4E-09 IPI00010949 16 9 20 60 42 10 1 24 22 8
DPP7 8.4E-09 IPI00296141 16 18 11 28 20 14
TGFB2 9.0E-09 IPI00235354 7 23 22 35 15 34 12 16
B3GNT3 9.6E-09 IPI00031983 40 8 15 19 5 11
ITGA2 2.8E-08 IPI00013744 10 21 9 24 12 9 32 17
CTSS 3.4E-08 IPI00299150 22 22 20 15 14 9
LTBP1 5.7E-08 IPI00302679 43 7 23 9 13 16
BSG 1.1E-07 IPI00019906 27 11 13 24 36 10 34 6 24 6
ACE 1.6E-07 IPI00437751 14 15 22 7 24 5 36 6
LCN2 1.9E-07 IPI00299547 13 28 9 31 31 51 156
CDH2 1.9E-07 IPI00290085 28 53 17 28 18 15
117
ITGB1 4.4E-07 IPI00217563 14 8 19 6 22 34 4 18 19 15 30 12
MSLN 5.0E-07 IPI00025110 12 38 30 15 49 22
SDCBP 6.2E-07 IPI00299086 38 5 24 8 14 12 11 18 37 8 11 13
CXCL1 1.8E-06 IPI00013874 14 26 19 48 17 14 14 16 49 22
AHSG 5.0E-06 IPI00022431 25 14 38 24 22 39 36 13
DNASE2 5.2E-06 IPI00010348 23 8 28 6 45 6 5 10
CXCL5 5.3E-06 IPI00292936 7 37 14 90 24 17 38 226
WFDC2 8.1E-06 IPI00291488 17 13 18 26 43 45 12 41
NEU1 8.5E-06 IPI00029817 17 9 29 9 12 12
SERPINB9 1.2E-05 IPI00032139 19 15 30 20 20 10 22 14
RARRES1 1.4E-05 IPI00410240 32 22 49 8 38 9
PLA2G15 2.7E-05 IPI00301459 9 12 6 6 42 6
CTBS 2.8E-05 IPI00007778 31 5 14 9 35 12 24 6
HS3ST1 9.3E-05 IPI00021377 18 12 49 9 41 29
LFNG 1.1E-04 IPI00455739 20 19 28 23 34 29 12 5 47 20
ENO2 2.0E-02 IPI00216171 6 235 27 213 10 175 36 152
a Proteome of ascites samples from pancreatic cancer patients (Makawita et al., unpublished).
b Identification in 12,787 protein containing plasma proteome database [158].
FC, fold change between cancer cell line and HPDE; %CV, percent coefficient of variation in normalized spectral counts for
triplicates of cell line; PJ, pancreatic juice
118
Appendix 4. Forty-three proteins common to cancer cell lines, pancreatic juice and ascites
Gene Protein Name Accession Identified in/as... Tissue Specificity
Asc
ites
a
≥ 5
-fold
in
at
least
on
e
can
cer
cell
lin
e vs
HP
DE
Hu
man
Pla
sma P
rote
om
e b
Over
exp
ress
ed i
n P
an
crea
tic
Cn
ace
r in
at
Lea
st 4
Oth
er
Stu
die
s (H
ars
ha e
t al.
[146
])
Over
exp
ress
ed i
n C
ore
Gen
e
Exp
ress
ion
Stu
dy (
Jon
es e
t
al.
[32
])
Over
exp
ress
ed i
n P
an
crea
tic
Can
cer
Tis
sue
(McK
inn
ey e
t
al.
[219
])
Cyst
Flu
id o
f P
DA
C P
ati
ent
(Ke
et a
l. [
104])
HPA
[161]
Uni
Gene
[160]
TiGER
[159]
TiSGeD
[158]
PRSS1 Trypsin-1 IPI00011694
PRSS2 Protease serine 2 isoform
B
IPI00011695
MUC5AC Mucin-5AC (Fragment) IPI00103397
RNASE1 Ribonuclease pancreatic IPI00014048
LUM Lumican IPI00020986
COL1A1 collagen alpha-1(I) chain
preproprotein
IPI00297646
CEACAM5 Carcinoembryonic
antigen-related cell
adhesion molecule 5
IPI00027486
MUC1 Mucin IPI00013955
PIGR Polymeric
immunoglobulin receptor
IPI00004573
OLFM4 Olfactomedin-4 IPI00022255
SPP1 Isoform A of
Osteopontin
IPI00021000
SERPINF1 Pigment epithelium-
derived factor
IPI00006114
LRG1 Leucine-rich alpha-2-
glycoprotein
IPI00022417
RBP4 Retinol-binding protein 4 IPI00022420
CFI Complement factor I IPI00291867
119
DMBT1 Isoform 1 of Deleted in
malignant brain tumors 1
protein
IPI00099110
F5 252 kDa protein IPI00022937
C4B complement component
4B preproprotein
IPI00418163
MXRA5 Matrix-remodeling-
associated protein 5
IPI00012347
LYZ Lysozyme C IPI00019038
AGT Angiotensinogen IPI00032220
CP Ceruloplasmin IPI00017601
FCGBP IgGFc-binding protein IPI00242956
SERPINA3 cDNA FLJ35730 fis,
clone TESTI2003131,
highly similar to
ALPHA-1-
ANTICHYMOTRYPSIN
IPI00550991
VTN Vitronectin IPI00298971
ACE Isoform Somatic-1 of
Angiotensin-converting
enzyme
IPI00437751
TCN1 Transcobalamin-1 IPI00299729
SERPINA4 Kallistatin IPI00328609
ITIH2 Inter-alpha (Globulin)
inhibitor H2, isoform
CRA_a
IPI00305461
APOA1 Apolipoprotein A-I IPI00021841
APOC1 Apolipoprotein C-I IPI00021855
APOL1 Isoform 2 of
Apolipoprotein L1
IPI00186903
SERPINC1 Antithrombin-III IPI00032179
SERPING1 Plasma protease C1
inhibitor
IPI00291866
HPX Hemopexin IPI00022488
SOD3 Extracellular superoxide
dismutase [Cu-Zn]
IPI00027827
GC Vitamin D-binding
protein
IPI00555812
120
F2 Prothrombin (Fragment) IPI00019568
CD14 Monocyte differentiation
antigen CD14
IPI00029260
C4BPA C4b-binding protein
alpha chain
IPI00021727
A2M alpha-2-macroglobulin
precursor
IPI00478003
HBA1 Hemoglobin subunit
alpha
IPI00410714
MUC6 mucin-6 IPI00401776
a Ascites fluid proteome from 3 pancreatic cancer patients (Makawita et al., unpublished)
b Identification in 12,787 protein containing plasma proteome database [157].
PDAC, pancreatic ductal adenocarcinoma