25
Modeling precision treatment of breast cancer Daemen A, Griffith OL, Heiser LM, et al. Modeling precision treatment of breast cancer. Genome Biol. 2013;14(10):R110. doi: 10.1186/gb-2013-14-10-r110 .

Exome breast cancer-edu-tk-sb

Embed Size (px)

Citation preview

Page 1: Exome breast cancer-edu-tk-sb

Modeling precision treatment of breast

cancerDaemen A, Griffith OL, Heiser LM, et al. Modeling

precision treatment of breast cancer. Genome Biol. 2013;14(10):R110. doi:10.1186/gb-2013-14-10-r110.

Page 2: Exome breast cancer-edu-tk-sb

Biological Background• Breast Cancer has a total of six subtypes that have been previously identified and well documented. These

subtypes respond differently to chemotherapy and hormone treatments. Currently doctors only test for a handful of molecular signatures and over 40% of those patients cancer does not fit in those categories. This study focuses on over 70 different Breast Cancer cell lines on over 90 different therapeutic agents. This includes SNP Array, RNA-seq, exome-seq (exome capture), genome-wide methylation, and RPPA protein abundance studies as well as integrating a number of algorithmic methods to identify molecular features including: least squares-support vector machine and random forest algorithms.

• Cell lines are often used in research, as they mirror many of the molecular characteristics found in tumors in clinical studies. This means they can be used for producing pre-clinical models for predictive marker development. This work was able to develop predictive drug response signatures and this research can be built upon with future clinical models. One issue with this study is a cell panel does not capture features such as tumor microenvironment that is critical to understanding tumors.

• Types of Cell Lines Used :• Luminal- often chemotherapy responsive and endocrine responsive• Basal- often chemotherapy responsive and endocrine nonresponsive• claudin-low- intermediate response to chemotherapy • normal and normal like cell- breast cancer cell lines arose after chemical exposure (such as 184A1) and primary cell lines • Unknown- includes cell lines that are contaminated (such as MT3) and lymphoblastoid cell lines (such as HCC1007)

Page 3: Exome breast cancer-edu-tk-sb

Total List of Cell Lines- Total of 84 in the StudyBasal 21NT Basal* CAL120 Luminal BT474 Luminal ZR7530Basal BT20 Basal* CAL148 Luminal BT483 Luminal ZR75BBasal HCC1143 Basal* CAL851 Luminal CAMA1 Luminal MDAMB175VIIBasal HCC1187 Basal* CAL51 Luminal HCC1419 Luminal SUM225CWNBasal HCC1569 Basal* HDQP1 Luminal HCC1428 Luminal SUM44PEBasal HCC1806 Basal& COLO824 Luminal HCC202 Luminal UACC893Basal HCC1937 Claudin-low BT549 Luminal HCC2185 Luminal EFM192ABasal HCC1954 Claudin-low HCC1395 Luminal LY2 Luminal EFM192BBasal HCC3153 Claudin-low HCC38 Luminal MCF7 Luminal EFM192CBasal HCC70 Claudin-low HS578T Luminal MDAMB134VI Luminal HCC2218Basal MX1 Claudin-low MDAMB157 Luminal MDAMB361 Matched normal HCC1143BLBasal SUM149PT Claudin-low MDAMB231 Luminal MDAMB415 Matched normal HCC38BLBasal SUM229PE Claudin-low SUM1315MO2 Luminal MDAMB453 Normal-like 184A1Basal 21MT1 Claudin-low SUM159PT Luminal ERBB2-amp Normal-like 184B5Basal MDAMB468 Claudin-low HBL100 Luminal SKBR3 Normal-like MCF10ABasal 21PT Claudin-low MDAMB436 Luminal SUM185PE Normal-like MCF12ABasal JIMT1 Luminal* EFM19 Luminal SUM52PE Normal-like MCF10FBasal SUM102PT Luminal* EVSAT Luminal T47D Normal-like S1Basal 21MT2 Luminal* MFM223 Luminal T47D_KBluc Normal-like^ PMC42Basal HCC1599 Luminal 600MPE Luminal UACC812 Unknown# T4Basal MB157 Luminal AU565 Luminal ZR751 Unknown# HCC1008

Unknown# MT3

A mix of cell lines were used in this study. This includes basal, basal like, claudin low, luminal, normal, normal like and unknown that were used for the study.

Page 4: Exome breast cancer-edu-tk-sb

Drug List Used For Breast Cancer Study and Their associated mean GI5017-AAG 7.035 BIBW2992 6.396 Doxorubicin 6.616 GSK11202

12 5.815 Geldanamycin 7.594 Lestaurtinib

(CEP-701) 6.226 Oxaliplatin 5.108 L-779450 4.745 Topotecan 6.865

ZM447439 5.110 Baicalein 4.292 ERKi II (FR180304) 4.443 GSK10598

68 4.885 Gemcitabine 6.652 MG-132 6.738 Oxamflati

n 6.053 Rapamycin 6.697 Tamoxifen 4.387

5-FU 3.972 Bortezomib 7.854 Epirubicin 6.525 GSK1838705 5.246 Glycyl

H1152 4.894 MLN4924 6.414 PD98059 4.432 Vorinostat 4.123 Temsirolimus 6.013

5-FdUR 3.970 CGC-11047 3.964 Erlotinib 4.695 GSK461364 7.076 ICRF-193 4.965 Mebendazol

e 6.064 PF-2341066 5.543 SB-3CT 4.169 Trichostatin

A 5.071

AG1478 4.526 CGC-11144 6.2563 Etoposide 5.39 GSK2119563 6.08 IKK 16 5.483 Methotrexat

e 4.668 PF-3084014 4.646 Ispinesib 7.154 Tykerb:IGF1

R (1:1) 6.209

Sigma AKT1-2 inhibitor 5.460 CPT-11 5.086 Everolimus 6.404 GSK21264

58 7.933Ibandronate sodium

salt4.242 NSC663284 5.645 PF-

3814735 5.695 Bosutinib 5.631 VX-680 5.445

Triciribine 5.593 Carboplatin 4.320 FTase inhibitor I 4.411 GSK21417

95 6.584 Imatinib 4.713 NU6102 4.743 PF-4691502 6.889 Sorafenib 4.287 Valproic acid 2.768

AS-252424 4.813 Cisplatin 5.061 Fascaplysin 6.743 GSK1059615 6.311 Gefitinib 5.148 Nelfinavir 4.989 Paclitaxel 7.908 Sunitinib

Malate 5.210 Velcade 7.962

AZD6244 4.705 Disulfiram 5.700 GSK923295 7.044 GSK650394 4.379 Ixabepilon

e 7.917 Nutlin 3a 4.687 Pemetrexed 3.222 TCS PIM-11 4.090 Vinorelbine 7.549

BEZ235 5.811 Docetaxel 8.250 GSK1070916 5.771 Lapatinib 5.164 LBH589 6.948 Olomoucine

II 5.294 Purvalanol A 4.128

TCS2312 dihydrochlorid

e6.248 XRP44X 5.706

GI50 is the concentration for 50% of maximal inhibition of cell proliferation, and should be used for cytostatic (as opposed to cytotoxic) agents. GI50 dichotomization threshold for each compound, with the mean GI50 for the 48 core cell lines.

https://www.dropbox.com/s/kjim8g5szr8fwa6/gi50_threshold_48.xlsx?dl=0

Page 5: Exome breast cancer-edu-tk-sb

Data Information• Exome-Seq -GSE48215

• 75 breast cancer cell lines underwent exome-seq to identify mutations

• Samples by Type: Basal 17, Claudin-low 8, Luminal 29, Non-Malignant 5, Unknown 16

• Only sorted BAM files uploaded after BWA align&sampe aligned to hg19

• RNA-Seq -GSE48213• 56 Cell lines were profiled in thier baseline, unperturbed

state. • Samples by Type: Basal 15, Claudin-low 7, Luminal 32, Non-

malignant 6, Unknown 4• Agilent Bioanalyzer High Sensitivity chip• Pipeline: http://use.t-bioinfo.com:3000/pipelines/38717146

• Methylation by Array –GSE42944• DNA methylation in 55 Breast Cancer samples • Extraction protocol - DNA was extracted with TNES/PK

(Tris/NaCl/EDTA/SDS/proteinase K) lysis buffer prior to sodium bisulfite conversion using the Zymo Research EZ DNA Methylation Kit.

• Hybridization protocol - Bisulphite-converted DNA was amplified, fragmented and hybridized to Illumina Infinium Human Methylation27 BeadChips using the standard Illumina protocol.

• Affymetrix Array • DNA copy number array • EGAS00000000059 + EGAS00001000585 • https://www.ebi.ac.uk/ega/search/site/EGAS00000000059

This study developed candidate response signatures by analyzing associations between biological responses to therapy and pretreatment omics signatures.

Pretreatment Measurements: 1. mRNA expression (Affymetrix) (56 cell

lines)2. Genome copy number (SNP6) (74 cell

lines)3. Protein expression (RPPA) (49 cell lines) 4. Gene mutation (exome-seq) (75 cell lines) 5. Transcriptome sequencing (RNA-seq) (56

cell lines) 6. Methylation assay (47 cell lines) The software applies signatures of response developed in vitro to measurements of expression, copy number, and/or methylation for individual samples and produces a list of recommended treatments ranked according to predicted probability of response and in vitro GI50 dynamic range

Page 6: Exome breast cancer-edu-tk-sb

Breast Cancer Cell Lines • A total of 84 breast cancer cell lines were assembled, and a total of 70 were tested for their response to compounds by growth inhibition assays. A

total of 56 cell lines underwent RNA sequencing and 75 samples underwent Exome sequencing. A total of 33 cell lines were included in all data sets. The cell lines and compounds that were shown to be low levels of response to variation in response to cells.

RNA-seq Breast Cell Lines MCF7 184A1MDAMB134VI 184B5MDAMB231 21NTMDAMB361 600MPEMDAMB453 AU565MX1 BT474SKBR3 BT483SUM1315MO2 BT549SUM149PT CAMA1SUM229PE HCC1143SUM52PE HCC1395T47D HCC1419T47D_KBluc HCC1428UACC812 HCC1569ZR751 HCC1806ZR7530 HCC1937ZR75B HCC195421MT1 HCC202MCF10F HCC3153MDAMB175VII HCC38SUM225CWN HCC70UACC893 HS578T21PT LY2JIMT1 MCF10AEFM192A MCF12AEFM192B HCC1599EFM192C HCC221821MT2 MB157

Exome-seq Breast Cell Lines 184A1 SKBR3 CAL51184B5 SUM1315MO2 EVSAT21NT SUM149PT HCC1143BL600MPE SUM159PT HCC2218AU565 SUM185PE HCC38BLBT20 SUM229PE HDQP1BT474 SUM52PE MFM223BT483 T47D MT3BT549 T47D_KBluc PMC42CAMA1 UACC812 EFM192AHCC1143 ZR751 EFM192BHCC1187 ZR7530 EFM192CHCC1395 ZR75B EFM19HCC1428 21MT1 21MT2HCC1569 MCF10F MDAMB231HCC1806 MDAMB175VII MDAMB361HCC1937 SUM225CWN MDAMB415HCC1954 SUM44PE MDAMB453HCC202 MDAMB436 MX1HCC2185 MDAMB468HCC3153 UACC893HCC38 21PTHCC70 JIMT1LY2 SUM102PTMCF10A T4MCF12A CAL120MDAMB134VI CAL148MDAMB157 CAL851

* In red did not get included in drug analysis *not included in exome-sequencing * Not included in RNA-sequencing

Cell lines with all Datasets (RNA-seq, Exome-seq, exon-array, methylation, drug analysis)

600MPE MCF10AAU565 MCF12ABT474 MDAMB134VI

BT483 MDAMB231BT549 MDAMB361CAMA1 MDAMB453HCC1143 SKBR3HCC1428 SUM1315MO2HCC1569 SUM149PTHCC1937 SUM52PE

HCC1954 T47DHCC202 UACC812HCC3153 ZR751

HCC38 ZR7530HCC70 ZR75BLY2 MDAMB175VII

SUM225CWN*no RPPA availability

Page 7: Exome breast cancer-edu-tk-sb

RPPA a protein array designed a a micro- or nano- scaled dot-blot platform that allows measurements of protein expression levels in a large number of biological samples. This can be characterized the basal protein expression and modification levels, growth factor, or ligand induced effects. This can be used to validate therapeutic targets and evaluate drug pharmacodynamics. The RPPA assays whose protein lysate requirements are generally in the picogram to nanogram range and hundreds of proteins can be analyzed simultaneously under identical conditions.

Measuring Protein Abundance: Reverse Phase Protein Lysate

1. Lysis and Printing2. Staining and Measuring3. Analysis

Page 8: Exome breast cancer-edu-tk-sb

Study Highlights • The researchers found predictive signatures of

responses across all levels of the genome. • The current system to determine treatment

uses ER and ERBB2 status, but this study suggest that more significant features should be included in the treatment decision.

• Using the Patient Response toolbox in R’, each patient would get a total of 22 therapeutic compounds ranked according to a patient’s likeli- hood of response and in vitro GI50 dynamic range.

• Building upon this work, the long term goal is to select therapeutic compounds most likely to be effective in an individual patient.

Page 9: Exome breast cancer-edu-tk-sb

Application• Building upon the work in this study, a more

comprehensive genome wide platforms could be used for discovery and one identified, significant features could be migrated to alternative platforms for a lab diagnostic.

Page 10: Exome breast cancer-edu-tk-sb

RNA-seq (56) Breast Cancer Samples

Page 11: Exome breast cancer-edu-tk-sb

Expected results:• Gene expression• Isoform expression• Exon expression

Quantile Normalization PCA pictures

• It is expected that cell lines will appear as “clouds” on PCA pictures if the lines different enough.

• Some cell lines can be found more close to each other on PCA graph and other cell lines can be placed on a distance. It can be concordant with Transcriptional subtype + ERBB2 status or more similar cell lines probably respond on treatment also similarly.

• Genes expression expected be less informative then isoform and exon expression. Batch effect can be found in exon expression, but not in gene expression.

• Upregulated and downregulated genes and isoforms can provide meaningful pathways in DAVID, already found in breast cancer and also unknown yet. It would be interesting result if lines will have different pathways as different types of cancer.

Page 12: Exome breast cancer-edu-tk-sb

PCA Genes After QN

Page 13: Exome breast cancer-edu-tk-sb

PCA Isoforms After QN

Page 14: Exome breast cancer-edu-tk-sb

PCA Exons After QN

Page 15: Exome breast cancer-edu-tk-sb

Identifying Subtypes: Luminal vs. Basal

Luminal vs. Basal

Page 16: Exome breast cancer-edu-tk-sb

Identifying Subtypes: Luminal vs. Claudin-Low

Page 17: Exome breast cancer-edu-tk-sb

Junk-RNAReads that were not mapped on genome (RSEM output NotMapped reads) will be mapped on ncRNA database and RepBase database.

Page 18: Exome breast cancer-edu-tk-sb

putative RE (kchains)• Reads that were

not mapped on ncRNA+RepBase will be analyzed for putative ncRNA and/or Repeats using BiClustering procedure.

• Kchain extension and annotation.

Page 19: Exome breast cancer-edu-tk-sb

Expected results:• RE abundance• kchains abundance

Quantile Normalization

PCA pictures (points and clouds)

• RE and kchains on PCA graph can reveal cell-lines similarities and differences.• Cell-lines can be found on PCA graphs by their RE and kchain abundances more cell-line-

specific than genes, isoform and exon expression, especially kchains abundances. • positions of TE/REs (rows of the table) can be analyzed also by classification of RE –

abundance of some types of RE can be higher then others in specific cell-lines.• Kchain extension and annotation can give some more genes that up- or down-regulated

in different cell lines

Page 20: Exome breast cancer-edu-tk-sb

Exom (75 breast cancer samples)

Page 21: Exome breast cancer-edu-tk-sb

Expected results:List of prospective mutations (chromosome/ position)

Analysis of probability of every mutation

Known and new markers for every cell-line

• List of cancer markers (positions of mutations)• Known and new mutations

• Genome regions with the biggest rate of mutation frequency

Page 22: Exome breast cancer-edu-tk-sb

BiAssociationCell lines ---------------------Traits (Treatment reaction and mutations)

HCC1143 HCC1806

17-AAG 6.86 3.76

5-FU… 7.05 4.61

Chr..Pos.. 1 0

Chr..Pos.. 0 1

Cell linesExpression and abundance (with line-specificity)

HCC1143 HCC1806

Genes… 0Isoforms…Exons…REs…

Kchains..

First table - table of traits with GI50 values (drug response) and mutations presented as a tabe with values like 0;0,5;1 for every cell line (sample).

Second table is table of genes, isoforms, exons expression and REs and kchains abundances (which have maximum in one of cell lines).

Page 23: Exome breast cancer-edu-tk-sb

Expected results of BiAssociation and P-clustering

• Expected that we will find similar association as it was fund in initial paper (between breast cancer markers in different cell lines and drug response of them).• We will probably find more markers

between isoforms and exons, and also Res (known and putative) and they can also be associated with specific drug response.• P-clustering can give modules of co-

associated features (drug response, expression, mutations and etc)

Page 24: Exome breast cancer-edu-tk-sb

•Methylation by Array ?• Affymetrix Array ?

Page 25: Exome breast cancer-edu-tk-sb

Educational tasks:1) Cell-line (or species- or set of data-) specificity by gene,

isoform, exon expression and REs (known and putative) abundances.Defining of data: PCA-visualization, batch effect, kchain

extension and annotation, artefacts (probably).2) Mutations: how to find and annotate?3) BiAssociation of data from different kind of sources