40
Larry Lam Southern California Bioinformatics Summer Institute 2009 Graeber Lab – Crump Institute for Molecular Imaging UCLA A Data Management and Analysis Software Platform for Phospho- Proteomics Data

Larry Lam Southern California Bioinformatics Summer Institute 2009

  • Upload
    novia

  • View
    27

  • Download
    0

Embed Size (px)

DESCRIPTION

A Data Management and Analysis Software Platform for Phospho -Proteomics Data . Larry Lam Southern California Bioinformatics Summer Institute 2009 Graeber Lab – Crump Institute for Molecular Imaging UCLA . Outline. Graeber Lab Background Project Objective - PowerPoint PPT Presentation

Citation preview

Page 1: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Larry Lam Southern California Bioinformatics Summer Institute 2009Graeber Lab – Crump Institute for Molecular Imaging UCLA

A Data Management and Analysis Software Platform for

Phospho-Proteomics Data

Page 2: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Outline

• Graeber Lab Background• Project Objective• My Experimental Project (Example

Dataset) • Software Design• Software Demo • Conclusion / Future Work• Acknowledgements

Page 3: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Systems Biology of Cancer Signaling

• Lab Goals– Understand Cancer Signaling

Through Systems Biology Approaches

– [long term] Improve Cancer Treatment

• Signaling Pathway Modeling Through– Kinetics– Phospho-Profiling– Adaptor Complex Analysis

Page 4: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Project Objective

Develop a Software Platform for Convenient Storage and Analysis of Large-Scale Data Sets

- Design Database to Collect and Store Large Scale Proteomic Data Sets

- Allow for Comprehensive Meta Information - Simplify Access to Multiple Data Sets- Simplify The Use of Common Tools of Analysis

Page 5: Larry Lam  Southern California Bioinformatics Summer Institute 2009

BCR/Abl Leukemia

• BCR/Abl fusion protein found in - 90% - 95% of chronic myleoid leukemia- 20% of adult acute lymphoblastic leukemia- 5% of children acute lymphoblastic leukemia

• Analyze the adaptor proteins in BCR/Abl signaling- Adaptor proteins mediate protein interactions

http://www.annals.org/cgi/content/full/138/10/819

BaitPrey Prey Complex Capture Protein

Interacting Protein

Page 6: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Experimental WorkflowExperimental

Protocol

Mass SpectometryQuantitation

Pipeline

IPI Proteomics Database

[Complex] NS Filter/

Consolidation

Complex

Phospho Profiling Quantitation Output File

Manual Organization/

Analysis

Purification

Current Workflow

Page 7: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Identifying Interactions of the Crk Adaptor Proteins

1. Genetic modification of pro-B-lymphocytes (Baf3) • Express adaptor + streptavidin binding peptide(SBP)

2. Culture 3. Lyse each culture for protein complex purification

Crk I Lysate Crk L Lysate Crk II Lysate NTAP Lysate

Page 8: Larry Lam  Southern California Bioinformatics Summer Institute 2009

1. Separation of protein complex with streptavidin beads

2. Trypsin digestion from proteins to peptides

3. Separation of phosphorylated peptides with Fe(III)-NTA beads

4. Liquid Chromotography + Mass Spectometry

5. Quantitation Pipeline

Protein Complex Purification

P P

P

P

Page 9: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Quantitation Output File

Consolidation of quantified peptides and associated proteins per sample

• All peptides identified• All adaptor proteins used • Phosphorylation position within the peptide [optional]

Peptide Sequence Description/IPI Accession

Crk I Crk L Crk II NTAP

K.ADAAEFWR.K CBLIPI00027269

16568291 1802395 2821019 0

R.QEAVALLQGQR.H Isoform Crk-II IPI00307991

2859381909 924541 466328113 0

Page 10: Larry Lam  Southern California Bioinformatics Summer Institute 2009

NS Filter/ConsolidationQuantitation Output File

Collapse Peptides To

Protein Quantity

Remove Insignificant

Proteins

Heatmap Analysis

Remove Known Contaminants

Peptide Sequence Description/IPI Accession

Crk I Crk L Crk II NTAP

K.ADAAEFWR.K CBLIPI00027269

170685828 10127151 0 0

K.ALVIAHNNIEMAK.N CBLIPI00027269

134461139 897107 51499 0

R.QEAVALLQGQR.H Isoform Crk-II IPI00307991

793418583 145308 139088540 85523

K.IHYLDTTTLIEPVAR.S Isoform Crk-II IPI00307991

16062973521 349417 3223960034 4617083

• Quantity Is Normalized For Each Row

Page 11: Larry Lam  Southern California Bioinformatics Summer Institute 2009

NS Filter/ConsolidationQuantitation Output File

Collapse Peptides To

Protein Quantity

Remove Insignificant

Proteins

Heatmap Analysis

Remove Known Contaminants

Page 12: Larry Lam  Southern California Bioinformatics Summer Institute 2009

NS Filter/ConsolidationQuantitation Output File

Collapse Peptides To

Protein Quantity

Remove Insignificant

Proteins

Heatmap Analysis

Remove Known Contaminants

Protein Enrichment Factor = (Median – NTAP Median)/ s + s

Protein NTAP

Page 13: Larry Lam  Southern California Bioinformatics Summer Institute 2009

NS Filter/ConsolidationQuantitation Output File

Collapse Peptides To

Protein Quantity

Remove Insignificant

Proteins

Heatmap Analysis

Remove Known Contaminants

• Configuration File of Known Contaminants

Page 14: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Statistical Analysis:Peptide Quantity Heatmap

Java TreeView

High Quantity

Low Quantity

Crk I Crk L CrkII NTAP

• Cbl Peptides• Crk I Peptides

Page 15: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Experimental WorkflowExperimental

Protocol

Mass SpectometryQuantitation

Pipeline

IPI Proteomics Database

[Complex] NS Filter/

Consolidation

Complex

Phospho Profiling Quantitation Output File

Manual Organization/

Analysis

Purification

Current Workflow

Quantitation Import

Local DB

Statistical Analysis

External SourcesExternal

SourcesExternal Sources

New Workflow

Page 16: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Program Design

C# GUI Application

Quantitation Output File

DATA IMPORT

MySQL Database

DATA QUERY

QuantitationData Set

R Statistical Function

• Programming Language: C#• Database: MySQL

– Free• Statistical Computing: R

– Free, Accessible to C#

Page 17: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Data Import Methodology1. Define Meta Data (Descriptors) And Relationships About The

Quantitation Values2. Create The Tables In MySQL3. Access Using MySQL Connector/Net

http://dev.mysql.com/downloads/connector/

Page 18: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Statistical AnalysisMethodology

• R Language and Environment for Statistical Computing and Graphics- Modeling- Statistical Tests- Clustering- Heatmaps

• Develop a Graphical User Interface To R Functions- Access R Functions Through R-(D)COM Interface

http://cran.r-project.org/contrib/extra/dcom/

Page 19: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Software Demo

Page 20: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Conclusion

• Management Software– Standardized approach in maintaining lab data

• Analyze Data Sets – Analysis tools highly accessible to biologists of

various technical levels• Combine Data Sets

– Potentially lead to new discoveries

Page 21: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Future Work

• Add More Links To External Database• Enhance Data Query• Include More Analysis Functions

Page 22: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Acknowledgments• Graeber Lab Members

– Dr. Thomas Graeber– Dr. Björn Titz

• SoCalBSI Faculty and Members– Dr. Jamil Momand – Dr. Sandy Sharp – Dr. Nancy Warter-Perez – Dr. Wendie Johnston – Dr. Beverly Krilowicz – Ronnie Cheng

• Funding

Page 23: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Main Window

Page 24: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Main Window: Options

Page 25: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Batch Import

Page 26: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Batch Information

Page 27: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Sample Information

Page 28: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Sample Information: Technical Replicates

Page 29: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Feature Type

Page 30: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Features

Page 31: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Project Assignment

Page 32: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Batch Prtotocol Assignment

Page 33: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Biological System Assignment

Page 34: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Import

Page 35: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Batch Query

Page 36: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Feature Type Selection

Page 37: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Matrix/Heatmap Dialog

Page 38: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Heatmap Options

Page 39: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Heatmap Options

Page 40: Larry Lam  Southern California Bioinformatics Summer Institute 2009

Data Import Design Methodology

Batch

Feature

LabelDescriptionExperimenterDate

LabelDescriptionFeature Type

Sample

LabelDescriptionQuality

1. Define Meta Data (Descriptors) About The Quantitation Values- Define Relationships

2. Create The Tables In MySQL3. Develop Support for MySQL Access

- MySQL Connector

Feature Value

ValueValue Type

V

V

VV