20
Spectral Libraries: Productivity Enhancers for Cancer Proteomics NCI Board of Scientific Advisors June 22, 2009 Christopher R. Kinsinger Ph.D. National Cancer Institute Clinical Proteomic Technologies for Cancer

Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Spectral Libraries:Productivity Enhancers for Cancer Proteomics

NCI Board of Scientific AdvisorsJune 22, 2009

Christopher R. Kinsinger Ph.D.National Cancer Institute

Clinical Proteomic Technologies for Cancer

Page 2: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Today’s presentation

Outline:

• State of cancer proteomics

• Critical role of Spectral Libraries• In discovery proteomics

• In targeted proteomics

• Proposed concept

• Questions

Page 3: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Current investment in proteomics

NCI/NIH funding in proteomics

Dol

lars

($ m

illio

ns)

Number of new FDA-approved protein analytes

What if a small project could increase the efficiency of these dollars by 50%?

Page 4: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Discovery• Tissue• Proximal

fluids

ClinicalValidation• Blood• Population

Verification• Blood• Population

Bio-Specimens• Plasma• Tissue• Proximal fluids

Found in blood?Higher in cancer?

Biomarkers worthevaluating

Biomarkers worthevaluating

Biomarkercandidates

Enhancing discovery- and verification (targeted)-stage components in a biomarker development pipeline

“hypotheses”• untargeted

proteomics• genomics

Page 5: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Current state of peptide identification

Proteomics experiment

Match product ion spectra

Score

m/z

MS/MS

?

Protein mixture

Physical digest

Mass spec analysis

Human genome sequence library

Theoretical digest

(DNA RNA Proteins Peptides

Fragmentation model

m/z

MS/MS Theoreticalspectra

Page 6: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

The Spectral Library solution

Used in chemical, drug, forensics industry

Catalog of the highest quality MS spectra

Proven method to identify unknown spectra

Maintained by NIST

THE GOLDSTANDARDWhat is a Spectral Library?

Page 7: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

The Spectral Library Solution

Spectral quality filter by NIST metrics (CPTAC)

MS / MS data

Clean data IntoSpectral Library

Consensus peptide IDs(multiple algorithms)

X! Tandem

What is a Spectral Library?

How is a Spectral Library built?

Page 8: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

The Spectral Library Solution

How is a Spectral Library built?

What are the advantages?

• Adds a 2nd dimension of search

•Intensity of peaks•Peptide sequence

• Improves speed and reliability in peptide ID

• Compilation of all observable peptides

What is a Spectral Library?

Page 9: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

How spectral libraries work

Theoretical spectrum

Physicalspectrum

390 520 650 780 910 1040 1170 1300 1430 1560 1690

0

50

100

50

100

313

348

380 456

502

520 594

615

656

704

725

779

831

839

907

910

1039

1040 1127

1197

1311

1311 1388

1412 1499

1501 1615

1645

1702

Physical spectrum

Library spectrum

180 240 300 360 420 480 540 600 660 720 780 840

0

50

100

50

100

143197

242

242

299299

370

370

409

409

428

428

464500

500

517

517

557

575

575

614614

644

703

703

731 770770

817817

873

Page 10: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

05

101520253035404550

20 6.7 2.2 0.74 0.25

Concentration (fmol/uL)

Enhancing discovery-stage:recombinant proteins

Enhancing identification of low abundant proteins

• 48 human proteins spiked into yeast

• High-end MS platforms

• 500%+ increase in identification!

Spectral library enables identification of low-abundant proteins

Pro

tein

s id

entif

ied

Spectral Library

CometX!Tandem

OMSSA

Page 11: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Enhancing discovery-stage:clinical tissue

Pro

tein

s id

entif

ied

non-library methods

Over 35% additional proteins identifiedwhen library contains one additional dataset

Colon tissue data

Discovery stage

2008library

2008 library+ Vanderbilttissue data

Enriching library enriches discovery

Page 12: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Spectral libraries for verification (targeted) stage proteomics

From Steve Carr

• Quantitative mass spectrometry

(Multiple Reaction Monitoring)

Page 13: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Enhancing verification-stage

Steps to developing a quantitative protein assay:• Select 3-5 target peptides

• Representative of parent protein• Detectable by mass spectrometer

• Synthesize labeled peptides• Develop anti-peptide antibodies• Analyze on robust, affordable instrument platform

Anderson, et al. Mol. Cell. Prot. 2009 in press

Provided by Spectral Library

Page 14: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Overview of proposed concept: Adding value to biomarker development

Advantages of Spectral Library• Accelerates and improves ID of low

abundant proteins for discovery

• Provides increasing registry of known peptides

• Becomes an index of assay design• Creates a community-wide resource

and shared interest to foster interactions among diverse research groups

Human Spectral Library• High quality human biological

samples• Tissue• Recombinant proteins

• High quality peptide spectra• Coordination among data

generators, library developers and data integrators

Goal: Develop public library that anchors proteomic analysis to the physical properties of a peptide through its MS/MS spectrum

Strengthens the first stages of the biomarker development pipeline

Page 15: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Representation of cancer tissue in library is dismal

• All other tissue types are <1% of current library

• Project will catalog proteins from 15 tissue types

• Increase total number of peptides in library by 50%

Sample source% of files in current library are from …

Colon 2.31CSF 1.07Kidney 1.59Liver 2.40Lymph 6.15Plasma 77.77Red Blood Cells 2.95

Page 16: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Further expanding a library with non-native proteins

• Begin with tissue samples

• Identify key proteins missing from library (TCGA, SPOREs, ICBP, etc.)

• Fill gaps with recombinant proteins or synthetic peptides

• At least 70% of peptides are unmodified

• Complete coverage of protein

• Aids identification of high-priority, low-abundant proteins

Page 17: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Evaluation criteria

•Increase number of peptide spectra in spectral library

•Increase number of proteins represented in library

•Provide a sustainable, caBIG-compatible data repository for proteomics data

Page 18: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Components of concept:Leveraging NCI resources and partners

NCI resources1. Sample source (tissue,

protein/peptide production)

2. Protein analysis (data generators)• Partners: NCI-F• Generate high quality spectra to more

extensively represent all human protein sequences

3. Data coordinating center• Partner: CBIIT/caBIG®

• Maintain CPAS database of experiments, peptides, proteins, and raw spectra; ensure quality of and completeness of annotation; leverage caBIG® data portal capabilities and Cancer Center network

Leveraged activities4. Spectral Library development

• Partner: NIST• Receive peptide spectra from CBIIT

and incorporate into human spectral library

5. Data integration with other resources• Partner: NCBI• Acquire data submitted to the NCI for

incorporation into NIH Peptidomedatabase

Page 19: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Proposed Spectral Library(timeline & budget)

2) Recombinant protein production (RFP)

3) Protein analysis(Data generators) (RFP)

4) Data coordinatingcenter (CBIIT)

5) Spectral Library development (NIST)

FY12FY11FY10Initiative title $

2.4 million

700,000

1.7 million 1.7 million 1.3 million

1) Biospecimen tissue acquisition (OBBR) 1.1 million

600,000

6) Data integration with other NIH resources(NCBI Peptidome)

$666,667 $433,333

$800,000 $800,000 $800,000

$300,000$300,000

$233,333 $233,333 $233,333

Total: 4.8 million

Leveraged activity

Leveraged activity

Page 20: Spectral Libraries - National Institutes of Health€¦ · The Spectral Library solution ... How spectral libraries work Theoretical spectrum Physical spectrum 390 520 650 780 910

Summary

•Registry of high quality, assigned peptide spectra•Enhance biomarker development in both the discovery and verification phases•Augment existing spectral libraries by 50% with spectra from cancer-relevant proteins•Spectral libraries will increase efficiency of NCI’s investment in proteomics