Transcript
Page 2: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Lennart [email protected]

Proteomics Services GroupEuropean Bioinformatics Institute

Hinxton, CambridgeUnited Kingdomwww.ebi.ac.uk

kenny helsens

[email protected]

Computational Omics and Systems Biology Group

Department of Medical Protein Research, VIBDepartment of Biochemistry, Ghent University

Ghent, Belgium

peptide validationand protein inference

Page 3: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Raw data

Peaklists

Peptide sequences

Protein accession numbersdata sizeambiguity

See: Martens and Hermjakob, Molecular BioSystems, 2007

Data processing and information ambiguity

Page 4: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

PEPTIDE IDENTIFICATION VALIDATION

Page 5: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Populations and individuals

10,000 peptide-to-spectrum matches

5%decoy hits

Page 6: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Suspect peptide identifications happen.

The problem is that finding them requiresdetailed analysis of a single spectrum and its identifications, amongst thousands of

other spectra…

Eliminating false positives

Page 7: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Automated interpretation

The Netherlands??

Page 8: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Manual interpretation

Tyrosine phosporylation

See: Ghesquière and Helsens, Proteomics, 2010

Page 9: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Peptizer expert system

Aggregation of the votes

Agent a

Agent b

Agent c

Agent d

Agent e

+ 1 + 1 0 -1 + 1Vote casts

Trustedsubset

Suspicioussubset

Confident Peptide Identifications

See: Helsens et al, Molecular and Cellular Proteomics, 2008

Page 10: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Peptizer expert system

See: Helsens et al, Molecular and Cellular Proteomics, 2008

Page 11: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Peptizer expert system

See: Helsens et al, Molecular and Cellular Proteomics, 2008

Page 12: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

PROTEIN INFERENCE

Page 13: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

1a 1b 3 4 6a 6b2 5

1a 1b 3 4 6a2 5

1a 1b 6a 6b2 5

1a 1b 3 6a 6b2 5

1b 3 4 6a 6b2 5

2 5

2 3 5

3 4 52

3 4 52

Gene

Transcripts

Translations

Intron Exon UTR Exon CDS Peptide

Peptidesmatching all transcriptsmatching a transcript subsetmatching exactly 1 translation

redundant

Not all peptides are created equal

Page 14: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Sample preparation consequences

See: Nesvizhskii AI et al, Molecular and Cellular Proteomics, 2005

Page 15: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

See: Nesvizhskii AI et al, Molecular and Cellular Proteomics, 2005

Sample preparation consequences

Page 16: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

peptides a b c d

proteinsprot X x xprot Y xprot Z x x x

Minimal setOccam {

peptides a b c d

proteinsprot X x xprot Y xprot Z x x x

Maximal setanti-Occam {

peptides a b c d

proteinsprot X (-) x xprot Y (+) xprot Z (0) x x x

Minimal set withmaximal annotation {

true Occam?See: Martens and Hermjakob, Molecular BioSystems, 2007

Protein inference: a question of conviction

Page 17: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

ALGORITHMS FOR THE

PROTEIN INFERENCE PROBLEM

Page 18: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

• IDPickerZhang et al, Journal of Proteome Research, 2007

• ProteinProphetNesvizhskii AI et al, Analytical Chemistry, 2003

• DBToolkitMartens et al, Bioinformatics, 2005http://genesis.UGent.be/dbtoolkit

A few algorithms for protein inference

Page 19: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

IDPicker parsimonious protein assembly

(I) Initialize

See: Zhang et al, Journal of Proteome Research, 2007

Page 20: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

IDPicker parsimonious protein assembly

(II) Collapse

See: Zhang et al, Journal of Proteome Research, 2007

Page 21: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

IDPicker parsimonious protein assembly

(III) Separate

See: Zhang et al, Journal of Proteome Research, 2007

Page 22: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

IDPicker parsimonious protein assembly

(IV) Reduce

See: Zhang et al, Journal of Proteome Research, 2007

Page 23: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

peptideprobability

peptideweight

proteinprobability

In iteration 1, all weights w start off as 1/n,with n the degeneracy count for the peptide

peptide probability

See: Nesvizhskii AI et al., Analytical Chemistry, 2003

ProteinProphet: the simplified view

Page 24: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

peptides a b cd

proteinsprot X(-) x xprot Y(+) xprot Z(0) x x x

Minimal set withmaximal annotation{

DBToolkit protein inference

Page 25: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

peptides a b c d

proteinsprot X (-) x xprot Y (+) xprot Z (0) x x x

Some indications from the HUPO BPP

Page 26: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

PROTEIN INFERENCE AND

QUANTIFICATION

Page 27: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Some inference examples (i)

See: Colaert et al, Proteomics, 2010

http://genesis.ugent.be/rover/

Nice and easy, 1/1, only unique peptides (blue) and a narrow distribution

Page 28: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Some inference examples (ii)

See: Colaert et al, Proteomics, 2010

Nice and easy, down-regulated

http://genesis.ugent.be/rover/

Page 29: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Some inference examples (iii)

See: Colaert et al, Proteomics, 2010

A little less easy, up-regulated

http://genesis.ugent.be/rover/

Page 30: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Some inference examples (iv)

See: Colaert et al, Proteomics, 2010

A nice example of the mess of degenerate peptides

http://genesis.ugent.be/rover/

Page 31: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Some inference examples (v)

See: Colaert et al, Proteomics, 2010

A bit of chaos, but a defined core distribution

http://genesis.ugent.be/rover/

Page 32: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Thank you!

Questions?


Recommended