32
http://www.bits.vib.be/training

BITS - Protein inference from mass spectrometry data

  • Upload
    bits

  • View
    875

  • Download
    2

Embed Size (px)

DESCRIPTION

This is the fifth presentation of the BITS training on 'Mass spec data processing'. It reviews the problems of determining protein sequences of mass spec data, how to deal with it, with an overview of useful tools.Thanks to the Compomics Lab of the VIB for their contribution.

Citation preview

Page 2: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Lennart [email protected]

Proteomics Services GroupEuropean Bioinformatics Institute

Hinxton, CambridgeUnited Kingdomwww.ebi.ac.uk

kenny helsens

[email protected]

Computational Omics and Systems Biology Group

Department of Medical Protein Research, VIBDepartment of Biochemistry, Ghent University

Ghent, Belgium

peptide validationand protein inference

Page 3: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Raw data

Peaklists

Peptide sequences

Protein accession numbersdata sizeambiguity

See: Martens and Hermjakob, Molecular BioSystems, 2007

Data processing and information ambiguity

Page 4: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

PEPTIDE IDENTIFICATION VALIDATION

Page 5: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Populations and individuals

10,000 peptide-to-spectrum matches

5%decoy hits

Page 6: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Suspect peptide identifications happen.

The problem is that finding them requiresdetailed analysis of a single spectrum and its identifications, amongst thousands of

other spectra…

Eliminating false positives

Page 7: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Automated interpretation

The Netherlands??

Page 8: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Manual interpretation

Tyrosine phosporylation

See: Ghesquière and Helsens, Proteomics, 2010

Page 9: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Peptizer expert system

Aggregation of the votes

Agent a

Agent b

Agent c

Agent d

Agent e

+ 1 + 1 0 -1 + 1Vote casts

Trustedsubset

Suspicioussubset

Confident Peptide Identifications

See: Helsens et al, Molecular and Cellular Proteomics, 2008

Page 10: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Peptizer expert system

See: Helsens et al, Molecular and Cellular Proteomics, 2008

Page 11: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Peptizer expert system

See: Helsens et al, Molecular and Cellular Proteomics, 2008

Page 12: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

PROTEIN INFERENCE

Page 13: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

1a 1b 3 4 6a 6b2 5

1a 1b 3 4 6a2 5

1a 1b 6a 6b2 5

1a 1b 3 6a 6b2 5

1b 3 4 6a 6b2 5

2 5

2 3 5

3 4 52

3 4 52

Gene

Transcripts

Translations

Intron Exon UTR Exon CDS Peptide

Peptidesmatching all transcriptsmatching a transcript subsetmatching exactly 1 translation

redundant

Not all peptides are created equal

Page 14: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Sample preparation consequences

See: Nesvizhskii AI et al, Molecular and Cellular Proteomics, 2005

Page 15: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

See: Nesvizhskii AI et al, Molecular and Cellular Proteomics, 2005

Sample preparation consequences

Page 16: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

peptides a b c d

proteinsprot X x xprot Y xprot Z x x x

Minimal setOccam {

peptides a b c d

proteinsprot X x xprot Y xprot Z x x x

Maximal setanti-Occam {

peptides a b c d

proteinsprot X (-) x xprot Y (+) xprot Z (0) x x x

Minimal set withmaximal annotation {

true Occam?See: Martens and Hermjakob, Molecular BioSystems, 2007

Protein inference: a question of conviction

Page 17: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

ALGORITHMS FOR THE

PROTEIN INFERENCE PROBLEM

Page 18: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

• IDPickerZhang et al, Journal of Proteome Research, 2007

• ProteinProphetNesvizhskii AI et al, Analytical Chemistry, 2003

• DBToolkitMartens et al, Bioinformatics, 2005http://genesis.UGent.be/dbtoolkit

A few algorithms for protein inference

Page 19: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

IDPicker parsimonious protein assembly

(I) Initialize

See: Zhang et al, Journal of Proteome Research, 2007

Page 20: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

IDPicker parsimonious protein assembly

(II) Collapse

See: Zhang et al, Journal of Proteome Research, 2007

Page 21: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

IDPicker parsimonious protein assembly

(III) Separate

See: Zhang et al, Journal of Proteome Research, 2007

Page 22: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

IDPicker parsimonious protein assembly

(IV) Reduce

See: Zhang et al, Journal of Proteome Research, 2007

Page 23: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

peptideprobability

peptideweight

proteinprobability

In iteration 1, all weights w start off as 1/n,with n the degeneracy count for the peptide

peptide probability

See: Nesvizhskii AI et al., Analytical Chemistry, 2003

ProteinProphet: the simplified view

Page 24: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

peptides a b cd

proteinsprot X(-) x xprot Y(+) xprot Z(0) x x x

Minimal set withmaximal annotation{

DBToolkit protein inference

Page 25: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

peptides a b c d

proteinsprot X (-) x xprot Y (+) xprot Z (0) x x x

Some indications from the HUPO BPP

Page 26: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

PROTEIN INFERENCE AND

QUANTIFICATION

Page 27: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Some inference examples (i)

See: Colaert et al, Proteomics, 2010

http://genesis.ugent.be/rover/

Nice and easy, 1/1, only unique peptides (blue) and a narrow distribution

Page 28: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Some inference examples (ii)

See: Colaert et al, Proteomics, 2010

Nice and easy, down-regulated

http://genesis.ugent.be/rover/

Page 29: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Some inference examples (iii)

See: Colaert et al, Proteomics, 2010

A little less easy, up-regulated

http://genesis.ugent.be/rover/

Page 30: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Some inference examples (iv)

See: Colaert et al, Proteomics, 2010

A nice example of the mess of degenerate peptides

http://genesis.ugent.be/rover/

Page 31: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Some inference examples (v)

See: Colaert et al, Proteomics, 2010

A bit of chaos, but a defined core distribution

http://genesis.ugent.be/rover/

Page 32: BITS - Protein inference from mass spectrometry data

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny [email protected]

Thank you!

Questions?