FDR Thresholding

Preview:

DESCRIPTION

FDR Thresholding. Caleb J. Emmons. What is FDR?. If decoy proteins are present. # decoy proteins identified. Protein FDR =. # target proteins identified. # spectra from decoy proteins. Peptide FDR =. # spectra from target proteins. The FDR Browser. How does FDR Thresholding work?. 5. - PowerPoint PPT Presentation

Citation preview

FDR ThresholdingFDR Thresholding

Caleb J. Emmons

Slide: 1

What is FDR?What is FDR?

Slide: 2

If decoy proteins are present

Protein FDR = # decoy proteins identified

# target proteins identified

Peptide FDR = # spectra from decoy proteins

# spectra from target proteins

The FDR BrowserThe FDR Browser

How does FDR Thresholding work?How does FDR Thresholding work?

Slide: 4

1

54

32

The “FDR Landscape”

Minimum number o

f peptid

es

How does FDR Thresholding work?How does FDR Thresholding work?

Slide: 5The “FDR Landscape”

Some Fine PointsSome Fine Points

Slide: 6

Confusing!

Protein ClusteringProtein Clustering

Poster 509, Tuesday 10:30-1:00

Informatics: Quantification/Validation

Caleb J. Emmons

Slide: 7

What is a Cluster?What is a Cluster?

Slide: 8

Total Peptide EvidenceTotal Peptide Evidence

Slide: 9

PEtot(A) = sum of peptide probabilitiesover all peptides matching A

Protein PEtot

K1C10 1481%

K1C14 1061%

K1C16 852%

K1C17 503%

Joint Peptide EvidenceJoint Peptide Evidence

Slide: 10

PEjoint(A, B) = sum of peptide probabilitiesover all peptides matching A and B

PEjoint K1C10 K1C14 K1C16

K1C14 184%

K1C16 184% 661%

K1C17 84% 375% 175%

Proteins A and B are in the same cluster if they are directly similar, or if they can be connected with a sequence of proteins that are directly similar.

Cluster FormationCluster Formation

Slide: 11

A ≈ B if 1) their joint evidence is at least 95%, and2) their joint evidence is at least half of the total evidence for A or B

Directly similar

Clusters

Cluster FormationCluster Formation

Slide: 12

Protein PEtot

K1C10 1481%

K1C14 1061%

K1C16 852%

K1C17 503%

PEjoint K1C10 K1C14 K1C16

K1C14 184%

K1C16 184% 661%

K1C17 84% 375% 175%

A ≈ B ?K1C10 K1C14 K1C16

K1C14 no

K1C16 no yes

K1C17 no yes no

Peptide-Protein WeightsPeptide-Protein Weights

Slide: 13

A B C

PEexcl(C) = sum of peptide probabilitiesover all peptides exclusively matching C

W(p, C) =

Spectrum CountingSpectrum Counting

Slide: 14

SEQ1, +3

SEQ1, +2

SEQ2, +2

SEQ3, +2

SEQ7, +3

SEQ7, +3

SEQ4,+2

SEQ5, +2A

B

C

Protein A 3 2 3 1

Protein B 4 2 4 2

Protein C 4 2 3 1

Cluster of B&C

6 5 5 4

Exclusive peptide/spectrum: associated only with this single cluster/protein

Unique peptides: only consider amino acid sequences

Unique spectra: only consider amino acid sequence, modifications, & charge state

Quantitative ValuesQuantitative Values

Slide: 15

Total and Weighted Spectrum Counts run over all spectra in the cluster

Total Ion Current (TIC) and Precursor Intensity may be computed, treating the cluster as a collection of spectra.

Normalized Spectral Abundance Factor (NSAF) roughly consists of a ratio of an exclusive spectrum count and protein length, so does not make direct sense on the level of cluster (as clusters do not have a ‘length’). However, the average NSAF over the member proteins gives an interpretable value. Similarly, we compute the Exponentially Modified Protein Abundance Index (emPAI) as an average over the member proteins in the cluster.

Recommended