9
Interaction prediction with STRING Principles and examples Lars Juhl Jensen EMBL Heidelberg

Interaction prediction with STRING - Principles and examples

Embed Size (px)

DESCRIPTION

EMBL Lab Day, European Molecular Biology Laboratory, Heidelberg, Germany, June 15, 2004

Citation preview

Page 1: Interaction prediction with STRING - Principles and examples

Interaction prediction with STRINGPrinciples and examples

Lars Juhl JensenEMBL Heidelberg

Page 2: Interaction prediction with STRING - Principles and examples

Too much information – too little knowledge

• Biology is now in the age of large-scale data collection– Explosive increase in data from genome sequencing, microarray

expression studies, screening for protein interactions etc.– The data types are highly heterogeneous– Much data is not being deposited in standardized repositories– Most data sets are error-prone and suffer from systematic biases

• STRING is a web resource that integrates many different types of information across 100+ species

• We do not intend STRING to be– a primary repository for experimental data– a curated database of complexes or pathways– a substitute for expert annotation

Page 3: Interaction prediction with STRING - Principles and examples

STRING provides a protein network by integrating diverse types of evidence

Genomic Neighborhood

Species Co-occurrence

Gene Fusions

Database Imports

Exp. Interaction Data

Co-expression

Literature Co-mentioning

Page 4: Interaction prediction with STRING - Principles and examples

Inferring functional modules fromgene presence/absence patterns

Restingprotuberances

Protractedprotuberance

Cellulose

© Trends Microbiol, 1999

CellCell wall

Anchoring proteins

Cellulosomes

Cellulose

The “Cellulosome”

Page 5: Interaction prediction with STRING - Principles and examples

Multiple evidence types from several species

Page 6: Interaction prediction with STRING - Principles and examples

Score calibration against a common reference

• Many diverse types of evidence– The quality of each is judged by

very different raw scores

– These are all calibrated against the same reference set

• Requirements for a reference– Must represent a compromise

of the all types of evidence

– Broad species coverage

• Both a strength and a weakness– Scores for all evidence types

are directly comparable

– The type of interaction is currently not predicted

Page 7: Interaction prediction with STRING - Principles and examples

Getting more specific – generally speaking

Page 8: Interaction prediction with STRING - Principles and examples

Acknowledgments

• The STRING team– Christian von Mering

– Berend Snel

– Martijn Huynen

– Daniel Jaeggi

– Steffen Schmidt

– Mathilde Foglierini

– Peer Bork

• ArrayProspector web service– Julien Lagarde

– Chris Workman

• NetView visualization tool– Sean Hooper

• Analysis of yeast cell cycle– Ulrik de Lichtenberg

– Thomas Skøt

– Anders Fausbøll

– Søren Brunak

• Web resources– http://string.embl.de

– http://www.bork.embl.de/ArrayProspector

– http://www.bork.embl.de/synonyms

Page 9: Interaction prediction with STRING - Principles and examples

Thank you!