18
Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Embed Size (px)

Citation preview

Page 1: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Integration of Heterogeneous Informations Sourcesfor Proteomics and

Transcriptomics

Steffen Möller

University of Rostock

Proteome Center

Page 2: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Data Flow and Motivation

• List of genes products with changed expression level

• Description of variants of genes

Sample A.a

Sample A.1Group A Sample Z.z

Sample Z.1Group Z...

Sample Selection

Preparation

Analysis Measurements

Question

Interpretation

Page 3: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Data available online

• Grouping of samples in homogeneous groups

• Portioning and preparation of samples

• Data derived from a preparation– DNA/RNA sequencing

– Affymetrix Microarrays

– 2DE Gels

– (Tandem) mass spectrometry

• External bioinformatics databases

• Internal extensions to the above– Communication of ideas between researchers

Lab-internalinformation

Measurements

Aids forInterpretation

of Data

Page 4: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Organisation of Samples

Page 5: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Access of MS Spectra

• MASCOT peptide identification

• MS/MS fragment sequencing

Page 6: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Addition to external data sources• Genes discussed among researchers

Page 7: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Overview on Identified Spots on Gel • Integration of Protein expression levels

– Spot Volume– Spot Area– Spot Peak intensity

• with RNA expression levels– from Affymetrix chips

Page 8: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Application of Agent Technology

• Automated retrieval and integration of presumed relevant in-house data

• Assistance in interpretation– Heuristics to extend/shrink list of genes

presumed relevant

• Integration with external online data– Pathways– Known relevance of genes in other diseases

Page 9: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Data Flow

Adapted for Agents:• Input: List of Gene IDs• Output: List of

( Gene ID Agent ID Evaluation Explanation History)

Seed of Genes

ModifiedList of Genes

Heuristic

Page 10: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Examples for Heuristics• Towards extension/shrinking of list of genes under

investigation– Gene lies within chromosomal locus linked to disease

– Chromosomal neighbourhood to other genes of investigation

– Gene is of presumed low abundance

• Guidance of further wet-lab analysis– Comparison of ration RNA/protein levels

• Search for pre- or post-transcriptional control

Page 11: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Example: Interaction with EnsEMBL

• Visualisation of QTLs with expression data(G. Fischer et al. 2002, submitted)

Page 12: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Transfer from Automated Sequence Annotation

• EDITtoTrEMBL (Möller et al. 1998)– Introduction of intermediate level for data integration– Hierarchical organisation of agents

TrEMBL

Program

Program

Integration

Page 13: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

EDITtoTrEMBL: Self-introducing Agents

• Dispatchers provide automated planning of annotation path of entries

• Sequence-Analysing agents described their input and their output to dispatching agents

•SWISS-PROT syntax and controlled vocabulary

•Regular expressions as constraints

Page 14: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Application in sequence annotation of transmembrane proteins

• A variety of programs exist to predict– membrane spanning regions– direction of insertion into the membrane

Out

In

Page 15: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Conflict resolution• Implemented with REVISE (C. V. Damasio; 1997)

application described in (S. Möller, M. Schroeder; 2000)

Page 16: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Problems with the transfer of these techniques to the wet-lab

• Analysers cannot describe themselves or their results– No ontology for methods of expression data

analysis has been defined (yet)– The motivation of an analyser to include a gene

cannot be formally expressed

• No rules for conflict resolution applicable– Conflicts point the unexpected, not to artefacts

Page 17: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

Discussion• Should I implement the best possible agent system or

rather ASAP hunt for the causing agents of autoimmune diseases?

• New agents are recruited from Perl scripts that are implemented to provide a quick answer to requests of biological researchers.

• Integration on a pragmatical level• The system is accepted by wet-lab researchers.• The system has a PHP-based web-frontend,

– communication between agents is implemented via SOAP– adaptations and extensions to the system are easily

implemented.

Page 18: Integration of Heterogeneous Informations Sources for Proteomics and Transcriptomics Steffen Möller University of Rostock Proteome Center

AcknowledgementsUniversity of Rostock

Michael Kreutzer, Gertrud Fischer, Bernd Scheidt, Ines Weber, Angelika Allenberg, Björn Damm, Michael Glocker, Hans-Jürgen Thiesen

City University, LondonMichael Schroeder

EMBL-EBI, CambridgeRolf Apweiler

Funded by theBMBF Leitprojekt „Proteom-Analyse des Menschen“

and the Landesforschungsschwerpunkt „Genomorientierte

Biotechnologie“