2
2716 Proteomics 2005, 5, 2716–2717 REPORT HUPO Brain Proteome Project Pilot Studies: Bioinformatics at Work Christian Stephan, Kai Reidegeld, Helmut E. Meyer, Michael Hamacher Medical Proteom-Center, Ruhr-University Bochum, Germany The data acquisition phase of initial pilot studies (human and mouse brain samples) of the Hu- man Proteome Organisation (HUPO) Brain Proteome Project (BPP) is now complete and the data generated by the participating laboratories has been submitted to the central Data Collection Center. The BPP Bioinformatics Group met on 8th April 2005 at the European Bioinformatics Institute (Hinxton, UK) to discuss strategies for the reanalysis of the pooled data from all the participating laboratories. A summary of the results of the data reprocessing will be presented at the 4th HUPO World Congress that will be held in August/September 2005. Received: April 14, 2005 Accepted: April 18, 2005 Keywords: Bioinformatics / Brain / Human Brain Proteome Project / Human Proteome Organisa- tion / Mass spectrometry / Mouse brain / ProteinScape Missing standards are still prominent in proteomics and related fields. The HUPO Brain Proteome Project (HUPO BPP) organized by Helmut E. Meyer, Medical Proteom- Center Bochum, and Joachim Klose, Charité Berlin, both Germany, started two pilot studies to circumvent the pit- falls of proteomics approaches: a standardized differential proteome and transcriptome analysis of mouse brain (three age stages) as well as of biopsy versus autopsy hu- man brain tissue (temporal front lobe). The practical work has been finished and the participating laboratories have submitted their acquired raw data to the so-called Data Collection Center (DCC) located at the Medical Proteom- Center. The whole data collection system is set up as a three tier client/server architecture. The DCC is a global server while every participating laboratory is also a local server whereto the working clients are connected via intranet. The Protein- Scape software (Bruker Daltonics, Bremen and Protagen, Dortmund, both Germany; free licences by Bruker Dal- tonics) acts as both the global as well as the local server and has been chosen for handling the heterogeneous data as it is a feasible system for importing all different data types from proteomics research (from 1-D and 2-D gel electrophoresis, DIGE, multidimensional LC-MS/MS etc.). The data of every participant will be exported from her/his own computer by an additional SQL store procedure that allows the export of chosen projects via an HTML-based menu and the submis- sion via a FTP server or CD. The import is a SQL script store procedure with an input of the source directory. At first the database schema is updated to the newest version. Addition- ally, a distinct range for identity fields (ID) is allocated for every laboratory. In the pilot studies several analysis strategies were per- formed ranging from mRNA profiling to peptidomics and proteomics (1-D- and 2-D-based or LC followed by MS). Dozens of gels and more than 1 million spectra were generated and implemented into the DCC. Data will be reprocessed according to a stringency set (Reprocessing Guideline, http:// www.hbpp.org) and interpreted by invited re-analysts (free temporary MASCOT licences by Matrix Science). Subsequent to the analysis phase all collected data will be exported by a newly designed exchange tool using a mzData-based format into the database PRIDE located at the European Bioinfor- matics Institute (EBI) for world-wide access. All steps of the bioinformatics workflow were elaborated in close discussion with the HUPO Proteomics Standards Initiative (HUPO PSI) and the European Bioinformatics Institute (EBI) in Hinxton, UK. During the whole pilot stud- ies, the HUPO BPP Bioinformatics Committee met regularly to discuss and to decide the next steps. Correspondence: Dr. Michael Hamacher, Medical Proteom-Cen- ter, Ruhr-University Bochum, ZKF E.143, Universitaetsstrasse 150, D-44801 Bochum, Germany E-Mail: [email protected] Fax: 149-234-32-14554 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de DOI 10.1002/pmic.200500426

HUPO Brain Proteome Project Pilot Studies: Bioinformatics at Work

Embed Size (px)

Citation preview

Page 1: HUPO Brain Proteome Project Pilot Studies: Bioinformatics at Work

2716 Proteomics 2005, 5, 2716–2717

REPORTHUPO Brain Proteome Project

Pilot Studies:

Bioinformatics at Work

Christian Stephan, Kai Reidegeld, Helmut E. Meyer, Michael Hamacher

Medical Proteom-Center, Ruhr-University Bochum, Germany

The data acquisition phase of initial pilot studies (human and mouse brain samples) of the Hu-man Proteome Organisation (HUPO) Brain Proteome Project (BPP) is now complete and thedata generated by the participating laboratories has been submitted to the central Data CollectionCenter. The BPP Bioinformatics Group met on 8th April 2005 at the European BioinformaticsInstitute (Hinxton, UK) to discuss strategies for the reanalysis of the pooled data from all theparticipating laboratories. A summary of the results of the data reprocessing will be presented atthe 4th HUPO World Congress that will be held in August/September 2005.

Received: April 14, 2005Accepted: April 18, 2005

Keywords:

Bioinformatics / Brain / Human Brain Proteome Project / Human Proteome Organisa-tion / Mass spectrometry / Mouse brain / ProteinScape

Missing standards are still prominent in proteomics andrelated fields. The HUPO Brain Proteome Project (HUPOBPP) organized by Helmut E. Meyer, Medical Proteom-Center Bochum, and Joachim Klose, Charité Berlin, bothGermany, started two pilot studies to circumvent the pit-falls of proteomics approaches: a standardized differentialproteome and transcriptome analysis of mouse brain(three age stages) as well as of biopsy versus autopsy hu-man brain tissue (temporal front lobe). The practical workhas been finished and the participating laboratories havesubmitted their acquired raw data to the so-called DataCollection Center (DCC) located at the Medical Proteom-Center.

The whole data collection system is set up as a three tierclient/server architecture. The DCC is a global server whileevery participating laboratory is also a local server wheretothe working clients are connected via intranet. The Protein-Scape software (Bruker Daltonics, Bremen and Protagen,Dortmund, both Germany; free licences by Bruker Dal-tonics) acts as both the global as well as the local server andhas been chosen for handling the heterogeneous data as it isa feasible system for importing all different data types from

proteomics research (from 1-D and 2-D gel electrophoresis,DIGE, multidimensional LC-MS/MS etc.). The data of everyparticipant will be exported from her/his own computer byan additional SQL store procedure that allows the export ofchosen projects via an HTML-based menu and the submis-sion via a FTP server or CD. The import is a SQL script storeprocedure with an input of the source directory. At first thedatabase schema is updated to the newest version. Addition-ally, a distinct range for identity fields (ID) is allocated forevery laboratory.

In the pilot studies several analysis strategies were per-formed ranging from mRNA profiling to peptidomics andproteomics (1-D- and 2-D-based or LC followed by MS). Dozensof gels and more than 1 million spectra were generated andimplemented into the DCC. Data will be reprocessed accordingto a stringency set (Reprocessing Guideline, http://www.hbpp.org) and interpreted by invited re-analysts (freetemporary MASCOT licences by Matrix Science). Subsequentto the analysis phase all collected data will be exported by anewly designed exchange tool using a mzData-based formatinto the database PRIDE located at the European Bioinfor-matics Institute (EBI) for world-wide access.

All steps of the bioinformatics workflow were elaboratedin close discussion with the HUPO Proteomics StandardsInitiative (HUPO PSI) and the European BioinformaticsInstitute (EBI) in Hinxton, UK. During the whole pilot stud-ies, the HUPO BPP Bioinformatics Committee met regularlyto discuss and to decide the next steps.

Correspondence: Dr. Michael Hamacher, Medical Proteom-Cen-ter, Ruhr-University Bochum, ZKF E.143, Universitaetsstrasse150, D-44801 Bochum, GermanyE-Mail: [email protected]: 149-234-32-14554

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

DOI 10.1002/pmic.200500426

Page 2: HUPO Brain Proteome Project Pilot Studies: Bioinformatics at Work

Proteomics 2005, 5, 2716–2717 Report 2717

Hosted by Rolf Apweiler, the Bioinformatics Group cametogether again for the 4th time at the EBI on April 8th, 2005to summarize the status quo of the data reprocessing and toset up several task forces. The task force “2-D gel imageanalysis” will be coordinated by Mike Dunn, Conway Insti-tute of Biomolecular and Biomedical Research, Dublin, Ire-land, collecting the uncompressed image files together withthe corresponding meta data (kind of gel, concentration, pHrange, dimension, amount of sample, staining, softwareused, technical replicates etc). The files will be submitted toAndrew Dowsey and Guang-Zhang Yang, Imperial CollegeLondon, UK, for further analysis as well as for reprojection tothe original gels. The results will be correlated back to themass spectrometry data and to the differential expressiondata generated by the participants. The task “Reprocessing ofthe protein lists” will be started by the re-analysis of the rawdata according to the Reprocessing Guidelines, headed bythe Medical Proteom-Center and Eugene Kapp, LudwigInstitute for Cancer Research Melbourne, Australia. Theresults of the pilot study participants will be mapped to thenew April IPI version as well as to the correct species. Thestringency parameters mentioned in the guidelines wereconfirmed with the comment that the used gel-basedapproaches per se are relatively stringent. The re-analysts (seewww.hbpp.org) will be provided with the protein/peptideslists and additional information according to their own spe-cification. Concerning the mRNA profiling, the correlation ofthe mapped differential expressed gene products with thecorresponding proteins will be done by Claus Hultschig,

MPI Molecular Genetics, Berlin, Germany and the EBI,while the peptidomics interpretation will be linked byMichael Schrader, Fachhochschule, Weihenstephan, andBioVisoN, Hannover, both Germany. The task force “TextMining” is organized by Rolf Apweiler’s group at the EBI,starting with a proof of concept by analyzing the annotatedresults of some participants.

Planning the timeline, the attendees of the bioinfor-matics meeting suggested that a summary of the reproces-sing results will be presented at the 4th HUPO World Con-gress in August/September 2005 (http://hupo2005.com).The jamboree of the re-analysts will take place afterwards,presumably in October accomplished by the preparing ofjoint publications. The publications should be available latestat the 5th HUPO BPP workshop that is planned for Dublinin February 2006.

More information, e.g. protocols and guidelines are avail-able at www.hbpp.org and the HUPO PSI (http://psi-dev.sourceforge.net/) homepage. The HUPO BPP Bioinfor-matics Group are: Rolf Apweiler, Martin Blüggel, AndrewDowsey, Mike Dunn, David Fenyo, Michael Hamacher, ClausHultschig, Philip Jones, Eugene Kapp, Andrew Lyall, KatrinMarcus, Lennart Martens, Helmut E. Meyer, Michael Müller,David Parkinson, Kai Reidegeld, Dietrich Rebholz-Schuh-mann, Michael Schrader, Christian Stephan, Chris Taylor, Her-bert Thiele, Guang-Zhong Yang, Chenggang Zhang and others.

Parts of the German HUPO BPP activities are funded bythe German Federal Ministry of Education and Research(BMBF).

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de