Upload
hans
View
44
Download
0
Embed Size (px)
DESCRIPTION
Support for Systems Biology Data in IRD/ ViPR - Proteomics. Richard H. Scheuermann, Ph.D. November 5 , 2012. Projects with Host Factor Data. Four s ystems biology groups funded by NIAID, including: Systems Virology (Michael Katze group, Univ. Washington) - PowerPoint PPT Presentation
Citation preview
Richard H. Scheuermann, Ph.D.November 5, 2012
Support for Systems Biology Datain IRD/ViPR - Proteomics
Projects with Host Factor Data
• Four systems biology groups funded by NIAID, including:– Systems Virology (Michael Katze group, Univ. Washington)
• Influenza H1N1 and H5N1 and SARS Coronavirus• statistical models, algorithms and software, raw and processed gene expression data, and
proteomics data– Systems Influenza (Alan Aderem group, Institute for Systems Biology/Seattle Biomed)
• Various influenza viruses• microarray, mass spectrometry, and lipidomics data
• ViPR Driving Biological Projects– Abraham Brass, Mass. General Hospital
• Dengue virus host factor database from RNAi screen – Lynn Enquist / Moriah Szpara, Princeton University
• Deep sequencing and neuronal microarrays for functional genomic analysis of Herpes Simplex Virus
– Richard Kuhn, Purdue University• Metabolomics data of Dengue virus infection of human cells and mosquitos
– Mike Diamond, Washington University• Identification of inhibitory interferon-stimulated genes against flaviviruses and noroviruses using
shRNA knockdown• Determine the mechanism of action of individual inhibitory ISGs
• “Omics” data management (MIBBI vs MIBBI-DB)– Project metadata (1 template)
• Title, PI, abstract, publications– Experiment metadata (~6 templates)
• Biosamples, treatments, reagents, protocols, subjects– Primary results data
• Raw expression values– Data processing metadata (1 template)
• Normalization and summarization methods– Processed data
• Data matrix of fold changes and p-values– Data interpretation metadata (1 template)
• Fold change and p-value cutoffs used– Interpreted results (Host factor biosets)
• Interesting gene, protein and metabolite lists
• Visualize biosets in context of biological pathways and networks• Statistical analysis of pathway/sub-network overrepresentation
Strategy for Handling “Omics” Data
Data Submission Workflows
Study metadata
Experiment metadata
Primary results
Analysis metadata
Processed data matrix
Free text metadataGEO/PRIDE/PNNL/SRA/MetaboLights
ViPR/IRD/PATRIC
Host factor bioset
pointer
submission
submission
pointer
Systems Biology sites
Metadata Submission Template Examples
Host Factor Data
8 Studies To Date
Host Factor Bioset
Transcriptomics => Proteomics
• Metadata fields are largely re-usable, with some exceptions– Exp_sample_template (protein).xls
• Results data differences– Peptide-level and protein-level• IM005_Peptide_normalization_matrix.V2.xlsx• IM005_Protein Normalization matrix.xlsx
– Statistical measures• Results_matrix_ IM005_sig Protein_RM.xlsx
Metadata Field Changes
• GEO GSM ID => Primary Data Archive + Primary Data Archive ID
• Semi-structured Experiment Variable to Structured Experiment Variable– Free text (1 day) => value unit pairs in separate fields
(1/day; 10^4/plaque forming units)• Multiple processed data matrix files– Concatenated IDs separated by (; |)
• Reagents and protocols are different but should not require submission template changes
Normalized Data
• Archive at BRC (standard format?)– Peptide normalized data– Protein normalized data– Results matrix of significant proteins
• BRCs derive bioset lists from results matrix– Handling different significance measures• t-test flag, t-test p-value, g-test flag, g-test p-value,
log10 ratio
Host Factor Bioset
On Deck
• Metabolomics and lipidomics data• Integration of RNA expression, protein
abundance and metabolite abundance• Pathway/network visualization and analysis
Acknowledgement
• Lynn Law, U. Washington• Richard Green, U. Washington• Peter Askovich, Seattle Biomed• Brett Pickett, U.T. Southwestern/JCVI • Jyothi Noronha, U.T. Southwestern• Eva Sadat, U.T. Southwestern• Entire Systems Biology Data Dissemination Task
Force, especially Jeremy Zucker• NIAID (Alison Yao and Valentina DiFrancesco)
Future Development Plans
GOenrichment
Networkvisualization
GOGOGOGOGOGOGOGOGOGOGOGOGOGO