Upload
bartholomew-scott
View
218
Download
1
Tags:
Embed Size (px)
Citation preview
Introduction to caArray
caBIG® Molecular Analysis Tools
Knowledge Center
April 3, 2011
caArray Overview
• More than a simple repository for microarray data.• Supports data management throughout the life of
experiment.• Allows collaborative sharing of pre-publication data
with partners.• Provide data to other biomedical/clinical tools to form
a comprehensive solution for array data management, search, and analysis.
Why use caArray?
• Target Users: • Bench scientists performing microarray data collection and
annotation• Microarray core facility scientists and technicians• Bioinformatics and data management coordinators• Multi-institutional data coordinating center informaticians
• Addressing Critical Needs:• Manage all aspects of array data: raw data, derived data,
sample annotation, experimental design• Ensure data are private (in a local instance) until published• Supports array data sharing using a federated model• Find what you are looking for fast: query annotated data within,
and across, datasets • Facilitate data integration: provide annotated data to other
analytical caBIG® tools
Key Functions of caArray
• Query annotated data within and across datasets with search and navigate features
• Uploading of array files from industry formats (e.g., Affymetrix, GenePix, Illumina, Agilent)
• Annotation of data to harmonize datasets and reduce time to aggregate data
• MAGE-TAB import and export functionality• GEO-SOFT export functionality• Security and authentication features that include group-
based permissions• Provide annotated data to other caBIG® tools that support
analytical analysis• Rich programmatic APIs that allow analytical tools (on and
off the Grid) to pull data from caArray and visualize/analyze it.
Web Interface: Find Things Fast
• User-friendly web interface for browse and search
Platform Support: Grow Towards All Inclusive
• The collection of most available Affymetrix, Illumina, and Agilent array platforms/designs in caArray ensures that most native data files can be stored, parsed, and associated to samples.
Parsed Data Formats: the More, the Better for Users
• MAGE-TAB format• Agilent raw TXT for aCGH, expression and miRNA assays • Agilent GEML/XML array designs • Nimblegen pair Report TXT (raw and normalized) • Nimblegen NDF array designs • Illumina CSV• Illumina Sample Probe Profile TXT • Illumina genotyping processed data matrix TXT • Illumina BGX/TXT array designs • Affymetrix CEL and CHP in AGCC/Calvin formats in addition to the
GCOS formats • Affymetrix CNCHP copy number data (CN4 and CN5) • Copy Number data in a prescribed MAGE-TAB Data Matrix format.
MAGE-TAB: Save Time on Sample Annotation
IDF
SDRFExcel-like Format, Controlled Vocabhttp://www.mged.org/mage-tab/
Data Management: Loading Data
Data Management: Sample Annotation and Datasets
Data Export: Zip, MAGE-TAB, or GEO Soft
Collaboration and Data Sharing
• Investigators define collaboration groups for sharing of pre-publication data with a set of partners.
• Access control at the experiment level or at individual samples.
• Data is private until made public by the Data Owner.
Data Analysis: Tool Integration
gene expression data gene expression data and SNP data
Cross-query over many caArray instances
gene expression data and copy number data
A Glance at the Technology
• Tool Platform: • Enterprise-web based system that works within a Firefox or Internet Explorer
browser
• CBIIT-Hosted Installation of caArray: • Limited computer skills are required to use the application; directed at laboratory
researchers
• Local Installation of caArray:• Moderate technical expertise is required to install the tool
• Upgrade Availability:• To make upgrades as seamless as possible, an upgrade installer, both available in
GUI format as well as command line format, upgrades installed caArray instance while maintaining data integrity.
The Next Step: Accessing Online Resources for caArray
Molecular Analysis Tools Knowledge Center
https://wiki.nci.nih.gov/x/R5GNAg
caArray User Forum https://cabig-kc.nci.nih.gov/Molecular/forums/viewforum.php?f=6
Tool Landing Page https://cabig.nci.nih.gov/tools/caArray
Access to Demo caArray Instance
https://array-train.nci.nih.gov/caarray/home.action(Register from that site for a training account)
Application Support Email: [email protected]
Phone: 301-451-4384
Toll-free: 888-478-4423