1
CENTER FOR GENOMICS AND BIOINFORMATICS a multidisciplinary research center serving IUB, with growing interactions to IUPUI and other institutions carries out independent research in genomics and bioinformatics collaborates with faculty to add or enhance a genomic and/or bioinformatic component in their research projects promotes interdepartmental and interdisciplinary interactions to enhance genomics and bioinformatics at IUB provides access to Roche GS-FLX high-throughput sequencing and Nimblegen high-density arrays High-throughput sequencing with GS-FLX: We offer both fee-for-service sequencing along with opportunities for collaborations using high-throughput sequencing techniques. Nimblegen high-density microarrays: We offer a variety of options for high-density microarray experiments, including any combination of consultation, experimental design, and experiments using the high density microarrays. Characterization of the Drosophila Transcriptome PROJECT DESCRIPTION: This project is an important component of the model organism ENCODE (modENCODE) project, which aims to identify all of the sequence-based functional elements in the Drosophila melanogaster genome. The project is run as a Research Network. Our role is to produce RNA samples that include all D. melanogaster transcripts. RNA samples are prepared from a variety of life cycle stages, from dissected tissues, from cell lines and from tissue regions. Many of these RNA samples are fractionated to enrich small RNAs, nuclear RNAs and poly-A+ RNAs. Sample types are chosen to maximize the diversity of transcriptomes. These samples are then used in experimental and computational work to generate transcript models and to provide a complete baseline catalog of developmental (stage- and tissue-specific) transcription in Drosophila.0 PROJECT LEADER: Peter Cherbas FUNDING: National Human Genome Research Institute (NHGRI) Daphnia pulex Genome Sequencing & Empirical Annotation PROJECT DESCRIPTION: The globally distributed zooplankton Daphnia (commonly called the waterflea) is the first crustacean to have its genome sequenced, helping to create a new model system for ecological genomics. The sequence is produced by the Joint Genome Institute in collaboration with the Daphnia Genomics Consortium. Investigations of the data are uncovering how the genome's structure, gene inventory and regulation are products of the many challenges common in aquatic environments. A surprising result is the genome's impressive catalog of genes; only half the predicted loci have sequence similarity to other characterized eukaryotic proteomes. The large number of orphan genes is due to the phylogenetic distances between Crustacea and insect model species, variable rates of evolutionary change, and gene family expansions specific to the Daphnia lineage. Experimental annotations are required to understand how the genome's organization is coupled to the animal's biology. Our characterization of gene functions is based on genome-wide expression profiling using ESTs and microarrays, generated by challenging Daphnia to ecologically relevant conditions. We further study transcriptional profiles from more modern conditions that threaten zooplankton populations: environmental pollutants, the depletion of essential elements and nutrients. Overall, these data suggest that many of the novel components of the genome reflect physiological and adaptive responses of Daphnia to its complex environments. PROJECT LEADER: John Colbourne FUNDING: NSF, NIH, US DOE, Microbial Systems Biology Pipeline PROJECT DESCRIPTION: Our current understanding of bacterial cell function is based on a few very powerful model systems. However, studies on alternative, non-model, microbial systems in recent years clearly indicate that restricting our focus on a few model systems is insufficient and in some cases misleading, for understanding the amazing diversity of mechanisms and properties of microbial cells. Furthermore, recent comparisons of true natural isolates with "wild-type" laboratory strains has revealed that many domesticated laboratory models have in fact evolved towards optimal Director: Peter Cherbas ([email protected] / 812-855-6273) Deputy Director: Jennifer Steinbachs ([email protected] / 812-856- 1858) Genomics Director: John Colbourne ([email protected] / 812-856- 0099) Bioinformatics Director: Qunfeng Dong ([email protected] / 812-855-3373) Computing Director: Phillip Steinbachs ([email protected] / 812-856- 5081) Nimblegen High-Density Arrays: During a microarray experiment, we perform QC in every step, as recommended in the NimbleGen manual. Measures include NanoDrop readings for checking the quantity of ds-cDNA synthesis, Agilent Bioanalyzer readings for checking the quality and quantity of ds-cDNA synthesis, and NanoDrop readings for checking the efficiency of cDNA labeling. We also perform sample tracking control analysis for data analysis assessment. High-throughput Sequencing: Sample assessment: DNA quality and quantity are assessed by spectrophotometry, fluorometry, and gel electrophoresis. These assays reveal potential contaminants (such as RNA, protein, other small molecules) that would interfere with sample sequencing and also reveals the extent to which the sample DNA is intact or degraded from handling. Library assessment: A sequencing library that is prepared from the sample DNA is composed of fragments that are smaller than the sample DNA, and carefully size-selected. To assess the concentration, size range and quality of the library, we use fluorometry and an Agilent Bioanalyzer LabChip, designed for highly sensitive detection. Library titration: We test a library by actually doing a trial-run of emulsion PCR, performed in the same manner as if the library were to be sequenced. For the high throughput of pyrosequencing to yield successful results, we must ensure that the vast majority of templates are captured and amplified individually - one template per bead. If more than one template is amplified on a bead, sequence reads imaged off that bead will be "mixed" and therefore uninterpretable, and will be discarded in the data analysis. The CGB employs 30 full-time scientists at various levels of expertise who engage faculty in their genomics and bioinformatics projects. Our genomics staff have access to a variety of cutting edge equipment in the lab, some highlights include: Roche GS-FLX - capable of generating between 125- 150 million bases of sequence in a single run; GeneMachines Hydroshear DNA Shearing Device; Veritas Microdissection Instrument model 704 with IR capture laser and UV cutting laser with epifluorescence; Beckman Coulter Biomek FX, an automated liquid handling robot; 2 MJ Research Tetrad thermal cyclers and 10 Eppendorf Mastercyclers, with 18 96-well blocks and 1 384-well block capable of incubating 2112 thermal cycle reactions simultaneously; GeneMachines Omnigrid microarry printer, capable of printing 300 microarrays simultaneously with over 14,000 elements per array; Axon Instruments GenePix 4000A microarray scanner (10 µm resolution) and GenePix 4200A microarray scanner (5 µm resolution, 3 channels); NanoDrop ND-1000 Spectrophotometer, designed with high absorbance capability, 50 times that of traditional spectrophotometers. The CGB designed and maintains its own dedicated Core Computing Facility (CCF) to support day-to- day operations and a growing number of computational and storage intensive research projects. The CCF consists of about 54 enterprise class systems from Sun Microsystems. Dedicated research systems include two 8-core and four 16- core systems for interactive serial jobs, sequencing/microarray analysis, and relational database services, and 14 dual-CPU cluster nodes for batch serial and parallel jobs. Another 26 single and dual dual-core systems provide development and production web, database, OVERVIEW RESOURCES RESEARCH CONTRIBUTION HIGHLIGHTS LIST OF SERVICES QUALITY CONTROL AND ASSURANCES CONTACT INFORMATION CENTER FOR GENOMICS AND BIOINFORMATICS

CENTER FOR GENOMICS AND BIOINFORMATICS

Embed Size (px)

DESCRIPTION

CENTER FOR GENOMICS AND BIOINFORMATICS. RESEARCH CONTRIBUTION HIGHLIGHTS. OVERVIEW. LIST OF SERVICES. CENTER FOR GENOMICS AND BIOINFORMATICS a multidisciplinary research center serving IUB, with growing interactions to IUPUI and other institutions - PowerPoint PPT Presentation

Citation preview

Page 1: CENTER FOR GENOMICS AND BIOINFORMATICS

CENTER FOR GENOMICS AND BIOINFORMATICS a multidisciplinary research center serving IUB, with growing interactions to IUPUI and other institutions carries out independent research in genomics and bioinformatics collaborates with faculty to add or enhance a genomic and/or bioinformatic component in their research projects promotes interdepartmental and interdisciplinary interactions to enhance genomics and bioinformatics at IUB provides access to Roche GS-FLX high-throughput sequencing and Nimblegen high-density arrays

High-throughput sequencing with GS-FLX: We offer both fee-for-service sequencing along with opportunities for collaborations using high-throughput sequencing techniques.Nimblegen high-density microarrays: We offer a variety of options for high-density microarray experiments, including any combination of consultation, experimental design, and experiments using the high density microarrays.

Characterization of the Drosophila TranscriptomePROJECT DESCRIPTION: This project is an important component of the model organism ENCODE (modENCODE) project, which aims to identify all of the sequence-based functional elements in the Drosophila melanogaster genome. The project is run as a Research Network. Our role is to produce RNA samples that include all D. melanogaster transcripts. RNA samples are prepared from a variety of life cycle stages, from dissected tissues, from cell lines and from tissue regions. Many of these RNA samples are fractionated to enrich small RNAs, nuclear RNAs and poly-A+ RNAs. Sample types are chosen to maximize the diversity of transcriptomes. These samples are then used in experimental and computational work to generate transcript models and to provide a complete baseline catalog of developmental (stage- and tissue-specific) transcription in Drosophila.0PROJECT LEADER: Peter CherbasFUNDING: National Human Genome Research Institute (NHGRI)

Daphnia pulex Genome Sequencing & Empirical AnnotationPROJECT DESCRIPTION: The globally distributed zooplankton Daphnia (commonly called the waterflea) is the first crustacean to have its genome sequenced, helping to create a new model system for ecological genomics. The sequence is produced by the Joint Genome Institute in collaboration with the Daphnia Genomics Consortium. Investigations of the data are uncovering how the genome's structure, gene inventory and regulation are products of the many challenges common in aquatic environments. A surprising result is the genome's impressive catalog of genes; only half the predicted loci have sequence similarity to other characterized eukaryotic proteomes. The large number of orphan genes is due to the phylogenetic distances between Crustacea and insect model species, variable rates of evolutionary change, and gene family expansions specific to the Daphnia lineage. Experimental annotations are required to understand how the genome's organization is coupled to the animal's biology. Our characterization of gene functions is based on genome-wide expression profiling using ESTs and microarrays, generated by challenging Daphnia to ecologically relevant conditions. We further study transcriptional profiles from more modern conditions that threaten zooplankton populations: environmental pollutants, the depletion of essential elements and nutrients. Overall, these data suggest that many of the novel components of the genome reflect physiological and adaptive responses of Daphnia to its complex environments.PROJECT LEADER: John ColbourneFUNDING: NSF, NIH, US DOE,

Microbial Systems Biology PipelinePROJECT DESCRIPTION: Our current understanding of bacterial cell function is based on a few very powerful model systems. However, studies on alternative, non-model, microbial systems in recent years clearly indicate that restricting our focus on a few model systems is insufficient and in some cases misleading, for understanding the amazing diversity of mechanisms and properties of microbial cells. Furthermore, recent comparisons of true natural isolates with "wild-type" laboratory strains has revealed that many domesticated laboratory models have in fact evolved towards optimal laboratory growth and often differ considerably from their environmental counterparts. The study of natural isolates is often hampered by the concurrent loss of genetic tractability. A paradigm shift is required to launch a new era in the study of the microbial world in which high-throughput technologies combined with powerful bioinformatics analysis allows the study of non-model systems at a level of detail previously unimaginable.PROJECT LEADERS: Qunfeng Dong (CGB, Bioinformatics), Jeong-Hyeon Choi, Keithanne Mockaitis, Zhao Lai, in collaboration with Y. Brun and all IUB microbiologistsFUNDING: The Indiana Metabolomics and Cytomics Initiative (METACyt)

Director: Peter Cherbas ([email protected] / 812-855-6273)

Deputy Director: Jennifer Steinbachs ([email protected] / 812-856-1858)

Genomics Director: John Colbourne ([email protected] / 812-856-0099)

Bioinformatics Director: Qunfeng Dong ([email protected] / 812-855-3373)

Computing Director: Phillip Steinbachs ([email protected] / 812-856-5081)

Nimblegen High-Density Arrays: During a microarray experiment, we perform QC in every step, as recommended in the NimbleGen manual. Measures include NanoDrop readings for checking the quantity of ds-cDNA synthesis, Agilent Bioanalyzer readings for checking the quality and quantity of ds-cDNA synthesis, and NanoDrop readings for checking the efficiency of cDNA labeling. We also perform sample tracking control analysis for data analysis assessment.High-throughput Sequencing: Sample assessment: DNA quality and quantity are assessed by spectrophotometry, fluorometry, and gel electrophoresis. These assays reveal potential contaminants (such as RNA, protein, other small molecules) that would interfere with sample sequencing and also reveals the extent to which the sample DNA is intact or degraded from handling. Library assessment: A sequencing library that is prepared from the sample DNA is composed of fragments that are smaller than the sample DNA, and carefully size-selected. To assess the concentration, size range and quality of the library, we use fluorometry and an Agilent Bioanalyzer LabChip, designed for highly sensitive detection. Library titration: We test a library by actually doing a trial-run of emulsion PCR, performed in the same manner as if the library were to be sequenced. For the high throughput of pyrosequencing to yield successful results, we must ensure that the vast majority of templates are captured and amplified individually - one template per bead. If more than one template is amplified on a bead, sequence reads imaged off that bead will be "mixed" and therefore uninterpretable, and will be discarded in the data analysis.

The CGB employs 30 full-time scientists at various levels of expertise who engage faculty in their genomics and bioinformatics projects. Our genomics staff have access to a variety of cutting edge equipment in the lab, some highlights include:

•Roche GS-FLX - capable of generating between 125-150 million bases of sequence in a single run;•GeneMachines Hydroshear DNA Shearing Device;•Veritas Microdissection Instrument model 704 with IR capture laser and UV cutting laser with epifluorescence;•Beckman Coulter Biomek FX, an automated liquid handling robot;•2 MJ Research Tetrad thermal cyclers and 10 Eppendorf Mastercyclers, with 18 96-well blocks and 1 384-well block capable of incubating 2112 thermal cycle reactions simultaneously;•GeneMachines Omnigrid microarry printer, capable of printing 300 microarrays simultaneously with over 14,000 elements per array;•Axon Instruments GenePix 4000A microarray scanner (10 µm resolution) and GenePix 4200A microarray scanner (5 µm resolution, 3 channels);•NanoDrop ND-1000 Spectrophotometer, designed with high absorbance capability, 50 times that of traditional spectrophotometers.

The CGB designed and maintains its own dedicated Core Computing Facility (CCF) to support day-to-day operations and a growing number of computational and storage intensive research projects. The CCF consists of about 54 enterprise class systems from Sun Microsystems. Dedicated research systems include two 8-core and four 16-core systems for interactive serial jobs, sequencing/microarray analysis, and relational database services, and 14 dual-CPU cluster nodes for batch serial and parallel jobs. Another 26 single and dual dual-core systems provide development and production web, database, authentication, collaboration, and Unix/Windows desktop services. Production storage is provided by multiple X4500 "Thumper" systems each serving up 24-48TB of raw capacity, which is combined into a single large pool on high-availability head nodes and shared out via NFS and iSCSI. All filesystems built on top of this use Sun's Zetabyte File System (ZFS) and a dual parity RAID scheme (raidz2) for redundancy. Backup storage is provided by a SAN consisting of several 8-24TB Fibre channel/SATA JBOD disk arrays. Altogether, the CGB has approximately 232TB of total raw disk capacity .

OVERVIEW

RESOURCES

RESEARCH CONTRIBUTION HIGHLIGHTS LIST OF SERVICES

QUALITY CONTROL AND ASSURANCES

CONTACT INFORMATION

CENTER FOR GENOMICS AND BIOINFORMATICS