37
Future Challenges in Bioinformatics RRX Pharm a RRX Pharm a

RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Embed Size (px)

Citation preview

Page 1: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Future Challenges in Bioinformatics

RRXPharma

RRXPharma

Page 2: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Introduction

• Introduction: How RRX got involved …

• Life sciences context: How bioinformatics came to be important…

• The past half century: How bioinformatics has “evolved”…

Page 3: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Introduction

• Categories of Bioinformatics Tools

• Why We Need Supercomputers

• Software Development Issues

• Future Challenges

• Tools for Biotech Projects

• Summary

Page 4: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How RRX got involved …

• Submitted a Canadian Foundation for Innovation (CFI) proposal for Advanced Bioinformatics Collaborative Computing (ABioCC)

Page 5: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How RRX got involved …

• Developed an SVG based visualization front end

• Paper will be presented at SVG Open 2003 in Vancouver on July 17th

Page 6: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics came to be important…

• After the structure of DNA was reverse engineered with X-Ray diffraction in 1953 focus shifted to nucleic acid sequence analysis

• DNA/RNA/protein sequence data accumulated using computer programs for storage and analysis

Page 7: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics came to be important…

• Bioinformatics algorithms in development for the last half century came into wide spread use by researchers

• The ability to compare sequences created a homology context for unknown sequences of interest leading to advances…

Page 8: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics came to be important…

• Improved sequencing technology enabled the complete deciphering of the human genome >>> 1999

• About 3.18 billion base pairs

• Celera used 300 PE Biosystems ABI Prism 3700 DNA Analysers

Page 9: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics has “evolved”…

• Central dogma of molecular biology – – DNA sequences are transcribed into mRNA

sequences, mRNA sequences are translated into protein sequences, which fold 3D creating structures with functions statistically survival selected >>> affecting the prevalence of the underlying DNA sequences in a population

Page 10: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics has “evolved”…

• This created a supporting information flow– Organization and control of genes in the DNA

sequence– Identification of transcriptional units in the

DNA sequence– Prediction of protein structure from sequence– Analysis of molecular function

Page 11: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics has “evolved”…

• Another covariant information flow was created based on the scientific method– Create hypothesis wrt biological activity– Design experiments to test the hypothesis– Evaluate resulting data for compatibility with

the hypothesis– Extend/modify hypothesis in response

Page 12: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics has “evolved”…

• IT used to handle explosion of data from high throughput techniques, too complex for manual analysis– X-ray diffraction

Page 13: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics has “evolved”…

– Automated DNA sequencing• Amersham Biosciences• Applied Biosystems• Beckman Coulter• LI-COR• SpectruMedix Corp.• Visible Genetics Corp.

Page 14: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics has “evolved”…

– Microarray expression analysis

Page 15: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics has “evolved”…

• Rapid emergence of 3D macromolecular structure databases– New sub discipline: structural

bioinformatics• Atomic and sub cellular spatial

scales– Representation/physics– Storage/retrieval/source data

correlation/interpretation– Analysis/simulation– Display/visualization

Page 16: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How bioinformatics has “evolved”…

Page 17: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Categories of Bioinformatics Tools…

• Databases >>> search/compare• Sequence Analysis - Clusters• Genomics• Phylogenics• Structure Prediction• Molecular Modelling• Microarrays• Packages, Misc Apps, Graphics, Scripts

Page 18: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

• aceperl

• BLAST

• Blastall

• Blastpgp

• BLAT

• Blimps

• Entrez

• FASTA

• fastacmd

• formatdb

• getz

• HMMER

• IMPALA

• InterProScan

• PHI-BLAST

• ProSearch

• PSI-BLAST

• PSI-BLASTN

• Seguin

• Swat

• tace

• xace

Categories of Bioinformatics Tools…•Database >>> search/compare

Page 19: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Sequence Analysis• Artemis

• Bl2seq

• BLAST

• Clustal W, X

• consed/autofinish

• Cross_match

• Dotter

• EMBOSS

• FASTA

• Glimmer

• HMMER

• InterProScan

• MEME

• View

• Paracel Transcript Assem

• Phrap

• Phred

• Primers

• ProSearch

• Readseq2

• Rnabob

• RRTree

• SAPS

• seals

• Seqsblast

• STADEN

• Swat

• T-Coffee

Page 20: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Genomics• Calc_primers• Cross_match• FPC• GENSCAN• Glimmer• Image• Mzef• Phrap

• Phred• STADEN• Swat• tace• tace_celegans• tRNAscan-SE• xace• xace_celegans

Page 21: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Phylogenics

• Clustal W• Clustal X• MOLPHY• MrBayes• PHYLIP

• RRTree• T-Coffee• TREE-PUZZLE• TreeViewX

Page 22: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Structure Prediction

• EMBOSS

• MEME

• Modeller

• Mzef

• PHI-BLAST

Page 23: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Molecular Modelling

• Modeller– homology modeling an alignment of a sequence to be

modeled with known related structures

• Rasmol– a molecular graphics program intended for 3D

visualisation of proteins and nucleic acids

• Raster3D (publishing images)

• X3DNA– analyzing and rebuilding 3D structures

Page 24: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Microarrays

• Dapple– a program for quantitating spots on a two-

colour DNA microarray image..

• OligoArray– a program that computes gene specific

oligonucleotides that are free of secondary structure for genome-scale oligonucleotide microarray construction.

Page 25: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Packages, Useful Scripts/Source Code, Graphics, PERL

• BioPERL• BioJava• boxshade• mvscf• seg• Split_fasta

• povRay• Raster3D• MOLPHY

Page 26: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Why We Need Supercomputers…

• Some commercial packages run on “supercomputers”– Accelrys: modeling and simulation

• Materials Studio• Cerius2 (SGI Unix only)

– Homology modeling to catalyst design

• Insight II (SGI Unix only)– 3D graphical environment for physics based molecular

modeling

• Catalyst (high end Unix servers)– database management valuable in drug discovery

research

• QUANTA (high end Unix servers)– crystallographic 2D/3D protein structure solution

• Discovery Studio

Page 27: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Why We Need Supercomputers…

• Supercomputer advantages– Multiple processors

– Large shared memory

– Handle very large files

– Large/fast RAID arrays

– Terabyte tape backup systems

– Power backup systems

– High performance networks

Page 28: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Why We Need Supercomputers…

• Common bioinformatics requirements– Computationally intensive tasks– Large memory models– Intensive/complex database searches– Large experimental database sets– Large derived database sets– Large persistent intermediate data structures– Teamwork data sharing and visualization

Page 29: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Why We Need Supercomputers…

• Network requirements– Driving gigE/10gigE NICs

• Moving large files/data sets rapidly

• Visualization streams/Access GRID

• Coordinating Cluster/GRID computing

• Dynamic provisioning of light paths

Page 30: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Why We Need Supercomputers…

OS

R es ea rchOb je ct s

Mul ti ca st OB GP MP LS IPv 6 GR ID

Object Messaging System

HTTPSession Operator Session Memory Model Beans

Java Server Page Handler

Java ServeletHandler

HostJava VM(s)

Data BaseServer(s)

DAO

OS

IRR Server Wrapper

IRR

IRR

IRR

IRR

Object Messaging System

Request Analyzer

Server BeanCacheResearch

Handler

RtConfig Handler

Whois Handler

GRIDCluster

GRID Cluster Network Object Model

Object Messaging System

GRID Resource Coordination Model

N x NOXC

Router OS

OXC MIB(s)/Session(s)Device Config File(s)

ResearchLab

ResearchLab

Collaboration Object Models

Configuration

Page 31: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Why We Need Supercomputers… xxxxxxxxxxxxxxxxxxxxxxx

Page 32: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Software Development Issues…

• Collaboration contexts/barriers – Team work … collaboration spaces– Standards development … DTDs– Integration issues…

• experimental data to homology to 3D model

• platform issues…

• network issues – 9k MTU - jumbo frames

– Licensing issues – public vs. private

Page 33: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Future Challenges…

• Creating developer infrastructure for building up structural models from component parts …– components from macromolecule libraries ported

to object models

• Understanding the design principles of systems of macromolecules and harnessing them to create new functions …– specialized molecular machines

Page 34: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Future Challenges…

• Learning to design drugs efficiently and cost effectively based on knowledge of the target …– target generation automation– validation automation

• Development of enhanced simulation models that give insight into context based function from knowledge of structure …– possible use of artificial intelligence to limit

scope of search

Page 35: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

How Tools might be used for Industry Biotech Projects

Page 36: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Summary

• Bioinformatics – well positioned to assist with

application development– exploring novel bioinformatics

software development– proceeding with supporting access

GRID and optical switching technology

Page 37: RRX Pharma-Bioinformatics Future Challenges in Bioinformatics

Questions/Comments… ? ;-)