52
Dina El Dina El Dina El - - - Khishin Khishin Khishin

Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Page 2: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Dina El-Khishin (Ph.D.)

Deputy Director of AGERI

&

Head of the Genomics, Proteomics

&

Bioinformatics Research Facility

Agricultural Genetic Engineering Research Institute (AGERI)

Giza

EGYPT

Page 3: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

BioinformaticsBibliotheca Alexandrina

December 2007

Page 4: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Use of internet browsers and computers

Assumptions

PC with Microsoft Windows

Internet connection

Background in molecular biology

Page 5: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

What’s in the name?

SequenceAnalysis

DatabaseHomologySearching

MultipleSequenceAlignment

HomologyModelingDocking

ProteinAnalysis

Proteomics

3DModeling

SampleRegistration &

TrackingIntegrated

DataRepositories

CommonVisual

Interfaces

IntellectualPropertyAuditing

BioInformatics

GenomeMapping

Page 6: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

What is Bioinformatics?

Computerized annotation of genomic and

biological information and data (databases).

•Transformation and manipulation of

these data (software tools).

- computational analysis of biological

data.

Page 7: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology

Here we will consider the use of Bioinformatics tools rather than their design and construction

Here we will consider the access and analysis of data and information items rather than their generation, storage or annotation

Page 8: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Typical BioinformaticsMulti-Disciplinary

•Scientists– Experimental Design & Interpretation– Laboratory Protocols & Standards/Controls

•Mathematicians– Analysis & Correlation of Data– Validation methodologies

•Computer Scientists– Information Storage / Control Vocabulary– Data Mining

Page 9: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Bioinformatics

USERS• of Information• of Tools• of Instrumentation• In-Silico Modeling

INTERPRETERS• of Information

DEVELOPERS*• of Information• of Tools• of Instrumentation• of Architecture/Storage• Algorithms• Modeling Strategies• Visualization

*

Page 10: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Overall Aim of Bioinformatics:

Provides biologically important predictions from annotated data and transformation / manipulation of these data.

Page 11: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Bioinformatics

• SCOPE: Biological information AcquisitionProcessingStorageDistributionAnalysisInterpretation

• TECHNIQUES:MathematicsComputer scienceBiologyOBJECTIVE:Understanding biological significance of biological data

Page 12: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Bioinformatics Databases• Nucleotide and protein sequences.

• Protein structures.

• All sorts of functional data related to genes, proteins and their regulation, interactions etc.

• Curated and non-curated databases.

Page 13: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Software tools (computer programs):

•Software tools:

•sequence analysis, •database construction and management, evolutionary relations, •structural analyses, •pathways, •microarray analysis, •proteomic analysis.

•Software tools integrated into databases.

Page 14: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

The Need for Bioinformatics:•Whole Genome Analyses and Sequences. •Experimental analyses for thousands of genes simultaneously. •DNA Chips and Array Analyses.

-Expression Arrays. -Comparative Analyses between Species and Strains.

•Proteomics: 'Proteome' of an Organism ... 2D gels, Mass Spec. •Medical applications: Genetic Disease ... SNPs.

-Pharmaceutical and Biotech Industry. •Forensic applications. •Agricultural applications.

Page 15: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Main Bioinformatics Applications

• Comparison and analysis of nucleotide and protein sequences.

• Comparison and analysis of molecular structures (especially proteins).

– Results of comparisons can be used in evolutionary and phylogenetic studies.

• Data mining of genomic data (large-scale gene expression results etc.).

Page 16: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Main Goals • Introduction to biological databases.

– Good knowledge of major databases.– An overview of the large variety of minor

databases.

• Learning to use tools at major database sites.– Genbank/ncbi, expasy/swissprot, PDB.

• Introduction to sequence searching and sequence alignments.– The tools with most practical "everyday" usability.

Page 17: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

DNA •Sequence Submission

•Sequence Alignments (Pairwise and Multiple)

•Scoring Matrices

•Motifs and Patterns

•Genes, Exons, and Introns

•Promoters, Transcription-factor-binding Sites

•Other Regulatory Sites

RNA •Secondary Structure

•RNA-specifying Genes, Motifs

Protein•Sequence Alignment

•Motifs, Patterns, and Profiles

Page 18: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

II. Sequence Databases and Their Use:

A. Primary Sequence Databases:

• Nucleic Acid Databases• NCBI (Natl Center Biotech Information) - GenBank

• http://www.ncbi.nlm.nih.gov/

• EBI (European Bioinformatics Institute) - EMBL

• http://www.ebi.ac.uk/

• DISC - DNA Information and Stock Center, Japan

• http://www.dna.affrc.go.jp/

Page 19: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Protein Databases• NCBI - GenPept

• http://www.ncbi.nlm.nih.gov/

• ExPASy - SwissProt and TrEMBL

• http://www.expasy.ch/

• EBI (European Bioinformatics Institute)

• SwissProt, TrEMBL, PIR

• http://www.ebi.ac.uk/

• DISC - DNA Information and Stock Center, Japan

• http://www.dna.affrc.go.jp/

Page 20: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

B. Uses of Sequence Databases:

•Information Retrieval •Analysis: "given a new DNA sequence, what's in it?"

•Finding Homologues •Finding Genes •Finding Motifs - DNA Binding Sites

Page 21: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

•NCBI - Entrez •http://www.ncbi.nlm.nih.gov/Entrez/ •Types of Databases Available •Entrez Help •Retrieve Large Data Sets

•ExPASy - SwissProt and TrEMBL •http://www.expasy.ch/ •SwissProt - Bairoch well-annotated non-redundant protein DB •TrEMBL - Translation of EMBL DNA coding sequences

•EBI - SwissProt, TrEMBL, PIR •http://www.ebi.ac.uk/ •SRS - Sequence Retrieval System •Software Tools - FASTA, WU-Blast2, ClustalW •EBI2 - second server at EBI

C. Retrieve Info from Sequence Databases:

•DISC - DNA Information and Stock Center, Japan - DDBJ

•http://www.dna.affrc.go.jp/ •SRS - Sequence Retrieval System •Software Tools - FASTA, BLAST, MpSrch

•PDB - Protein DataBank •http://www.rcsb.org/pdb/ •Protein 3D Structure database

Page 22: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

•Homologues - sequences descending from common ancestor

•Comparison of Sequences using Distance Matrix approach

•DOT PLOTS - 2D graph of alignment of two sequences

• FASTA - fast, global database search tool of Pearson and Lipman

D. Sequence Analysis: finding Homologues

Page 23: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

BLAST - Basic Local Alignment Sequence Tool

http://www.ncbi.nlm.nih.gov/BLAST/

-BLASTN - NA query NA database

-BLASTP - Protein query Protein database

-BLASTX - NA query (translated) Protein database

-TBLASTN - Protein query NA (translated) database

-TBLASTX - NA query (translated) NA (translated) database

Page 24: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

E. Sequence Analysis: finding Genes in DNA

Methods:•Gene Search by Signal

-Look for Signals - Promoter Sites, Splice Sites, ...

•Gene Search by Content• Open Reading Frame•Use of Statistical Properties of Protein Coding Regions

•Unequal use of amino acids •Unequal numbers of codons per amino acid •Codons available not equally used - Codon Usage

Page 25: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

F. Sequence Analysis: finding Motifs

Motifs:•Motif - a recurrent thematic element •Structural motifs - pieces of folded 3D structure •Sequence motifs - conserved "blocks" of sequences

DNA Motifs:•Protein binding sites ... regulatory elements •Relatively short •Statistically difficult •Cooperative binding often important •Structural elements may be important - bends, kinks

Page 26: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Protein Motifs:

•Secondary structure - alpha helices, beta sheets

•Super secondary structure - 4 helix bundle, etc.,

Basic Methods:

•Consensus sequence - single, best sequence

•Regular Expression - multiple characters per site

•Weight-Matrix - any character per site, with score -Profile

•Hidden Markov Model

Page 27: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Protein Family Classifications

Prosite:•Database of protein families and domains - at ExPASy and elsewhere •Regular expressions (Patterns) and Profiles •Programs

•Search Prosite for Pattern or Profile •ScanProsite - scan a sequence against ProSite, or pattern against SwissProt •ProfileScan - scan a sequence against Profile Database

Page 28: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

G. Multiple Sequence AnalysisBasics

•Progressive Sequence Alignment -Pairwise alignment of most similar, then next most similar, etc.,

•Steps -Do pair wise alignment for all sequences -Get Matrix of approximate Distances between each pair -Create an approximate phylogenetic tree - Guide Tree -Use this to determine order of addition of sequences to alignment -Align: two sequences; seq. to sub-alignment; two sub-alignments -Keep GAPS that appear early - 'Once a gap, always a gap'

Page 29: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

•Web sites for Multiple Sequence Alignment

Clustal W:•Weighting - different weights given to unequally sampled sequences •Position Dependent Weights

•Position-Specific Gap Penalties (Opening vs Extension) •Sequence Weighting •Weights for Adding New Sequences to existing Alignment - extra weight to sequences most similar to alignment

•Clustal W Servers

Other Web Programs :•MAP, PIMA, MSA •Many others available

Page 30: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Web Databases of Multiple Sequence Alignments

•Fold Classification via Structure-Structure Protein alignments (FSSP)

•Homology derived Secondary Structure Assignments (HSSP)

•Database of Secondary Structure Assignments (DSSP)

Page 31: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

H. Phylogenetics

Basics:•Trees - Rooted vs Unrooted

•Rooted Tree - position of Ancestor is known •Unrooted Tree - no Ancestral Node •Topology - Branching Pattern of the Tree

Page 32: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

1, 2, 3, 4, 5: Taxa or External NodesX, Y, Z: Internal Nodes R1: Root a, b, c, d, e: External Branchesf, g: Internal Branchesh: Internal Branch ONLY IF tree is Rooted;else h is part of the Outgroup: Taxan 5 ... used to "root trees"

Terminology

Page 33: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Methods:•Distance Matrix methods

•UPGMA - Unweighted Pair Group Method of Averages •Fixed 'clock', averages used to get distances

•Fitch & Margoliash - 3 branches calculated at a time •Neighbor Joining - Pairs of taxa, finding closest pair

•tree with smallest sum of Branch Lengths •Other methods also available

•Parsimony methods•Find tree with fewest inferred mutations •Programs: PHYLIP package; PAUP

•Maximum Likelihood methods •Use a mathematical model of process of evolution •Model contains a parameter which is used to Maximize the Likelihood that observed changes took place

Page 34: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Confidence - "How good is the Tree?“

•Bootstrap - permutation resampling of the sequences

-How robust is the tree to such resampling? always same tree?

•How much better is this "best" tree than other trees?

-Use set of "User defined" Trees ... how good is each?

-PHYLIP programs

Page 35: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

III. Whole Genomes

A. Implications

TOTAL information on Heritable Properties of an Organism

What an Organism CAN do ... and CAN NOT do ...

Major step toward Understanding an Organismand toward making Biology a PREDICTIVE SCIENCE

Current: identify Genes, predict Function

Page 36: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Next:

Deduce Life Style of the Organism

Predict Metabolic and Genetic Pathways

Predict Adaptive Responses, Developmental Pathways

ORGANISM DATABASES

Page 37: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

B. TIGR

The Institute for Genomic Research

•First to Sequence whole Genome of Free-living Organism

•Sequenced the first Three Eubacteria and First Two Archae

•TIGR Database (TDB) - links to specific organisms

Page 38: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

IV. Organisms and Other Databases

A. Need for Organism Databases

• Direct result of Genome Physical Mapping efforts

• Need for Maps, Genes, Sequences, References

• Incomplete Genome Information plus other Information

• NOW: Complete Genome Information

Page 39: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

B. Web Organism Databases

ACeDB - A C. elegans Data Base

• Created by Durbin and Thierry-Mieg for Sulston R.Mapping Program

• Over 40 organisms represented in ACeDB databases

• Highly variable Types of Information in each

• Examples: C. elegans, yeast, fly, grains, Arabidopsis, human chroms

Page 40: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

• Saccharomyces Genome Database (SGD)

• Basic database is Web enhancement over ACeDB SacchDB

• Excellent interface to yeast genome maps:

• Many resources including analysis tools

• BLAST and FASTA facilities

• SacchDB extended to include

• Genome Deletion Project

• Yeast Evolution Project

• Sacch3D - protein 3D structure information

• Worm and Mammalian Homology to Yeast

• Yeast SAGE data

Page 41: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

•The Arabidopsis Information Resource (TAIR): Arabidopsis thaliana

•Database based on Oracle relational database system

•Much underlying information from ACeDB AatDB

•Analysis tools and Viewers, including BLAST and FASTA

•Arabidopsis Genome Initiative (AGI)

•PlantsP: Plant Phosphorylation Proteins (kinases, phosphatases)

•underlying MySQL database

•display and usage is Web based

•many other resources, links, download, etc

Page 42: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Berkeley Drosophila Genome Project (BDGP)

• Outgrowth of Encyclopedia of Drosophila (EofD)

• Excellent Map Viewers - largely Java applets

• Example: CytoView

• Includes FlyBase, ACeDB database of Drosophila

• Mouse Genome Informatics (MGI)

• Integrated access to mouse genetics and biology

• Mouse Genome Database (MGD)

• Mouse Gene Expression Database (GXD)

• Encyclopedia of the Mouse Genome

• links to

• Mouse Tumor Biology database

• Rat Data resource

Page 43: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Human Genome Resources at NCBI

•Information and links to Human Genome Project

•Human Genes •OMIM - Online Mendelian Inheritance in Man

•McKusick catalog of human genes and disorders •Over 10,000 entries

•LocusLink - single interface to all human locus info •Human/Mouse Homology Relationships •Examples of Info on Candidate Human Genes for Hypertension

Page 44: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

VI. Problems ... Directions to Go

A. Problems:

• Sequence DBs and Others are Flat File Database • one piece of information at a time

• Analysis Tools are largely Single Task oriented • from Task to Task, User must make Decisions

• Automate Basic Analytical Tasks for new DNA Sequences • This is now done currently in some facilities

and in some expensive commercial packages • Examples: Pangea, Incyte

Page 45: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

B. Need: "smart" Analysis Packages

• Need "smart" Analysis Packages that can "learn" from DB info.

• "predict" next best options for User.

• Analysis: DNA seq --> gene --> protein --> motifs --> 3D structure.

Page 46: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Basic Problem with Biology becoming a "Predictive Science“

1-Large number of Different Molecules, eg Proteins

-Large Variety per Cell

-Variety Changes with Type of Cell in Organism

2-Often a Small number of Each Molecule

Thus: Statistical Analysis is often not Appropriate

Page 47: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

The new paradigm, is that all the genes will be known "resident in database available electronically"

Biological investigations will be theoretical

Scientists will start with a theoretical conjecture and only then turning to experiment to follow or test the hypothesis.

The Potential of Bioinformatics

Page 48: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Bioinformatics scientist have developed new techniques to analyze genes on an industrial scale resulting in a new area of science known as

'Genomics'

Genomics is revolutionizing our entire approach to science.

Page 49: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Gene Discovery Informatics

Microdissection

Create DNA Libraries

Signature Hybridization

Clusteringby Signature

Expression Profiles

DifferentialExpression

DNASequencing

GeneAssignments

FunctionalPredictions

MicroArrays

Functional Assays

Small MoleculeDrugs

Tissues &Cell Lines

In situHybridization

ClonesDatabase

DNALibrariesDatabase

AnnotatedSequenceDatabase

Assays &ValidationDatabase

ClusteringDatabase

Tissue &Cell LinesDatabase Small

MoleculeDatabase Micro

ArrayDatabase

In SituHybridiz-

ation

Page 50: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Page 51: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

Thank you

Page 52: Dina El Dina El--KhishinKhishinDina ElDina El--KhishinKhishin The design, construction and use of software tools to generate, store, annotate, access and analyse data and information

Dina ElDina ElDina El---KhishinKhishinKhishin

References

• Martti Tolvanen & Bairong ShenUniversity of Tampere

• Bioinformatics for Dummies (Wiley 2003)

• Internet Bioinformatics