1. Introduction to Bioinformatics

Embed Size (px)

Citation preview

  • 8/19/2019 1. Introduction to Bioinformatics

    1/42

  • 8/19/2019 1. Introduction to Bioinformatics

    2/42

    (Bioinformatics) ____________________

    the intersection of

    Information technology&

    Biology

     Ahmed A. Zayed

  • 8/19/2019 1. Introduction to Bioinformatics

    3/42

    How much

    informationdoes our

    body hold ??

  • 8/19/2019 1. Introduction to Bioinformatics

    4/42

    The smallest amount of

    information can be

    obtained from a YES / NO

    question

    With

    1 / 0

    Answer

    i.e.

    1 Bit of information

  • 8/19/2019 1. Introduction to Bioinformatics

    5/42

    Our genetic code is made of 4 nucleotides each

    can be represented by:

    2 Bits of information

  • 8/19/2019 1. Introduction to Bioinformatics

    6/42

    Our entire genetic code can be

    stored in a single DVD !!!! 

  • 8/19/2019 1. Introduction to Bioinformatics

    7/42

    40,000,000,000,000 cells in body

    X 1.5 GB

    =60 zettabytes

    60 X 1021 

    All the information our

    civilization stores will

    reach only 40 ZB by

    2020 

    It can be stored in less

    than 100 gm of DNA!!!! 

  • 8/19/2019 1. Introduction to Bioinformatics

    8/42

    Introduction to(Bioinformatics)

     ____________________

    Molecules of life

     Ahmed A. Zayed

    Part 1

    Part 2

  • 8/19/2019 1. Introduction to Bioinformatics

    9/42

    Introduction to

    (Bioinformatics)

  • 8/19/2019 1. Introduction to Bioinformatics

    10/42

    What is Bioinformatics?

    It is about :•searching biological

    databases,

    •comparing sequences,

    •looking at protein

    structures,

    •and asking

    biological questions

    with a computer

  • 8/19/2019 1. Introduction to Bioinformatics

    11/42

    Bioinformatics is vastly growing that !!!!!!

  • 8/19/2019 1. Introduction to Bioinformatics

    12/42

    Shortly, Bioinformatics is the:

    of large-scale, complex

    &

    using

    •Storage

    •Retrieval

    Papers, Sequences,

    and structures of 

     

    (DNA, Proteins)

    • Analysisconverting sequences into gene

    •ModelingProtein structure predictions

  • 8/19/2019 1. Introduction to Bioinformatics

    13/42

  • 8/19/2019 1. Introduction to Bioinformatics

    14/42

  • 8/19/2019 1. Introduction to Bioinformatics

    15/42

    We, as END USERS

    Can perform biological experiments 

    in vivo ,

    within a living organism.

    in vitro ,

    (in glass) or in an artificial environment.

    in silico ,

    through silicon chips (bioinformatics)

  • 8/19/2019 1. Introduction to Bioinformatics

    16/42

    Theory of molecular

    evolution

    phylogenies based on

    sequence comparison 

    differences between

    homologous sequences 

    as a molecular clock to

    estimate the time since

    the last common ancestor

    Linus Pauling

  • 8/19/2019 1. Introduction to Bioinformatics

    17/42

    phylogenies based on

    sequence comparison 

  • 8/19/2019 1. Introduction to Bioinformatics

    18/42

    Atlas of Protein Sequence

    The first comprehensive,

    computerized and

    publicly available

    collection of protein

    sequences.

    It became a model formany subsequent

    sequence databases,

    including GenBank.Margaret Oakley Dayhoff 

  • 8/19/2019 1. Introduction to Bioinformatics

    19/42

    Needleman-Wunschalgorithm

    Global sequence alignment

  • 8/19/2019 1. Introduction to Bioinformatics

    20/42

    DNA

    sequencingand

    software to

    analyze it

    (Staden

    software)

    DNA sequencing andStaden software

    Modern version of Staden software

  • 8/19/2019 1. Introduction to Bioinformatics

    21/42

    1- All basic sequence alignments programs.

     

    2- Phylogenetic and classification methods.

     

    3-Various display tools adapted to relatively

    small sequence objects

    (such as protein sequences of, at most, a fewthousand characters long).

    Most of the bioinformatics softwares

    (tools) include:

  • 8/19/2019 1. Introduction to Bioinformatics

    22/42

    Smith-Waterman algorithm

    Smith and Waterman

    Local sequence

    alignment

  • 8/19/2019 1. Introduction to Bioinformatics

    23/42

    Sequence

    alignment

  • 8/19/2019 1. Introduction to Bioinformatics

    24/42

  • 8/19/2019 1. Introduction to Bioinformatics

    25/42

  • 8/19/2019 1. Introduction to Bioinformatics

    26/42

    Local sequence

    alignment

    Global sequence

    alignment

  • 8/19/2019 1. Introduction to Bioinformatics

    27/42

  • 8/19/2019 1. Introduction to Bioinformatics

    28/42

    The concept of a sequencemotif

    Nucleotide or amino-acid sequence

    pattern that is widespread and has, or is

    conjectured to have, a biologicalsignificance.

    A DNA sequence motif represented as a sequence logo 

    graphically representing the observed probabilities

  • 8/19/2019 1. Introduction to Bioinformatics

    29/42

    GenBank Release 3 madepublic

    open access, annotated collection of all

    publicly available nucleotide sequences andtheir protein translations 

    http://www.ncbi.nlm.nih.gov/genbank/

  • 8/19/2019 1. Introduction to Bioinformatics

    30/42

    Phage lambda genomesequenced

    Bacteriophage lambda

    Provided useful tools

    in molecular genetics

    such as being used as a

    vector  for the cloning 

    of recombinant DNA 

  • 8/19/2019 1. Introduction to Bioinformatics

    31/42

    Sequence databasesearching algorithm

    David J. Lipman

     Allowed searchingthe fast growing

    huge databases 

  • 8/19/2019 1. Introduction to Bioinformatics

    32/42

    FASTP /FASTN: fastsequence similarity

    searching

    FASTA format for nucleotide sequence 

    >gi|5524211|gb|AAD44166.1| cytochrome b

    LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGY 

    >gi|1045243| cytochrome bACTGATCATAGTACATGACATAGATATCAGATACATAGAC 

    FASTA format for amino-acid sequence 

  • 8/19/2019 1. Introduction to Bioinformatics

    33/42

  • 8/19/2019 1. Introduction to Bioinformatics

    34/42

    Searching NCBI’s Literature Databases

    Activity 1

    http://www.ncbi.nlm.nih.gov/gquery

  • 8/19/2019 1. Introduction to Bioinformatics

    35/42

    EMBnet network fordatabase distribution

    Synchronizing the data

    between databases every

    night!!!

  • 8/19/2019 1. Introduction to Bioinformatics

    36/42

    BLAST: fast sequencesimilarity searching 

    http://blast.ncbi.nlm.nih.gov/Blast.cgi

    Basic Local Alignment Search Tool 

  • 8/19/2019 1. Introduction to Bioinformatics

    37/42

    EST: expressed sequencetag sequencing

    short sub-sequence of a cDNA sequence. 

    They may be used to identify

    gene transcripts, and are instrumental in

    gene discovery.

  • 8/19/2019 1. Introduction to Bioinformatics

    38/42

    Sanger Centre, Hinxton,UK

    charitably funded genomic

    research centre , A leader inthe Human Genome Project

    EMBL European

    Bioinformatics Institute

    http://www.ebi.ac.uk/

    http://www.embl.de/

  • 8/19/2019 1. Introduction to Bioinformatics

    39/42

    First bacterial genomes completely

    sequenced

     Yeast genome completely sequenced

    Worm (multicellular) genome

    completely sequenced

    Fly genome completely sequenced

    Human genome project is

    complete

    http://www.sanger.ac.uk/Projects/C_elegans/http://www.sanger.ac.uk/Projects/C_elegans/http://www.sanger.ac.uk/Projects/C_elegans/

  • 8/19/2019 1. Introduction to Bioinformatics

    40/42

  • 8/19/2019 1. Introduction to Bioinformatics

    41/42

  • 8/19/2019 1. Introduction to Bioinformatics

    42/42

    Thank You

    Questions ?