30
Sequence Alignment Using Dot matrix

Dot matrix Analysis Tools (Bioinformatics)

Embed Size (px)

Citation preview

Page 1: Dot matrix Analysis Tools (Bioinformatics)

Sequence Alignment Using Dot matrix

Page 2: Dot matrix Analysis Tools (Bioinformatics)

Published By

Safa Khalid BS-Bioinformatics (6th Semester)University of Agriculture, Faisalabad

Page 3: Dot matrix Analysis Tools (Bioinformatics)

Sequence Alignment

Way of arranging the sequences of DNA, RNA or protein to identify regions of similarity

Helps in inferring functional , Structural or evolutionary relationship between the sequence

Sequence alignment methods are used to find the best- matching sequences

it can be used to find genes, segments of DNA that code for a specific protein or phenotype

If a region of DNA has been sequenced, it can be screened for characteristic features of genes.

Page 4: Dot matrix Analysis Tools (Bioinformatics)

Alignment

Alignment is the task of locating “equivalent” regions of two or more sequences to maximize their similarity

COMPUTATIONAL BIOLOGY (RED : Mismatches)

CAMPUTATIONAL BIO - - - - ( gaps )

Alignments of related sequences is expected to give good scores compared with alignments of randomly chosen sequences

In practice, the correct alignment does not necessarily have the best score, since no “perfect” scoring scheme has been devised

Page 5: Dot matrix Analysis Tools (Bioinformatics)

If two sequence are > 25% identical, they are likely related

If two sequences are 15-25% identical they may be related, but

more tests are needed

If two sequences are < 15% identical they are probably not

related

Page 6: Dot matrix Analysis Tools (Bioinformatics)

Types of alignment

Based on Completeness: Global Local Based on Numbers: Pair wise alignment Multiple sequence Alignment

Page 7: Dot matrix Analysis Tools (Bioinformatics)

Local and Global Alignment

Page 8: Dot matrix Analysis Tools (Bioinformatics)

Pair Wise alignment

Pairwise Sequence Alignment is used to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two biological sequences (protein or nucleic acid).

Page 9: Dot matrix Analysis Tools (Bioinformatics)

Dot Matrix

A dot plot is a visual representation of the similarities between two sequences.

One sequence (A) is listed across the top of the matrix and the other (B) is listed down the left side

Starting from the first character in B, one moves across the page keeping in the first row and placing a dot in many column where the character in A is the same

The process is continued until all possible comparisons between A and B are made

Any region of similarity is revealed by a diagonal row of dots Isolated dots not on diagonal represent random matches

Page 10: Dot matrix Analysis Tools (Bioinformatics)
Page 11: Dot matrix Analysis Tools (Bioinformatics)

Example Seq 1: TWILIGHTZONE Seq 2: MIDNIGHTZONE Matrix= 12 * 12

Page 12: Dot matrix Analysis Tools (Bioinformatics)

Dot plot interpretationSeq1: ATGATAT

Seq2: ATGATAT

Page 13: Dot matrix Analysis Tools (Bioinformatics)

Bioinformatic Softwares for dot plot analysis

LALIGN DOTLET DOTMATCHER SIM

Page 14: Dot matrix Analysis Tools (Bioinformatics)

FACTORS COMPUTED BY THE SOFTWARES

Gap open penalty

Pairwise alignment score for the first residue in a gap.

Default value is: -12

Gap Extend Penalty

Pairwise alignment score for each additional residue in a gap

Default value is: -2

Expectation Threshold

Limits the number of scores and alignments reported based on the expectation value. This is the maximum number of times the match is expected to occur by chance.

Page 15: Dot matrix Analysis Tools (Bioinformatics)

SIM

Page 16: Dot matrix Analysis Tools (Bioinformatics)
Page 17: Dot matrix Analysis Tools (Bioinformatics)
Page 18: Dot matrix Analysis Tools (Bioinformatics)

LALIGN

Page 19: Dot matrix Analysis Tools (Bioinformatics)
Page 20: Dot matrix Analysis Tools (Bioinformatics)
Page 21: Dot matrix Analysis Tools (Bioinformatics)

DOTLET

Page 22: Dot matrix Analysis Tools (Bioinformatics)
Page 23: Dot matrix Analysis Tools (Bioinformatics)
Page 24: Dot matrix Analysis Tools (Bioinformatics)

DotMatcher

Page 25: Dot matrix Analysis Tools (Bioinformatics)
Page 26: Dot matrix Analysis Tools (Bioinformatics)
Page 27: Dot matrix Analysis Tools (Bioinformatics)

Results Interpretation:

Page 28: Dot matrix Analysis Tools (Bioinformatics)

Inverted repeatAn inverted repeat is sequence of nucleotides followed downstream by its reverse complement.

Inverted repeat: abcdeedcbafghijklmno

Page 29: Dot matrix Analysis Tools (Bioinformatics)

Palindromic Sequence A palindromic sequence is a nucleic acid sequence (DNA or RNA) that is

same whether read 5' to 3' on one strand or 5' to 3' on the complementary strand with which it forms a double helix.

Page 30: Dot matrix Analysis Tools (Bioinformatics)