Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
WSSP-10 Chapter 7BLASTN: DNA vs DNA searches
4-3
DSAP: BLASTn Page
p. 7-1
p. 7-1
NCBI BLAST Home Page
p. 7-2
NCBI BLASTN search page
p. 7-2
Copy sequence from DSAP or wave form program
p. 7-3
Choose a database (nr/nt or est)
p. 7-4
Search options (Use defaults)
p. 7-5
BLASTN progress report (search may take a few minutes)
p. 7-5
Format options (use defaults)
p. 7-6
EX1.10 BLASTN nr/nt database
Graphic report of EX2.09
p. 7-7
p. 7-7
BLASTN list of matches for EX1.10
EX2.09BLASTN
p. 7-9
Best match to EX1.10
p. 7-9
>gi|226493893|ref|NM_001157047.1| Zea mays dynein light chain LC6, flagellarouter arm (LOC100284150), mRNALength=606
Score = 221 bits (244), Expect = 5e-54 Identities = 218/282 (77%), Gaps = 0/282 (0%) Strand=Plus/Plus
Query 11 ATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGCAGGCGGAG 70 ||||||||||| | |||| || || ||||||||||||||| ||||||||| || ||Sbjct 104 ATGTTGGAAGGAAAGGCGGTGGTGGAGGACACCGACATGCCGGCGAAGATGCAAGCCCAG 163
Query 71 GCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCAAGAGCCTC 130 || ||| || || || || || |||| ||||||||| |||||| |||| ||Sbjct 164 GCGATGTCGGCGGCGTCCAGGGCCCTGGATCGCTTCGACGTCCTCGACTGCCGGAGCATC 223
Query 131 GCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGTGCGTCGTC 190 || | || ||||||||||| ||||| |||| | || || |||||||| ||||| ||Sbjct 224 GCGTCCCACATCAAGAAGGAGTTTGACGCGATCCATGGCCCCGGATGGCAATGCGTGGTT 283
Query 191 GGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACTTCCGCCTG 250 |||||| |||||||||| | | |||| |||| || || |||||||||||||||||||||Sbjct 284 GGCTCCGGCTTCGGCTGCTACATCACGCACAGCAAGGGGAGCTTCATCTACTTCCGCCTG 343
Query 251 GAGACGCTCCACTTCCTCATCTTCAAAGGCGCGGCCGCTTGA 292 ||| ||||| |||||| |||||||||| ||||| || |||Sbjct 344 GAGTCGCTCAGGTTCCTCGTCTTCAAAGGGGCGGCAGCATGA 385
Our Seq.
Database Seq.
Length ofsequence
Mismatch
Match
Perfect, but short, matches are notusually meaningful
>gi|14250883|emb|AL583809.3|CNS07EFY Human chromosome 14 DNAsequence BAC R-736L22 of library RPCI-11 from chromosome 14 ofHomo sapiens (Human), complete sequence Score = 40.1 bits (20), Expect = 4.6 Identities = 20/20 (100%)
Query: 189 ttttctgaatattcataata 208 ||||||||||||||||||||Sbjct: 60645 ttttctgaatattcataata 60626
7-11
Examine the best alignments:Are they significant?
7-9
C R E L L I L D A Query TGT CGT GAA CTC CTA ATT CTC GAC GCC ||| ||| ||| || || || || || || Sbjct TGT CGT GAA CTT CTG ATC CTT GAT GCA C R E L L I L D A
Mismatches
p. 7-12
Query 69 ATGAACAAGGAGAAGATTCTGAAGCTGGCGAAGGGCTTCCGGGGGAGGGCGAAGAACTGC 128 |||||||||| |||||| | ||||| || ||||| ||| | || |||||||| || |||Sbjct 242 ATGAACAAGGGAAAGATTTTTAAGCTAGCTAAGGGATTCAGAGGAAGGGCGAAAAATTGC 301
Query 129 ATCCGGATCGCGAGGGAGCGGGTGGAGAAGGCGCTCCAGTACTCGTACCGCGATCGCCGC 188 || |||| || |||||| ||||||| ||||| || || || || ||| | ||||| |||Sbjct 302 ATAAGGATAGCAAGGGAGAGGGTGGAAAAGGCACTGCAATATTCATACAGGGATCGACGC 361
Bad sequence on our part
Bad sequence on their part
Differences in the sequence of the two organisms
C R R T P D P *Query TGTCGT-CGAACTCCTGATCCTTGA |||||| ||||||||||||||||||Sbjct TGTCGTCCGAACTCCTGATCCTTGA C R E L L I L D
p. 7-13
Small Gaps- alter the reading frame of the protein
Query: 179 TTCGAGCTACCAGATGATC-GATTGGAACAT-T-C--TGTCATTG-AC-CTTC-AGGTAA 230 ||||||| || | | || |||| || || | | | | ||| | |||| |||| |Sbjct: 4684 TTCGAGCG-CC-GTTAATATGATTACAATATCTACAATATTATTATATGCTTCCAGGTGA 4741
Query: 231 TCAACCATGACCGTGTCAACCGAAACGACGTTATCGGCCGTGCACTATTGAACATGGAGG 290 |||| ||||||||||| ||||| || || || || |||||||| || | || ||||| |Sbjct: 4742 TCAATCATGACCGTGTTAACCGTAATGATGTAATTGGCCGTGCCCTTCTTAATATGGAAG 4801
An example of a match with and without gaps.
p. 7-13
>gi|241990611|dbj|AK330768.1| Triticum aestivum cDNA, clone: SET5_E05, cultivar:Chinese Spring Length=650 Score = 219 bits (242), Expect = 2e-53 Identities = 211/271 (77%), Gaps = 0/271 (0%)
Query 10 GATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGCAGGCGGA 69 |||| ||||||||| ||||| || || ||||||||||||||| ||||||||| | |Sbjct 78 GATGCTGGAAGGGAAGGCGACGGTGGAGGACACCGACATGCCGGCCAAGATGCAGCTGCA 137
Query 70 GGCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCAAGAGCCT 129 ||||| || || || |||||||| | ||||||||| |||||| |||| |Sbjct 138 GGCCACCTCGGCGGCGTCCAGGGCGCTCGAACGCTTCGACGTCCTCGACTGCCGGAGCAT 197
Query 130 CGCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGTGCGTCGT 189 ||| ||||| ||||||||||| || || | |||| |||| ||||| ||||||||||| ||Sbjct 198 CGCGGCGCACATCAAGAAGGAGTTCGACACGATCCACGGCCCGGGGTGGCAGTGCGTGGT 257
Query 190 CGGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACTTCCGCCT 249 |||| |||||||||||| | |||||| |||| || || |||||||| |||||| ||Sbjct 258 GGGCTGCAGCTTCGGCTGCTACTTCACGCACAGCAAGGGGAGCTTCATATACTTCAAGCT 317
Query 250 GGAGACGCTCCACTTCCTCATCTTCAAAGGC 280 ||| |||||| |||||| |||||||||||Sbjct 318 CGAGTCGCTCCGGTTCCTCGTCTTCAAAGGC 348
Alignment of the second best match to EX1.10
p. 7-14
p. 7-14
Alignments near the end of the EX1.10
>gi|254826767|ref|NG_012498.1| Homo sapiens glypican 4 (GPC4),RefSeqGene on chromosome X Length=121142 Score = 71.6 bits (78),Expect = 6e-09 Identities = 42/44 (95%), Gaps = 0/44 (0%)
Query 665 CTAGCTTTTCTTAACaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 708 || ||||||||||| |||||||||||||||||||||||||||||Sbjct 72886 CTTGCTTTTCTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 72929
p. 7-15
Fill in the table listing the best matches fromthree different organisms.
List Wolffia if there is a match
Use theclone reportto obtainmoreinformationabout thegene
p. 7-15
3) Perform aBLASTn ofthe estdatabase
Changethedatabase
p. 7-17
p. 7-17
BLASTn report of the EX1.10 search ofthe est database
>gi|198335694|gb|GD004539.1| CCHY28888.g1 CCHY Panicum virgatum callus (N) Panicum virgatum cDNA clone CCHY28888 3', mRNA sequence. Length=624 Score = 246 bits (272), Expect = 1e-61 Identities = 226/286 (79%), Gaps = 0/286 (0%) Strand=Plus/Minus Query 3 GAGAGAAGATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGC 62 |||| | ||| ||||||||| ||||| || || ||||| ||||||||| ||||||||Sbjct 527 GAGACACCATGCTGGAAGGGAAGGCGATGGTGGAGGACACGGACATGCCGGCGAAGATGC 468 Query 63 AGGCGGAGGCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCA 122 ||||| |||| ||| || || || || ||||| | ||||||||| |||||| Sbjct 467 AGGCGCAGGCGATGGCGGCGGCGTCCAGGGCCCTCGACCGCTTCGACGTCCTCGACTGCC 408 Query 123 AGAGCCTCGCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGT 182 |||| |||| ||||| ||||||||||| ||||| | |||| |||| || || ||||| |Sbjct 407 GGAGCATCGCGGCGCACATCAAGAAGGAGTTTGACACGATCCACGGCCCCGGGTGGCAAT 348 Query 183 GCGTCGTCGGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACT 242 |||| || ||||||||||||||||| | |||||| |||| || || |||||||||||||Sbjct 347 GCGTGGTGGGCTCCAGCTTCGGCTGCTACTTCACGCACAGCAAGGGGAGCTTCATCTACT 288 Query 243 TCCGCCTGGAGACGCTCCACTTCCTCATCTTCAAAGGCGCGGCCGC 288 |||| || ||| ||||| ||||||||||||||||| ||||| ||Sbjct 287 TCCGGCTCGAGTCGCTCAGGTTCCTCATCTTCAAAGGGGCGGCAGC 242
Alignment of the best match to EX1.09from the est search
p. 7-17
Fill out the DSAP table of the BLASTnsearch of the est database
p. 7-18
Query 61 CAAGGTCTAAGTACTGAAAAGGAAAGTCTACTAATTACAAAGAAGTTATTGTTTGTACCT 120 |||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||Sbjct 13166 CAAGGTCTAAGTACTGAAAAGGAAAGTCCACTAATTACAAAGAAGTTATTGTTTGTACCT 13107
Query 121 TTTGTATCAGGGTTTATTAAATTTCAATCTTTATTGCTGAATCCCGAAACAAGGTGATCT 180 |||||||||||||||||||||||| |||||| ||||||||||||||||||||||||||||Sbjct 13106 TTTGTATCAGGGTTTATTAAATTTTAATCTTCATTGCTGAATCCCGAAACAAGGTGATCT 13047
Open Question: Why are there differences in the sequences?