34
Summer 2005 Show Me the CpG Islands! Alicia Laughton (Mathematics ‘06) Jessica Minnier (Mathematics ‘07) Guided by Yung-Pin Chen (Mathematics/Statistics) (With Statistical Significance)

Summer 2005 Show Me the CpG Islands! Alicia Laughton (Mathematics ‘06) Jessica Minnier (Mathematics ‘07) Guided by Yung-Pin Chen (Mathematics/Statistics)

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Summer 2005

Show Me the CpG Islands!

Alicia Laughton (Mathematics ‘06)

Jessica Minnier (Mathematics ‘07)

Guided by Yung-Pin Chen (Mathematics/Statistics)

(With Statistical Significance)

Summer 2005

This work is funded byJohn S. Rogers

Science Research Program

Summer 2005

Outline

• DNA Overview

• CpG Islands

• Methods– Traditional Method

– Our Method

• Future plans

Summer 2005

DNA

• Deoxyribonucleic acid

• Double-helix• Chain of nucleotide

subunits• Contains genetic

information

Summer 2005

Nucleotides

• Made up of sugar, Phosphate, and bases

• Four bases– Adenine (A)

– Cytosine (C)

– Guanine (G)

– Thymine (T)

• CpG represents a C directly followed by a G in the DNA sequence

Summer 2005

Methylation

• Causes C to turn into T• Accounts for low

occurrence of CpG dinucleotides in vertebrates– Expectation is 6.25%

randomly

– Actually 1% of total sequence (Bird 1986)

Summer 2005

Sequence AL031723

• Human DNA sequence on chromosome 16• 3 known CpG Islands• Percentage of Content:

– A - 22.7%

– C - 29.5%

– G - 28.3%

– T - 19.5%

– CpG - 3.1%

Summer 2005

CpG Islands

• “regions of DNA with a high G + C content and a high frequency of CpG dinucleotides relative to the bulk genome” -- Gardiner-Garden and Frommer (1987)

Summer 2005

CpG islands & Genes

Gene

5’ end

CpGi

Gene

Promoter CpG islands

Gene CpG islands in body

Gene 3’ end CpG islands

Summer 2005

What is important about CpG Islands?

• Useful in identifying protein-coding regions (Yoon and Vaidyanathan, 2004)

– Associated with “housekeeping genes” and 40% of tissue-specific genes

• Aberrant methylation of CpG sites may cause silencing of tumor-suppressor genes (Deng, Zhou et al, 2002)

Summer 2005

aggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttgagacggagtcttgctctgtcacccaggctggagtgtagtggtgcaatctcggctcactgcacctctgcctcccgggttcaagcgattttcctgcctcagcccccggagtagctgggattacaggtgcccgccaccacacccagctaatttttgtatttttagtagagacggggtttcgccatgttggccaggctggtctcgaactcctgacctcaggtgatccgcctgcctcagcctcccaaagtgctgggattacagacgtgagccactgcggctggcctctctccccgtctttaactgtagccctgtgaattctcatcagcctgggcctggactcagcaggccaaaaagttaccagcagagcccagcacatgtgaggaaagtcggagacgtggcggcgccggccggaggatccttcccaagaccctgggccgctgtggccccctagatcttgcaggttgccagggtgccaggccagggagggggcctttctgagattctcctcattctgacacaggagaggagggcactgacccagtcccaaggtcccgggggaatcagccgaccacagcccaggactgtcccacctgggcagagagcccattctgggtgcccagcccgggcaggcccaggcacccccagcagtgccccgggcagcacctgccagccaggtagtgcagggtgaggttgggcagggcagggcgtggtaggtcagctgagcaaacagctcggagggagagctggggagggctgggaactaggtcgatagaaacacagggactgtgttagggaggggatgccttgccagtcacgcccagccctgactcctgccctctgagggggcttcccccacccctgctgacagccccaggaccggcccctgccaggaggctgacctgccaggagtgaccgccccagacttgagcccttgggaggcaggttctgagtccccttttcctgctcagacccccagggaaacgcaggctgggccagaggcagctgcacagacccctgcagtggggtgctcggtggagagcgctggaggtgggagggaggatgtgtgaggcagcgggagagaatccaggcttcccccacaacacccaccatgagcggtgcagagtaggggtgggcggcacgggagccttcccaccccgcagaaccaggccctgggcagagctggcctacagacgataccggacaagtcctcctccgtcttggtgacagagggagctgggactccctccacccacccactgccacttcagaagcagccacagggagactgggaggggcaggggtgctggggatgagcgtggggctcagccctccctcttcccaccctggagggctgcctccttccagcccacctggaagggtggtgtcagtcccagagcccctgcactccccgccccacctcctgcagctggaacccgcgtgggagccgcacccagcgtcccagggacaaacacagaggccttgggtggtggcggtaccaaggtctgaggcctggcagctcaggggcacccccgtccctgagagaggtcaagaaggggaggcaccaccccccaccacgggacctcgctgacgatgcccatagagagaaaccaggccagtgctgggaggggaaagaccccaggcctcatgagaagtcactgcctgcttttcccctcggccaggaaggaagccccaggcccttccctcccgtctcgggcatactgaccccaggcaccaagcgagaccaggagcccacccctttcctttcccagatggcacaccagtgactctgaatatcggagcgcacccctgctccctgggaggcaggatatcgtgccgctgctccctggggcgcacgataccctccccaggaaggcgccggtcagggcggacgggccagggtgctcaccggtaccaggcgaggccgcgctcgtagcacctgtcgaagaagtggggctcagagcccagcgcgcggacgtcggggtgcagccgcagaaactccagcagggcgcgcgtgccgcccttcttcacgccaacgatgagcgcttgcgggaagcgccggcggccgggaccgctggccaaaggcaggccgggtgctcccgggcggtggacggagctggacggctcggagggcgcgggggccggcgcgggggcgcgggcggccggcgggcagcggccggggagggcgcagaggcagtaggcgccgagcaccagggccacgagcagcatcggcgcgcgggacgcccgcagagcggccccttgcccggcccctgcgccctggccgcccccggccccgccgcccaggccgccgctacctgccatggggtcgcgccgctccaggcccgggagcgggggcagcaggcgggcgcgcatctcggcccgcgcgccgctcagtccgtgggtgcccggcttgtgctctgcgcccggcggtcccgcagcctgggagcgggcgcggggcgggaccgggggcggggtctggacgccctcccccctccccctcccccgcccactccgcctccgaggccactgcctgggctggacccgccggcagccgccaccacccgggcgcgactcgagctgccgggaccaccaggacgctcctgctccgagatcccaggccctggctcgcttgactccggcatcttcacctctgcgcggggaggatgcggcggcggtggccgttcgggacgcagggcagggacagggcggcgcgcgggcctcgggaccctctgtttgaagaccgatccccttccccccccaccccactccgggacgtgcgcggcaggtgcataggccaagccttggcctgcaggagcgggagcctcatcgccaggccaaggggacccaggaaaagcgtcgatccgggcactcggcctgccaagggagaaagaggccgggacagcaccctagtgtgcagagagggatcccagaacgtgtggggggagtctgcggccgggaatggcgtgcgcctcctcttcctgcctgctggagggaccagcaccaaaacaggaaagttcaccctgccaggccttctctccaaagagtcagagggagctccgtagggggatggggttcccggaccccctgccgtggaaggggagtgggaacacagacaggcggcaagggctttcgaggccccctcttgcacaaaccagctcagagatcggagatctttgggatcaattactttccctccccaggcatccgaagcctatcctagcccaggtgtggatgagggtgggagagacgggggaggagggagaggagcaggactggacccccgtgtgacaaacatctgacaagttgctctgaggactgcccccctccttgtggagcccacctcatctggtgtgcatttccctgcggctttcatccagccctgggcgaccctccctcctccatctcagcctccctcctcctgccccacacctcaggcctgggactcgcagatgccaaaagggcctggcagatgccaaagccagaaagtgcagggggactgcatcccccacaggagaccgggttcttccccactacatactcagaccccactccctgcacccactgctcttgcaaaccaggaactaaggggttcccctacccaccccgctccttgcctcctcttgcttttcttttgttttgtttgtttttgagacagagctgcactccagctgactcttgtcgcccaggctggagtgcagtggcacaatctcagctcactacaacctctgcctcccgggttcaagcgattctcctgcctcagcctcccaagtagctgggaatacaggcacccatcaccacgcctggctaatttttgtatttttagtagagatggggtttcaccatgttagtcaggctggtctcaaactcctgacctcaggtaatctgcccacctcagcctcccaaagggctgggattacaggcgtgagccactgtgccccaccctcctcttgcttttctaaaagatgatggtcaaagtacagcccccatttgcccccagacagggcacccttcccagatcgagaccttggggagtctgcgtgacccccacacctggcagacacaggtgcttcactagtgggggaacggctgagcatgtgctgagctcgggggcactagtgggctacagtccccaagtgggaggcccctcaagagcctggatgagctgactgacggtggagaggagggaaggagggcctatggccaaagtcaatccaggacccaactgccgaggccacaggaaggccgggtcaccgcctggaactaggtcggtcacagcccagtgggagccgtggcccggagactcaactgggggccctggttactctgctcgcctccccgcgtcggcacccagaacagagcttgcaggcactgggggcccagtccagggtctcaagagcagacaatgctgccttgcagttggggaaactgagacagggtgagaactttcagaggctcattgcaggctcctagcaggctgaaaggacggaggcacaggcacctaggagcacaccagccccacgtggccacggcccctcggagagcatgaggacacttgcaatgcggaagctcagcaggcccagctctactggctctgcaccgcccagtgaggggtcagcacagttggtccaagggacaataccagattaatgaggcagaagccacgggactgaccccttggaattctccacacccacactgtgcatccttaacccaaagcttctagcttggtagcccctcctaccctcctccctgcagcagggattagggatgcattctgacccctgcctgccgtcaggggagtgaggtctctccctggagcctgagctgaggatgcccaattcagccaggtgagccccgggatggactccatgtcccctagccaccacctgacttccccagcaccccacactggcaccagcccttcagatctcagaagcgagccaccctattctcacggagccccttcctgcctgccctccaaacccaagagtagttttagtacaaaaggcaaagttaacaaataggggtaggcgtcagggaaggaagaggatcagaggatcgggaacggagaaactggagcacctggagaagcgtctgggtcctgccacccccactgactccccaactggccttgggcagggtcctctctgcaggcgctgggtccaagcttggggatgagcagccaccagcgcgggctgcttcagctgaggctgccgcacccccacgtccatcctgggtagaggcaggacagccacagagccccatgcacggggctggactcaccctgggcactcacctaaaggcagtctcctcctttccaaagcccagactttctccggactcccaggaccaccaacaagggttcctgtgcgcagactcgggggtcttggggaggaaggacgctttctaggtggctgcctggaacctggaggcccctttctacagtacctggccagcggtcggtcacacctgagtgcccagagtgagcgggcggcagaggcatttctgacgctgccaggtaatcccacgggctggaaacgacctctgggctgggaagccaccgcctcccccagtcctgctgggtccctcagcagagagaacggaaccggggctttccccacagttttcaaagtttcagggaatcctagccaagtatcattccttcttccggagccgggaccccaggtcaagcctggggcccccacagggcggtcccaaccccactgcccggagcgcacccctgctccctgggaggcaggatatcgtgccgctgctccctggggcgcacgataccctccccaggaaggcgccggtcagggcggacgggccagggtgctcaccggtaccaggcgaggccgcgctcgtagcacctgtcgaagaagtggggctcagagcccagcgcgcggacgtcggggtgcagccgcagaaactccagcagggcgcgcgtgccgcccttcttcacgccaacgatgagcgcttgcgggaagcgccggcggccgggaccgctggcgtttccctcccaggggcccagtggtgaactgaattcaggcctgagacatactctgtctactaagtcaccccatctgcccagccttggtccacctggcactgcccagagacatcagtgatgcatttcggaagctggcaaagtggaccccactggagtacaaaggactcagggacccctgtgctggggaagagaaggagcccaggacctcccccaggggctgcctctgaggggcgtgagattcaggggcctctcgggtgggacctgcgggggccgctagacactgcgggaacttcacatccccaacgcccagcagcagcctgcagggaaggcaggggaggcgagccgggctcagagagggcgagcaacttgccccatccgaaggcaaaggtggtatgagacccgggtcctctctccacctctgccccagccttcctggccacagggctggcgccaggcaggcacggcacaggctcccggcagaggccacggtctcagccatccccacggtctcaggagtccccacggtctcagccgtccccacggtctgagtccccacggtctcagctgttcccacggtctcaggagtccccacaggttcagcagtccccacggtctcagccatccccacggtctcagccgtccccacagtctcagccatccccacggtctcagcagtccctactcaggacttgaaattccagcactggttccgtgatggctcctccagccccctgcccagcccagcatggtcatttccatctcctggcctttccgctgccgtctctctgctggatgctttatccttagtccccgctgagggcagaaggactttccaggaggaattgaccagaacgcagaacagcaggatgtggaatggactggggacagggagagagagatgcagggaccaggagtcggctcggagggttctcctggaagctgacccctccctccatcaggcactcggctgacggtggctacacacctcggggcgcccaggatggcagcactggggctgttcattcaccagtggatccccagcacctaacagagcctggcacgcagtggacattccattaatgtcgctcagtggaagggtatacgtgggaggagaggtcgggaaggctttctggaggtgacggccaggtgaagacgaggagaacagcattccaggccaaggaaccgtgtgggtgaaggctcagcagcagagagcccgggcagtagaggatggggtggagcttaaggccctgcgggaacaggggcggggcttagagtctggcctgaggctggtccagccccgcctcctcctcaggctcccaccaactctgagccaccagaccctcctttgtaaaatgaagacctcagtcatgactcgcatgagtctctgaagagtaacagctttattgtgatgtaattcacacaccactcaatccagccatttgtcgcatgcaaatcaatggttttcagtatattcatagtcgtgcaatcacaatcaattttagaacatttctatcaccccaaaaagaaatcctgtgtccattagcaatgacgccctcttctccccttcccacagcccctggcaaccacgaatctactttctgtctctatgggtttgcctattctggacatttcacaaaaagagaatcattgcttgaagccaggagttcaagaccaacctgggcaacaaagcgagaaccccgtctgtacaaaatattttaaatttagccaggcacagtggcgcacaccagtagtcccagcactttggaagtctgaggcaggaggttcacttgaggcggggaattcaaaaccagcctgggcaacatagggagtaccagtctctacaaaaaatttcaaaatttgccaagcgtgatggtatgcacctatagtcctagcttactcaggaggctgaggtgggaggatcgcttgagcccaggagtacgaggctgcagtgagccatgatcataccactgcattccagcctgggcgacagagtgagagcccatctctaaaacagaaagaaagaaagaaagaaatatggccagtcacagtggctcatgcctgtaatcccagcattttgggaggccaaggcaggtggatcacttgaggtcaggagttcgagaccagcctggccaacatggtgaaaccctgtctctaccaaaaatacataaattagccaggtgtgggccaggcgccatggcttacacttgtaatcccagcactttgggaggccgaggtgggcagatcacctgaggttgagagttcgagaccagcctgaccaacatgaagaaaccctgtctctactaaaaatacaaaaaattagctgggtgtggtggtgcatgcctgtaatctcagctacttgggaggctgaggaaggagaatggcttgaacccgggaggcagaggttgtggtgagccgagatcgcgcgattgcactccagcctgggcaacaacagcaaaactccatctcaaataataataataataaattagccaggtgtggtggtgcacgcctgtagtcccagctactcgggaggctgaggcacaagaaacccttgaacccgggaggcagaggttgcagtgaagctgaaattgcaccattccactccagcctgggagacagagtgagacaccatctctaaaatgaaaaaaaaaaaagagaatcatacaatgttcgtccttttgtgtctgggtctcttactcagcatgttctccaggttcatcaacactgtggcatgtgccagtacctccttcctcttcctgactgagtaatactccatcgtatggatggaccaccttttgttgattccctcattcgttgatggacatctaggttgtttccactgcggggttcttagtaacggtattacagggaaccatagattaccaggtatt

How do you locate the CpG island in a DNA sequence?

Summer 2005

agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg

C+G Content: 0.492 Observed/Expected: 0.548

Summer 2005

agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg

C+G Content: 0.501 Observed/Expected: 0.568

Summer 2005

agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg

C+G Content: 0.500 Observed/Expected: 0.560

Summer 2005

agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg

C+G Content: 0.712 Observed/Expected: 0.604 200 steps later…

Summer 2005

agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg

C+G Content: 0.598 Observed/Expected: 0.421 600 steps later…

Summer 2005

Just a couple formulas…

G+C content =

(# of C’s) + (# of G’s) length of window

Obs/Exp ratio =

Observed # of CpGs # of CpG’s in windowExpected # of CpGs (# of C’s)x(# of G’s) length

=

From window

Summer 2005

Traditional Methods• Gardiner-Garden and Frommer (1987)

– Window size 100 bp and Shift size 1bp– Criteria

• At least 200 base pairs• G + C content greater than 50%• Expected portion of the Obs/Exp ratio calculated over the window• Obs/Exp ratio greater than 0.6

• Takai and Jones (2002)– Window size 200 bp and Shift size 1bp– Criteria

• At least 500 base pairs• At least 7 CpG dinucleotides in 200 base pair sequence• G + C content greater than 55%• Obs/Exp ratio calculated in same fashion as above method• Obs/Exp ratio greater than 0.65

Summer 2005

The Traditional Method

C+G content Obs/Exp ratio

C+

G c

onte

nt/O

bs-E

x p r

ati o

Base Position

Sequence AL031723

Summer 2005

• Modifying the traditional methods

– Window size 200 bp and Shift size 1 bp

– Expected portion of the Obs/Exp ratio is based on whole sequence

• And….

Our Method

Observed # of CpGs # of CpG’s in windowExpected # of CpGs (# of C’s)x(# of G’s) length

=

From entire sequence

Summer 2005

• Cutoffs greater than 97th percentile of observed sequence

Obs/Exp Ratio G+C Content

Mean: 0.0018

Standard Deviation: 0.0014

97th percentile: 0.0058

Mean: 0.5815

Standard Deviation: 0.0818

97th percentile: 0.7350

G+C ContentObs/Exp Ratio

Num

ber

of O

bser

vati

ons

Num

ber

of O

bser

vati

ons

Sequence AL031723

Summer 2005

Kullback-Leibler Divergence

p ln (p/0.03) + (1-p) ln ((1-p)/(1-0.03))

p

KL

Div

erge

nce

Summer 2005

agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg

Kullback-Leibler: 0.508 Our Obs/Exp: 0.0029 C+G: 0.492

Summer 2005

agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg

Kullback-Leibler: 0.509 Our Obs/Exp: 0.0030 C+G: 0.501

Summer 2005

agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg

Kullback-Leibler: 0.507 Our Obs/Exp: 0.0029 C+G: 0.500

Summer 2005

agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg

200 steps later…Kullback-Leibler: 0.520 Our Obs/Exp: 0.0033 C+G: 0.712

Summer 2005

agaattgcttgaaccgggaggcggaggttgcaatgagctgagatcacaccactgcactccagcatggtgacagagcaagactccatctcaaatcgagtaaaaaaaaaaaaatagctgggtgcggtggctcacgcctgtaatcccagcactttgggaggctgaggcgggtagatcacgaggtcaggagatcgtagccatcctggctaacacggtgaaaccccgtctctactaaaaatacaaaaagaaattagctgggcgtggtggtgggcgcctgtagtcccagctactcgggaggctgaggcaggagaatggcgtgaacccgggaggtggagcttgcagtgagtcgagatcacgccactgcactccagcctgggcgacagagcgagactcgatctcaaaaaaaaaaaaaaaaaaaaagtgcgacacgaggcacacagtcagtgcccagtggagttcgctgatatggttaccacatccctggggacagcgcctccaccctccaacctcgaggtttgtggaaaaatctgggtccaagctttatttcttaaatattcctctctgcccagcatgtgcacgcagcccgctctggccaggcgagcgggtgtcaatcaaggtgctgagcatccccagggtgccgctcagccccagccgaagtcctggcccgtcatctggtagaacctgcggttgaagggccggtagaactcctgcaggcgccggaccagggcctggggcacgcgtgggtgtggccggcccttggacttgcccaggcagcggggacggctgccgccctgggccttcttgaggcaggggaagcccttggtggcgttgaagtagaagtgcttgtccgtgacgacccgtttcaggcccaggaagtcctgcacgcggccgacctctccggccgggtcgctgaccagacgctccccgctgacgaacaggaagtgggacagggggaagtagcgcagccagtggtccaggtgctgggcgtacaggccgatgcggacggcgctccaggctgtgtccacggggcccaggccgtggcggaaggccagggcgcggaagctgggcaggcccggggtcttggagagcgtctgggcgtagtcggagatggcccgggtcacggggttccgcaccaccacgatcagcttcgtgtccggggacatggcgtggatgcggcggggggcctctcgcgtcacgaagtagctgggggtcttctccatggtgatctgcccatccagggttcggggcatcagactcctgcgggacgggtgcaaggagagggggcctgagcctccccagccctagaccggcccccaggggcccgggaccaaggcccccttatgcccgggaagcccaggcctccagggcgagcaagtcttcctccctgctcgggcccacccctgctagcgtgcgcggctgggcagcctggaacatggactgtgagggtgcccagcccggcacctgcctgcagcccggcctgttccgccggcctgccccgcctgctgctgcactgaggattagggtgacggtcgctggtcgggaggcccaaatgctcctcaccacccacatatcttccctgtgcaatccctgccgtcctcgcttccagagccagctccctcccaccggacccacactttcctggaactaggctgcccccagctcctttctcatcccagaccaagtaccccgaggcccgcccgcctagatcacttgaggtcacccgttcactcagtggctgacagcatcccctaaatcagcccttcaccaattattgacagtgtgtcctcaaccaaaagtagtcctccctgctccctccctcccctgatgtaattacatctcttcccatctttatttattttttg

600 steps later…Kullback-Leibler: 0.510 Our Obs/Exp: 0.0030 C+G: 0.598

Summer 2005

Our MethodK

L D

iver

gen c

e*1 2

/Obs

-Exp

rat

io*1

6 0/C

+G

Co n

tent

Base Position

Kullback-Leibler Divergence Observed/Expected Ratio C+G Content

Sequence AL031723

Summer 2005

Comparison of AL031723

Traditional

Method

Our Method

Summer 2005

Comparison of AL031723

Traditional

Method

Possible CpG Islands

3878-4534 5849-6136

6541-6820 8479-8698

10745-11049 18435-19580

25131-26359 35182-35441

36245-36576 36827-37606

Actual CpG Islands

18928-19547

25201-26371

36997-37693

Summer 2005

Comparison of AL031723

Our Method

Possible CpG Islands

19227-19435

25197-26147

36982-37420

Actual CpG Islands

18928-19547

25201-26371

36997-37693

Summer 2005

Cons

• Traditional Method

– Criteria not stringent enough

– If the expected part of the Obs/Exp ratio is unusually high then a high CpG count may not bring ratio above the cutoff

• Our Method

– Criteria sometimes too stringent

Summer 2005

Future Plans• CpG Islands

• Linkage Disequilibrium and SNPs

–Statistical analysis of the linkage disequilibrium coefficient

–Kullback-Leibler Divergence II

Summer 2005

Thank you!

• Former researchers– Andrew Dittmore

– Yasuhiro Goda

– Nick Heppenstall

– Michal Dvir

• Deborah Lycan• John S. Rogers Program