Upload
daniel-lovitt
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
Mapping mobile DNA elements: Sources of human genetic diversity and disease.
Kathleen H. Burns, M.D., [email protected]
Johns Hopkins Department of Pathology
Overview
I. The Mobile Genome: Transposable elements in human diversity & disease.
II. Finding a needle in a haystack - one new mobile element insertion in a million.
III. Novel microarray-based transposon mapping technology and data analysis
IV. Future directions
The (junk) Genome
Data from Jasinska & Krzyzosiak, FEBS Letters (2004).
55% of the human genome is repetitive
45% of the human genome is comprised of interspersed repeats
non-repetitive
tandem repeats
21% of the human genome is comprised of long interspersed LINEs
LTRs (HERVs)
SINEs (Alu)
LINEs are dynamic genetic sequences.Multiplication by “copy-and-paste”
AAAAA5’ UTR ORF1 ORF2 3’ UTR
ASP
Major epigenetic remodeling takes place in germ cell and embryo development.
PGCs BlastocystFertilization
extraembryonic tissue
epiblast
From Burns & Matzuk, “Rewriting and its Risks” in Preimplantation Embryo Development, Knobil and Neill’s Physiology of Reproduction, 3rd Ed., 2006.
TotalGenome
Methylation
Zygote
Retrotransposons can act as insertional mutagens“Accidental Discoveries” of LINEs as Agents of Disease
LINE LINE
compromise transcriptional elongation splicing disruption
“gene breaking”/pre-mature polyA
LINE
regulatory/epigenetic effectsdisrupted ORF
Disrupted Genes Disorder Insertion Site Inserted Elements Reference
CYBB CGD Exon L1 Ta Meischl et al. (2000)
Intron Brouha et al. (2002)
F8 Haemophilia A Exon L1 Ta Kazazian et al. (1988)
L1 preTa Kazazian et al. (1988)
F9 Haemophilia B Exon L1 Ta Li et al. (2001)
Mukherjee et al. (2004)
HBB thalassemia Exon L1 Ta Divoky et al. (1996)
Intron Kimberland et al. (1999)
Tumor suppressors are down-regulated in neoplasia in the context of broad genome hypomethylation
Modified from Melki & Clark (2002). Normal Cell Cancer Cell
Hypermethylation of tumor suppressors
Overall hypomethylation
Tumor Evidence for LINE hypomethylation Reference
breast cancer 5’ flanking sequences of hypomethylated L1Hs elements isolated by MSP iPCR Alves, et al. 1996.
chronic myeloid leukemia (blast phase)
methylation-specific PCR of primary samples; hypomethylation associated with ↑ BCR-ABL mRNA, resistance to tyrosine kinase inhibitors
Roman-Gomez, et al. 2006.
chronic lymphoid leukemia primary specimens analyzed by HpaII digest and Southern blot Dante, et al. 1992.
colorectal adenocarcinoma compared to neighboring normal colon; alternate MSI progression pathway Estecio, et al. 2007.
hepatocellular carcinoma hepatocellular carcinomas compared to surrounding, cirrhotic liver; HpaII restriction enzyme digest
Takai, et al. 2000.
pancreatic endocrine and carcinoid tumors
compared to surrounding tissue; LINE hypomethylation correlates with lymph node metastasis, cytogenetic aberrations
Choi, et al. 2007.
prostate cancer compared to surrounding tissue; hypomethylation associated with Gleason grade, clinical stage, and cytogenetic abnormalities
Santourlidis, et al. 1999; Schultz, et al. 2002; Cho, et al. 2007.
urothelial carcinoma appreciated by Southern blot or MSP-PCR in most specimens Neuhausen, et al. 2006.
Miki, et al. Cancer Research 1992.
BglII PstI MspI HindIII EcoRI
Chr5:112203758-112203784
A known case of colon cancer associatedwith APC mutation by a somatic LINE insertion
chr4; human genome sequencing project
Reducing the haystack:Seeing the transcriptionally active T(a)LINE subset
Boissinot & Furano, 2005.
AAAAAAACAL1
fossil L1 elements
polymorphic L1 elements
Where are the transposons? TIP-chip strategy
Tiling Array:masked feature
microarray slide
L1 T(a)
LINE mapping strategy: Vectorette PCR
R1
R2
R3
R1 R1
R2
R3R3
2kb
LINE mapping strategy: Vectorette PCR
Enzyme Total Coverage
1 9 7 8 6 8 6 7 5 5 9
2
3
4
5
6
Covering 3 billion base pairs of the human genome
Total Length = 29
-- 7 3 3 5 4 5 5 4 16
-- -- 1 3 4 5 1 3 1 21
-- -- 1 3 2 -- 0 0 4 25
-- -- 1 3 1 -- 0 0 -- 28
-- -- 0 -- 1 -- 0 0 -- 29
Comparison to the reference genome
location of a T(a)LINE in the reference
Partial overlap of mapped T(a)LINEs with the human genome reference
6,932,52711,641,35011,863,05633,335,61634,746,98543,049,91845,334,45649,615,81954,160,94456,745,09563,148,05665,272,70967,180,02472,523,64573,829,56175,459,36175,866,87976,328,40576,340,938
77,583,29880,989,51485,496,58594,744,09898,150,743
111,115,626118,453,435119,776,693120,225,018120,498,092124,905,894129,910,188140,342,653141,399,674143,237,654150,255,511154,398,622154,583,236
P M S D P M S DP M
S D
Insertion carrier
Inferred heterozygote
No insertion
The T(a)LINE TIP-Chip is a robust genotyping tool
P
M
S
D
Gaussian Distribution of Background and Foreground Values
log2 scale
Reference peak
Other peaks
Noise estimateFre
quen
cy
LN scale
Modeling Peak Shape
0 1 2 3 4 5 6 7 8 9 10 11 12 13 32
Strong signalWeak signal
Absent signal
Kilobases
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
21 states x 2 peak directions = 42 peak states + 1 background state = 43 states
LISALINE Insertion Signal Analysis
Ranking of 34 known L1 insertionsignals on the X chromosome
Reference (Ta)L1are polymorphicAbsent
Present
P M S D P M S D
Finding novel T(a)LINE insertions
6,932,52711,641,35011,863,05633,335,61634,746,98543,049,91845,334,45649,615,81954,160,94456,745,09563,148,05665,272,70967,180,02472,523,64573,829,56175,459,36175,866,87976,328,40576,340,938
77,583,29880,989,51485,496,58594,744,09898,150,743
111,115,626118,453,435119,776,693120,225,018120,498,092124,905,894129,910,188140,342,653141,399,674143,237,654150,255,511154,398,622154,583,236
Up
Dn
Dn
Up
H H A P P P P P A P A P P
LINE methylation is relaxed in germ cells and preimplantation embryos as well as in some types of cancer.
• LINE insertions arising in the former can cause heritable disease.• Relaxed LINE silencing may have an underappreciated roles in
tumor progression.
We have developed a comprehensive mapping method for the most active LINEs in the modern genome. This is a robust method for genotyping known insertions and identifying novel insertions.
• This has immediate pertinence to understanding common polymorphisms and heritable disease.
• The approach is expected to have special utility in comparing tumor and normal DNA.
Summary
Applications of total genome transposon profiling in cancer research.
• Comprehensive T(a)LINE mapping to compare tumor and normal tissue.• Correlate findings with markers of epigenetic T(a)LINE silencing.• Development of HERV-K mapping strategies.• Transposon mapping for identifying causes of familial cancer susceptibility
syndromes.
Nimblegen HD2 platformTotal probes 2.1 millionProbe length 50 - 75mer (ChIP-chip whole genome tiling)Feature size 13μm x 13μmArray size 62mm x 14mm Slide size 1” x 3” (25mm x 76mm) glass
Future Directions
Center for High Throughput Biology• Jef Boeke, Ph.D. & the Boeke Lab• Cheng Ran Huang • Tejas Niranjan
Department of Oncology• Curt Civin, M.D. & the Civin Lab
Institute of Genetic Medicine• Dave Valle, M.D.
Acknowledgements
Funding:• Burroughs Wellcome Foundation• National Cancer Institute • Sol Goldman Pancreatic Cancer Research Center• Goldhirsh Brain Tumor Research Foundation