Upload
eileen-jackson
View
218
Download
1
Embed Size (px)
Citation preview
Genome Organization & Evolution
Chromosomes
• Genes are always in genomic structures (chromosomes) – never ‘free floating’
• Bacterial genomes are circular
• Eukaryotic genomes are oriented strands
• Question: why are chromosomes?
Size of genomes
Epstein-Barr virus 0.172 x 106
E. coli 4.6 x 106
S. cerevisiae 12.1 x 106
C. elegans 95.5 x 106
A. thaliana 117 x 106
D. melanogaster 180 x 106
H. sapiens 3200 x 106
Genomic structures
• Chromosomes
• Plasmids
• Mitochondria
• Chloroplasts
Competition & cooperation
• Are genes ‘selfish’? Examples?
• Are genes ‘cooperative’? Examples?
• Which came first, cooperation or competition?
• How do cooperation and competition evolve?
Gene structure
• Exons (coding regions)• Introns
– Who has ’em?– What size?– Which is original form?
• Computational challenges & clues– Find the exon/intron structure– Use the function to facilitatie location
Regulatory mechanisms
• ‘organize expression of genes’ (function calls)
• Promoter region (binding site), usually near coding region
• Binding can block (inhibit) expression• Computational challenges
– Identify binding sites– Correlate sequence to expression
Proteins
• Most protein sequences (today) are inferred• What’s wrong with this?• Proteins (and nucleic acids) are modified• ‘mature’ Rna• Computational challenges
– Identify (possible) aspects of molecular life cycle
– Identify protein-protein and protein-nucleic acid interactions
Genetic variation
• Variable number tandem repeats (minisatellites). 10-100 bp. Forensic applications.
• Short tandem repeat polymorphisms (microsatellites). 2-5 bp, 10-30 consecutive copies.
• Single nucleotide polymorphisms
Single nucleotide polymorphisms
• 1/2000 bp.
• Types– Silent– Truncating – Shifting
• Significance: much of individual variation.
• Challenge: correlation to disease
Anatomy of a gene
• ORF. From start (ATG) to stop (TGA, TAA, TAG)
• Upstream region with binding site. (e.g. TATA box).
• Poly-a ‘tail’
• Splices. Bounded by AG and GT splice signals.
Yeast genome
• 4.6 x 106 bp. One chromosome. Published 1997.
• 4,285 protein-coding genes
• 122 structural RNA genes
• Repeats. Regulatory elements. Transposons.
• Lateral transfers.
Yeast protein functionsRegulatory 45 1.05%
Cell structure 182 4.24
Transposons,etc 87 2.03
Transport & binding 281 6.55
Putative transport 146 3.40
Replication, repair 115 2.68
Transcription 55 1.28
Translation 182 4.24
Enzymes 251 5.85
Unknown 1632 38.06
Eukaryotic genome
• Moderately repetitive– Functional (protein coding, tRNA coding)– Unknown function
• SINEs (short interspersed elements)– 200-300 bp
– 100,000 copies
• LINEs (long interspersed elements)– 1-5 kb
– 10-10,000 copies
Eukaryotic genome
• Highly repetitive– Minisatellites
• Repeats of 14-500 bp• 1-5 kb long• Scattered throughout genome
– Microsatellites• Repeats up to 13 bp• 100s of kb long, 106 copies• Around centromere
– Telomeres• Short repeats (6 bp)• 250-1,000 at ends of chromosomes
HW 3
• Due March 5
• Weblem 2.2, p 112
• Weblem 2.14, p 113