- 1.DNA and Gene Expression
2. Dexoyribonucleic Acid (DNA)
- Two phosphoric acid sugar strands held apart by pairs of four
bases
-
- Adenine (A), thymine (T), guanine (G), cytosine (C)
-
- A pairs with T, G pairs with C
- Self replicating molecule
- Directs synthesis of proteins for body
3. DNA Structure 4. DNA Replication
- Results in two complete double helixes of DNA
- How nucleotides are added in DNA replication (animation)
5. Genome
- Maybe 30,000 genes on human genome
- Gene range from 1000 to 2 million base pairs
6. Protein Synthesis
- 20 amino acids, despite 64 possible combinations from 4 base
pairs; duplication
-
- Various sequences of three base pairs
-
- Each codes for an amino acid (or stop signal)
- Amino acids assembled into proteins
- Interestingly, only about 2% of genome involved in protein
synthesis
7. Genetic Code Amino Acid Codons Alanine CGA, CGG, CGT, CGC
Arginine GCA, GCG, GCT, GCC, TCT, TCC Aaparagine TTA, TTG Aspartic
acid CTA, CTG Cysteine ACA, ACG Glutamic acid CTT, CTC Glutamine
GTT, GTC Glycine CCA, CCG, CCT, CCC Histidine GTA, GTG Isoleucine
TAA, TAG, TAT Leucine ATT, AAC, GAA, GAG, GAT, GAC Lysine TTT, TTC
Methionine TAC Phenylalanine AAA, AAG Proline GGA, GGG, GGT, GGC
Serine AGA, AGG, AGT, AGC, TAC, TCG Threonine TGA, TGG, TGT, TGC
Tryptophan ACC Tyrosine ATA, ATG Valine CAA, CAG, CAT, CAC (Stop
signals) ATT, ATC, ACT 8. Mutations
- Mistakes made in copying DNA
- Produces different alleles (called polymorphisms)
- Mutations occurring in gametes will be transmitted faithfully
unless natural selection intervenes
9. Single-Base Mutations
- Can either change or remove a base from a codon
- Changing one base for another
-
- Generally less likely to have an affect
-
- More problematic, as it shifts the reading of the triplet
code
-
- C G A-CTA-TGA --> CAC-TAT-GA
-
- Alanine-aspartic acid-threonine --> valine-isoleucine
-
- No, small, or large effect on protein production
10. Multi-base Mutations
- Some genes can have multiple mutations at different
locations
- Complicates matters enormously
- Both in terms of functionality and for identification of
effects by behavioural geneticists
11. RNA
-
- Single-stranded molecule (generally) and is shorter
-
- Contains ribose, not deoxyribose, making RNA less stable
-
- Complementary nucleotide to adenine is not thymine (as in DNA),
but uracil (U)
- Various forms: mRNA, tRNA, rRNA, non-coding RNA
12. RNA and DNA
- Actually, the original genetic code
-
- Still seen in most viruses
- But single strand vulnerable to predatory enzymes; double
stranded DNA gained selective advantage
- RNA degrades quickly, is tissue-, age-, and state-specific
13. Gene Expression
-
- Production of mRNA in nucleus from DNA template
-
- Assembly of amino acids into peptide chains on basis of
information encoded in mRNA
-
- Occurs in ribosomes in cell cytoplasm
14. mRNA
- mRNA exists only for a few minutes
-
- Amount of protein produced depends on amount of mRNA available
for translation
-
- Protein production regulation
- mRNA carries information about a protein sequence to the
ribosomes
-
- About 100 amino acids added to protein per second
-
- Proteins 100-1000 amino acids long
15. Transcription
16. Translation
17. Non-Coding RNA
- Most DNA transcribed into RNA that is not mRNA, so is called
non-coding RNA
- At least 50% of human genome is responsible for non-coding
RNA
-
- Much of this is involved in directly or indirectly regulating
protein-coding genes
18. Introns
- One type of non-coding RNA
-
- DNA sequencers embedded in protein-coding genes
-
- Transcribed into RNA, but spliced out before RNA leaves
nucleus
-
- From 50 to 20,000 base pairs long
- About 25% of human genome
19. Introns
- Used to be called junk DNA
- Introns can regulate transcription of genes in which they
reside
- In some cases can also regulate other genes
20. Exons
- Whats left (and spliced back together) after introns are
removed
- Usually only a few hundred base pairs long
21. MicroRNA
- Another class of non-coding RNA
- Usually only 21 base pairs long
-
- DNA coding for them is about 80 base pairs
- Especially important for regulation of genes involved in
primate nervous system
- Bind to, and thus silence, mRNA
- About 500 microRNA identified which regulate expression of over
30% of all coding mRNA
22. Gene Regulation
- Can be short-term or long-term
- Responsive to both environmental factors and expression of
other genes
-
- i.e., genes can turn each other on and off
23. Polymorphisms
- Genome is about 3 billion base pairs
- Millions of base pairs differ among individuals
- However, about 2 million base pairs differ among at least 1
percent of the population
- These are the DNA polymorphisms useful for behavioural
geneticists
24. Detecting Polymorphisms
-
- Traditionally, single genes were identified by their phenotypic
protein outcome
-
- Based on the actual polymorphisms in the DNA
-
- Millions of DNA base sequences are polymorphic and can be used
in genome-wide DNA studies
-
- Identify single-gene disorders
25. DNA Microarrays
- Surfaces the size of a postage stamp
- Hundreds of thousands of DNA sequences
- Serve as probes to detect gene expression or single-nucleotide
polymorphisms
26. Genetic Screens
- Test to identify individuals with a phenotype of interest
-
- Find the genetic basis of a phenotype or trait
-
- Identify mutant alleles in genes that are already known
-
- Find the possible phenotypes that may derive from specific DNA
sequences
- Traditionally, with non-humans, expose individuals to mutagens
to cause mutations, thereby increasing the frequency of unusual
alleles
27. Different Types of Screens
- Basic screens look for a phenotype of interest in the mutated
population
- Enhancer/suppressor screens used when an allele of a gene leads
to a weak mutant phenotype
-
- A weak effect might be a damaged or abnormal limb, organ,
behaviour trait
-
- A strong effect might be the total absence of said limb, organ,
behaviour
28. Classical Genetic Approach
- Map mutants by locating a gene on its chromosome through
crossbreeding studies
- Statistics on frequency of traits that co-occur are
utilized
- Now, SNPs (single nucleotide polymorphisms) are used for
mapping
29. Reverse Genetic Screens
- Find the effect of a gene sequence on phenotype
- Produce disruption in DNA, then look for effect on whole
organism
-
- Random or directed deletions, insertions, and point mutations
produce a mutagenized population
-
- Screen population for specific change at the gene of
interest
30. Directed Deletions and Point Mutations
-
- Individuals engineered to carry genes made inoperative (knocked
out)
- Gene silencing (gene knockdown)
-
- Uses double stranded RNA to temporarily disrupt gene
expression
-
- Produces specific knockout effect without mutating the DNA of
interest
-
- Over express normal gene, for example
31. Single Nucleotide Polymorphisms
-
- A variation in DNA sequence when a single nucleotide (A, T, C,
G) in the genome differs between individuals or between paired
chromosomes of an individual
-
- Two alleles here: C and T
- Almost all common SNPs have only two alleles
- For a variation to be called an SNP it must occur in at least
1% of the population
32. Amino Acid Sequence
- SNPs wont necessarily change the amino acid sequence of a
protein
-
- Duplicatory nature of codons, might get same amino acid, but
have different SNP allele
-
- Both forms produce same polypeptide sequence
-
- Different polypeptide sequences are produced
33. Coding Regions
- SNPs can exist in both protein coding and non-coding regions of
genome
- Even non-protein coding region SNPs can have effects
-
- Transcription factor binding
-
- Sequencing of non-coding RNA
34. Example
- SNP in coding region with subtle/harmless protein change
- Change the GAU codon to GAG
-
- Changes amino acid from aspartic acid to glutamic acid
-
- Similar chemical properties, but glutamic acid is a bit
bigger
- This change to a protein is unlikely to be crucial to its
function
35. Example
- SNP in coding region with harmful effect
- Changes one nucleotide base in coding region of hemoglobin beta
gene
-
- Glutamic acid replaced by valine
-
- Hemoglobin molecule no longer carrying oxygen as efficiently
due to drastic change in protein shape
36. Latent Effects
- SNP in coding region only switching gene on under certain
conditions
- Under normal conditions, gene is switched off (is latent)
- Can activate under specific environmental conditions
-
- E.g., exposure to precarcinogens or carcinogens
37. SNPs and Cancer
- SNP changes to genes for proteins regulating rate of absorbing,
binding, metabolizing, excreting precarcinogens or carcinogens
- Small changes can alter an individuals risk for cancer
- SNP does no harm itself under normal circumstances, only having
an effect when person is exposed to a particular environmental
agent
-
- E.g., Two people with different SNPs could both smoke, but only
one develops cancer, responds to therapy, etc.
38. Smoking and Susceptibility
- Precarcinogens from tobacco enter lungs
-
- Lodge in fat-soluable areas of cells
-
- Bind to proteins converting precarcinogens to carcinogens
- Reactive molecules quickly eliminated
-
- Detoxifying proteins make carcinogens water-soluable
-
- Excreted in urine before (hopefully) damaging cell
39. SNP Variability
- Different SNPs may express hyperactive or lazy activator (or
something in between)
-
- The carcinogen-making protein
-
- E.g., Hyperactive ones could grab and convert more
precarcinogens than usual or do it more rapidly
-
- E.g., Influence effectiveness of detoxifying enzymes
-
- If more carcinogens build up in lungs, more damage to cells DNA
is caused
- Different SNPs could alter individuals risk of lung cancer
40. Bladder Cancer
- Workers in dye industry exposed to arylamines
-
- Have increased risk of bladder cancer
- In liver, an acetylator enzymes acts on arylamines,
deactivating them for excretion
- SNPs produce several different slow forms of acetylator enzyme,
keeping arylamines in liver for longer
-
- More are converted to precarcinogens, increasing risk for
cancer
41. Polygenetic Effect
- SNPs dont entirely explain this
- Not all individuals with slow acetylators exposed to arylamines
are at increased risk of bladder cancer
-
- About half of North American population has slow
acetylators
-
- Only 1 in 500 develop bladder cancer
- Other yet undiscovered genes and proteins involved
42. Drug Therapies
- SNPs could also explain different patient reactions to the same
drug treatment
- Many proteins interact with a drug
-
- Transportation through body, absorption into tissues,
metabolism into more active or toxic by-products, excretion
- Having SNPs in one or more of the proteins involved may alter
the time the body is exposed to the active form of the drug
-
- Could be applied to why individuals with behaviourally similar
forms of schizophrenia can react very differently to the same drug
therapy
43. SNPs and Gene Mapping
- SNPs are very common variations throughout the genome
- Relatively easy to measure
- Very stable across generations
- Contribute to understanding of complex gene interactions in
behaviours and behavioural disorders
44. By Association
- If SNP located close to gene of interest
- If gene passed from parent to child, SNP is likely passed
too
- Can infer that when same SNP found in a group of individuals
genomes that associated gene is also present
45. Sequencing SNPs
- Sequence the genome of large numbers of people
- Compare base sequences to discover SNPs
- Goal is to generate a single map of human genome containing all
possible SNPs
46. SNP Profile
- Each individual has his or her own pattern of SNPs
- By studying SNP profiles in populations correlations will
emerge between specific SNP profiles and specific behaviour
traits
-
- E.g., specific responses to cancer treatments
47. What is a Gene?
- Gene from pangenesis (Darwins mechanism of heredity)
- Greek:genesis(birth) orgenos(origin)
- First coined by Wilhelm Johannsen in 1909
48. Central Dogma
- Information travels from DNA through RNA to protein
- Gene = DNA region expressed as mRNA, then translated into
polypeptide
49. Extended Dogma
- Transcribed mRNA produces single polypeptide chain (folds into
functional protein)
- This molecule performs discrete, discernible cellular
function
- Gene regulated by promoter and transcription-factor binding
sites on nearby DNA
50. Simplified Extended Dogma From: Seringhaus & Gerstein,
2008 51. Implications
-
- Gene named and classified by basic function
- Traditional classification systems
-
- Broad functional categories (e.g., genes whose products
catalyze a hydrolysis reaction) to specific functions (e.g.,
amylase describing specific break-down of starch)
- In 1950s: International Commission on Enzymes Classification,
Munich Information Center for Protein Sequences
52.
- One gene, one protein, one function
- Straightforward view of subcellular life
- Allows conception of single protein as indivisible unit in
larger cellular network
- When mapping genes across species, can assume a protein is
either fully preserved in organisms or entirely absent
- Allows easy grouping of related proteins in different
species
- Extended dogma includes regulation, function, and
conservation
53. Current View
- High-throughput experiments
-
- Probe activity of millions of bases in genome
simultaneously
- Much more complex than extended dogma
54. Creating RNA Transcript
- Genes only small fraction of human genome
- Genome pervasively transcribed ( ENCODE Project , 2007)
- Non-genic (i.e., genome outside known gene boundaries)
transcription very widespread (even including pseudogenes)
- Function of non-gene transcribed material as yet unclear
55. Pseudogenes
- DNA sequences that look similar to functional genes, but
contain genetic lesions (e.g., truncations, premature stop codons),
disrupting ability to encode proteins or structural RNA
-
- Long considered fossils of past genes
- However, recent work estimates that 5-20% of human pseudogenes
can be transcriptionally active (Zheng & Gerstein, 2007)
- Might achieve functionality via: fusing with mRNAs from nearby
functional genes to form chimeric RNAs, having RNA transcript that
has regulatory role, combining with new DNA to generate a new
gene
56. Introns/Exons
- Long understood that eukaryote genes composed of short exons,
coding regions of DNA separated by long introns
- Introns transcribed to RNA that is spliced out before proteins
produced
- But, now known that splicing for a gene-containing locus can be
done in multiple ways
-
- Individual exons left out of final product
-
- Only portions of the sequence in an exon are preserved
-
- Sequences from outside gene can be spliced in
- Result is many variants of a single gene
57. Example of Current View From: Seringhaus & Gerstein,
2008 58. Gene Regulation
-
- Protein-coding portion of gene and regulatory sequence in close
proximity on chromosome
- Doesnt apply well to mammalian and other higher eukaryote
systems
- Gene activity influenced by epigenetic modifications (changes
to DNA itself or to support structures of DNA)
- Genes can be regulated over 50,000 base pairs away, even beyond
adjacent genes
- Looping and folding of DNA can bring distant spans into close
proximity
59. DNA Folding From: Seringhaus & Gerstein, 2008 60.
Implications
- Defining gene functionality much more difficult now
- Traditionally done by phenotypic effect
- Doesnt capture function on molecular level, though
- Also, pathways a gene product engages in within a cell
significant for understanding functionality
61. Classification
- Non-trivial problem in deciding which qualities of a gene and
its products to use
- Earlier approaches assumed simple hierarchical scheme
- Recent computer technologies offering solutions
62. Direct Acyclic Graphs (DAGs) Simple hierarchy DAG hierarchy
In simple hierarchy a gene has only one parent for each node. In
the DAG approach each node can have multiple parents. Genes can be
classified within multiple groups. From: Seringhaus & Gerstein,
2008 63. Naming
- Cross-species gene identification difficult
- Often, traditionally, have different names for functionally
similar (or same) gene in different species
- Recent increases in computing power and genome sequencing
making homology mapping of similar genes across species
feasible
64. Example: Notch Pathway
- Highly conserved among species
- Defective Notch encodes receptor protein in fruit flies that
produces notched wing shape
- Traditional views of Notch pathway quite limited
- High throughput experiments in humans identifying many more
proteins involved in pathway
- Hypertext software now makes identifying connections
easier
65. Notch Pathway Traditional Current From: Seringhaus &
Gerstein, 2008