65
DNA and Gene Expression

PowerPoint slides

  • Upload
    pammy98

  • View
    1.333

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PowerPoint slides

DNA and Gene Expression

Page 2: PowerPoint slides

Dexoyribonucleic Acid (DNA)

• Two phosphoric acid sugar strands held apart by pairs of four bases– Adenine (A), thymine (T), guanine (G),

cytosine (C)– A pairs with T, G pairs with C

• Self replicating molecule• Directs synthesis of proteins for body

Page 3: PowerPoint slides

DNA Structure

<static.howstuffworks.com/gif/dna-2.jpg>

<static.howstuffworks.com/gif/dna-base-pairings.gif>

Page 4: PowerPoint slides

DNA Replication

• Results in two complete double helixes of DNA

• How nucleotides are added in DNA replication (animation)

Page 5: PowerPoint slides

Genome

• Maybe 30,000 genes on human genome

• Gene range from 1000 to 2 million base pairs

Page 6: PowerPoint slides

Protein Synthesis

• 20 amino acids, despite 64 possible combinations from 4 base pairs; duplication

• Codons– Various sequences of three base pairs– Each codes for an amino acid (or “stop” signal)

• Amino acids assembled into proteins• Interestingly, only about 2% of genome

involved in protein synthesis

Page 7: PowerPoint slides

Genetic CodeAmino Acid CodonsAlanine CGA, CGG, CGT, CGCArginine GCA, GCG, GCT, GCC, TCT, TCCAaparagine TTA, TTGAspartic acid CTA, CTGCysteine ACA, ACGGlutamic acid CTT, CTCGlutamine GTT, GTCGlycine CCA, CCG, CCT, CCCHistidine GTA, GTGIsoleucine TAA, TAG, TATLeucine ATT, AAC, GAA, GAG, GAT, GACLysine TTT, TTCMethionine TACPhenylalanine AAA, AAGProline GGA, GGG, GGT, GGCSerine AGA, AGG, AGT, AGC, TAC, TCGThreonine TGA, TGG, TGT, TGCTryptophan ACCTyrosine ATA, ATGValine CAA, CAG, CAT, CAC(Stop signals) ATT, ATC, ACT

Page 8: PowerPoint slides

Mutations

• Mistakes made in copying DNA

• Produces different alleles (called polymorphisms)

• Mutations occurring in gametes will be transmitted faithfully unless natural selection intervenes

Page 9: PowerPoint slides

Single-Base Mutations

• Can either change or remove a base from a codon• Changing one base for another

– Generally less likely to have an affect

• Removal of base– More problematic, as it shifts the reading of the triplet code

– CGA-CTA-TGA --> CAC-TAT-GA…

– Alanine-aspartic acid-threonine --> valine-isoleucine…

• Changing amino acid– No, small, or large effect on protein production

Page 10: PowerPoint slides

Multi-base Mutations

• Some genes can have multiple mutations at different locations

• Complicates matters enormously

• Both in terms of functionality and for identification of effects by behavioural geneticists

Page 11: PowerPoint slides

RNA• Ribonucleic acid• Differs from DNA

– Single-stranded molecule (generally) and is shorter

– Contains ribose, not deoxyribose, making RNA less stable

– Complementary nucleotide to adenine is not thymine (as in DNA), but uracil (U)

• Various forms: mRNA, tRNA, rRNA, non-coding RNA

Page 12: PowerPoint slides

RNA and DNA

• Actually, the original genetic code– Still seen in most viruses

• But single strand vulnerable to predatory enzymes; double stranded DNA gained selective advantage

• RNA degrades quickly, is tissue-, age-, and state-specific

Page 13: PowerPoint slides

Gene Expression

• Transcription of gene– Production of mRNA in nucleus from DNA

template

• Translation– Assembly of amino acids into peptide chains on

basis of information encoded in mRNA– Occurs in ribosomes in cell cytoplasm– mRNA and tRNA

Page 14: PowerPoint slides

mRNA

• mRNA exists only for a few minutes– Amount of protein produced depends on amount of

mRNA available for translation

– Protein production regulation

• mRNA carries information about a protein sequence to the ribosomes– About 100 amino acids added to protein per second

– Proteins 100-1000 amino acids long

Page 15: PowerPoint slides

Transcription

• Transcription animation

Page 16: PowerPoint slides

Translation

• Translation video

Page 17: PowerPoint slides

Non-Coding RNA

• Most DNA transcribed into RNA that is not mRNA, so is called non-coding RNA

• At least 50% of human genome is responsible for non-coding RNA– Much of this is involved in directly or

indirectly regulating protein-coding genes

Page 18: PowerPoint slides

Introns

• One type of non-coding RNA– DNA sequencers embedded in protein-coding

genes– Transcribed into RNA, but spliced out before

RNA leaves nucleus– From 50 to 20,000 base pairs long

• About 25% of human genome

Page 19: PowerPoint slides

Introns

• Used to be called “junk” DNA

• Not the case at all

• Introns can regulate transcription of genes in which they reside

• In some cases can also regulate other genes

Page 20: PowerPoint slides

Exons

• What’s left (and spliced back together) after introns are removed

• Usually only a few hundred base pairs long

Page 21: PowerPoint slides

MicroRNA

• Another class of non-coding RNA• Usually only 21 base pairs long

– DNA coding for them is about 80 base pairs

• Especially important for regulation of genes involved in primate nervous system

• Bind to, and thus silence, mRNA• About 500 microRNA identified which

regulate expression of over 30% of all coding mRNA

Page 22: PowerPoint slides

Gene Regulation

• Can be short-term or long-term

• Responsive to both environmental factors and expression of other genes– i.e., genes can turn each other on and off

Page 23: PowerPoint slides

Polymorphisms

• Genome is about 3 billion base pairs

• Millions of base pairs differ among individuals

• However, about 2 million base pairs differ among at least 1 percent of the population

• These are the DNA polymorphisms useful for behavioural geneticists

Page 24: PowerPoint slides

Detecting Polymorphisms

• Genetic markers– Traditionally, single genes were identified by

their phenotypic protein outcome

• DNA markers– Based on the actual polymorphisms in the DNA– Millions of DNA base sequences are

polymorphic and can be used in genome-wide DNA studies

– Identify single-gene disorders

Page 25: PowerPoint slides

DNA Microarrays

• Gene chips• Surfaces the size of a postage

stamp• Hundreds of thousands of DNA

sequences• Serve as probes to detect gene

expression or single-nucleotide polymorphisms

• Fodor's gene chip

<http://www.bio.davidson.edu/Courses/genomics/chip/chipreal.html>

<http://learn.genetics.utah.edu/units/biotech/microarray/>

Page 26: PowerPoint slides

Genetic Screens• Test to identify individuals with a phenotype of

interest• Forward genetic screens

– Find the genetic basis of a phenotype or trait

• Reverse genetic screens– Identify mutant alleles in genes that are already known

– Find the possible phenotypes that may derive from specific DNA sequences

• Traditionally, with non-humans, expose individuals to mutagens to cause mutations, thereby increasing the frequency of unusual alleles

Page 27: PowerPoint slides

Different Types of Screens

• Basic screens look for a phenotype of interest in the mutated population

• Enhancer/suppressor screens used when an allele of a gene leads to a weak mutant phenotype– A weak effect might be a damaged or abnormal

limb, organ, behaviour trait– A strong effect might be the total absence of

said limb, organ, behaviour

Page 28: PowerPoint slides

Classical Genetic Approach

• Map mutants by locating a gene on its chromosome through crossbreeding studies

• Statistics on frequency of traits that co-occur are utilized

• Now, SNPs (single nucleotide polymorphisms) are used for mapping

Page 29: PowerPoint slides

Reverse Genetic Screens

• Find the effect of a gene sequence on phenotype

• Produce disruption in DNA, then look for effect on whole organism– Random or directed deletions, insertions, and

point mutations produce a mutagenized population

– Screen population for specific change at the gene of interest

Page 30: PowerPoint slides

Directed Deletions and Point Mutations

• Gene knockouts– Used in yeast and mice– Individuals engineered to carry genes made inoperative

(“knocked out”)

• Gene silencing (“gene knockdown”)– Uses double stranded RNA to temporarily disrupt gene

expression– Produces specific knockout effect without mutating the

DNA of interest

• Transgenic organisms– Over express normal gene, for example

Page 31: PowerPoint slides

Single Nucleotide Polymorphisms

• SNPs– A variation in DNA sequence when a single nucleotide

(A, T, C, G) in the genome differs between individuals or between paired chromosomes of an individual

• AAGCCTA to AAGCTTA– Two alleles here: C and T

• Almost all common SNPs have only two alleles• For a variation to be called an SNP it must occur

in at least 1% of the population

Page 32: PowerPoint slides

Amino Acid Sequence

• SNPs won’t necessarily change the amino acid sequence of a protein– Duplicatory nature of codons, might get same amino

acid, but have different SNP allele

• Synonymous SNPs– Both forms produce same polypeptide sequence

– “Silent mutation”

• Non-synonymous SNPs– Different polypeptide sequences are produced

Page 33: PowerPoint slides

Coding Regions

• SNPs can exist in both protein coding and non-coding regions of genome

• Even non-protein coding region SNPs can have effects– Gene splicing– Transcription factor binding– Sequencing of non-coding RNA

Page 34: PowerPoint slides

Example

• SNP in coding region with subtle/harmless protein change

• Change the GAU codon to GAG– Changes amino acid from aspartic acid to

glutamic acid– Similar chemical properties, but glutamic acid

is a bit bigger

• This change to a protein is unlikely to be crucial to its function

Page 35: PowerPoint slides

Example

• SNP in coding region with harmful effect

• Sickle-cell anemia

• Changes one nucleotide base in coding region of hemoglobin beta gene– Glutamic acid replaced by valine– Hemoglobin molecule no longer carrying

oxygen as efficiently due to drastic change in protein shape

Page 36: PowerPoint slides

Latent Effects

• SNP in coding region only switching gene on under certain conditions

• Under normal conditions, gene is switched off (is latent)

• Can activate under specific environmental conditions– E.g., exposure to precarcinogens or carcinogens

Page 37: PowerPoint slides

SNPs and Cancer

• SNP changes to genes for proteins regulating rate of absorbing, binding, metabolizing, excreting precarcinogens or carcinogens

• Small changes can alter an individual’s risk for cancer

• SNP does no harm itself under normal circumstances, only having an effect when person is exposed to a particular environmental agent– E.g., Two people with different SNPs could both

smoke, but only one develops cancer, responds to therapy, etc.

Page 38: PowerPoint slides

Smoking and Susceptibility

• Precarcinogens from tobacco enter lungs– Lodge in fat-soluable areas of cells– Bind to proteins converting precarcinogens to

carcinogens

• Reactive molecules quickly eliminated– Detoxifying proteins make carcinogens water-

soluable– Excreted in urine before (hopefully) damaging

cell

Page 39: PowerPoint slides

SNP Variability

• Different SNPs may express hyperactive or lazy activator (or something in between)– The carcinogen-making protein

– E.g., Hyperactive ones could “grab” and convert more precarcinogens than usual or do it more rapidly

– E.g., Influence effectiveness of detoxifying enzymes

– If more carcinogens build up in lungs, more damage to cells’ DNA is caused

• Different SNPs could alter individuals’ risk of lung cancer

Page 40: PowerPoint slides

Bladder Cancer

• Workers in dye industry exposed to arylamines– Have increased risk of bladder cancer

• SNPs may be involved• In liver, an acetylator enzymes acts on arylamines,

deactivating them for excretion• SNPs produce several different slow forms of

acetylator enzyme, keeping arylamines in liver for longer– More are converted to precarcinogens, increasing risk

for cancer

Page 41: PowerPoint slides

Polygenetic Effect

• SNPs don’t entirely explain this• Not all individuals with slow acetylators

exposed to arylamines are at increased risk of bladder cancer– About half of North American population has

slow acetylators– Only 1 in 500 develop bladder cancer

• Other yet undiscovered genes and proteins involved

Page 42: PowerPoint slides

Drug Therapies• SNPs could also explain different patient reactions

to the same drug treatment• Many proteins interact with a drug

– Transportation through body, absorption into tissues, metabolism into more active or toxic by-products, excretion

• Having SNPs in one or more of the proteins involved may alter the time the body is exposed to the active form of the drug– Could be applied to why individuals with behaviourally

similar forms of schizophrenia can react very differently to the same drug therapy

Page 43: PowerPoint slides

SNPs and Gene Mapping

• SNPs are very common variations throughout the genome

• Relatively easy to measure

• Very stable across generations

• Useful as gene markers

• Contribute to understanding of complex gene interactions in behaviours and behavioural disorders

Page 44: PowerPoint slides

By Association

• If SNP located close to gene of interest

• If gene passed from parent to child, SNP is likely passed too

• Can infer that when same SNP found in a group of individuals’ genomes that associated gene is also present

Page 45: PowerPoint slides

Sequencing SNPs

• Sequence the genome of large numbers of people

• Compare base sequences to discover SNPs

• Goal is to generate a single map of human genome containing all possible SNPs

Page 46: PowerPoint slides

SNP Profile

• Each individual has his or her own pattern of SNPs– “SNP profile”

• By studying SNP profiles in populations correlations will emerge between specific SNP profiles and specific behaviour traits– E.g., specific responses to cancer treatments

Page 47: PowerPoint slides

What is a Gene?

• “Gene” from “pangenesis” (Darwin’s mechanism of heredity)

• Greek: genesis (“birth”) or genos (“origin”)

• First coined by Wilhelm Johannsen in 1909

Page 48: PowerPoint slides

Central Dogma

• One gene, one protein

• Information travels from DNA through RNA to protein

• Gene = DNA region expressed as mRNA, then translated into polypeptide

• View held through 1960s

Page 49: PowerPoint slides

Extended Dogma

• Transcribed mRNA produces single polypeptide chain (folds into functional protein)

• This molecule performs discrete, discernible cellular function

• Gene regulated by promoter and transcription-factor binding sites on nearby DNA

Page 50: PowerPoint slides

Simplified Extended Dogma

From: Seringhaus & Gerstein, 2008

Page 51: PowerPoint slides

Implications• Nomenclature

– Gene named and classified by basic function

• Traditional classification systems– Vertically hierarchical– Broad functional categories (e.g., genes whose products

catalyze a hydrolysis reaction) to specific functions (e.g., “amylase” describing specific break-down of starch)

• In 1950s: International Commission on Enzymes Classification, Munich Information Center for Protein Sequences

Page 52: PowerPoint slides

• One gene, one protein, one function• Straightforward view of subcellular life• Allows conception of single protein as

indivisible unit in larger cellular network• When mapping genes across species, can

assume a protein is either fully preserved in organisms or entirely absent

• Allows easy grouping of related proteins in different species

• Extended dogma includes regulation, function, and conservation

Page 53: PowerPoint slides

Current View

• High-throughput experiments– Probe activity of millions of bases in genome

simultaneously

• Much more complex than extended dogma

Page 54: PowerPoint slides

Creating RNA Transcript

• Genes only small fraction of human genome• Genome pervasively transcribed (

ENCODE Project, 2007)• Non-genic (i.e., genome outside known gene

boundaries) transcription very widespread (even including “pseudogenes”)

• Function of non-gene transcribed material as yet unclear

Page 55: PowerPoint slides

Pseudogenes• DNA sequences that look similar to functional genes, but

contain genetic lesions (e.g., truncations, premature stop codons), disrupting ability to encode proteins or structural RNA– Long considered “fossils” of past genes

• However, recent work estimates that 5-20% of human pseudogenes can be transcriptionally active (Zheng & Gerstein, 2007)

• Might achieve functionality via: fusing with mRNAs from nearby functional genes to form chimeric RNAs, having RNA transcript that has regulatory role, combining with new DNA to generate a new gene

Page 56: PowerPoint slides

Introns/Exons

• Long understood that eukaryote genes composed of short exons, coding regions of DNA separated by long introns

• Introns transcribed to RNA that is spliced out before proteins produced

• But, now known that splicing for a gene-containing locus can be done in multiple ways– Individual exons left out of final product– Only portions of the sequence in an exon are preserved– Sequences from outside gene can be spliced in

• Result is many variants of a single gene

Page 57: PowerPoint slides

Example of Current View

From: Seringhaus & Gerstein, 2008

Page 58: PowerPoint slides

Gene Regulation• Traditional view

– Protein-coding portion of gene and regulatory sequence in close proximity on chromosome

• Doesn’t apply well to mammalian and other higher eukaryote systems

• Gene activity influenced by epigenetic modifications (changes to DNA itself or to support structures of DNA)

• Genes can be regulated over 50,000 base pairs away, even beyond adjacent genes

• Looping and folding of DNA can bring distant spans into close proximity

Page 59: PowerPoint slides

DNA Folding

From: Seringhaus & Gerstein, 2008

Page 60: PowerPoint slides

Implications

• Defining gene functionality much more difficult now

• Traditionally done by phenotypic effect• Doesn’t capture function on molecular

level, though• Also, pathways a gene product engages in

within a cell significant for understanding functionality

Page 61: PowerPoint slides

Classification

• Non-trivial problem in deciding which qualities of a gene and its products to use

• Earlier approaches assumed simple hierarchical scheme

• No longer so simple

• Recent computer technologies offering solutions

Page 62: PowerPoint slides

Direct Acyclic Graphs (DAGs) Simplehierarchy

DAGhierarchy

In simple hierarchy a gene has only one “parent” for each node.

In the DAG approach each node can have multiple “parents”. Genes can be classified within multiple groups.

From: Seringhaus & Gerstein, 2008

Page 63: PowerPoint slides

Naming

• Cross-species gene identification difficult• Naming inconsistent• Often, traditionally, have different names for

functionally similar (or same) gene in different species

• Recent increases in computing power and genome sequencing making homology mapping of similar genes across species feasible

Page 64: PowerPoint slides

Example: Notch Pathway

• Highly conserved among species• Defective Notch encodes receptor protein in fruit

flies that produces notched wing shape• Traditional views of Notch pathway quite limited• High throughput experiments in humans

identifying many more proteins involved in pathway

• Hypertext software now makes identifying connections easier

Page 65: PowerPoint slides

Notch Pathway

Traditional CurrentFrom: Seringhaus & Gerstein, 2008