40
BioSci 145B lecture 3 page 1 © copyright Bruce Blumberg 2004. All rights reserved BioSci 145B Lecture #3 4/19/2005 Bruce Blumberg 2113E McGaugh Hall - office hours Wed 10-11 AM (or by appointment) phone 824-8573 [email protected] TA – Suman Verma 2113 McGaugh Hall, 924-6873, 3116 lectures will be posted on web pages http://eee.uci.edu/05s/05705/ - link only here http://blumberg-serv.bio.uci.edu/bio145b-sp2005 http://blumberg.bio.uci.edu/bio145b-sp2005 Many people have not spoken to me about term paper topics By the end of the day today, I need to have topics for everyone Please e-mail me your topic

BioSci 145B lecture 3 page 1 © copyright Bruce Blumberg 2004. All rights reserved BioSci 145B Lecture #3 4/19/2005 Bruce Blumberg –2113E McGaugh Hall -

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

BioSci 145B lecture 3 page 1 ©copyright Bruce Blumberg 2004. All rights reserved

BioSci 145B Lecture #3 4/19/2005

• Bruce Blumberg– 2113E McGaugh Hall - office hours Wed 10-11 AM (or by

appointment)– phone 824-8573– [email protected]

• TA – Suman Verma– 2113 McGaugh Hall, 924-6873, 3116

• lectures will be posted on web pages– http://eee.uci.edu/05s/05705/ - link only here– http://blumberg-serv.bio.uci.edu/bio145b-sp2005– http://blumberg.bio.uci.edu/bio145b-sp2005

• Many people have not spoken to me about term paper topics– By the end of the day today, I need to have topics for everyone– Please e-mail me your topic

BioSci 145B lecture 3 page 2 ©copyright Bruce Blumberg 2004. All rights reserved

Genome mapping (contd)

• Radiation hybrid mapping– Old but very useful technique

• Lethally irradiate cells with X-rays• Fuse with cells of another species, e.g., blast human cells then

fuse with hamster cells– Chunks of human DNA will remain in mouse cells

• Expand colonies of cells to get a collection of cell lines, each containing a single chunk of human cDNA

• Collection = RH panel– Now map markers onto these RH panels

• Can identify which of any type of markers map together– STS, EST (very commonly used), etc

• Can then map others by linkage to the ones you have mapped– Compare RH panel with other maps

• Utility – great for cloning gaps in other maps

BioSci 145B lecture 3 page 3 ©copyright Bruce Blumberg 2004. All rights reserved

Genome mapping (contd)

• How should maps be made with current knowledge?– All methods have strengths and weaknesses – must integrate

data for useful map• e.g, RH panel, BAC maps, STS, ESTs

– Size and complexity of genome is important • More complex genomes require more markers and time

mapping– Breakpoints and markers are mapped relative to each other– Maps need to be defined by markers (cities, lakes, roads in

analogy)– Key part of making a finely detailed map is construction of

genomic libraries and cell lines for common use• Efforts by many groups increase resolution and utility of maps

• Current strategies– BAC end sequencing– Whole genome shotgun sequencing– EST sequencing– Mapping of above to RH panels

BioSci 145B lecture 3 page 4 ©copyright Bruce Blumberg 2004. All rights reserved

• How can we be sure that our genomic and cDNA libraries do not change during growth and screening?

• What are the sorts of factors that might modulate whether a sequence can be stably propagated in E. coli?

– 1

– 2

– 3

Sequence stability in E. coli

toxicity

restriction

recombination

– Use the correct bacterial strain– Check for errors (look at multiple colonies)

BioSci 145B lecture 3 page 5 ©copyright Bruce Blumberg 2004. All rights reserved

Sequence stability in E. coli

• toxicity– sequence may lead to the production of a toxic product or toxic

levels of an otherwise innocuous product– more problematic with cDNA than genomic clones

• restriction - Raleigh 1987 Meth. Enzymol. 152, 130-141– virtually all microorganisms have systems to destroy non-

endogenous DNA host range restriction• four classes of restriction endonucleases

– very important for cloning purposes are recently discovered systems that degrade DNA containing 5-methyl cytosine or 6-methyl adenine.

– If you are cloning genomic DNA, or hemimethylated cDNA these are very important!

• virtually all eukaryotic DNA contains 5-methyl cytosine and/or 6-methyl adenine

– mcrA,B,C - methylcytosine– mrr - methyl adenine

BioSci 145B lecture 3 page 6 ©copyright Bruce Blumberg 2004. All rights reserved

Sequence stability in E. coli (contd)

• Restriction (contd)– foreign DNA escapes restriction 1/105 for EcoK and EcoB, 1/10 for

mcrA.– one needs to be conscious of the mcr and mrr restriction status of

strains and packaging extracts to be used.

• Recombination - Wyman and Wertman (1987) Meth Enzymol 152, 173-180– genomic DNA contains lots of repeated sequences

• direct repeats• inverted repeats• interspersed repeats (e.g. Alu)

– repeated sequences unstable in recombination proficient E. coli if in:• lambda• plasmid• cosmid

– seems not to apply to single copy vectors such as BAC, PAC, fosmid• What does this imply?

– ~30% of the human genome is unstable in plasmid or phage clones• phages with such sequences don’t grow or get shorter with time

Recombination is intermolecular process

BioSci 145B lecture 3 page 7 ©copyright Bruce Blumberg 2004. All rights reserved

Sequence stability in E. coli (contd)

• Recombination (contd)– E. coli has a variety of recombination pathways. These are the

major players in causing sequence underrepresentation• recA required for all pathways• recBCD - major recombination pathway• sbcB,C - suppressor of B,C• minor pathways

– recE– recF– recJ

• rule of thumb - the more recombination pathways mutated, the sicker the cells and the slower they grow

– major players for inverted repeats are recBCD and sbc– recA is most important for stabilizing direct repeats and

preventing plasmid concatamerization

BioSci 145B lecture 3 page 8 ©copyright Bruce Blumberg 2004. All rights reserved

Sequence stability in E. coli (contd)

• Plating a genomic library– whenever possible, select a cell type that is recA, recD, sbcB and

deficient in all restriction systems.• Conveniently, EcoK, mcrB,C and mrr are all linked and often

deleted together in strains• can get more than 100 fold difference in numbers of phage

between wild type and recombination deficient– recD is preferred over recB,C because recD promotes rolling

circle replication in lambda which improves yields

BioSci 145B lecture 3 page 9 ©copyright Bruce Blumberg 2004. All rights reserved

What do I need to know about E. coli genetics?

• You look in a supplier’s catalog and see lots of E. coli with different genotypes of the following general form:

– F’{lacIq Tn10 (TetR)} mcrA, Δ(mrr-hsdRMS-mcrBC), Φ80lacZΔM15, ΔlacX74, deoR, recA1, araD139, Δ(ara-leu)7697, galU, galK, rpsL(StrR), endA1, nupG

• Does this make any difference for your experiments?– Or should you simply follow the supplier’s instructions?– Or just use whatever people in the next lab are using without

thinking about it?

BioSci 145B lecture 3 page 10 ©copyright Bruce Blumberg 2004. All rights reserved

What do I need to know about E. coli genetics?

• F’{lacIq Tn10 (TetR)} mcrA, Δ(mrr-hsdRMS-mcrBC), Φ80lacZΔM15, ΔlacX74, deoR, recA1, araD139, Δ(ara-leu)7697, galU, galK, rpsL(StrR), endA1, nupG

• restriction systems– mcrA - cuts Cm5CGG– mcrB,C - complex cuts at Gm5C– mrr - restricts 6-methyl adenine containing DNA– Why are these important?

– hsdRMS - EcoK restriction system• R cuts 5'-AAC(N)6 GTGC-3’• M/S methylates A residues in this sequence• Why methylate the DNA?

• for stability of long repeated sequences– recA1 - deficient in general recombination– recD - deficiency in Exonuclease V– sbcB,C - Exonuclease I– deoR - allows uptake of large DNA

Most eukaryotic DNA is methylated!

Protects own DNA from digestion

BioSci 145B lecture 3 page 11 ©copyright Bruce Blumberg 2004. All rights reserved

What do I need to know about E. coli genetics? (contd)

• for lac color selection– lacZ ΔM15 either on F’ or on Φ80 prophage– lacIq - constitutive expression of lac repressor. Prevents leaky

expression of promoters containing lac operator

• for high quality DNA preps– recA1 - deficient in general recombination– endA1 - deficient in endonuclease I

• if you buy ESTs from Research Genetics (InVitrogen) or OpenBiosystems– tonA - resistant to bacteriophage T1

• for recombinant protein expression– lon - protease deficiency– OmpT - protease found in periplasmic space– most important protease inhibitor for E. coli protein preps is

pepstatin A

BioSci 145B lecture 3 page 12 ©copyright Bruce Blumberg 2004. All rights reserved

What do I need to know about E. coli genetics? (contd)

• suppressors– supE - inserts glutamine at UAG (amber) codons– supF - inserts tyrosine at UAG (amber) codons

• many older phages have S100am which can only be suppressed by supF

– λZAP, λgt11, λZipLOX,

BioSci 145B lecture 3 page 13 ©copyright Bruce Blumberg 2004. All rights reserved

Construction of cDNA libraries

• What is a cDNA library?

• What are they good for?

– Collection of DNA copies representing the expressed mRNA population of a cell, tissue, organ or embryo

– Identifying and isolating expressed mRNAs– functional identification of gene products– cataloging expression patterns for a particular tissue

• EST sequencing and microarray analysis– Mapping gene boundaries

• Promoters• Alternative splicing

BioSci 145B lecture 3 page 14 ©copyright Bruce Blumberg 2004. All rights reserved

Determinants of library quality

• What constitutes a full-length cDNA?– Strictly, it is an exact copy of the mRNA– full-length protein coding sequence considered acceptable for most

purposes• mRNA

– full-length, capped mRNAs are critical to making full-length libraries– cytoplasmic mRNAs are best – WHY?

• 1st strand synthesis– complete first strand needs to be synthesized– issues about enzymes

• 2nd strand synthesis– thought to be less difficult than 1st strand (probably not)

• choice of vector– plasmids are best for EST sequencing– phages are best for manual screening

• how will library quality be evaluated– test with 2, 4, 6, 8 kb probes to ensure that these are well

represented

They are processed, i.e., introns removed and pA+ added

BioSci 145B lecture 3 page 15 ©copyright Bruce Blumberg 2004. All rights reserved

cDNA synthesis

• Scheme– mRNA is isolated from source of interest– 1-10 μg are denatured and annealed to primer containing d(T)nV

• To minimize length of poly A tail in libraries for sequencing– reverse transcriptase copies mRNA into cDNA– DNA polymerase I and Rnase H convert remaining mRNA into

DNA– cDNA is rendered blunt ended– linkers or adapters are added for cloning– cDNA is ligated into a suitable vector– vector is introduced into bacteria

• Caveats– there is lots of bad information out there

• much is derived from vendors who want to increase sales of their enzymes or kits

– all manufacturers do not make equal quality enzymes– most kits are optimized for speed at the expense of quality– small points can make a big difference in the final outcome

BioSci 145B lecture 3 page 16 ©copyright Bruce Blumberg 2004. All rights reserved

cDNA synthesis (contd)

• Preparation of mRNA– want minimum of non poly A+ mRNAs– affinity chromatography on oligo d(T) or (U)

– Oligo d(T)30 latex (Nippon Roche) works best overall (a.k.a. OligoTex Qiagen)

– 2 successive runs gives ~90% pure A+ mRNA

• denaturation of mRNA– critical step– most protocols use heat denaturation

• Heat RNA in the presence of metal ions = chemical cleavage!

– CH3HgOH is method of choice for best libraries

• Potent, reversible denaturant• But VERY TOXIC!

BioSci 145B lecture 3 page 17 ©copyright Bruce Blumberg 2004. All rights reserved

cDNA synthesis (contd)

• First strand synthesis - lots of misinformation about enzymes– reverse transcriptase contains 2 subunits

• polymerase• RNase H - critical for processivity of the enzyme!

– What is processivity?

– Manufacturers prefer to sell MMLV RNase H- RT – cloned and cheap– best enzyme for 1st strand synthesis is AMV RT from Seikagaku

America• But not best overall

– thought that 1st strand is main failure point in cDNA synthesis - NOT

– addition of 0.6M trehalose to AMV reactions increases yield• allows rxns to run at ~60° C

– Betaine is very big help for MMLV RT

– A recent publication claimed that betaine plus trehalose was the way to go for library making

• Is it?

How many nt are polymerized before falling off

BioSci 145B lecture 3 page 18 ©copyright Bruce Blumberg 2004. All rights reserved

cDNA synthesis (contd)

• Example of comparisons between enzymes and buffers– Mfg supplied buffers NOT optimal– Literature references not optimal either– Enzymes vary a lot

• Between AMV and MMLV• And AMV between suppliers

AMV SuperscriptSigmaAMV

R B T both R B T both

1

2469

BioSci 145B lecture 3 page 19 ©copyright Bruce Blumberg 2004. All rights reserved

cDNA synthesis (contd)

• 2nd strand– must remove mRNA– best way is with RNAse H so that fragments serve as primers for

DNA pol I– Gubler and Hoffman (1983) Gene 25, 263– in my experience, 2nd strand synthesis is the point of failure in

cDNA• virtually all kits shortcut this step (1-2 hrs)• should be overnight• recent improvement is to use thermostable RNAse H, DNA

ligase and DNA polymerase to maximize production of 2nd strand.

cDNA

mRNA

BioSci 145B lecture 3 page 20 ©copyright Bruce Blumberg 2004. All rights reserved

cDNA synthesis (contd)

BioSci 145B lecture 3 page 21 ©copyright Bruce Blumberg 2004. All rights reserved

cDNA synthesis (contd)

• Cloning– after 2nd strand is made, the ends must be blunted and linkers or

adapters added • usually T4 DNA polymerase WHY?

– perfect cDNAs will retain 2-20 bp of RNA at the 5’ end.• Linkers can not be added to this by any DNA ligase!• T4 RNA ligase ligates DNA-RNA and stimulates blunt end

ligation 10x• no commercial products use T4 RNA ligase so it is no wonder

that full-length cDNAs are lost– if internal restriction sites have not been protected, they need to

be methylated now before linkers are added.• Most methylase preps are not clean

cDNA

mRNA

1st strand cDNA

2nd strand cDNAmRNA!

Very strong proofreading activity

BioSci 145B lecture 3 page 22 ©copyright Bruce Blumberg 2004. All rights reserved

Full-length mRNA isolation and cDNA synthesis

• Ways to capture cap structures and presumably full-length mRNAs– affinity chromatography with eIF-4E (cap binding protein a.k.a.

Capture– selection with antibody to cap structure– oligo capping– biotinylated cap trapper

• 5’ oligo capping - Maruyama, K., and Sugano, S. (1994). Gene 138, 171-4.– uncapped mRNAs are dephosphorylated so that they cannot be ligated

– cap structure is removed, only previously capped mRNAs have 5’ PO4

– RNA ligase can ligate a 5’-OH oligo to the 5’ end of the mRNA– This can be used to prime 2nd strand synthesis

Classes of startingRNA

After BAP treatmentAfter TAP, only

previously cappedmRNAs carry

phosphate

Only previouslycapped mRNAs will

accept a linkerligation

Gppp

ppp

pp

p

HO

Gppp

HO

HO

HO

HO

p

HO

HO

HO

HO

HO

HO

HO

HO

BAP treatment

TAP treatment

RNA ligase+ (r-oligo)OH

5'

BioSci 145B lecture 3 page 23 ©copyright Bruce Blumberg 2004. All rights reserved

Full-length mRNA isolation and cDNA synthesis (contd)

• 5’ oligo capping (contd)– advantages

• very simple• no homopolymeric regions to worry about• can put arbitrary sequence at 5’ end.

– Enables custom vector construction– also enables PCR to make driver for normalization

– disadvantages• cap trapper paper claims this method only gives 70% full-

length cDNAs• high quality TAP is not easy to find• original paper used PCR between 5’ and 3’ primer to make

cDNAs – PCR => bias!

BioSci 145B lecture 3 page 24 ©copyright Bruce Blumberg 2004. All rights reserved

Full-length mRNA isolation and cDNA synthesis (contd)

• Cap trapping Carninci, P. et al. (1996) Genomics 37: 327- 336.– biotin residue is chemically added

to the cap structure– approach

• 1st strand cDNA is synthesized• treatment with RNAse I cuts any

cDNA:mRNA duplexes which are not absolutely complete

• complete cDNAs are isolated by streptavidin chromatography

• RNA is hydrolyzed• cDNA is tailed with dG

– What are pitfalls of this?

• 2nd strand synthesis is primed with dC

• adapter added• cloned

Homopolymers troublesome

BioSci 145B lecture 3 page 25 ©copyright Bruce Blumberg 2004. All rights reserved

Full-length mRNA isolation and cDNA synthesis (contd)

• Cap trapping (contd)– advantages

• claimed to give 90% recovery of full-length cDNAs• lots of history at RIKEN

– disadvantages• homopolymeric region• many steps -> points of failure

BioSci 145B lecture 3 page 26 ©copyright Bruce Blumberg 2004. All rights reserved

Full-length mRNA isolation and cDNA synthesis (contd)

• Cloning of cDNAs– most methods require linker or adapter addition followed by restriction

digestion– relies on methylation to protect internal sites or use of rare cutters– A new alternative is ExoIII-mediated subcloning

• no methylation• no restriction digestion• no ligation• no multimerization of

vector or inserts• 100% oriented

BioSci 145B lecture 3 page 27 ©copyright Bruce Blumberg 2004. All rights reserved

Vectors for cDNA cloning

• Plasmids vs phage– phage preferred for high density manual screening– plasmids are better for functional screening

• microinjection• transfection• panning

– phage packaging and infection more efficient than electroporation

• 10-100x better than best transformation frequency

• what will the library be used for ?– Consider the intended use as well as other contemplated uses

• will the library go to an EST project?– Plasmid

• will it be screened manually– phage

• or arrayed and screened on high density filters– plasmid

• will we normalize it?– Probably plasmid

BioSci 145B lecture 3 page 28 ©copyright Bruce Blumberg 2004. All rights reserved

Vectors for cDNA cloning (contd)

• Analysis of cDNAs obtained– rate limiting step in clone analysis is getting them into a usable form

• usually a plasmid– cloning is tedious, particularly if one has many positives

• some tricks can be used but this is still the bottleneck• in about 1985 or so, Stratagene introduced lambda ZAP

– phage with an embedded plasmid and M13 packaging signals– plasmid can be automatically excised by adding a helper phage

• gene II protein replicates plasmid into ss phagemid which is secreted

– this was a major advance and many phage libraries today are made in ZAP or its derivatives

– early protocols had helper phage problems - solved• later, others developed a Cre-lox based system

– instead of M13 used loxP sites.– When Cre recombinase is added, recombination between the loxP

sites excises a plasmid• both methods work very well and make analysis of many clones very

straightforward

BioSci 145B lecture 3 page 29 ©copyright Bruce Blumberg 2004. All rights reserved

Vectors for cDNA cloning (contd)

BioSci 145B lecture 3 page 30 ©copyright Bruce Blumberg 2004. All rights reserved

Vectors for cDNA cloning (contd)

BioSci 145B lecture 3 page 31 ©copyright Bruce Blumberg 2004. All rights reserved

mRNA frequency and cloning

• mRNA frequency classes – classic references

• Bishop et al., 1974 Nature 250, 199-204• Davidson and Britten, 1979 Science 204, 1052-1059

– abundant • 10-15 mRNAs that together represent 10-20% of the total RNA

mass• > 0.2%

– intermediate • 1,000-2,000 mRNAs together comprising 40-45% of the total• 0.05-0.2% abundance

– rare • 15,000-20,000 mRNAs comprising 40-45% of the total• abundance of each is less than 0.05% of the total• some of these might only occur at a few copies per cell

• How does one go about identifying genes that might only occur at a few copies per cell?

BioSci 145B lecture 3 page 32 ©copyright Bruce Blumberg 2004. All rights reserved

Normalization and subtraction

• How to identify genes that might only occur at a few copies per cell?– alter the representation of the cDNAs in a library or probe

– Normalization - process of reducing the frequency of abundant and increasing the frequency of rare mRNAs

• Bonaldo et al., 1996 Genome Research 6, 791-806

– Subtraction - removing cDNAs (mRNAs) expressed in two populations leaving only differentially expressed

• Sagerström et al. (1997) Ann Rev. Biochem 66, 751-783

BioSci 145B lecture 3 page 33 ©copyright Bruce Blumberg 2004. All rights reserved

Normalization and subtraction

• Normalization - reducing abundant, increase rare mRNAs -– normalization should bring

cDNA abundunce to within 10x• rarely works this well • Typically, abundant genes

reduced 10x, rare ones increased 3-10x

• Intermediate class genes do not change much at all

– Approach• make a population of cDNAs

single stranded - tester• hybridize with a large excess

of cDNA or mRNA to Cot =5.5

– driver

• Cot value is critical for success of normalization

– 5-10 optimal, higher valuesNOT better

BioSci 145B lecture 3 page 34 ©copyright Bruce Blumberg 2004. All rights reserved

Normalization and subtraction (contd)

– Approach (contd)• various approaches to make driver

– use mRNA - may not be easy to get– make ssRNA by transcribing library– ssDNA from gene II/ExoIII treating inserts from plasmid

library– PCR amplification of library

• best approach is to use driver derived from the same library by PCR

– rapid, simple and effective– other approaches each have various technical difficulties– see the Bonaldo review for details.

BioSci 145B lecture 3 page 35 ©copyright Bruce Blumberg 2004. All rights reserved

Normalization and subtraction (contd)

– What are normalized libraries good for?• EST sequencing• gene identification

– biggest use is to reduce the number of cDNAs that must be screened

– good general purpose target to screen» subtracted libraries are useful but limited in utility

– Drawbacks• Not trivial to make• Size distribution of library changes

– Longer cDNAs lost

BioSci 145B lecture 3 page 36 ©copyright Bruce Blumberg 2004. All rights reserved

Normalization and subtraction (contd)

• Subtractive screening - Sargent and Dawid (1983) Science 222, 135-139. – Make 1st strand cDNA from a tissue and then hybridize it to

excess mRNA from another

• larger Cot is best >20 at least

– remove double stranded materials -> common seqs– make a probe or library from the remaining single stranded cDNA

BioSci 145B lecture 3 page 37 ©copyright Bruce Blumberg 2004. All rights reserved

Normalization and subtraction (contd)

• Subtractive screening (contd)– benefits

• sensitive• can simultaneously identify all cDNAs that are differentially

present in a population• good choice for identifying unknown, tissue specific genes

– drawbacks• easy to have abundant housekeeping genes slip through

– multistage subtraction is best– in effect normalize first, then subtract

• libraries have limited applications– may not be useful for multiple purposes

BioSci 145B lecture 3 page 38 ©copyright Bruce Blumberg 2004. All rights reserved

Normalization and subtraction (contd)

– rule of thumb• make a high quality representative library from a tissue of

interest• save subtraction and other fancy manipulations for making

probes to screen such libraries with– unlimited screening– easy to use libraries for different purposes, e.g. the liver

library» hepatocarcinoma» cirrhosis» regeneration specific genes

BioSci 145B lecture 3 page 39 ©copyright Bruce Blumberg 2004. All rights reserved

• Screening methods depend on what type of information you have in hand.– Related gene from another species?

– A piece of genomic DNA?

– A mutant

– A functional assay?

– An antibody?

– A partial amino acid sequence?

– A DNA element required for expression of an interesting gene?

– An interacting protein?

– A specific tissue or embryonic stage?

How to identify your gene of interest

• Low stringency hybridization

• Hybridization

• Complementation• Positional cloning

• Expression screening

• Expression library screening

• Oligonucleotide screening

• Various binding protein strategies

• Interaction screening

• Subtracted screening

BioSci 145B lecture 3 page 40 ©copyright Bruce Blumberg 2004. All rights reserved

How to identify your gene of interest (contd)

• What is the most important piece of information you need to clone a cDNA?

• First step in any hybridization based method (high or low stringency) is to get information on expression– high stringency homologous screening - Northern analysis– cross species screening requires more care

• perform a genomic Southern to identify hybridization and washing conditions that identify a small number of hybridizing fragments

– standard conditions - 1 M Na+, 43% formamide, 37° C– begin washing at RT in 2 x SSC and expose– increase stringency until signal/noise ratio is acceptable– use these conditions for Northern.

• If Northern is unsuccessful - obtain a genomic clone and repeat the screening at high stringency

– this approach will never fail to identify a homologous gene

– Information on where the mRNA is expressed• either what tissue or• what time during development

– such information is indispensable!!