6
THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 256. No. 1. Issue of January IO. pp. 533-538, 1981 Prrnted rn U SA. Restriction Enzyme Mapping and Heteroduplex Analysis of the Rat Milk Protein cDNA Clones* (Received for publication, July 7, 1980,and in revised form, August 6, 1980) Donald A. Richards, Douglas E. Blackburn, and Jeffrey M. Rosen$ From the Department of Cell Biology, Baylor College of Medicine, Houston, Texas 77030 Detailed restriction enzyme maps have been deter- mined for the three major rat casein and the fourth principal milk protein, a-lactalbumin, cDNA clones. Each of the milk protein genes exhibited unique and characteristic restriction enzyme sites. A comparison of the restriction enzyme maps of the three rat caseins revealed no apparent sequence homology among their gene sequences. The orientation of each cDNA gene sequence within the parent plasmid, pBR322, was de- termined by hybridization with a 3’ specific cDNA probesynthesizedfromapartiallyhydrolyzed total poly(A)mRNA preparation following isolation by chro- matography on oligo(dT)-cellulose. This technique pro- vided a rapid procedure for determining the 5’-3’ ori- entation of the cloned DNA sequences. Three casein clones were selected, which were in the same orienta- tion, and were employed for a heteroduplex analysis to determine whether minor regions of homologyexisted within the a-, /3-, and y-casein genes. No heteroduplex formation was observed among hese genes even under the low stringency conditions of hybridization em- ployed, suggesting that considerable sequence diver- gence has occurred within the rat casein gene family. In the previous publication, the construction and character- ization of the cDNA clones for the four major milk proteins isolated from a rat double-stranded cDNA library were de- scribed (1). Three of these genes code for the caseins, which comprise 80% of total protein in rat milk. These proteins are highly acidic and are generally classified as any of the heter- ologous calcium-binding phosphoproteins, which are precipi- tated from skim milk between pH 4 and 5 (2). The three caseins also appear to be under similar, if not identical, hor- monal regulation (3). The degree to which these three genes and their protein products are related in the rat is not clear. The bovine a- and ,&casein proteins contain striking similarities (4). Both pro- teins lack sulfhydryl groups and are quite hydrophobic except for a highly negatively charged segment. The bovine a- and /l-caseins also contain an almost identical octapeptide se- quence in the phosphoserine cluster of each protein. Each of the rat casein proteins is extremely rich in glutamic acid, serine, and proline and low in any cysteine residues (5). However, the remainder of their amino acid contents display * This work was supported by National Institutes of Health Grant CA-16303. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement”in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. $ Recipient of a Research Career Development Award (CA-0154) from the National Institutes of Health. To whom reprint requests should.be addressed. Reprints will be provided only upon receipt of a stamped, self-addressed envelope. clear dissimilarities, indicating that if some homology does exist among the caseins, it is most probably of a limited nature. This is confirmed by the observation that during the initial characterization of their cDNA clones no cross-hybrid- ization was observed under the stringent conditions employed in Northern hybridization or hybrid-arrested cell-free trans- lation analysis. The only amino acid sequence information presently available for the three caseins is confined to the precasein signal peptide sequences (6). Based on the 13 to 16 amino acids whose identity is known, there appears to be a significant conservation of this amino acid sequence in the range of 66 to 92% (6). The significance of such homologies in the signal peptide sequences is unclear since there are limited homologies within leader sequences among clearly unrelated proteins (7). However, it may indicate that functional areas of these proteins havebeen conserved. In an effort to determine whether areas of homology exist in their nucleotide sequence, we have utilized restriction en- zyme mapping and heteroduplex analysis to compare the three caseins. These techniques have previously been utilized to examine sequence homology within a family of chorion structural gene clones (8, 9) and two globin genomic gene clones (10). The construction of detailed restriction enzyme maps for each of the caseins as well as a-lactalbumin, the fourth major milk protein, will also permit interspecies com- parison with the restriction mapping of other caseins and a- lactalbumins, including the mouse milk protein genes, which currently is in progress in several laboratories. The availability of these restriction maps will also allow a more detailed sequence analysis of the milk protein structural genes. This information is required for the characterization of the se- quence and genomic organization of these genes in rat DNA. EXPERIMENTAL PROCEDURES Plasmid Isolation-Recombinant plasmid was prepared by the procedures of Katz et al. (11, 12) with the modifications described previously ( 1 ) . Restriction Enzyme Mapping-Restriction enzyme digestions were performed according to the procedures of the manufacturer of the respective enzymes (Bethesda Research Laboratories). Digests routinely contained 2 to 3 units of the restriction enzyme per wg of DNA for 4 h at 37OC. Electrophoretic analysis was carried out using both agarose (13) and polyacrylamide gels (14) as described. DNA fragments were visualized by ethidium bromide staining (5 pg/ml in 50 mM Tris-HC1, pH 7.6, for 30 min at room temperature) or by autoradiography if the fragments had been end labeled with [y-”’P]- dATP. Isolation of restriction enzyme fragments for further enzyme digestion or end labeling was carried out by a modification of the technique of Maxam and Gilbert (14) as previously described (1). End labeling of restriction enzyme fragments with [Y-~’P]~ATP was per- formed according to the procedure of Maxam and Gilbert (14). digestion technique fwst described by Smith and Birnstiel (15). DNA Restriction enzyme mapping was also carried out using the partial fragments isolated from 5% polyacrylamide gels were end-labeled by polynucleotide kinase using [y-”P]dATP andseparated from free nucleotides by Sephadex G-50 chromatography and ethanol precipi- 533

Restriction Enzyme Mapping and Heteroduplex Analysis of the Rat

Embed Size (px)

Citation preview

Page 1: Restriction Enzyme Mapping and Heteroduplex Analysis of the Rat

THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 256. No. 1. Issue of January IO. pp. 533-538, 1981 Prrnted rn U S A .

Restriction Enzyme Mapping and Heteroduplex Analysis of the Rat Milk Protein cDNA Clones*

(Received for publication, July 7, 1980, and in revised form, August 6, 1980)

Donald A. Richards, Douglas E. Blackburn, and Jeffrey M. Rosen$ From the Department of Cell Biology, Baylor College of Medicine, Houston, Texas 77030

Detailed restriction enzyme maps have been deter- mined for the three major rat casein and the fourth principal milk protein, a-lactalbumin, cDNA clones. Each of the milk protein genes exhibited unique and characteristic restriction enzyme sites. A comparison of the restriction enzyme maps of the three rat caseins revealed no apparent sequence homology among their gene sequences. The orientation of each cDNA gene sequence within the parent plasmid, pBR322, was de- termined by hybridization with a 3’ specific cDNA probe synthesized from a partially hydrolyzed total poly(A) mRNA preparation following isolation by chro- matography on oligo(dT)-cellulose. This technique pro- vided a rapid procedure for determining the 5’-3’ ori- entation of the cloned DNA sequences. Three casein clones were selected, which were in the same orienta- tion, and were employed for a heteroduplex analysis to determine whether minor regions of homology existed within the a-, /3-, and y-casein genes. No heteroduplex formation was observed among hese genes even under the low stringency conditions of hybridization em- ployed, suggesting that considerable sequence diver- gence has occurred within the rat casein gene family.

In the previous publication, the construction and character- ization of the cDNA clones for the four major milk proteins isolated from a rat double-stranded cDNA library were de- scribed (1). Three of these genes code for the caseins, which comprise 80% of total protein in rat milk. These proteins are highly acidic and are generally classified as any of the heter- ologous calcium-binding phosphoproteins, which are precipi- tated from skim milk between pH 4 and 5 (2). The three caseins also appear to be under similar, if not identical, hor- monal regulation (3).

The degree to which these three genes and their protein products are related in the rat is not clear. The bovine a- and ,&casein proteins contain striking similarities (4). Both pro- teins lack sulfhydryl groups and are quite hydrophobic except for a highly negatively charged segment. The bovine a- and /l-caseins also contain an almost identical octapeptide se- quence in the phosphoserine cluster of each protein. Each of the rat casein proteins is extremely rich in glutamic acid, serine, and proline and low in any cysteine residues (5). However, the remainder of their amino acid contents display

* This work was supported by National Institutes of Health Grant CA-16303. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

$ Recipient of a Research Career Development Award (CA-0154) from the National Institutes of Health. To whom reprint requests should. be addressed. Reprints will be provided only upon receipt of a stamped, self-addressed envelope.

clear dissimilarities, indicating that if some homology does exist among the caseins, it is most probably of a limited nature. This is confirmed by the observation that during the initial characterization of their cDNA clones no cross-hybrid- ization was observed under the stringent conditions employed in Northern hybridization or hybrid-arrested cell-free trans- lation analysis. The only amino acid sequence information presently available for the three caseins is confined to the precasein signal peptide sequences (6). Based on the 13 to 16 amino acids whose identity is known, there appears to be a significant conservation of this amino acid sequence in the range of 66 to 92% (6). The significance of such homologies in the signal peptide sequences is unclear since there are limited homologies within leader sequences among clearly unrelated proteins (7). However, it may indicate that functional areas of these proteins have been conserved.

In an effort to determine whether areas of homology exist in their nucleotide sequence, we have utilized restriction en- zyme mapping and heteroduplex analysis to compare the three caseins. These techniques have previously been utilized to examine sequence homology within a family of chorion structural gene clones (8, 9) and two globin genomic gene clones (10). The construction of detailed restriction enzyme maps for each of the caseins as well as a-lactalbumin, the fourth major milk protein, will also permit interspecies com- parison with the restriction mapping of other caseins and a- lactalbumins, including the mouse milk protein genes, which currently is in progress in several laboratories. The availability of these restriction maps will also allow a more detailed sequence analysis of the milk protein structural genes. This information is required for the characterization of the se- quence and genomic organization of these genes in rat DNA.

EXPERIMENTAL PROCEDURES

Plasmid Isolation-Recombinant plasmid was prepared by the procedures of Katz et al. (11, 12) with the modifications described previously ( 1 ) .

Restriction Enzyme Mapping-Restriction enzyme digestions were performed according to the procedures of the manufacturer of the respective enzymes (Bethesda Research Laboratories). Digests routinely contained 2 to 3 units of the restriction enzyme per wg of DNA for 4 h at 37OC. Electrophoretic analysis was carried out using both agarose (13) and polyacrylamide gels (14) as described. DNA fragments were visualized by ethidium bromide staining (5 pg/ml in 50 mM Tris-HC1, pH 7.6, for 30 min at room temperature) or by autoradiography if the fragments had been end labeled with [y-”’P]- dATP. Isolation of restriction enzyme fragments for further enzyme digestion or end labeling was carried out by a modification of the technique of Maxam and Gilbert (14) as previously described (1). End labeling of restriction enzyme fragments with [Y-~’P]~ATP was per- formed according to the procedure of Maxam and Gilbert (14).

digestion technique fwst described by Smith and Birnstiel (15). DNA Restriction enzyme mapping was also carried out using the partial

fragments isolated from 5% polyacrylamide gels were end-labeled by polynucleotide kinase using [y-”P]dATP and separated from free nucleotides by Sephadex G-50 chromatography and ethanol precipi-

533

Page 2: Restriction Enzyme Mapping and Heteroduplex Analysis of the Rat

534 Analysis of Rat Milk Protein cDNA Clones

tation. Labeled fragments were then digested with a restriction en- zyme which could cleave the DNA fragment only once near one of its ends. After phenol extraction and ethanol precipitation, these frag- ments were digested with a second enzyme in the presence of 2 to 4 pg of unlabeled competitor plasmid DNA. Aliquots of the reaction mixture were removed every few minutes and the reaction was stopped by adding 25 mM Na2EDTA. Partial digestion products were analyzed by agarose and polyacrylamide gel electrophoresis as de- scribed above.

Determination of Cloned DNA Orientation-The orientation of each of the cDNA clones was determined by hybridizing a 3' specific mRNA probe generated from the four major milk protein mRNAs to a Southern gel blot (16) of restriction enzyme digests obtained from each of the clones. Poly(A)-containing mRNA isolated from 8-day lactating rat mammary tissue was hydrolyzed in the presence of 0.2 N NaOH and 10 mM Na2EDTA for 15 min at 4°C. The 3' poly(A)- containing fragment of these mRNAs was then isolated by oligo(dT)- cellulose chromatography after complete neutralization of the hy- drolysis mixture. '"P-labeled cDNA was synthesized from the 3' mRNA template as previously described (1). This 3' specific probe was hybridized to a Southern gel blot of a restriction digest of each of the clones as detailed in the text. For comparison, ["PIcDNA synthe- sized from unhydrolyzed, intact poly(A)RNA was hybridized to a duplicate Southern gel blot.

Heteroduplex Analysis-Each of the casein clones was digested with Eco RI to linearize the plasmid. After phenol extraction and ethanol precipitation, each cloned DNA was dissolved in 70% form- amide (chelexed), 0.1 M Tris-HC1, pH 8.0, 0.3 M NaCI, and 0.01 M Na2EDTA and mixed a t a concentration of 40 pg/ml with equimolar amounts of one of the other plasmids. Each heteroduplex mixture, after being sealed in a 100-pl capillary tube, was denatured for 2 min a t 80°C and hybridization was carried out at 40°C for 1 h. After slow cooling to room temperature, the mixture was prepared for electron microscopic analysis as described (17).

RESULTS AND DISCUSSION

Several different, but complementary approaches have been employed in order to generate a detailed restriction enzyme map for each of the four milk protein cDNA clones. The initial analysis of their restriction enzyme sites was carried out using single and double restriction enzyme digestions of total plas- mid DNA. Resolution of the DNA fragments was accom-

plished using both agarose and polyacrylamide gel electropho- resis and they were visualized by ethidium bromide staining (data not shown). Using this technique, we were able to construct partial maps of each of the four structural gene sequences. However, due to co-migration of some of the cloned, inserted DNA fragments with the plasmid pBR322- derived sequences, we were unable by this method to generate complete restriction maps. Any discrepancies which may have arisen using this approach were resolved by employing two procedures which both rely on the prior end labeling of the terminal phosphate of each DNA fragment with [y-"2P]dATP.

In the first technique, each cloned cDNA sequence was isolated from the parent plasmid sequences by preparative gel electrophoresis following either Pst I or Hpa I1 digestion. The cloned, insert DNA was then digested by a second restriction enzyme and end labeled with [y-"PIdATP using the enzyme polynucleotide kinase. The end-labeled DNA fragments were resolved by electrophoresis on a 5% polyacrylamide gel and each fragment isolated. Individual fragments were then redi- gested with a series of restriction enzymes and the resulting fragments analyzed by polyacrylamide gel electrophoresis and autoradiography. An example of this technique is shown in Fig. 1. The p-casein insert was isolated following Pst I diges- tion of the parent plasmid, pCp23, and preparative gel elec- trophoresis. This sequence, which is cleaved once by Pst I, was digested with a second enzyme, HinfI, and the six diges- tion products were end-labeled with [Y-"~P]~ATP. Following isolation of each of the end-labeled fragments by preparative gel electrophoresis, they were each digested with a series of four restriction enzymes. The HinfI-Pst I p-casein gene frag- ments I1 and I11 were approximately of the same size (215 and 230 nucleotides, respectively), and they were, therefore, not resolved prior to redigestion. The analysis of each of the isolated end-labeled fragments and their respective digestion products is shown in Fig. 1. Restriction enzyme digestions were performed with Hpa 11, A h I, Hue 111, and HincII as shown in lanes A thru D, respectively. For example, HinfI fragment I, which is 290 nucleotides long, was found to contain

Fragment I Fragments II and 111 Fragment IV A B C D A B C D A B C D .- "... r ~ =.

FIG. 1. Restriction enzyme analy- sis of six end-labeled pCfl23 insert restriction enzyme fragments. These fragments were generated by Pst I and Hinff digestion of the cloned P-casein insert, end labeled with [y-'"PIdATP, and then isolated by preparative gel elec- trophoresis. The six fragments were then individually digested with a series of re- striction enzymes and analyzed on a 5% polyacrylamide gel. Fragments I1 and 111 of approximately the same size (230 and 215 nucleotides) were not resolved and were analyzed together. The following restriction enzymes were used for this analysis: A, Hpa 11; B, A h I; C, Hae 111; and D, HincII.

290- 230 1

135-1 110- 79- 1

210- 215- ' 185- 155-

79-

19-

<lo- 40-

Fragment V Fragment VI A B C D A B C D

~ " ~ - "-

117- I 87- 90- cr-

18-

Page 3: Restriction Enzyme Mapping and Heteroduplex Analysis of the Rat

Analysis of Rat Milk Protein cDNA Clones 535

an internal Hpa I1 site, but no A b I, Hue 111, or HincII sites. The other fragments have been analyzed in a similar manner using a Hue I11 SV40 digest to establish the individual frag- ment sizes. Although somewhat laborious, this method pro- vided considerable data concerning the location of the individ- ual restriction enzyme sites. One problem with this technique is the possible omission of internal DNA fragments when multiple restriction enzyme sites were present.

The second technique used is the "partial" restriction en- zyme digestion method described by Smith and Birnstiel(l5). This technique is illustrated in Fig. 2. The y-casein sequence from plasmid pCy31 was excised by digestion with the enzyme Hpa 11. Hpa I1 does not cut the y-casein structural gene sequence but cleaves outside the inserted sequence to yield the intact y-casein sequence plus 63 nucleotides of pBR322 at one end and 47 nucleotides at the other end (18). After preparative gel electrophoresis, the isolated fragment was end-labeled with [y-=P]dATP and digested with Pst I. The y-casein sequence does not contain an internal Pst I site and only one of the Pst I sites was regenerated in this recom- binant clone at the insertion sites in pBR322 using the G-C tailing procedure described (1). Thus, Pst I digestfon yields two fragments, a large fragment greater than 900 rucleotides in length which contains the y-casein sequence plus 47 nu- cleotides of pBR322 and a 63-nucleotide fragment also derived from pBR322. Both fragments remain end labeled; however, now only one end of each carries the 32P label. The fragments are then partially digested with a second restriction enzyme in the presence of excess unlabeled plasmid which acts as a competitive substrate with the end-labeled insert. The anal- ysis of partial HinfI digests performed at progressively longer times is shown in Fig. 2. Visualization of the unlabeled plasmid by ethidium bromide staining indicated that a partial digestion was, in fact, obtained at each time point (Fig. 2A). The partial digestion of the 32P-end-labeled y-casein sequence is shown in the autoradiogram of the same gel (Fig. 2B). The lowest band of 63 nucleotides is the fragment generated by the Pst I digestion and is not cleaved by HinfI. The remaining bands represent the y-casein partial digestion products. Since only one of the ends of this sequence is labeled, the location of each HinfI site can be read off directly by progression up the gel. Thus, the smallest band of 140 nucleotides, which contains 47 nucleotides derived from pBR322, contains the first HinfI site. This site, therefore, is 93 nucleotides from the Pst I insertion site within the y-casein sequence. The next site is 66 nucleo- tides from the first HinfI site. Accordingly, each band was analyzed in a similar manner. The autoradiogram of a com- plete HinfI digestion of the y-casein sequence, which was not predigested with Pst I, is shown in Fig. 2B, last lane. This was included in order to determine the relative length of the HinfI fragments generated at each end of the inserted DNA. This technique proved extremely useful in ordering the restriction enzyme sites in both the y-casein clone, pCy31, and the a- casein clone, pCal6, which had regenerated only a single Pst I at one end of the insert sequence. However, this technique was not suitable for mapping the other two clones, where both Pst I sites were regenerated and no other enzyme was known which had a single site of cleavage at one end of the insert DNA.

The results obtained from these three mapping techniques have been summarized in Figs. 3 through 6. In each of the figures, the asymmetric Hha I site, which is 22 nucleotides from the Pst I insertion site of the plasmid, pBR322 (18), is indicated to orient the restriction map of the insert DNA within the parent plasmid. The detailed map of the a-casein gene obtained from clone pCal6 is shown in Fig. 3. The cloned insert is approximately 1016 nucleotides long, which repre-

sents 77% of the entire a-casein mRNA sequence once deduc- tions are made for the length of the G-C tail (1). The two HindIII sites, which are approximately 400 nucleotides apart, are characteristic for the a-casein sequence as is the single Hha I site. These two restriction enzyme sites were not present in any of the other isolated casein clones analyzed or in the cloned a-lactalbumin sequence. The a-casein sequence was found not to contain recognition sequences for any of the following enzymes: Aua 11, Bum HI, Eco RI, Sst I, Pst I, and Hpa 11. the restriction enzymes Alu I, Hue 111, and HinfI were each found to cut the a-casein sequences at multiple sites, which are detailed in Fig. 3.

The 8-casein clone, pCp23, was found to contain 95% of its respective mRNA sequence with an insert size of 1117 nucleo- tides. Its restriction enzyme map, shown in Fig. 4, reveals the presence of an internal Pst I site near the middle of 8-casein

A B ,. Time Points

4 7 10 13 17 Std

-752 -540 -372

-227 -1 79

700-

420- 330-

588= 140-

4 7 10 13 17 60 Time Points No Pst I

m"10 "140

FIG. 2. Partial digestion mapping technique with pCy31. A depicts a time course of a HinfI partial digest of 32P end-labeled pCy31 containing an excess of unlabeled pCy31 stained with ethidium bromide. B contains an autoradiogram of the same gel demonstrating the partial digestion products of the pCy31 Hpa I1 excised insert, which has been end-labeled with [y-32P]dATP at one end. The far right column displays the result of a Hpa 11- generated pCy31 insert, end labeled at both ends and digested with HinfI to completion.

5' P Hf A HdAHa Hh Ana Hd A

3' Hf Hf Hh

L . 1

100 200 300 400 500 600 700 600 900 loo0 Nucleotides

FIG. 3. Restriction enzyme map of the a-casein clone, pcCrl6. Schematic diagram of the a-casein clone demonstrating characteristic restriction enzyme sites. Abbreviations of the restriction enzymes are as follows: A, Alu I; Ha, Hae III; Hd, HindIII; HL HinfI; Hh. Hha I; and P, Pst I.

.- . . &e I t 1 1 I

i E l l I I I t

t I 4 t I PA I I

' 100 200 300 400 500 800 700 800 900 1ooo1,20 NucleotMes

FIG. 4. Restriction enzyme map of the /%casein clone, pC/3!23. Thii schematic illustrates the restriction enzyme sites determined within the p-casein insert. The Pst I site and the Hpa I1 (Hp) sites are unique to the p-casein clone. The abbreviations are identical with those shown in Fig. 3.

Page 4: Restriction Enzyme Mapping and Heteroduplex Analysis of the Rat

536 Analysis of Rat Milk Protein cDNA Clones

5’ Hf Hf Hf Hf Ha Hf AAHa S Hf 3’ P Hh

I ! I

A& I I I t I

H> 111 I t I I t t t t I t I Hinf I t -

WII t -I

100 200 300 400 500 600 700 800

Nucleotides FIG. 5. Restriction enzyme map of the y-casein clone, pCy31.

The diagram represents the full length y-casein cloned cDNA se-

milk protein cDNA clones studied. quence. The Sst I site (5’) is unique for the y-casein clone among the

PAvHaAvHaHf Hf AvB B 5’ 3’

’ l i 5! I r ‘ > v I Pi““ & II

B& I t I l ,

H ! i Ill-

mf I - 100 200 300 400

Nucleotides

FIG. 6. Restriction enzyme map of the a-lactalbumin clone, pLA32. The cDNA clone of a-lactalbumin, approximately 71% of the full length mRNA sequence, demonstrates multiple characteristic Aua I1 (Au) sites and two Barn HI ( B ) sites which are also unique for a-lactalbumin.

gene, which is unique among the milk protein clones. The p- casein sequence also contains two Hpa I1 sites, which are 480 nucleotides apart, and these are diagnostic for the p-casein cDNA clone. Several restriction enzyme recognition sites not found in this clone included: Aua 11, Bum HI, Eco Ri, HzndIII, Sst I, Hha I, and Tag I. The restriction enzyme map of the y-casein clone, pCy31, is shown in Fig. 5. The y-casein se- quence is distinguished by a unique Sst I site. The clone pCy31, which is approximately 906 nucleotides long, contains essentially the entire y-casein mRNA sequence. This sequence contains no recognition sites for the following enzymes: Aua 11, Barn H1, Eco RI, HzndIII, Hpa 11, Hha I, Pst I, and Taq I. pLA32, an a-lactalbumin clone, is the shortest of the milk protein gene sequences isolated. However, at a length of 473 nucleotides, this clone still represents the majority (71%) of the a-lactalbumin mRNA sequence. The a-lactalbumin re- striction enzyme map (Fig. 6) is characterized by the presence of two Bum HI restriction sites which are approximately 60 nucleotides apart. Three Aua I1 restriction sites are also present. Both of these restriction enzyme sites are absent in the other milk protein gene clones. The restriction enzyme, Alu I, which was found to cleave all three casein sequences at multiple sites, did not have a recognition site within the pLA32 insert sequence. The a-lactalbumin sequence was also found not to contain sites for the following restriction enzymes: Eco RI, HindIII, Hha I, Pst I, Hpa 11, and Sst I. Further restriction enzyme mapping of these four genes was not per- formed because they are now amenable to direct DNA se- quence analysis. Direct confirmation of the restriction map of the pCp23 clone has been obtained recently by sequence analysis in our laboratory.’ This clone contains the sequence specifying the last five amino acids of the rat P-precasein signal peptide and an extensive homology with the ovine and

D. E. Blackburn and J. M. Rosen, unpublished observations.

bovine p-casein proteins was evident (19), further confirming its identity.

Following the completion of the restriction enzyme analysis of the four major milk protein clones, the orientation of each sequence was determined. This was accomplished by prepar- ing a probe specific for the 3’ ends of total poly(A) RNA isolated from 8-day lactating rats. The 3’ probe was prepared by the partial hydrolysis of poly(A) RNA isolated from 8-day lactating rat mammary glands by treatment with 0.2 N NaOH for 15 min at 4°C. The 3’ ends of the hydrolyzed poly(A) RNA were then isolated by oligo(dT)-cellulose chromatography and a ”P-labeled cDNA probe synthesized using oligo(dT) as a primer. This cDNA probe was hybridized to a “Southern” gel blot of the restriction enzyme digests shown in Fig. 7A. An identical Southern blot was hybridized with [32P]cDNA syn- thesized from unhydrolyzed, intact poly(A) RNA for compar- ison with the 3’ probes.

The a-casein clone, pCal6, was digested with the restriction enzyme Hind111 which yields three fragments, each of which contains a portion of the cDNA sequence (Fig. 7A). As ex- pected, the total poly(A) RNA generated cDNA probe hy- bridized to each of these three bands (Fig. 7B, lane A). Despite the high background in these two tracks (due to partial digestion and/or DNA breakdown, which was not evident in the ethidium bromide-stained gel), there was a preferential hybridization of the 3’ cDNA probe. The 944-nucleotide frag- ment was demonstrated to contain the 5’ end of the structural gene sequence since it did not hybridize to the 3’ probe (Fig. 7B, lane B). The 4027-nucleotide fragment which contains approximately 180 nucleotides of the a-casein sequence de- rived from the opposite end of the cloned sequence hybridizes equally well to the 3‘ cDNA probe and to the total cDNA probe. The 390-nucleotide sequence derived from the center of the a-casein cDNA clone hybridizes with the 3’ probe but not as well as the 4027-nucleotide fragment.

The p-casein clone, pCp23, was digested with Pst I, which cleaved the sequence into two fragments for orientation anal- ysis, as seen in Fig. 7A. The smaller fragment of 520 nucleo- tides hybridized preferentially to the 3’ cDNA probe (Fig. 7B, lane B). The 5’ fragment of 597 nucleotides displayed almost no hybridization to the 3’ cDNA probe. The y-casein clone, pCy41, was digested with two restriction enzymes, Pst I, which cuts out the cloned cDNA sequence as a single piece, and Sst I which cleaves this sequence approximately in half (Fig. 7A). There was also a small amount of undigested DNA, which was only evident after hybridization. The 908-nucleotide frag- ment observed in both lanes A and B of Fig. 7B represents the small amount of the residual Pst I fragment which re- mained undigested by Sst I. In Fig. 7B, lune B, the preferential hybridization of the 3’ cDNA probe to the 908-nucleotide fragment, containing the entire sequence, and the 398-nucleo- tide fragment, which, therefore, contains the 3’ end of the mRNA sequence, is shown.

Aua 11 was utilized to digest the a-lactalbumin clone, pLA32, and cleaved the cloned cDNA sequence into four fragments (Fig. 7A). Two of these fragments which are both less than 140 nucleotides were not detected in the Southern blot hy- bridization even using the total poly(A)-derived cDNA probe. This was probably due to the inefficient binding of these small DNA fragments to the nitrocellulose filter during transfer. The other two fragments both greater than 200 nucleotides in length are clearly evident following hybridization to the total poly(A)-derived cDNA probe. The larger fragment of 265 nucleotides clearly represents the 3’ end of the mRNA se- quence (Fig. 7B, lane B).

With the orientation of the a-lactalbumin clone determined, it was possible to compare the restriction enzyme map with

Page 5: Restriction Enzyme Mapping and Heteroduplex Analysis of the Rat

Analysis of Rat Milk Protein cDNA Clones

* I 2 3 4 B PC- 16 pCB23 pC141 pLA32

A 0 A 0 A 0 A B

537

4027-

944- I II FIG. 7. Orientation analysis of the milk protein structural

gene clones. A contains an ethidium bromide-stained profile on a 2.5% agarose gel of restriction enzyme digestions of the four milk protein clones used for the orientation analysis. The digestions are as follows: lane I , Hind111 digest of pCal6 lane 2, Pst I digest of pCP23; lane 3, Pst I and Sst I digest of pCy41 (identical with pCy31 but Pst I sites were regenerated at both ends of the insert); and lane 4, Aua

the recently determined amino acid sequence of the rat a- lactalbumin protein.' This was accomplished by determining all of the possible amino acid codons which could be generated by the predicted restriction enzyme recognition sequences and then determining whether these amino acid sequences were present in the known sequence and were spaced properly according to the restriction enzyme map. The five restriction sites determined at the 5' end of the pLA32 clone were found to be consistent with the amino sequence and these sites were found within 5 to 10 nucleotides of that predicted by the restriction enzyme map. This is within the normal degree of error observed with these mapping techniques. The remaining four restriction enzyme sites appear to be outside the coding region of the a-lactalbumin mRNA in its 3' noncoding region. This result was entirely consistent with the known sue of the a-lactalbumin mRNA and the size of the protein encoded by this mRNA. In addition to confirming the accuracy of our restriction enzyme map of the a-lactalbumin clone, it also helped c o n f m t h e identity of this clone.

In each of the restriction enzyme maps presented in Figs. 3 through 6, the 5' and 3' ends of the DNA-coding sequence have been indicated. In order to compare the three caseins and determine whether they exhibited any apparent homol- ogy, as reflected by their restriction enzyme maps, the three restriction maps are shown together in Fig. 8. Since two of the sequences are slightly less than full length, the maps in this figure were aligned in a 3' to 5' direction. In the preparation of double-stranded DNA for cloning, the length of the insert sequence usually reflects the length of the initial cDNA since the synthesis of the second strand continues until a complete copy of the cDNA is made. Therefore, a portion of the 5' end of the structural gene sequences for the a- and /3-caseins is most probably not represented in these restriction enzyme maps. As shown in Fig. 8, except for a few similar restriction enzyme sites, which appear randomly located, there is no apparent homology among the three restriction enzyme maps. This is consistent with results obtained during Northern blot hybridizations and hybrid-arrested translations in which no

R. Prasad, J. Hamilton, R. Butkowski, and K. E. Ebner, personal communication.

I

908- - "'"-I 398-

r)

265

203-• ') I1 digest of pLA32. B depicts Southern hybridizations to similar restriction digests for each clone. The hybridization probe used in the A columns is '"P-labeled cDNA synthesized from t o t a l poly(A) RNA containing each of the milk protein sequences. Columns labeled B show the results of hybridizations performed with 3' specific cDNA probes prepared as described in the text.

Hf Hf A Hd Ha A Hh Ha A H d A HI

pCo(l6 I i 1 1 " ?I< 1 I

1 1 I 1 II Wr- I I I

I y?" 1 I 1 1 I I

A HI Hp P AH1 A H l H a Ha Hp Hf

pCP23

pcx31

Hf S H a A A Hf H a Hf HIHf Hf

3' 100 200 300 400 500 600 700 800 900 1000 5' Nucleotldes

FIG. 8. Comparison of the restriction enzyme maps of the three major rat casein cDNA clones. The restriction map of each casein clone is presented here in the 3' to 5' direction for comparative purposes. This orientation was selected because both the a-casein and /I-casein clones shown lack a portion of the 5' ends of their mRNA sequences.

major cross-hybridization among the casein genes was de- tected (1).

This restriction enzyme analysis suggested the absence of significant homology among the three casein clones. However, it did not rule out the existence of small regions of sequence homology. A more precise technique to determine whether such regions of homology exist is to visualize directly by electron microscopy heteroduplex molecules formed between the three clones to ascertain whether hybridization will occur between any homologous portions of these genes. This anal- ysis required the prior determination that the three casein clones are in the same orientation in the parent plasmid, pBR322. Appropriate clones were, therefore, selected and digested with Eco RI, which cleaves each plasmid sequence only once a t a site outside of the structural gene sequences. Heteroduplex analysis was then carried out under conditions of low stringency as described under "Experimental Proce- dures.'' Typical results of this analysis are shown in Fig. 9. The heteroduplex formed between pCal6 and pCp23 is shown in Fig. 9A with each single-stranded region indicating the respective cloned cDNA sequences. No heteroduplex struc- tures were observed under these conditions between these two gene sequences. Similar results were obtained during the heteroduplex analysis of pCp23 with pCy31 shown in Fig. 9B and of pCal6 with pCy31 in Fig. 9C. Thus, even using this

Page 6: Restriction Enzyme Mapping and Heteroduplex Analysis of the Rat

538 Analysis of Rat Milk Protein cDNA Clones

FIG. 9. Heteroduplex analysis of the rat casein cDNA clones. The electron micrographs represent typical heteroduplexes obtained with the three Eco RI-cleaved casein clones. The hybridizing se- quences represent the flanking sequences of the parent plasmid, pBR322. No cross-hybridization was obtained between any of the casein structural gene sequences. A pCal6 and pCP23 heteroduplex is shown in A; a pCp23 and pCy31 heteroduplex is shown in B; and a pCal6 and pCy31 heteroduplex is shown in C.

technique, no apparent homology was detected within the casein gene family. This does not, however, rule out the existence of very short regions of nucleotide sequence homol- ogy within these genes or the presence of conserved regions in the amino acid sequence of these proteins. For example, small regions of homology (t50 nucleotide pairs) would not have been detected at the 5' and 3' ends of these heteroduplex structures within the limit of resolution of this analysis. Direct nucleotide sequencing will also be required in order to deter- mine the frequency of third base position changes and any small deletions or insertions. This study does suggest, how- ever, that if homology does exist, it is quite limited, indicating that the casein genes have diverged quite extensively during their evolution. We are currently sequencing each of the rat casein gene sequences to determine whether they may have evolved from a common ancestral gene (19).

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16. 17.

18. 19.

In summary, several restriction enzyme mapping strategies were utilized to determine the restriction enzyme maps of the four major rat milk protein cDNA clones. These studies were designed to permit a limited comparison of the sequence of these genes and to subdivide them into 200-nucleotide frag- ments suitable for future DNA sequencing and for the char- acterization of their genomic organization. C o n f i a t i o n of the a-lactalbumin restriction map was obtained by comparison with the known amino acid sequence of rat a-lactalbumin. Five predicted restriction enzyme sites found in the coding region were within 10 bases of an amino acid codon which was consistent with each of these restriction enzyme recognition sequences. After determination of the 5' and 3' ends of the coding sequence of each clone, a comparison of the restriction maps of the three caseins indicated that no apparent homology exists among the three casein structural genes. In an attempt to define small regions of possible homology, heteroduplex anaylsis was also employed. The results of the heteroduplex analysis among all three casein clones indicate that within the limit of resolution of this analysis no sequence homology was detected. The casein genes, therefore, represent a family of related genes, which may have undergone significant diver- gence during their evolution. In the near future, it should be possible to compare the evolution of the casein genes among several different species as studies are currently under way in other laboratories, as well as our own, to clone and character- ize the casein genes from the mouse, rabbit, guinea pig, and bovine species.

Acknowledgments-We wish to express our graditude to Dr. M. L. Mace for his excellent electron micrographs and to Ms. Grace Huang for her expert technical assistance.

REFERENCES 1. Richards, D. A., Rodgers, J. R., Supowit, S. C., and Rosen, J. M.

(1981) J. Biol. Chem. 256,526-532 2. Thompson, M. P. (1971) in Milk Proteins-Chemistry and Mo-

lecular Biology (McKenzie, H. A., ed), Vol. 11, pp. 117-173, Academic Press, New York

3. Rosen, J. M., and Barker, S. W . (1976) Biochemistry 15, 5272- 5280

4. Taborsky, G. (1974) Adu. Protein Chem. 28,91-99 Nardacci. N. J.. Lee. J. W . C.. and McGuire. W . G . (1978) Cancer . ~~

Res. 38,2694-266 Rosen, J. M., and Shields, D. (1980) in Testicular Development,

Structure and Function (Steinberger, A., and Steinberger, S., eds), pp. 343-349, Raven Press, New York

Blobel, G., Walter, P., Chang, C. N., Goldman, P. M., Erickson, A. H., and Lingappa, V. R. (1979) Symp. SOC. Exp. Biol. (Great Britain) 33.9-36

Jones, C. W., Rosenthal, N., Rodakis, G . C., and Kafatos, F. C. (1979) Cell 18, 1317-1332

Griftin-Shea, R., Thireos, G., Kafatos. F. C., Petri, W . H., and Villa-Komaroff, L. (1980) Cell 19,915-922

Tiemeier, D. C., Tilghman, S. M., Polsky, F. I., Seidman, J. G., Leder, A., Edgell, M. H., and Leder, P. (1978) Cell 14,237-245

Katz, L., Kingsbury, D. T., and Helinski, D. R. (1973) J. Bacteriol.

Katz, L., Williams, P. H., Sato, S., Leavitt, R. W., and Helinski,

Helline. R. B.. Goodman. H. M., and Boyer, H. W . (1974) J. Virol.

114,557-591

D. R. (1977) Biochemistry 16.1677-1683

14, 7235-1242

U. S. A. 74.560-564 Maxam, A. M., and Gilbert, W . (1977) Proc. Natl. Acad. Sci.

Smith, H. O., and Birnstiel, M. L. (1976) Nucleic Acids Res. 3,

Southern, E. M. (1975) J. Mol. Biol. 98,503-517 h i , E. C., Stein, J. P., Catterall, J. F., Woo, S. L. C., Mace, M. L.,

Sutcliffe, J. G. (1978) Nucleic Acids Res. 5,2721-2728 Richardson, B. C., and Mercier, J.-C. (1979) Eur. J . Biochem. 99,

2387-2399

Means, A. R., and O'Malley, B. W . (1979) Cell 18,829-842

285-297