22
Eur. J. Biochem. 27,65-86 (1972) The Complete Amino-Acid Sequence of Human a-Lactalbumin John B. C. FINDLAY and Keith BREW Department of Biochemistry, University of Leeds (Received November 10, 1971/February 7, 1972) a-Lactalbumin was isolated from human milk in a yield of 1.8 mg/ml of milk. The purification procedure involved ammonium sulphate fractionation (300/, to 800/, saturation) and pH-4.0 precipitation, followed by gel filtration with Sephadex G-100. A final purification stage using DEAE-cellulose was necessary in some preparations. Peptides derived from the reduced, S-aminoethylated protein by treatment with cyanogen bromide and digestion of these CNBr fragments with trypsin, chymotrypsin or thermolysin, were purified by gel filtration, ion exchange chromatography and high-voltage paper electro- phoresis. From the sequences of these peptides it has proved possible to deduce unambiguously the com- plete primary structure of the protein. Comparison with bovine a-lactalbumin shows an identity in 72O/, of the residues with a further 6O/, being chemically similar amino acids. The correspond- ing figures for the human a-lactalbumin/human lysozyme comparison, are 39O/, and 12O/,, re- spectively. The significance of some of the amino acid replacements is discussed. Information about the structural restraints placed upon the variability of a protein by the neces- sity of maintaining its biological function, can fre- quently be obtained by comparative structural stud- ies on the protein from different species. I n the case of a-lactalbumin, these investigations are of further relevance on account of its homology with lysozyme [l]. This similarity in structure is not, however, re- flected in the enzymic properties of the two proteins, for lysozyme is thought to act as an anti-bacterial agent whilst a-lactalbumin is involved, along with a membrane-bound enzyme, in the biosynthesis of lactose in the mammary gland [2,3]. Although there is a superficial resemblance in the specificity of these activities, lysozyme catalysing the hydro- lysis of a , ! ? 1-4 glycosidic linkage in the bacterial cell wall and a-lactalbumin participating in the synthesis of a similar bond in lactose, further investigation has revealed a wide divergence in the actual roles of the two proteins in the expression of their respective activities [4]. Thus, it is to be hoped that information on the sequences of a-lactalbumin and lysozyme may tell us something about the relationship of structural Abbreviations. CNBr, cyanogen bromide; dansyl, l-di- methylaminonaphthalene-5-sulphonyl. Enzymes. Aminopeptidase M (EC 3.4.1.2); prolidase (EC 3.4.3.7) ; carboxypeptidase A (EC 3.4.2.1) ; carboxypeptidase B (EC 3.4.2.2); trypsin (EC 3.4.4.4); chymotrypsin (EC 3.4.4.5) ; thermolysin (EC 3.3.4.-); pronase (EC 3.4.4.-). 6 Eur. J. Biochem., Vol.27 change to the evolutionary divergence of function between the two groups. As it was known that the primary structure of human lysozyme was under in- vestigation [5], human a-lactalbumin has been chosen for comparative sequence studies. Peptides obtained by enzyme-catalysed hydroly- sis of the cyanogen bromide fragments from human- a-lactalbumin have been isolated and characterized and we report here the amino acid sequences of these peptides and the way in which they were aligned to yield the complete primary structure of the protein. This result is compared to the known primary structures of bovine a-lactalbumin [6] and human and hen egg-white lysozymes [5,7]. The degree of varia- bility of certain parts of the a-lactalbumin molecule revealed by the comparison is discussed in relation to their possible role in the lactose synthetase specifier activity and to the antigenic activity of the protein. EXPERIMENTAL PROCEDURE MATERIALS The human milk used in these studies was kindly supplied by the Maternity Hospital at Leeds (United Leeds Teaching Hospital Group). Since only small quantities of milk could be obtained from any single individual, no attempt was made to treat these sam- ples separately.

The Complete Amino-Acid Sequence of Human alpha-Lactalbumin

  • Upload
    fau

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Eur. J. Biochem. 27,65-86 (1972)

The Complete Amino-Acid Sequence of Human a-Lactalbumin

John B. C. FINDLAY and Keith BREW

Department of Biochemistry, University of Leeds

(Received November 10, 1971/February 7, 1972)

a-Lactalbumin was isolated from human milk in a yield of 1.8 mg/ml of milk. The purification procedure involved ammonium sulphate fractionation (300/, to 800/, saturation) and pH-4.0 precipitation, followed by gel filtration with Sephadex G-100. A final purification stage using DEAE-cellulose was necessary in some preparations.

Peptides derived from the reduced, S-aminoethylated protein by treatment with cyanogen bromide and digestion of these CNBr fragments with trypsin, chymotrypsin or thermolysin, were purified by gel filtration, ion exchange chromatography and high-voltage paper electro- phoresis.

From the sequences of these peptides it has proved possible to deduce unambiguously the com- plete primary structure of the protein. Comparison with bovine a-lactalbumin shows an identity in 72O/, of the residues with a further 6O/, being chemically similar amino acids. The correspond- ing figures for the human a-lactalbumin/human lysozyme comparison, are 39O/, and 12O/,, re- spectively. The significance of some of the amino acid replacements is discussed.

Information about the structural restraints placed upon the variability of a protein by the neces- sity of maintaining its biological function, can fre- quently be obtained by comparative structural stud- ies on the protein from different species. In the case of a-lactalbumin, these investigations are of further relevance on account of its homology with lysozyme [l]. This similarity in structure is not, however, re- flected in the enzymic properties of the two proteins, for lysozyme is thought to act as an anti-bacterial agent whilst a-lactalbumin is involved, along with a membrane-bound enzyme, in the biosynthesis of lactose in the mammary gland [2,3]. Although there is a superficial resemblance in the specificity of these activities, lysozyme catalysing the hydro- lysis of a ,!? 1-4 glycosidic linkage in the bacterial cell wall and a-lactalbumin participating in the synthesis of a similar bond in lactose, further investigation has revealed a wide divergence in the actual roles of the two proteins in the expression of their respective activities [4].

Thus, it is to be hoped that information on the sequences of a-lactalbumin and lysozyme may tell us something about the relationship of structural

Abbreviations. CNBr, cyanogen bromide; dansyl, l-di- methylaminonaphthalene-5-sulphonyl.

Enzymes. Aminopeptidase M (EC 3.4.1.2); prolidase (EC 3.4.3.7) ; carboxypeptidase A (EC 3.4.2.1) ; carboxypeptidase B (EC 3.4.2.2); trypsin (EC 3.4.4.4); chymotrypsin (EC 3.4.4.5) ; thermolysin (EC 3.3.4.-); pronase (EC 3.4.4.-).

6 Eur. J. Biochem., Vol.27

change to the evolutionary divergence of function between the two groups. As it was known that the primary structure of human lysozyme was under in- vestigation [5], human a-lactalbumin has been chosen for comparative sequence studies.

Peptides obtained by enzyme-catalysed hydroly- sis of the cyanogen bromide fragments from human- a-lactalbumin have been isolated and characterized and we report here the amino acid sequences of these peptides and the way in which they were aligned to yield the complete primary structure of the protein.

This result is compared to the known primary structures of bovine a-lactalbumin [6] and human and hen egg-white lysozymes [5,7]. The degree of varia- bility of certain parts of the a-lactalbumin molecule revealed by the comparison is discussed in relation to their possible role in the lactose synthetase specifier activity and to the antigenic activity of the protein.

EXPERIMENTAL PROCEDURE

MATERIALS

The human milk used in these studies was kindly supplied by the Maternity Hospital a t Leeds (United Leeds Teaching Hospital Group). Since only small quantities of milk could be obtained from any single individual, no attempt was made to treat these sam- ples separately.

Eur. J. Biochem. 66 The Amino-Acid Sequence of Human a-Lactalbumin

Sephadex was purchased from Pharmacia (Great Britain), Bio-Gel P4 from Bio-Rad Laboratories, DEAE-cellulose from Whatman and M-72 from Beck- man Instruments Ltd.

Chymotrypsin, carboxypeptidase A and trypsin were all obtained from Sigma Chemical Company Ltd, thermolysin and pronase from Calbiochem. Ltd, amino-peptidase M from Rohm and Haas (Darmstadt) and prolidase from Miles-Servac (PTY) Ltd. The cy- anogen bromide, ethyleneimine, phenylisothiocya- nate and 2-mercaptoethanol used in these studies were all produced by Koch-Light Laboratories Ltd ; B.D.H. Ltd supplied l-dimethylaminonaphthalene- 5-sulphonyl chloride and trifluoroacetic acid. The polyamide layer sheets were obtained from The Cheng Chin Trading Company (Taiwan).

All other chemicals were reagent grade or better. Pyridine and phenylisothiocyanate were redistilled prior to use, the former from ninhydrin-containing solutions. Reagents and solutions prepared from these materials were stored a t - 16 "C.

METHODS

Purification of Human a-Lactalbumin

Whole human milk (pH 6.6) was adjusted to pH 4.6 with 1 M acetic acid, and centrifuged a t 10000 rev./min for 30 min. Some of the caseins were sedi- mented whilst the fat formed a layer a t the top of the vessel. The clear supernatant material was retained and after adjusting to pH 7.0 with 1 M NaOH solid ammonium sulphate was added to 30°/, saturation. After centrifugation as above, the precipitated ma- terial was discarded and the supernatant fraction taken to goo/, saturation with ammonium sulphate. The precipitate so obtained was redissolved in a tenth of the original milk volume of water, and the pH ad- justed to 4.0 with 1 M acetic acid. The resulting pre- cipitate was collected as before, redissolved in as small a volume of water as possible and the pH taken up to 7.0.

Following extensive dialysis (3 days) against dis- tilled water, the preparation was freeze-dried, dis- solved in 0.05 M ammonium bicarbonate and applied to Sephadex G-100 (4 x 140 cm) equilibrated in the same buffer. Gel filtration was carried out in this buf- fer, at room temperature with a flow rate of 50 ml/h. The column eluant was monitored spectrophotomc- trically a t 280 nm.

On occasion, a further column separation using the ion-exchange resin DEAE-cellulose in 0.02 M Tris-HC1 pH 7.8 with a sodium chloride gradient of 0 to 0.3 M, was employed to ensure purity.

The resulting material gave a single band on poly- acrylamide gel disc electrophoresis a t pH 8.6 and consistent amino acid composition data. A single precipitin line was also obtained using the Ouchter-

lony double-diffusion procedure with rabbit antise- rum to H a-lactalbumin.

Reduction, S- Aminoethylation und Cleavage with Cyanogen Bromide

The method used in the reduction and X-amino- ethylation of human a-lactalbumin is described elsewhere [8]. The modified protein was separated from the reaction mixture by dialysis in acetylated tubing against distilled water for 3 or 4 days.

Cleavage of the resulting material with cyanogen bromide was subsequently carried out in 7001, formic acid a t a protein concentration of 10mg/ml and a CNBr concentration of 20 mglml. After 24 h at room temperature, excess reagents were removed by rotary evaporation, the residual material washed with distilled water and then freeze-dried. The re- sulting peptides were separated by gel filtration with Sephadex G-75 ( 4 ~ 1 4 0 c m ) equilibrated in l o / , formic acid or 1 M acetic acid.

Polyacrylamide-Gel Electrophoresis Disc electrophoresis with 7 polyacrylamide

gels was performed as described by Davis [9]. For electrophoresis of the cyanogen bromide fragments the gels and buffers were prepared containing 4 M urea.

Amino-Acid Analysis Amino acid analyses were performed with a Beck-

man Unichrome amino acid analyser fitted with high- sensitivity flow cells, following acid hydrolysis of the protein and peptides with 6 N HCl a t 110 "C in eva- cuated sealed Pyrex tubes. The composition of human a-lactalbumin was obtained from 24, 48 and 96-h hydrolysates, extrapolating back to zero time for the seiine, threonine and tyrosine values. It was assumed that the release of isoleucine and valine was com- plete after 96 h. The cysteine content was obtained upon analysis of the performic acid-oxidised protein [lo]. Peptides were hydrolysed for 24 h but no cor- rection for the destruction of serine, threonine and tyrosine or for slow liberation of isoleucine and valine, was made. Contaminating amino acids less than 0.1 of a residue have been omitted from the composition tables.

N-Terminal Determination The N-terminal amino acids of the whole protein

and the CNBr fragments were determined using a modification (personal communication, J. R. Woot- ton) of the dansyl chloride procedure [12]. In this method dansylation was carried out in the presence of dodecylsulphate and the N-terminal dansyl amino acids extracted from acid hydrolysates by ethyl ace- tate or 50°/, aqueous pyridine. These derivatives were subsequently identified by two-dimensional

Vol.27, No.l,1972 J. B. C. FINDLAY and K. BREW 67

ascending chromatography on 7.5 or 5-cm square double-sided polyamide layers [ 131, using the solvent systems detailed below. Solvent 1 was invariably used in the first dimension and solvents 2 to 5 in the second dimension.

(v/v) aqueous formic acid [I31 ; solvent 2 = benzene-acetic acid (9:1, v/v) [13]; solvent 3 = ethyl acetate-methanol-acetic acid (20: I : I, v/v) [I41 ; solvent 4 = pyridine-acetate pH 4.4-ethanol (3: 1, vlv) [14]); solvent 5 = 0.05 M trisodium phosphate-ethanol (3 : I, v/v) [14].

Systems 1 and 2 were sufficient for the identifi- cation of the dansyl derivatives of all amino acids with the exception of serine and threonine, aspartic and glutamic acids (both pairs resolved with solvent 3), arginine and lysine (separated in solvent 4) and histidine (resolved from edansyl-lysine using sol- vent 5 ) . Removal of the N-terminal residues was achieved by a modifiedEdman procedure in which the coupling step was performed in the presence of do- decylsulphate ; the phenylthiocarbamyl-peptide was recovered by precipitation with acetone prior to cycli- sation [ I l l . This allowed the N-terminal sequences of the whole protein and the CNBr fragments to be determined.

Solvent I = 1.5

C-Terminal Determination The C-terminal residue of the intact protein was

identified using carboxypeptidase A. 10mg of the native protein was dissolved in 3 ml of 0.5 M phos- phate buffer pH 7.6 and 4O/, (wlw) carboxypeptidase A in loo/, lithium chloride, added. The mixture was incubated a t 37 "C and 0.5-ml aliquots withdrawn into 0.6 mlO.1 M HC1 after 0, 30, 75, 180 and 300 min. These samples were stored frozen and analysed with- out further treatment.

Enzyme Digestions Trypsin. 5- 10 pmol of the CNBr fragments were

dissolved in 2.0 ml H,O and the pH adjusted to 8.6 with 0.1 M NaOH. Trypsin, equivalent to lo/, (w/w) of the protein was added and the hydrolysis allowed to proceed for 1 h a t 37 "C. At this time a sec- ond lo/, aliquot was added. The pH of the solution was constantly maintained at 8.6 with 0.1 N NaOH. After a second 1-h period the reaction was terminated by the addition of glacial acetic acid to a final pH of 2.5. The material was subsequently freeze-dried.

Chymotrypsin. Digestion with chymotrypsin was carried out as with trypsin, except that a third ali- quot of the enzyme was added after 2 h and the hy- drolysis allowed to proceed for a total of 3 h.

Thermolysin. 5-10 pmol of the material to be digested was dissolved in 2 ml 2.5 mM CaCl, and the pH adjusted to 7.5 with 0.1 M NaOH. l o / , thermo- lysin (w/w) was added a t zero time and after I and 2 h, thereby giving a total of 30l0 and a hydrolysis 5'

time of 3 h. The pH was maintained as above and the reaction terminated with glacial acetic acid as previously described.

Separation of Peptides Peptides were routinely separated using a

column of (0.9 x 55 cm) Beckman M-72 resin, a spher- ical bead sulphonated polystyrene equivalent to Dowex 50X 8. Elution was performed with pyridine acetate buffers [6], using on most occasions a linear gradient from 0.05M to 2 M pyridine. All buffers were deaerated before use.

The freeze-dried peptide mixtures were dissolved in 2 ml of 0.05 M pyridine acetate and any insoluble material removed by centrifugation. These preci- pitates were washed (2 x 0.5 ml) with the same buffer and the washings also applied to the column. Chro- matography was carried out a t 55 "C with flow rates of 50 to 60 ml/h. The columns were finally eluted with 200 ml 2 M pyridine-acetate a t the end of each gra- dient. The legends to each figure give the actual de- tails in each case. The column eluate was monitored spectrophotometrically a t 280 nm and by the alka- line hydrolysis-ninhydrin colorimetric assay on ali- quots from each fraction [15].

Peptides were pooled as shown, dried by rotary evaporation and redissolved in water, 1 O/, acetic acid or 0.05 M ammonium bicarbonate, depending on their solubility characteristics.

Impure pools were further fractionated by gel filtration on Bio-Gel P4 (2.2 x 140 cm) in 1 01, formic acid, a t room temperature and with flow rates of 12 to 15 ml/h. Alternatively, high-voltage paper elec- trophoresis a t 3 kV for 40 min in pyridine-acetate buffer pH 6.5 proved an effective procedure. The pep- tides thus separated were eluted from the strips of Whatman No. 3 paper with 0.1 M ammonium bicarbonate.

Sequence Analysis of Peptides Purified peptides were subjected to sequence

analysis using the combined Edman degradation- dansylation procedure [26]. The N-terminal dansyl amino acids so formed were separated and identified by two-dimensional ascending chromatography on double-sided polyamide layers as described.

Several observations arising from our experience with this technique should be mentioned. Where S-aminoethyl-cysteinyl residues were present in non-terminal positions in peptides, substantial modi- fication or destruction of the amino acid occurred during the degradation process. As a result, the yield of bis-dansyl S-aminoethyl-cysteine was markedly reduced. In contrast, good yields of the derivative were obtained when the residue occupied a N-ter- minal position in the peptide. On chromatography, the dansyl 8-aminoethyl-cysteine derivative ran

68 The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Biochem.

3.00- 28 2.6

€26

%20 ';i 1:8 81.6 5 l.L

B 1.0 <0.8

0.6 0.L 0.2

g 2.2

91.2

slightly faster than dansyl cysteine in solvents 1 and 2 .

A similar situation was observed with internal lysine residues. In this instance, three spots were ob- served, the N - and bis-dansyl derivatives, together with a component running in a position almost iden- tical to that characteristic of leucine. A possible ex- planation for these effects would lie in the formation of phenylthiocarbamyl derivatives of the side chain amino groups of lysine and S-aminoethyl-cysteine, during the Edman degradation procedure.

Acid hydrolysis of the dansyl peptide which is a necessary step for the liberation of the N-terminal dansyl amino acid, caused the destruction of dansyl tryptophan. As a consequence, no spot was observed and the position of the tryptophanyl residue could only be inferred from the absence of such a spot a t this stage, allied with the location of other residues in the peptide. Less ambiguous indication of the posi- tion of this amino acid was obtained from a timed course of enzymic hydrolysis.

A

- - - - - - - - - - - - - - -

0 K - r - T -

Complete Enzymatic Hydrolysis of Peptides The peptide (0.1 to 0.2 pmol) was dissolved in

2 m l 5 mM MgC1, and the resultant solution adjusted to pH 7.8 with 0.01 M NaOH. Aminopeptidase M (0.1 mg) and pronase (0.1 mg), also suspended in 5 mM MgC1, pH 7.8, were added and the digestion allowed to proceed for 18 h at 37 "C. Where proline was present, prolidase was substituted for pronase. The mixture was subsequently acidified with 0.1 M HC1 and analysed directly on the amino acid analyser without prior removal of the enzymes.

This techniques was employed as an adjunct to the electrophoretic method (see below) used in the determination of the amide content of various pep- tides. Glutamine and asparagine emerged with serine on amino acid analysis. Used in conjunction with the results from acid hydrolysis of the same peptides, in which glutamine and asparagine are converted to the respective acids, an unambiguous estimate of the number and identity of the amidated residues could usually be obtained.

The second important use of complete enzymatic hydrolysis was in the verification of the presence of tryptophan in peptides, following previous indication by absorption at 280 nm and the presence of degra- dation products of tryptophan on analysis of acid hydrolysates of the peptide.

Time- Course Digestion of Peptides Carboxypeptidase A. 10 p1 of the enzyme suspen-

sion (20 mg/ml) was dissolved in 50 pl 0.1 N NaOH and the solution neutralised with an equivalent amount of 0.1 N HC1. 0.1 M NH4HC03 pH 7.8 was added to bring the volume to 1 ml and 0.4-ml ali- quots of the enzyme solution were added to 0.1 pmol

of dried peptide and a control tube, respectively. In- cubation was carried out a t 25 "C, aliquots with- drawn at suitable intervals, acidified with glacial acetic acid and analysed directly without prior re- moval of enzyme or peptide.

Aminopeptidase M . A solution of aminopeptidase M (0.1 mg/ml) in 0.1 M NH4HC03 pH 7.8 containing 2.5 mM MgC1, was prepared. 0.2 pmol of peptide was dissolved in 0.4 ml of the enzyme preparation and incubation carried out at 25 "C. Aliquots were with- drawn and treated as above.

Determination of Amide Content The paper electrophoretic procedure of Offord [17]

was used.

RESULTS Fig.1 shows the elution pattern obtained from

Sephadex G-100 in the purification of human a-lact- albumin. It is possible to obtain a similarly shaped profile by applying the material obtained by simply dialysing and freeze-drying the pH-4.6 supernatant. However, the procedure detailed in Methods had two distinct advantages. The first that, due to the re- moval of the bulk of the milk caseins, much larger amounts of a-lactalbumin could be purified with a single column separation. Secondly, the method also resulted in the removal of a small soluble casein of low absorption coefficient which eluted from Sepha- dex G-100 in the leading edge of the a-lactalbumin peak. Although present in small amounts, this ma- terial had a profound effect on the subsequent amino acid analysis of the a-lactalbumin, as a result of the very high proline and glutamic acid content of the contaminant.

" 0 200 LbO 660 860 lob0 1iOO Volumelml 1

Fig. 1. Purification of human oc-lactalbumin on Sephadex G- 100. 2.0 g of the pH-4-precipitated fraction of human milk was applied to a column (4 x 150 em) of Sephadex G-100, equilibrated in 0.05 M ammonium bicarbonate. The column was developed with this buffer at 50 ml/h at room temperature

Vol.27, No.1, 1972 J. B. C. FINDLAY and K. BREW 69

Table 1. The amino-acid compositions of human a-lactalbumin and its cyanoyen-bromide fragments The results are given as number of residues per protein molecule as determined bv sequence and amino acid composition

CNBr E C-terminal fragment CNBr D CNBr E Whole protein

Amino acid R-terminal fragment

Sequence Compn. Sequence Compn. Sequence Compn. Sequence Compn.

Aspartic acid 16 15.5 2 2.0 12 11.6 2 2.2 Threonine 7 6.8 2 1.6 4 4.2 1 1.1 Serine 8 7.8 1 0.8 7 7.0 Glutamic acid 15 14.8 4 3.9 7 6.8 4 4.4 Proline 2 2.4 1 0.9 1 1.0 Glycine 6 6.0 3 2.7 2 2.0 1 1.4 Alanine 5 5.2 1 1.1 1 1.2 3 2.7 ' I2 Cystine 8 7.8 2 1.9 3 2.7 3 2.6 Valine 2 2.0 -

3 3.0 Methionine a 2 2.0 (1) Isoleucine 12 11.6 3 Leucine 14 13.7 5 4.6 3 3.3 6 5.1 Tyrosine 4 4.1 1 0.7 2 1.9 1 0.8 Phenylalanine 4 4.1 1 0.8 3 2.8

1 + 2 + Tryptophan 3 Lysine 12 12.0 3 2.9 3 3.2 6 5.6

- 1 1.0 1 0.9 Histidine 2 2.0 - Arginine 1 1.1 - 1 1.0

Total 123 30 60 33

N-Terminal LYS LYS Phe CYS

LYS

- -

- -

- - - 2.0 1.6

5.7 (1) 2.8

- - (1) (1) 6

- - - - -

- - -

Glu Glu His Ala Phe Phe -

a In CNBr fragments methionine occurred as homoserine lactone on analysis of acid-hydrolysed peptides.

The yield of human a-lactalbumin obtained by this procedure was 1.7 to 1.8 mg/ml of the original milk. All preparations gave single bands on poly- acrylamide disc electrophoresis and were active in the lactose synthetase reaction. The absorption coefficient AiQk measured by the dry weight method was found to be 18.8 and the amino acid composition is given in Table 1. The results of polyacrylamide gel electro- phoresis and chromatography with DEAE-cellulose indicated an absence of genetic variants in the milk samples examined.

SEPARATION OF CNBr FRAGMENTS

Also shown in Table 1 is the amino acid compo- sition data for the three fragments obtained upon cleavage (with cyanogen bromide) of the S-amino- ethylated protein a t the two methionyl residues.

The separation profile of these fragments is shown in Fig.2. Examination of peaks D and E by poly- acrylamide gel disc electrophoresis in urea, revealed that although peak D was pure, peak E contained two components. It has not proved possible to separate these two components on a preparative scale by any of the techniques of gel filtration, ion-ex- change chromatography, high-voltage electropho- resis or paper chromatography so far tried. They were resolved, however, on a small scale by high- voltage electrophoresis a t pH 2.5 (2 h, 100 mA and

Volume(mU Fig. 2. Separation of CNBr fragments of S-aminoethylated human a-lactalbumin. 120 mg of CNBr-cleaved material was separated with Sephadex 6-75 (2.5 x 150 em), developed at room temperature with lolo formic acid and at a flow rate

of 12 ml/h

3 kV) on silica gel thin-layer plates (20 x20 cm). Accordingly, although enough material could be ob- tained for amino acid analysis and N-terminal se- quence determination, proteolytic digestions were carried out on the mixture of fragments.

70 The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Biochem.

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 0 100 200 300 LOO 500 600 700 800 9dO 1300

Voiumeiml) Fig.3. Separation of tryptic peptides from CNBr E. A tryptic digest of 40 mg of CNBr E was chromatographed on a column (0.9 x 55 cm) of Beckman M-72 using linear gradient elution at 50 ml/h with 500 ml each of 0.05 M pyridine-acetate pH 2.5 as starting buffer and 2 M pyridine-acetate pH 5.0 as limit buffer. 3.5O/, of each fraction was sampled for ninhydrin esti-

mation (-), the absorbance being measured a t 570 nm; . . . . ., absorbance at 280 nm

Of the remaining peaks in the elution pattern, peak A was devoid of protein, peak B was identical in composition to the whole protein and peak C ap- peared to consist, judging by its amino acid compo- sition, of a combination of two partial cleavage prod- ucts. Calculation of yields based on the material in these pools indicated that nearly 90°/, of the S- aminoethylated protein had been at least partially cleaved by cyanogen bromide, nearly 70 appearing as the three CNBr fragments.

N - and C-Terminal Amino Acids Also shown in Table 1 are the N-terminal amino

acids of the whole protein and the cyanogen bromide fragments determined by the dansyl chloride pro- cedure. On digestion with carboxypeptidase A, leu- cine was the only amino acid released in significant amounts. The percentage recovery of the residue was 95 Ole, the maximum contaminant being tyrosine, pres- ent in quantities less than loo/, of leucine after25.5-h digestion period. From this result it was suspected that the pre C-terminal residue of the protein was not susceptible to cleavage with carboxypeptidase A, indicating the possible presence of a basic residue or proline. In summary, human or-lactalbumin is a protein 123 residues in length, the C-terminal amino acid being leucine and the N-terminal residue, ly- sine. In this last mentioned respect, it differs from bovine or-lactalbumin [6] which has an N-terminal glutamyl residue.

ISOLATION AND CHARACTERIZATION OF PEPTIDES AFTER DIGESTION OX CNBr FRAGMENTS

WITH HYDROLYTIC ENZYMES

An example of the chromatographic separation of peptides from the CNBr pools is shown in Fig.3.

0.8 - 0.7-

0.6- L3 6 0.5-

5; 0.L-

r.

0 - a 0.3-

;5 0.2-

0.1-

d R

111 n Q

Volumeirnl) Fig.4. Purification of T16, TlY. 4.5 pmol T16, T17 obtained from the chromatographic separation illustrated in Fig. 3 were applied to a column (2.0 x 150 cm) of Bio-Gel P4 equi- librated with acetic acid. The column was developed in this buffer a t a flow rate of 12 ml/h. 5O/, of every second frac- tion was subjected to the ninhydrin assay. - , the ab- sorbance being measured at 570nm; ....., absorbance at

280 nm

For the further purification of pools containing more than one species, the single technique which proved most successful was gel filtration with Bio-Gel P4. Fig.4 illustrates such a re-separation, in this case of T i 6 and TI7 from CNBr E. On one occasion, how- ever, this method proved unsuccessful and purifi- cation was achieved by high-voltage electrophoresis a t pH 6.5.

Peptides were checked for purity by high-voltage electrophoresis a t pH 6.5 and 3.5. Where further purification was necessary the method employed

N

-4

Tab

le 2.

Am

ino-

acid

com

posi

tions

, yie

lds a

nd N

-ter

min

al re

sidu

es of

tryp

tic p

eptid

es fr

om

CN

Br

D

The

num

bers

in p

aren

thes

es a

fter

the

mol

ar ra

tio fi

gure

s rep

rese

nt th

e as

sum

ed n

umbe

r of r

esid

ues.

Try

ptop

han

cont

ent w

as c

onfir

med

by c

ompl

ete e

nzym

ic h

ydro

lysi

s of

the

res

pect

ive

pept

ides

~

~~

~ ~~

Pept

ide

Who

le C

NB

r D f

rom

:

Am

ino

acid

Pe

ptid

e C

ompn

to

tal

T5

T6

T6,7

T7

T8

T9

T10

T

11

Ti2

T

13

.~

Asp

artic

aci

d 3.

99 (

4)

1.14

(1)

0.97

(1)

0.99

(1)

1.

01 (1

) 1.

96 (

2)

4.83

(5

) 12

12

Thr

eoni

ne

2.37

(3)

0.

75 (

1)

4 4

Seri

ne

2.70

(3)

0.

86 (

1)

0.70

(1)

2.39

(3)

0.

88 (

1)

0.98

(1)

7

7

Glu

tam

ic a

cid

4.76

(5)

1.

06 (

1)

0.81

(1)

2.

05 (

2)

7 7

Prol

ine

0.87

(1)

1

1

Gly

cine

1.

97 (

2)

1.10

(1)

0.86

(1)

2 2

er w A

lani

ne

1.06

(1)

1

1 a

Val

ine

0.95

(1)

0.

85 (

1)

2 2

3 E s T

yros

ine

1.66

(2)

2

2 F

!if 3 * Is

oleu

cine

1.

65 (

2)

0.86

(1)

0.

90 (

I)

0.96

(1)

0.

84 (

1)

0.81

(1)

2.10

(2)

6

6 P

Leu

cine

0.

96 (

1)

0.92

(1)

0.

86 (

1)

1.05

(1)

0.89

(1)

3

3 __

Phen

ylal

anin

e 1.

58 (2

) 0.

81 (

1)

0.73

(1)

0.91

(1)

3

3

Try

ptop

han

+(I)

1

1

Lys

ine

0.73

(1)

0.

81 (1)

0.98

(1)

0.

93 (

1)

1.10

(1)

3 3

8- Am

inoe

thyl

-cys

tein

e 0.

92 (

1)

0.93

(1)

0.

66 (

1)

0.77

(1)

3

3

His

tidin

e 0.

65 (

1)

1

1

Arg

inin

e 0.

93 (

1)

1

1

Hom

oser

ine l

acto

ne

+(I)

+(I)

+(I)

No.

of r

esid

ues

28

3 8

6 4

8 3

4 6

11

60

60

N-t

erm

inal

res

idue

Ph

e G

ly

Gly

G

lx

Leu

Se

r A

sx

Asx

A

sx

Phe

LYS

Yie

ld (O

/J 28

24

14

6

36

56

66

24

42

45

The

num

bers

T

able

3. A

min

o-ac

id c

ompo

sitio

ns, y

ield

s an

d N

-term

,inal

resi

dues

of

tryp

tic p

eptid

es f

rom

CN

Br E

of t

he re

spec

tive

pept

ides

. Pe

ptid

es T

2, T

16, T

i7 a

nd T

i9 w

ere

rese

para

ted

on B

io-G

el P

4 in

par

enth

eses

aft

er th

e m

olar

rat

io fi

gure

s rep

rese

nt t

he a

ssum

ed n

umbe

r of r

esid

ues.

Try

ptop

han

cont

ent w

as c

onfi

rmed

by

com

plet

e en

zym

ic h

ydro

lysi

s

Pept

ide

Who

le C

NB

r E fr

om

Am

ino

acid

T

4 T

3,4

T14

T15

T14

,15

T16

T

17

Ti8

T

19

Tl9

A

Pept

ides

C

ompn

to

tal

T1

T2

T2A

T

3

Asp

artic

aci

d 2.

05(2

) 2.

06(2

) 0.

99(1

) 0.

99(1

) 1.

02(1

) 4

4

Thr

eoni

ne

0.9(

1)

l.OO(

1)

0.80

(1)

0.84

(1)

3 3

Seri

ne

0.96

(1)

0.75

(1)

I 1

Glu

tam

ic a

cid

1.0(

1)

2.00

(2)

2.02

(2)

l.Oo(

1)

l.OO(

1)

0.99

(1)

2.08

(2)

0.98

(1)

1.02

(1)

8 8

Prol

ine

1.11

(1)

1.12

(1)

1

1

Gly

cine

2.

75(3

) 2.

76(3

) 10

7(1)

4

4

Ala

nine

0.

99(1

) 0.

97(1

) 1.

04(1

) 0.

95(1

) 3

4

Val

ine

Isol

euci

ne

2.72

( 3)

2.79

(3)

1.84

(2)

1.73

(2)

0.98

(1)

6 6

Leu

cine

2.

72(3

) 2.

83(3

) 2.

00(2

) 2.

00(2

) 0.

95(1

) 1.

10(1

) 0.

99(1

) 0.

92(1

) 2.

01(2

) 1.

01(1

) 11

11

Tyr

osin

e 0.

82(1

) 0.

97(1

) 0.

84(1

) 2

2

Phen

ylal

anin

e 0.

9(1)

1

1

Try

ptop

han

2 2

Lys

ine

2.0(

2)

1.04

(1)

l.OO(

1)

l.OO(

1)

0.93

(1)

2.03

(2)

0.93

(1)

1.06

(1)

0.98

(1)

0.98

(1)

8 9

S- A

min

oeth

yl-c

yste

ine

0.90

(1)

0.81

(1)

0.88

(1)

0.71

(1)

0.91

(1)

4 5

His

tidi

ne

0.84

(1)

1 1

Arg

inin

e

Hor

nose

rine

lact

one

+ +

1 1

~_

__

__

No.

of

resi

dues

5

8 7

15

2 17

1

5 6

9 6

6 3

2 60

63

Yie

ld (o

/o)

90

38

60

14

20

58

70

90

10

66

66

62

40

16

Vo1.27, No.l,1972 J. B. C. FINDLAY and K. BREW 73

Table 4. Amino-acid compositions, yields and N-terminal residues of chymotryptic peptides f r m CNBr D The numbers in parentheses after the molar ratio figures represent the assumed number of residues. Tryptophan content was

confirmed by complete enzymic hydrolysis of the respective peptides

Peptide

c7 C8 c9 c10 c11 c12 C13 C12,13 C14 C15 Amino acid

Aspartic acid 3.02(3) 1.09(1) 7.78(8) 7.87(8)

Threonine 1.04(1) 1.65(2) 0.93(1) 0.9G(1)

Serine 1.15(1) 0.92(1) 0.76(1) 2.41(3) 2.69(3) 1.09(1) 2.06(2)

Glutamic acid 3.72(4) 1.01(1) 2.02(2) 2.04(2)

Proline 0.97( 1) 0.97(1) ~~ ~

Glycine 1.01(1) 0.94(1) ~

Alanine 0.95(1)

Valine 0.95(1) 0.95(1) 1.14( 1)

Isoleucine 0.96(1) 0.81(1) 3.81(4) 3.74(4)

Leucine 1.02(1) 1.06(1) 1.08(1) 1.11(1)

Tyrosine 0.88(1) 0.51(1)

Phenylalanine 0.99( 1) 0.61(1) 0.84(1) 0.93(1)

Tryptophan +(I)

Lysine 0.98(1) 0.96(1) 1.06(1) 1.08(1) 1.04(1)

0.90( 1) 0.97(1) 1.51(2) 1.41(2)

Histidine 0.96( 1)

Arginine 1.00 0.80(1) 0.66(1)

Homoserine lactone + + No. of residues 6 14 3 4 3 9 1 10 20 22

N-terminal residues Phe Asx Gly Glx Lys Cys Arg Cys Asx Ser

Yield (o/o) 78 19 100 26 50 20 30 28 15 15

is indicated. The compositions, yields and N-terminal residues of these peptides are shown in Tables 2 to 7. The nomenclature used is as follows : tryptic peptides are denoted by the letter T, chymotryptic by C and thermolysin by Th. The numbering of these peptides denotes their position in the complete se- quence of the protein, beginning a t the NH,-terminus. Where one peptide also occurs as two smaller frag- ments, the larger peptide is denoted by the numbers assigned to both the smaller pieces e .g . T3, 4 is a fragment containing the same residues as T3 and T4. Yields varied from 10°/, to 100°/,, and were generally greater from hydrolysates of the fragments contained in the CNBr E pool. This was foreshadowed by the presence of small amounts of insoluble material in all the digests of CNBr D.

Sequence analysis of the above peptides has per- mitted deduction of the complete primary structure of the a-lactalbumin. The proof of this sequence is detailed below. For ease of discussion the largest of the three CNBr pieces has been further subdivided into two regions: 31 to 58 and 59 to 90.

Residues 1 to 30 This segment represents the N-terminal fragment

contained in peak CNBr E. I ts amino acid sequence was unambiguously derived from tryptic peptides T1, T2, T3, T3,4 and T4 together with overlapping chymotryptic peptides Cl, C2, C3, C3,5, C5 and C6. Further verification came from thermolysin peptides Thl, Th2, Th3, Th4, 4A, Th5, ThGA, Th6, Th7, and Th7A.

The

num

bers

T

able

5.

Am

ino-

acid

com

posi

tions

, yi

elds

and

N-te

rmin

al r

esid

ues

of

chym

otry

ptic

pep

tides

fro

m C

NB

r E

of t

he r

espe

ctiv

e pe

ptid

es.

Pept

ides

C1,

C5

and

C18

wer

e re

sepa

rate

d on

Bio

-Gel

P4

of p

aren

thes

es a

fter

the

mol

ar ra

tio fi

gure

s rep

rese

nt th

e as

sum

ed n

umbe

r of r

esid

ues.

Try

ptop

han

cont

ent w

as c

onfi

rmed

by

com

plet

e enz

ymic

hyd

roly

sis

Pep

tidc

c1

C2

c3

c2

.3

c4

C

5 C

3.5

C6

C16

C

16A

C

i7

C17

A

C18

C

17,1

8 G

I9

C20

A

min

o ac

id

Asp

artic

aci

d 1.

79(2

) 1.

88(2

) 2.

01(2

) 1.

82(2

)

Thr

eoni

ne

0.75

(1)

0.96

( 1)

0.98

(1)

0.84

(1)

Seri

ne

0.80

(1)

0.93

(1)

0.73

(1)

0.82

(1)

2.81

(3)

0.98

(1)

Glu

tam

ic a

cid

1.02

(1)

l.OO(

1)

0.97

(1)

1.82

(2)

1.04

(1)

1.07

(1)

1.04

(1)

Prol

ine

1.12

(1)

G 1 y

c i n e

1.

04(1

) 0.

97(1

) 2.

07(2

) 1.

03(1

) l.O

O(1)

Ala

nine

l.O

l(1)

0.96

(1)

1.01

(1)

0.94

(1)

1.82

(2)

Val

ine

Isol

euci

ne

1.14

(1)

1.16

(1)

1.93

(2)

1.74

(2)

1.81

(2)

Leu

cine

0.

94(1

) 1.

02(1

) 1.

80(2

) 1.

99(2

) 0.

98(1

) 2.

04(2

) 2.

03(2

) l.O

O(1)

1.

03(1

) 1.

08(1

) 1.

96(2

) 1.

04(1

) 2.

05(2

)

Tyr

osin

e 0.

59(1

) 0.

70(1

) 0.

57(1

) 0.

54(1

)

Phen

ylal

anin

e 0.

81(1

)

Try

ptop

han

+(I

) +

(I)

Lys

ine

0.96

(1)

0.92

( 1)

0.99

(1)

0.90

(1)

0.89

(1)

0.99

(1)

0.83

(1)

0.98

(1)

1.11

(1)

0.88

(1)

1.15

(1)

8- Am

inoe

thyl

-cys

tein

e 0.

66(1

) 1.

03( 1

) 0.

65(1

) 0.

76(1

) 0.

91(1

) ~~

~

His

tidin

e 0.

96(1

) 0.

77(1

) 0.

56(1

)

Arg

inin

e ~~

Hom

oser

ine

lact

one

+ N

o. o

f res

idue

s 3

5 3

8 4

7 10

12

8

7 4

3 2

6 8

5

N-t

erm

inal

res

idue

L

ys

Thr

Se

r T

hr

Ser

Ser

Ser

Gly

A

sx

Asx

Leu

L

eu

Ala

L

eu

Cys

L

eu

Yie

ld (O

le)

77

20

18

21

17

44

24

54

66

11

17

16

31

7 42

30

M

.LI 5 0

P

The

T

able

6.

Amin

o-ac

id c

ompo

sitio

ns, y

ield

s an

d N-

term

inal

res

idue

s of

ther

mol

ysin

pep

tides

fro

m C

NB

r D

of th

e re

spec

tive

pept

ides

. Pep

tides

Thl

O a

nd T

hl7A

wer

e re

sepa

rate

d by

hig

h-vo

ltage

ele

ctro

phor

esis

on

pape

r, T

hll

and

Th1

4 by

Bio

-Gel

P4

num

bers

in p

aren

thes

es a

fter

the

mol

ar ra

tio f

igur

es re

pres

ent t

he a

ssum

ed n

umbe

r of r

esid

ues.

Try

ptop

han

cont

ent w

as c

onfir

med

by c

ompl

ete e

nzym

ic hy

drol

ysis

~~ Am

ino

acid

~ ~

~~

Pept

ide

Th8

T

h9

Th8,Q

Thl

O

Th

ll

Th

llA

T

h12

Th1

3 T

hl4A

T

hi4

Th1

5 T

h16

Thl

6A

Thl

7A

Th1

7

Asp

artic

aci

d 1.

1 (1

) 0.

98(1

) 1.

96(2

) 1.

04(1

) 1.

06(1

) 1.

08(1

) 1.

16(1

) 1.

02(1

) 1.

12(1

) 1.

01(1

) 4.

83(5

) 4.

61(5

)

Thr

eoni

ne

0.86

(1)

0.82

(1)

1.80

(2)

1.13

(1)

0.74

(1)

0.95

(1)

Seri

ne

0.92

(1)

0.96

(1)

0.93

(1)

0.80

(1)

2.36

(3)

1.61

(2)

2.60

(3)

0.92

(1)

0.87

(1)

1.09

(1)

0.97

(1)

2.98

(3)

0.96

(1)

1.06

(1)

2.05

(2)

1.95

(2)

1.92

(2)

Glu

tam

ic a

cid

Prol

ine

1.02

(1)

1.10

(1)

0.90

(1)

Gly

cine

0.

94(1

) 0.

99(1

) 0.

98(1

)

Ala

nine

1.

01(1

) l.O

O(1)

Val

ine

0.66

(1)

1.02

(1)

0.84

(1)

0.99

(1)

Isol

euci

ne

0.95

(1)

0.94

(1)

0.90

(1)

0.94

(1)

0.86

(1)

2.10

(2)

1.11

(1)

Leu

cine

l.OO(1)

1.13

(1)

0.88

(1)

1.05

(1)

"yro

sine

0.

82(1

) 0.

97(1

) 0.

60(1

)

Phen

ylal

anin

e 0.

93(1

) 1.

07( 1

) 1.

09(1

) 0.

95(1

) 0.

88(1

) 0.

91(1

)

Try

ptop

han

+(I)

Lys

ine

0.98

(1)

0.97

(1)

0.97

(1)

1.12

(1)

X- Am

inoe

thyl

0.

78(1

) l.O

O(1

) l.O

O(1

) l.

l4(1

)

His

tidin

e ~

l.OO(1)

0.88

(1)

~

Arg

inin

e 0.

66(1

) l.O

O(1)

0.

98(1

)

Hom

oser

ine

lact

one

+ N

o. of

resi

dues

3

7 10

11

3

2 4

13

8 9

3 5

6 11

8

N-t

erm

inal

res

idue

Ph

e Se

r Ph

e Il

e Le

u Ph

e Il

e L

eu

Ser

Ser

Ile

Ile

Ile

Phe

Leu

4 p a Eo

P

p.

Yie

ld (O

/o)

16

14

13

20

43

17

62

18

10

14

35

23

10

15

50

76 The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Bioohem.

Table 7. Amino-acid compositions, yields and N-terminal The numbers in parentheses after the molar ratio figures represent the assumed number of residues.

Peptide Amino acid

Thl Th2 Th2Aa Th3 Th48 Th4A Th4,4A* Th58 Th6

Aspartic acid 1.06(1) l .OO(1) l.Ol(1)

Threonine 0.81(1) 0.88(1)

Serine 0.81(1)

Glutamic acid 1.08(1) 1.02(1) 1.04(1) 1.06(1) 1.05(1)

Proline 1.01(1)

Glycine 2.97(3)

Alanine 0.96(1)

Valine

Isoleucine 0.98(1) 0.97(1)

Leucine 0.97(1) l.OO(1) 0.92(1) 1.97(2) 0.98(1)

Tyrosine 0.84(1) ~~~~

Phenylalanine 0.92(1)

Tryptophan

Lysine 0.95(1) 1.04(1) 1.01(1) 1.06(1) l .OO(1)

S- Aminoethyl-cysteine 0.97(1) 0.97(1)

Histidine

Arginine

Homoserine lactone

No. of residues 2 5 4 3 1 3 4 6 5

N-terminal residue Lys Phe Thr Leu Leu Leu Leu Ile Ile

Yield (Ole) 82 55 25 58 60 54 24 100 57

a Reseparated on Bio-Gel P,.

A summary of the way in which these pieces were uniquely aligned to give the whole segment is given in Fig.5.

TI (Residues 1 to 5) . Sequence : Lys-Glx-Phe- Thr-Lys. Since the fist three residues of this peptide are identical to those obtained from sequence ana- lysis of the whole protein, it was judged to represent the first five amino acids in the primary structure of n-lactalbumin. Peptides Thi and C i therefore represent the first 2 and first 3 residues respectively, a t the N-terminal end of the protein.

T2 (Residues 6 to 13). Sequence : X-AE-Cys-Glx- Leu-Ser-Glx-Leu-Leu-Lys. Overlaps which deter- mined the position of T2 in the protein are provided by C2, C2, 3 and Th2 as shown in the table. A peptide was also obtained in which cleavage has

occurred after the X-aminoethyl-cysteinyl residue

T3,4 (Residues 14 to 30). Sequence : Asx-Ile-Asx- Gly-Tyr-Gly-Gly-Ile-Ala-Leu-Pro-Glx-(Leu,Ile,8-AE Cys,Thr,Hms-lactone), where Hms = homoserine. A certain proportion of this segment also occurred as two further cleavage products T3 and T4, hydrolysis task- ing place after the X-aminoethyl-cysteinyl residue. It proved impossible to obtain unambiguous sequence data for this peptide after residue 25 but residues 26 to 30 could be easily identified from Th6, Th7, Th7A, T4 and C6. Overlaps between T2 and T3,4 were provided by peptides Th4, 4A, Th4A, C5 and c3,5.

I n this way a unique sequence was obtained for residues 1 to 30. The whole segment was recovered

(8-AE-CYS).

Vol. 27, No. 1, 1972 J. B. C. FINDLAY and K. BREW 77

residues of thermolysin peptides from GNBr E Tryptophan content was confirmed by complete enzymic hydrolysis of the respective peptides

Th6,4 Th7 Th7AS ThiE* Thl9 Th208 Th2OAa Th21a Th4.21A Th21A Th22’ Th23 Th24a

0.95(1) 1.04(1) l.OO(1)

1.03(1) 0.83(1) 0.83(1)

1.08(1) 0.83(1) 2.07(2) 1.04(1)

1.03(1)

1.04(1)

0.98(1) l . O O ( 1 ) 2.09(1) 1.99(2)

0.92(1) 0.92(1) 1.04(1) 0.91(1) 0.94(1) 0.91(1) 0.97(1)

1.81(2) 1.14(1) 1.01(1) 1.09(1) 0.93(1) 1.86(2)

0.85(1) 0.89(1) -

+(I)

0.96(1) 0.92(1) l.OO(1) 1.07(1) 1.04(1) 1.01(1) ~~

0.80(1) 0.90(1) 0.90(1) 0.97( 1)

0.88(1) 0.90(1) 0.91(1)

+ + 6 3 4 3 3 3 4 3 5 4 5 4 5

Ile Ile Ile Ile Ile Ile Ile A h Leu Ala Leu Leu Leu

20 12 40 71 58 12 30 13 16 20 52 31 20

as peptides from each of the three hydrolytic digests employed.

Residues 31 to 58 This entire segment of the protein was isolated

from the mixture of peptides obtained after digestion of CNBr D with trypsin. Some hydrolysis also oc- curred after tyrosine-50. Direct analysis of this pep- tide (T5) gave the first 9 residues of the region: Phe-His-Thr-Ser-Gly-Tyr-Asx-Thr-Glx. As the ter- minal dipeptide sequence is uniquely identical to that of the whole fragment, T5 can be unambiguously aligned a t the N-terminus of CNBr D. The complete sequence of the fragment was, however, deduced from the shorter peptides obtained upon digestion with thermolysin and chymotrypsin (see Fig. 6).

Residues 59 to 90 The link between the two segments of CNBr D

was provided by C i i (sequence: Lys-Leu-Trp). This N-terminal lysyl residue also occurred in T7, Th12, and T6,7 and the sequence Leu-Trp in T8 and Thi3.

The frequency of basic residues in this section of the molecule gave rise to a series of smaller tryptic peptides, which provided us with the entire sequence. Their alignment was ascertained from suitable over- lapping chymotryptic and thermolysin peptides.

T8 (Residues 59 to 62) . Sequence: Leu-Trp- 8-AE-Cys-Lys. The presence of tryptophan was con- firmed by the absorbance of the peptide at 280 nm and by complete enzymatic hydrolysis : Leu (1.02), Trp (0.61), S-AE-Cys (0.76), Lys (0.98). The position of this residue in the peptide was ascertained using

1 10

20

30

Se

quen

ce

Lys

-Gln

-Phe

-Thr

-Lys

-Cys

-Glu

-Leu

-Ser

-Gln

-Leu

-Leu

-I~y

s-A

sp-I

le-A

sp-~

ly-T

yr-G

ly-G

ly-I

le-A

la-L

eu-P

ro-G

lu-L

eu-I

le-C

ys-T

hr-

(Hom

oser

ine l

acto

ne)

Pep

tide

No

Thl

Lv

s-G

Ix-

C1

Lis

-Gln

-Phe

T

1 L

ys-G

lx-P

he-T

hr-L

ys

Th2

Ph

e-T

hr-L

ys-C

ys-G

lu

c2

Thr

-Lys

-Cys

-Glx

-Leu

T

h3

Leu

-Ser

-Gln

c3

Se

r-G

lx-L

eu

c4

Ser-

Glx

-Leu

-Leu

T

2 Cys-Glx-Leu-Ser-Glx-Leu-Leu-Lys

c2,3

Thr-Lys-Cys-Glx-Leu-Ser-Glx-Leu

Th4

,4A

L

eu-L

eu-L

ys- A

sx

c5

Leu

-Lys

- Asp

-Ile

- Asp

-Gly

-Tyr

T

h4A

L

eu-L

ys- A

sp

c3,5

Se

r-G

lx-L

eu-L

eu-L

ys- A

sp-I

le- A

sp-G

ly-T

yr

T3

Asx-Ile-Asx-Gly,Tyr,Gly,Gly,Ile,Ala,Leu,Pro,Glx,Leu,Ile,Cys

T3.

4 Asx-Ile-Asx-Gly-Tyr-Gly-Gly-Ile-Ala-Leu-Pro-Glx-Leu,Ile,Cys,Thr,

(hom

oser

ine l

acto

ne).

Th5

Il

e-A

sx-G

ly-T

yr-G

ly-G

ly-

Ile-

Cys

,Thr

, (h

omos

erin

e lac

tone

) T

h6

Ile-

Ala

-Leu

-Pro

-Glu

- Th

6A

Th7

Il

e-C

ys-T

hr

Th7A

Il

e-C

ys-T

hr-

hom

oser

ine l

acto

ne

T4

Thr

- ho

mos

erin

e la

cton

e C6

Gly-Gly-Ile-Ala-Leu-Pro-Glx-Leu-Ile-Cys-Thr-

hom

oser

ine

lact

one

Fig.

5. S

umm

ary o

f pr

oof

of th

e se

quen

ce o

f res

idue

s 1 to

30

from

CN

Br E

. The

sequ

ence

s hav

e be

en d

eriv

ed fr

om tr

yptic

, chy

mot

rypt

ic an

d th

erm

olys

in p

eptid

es o

f CN

Br E

. R

esid

ues s

how

n in

par

enth

eses

are

pre

sent

acc

ordi

ng to

the

amin

o-ac

id c

ompo

sitio

n bu

t w

ere

not

sequ

ence

d

Ile-

Ala

-Leu

-Pro

-Glx

-Leu

~~~

31

40

50

Sequ

ence

Phe-His-Thr-Ser-Gly-Tyr-Asp-Thr-Gln-Ala-Ile-Val-Glu-Asn-Asp-Gln-Ser-Thr-Glu-Tyr-Gly-Leu-Phe-G~n-Ile-Ser-Asn-Lys

Pep

tide

No

Th8

Ph

e-H

is-T

hr

c7

Phe-

His

-Thr

-Ser

-Glv

-Tvr

T

h9

Ser-

Gli

-Tir

- Asp

-Th-

Gln

- Ala

T

h8,9

Phe-His-Thr-Ser-Gly-Tyr-Asx-Thr-Glx-Ala

C8

Asx-Thr-Glx-Ala-Ile-Val(Glx,Asx,Asx,Glx,Ser,Thr,Glx,Tyr)

Thl

O

Ile-

Val

- Glx

- Asx-Asx-Glx-Ser-Thr-Glx-Tyr-Gly

C9&

T6

Gly

-Leu

-Phe

T

hll

L

eu-P

he-G

lx

Thl

lA

Phe-

Gln

c1

0 G

lx-I

le-S

er- A

sx

Glx

-Ile

-Ser

-Asx

-Lys

T7

T

h12

Ile-

Ser-

Asx

-Lys

T

6,7

Gly

-Leu-Phe-Glx-Ile-Ser-Asx-Lys

Lys

-Leu

-Trp

Fig.

6. S

umm

ary

of th

e pr

oof

of th

e se

quen

ce of

res

idue

s 31

to 5

8 fr

om C

NB

r D. F

or d

etai

ls s

ee le

gend

of F

ig.5

R w

Vo1.27, No.1, 1972 J. B. C. PINDLAY and K. BREW

the subtractive Edman procedure, rather than the dansyl Edman technique. In this case, the disappear- ance of the tryptophan degradation products was noted after two round of Edman degradation.

T5 (Residues 63 to 70) . Sequence: Ser-Ser-Glx- Val-Pro-Glx-Ser-Arg. This peptide was recovered in high yield. Evidence for overlap with T8 was pro- vided by the larger fragment Th13.

T I 0 (Residues 71 to 73) . Sequence: Asx-Ile- S-AE-Cys.

T I 1 (Residues 74 to 77 ) . Sequence : Asx-Ile-Xer- S-AE-Cys. An extended version of this peptide was also isolated (T12) and has the sequence Asx-Ile- Ser-S-AE-Cys-Asx-Lys. A time course incubation with aminopeptidase M at 25 "C yielded the following results: 10 min Asp (0.47), Ile (0.20), Ser (0.13); 3 h: Asp (1.40), Ile (0.62), Ser (0.46); 8-AE-Cys (0.30), Lys (0.18); 18 h : Asp (2.14), Ile (0.98), Ser (0.71), A-AE-Cys (0.78), Lys (0.98). A-AE-Cys (0.78), Lys (0.98). The necessary overlap between T10 and T11 was given by Thl5 (Ile-8- AE-Cys-Asx).

T13 (Residues 80 to 90). Sequence: Phe-Leu-Asx- Asx-Asx-Ile-Thr-Asx-Asx-Ile-Hms-lactone. This segment was clearly identified as the C-terminal peptide of CNBr D by the presence of homoserine lactone. The link between T12 and T13 was provided by ThlSA, sequence Ile-Ser-S-AE-Cys-Asx-Lys- Phe and by the large peptides C14 and C15.

The chymotryptic peptides which constitute this segment of the molecule are also shown in Fig.7. Cleavage occurred after tryptophan-60, but not, surprisingly, after phenylalanine-80. On the other hand, hydrolysis of the serine-69-arginine-70 and arginine-70-asparagine-71 bonds took place in a t least a proportion of the molecules. Both these latter cleavage points do not conform to the strict speci- ficity of the enzyme.

8 x

Residues 51 to 123 This final section of the human a-lactalbumin

molecule was contained in CNBr E. The evidence for its sequence is summarised in Fig. 8 and was obtained by an analysis of the peptides discussed below.

No peptide representing residues 91 to 93 was recovered from any of the enzymatic digests of this fragment, despite elution of the Beckman M-72 column used in the separation, with 0.2 M sodium hydroxide. However, the three N-terminal residues were sequenced from the intact purified fragment, the information so obtained proving consistent with the amino acid composition data from both the whole protein and the purified fragment. Further evidence for the sequence in this region and the alignment of the CNBr fragments was provided by peptides T17 and T117 isolated from a tryptic digest of the intact S-aminoethylated protein. These peptides

m

3

W

0

91

100

110

120

Sequ

ence

C

~s-A

la-L

~s-L

ys-I

le-L

eu-A

sp-I

le-L

ys-G

ly-I

le-A

sn-T

yr-T

rp-L

eu-A

~a-H

is-L

ys-A

la-I

~eu-

Cys

-Thr

-G~u

-Lys

-Leu

-Glu

-Gln

-Trp

-Leu

-Cys

-Glu

-Lys

-Leu

~~

Pept

ide

KO

TI4

LY

5 T

h18

Lys

-Ile

-Leu

- Asx

T

14,1

5 L

ys-I

le-L

eu-A

sx-I

le-L

ys

T15

Ile-

Leu

- Asp

-Ile

-Lys

C1

6A

Asx

-Ile

-Lys

-Gly

-Ile

- Asx

-Tyr

C1

6 A

sp-I

le-L

ys-G

ly-I

le- A

sn-T

yr-T

rp

Th19

Il

e-L

ys-G

ly

T16

Gly-Ile-Asx-Tyr-Trp-Leu-

Ala

-His

-Lys

T

h20

Ile-

Asn

-Tyr

T1

12OA

Il

e-A

sx-T

yr-T

rp

C17A

L

eu- A

h-H

is

C17

Leu

-Ala

-His

-Lys

T

h21

Ala

-His

-Lys

T

h4,2

1A

Leu

-Ala

-His

-Lys

- Ala

Th

21A

A

la-H

is-L

ys- A

la

C18

Ala

-Leu

C

17,1

8 L

eu- A

la-H

is-L

ys- A

la-L

eu

T17

Ala

-Leu

-Cys

-Thr

-Glu

-Lys

Th

22

Leu

-Cys

-Thr

-Glx

-Lys

c1

9 Cys-Thr-Glx-Lys-Leu-Glx-Glx-Trp

Th2

3 L

eu-G

lu-G

ln-T

rp

T18

Leu

-Glx

-Glx

-Trp

-Leu

-Cys

Th

l4PE

C20

L

eu-C

ys-G

lx-L

ys-L

eu

Ti9

G

lu-L

ys-L

eu

Tl9

A

Glu

-Lys

Fig.

8. S

umm

ary

of th

e pr

oof

of th

e se

quen

ce o

f re

sidu

es 9

1 to

123

fro

m C

NB

r E

. For

det

ails

see

lege

nd o

f Fi

g. 5

6 L

Vo1.27, Xo.1,1972 J. B. C. F’I~~DLAY and K. BREW 81

were purified by ion-exchange chromatography as before and characterised as follows.

T17. Composition: Asp (5.00), Thr (0.99), Met (0.79), Ile (1.84), Leu (1.04), Phe (1.09), and S-AE- Cys (1.01). N-terminal residue: Phe. Carboxypeptid- ase B (1 h) : 8-AE-Cys (0.61); A & B ( I h) : Met (0.51), S-AE-Cys (0.93); ( 2 h) : Ile (>0.1), Met (0.72), S-AE- Cys (0.94). Sequence : Phe-(Leu,Asx,Asx,Asx,Ile, Thr, Asx,Asx)Ile-Met-8-AE-Cys.

Tl l7 . Composition: Ala (1.00), Lys (1.19). Se- quence : Ala-Lys.

T14,15 (Residues 94 to 99). Sequence: Lys-Ile- Leu-Asx-Ile-Lys. Two further cleavage products lysine (T14) and Ile-Leu-Asx-Ile-Lys (T15) were also isolated. Relying on the specificity of trypsin and the accuracy of the amino acid compositions, it was con- sidered that only two lysines could be accommodated in the region subsequently denoted residues 93 and 94.

TI6 (Residues 100 to 108). Sequence: Gly-Ile-Asn- Tyr-Trp-Leu-Ala-His-Lys. The overlap between peptides T14, 15 and TI6 was provided by C16, Cl6A and Th19. The presence of a tryptophan residue was indicated by the absorbance a t 280 nm and by carboxypeptidase A digestion of Th20A (Ile-Asx-Tyr- Trp) 1 h: Asn ( O . l ) , Tyr (0.4), Trp (0.70), 4 h : Asn (0.24), Tyr (0.61), Trp (0.75).

Xl7 (Residues 109 to 114). Sequence: Ala-Leu- 8- BE - Cys- Thr - Glx - Ly s . No evidence was obtained to indicate any cleavage by trypsin of the peptide bond between S-aminoethyl-cysteine-111 and threo- nine-112. Peptides Th4, 21A, and C17, 18 provided the necessary overlaps for the positioning of this peptide.

T18 (Residues 115 to 120). Sequence: Leu-Glu- Gln-Trp-Leu-X-AE-Cys. In common with the other tryptophan-containing peptides, TI8 showed cha- racteristic absorption a t 280 nm. The overlapping peptide GI9 (sequence S-AE-Cys-Thr-Glx-Lys-Leu- Glx-Glx-Trp) also contained a tryptophanyl residue, a fact which facilitated the alignment of TI8 as the penultimate peptide in the protein. The presence of this aromatic amino acid was confirmed by com- plete digestion of the peptide with aminopeptidase M: Gln (0.94), Glu (1.05), Leu (1.78), Trp (0.76), S-AE-Cys (0.88). I t s position was confirmed by a time-course digestion of Th23 with carboxypeptidase A: 1 h: Glu ( O . l ) , Gln (0.3), Trp (0.66); 5 h: Glu (0.28), Gln (0.62), Trp (0.72).

T19 (Residues 121 to 123). Sequence: Glx-Lys- Leu. The C-terminal position of this peptide was sug- gested by the identity of its C-terminal amino acid, leucine, with that of the intact protein. Further proof was provided by overlapping peptides Th24 and C20, both of sequence Leu-S-AE-Cys-Glx-Lys-Leu.

AMIDE DISTRIBUTION

I n order to determine whether the aspartyl and glutamyl residues which appeared on analysis of the 6 Eur. J. Biochem.. Vol. 27

Table 8. The identification of the amidated residues in human a-1actalbum.in as deduced from the electrophoretic mobilities

of peptides shown

Residue Peptide Peptide Con- + position code sequence Charge clusion

Glx 2 CNBr E-C1 Lys-Glx-Phe +1 Gln Glx 7 CNBr E-Th2 Phe-Thr-Lys-Cys- +l Glu

Glx Glx 10 CNBr E-Th3 Leu-Ser-Glx 0 Gln Asx 14 CNBr E-Th4A Leu-Lys-Asx 0 Asp Asx 16 CNBr E-C1 Leu-Lys-Asx-Ile-

Glx 25 CNBr E-Th6A Ile-Ala-Leu-Pro-Glx - 1 Glu Asx 37 CNBr D-Th9 Ser-Gly-Tyr-Asx-

Glx 33 After 4

Asx-Gly-Tyr -1 ASP

Thr-Glx-Alaa -1 ASP

Edmans Thr-Glx-Ala 0 Gln Glx 43 Asx 44 1 CNBr D-ThlO Ile-Val-Glx-Asx- -3 3 Acids Asx 45 purified Asx-Glx-Ser-Thr- Glx 46 on paper Glx-TF-Glya Glx 49 1 Glx 54 CNBr D-ThllA Phe-Glx Asx 58 CNBr D-Thl2 Ile-Ser-Asx-Lys Glx 65 CNBr D-T9 Ser-Ser-Glx-Val 01x68 ] Pro-Glx-Ser- Arg Asx 71 CNBr D-Ti0 Asx-Ile-Cys Asx 74 CNBr D-Ti1 Asx-Ile-Ser-Cys Asx 78 CNBr D-Ti2 Asx-Ile-Ser-Cys-

Asx 83 CNBr D-Thl7A Phe-Leu-Asx-Asx- Asx84 and T i 3 Asx-Ile-Thr- Asx

Asx 82

Asx 87 Asx 88 Asx 97 CNBr E-Ti4 Ile-Leu-Asx-Ile-Cys Asx 102 CNBr E-Th20 Ile-Asx-Tyra Glx 113 CNBr E-Ti6 Ala-Leu-Cys-Thr-

Glx 116 CNBr E-Th23 Leu-Glx-Glx-Trpa Glx 117 After 2

Glx 121 CNBr E-Ti8 Glx-Lys-Leu

Asx-Lysa

Asx-Ile-Hms.Laca ! Glx-Lys

Glx-Trp Edmans

2 Amid.

0 Gln +1 Asn

Glu +1 Gln +1 Asn

0 Asp

0 Asp

0 1 Acid 4 Amid.

0 Asp 0 Asn

+1 Glu

-1 GIU 0 Gln

0 Glu

a Peptide also subjected to enzymic hydrolysis.

acid hydrolysates of peptides, actually represented these residues or whether they resulted from de- amidation of asparagine and glutamine, respectively, a series of suitable peptides containing wherever possible only one of the residues, was subjected to high-voltage electrophoresis at pH 6.5. In certain cases, this evidence was supplemented with amino acid composition data following complete enzymatic hydrolysis of the peptide.

82 The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Biochem.

Table 9. Evidence for the positioning of the acid and amide residues i n peptides CNBr ThlO and Th 17A

Composition Aminopeptidase JI digestion Carboxypeptidase A digestion

Acid Enzymic 0.5 h I h 4 h l h 5 h 18 h Peptide Amino acid

Aspartic acid Threonine

Ser, Asn, Gln Glutamic acid

Th 10 Glycine Valine

Isoleucine Tyrosine

1.98 1.00 1.12 0.85 1.01 2.70 2.94 1.87 0.99 1.02 0.73 1.04 1.04 1.14 0.56 0.66

0.21 0.20 0.32 0.51

0.51 0.62 0.36 0.76 0.87 0.36 0.61 0.76

0.76 1.00 1.05 0.52 0.80 0.89 0.72 0.82 0.99

0.48 0.70 0.75

Aspartic acid 4.61 1.00 Threonine 0.95 0.88

0.50 0.70

Ser, Asn, Gln 3.52 0.62 0.76 0.91

Leucine 1.05 1.02 0.92 1.04 1.09 Isoleucine 1.11 1.21 No cleavage obtained

Phenylalanine 0.91 0.63 0.68 0.73 0.80

The results are given in Table 8. Of the 31 possible acid and amide residues, 17 were directly identified by these methods. A further four were determined by high-voltage electrophoresis following Edman de- gradation. This latter procedure is best illustrated by reference to peptide CNBr E Th23. I ts mobility indicated the presence of one acid and one amidated residue. After two rounds of Edman degradation, the electrophoretic mobility of the remaining frag- ment suggested that a glutaminyl residue had been removed. By inference, glutam(ate/ide)-116 could then be identified as glutamic acid. I n a similar way, with CNBr D Th9, aspar(agine/tate)-37 was identified as the acid and glutam(ate/ide)-39 as the amide com- ponent of the peptide. This result was confirmed by a time course digestion of the whole peptide with carboxypeptidase A, which indicated the position of the amidated residue in the sequence: 0.5 h : Gln (0.65), Ala (0.81); 3 h : Thr (0.15), Gln (0.84), Ala (1.00); 9 h: Thr (0.35), Gln (1.00), Ala (1.01). Since complications and hence ambiguities may arise using the electrophoretic technique due to modifi- cations of side-chain amino-groups by phenyliso- thiocyanate, peptides containing lysyl or S-amino- ethyl-cysteinyl residues were not subjected to the procedure.

The most intransigent residues were located in two groups situated in peptides CNBr D ThlO and CNBr D T13. The data from both high-voltage elec- trophoresis and enzymatic hydrolyses indicated that the former contained two amides (one asparagine and one glutamine) and three acids (two glutamic and one aspartic) and the latter a single aspartyl and four asparaginyl residues. Since both peptides contained

more than one residue which could account for either the acid or amide content, the above information did not allow us to unambiguously identify the loca- tion of these residues in the two peptides. This am- biguity was resolved by time-course incubation with aminopeptidase M and carboxypeptidase A. The results are illustrated in Table 9.

They indicate that the glutamyl residues present in peptide ThlO occupy positions 43 and 49 in the molecule and the asparagine position 44. The compo- sition of a complete enzymic digest of this peptide, together with its sequence, allowed us to assign the aspartyl residue to position 45, leaving the glutamine for position 46.

In peptide Thl7A (identical with T13), the only aspartyl residue present was liberated after the first asparagine, on a time-course incubation with amino- peptidase M. Consequently, asparagine moieties must occupy position 82, 84 and 88 (see Table 9).

DISCUSSION The results obtained with the purification pro-

cedure adopted for human or-lactalbumin confirms the absence of b-lactoglobulin in the milk of this species [l6], a trait in common with the guinea pig [I71 and camel [IS].

The rationale behind our approach to the sequence determination of human a-lactalbumin was firstly to cleave the polypeptide chain into large fragments with cyanogen bromide and subsequently, to treat these isolated sections individually with respect to sequence analysis. Since the N and C-terminal CNBr fragments could not be conveniently separated in

Vol.27, No. 1, 1972 J. B. C. FINDLAY and K. BREW 83

quantities sufficient to permit this approach, they were subjected to enzymic digestion as a mixture. However, even this partial fractionation resulted in the production of peptide mixtures far less com- plex than would be obtained from the whole protein. The advantages in this approach were two-fold ; firstly, is facilitated the isolation of these peptides, most of which were obtained pure after fractionation by ion-exchange chromatography. Secondly, by locating peptides in defined sections of the poly- peptide chain, the problem of alignment was greatly simplified.

Amino acid analysis of human a-lactalbumin re- vealed the presence of two methionines in the pro- tein. Subsequent cleavage with cyanogen bromide gave rise to the expected three peptide fragments whose relative positions in the polypeptide chain could be uniquely determined. Obviously, with more than three fragments, the procedure detailed below would not yield an unambiguous result and recourse would have to be made to overlapping peptides con- taining the relevant methionyl residues.

The amino acid composition of one fragment iso- lated from the CNBr E peak, indicated the absence of homoserine lactone, the degradation product re- sulting from the action of cyanogen bromide on methionine [21]. In contrast, both of the other CNBr fragments contained this amino acid derivative. This 33-residue section must, therefore, be derived from the C-terminal end of the intact polypeptide. As the second component present in CNBr E possessed an identical N-terminal tripeptide sequence to that of the whole protein, it was considered to constitute the first 30 residues from the NH, terminus of the protein. The material in the CNBr D pool must, therefore, comprise the central portion of the poly- peptide chain, stretching from residues 31 to 90.

The sum of the amino acid compositions of the three CNBr fragments is in very close agreement with the composition obtained for the whole protein. In the same way, peptides isolated from the tryptic, chymotryptic and thermolysin digests of these pieces accounted in each case (with the exception of a tripeptide in CNBr E) for the total number ofresidues contained in CNBr fragment. The missing tripeptide mentioned above was not recovered even after treat- ment of the Beckman M-72 column with 0.2 M so- dium hydroxide. It can only be supposed that the high proportion of basic amino acids relative to the overall size of the peptide caused it to be very strong- ly bound to the column resin.

Although the peptide bonds hydrolysed to give these peptides did in general conform to the published specificity of the proteases, certain unusual cleavage points were observed, some of which have already been mentioned. The most obvious was thd cleavage after tyrosine-50 on digestion of CNBr D with tryp- sin. It is not clear whether hydrolysis at this position 8.

is a consequence of contamination of the enzyme prep- aration with a small amount of chymotrypsin or whether it represents an inherent low level of chymo- tryptic-type activity in trypsin itself. Certainly, the lack of further chymotryptic cleavage products, sug- gests that some structural feature confers the prop- erty of high susceptibility to proteolysis on this Tyr-Gly bond.

To obtain maximum homology between human a-lactalbumin, bovine a-lactalbumin [6], human leu- kaemic lysozyme [5] and hen egg-white lysozyme [7] the primary structure of the four proteins are aligned as indicated in Fig. 9. (Upper numbering on the basis of a-lactalbumin structure, lower on lysozyme). Those deletions which were previously postulated in the bovine sequence for maximum homology with hen egg-white lysozyme, seem justified on the basis of the similarity between the two a-lactalbumins and the alignment of human a-lactalbumin and hu- man lysozyme. Moreover, no new gaps need be in- serted into the human a-lactalbumin sequence to aug- ment its resemblance to human leukaemic lysozyme. However, to take account of the extra amino acid (glycine) at position 48 of human leukaemic lysozyme, the gap in this region of the human a-lactalbumin primary structure must be widened by a further re- sidue.

Based on these alignments, it is possible to tab- ulate (Table 10) the numbers of positions at which human a-lactalbumin possesses amino acids identical, similar or different to bovine a-lactalbumin and human leukaemic lysozyme. A comparison between the two a-lactalbumins reveals that 720/, of the resi- dues are identical, 6°/0 are chemically similar and 22 O/, different. These values indicate greater differ- ences than have been found for the a-chains of haemoglobin (88O/,, 20/, and loo/,) [22] and cyto- chromes c (89O/,, 3O/,, So/,) [22] from the same species, but correspond roughly to the variations seen in the structures of the bovine and rat ribonucleases (680/,, 6O/,, 26°/0) [22]. This quite rapid rate of evolutionary change is consistent with the proposed relatively late emergence of a-lactalbumin activity [I].

In the comparison between human a-lactalbumin and leukaemic lysozyme, 63 (510/,) of the positions in the protein contain identical or closely related amino acids, a value very similar to that of 60 (490/,) for the bovine a-lactalbumin and hen egg-white lysozyme comparison. Included in Table 10, is the same comparison quantitised in terms of the minimal base change in the amino acid codons. This analysis supports the inference of homology already deduced from the similarities between bovine a-lactalbumin and hen egg-white albumin and also confirms the ap- proximate equality in total amino acid replacements when the sequence of human a-lactalbumin is com- pared with hen egg-white lysozyme and with human leukaemic lysozyme.

84 The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Biochem.

1 10 A Lys-Gln-Phe-Thr-Lys-Cys-Glu-Leu- Ser -Gln -Leu- B Glu -Gln -Leu -Thr-Lys-Cys-Glu-Val -Phe-Arg-Glu - C Lys-Val -Phe-Glu -Arg-Cys-Glu-Leu-Ala -Arg-Thr- D Lys-Val-Phe-Gly -Arg-Cys-Glu-Leu-Ala - Ala-Ah -

1 10

20 Leu-Lys- -Asp- Ile -Asp-Gly-Tyr-Gly -Gly- Ile -Ala- Leu-Lys- -Asp-Leu-Lys -Gly-Tyr-Gly -Gly-Val-Ser- Leu-Lys-Arg-Leu-Gly -Met-Asp-Gly-Tyr-Arg-Gly- Ile -Ser - Met -Lys-Arg-His-Gly -Leu-Asp-Asn-Tyr- Arg-Gly-Tyr-Ser -

20 30

Leu-Pro-Glu -Leu- Ile -Cys-Thr-Met-Phe-His-Thr-Ser-Gly - Leu-Pro-Glu -Trp-Val -Cys-Thr-Thr-Phe-His-Thr-Ser-Gly - Leu-Ale -Am-Trp-Met-Cys-Leu-Ala -Lys -Trp-Glu-Ser-Gly - Leu-Gly -Am-Trp-Val -Cys-Ala -Ala-Lys -Phe-Glu-Ser-Asn-

30

40 Tyr-Asp-Thr-Gln -Ala- Ile -Val -Glu -Am- -Asp-Gln- Tyr-Asp-Thr-Glu -Ala- Ile -Val -Glu -Am- -Asn-Gln- Tyr-Asn-Thr-Arg-Ala-Thr-Asn-Tyr -Am-Ah-Gly-Asp-Arg- Phe-Asn-Thr-Gln-Ala-Thr-Asn-Arg-Asn-Tyr- -Asp-Gly-

40 50

50 Ser-Thr-Glu -Tyr-Gly-Leu-Phe-Gln-Ile- Ser -Am-Lys-Leu- Ser-Thr-Asp-Tyr-Gly-Leu-Phe-Gln-Ile-Asn- Am-Lys- Ile - Ser-Thr-Asp-Tyr-Gly- Ile -Phe-Gln-Ile-Asn- Ser -Arg-Tyr- Ser-Thr-Asp-Tyr-Gly- Ile -Leu -Gln-Ile-Asn- Ser -Arg-Trp-

60 Trp-Cys-Lys- Ser - Ser -Gln-Val-Pro-Gln -Ser -Arg-Asn- Trp-Cys-Lys-Asn-Asp-Gln-Asp-Pro-His -Ser - Ser -Am- Trp-Cys-Asn-Asp-Gly-Lys -Thr-Pro-Gly-Ah-Val -Am- Trp-Cys-Asn-Asp -Gly-Arg-Thr-Pro-Gly -Ser -Arg-Asn-

60

70

Ile -Cys-Asp- Ile -Ser-Cys-Asp-Lys-Phe-Leu-Asn-Asp-Asn- Ile -Cys-Asn- Ile -Ser-Cys-Asp-Lys-Phe-Leu-Asn-Asn-Asp- Ala -Cys-His -Leu-Ser-Cys- Ser -Ala -Leu-Leu-Gln -Asp-Asn- Leu-Cys- Asn- Ile -Pro-Cys- Ser - A h -Leu-Leu- Ser - Ser -Asp-

80

80

90 Ile -Thr-Asn-Asn- Ile -Met-Cys-Ala-Lys-Lys- Ile -Leu- - Leu-Thr- Asn- Asn- Ile -Met-Cys-Val-Lys-Lys- Ile -Leu- - Ile - A h -Asp-Ala -Val-Ale -Cys-Ala -Lys-Arg-Val-Arg- - Ile -Thr-Ala - Ser -Val-Asn-Cys-Ala-Lys-Lys- Ile -Val - Ser-

90 100

Asp- Ile -Lys-Gly- Ile -Am-Tyr-Trp-Leu-Ala-His-Lys-Ala - Asp-Lys-Val- Gly- Ile -Am-Tyr-Trp-Leu-Ala-His -Lys-Ala - Asp-Pro-Gln-Gly- Ile -Arg-Ala-Trp- Val -Ala-Trp-Arg-Asn- Asn-Gly-Asp-Gly -Met-Asn- Ala-Trp-Val -Ala-Trp- Arg- Asn-

110 Leu-Cys-Thr-Glu-Lys-Leu-Glu -Gln -Trp-Leu- Leu-Cys- Ser -Glu -Lys-Leu-Asp-Gln -Trp-Leu- Arg -Cys-Gln -Am-Arg-Asp-Val -Arg-Gln -Tyr -Val-Gln- Arg-Cys-Lys -Gly -Thr -Asp-Val -Gln -Ala-Trp - Ile -Arg-

100

110

120 120

-Cys-Glu-Lys-Leu- -Cys-Glu-Lys-Leu

Gly-Cys- -Gly-Val- Gly-Cys- -Arg-Leu

130

Fig.9 Comparison of the amino-acid sequences of human ( A ) and bovine ( D ) a-lactalbumins and human leukaemic ( C ) and hen egg-white ( D ) lysozymes. Differences in amino residues acid are indicated in bold-free type

Vol. 27, No. 1,1972 J. B. C. FINDLAY and K. BREW 85

Table 10. Comparison between the sequences of human a-laetalbumin and those of bovine a-lactalbumin and human lysozyme Similar amino-acids are taken as serine/threonine, leucine/isoleucine, alanine/valine/glycine, arginine/lysine, aspartic acid/

glutamic acid and asparagine/glutamine

Residues Codons differing by Proteins compared

identical similar different 1 base 2 bases ~~

Bovine or-lactalbumin/

Human a-lactalbuminl human a-lactalbumin 89 (72O/,) 7 ( 6O/O) 27 (22O/O) 31 (25O/,) 3

human leukaemic lysozyme 48 ( 3 9 O / O ) 15 (12O/O) 60 (49'10) 49 (41°/0) 26 (20O10)

Inspection of all the sequences reveals the struc- turally conservative nature of a great proportion of these substitutions. Both hydrophobic and hydro- philic residues are in general replaced by amino acids of similar chemical character. An analysis of the sequences of the two human proteins based on the helix-forming or helix-breaking nature of the amino acid [23] also supports this conclusion. At only I2 positions, most of them occurring in clusters or re- gions of low helical content do replacements of dif- fering character take place. One possible instructive variation from this general trend occurs in the short segment from position I00 to 104 (residues numbered according to the lysozyme sequence, Fig.9). At po- sition 100 human leukaemic lysozyme possesses an arginyl residue in comparison with the leucyl residue of a-lactalbumin. At position 103, the respective amino acids are helix-initiating or helix- breaking proline and isoleucine whilst position 104 contains two residues of opposite-character glutamine in human leukaemic lysozyme and lysine in human a-lactal- bumin. Inspection of the three dimensional structure of hen egg-white lysozyme 1241 indicates that this region of the molecule is flexible enough to accom- modate residues of markedly different character with- out causing significant alterations in the surrounding configuration. It therefore provides an interesting test as regards a possible conformational homology between the two proteins.

One further interesting feature of human a-lact- albumin is the replacement of the tryptophanyl re- sidue a t position 28 of the sequence, by leucine. Kronman [25] believes that this aromatic chromo- phore is buried in the molecule but Barman [26] considers it to be exposed. On the basis of the hydro- phobic nature of the substituted residue in human a-lactalbumin, however, the former hypothesis ap- pears the more probable. Whichever postulate is correct, the substitution of this amino acid apparently does not deleteriously affect the participation of the protein in the biosynthesis of lactose. Consequently the direct involvement of the tryptophanyl residue in the biochemical reactions of a-lactalbumin can be discounted.

A three-dimensional structure has been proposed for bovine a-lactalbumin based on the hypothesis

that the sequence homology with hen egg-white lysozyme is reflected by a similarity in their confor- mation [27]. Although this postulate has been ques- tioned by some authors [28,29] it is supported by pre- cedent ( e . g . with the serine proteases and in the hae- moglobin-myoglobin system) [30] and by a con- siderable amount of physico-chemical evidence (see Discussion of [31]). Consequently, it is interest- ing to note that the sequence of human oc-lactalbu- min apparently supports a conformational similarity with he egg-white lysozyme [7] and human leukaemic lysozyme [5]. For example, if one considers the hydro- phobic core surrounding residue 28, which in human- albumin is a leucyl residue, replacing the tryptophan of bovine a-lactalbumin and hen egg-white lyso- zyme, other compensatory changes in the surrounding residues (17,20, 56,99, 106, 109) result in a constancy in the summed molar volumes. This conservative feature is therefore consistent with a retention of the overall conformation of the molecule.

Similarly, many of the residues involved in hy- drogen bonding in hen egg-white lysozyme are re- tained in human a-lactalbumin. Where substitu- tions have occurred, the potential for bonding is maintained by the nature of the substituted amino acid, e . g. asparagine-27 is replaced by glutamate and serine-85 by asparagine. It is noteworthy that of the 33 residues common to the two ol-lactalbumins and two lysozymes, 11 are in some way involved in hy- drogen bonding.

Another region of conformational interest in human a-lactalbumin is to be found near the amino terminal end of the molecule. In bovine a-lactalbumin the substitution of glutamate-I for the N-terminal lysine of hen egg-white lysozyme removes the pos- sibility of salt-bridge formation between the side chain of this residue and that of glutamate-7. How- ever, the retention of a lysine in this position in hu- man a-lactalbumin, and the greater similarity of the residues surrounding phenylalanine-3 to those of the egg-white lysozyme (as compared with bovine a-lactalbumin) suggests that the structure of this region in human oc-lactalbumin may be more similar to that of hen egg-white lysozyme than is the bovine protein. Other salt bridges, between lysine-I3 and the C-terminal carboxyl group and between lysine-97

86 J. B. C. FINDLAY and K. BREW: The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Biochem.

and asparate-102 also appear to have remained un- changed. In addition, several potential charge pairs, the disruption of which produce changes likened by Kronman [31] to those induced by acid denaturation, are still present in the molecule. However, only the elucidation of the three-dimensional structure of a-lactalbumin, will clarify the relevance of the con- servation of these potential interactions, to the con- formation of the protein.

The possible effects of the amino acid substitu- tions in bovine a-lactalbumin (when compared with hen egg-white lysozyme) on the conformation of the active-site cleft region of lysozyme has been previ- ously discussed in detail [27]. Many of these features have been retained in the equivalent positions of human a-lactalbumin. For example, the replacement of alanine-108 by a tyrosyl residue and with it the introduction of the potential for blocking off a sec- tion of the cleft region, is still in evidence, as is the absence of tryptophan-63 and the substitution of the catalytically functional glutamate-35 by a threonyl residue. This modification of the active site of lysozyme is extended in human a-lactalbumin by the replacement of the catalytically important aspartate-53 (still conserved in bovine a-lactalbumin) by glutamic acid. Although the side-chain carboxyl of this latter residue could still function in a manner similar to that of the corresponding aspartic acid of lysozyme, the freedom of variability in this po- sition indicates a less critical role for the residue in the biological activity of a-lactalbumin. These modifi- cations are, therefore, consistent with the loss of lyso- zyme activity and a reduction in the dimensions of the substrate molecules, both of which are characteristic of the a-lactalbumins. It seems possible that the role of a-lactalbumin in the lactose synthetase system [2,3] may lie more in the provision of parts of the binding site for monosaccharides than directly in the synthesis of the

One unusual feature of the a-lactalbumin-lyso- zyme group is the lack of immunological cross- reactivity not only between a-lactalbumins and lyclo- zymes [32] but also between many a-lactalbumins from different species [33]. This phenomenon can perhaps be related to the significant differences found in certain parts of their respective sequences. For example, one major antigenic site of hen egg-white lysozyme [34] is the disulphide loop region (residues 63 to 82) which is exposed on the surface both of lyso- zyme and in the bovine a-lactalbumin model. It is readily noticeable that a t least part of this section (residues 66 to 73) is a region of high mutability in the primary structures of the a-lactalbumins. As- suming that differences in sequence as well as con- formation are capable of generating variations in an- tigenic response, the correspondence of the principal antigenic site with a region of great variability dis- tant from the active site could account for this ob-

1-4 glycosidic linkage.

served lack of cross-reactivity. Clearly should this suggestion be justified by current studies, the hypo- thesis of major conformational differences between the two proteins which has been advanced by some workers to account for this property is both unneces- sary and improbable.

We wish to thank the Agricultural Research Council for the research grant in support of this work, which also includ- ed a research assistantship for one of us (J.B.C.F.).

REFERENCES 1. Brew. K.. Vanaman. T. C. & Hill. R. L. (1967) J . Biol. ~ Chem. 242, 3747. . 2. Brodbeck. U. & Ebner. K. E. (1966) J . Biol. Chem. 241,762. 3. Brew, K.,' Vanaman;T. C. '& Hill, R. L. (1968) Proc.

4. Brew, K. (1969) Nature (London) 223, 671. 5 . Canfield, R. E., Kammerman, S., Sobel, J. M. & Morgan,

6. Brew, K., Castellino, F. J., Vanaman, T. C. & Hill,

7 a. Canfield, R.E. & Liu, A.K. (1965) J . Biol.Chem. 240,1997. 7b. Jolles,P. (1967) Proc. Roy. SOC. London Biol.Sci.167,350. 8. Brew, K. & Hill, R. L. (1970) J . Biol. Chem. 245, 4559. 9. Davis, B. S. (1965) Ann. N . Y . Acad. Sci. 121, 404.

Nut. Acad. Sci. U. 8. A . 59, 491.

F. J. (1971) Nature New Biol. 232, 16.

R. L. (1970) J . Biol. Chem. 245, 4570.

10. Him, C. H. W. (1956) J . Biol. Chem. 219, 611. 12. Gray, W. R. (1967) Methods Enzymol. 11, 469. 13. Woods, K. R. & Wang, K. T. (1967) Biochim. Biophys.

14. Hartley, B. S. (1970) Biochem. J . 119, 805. 15. Hirs, C. H. W. (1967) Methods Enzymol. 11, 328. 16. Gray, W. R. (1967) Methods Enzymol. 11, 409. 17. Offord, R. E. (1966) Nature (London) 211, 591. 18. Johansson, B. (1958) Nature (London) 181, 996. 19. Brew, K. & Campbell, P. N. (1967) Biochem. J . 102, 258. 20. Kessler, E. & Brew, K. (1970) Biochim. Biophys. Ada ,

200, 449. 21. Steers. E. Jr., Craven, G .R., Anfinsen, C. B. & Bethune,

J. L. (1965) J . Biol. Chem. 240, 2478. 22. Dayhoff, M. 0. & Eck, R. V., eds (1968) in Atlas of Pro-

tein Sequence and Structure. 23. Lewis,P.N. & Scheraga, K.A. (1971) Arch. Biochem.

Biophys. 144, 584. 24. Blake, C. C. F., Mair, G. A., North, A. C. T., Phillips,

D. C. & Sarma, V. R. (1967) Proc. Roy. 800. London, B . Biol. Sci. 167, 365.

25. Kronman, M. J., Holmes, L. G. & Robbins, F. M. (1971) J . Biol. Chem. 246, 1909.

26. Barman, T. E. (1970) J. Mol. Biol. 52, 391. 27. Browne, W. J., North, A. C. T., Phillips, D. C., Brew, K.,

Vanaman, T. C. & Hill, R. L. (1969) J . Mol. Biol. 42,65. 28. Krigbaum, W. R. & Kugler, F. R. (1970) Biochemistry,

9, 1216. 29. Habeeb, A. F. S. A. & Atassi, &IiI. Z. (1971) Biochim.

Biophys. Acta, 236, 131. 30. Perutz,M. F., Muirhead,M., Cox, J. M. & Goaman,

L. C. G. (1968) Nature (London) 219, 131. 31. Kronman, M. J., Holmes, L. G. & Robbins, F. M. (1971)

J . Biol. Chem. 246, 1909. 32. Atassi,M. Z., Habeeb, A. F. S. A. & Rydstedt, L. (1970)

Biochim. Bwphys. Acta, 200, 184. 33. Tanahashi, N., Brodbeck, U. & Ebner, K. E. (1968)

Biochim. Biophys. Acta, 154, 247. 34. Amon, R. & Sela, M. (1969) Proc. Nut. A d . Sci. U. S.A.

62, 163.

Acta, 133, 369.

J. B. C. Findlay and K. Brew University Department of Biochemistry 9 Hyde Terrace, Leeds, LS2 9LS, Great Britain