Upload
fau
View
0
Download
0
Embed Size (px)
Citation preview
Eur. J. Biochem. 27,65-86 (1972)
The Complete Amino-Acid Sequence of Human a-Lactalbumin
John B. C. FINDLAY and Keith BREW
Department of Biochemistry, University of Leeds
(Received November 10, 1971/February 7, 1972)
a-Lactalbumin was isolated from human milk in a yield of 1.8 mg/ml of milk. The purification procedure involved ammonium sulphate fractionation (300/, to 800/, saturation) and pH-4.0 precipitation, followed by gel filtration with Sephadex G-100. A final purification stage using DEAE-cellulose was necessary in some preparations.
Peptides derived from the reduced, S-aminoethylated protein by treatment with cyanogen bromide and digestion of these CNBr fragments with trypsin, chymotrypsin or thermolysin, were purified by gel filtration, ion exchange chromatography and high-voltage paper electro- phoresis.
From the sequences of these peptides it has proved possible to deduce unambiguously the com- plete primary structure of the protein. Comparison with bovine a-lactalbumin shows an identity in 72O/, of the residues with a further 6O/, being chemically similar amino acids. The correspond- ing figures for the human a-lactalbumin/human lysozyme comparison, are 39O/, and 12O/,, re- spectively. The significance of some of the amino acid replacements is discussed.
Information about the structural restraints placed upon the variability of a protein by the neces- sity of maintaining its biological function, can fre- quently be obtained by comparative structural stud- ies on the protein from different species. In the case of a-lactalbumin, these investigations are of further relevance on account of its homology with lysozyme [l]. This similarity in structure is not, however, re- flected in the enzymic properties of the two proteins, for lysozyme is thought to act as an anti-bacterial agent whilst a-lactalbumin is involved, along with a membrane-bound enzyme, in the biosynthesis of lactose in the mammary gland [2,3]. Although there is a superficial resemblance in the specificity of these activities, lysozyme catalysing the hydro- lysis of a ,!? 1-4 glycosidic linkage in the bacterial cell wall and a-lactalbumin participating in the synthesis of a similar bond in lactose, further investigation has revealed a wide divergence in the actual roles of the two proteins in the expression of their respective activities [4].
Thus, it is to be hoped that information on the sequences of a-lactalbumin and lysozyme may tell us something about the relationship of structural
Abbreviations. CNBr, cyanogen bromide; dansyl, l-di- methylaminonaphthalene-5-sulphonyl.
Enzymes. Aminopeptidase M (EC 3.4.1.2); prolidase (EC 3.4.3.7) ; carboxypeptidase A (EC 3.4.2.1) ; carboxypeptidase B (EC 3.4.2.2); trypsin (EC 3.4.4.4); chymotrypsin (EC 3.4.4.5) ; thermolysin (EC 3.3.4.-); pronase (EC 3.4.4.-).
6 Eur. J. Biochem., Vol.27
change to the evolutionary divergence of function between the two groups. As it was known that the primary structure of human lysozyme was under in- vestigation [5], human a-lactalbumin has been chosen for comparative sequence studies.
Peptides obtained by enzyme-catalysed hydroly- sis of the cyanogen bromide fragments from human- a-lactalbumin have been isolated and characterized and we report here the amino acid sequences of these peptides and the way in which they were aligned to yield the complete primary structure of the protein.
This result is compared to the known primary structures of bovine a-lactalbumin [6] and human and hen egg-white lysozymes [5,7]. The degree of varia- bility of certain parts of the a-lactalbumin molecule revealed by the comparison is discussed in relation to their possible role in the lactose synthetase specifier activity and to the antigenic activity of the protein.
EXPERIMENTAL PROCEDURE
MATERIALS
The human milk used in these studies was kindly supplied by the Maternity Hospital a t Leeds (United Leeds Teaching Hospital Group). Since only small quantities of milk could be obtained from any single individual, no attempt was made to treat these sam- ples separately.
Eur. J. Biochem. 66 The Amino-Acid Sequence of Human a-Lactalbumin
Sephadex was purchased from Pharmacia (Great Britain), Bio-Gel P4 from Bio-Rad Laboratories, DEAE-cellulose from Whatman and M-72 from Beck- man Instruments Ltd.
Chymotrypsin, carboxypeptidase A and trypsin were all obtained from Sigma Chemical Company Ltd, thermolysin and pronase from Calbiochem. Ltd, amino-peptidase M from Rohm and Haas (Darmstadt) and prolidase from Miles-Servac (PTY) Ltd. The cy- anogen bromide, ethyleneimine, phenylisothiocya- nate and 2-mercaptoethanol used in these studies were all produced by Koch-Light Laboratories Ltd ; B.D.H. Ltd supplied l-dimethylaminonaphthalene- 5-sulphonyl chloride and trifluoroacetic acid. The polyamide layer sheets were obtained from The Cheng Chin Trading Company (Taiwan).
All other chemicals were reagent grade or better. Pyridine and phenylisothiocyanate were redistilled prior to use, the former from ninhydrin-containing solutions. Reagents and solutions prepared from these materials were stored a t - 16 "C.
METHODS
Purification of Human a-Lactalbumin
Whole human milk (pH 6.6) was adjusted to pH 4.6 with 1 M acetic acid, and centrifuged a t 10000 rev./min for 30 min. Some of the caseins were sedi- mented whilst the fat formed a layer a t the top of the vessel. The clear supernatant material was retained and after adjusting to pH 7.0 with 1 M NaOH solid ammonium sulphate was added to 30°/, saturation. After centrifugation as above, the precipitated ma- terial was discarded and the supernatant fraction taken to goo/, saturation with ammonium sulphate. The precipitate so obtained was redissolved in a tenth of the original milk volume of water, and the pH ad- justed to 4.0 with 1 M acetic acid. The resulting pre- cipitate was collected as before, redissolved in as small a volume of water as possible and the pH taken up to 7.0.
Following extensive dialysis (3 days) against dis- tilled water, the preparation was freeze-dried, dis- solved in 0.05 M ammonium bicarbonate and applied to Sephadex G-100 (4 x 140 cm) equilibrated in the same buffer. Gel filtration was carried out in this buf- fer, at room temperature with a flow rate of 50 ml/h. The column eluant was monitored spectrophotomc- trically a t 280 nm.
On occasion, a further column separation using the ion-exchange resin DEAE-cellulose in 0.02 M Tris-HC1 pH 7.8 with a sodium chloride gradient of 0 to 0.3 M, was employed to ensure purity.
The resulting material gave a single band on poly- acrylamide gel disc electrophoresis a t pH 8.6 and consistent amino acid composition data. A single precipitin line was also obtained using the Ouchter-
lony double-diffusion procedure with rabbit antise- rum to H a-lactalbumin.
Reduction, S- Aminoethylation und Cleavage with Cyanogen Bromide
The method used in the reduction and X-amino- ethylation of human a-lactalbumin is described elsewhere [8]. The modified protein was separated from the reaction mixture by dialysis in acetylated tubing against distilled water for 3 or 4 days.
Cleavage of the resulting material with cyanogen bromide was subsequently carried out in 7001, formic acid a t a protein concentration of 10mg/ml and a CNBr concentration of 20 mglml. After 24 h at room temperature, excess reagents were removed by rotary evaporation, the residual material washed with distilled water and then freeze-dried. The re- sulting peptides were separated by gel filtration with Sephadex G-75 ( 4 ~ 1 4 0 c m ) equilibrated in l o / , formic acid or 1 M acetic acid.
Polyacrylamide-Gel Electrophoresis Disc electrophoresis with 7 polyacrylamide
gels was performed as described by Davis [9]. For electrophoresis of the cyanogen bromide fragments the gels and buffers were prepared containing 4 M urea.
Amino-Acid Analysis Amino acid analyses were performed with a Beck-
man Unichrome amino acid analyser fitted with high- sensitivity flow cells, following acid hydrolysis of the protein and peptides with 6 N HCl a t 110 "C in eva- cuated sealed Pyrex tubes. The composition of human a-lactalbumin was obtained from 24, 48 and 96-h hydrolysates, extrapolating back to zero time for the seiine, threonine and tyrosine values. It was assumed that the release of isoleucine and valine was com- plete after 96 h. The cysteine content was obtained upon analysis of the performic acid-oxidised protein [lo]. Peptides were hydrolysed for 24 h but no cor- rection for the destruction of serine, threonine and tyrosine or for slow liberation of isoleucine and valine, was made. Contaminating amino acids less than 0.1 of a residue have been omitted from the composition tables.
N-Terminal Determination The N-terminal amino acids of the whole protein
and the CNBr fragments were determined using a modification (personal communication, J. R. Woot- ton) of the dansyl chloride procedure [12]. In this method dansylation was carried out in the presence of dodecylsulphate and the N-terminal dansyl amino acids extracted from acid hydrolysates by ethyl ace- tate or 50°/, aqueous pyridine. These derivatives were subsequently identified by two-dimensional
Vol.27, No.l,1972 J. B. C. FINDLAY and K. BREW 67
ascending chromatography on 7.5 or 5-cm square double-sided polyamide layers [ 131, using the solvent systems detailed below. Solvent 1 was invariably used in the first dimension and solvents 2 to 5 in the second dimension.
(v/v) aqueous formic acid [I31 ; solvent 2 = benzene-acetic acid (9:1, v/v) [13]; solvent 3 = ethyl acetate-methanol-acetic acid (20: I : I, v/v) [I41 ; solvent 4 = pyridine-acetate pH 4.4-ethanol (3: 1, vlv) [14]); solvent 5 = 0.05 M trisodium phosphate-ethanol (3 : I, v/v) [14].
Systems 1 and 2 were sufficient for the identifi- cation of the dansyl derivatives of all amino acids with the exception of serine and threonine, aspartic and glutamic acids (both pairs resolved with solvent 3), arginine and lysine (separated in solvent 4) and histidine (resolved from edansyl-lysine using sol- vent 5 ) . Removal of the N-terminal residues was achieved by a modifiedEdman procedure in which the coupling step was performed in the presence of do- decylsulphate ; the phenylthiocarbamyl-peptide was recovered by precipitation with acetone prior to cycli- sation [ I l l . This allowed the N-terminal sequences of the whole protein and the CNBr fragments to be determined.
Solvent I = 1.5
C-Terminal Determination The C-terminal residue of the intact protein was
identified using carboxypeptidase A. 10mg of the native protein was dissolved in 3 ml of 0.5 M phos- phate buffer pH 7.6 and 4O/, (wlw) carboxypeptidase A in loo/, lithium chloride, added. The mixture was incubated a t 37 "C and 0.5-ml aliquots withdrawn into 0.6 mlO.1 M HC1 after 0, 30, 75, 180 and 300 min. These samples were stored frozen and analysed with- out further treatment.
Enzyme Digestions Trypsin. 5- 10 pmol of the CNBr fragments were
dissolved in 2.0 ml H,O and the pH adjusted to 8.6 with 0.1 M NaOH. Trypsin, equivalent to lo/, (w/w) of the protein was added and the hydrolysis allowed to proceed for 1 h a t 37 "C. At this time a sec- ond lo/, aliquot was added. The pH of the solution was constantly maintained at 8.6 with 0.1 N NaOH. After a second 1-h period the reaction was terminated by the addition of glacial acetic acid to a final pH of 2.5. The material was subsequently freeze-dried.
Chymotrypsin. Digestion with chymotrypsin was carried out as with trypsin, except that a third ali- quot of the enzyme was added after 2 h and the hy- drolysis allowed to proceed for a total of 3 h.
Thermolysin. 5-10 pmol of the material to be digested was dissolved in 2 ml 2.5 mM CaCl, and the pH adjusted to 7.5 with 0.1 M NaOH. l o / , thermo- lysin (w/w) was added a t zero time and after I and 2 h, thereby giving a total of 30l0 and a hydrolysis 5'
time of 3 h. The pH was maintained as above and the reaction terminated with glacial acetic acid as previously described.
Separation of Peptides Peptides were routinely separated using a
column of (0.9 x 55 cm) Beckman M-72 resin, a spher- ical bead sulphonated polystyrene equivalent to Dowex 50X 8. Elution was performed with pyridine acetate buffers [6], using on most occasions a linear gradient from 0.05M to 2 M pyridine. All buffers were deaerated before use.
The freeze-dried peptide mixtures were dissolved in 2 ml of 0.05 M pyridine acetate and any insoluble material removed by centrifugation. These preci- pitates were washed (2 x 0.5 ml) with the same buffer and the washings also applied to the column. Chro- matography was carried out a t 55 "C with flow rates of 50 to 60 ml/h. The columns were finally eluted with 200 ml 2 M pyridine-acetate a t the end of each gra- dient. The legends to each figure give the actual de- tails in each case. The column eluate was monitored spectrophotometrically a t 280 nm and by the alka- line hydrolysis-ninhydrin colorimetric assay on ali- quots from each fraction [15].
Peptides were pooled as shown, dried by rotary evaporation and redissolved in water, 1 O/, acetic acid or 0.05 M ammonium bicarbonate, depending on their solubility characteristics.
Impure pools were further fractionated by gel filtration on Bio-Gel P4 (2.2 x 140 cm) in 1 01, formic acid, a t room temperature and with flow rates of 12 to 15 ml/h. Alternatively, high-voltage paper elec- trophoresis a t 3 kV for 40 min in pyridine-acetate buffer pH 6.5 proved an effective procedure. The pep- tides thus separated were eluted from the strips of Whatman No. 3 paper with 0.1 M ammonium bicarbonate.
Sequence Analysis of Peptides Purified peptides were subjected to sequence
analysis using the combined Edman degradation- dansylation procedure [26]. The N-terminal dansyl amino acids so formed were separated and identified by two-dimensional ascending chromatography on double-sided polyamide layers as described.
Several observations arising from our experience with this technique should be mentioned. Where S-aminoethyl-cysteinyl residues were present in non-terminal positions in peptides, substantial modi- fication or destruction of the amino acid occurred during the degradation process. As a result, the yield of bis-dansyl S-aminoethyl-cysteine was markedly reduced. In contrast, good yields of the derivative were obtained when the residue occupied a N-ter- minal position in the peptide. On chromatography, the dansyl 8-aminoethyl-cysteine derivative ran
68 The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Biochem.
3.00- 28 2.6
€26
%20 ';i 1:8 81.6 5 l.L
B 1.0 <0.8
0.6 0.L 0.2
g 2.2
91.2
slightly faster than dansyl cysteine in solvents 1 and 2 .
A similar situation was observed with internal lysine residues. In this instance, three spots were ob- served, the N - and bis-dansyl derivatives, together with a component running in a position almost iden- tical to that characteristic of leucine. A possible ex- planation for these effects would lie in the formation of phenylthiocarbamyl derivatives of the side chain amino groups of lysine and S-aminoethyl-cysteine, during the Edman degradation procedure.
Acid hydrolysis of the dansyl peptide which is a necessary step for the liberation of the N-terminal dansyl amino acid, caused the destruction of dansyl tryptophan. As a consequence, no spot was observed and the position of the tryptophanyl residue could only be inferred from the absence of such a spot a t this stage, allied with the location of other residues in the peptide. Less ambiguous indication of the posi- tion of this amino acid was obtained from a timed course of enzymic hydrolysis.
A
- - - - - - - - - - - - - - -
0 K - r - T -
Complete Enzymatic Hydrolysis of Peptides The peptide (0.1 to 0.2 pmol) was dissolved in
2 m l 5 mM MgC1, and the resultant solution adjusted to pH 7.8 with 0.01 M NaOH. Aminopeptidase M (0.1 mg) and pronase (0.1 mg), also suspended in 5 mM MgC1, pH 7.8, were added and the digestion allowed to proceed for 18 h at 37 "C. Where proline was present, prolidase was substituted for pronase. The mixture was subsequently acidified with 0.1 M HC1 and analysed directly on the amino acid analyser without prior removal of the enzymes.
This techniques was employed as an adjunct to the electrophoretic method (see below) used in the determination of the amide content of various pep- tides. Glutamine and asparagine emerged with serine on amino acid analysis. Used in conjunction with the results from acid hydrolysis of the same peptides, in which glutamine and asparagine are converted to the respective acids, an unambiguous estimate of the number and identity of the amidated residues could usually be obtained.
The second important use of complete enzymatic hydrolysis was in the verification of the presence of tryptophan in peptides, following previous indication by absorption at 280 nm and the presence of degra- dation products of tryptophan on analysis of acid hydrolysates of the peptide.
Time- Course Digestion of Peptides Carboxypeptidase A. 10 p1 of the enzyme suspen-
sion (20 mg/ml) was dissolved in 50 pl 0.1 N NaOH and the solution neutralised with an equivalent amount of 0.1 N HC1. 0.1 M NH4HC03 pH 7.8 was added to bring the volume to 1 ml and 0.4-ml ali- quots of the enzyme solution were added to 0.1 pmol
of dried peptide and a control tube, respectively. In- cubation was carried out a t 25 "C, aliquots with- drawn at suitable intervals, acidified with glacial acetic acid and analysed directly without prior re- moval of enzyme or peptide.
Aminopeptidase M . A solution of aminopeptidase M (0.1 mg/ml) in 0.1 M NH4HC03 pH 7.8 containing 2.5 mM MgC1, was prepared. 0.2 pmol of peptide was dissolved in 0.4 ml of the enzyme preparation and incubation carried out at 25 "C. Aliquots were with- drawn and treated as above.
Determination of Amide Content The paper electrophoretic procedure of Offord [17]
was used.
RESULTS Fig.1 shows the elution pattern obtained from
Sephadex G-100 in the purification of human a-lact- albumin. It is possible to obtain a similarly shaped profile by applying the material obtained by simply dialysing and freeze-drying the pH-4.6 supernatant. However, the procedure detailed in Methods had two distinct advantages. The first that, due to the re- moval of the bulk of the milk caseins, much larger amounts of a-lactalbumin could be purified with a single column separation. Secondly, the method also resulted in the removal of a small soluble casein of low absorption coefficient which eluted from Sepha- dex G-100 in the leading edge of the a-lactalbumin peak. Although present in small amounts, this ma- terial had a profound effect on the subsequent amino acid analysis of the a-lactalbumin, as a result of the very high proline and glutamic acid content of the contaminant.
" 0 200 LbO 660 860 lob0 1iOO Volumelml 1
Fig. 1. Purification of human oc-lactalbumin on Sephadex G- 100. 2.0 g of the pH-4-precipitated fraction of human milk was applied to a column (4 x 150 em) of Sephadex G-100, equilibrated in 0.05 M ammonium bicarbonate. The column was developed with this buffer at 50 ml/h at room temperature
Vol.27, No.1, 1972 J. B. C. FINDLAY and K. BREW 69
Table 1. The amino-acid compositions of human a-lactalbumin and its cyanoyen-bromide fragments The results are given as number of residues per protein molecule as determined bv sequence and amino acid composition
CNBr E C-terminal fragment CNBr D CNBr E Whole protein
Amino acid R-terminal fragment
Sequence Compn. Sequence Compn. Sequence Compn. Sequence Compn.
Aspartic acid 16 15.5 2 2.0 12 11.6 2 2.2 Threonine 7 6.8 2 1.6 4 4.2 1 1.1 Serine 8 7.8 1 0.8 7 7.0 Glutamic acid 15 14.8 4 3.9 7 6.8 4 4.4 Proline 2 2.4 1 0.9 1 1.0 Glycine 6 6.0 3 2.7 2 2.0 1 1.4 Alanine 5 5.2 1 1.1 1 1.2 3 2.7 ' I2 Cystine 8 7.8 2 1.9 3 2.7 3 2.6 Valine 2 2.0 -
3 3.0 Methionine a 2 2.0 (1) Isoleucine 12 11.6 3 Leucine 14 13.7 5 4.6 3 3.3 6 5.1 Tyrosine 4 4.1 1 0.7 2 1.9 1 0.8 Phenylalanine 4 4.1 1 0.8 3 2.8
1 + 2 + Tryptophan 3 Lysine 12 12.0 3 2.9 3 3.2 6 5.6
- 1 1.0 1 0.9 Histidine 2 2.0 - Arginine 1 1.1 - 1 1.0
Total 123 30 60 33
N-Terminal LYS LYS Phe CYS
LYS
- -
- -
- - - 2.0 1.6
5.7 (1) 2.8
- - (1) (1) 6
- - - - -
- - -
Glu Glu His Ala Phe Phe -
a In CNBr fragments methionine occurred as homoserine lactone on analysis of acid-hydrolysed peptides.
The yield of human a-lactalbumin obtained by this procedure was 1.7 to 1.8 mg/ml of the original milk. All preparations gave single bands on poly- acrylamide disc electrophoresis and were active in the lactose synthetase reaction. The absorption coefficient AiQk measured by the dry weight method was found to be 18.8 and the amino acid composition is given in Table 1. The results of polyacrylamide gel electro- phoresis and chromatography with DEAE-cellulose indicated an absence of genetic variants in the milk samples examined.
SEPARATION OF CNBr FRAGMENTS
Also shown in Table 1 is the amino acid compo- sition data for the three fragments obtained upon cleavage (with cyanogen bromide) of the S-amino- ethylated protein a t the two methionyl residues.
The separation profile of these fragments is shown in Fig.2. Examination of peaks D and E by poly- acrylamide gel disc electrophoresis in urea, revealed that although peak D was pure, peak E contained two components. It has not proved possible to separate these two components on a preparative scale by any of the techniques of gel filtration, ion-ex- change chromatography, high-voltage electropho- resis or paper chromatography so far tried. They were resolved, however, on a small scale by high- voltage electrophoresis a t pH 2.5 (2 h, 100 mA and
Volume(mU Fig. 2. Separation of CNBr fragments of S-aminoethylated human a-lactalbumin. 120 mg of CNBr-cleaved material was separated with Sephadex 6-75 (2.5 x 150 em), developed at room temperature with lolo formic acid and at a flow rate
of 12 ml/h
3 kV) on silica gel thin-layer plates (20 x20 cm). Accordingly, although enough material could be ob- tained for amino acid analysis and N-terminal se- quence determination, proteolytic digestions were carried out on the mixture of fragments.
70 The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Biochem.
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 0 100 200 300 LOO 500 600 700 800 9dO 1300
Voiumeiml) Fig.3. Separation of tryptic peptides from CNBr E. A tryptic digest of 40 mg of CNBr E was chromatographed on a column (0.9 x 55 cm) of Beckman M-72 using linear gradient elution at 50 ml/h with 500 ml each of 0.05 M pyridine-acetate pH 2.5 as starting buffer and 2 M pyridine-acetate pH 5.0 as limit buffer. 3.5O/, of each fraction was sampled for ninhydrin esti-
mation (-), the absorbance being measured a t 570 nm; . . . . ., absorbance at 280 nm
Of the remaining peaks in the elution pattern, peak A was devoid of protein, peak B was identical in composition to the whole protein and peak C ap- peared to consist, judging by its amino acid compo- sition, of a combination of two partial cleavage prod- ucts. Calculation of yields based on the material in these pools indicated that nearly 90°/, of the S- aminoethylated protein had been at least partially cleaved by cyanogen bromide, nearly 70 appearing as the three CNBr fragments.
N - and C-Terminal Amino Acids Also shown in Table 1 are the N-terminal amino
acids of the whole protein and the cyanogen bromide fragments determined by the dansyl chloride pro- cedure. On digestion with carboxypeptidase A, leu- cine was the only amino acid released in significant amounts. The percentage recovery of the residue was 95 Ole, the maximum contaminant being tyrosine, pres- ent in quantities less than loo/, of leucine after25.5-h digestion period. From this result it was suspected that the pre C-terminal residue of the protein was not susceptible to cleavage with carboxypeptidase A, indicating the possible presence of a basic residue or proline. In summary, human or-lactalbumin is a protein 123 residues in length, the C-terminal amino acid being leucine and the N-terminal residue, ly- sine. In this last mentioned respect, it differs from bovine or-lactalbumin [6] which has an N-terminal glutamyl residue.
ISOLATION AND CHARACTERIZATION OF PEPTIDES AFTER DIGESTION OX CNBr FRAGMENTS
WITH HYDROLYTIC ENZYMES
An example of the chromatographic separation of peptides from the CNBr pools is shown in Fig.3.
0.8 - 0.7-
0.6- L3 6 0.5-
5; 0.L-
r.
0 - a 0.3-
;5 0.2-
0.1-
d R
111 n Q
Volumeirnl) Fig.4. Purification of T16, TlY. 4.5 pmol T16, T17 obtained from the chromatographic separation illustrated in Fig. 3 were applied to a column (2.0 x 150 cm) of Bio-Gel P4 equi- librated with acetic acid. The column was developed in this buffer a t a flow rate of 12 ml/h. 5O/, of every second frac- tion was subjected to the ninhydrin assay. - , the ab- sorbance being measured at 570nm; ....., absorbance at
280 nm
For the further purification of pools containing more than one species, the single technique which proved most successful was gel filtration with Bio-Gel P4. Fig.4 illustrates such a re-separation, in this case of T i 6 and TI7 from CNBr E. On one occasion, how- ever, this method proved unsuccessful and purifi- cation was achieved by high-voltage electrophoresis a t pH 6.5.
Peptides were checked for purity by high-voltage electrophoresis a t pH 6.5 and 3.5. Where further purification was necessary the method employed
N
-4
Tab
le 2.
Am
ino-
acid
com
posi
tions
, yie
lds a
nd N
-ter
min
al re
sidu
es of
tryp
tic p
eptid
es fr
om
CN
Br
D
The
num
bers
in p
aren
thes
es a
fter
the
mol
ar ra
tio fi
gure
s rep
rese
nt th
e as
sum
ed n
umbe
r of r
esid
ues.
Try
ptop
han
cont
ent w
as c
onfir
med
by c
ompl
ete e
nzym
ic h
ydro
lysi
s of
the
res
pect
ive
pept
ides
~
~~
~ ~~
Pept
ide
Who
le C
NB
r D f
rom
:
Am
ino
acid
Pe
ptid
e C
ompn
to
tal
T5
T6
T6,7
T7
T8
T9
T10
T
11
Ti2
T
13
.~
Asp
artic
aci
d 3.
99 (
4)
1.14
(1)
0.97
(1)
0.99
(1)
1.
01 (1
) 1.
96 (
2)
4.83
(5
) 12
12
Thr
eoni
ne
2.37
(3)
0.
75 (
1)
4 4
Seri
ne
2.70
(3)
0.
86 (
1)
0.70
(1)
2.39
(3)
0.
88 (
1)
0.98
(1)
7
7
Glu
tam
ic a
cid
4.76
(5)
1.
06 (
1)
0.81
(1)
2.
05 (
2)
7 7
Prol
ine
0.87
(1)
1
1
Gly
cine
1.
97 (
2)
1.10
(1)
0.86
(1)
2 2
er w A
lani
ne
1.06
(1)
1
1 a
Val
ine
0.95
(1)
0.
85 (
1)
2 2
3 E s T
yros
ine
1.66
(2)
2
2 F
!if 3 * Is
oleu
cine
1.
65 (
2)
0.86
(1)
0.
90 (
I)
0.96
(1)
0.
84 (
1)
0.81
(1)
2.10
(2)
6
6 P
Leu
cine
0.
96 (
1)
0.92
(1)
0.
86 (
1)
1.05
(1)
0.89
(1)
3
3 __
Phen
ylal
anin
e 1.
58 (2
) 0.
81 (
1)
0.73
(1)
0.91
(1)
3
3
Try
ptop
han
+(I)
1
1
Lys
ine
0.73
(1)
0.
81 (1)
0.98
(1)
0.
93 (
1)
1.10
(1)
3 3
8- Am
inoe
thyl
-cys
tein
e 0.
92 (
1)
0.93
(1)
0.
66 (
1)
0.77
(1)
3
3
His
tidin
e 0.
65 (
1)
1
1
Arg
inin
e 0.
93 (
1)
1
1
Hom
oser
ine l
acto
ne
+(I)
+(I)
+(I)
No.
of r
esid
ues
28
3 8
6 4
8 3
4 6
11
60
60
N-t
erm
inal
res
idue
Ph
e G
ly
Gly
G
lx
Leu
Se
r A
sx
Asx
A
sx
Phe
LYS
Yie
ld (O
/J 28
24
14
6
36
56
66
24
42
45
The
num
bers
T
able
3. A
min
o-ac
id c
ompo
sitio
ns, y
ield
s an
d N
-term
,inal
resi
dues
of
tryp
tic p
eptid
es f
rom
CN
Br E
of t
he re
spec
tive
pept
ides
. Pe
ptid
es T
2, T
16, T
i7 a
nd T
i9 w
ere
rese
para
ted
on B
io-G
el P
4 in
par
enth
eses
aft
er th
e m
olar
rat
io fi
gure
s rep
rese
nt t
he a
ssum
ed n
umbe
r of r
esid
ues.
Try
ptop
han
cont
ent w
as c
onfi
rmed
by
com
plet
e en
zym
ic h
ydro
lysi
s
Pept
ide
Who
le C
NB
r E fr
om
Am
ino
acid
T
4 T
3,4
T14
T15
T14
,15
T16
T
17
Ti8
T
19
Tl9
A
Pept
ides
C
ompn
to
tal
T1
T2
T2A
T
3
Asp
artic
aci
d 2.
05(2
) 2.
06(2
) 0.
99(1
) 0.
99(1
) 1.
02(1
) 4
4
Thr
eoni
ne
0.9(
1)
l.OO(
1)
0.80
(1)
0.84
(1)
3 3
Seri
ne
0.96
(1)
0.75
(1)
I 1
Glu
tam
ic a
cid
1.0(
1)
2.00
(2)
2.02
(2)
l.Oo(
1)
l.OO(
1)
0.99
(1)
2.08
(2)
0.98
(1)
1.02
(1)
8 8
Prol
ine
1.11
(1)
1.12
(1)
1
1
Gly
cine
2.
75(3
) 2.
76(3
) 10
7(1)
4
4
Ala
nine
0.
99(1
) 0.
97(1
) 1.
04(1
) 0.
95(1
) 3
4
Val
ine
Isol
euci
ne
2.72
( 3)
2.79
(3)
1.84
(2)
1.73
(2)
0.98
(1)
6 6
Leu
cine
2.
72(3
) 2.
83(3
) 2.
00(2
) 2.
00(2
) 0.
95(1
) 1.
10(1
) 0.
99(1
) 0.
92(1
) 2.
01(2
) 1.
01(1
) 11
11
Tyr
osin
e 0.
82(1
) 0.
97(1
) 0.
84(1
) 2
2
Phen
ylal
anin
e 0.
9(1)
1
1
Try
ptop
han
2 2
Lys
ine
2.0(
2)
1.04
(1)
l.OO(
1)
l.OO(
1)
0.93
(1)
2.03
(2)
0.93
(1)
1.06
(1)
0.98
(1)
0.98
(1)
8 9
S- A
min
oeth
yl-c
yste
ine
0.90
(1)
0.81
(1)
0.88
(1)
0.71
(1)
0.91
(1)
4 5
His
tidi
ne
0.84
(1)
1 1
Arg
inin
e
Hor
nose
rine
lact
one
+ +
1 1
~_
__
__
No.
of
resi
dues
5
8 7
15
2 17
1
5 6
9 6
6 3
2 60
63
Yie
ld (o
/o)
90
38
60
14
20
58
70
90
10
66
66
62
40
16
Vo1.27, No.l,1972 J. B. C. FINDLAY and K. BREW 73
Table 4. Amino-acid compositions, yields and N-terminal residues of chymotryptic peptides f r m CNBr D The numbers in parentheses after the molar ratio figures represent the assumed number of residues. Tryptophan content was
confirmed by complete enzymic hydrolysis of the respective peptides
Peptide
c7 C8 c9 c10 c11 c12 C13 C12,13 C14 C15 Amino acid
Aspartic acid 3.02(3) 1.09(1) 7.78(8) 7.87(8)
Threonine 1.04(1) 1.65(2) 0.93(1) 0.9G(1)
Serine 1.15(1) 0.92(1) 0.76(1) 2.41(3) 2.69(3) 1.09(1) 2.06(2)
Glutamic acid 3.72(4) 1.01(1) 2.02(2) 2.04(2)
Proline 0.97( 1) 0.97(1) ~~ ~
Glycine 1.01(1) 0.94(1) ~
Alanine 0.95(1)
Valine 0.95(1) 0.95(1) 1.14( 1)
Isoleucine 0.96(1) 0.81(1) 3.81(4) 3.74(4)
Leucine 1.02(1) 1.06(1) 1.08(1) 1.11(1)
Tyrosine 0.88(1) 0.51(1)
Phenylalanine 0.99( 1) 0.61(1) 0.84(1) 0.93(1)
Tryptophan +(I)
Lysine 0.98(1) 0.96(1) 1.06(1) 1.08(1) 1.04(1)
0.90( 1) 0.97(1) 1.51(2) 1.41(2)
Histidine 0.96( 1)
Arginine 1.00 0.80(1) 0.66(1)
Homoserine lactone + + No. of residues 6 14 3 4 3 9 1 10 20 22
N-terminal residues Phe Asx Gly Glx Lys Cys Arg Cys Asx Ser
Yield (o/o) 78 19 100 26 50 20 30 28 15 15
is indicated. The compositions, yields and N-terminal residues of these peptides are shown in Tables 2 to 7. The nomenclature used is as follows : tryptic peptides are denoted by the letter T, chymotryptic by C and thermolysin by Th. The numbering of these peptides denotes their position in the complete se- quence of the protein, beginning a t the NH,-terminus. Where one peptide also occurs as two smaller frag- ments, the larger peptide is denoted by the numbers assigned to both the smaller pieces e .g . T3, 4 is a fragment containing the same residues as T3 and T4. Yields varied from 10°/, to 100°/,, and were generally greater from hydrolysates of the fragments contained in the CNBr E pool. This was foreshadowed by the presence of small amounts of insoluble material in all the digests of CNBr D.
Sequence analysis of the above peptides has per- mitted deduction of the complete primary structure of the a-lactalbumin. The proof of this sequence is detailed below. For ease of discussion the largest of the three CNBr pieces has been further subdivided into two regions: 31 to 58 and 59 to 90.
Residues 1 to 30 This segment represents the N-terminal fragment
contained in peak CNBr E. I ts amino acid sequence was unambiguously derived from tryptic peptides T1, T2, T3, T3,4 and T4 together with overlapping chymotryptic peptides Cl, C2, C3, C3,5, C5 and C6. Further verification came from thermolysin peptides Thl, Th2, Th3, Th4, 4A, Th5, ThGA, Th6, Th7, and Th7A.
The
num
bers
T
able
5.
Am
ino-
acid
com
posi
tions
, yi
elds
and
N-te
rmin
al r
esid
ues
of
chym
otry
ptic
pep
tides
fro
m C
NB
r E
of t
he r
espe
ctiv
e pe
ptid
es.
Pept
ides
C1,
C5
and
C18
wer
e re
sepa
rate
d on
Bio
-Gel
P4
of p
aren
thes
es a
fter
the
mol
ar ra
tio fi
gure
s rep
rese
nt th
e as
sum
ed n
umbe
r of r
esid
ues.
Try
ptop
han
cont
ent w
as c
onfi
rmed
by
com
plet
e enz
ymic
hyd
roly
sis
Pep
tidc
c1
C2
c3
c2
.3
c4
C
5 C
3.5
C6
C16
C
16A
C
i7
C17
A
C18
C
17,1
8 G
I9
C20
A
min
o ac
id
Asp
artic
aci
d 1.
79(2
) 1.
88(2
) 2.
01(2
) 1.
82(2
)
Thr
eoni
ne
0.75
(1)
0.96
( 1)
0.98
(1)
0.84
(1)
Seri
ne
0.80
(1)
0.93
(1)
0.73
(1)
0.82
(1)
2.81
(3)
0.98
(1)
Glu
tam
ic a
cid
1.02
(1)
l.OO(
1)
0.97
(1)
1.82
(2)
1.04
(1)
1.07
(1)
1.04
(1)
Prol
ine
1.12
(1)
G 1 y
c i n e
1.
04(1
) 0.
97(1
) 2.
07(2
) 1.
03(1
) l.O
O(1)
Ala
nine
l.O
l(1)
0.96
(1)
1.01
(1)
0.94
(1)
1.82
(2)
Val
ine
Isol
euci
ne
1.14
(1)
1.16
(1)
1.93
(2)
1.74
(2)
1.81
(2)
Leu
cine
0.
94(1
) 1.
02(1
) 1.
80(2
) 1.
99(2
) 0.
98(1
) 2.
04(2
) 2.
03(2
) l.O
O(1)
1.
03(1
) 1.
08(1
) 1.
96(2
) 1.
04(1
) 2.
05(2
)
Tyr
osin
e 0.
59(1
) 0.
70(1
) 0.
57(1
) 0.
54(1
)
Phen
ylal
anin
e 0.
81(1
)
Try
ptop
han
+(I
) +
(I)
Lys
ine
0.96
(1)
0.92
( 1)
0.99
(1)
0.90
(1)
0.89
(1)
0.99
(1)
0.83
(1)
0.98
(1)
1.11
(1)
0.88
(1)
1.15
(1)
8- Am
inoe
thyl
-cys
tein
e 0.
66(1
) 1.
03( 1
) 0.
65(1
) 0.
76(1
) 0.
91(1
) ~~
~
His
tidin
e 0.
96(1
) 0.
77(1
) 0.
56(1
)
Arg
inin
e ~~
Hom
oser
ine
lact
one
+ N
o. o
f res
idue
s 3
5 3
8 4
7 10
12
8
7 4
3 2
6 8
5
N-t
erm
inal
res
idue
L
ys
Thr
Se
r T
hr
Ser
Ser
Ser
Gly
A
sx
Asx
Leu
L
eu
Ala
L
eu
Cys
L
eu
Yie
ld (O
le)
77
20
18
21
17
44
24
54
66
11
17
16
31
7 42
30
M
.LI 5 0
P
The
T
able
6.
Amin
o-ac
id c
ompo
sitio
ns, y
ield
s an
d N-
term
inal
res
idue
s of
ther
mol
ysin
pep
tides
fro
m C
NB
r D
of th
e re
spec
tive
pept
ides
. Pep
tides
Thl
O a
nd T
hl7A
wer
e re
sepa
rate
d by
hig
h-vo
ltage
ele
ctro
phor
esis
on
pape
r, T
hll
and
Th1
4 by
Bio
-Gel
P4
num
bers
in p
aren
thes
es a
fter
the
mol
ar ra
tio f
igur
es re
pres
ent t
he a
ssum
ed n
umbe
r of r
esid
ues.
Try
ptop
han
cont
ent w
as c
onfir
med
by c
ompl
ete e
nzym
ic hy
drol
ysis
~~ Am
ino
acid
~ ~
~~
Pept
ide
Th8
T
h9
Th8,Q
Thl
O
Th
ll
Th
llA
T
h12
Th1
3 T
hl4A
T
hi4
Th1
5 T
h16
Thl
6A
Thl
7A
Th1
7
Asp
artic
aci
d 1.
1 (1
) 0.
98(1
) 1.
96(2
) 1.
04(1
) 1.
06(1
) 1.
08(1
) 1.
16(1
) 1.
02(1
) 1.
12(1
) 1.
01(1
) 4.
83(5
) 4.
61(5
)
Thr
eoni
ne
0.86
(1)
0.82
(1)
1.80
(2)
1.13
(1)
0.74
(1)
0.95
(1)
Seri
ne
0.92
(1)
0.96
(1)
0.93
(1)
0.80
(1)
2.36
(3)
1.61
(2)
2.60
(3)
0.92
(1)
0.87
(1)
1.09
(1)
0.97
(1)
2.98
(3)
0.96
(1)
1.06
(1)
2.05
(2)
1.95
(2)
1.92
(2)
Glu
tam
ic a
cid
Prol
ine
1.02
(1)
1.10
(1)
0.90
(1)
Gly
cine
0.
94(1
) 0.
99(1
) 0.
98(1
)
Ala
nine
1.
01(1
) l.O
O(1)
Val
ine
0.66
(1)
1.02
(1)
0.84
(1)
0.99
(1)
Isol
euci
ne
0.95
(1)
0.94
(1)
0.90
(1)
0.94
(1)
0.86
(1)
2.10
(2)
1.11
(1)
Leu
cine
l.OO(1)
1.13
(1)
0.88
(1)
1.05
(1)
"yro
sine
0.
82(1
) 0.
97(1
) 0.
60(1
)
Phen
ylal
anin
e 0.
93(1
) 1.
07( 1
) 1.
09(1
) 0.
95(1
) 0.
88(1
) 0.
91(1
)
Try
ptop
han
+(I)
Lys
ine
0.98
(1)
0.97
(1)
0.97
(1)
1.12
(1)
X- Am
inoe
thyl
0.
78(1
) l.O
O(1
) l.O
O(1
) l.
l4(1
)
His
tidin
e ~
l.OO(1)
0.88
(1)
~
Arg
inin
e 0.
66(1
) l.O
O(1)
0.
98(1
)
Hom
oser
ine
lact
one
+ N
o. of
resi
dues
3
7 10
11
3
2 4
13
8 9
3 5
6 11
8
N-t
erm
inal
res
idue
Ph
e Se
r Ph
e Il
e Le
u Ph
e Il
e L
eu
Ser
Ser
Ile
Ile
Ile
Phe
Leu
4 p a Eo
P
p.
Yie
ld (O
/o)
16
14
13
20
43
17
62
18
10
14
35
23
10
15
50
76 The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Bioohem.
Table 7. Amino-acid compositions, yields and N-terminal The numbers in parentheses after the molar ratio figures represent the assumed number of residues.
Peptide Amino acid
Thl Th2 Th2Aa Th3 Th48 Th4A Th4,4A* Th58 Th6
Aspartic acid 1.06(1) l .OO(1) l.Ol(1)
Threonine 0.81(1) 0.88(1)
Serine 0.81(1)
Glutamic acid 1.08(1) 1.02(1) 1.04(1) 1.06(1) 1.05(1)
Proline 1.01(1)
Glycine 2.97(3)
Alanine 0.96(1)
Valine
Isoleucine 0.98(1) 0.97(1)
Leucine 0.97(1) l.OO(1) 0.92(1) 1.97(2) 0.98(1)
Tyrosine 0.84(1) ~~~~
Phenylalanine 0.92(1)
Tryptophan
Lysine 0.95(1) 1.04(1) 1.01(1) 1.06(1) l .OO(1)
S- Aminoethyl-cysteine 0.97(1) 0.97(1)
Histidine
Arginine
Homoserine lactone
No. of residues 2 5 4 3 1 3 4 6 5
N-terminal residue Lys Phe Thr Leu Leu Leu Leu Ile Ile
Yield (Ole) 82 55 25 58 60 54 24 100 57
a Reseparated on Bio-Gel P,.
A summary of the way in which these pieces were uniquely aligned to give the whole segment is given in Fig.5.
TI (Residues 1 to 5) . Sequence : Lys-Glx-Phe- Thr-Lys. Since the fist three residues of this peptide are identical to those obtained from sequence ana- lysis of the whole protein, it was judged to represent the first five amino acids in the primary structure of n-lactalbumin. Peptides Thi and C i therefore represent the first 2 and first 3 residues respectively, a t the N-terminal end of the protein.
T2 (Residues 6 to 13). Sequence : X-AE-Cys-Glx- Leu-Ser-Glx-Leu-Leu-Lys. Overlaps which deter- mined the position of T2 in the protein are provided by C2, C2, 3 and Th2 as shown in the table. A peptide was also obtained in which cleavage has
occurred after the X-aminoethyl-cysteinyl residue
T3,4 (Residues 14 to 30). Sequence : Asx-Ile-Asx- Gly-Tyr-Gly-Gly-Ile-Ala-Leu-Pro-Glx-(Leu,Ile,8-AE Cys,Thr,Hms-lactone), where Hms = homoserine. A certain proportion of this segment also occurred as two further cleavage products T3 and T4, hydrolysis task- ing place after the X-aminoethyl-cysteinyl residue. It proved impossible to obtain unambiguous sequence data for this peptide after residue 25 but residues 26 to 30 could be easily identified from Th6, Th7, Th7A, T4 and C6. Overlaps between T2 and T3,4 were provided by peptides Th4, 4A, Th4A, C5 and c3,5.
I n this way a unique sequence was obtained for residues 1 to 30. The whole segment was recovered
(8-AE-CYS).
Vol. 27, No. 1, 1972 J. B. C. FINDLAY and K. BREW 77
residues of thermolysin peptides from GNBr E Tryptophan content was confirmed by complete enzymic hydrolysis of the respective peptides
Th6,4 Th7 Th7AS ThiE* Thl9 Th208 Th2OAa Th21a Th4.21A Th21A Th22’ Th23 Th24a
0.95(1) 1.04(1) l.OO(1)
1.03(1) 0.83(1) 0.83(1)
1.08(1) 0.83(1) 2.07(2) 1.04(1)
1.03(1)
1.04(1)
0.98(1) l . O O ( 1 ) 2.09(1) 1.99(2)
0.92(1) 0.92(1) 1.04(1) 0.91(1) 0.94(1) 0.91(1) 0.97(1)
1.81(2) 1.14(1) 1.01(1) 1.09(1) 0.93(1) 1.86(2)
0.85(1) 0.89(1) -
+(I)
0.96(1) 0.92(1) l.OO(1) 1.07(1) 1.04(1) 1.01(1) ~~
0.80(1) 0.90(1) 0.90(1) 0.97( 1)
0.88(1) 0.90(1) 0.91(1)
+ + 6 3 4 3 3 3 4 3 5 4 5 4 5
Ile Ile Ile Ile Ile Ile Ile A h Leu Ala Leu Leu Leu
20 12 40 71 58 12 30 13 16 20 52 31 20
as peptides from each of the three hydrolytic digests employed.
Residues 31 to 58 This entire segment of the protein was isolated
from the mixture of peptides obtained after digestion of CNBr D with trypsin. Some hydrolysis also oc- curred after tyrosine-50. Direct analysis of this pep- tide (T5) gave the first 9 residues of the region: Phe-His-Thr-Ser-Gly-Tyr-Asx-Thr-Glx. As the ter- minal dipeptide sequence is uniquely identical to that of the whole fragment, T5 can be unambiguously aligned a t the N-terminus of CNBr D. The complete sequence of the fragment was, however, deduced from the shorter peptides obtained upon digestion with thermolysin and chymotrypsin (see Fig. 6).
Residues 59 to 90 The link between the two segments of CNBr D
was provided by C i i (sequence: Lys-Leu-Trp). This N-terminal lysyl residue also occurred in T7, Th12, and T6,7 and the sequence Leu-Trp in T8 and Thi3.
The frequency of basic residues in this section of the molecule gave rise to a series of smaller tryptic peptides, which provided us with the entire sequence. Their alignment was ascertained from suitable over- lapping chymotryptic and thermolysin peptides.
T8 (Residues 59 to 62) . Sequence: Leu-Trp- 8-AE-Cys-Lys. The presence of tryptophan was con- firmed by the absorbance of the peptide at 280 nm and by complete enzymatic hydrolysis : Leu (1.02), Trp (0.61), S-AE-Cys (0.76), Lys (0.98). The position of this residue in the peptide was ascertained using
1 10
20
30
Se
quen
ce
Lys
-Gln
-Phe
-Thr
-Lys
-Cys
-Glu
-Leu
-Ser
-Gln
-Leu
-Leu
-I~y
s-A
sp-I
le-A
sp-~
ly-T
yr-G
ly-G
ly-I
le-A
la-L
eu-P
ro-G
lu-L
eu-I
le-C
ys-T
hr-
(Hom
oser
ine l
acto
ne)
Pep
tide
No
Thl
Lv
s-G
Ix-
C1
Lis
-Gln
-Phe
T
1 L
ys-G
lx-P
he-T
hr-L
ys
Th2
Ph
e-T
hr-L
ys-C
ys-G
lu
c2
Thr
-Lys
-Cys
-Glx
-Leu
T
h3
Leu
-Ser
-Gln
c3
Se
r-G
lx-L
eu
c4
Ser-
Glx
-Leu
-Leu
T
2 Cys-Glx-Leu-Ser-Glx-Leu-Leu-Lys
c2,3
Thr-Lys-Cys-Glx-Leu-Ser-Glx-Leu
Th4
,4A
L
eu-L
eu-L
ys- A
sx
c5
Leu
-Lys
- Asp
-Ile
- Asp
-Gly
-Tyr
T
h4A
L
eu-L
ys- A
sp
c3,5
Se
r-G
lx-L
eu-L
eu-L
ys- A
sp-I
le- A
sp-G
ly-T
yr
T3
Asx-Ile-Asx-Gly,Tyr,Gly,Gly,Ile,Ala,Leu,Pro,Glx,Leu,Ile,Cys
T3.
4 Asx-Ile-Asx-Gly-Tyr-Gly-Gly-Ile-Ala-Leu-Pro-Glx-Leu,Ile,Cys,Thr,
(hom
oser
ine l
acto
ne).
Th5
Il
e-A
sx-G
ly-T
yr-G
ly-G
ly-
Ile-
Cys
,Thr
, (h
omos
erin
e lac
tone
) T
h6
Ile-
Ala
-Leu
-Pro
-Glu
- Th
6A
Th7
Il
e-C
ys-T
hr
Th7A
Il
e-C
ys-T
hr-
hom
oser
ine l
acto
ne
T4
Thr
- ho
mos
erin
e la
cton
e C6
Gly-Gly-Ile-Ala-Leu-Pro-Glx-Leu-Ile-Cys-Thr-
hom
oser
ine
lact
one
Fig.
5. S
umm
ary o
f pr
oof
of th
e se
quen
ce o
f res
idue
s 1 to
30
from
CN
Br E
. The
sequ
ence
s hav
e be
en d
eriv
ed fr
om tr
yptic
, chy
mot
rypt
ic an
d th
erm
olys
in p
eptid
es o
f CN
Br E
. R
esid
ues s
how
n in
par
enth
eses
are
pre
sent
acc
ordi
ng to
the
amin
o-ac
id c
ompo
sitio
n bu
t w
ere
not
sequ
ence
d
Ile-
Ala
-Leu
-Pro
-Glx
-Leu
~~~
31
40
50
Sequ
ence
Phe-His-Thr-Ser-Gly-Tyr-Asp-Thr-Gln-Ala-Ile-Val-Glu-Asn-Asp-Gln-Ser-Thr-Glu-Tyr-Gly-Leu-Phe-G~n-Ile-Ser-Asn-Lys
Pep
tide
No
Th8
Ph
e-H
is-T
hr
c7
Phe-
His
-Thr
-Ser
-Glv
-Tvr
T
h9
Ser-
Gli
-Tir
- Asp
-Th-
Gln
- Ala
T
h8,9
Phe-His-Thr-Ser-Gly-Tyr-Asx-Thr-Glx-Ala
C8
Asx-Thr-Glx-Ala-Ile-Val(Glx,Asx,Asx,Glx,Ser,Thr,Glx,Tyr)
Thl
O
Ile-
Val
- Glx
- Asx-Asx-Glx-Ser-Thr-Glx-Tyr-Gly
C9&
T6
Gly
-Leu
-Phe
T
hll
L
eu-P
he-G
lx
Thl
lA
Phe-
Gln
c1
0 G
lx-I
le-S
er- A
sx
Glx
-Ile
-Ser
-Asx
-Lys
T7
T
h12
Ile-
Ser-
Asx
-Lys
T
6,7
Gly
-Leu-Phe-Glx-Ile-Ser-Asx-Lys
Lys
-Leu
-Trp
Fig.
6. S
umm
ary
of th
e pr
oof
of th
e se
quen
ce of
res
idue
s 31
to 5
8 fr
om C
NB
r D. F
or d
etai
ls s
ee le
gend
of F
ig.5
R w
Vo1.27, No.1, 1972 J. B. C. PINDLAY and K. BREW
the subtractive Edman procedure, rather than the dansyl Edman technique. In this case, the disappear- ance of the tryptophan degradation products was noted after two round of Edman degradation.
T5 (Residues 63 to 70) . Sequence: Ser-Ser-Glx- Val-Pro-Glx-Ser-Arg. This peptide was recovered in high yield. Evidence for overlap with T8 was pro- vided by the larger fragment Th13.
T I 0 (Residues 71 to 73) . Sequence: Asx-Ile- S-AE-Cys.
T I 1 (Residues 74 to 77 ) . Sequence : Asx-Ile-Xer- S-AE-Cys. An extended version of this peptide was also isolated (T12) and has the sequence Asx-Ile- Ser-S-AE-Cys-Asx-Lys. A time course incubation with aminopeptidase M at 25 "C yielded the following results: 10 min Asp (0.47), Ile (0.20), Ser (0.13); 3 h: Asp (1.40), Ile (0.62), Ser (0.46); 8-AE-Cys (0.30), Lys (0.18); 18 h : Asp (2.14), Ile (0.98), Ser (0.71), A-AE-Cys (0.78), Lys (0.98). A-AE-Cys (0.78), Lys (0.98). The necessary overlap between T10 and T11 was given by Thl5 (Ile-8- AE-Cys-Asx).
T13 (Residues 80 to 90). Sequence: Phe-Leu-Asx- Asx-Asx-Ile-Thr-Asx-Asx-Ile-Hms-lactone. This segment was clearly identified as the C-terminal peptide of CNBr D by the presence of homoserine lactone. The link between T12 and T13 was provided by ThlSA, sequence Ile-Ser-S-AE-Cys-Asx-Lys- Phe and by the large peptides C14 and C15.
The chymotryptic peptides which constitute this segment of the molecule are also shown in Fig.7. Cleavage occurred after tryptophan-60, but not, surprisingly, after phenylalanine-80. On the other hand, hydrolysis of the serine-69-arginine-70 and arginine-70-asparagine-71 bonds took place in a t least a proportion of the molecules. Both these latter cleavage points do not conform to the strict speci- ficity of the enzyme.
8 x
Residues 51 to 123 This final section of the human a-lactalbumin
molecule was contained in CNBr E. The evidence for its sequence is summarised in Fig. 8 and was obtained by an analysis of the peptides discussed below.
No peptide representing residues 91 to 93 was recovered from any of the enzymatic digests of this fragment, despite elution of the Beckman M-72 column used in the separation, with 0.2 M sodium hydroxide. However, the three N-terminal residues were sequenced from the intact purified fragment, the information so obtained proving consistent with the amino acid composition data from both the whole protein and the purified fragment. Further evidence for the sequence in this region and the alignment of the CNBr fragments was provided by peptides T17 and T117 isolated from a tryptic digest of the intact S-aminoethylated protein. These peptides
m
3
W
0
91
100
110
120
Sequ
ence
C
~s-A
la-L
~s-L
ys-I
le-L
eu-A
sp-I
le-L
ys-G
ly-I
le-A
sn-T
yr-T
rp-L
eu-A
~a-H
is-L
ys-A
la-I
~eu-
Cys
-Thr
-G~u
-Lys
-Leu
-Glu
-Gln
-Trp
-Leu
-Cys
-Glu
-Lys
-Leu
~~
Pept
ide
KO
TI4
LY
5 T
h18
Lys
-Ile
-Leu
- Asx
T
14,1
5 L
ys-I
le-L
eu-A
sx-I
le-L
ys
T15
Ile-
Leu
- Asp
-Ile
-Lys
C1
6A
Asx
-Ile
-Lys
-Gly
-Ile
- Asx
-Tyr
C1
6 A
sp-I
le-L
ys-G
ly-I
le- A
sn-T
yr-T
rp
Th19
Il
e-L
ys-G
ly
T16
Gly-Ile-Asx-Tyr-Trp-Leu-
Ala
-His
-Lys
T
h20
Ile-
Asn
-Tyr
T1
12OA
Il
e-A
sx-T
yr-T
rp
C17A
L
eu- A
h-H
is
C17
Leu
-Ala
-His
-Lys
T
h21
Ala
-His
-Lys
T
h4,2
1A
Leu
-Ala
-His
-Lys
- Ala
Th
21A
A
la-H
is-L
ys- A
la
C18
Ala
-Leu
C
17,1
8 L
eu- A
la-H
is-L
ys- A
la-L
eu
T17
Ala
-Leu
-Cys
-Thr
-Glu
-Lys
Th
22
Leu
-Cys
-Thr
-Glx
-Lys
c1
9 Cys-Thr-Glx-Lys-Leu-Glx-Glx-Trp
Th2
3 L
eu-G
lu-G
ln-T
rp
T18
Leu
-Glx
-Glx
-Trp
-Leu
-Cys
Th
l4PE
C20
L
eu-C
ys-G
lx-L
ys-L
eu
Ti9
G
lu-L
ys-L
eu
Tl9
A
Glu
-Lys
Fig.
8. S
umm
ary
of th
e pr
oof
of th
e se
quen
ce o
f re
sidu
es 9
1 to
123
fro
m C
NB
r E
. For
det
ails
see
lege
nd o
f Fi
g. 5
6 L
Vo1.27, Xo.1,1972 J. B. C. F’I~~DLAY and K. BREW 81
were purified by ion-exchange chromatography as before and characterised as follows.
T17. Composition: Asp (5.00), Thr (0.99), Met (0.79), Ile (1.84), Leu (1.04), Phe (1.09), and S-AE- Cys (1.01). N-terminal residue: Phe. Carboxypeptid- ase B (1 h) : 8-AE-Cys (0.61); A & B ( I h) : Met (0.51), S-AE-Cys (0.93); ( 2 h) : Ile (>0.1), Met (0.72), S-AE- Cys (0.94). Sequence : Phe-(Leu,Asx,Asx,Asx,Ile, Thr, Asx,Asx)Ile-Met-8-AE-Cys.
Tl l7 . Composition: Ala (1.00), Lys (1.19). Se- quence : Ala-Lys.
T14,15 (Residues 94 to 99). Sequence: Lys-Ile- Leu-Asx-Ile-Lys. Two further cleavage products lysine (T14) and Ile-Leu-Asx-Ile-Lys (T15) were also isolated. Relying on the specificity of trypsin and the accuracy of the amino acid compositions, it was con- sidered that only two lysines could be accommodated in the region subsequently denoted residues 93 and 94.
TI6 (Residues 100 to 108). Sequence: Gly-Ile-Asn- Tyr-Trp-Leu-Ala-His-Lys. The overlap between peptides T14, 15 and TI6 was provided by C16, Cl6A and Th19. The presence of a tryptophan residue was indicated by the absorbance a t 280 nm and by carboxypeptidase A digestion of Th20A (Ile-Asx-Tyr- Trp) 1 h: Asn ( O . l ) , Tyr (0.4), Trp (0.70), 4 h : Asn (0.24), Tyr (0.61), Trp (0.75).
Xl7 (Residues 109 to 114). Sequence: Ala-Leu- 8- BE - Cys- Thr - Glx - Ly s . No evidence was obtained to indicate any cleavage by trypsin of the peptide bond between S-aminoethyl-cysteine-111 and threo- nine-112. Peptides Th4, 21A, and C17, 18 provided the necessary overlaps for the positioning of this peptide.
T18 (Residues 115 to 120). Sequence: Leu-Glu- Gln-Trp-Leu-X-AE-Cys. In common with the other tryptophan-containing peptides, TI8 showed cha- racteristic absorption a t 280 nm. The overlapping peptide GI9 (sequence S-AE-Cys-Thr-Glx-Lys-Leu- Glx-Glx-Trp) also contained a tryptophanyl residue, a fact which facilitated the alignment of TI8 as the penultimate peptide in the protein. The presence of this aromatic amino acid was confirmed by com- plete digestion of the peptide with aminopeptidase M: Gln (0.94), Glu (1.05), Leu (1.78), Trp (0.76), S-AE-Cys (0.88). I t s position was confirmed by a time-course digestion of Th23 with carboxypeptidase A: 1 h: Glu ( O . l ) , Gln (0.3), Trp (0.66); 5 h: Glu (0.28), Gln (0.62), Trp (0.72).
T19 (Residues 121 to 123). Sequence: Glx-Lys- Leu. The C-terminal position of this peptide was sug- gested by the identity of its C-terminal amino acid, leucine, with that of the intact protein. Further proof was provided by overlapping peptides Th24 and C20, both of sequence Leu-S-AE-Cys-Glx-Lys-Leu.
AMIDE DISTRIBUTION
I n order to determine whether the aspartyl and glutamyl residues which appeared on analysis of the 6 Eur. J. Biochem.. Vol. 27
Table 8. The identification of the amidated residues in human a-1actalbum.in as deduced from the electrophoretic mobilities
of peptides shown
Residue Peptide Peptide Con- + position code sequence Charge clusion
Glx 2 CNBr E-C1 Lys-Glx-Phe +1 Gln Glx 7 CNBr E-Th2 Phe-Thr-Lys-Cys- +l Glu
Glx Glx 10 CNBr E-Th3 Leu-Ser-Glx 0 Gln Asx 14 CNBr E-Th4A Leu-Lys-Asx 0 Asp Asx 16 CNBr E-C1 Leu-Lys-Asx-Ile-
Glx 25 CNBr E-Th6A Ile-Ala-Leu-Pro-Glx - 1 Glu Asx 37 CNBr D-Th9 Ser-Gly-Tyr-Asx-
Glx 33 After 4
Asx-Gly-Tyr -1 ASP
Thr-Glx-Alaa -1 ASP
Edmans Thr-Glx-Ala 0 Gln Glx 43 Asx 44 1 CNBr D-ThlO Ile-Val-Glx-Asx- -3 3 Acids Asx 45 purified Asx-Glx-Ser-Thr- Glx 46 on paper Glx-TF-Glya Glx 49 1 Glx 54 CNBr D-ThllA Phe-Glx Asx 58 CNBr D-Thl2 Ile-Ser-Asx-Lys Glx 65 CNBr D-T9 Ser-Ser-Glx-Val 01x68 ] Pro-Glx-Ser- Arg Asx 71 CNBr D-Ti0 Asx-Ile-Cys Asx 74 CNBr D-Ti1 Asx-Ile-Ser-Cys Asx 78 CNBr D-Ti2 Asx-Ile-Ser-Cys-
Asx 83 CNBr D-Thl7A Phe-Leu-Asx-Asx- Asx84 and T i 3 Asx-Ile-Thr- Asx
Asx 82
Asx 87 Asx 88 Asx 97 CNBr E-Ti4 Ile-Leu-Asx-Ile-Cys Asx 102 CNBr E-Th20 Ile-Asx-Tyra Glx 113 CNBr E-Ti6 Ala-Leu-Cys-Thr-
Glx 116 CNBr E-Th23 Leu-Glx-Glx-Trpa Glx 117 After 2
Glx 121 CNBr E-Ti8 Glx-Lys-Leu
Asx-Lysa
Asx-Ile-Hms.Laca ! Glx-Lys
Glx-Trp Edmans
2 Amid.
0 Gln +1 Asn
Glu +1 Gln +1 Asn
0 Asp
0 Asp
0 1 Acid 4 Amid.
0 Asp 0 Asn
+1 Glu
-1 GIU 0 Gln
0 Glu
a Peptide also subjected to enzymic hydrolysis.
acid hydrolysates of peptides, actually represented these residues or whether they resulted from de- amidation of asparagine and glutamine, respectively, a series of suitable peptides containing wherever possible only one of the residues, was subjected to high-voltage electrophoresis at pH 6.5. In certain cases, this evidence was supplemented with amino acid composition data following complete enzymatic hydrolysis of the peptide.
82 The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Biochem.
Table 9. Evidence for the positioning of the acid and amide residues i n peptides CNBr ThlO and Th 17A
Composition Aminopeptidase JI digestion Carboxypeptidase A digestion
Acid Enzymic 0.5 h I h 4 h l h 5 h 18 h Peptide Amino acid
Aspartic acid Threonine
Ser, Asn, Gln Glutamic acid
Th 10 Glycine Valine
Isoleucine Tyrosine
1.98 1.00 1.12 0.85 1.01 2.70 2.94 1.87 0.99 1.02 0.73 1.04 1.04 1.14 0.56 0.66
0.21 0.20 0.32 0.51
0.51 0.62 0.36 0.76 0.87 0.36 0.61 0.76
0.76 1.00 1.05 0.52 0.80 0.89 0.72 0.82 0.99
0.48 0.70 0.75
Aspartic acid 4.61 1.00 Threonine 0.95 0.88
0.50 0.70
Ser, Asn, Gln 3.52 0.62 0.76 0.91
Leucine 1.05 1.02 0.92 1.04 1.09 Isoleucine 1.11 1.21 No cleavage obtained
Phenylalanine 0.91 0.63 0.68 0.73 0.80
The results are given in Table 8. Of the 31 possible acid and amide residues, 17 were directly identified by these methods. A further four were determined by high-voltage electrophoresis following Edman de- gradation. This latter procedure is best illustrated by reference to peptide CNBr E Th23. I ts mobility indicated the presence of one acid and one amidated residue. After two rounds of Edman degradation, the electrophoretic mobility of the remaining frag- ment suggested that a glutaminyl residue had been removed. By inference, glutam(ate/ide)-116 could then be identified as glutamic acid. I n a similar way, with CNBr D Th9, aspar(agine/tate)-37 was identified as the acid and glutam(ate/ide)-39 as the amide com- ponent of the peptide. This result was confirmed by a time course digestion of the whole peptide with carboxypeptidase A, which indicated the position of the amidated residue in the sequence: 0.5 h : Gln (0.65), Ala (0.81); 3 h : Thr (0.15), Gln (0.84), Ala (1.00); 9 h: Thr (0.35), Gln (1.00), Ala (1.01). Since complications and hence ambiguities may arise using the electrophoretic technique due to modifi- cations of side-chain amino-groups by phenyliso- thiocyanate, peptides containing lysyl or S-amino- ethyl-cysteinyl residues were not subjected to the procedure.
The most intransigent residues were located in two groups situated in peptides CNBr D ThlO and CNBr D T13. The data from both high-voltage elec- trophoresis and enzymatic hydrolyses indicated that the former contained two amides (one asparagine and one glutamine) and three acids (two glutamic and one aspartic) and the latter a single aspartyl and four asparaginyl residues. Since both peptides contained
more than one residue which could account for either the acid or amide content, the above information did not allow us to unambiguously identify the loca- tion of these residues in the two peptides. This am- biguity was resolved by time-course incubation with aminopeptidase M and carboxypeptidase A. The results are illustrated in Table 9.
They indicate that the glutamyl residues present in peptide ThlO occupy positions 43 and 49 in the molecule and the asparagine position 44. The compo- sition of a complete enzymic digest of this peptide, together with its sequence, allowed us to assign the aspartyl residue to position 45, leaving the glutamine for position 46.
In peptide Thl7A (identical with T13), the only aspartyl residue present was liberated after the first asparagine, on a time-course incubation with amino- peptidase M. Consequently, asparagine moieties must occupy position 82, 84 and 88 (see Table 9).
DISCUSSION The results obtained with the purification pro-
cedure adopted for human or-lactalbumin confirms the absence of b-lactoglobulin in the milk of this species [l6], a trait in common with the guinea pig [I71 and camel [IS].
The rationale behind our approach to the sequence determination of human a-lactalbumin was firstly to cleave the polypeptide chain into large fragments with cyanogen bromide and subsequently, to treat these isolated sections individually with respect to sequence analysis. Since the N and C-terminal CNBr fragments could not be conveniently separated in
Vol.27, No. 1, 1972 J. B. C. FINDLAY and K. BREW 83
quantities sufficient to permit this approach, they were subjected to enzymic digestion as a mixture. However, even this partial fractionation resulted in the production of peptide mixtures far less com- plex than would be obtained from the whole protein. The advantages in this approach were two-fold ; firstly, is facilitated the isolation of these peptides, most of which were obtained pure after fractionation by ion-exchange chromatography. Secondly, by locating peptides in defined sections of the poly- peptide chain, the problem of alignment was greatly simplified.
Amino acid analysis of human a-lactalbumin re- vealed the presence of two methionines in the pro- tein. Subsequent cleavage with cyanogen bromide gave rise to the expected three peptide fragments whose relative positions in the polypeptide chain could be uniquely determined. Obviously, with more than three fragments, the procedure detailed below would not yield an unambiguous result and recourse would have to be made to overlapping peptides con- taining the relevant methionyl residues.
The amino acid composition of one fragment iso- lated from the CNBr E peak, indicated the absence of homoserine lactone, the degradation product re- sulting from the action of cyanogen bromide on methionine [21]. In contrast, both of the other CNBr fragments contained this amino acid derivative. This 33-residue section must, therefore, be derived from the C-terminal end of the intact polypeptide. As the second component present in CNBr E possessed an identical N-terminal tripeptide sequence to that of the whole protein, it was considered to constitute the first 30 residues from the NH, terminus of the protein. The material in the CNBr D pool must, therefore, comprise the central portion of the poly- peptide chain, stretching from residues 31 to 90.
The sum of the amino acid compositions of the three CNBr fragments is in very close agreement with the composition obtained for the whole protein. In the same way, peptides isolated from the tryptic, chymotryptic and thermolysin digests of these pieces accounted in each case (with the exception of a tripeptide in CNBr E) for the total number ofresidues contained in CNBr fragment. The missing tripeptide mentioned above was not recovered even after treat- ment of the Beckman M-72 column with 0.2 M so- dium hydroxide. It can only be supposed that the high proportion of basic amino acids relative to the overall size of the peptide caused it to be very strong- ly bound to the column resin.
Although the peptide bonds hydrolysed to give these peptides did in general conform to the published specificity of the proteases, certain unusual cleavage points were observed, some of which have already been mentioned. The most obvious was thd cleavage after tyrosine-50 on digestion of CNBr D with tryp- sin. It is not clear whether hydrolysis at this position 8.
is a consequence of contamination of the enzyme prep- aration with a small amount of chymotrypsin or whether it represents an inherent low level of chymo- tryptic-type activity in trypsin itself. Certainly, the lack of further chymotryptic cleavage products, sug- gests that some structural feature confers the prop- erty of high susceptibility to proteolysis on this Tyr-Gly bond.
To obtain maximum homology between human a-lactalbumin, bovine a-lactalbumin [6], human leu- kaemic lysozyme [5] and hen egg-white lysozyme [7] the primary structure of the four proteins are aligned as indicated in Fig. 9. (Upper numbering on the basis of a-lactalbumin structure, lower on lysozyme). Those deletions which were previously postulated in the bovine sequence for maximum homology with hen egg-white lysozyme, seem justified on the basis of the similarity between the two a-lactalbumins and the alignment of human a-lactalbumin and hu- man lysozyme. Moreover, no new gaps need be in- serted into the human a-lactalbumin sequence to aug- ment its resemblance to human leukaemic lysozyme. However, to take account of the extra amino acid (glycine) at position 48 of human leukaemic lysozyme, the gap in this region of the human a-lactalbumin primary structure must be widened by a further re- sidue.
Based on these alignments, it is possible to tab- ulate (Table 10) the numbers of positions at which human a-lactalbumin possesses amino acids identical, similar or different to bovine a-lactalbumin and human leukaemic lysozyme. A comparison between the two a-lactalbumins reveals that 720/, of the resi- dues are identical, 6°/0 are chemically similar and 22 O/, different. These values indicate greater differ- ences than have been found for the a-chains of haemoglobin (88O/,, 20/, and loo/,) [22] and cyto- chromes c (89O/,, 3O/,, So/,) [22] from the same species, but correspond roughly to the variations seen in the structures of the bovine and rat ribonucleases (680/,, 6O/,, 26°/0) [22]. This quite rapid rate of evolutionary change is consistent with the proposed relatively late emergence of a-lactalbumin activity [I].
In the comparison between human a-lactalbumin and leukaemic lysozyme, 63 (510/,) of the positions in the protein contain identical or closely related amino acids, a value very similar to that of 60 (490/,) for the bovine a-lactalbumin and hen egg-white lysozyme comparison. Included in Table 10, is the same comparison quantitised in terms of the minimal base change in the amino acid codons. This analysis supports the inference of homology already deduced from the similarities between bovine a-lactalbumin and hen egg-white albumin and also confirms the ap- proximate equality in total amino acid replacements when the sequence of human a-lactalbumin is com- pared with hen egg-white lysozyme and with human leukaemic lysozyme.
84 The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Biochem.
1 10 A Lys-Gln-Phe-Thr-Lys-Cys-Glu-Leu- Ser -Gln -Leu- B Glu -Gln -Leu -Thr-Lys-Cys-Glu-Val -Phe-Arg-Glu - C Lys-Val -Phe-Glu -Arg-Cys-Glu-Leu-Ala -Arg-Thr- D Lys-Val-Phe-Gly -Arg-Cys-Glu-Leu-Ala - Ala-Ah -
1 10
20 Leu-Lys- -Asp- Ile -Asp-Gly-Tyr-Gly -Gly- Ile -Ala- Leu-Lys- -Asp-Leu-Lys -Gly-Tyr-Gly -Gly-Val-Ser- Leu-Lys-Arg-Leu-Gly -Met-Asp-Gly-Tyr-Arg-Gly- Ile -Ser - Met -Lys-Arg-His-Gly -Leu-Asp-Asn-Tyr- Arg-Gly-Tyr-Ser -
20 30
Leu-Pro-Glu -Leu- Ile -Cys-Thr-Met-Phe-His-Thr-Ser-Gly - Leu-Pro-Glu -Trp-Val -Cys-Thr-Thr-Phe-His-Thr-Ser-Gly - Leu-Ale -Am-Trp-Met-Cys-Leu-Ala -Lys -Trp-Glu-Ser-Gly - Leu-Gly -Am-Trp-Val -Cys-Ala -Ala-Lys -Phe-Glu-Ser-Asn-
30
40 Tyr-Asp-Thr-Gln -Ala- Ile -Val -Glu -Am- -Asp-Gln- Tyr-Asp-Thr-Glu -Ala- Ile -Val -Glu -Am- -Asn-Gln- Tyr-Asn-Thr-Arg-Ala-Thr-Asn-Tyr -Am-Ah-Gly-Asp-Arg- Phe-Asn-Thr-Gln-Ala-Thr-Asn-Arg-Asn-Tyr- -Asp-Gly-
40 50
50 Ser-Thr-Glu -Tyr-Gly-Leu-Phe-Gln-Ile- Ser -Am-Lys-Leu- Ser-Thr-Asp-Tyr-Gly-Leu-Phe-Gln-Ile-Asn- Am-Lys- Ile - Ser-Thr-Asp-Tyr-Gly- Ile -Phe-Gln-Ile-Asn- Ser -Arg-Tyr- Ser-Thr-Asp-Tyr-Gly- Ile -Leu -Gln-Ile-Asn- Ser -Arg-Trp-
60 Trp-Cys-Lys- Ser - Ser -Gln-Val-Pro-Gln -Ser -Arg-Asn- Trp-Cys-Lys-Asn-Asp-Gln-Asp-Pro-His -Ser - Ser -Am- Trp-Cys-Asn-Asp-Gly-Lys -Thr-Pro-Gly-Ah-Val -Am- Trp-Cys-Asn-Asp -Gly-Arg-Thr-Pro-Gly -Ser -Arg-Asn-
60
70
Ile -Cys-Asp- Ile -Ser-Cys-Asp-Lys-Phe-Leu-Asn-Asp-Asn- Ile -Cys-Asn- Ile -Ser-Cys-Asp-Lys-Phe-Leu-Asn-Asn-Asp- Ala -Cys-His -Leu-Ser-Cys- Ser -Ala -Leu-Leu-Gln -Asp-Asn- Leu-Cys- Asn- Ile -Pro-Cys- Ser - A h -Leu-Leu- Ser - Ser -Asp-
80
80
90 Ile -Thr-Asn-Asn- Ile -Met-Cys-Ala-Lys-Lys- Ile -Leu- - Leu-Thr- Asn- Asn- Ile -Met-Cys-Val-Lys-Lys- Ile -Leu- - Ile - A h -Asp-Ala -Val-Ale -Cys-Ala -Lys-Arg-Val-Arg- - Ile -Thr-Ala - Ser -Val-Asn-Cys-Ala-Lys-Lys- Ile -Val - Ser-
90 100
Asp- Ile -Lys-Gly- Ile -Am-Tyr-Trp-Leu-Ala-His-Lys-Ala - Asp-Lys-Val- Gly- Ile -Am-Tyr-Trp-Leu-Ala-His -Lys-Ala - Asp-Pro-Gln-Gly- Ile -Arg-Ala-Trp- Val -Ala-Trp-Arg-Asn- Asn-Gly-Asp-Gly -Met-Asn- Ala-Trp-Val -Ala-Trp- Arg- Asn-
110 Leu-Cys-Thr-Glu-Lys-Leu-Glu -Gln -Trp-Leu- Leu-Cys- Ser -Glu -Lys-Leu-Asp-Gln -Trp-Leu- Arg -Cys-Gln -Am-Arg-Asp-Val -Arg-Gln -Tyr -Val-Gln- Arg-Cys-Lys -Gly -Thr -Asp-Val -Gln -Ala-Trp - Ile -Arg-
100
110
120 120
-Cys-Glu-Lys-Leu- -Cys-Glu-Lys-Leu
Gly-Cys- -Gly-Val- Gly-Cys- -Arg-Leu
130
Fig.9 Comparison of the amino-acid sequences of human ( A ) and bovine ( D ) a-lactalbumins and human leukaemic ( C ) and hen egg-white ( D ) lysozymes. Differences in amino residues acid are indicated in bold-free type
Vol. 27, No. 1,1972 J. B. C. FINDLAY and K. BREW 85
Table 10. Comparison between the sequences of human a-laetalbumin and those of bovine a-lactalbumin and human lysozyme Similar amino-acids are taken as serine/threonine, leucine/isoleucine, alanine/valine/glycine, arginine/lysine, aspartic acid/
glutamic acid and asparagine/glutamine
Residues Codons differing by Proteins compared
identical similar different 1 base 2 bases ~~
Bovine or-lactalbumin/
Human a-lactalbuminl human a-lactalbumin 89 (72O/,) 7 ( 6O/O) 27 (22O/O) 31 (25O/,) 3
human leukaemic lysozyme 48 ( 3 9 O / O ) 15 (12O/O) 60 (49'10) 49 (41°/0) 26 (20O10)
Inspection of all the sequences reveals the struc- turally conservative nature of a great proportion of these substitutions. Both hydrophobic and hydro- philic residues are in general replaced by amino acids of similar chemical character. An analysis of the sequences of the two human proteins based on the helix-forming or helix-breaking nature of the amino acid [23] also supports this conclusion. At only I2 positions, most of them occurring in clusters or re- gions of low helical content do replacements of dif- fering character take place. One possible instructive variation from this general trend occurs in the short segment from position I00 to 104 (residues numbered according to the lysozyme sequence, Fig.9). At po- sition 100 human leukaemic lysozyme possesses an arginyl residue in comparison with the leucyl residue of a-lactalbumin. At position 103, the respective amino acids are helix-initiating or helix- breaking proline and isoleucine whilst position 104 contains two residues of opposite-character glutamine in human leukaemic lysozyme and lysine in human a-lactal- bumin. Inspection of the three dimensional structure of hen egg-white lysozyme 1241 indicates that this region of the molecule is flexible enough to accom- modate residues of markedly different character with- out causing significant alterations in the surrounding configuration. It therefore provides an interesting test as regards a possible conformational homology between the two proteins.
One further interesting feature of human a-lact- albumin is the replacement of the tryptophanyl re- sidue a t position 28 of the sequence, by leucine. Kronman [25] believes that this aromatic chromo- phore is buried in the molecule but Barman [26] considers it to be exposed. On the basis of the hydro- phobic nature of the substituted residue in human a-lactalbumin, however, the former hypothesis ap- pears the more probable. Whichever postulate is correct, the substitution of this amino acid apparently does not deleteriously affect the participation of the protein in the biosynthesis of lactose. Consequently the direct involvement of the tryptophanyl residue in the biochemical reactions of a-lactalbumin can be discounted.
A three-dimensional structure has been proposed for bovine a-lactalbumin based on the hypothesis
that the sequence homology with hen egg-white lysozyme is reflected by a similarity in their confor- mation [27]. Although this postulate has been ques- tioned by some authors [28,29] it is supported by pre- cedent ( e . g . with the serine proteases and in the hae- moglobin-myoglobin system) [30] and by a con- siderable amount of physico-chemical evidence (see Discussion of [31]). Consequently, it is interest- ing to note that the sequence of human oc-lactalbu- min apparently supports a conformational similarity with he egg-white lysozyme [7] and human leukaemic lysozyme [5]. For example, if one considers the hydro- phobic core surrounding residue 28, which in human- albumin is a leucyl residue, replacing the tryptophan of bovine a-lactalbumin and hen egg-white lyso- zyme, other compensatory changes in the surrounding residues (17,20, 56,99, 106, 109) result in a constancy in the summed molar volumes. This conservative feature is therefore consistent with a retention of the overall conformation of the molecule.
Similarly, many of the residues involved in hy- drogen bonding in hen egg-white lysozyme are re- tained in human a-lactalbumin. Where substitu- tions have occurred, the potential for bonding is maintained by the nature of the substituted amino acid, e . g. asparagine-27 is replaced by glutamate and serine-85 by asparagine. It is noteworthy that of the 33 residues common to the two ol-lactalbumins and two lysozymes, 11 are in some way involved in hy- drogen bonding.
Another region of conformational interest in human a-lactalbumin is to be found near the amino terminal end of the molecule. In bovine a-lactalbumin the substitution of glutamate-I for the N-terminal lysine of hen egg-white lysozyme removes the pos- sibility of salt-bridge formation between the side chain of this residue and that of glutamate-7. How- ever, the retention of a lysine in this position in hu- man a-lactalbumin, and the greater similarity of the residues surrounding phenylalanine-3 to those of the egg-white lysozyme (as compared with bovine a-lactalbumin) suggests that the structure of this region in human oc-lactalbumin may be more similar to that of hen egg-white lysozyme than is the bovine protein. Other salt bridges, between lysine-I3 and the C-terminal carboxyl group and between lysine-97
86 J. B. C. FINDLAY and K. BREW: The Amino-Acid Sequence of Human a-Lactalbumin Eur. J. Biochem.
and asparate-102 also appear to have remained un- changed. In addition, several potential charge pairs, the disruption of which produce changes likened by Kronman [31] to those induced by acid denaturation, are still present in the molecule. However, only the elucidation of the three-dimensional structure of a-lactalbumin, will clarify the relevance of the con- servation of these potential interactions, to the con- formation of the protein.
The possible effects of the amino acid substitu- tions in bovine a-lactalbumin (when compared with hen egg-white lysozyme) on the conformation of the active-site cleft region of lysozyme has been previ- ously discussed in detail [27]. Many of these features have been retained in the equivalent positions of human a-lactalbumin. For example, the replacement of alanine-108 by a tyrosyl residue and with it the introduction of the potential for blocking off a sec- tion of the cleft region, is still in evidence, as is the absence of tryptophan-63 and the substitution of the catalytically functional glutamate-35 by a threonyl residue. This modification of the active site of lysozyme is extended in human a-lactalbumin by the replacement of the catalytically important aspartate-53 (still conserved in bovine a-lactalbumin) by glutamic acid. Although the side-chain carboxyl of this latter residue could still function in a manner similar to that of the corresponding aspartic acid of lysozyme, the freedom of variability in this po- sition indicates a less critical role for the residue in the biological activity of a-lactalbumin. These modifi- cations are, therefore, consistent with the loss of lyso- zyme activity and a reduction in the dimensions of the substrate molecules, both of which are characteristic of the a-lactalbumins. It seems possible that the role of a-lactalbumin in the lactose synthetase system [2,3] may lie more in the provision of parts of the binding site for monosaccharides than directly in the synthesis of the
One unusual feature of the a-lactalbumin-lyso- zyme group is the lack of immunological cross- reactivity not only between a-lactalbumins and lyclo- zymes [32] but also between many a-lactalbumins from different species [33]. This phenomenon can perhaps be related to the significant differences found in certain parts of their respective sequences. For example, one major antigenic site of hen egg-white lysozyme [34] is the disulphide loop region (residues 63 to 82) which is exposed on the surface both of lyso- zyme and in the bovine a-lactalbumin model. It is readily noticeable that a t least part of this section (residues 66 to 73) is a region of high mutability in the primary structures of the a-lactalbumins. As- suming that differences in sequence as well as con- formation are capable of generating variations in an- tigenic response, the correspondence of the principal antigenic site with a region of great variability dis- tant from the active site could account for this ob-
1-4 glycosidic linkage.
served lack of cross-reactivity. Clearly should this suggestion be justified by current studies, the hypo- thesis of major conformational differences between the two proteins which has been advanced by some workers to account for this property is both unneces- sary and improbable.
We wish to thank the Agricultural Research Council for the research grant in support of this work, which also includ- ed a research assistantship for one of us (J.B.C.F.).
REFERENCES 1. Brew. K.. Vanaman. T. C. & Hill. R. L. (1967) J . Biol. ~ Chem. 242, 3747. . 2. Brodbeck. U. & Ebner. K. E. (1966) J . Biol. Chem. 241,762. 3. Brew, K.,' Vanaman;T. C. '& Hill, R. L. (1968) Proc.
4. Brew, K. (1969) Nature (London) 223, 671. 5 . Canfield, R. E., Kammerman, S., Sobel, J. M. & Morgan,
6. Brew, K., Castellino, F. J., Vanaman, T. C. & Hill,
7 a. Canfield, R.E. & Liu, A.K. (1965) J . Biol.Chem. 240,1997. 7b. Jolles,P. (1967) Proc. Roy. SOC. London Biol.Sci.167,350. 8. Brew, K. & Hill, R. L. (1970) J . Biol. Chem. 245, 4559. 9. Davis, B. S. (1965) Ann. N . Y . Acad. Sci. 121, 404.
Nut. Acad. Sci. U. 8. A . 59, 491.
F. J. (1971) Nature New Biol. 232, 16.
R. L. (1970) J . Biol. Chem. 245, 4570.
10. Him, C. H. W. (1956) J . Biol. Chem. 219, 611. 12. Gray, W. R. (1967) Methods Enzymol. 11, 469. 13. Woods, K. R. & Wang, K. T. (1967) Biochim. Biophys.
14. Hartley, B. S. (1970) Biochem. J . 119, 805. 15. Hirs, C. H. W. (1967) Methods Enzymol. 11, 328. 16. Gray, W. R. (1967) Methods Enzymol. 11, 409. 17. Offord, R. E. (1966) Nature (London) 211, 591. 18. Johansson, B. (1958) Nature (London) 181, 996. 19. Brew, K. & Campbell, P. N. (1967) Biochem. J . 102, 258. 20. Kessler, E. & Brew, K. (1970) Biochim. Biophys. Ada ,
200, 449. 21. Steers. E. Jr., Craven, G .R., Anfinsen, C. B. & Bethune,
J. L. (1965) J . Biol. Chem. 240, 2478. 22. Dayhoff, M. 0. & Eck, R. V., eds (1968) in Atlas of Pro-
tein Sequence and Structure. 23. Lewis,P.N. & Scheraga, K.A. (1971) Arch. Biochem.
Biophys. 144, 584. 24. Blake, C. C. F., Mair, G. A., North, A. C. T., Phillips,
D. C. & Sarma, V. R. (1967) Proc. Roy. 800. London, B . Biol. Sci. 167, 365.
25. Kronman, M. J., Holmes, L. G. & Robbins, F. M. (1971) J . Biol. Chem. 246, 1909.
26. Barman, T. E. (1970) J. Mol. Biol. 52, 391. 27. Browne, W. J., North, A. C. T., Phillips, D. C., Brew, K.,
Vanaman, T. C. & Hill, R. L. (1969) J . Mol. Biol. 42,65. 28. Krigbaum, W. R. & Kugler, F. R. (1970) Biochemistry,
9, 1216. 29. Habeeb, A. F. S. A. & Atassi, &IiI. Z. (1971) Biochim.
Biophys. Acta, 236, 131. 30. Perutz,M. F., Muirhead,M., Cox, J. M. & Goaman,
L. C. G. (1968) Nature (London) 219, 131. 31. Kronman, M. J., Holmes, L. G. & Robbins, F. M. (1971)
J . Biol. Chem. 246, 1909. 32. Atassi,M. Z., Habeeb, A. F. S. A. & Rydstedt, L. (1970)
Biochim. Bwphys. Acta, 200, 184. 33. Tanahashi, N., Brodbeck, U. & Ebner, K. E. (1968)
Biochim. Biophys. Acta, 154, 247. 34. Amon, R. & Sela, M. (1969) Proc. Nut. A d . Sci. U. S.A.
62, 163.
Acta, 133, 369.
J. B. C. Findlay and K. Brew University Department of Biochemistry 9 Hyde Terrace, Leeds, LS2 9LS, Great Britain