11
Differentiation (1986) 30:254-264 ~ Differentiation c> Springer-Verlag 1986 Cytokeratin expression in simple epithelia 11. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8) Thomas M. Magin', Jose L. Jorcano''2, and Werner W. Frankel * Division of Membrane Biology and Biochemistry, Institute of Cell and Tumor Biology, German Cancer Research Center, Im Neuenheimer Feld 280, D-6900 Heidelberg, Federal Republic of Germany Center of Molecular Biology, University of Heidelberg, D-6900 Heidelberg, Federal Republic of Germany Abstract. Cytokeratin A (no. 8) is a cytoskeletal protein (Mr, approximately 53,000 in bovine cells) which is typical of all simple epithelia, is widespread in all cultured epithelial cells, and together with its partner cytokeratin D. is the first cytokeratin expressed during embryogenesis (syn- onyms for this protein are Endo A and TROMA-1 antigen). We isolated a clone (pKB8') from a pUC8 cDNA library prepared from poly(A)+-RNA of bovine bladder urothe- lium which contains the 3' nontranslated portion and the sequence coding for the carboxyterminal tail and almost the whole of the a-helical rod (369 amino acids). Northern- blot analysis showed that the mRNA coding for this cyto- keratin is specifically synthesized in various epithelial tissues and in epithelial cell culture lines. The amino acid sequence of this cytokeratin, when compared with the sequences of other intermediate filament (IF) proteins, exhibits a high and specific homology with other cytokeratins of the basic (type 11) subfamily; this homology is, however, restricted to the rod portion. The tail region, which is rich in hydroxy- amino acids (- 35%), is unique among the type-I1 cytokera- tins in that it does not exhibit subdivision in three domains, specifically lacking the glycine-rich middle domain. Se- quence comparison with a partial sequence of the corre- sponding cytokeratin of the amphibian species, Xenopus laevis, indicated high evolutionary conservation. The high sequence homology of bovine cytokeratin A with published sequences of human tissue polypeptide antigen (TPA), a soluble serum component used as tumor marker in clinical oncology, supports the view that TPA is a proteolytically solubilized fragment containing the rod portion of human cytokeratin no. 8. Our analysis of clone pKB8' made possi- ble the first comparison of a simple epithelial cytokeratin with epidermal keratins and other IF proteins. This showed that, in some important molecular features, cytokeratin A (no. 8) differs drastically from the epidermal members of the same cytokeratin subfamily, probably reflecting differ- ent cellular functions of the tail region in stratified and simple epithelia. Introduction Of the five major types of the large multigene family of intermediate-sized filament (IF) proteins, i.e., cytokeratins, vimentin, desmin, glial filament protein, and neurofilament * To whom offprint requests should be sent proteins (for a review, see [48]), the cytokeratins are the most complex class. Approximately 20 cytokeratin polypep- tides have been identified in the various epithelia of species as diverse as humans, cows and rodents [4; 12, 16, 47, 581. These cytokeratins can be grouped into two subfamilies, i.e., the acidic (type I) cytokeratins and the more basic (ty- pe 11) cytokeratins [4, 5, 19, 24, 58, 65, 661. While the non- keratinous IF polypeptides can form homotypic tetrameric coiled-coil a-helical subunits and IFs [21, 28, 711, two poly- peptide chains of either cytokeratin subfamily are required for the formation of the heterotypic tetramer complexes constituting cytokeratin IFs [17, 25, 52, 70, 731. The reason for this requirement of heterotypic combinations of acidic and basic cytokeratins is not yet understood [61]. The IFs of different types of epithelia and epithelial tumors are made up of different combinations of type-I and -TI cytokeratins (e.g. [12, 16, 47, 65, 66, 741). While stratified epithelia are usually characterized by a relatively complex pattern of cytokeratin polypeptides, including a subset of type-I1 cytokeratins with isoelectric values of above pH 7.0 (e.g., nos. 1-6 of the human catalog [47]), most simple (Le., one-layered) epithelia produce IFs with a rather simple cytokeratin composition. For example, he- patocytes and hepatocyte-derived tumor cells express only one cytokeratin polypeptide of either subfamily [lo, 11, 16, 471. Recently, cDNA and genomic clones of a number of cytokeratin polypeptides from various species have been described, yielding important amino-acid-sequence infor- mation from which some general principles of cytokeratin organization and conformation have been deduced [23, 24, 26,27, 31-33, 35, 37,43, 54, 60-63, 671. However, in biolog- ical terms, the available information is rather limited, be- cause it is restricted to cytokeratins of one type of stratified epithelium, i.e. epidermis. In the present study, we report the first sequence of a basic cytokeratin that is typical of simple epithelia, i.e., bovine cytokeratin A (M,, 53,000), which is the polypeptide corresponding to human cytokeratin no. 8 (Mr, 52,500) and cytokeratin A of rodents (Mr, 55,000). This protein has a very wide distribution. It was first noted as being a com- ponent of cytokeratin IFs of human HeLa cells ([9]; 'com- ponent 2' of [12]), as well as being present in intestinal cells of various species [13], and in hepatocytes and hepa- toma cells of humans, cows, rats, mice, and monkeys [7, 8, 10, 11, 16, 47, 58, 661. It has also been described in human mesothelial [74] and bovine kidney [15] cells and

Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

Embed Size (px)

Citation preview

Page 1: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

Differentiation (1986) 30:254-264

~

Differentiation c> Springer-Verlag 1986

Cytokeratin expression in simple epithelia 11. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

Thomas M. Magin', Jose L. Jorcano''2, and Werner W. Frankel * Division of Membrane Biology and Biochemistry, Institute of Cell and Tumor Biology, German Cancer Research Center, Im Neuenheimer Feld 280, D-6900 Heidelberg, Federal Republic of Germany Center of Molecular Biology, University of Heidelberg, D-6900 Heidelberg, Federal Republic of Germany

Abstract. Cytokeratin A (no. 8) is a cytoskeletal protein (Mr, approximately 53,000 in bovine cells) which is typical of all simple epithelia, is widespread in all cultured epithelial cells, and together with its partner cytokeratin D. is the first cytokeratin expressed during embryogenesis (syn- onyms for this protein are Endo A and TROMA-1 antigen). We isolated a clone (pKB8') from a pUC8 cDNA library prepared from poly(A)+ -RNA of bovine bladder urothe- lium which contains the 3' nontranslated portion and the sequence coding for the carboxyterminal tail and almost the whole of the a-helical rod (369 amino acids). Northern- blot analysis showed that the mRNA coding for this cyto- keratin is specifically synthesized in various epithelial tissues and in epithelial cell culture lines. The amino acid sequence of this cytokeratin, when compared with the sequences of other intermediate filament (IF) proteins, exhibits a high and specific homology with other cytokeratins of the basic (type 11) subfamily; this homology is, however, restricted to the rod portion. The tail region, which is rich in hydroxy- amino acids (- 35%), is unique among the type-I1 cytokera- tins in that it does not exhibit subdivision in three domains, specifically lacking the glycine-rich middle domain. Se- quence comparison with a partial sequence of the corre- sponding cytokeratin of the amphibian species, Xenopus laevis, indicated high evolutionary conservation. The high sequence homology of bovine cytokeratin A with published sequences of human tissue polypeptide antigen (TPA), a soluble serum component used as tumor marker in clinical oncology, supports the view that TPA is a proteolytically solubilized fragment containing the rod portion of human cytokeratin no. 8. Our analysis of clone pKB8' made possi- ble the first comparison of a simple epithelial cytokeratin with epidermal keratins and other IF proteins. This showed that, in some important molecular features, cytokeratin A (no. 8) differs drastically from the epidermal members of the same cytokeratin subfamily, probably reflecting differ- ent cellular functions of the tail region in stratified and simple epithelia.

Introduction

Of the five major types of the large multigene family of intermediate-sized filament (IF) proteins, i.e., cytokeratins, vimentin, desmin, glial filament protein, and neurofilament

* To whom offprint requests should be sent

proteins (for a review, see [48]), the cytokeratins are the most complex class. Approximately 20 cytokeratin polypep- tides have been identified in the various epithelia of species as diverse as humans, cows and rodents [4; 12, 16, 47, 581. These cytokeratins can be grouped into two subfamilies, i.e., the acidic (type I) cytokeratins and the more basic (ty- pe 11) cytokeratins [4, 5, 19, 24, 58, 65, 661. While the non- keratinous IF polypeptides can form homotypic tetrameric coiled-coil a-helical subunits and IFs [21, 28, 711, two poly- peptide chains of either cytokeratin subfamily are required for the formation of the heterotypic tetramer complexes constituting cytokeratin IFs [17, 25, 52, 70, 731. The reason for this requirement of heterotypic combinations of acidic and basic cytokeratins is not yet understood [61].

The IFs of different types of epithelia and epithelial tumors are made up of different combinations of type-I and -TI cytokeratins (e.g. [12, 16, 47, 65, 66, 741). While stratified epithelia are usually characterized by a relatively complex pattern of cytokeratin polypeptides, including a subset of type-I1 cytokeratins with isoelectric values of above pH 7.0 (e.g., nos. 1-6 of the human catalog [47]), most simple (Le., one-layered) epithelia produce IFs with a rather simple cytokeratin composition. For example, he- patocytes and hepatocyte-derived tumor cells express only one cytokeratin polypeptide of either subfamily [lo, 11, 16, 471.

Recently, cDNA and genomic clones of a number of cytokeratin polypeptides from various species have been described, yielding important amino-acid-sequence infor- mation from which some general principles of cytokeratin organization and conformation have been deduced [23, 24, 26,27, 31-33, 35, 37,43, 54, 60-63, 671. However, in biolog- ical terms, the available information is rather limited, be- cause it is restricted to cytokeratins of one type of stratified epithelium, i.e. epidermis.

In the present study, we report the first sequence of a basic cytokeratin that is typical of simple epithelia, i.e., bovine cytokeratin A (M, , 53,000), which is the polypeptide corresponding to human cytokeratin no. 8 (Mr, 52,500) and cytokeratin A of rodents (Mr, 55,000). This protein has a very wide distribution. It was first noted as being a com- ponent of cytokeratin IFs of human HeLa cells ([9]; 'com- ponent 2' of [12]), as well as being present in intestinal cells of various species [13], and in hepatocytes and hepa- toma cells of humans, cows, rats, mice, and monkeys [7, 8, 10, 11, 16, 47, 58, 661. It has also been described in human mesothelial [74] and bovine kidney [15] cells and

Page 2: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

255

in the transitional epithelium (urothelium) of the bladder and ureter of mice, cows, and humans [l, 41, 58, 641. In fact, comparisons of microdissected tissue samples have shown this cytokeratin to be a major cytoskeletal protein in all simple epithelia and their carcinomas, as well as in most epithelial cell culture lines [47]. In addition, this cytok- eratin is of particular interest because it is the only basic (type 11) cytokeratin expressed in the epithelia of pre- and postimplantation mouse embryos, in which it was originally designated as 'cytokeratin Y' ([29, 301; for identification, see [14, 161; other synonyms of this protein are Endo A [49] and TROMA-1 antigen [2]). A corresponding cytoker- atin has been described in the oocytes and early embryonal stages as well as in the differentiated intestinal cells of the amphibian species, Xenopus laevis [18].

In the present study, we describe a cDNA clone, synthe- sized from mRNA of bovine bladder urothelium, which codes for this protein. The amino acid sequence deduced thereform exhibits the typical features of a basic (type 11) cytokeratin, but differs in several respects from all known epidermal polypeptides of this subfamily, most notably with regard to the absence of a glycine-rich segment in the car- boxyterminal tail region. We also show that the sequence of this protein is very similar to the published sequences of human 'tissue polypeptide antigen' (TPA), a tumor marker that is widely used in clinical diagnoses.

Methods

Tissues and cells

RNA was extracted from tissues (bladder, liver, snout, esophagus, heart, and retina) collected from cows within 1 h of slaughter. Cultured bovine cells of the kidney epithe- lial line MDBK and the mammary-gland epithelial line BMGE-H, as well as bovine fibroblasts of the line B1 were grown until almost confluent, as described elsewhere [15, 591.

Isolation of bladder urothelium

For the preparation of bladder urothelium, the bladders were opened by longitudinal sectioning, turned inside out, and then fixed on a piece of wood. The urothelial surface was rinsed with ice-cold phosphate-buffered saline (PBS), and the epithelial cells were gently scraped off using a scal- pel, with care being taken to avoid including material from the underlying connective tissue. The cells were placed di- rectly into guanidine buffer in a Dounce homogenizer and used for isolation of RNA.

Isolation and in vitro translation of RNA

Total cellular RNA from the various bovine tissues and cultured cells was prepared essentially as described else- where [34]. In some cases, guanidinium-thiocyanate was used instead of guanidinium-HC1 for the first precipitation step (cf. [3]). The preparation of poly(A)+-RNA and the in vitro translation of total and poly(A)+-RNA were per- formed as previously described [33, 411.

DNA cloning procedures

Double-stranded cDNA was synthesized, starting with 6 pg urothelial poly(A)+-RNA, essentially according to the pro-

tocol of Maniatis et al. [42]. After nuclease S1 treatment, 500 ng cDNA was tailed with oligo-dC (average tail length, 15 nucleotides) using terminal transferase, according to the method of Deng and Wu [6]. The tailed cDNA was size fractionated on a Sepharose-2B column equilibrated with 10 mM Tris-HC1 (pH KO) , 1 mM ethylenediaminetetra-ace- tate (EDTA), and 0.1 M NaCl.

cDNA fractions larger than 500 base pairs (bp) were pooled and mixed with PstI-digested, oligo-dG-tailed pUC8 ([69] ; average dG-tail length, 15 nucleotides). Annealing was performed as described by Maniatis et al. [42] using 100 ng cDNA and 600 ng oligo-dG-tailed plasmid, and this mixture was then used to transform competent Escherichia coli CK600 cells (DMSO treated, essentially following [22]). This ratio of plasmid to cDNA was found to give the high- est number of recombinants (50&800 per nanogram of cDNA ; transformation efficiency of CK600 cells with pUC8 z 2 4 x lo7 colonies/pg DNA).

To amplify the cDNA library, cells were grown, after transformation, for 4 h at 37" C, in a ten-fold volume of medium (LB [42]) without selection pressure. Ampicillin (100 pg/ml final concentration) and glycerol (15% final concentration) were then added, and the cells were stored at -80" C in small aliquots. All recombinant experiments were conducted under the conditions specified by the Zen- trale Kommission fur Biologische Sicherheit of the Federal Republic of Germany.

Screening of recombinant clones

Bacteria carrying recombinant plasmids were plated at a density of approximately 5,000 colonies per 14-cm Petri dish. The transfer and binding of DNA to 137-mm nitrocel- lullose filters (BA85 0.45 pin; Schleicher and Schiill, Dassel, FRG) as well as the washing and prehybridization were performed as described by Maniatis et al. [42]. A gel-puri- fied insert of a cDNA clone to bovine epidermal keratin 111 (pKBIII' [34]) was nick-translated and used as the hy- bridization probe. Hybridization was performed overnight at 37" C in 50% formamide, 5 x SSPE (1 x SSPE is 0.18 M NaCI, 10 mM NaH, PO4, 1 mM EDTA pH 7.4), 5 x Den- hardt's solution, 0.2% sodium dodecyl sulfate (SDS), and 100 pg/ml yeast tRNA. The filters were washed four times at room temperature with 2 x SSC (1 x SSC is 0.15 M NaCl and 0.015 M Na-citrate) and 0.2% SDS, and twice at 45" C (45 min each) with 0.1 x SSC and 0.2% SDS. As described below, these conditions allow cross-hybridization between the epidermal probe and the cytokeratin-A mRNA.

Positive colonies were selected and replated at low den- sity (300-500 colonies per 90-mm Petri dish). The screening was repeated, and the positive colonies were replated and screened once more. Finally, single colonies were selected for further analysis.

Identgication of cDNA clones

Clones hybridizing strongly in the colony hybridization as- say were identified by a positive hybridization-selection as- say under stringent hybridization conditions using 200 pg/ ml urothelial poly(A)+-RNA [34]. The in vitro translation products were identified by one- and two-dimensional gel electrophoresis [34]. To find conditions allowing cross-hy- bridization between mRNAs coding for different members of the same cytokeratin subfamily, hybridization-selection

Page 3: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

256

experiments were performed under low-stringency condi- tions. Therefore, hybridizations were performed at 42" C in 50% formamide, 5 x SSC, 20 mM Pipes (pH 7.4), 0.2% SDS, 250 pg/ml yeast tRNA, and 20 pg/ml poly A, and washes were performed at 42" C in 0.1 x SSC and 0.2% SDS. The clone pKBIII' selected the mRNA coding for cytokeratin A, and so these conditions were chosen for the colony hybridization.

Northern blotting

RNA was denatured with glyoxal, separated onto 1.1% agarose gels, transferred to nitrocellulose paper, and probed with nick-translated, 32P-labeled plasmid DNA (specific ac- tivity, 3-5 x lo8 cpm/pg), as previously described for epider- mal keratins [34].

Subcloning and DNA sequencing

The cDNA insert of pKB8l was sequenced mainly using the dideoxy-chain-termination method [56]. Regions where the reading was ambiguous were resequenced using the chemical-cleavage method [44]. Dideoxy sequencing was performed on both strands after subcloning overlapping restriction fragments in the M13-mp8 and -mp9 vectors [69]. The methods used for the DNA preparations, the nu- cleotide mixes, and the reaction conditions were those pro- posed by Amersham-Buchler (Braunschweig, FRG). %- labeled or-dATP was used as the radioactive label. The reac- tion products were separated on 40-cm-long, 0.2-mm-thick sequencing gels. After the run, the gels were dried on What- man-3MM paper and exposed for 1-4 days at room temper- ature. The amino-acid-sequence translation, sequence com- parisons, and probabilistic conformational analyses were performed as specified previously [54].

Results

Selection of clones from a bladder-urothelium cDNA library

A cDNA library (60,000 independent clones) was made from bovine bladder urothelium and then amplified ; 20,000 colonies were screened for the presence of clones represent- ing nonepidermal, simple epithelium-type cytokeratins (bo- vine components designated A, D, and M , 40,000, corre- sponding to human cytokeratins nos. 8, 18, and 19). Initial- ly, size-fractionated urothelial poly(A)+-RNA was used (12-23 S) to screen the library. However, this proved unsuc- cessful, probably due to the fact that, in this tissue, cytoker- atin mRNAs are far less abundant than in epidermis and other multistratified epithelia (cf. [34]). Therefore, cDNA- clones to bovine epidermal type-I1 cytokeratins [34] were used as screening probes under conditions allowing cross- hybridization with other members of the same subfamily [36]. To optimize this approach, we initially performed hy- bridization-selection experiments using epidermal cDNA clones and bovine urothelial poly(A)+ -RNA. Epidermal clone pKBIII was shown to select specifically for RNA coding for nonepidermal cytokeratin A (data not shown) and was therefore used to screen the library for cytokeratin A under relaxed conditions. To avoid unspecific hybridiia- tion through the dC : dG-tail region of the cDNAs, a frag- ment of the cDNA insert of pKBIII was purified by poly- acrylamide gel electrophoresis, labeled with 32P by nick-

Fig, 1 a-d. Identification of a cDNA clone coding for bovine cytok- eratin A (no. 8) by in vitro translation of an mRNA hybrid selected by clone pKB8' (a, b) and by Northern-blot analysis (c, d). a Coo- massie-Blue-stained gel containing cytoskeletal proteins from bo- vine bladder urothelium. A , D, and 40K denote cytokeratins A (no. 8), D (no. 21 of bovine catalog of [58], corresponding to cytok- eratin no. 18 of the human catalog of [47], and the Mr-40,000 cytokerdtin (no. 22 of the bovine and no. 19 of the human cata- logs), respectively. B, AC, and P are bovine serum albumin, muscle ol-actin, and yeast phosphoglycerokinase, respectively, which were used as markers in the co-electrophoresis. The arrow denotes the position of a major degradation product of cytokeratin A (for identification see [57]). b Autoradiograph of the gel shown in a, revealing 35S-methionine-Iabeled products of the in vitro transla- tion of mRNA selected by the hybridization of urothelial poly(A) +-

RNA to clone pKB8'. T, an endogenous product of the reticulo- cyte lysate; arrow, a degradation product of cytokeratin A (no. 8). c, d Autoradiographs showing Northern-blot analyses of poly(A)+- RNA from bovine bladder urothelium hybridized with clone pKB8'; moderate (12 h ; c) and prolonged (4 days; d) exposure times demonstrated the high specificity of the cDNA clone. Using rRNAs as markers, the estimated size of the mRNA coding for cytokeratin A is 1.85 kb

translation, and used to screen filter replicas of the urothe- lial cDNA library. Fifteen positive colonies were selected and further purified in two additional screening cycles.

Plasmid DNA was bound to nitrocellulose filters and hybridized to urothelial poly(A)+-RNA under stringent conditions. Specifically bound RNA was recovered and translated in vitro, and the polypeptides synthesized were analyzed using one- and two-dimensional gel electrophore- sis. Of the 15 selected clones, 5 were strongly positive for cytokeratin A. Figure 1 a, b presents, as an example, the specific selection of mRNA coding for cytokeratin A by clone pKB8. This specificity was also demonstrated by Northern-blot analysis of total bladder urothelial poly(A)+ - RNA (Fig. 1 c, d). The site of the mRNA coding for cyto- keratin A was estimated to be 1 .85 kb (Fig. 1 c), this being considerably large than that recently reported for the mRNA of the corresponding mouse cytokeratin A (1.64 kb [68]), although the murine polypeptide appears to be some-

Page 4: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

251

TTT GCC TCC TTC ATC GAC AAG GTG CGG CAC CTG GAG CAG CAG AAC AAG GTT CTG GAG ACC M A TGG AAC CTC CTG Phe A l a Ser Phe I l e A s p L v s V a l A r g H i s L e u G l u G l n G l n A s n L v s V a l L e u G l u T h r L v s T r o A s n L e u L e u

75 CAG CAG CAG AAG ACT GCC CGG AGC AAC ATA GAC AAC ATG TTT GAG AGC TAC ATT AAC AAC CTC CGT CGG CAG CTG G l n G l n G i n L v s T h r A l a A r g Ser A s n I l e ASD A s n M e t Phe G l u S e r T v r I l e A s n A s n L e u A r g A r g G l n L e u

1 5 0 GAA ACT CTG GCC CAG GAG AAG CTG AAG CTG GAA GTG GAG CTT GGC AAC ATG CAG GGG CTG GTG GAG GAC TTC AAG G l u T h r L e u A l a G l n G l u L y s L e u L v s L e u GLu V a l G I u L e u G I v A s n M e t G i n G l v L e u V a l GLu A s p Phe L v s

225 ACC AAG TAT GAG GAT GAA ATC CAA AAG CGC ACA GAC ATG GAG AAT GAA TTT GTC ATC ATC AAG AAG GAT GTG GAT T h r L y s T y r G l u A s p G l u I t e G l n L v s A r g T h r ASD M e t G I u A s n G I u Phe V a l I l e I l e L v s L v s ASP V a l A s p

300 GAA GCT TAC ATG AAC AAG GTA GAG CTG GAG TCC CGC CTG GAA GGG CTG ACT GAT GAG ATC AAC TTC TAC AGG CAA GIII A l a T v r M e t A s n L v s V a l G l u L e u G l u S e r A r g L e u G I u G l v L e u T h r A s p G l u I l e A s n Phe T y r A r g G l n

3 7 5 CTG TAT GAA GAG GAG ATC CGT GAG ATG CAG TCT CAG ATT TCT GAC ACG TCC GTG GTC CTG TCC ATG GAC AAC AAC L e u T v r G I u G I u G l u I l e A r g G I u M e t G l n S e r G l n f l e S e r A S D T h r Ser V a l V a l L e u S e r M e t ASP A s n A s n

450 CGC AAC CTG GAC CTG GAT GGC ATC ATC GCT GAG GTC AAG GCC CAG TAT GAG GAG ATC GCC AAC CGC AGC CGG GCT A r g A s n L e u ASD L e u ASD G l v I l e I l e A l a G I u V a l L v s A l a G l n T v r G l u G l u I l e A l a A s n A r a S e r A r s A l a

525 GAG GCC GAG GCC ATG TAC CAA ATC AAG TAT GAG GAG CTG CAG ACA CTG GCT GGG AAG CAC GGG GAT GAC CTT CGT G l u A l a G l u A l a M e t T v r G i n I l e L v s T v r G I u G l u L e u G l n T h r L e u A l a G l v L v s H I S G l v ASD A s o L e u A r g

6 0 0 CGC ACG AAG ACG GAG ATT TCT GAG ATG AAC CGG AAC ATC AAC CGT CTC CAG GCT GAG ATC GAG GGT CTC AAA GGC A r g T h r L v s T h r G I u I l e S e r G l u M e t A s n A r g A s n I l e A s n A r g L e u G l n A l a G l u I l e G l u G l v L e u L v s G l y

6 7 5 CAG AGG GCT TCC CTG GAG GCT GCC ATC GCT GAC GCT GAG CAG CGT GGT GAG ATG GCT GTT AAG GAT GCT CAG GCC G l n A r g A l a S e r L e u G l u A l a A l a I l e A l a ASD A l a G l u G l n A r g G l v G l u M e t A l a V a l L v s ASD A l a G l n A l a

750 AAG CTG GCG AGG CTG GAG GCC GCT CTG AGG AAC GCC AAG CAG GAC ATG GCG CGG CAG CTG CGC GAG TAC CAG GAG L V S L e u A l a A r g L e u G l u A l a A l a L e u A r g A s n A l a L v s G l n ASD M e t A l a A r u G l n L e u A r g G I u T v r G l n G l u

825 CTC ATG AAT GTC AAG CTG GCC CTG GAC GTG GAG ATT GCC ACC TAC AGG AAG CTG CTG GAG GGC GAG GAG AGC CGG L e u M e t A s n V a l L v s L e u A l a L e u A s a V a l G I u I l e A l a T h r T v r A r a L v s L e u L e u G l u G l v G l u G l u Ser A r g

9 0 0 CTG GAG TCT GGG ATG CAG AAC ATG AGT ATC CAC ACC AAG ACC ACC AGT GGC TAC GCA GGT GGA CTG ACT TCG TCC L e u G I u Ser GLv M e t G l n A s n M e t S e r I l e His T h r L v s T h r T h r S e r G l v T y r A l a G I v G l v L e u T h r Se r Ser

9 7 5 TAC GGG ACC CCT GGC TTC AAC TAC AGC CTG AGC CCC GGC TCC TTC AGC CGC ACC AGT TCC AAG CCT GTG GTT GTG T v r G l v T h r P r o G l v Phe A s n T v r Se r L e u Ser P r o G l v Sar Phe S e r A i g T h r S e r S e r L v s F r o V a l V a l V a l

A

1 0 5 0 AAG AAG ATT GAG ACC CGC GAT GGG AAG CTG GTG TCC GAG TCC TCT GAT GTC CTG TCC AAG TGA AAGGCTCTTG L v s L v s I l e G l u T h r A r g A s p G l v L v s L e u V a l S e r G l u S e r S e r A s p V a l L e u S e r L v s ***

1123 CGGTCCCTCC CCAGTCTCCG GCTCATTGGC TCCTGCAGAT GGGAGCTGTG CAGGGGAGCA TCTGCACAGG AGACCTGAGG TTTCGCCCCT

1 2 1 3 GTCCTCCGCC CACACCTGGG GGGAGTCGAC TGCCTGGGGT TGCCCCTTTC GCCCATGACC CCACCTAAAA GCCAATGTAA GCGTCTTTTT

1 3 0 3 CAGAATAAAT CCAATTCGAG TATCTTTTTT TTTCAAAATA AAGCTTCAGT TGGCTCTGCA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA

Fig. 2. Nucleotide sequence of clone pKB8' and the deduced amino acid sequence of cytokeratin A (no. 8). The cDNA sequence extends from coil 1 A of the a-helical rod to the poly(A)-tail region. The arrowhead indicates the end of the a-helical rod after the TYR(X)LLEGE-consensus sequence. Asterisks denote the translation termination codon. The polyadenylation signal is underlined. Thirty nucleotides of the poly(A)-tail are shown; a total of 86 are contained in the cDNA clone

what larger (Mr, 55,000) than the bovine one (Mr, 53,000). The mRNA for bovine cytokeratin A is considerably smaller than any of the mRNAs coding for epidermal kera- tins of the same type-IT subfamily [19, 31, 341.

For further experiments, the cDNA clone pKB8l, carry- ing an insert with a length of approximately 1.45 kb, was chosen.

Sequence unulysis

Restriction mapping of clone pKB8' revealed three internal Pstl sites and two Hind111 sites, which permitted overlap- ping subclones to be constructed in the M13-mp8 and -mp9 vectors in both orientations. Sequencing by the dideoxy- chain-termination method resulted in a few ambiguities, and such regions were therefore also sequenced using the chemical-cleavage method.

Figure 2 presents the DNA sequence and the deduced amino acid sequence. The cDNA insert is - 1.43 kb long, including a poly(A) stretch of 86 nucleotides, and encodes a total of 369 amino acids, of which 296 residues are located in the a-helical rod region, while 13 residues can be assigned to the carboxyterminal tail (nomenclature of [21]). The ami- no terminal head domain and approximately 10 amino acids from the start of the a-helical rod domain are not present in clone pKB8l. The 3' nontranslated region com- prises 250 bp and has a poly-adenylation signal 17 nucleo- tides before the poly(A) stretch.

A comparison of cytokeratin A with amino acid se- quences from other IF proteins (Figs. 3 and 4 show some examples; cf. [61, 62, 711) revealed several common princi- ples, but also some distinctive features. The a-helical rod portion was identified by conformation analysis (Fig. 5) and exhibited extended regions characterized by the conven- tional heptades of typical coiled-coil arrangements. The se-

Page 5: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

258

HUM 560 BOV A BOV VI HAM VIM

HUM M BOV A

HAM VIM eov VI

-568 BOV A BOV VI HAM VIM

HUM 568 BOV A BOV VI HAM VIM

HUM 568

BOV VI HAM VIM

eov A

HUM 568 BOV A BOY VI HAM VIM

-230 -220 -210 -200 -190 -1 a0 -170

N M O D L V E D L K N K Y E V E I N K R T A A E N ~ F V ? L K K D V O A A Y M N K V E L O A ~ A D l L T D E l N F L R A L Y D A E L S ~ M ~ N Y ~ G L V E D F K T K Y E D E I ~ K R l D M E N E F V l l K K D V D E A Y M N K V E L E S R L E G L l D E l N F Y R ~ L Y E E E l R E M O

N L A E D I M R L R E K L O E E M L O R t E A E S T L O S F R O D V D N A S L A R L D L E R K V E S L O E E I A F L K K L H D E E I O E L O N A R L A A D D F R L K Y E N E V T L R O S V E A D I N ~ L R R V L D E ~ T L T K T D L E M O I E ~ L T E E L A Y ~ K K N H E E E M R D L ~

-160 -1 50 -140 -130 -120 -1 10 100

T H l S D T S V V L S M D N N R N L D L D S I I A E V K A O Y E E I A O R S R A E A E S W Y O T K Y E E L O V I A G R H G D D L R N T K ~ E S ~ I S D l S V V L S M D N N R N L D L D G l l A E V K A O Y E E l A N R S R A E A E A M Y O l K Y E E L O l L A G K H G D D L R R l K l E - N V S T G D V N V E M N A A P G V D L T E L L N N M R S Q Y E O L A E K N R H D A E A W F N E K S K E L T T E I N S N L E O V S S H K S I ~ ~ I ~ E ~ H ~ ~ I ~ ~ D ~ ~ K P - D L T A ~ L R ~ ~ ~ ~ ~ ~ E ~ ~ A A K ~ L ~ E A E E ~ Y K ~ K F A D L ~ E A A N R N ~ D ~ L R ~ A K O E

-90 -80 -70 -60 -50 -40 -30

l A E l N R M l O R L R S E l U H V K K O C A N L O A A l A D A E ~ R G E M A L K D A K N K L E G L E D A L O K A ~ ~ D l A R L L K E Y ~ E l S E M N R N l N R L O A E l E G L K G ~ R A S L E A A l A D A E ~ R G E M A V K D A O A K L A R L E A A L R N A K Q D M A R O L R E Y O E I T E L R R ~ I O G L E I E ~ ~ S O L A L K ~ ~ L E A ~ L A E T E G R Y ~ ~ O L ~ ~ I ~ S ~ I S S L E E O L O O I R A E T E C ~ N A E Y ~ ~ S N E Y R R O ~ O S L T C E V D A L K G T ~ ~ S L E R Q M ~ E M E E N ~ A L E A A N Y O D T I G R L ~ D E I O N M K ~ E M A R H L R E Y O ~

-20 -10 0 +10 +20 +30 +40

L M N V K L A L D V E l A T Y R K L L E G E E C R L N G E G V G a V N l S V V O S T V S S G Y G G A S G V - G S G L G L G G G S S Y S Y ~ S L M N V K L A L D V E I A T Y R K L L E G E E S R L E S G M O N M S I H ~ K T T S G Y A ~ ~ L I S S Y G ~ P G F N Y S L S P G S F S R T S S L L D I K I R L E N E I O T Y R ~ L L E G E G S S G G G S Y G G G R G Y G G S S G G G G ~ G Y G G G S S S G G Y G G G S S S G G G H G ~ ~ S S L ? N V K ~ ~ A L D l E I A T Y R K L L E G E E S R I S L P L P N F S S L N L R E T ~ ! L € S L P L V D T H S K R T t L l K T V E T R D ~ ~ ~ ~ V l

+ 50 + 60 + 70 + a0 + 90

t G L G V G G G F S S S S G R A T G G G L S S V G G G S S T l K Y T T T S S T l K Y T T T S S S S R K S Y K H K F V V V K K I E T R Q - G K L VSESSOVLSK ' - - - - - - - - - - -

G G S Y G G G S S S G G G H G G G S S S G G t I K S T l 1 G S V G E S S S K G P R . V - - - - N E T S O H H D D L ~ - - - - - - - - - - - - - - - - - - - - - - - - - - -

Fig. 3. Amino acid comparison between bovine cytokeratin A (BOV A ) , human cytokeratin no. 6 ( H U M 5 6 B [67]), bovine epidermal cytokeratin VIb (BOV VI [54]), and hamster vimentin (HAM VIM [50]). The amino acids identical in cytokeratin A and at least one of the other IF proteins are shown in boldface. The sequences are aligned, taking the end of the a-helical rod as the reference point (arrow-position 0), in this and all subsequent sequence figures. The underlined sequence DGKLVSE in bovine cytokeratin A identifies a motif found in similar positions in some type-I cytokeratin sequences. It may also be related to the DGKVINET motif found in hamster vimentin in positions 4 4 5 1 . Note the high homology of the two representatives of the type-I1 cytokeratin subfamily, i.e., human no. 6 and bovine A

quence of this region identifies cytokeratin A as being a typical member of the basic (type 11) subfamily, as illus- trated by its high sequence homology with human epider- mal cytokeratin no. 6 (Fig. 3; 83% sequence identity; this polypeptide has been presented as component M, 56,000 in [24, 671) and all four known bovine epidermal keratin sequences (Fig. 4). Some regions of the a-helix exhibit a higher degree of homology than others; in particular, the end of the &-helical coil 1 A (positions - 300 to - 279), the interruption between coils 1 and 2 (positions -160 to - 140), and the start (positions - 139 to - 119) and the end (positions -35 to 0) of coil 2B. Dot-matrix compari- sons (Fig. 6) of the sequence of bovine cytokeratin A with those of representatives of the different major types of IF proteins revealed that the a-helical region of bovine cytok- eratin A is highly homologous to the same domain of basic (type 11) cytokeratins from the epidermis of the same (not shown [33] ) and other species, ranging from humans (Fig. 6a) to the frog, X. Zueuis (Fig. 6b). The same analysis

showed that cytokeratin A is much less closely related to acidic (type I) cytokeratins (Fig. 6c) and the various non- epithelial IF proteins (Fig. 6d presents the example of vi- mentin), the only significant homology recognized being at the end of the a-helical domain (arrows in Figs. 6a-d). At the position ( -117 of Fig. 3) which is occupied by a common tryptophane residue in all known nonepithelial IF proteins as well as in many type-I and -11 cytokeratins, cytokeratin A presents a methionine residue. Interestingly, a leucine residue has recently been reported in this position in three epidermal type-I1 cytokeratins [61, 621. As for all other IF proteins [71], the predicted conformation of cytok- eratin A shows an abrupt change from a-helicity to nonheli- cal arrangements after the consensus sequence TYR(X)L- LEGE (position 0 in Figs. 3 and 4). This position also marks an abrupt end to the homology between cytokeratin A and all other members of the IF protein family (Fig. 6a), including the basic (type 11) epidermal cytokeratins of the same species (Figs. 4, 6).

Page 6: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

259

-40 -30 -20 -10 -0

B A A Q A K L A R L E A A L R N A K Q D M A R O L R E Y Q E L M N V K L A L D V E I A T Y R K L L E G E E S R L E S G M O Bla A N A K L Q D L K A A L Q Q A K E D L A R L L K E Y Q E L M N V K L A L D I E I A T Y R T L L E G E E C R M S G E C Q B Ib E A K D D L A R L L R D Y Q D A M N V K L A L D V E I A T Y R K L L E G E E C R M S G E C P BIII A R S K L A D V E D A L Q K A K Q D M A R L L R E Y Q E L M N T K L A L D V E I A T Y R K L L E G E E C R L S G E G V BIV A R S K L A D L E D A L Q K A K Q D M A R L L K E Y Q E L M N V K L A L D V E I A T Y R K L L E G E E C R L S G E G V

A+ c i

10 20 30 40 50 60

B A N M S I H T K T T S G Y A G G L T S S Y G T T G F N Y S L S P G S F S R T S S K P V V V ~ K I E T R D G K L V S E S S D Bla S S V S I E M V H N T T S S S S G G S G A L G G G A G G R G G L G S G G L G S G S L G S G R L G S G G R G S R A S R G N Blb S A V S I S V V S S S S T T S A S A G G F G G G Y G G G V G V G G G A R S G F G G G S G F G G G S G I S G S S G F G G G Blll G P V N I S V V T N T V S S G Y G G G S G F G G G L G G G L G G G L G G G L G G G L G G G L G S G L G G G G S S S F Y S BIV G ~ V N I ~ ~ ~ Q ~ T V S G G Y G G A G G Y G G A S G L G ~ G L G V S G G ~ G Y ~ Y ~ ~ G H ~ L G G G F ~ S G S G R A I

C l + / + C Z

70 80 90 100 110 120 130

B A Bla L A L D S S S G G G S A V R G S V S N S G G S C A V S G V G G - - - - - - - - R G S V R V T ~ S S S O S ~ R S H H K L - Blb S G S G F G G G S G F S G S S G F G G G S S G F G S G S G G R S G V S G G G L S S G S S R G G S V R F S O S S O R T S R B111 S S S G G V G L G G G L S V G G S G F S A S S G R S L G F G S G G G S - - - - S S S V K F V S T T S S S R K S F K S - - BIV G C G F G S S G G S - - - - - - - - - - - - - - - - - - - - - - - - - - - - - S S T I K Y T T T T S S S S R K G Y K H -

c2+ I -+c3

Fig. 4. Amino acid comparison of rod-tail regions of bovine cytokeratin A (no. 8) and several bovine epidermal type-I1 cytokeratins (for details, scc [33]). The symbols are as in Fig. 3 . The oligopeptides underlined are found in similar positions in cytokcratin A, 111, and IV and, in the epidermal components, signal the beginning of domain C2. C1, C2, and C3 indicate domains of the carboxyterminal tail region present in epidermal cytokeratins. Note the absence of domain subdivision and glycine-rich repeats in the tail region of cytokeratin A

300

200

100

0

-m

-Po

-ma

a I H - I ---u H H H H - H nn b a m H i M H W H H I , I I M I H I I H I Y

"I4 H I H I , 11" 111 2 :":: I I " I I ,I Y " H H I I C I #

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -280 -240 -200 -160 -120 -80 -40 0 r 4 0 4 8 0

Fig. 5. Analysis of a-helix probability for the amino acid sequence of cytokeratin A using the program of Gamier etal. [20] with the specifications mentioned by Rieger et al. [54]. The values on the ordinate present relative units of probability of a-helical confor- mation. The numbers on the abscissa are as in Fig. 3. The lines in a-dindicate the probability for a-helical (a), extended (b), a-turn (c), and random-coil (d) conformation. Note regions of a-helical potential in the rod portion and the abrupt change at the rod-tail transition (position 0). The significance of the a-helical 'spot' in the tail region (positions 45-62) is not clear

According to a hypothesis of Steinert et al. [61, 621, in the terminal third of the rod, i.e., coil 2B, all type-I1 cytokeratins should have a higher ratio of basic to acidic amino acids than type-I cytokeratins. In this region, bovine cytokeratin A has 19 basic and 23 acidic residues, i.e., it is, in this respect, considerably less basic than several other type-I1 cytokeratins analyzed.

The tail of cytokeratin A is considerably shorter than that of any of the epidermal type-I1 cytokeratins and is characterized by a high content of hydroxyamino acids (26 of the 73 residues) that occur over the entire tail portion. In contrast to all known epidermal type-I1 cytokeratins of bovine, murine, human, and amphibian origin [24, 27, 33, 61, 62, 671, cytokeratin A does not exhibit the subdivision of the tail into three domains. In particular, it lacks the glycine-rich middle domain (designated C2 in [33] and V2 in [61-631) and does not exhibit the oligoglycine repeats characteristic of the C2 domains of all epidermal type-I1 and some type-I cytokeratins (Figs. 3, 4). Interestingly, however, cytokeratin A presents a SGYAGG motif which is similar to the SG(G)YGGi sequence found at the border of domains C1 and C2 in several epidermal cytokeratins (Figs. 3, 4; see also [61]). The tail of bovine cytokeratin A also contains the sequence DGKLVSE, which is very similar to the sequence DGKVV,', found in a similar posi- tion in several type-I cytokeratins of a broad range of spe- cies [26, 32, 351, and also, in a modified version, in non- epithelial IF proteins (cf. [71]); however, this sequence is not found in epidermal keratins containing glycine-rich tails. It is also noteworthy that, although the number of basic amino acid residues exceeds that of acidic ones in the tail of cytokeratin A, the end portion, in contrast to

Page 7: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

260

I t I

Fig. 6a-d. Dot-matrix comparison of the amino acid sequences of bovine cytokeratin A (no. 8) and representatives of three other types of IF proteins. a Human basic (type 11) epidermal cytokeratin no. 6 (H56B [67]); b basic (type 11) epidermal cytokeratin of M, 64,000 of X . luevis [27]; c bovine acidic (type I) cytokeratin no. 14 (Boa VZb [54]); d hamster vimentin. Comparisons were performed using the computer program of S. Suhai (this center), presenting those sequence stretches with a degrec of sequence idcntity higher than 60%. In each comparison, bovine cytokeratin A (no. 8) is represented by the ordinate, the exception being a; the arrows denote the end of the a-helical domain. The numbers of the amino acid residues in the specific sequences are given in the abscissa and the ordinate

all other type-11 cytokeratins and some type-I cytokeratins, is not positively charged; the last 10 amino acids include 1 basic and 2 acidic residues.

We also compared the sequence of bovine cytokeratin A with that of a segment of the corresponding amphibian cytokeratin contained in clone KXL8' selected from an X . laevis oocyte mRNA Lgtl 1 library by cross-hybridization (J. Franz and W.W. Franke, unpublished data; cf. [IS]). The amino acid sequence of this cytokeratin has been well conserved during evolution, not only in the a-helical rod but also in the adjacent tail portion (data not shown).

On the basis of immunological reactions produced using antibodies to human TPA which is widely utiliLed as a tumor marker in clinical diagnoses, it has recently been suggested that TPA is a proteolytic soluble fra.gment(s) of the simple epithelial cytokeratins nos. 8, 9, and/or 19 [72].

When we compared the sequence of bovine cytokeratin A with the sequences of TPA fragments published by Luening and coworkers [39, 40, 531, it became obvious (Fig. 7) that at least one component of TPA is highly homologous to bovine cytokeratin A and is probably identical to human cytokeratin no. 8 (both TPA sequences could be assigned to the cr-helical rod). From comparisons with other pub- lished cytokeratin sequences as well as with partial se- quences of the rod portions of cytokeratins nos. 18 and 19 ([55]; B. Bader, T. Magin, and W.W. Franke, unpub- lished data), it also became clear that regions homologous to fragments El and C of TPA do not occur in type-I cytokeratins. IN contrast, fragment B of TPA is located at the start of coil 1 A of a type-I cytokeratin (probably no. 18) which is known to occur in vivo in complexes with cytokeratin no. 8 [17, 251.

Page 8: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

262

-170 -160 -IS! - 1 4 0 -130 -12p B A N F Y R 4 L Y ~ E E I R E H 4 S O ~ S D T S V V L S M ~ N N R N L ~ L ~ G ~ l A E V K ~ Q Y E E ~ ~ N R S R A E ~ E ~ ~

-11p -100 -90. -80. -70. -60. TPA C ____- - - -_______-____________ M N R N I S R L Q A E I E G L K G Q R A S L E A A 1 A D P . E Q R

B A V Q 1 K Y E E L Q T L A G K H G D D L R R T K T E I S E M N R N I " L Q A E l E G L K G Q R A S L E A A I A D A E Q R t * f t t t.tttt+ffftt*t**tt+*+ttf*****

-50. -40. . -30 -20 -10 +1 T P A C G E L A I R D A N A R L S E L F A A L q R A K Q n M - - - - - _ - _ _ _ _ :-- - - - - _ _ _ f ___- -___ :---

** t ** ***** ***** B A G E H A V K D A Q A K L A R L E A A L R N A K Q D M A R Q L R E Y Q E L M N V K L A L D V E I A T V R K L L E G E E S R

A Fig. 7. Amino-acid-sequence comparison of the rod segment of bovine cytokeratin A ( R A ) with partial sequences from cyanogen bromide cleavage products of human serum component, TPA [39]. Asterisks indicate identical amino acids ; dashed lines denote posi- tions not determined for the TPA fragments. Note the high se- quence homology, showing that TPA is closely related to bovine cytokeratin A, and is probably identical to the corresponding hu- man polypeptide, i.e., cytokeratin no. 8. The arrowhead denotes the end of the a-helical rod. Sequences of fragments TPA-El and TPA-C were aligned to maximize homology. The positions differ- ing between fragment TPA-C and bovine cytokeratin A may repre- sent true amino acid exchanges in the two species, but some may also be due to errors in the amino acid sequencing of the protein fragments

Fig. 8. Expression of bovine cytokeratin A in different bovine lis- sues and cultured cells as examined by Northern-blot analysis. The lanes contain poly(A)+-RNA from bladder urothelium (n), liver (b), retina (c), esophagus (4, snout (e), heart 0, cultures of MDBK-cells (g) , BMGE-H cells (h), and bovine fibroblasts; for comparison, SV40-transformed human fibroblasts (k) and B 1 (i) are included. As a positive control, RNA extracted from bovine and human fibroblasts was also probed using a vimentin cDNA clone kindly provided by W. Quax ([51]; 1, m). The radioactive smear below the mRNA band in lane h probably represents mRNA degradation products

Cell-type-spec$c expression of cytokeratin A (no. 8 )

To determine the presence and the relative amounts of mRNA coding for cytokeratin A in different cell types, total RNA and poly(A)+-RNA were prepared from diverse bovine tissues and cultured cells, and were then examined by hybridization using the Northern-blot technique. In these experiments, nick-translation-labeled plasmid pKB8' DNA and DNA of a subclone representing only a segment of the 3' nontranslated region of the mRNA were used, the latter in order to minimize cross-hybridization with other members of the basic (type 11) cytokerdtin subfamily (for problems of cross-hybridization of different cytokera- tin mRNAs, see also [34, 361). As shown in Fig. 8, we de- tected cytokeratin-A mRNA in bladder, liver, and (in very low amounts) esophagus, but not, for example, in retina, snout epidermis, or heart tissue (Fig. 8, lanes a-f), this being

in agreement with biochemical data and results obtained using immunofluorescence microscopy (e.g. [46, 49, 581). The detection of this mRNA in esophagus epithelium is especially interesting, as this cytokeratin is not readily de- tected among the cytoskeletal proteins of total human or bovine esophagus [12,45, 471, but has recently been shown, using cytokeratin-type-specific antibodies and micro- dissected tissue samples (R. Moll and W.W. Franke, unpub- lished data), to occur specifically in the basal layer of this stratified epithelium. Of the various cultured cell lines tested, cytokeratin-A mRNA occurred in MDBK and BMGE-H cells, but not in bovine and human fibroblasts (Fig. 8, lanes g-k; for controls of the negative results, see lanes 1 and m).

Discussion

The simple epithelial cytokeratins A (no. 8 of the human catalog) and D (no. 18) are of considerable importance, as they are the only cytokeratin pair expressed in a variety of simple epithelial cells, including early embryonal epithc- ha. The widespread occurrence of these cytokerdtins in var- ious kinds of epithelia, carcinomas, and cultured epithelial cells makes the cloning of nucleic acid probes and sequence information about these proteins highly desirable for the study of cell and tissue differentiation, including carcino- genesis. The clone pKB8' described in our study detects mRNA coding for cytokeratin A (no. 8) in simple epithelia, in urothelium, and in some stratified tissues, e.g., esophagus (this study) and exocervix (data not shown), which contain some simple epithelial cytokeratins in their most basal layer. Remarkably, epidermis which has a very high keratin con- tent is negative for this cDNA probe, confirming and ex- tending previous analyses at the polypeptide level [12, 581. The mRNA coding for cytokeratin A can also be detected in cultured epithelial cells such as the BMGE-H and MDBK cell lines (for polypeptides in these cells, see [15, 591). We have been unable to detect hybridization of this cytokeratin cDNA or any of the six epidermal cDNA probes described previously [34] with mRNAs of nonepi- thelial tissues such as heart, retina (this study), and brain (data not shown), or of nonepithelial cultured cells such as fibroblasts (this study) and lens-forming cells (not shown) even under conditions of lowered stringency. This absence of mRNAs capable of cross-hybridizing with any of the diverse cytokerdtin mRNAs is in agreement with the immunolocalization data and biochemical analyses of various investigators ([ 16, 461 ; for human tissucs, see [48]), and strongly speaks against the existence of ' keratin-like proteins' in various fibroblastic cell lines [75].

Our present knowledge concerning cytokeratins is lim- ited virtually to epidermal and sheep-wool keratins. The availability of the sequence of clone pKBX now makes pos- sible a comparison between a simple epithelial cytokeratin and the epidermal keratins. Although cytokeratin A is not a truly positively charged protein (isoelectric pH value in 9.5 M urea, -6.4 [12, 15, 161, it is related in its amino acid sequence to the group of basic epidermal cytokeratin polypeptides of subfamily 11. This is evident from the high amino-acid-sequence homology of the rod portion of both cytokeratin A and the epidermal members of this subfamily.

Steinert et al. [61, 621 have concluded that the more basic character of type-I1 cytokeratins, as compared to type-I cytokeratins, is generally due to a higher content

Page 9: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

262

of basic residues in coil 2B of the a-helical rod. However, the sequence of the simple epithelial cytokeratin A is re- markable by its ratio of basic to acidic amino acids in coil 2B of bovine cytokeratin A (20 residues out of 43 are basic), although it is still less acidic than its ‘complex partner’ cytokeratin D (no. 18) that, in humans, exhibits an acidic: basic rcsidue ratio of 21 : 14 (cf. [55 ] ) .

Amino-acid-sequence comparisons of all typc-11 cytok- eratins studied so far 124, 27, 33, 61, 62, 671 have revealed a subdivision of the tail region into three subdomains. Cy- tokeratin A differs dramatically from these cytokeratins by its lack of this feature; specifically, it does not contain the typical glycine-rich middle domain (C2 or V2). Preliminary data on the porcine iM,-52,000 cytokeratin, which may be related to bovine cytokeratin A, suggest that this too may lack glycine-rich tracts [71]. It may be that the hexapeptide SGYAGG, which is located in cytokeratin A in a position corresponding to the Cl-C2 domain transition in the other type-I1 cytokeratins, represents a rudimentary or ancestral motif related to the extended C2 domains of the epidermal members of this subfamily. It remains to be seen whether the absence of a glycine-rich C2 domain is a systematic difference between simple epithelial type-I1 cytokeratin(s) and type-I1 cytokeratins whose expression is usually asso- ciated with stratification.

It may be that the extent of the glycine-rich C2 domain is correlated with the specific stability of a cytokeratin fila- ment type against treatment with solubilizing agents [62] and/or higher resistance to the melting of heterotypic pairs by hydrogen-bond-weakening agents. In this context, it is of interest that the simple epithelial cytokeratin pair A:D (nos. 8 and 18 of 1471) is characterized by a relatively ‘low melting point’ in urea, whereas all epidermal cytokeratin pairs examined so far are much more stable 117, 251. It is widely assumed that the tail regions of IF proteins project laterally from the protofilament backbone formed by lin- early arranged rods (for reviews, see [63, 731) and would therefore be good candidates for the specific interaction of given types of IF protofilaments with each other or with other cell components. In this respect, the absence of gly- cine-rich tails in both cytokeratin A (no. 8) and D (no. 18; see ref. 55) indicates that stratified tissues exhibit a type of IF interaction that is absent in certain simple epithelial cells.

All type-I1 cytokeratin genes examined so far display a very similar exon-intron pattern [31, 38, 671. As to wheth- er the simple epithelial’cytokeratin A also displays the typi- cal patlern of 8 introns characteristic of the epidermal type- I1 cytokeratins cannot be said at present. Vasseur et al. 1681 have described a genomic clone reportedly coding for murine cytokeratin A that contains, surprisingly, only 7 introns. We have found the genomic analysis of bovine cytokeratin A to be exceptionally difficult because of the existence of several intronless pseudogenes (at least one is an almost precise cDNA copy of the cytokeratin mRNA; V. Romano and T. Magin, unpublished data). This feature has not been encountered in any of the genes for epidermal cytokeratins. Therefore, it might be speculated that the early expression of cytokeratin A during embryogenesis, i.e., before germ cells begin to develop from cytokeratin- containing embryonal epithelia, favors the integration of reverse copies ofcytokeratin mRNAs into germ line DNA.

TPA is a component of human serum which can be immunologically recognized and is widely used in clinical

oncology as a marker of severe tissue lesions, notably in association with cancer. Weber et al. [72] have reported that, in immunoblot experiments, TPA antibodies react spe- cifically with human cytokeratins nos. 8, 18, and 19. The remarkable sequence homology between bovine cytokeratin A and two sequences of TPA fragments strongly suggests that TPA contains cytokeratin no. 8. In addition, a compar- ison of TPA fragment B with partial sequences of cytokera- tin no. 18 and no. 19 (data not shown) suggests that this fragment is derived from cytokeratin no. 18. This supports the hypothesis of Weber et al. [72] that TPA as it appears in the circulation represents mixtures ~ or heterotypic com- plexes - of polypeptide fragments that contain the rod re- gions of cytokeratins nos. 8 and 18 and have been proteolyt- ically released during cell lysis and degradation.

Acknowledgements. We are grateful to Dr. Pamela Cowin (this Institute) for correcting the manuscript, Inga Benz (Max Planck Institute for Medical Research, Heidelberg) for a gift of high trans- formation efficiency E. coli cells, and Irmgard Purkert (this Insti- tute) for carefully typing the manuscript. The results of this study were presented at the Gordon Research Conference on Epithelial Differentiation and Keratinization, August 5-9, 1985, Tilton, NH, USA.

References 1. Achtstaetter T, Moll R, Moore B, Franke WW (1985) Cylo-

keratin polypeptide patterns of different epithelia of human male urogenital tract : Immunofluorescence and gel electro- phoresis studies. J Histochem Cytochem 33 :415-426

2. BrGlet P, Babinet C, Kemler R, Jacob F (1980) Molecular clon- ing of a cDNA sequence encoding a trophectoderm-specific marker during mouse blastocyst formation. Proc Natl Acad SciUSA 77:41134117

3. Chirgwin JM, Przybyla AE, MacDonald RJ, Rutter WJ (1979) Isolation of biologically native ribonucleic acid from sources enriched in ribonuclease. Biochemistry 18 : 5294-5299

4. Cooper D, Schermer A, Sun T-T (1985) Biology of disease. Classification of human epithelia and their neoplasms using monoclonal antibodies to keratins: strategies, applications, and limitations. Lab Invest 52 : 243-256

5. Crewther WG, Uowling LM, Steinert PM, Parry DAD (1983) Structure o f intermediate filaments. Int J Biol Macromol

6. Deng G, Wu R (1981) An improved procedure for utilizing terminal transferase to add homopolymers to the 3’ termini of DNA. Nucleic Acids Res 9 : 41 73 -4188

7. Denk H, Franke WW, Dragosics B, Zciler I(1981) Pathology of cytoskeleton of liver cells: Demonstration of Mallory bodies (alcoholic hyalin) in murine and human hepatocytes by immun- ofluorescence microscopy using antibodies to cytokeratin poly- peptides from hepatocytes. Hepatology 1 : 9--20

8. Denk H, Krepler R, Lackinger E, Artlieb U, Franke WW (1 982) Biochemical and immunological analysis of the interme- diate filament cytoskeleton in human hepatocellular carcino- mas and in hepatic neoplastic nodules of mice. Lab Invest 46: 584-596

9. Franke WW, Schmid E, Osborn M (3979) HeLa cells contain intermediate-sized filaments of the prekeratin type. Exp Cell Res 118:95-109

10. Franke WW, Denk H, Kalt R, Schmid E (1981) Biochemical and immunological identification of cytokeratin proteins in he- patocytes and hepatoma cells. Exp Cell Res 131 :299-318

11. Franke WW, Mayer D, Schmid E, Denk H, Borenfreund E (1981) Differences of expression of cytoskeletal proteins in cul- tured rat hepatocytes and hepatoma cells. Exp Cell Res

12. Franke WW, Schiller DL, Moll R, Winter S , Schmid E, Engel-

5 : 267-274

1341345-365

Page 10: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

262

brecht I, Denk H, Krepler R, Platzer B (1981) Diversity of cytokeratins : Differentiation-specific expression of cytokeratin polypeptides in epithelial cells and tissues. J Mol Biol

13. Franke WW, Winter S, Grund C, Schmid E, Schiller DL, Jar- asch E-D (1981) Isolation and characterization of desmosome- associated tonofilaments from rat intestinal brush border. J Cell Biol90 : 11 6-127

14.Franke WW, Grund C; Kuhn C, Jackson BW, Illmensee K (1982) Formation of cytoskeletal elements during mouse em- bryogenesis. 111. Primary mesenchymal cells and the first ap- pearance of vimentin filaments. Differentiation 23 : 43-49

15. Franke WW, Schmid E: Grund C, Geiger B (1982) Intermediate filament proteins in nonfilamentous structures : Transient dis- integration and inclusion of subunit proteins in granular aggre- gates. Cell 30:103-113

16. Franke WW; Schmid E, Schiller DL, Winter S, Jarasch E-D, Moll R, Denk H, Jackson B, Illmensee K (1982) Differentia- tion-related patterns of expression of proteins of intermediate- sized filaments in tissues and cultured cells. Cold Spring Harbor Symp Quant Biol46 : 431453

17. Franke WW, Schiller DL, Hatzfeld M, Winter S (1983) Protein complexes of intermediate-sized filaments : Melting of cytokera- tin complexes in urea reveals different polypeptide separation characteristics. Proc Natl Acad Sci USA 80:7113-7117

18. Franz JK, Gall L, Williams MA, Picheral B, Franke WW (1983) Intermediate-size filaments in a germ cell: Expression of cytokeratins in oocytes and eggs of the frog Xenopus. Proc Natl Acad Sci USA 80: 62546258

19. Fuchs EV, Coppock SM, Green H, Cleveland DW (1981) Two distinct classes of keratin genes and their evolutionary signifi- cance. Cell 27 : 75--84

20. Garnier J, Osguthorpe D-J, Robson B (1978) Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mo1 Biol

21. Geisler N, Weber K (1982) The amino acid sequence of chicken- muscle desmin provides a common structural model for inter- mediate-filament protein. EMBO J 1 : 1649-1656

22. Hanahan D (1983) Studies on transformation of Escherichiu coli cells with plasmids. J Mol Biol 166: 5577580

23. Hanukoglu I, Fuchs E (1982) The cDNA of a human epidermal keratin : divergence of sequence but conservation of structure among intermediate-filament proteins. Ccll31: 243-252

24. Hanukoglu 1, Fuchs E (1983) The cDNA sequence of a type-I1 cytoskeletal keratin reveals constant and variable structural do- mains among keratins. Cell 33 : 915-924

25. Hatzfeld M, Franke WW (1985) Pair formation and promiscui- ty of cytokeratins: Formation in vitro of heterotypic complexes and intermediate-sized filaments by homologous and heterolo- gous recombinations of purified polypeptides. J Cell Biol 101 :182&1841

26. Hoffmann W, Franz JK (1984) Amino acid sequence of the carboxy-terminal part of an acidic type-I cytokeratin of molecu- lar weight 51,000 from Xenopus laevis epidermis as predicted from the cDNA sequence. EMBO J 3: 1301-1306

27. Hoffmann W, Franz JK, Frankc WW (1985) Amino acid se- quence microheterogeneities of basic (type 11) cytokeratins of Xenopus luevis epidermis and evolutionary conservativity of hel- ical and non-helical domains. J Mol Biol 184:713-724

28. Ip W, Hartzer MK, Pang Y-YS, Robson RM (1985) Assembly of vimentin in vitro and its implications concerning the struc- ture of intermediate filaments. J Mol Biol 183: 365-375

29. Jackson BW, Grund C, Schmid E, Biirki K, Franke WW, 111- mensee K (1980) Formation of cytoskeletal elements during mouse embryogenesis. Differentiation 17: 261-1 79

30. Jackson BW, Grund C, Winter S , Franke WW, Illmensee K (1981) Formation of cytoskeletal elements during mouse em- bryogenesis. 11. Epithelial differentiation and intermediate-sised filaments in early postimplantation embryos. Differentiation

153:933-959

120~97-120

201203-216

31. Johnson LD, Idler WW, Zhou X-M, Roop DR, Steinert PM (1985) Structure of a gene for the human epidermal 67-kDa keratin. Proc Natl Acad Sci USA 82 : 1896-1900

32. Jonas E, Sargent TD, Dawid IB (1985) Epidermal keratin gene expressed in embryos of Xenopus luevis. Proc Natl Acad Sci

33. Jorcano JL, Franz JK, Franke WW (1984) Amino acid se- quence diversity between bovine epidermal cytokeratin ~ 0 1 ~ - peptides of the basic (type 11) subfamily as determined from cDNA clones. Differentiation 28: 155-163

34. Jorcano JL, Magin TM, Franke WW (1984) Cell type-specific expression of bovine keratin genes as demonstrated by the use of complementary DNA clones. J Mol Biol 176:21-37

35. Jorcano JL, Rieger M, Franz JK, Schiller DL, Moll R, Franke WW (1984) Identification of two types of keratin polypeptides within the acidic cytokeratin subfamily I . J Mol Biol

36. Kim KH, Rheinwald JG, Fuchs EV (1983) Tissue specificity of epithelial keratins : differential expression of mRNAs from two multigene families. Mol Cell Biol 3 : 495-502

37. Krieg TM, Schafer MP, Cheng CK, Filpula D, Flaherty P, Steinert PM, Roop DR (1985) Organization of a type I keratin gene : Evidence for evolution of intermediate filaments from a common ancestral gene. J Biol Chem 260: 5867-5870

38. Lehnert ME, Jorcano JL, Zentgraf H, Blessing M, Franz JK, Franke WW (1984) Characterization of bovine keratin genes: similarities of exon patterns in genes coding for different kera- tins. EMBO J 3 : 3279-3287

39. Luening B, Nilsson U (1983) Sequence homology between tis- sue polypeptide antigen (TPA) and intermediate filament (IF) proteins. Acta Chem Scand [B] 37:731-733

40. Luening B, Wiklund B, Redelius P, Bjorklund B (1980) Bio- chemical properties of tissue polypeptide antigen. Biochem Bio- phys Acta 624:90-101

41. Magin TM, Jorcano JL, Franke WW (1983) Translational products of mRNAs coding for non-epidermal cytokeratins.

42. Maniatis T, Frisch EF, Sambrook J (1982) Molecular cloning. A laboratory manual. Cold Spring Harbor Laboratory, New York

43. Marchuk D, McCrohon S, Fuchs E (1984) Complete sequence of a gene encoding a human type I keratin: Sequences homolo- gous to enhancer elements in the regulatory region of the gene. Cell 39:491498

44. Maxam AM, Gilbert W (1980) Sequencing end-labeled DNA with base-specific chemical cleavages. Methods Enzymol

45. Milstone LM (1981) Isolation and characterization of two poly- peptides that form intermediate filaments in bovine esophageal epithelium. J Cell Biol88:317-322

46. Moll R, Franke WW (1982) Intermediate filaments and their interaction with membranes. The desmosome-cytokeratin fila- ment complex and epithelial differentiation. Pathol Res Pracl 175:146-161

47. Moll R, Franke WW, Schiller DL, Gciger B, Krepler R (1982) The catalog of human cytokeratin polypeptides: patterns of expression of specific cytokeratins in normal epithelia, tumors, and cultured cells. Cell 31 : 11-24

48. Osborn M, Weber K (1983) Tumor diagnosis by intermediate filament typing: a novel tool for surgical pathology. Lab Invest

49. Oshima RG (1981) Identification and inimunoprecipitation of cytoskeletal proteins from murine extra-embryonic endodermal cells. J Biol Chem 256:812&8133

50. Quax W, Egberts VW, Hendriks W, Quax-Jeuken Y, Bloemen- dal H (1983) The structure of the vimentin gene. Cell

51. Quax-Jeuken Y, Quax W, Bloemendal H (1983) Primary and secondary structures of hamster viinentin predicted from thc nucleotide sequence. Proc Natl Acad Sci USA 80: 3548-3552

52. Quinlan RA, Cohlberg JA, Schiller DL, Halzfeld M, Franke

USA 82:5413-5417

179~257-281

EMBO J 2: 1387-1392

651499-560

481372-394

36:215-223

Page 11: Cytokeratin expression in simple epithelia: II. cDNA cloning and sequence characteristics of bovine cytokeratin A (no. 8)

264

WW (1984) Heterotypic tetramer (A,D,) complexes of non- epithelial keratins isolated from cytoskeletons of rat hepato- cytes and hepatoma cells. J Mol Biol 178:365-388

53. Redelius P, Luening B, Bjoerklund B (1980) Chemical studies of tissue polypeptide antigen (TPA). 11. Partial amino acid se- quences of cyanogen bromide fragments of TPA subunit B, . Acta Chem Scdnd [B] 34:265-273

54. Rieger M, Jorcano JL, Franke WW (1985) Complete sequence of a bovine type I cytokeratin gene: conserved and variable intron positions in genes of polypeptides of the same cytokera- tin subfamily. EMBO J 4: 2261-2267

55. Romano V, Hatzfeld M, Magin TM, Franke WW, Maicr G, Ponstingl H (1986) Cytokeratin expression in simple epithelia. I. Identification of mRNA coding for human cytokeratin no. 18 by a cDNA clone. Differentiation (in press)

56. Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain terminating inhibitors. Proc Natl Acad Sci USA

57. Schiller DL, Franke WW (1983) Limited proteolysis of cytoker- atin A by an endogeneous protease: removal of positively charged terminal sequences. Cell Biol Int Rep 7 : 3

58. Schiller DL, Franke WW, Geiger B (1982) A subfamily of rela- tively large and basic cytokeratin polypeptides as defined by peptide mapping is represented by one or several polypeptides in epithelial cells. EMBO J 1 : 761-769

59. Schmid E, Schiller DL, Grund C, Stadler J, Franke WW (1983) Tissue type specific expression of intermediate filament proteins in a cultured epithclial cell line from bovine mammary gland. J Cell Biol96: 37-50

60. Steinert PM, Rice RH, Roop DR, Trus BL, Steven AC (1983) Complete amino acid sequence of a mouse epidermal keratin subunit and implications for the structure of intermediate fila- ments. Nature 302 : 794800

61. Steinert PM, Parry DAD, Racoosin EL, Idler WW, Steven AC, Trus BL, Roop DR (1984) The complete cDNA and de- duced amino acid sequence of a type I1 mouse epidermal kera- tin of 60,000 Da : Analysis of sequence differences between type1 and type11 keratins. Proc Natl Acad Sci USA

62. Steinert PM, Parry DAD, Idler WW, Johnson LD, Steven AC, Roop DR (1985) Amino acid sequences of mouse and human epidermal type I1 keratins of M, 67,000 provide a systematic basis for the structural and functional diversity of the end do- mains of keratin intermediate filament subunits. J Biol Chem

63. Steinert PM, Steven AC, Roop DR (1985) The molecular biolo- gy of intermediate filaments. Cell 42 : 41 1 4 1 9

64. Summerhayes IC, Chen LB (1982) Localization of a M, 52,000 keratin in basal epithelial cells of the mouse bladder and expres-

74:5463-5467

81:5709-5713

260:7142-7149

sion throughout neoplastic progression. Cancer Res 42:40984109

65.Sun T-T, Eichner R, Schermer A, Cooper D, Nelson WG, Weiss RA (1984) Classification, expression and possible mecha- nisms of evolution of mammalian epithelial keratins : a unifying model. In: Levine A, VandeWoude GF, Topp WC, Watson JD (eds). The transformed phenotype, cancer cells, vol 1. Cold Spring Harbor Laboratory, Cold Spring Harbor, pp 169-176

66.Tseng SCG, Jarvinen MJ, Nelson WG, Huang J-W, Wood- cock-Mitchell J, Sun T-T (1982) Correlation of specific keratins with different types of epithclial differentiation : monoclonal antibody studies. Cell 30: 361-372

67. Tyner AL, Eichman MJ, Fuchs E (1985) The sequence of a type I1 keratin gene expressed in human skin: Conservation of structure among all intermediate filament genes. Proc Natl Acad Sci USA 82: 46834687

68. Vasseur M, Duprey P, Brdlet P, Jacob P (1985) One gene and one pseudogene for the cytokerdtin endo A. Proc Natl Acad Sci USA 82:1155-1159

69. Vieira J, Messing J (1982) The pUC plasmids, an M13mp7- derived system for insertion mutagenesis and sequencing with synthetic universal primers. Gene 19: 259-268

70. Ward WS, Schmidt WN, Schmidt CA, Hnilica LS (1985) Cross- linking of Novikoff ascites hepatoma cytokeratin filaments. Biochemistry 24:44294434

71. Weber K, Geisler N (1984) Intermediate filaments from wool cr-keratins to neurofilaments : a structural overview. In: Levine A, VandeWoude GF, Topp WC, Watson JD (eds). The trans- formed phenotype, cancer cells, vol 1. Cold Spring Harbor Lab- oratories, Cold Spring Harbor, pp 153-159

72. Weber K, Osborn M, Moll R, Wiklund B, Luening B (1984) Tissue polypeptide antigen (TPA) is related to the non-epider- ma1 keratins 8, 18 and 19 typical of simple and non-squamous epithelia: Re-evaluation of a human tumor marker. EMBO J 3 : 2707-271 4

73. Woods EF (1983) The number of polypeptide chains in the rod domain of bovine epidermal keratin. Biochemistry Int

74. Wu Y-J, Parker LM, Binder NE, Beckett MA, Sinard JH, Grif- fiths CT, Rheinwald JG (1982) The mesothelial keratins: a new family of cytoskeletal proteins identified in culturcd mesothelial cells and nonkeratinizing epithelia. Cell 31 : 693-703

75. Zackroff RV, Goldman AE, Jones JCR, Steinert PM, Goldman R D (1984) The isolation and characterization of keratin-like proteins from cultured cells with fibroblastic morphology. J Cell Biol98: 1231-1237

7 769-774

Received November 1985 / Accepted November 30, 1985