9
Genome-wide identification of cuticular protein genes in the silkworm, Bombyx mori Ryo Futahashi a, b , Shun Okamoto a , Hideki Kawasaki c , Yang-Sheng Zhong c, 1 , Masashi Iwanaga c , Kazuei Mita b , Haruhiko Fujiwara a, * a Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Bioscience Building 501, Kashiwa, Chiba 277-8562, Japan b National Institute of Agrobiological Sciences, Owashi 1-2, Tsukuba 305-8643, Japan c Faculty of Agriculture, Utsunomiya University, 350 Mine, Utsunomiya, Tochigi 321-8505, Japan article info Article history: Received 31 October 2007 Received in revised form 1 April 2008 Accepted 6 May 2008 Keywords: Bombyx mori Cuticle Cuticular protein R&R consensus EST library Whole genome sequence abstract Many kinds of cuticular proteins are found in a single insect species and their numbers and features are diversified among insects. Because there are so many cuticular proteins and so much sequence variation among them, an overview of cuticular protein gene is needed. Recently, a complete silkworm genome sequence was obtained through the integration of data from two whole genome sequence projects performed independently in 2004. To identify cuticular protein genes in the silkworm Bombyx mori exhaustively, we searched both the Bombyx whole genome sequence as well as various EST libraries, and found 220 putative cuticular protein genes. We also revised the annotation of the gene model, and named each identified cuticular protein based on its motif. The phylogenetic tree of cuticular protein genes among B. mori, Drosophila melanogaster, and Apis mellifera revealed that duplicate cuticular protein clusters have evolved independently among insects. Comparison of EST libraries and northern blot analyses showed that the tissue- and stage-specific expression of each gene was intricately regulated, even between adjacent genes in the same gene cluster. This study reveals many novel cuticular protein genes as well as insights into cuticular protein gene regulation. Ó 2008 Elsevier Ltd. All rights reserved. 1. Introduction Insect cuticle is composed of many kinds of cuticular proteins together with chitin. The numbers and features of cuticular proteins are diversified among insects, whereas chitin is a uniform polymer of N-acetylglucosamine. Because a majority of excreted cuticle components are cross-linked and therefore inextractable (Andersen et al., 1995), and because the amino acid sequences of cuticular protein genes are not well conserved among insects, the overview of cuticular protein genes is still largely unknown. The majority of cuticular proteins have the Rebers and Riddiford Consensus (R&R Consensus), which in an extended form is known to bind chitin (Rebers and Willis, 2001; Togawa et al., 2004; Willis et al., 2005). Proteins with R&R Consensus can be split into three groups, RR-1, RR-2, and RR-3, with some correlation to the type or region of the cuticle. Recently, other motifs of cuticular proteins have been reported. In Drosophila melanogaster , the Tweedle motif was found by identification of a body shape mutant (Guan et al., 2006). Because Tweedle proteins are predicted to form b-strands, and because a barrel structure formed by multiple b-strands provides an interface for aromatic residues to stack with and bind to chitin (Iconomidou et al., 1999; Hamodrakas et al., 2002), studies have postulated that Tweedle proteins interact directly with chitin (Guan et al., 2006). A motif of 51 amino acids was described (Andersen et al., 1997) but more recently Togawa et al. (2007) found that when more species were examined, the conserved motif was no more than 44 amino acids (cuticular protein with a 44 amino acid motif, CPF). Two proteins with this motif did not bind chitin in their assay. They also reported CPF-like proteins (CPFL), which lack the conserved 44 aa residues but their C-terminal regions are similar. It is also known that some cuticular proteins do not possess these motifs, and cuticular proteins devoid of the above mentioned motifs were also isolated and sequenced directly from cuticles (Andersen et al., 1995; Willis et al., 2005; He et al., 2007). Recently, comprehensive identifications of cuticular proteins with an R&R motif have been attempted in Drosophila melanogaster (Karouzou et al., 2007) and Apis mellifera (Honeybee Genome Sequencing Consortium, 2006) based on their genome sequences. In the studied genomes, 101 and 28 genes with the R&R motif were * Corresponding author. Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Bioscience Building 501, Kashiwa, Chiba 277-8562, Japan. Tel.: þ81 47136 3659; fax: þ81 47136 3660. E-mail address: [email protected] (H. Fujiwara). 1 Present address: Department of Sericulture, South China Agricultural University, China. Contents lists available at ScienceDirect Insect Biochemistry and Molecular Biology journal homepage: www.elsevier.com/locate/ibmb 0965-1748/$ – see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.ibmb.2008.05.007 Insect Biochemistry and Molecular Biology 38 (2008) 1138–1146

Genome-wide identification of cuticular protein genes in the silkworm, Bombyx mori

Embed Size (px)

Citation preview

lable at ScienceDirect

Insect Biochemistry and Molecular Biology 38 (2008) 1138–1146

Contents lists avai

Insect Biochemistry and Molecular Biology

journal homepage: www.elsevier .com/locate/ ibmb

Genome-wide identification of cuticular protein genes in the silkworm,Bombyx mori

Ryo Futahashi a,b, Shun Okamoto a, Hideki Kawasaki c, Yang-Sheng Zhong c,1, Masashi Iwanaga c,Kazuei Mita b, Haruhiko Fujiwara a,*

a Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Bioscience Building 501, Kashiwa, Chiba 277-8562, Japanb National Institute of Agrobiological Sciences, Owashi 1-2, Tsukuba 305-8643, Japanc Faculty of Agriculture, Utsunomiya University, 350 Mine, Utsunomiya, Tochigi 321-8505, Japan

a r t i c l e i n f o

Article history:Received 31 October 2007Received in revised form 1 April 2008Accepted 6 May 2008

Keywords:Bombyx moriCuticleCuticular proteinR&R consensusEST libraryWhole genome sequence

* Corresponding author. Department of Integratedof Frontier Sciences, The University of Tokyo, BioscChiba 277-8562, Japan. Tel.: þ81 47136 3659; fax: þ8

E-mail address: [email protected] (H. Fujiwar1 Present address: Department of Sericulture,

University, China.

0965-1748/$ – see front matter � 2008 Elsevier Ltd.doi:10.1016/j.ibmb.2008.05.007

a b s t r a c t

Many kinds of cuticular proteins are found in a single insect species and their numbers and features arediversified among insects. Because there are so many cuticular proteins and so much sequence variationamong them, an overview of cuticular protein gene is needed. Recently, a complete silkworm genomesequence was obtained through the integration of data from two whole genome sequence projectsperformed independently in 2004. To identify cuticular protein genes in the silkworm Bombyx moriexhaustively, we searched both the Bombyx whole genome sequence as well as various EST libraries, andfound 220 putative cuticular protein genes. We also revised the annotation of the gene model, andnamed each identified cuticular protein based on its motif. The phylogenetic tree of cuticular proteingenes among B. mori, Drosophila melanogaster, and Apis mellifera revealed that duplicate cuticular proteinclusters have evolved independently among insects. Comparison of EST libraries and northern blotanalyses showed that the tissue- and stage-specific expression of each gene was intricately regulated,even between adjacent genes in the same gene cluster. This study reveals many novel cuticular proteingenes as well as insights into cuticular protein gene regulation.

� 2008 Elsevier Ltd. All rights reserved.

1. Introduction

Insect cuticle is composed of many kinds of cuticular proteinstogether with chitin. The numbers and features of cuticularproteins are diversified among insects, whereas chitin is a uniformpolymer of N-acetylglucosamine. Because a majority of excretedcuticle components are cross-linked and therefore inextractable(Andersen et al., 1995), and because the amino acid sequences ofcuticular protein genes are not well conserved among insects, theoverview of cuticular protein genes is still largely unknown. Themajority of cuticular proteins have the Rebers and RiddifordConsensus (R&R Consensus), which in an extended form is knownto bind chitin (Rebers and Willis, 2001; Togawa et al., 2004; Williset al., 2005). Proteins with R&R Consensus can be split into threegroups, RR-1, RR-2, and RR-3, with some correlation to the type orregion of the cuticle. Recently, other motifs of cuticular proteins

Biosciences, Graduate Schoolience Building 501, Kashiwa,1 47136 3660.a).

South China Agricultural

All rights reserved.

have been reported. In Drosophila melanogaster, the Tweedle motifwas found by identification of a body shape mutant (Guan et al.,2006). Because Tweedle proteins are predicted to form b-strands,and because a barrel structure formed by multiple b-strandsprovides an interface for aromatic residues to stack with and bindto chitin (Iconomidou et al., 1999; Hamodrakas et al., 2002), studieshave postulated that Tweedle proteins interact directly with chitin(Guan et al., 2006). A motif of 51 amino acids was described(Andersen et al., 1997) but more recently Togawa et al. (2007) foundthat when more species were examined, the conserved motif wasno more than 44 amino acids (cuticular protein with a 44 aminoacid motif, CPF). Two proteins with this motif did not bind chitin intheir assay. They also reported CPF-like proteins (CPFL), which lackthe conserved 44 aa residues but their C-terminal regions aresimilar. It is also known that some cuticular proteins do not possessthese motifs, and cuticular proteins devoid of the above mentionedmotifs were also isolated and sequenced directly from cuticles(Andersen et al., 1995; Willis et al., 2005; He et al., 2007).

Recently, comprehensive identifications of cuticular proteinswith an R&R motif have been attempted in Drosophila melanogaster(Karouzou et al., 2007) and Apis mellifera (Honeybee GenomeSequencing Consortium, 2006) based on their genome sequences.In the studied genomes, 101 and 28 genes with the R&R motif were

R. Futahashi et al. / Insect Biochemistry and Molecular Biology 38 (2008) 1138–1146 1139

found, respectively. These studies imply that the composition ofcuticular protein genes vary among insect taxa. In 2004, Japaneseand Chinese groups independently published the Bombyx morigenome draft sequences (Mita et al., 2004; Xia et al., 2004).Recently, these two data sets were merged and assembled throughcollaboration between China and Japan (The International Silk-worm Genome Sequencing Consortium, in preparation), which hashelped us to screen for genes of interest on a genome-wide scale.They also constructed gene model, which consists of all the genespredicted by gene finder BGF (Li et al., 2005) by pre-filtering clas-sifiable transposable elements. Furthermore, Bombyx is suitable forthe study of tissue specificity because its tissue size is relativelylarge, and it is easy to construct a tissue-specific cDNA library. ManyEST libraries from various tissues have now been constructed (Mitaet al., 2003; Kawasaki et al., 2004; Kinjoh et al., 2007; Okamotoet al., in press).

In the silkworm, B. mori, 28 cuticular protein genes have beenalready reported before merging the two genome data sets(Supplementary Table 1; Nakato et al., 1990, 1997; Takeda et al.,2001; Suzuki et al., 2002; Sawada et al., 2003; Noji et al., 2003;Zhong et al., 2006; Togawa et al., 2007). Here we performeda genome-wide screen for cuticular protein genes in the silkworm,B. mori. To identify cuticular protein genes exhaustively, wesearched both the Bombyx whole genome sequence and ESTlibraries using not only the known motif described above, but alsousing the following criteria: (1) N-terminal signal peptide, (2)simple repeat sequence (GGX or AAP(A/V)), and (3) sequencesimilarity to known cuticular proteins. We found 220 putativecuticular protein genes (RR-1 56, RR-2 89, RR-3 3, Tweedle 4, CPF 1,CPFL 4, glycine-rich 29, and 34 other genes). We revised theannotation of the gene model in The International SilkwormGenome Sequencing Consortium (in preparation) and named eachidentified cuticular protein based on its motif. Phylogenetic analysisof RR-1 and RR-2 proteins among B. mori, D. melanogaster, andA. mellifera suggested that duplicate cuticular protein clusters haveevolved independently among insect taxa. Comparison of ESTlibraries revealed that motif differences correlate to the tissuespecificity. However, the expression of each cuticular protein doesnot correlate based on chromosomal location, and often differedbetween the adjacent genes. Through northern blot analysis wealso found that stage-specific expression of cuticular protein inthe wing disc also varied between adjacent genes. These resultsindicate that expression of each cuticular protein is regulatedintricately, even in the same clusters.

2. Materials and methods

2.1. Prediction of cuticular protein genes in genome-wide scale

First we identified genes that might code for cuticular proteinswith the following known motif, R&R Consensus (Rebers and Rid-diford, 1988; Iconomidou et al., 1999), Tweedle motif (Guan et al.,2006), and 44 aa residues (Togawa et al., 2007) by using tBLASTnsearch both the whole genome sequence (The International Silk-worm Genome Sequencing Consortium, in preparation) andvarious EST libraries (Mita et al., 2003; Kinjoh et al., 2007; Okamotoet al., in press). R&R Consensus is also confirmed by using a toolbased on profile hidden Markov models (HMMs) in cuticleDBwebsite <http://bioinformatics2.biol.uoa.gr/cuticleDB/index.jsp>(Karouzou et al., 2007), which is capable of discriminatingbetween RR-1 and RR-2 cuticular proteins. In addition to theseknown motifs, we also predicted putative cuticular protein genesusing several criteria as follows: (1) N-terminal signal peptide, (2)simple repeat sequence (GGX or AAP(A/V)), and (3) sequencesimilarity to known cuticular proteins.

2.2. Naming protocol

We adopted a simple nomenclature with their motif or char-acteristics such as BmorCPT1 (B. mori cuticular protein Tweedle motif1). Naming of cuticular protein was referred as other insect speciessuch as D. melanogaster (Karouzou et al., 2007), Anopheles gambiae(He et al., 2007), and A. mellifera [cuticleDB website <http://bioinformatics2.biol.uoa.gr/cuticleDB/index.jsp>; Magkrioti et al.,2004]. For each type of cuticular protein except for CPG, we havenumbered the genes in the order on which they appear on thechromosome. As for CPG, because CPG1 has been already reported(Suzuki et al., 2002), we first numbered CPG genes in the genecluster including CPG1 (see Supplementary Table 1). 28 Bombyxcuticular protein genes have been published previously or availableon NCBI database. (Most of these proteins are found in cuticleDBwebsite described above.) We searched for the genomic location ofthese 28 previously identified genes. In 26 cases (except forBmLCP18 and BMWCP11), the genomic sequence was not identicalto the previously published sequence (see Supplementary Table 4).For example, the amino acid sequences are not identical betweenCPFLa1 and BmorCPFL3, and between CPFLb and BmorCPFL4. InBombyx genome project, p50 (Dazao) strain was used (Mita et al.,2004; Xia et al., 2004), however, p50 strain has not been used toidentify the cuticular protein genes in the previous study (seeSupplementary Table 4). Because cuticular protein genes often arevery similar and may code for proteins with identical amino acidsequences (Takeda et al., 2001; He et al., 2007; Karouzou et al.,2007), we could not conclude that the previously reported genesare the same as the genes identified in this study using the wholegenome sequence. Takeda et al. (2001) reported two sets of verysimilar genes, (BMWCP1a and BMWCP1b) and (BMWCP7a andBMWCP7b). We found two different genomic regions correspondedto BMWCP1a and BMWCP1b, but the same genomic region corre-sponded to BMWCP7a and BMWCP7b. Moreover, deduced aminoacid sequences of genome sequence are not identical with those ofBMWCP1a, BMWCP1b, BMWCP7a and BMWCP7b. In D. mela-nogaster, the copy number of cuticular protein genes at band 65Avaried even among strains (Charles et al., 1997), suggesting geneduplication may occur even among strains. Therefore we named allcuticular protein found in genome sequence to avoid confusingamong strain differences. The corresponding genome regions (mostsimilar genes) of known genes are denoted in Supplementary Table1. Bombyx strain used for the previous study identifying Bombyxcuticular protein and each EST library is shown in SupplementaryTables 4 and 5.

2.3. Phylogenetic analysis

Sequences were aligned using Clustal_X (Thompson et al., 1997).Phylogenetic trees were constructed by the neighbor-joiningmethod with the MEGA4 program (Tamura et al., 2007). Theconfidence of the various phylogenetic lineages was assessed by thebootstrap analysis. Amino acid sequences of cuticular protein genesof D. melanogaster and A. mellifera were obtained from cuticleDBwebsite described above.

2.4. Comparison of EST libraries

In B. mori, various EST libraries have been constructed (Mitaet al., 2003; Kawasaki et al., 2004; Kinjoh et al., 2007; Okamotoet al., in press). We searched 220 cuticular protein genes in theselibraries and count the total clone numbers. Expression of severalgenes was also confirmed by reverse transcriptional-polymerasechain reaction (Okamoto et al., in press). Some libraries are con-structed from another strain and nucleic acid sequence is slightly

R. Futahashi et al. / Insect Biochemistry and Molecular Biology 38 (2008) 1138–11461140

different (see Supplementary Table 5). In these cases, we adopt themost similar gene in genome sequences.

2.5. Northern blot analysis

Northern hybridization was performed as described in Noji et al.(2003). Hybrids of the N124 and C124 B. mori strains were used. Theperiods (in days) corresponding to the developmental stages of thefourth to fifth larval ecdysis, wandering were designated as V0, W0,respectively.

3. Results

3.1. Genome-wide screening for cuticular protein genes

First, we searched for putative cuticular protein genes withknown motif such as R&R Consensus (Rebers and Riddiford, 1988;Iconomidou et al., 1999), Tweedle motif (Guan et al., 2006), and 44aa residues (Togawa et al., 2007). We found 148 genes for cuticularproteins with the R&R Consensus in B. mori, which are moreabundant than in D. melanogaster (101 R&R proteins, Karouzouet al., 2007) and in A. mellifera (28 R&R proteins, Honeybee GenomeSequencing Consortium, 2006). Fifty-six genes are RR-1 proteingenes (BmorCPR1–BmorCPR56; GenBank accession numbersBR000502–BR000557), 89 are RR-2 protein genes (BmorCPR57–BmorCPR145; BR000558–BR000646), and three are classified as RR-3 protein genes (BmorCPR146–BmorCPR148; BR000647–BR000649)by manual annotation. Eighty-two of these are found in at least oneEST library (Table 1, Supplementary Table 1), 39 of which are notcorrectly annotated in the gene model (Supplementary Table 1; TheInternational Silkworm Genome Sequencing Consortium, in pre-paration). We confirmed the classification by an HMM tool in thecuticleDB website (Karouzou et al., 2007). Of these, 53 (except forBmorCPR22, BmorCPR47, and BmorCPR55) and 87 (except forBmorCPR139 and BmorCPR141) are classified as RR-1 and RR-2proteins, respectively (Supplementary Table 2). BmorCPR22 andBmorCPR55 are similar to Dmell(3)mbn and DmelCpr97Eb,respectively, which could also not be classified using an HMM tool(Karouzou et al., 2007). Another three genes not classified using anHMM tool are also similar to known cuticular protein genes with anR&R Consensus. Through a tBLASTn search, we also found anotherfive genomic regions that corresponded to the R&R Consensus, twoof which were classified to the RR-2 protein by an HMM tool incuticleDB, but could not be annotated (Supplementary Table 2, *1–*5). We did not find the 50 ends of these 11 genes (BmorCPR14,BmorCPR22, BmorCPR35, BmorCPR50, BmorCPR66, BmorCPR80,BmorCPR137, BmorCPR141, BmorCPR142, BmorCPR144, andBmorCPR145), although they were found in EST library, or have beenpredicted by gene model. These genes should be revised becausethey lack signal peptide sequence (Supplementary Table 1).

Table 1Numbers of cuticle protein genes and total ESTs.

Motif Number of genesa Total ESTs

RR-1 56 (34) 2859RR-2 89 (46) 320RR-3 3 (2) 88Tweedle 4 (4) 43344 aa Residues (CPF) 1 (0) 0CPF-like (CPFL) 4 (4) 73Glycine-rich 29 (29) 2246Others 34 (32) 527

Total 220 (151) 6546

a Numbers of genes that are found in at least one EST library in parentheses.

We have identified four genes with a Tweedle motif(BmorCPT1–BmorCPT4; BR000650–BR000653), all of which arefound in at least one EST library. They do not form any cluster(Supplementary Table 1, The International Silkworm GenomeSequencing Consortium, in preparation). Two of these (BmorCPT1and BmorCPT4) are not correctly annotated in gene model(Supplementary Table 1; The International Silkworm GenomeSequencing Consortium, in preparation).

We have identified only one gene with the diagnostic 44 aaconsensus (BmorCPF; BR000417), and it is not found in any ESTlibrary. We found four genes that code for CPFL proteins(BmorCPFL1–BmorCPFL4; BR000418–BR000421), each was foundin at least one EST library. BmorCPFL1 is not predicted andBmorCPFL3 is not correctly annotated in the gene model (Supple-mentary Table 1; The International Silkworm Genome SequencingConsortium, in preparation). BmorCPFL1 formed a cluster with theother three cuticular protein genes (BmorCPH2–BmorCPH4), andBmorCPFL2–BmorCPFL4 formed clusters with two other cuticularprotein genes (BmorCPG22–BmorCPG23). Amino acid sequences ofBmorCPH4, BmorCPH12, BmorCPH13, BmorCPH16, BmorCPH25,BmorCPH34, and BmCPG4 also have some similarity to CPF orCPFL; however, we could not conclude that these genes belong toCPFL.

Some cuticular proteins do not have the motifs described above.Andersen et al. (1995) predicted that GG repeats would form turns,and some cuticular proteins contain glycine-rich regions, includingGG repeats (Charles et al., 1992; Suzuki et al., 2002; Krogh et al.,1995; Zhong et al., 2006). We found 29 putative cuticular proteingenes that included GGYGG/GGXGG repeats and labeled thesegenes as CPG (cuticular protein glycine-rich, BmorCPG1–BmorCPG29; BR000422–BR000450). Some glycine-rich cuticularproteins also have R&R Consensus or Tweedle motif (i.e.BmorCPR140 and BmorCPT2). We gave preference to these knownmotifs over glycine-rich characteristics, and termed these proteins‘‘CPR’’ or ‘‘CPT’’, respectively.

To find the other type of cuticle proteins described above, wesearched the genes found in variable EST libraries or predicted bygene model (The International Silkworm Genome SequencingConsortium, in preparation). We also focused on the genes whichhave no similarity with known proteins. Among these proteins,several genes have both simple repeat sequences such as AAP(A/V)and an N-terminal signal peptide, and most of these genes form geneclusters. AAP(A/V) motif is one of the characteristics of cuticleproteins (Magkrioti et al., 2004), and these motif often coexist withanother motif such as R&R Consensus and 44 aa residues. However,some genes lack this motif even in the same gene cluster (i.e.BmorCPH16-BmorCPH19 and BmorCPH21-BmorCPH23 have AAP(A/V) motif but BmorCPH20 do not have AAP(A/V) repeats, althoughBmorCPH20 has sequence similarity to known cuticular proteinsAgamCPR124 of A. gambiae). We judged 34 genes as putative cuticleprotein genes (Supplementary Table 3). All of these genes have signalpeptide, and have either sequence similarity with known cuticleprotein gene or AAP(A/V) motif, or both. We labeled these other 34genes as CPH (cuticular protein hypothetical, BmorCPH1–BmorCPH34;BR000451–BR000454, BR000457, BR000461-BR000478, BR000486-BR000488, BR000490, BR000493-BR000497, BR000500-BR000501).BmorCPH14, BmorCPH15, BmorCPH16, BmorCPH30, and BmorCPH31have 18-residue motifs (Supplementary Table 6), which is anothermotif found in cuticular proteins (Anderson et al., 1995; Willis et al.,2005). Although an 18-residue motif is also found in RR-3 proteins(Andersen, 2000; Willis et al., 2005), we did not find this motif inBombyx RR proteins. Several CPH proteins have sequence similaritywith known cuticular proteins with R&R Consensus (SupplementaryTable 3), although they were not judged as CPR by using a tool basedon profile hidden Markov models (HMMs) in cuticleDB website. Thefurther study is needed to categorize CPH genes.

R. Futahashi et al. / Insect Biochemistry and Molecular Biology 38 (2008) 1138–1146 1141

Overall, we predicted 220 putative cuticular protein genes, 151genes of which are found in at least one EST library (Table 1). Thegenomic locations of these genes are denoted in SupplementaryTable 1.

3.2. Phylogenic analysis of other insect species

To compare the Bombyx cuticular proteins with the Drosophilaand Apis cuticular proteins, a neighbor-joining tree was drawnusing the MEGA4 program (Tamura et al., 2007). Because almost allRR-1 and RR-2 proteins formed a distinct clade in D. melanogaster(Karouzou et al., 2007), we drew the neighbor-joining tree of RR-1and RR-2 proteins separately (Figs. 1 and 2). Although the numbersof RR-1 proteins are similar between Drosophila and Bombyx (55and 56, respectively), gene clades were found within each speciesand did not correspond directly (Fig. 1). Many genes formed cladeswithin single species especially in RR-2 proteins. For the RR-1protein genes, BmorCPR7–BmorCPR9, BmorCPR13, 17, 19, andBmorCPR27–BmorCPR29 formed Bombyx-specific clades (Fig. 1). Allof these genes belong to the same gene clusters on chromosomes

Fig. 1. Neighbor-joining tree of annotated RR-1 protein among B. mori, D. melanogaster, andThree genes not classified by an HMM tool (BmorCPR22, BmorCPR47, and BmorCPR55) are ombased on 1000 resampled data sets. The tree is condensed to show only branches with 20% bmelanogaster; gray circles¼ A. mellifera, and solid circles¼ B. mori. Brackets indicate species-genes is shown in parenthesis.

17, 19, and 22, respectively (Supplementary Table 1). For the RR-2protein genes, many genes formed Bombyx-specific clades (Fig. 2),and most of them belong to the largest clusters on chromosome 22(BmorCPR79–BmorCPR130). Gene clades were also found in A.mellifera (i.e. AmelCPR9-AmelCPR11 in Fig. 2), which has few R&Rproteins (Honeybee Genome Sequencing Consortium, 2006). AllRR-1 and RR-2 protein genes except for BmorCPR10 have at leastone intron, and exon/intron structures of genes belonging to thesame gene cluster are similar between each other (SupplementaryTable 1). These results suggest that gene clusters of cuticularproteins evolve independently among insect taxa.

3.3. Tissue specificity of cuticular proteins, comparing EST libraries

Next, we investigated the tissue specificity of each cuticularprotein by comparing the EST libraries. Six thousand and fivehundred and forty-six ESTs were found corresponding to 151distinct genes (Table 1; Supplementary Table 1). In total, mostcuticular protein transcripts were found in the epidermis and wingdisc EST libraries (Fig. 3A, B), especially in the EST library of

A. mellifera. Only the extended consensus region was used as Karouzou et al. (2007).itted. The tree was generated with MEGA4 (Tamura et al., 2007). Bootstrap support is

ootstrap support or higher. Symbols after gene names indicate species: open circles¼D.specific clades (made from three or more genes), and chromosomal location of Bombyx

Fig. 2. A Neighbor-joining tree of annotated RR-2 protein among B. mori, D. melanogaster, and A. mellifera (DmelCpr73D omitted as Karouzou et al., 2007). Only the extendedconsensus region was used as Karouzou et al. (2007). Two genes not classified by an HMM tool (BmorCPR139 and BmorCPR141) are omitted. The tree was generated with MEGA4(Tamura et al., 2007). Bootstrap support is based on 1000 resampled data sets. The tree is condensed to show only branches with 20% bootstrap support or higher. Symbols aftergene names indicate species: open circles¼D. melanogaster; gray circles¼ A. mellifera, and solid circles¼ B. mori. Brackets indicate species-specific clades (made from three or moregenes), and chromosomal location of Bombyx genes is shown in parenthesis.

R. Futahashi et al. / Insect Biochemistry and Molecular Biology 38 (2008) 1138–11461142

epidermis at the fourth molt (Okamoto et al., in press). The tissuespecificity was very different for each motif (Fig. 3A, B). Transcriptscorresponding to RR-1 proteins were found mainly in the epidermisand pheromone gland, and only partially in the wing disc (Fig. 3A,B). In contrast, transcripts corresponding to RR-3 proteins werefound mainly in the pheromone gland, Tweedle transcripts weremainly in the wing disc, and CPFL transcripts were mainly in thecompound eyes (Fig. 3A, B). Some cuticular protein transcripts werealso found in internal organs such as the ovary, brain, and posteriorsilkgland. Transcripts corresponding to RR-2 proteins were widelydistributed and were also highly expressed in the ovary (Fig. 3A).Gu and Willis (2003) reported that the type of cuticular proteintranscript was altered at the spinning stage. They revealed thatimaginal discs from feeding larvae had abundant mRNAs for RR-1cuticular proteins, and only discs from spinning larvae had mRNAsthat coded for RR-2 proteins. We therefore compared thepercentages of total clone numbers of each motif in wing discs both

prior and posterior to spinning. We found that transcripts corre-sponding to Tweedle and CPFL proteins as well as RR-2 proteinswere only found posterior to spinning, whereas most of transcriptsof young larval discs were corresponding to RR-1 and glycine-richproteins, which is similar to larval epidermis (Fig. 3B). We alsoanalyzed the tissue specificity of each cuticular protein transcript(Supplementary Table 1). Tissue specificity did not correlate withchromosomal location. The expression patterns of each cuticle genein clusters were different between adjacent genes (i.e. BmorCPR90–BmorCPR92). These results suggest that components of cuticularprotein differ between tissues, and tissue-specific expression ofeach cuticular protein is regulated intricately.

3.4. Stage specificity of the cuticular protein in the wing disc

So far, we have reported four groups of cuticular protein geneshaving different developmental-expression profiles and hormonal

Fig. 3. (A) Tissue specificity of each cuticle protein family. EST data sets derived from cDNA libraries of various Bombyx tissues are compared (Mita et al., 2003; Kawasaki et al., 2004;Kinjoh et al., 2007; Okamoto et al., in press). The percentages of each library are shown. The numbers of total ESTs are shown in parentheses. (B) Composition of cuticle proteingenes in each tissue. The percentages of total clone numbers of each motif are shown. The numbers of total ESTs are shown in parentheses.

R. Futahashi et al. / Insect Biochemistry and Molecular Biology 38 (2008) 1138–1146 1143

responses from the wing discs of the fifth larval instar of B. mori(Takeda et al., 2001; Noji et al., 2003; Zhong et al., 2006). Recently,we isolated a novel cuticular protein gene from the EST database ofN124 and C124 B. mori strains, with reference to the arrangement ofGu and Willis (2003), and named it BMWCP11 (GenBank accessionnumber AB236661). BMWCP11 is identical to BmorCPR46 in thegenome sequence of the p50 strain.

Transcripts of BmorCPR46 were identified by Northern blotanalysis (Fig. 4A). The size of the transcripts was estimated to be0.6 kb. The larger transcripts are suggested to be precursors.Transcripts were observed from the beginning of the fifth larvalstage. They then increased and remained present during thefeeding stage, and the strong signals were detected from 1 to 4 daysafter the fourth ecdysis. Weak signals were observed afterwandering; they then disappeared at 2 days after the wanderingstage. The developmental profile of BmorCPR46 was quite differentfrom that of other cuticular protein genes reported previously. Thecomparison is shown in Fig. 4B. They are grouped into fivedivisions, with different expression profiles during the fifth larvalinstar as well as the pupal stage. Among them, BmorCPR45 andBmorCPR46 are adjacent genes in the same cluster of chromosome22, but the expression profile of these two genes was quitedifferent. BmorCPR95 and BmorCPR99 are located between

BmorCPR93 and BmorCPR103 in the same cluster of chromosome22, although the expression patterns of BmorCPR93 andBmorCPR103 were very similar to each other but different fromthose of BmorCPR95 and BmorCPR99. These results suggest that thestage specificity of each cuticular protein is also intricately regu-lated within a cluster.

4. Discussion

4.1. Genome-wide identification of Bombyx cuticular protein genes

In this study, we identified 220 putative cuticular protein genesin Bombyx genome sequences (Table 1). Of these, 10 genes were notidentified by the gene model (The International Silkworm GenomeSequencing Consortium, in preparation), and 57 genes were notcorrectly annotated. We revised these gene structures and namedthe genes (Supplementary Table 1). The expression of 151 geneswas compared among various tissues by searching EST libraries(Fig. 3; Supplementary Table 1). We used several criteria to identifya gene as encoding a cuticular protein: (1) known cuticular proteinconsensus (e.g., R&R Consensus), (2) N-terminal signal peptide, (3)simple repeat sequence (GGX or AAP(A/V)), and (4) sequencesimilarity to known cuticular proteins. We also found 29 putative

Fig. 4. (A) Expression profile of BmorCPR46 (BMWCP11) by Northern blot analysis. The number and arrows indicate the size of the transcripts (kb). Total RNA (15 mg) was separatedby 1% agarose-formaldehyde gel electrophoresis. The bottom panel shows ethidium bromide-stained ribosomal RNA as a control for RNA loading. Expression profile of BmorCPR46in wing discs during the fourth molting and wandering stage by Northern blot analysis. The top panel indicates the stage of wing discs (V0–W3). The RNAs of each stage are thesame source and showed similar electrophoretic patterns. (B) Schematic comparison of the developmental profiles of five groups of cuticular protein genes isolated in wing discs inB. mori. The relative expression patterns of five genes are shown with the hemolymph ecdysteroid titer (Kiguchi et al., 1985; Kawasaki, 1998). V0, day of the fourth ecdysis; W0, dayof wandering; P0, day of pupation; and A0, day of adult eclosion. The synonym of each cuticular protein gene is also shown in parentheses.

R. Futahashi et al. / Insect Biochemistry and Molecular Biology 38 (2008) 1138–11461144

cuticular proteins with glycine-rich repeats (CPG) and 34 otherhypothetical cuticular proteins (CPH).

One of the conspicuous traits of the cuticular protein family isthat 80% of cuticular proteins form gene clusters (SupplementaryTable 1; The International Silkworm Genome Sequencing Consor-tium, in preparation). We found gene clusters of RR-1, RR-2, CPFL,CPG, and CPH proteins. The exception was the Tweedle protein,which was in contrast with Drosophila, where Tweedle proteinsform large clusters (Guan et al., 2006). In the Bombyx genome, therewas only one CPF gene, which was also in contrast to Drosophila,Anopheles, and Apis, which have three or four CPF genes (Togawaet al., 2007). These observations indicate that the cluster structuresof cuticular protein genes with each motif vary among insectspecies.

4.2. Species-specific duplication of cuticular protein genes

Neighbor-joining trees show that the large numbers of cuticularproteins form gene clades within species (Figs. 1 and 2), including A.mellifera, which has a small number genes compared withDrosophila and Bombyx. In Bombyx, these genes in species-specificclades also form physically gene clusters on chromosome. Theseresults suggest that the duplication of cuticular protein genes hasoccurred independently among insect taxa.

Snyder et al. (1982) found four cuticular protein genes in a 9 kbregion of the Drosophila genome and speculated on their duplica-tion and evolution. Charles et al. (1997) also reported a cluster ofcuticular protein genes in D. melanogaster, where a 36 kb genomicDNA segment contained 12 clustered cuticular protein genes, andfive genes in this cluster are intronless, and four of these five have

arisen by retroposition. They also reported an intraspecific varia-tion in the cuticular protein gene number. Karouzou et al. (2007)observed an absence of large clusters of highly similar genes inDrosophila, which suggested that duplications occurred long ago. Incontrast, A. gambiae has large clusters of closely related genes (Heet al., 2007). We found 39 clusters of RR-2 proteins in A. gambiaegenome sequences on the cuticleDB website, and several RR-2proteins are intronless. Rondot et al. (1998) reported the possibilityof duplication of cuticular protein genes in Tenebrio molitor.Notably, there are more introns and larger introns in RR-2 proteinsin Bombyx than other species (Supplementary Table 1), suggestingthat the origin of gene duplication differs among insects. Thus,multigene families of cuticular protein genes have been reported invarious insects, suggesting that species-specific duplication ofcuticular proteins may be associated with taxa-specific exoskeletalcharacteristics.

Compared with other insects, Bombyx has very large clusters inchromosome 22 (BmorCPR22–BmorCPR46 for RR-1 andBmorCPR79–BmorCPR130 for RR-2 proteins, respectively, Supple-mentary Table 1; The International Silkworm Genome SequencingConsortium, in preparation). It is noted that many Bombyx-specificRR-2 proteins belong to this large cluster (Fig. 2). Because manylepidopteran larvae have a relatively large size and spend their livesin the open field, they have developed various exoskeletal shapes,such as spines and tubercles. In the giant silkworm, Hyalophoracecropia, the cuticular protein composition is different between thetubercles and the dorsal sclerites of the larval epidermis (Cox andWillis, 1985; Lampe and Willis, 1994; Binger and Willis, 1994).Recently, we found that expression patterns of several RR-2 cutic-ular proteins coincided with tubercle structures in the swallowtail

R. Futahashi et al. / Insect Biochemistry and Molecular Biology 38 (2008) 1138–1146 1145

butterfly, Papilio xuthus (Futahashi and Fujiwara, in press). Onepossible function of Bombyx-specific RR-2 protein is construction ofthe complex larval structures. Another conspicuous characteristicof lepidopteran species is scales on the adult wings. It is reportedthat scale structures strongly correlate with scale colors (Janssenet al., 2001). Both the larval tubercles and the adult wing scales arerigid structures, and RR-2 proteins are generally associated witha rigid cuticle. The large copy numbers of RR-2 proteins in Bombyxmay allow for the possibility of divergence of body surface and scalestructures to adapt to the environment in lepidopteran species.

4.3. Expression analysis of each cuticular protein gene

We found a correlation between tissue specificity and specificmotif (Fig. 3). Cox and Willis (1985) found the composition ofcuticular proteins correlated with the flexibility of the maturecuticle. We found that transcripts corresponding to RR-1 proteinswere abundant in the epidermis EST libraries, which is consistentwith the general claim that RR-1 proteins are associated withflexible cuticles (Fig. 1). In contrast to RR-1 and RR-2 proteins, RR-3proteins have been poorly understood. We found the largestnumber of clones for RR-3 proteins in pheromone gland (Fig. 3). Theouter layer of pheromone gland was comprised mainly thick cuticle(Fonagy et al., 2000). Our results suggest that RR-3 proteins mayfunction in specific cuticle structure. In D. melanogaster, Tweedleproteins were expressed in the epidermis, tracheal tree, foregut,and embryo (Guan et al., 2006). In Bombyx, Tweedle transcriptswere found mainly in the epidermis and wing disc (Fig. 3). Wefound that RR-2, Tweedle, and CPFL transcripts were only found inwing disc after spinning (Fig. 3B). Tweedle and CPFL proteins maybe associated with the formation of hard cuticle in the upper-surface of the pupal fore-wing similar to RR-2 proteins. Gu andWillis (2003) concluded that the young larval discs were secretinga larval-type cuticle. Our results of composition of cuticle proteingenes also support this hypothesis because cuticular compositionof wing disc prior to spinning is similar to that of larval epidermis(Fig. 3B). Togawa et al. (2007) reported that four CPFs and one CPFLwere associated with the outer layer of pupal and adult cuticles,whereas six CPFLs appeared to contribute to larval cuticles. Wefound transcripts corresponding to CPFLs especially in compoundeyes, which was similar to RR-2 protein transcripts (Fig. 3), sug-gesting that CPFLs were also associated with rigid structures.

An unexpected result was that transcripts of several cuticularprotein genes were also observed in internal tissues such as ovary,brain, and posterior silkgland (Fig. 3, Supplementary Table 1). Mostof these transcripts were found in internal tissues as well asepidermis or wing disc (Supplementary Table 1). One possibility isthat these proteins are synthesized in the internal organs and thenlater exported to the cuticle. Another possibility is that thesetranscripts were derived from attached trachea, because tracheaeare cuticular structures (Willis et al., 2005). A few transcripts wereonly found in internal organ (i.e. BmorCPR51 and BmorCPR90;Supplementary Table 1). Rebers and Willis (2001) reported that theR&R Consensus binds to chitin, and that exchange of just two aminoacid residues was sufficient to prevent such binding. Anotherpossibility is that several gene products containing the R&RConsensus is not be able to bind to chitin due to small differences intheir consensus sequence, and they may accordingly serve someother functions. Because there are a few genes that were expressedonly in internal organs at present, we could not find the internalorgan specific amino acid substitution. Further study is needed tolearn about the internal organ expressed cuticular protein genes.

Riddiford (1981) described two types of pupal cuticular proteinmRNAs. One was transcribed after the beginning of wandering, andthe other was observed after a prepupal ecdysteroid surge. Insectcuticle layers are constructed by different types of cuticles, which

are produced under different hormonal conditions. The hormonalregulation differs between the epicuticle and the procuticle (Fris-trom et al., 1986; Apple and Fristrom, 1991). We have found fivedifferent groups of stage specificity in the wing disc (Fig. 4B). It issuggested that the expression of these five groups is induced bydifferent types of ecdysone responsiveness, although some areadjacent genes.

We first thought that the expression pattern of each cuticularprotein gene should be similar within the same cluster. However,the tissue specificity and stage specificity of each cuticular proteindiffered even between adjacent genes in the same clusters (Fig. 4,Supplementary Table 1). These results are in contrast with choriongene clusters, the expression profile of which strongly correlateswith chromosomal location (Kafatos et al., 1995). Our findingsimply that the structure and regulation of each cuticular proteingene in the clusters have been rearranged or altered intricatelyduring evolution.

Acknowledgements

This work was supported in part by grants in aid for scientificresearch from the Ministry of Education, Science, and Culture ofJapan (to HF and HK), the Insect Technology Program of MAFF (toKM), PROBRAIN, the Basic Research Program (to KM and HF), andResearch Fellowship of Japan Society for the Promotion of Sciencefor Young Scientists (to RF). We are very grateful to MichihikoShimomura and Kazutoshi Yoshitake for data analysis. We are alsograteful to the referee for many helpful comments on themanuscript.

Appendix. Supplementary data

Supplementary data associated with this article can be found, inthe online version, at doi:10.1016/j.ibmb.2008.05.007.

References

Andersen, S.O., 2000. Studies on proteins in post-ecdysial nymphal cuticle of locust,Locusta migratoria, and cockroach, Blaberus craniifer. Insect Biochem. Mol. Biol.30, 569–577.

Andersen, S.O., Hojrup, P., Roepstorff, P., 1995. Insect cuticular proteins. InsectBiochem. Mol. Biol. 25, 153–176.

Andersen, S.O., Rafn, K., Roepstorff, P., 1997. Sequence studies of proteins from larvaland pupal cuticle of the yellow meal worm, Tenebrio molitor. Insect Biochem.Mol. Biol. 27, 121–131.

Apple, R.T., Fristrom, J.W., 1991. 20-Hydroxyecdysone is required for, and negativelyregulates, transcription of Drosophila pupal cuticle protein genes. Dev. Biol. 146,569–582.

Binger, L.C., Willis, J.H., 1994. Identification of the cDNA, gene, and promoter fora major protein from flexible cuticles of the giant silkworm, Hyalophoracecropia. Insect Biochem. Mol. Biol. 24, 989–1000.

Charles, J.P., Bouhin, H., Quennedey, B., Courrent, A., Delachambre, J., 1992. cDNAcloning and deduced amino acid sequence of a major, glycine-rich cuticularprotein from the coleopteran Tenebrio molitor. Temporal and spatial distributionof the transcript during metamorphosis. Eur. J. Biochem. 206, 813–819.

Charles, J.P., Chihara, C., Nejad, S., Riddiford, L.M., 1997. A cluster of cuticle proteingenes of Drosophila melanogaster at 65A: sequence, structure and evolution.Genetics 147, 1213–1224.

Cox, D.C., Willis, J.H., 1985. The cuticular proteins of Hyalophora cecropia fromdifferent anatomical regions and metamorphic stages. Insect Biochem. 15,349–362.

Fonagy, A., Yokoyama, N., Okano, K., Tatsuki, S., Maeda, S., Matsumoto, S., 2000.Pheromone-producing cells in the silkmoth, Bombyx mori: identification andtheir morphological changes in response to pheromonotropic stimuli. J. InsectPhysiol. 46, 735–744.

Fristrom, D., Doctor, J., Fristrom, J.W., 1986. Procuticle proteins and chitin-likematerial in the inner epicuticle of the Drosophila pupal cuticle. Tissue Cell 18,531–543.

Futahashi, R., Fujiwara, H., 2008. Identification of stage-specific larval camouflageassociated genes in the swallowtail butterfly, Papilio xuthus. Dev. Genes Evol.218, 491–504.

Gu, S., Willis, J.H., 2003. Distribution of cuticular protein mRNAs in silk mothintegument and imaginal discs. Insect Biochem. Mol. Biol. 33, 1177–1188.

R. Futahashi et al. / Insect Biochemistry and Molecular Biology 38 (2008) 1138–11461146

Guan, X., Middlebrooks, B.W., Alexander, S., Wasserman, S.A., 2006. Mutation ofTweedleD, a member of an unconventional cuticle protein family, alters bodyshape in Drosophila. Proc. Natl. Acad. Sci. U.S.A. 103, 16794–16799.

Hamodrakas, S.J., Willis, J.H., Iconomidou, V.A., 2002. A structural model of thechitin-binding domain of cuticle proteins. Insect Biochem. Mol. Biol. 32,1577–1583.

He, N., Botelho, J.M., McNall, R.J., Belozerov, V., Dunn, W.A., Mize, T.,Orlando, R., Willis, J.H., 2007. Proteomic analysis of cast cuticles fromAnopheles gambiae by tandem mass spectrometry. Insect Biochem. Mol.Biol. 37, 135–146.

Honeybee Genome Sequencing Consortium, 2006. Insights into social insects fromthe genome of the honeybee Apis mellifera. Nature 443, 931–949.

Iconomidou, V.A., Willis, J.H., Hamodrakas, S.J., 1999. Is beta-pleated sheet themolecular conformation which dictates formation of helicoidal cuticle? InsectBiochem. Mol. Biol. 29, 285–292.

Janssen, J.M., Monteiro, A., Brakefield, P.M., 2001. Correlations between scalestructure and pigmentation in butterfly wings. Evol. Dev. 3, 415–423.

Kafatos, F.C., Tzertzinis, G., Spoerel, N.A., Nguyen, T., 1995. Chorion genes: anoverview of their structure, function, and transcriptional regulation. In:Goldsmith, M.R., Gilbert, K., Wilkins, A.S. (Eds.), Molecular Model Systems in theLepidoptera. Cambridge University Press, Cambridge, NY, pp. 181–215.

Karouzou, M.V., Spyropoulos, Y., Iconomidou, V.A., Cornman, R.S., Hamodrakas, S.J.,Willis, J.H., 2007. Drosophila cuticular proteins with the R&R Consensus:annotation and classification with a new tool for discriminating RR-1 and RR-2sequences. Insect Biochem. Mol. Biol. 37, 754–760.

Kawasaki, H., 1998. Transition from larva to pupa: morphogenesis, cell prolifer-ation, and protein synthesis in Bombyx wing disc. Inver. Rep. Dev. 34,101–108.

Kawasaki, H., Ote, M., Okano, H., Shimada, T., Quan, G.-X., Mita, K., 2004. Change inthe expressed gene patterns of the wing disc during the metamorphosis ofBombyx mori. Gene 343, 133–142.

Kiguchi, K., Agui, N., Kawasaki, H., Kobayashi, M., 1985. Developmental timetable forthe last larval and pharate pupal stages of the silkworm, Bombyx mori, withspecial reference to the relation between the developmental events andhemolymph ecdysteroid levels. Bull. Seric. Exp. Sta. 30, 83–100 (in Japanesewith English summary).

Kinjoh, T., Kaneko, Y., Itoyama, K., Mita, K., Hiruma, K., Shinoda, T., 2007. Control ofjuvenile hormone biosynthesis in Bombyx mori: cloning of the enzymes in themevalonate pathway and assessment of their developmental expression in thecorpora allata. Insect Biochem. Mol. Biol. 37, 808–818.

Krogh, T.N., Skou, L., Roepstorff, P., Andersen, S.O., Hojrup, P., 1995. Primary struc-ture of proteins from the wing cuticle of the migratory locust, Locusta migra-toria. Insect Biochem. Mol. Biol. 25, 319–329.

Lampe, D.J., Willis, J.H., 1994. Characterization of a cDNA and gene encodinga cuticular protein from rigid cuticles of the giant silkmoth, Hyalophora cecropia.Insect Biochem. Mol. Biol. 24, 419–435.

Li, H., Liu, J., Xu, Z., 2005. Test data sets and evaluation of gene prediction programson the rice genome. J. Comp. Sci. Tech. 10, 446–453.

Magkrioti, C.K., Spyropoulos, I.C., Iconomidou, V.A., Willis, J.H., Hamodrakas, S.J.,2004. cuticleDB: a relational database of Arthropod cuticular proteins. BMCBioinformatics 5, 138.

Mita, K., Kasahara, M., Sasaki, S., Nagayasu, Y., Yamada, T., Kanamori, H., Namiki, N.,Kitagawa, M., Yamashita, H., Yasukochi, Y., Kadono-Okuda, K., Yamamoto, K.,Ajimura, M., Ravikumar, G., Shimomura, M., Nagamura, Y., Shin-I, T., Abe, H.,Shimada, T., Morishita, S., Sasaki, T., 2004. The genome sequence of silkworm,Bombyx mori. DNA Res. 11, 27–35.

Mita, K., Morimyo, M., Okano, K., Koike, Y., Nohata, J., Kawasaki, H., Kadono-Okuda, K., Yamamoto, K., Suzuki, M.G., Shimada, T., Goldsmith, M.R., Maeda, S.,2003. The construction of an EST database for Bombyx mori and its application.Proc. Natl. Acad. Sci. U.S.A. 100, 14121–14126.

Nakato, H., Tomiyama, M., Izumi, S., Tomino, S., 1990. Structure and expression ofmRNA for a pupal cuticle protein of the silkworm, Bombyx mori. Insect Biochem20, 667–678.

Nakato, H., Takekoshi, M., Togawa, T., Izumi, S., Tomino, S., 1997. Purification andcDNA cloning of evolutionally conserved larval cuticle proteins of the silkworm,Bombyx mori. Insect Biochem. Mol. Biol. 27, 701–709.

Noji, T., Ote, M., Takeda, M., Mita, K., Shimada, T., Kawasaki, H., 2003. Isolation andcomparison of different ecdysone-responsive cuticle protein genes in wingdiscs of Bombyx mori. Insect Biochem. Mol. Biol. 33, 671–679.

Okamoto, S., Futahashi, R., Kojima, T., Mita, K., Fujiwara, H. A catalogue of epidermalgenes: genes expressed in the epidermis during larval molt of the silkwormBombyx mori. BMC Genomics, in press.

Rebers, J.E., Riddiford, L.M., 1988. Structure and expression of a Manduca sexta larvalcuticle gene homologous to Drosophila cuticle genes. J. Mol. Biol. 203, 411–423.

Rebers, J.E., Willis, J.H., 2001. A conserved domain in arthropod cuticular proteinsbinds chitin. Insect Biochem. Mol. Biol. 31, 1083–1093.

Riddiford, L.M., 1981. Hormonal control of epidermal cell development. Amer. Zool.21, 751–762.

Rondot, I., Quenedey, B., Delachambre, J., 1998. Structure, organization andexpression of two clustered cuticle protein genes during the metamorphosis ofan insect, Tenebrio molitor. Eur. J. Biochem. 254, 304–312.

Sawada, H., Nakato, H., Togawa, T., Nakagoshi, M., Takikawa, S., Dohke, K., Iino, T.,Mase, K., Yamamoto, T., Izumi, S., 2003. Molecular cloning and characterizationof a cDNA encoding a novel cuticle protein in the silkworm, Bombyx mori. Comp.Biochem. Physiol. B: Biochem. Mol. Biol. 134, 519–527.

Snyder, M., Hunkapiller, M., Yuen, D., Silvert, D., Fristrom, J., Davidson, N., 1982.Cuticle protein genes of Drosophila: structure, organization, and evolution offour clustered genes. Cell 29, 1027–1040.

Suzuki, Y., Matsuoka, T., Iimura, Y., Fujiwara, H., 2002. Ecdysteroid dependentexpression of a novel cuticle protein gene BMCPG1 in the silkworm, Bombyxmori. Insect Biochem. Mol. Biol. 32, 599–607.

Takeda, M., Mita, K., Quan, G.-.X., Shimada, T., Okano, K., Kanke, E., Kawasaki, H.,2001. Mass isolation of cuticle protein cDNAs from wing discs of Bombyx moriand their characterization. Insect Biochem. Mol. Biol. 31, 1019–1028.

Tamura, K., Dudley, J., Nei, M., Kumar, S., 2007. MEGA4: molecular evolutionarygenetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24, 1596–1599.

The International Silkworm Genome Sequencing Consortium. Silkworm genomesequence reveals biology underlying silk production, phytophagy, and meta-morphosis. In preparation.

Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997. TheCLUSTAL_X windows interface: flexible strategies for multiple sequence align-ment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882.

Togawa, T., Augustine Dunn, W., Emmons, A.C., Willis, J.H., 2007. CPF and CPFL, tworelated gene families encoding cuticular proteins of Anopheles gambiae andother insects. Insect Biochem. Mol. Biol. 37, 675–688.

Togawa, T., Nakato, H., Izumi, S., 2004. Analysis for the chitin recognition mecha-nism of cuticle proteins from the soft cuticle of the silkworm, Bombyx mori.Insect. Biochem. Mol. Biol. 34, 1059–1067.

Willis, J.H., Iconomidou, V.A., Smith, R.F., Hamodrasks, S.J., 2005. Cuticular proteins.In: Gilbert, L.I., Iatrou, K., Gill, S.S. (Eds.), Comprehensive Molecular InsectScience, vol. 4. Elsevier, Oxford, UK, pp. 79–110.

Xia, Q., Zhou, Z., Lu, C., Cheng, D., Dai, F., Li, B., Zhao, P., Zha, X., Cheng, T., Chai, C.,Pan, G., Xu, J., Liu, C., Lin, Y., Qian, J., Hou, Y., Wu, Z., Li, G., Pan, M., Li, C., Shen, Y.,Lan, X., Yuan, L., Li, T., Xu, H., Yang, G., Wan, Y., Zhu, Y., Yu, M., Shen, W., Wu, D.,Xiang, Z., Yu, J., Wang, J., Li, R., Shi, J., Li, H., Li, G., Su, J., Wang, X., Li, G., Zhang, Z.,Wu, Q., Li, J., Zhang, Q., Wei, N., Xu, J., Sun, H., Dong, L., Liu, D., Zhao, S., Zhao, X.,Meng, Q., Lan, F., Huang, X., Li, Y., Fang, L., Li, C., Li, D., Sun, Y., Zhang, Z., Yang, Z.,Huang, Y., Xi, Y., Qi, Q., He, D., Huang, H., Zhang, X., Wang, Z., Li, W., Cao, Y.,Yu, Y., Yu, H., Li, J., Ye, J., Chen, H., Zhou, Y., Liu, B., Wang, J., Ye, J., Ji, H., Li, S., Ni, P.,Zhang, J., Zhang, Y., Zheng, H., Mao, B., Wang, W., Ye, C., Li, S., Wang, J.,Wong, G.K., Yang, H., Biology Analysis Group, 2004. A draft sequence for thegenome of the domesticated silkworm (Bombyx mori). Science 306, 1937–1940.

Zhong, Y.-S., Mita, K., Shimada, T., Kawasaki, H., 2006. Glycine-rich protein genes,which encode a major component of the cuticle, have different developmentalprofiles from other cuticle protein genes in Bombyx mori. Insect Biochem. Mol.Biol. 36, 99–110.