17
RESEARCH ARTICLE Open Access A holistic phylogeny of the coronin gene family reveals an ancient origin of the tandem-coronin, defines a new subfamily, and predicts protein function Christian Eckert, Björn Hammesfahr and Martin Kollmar * Abstract Background: Coronins belong to the superfamily of the eukaryotic-specific WD40-repeat proteins and play a role in several actin-dependent processes like cytokinesis, cell motility, phagocytosis, and vesicular trafficking. Two major types of coronins are known: First, the short coronins consisting of an N-terminal coronin domain, a unique region and a short coiled-coil region, and secondly the tandem coronins comprising two coronin domains. Results: 723 coronin proteins from 358 species have been identified by analyzing the whole-genome assemblies of all available sequenced eukaryotes (March 2011). The organisms analyzed represent most eukaryotic kingdoms but also cover every taxon several times to provide a better statistical sampling. The phylogenetic tree of the coronin domains based on the Bayesian method is in accordance with the most recent grouping of the major kingdoms of the eukaryotes and also with the grouping of more recently separated branches. Based on this holisticapproach the coronins group into four classes: class-1 (Type I) and class-2 (Type II) are metazoan/ choanoflagellate specific classes, class-3 contains the tandem-coronins (Type III), and the new class-4 represents the coronins fused to villin (Type IV). Short coronins from non-metazoans are equally related to class-1 and class-2 coronins and thus remain unclassified. Conclusions: The coronin class distribution suggests that the last common eukaryotic ancestor possessed a single and a tandem-coronin, and most probably a class-4 coronin of which homologs have been identified in Excavata and Opisthokonts although most of these species subsequently lost the class-4 homolog. The most ancient short coronin already contained the trimerization motif in the coiled-coil domain. Background The coronin proteins, which were originally isolated as a major co-purifying protein from an actin-myosin-com- plex of the slime mold Dictyostelium discoideum [1], have since been identified in other protists [2,3], fungi [4], and animals [5], but are absent in plants. Coronins are a conserved family of actin binding proteins [6-8] and the first family member had been named coronin based on its strong immunolocalization to the actin rich crown like structures of the cell cortex in Dictyostelium discoi- deum [1]. Coronins belong to the superfamily of the eukaryotic-specific WD40-repeat proteins [9,10] and play a role in several actin-dependent processes like cytokin- esis [11], cell motility [11,12], phagocytosis [13,14], and vesicular trafficking [15]. WD-repeat motifs are minimally conserved regions of approximately 40-60 amino acids typically starting with Gly-His (GH) dipeptides 11-24 residues away from the N- terminus and ending with a Trp-Asp (WD) dipeptide at the C-terminus. WD40-repeat proteins, which are charac- terized by the presence of at least four consecutive WD repeats in the middle of the molecule, fold into beta pro- peller structures and serve as stable platforms for protein- protein interactions [9]. The coronin proteins have five canonical WD-repeat motifs located centrally. Since the region encoding the * Correspondence: [email protected] Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Goettingen, Germany Eckert et al. BMC Evolutionary Biology 2011, 11:268 http://www.biomedcentral.com/1471-2148/11/268 © 2011 Eckert et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

D - BioMed Central

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: D - BioMed Central

RESEARCH ARTICLE Open Access

A holistic phylogeny of the coronin gene familyreveals an ancient origin of the tandem-coronin,defines a new subfamily, and predicts proteinfunctionChristian Eckert, Björn Hammesfahr and Martin Kollmar*

Abstract

Background: Coronins belong to the superfamily of the eukaryotic-specific WD40-repeat proteins and play a rolein several actin-dependent processes like cytokinesis, cell motility, phagocytosis, and vesicular trafficking. Two majortypes of coronins are known: First, the short coronins consisting of an N-terminal coronin domain, a unique regionand a short coiled-coil region, and secondly the tandem coronins comprising two coronin domains.

Results: 723 coronin proteins from 358 species have been identified by analyzing the whole-genome assembliesof all available sequenced eukaryotes (March 2011). The organisms analyzed represent most eukaryotic kingdomsbut also cover every taxon several times to provide a better statistical sampling. The phylogenetic tree of thecoronin domains based on the Bayesian method is in accordance with the most recent grouping of the majorkingdoms of the eukaryotes and also with the grouping of more recently separated branches. Based on this“holistic” approach the coronins group into four classes: class-1 (Type I) and class-2 (Type II) are metazoan/choanoflagellate specific classes, class-3 contains the tandem-coronins (Type III), and the new class-4 represents thecoronins fused to villin (Type IV). Short coronins from non-metazoans are equally related to class-1 and class-2coronins and thus remain unclassified.

Conclusions: The coronin class distribution suggests that the last common eukaryotic ancestor possessed a singleand a tandem-coronin, and most probably a class-4 coronin of which homologs have been identified in Excavataand Opisthokonts although most of these species subsequently lost the class-4 homolog. The most ancient shortcoronin already contained the trimerization motif in the coiled-coil domain.

BackgroundThe coronin proteins, which were originally isolated as amajor co-purifying protein from an actin-myosin-com-plex of the slime mold Dictyostelium discoideum [1],have since been identified in other protists [2,3], fungi[4], and animals [5], but are absent in plants. Coroninsare a conserved family of actin binding proteins [6-8] andthe first family member had been named coronin basedon its strong immunolocalization to the actin rich crownlike structures of the cell cortex in Dictyostelium discoi-deum [1]. Coronins belong to the superfamily of the

eukaryotic-specific WD40-repeat proteins [9,10] and playa role in several actin-dependent processes like cytokin-esis [11], cell motility [11,12], phagocytosis [13,14], andvesicular trafficking [15].WD-repeat motifs are minimally conserved regions of

approximately 40-60 amino acids typically starting withGly-His (GH) dipeptides 11-24 residues away from the N-terminus and ending with a Trp-Asp (WD) dipeptide atthe C-terminus. WD40-repeat proteins, which are charac-terized by the presence of at least four consecutive WDrepeats in the middle of the molecule, fold into beta pro-peller structures and serve as stable platforms for protein-protein interactions [9].The coronin proteins have five canonical WD-repeat

motifs located centrally. Since the region encoding the

* Correspondence: [email protected] of NMR-based Structural Biology, Max-Planck-Institute forBiophysical Chemistry, Am Fassberg 11, 37077 Goettingen, Germany

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

© 2011 Eckert et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the original work is properly cited.

Page 2: D - BioMed Central

WD repeats is similar to the sequence of the beta-subu-nit of trimeric G-proteins the formation of a five-bladedbeta-propeller was assumed for coronins [16]. However,the determination of the structure of murine coronin-1(MmCoro1A [17]) demonstrated that the protein, analo-gous to the trimeric G-proteins, forms a seven-bladedbeta-propeller carrying two potential F-actin bindingsites. Apart from the central WD-repeats, almost all cor-onin proteins have a C-terminal coiled-coil sequencethat mediates homo-oligomerization [18-20], and a shortN-terminal motif that contains an important regulatoryphosphorylation site in coronin-1B [12]. In addition,each coronin protein has a unique region of variablelength and composition following the conserved exten-sion to the C-terminus of the beta-propeller.Based on their domain composition coronins have ori-

ginally been divided into two subfamilies, namely short andlong coronins [21]. Short coronins consist of 450 - 650amino acids containing one seven-bladed beta-propellerand a C-terminal coiled-coil region. Furthermore, the N-terminal region of most known short coronins contains 12basic amino acids. Since this motif is only present in coro-nin molecules, it has been suggested as a novel coronin sig-nature [21]. The longer types of coronin, also called PODor Coronin 7, possess two complete core domains in tan-dem but lack a coiled-coil motif. In the longer coronins,the sequence of the basic N-terminal motif is reduced to 5amino acids. Based on phylogenetic relationships amongthe coronins, the Human Genome Organization nomencla-ture committee (HGNC) proposed a system in 2001 thatgrouped the short coronins into two classes resulting in atotal of three subtypes [8]. Very recently, a new nomencla-ture has been suggested dividing the coronins into twelvesubclasses based on the analysis of about 250 coroninsfrom most taxa [22]. In contrast to previous systems, everymammalian coronin (and corresponding vertebrate homo-logs) was designated an own class resulting in seven verte-brate classes. Invertebrates were grouped into two classes,the fungi got an own class, coronins from alveolates weregrouped with those from Parabasalids (class 10), and theremaining coronins from Amoeba, Heterolobosea, andEuglenozoa were combined into the twelfth class. Thisstudy constituted the first major phylogenetic analysis ofthe coronin family. However, this classification was notconsistent with the latest phylogeny of the eukaryotes andhomologs of some major branches like the stramenopileswere missing.Here, we present the analysis of the complete coronin

repertoires of all eukaryotic organisms sequenced andassembled so far. The distribution of all coronin homo-logs is in accordance with the latest taxonomy of theeukaryotes and reveals the origin of the tandem-coroninand another newly defined class in the last commonancestor of the eukaryotes.

ResultsIdentification and annotation of the coronin proteinsThe coronin protein genes were identified by TBLASTNsearches against the corresponding genome data of the dif-ferent species. The list of sequenced eukaryotic species aswell as access information to the corresponding genomedata has been obtained from diArk [23]. Species thatmissed certain orthologs in the first instance were latersearched again with supposed-to-be orthologs of otherclosely related species. In this iterative process all coroninfamily proteins have been identified or their loss in certainspecies or taxa was confirmed. Because verified cDNAsequences and protein predictions, which often containmispredicted exons and introns even in the “annotated”genomes, are not available for most of the sequenced spe-cies, the protein sequences were assembled and assignedby manual inspection of the genomic DNA sequences.Exons have been confirmed by the identification of flank-ing consensus intron-exon splice junction donor andacceptor sequences [24]. In addition, the gene structuresof all coronin genes were reconstructed using WebScipio[25,26]. Through comparison of the intron positions andsplice-site phases in relation to the protein multiple-sequence alignment, several suspicious exon border pre-dictions could be resolved and the protein sequences sub-sequently be corrected. The genomic sequences of manyspecies contain several gaps due to the low coverage of thesequencing or problems in the assembly process. Onlysome of the gaps could be closed at the amino-acid levelby analyzing EST data.The coronin dataset contains 723 sequences from 358

organisms (Table 1). 614 sequences are complete, and anadditional 44 sequences are partially complete. Sequencesfor which a small part is missing (up to 5%) were termed“Partials”, while sequences for which a considerable part ismissing were termed “Fragments”. This difference hasbeen introduced because Partials are not expected to con-siderably influence the phylogenetic analysis. Several ofthe genes were termed pseudogenes because they containtoo many frame shifts, in-frame stop codons, and missingsequences to be attributed to sequencing or assemblyerrors.

Multiple sequence alignment, phylogenetic analysis, andclassificationA multiple sequence alignment of all coronin family mem-bers has been created and extensively manually improved(Additional file 1). The basis of the alignment was the con-served coronin domain that consists of the b-propellerregion and a subsequent conserved extension, which packsagainst the “bottom” surface of the propeller [17]. Thisentire domain is conserved in all coronin homologs andwe would therefore suggest naming it coronin-domain.The unique regions following the coronin-domain could

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 2 of 17

Page 3: D - BioMed Central

only be aligned for homologs of closely related species.The C-terminal predicted coiled-coil regions were alignedagain for all corresponding sequences to analyze potentialoligomerization patterns (see below). The second coronin-domains of the tandem-coronins were also aligned to thecoronin-domains for the phylogenetic analysis. One partof the coronin-domain in coronin-1D is encoded by acluster of mutually exclusive exons (see below) and there-fore the exon with the higher sequence identity to relatedhomologs has been included in the alignment. The phylo-genetic tree of the coronin family was calculated for 764coronin-domains, including both coronin-domains of thetandem-coronins separately, using the Bayesian (Addi-tional file 2) and the maximum-likelihood method (Addi-tional file 3). The resulting trees were almost identical.However, the relations of the innermost nodes represent-ing the most ancient relationships were best resolvedusing the Bayesian approach (Figure 1). The resulting phy-logenetic tree is in accordance with the latest phylogeneticgrouping of the six kingdoms of the eukaryotes [27-29] ofwhich five are covered by the data analyzed here. Thus,coronins of phylogenetic related species group together inthe coronin family tree. In the coronin tree, not only thegrouping is retained but also the evolutionary history ofthe branches. For example, the fungi separate as mono-phyletic group before the metazoans, and after theAmoeba.The classification into subfamilies should at best includeboth the phylogenetic grouping of the protein familymembers and the domain organisation of the respectivehomologs. However, because most coronins contain aunique region between the coronin-domain and theC-terminal coiled-coil regions, several sub-branch speci-fic domain organisation patterns evolved. To keep the

coronin classification as simple as possible and to pro-vide the highest consistency with previous classificationschemes, the following classification is proposed: Theclassification should solely be based on the phylogenetictree of the coronin-domains because it is in accordancewith the phylogeny of the eukaryotes and contains theconserved part of the proteins that is the basis of theprotein family. Metazoan species encode two phylogen-etically distinct groups of coronins that have historicallybeen named class-1 and class-2 coronins. Further var-iants of these classes should be named alphabetically,e.g. class-1A, class-1B, etc.. However, due to the inde-pendent whole-genome, genomic region, and singlegene duplication events of certain phylogenetic branchesthese variant designations do not always refer to ortho-logs. For the mammalian coronins, which are the bestanalyzed coronins, the suggested classification is almostentirely consistent with previous classifications [8] andthe HGNC nomenclature except for “CORO6” and“CORO7”, which are here classified as coronin-1D andcoronin-3, respectively. Class-3 comprises the tandemcoronins. All members of this class group together in thephylogenetic tree, and only single homologs have beenfound in all species analyzed. Class-4 is a newly definedclass that contains coronins with variable numbers ofC-terminal PH, gelsolin, and VHP domains, but also cor-onins with only very short sequences outside the coro-nin-domain. The other coronins group in accordancewith the latest taxonomy of the species (Figure 1). In ouropinion it does not add information or help the scientificcommunity if those coronins were classified separately.In contrast to the metazoans, gene duplications in thebranches of Amoeba, Excavata, and SAR are species-spe-cific and do not warrant further subclassification at themoment. For example, instead of talking about a “class-11 coronin” and long explanations what type of coroninswould belong to such a class, it would be easier, shorter,and less confusing to just say a “Naegleria coronin”, an“apicomplexan coronin” or a “yeast coronin”. The distri-bution of the coronins analyzed here is summarized forsome example species in Figure 2 including previouslyused names and classification schemes. The distributionof all coronins is found in Additional file 4. Coroninhomologs are absent in Rhodophyta (Cyanidioschyzon,Galdieria), Viridiplantae, Microsporidia, Formicata(Giardia), and Haptophyceae (Emiliania).

Short coronins (class-1, class-2, and unclassified coronins)The domain organisations of most short coronins (class-1,class-2, and unclassified coronins) are similar. They consistof the 390 amino-acid long coronin-domain followed by ashort unique domain and a C-terminal short coiled-coilregion (about 30-40 amino acids, Figure 3). The uniqueregions are conserved in branches (e.g. the vertebrates

Table 1 Data statistics

coronin

Sequence

Total 723

From WGS 700

Domains 7

Amino acids

Total pseudogenes 7

Pseudogenes without sequence 3

Completeness

Complete 614

Partials 44

Fragments 62

Species

Total 358

WGS-projects 323

EST-projects 112

WGS- and EST-projects 152

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 3 of 17

Page 4: D - BioMed Central

have similar regions, as do the arthropods, the nematodes,etc.), but are not conserved for major taxa (e.g. fungi,Metazoa, stramenopiles).The Saccharomyces cerevisiae coronin, ScCoro

(CRN1), is known to bind to microtubules via its uniqueregion between the b-barrel domain and the coiled-coiloligomerisation region (Figure 3, [30]). Two shortregions showing homology to the microtubule-bindingregions of MAP1B mediate this interaction. However,the MAP1B sequence motif is very short (about ten resi-dues) and not very specific comprising mainly glutamateand lysine residues [30]. If the corresponding motifs in

ScCoro are responsible for microtubule-binding then allyeast and Schizosaccharomyces coronins should be ableto bind to microtubules because they contain motifswith similar amino acid compositions. A similar motifor region could not be identified in the Pezizomycotinacoronins. While these supposed microtubule-bindingregions mainly consist of glutamate, lysine, proline, ser-ine, and threonine and are not even conserved in veryclosely related yeast species, the Saccharomyces cerevi-siae coronin, ScCoro, has very recently been describedto contain a CA domain (C: central; A: acidic; [31]).This domain, with which ScCoro activates and inhibits

Aac

3C

mf 3

Hrs

3

Am 3

Apf 3

Bot 3

Nav

3

Aea

3C

pq 3

And

3An

g 3

Da

3

Der

3D

m 3

Dse

3D

y 3

Dg

3D

v 3

Dp

3D

w 3Myd

3

Tic

3

Pdc

3R

hp 3

Dap

3

Nv

3S

tp 3

Cab

3C

e 3

Car

3Cb 3

Cej 3

Hb 3

Psp 3S

tr 3

Ci 3

Cis 3

Aim 3

Caf 3

Myl 3

Eqc 3Bt 3

Cvp 3O

c 3

Caj 3

Hs 3

Pat 3

Pna 3

Mam

3Pah 3

Mm

3R

n 3

Md 3

Gg 3 Xt 3Br 3G

a 3Tar 3Bf 3

Cpt 3

Lg 3

Amq 3

Tct 3

Bad_a 3Bad_b 3

Spp 3

Fna_b 3Fnb_b 3

Fnb_c 3Fnd_b 3

Fnd_c 3M

lp 3 Pug 3Put 3M

v 3R

g 3Spr 3

Phb 3R

ha 3Co 3

Ac 3

Ays 3 Ppp 3Dif 3

Dcp 3Dd 3

Ecs 3

Phi 3Phr 3

Phs 3Pu 3

Tv a 3

Srp 3Ed 3Eh 3Eti 3

Aac 3Cterm

Cmf 3Cterm

Hrs 3CtermAm 3Cterm

Apf 3Cterm

Bot 3Cterm

Nav 3Cterm

Pdc 3Cterm

Rhp 3Cterm

Aea 3Cterm

Cpq 3Cterm

And 3Cterm

Ang 3Cterm

Da 3CtermDse 3Cterm

Dy 3Cterm

Dm 3Cterm

Dss 3Cterm

Der 3Cterm

Dw 3Cterm

Dp 3Cterm

Dv 3Cterm

Dg 3Cterm

Dmo 3Cterm

Myd 3CtermTic 3Cterm

Dap 3Cterm

Is 3Cterm

Cpt 3Cterm

Lg 3Cterm

Aim 3Cterm

Caf 3Cterm

Bt 3Cterm

Myl 3Cterm

Eqc 3CtermCvp 3Cterm

Mm 3Cterm

Rn 3CtermOc 3Cterm

Caj 3Cterm

Ggg 3Cterm

Hs 3CtermPat 3Cterm

Pna 3CtermMam 3Cterm

Pah 3CtermOra 3Cterm

Md 3Cterm

Aoc 3CtermGg 3Cterm

Tag 3CtermXt 3Cterm

Br 3Cterm

Ga 3Cterm

Tar 3Cterm

Tn 3CtermOl a 3CtermStp 3CtermBf 3CtermAmq 3Cterm

Ci 3Cterm

Cis 3CtermNv 3CtermCab 3Cterm

Car 3Cterm

Cb 3CtermCej 3CtermCe 3CtermHb 3Cterm

Psp 3CtermStr 3Cterm

Co 3Cterm

Bad_a 3Cterm

Bad_b 3CtermSpp 3Cterm

Fna_b 3Cterm

Fnd_b 3Cterm

Fnd c 3CtermFnb_b 3Cterm

Fnb_c 3Cterm

Mlp 3Cterm

Pug 3Cterm

Put 3Cterm

Mv 3Cterm

Rg 3Cterm

Spr 3Cterm

Muc 3Cterm

Rha 3CtermPhb 3Cterm

Tv a 3CtermEd 3Cterm

Eh 3CtermEti 3Cterm

Ac 3Cterm

Ays 3Cterm

Ppp 3CtermDif 3Cterm

Dcp 3Cterm

Dd 3CtermPp 3Cterm

Aua 3Cterm

Ecs 3Cterm

Phi 3Cterm

Phs 3CtermPhr 3CtermPu 3CtermSrp 3Cterm

Tct 3Cterm

Ays 4AAys 4B

Dcp 4Dd 4

Dif 4APpp 4A

Ed 4Eh 4

Eti 4Ng 4

Co 4Spp 4

Ays 4CPpp 4B

Dif 4B

Aua

AAu

a B

Ecs

Hya

Phs

Phi

PhrPy

cPuSrpFr

cPh

tThpBh

ABh

B

Bab

Bb

Tep

Tha

Et Nca

Tg_a

Tg

_b

Tg_c

Pb

Pl

c Pl

y

Pf_a

Pf

_d

Pf_b

Pf

_e

Pf_c

Pf

_h Pf_i Pf_j Plr PlgPk

Pv

Ch B

Cp BCrm

B

Ch

AC

p ACrm

A

Pt

TetBin

Crf A

Crf B

Lb

LcLm

LemLe

iLet

TbTbgTy

vTrc A

Trc B

Tv_a

ATv

_a BTv_a

C

Ng

AcAch

AysPpp

Dif Dcp

Dd

Pp A

Pp B

Ed A

Eh A

Ed B

Eh B

Eti AMab

Abb Agb

Cpc PloScc

Lab ALab B

Sll GltPus

HtaSth

DisTav

Ges

Fp

Ppl APpl C

Ppl BWc

PhcChp

Fna_b

Fnd_b

Fnd_cFnb_b Tem

MlgUm_a

Um_b

Mlp

PugPut

MvRgSpr

Aec An_a An_b

En

Asc Asf_aAsf_b

Nef

AstPch

AfAo

PcmTls

Ajc_aAjc bAjc_c

Ajc_dAjc_e

Ajd_aAjd_b

Pab_aPab_c

Pab_b

Arb Te Thv Trr Trt

Arg Aro

Coi_a Coi_dCoi_b Coi_c

Cop_bCop_c

Ur

McpMgMyf

Msp

Alb Ptr Pyt

CohPn

Bg

Bof Scs

Ged

Chg Th

Tit Poa

NcNet

NedSom

Ecf

FoGim

GzNh

HjTra

Hpv

Glg Va Vd

Ggt Mag

Grc

Crp

Tum

ShcSho

Sp Sj

Ca_aCa_b

CadCt_a

Cap Loe

Shs Deh Mrg

CllPia_b

Pcp_aPcp_b

CglNac

Sab_aSab_b

Sak Smi

Sc_a Sc_bSc_c

Sap_a

VpZr

Ka KlKlw Lak_aLatLw

Erg

Wa Yl

Alm Aalpha

Alm AbetaAlm Balpha

Bad_a

Bad_b Spp

Muc

Rha A

Rha BPhb Tct

Co

Aac 2Cmf 2

Hrs 2

Am 2Apf 2

Bot 2

Nav 2

Tic 2

Aea 2Cpq 2And 2Ang 2

Myd 2

Ayp 2Rhp 2

DapIs 2

Cpt 2Lg 2Her 2AHer 2B

Aim 2A

Caf 2A

Bt 2AO

c 2AEqc 2A

Ss 2ACaj 2A

Ggg 2A

Pna 2AH

s 2APat 2A

Mam

2A

Pah 2A

Otg 2A

Cvp 2A

Mm

2A

Rn 2A

Spt 2AM

yl 2ALa 2A

Md 2A

Aoc 2A

Gg 2A

Tag 2A

Xl 2AXt 2A

Br 2A

Ga 2A

Ol_a 2A

Tar 2A

Tn 2A

Br 2CGa 2C

Tar 2C

Tn 2C

Ol_a 2C

Ol_b 2C

Aim 2B

Caf 2BEqc 2B

Bt 2BM

yl 2B

Caj 2B

Ggg 2B

Pat 2B

Hs 2B

Pna 2B

Cvp 2B

La 2BPah 2B

Mam

2B

Oc 2B

Mm

2B

Rn 2BM

d 2BOra 2B

Gg 2BAoc 2B

Xl 2B

Xt 2B

Br 2B

Ga 2B

Ol_a 2B

Tar 2B

Tn 2B

Br 2DBf 2

Stp 2 Brm 2

Wb 2

Lol 2Ov 2

Trs 2

Amq 2 Mb 2Pro 2

Nv 2

Tia 2ATia 2B

Ci 1

Cis 1

Stp 1H

m 1

Sck 1N

v 1Bf 1Brm 1

Wb 1

Ov 1

Lol 1

Glp 1

Mh 1

Mi 1

Cab 1C

ar 1

Ce

1 Cb 1

Cej 1

Hb 1

Psp 1

Str 1

Trs 1O

id 1

Aim

1A

Bt 1

A

Ss 1

A

Caf 1A

Eqc 1

AM

yl 1A

La 1

A

Mim

1A

Caj 1A

Ggg 1AHs 1

APat

1APna

1AMam

1AMf 1A

Pah 1

A

Cvp

1A

Mm

1A

Rn 1

ASp

t 1A

Md

1AOc

1A

Dn 1

A

Aoc

1AXl 1

A

Xt 1

ABr 1

A

Cyc

1A

Ga

1A

Tar 1

A

Tn 1

A

Fh 1

A

Ol_

a 1A

Ol_

b 1A

Aim 1B

Bt 1BMim 1BCaf 1B

Ect ALa 1B

Ere 1B

Oc 1B

Myl 1B

Caj 1BPna 1B

Pah 1B

Ggg 1BHs 1B

Pat 1B

Eqc 1B

Mm 1B

Rn 1B

Cvp 1B

Aoc 1B

Aim 1CGgg 1C

Pah 1C

Bt 1CSs 1C

Mm 1CRn 1C

Ora 1C

Caf 1CCvp 1CPat 1CPna 1C

Hs 1CMam 1C

Caj 1CLa 1C

Otg 1CMyl 1CMd 1C

Eqc 1C

Spt 1COc 1C

Gg 1CTag 1C

Aoc 1C

Xl 1CalphaXl 1CbetaXt 1C

Br 1CalphaGa 1Cbeta

Ol_a 1CbetaOl_b 1Cbeta

Tar 1CbetaTn 1Cbeta

Br 1CbetaGa 1CalphaOl_a 1Calpha

Tn 1Calpha

Tar 1Calpha

Aim 1D

Bt 1DMd 1DOra 1D

Cvp 1DLa 1D

Eqc 1D

Mim 1D

Spt 1D

Ggg 1DHs 1D

Pat 1D

Mam 1DPah 1D

Pna 1D

Caj 1DMyl 1D

Fc 1D

Caf 1D

Mm 1DRn 1D

Oc 1D

Aoc 1DGg 1

D

Tag 1D

Xt 1D

Br 1DGa 1

D

Tar 1DTn 1

DOl_a

1D

Br 1

E

Ga

1E

Ol_

a 1E

Ol_

b 1E

Tar 1

E

Tn 1

E

Cpt

1H

er 1

DH

er 1

AH

er 1

BH

er 1

C

Ecg

1AEm

1A

Hm

m 1

AS

cm 1

Ecg

1BEm

1B

Hm

m 1

BSm

1

Lg 1

Aea

1C

pq 1

Ang

1An

d 1

Da

1

Der

1D

m 1

Dse

1

Dss

_a 1

Dss

_d 1

Dp

1A

Drp

1A

Dp

1B

Drp

1B

Dss

_c 1

Dss

_f 1

Dw

1D

y 1

Dg

1

Dm

o 1

Dv

1G

om 1

Myd

1Ti

c 1

Ayp

1R

hp 1

Pdc

1D

ap 1

AD

ap 1

B

Nav

1

Am 1

Apf 1

Bot 1

Hrs

1Aa

c 1

Cm

f 1

Class-3 C-Term

Stra

men

op

iles

Alv

eola

ta

Class-4

Capsasporaowczarzaki

Basidomycota

Ascomycota

Fish

Cla

ss-1

E

Vertebrata Class-1C

Vertebra

ta Class-

1DVerte

brata Class-1B

Vert

ebra

ta C

lass

-1A

Inve

rteb

rate

Cla

ss-1

Invertebrate Class-2

Vertebrata C

lass-2AVertebrata Class-2B

Class-3 N-Term

Metazoa Class-1 Metazoa Class-2

Fungi

SAR

Amoebae

Excavata

Rhizaria

(Fungi/Metazoaincertae sedis)

0.89

0.53

0.52

0.88

0.59

0.44

0.83 0.75 0.58

0.50

0.50

0.46

0.450.41

0.450.340.41

0.08

0.41

1.00

1.00

1.00

1.00

1.00

1.00

1.00

1.00

1.00

0.37

0.38

0.45

0.29

0.38

0.98

0.98

0.90

0.41

1.00

1.00

0.87

0.610.600.60

Figure 1 Phylogenetic tree of the coronin family. The phylogenetic tree of the coronin family was calculated from the multiple sequencealignment of the conserved coronin domain using the Bayesian method. The unrooted tree was drawn with iTOL [73] and branches werecoloured according to class and taxonomic distributions. For an extended representation of the tree including all posterior probability values seeAdditional file 2.

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 4 of 17

Page 5: D - BioMed Central

the ARP2/3 complex depending on concentration [31],is similar to CA domains in WASP family proteins [32].The CA domain is well conserved but distinct withinthe Saccharomyceta clade (Pezizomycotina and Sacchar-omycotina, Figure 4).

Surprisingly, the coronins of the Tremellomycetes (e.g.Filobasidiella/Cryptococcus species) that belong to theBasidiomycota encode a C-terminal dUTPase domain(deoxyuridine triphosphatase domain) instead of thecoiled-coil region (Figure 3). These coronin sequences

Heterolobosea Naegleria gruberi

Kinetoplastida Leishmania major Crithidia fasciculata

Parabasalia Trichomonas vaginalis

Stramenopiles Phytophthora ramorum Aureococcus anophagefferens Thalassiosira pseudonana

Alveolata Toxoplasma gondii Plasmodium falciparum Theileria parva Tetrahymena thermophila Cryptosporidium hominis

Apusozoa Thecamonas trahens

Amoeba Dictyostelium discoideum Dictyostelium fasciculatum Acytostelium subglobosum Entamoeba histolytica Physarum polycephalum Acanthamoeba castellanii

Fungi Ascomycota Basidiomycota Allomyces macrogynus Batrachochytrium dendrobatidis Spizellomyces punctatus Rhizopus arrhizus

Fungi/Metazoa Capsaspora owczarzakiincertae sedis

Choanoflagellida Monosiga brevicolis

Metazoa Amphimedon queenslandica Trichoplax adhaerens Nematostella vectensis Cestoda Branchiostoma floridae Strongylocentrotus purpuratus Oikopleura dioica Ciona intestinalis Helobdella robusta Lottia gigantea

Nematoda Caenorhabditis elegans Brugia malayi Meloidogyne hapla

Arthropoda Drosophila melanogaster Drosophila pseudoobscura Anopheles gambii Daphina pulex

Actinopterygii Takifugu rubripes Brachydanio rerio

Mammalia

Aves Xenopus tropicalis Anolis carolinensis

1 2 34 8 95 6 7 10 11 12CORO1B CORO1C CORO6CORO1A CORO2A CORO2B CORO7

p57TACO

ClipinA

IR10ClipinB

ClipinC P70POD-1

coroninSE HCRNN4coronin-3

CRN2

1B 1C 1D 1E1A 2A 2B 2C 2D

Class-1 Class-2 Class-3 Class-4

Villidin

unclassified

( )

HGNCMorgan & Fernandez

Figure 2 Coronin repertoire of selected species of major taxa and branches. The coronins of several representative species for mosteukaryotic taxa and branches are listed (for the list of all species see Additional file 4). On top, alternatively used names and classificationschemes are given for better comparison and orientation.

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 5 of 17

Page 6: D - BioMed Central

are supported by many EST/cDNA clones for several ofthe Filobasidiella species extending from the coronindomain to the stop-codon. In addition to this dUTPasedomain, the Filobasidiella species contain a furtherdUTPase in the genome that is conserved in the otherBasidiomycotes, and also the other fungi. The dUTPasedomains of the Tremellomycetes coronins contain allcharacteristic dUTPase domain motifs [33] and aretherefore supposed to constitute enzymatically activedomains. dUTPases typically form homotrimer activesite architectures with all monomers contributing con-served residues to each of the three active sites [33].Except for the prediction of trimerization of these coro-nins, which could be mediated by the dUTPase domainsinstead of the coiled-coil domains in the other coronins,it needs experimental data to link the function of actinfilament structure remodelling by coronins to dUTPnucleotide hydrolysis in DNA repair by dUTPases.

Class-3 coroninsClass-3 coronins (Type III coronins) comprise homologsthat encode two coronin domains arranged in tandem[8]. These two coronin domains are separated by uniqueregions, and class-3 coronins do not encode coiled-coildomains. As recently reported [31] the class-3 coronins

also encode a CA domain similar to the CA domain ofthe WASP family proteins at their C-termini (Figure 3).Based on the multiple sequence alignment of 112 class-3 coronins from all major branches of the eukaryotesthe position of the C-region has slightly been adjustedin comparison with a previous analysis (Figure 4; [31]).Although the C-region of the class-3 coronins is not asconserved as similar regions in the yeast short coroninsor in WASP family proteins, the characteristic patternof hydrophobic residues concluded by a basic residue isvisible in the homologs of all species (Figure 4). In con-trast to the short coronins, the unique region betweenthe C-terminal coronin-domain and the conserved CA-domain is short (20-30 amino acids).Like for the short coronins the Filobasidiella species have

surprising and species-specific tandem-coronins. The Filo-basidiella class-3 coronins have a D-glycerate 3-kinasedomain between the two coronin-domains (Figure 3). Onlythe termini of the Filobasidiella class-3 coronins are sup-ported by EST/cDNA data, but long exons bridge the N-terminal coronin-domain and the glycerate 3-kinasedomain, as well as the glycerate 3-kinase domain and theC-terminal coronin-domain. As found for the dUTPasedomain of the short coronins, the Filobasidiella speciescontain an additional D-glycerate 3-kinase that has

0 200 400 600 aa

445

461

LZ

651

LZ

DdCoro

602

PfCoro

695

Fna_bCoro

HsCoro1A

ScCoro (Crn1)

605

CeCoro3 (POD-1)

LZ Leucine Zipper MAP1B homology region

800 1000 1200 1400 1600 1800

1057

Fnb_bCoro31321

DdCoro4 (Villidin)1704

CeCoro1

CoCoro41738

EhCoro41602

PH

Gelsolin

dUTPase

VHP

Coiled coil

WD40 repeat

Central region

Acidic region

D-glycerate 3-kinase

Figure 3 Domain organisation of representative coronins. A colour key to the domain names and symbols is given on the right except forthe coronin domain that is coloured in orange. The abbreviations for the domains are: WD, WD repeat; PH, pleckstrin-homology domain; LZ,leucine zipper; VHP, villin headpeace domain.

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 6 of 17

Page 7: D - BioMed Central

homologs in the other fungi and also in plants. Why it isadvantageous to connect an actin-filament binding functionto a glycerate 3-kinase needs experimental evaluation. Theglycerate 3-kinase domain is not found in the class-3 coro-nins of the other Basidiomycotes. Except for the Filobasi-diella species only the insects have long insertions betweenthe two coronin-domains of their class-3 coronins. Theseinsertions are highly conserved, about 300 residues long,

and do not show any homology to known domains,sequence motifs, and other proteins.In contrast to the related species Rhizopus arrhizus

and Phycomyces blakesleeanus the coronin-3 of Mucorcircinelloides consists of only the second coronin-domain of the tandem. We can exclude the possibilityof this being an artefact of the genome assembly forthree reasons. First, the genome sequence is continuous

0

1

2

3

4

bits

N

1

G

E

DSTN

2

S

G

LAIV

3

S

G

ND

4

V

K

A

NHED

5

T

I

F

AVL

6

SL

7

T

S

N

G

QAK

8

G

TNSKE

9

E

NKSD

10

T

S

Q

GDNK

11

TKDQES

12

IV

13

T

K

G

E

SDN

14Q

G

DTNSK

15ML

16IL

17

E

A

DQNK

18

K19

ASV

20

V

T

G

CNAS

21

A

TSNED

22

M

F

EQIL

23

E

SD

24

SAEQD

25

T

S

E

PLIVD

26

R

DEN

27

K

E

A

RGHND

28

VSKEDA

29

T

S

K

VQDE

30

T

G

A

EPND

31

Q

A

EDP

32

V

G

E

A

PNKDS

33

T

S

Q

D

ERK

34

T

S

R

K

G

NED

35 36

E

37

DEN

38

SAGE

39

R

N

G

EDK

40

K

I

SNDT

41

GEDS

42

VSADEG

43

W

44

Q

DE

45

N

L

SDE

46

S

DAEV

47

T

Q

G

EPKD

48

L

I

EKD

49

T

L

D

RKPE

50

L

I

VSEP

51

D

KIETVA

52

E

TRSKP

53

PAVIS

C

0

1

2

3

4

bits

N

1

N

M

I

E

ATP

2

ERAP

3

S

VNAP

4

P

A

GTS

5

VM

6

R

Q

GNSAK

7

S

QDE

8

HNQ

9

AQSGK

10

VDQGSA

11

S

12

VIM

13

SA

14

T

NSA

15

IAM

16

VA

17

DNS

18

RK

19

YF

20

KMQA

21

D22

H

A

GEDKN

23

V

Q

A

DE

24

A

PDE

25

S

TKAED

26

PGDE

27

NGRE

28

NHE

29 30 31 32

E

33

KGDAE

34

VDGEA

35

SPDE

36T

N

KPSED

37T

EGVDA

38

S

AEGD39

SGED40

P

N

GETADS

41

G

S42

GP

S43

F44

DE

45

QAE

46

PEAIV

47

V

T

G

KPQS

48

SRK

49

QP

50

Q

I

TPVA

51

D

SAPQE

52

ISKR

53

Q

TVSPRA

C

C-domain A-domain

Saccharomycotina coronins

Pezizomycotina coronins

Class-3 coronins

C-domain A-domain

0

1

2

3

4

bits

N

1

D

VSMEK

2

R

A

DYTS

3

Q

H

VED

4

R

I

TSKQE

5

K

S

NAPEQ

6

ASDRK

7

T

A

N

ERQK

8

A

F

KDQE

9

S

D

KAQE

10

S

Y

REVIL

11

F

K

A

IMQL

12

H

ADSKN

13

K

D

RTSA

14

FWLVM

15

LAFVS

16

Q

ETDSNA

17

FQSRK

18

PIVAML

19

M

T

Q

SNGK

20

MLDE

21

G

YKQHD

22

Q

K

H

E

DSNR

23

R

H

QDAE

24

S

H

D

25 26 27 28 29 30 31 32 33 34 35 36 37

Q

38 39 40 41

S

K

D

A

NE

42

S

RQEDP

43

I

D

V

44

K

V

DE

45

A

H

ESKDG

46

L

D

VSEN

47

Y

V

K

I

R

48

S

QGNTED

49

L

K

G

C

ETD

50

N

A

R

SEKP

51

V

T

D

A

EPL

52

I

V

D

TEP

53

S

N

G

R

AT

Q

54

E

A

TPFD

55

T

G

E

L

Q

DS

56

T

LFM

57

V

R

K

I

A

QDE

58

VDAGE

59

G

60

S

P

DAEG

61

S

A

DPV

62

V

PAESD

63

A

SDE

64

TKHSEND

65

SDE

66

W67

ESNQD

68

N

QED

69

EDY

70

C

Figure 4 Sequence conservation in the CA domains. The sequence logos illustrate the sequence conservation within the multiple sequencealignments of the CA domains of the Saccharomycotina, the Pezizomycotina, and the class-3 coronins. The CA domains of theSaccharomycotina and the Pezizomycotina are located within the unique regions of the short coronins while the CA domain of the class-3coronins is at the C-termini of the proteins like in WASP family proteins. The regions between the C and the A domains are of variable length.

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 7 of 17

Page 8: D - BioMed Central

around MucCoro3. Secondly, there is no homology toany part of the N-terminal coronin-domain of RhaCoro3or PhbCoro3 in the genome although the sequenceidentity of the C-terminal coronin-domains is about65%. And finally, there is a TATA-box shortly upstreamof the MucCoro3 gene. Because a coronin-3 has alreadybeen present in the most ancient eukaryote the loss ofthe N-terminal coronin-domain must be specific toMucor circinelloides.

Class-4 coroninsBased on the phylogenetic tree (Figure 1) and thedomain composition of the protein homologs, anothercoronin class can be defined for which the Dictyosteliumdiscoideum homolog, also called villidin [34], would be arepresentative (Figure 3). We suggest naming membersof this class class-4 coronins. Most class-4 coronins con-sist of an N-terminal coronin-domain followed by threeto four PH domains, four to five gelsolin domains, and aC-terminal villin headpeace domain (VHP). Class-4 cor-onins were identified in two of the major kingdoms ofthe eukaryotes, in excavates and opisthokonts. Further-more, they are found in several of the sub-branches ofthe opisthokonts, in amoebae, fungi, and the fungi/meta-zoa incertae sedis branch. Because class-4 coronins fromdifferent species often contain different numbers of PHand gelsolin domains, domain gain and loss events musthave happened in the respective branches or single spe-cies. However, there are not enough coronin-4 homo-logs identified yet to reconstruct the evolution of theseregions. In addition to these multi-domain class-4 coro-nins there is a group of class-4 coronins that just con-sists of the conserved coronin domain and is restrictedto some Amoebae species yet.

Alternatively spliced coroninsAlternative splice forms have been reported for two coro-nin homologs: five variants of coronin from Caenorhab-ditis elegans [35], CeCoro1 (Figure 5), and three variantsfor coronin-1C from human [36], HsCoro1C. Thedescribed splice variants do not concern the beta-barreldomain but the structurally low-complexity region priorto the coiled-coil region in CeCoro1 and elongations ofthe N-terminus of HsCoro1C, respectively. In thereported analysis of CeCoro1 [35] two splice sites (thealternative 3’-splice site of exon7 and the alternative 5’-splice site of exon8) do not obey the conventional spli-cing rules. Alternative 5’-splicing of exon8 would lead toa premature stop-codon. In the four additional Caenor-habditis strains analyzed here, C. briggsae, C. japonica,C. remanei, and C.brenneri, alternative 5’-splicing ofexon8 would not lead to a premature stop-codon at thesame position as in C. elegans but to transcripts of var-ious lengths. The same accounts for several of the other

available nematode coronin-1 genes. Given the high con-servation of the nematode coronin-1 genes, especially theCaenorhabditis genes, and the completely uncommonnature of the potential splice sites, the reported alterna-tive 3’-splice site of exon7 and 5’-splice site of exon8 aremost probably artificial results. An alternative 3’-splicesite has been reported for exon8 of CeCoro1 comprisingtwo amino acids [35]. Similar splice sites were identifiedin the genes of the other analyzed Caenorhabditis speciesbut not in other nematodes. This splice site is thus alsoeither an artificial result or specific for the Caenorhabdi-tis branch. In addition, skipping of exon8 has also beenreported to lead to an alternative transcript [35]. Theintron position and reading frame of exon8 of CeCoro1 isconserved in all analyzed nematode coronin-1’s exceptfor the Strongyloides rattii coronin-1, which consists ofonly one exon, and the Pristionchus pacificus coronin-1,which has introns at different positions. Compared to thefull-length transcript, the other alternative splice forms ofCeCoro1 are of low abundance (see Figure two in [35]).Because the integrity of exon8 of CeCoro1 (intron posi-tions around the conserved coding sequence of exon8) isnot conserved in nematodes but the correspondingamino-acid sequence, alternative splicing of nematodecoronin-1 is either restricted to some sub-branches or anartificial result of the CeCoro1 analysis.Alternative splicing of human coronin-1C results in

two additional transcripts derived from alternative tran-scription start sites encoded by an additional upstreamexon, compared to the normal start site as found andconserved in all other coronin proteins [36]. These alter-native splice forms seem to be restricted to modern pri-mates (human, chimpanzee, gorilla, orangutan, andgibbon) and have been discussed in detail elsewhere [36].We have identified alternative splice variants for coronin-

1D (Figure 5), a coronin subfamily restricted to vertebrates.A cluster of two mutually exclusively spliced exons, exon5aand exon5b, was identified in all tetrapods. The amino-acidsequences corresponding to exon5 of the fish genes aremore similar to exon5b than to exon5a. Thus, exon5a isthe result of an exon duplication event that either occurredafter the separation of tetrapods from fishes or at the onsetof the vertebrates, where exon5a has been lost in the ances-tor of the fishes. Exon5 represents the sequence of almostthe entire fourth WD repeat (fifth blade in the b-propeller)starting in the middle of the fourth b-strand of blade four.By exchanging the fourth WD repeat the vertebrates couldfine-tune the function of the coronin-1D beta-barreldomain. Vertebrate coronin-1D (CORO6) has not beenanalyzed experimentally yet and its specific function isunknown.Further alternative transcripts are derived from mam-

malian coronin-1D genes by alternative 5’-splicing of thelast exon, exon10. This alternative splicing results in one

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 8 of 17

Page 9: D - BioMed Central

additional glutamine residue and is conserved in all 22analyzed mammalian coronin-1D’s except for Ailuro-poda melanoleuca (giant panda), Loxodonta africana(elephant), Myotis lucifugus (little brown bat), and Bostaurus (cow).

OligomerizationMost of the short coronins have predicted coiled-coildomains at the C-terminus that are the bases for theirsupposed oligomerization. Initially, coronins have beenproposed to form dimers [16], the most common form

CeCoro1

HsCoro1D

For clarity introns have been scaled down by a factor of 2.24

1 gi|193211354|ref|NC_003281.8| (5435bp)

400 bps (ex.) 900 bps (in.) TAG

TAA

For clarity introns have been scaled down by a factor of 2.97

1 gi|224589808|ref|NC_000017.10| (5687bp)

400 bps (ex.) 1000 bps (in.) “Q”5a 5b

180 190 200 210 220 230....|....|....|....|....|....|....|....|....|....|....|....|

HsCoro1D DVIHSVCWNSNGSLLATTCKDKTLRIIDPRKGQVVAEQARPHEGARPLRAVFTADGKLLSExon5b ERFAAHEGMRPMRAVFTRQGHIFT

240 250 260 270....|....|....|....|....|....|....|

280|....|....

HsCoro1D TGFSRMSERQLALWDPNNFEEPVALQEMDTSNGVLLPFYDPDSSIExon5b TGFTRMSQRELGLWDP

blade 4 blade 5

blade 5

exon 4

exon 5b

exon 5a exon 6

exon 5a

exon 5b

blade 6

Figure 5 Gene structures of alternatively spliced coronins. The cartoons outline the gene structures of the alternatively spliced coronin-1gene from Caenorhabditis elegans, CeCoro1, and the coronin-1D gene from Homo sapiens, HsCoro1D. The alternatively spliced CeCoro1 genecontains a differentially included exon8, which has an additional alternative 3’-splice site, leading to three transcripts. The other two describedsplice sites, an alternative 3’-splice site of exon7 and an alternative 5’-splice site of exon8 [35], are most probably artificial. The HsCoro1D genecontains a cluster of two mutually exclusive spliced exons, exon5a and exon5b, and an alternative 5’-splice site of exon10. Dark grey bars andlight grey bars mark exons and introns, respectively, and alternative exons and splice sites are coloured.

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 9 of 17

Page 10: D - BioMed Central

of coiled-coil multimerization. In the last decade, a fewcoronin homologs were biochemically purified and ana-lyzed. Accordingly, the Xenopus laevis coronin-1C(XcoroninA) has been shown to form a dimer [37] whilean oligomeric state has been found for human coronin-1C (coronin 3; [18], and the Saccharomyces cerevisiaecoronin (CRN1) trimerizes [31]. Parallel trimer forma-tion has also been shown in a crystal structure of thecoiled-coil domain of mouse coronin-1A [20] revealinga conserved motif determining the trimeric structure:R1-[ILVM]2-X3-X4-[ILV]5-E6. In this motif arginineforms a salt-bridge with glutamate at the surface of thecoiled-coil structure and the aliphatic side chain moi-eties of arginine and glutamate pack against the hydro-phobic residues at positions 2 and 5 of the motifshielding them from solvent. Mutation of the arginineto lysine leads to a concentration-dependent equilibriumbetween trimers and tetramers with tetramers formingat high concentration, while mutation to alanine or nor-leucine leads to tetramers [20]. Mutation of the invar-iant arginine to glutamine in the trimerization motif ofhuman matrilin-1 leads to tetramers [38]. Unfortunately,the switching of arginine and glutamate in the respectivepositions has not been analyzed yet. We would expectthat such a switch should be as stable as the originalmotif. Thus, to predict the oligomerization state wehave analyzed all coronin coiled-coil regions for the pre-sence of the trimerization motif. Accordingly, all 233class-1 coronins have the classical motif, except forDpCoro1B and DrpCoro1B (Drosophila pseudoobscuraand persimilis; Lys at position 1), and NvCoro1 (Nema-tostella; Cys at position 2), and are thus predicted toform trimers. This would include the Xenopus Coro1Cthat has, however, been shown to exist as a dimer [37].The situation is more diverse for the class-2 coronins.The invertebrate coronins contain the trimerizationmotif, except for AmqCoro2 (Amphimedon; Ser at P1),HerCoro2A (Helobdella; Lys at P1), HerCoro2B (Phe atP1), MydCoro2 (Mayetiola; Gln at P6), and the nema-tode class-2 coronins (Cys at P2). Almost all fish class-2coronins contain the trimerization motif, but the othervertebrate class-2 coronins have conserved mutations.The tetrapod class-2A coronins encode a glutamineinstead of the invariant arginine, which would turnthem to tetramers in analogy to matrilin-1 [38]. The tet-rapod class-2B coronins contain glutamine instead ofthe glutamate at position 6 of the motif, a substitutionwhose effect has not been analyzed yet.About half of the analyzed fungal coronins have the clas-

sical trimerization motif. The most common substitutionsthat are found in all Schizosaccharomyces and mostBasidiomyota coronins are lysines or glutamines instead ofthe arginine at position 1. While the coiled-coil region isconserved in general, substitutions happened in specific

species but not in whole branches (except for the Schizo-saccharomyces). Therefore, we would expect all fungalcoronins to form trimers. All Amoeba coronins, the Stra-menopiles coronins (exceptions: FrcCoro a His at P1,BhCoro_B a Asn at P6, AuaCoro_B a Lys at P1), theTrichomonas and Naegleria coronins contain the classicaltrimerization sequence motif in the coiled-coil region.Interestingly, the kinetoplastid coronins have the salt-bridge switched in the motif and should thus also be ableto form trimers. From the Alveolata, only the Ciliophora(e.g. Tetrahymena) and Coccidia (e.g. Toxoplasma gondii)coronins contain coiled-coil domains, and only the Cocci-dia contain the trimerization motif.These are, however, predictions based on the existence

of the proposed trimerization motif. The motif has beenidentified in 86% of all short, autonomous, and parallelthree-stranded coiled-coils while it is also observed in 9%of the antiparallel trimers and in 5% of the parallel andantiparallel dimers [20]. Thus, although most short coro-nins are predicted to form trimers some might neverthe-less function in other oligomeric states in the cell. Theoligomerization state can ultimately only be shown inexperiments, which have, however, been done for just afew of the coronins yet.

F-actin bindingF-actin binding is one of the common properties of coro-nin proteins. The extended multiple sequence alignmentpresented here together with the recently determinedcrystal structure of murine coronin-1A [17] now allows areevaluation of previous mutagenesis studies. Truncationstudies have shown that the coronin domain, includingthe b-propeller and its C-terminal extension, is necessaryfor F-actin binding [30,39]. Mapping the sequence con-servation within 13 short coronin members onto the sur-face of the crystal structure revealed two regions, oneformed by blades 1, 6, and 7 and one formed by blades 6and 7 and a portion of the C-terminal extension, torepresent possible actin binding sites [17]. Subsequently,several surface-exposed charged amino acids have beenmutated to alanine or substituted by reversed charges inhuman coronin-1B and their F-actin binding affinity hasbeen analyzed ([40], Figure 6 red dots, see also Additionalfile 5). Only the R30D mutation abolished actin bindingin vitro. Although an arginine is the most prevalentamino acid at this position it is often substituted by alysine or a proline (Figure 6). The multiple sequencealignment of the coronins also does not show a trendtowards a class-specific substitution. For example, whilea proline is found at this position in all vertebrate class-2coronins, arginines, lysines, asparagines, prolines, threo-nins, and tyrosins are found in invertebrate class-2 coro-nins. At least negatively charged amino acids are notfound in any of the coronin domains at this position.

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 10 of 17

Page 11: D - BioMed Central

0

1

2

3

4

bits

N

1 2 3

P

M

4

I

PL

E

SVAG

5E

SPFI

6A

V

N

PELSG

7

V

R

N

D

ESAT

8

Q

E

N

MTS

9

E

N

M

L

R

TAP

10

I

Y

L

Q

V

ATS

11

K

A

T

R

S

Q

12

M

A

S

PTRK

13

Q

P

K

ARSM

14

I

A

TLSM

15

Q

F

AWMS

16

M

AFSGR

17

V

W

Q

S

PKR

18

L

G

I

QRVF

19

I

M

GYFV

20

S

N

PKR

21

TVQAS

22

W

P

S23

L

SRK

24

R

L

I

YF

25

KR

26

F

Y

NH

27

TIALV

28

E

K

TA

QYF

29

S

CAPG

30

S

I

E

RTKQ

31

I

T

Q

SVAP

32

S

G

PTLVA

33

T

P

SNHRK

34

Q

PNARK

35

N

SDE

36

H

S

E

T

A

N

L

Q

37

T

Q

P

N

H

A

38 39

G

T

A

S

Q

WHC

40

L

VFIY

41

RSTED

42

ESGDN

43

VLI

44

C

N

S

HPKR

45

P

S

A

L

GNIV

46

N

A

LGTS

47

A

Q

NTSRK

48

Y

L

A

R

TVSN

49

P

QVSTA

50

I

STHW

51

C

T

YSEPD

52

Q

N

D

TPACG

53

ATSDE

54

HES

55

TS

56

A

57

TAV

58

TS

59

GT

60

F

61 62 63 64 65 66 67

Q

CGNTS

68

Q

G

C

HDSTN

69

Y

N

H

I

GLF

70

FLVIC

71

T

Q

HCSKA

72

G

TSCAV

73

T

G

SN

74

K

C

TSGAP

75

H

Q

A

L

SERK

76

W

LRYF

77

A

MFVIL

78

F

SA

79

A

LFIV

80

A

VPIN

C10 20 30 40 50 60 70 80

....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|

ScCoro__fl -------------MSGKFVRASKYRHVFGQAAKKEL--QYEKLKVTNNAWD---------------SNLLKTNGKFIAVNDdCoro__fl --------------MSKVVRSSKYRHVFAAQPKKEE--CYQNLKVTKSAWD---------------SNYVAANTRYFGVIMmCoro1A_fl -------------MSRQVVRSSKFRHVFGQPAKADQ--CYEDVRVSQTTWD---------------SGFCAVNPKFMALI

1-2 R30A/D

0

1

2

3

4

bits

N

401

T

IAMFL

402

T

P

DGANES

403

R

N

V

T

SKE

404

P

S

K

D

AEG

405

I

H

D

SAP

406

E

MVIL

407

N

S

V

G

TKA

408

T

I

VLG

409

N

I

AGDVS

410

Q

M

V

S

EFTL

411

H

K

I

T

LVEQ

412

N

G

VSDTE

413

F

VMIL

414

G

D41

5

C

N

VGQST

416

V

R

SAGT

417

T

LGIAS

418

ATPNS

419

ASG

420

S

APLTIV

421

VIMCL

422

T

V

IFLM

423

I

P42

4

V

S

I

HYLF

425

V

L

WFY

426

SD

427

S

ADEP

428

E

SGD

429

I

V

LNST

430

R

HKQGSN

431

TCVLMI

432

IVL

433

I

VLFY

434

A

IVL

435

S

I

GVTCA

436

SG

437

RK

438

G43

9

ED

440

N

C

RTSG

441

A

Q

R

VTSN

442

CLVI

443

H

N

KFYR

444

V

S

I

A

MF

LCY

445

M

LFY

446

QE

447

M

FLYVI

448

L

D

N

Q

VEST

449

A

T

ESPND

450

A

S

T

GED

451

N

Q

D

ASEK

452

NK

EDATG

453

C

GSPRT

454

G

SNAVE

455

YQEA

456

457

458

459

460

T

S

A

G

R

QDKE

461

E

QDSKAP

462

Y

Q

N

P46

3

HAFY

464

A

IFVL

465

T

L

Y

FESH

466

K

A

PEFY

467

GCVIL

468

M

D

ATNS

469

R

H

M

S

QTE

470

S

V

C

HYF

471

QTRSK

472

A

LCGTS

473

A

E

G

TPSK

474

L

A

NTSDE

475

V

T

LAQSP

476

I

F

ALTH

Q

477

L

S

QKR

478

SAG

479

VAFLIM

480

V

T

SCAG

C410 420 430 440 450 460 470 480

....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|

ScCoro__fl IEKGDLGG-FYTVDQ-SSGILMPFYDEGNKILYLVGKGDGNIRYYEFQNDE------------LFELSEFQSTEAQRGFADdCoro__fl FTT-PLS--AQVVDS-ASGLLMPFYDADNSILYLAGKGDGNIRYYELVD----------ESPYIHFLSEFKSATPQRGLCMmCoro1A_fl LEE-PLS--LQELDT-SSGVLLPFFDPDTNIVYLCGKGDSSIRYFEITS----------EAPFLHYLSMFSSKESQRGMG

1-13 1-14 1-15 1-16 1-171-121-11

0

1

2

3

4

bits

N

481

W

I

S

MYVLF

482

IAVLM

483

T

SP

484

RK

485

W

S

K

L

Q

NTHR

486

V

E

MS

AG

487

488

IVCL

489

K

S

A

END

490

L

S

MITV

491

A

R

KHMNS

492

T

Q

ADSEK

493

G

AVNC

494

E

495

T

L

FVI

496

S

D

T

G

VNLMFA

497

KR

498

G

CILVAF

499

V

M

FLY

500

V

QRK

501

M

A

I

NVTL

502

E

I

NSVHT

503

Q

G

D

ASTEN

504

KSTDNR

505

R

A

Q

NSTKG

506

P

A

L

NDSG

507

NSG

508

AVGD

509

IG

510

AG

511

A

T

Y

SLK

512

LVCI

513

L

R

D

IVQE

514

I

A

T

VP

515

T

LVI

516

G

P

M

Q

VIAS

517

I

VYMF

518

F

Y

Q

RIT

519

ILV

520

P

521

R

522

IVRK

523

E

SQTRNK

524

T

D

KGSA

525

MIGL

526

527

528

529

530

531

532

533

534

535

TYKAS

536

T

V

Q

S

RED

537

V

EF

T

DSL

538

YF

539

HQ

540

ASED

541

E

D54

2

VIL

543

FY

544

G

V

P54

5

N

MEPD

546

S

V

ICAT

547

T

PLAKR

548

F

R

L

TIVPA

549

TASG

550

W

G

D

T

P

551

I

Q

TVDKE

552

T

SAP

553

G

VETSA

554

Q

T

VIML

555

G

DEST

556

G

V

T

PSA

557

G

K

Q

ASDE

558

K

S

QADE

559

Y

FW

560

M

VWIFL

C490 500 510 520 530 540 550 560

....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|

ScCoro__fl VAPKRM-VNVKENEVLKGFKTVVDQ-----RIEPVSFFVPRR------------SEEFQEDIYPDA-PSNKPALTAEEWFDdCoro__fl FLPKRC-LNTSECEIARGLKVTPF------TVEPISFRVPRK------------SDIFQDDIYPDT-YAGEPSLTAEQWVMmCoro1A_fl YMPKRG-LEVNKCEIARFYKLHER------KCEPIAMTVPRK------------SDLFQEDLYPPT-AGPDPALTAEEWL

1-18 1-20

1-21

1-221-19

Figure 6 Sequence conservation within the actin binding region. The sequence logos illustrate the sequence conservation within themultiple sequence alignments of the coronin domains. Here, only the N- and C-termini of the coronin domains are shown because most of theresidues implicated in actin binding map to these regions. For the representation of the entire coronin domain see Additional file 5. For betterorientation, the sequences of three representative coronins are shown: the yeast coronin as the main target of mutagenesis experiments, theDictyostelium coronin as the founding member of the protein family, and the murine coronin-1A of which the crystal structure is known.Secondary structural elements as determined from the crystal structure are drawn as yellow arrows (b-strands) and red boxes (a-helices). Greendots point to amino acids of ScCoro that have been mutated to alanine [41] and red dots highlight mutagenesis studies in HsCoro1B [40]. Light-blue boxes highlight mutations that abolished actin binding, dark-grey boxes represent mutations that did not influence actin binding, andlight-grey boxes point to mutations in yeast coronins that could not be expressed and tested.

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 11 of 17

Page 12: D - BioMed Central

Recently, systematic mutagenesis of charged surface-exposed residues of yeast coronin revealed a patch ofresidues extending over the top and one side of theb-propeller that abolished actin binding when mutated toalanine (Figure 6, green dots [41]). The analysis of theconservation within the coronin proteins shows thatmany of the substitutions in both studies have been per-formed on marginally conserved residues (e.g. E215A/K,K216A/E, 1-11, 1-15, 1-16). Thus, it is not surprising thatcoronins with mutations of these residues are able tobind F-actin. As actin binding is one of the commonfunctions of coronins and actins belong to the highestconserved protein families the actin binding surface ofthe coronins is also expected to be highly conserved.Most of the residues that were found to abolish actinbinding when mutated to alanine are strongly conserved(Figure 6). The few residues that are highly conservedbut do not influence actin binding might be interactionsites for other proteins like cofilin.

DiscussionHere, we have analyzed 723 coronins from 358 species. For323 species whole genome sequence data was availableallowing a “holistic” analysis of the coronin protein family.In addition, the whole genome assemblies of 69 specieshave been analyzed that in the end did not contain anycoronin homolog. These species include Rhodophyta (Cya-nidioschyzon, Galdieria), Viridiplantae, Microsporidia, For-micata (Giardia), and Haptophyceae (Emiliania). Asequence alignment of the coronin proteins was createdand extensively improved manually. The phylogenetic ana-lysis of the conserved coronin domain, which is alsoincluded in the crystal structure [17], using the Bayesianmethod showed that the grouping of the coronins is com-pletely in accordance with the latest phylogeny of theeukaryotic species (Figure 1, [27-29]). Subsequently, weanalyzed the coronin tree with respect to established andproposed classifications defining subfamilies. Two majorschemes are currently in use, the old one established bythe HGNC [8] and a more recent one expanding the num-ber of classes from three to twelve [22]. Essentially, thelater classification re-defines subclasses of the HGNCscheme as separate classes, e.g. 1A and 1B become class-4and class-1, respectively, and groups some branches to newclasses. However, some coronins still remained unclassifiedand several classes have been proposed, like the inverte-brate metazoan classes 8 and 9, although the contributingmembers did not form monophyletic branches in theunderlying protein family tree. The proposed classes 10and 12 contain members of unrelated taxonomic branches,probably because these coronins were adjacent in the treefigure. In addition, the Entamoeba tandem-coronin did notgroup to the other tandem-coronins. Thus, this classifica-tion is not consistent with the taxonomy of the eukaryotes.

In addition, homologs of major branches were missing inthe analysis like those from stramenopiles. We do notintend to add confusion to the classification of the coroninfamily but want to suggest a reliable and, consideringfuture genome sequencing projects, expandable scheme.Two major reasons support the future use of the HGNCscheme although it needs some minor adjustments. Theclassification by Morgan and Fernandez [22] of coroninsoutside the metazoans is not consistent with the latest tax-onomy of the eukaryotes and therefore not adaptable toour more comprehensive coronin tree. In addition, it iswell known that two whole-genome-duplications are thereason for the expansion of gene homologs at the origin ofthe vertebrates [42], while another whole-genome-duplica-tion happened at the origin of the Actinopterygii [43,44].Thus retaining the orthology between non-vertebrate andvertebrate coronins in class-numbers would be desirablebut has also been abandoned by Morgan and Fernandez[22]. Here, we adapted the HGNC classification except forrenaming CORO6 and CORO7 (HGNC) to coronin-1Dand coronin-3, respectively, numbering additional fish cor-onins as coronin-1E, coronin-2C, and coronin-2D, anddefining the new coronin class-4. The term “class” isequivalent to the term “Type” used by the Bear group inrecent reviews [8,45]. However, we prefer the term “class”to be consistent with the terminology used for other pro-tein families (e.g. the myosin family [46,47]) and thereforeto facilitate the work with databases and search engines inthe future.Class-4 coronins represent a new type of coronins that

are present in Excavata (Naegleria gruberi), Amoebae,fungi (Spizellomyces punctatus), and the Fungi/Metazoaincertae sedis branch. Most class-4 coronins consist of theN-terminal coronin domain followed by two to three PH,four to five gelsolin, and a C-terminal VHP domain. Thefirst representative of this subfamily has been identified inDictyostelium discoideum and called villidin because of thehomology of its gelsolin and VHP domains to villin [34].The homology of villidins WD-repeat region to coroninhas been recognized later on [48,49] suggesting villidinsorigin through a fusion of the coronin domain with villin.Villin is the founding member of a superfamily of proteinscontaining three to six gelsolin domains (reviewed in[48,50]). Like villidin (class-4 coronins), villin, supervillin,and protovillin also contain a C-terminal VHP domain.Alignment of villin to the class-4 coronins gelsolin domainsshows that the class-4 coronins have lost the first gelsolindomain of villin. The first gelsolin domain of villin is asso-ciated with dimerization, actin filament capping, nuclea-tion, and bundling, and G-actin binding [50]. Thus, class-4coronins do not play a role in these activities via theirgelsolin domains [34]. However, villin contains three phos-pholipid-binding domains, two preceding the second gelso-lin domain and one overlapping with the VHP domain.

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 12 of 17

Page 13: D - BioMed Central

These phospholipid-binding domains are conserved inclass-4 coronins and are most probably responsible fortheir association with internal membranes like Golgi-struc-tures and ER-membranes [34].To reveal the evolution of the coronin family and to

determine the coronin repertoire of the last commonancestor of the eukaryotes, we plotted the coronin inven-tory of several representative species, whose genomesequences are available and whose coronin inventoriesare therefore complete, on the most widely agreed tree ofthe eukaryotes (Figure 7). However, especially the group-ing of taxa that emerged close to the origin of the eukar-yotes remains highly debated. Therefore, alternativebranchings are also indicated in the tree. The phylogenyof the supposed supergroup Excavata is the least under-stood because only a few species of this branch havebeen completely sequenced so far. While the grouping ofthe Heterolobosea, Trichomonada, and Euglenozoa intothe Excavata is found in most analyses, the grouping ofthe Diplomonadida as separate phylum or as part of theExcavata is still debated (arrow 1 [51]). Also, some ana-lyses group the red algae of the Rhodophyta branch tothe Viridiplantae [52-54] and others support their inde-pendence (arrow 2; [28,55]). According to most of the

recent phylogenetic analyses, the Alveolata, Rhizaria, andStramenopiles form the superfamily SAR [27,54]. Theplacement of the Haptophyceae and Cryptophyta to theSAR is still highly debated. Although several analysesare in favour to this grouping (arrow 3; [55-57]) mostanalyses are in contrast [27-29,53,54]. Short coroninscontaining the N-terminal coronin domain and the C-terminal oligomerization domain have been found in allbranches except Diplomonadida, Haptophyceae, and Vir-idiplantae/Rhodophyta. The phylogenetic grouping of thespecies based on the phylogenetic tree of the coronindomains showed that the coronins with different domaincompositions (containing dUTPase domains, ARP2/3binding domains, no coiled-coil regions) are species-specific developments based on domain loss and gainevents while the corresponding species correctly grouptogether inside the respective branches. Class-3 coroninsare also found in all major eukaryotic superkingdomsthat contain coronins. We did not identify any speciesthat contains exclusively a class-3 coronin suggestingthat encoding a class-3 coronin is a plus for the speciesbut not a necessity. Class-4 coronins were found in twoof the four coronin-containing superkingdoms, the Exca-vata and the Opisthokonts. Several major sub-branches

Emiliania huxleyi

Guillardia theta

Haptophyceae

Cryptophyta

Alveolata Ciliophora

ApicomplexaCoccidia

Aconoidasia

Tetrahymena thermopila

Toxoplasma gondii

Plasmodium faliciparum

Stra

menopile

s

SAR

Bacillariophyta

Oomyc

etes

Blastocystis

Pela

goph

ycea

e

Aureococcus anophageferens

Phytophthora ramorum

Thalassiosira pseudonana

DiplomonadidaGiardia lamblia

Euglenozoa

Excavata

Leishmania major

Trichomonada

Opisthokonts

Trichomonas vaginalis

Capsaspora owczarzaki

Monosiga brevicollis

Met

azoaChoanoflagellida

Fungi

Fungi/Metazoa incertae sedis

Am

oebo

zoa

Rhod

ophy

ta

Galdieria sulphurariaCyanidioschyzon merolae

Virid

ipla

ntae

Chlo

roph

yta

Stre

ptop

hyta

Ectocarpus siliculosus

Blastocystis hominis

Phaeophyceae

Heterolobosea

Naegleria gruberi

0

0

0 0 0

4

4

1

3

3

4

0 3

3

40 3

40 3

40 3

0 0

0 0

3

0

0

0

0

0

0 0

3

0 3

2

2

X

X X

1

2

3

1 3

3

4

4

4

3

3

3

Eukaryota

Rhizaria

Bigelowiella natans

3

Figure 7 Evolution of the coronin protein family with respect to the species evolution. Schematic representation of the most widelyaccepted eukaryotic tree of life. Branch lengths are arbitrary. The coronin inventories of certain taxa and specific species have been plotted tothe tree with class numbers given in colour-coded boxes. “O” stands for “Orphan”, the unclassified short coronins. The numbers on the arrowsrefer to alternative placing of the respective taxa: 1: The independence of the Diplomonadida (instead of grouping them to the superkingdomExcavata) is supported by [51]. 2: The monophyly of the Rhodophyta is supported by [28,55]. 3: Grouping the Haptophyceae and Cryptophyta tothe SAR is supported by [55-57].

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 13 of 17

Page 14: D - BioMed Central

of the Opisthokonts contain class-4 coronins, the Amoe-bozoa, the Fungi, and the Fungi/Metazoa incertae sedisbranch. However, the evolution of the class-4 coroninsrather seems to be determined by gene-loss events. Thisdistribution of the coronin classes demonstrates that thelast common ancestor of the eukaryotes must have con-tained a short coronin as well as a tandem coronin (class-3), and most probably even a class-4 coronin. In the cor-onin-family tree (Figure 1) the C-terminal coronin-domains of the class-3 coronins group closer to the shortcoronins than the N-terminal coronin-domains. Thissuggests a three-step invention of the class-3 coronin(Figure 8): First a gene duplication of the short coroninhappened (1). The new copy was subsequently copiedtwice but the order of these events could not be deter-mined (2). One copy has been distributed in a differentgenomic region resulting in the class-4 coronin afterfusion to a copy of the villin gene (2B). The other copyresulted in a tandem gene duplicate in which the newcopy was placed at the 5’ site of the original gene (2A).The tandem gene duplicate subsequently fused to buildthe class-3 coronin prototype (3). It could also be possi-ble that the coronin domain copy, which led to the class-4 coronin, would have been produced as a copy of the 3’coronin of the then already existing tandem gene dupli-cate (4).At the origin of the Metazoa and Choanoflagellida

branches another gene duplication event led to two distinctclasses, class-1 coronins and class-2 coronins (Figure 7).The further evolution of the short coronins in the inverte-brate branches is determined by species-specific gene-loss

and gene-duplication events (Figure 2). This view is, how-ever, based on the species whose genomes are availabletoday and might change as soon as sequencing of morerelated species reveals subtypes of the class-1 and class-2coronins in major invertebrate branches. At the origin ofthe vertebrates the two well-known whole-genome duplica-tions (2R, [42]) resulted in several subtypes of both theclass-1 and class-2 coronins. The subsequent third wholegenome duplication in the fish-lineage [43,44] led to evenmore gene duplicates. Subsequent to this boost of coroninhomologs at the onset of the vertebrates branch-specificgene deletions happened, like the loss of the class-1B var-iants in fishes and the class-1A loss in birds (Figure 2).The short coiled-coil region including the trimerization

motif R-[VILM]-X-X-[VIL]-E is an accomplishment ofthe most ancient short coronin because it is found in cor-onins of all branches of the eukaryotic tree. It has beenretained without major mutations for a long evolutionarytime. This is exemplified by the fact that changes, whichmight lead to other oligomerization states, are species-specific or have been introduced in very recently sepa-rated branches.

ConclusionsThe phylogenetic tree based on the coronin domains of723 homologs from 358 species allowed grouping thecoronin proteins into four classes: Class-1 (Type I) andclass-2 (Type II) comprise short coronins and resultedfrom a gene duplication of a short coronin at the onsetof the halozoans. Short coronins are characterized byan N-terminal coronin domain followed by a unique

gene duplication1

fusion of the tandem gene duplicates3

A: tandem gene duplicationB: gene duplication(order unknown)

2

4

short coronin

class-3 coronin

class-4 coronin

A

BA

Figure 8 Evolution of the coronin classes. The cartoon shows the different gene duplication and fusion events that led to the formation ofthe short coronins, the class-3 coronins, and the class-4 coronins.

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 14 of 17

Page 15: D - BioMed Central

domain and a C-terminal short coiled-coil region. Thecoiled-coil domain of almost all short coronins containsa trimerization motif that must therefore have alreadyexisted in the last common ancestor of the eukaryotes.Class-3 (Type III) coronins comprise coronins with twocoronin-domains arranged in tandem and have beenfound in species of all eukaryotic kingdoms that containcoronins. Class-4 (Type IV) coronins encode fusions ofthe coronin domain to villin and have been identified inExcavata and Opisthokonts although most of these spe-cies subsequently lost the class-4 homolog. Hence, thelast common ancestor of the eukaryotes must have con-tained a short coronin and a class-3 coronin, and mostprobably a class-4 coronin.

MethodsIdentification and annotation of the coronin familyproteinsThe coronin genes have been identified by TBLASTNsearches against the sequenced eukaryotic genomes, whichhave been obtained via lists available from the diArk data-base [23,58]. All hits were manually analyzed at the geno-mic DNA level. Datasets of predicted proteins produced bythe sequencing consortia often miss homologs, and pre-dicted proteins contain mispredicted exons and introns inmany cases, necessitating manual assembly and annotation.The correct coding sequences were identified with the helpof the multiple sequence alignments of all coronin proteins.As the amount of protein sequences increased (especiallythe number of sequences in taxa with few representatives),many of the initially predicted sequences were reanalyzedto correctly identify all exon borders. Where possible, ESTdata available from the NCBI EST database has been ana-lyzed to help in the annotation process. In addition, coroninhomologs from cDNA projects or single-gene analyses havebeen obtained by TBLASTN searches against the NCBI nrdatabase [59]. Gene structures have been reconstructedusing WebScipio [25] as far as genomic sequence data wasavailable. All sequence related data (names, correspondingspecies, GenBank ID’s, alternative names, correspondingpublications, domain predictions, gene structure recon-structions, and sequences) and references to genomesequencing centres are available through CyMoBase[60,61].

Generating the multiple sequence alignmentThe multiple sequence alignment of the coronin family hasbeen built and extended during the process of annotatingand assembling new sequences. The initial alignment hasbeen generated from the first about 50 non-validatedsequences obtained from NCBI using the ClustalW soft-ware with standard settings [62]. During the following cor-rection of the sequences (removing wrongly annotated

sequences and filling gaps) the alignment has been adjustedmanually. Subsequently, every newly predicted sequencehas been preliminary aligned to its supposed closest rela-tive using ClustalW, the aligned sequence added to themultiple sequence alignment of the coronins, and the coro-nin alignment adjusted manually during the subsequentsequence validation process. We have also retained theintegrity of the primary sequence within the secondarystructural elements that have been determined fromthe crystal structure (e.g. sequence gaps have only beenintroduced in known loop regions). Still, many gaps insequences derived from low-coverage genomes remained.In those cases, the integrity of the exons surrounding thegaps has been maintained (gaps in the genomic sequenceare reflected as gaps in the multiple sequence alignment).The unique and coiled-coil regions are completely diver-gent in sequence and length and were therefore alignedmanually. The domain compositions of the short coronin,the class-3, and the class-4 coronins are different andregions outside the N-terminal coronin domain were onlyaligned within these groups. The C-terminal coronindomains of the class-3 coronins were separately includedin the multiple sequence alignment of the coronins, inaddition to being aligned as part of their class.

Building treesFor calculating phylogenetic trees only full-length and par-tial sequences were included in the alignment. The phylo-genetic trees were generated based on the conservedcoronin domains (corresponding to amino acids 1-386 ofHsCoro1A) using two different methods: 1. Maximumlikelihood (ML) using the LG model with estimated pro-portion of invariable sites and bootstrapping (1,000 repli-cates) using RAxML [63]. 2. Posterior probabilities weregenerated using MrBayes v3.1.2 [64] with the MPI option[65]. Two independent runs with 15,000,000 generations,four chains, and a random starting tree were computedusing the mixed amino-acid option. From the 32,000thgeneration MrBayes used the Wag model [66]. Using Prot-Test [67], the LG model [68], which is, however, notimplemented in MrBayes, was determined to provide aslightly better fit to the data than the Wag model. Treeswere sampled every 1,000th generation and the first 25%of the trees were discarded as “burn-in” before generatinga consensus tree.

Domain and motif predictionProtein domains were predicted using the SMART [69]and Pfam [70] web server. The leucine zipper motifs havebeen identified using the Prosite database [71]. The CAdomains have been identified by visual inspection of themanual sequence alignment of the coronins and motifcomparisons with CA domains of WASP family proteins

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 15 of 17

Page 16: D - BioMed Central

available at CyMoBase (unpublished data, [61]). Graphicalrepresentations of the sequence patterns have been gener-ated with WebLogo [72].

Additional material

Additional file 1: Sequence alignment of the coronins The filecontains the alignment of the full-length sequences of the coronins infasta-format. The data can also be downloaded from CyMoBase [61].

Additional file 2: MrBayes tree of the coronin family This file containsthe phylogenetic tree calculated with MrBayes including posteriorprobability values that has been the basis for Figure 1. Here, the tree isplotted in an extended way so that every coronin can be found andcompared easily.

Additional file 3: RAxML tree of the coronin family This file containsthe phylogenetic tree calculated with RAxML including bootstrap values.The tree is plotted in an extended way so that every coronin can befound and compared easily.

Additional file 4: Coronin repertoire of all eukaryotes analyzedComplete table of the coronin inventories of 358 eukaryotes.

Additional file 5: Conserved residues in the coronin domain Thisfigure contains the sequence conservation of the entire coronin domainincluding all mutagenesis experiments as described in Cai et al. [40] andGandhi et al. [41].

AcknowledgementsThis work has been funded by grants KO 2251/3-1 and KO 2251/3-2 of theDeutsche Forschungs gemein schaft.

Authors’ contributionsCE and MK assembled coronin sequences, performed data analysis andwrote the manuscript. BH performed the phylogenetic analysis. All authorsread and approved the final manuscript.

Received: 28 June 2011 Accepted: 25 September 2011Published: 25 September 2011

References1. de Hostos EL, Bradtke B, Lottspeich F, Guggenheim R, Gerisch G: Coronin,

an actin binding protein of Dictyostelium discoideum localized to cellsurface projections, has sequence similarities to G protein beta subunits.EMBO J 1991, 10:4097-4104.

2. Tardieux I, Liu X, Poupel O, Parzy D, Dehoux P, Langsley G: A Plasmodiumfalciparum novel gene encoding a coronin-like protein which associateswith actin filaments. FEBS Lett 1998, 441:251-256.

3. Figueroa JV, Precigout E, Carcy B, Gorenflot A: Identification of a coronin-like protein in Babesia species. Ann N Y Acad Sci 2004, 1026:125-138.

4. Heil-Chapdelaine RA, Tran NK, Cooper JA: The role of Saccharomycescerevisiae coronin in the actin and microtubule cytoskeletons. Curr Biol1998, 8:1281-1284.

5. Suzuki K, Nishihata J, Arai Y, Honma N, Yamamoto K, Irimura T,Toyoshima S: Molecular cloning of a novel actin-binding protein, p57,with a WD repeat and a leucine zipper motif. FEBS Lett 1995, 364:283-288.

6. de Hostos EL: A brief history of the coronin family. Subcell Biochem 2008,48:31-40.

7. Clemen CS, Rybakin V, Eichinger L: The coronin family of proteins. SubcellBiochem 2008, 48:1-5.

8. Uetrecht AC, Bear JE: Coronins: the return of the crown. Trends Cell Biol2006, 16:421-426.

9. Smith TF: Diversity of WD-repeat proteins. Subcell Biochem 2008, 48:20-30.10. Neer EJ, Schmidt CJ, Nambudripad R, Smith TF: The ancient regulatory-

protein family of WD-repeat proteins. Nature 1994, 371:297-300.11. de Hostos EL, Rehfuess C, Bradtke B, Waddell DR, Albrecht R, Murphy J,

Gerisch G: Dictyostelium mutants lacking the cytoskeletal protein

coronin are defective in cytokinesis and cell motility. J Cell Biol 1993,120:163-173.

12. Cai L, Holoweckyj N, Schaller MD, Bear JE: Phosphorylation of coronin 1Bby protein kinase C regulates interaction with Arp2/3 and cell motility.J Biol Chem 2005, 280:31913-31923.

13. Maniak M, Rauchenberger R, Albrecht R, Murphy J, Gerisch G: Coronininvolved in phagocytosis: dynamics of particle-induced relocalizationvisualized by a green fluorescent protein Tag. Cell 1995, 83:915-924.

14. Ferrari G, Langen H, Naito M, Pieters J: A coat protein on phagosomesinvolved in the intracellular survival of mycobacteria. Cell 1999,97:435-447.

15. Rybakin V, Stumpf M, Schulze A, Majoul IV, Noegel AA, Hasse A: Coronin 7,the mammalian POD-1 homologue, localizes to the Golgi apparatus.FEBS Lett 2004, 573:161-167.

16. de Hostos EL: The coronin family of actin-associated proteins. Trends CellBiol 1999, 9:345-350.

17. Appleton BA, Wu P, Wiesmann C: The crystal structure of murine coronin-1: a regulator of actin cytoskeletal dynamics in lymphocytes. Structure2006, 14:87-96.

18. Spoerl Z, Stumpf M, Noegel AA, Hasse A: Oligomerization, F-actininteraction, and membrane association of the ubiquitous mammaliancoronin 3 are mediated by its carboxyl terminus. J Biol Chem 2002,277:48858-48867.

19. Oku T, Itoh S, Ishii R, Suzuki K, Nauseef WM, Toyoshima S, Tsuji T:Homotypic dimerization of the actin-binding protein p57/coronin-1mediated by a leucine zipper motif in the C-terminal region. Biochem J2005, 387:325-331.

20. Kammerer RA, Kostrewa D, Progias P, Honnappa S, Avila D, Lustig A,Winkler FK, Pieters J, Steinmetz MO: A conserved trimerization motifcontrols the topology of short coiled coils. Proc Natl Acad Sci USA 2005,102:13891-13896.

21. Rybakin V, Clemen CS: Coronin proteins as multifunctional regulators ofthe cytoskeleton and membrane trafficking. Bioessays 2005, 27:625-632.

22. Morgan RO, Fernandez MP: Molecular phylogeny and evolution of thecoronin gene family. Subcell Biochem 2008, 48:41-55.

23. Odronitz F, Hellkamp M, Kollmar M: diArk–a resource for eukaryoticgenome research. BMC Genomics 2007, 8:103.

24. Breathnach R, Chambon P: Organization and expression of eucaryoticsplit genes coding for proteins. Annu Rev Biochem 1981, 50:349-383.

25. Odronitz F, Pillmann H, Keller O, Waack S, Kollmar M: WebScipio: an onlinetool for the determination of gene structures using protein sequences.BMC Genomics 2008, 9:422.

26. Keller O, Odronitz F, Stanke M, Kollmar M, Waack S: Scipio: using proteinsequences to determine the precise exon/intron structures of genes andtheir orthologs in closely related species. BMC Bioinformatics 2008, 9:278.

27. Parfrey LW, Grant J, Tekle YI, Lasek-Nesselquist E, Morrison HG, Sogin ML,Patterson DJ, Katz LA: Broadly sampled multigene analyses yield a well-resolved eukaryotic tree of life. Syst Biol 2010, 59:518-533.

28. Reeb VC, Peglar MT, Yoon HS, Bai JR, Wu M, Shiu P, Grafenberg JL, Reyes-Prieto A, Rummele SE, Gross J, Bhattacharya D: Interrelationships ofchromalveolates within a broadly sampled tree of photosyntheticprotists. Mol Phylogenet Evol 2009, 53:202-211.

29. Hampl V, Hug L, Leigh JW, Dacks JB, Lang BF, Simpson AG, Roger AJ:Phylogenomic analyses support the monophyly of Excavata and resolverelationships among eukaryotic “supergroups”. Proc Natl Acad Sci USA2009, 106:3859-3864.

30. Goode BL, Wong JJ, Butty AC, Peter M, McCormack AL, Yates JR, Drubin DG,Barnes G: Coronin promotes the rapid assembly and cross-linking ofactin filaments and may link the actin and microtubule cytoskeletons inyeast. J Cell Biol 1999, 144:83-98.

31. Liu SL, Needham KM, May JR, Nolen BJ: Mechanism of a Concentration-dependent Switch between Activation and Inhibition of Arp2/3 Complexby Coronin. J Biol Chem 2011, 286:17039-17046.

32. Veltman DM, Insall RH: WASP family proteins: their evolution and itsphysiological implications. Mol Biol Cell 2010, 21:2880-2893.

33. Vertessy BG, Toth J: Keeping uracil out of DNA: physiological role,structure and catalytic mechanism of dUTPases. Acc Chem Res 2009,42:97-106.

34. Gloss A, Rivero F, Khaire N, Muller R, Loomis WF, Schleicher M, Noegel AA:Villidin, a novel WD-repeat and villin-related protein from Dictyostelium,

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 16 of 17

Page 17: D - BioMed Central

is associated with membranes and the cytoskeleton. Mol Biol Cell 2003,14:2716-2727.

35. Yonemura I, Mabuchi I: Heterogeneity of mRNA coding forCaenorhabditis elegans coronin-like protein. Gene 2001, 271:255-259.

36. Xavier CP, Rastetter RH, Stumpf M, Rosentreter A, Muller R, Reimann J,Cornfine S, Linder S, van Vliet V, Hofmann A, Morgan RO, Fernandez MP,Schroder R, Noegel AA, Clemen CS: Structural and functional diversity ofnovel coronin 1C (CRN2) isoforms in muscle. J Mol Biol 2009, 393:287-299.

37. Asano S, Mishima M, Nishida E: Coronin forms a stable dimer through itsC-terminal coiled coil region: an implicated role in its localization to cellperiphery. Genes Cells 2001, 6:225-235.

38. Beck K, Gambee JE, Kamawal A, Bachinger HP: A single amino acid canswitch the oligomerization state of the alpha-helical coiled-coil domainof cartilage matrix protein. EMBO J 1997, 16:3767-3777.

39. Oku T, Itoh S, Okano M, Suzuki A, Suzuki K, Nakajin S, Tsuji T, Nauseef WM,Toyoshima S: Two regions responsible for the actin binding of p57, amammalian coronin family actin-binding protein. Biol Pharm Bull 2003,26:409-416.

40. Cai L, Makhov AM, Bear JE: F-actin binding is essential for coronin 1Bfunction in vivo. J Cell Sci 2007, 120:1779-1790.

41. Gandhi M, Jangi M, Goode BL: Functional surfaces on the actin-bindingprotein coronin revealed by systematic mutagenesis. J Biol Chem 2010,285:34899-34908.

42. Van de Peer Y, Maere S, Meyer A: 2R or not 2R is not the questionanymore. Nat Rev Genet 2010, 11:166.

43. Steinke D, Hoegg S, Brinkmann H, Meyer A: Three rounds (1R/2R/3R) ofgenome duplications and the evolution of the glycolytic pathway invertebrates. BMC Biol 2006, 4:16.

44. Jaillon O, Aury JM, Brunet F, Petit JL, Stange-Thomann N, Mauceli E,Bouneau L, Fischer C, Ozouf-Costaz C, Bernot A, Nicaud S, Jaffe D, Fisher S,Lutfalla G, Dossat C, Segurens B, Dasilva C, Salanoubat M, Levy M, Boudet N,Castellano S, Anthouard V, Jubin C, Castelli V, Katinka M, Vacherie B,Biemont C, Skalli Z, Cattolico L, Poulain J, et al: Genome duplication in theteleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature 2004, 431:946-957.

45. Chan KT, Creed SJ, Bear JE: Unraveling the enigma: progress towardsunderstanding the coronin family of actin regulators. Trends Cell Biol2011, 21:481-488.

46. Odronitz F, Kollmar M: Drawing the tree of eukaryotic life based on theanalysis of 2,269 manually annotated myosins from 328 species. GenomeBiol 2007, 8:R196.

47. Berg JS, Powell BC, Cheney RE: A millennial myosin census. Mol Biol Cell2001, 12:780-794.

48. Archer SK, Claudianos C, Campbell HD: Evolution of the gelsolin family ofactin-binding proteins as novel transcriptional coactivators. Bioessays2005, 27:388-396.

49. Xavier CP, Eichinger L, Fernandez MP, Morgan RO, Clemen CS: Evolutionaryand functional diversity of coronin proteins. Subcell Biochem 2008,48:98-109.

50. Khurana S, George SP: Regulation of cell structure and function by actin-binding proteins: villin’s perspective. FEBS Lett 2008, 582:2128-2139.

51. Simpson AG, Inagaki Y, Roger AJ: Comprehensive multigene phylogeniesof excavate protists reveal the evolutionary positions of “primitive”eukaryotes. Mol Biol Evol 2006, 23:615-625.

52. Keeling PJ: The endosymbiotic origin, diversification and fate of plastids.Philos Trans R Soc Lond B Biol Sci 2010, 365:729-748.

53. Burki F, Shalchian-Tabrizi K, Pawlowski J: Phylogenomics reveals a new‘megagroup’ including most photosynthetic eukaryotes. Biol Lett 2008,4:366-369.

54. Burki F, Shalchian-Tabrizi K, Minge M, Skjaeveland A, Nikolaev SI,Jakobsen KS, Pawlowski J: Phylogenomics reshuffles the eukaryoticsupergroups. PLoS One 2007, 2:e790.

55. Nozaki H, Maruyama S, Matsuzaki M, Nakada T, Kato S, Misawa K:Phylogenetic positions of Glaucophyta, green plants (Archaeplastida)and Haptophyta (Chromalveolata) as deduced from slowly evolvingnuclear genes. Mol Phylogenet Evol 2009, 53:872-880.

56. Keeling PJ: Chromalveolates and the evolution of plastids by secondaryendosymbiosis. J Eukaryot Microbiol 2009, 56:1-8.

57. Hackett JD, Yoon HS, Li S, Reyes-Prieto A, Rummele SE, Bhattacharya D:Phylogenomic analysis supports the monophyly of cryptophytes and

haptophytes and the association of rhizaria with chromalveolates. MolBiol Evol 2007, 24:1702-1713.

58. diArk - a resource for eukaryotic genome research. [http://www.diark.org].59. Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL:

NCBI BLAST: a better web interface. Nucleic Acids Res 2008, 36:W5-9.60. Odronitz F, Kollmar M: Pfarao: a web application for protein family

analysis customized for cytoskeletal and motor proteins (CyMoBase).BMC genomics 2006, 7:300.

61. CyMoBase - a database for cytoskeletal and motor proteins. [http://www.cymobase.org].

62. Thompson JD, Gibson TJ, Higgins DG: Multiple sequence alignment usingClustalW and ClustalX. Curr Protoc Bioinformatics 2002, Chapter 2:Unit 2 3.

63. Stamatakis A, Hoover P, Rougemont J: A rapid bootstrap algorithm for theRAxML Web servers. Syst Biol 2008, 57:758-771.

64. Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inferenceunder mixed models. Bioinformatics 2003, 19:1572-1574.

65. Altekar G, Dwarkadas S, Huelsenbeck JP, Ronquist F: Parallel Metropoliscoupled Markov chain Monte Carlo for Bayesian phylogenetic inference.Bioinformatics 2004, 20:407-415.

66. Whelan S, Goldman N: A general empirical model of protein evolutionderived from multiple protein families using a maximum-likelihoodapproach. Mol Biol Evol 2001, 18:691-699.

67. Abascal F, Zardoya R, Posada D: ProtTest: selection of best-fit models ofprotein evolution. Bioinformatics 2005, 21:2104-2105.

68. Le SQ, Gascuel O: An improved general amino acid replacement matrix.Mol Biol Evol 2008, 25:1307-1320.

69. Letunic I, Doerks T, Bork P: SMART 6: recent updates and newdevelopments. Nucleic Acids Res 2009, 37:D229-232.

70. Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL,Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR,Bateman A: The Pfam protein families database. Nucleic Acids Res 2010,38:D211-222.

71. Sigrist CJ, Cerutti L, de Castro E, Langendijk-Genevaux PS, Bulliard V,Bairoch A, Hulo N: PROSITE, a protein domain database for functionalcharacterization and annotation. Nucleic Acids Res 2010, 38:D161-166.

72. Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logogenerator. Genome Res 2004, 14:1188-1190.

73. Letunic I, Bork P: Interactive Tree Of Life (iTOL): an online tool forphylogenetic tree display and annotation. Bioinformatics 2007, 23:127-128.

doi:10.1186/1471-2148-11-268Cite this article as: Eckert et al.: A holistic phylogeny of the coroningene family reveals an ancient origin of the tandem-coronin, defines anew subfamily, and predicts protein function. BMC Evolutionary Biology2011 11:268.

Submit your next manuscript to BioMed Centraland take full advantage of:

• Convenient online submission

• Thorough peer review

• No space constraints or color figure charges

• Immediate publication on acceptance

• Inclusion in PubMed, CAS, Scopus and Google Scholar

• Research which is freely available for redistribution

Submit your manuscript at www.biomedcentral.com/submit

Eckert et al. BMC Evolutionary Biology 2011, 11:268http://www.biomedcentral.com/1471-2148/11/268

Page 17 of 17