View
1
Download
0
Category
Preview:
Citation preview
Identification and Characterization of Novel ConservedDomains in Metazoan Zic Proteins
Takahide Tohmonda,1 Akiko Kamiya,1 Akira Ishiguro,1 Takashi Iwaki,2 Takahiko J. Fujimi,1
Minoru Hatayama,3 and Jun Aruga*,1,3
1Laboratory for Behavioral and Developmental Disorders, RIKEN Brain Science Institute, Wako-Shi, Saitama, Japan2Meguro Parasitological Museum, Meguro-Ku, Tokyo, Japan3Department of Medical Pharmacology, Nagasaki University Institute of Biomedical Sciences, Nagasaki, Japan
*Corresponding author: E-mail: aruga@nagasaki-u.ac.jp.
Associate editor: John R. True
Abstract
Zic family genes encode C2H2-type zinc finger proteins that act as critical toolkit proteins in the metazoan body planestablishment. In this study, we searched evolutionarily conserved domains (CDs) among 121 Zic protein sequences from22 animal phyla and 40 classes, and addressed their evolutionary significance. The collected sequences included thosefrom poriferans and orthonectids. We discovered seven new CDs, CD0–CD6, (in order from the N- to C-terminus) usingthe most conserved Zic protein sequences from Deuterostomia (Hemichordata and Cephalochordata), Lophotrochozoa(Cephalopoda and Brachiopoda), and Ecdysozoa (Chelicerata and Priapulida). Subsequently, we analyzed the evolution-ary history of Zic CDs including the known CDs (ZOC, ZFD, ZFNC, and ZFCC). All Zic CDs are predicted to have existed ina bilaterian ancestor. During evolution, they have degenerated in a taxa-selective manner with significant correlationsamong CDs. The N terminal CD (CD0) was largely lost, but was observed in Brachiopoda, Priapulida, Hemichordata,Echinodermata, and Cephalochordata, and the C terminal CD (CD6) was highly conserved in conserved-type-Zic pos-sessing taxa, but was truncated in vertebrate Zic gene paralogues (Zic1/2/3), generating a vertebrate-specific C-terminuscritical for transcriptional regulation. ZOC was preferentially conserved in insects and in an anthozoan paralogue, and itwas bound to the homeodomain transcription factor Msx in a phylogenetically conserved manner. Accordingly, theextent of divergence of Msx and Zic CDs from their respective bilaterian ancestors is strongly correlated. These resultssuggest that coordinated divergence among the toolkit CDs and among toolkit proteins is involved in the divergence ofmetazoan body plans.
Key words: Zic, transcription factors, evolution, development, metazoans.
Introduction
Body plans of metazoans have diverged and converged duringevolution, providing a basis for adaptation strategies tochanging environments. In the current metazoan phylogeny,nonbilaterian animals include poriferans, ctenophores, cni-darians, and placozoans, and most bilaterian animals are di-vided into three major taxa, Lophotrochozoa, Ecdysozoa, andDeuterostomia (Ruppert et al. 2004; Telford et al. 2015; Bruscaet al. 2016). Structural alterations of the genome largely ac-count for body plan evolution. Recent evolutionary develop-mental studies have clarified key proteins that play roles inestablishing animal body plans. The genes for such proteinsare often called toolkit genes, which include those encodingtranscription factors and components of intra- or intercellularsignaling (Meyerowitz 1999; True and Carroll 2002).
As part of the toolkit genes, Zic family genes encode C2H2-type zinc finger proteins, and they are harbored in bilaterian,cnidarian, and placozoan genomes, but have not beendetected in the genomes of poriferans and ctenophores.They are essential for a variety of developmental processes.In vertebrates and ascidians, they play roles both in the
ectodermal and mesodermal lineages: for example, neuraldifferentiation, neural plate border specification, and node/notochord/somite development (references in Aruga 2004;Houtmeyers et al. 2013). In ecdysozoans, they participate inembryonic segmentation, visceral mesoderm differentiationof arthropods, and neural cell specification and epidermaldifferentiation of nematodes (Alper and Kenyon 2002;Bertrand and Hobert 2009). In the lophotrochozoan planaria,they are required for head regeneration, including eye andCNS regeneration (Vasquez-Doorman and Petersen 2014;Vogg et al. 2014). Furthermore, expression profiles suggestthat cnidarian Zic genes are involved in the development ofthe ectoderm, gastrodermis (a bifunctional endomesoderm)(Layden et al. 2010), and nematocytes (Lindgens et al. 2004).
Accumulating evidences regarding the involvement of Zicgenes in a wide range of metazoan development processeshave raised the question of how Zic genes have been involvedin the change of animal body plans throughout the course ofevolution. Several studies have addressed this question.Experiments using animal models have revealed their rolesin the establishment of binocular vision in vertebrates(Herrera et al. 2003) and dorsoventral patterning of somites
Article
� The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Mol. Biol. Evol. 35(9):2205–2229 doi:10.1093/molbev/msy122 Advance Access publication June 14, 2018 2205
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
in teleost fish (Moriyama et al. 2012). Conversely, molecularphylogenetic analyses have provided starting point hypothe-ses about the evolutionary history of these genes duringevolution.
Firstly, concerning their origins, Zic genes are derived from acommon ancestor of the Gli-Glis-Zic superfamily proteins thatshare similar five-C2H2-type zinc finger domains (ZFDs)(reviewed in Aruga and Hatayama 2018). In Gli-Glis-Zic pro-teins, the two N-terminal C2H2 zinc finger motifs conform tothe tandem-CWCH2 motif that characterizes structurally uni-fied zinc fingers (Hatayama and Aruga 2010), and their ZFDsshow monophyly in the molecular phylogenetic analysis ofeukaryotic zinc finger proteins (Hatayama and Aruga 2010;Layden et al. 2010). The common ancestor of Gli-Glis-Zicexisted in a metazoan ancestor (Aruga et al. 2006; Hatayamaand Aruga 2010; Layden et al. 2010). Secondly, a prototypal Zicgene existed in a bilaterian ancestor (urbilaterian). This is basedon the presence of an absolutely conserved intron (A-intron)in bilaterian Zic proteins, and the distribution of all previouslyknown conserved domains (CDs), that is, Zic-Opa-conserved(ZOC), ZFD, and zinc finger N terminally conserved (ZFNC) ineach of the three major bilaterian taxa (supplementary fig. S1,Supplementary Material online) (Aruga et al. 2006; Laydenet al. 2010). Thirdly, the CDs in urbilaterian prototypal Zicgenes was selectively degenerated in several animal taxa in-cluding Tunicata (phylum Chordata in Deuterostomia),Platyhelminthes (Lophotrochozoa), Dicyemida(Lophotrochozoa), and Nematoda (Ecdysozoa) (Aruga et al.2006). Lastly, vertebrate paralogues in Amphibia, Reptilia,Aves, and Mammalia are generated as a consequence of tan-dem gene duplication-C terminal truncation of one of the twogenes-quadruplication of the whole genome-loss of threegenes (supplementary fig. S1, Supplementary Material online)(Aruga et al. 2006).
However, these hypotheses may not have been fully veri-fied, presumably due to the limitation of the number of an-imal species in which the Zic sequence is present. Previouscomparative genomic studies lacked comparison besides thethree limited CDs and exon–intron boundaries, and minorphyla have been excluded from the analyses. Moreover, thefunctions of the proteins or their domains have not beenaddressed in any of the comparative analyses. Zic proteins,chordate, and fly Zic proteins specifically, possess transcrip-tion regulatory activities (Mizugishi et al. 2001; Yagi et al. 2004;Sen et al. 2010) that are yet to be comparatively investigated.In addition, we recently found Msx protein-binding activitiesof Zic proteins, which we report in this article.
Msx proteins contain a homeodomain (HD) and controlontogeny in many animal species. Msx genes are tool kitgenes that are widely distributed in Metazoa (Takahashiet al. 2008). Phylogenetic analysis showed that Msx proteinslost their CDs selectively in some animal taxa (Takahashi et al.2008). Zic and Msx control cell fate specification at lateralCNS in both vertebrates and nematodes (Li et al. 2017; Arugaand Hatayama 2018). These findings raise the question of howthe evolutionary processes of Zic and Msx are correlated.
In this study, we discovered seven additional CDs thatexisted in the urbilaterian Zic protein by optimizing the
taxa for sampling. We then examined the distribution of allknown CDs in Zic sequences from an extended metazoananimal list including those from 10 new phyla and 19 classes,and analyzed the extent and correlation of conservationamong these CDs. Finally, we addressed the protein functionsin a comparative manner. These analyses revealed Msx as anovel binding partner for ZOC in Zic proteins. Msx conser-vation showed a strong correlation with that of Zic CDs,indicating molecular coevolution between Zic and Msx. Wealso present new hypotheses concerning the origin of Zicfamily proteins, generation of cnidarian paralogues, and pres-ence of the A-intron in a placozoan Zic. These findings revealnot only the evolutionary history of Zic proteins but also thecoordinated nature of the protein domain evolution, whichmay underlie the generation of metazoan body plan diversity.
Results and Discussion
Collection of Novel Zic Protein Sequences and Originof Zic GeneTo update the molecular phylogenetic analysis of Zic genes,we conducted a homology search against the current genomedatabases and collected Zic amino acid (AA) sequences fromanimal taxa that were not included in previous studies. Thesearch identified Zic genes in new phyla including Porifera,Brachiopoda, Bryozoa, Priapulida, Onychophora, Tardigrada,and Orthonectida. In addition, we newly cloned Zic cDNAfragments from Spinochordodes tellinii (a parasitic worm formantis, belonging to phylum Nematomorpha), Brachionusplicatilis (a plankton belonging to phylum Rotifer), andEchinorhynchus gadi (a parasite in teleost fish, belonging tophylum Acanthocephala). In Chordata, we added Zic AAsequences from two new Chordata classes: Callorhinchus milii(shark, belonging to class Chondrichthyes) and Petromyzonmarinus (sea lamprey belonging to classCephalospidomorphi). We also updated the annotations ofZic sequences in several species including a placozoanTrichoplax adhaerens and a centipede Strigamia maritima.In total, we collected 121 Zic AA sequences from 22 animalphyla and 40 classes (table 1). The extent of animal taxacoverage was larger than that in the latest molecular phylo-genetic study on Zic (12 phyla, 21 classes) (Layden et al. 2010).
Poriferan Zic sequences were identified in class Calcarea(three genes in Sycon ciliatum and two genes in Leucosoleniacomplicata) and in class Homoscleromorpha (one gene inOscarella carmela) (table 1; supplementary figs. S2 and S3,Supplementary Material online). This finding contrastedwith the absence of Zic sequences in any sponges belongingto class Demospongiae (whole genome sequence ofAmphimedon queenslandica; Layden et al. 2010; Srivastavaet al. 2010) and the transcriptome of Ephydatia muelleri,Haliclona amboinensis, Haliclona tubifera, Stylissa carteri,and Xestospongia testudinaria (tblastn search againstCompagen database; Hemmrich and Bosch 2008). We didnot detect any Zic sequences in ctenophores (whole genomesequence of Mnemiopsis leidyi; Ryan et al. 2013 andPleurobrachia pileus; Moroz et al. 2014, the transcriptomesof Beroe abyssicola, Coeloplana astericola, Euplokamis
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2206
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
Tab
le1.
List
of
Seq
uen
ces
and
the
Der
ived
An
imal
Spec
ies
inT
his
Stu
dy.
Supe
r-Ph
ylum
Ph
ylum
Su
bphy
ulum
or
Cla
ss
Spec
ies
Spec
ies
Abr
evia
tion
Zic
Para
logu
e
ID
Intr
ons
in
ZFD
Zic
Gen
e N
ame
or L
ocus
Tag
(se
quen
ce I
D)
Gli
, Glis
, and
Msx
Gen
e N
ame
([lo
cus
tag]
,
sequ
ence
ID
)
18S
rRN
A S
eque
nce
ID
Bilateria
Deuterostomia
Cho
rdat
a
Ver
tebr
ata
Hom
o sa
pien
s H
sa
1, 2
, 3, 4
,
5
AB
,
AB
,
AB
, A,
A
ZIC
1 (N
P_00
3403
), Z
IC2
(NP_
0090
60),
ZIC
3
(NP_
0034
04),
ZIC
4 (N
P_0
0116
1850
), Z
IC5
(NP_
1491
23)
MSX
1 (N
P_00
2439
), M
SX2
(NP_
0024
40)
M
1009
8
Cho
ndri
chth
yes
Cal
lorh
inch
us
mil
ii
Cm
i
1, 2
, 3, 4
,
5
AB
,
AB
,
AB
, A,
A
Zic
1 (X
P_00
7889
156)
, Zic
2 (X
P_00
7904
497)
, Zic
3
(XP_
0078
9096
0), Z
ic4
(XP_
0078
8903
4), Z
ic5
(XP_
0079
0449
6)
Msx
1(X
P_00
7895
507)
, Msx
2 (X
P_00
7904
825)
JW
8671
15
Cep
hala
spid
omor
phi
Pet
rom
yzon
mar
inus
ASF)111 33
OD
A(Bxs
M,)1 2167W
BA(
AxsM
)91167W
BA(
AciZ
.d .na
mPR
RE
E
Tun
icat
a
Cio
na in
test
inal
is
Cin
a,
b
AC
Ci-
mac
ho1
(Zic
-r.a
) (N
P_00
1027
958)
, Ci-
Zic
L
(Zic
-r.b
) (N
P_00
1071
853)
710310B
A)406390100_P
N(bxs
m
Hal
ocyn
thia
rore
tzi
Hro
r a,
b
n.d.
Mac
ho-1
(Z
ic-r
.a)
(BA
B19
958)
, Hrz
icN
(Z
ic-r
.b)
(BA
C23
063)
610310B
A
Mol
gula
tect
ifor
mis
Mte
a,
b
n.d.
Mt-
mac
ho1
(Zic
-r.a
) (B
AE
5434
9), M
t-zi
cL (
Zic
-r.b
)
(BA
E54
350)
)anirt ic.M(
024 21L
Cep
halo
chor
data
Bra
nchi
osto
ma
flor
idae
Bfl
a
AE
RR
AR
B)1 0201
AA
C (x s
M)421 49
EA
B(c i
Z ihpm
A.d.n
Bra
nchi
osto
ma
belc
heri
63321 010RS
YA
)2356 46 910_PX(
4 30 7849 01C
OL
Ae b
B
Bra
nchi
osto
ma
lanc
eola
tum
718 824Y
A)9 302 2
HL
A (ci
Z.d. n
a lB
Ech
inod
erm
ata
Ast
eroi
dea
Aca
ntha
ster
pla
nci
Apl
a
A
LO
C11
0977
927
(XP_
0220
8815
5)
Msx
(D
lx2b
-lik
e, X
P_02
2098
347)
A
B08
4554
(co
nti
nu
ed)
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2207
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
Pat
iria
min
iata
Pm
i tr
N777060
QD
)83538R
DA(
ciZ
.d.n
Pat
iria
pec
tini
fera
Pp
e a
155480B
A) 76519F
AB(
AxsM
)23149E
AB(
ciZ-p
AA
Ech
inoi
dea
Stro
ngyl
ocen
trot
us
purp
urat
us
Spu
aRS81S
US)877999_P
N(xs
M)999576110_P
X(ci
Z-pSA
Hem
icho
rdat
a E
nter
opne
usta
Sacc
oglo
ssus
kow
alev
skii
Sko
an.
d.
zic1
(N
P_00
1158
430)
Msx
(A
BD
9728
0), G
li (X
P_00
6823
312)
, Glis
2
(XP_
0027
3878
4), G
lis3
(X
P_00
6825
750)
SASR
GE
Sch
izoc
ardi
um
cali
forn
icum
Sca
a275386F
K)16858
OR
A(ciz
.d.n
Protostomia
Lophotrochozoa
Ann
elid
a
Poly
chae
ta
Cap
itel
la te
leta
C
te
aA
Ct-
zic
(AD
N43
078)
, thi
s st
udy
for
full
OR
F
(BR
0014
76),
AM
QN
0100
5783
.155
81–1
6514
.214
17–2
2060
Msx
(C
APT
ED
RA
FT_1
7662
5, E
LT
8977
5)
JF50
9728
Cli
tella
ta
Tub
ifex
tubi
fex
3FA
)17519FA
B(Bxs
M,)07 519 FA
B(Axs
M)0314 9
EA
B(ci
Z- tT
Au t
T97
152
Hel
obde
lla
robu
sta
Hro
b
A
this
stu
dy (
BR
0014
75),
AM
QM
0100
3360
(353
22–3
3850
.327
48–3
2615
.307
53–2
9184
)
Msx
(H
EL
RO
DR
AFT
_174
933,
XP_
0090
2061
4)
AM
QM
0100
8875
.130
.128
7
Bra
chio
poda
Lin
gula
ta
Lin
gula
ana
tina
L
an
a13618
X) 97 519 F
AB(
xsM
)4 709143 10_PX(
638971601C
OL
A
Rhy
ncho
nella
ta
Ter
ebra
tali
a
tran
sver
sa
Ttr
a a
3181.1.51169 1JF)81646
UQ
A(ci
Z.d.n
Bry
ozoa
G
ymno
laem
ata
Bug
ula
neri
tina
7971.1.947994FA
)5411–66(8_38851_gitnoc_deloop
NB
.d.nen
B
Mol
lusc
a
Biv
alvi
a
Cor
bicu
la
flum
inea
Cfl
a
755021FA
)96519FA
B(xs
M)43149
EA
B(ci
Z-jC
A
Cra
ssos
trea
gig
as
Cgi
1,
2, 3
HA
,
A, A
LO
C10
5339
352
(XP_
0114
4317
4), L
OC
1053
3935
4
(XP_
0114
4317
6), L
OC
1053
3935
3 (X
P_01
1443
175)
Msx
1 (L
OC
1053
3759
0, X
P_01
1440
684)
, Msx
2
(LO
C11
1134
762,
XP_
0223
3981
2)
AB
0649
42
Miz
uhop
ecte
n
yess
oens
is
Mye
1,
2, 3
A, A
,
A
LO
C11
0458
045
(XP_
0213
6526
5), L
OC
1104
5804
3
(XP_
0213
6526
2), L
OC
1104
5804
6 (X
P_02
1365
266)
Msx
(L
OC
1104
5176
6, X
P_02
1355
611)
Spis
ula
soli
diss
ima
Sso
a07211
L)32149
EA
B(ci
Z-os SA
Cep
halo
poda
H
eter
olol
igo
Hbl
a
)5314 9E
AB(
c iZ -b
L.d .n
blee
keri
(co
nti
nu
ed)
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2208
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
Oct
opus
bim
acul
oide
s
Obi
a
A
LO
C10
6877
584
+ L
OC
1068
8318
9 (X
P_01
4782
012
+
XP_
0147
8959
5)
Msx
1 (O
CB
IM_2
2016
198m
g, K
OG
0102
3), M
sx2
(LO
C10
6871
140,
XP_
0147
7297
8), G
li
(XP_
0147
6779
1), G
lis2
(X
P_01
4769
206)
, Glis
3
(XP_
0147
7237
8)
Gas
trop
oda
Apl
ysia
cal
ifor
nica
79506130CS
AA
)2.744311500_PX(
563468101C
OL
.d.nac
A
Bio
mph
alar
ia
glab
rata
Bgl
A
LO
C10
6066
485
(XP_
0130
8098
3)
Msx
(th
is s
tudy
, BR
0014
78),
(AP
KA
1042
272.
1795
.140
1)
U65
223
Lot
tia
giga
ntea
L
gi
aA
L
OT
GID
RA
FT_1
9323
8 (X
P_00
9060
630)
Msx
(th
is s
tudy
, BR
0014
77),
AM
QO
0100
6498
.103
608–
1040
16.1
0430
5–10
4723
Plat
yhel
min
thes
Rha
bdit
opho
ra
Dug
esia
japo
nica
D
ja
A, B
n.
d.
Dj-
Zic
A (
BA
E94
141)
, Dj-
Zic
B (
BA
E94
142)
m
sh1
(CA
L25
148)
, msh
2 (C
AL
2514
9)
AF0
1315
3
Schm
idte
a
med
iter
rane
a
Sme
A, B
A
, A
Zic
A (
AH
W52
381)
, Zic
B (
AA
WT
0102
8541
.1)
msh
1 (C
AL
2514
6), m
sh2
(CA
L25
147)
, Msx
(BA
G11
600)
DM
U31
084
Ces
toda
Ech
inoc
occu
s
gran
ulos
us
510 72U
)09 251SD
C(005 867000 _
GrgE
Arg
E
Ech
inoc
occu
s
mul
tilo
cula
ris
4361 37B
A)9 11 04 S
DC(
005867 00 0_ Jum
EA
um
E
Hym
enol
epis
mic
rost
oma
Hm
ic
A
Hm
N_0
0003
9700
+ H
mN
_000
7400
00 (
CD
S302
21 +
CD
S273
63)
525782JA
Tre
mat
oda
Clo
norc
his
sine
nsis
077 413FJ)07855
AA
G(802901_ F
LC
Ais
C
Schi
stos
oma
man
soni
22600010G
BA
C)073156810_P
X(xs
M)22149
EA
B(ci
Z-amS
Ana
mS
Rot
ifer
a M
onog
onon
ta
Bra
chio
nus
plic
atili
s
Bpl
n.d.
th
is s
tudy
11 994U
)249823C
L(
Supe
r-Ph
ylum
Ph
ylum
Su
bphy
ulum
or
Cla
ss
Spec
ies
Spec
ies
Abr
evia
tion
Zic
Para
logu
e
ID
Intr
ons
in
ZFD
Zic
Gen
e N
ame
or L
ocus
Tag
(se
quen
ce I
D)
Gli
, Glis
, and
Msx
Gen
e N
ame
([lo
cus
tag]
,
sequ
ence
ID
)
18S
rRN
A S
eque
nce
ID
(co
nti
nu
ed)
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2209
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
Bde
lloi
dea
Adi
neta
vag
a A
va
1, 2
trN
GA
+1 ,
GA
+1
this
stu
dy A
va-Z
ic1
(BR
0014
81),
Ava
-Zic
2
(BR
0014
82)
4062.897.357630020IW
AC
)584100R
B(xs
M-avA
Aca
ntho
ceph
ala
Aca
ntho
ceph
ala
Ech
inor
hync
hus
gadi
53388U
) 349823C
L (yduts
siht.d.n
agE
Dic
yem
ida
Rho
mbo
zoa
Dic
yem
a
acut
icep
halu
m
720662B
A)29615F
AB(
BciZ,)19615F
AB(
AciZ
A,A
B,A
caD
Ort
hone
ctid
a O
rtho
nect
ida
Into
shia
line
i Il
i 20010
AC
WL
)22486FA
O,4830_65Q3
A(xs
M)58086F
AO(
15140_65Q3
AAF
351.
1070
.227
7
Ecdysozoa
Pria
puli
da
Pria
puli
da
Pri
apul
us
caud
atus
Pca
aA
L
OC
1068
1342
9 (X
P_01
4673
047)
M
sx (
LO
C10
6815
387,
XP_
0146
7536
2)
X87
984
Art
hrop
oda
Che
licer
ata
Lim
ulus
poly
phem
us
Lpo
1a , 2
A
, A
LO
C10
6463
037
(XP_
0222
4615
4), L
OC
1064
6737
9
(XP_
0222
5126
3)
Gli
(X
P_02
2243
540)
, Glis
3A (
XP_
0222
5365
1),
Gli
s3B
(X
P_02
2256
099)
, Glis
3C (
XP_
0222
5018
2),
Gli
s2 (
XP_
0222
4379
4)
L81
949
Pan
dinu
s
impe
rato
r
Pim
a
138012Y
A)57519F
AB(
BxsM,)4 7519F
AB(
AxsM
)83149E
AB(
c iZ-iP
A
Par
aste
atod
a
tepi
dari
orum
Pte
a)237013100 _P
N(apo- t
A.d.n
Cru
stac
ea
Art
emia
fran
cisc
ana
Afr
a
160832JA
)67519FA
B(xs
M)04149
EA
B(ci
Z-fA
BA
Dap
hnia
mag
na
Dm
a 1,
2
n.d.
Dap
ma6
txE
Vm
_004
009t
1 (J
AM
3942
5),
Dap
ma6
txE
Vm
0040
09t2
(JA
I850
10)
95182010 PID
G
Hya
lell
a az
teca
H
az
a)8400 108 10 _P
X(52576 6801
CO
LA
Myr
iapo
da
Stri
gam
ia
mar
itim
a
Smar
1,
2
A, A
this
stu
dy S
mar
-Zic
1 (B
R00
1483
), S
mar
-Zic
2
(BR
0014
84)
2212.1.562371FA
)6 84100R
B(xs
M-ramS
Inse
cta
Tri
boli
um
cast
aneu
m
22 3914PK
) 5944301 00_PN (
hsm
)09964Y
QA(
apO
.d.ns ac
T
Dro
soph
ila
9553 31_R
N)92 37 4
CA
A(HS
M) 8224 25 _P
N(ap
OB
Ae
mD
(co
nti
nu
ed)
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2210
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
mel
anog
aste
r
Bem
isia
taba
ci48181010
WZ
CG
)828598810_PX(
007920901C
OL
BA
atB
Fop
ius
aris
anus
)99838G
AJ(ap
O.d.n
raF
Bom
byx
mor
i81462 010
KD
AA
)2.217429400_PX(
8593 7101C
OL
.d.no
mB
Ony
chop
hora
U
deon
ycho
phor
a
Eup
erip
atoi
des
kana
ngre
nsis
Eka
a tr
N)iit ra kcuel.
E(0 1994
U)831 25F
DC(
de ria p-ddo.d.n
Tar
digr
ada
Eut
ardi
grad
a
Hyp
sibi
us
duja
rdin
i
RZ
BG
)96942V
QO,87 110_8 98
VB(
xsM
)36721V
QO(
2992 1_89 8V
BA
u dH
0101
2413
Ram
azzo
ttiu
s
vari
eorn
atus
0 6Q
H) 709 29
UA
G,1 -5294 0_Y v
R(xs
M) 62330
VA
G(1-86731_
Y vR
Aav
R49
50.1
.791
Nem
atod
a
Sece
rnen
tea
Cae
norh
abdi
tis
eleg
ans
711862Y
A) 8469 05_P
N(51-bav
)774420100_PN(
2-f erB
CA
Ele
C
Nec
ator
amer
ican
us
Nam
EA
CB
N
EC
AM
E_0
0647
(X
P_01
3297
463)
N
EC
AM
E_1
1097
(X
P_01
3299
611
) A
J920
348
Tox
ocar
a ca
nis
2834 9U
)06 168N
HK(
51- bav)1 4157
NH
K(21040 _na c
TB
AEI
na cT
Asc
aris
suu
m
Asu
1,
2
n.d.
Z
IC3
(AD
Y45
473)
, ASU
_027
41 (
ER
G86
104)
va
b-15
(E
R317 1002 0I
UE
A)43258
G
Eno
plea
Tri
chur
is tr
ichi
ura
090996B
A) 701 55
WD
C(2-
X SM
) 19 05 5W
DC(
1 02 63 30 000_E
RT
TB
Airt
T
Tri
chin
ella
pseu
dosp
iral
is
Tps
A´
B
852158Y
A)31901
ZR
K(bxs
m)388 76
YR
K(91731_
A4T
Tri
chin
ella
zim
babw
ensi
s
Tzi
A´
B
462158Y
A)31901
ZR
K(bxs
m)8 46 60
ZR
K(8903_11
T
Nem
atom
orph
a N
emat
omor
pha
Spin
ocho
rdod
es
tell
inii
Ste
trN
3771 24FA
)4 498 23C
L(yduts
sih tA
Xen
acoe
lom
orph
a A
coel
a
Sym
sagi
ttif
era
rosc
offe
nsis
Sro
84113210R F
GA
Hof
sten
ia m
iam
ia
Hm
ia
01613110ASF
G.d .n
Supe
r-Ph
ylum
Ph
ylum
Su
bphy
ulum
or
Cla
ss
Spec
ies
Spec
ies
Abr
evia
tion
Zic
Para
logu
e
ID
Intr
ons
in
ZFD
Zic
Gen
e N
ame
or L
ocus
Tag
(se
quen
ce I
D)
Gli
, Glis
, and
Msx
Gen
e N
ame
([lo
cus
tag]
,
sequ
ence
ID
)
18S
rRN
A S
eque
nce
ID
(co
nti
nu
ed)
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2211
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
Con
volu
tril
oba
long
ifis
sura
Clo
tr
C440255
NF)77034
ND
A(ciz-olc
A
Nonbilateria
Plac
ozoa
T
rico
plac
ia
Tri
chop
lax
adha
eren
s
Tad
A´
Zic
(A
BG
P010
0003
4.1,
134
377–
1338
91,
1330
79–1
3268
5), t
his
stud
y fo
r ex
tend
ed C
DS
(BR
0014
79)
7871.1.82801L
Cni
dari
a
Ant
hozo
a
Exa
ipta
sia
pall
ida
Epa
1, 2
, 3, 4
,
5, 6
none
,
none
,
none
,
none
,
none
,
none
LO
C11
0249
839
(XP_
0209
1209
0),
AC
249_
AIP
GE
NE
2485
6 (K
XJ1
9923
),
LO
C11
0249
850
(XP_
0209
1210
1),
AC
249_
AIP
GE
NE
630
(KX
J150
38),
AC
249_
AIP
GE
NE
2893
0 (K
XJ2
0887
),
AC
249_
AIP
GE
NE
2487
3 (K
XJ1
9931
)
MSX
1 (K
XJ2
3890
), M
SX2
(KX
J161
86)
KP7
6128
1
Nem
atos
tell
a
vect
ensi
s
Nve
A, B
, C,
D, E
none
,
none
,
none
,
none
,
none
Nv-
Zic
A (
BA
E94
125)
, Nv-
Zic
B (
BA
E94
126)
,
Nv-
Zic
C (
BA
E94
127)
, Nv-
Zic
D (
BA
E94
128)
,
Nv-
Zic
E (
BA
E94
129)
Msx
(B
AG
1159
8), G
li (H
AD
P011
9076
5), N
vGli
s
(HA
DP0
1062
898)
, NvN
kl (
HA
DP0
1217
754)
AF2
5438
2
Acr
opor
a
digi
tife
ra
Adi
1,
2, 3
none
,
none
,
none
LO
C10
7357
868
(XP_
0157
7999
1), L
OC
1073
5787
0
(XP_
0157
7999
3), L
OC
1073
5786
1 (X
P_01
5779
985)
Msx
(L
OC
1073
4910
2, X
P_01
5770
686)
B
AC
K02
0184
06
Orb
icel
la
fave
olat
a
Ofa
1,
2, 3
none
,
none
,
none
LO
C11
0042
868
(XP_
0206
0391
0), L
OC
1100
4287
1
(XP_
0206
0391
3), L
OC
1100
4286
9 (X
P_02
0603
911)
Msx
(L
OC
1100
6128
2, X
P_02
0623
784)
Hyd
rozo
a H
ydra
vul
gari
s H
vu
1, 2
, 3, 4
D,
none
,
A˝,
none
LO
C10
0210
883
(XP_
0021
5378
2),
Hyz
ic(A
AR
1081
7), Z
ic3
(AFK
7487
6), Z
ic2
(AF
K74
875)
44995 0FE
)94117G
DC(
1XS
M
(co
nti
nu
ed)
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2212
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
Scol
ione
ma
suva
ense
488132B
A)27519F
AB(
xsM
)34149E
AB(
ciZ-usS
DusS
Cte
noph
ora
Ten
tacu
lata
M
nem
iops
is le
idyi
)67034N
DA(
silG,)74025010
TAF
G(ilG
elM
Pori
fera
Dem
ospo
ngia
e
Eph
ydat
ia
fluv
iatil
is
6 418 75Y
A)15102
AA
A,3 xorp(xs
Mlf
E
Eph
ydat
ia m
uell
eri
,)2334m(
3/1silG ,) 973 682
m(e kil-il
G, )7176 8m( il
Geu
mE
Am
phim
edon
quee
nsla
ndic
a
Aqu
Msx
(L
OC
1006
3160
3, N
P_00
1266
196)
, Gli-
2/3a
(Aqu
1.21
7717
), G
li2/3
b (A
qu1.
2199
64),
Glis
1/3
(Aqu
1.21
3405
)
AC
UQ
0101
5651
Hal
iclo
na
ambo
inen
sis
)05012.m
1i1g
88664c(ekil-il
Gma
H
Xes
tosp
ongi
a
test
udin
aria
Xte
)7 91 6dloffac st x(eki l-il
G
Cal
care
a
Leu
coso
leni
a
com
plic
ata
549001FA
)35286dipcl(xs
M78485dipcl,855411dipcl
.d.n2,1
ocL
.1.1
817
Syco
n ci
liat
um
Sci
1(tr
N),
2,
3
none
,
none
,
none
scpi
d881
50, s
cpid
4445
0, s
cpid
9509
7
Msx
(sc
pid6
5818
), G
li (s
cpid
3796
5), G
li-li
ke
(sca
ffol
d107
7), u
ncla
ssif
ied
(scp
id34
448)
AJ6
2718
7.1.
1792
Hom
oscl
erom
orph
a O
scar
ella
car
mel
a68 61.1.625456F
E)0135g(
xsM
627 013.genon
acO
Supe
r-Ph
ylum
Ph
ylum
Su
bphy
ulum
or
Cla
ss
Spec
ies
Spec
ies
Abr
evia
tion
Zic
Para
logu
e
ID
Intr
ons
in
ZFD
Zic
Gen
e N
ame
or L
ocus
Tag
(se
quen
ce I
D)
Gli
, Glis
, and
Msx
Gen
e N
ame
([lo
cus
tag]
,
sequ
ence
ID
)
18S
rRN
A S
eque
nce
ID
a Use
dto
defi
ne
new
CD
s.A
-I,p
osi
tio
ns
ofi
ntr
on
sin
ZFD
(su
pp
lem
enta
ryfi
g.S7
,Su
pp
lem
enta
ryM
ater
ialo
nlin
e);A0 ,
A-i
ntr
on
con
form
ing
toG
C-A
Gru
le;A00 ,
A-i
ntr
on
rem
nan
t;Aþ
1,A
-in
tro
nþ
1si
te;t
rC,C
-ter
min
us
tru
nca
ted
;trN
,N-t
erm
inu
str
un
cate
d;n
.d.,
no
td
eter
min
ed.
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2213
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
dunlapae, Pukia falcata, and Vallicula multiformis; Neurobase,https://neurobase.rc.ufl.edu/pleurobrachia/browse) or inchoanoflagellates (whole genome sequence of Monosiga bre-vicollis, Salpingoeca rosetta, and Capsaspora owczarzaki; NCBIand Ensembl databases). In addition, no Zic sequences wereidentified in fungal species or other living organisms (Arugaet al. 2006; Hatayama and Aruga 2010).
Based on the above facts, we summarized the Zic distri-bution in the metazoan phylogenetic tree and updated thehypothesis concerning the origin of Zic genes (fig. 1). Becauseporiferans are monophylic with Calcarea þHomoscleromorpha forming the sister group (Nosenkoet al. 2013; Riesgo et al. 2014), Zic absence in demospongesmay indicate the loss of Zic in the demosponge clade. Thephylogenetic positions of poriferans and ctenophores are stillcontroversial (Telford et al. 2016). However, recent studiessupport the model, “poriferans as sister to all other animals”(Feuda et al. 2017; Simion et al. 2017). Based on this model,the absence of Zic in the ctenophores is considered to be theloss of Zic in the ctenophore clade. In this case, the directancestor of Zic is predicted to be gained by the metazoanancestor (fig. 1). Alternatively, if we adopt “ctenophore assister to all other animals” model, the common ancestor ofmetazoan excluding ctenophore after divergence ofctenophore.
Distribution of Gli and Glis in the phylogenetic tree is alsolimited to metazoan clades (Hatayama and Aruga 2010). Gliand Glis have been identified in a demosponge species(Amphimedon queenslandica, Layden et al. 2010). We foundGli and Glis1/3 orthologues in another demosponge species(Ephydatia muelleri) (supplementary figs. S2 and S3,Supplementary Material online). It is also known that A.queenslandica contains another Gli/Glis sister/ancestral se-quence (Amqgli2/3b, Layden et al. 2010). In our molecularphylogenetic analysis, Amqgli2/3b was grouped with sequen-ces from other demospongiae species (E. muelleri, Haliclonaamboinensis, and Xestospongia testudinaria) and a calcareaspecies (S. ciliatum), suggesting the presence of a Gli/Glisgene unique to poriferans (supplementary figs. S2 and S3,Supplementary Material online). These results, collectively,
indicated that the Gli-Glis-Zic superfamily common ancestralgene may have already differentiated into Gli, Glis, and Zic inthe last common ancestor of the metazoans.
Distribution of a Strongly Conserved Intron (A-Intron)in Zic GenesWe added notes on Tad-Zic gene in the placozoan Trichoplaxadhaerens (Srivastava et al. 2008) as follows. The currentDDBJ/EMBL/GenBank database contains an mRNA sequence(XM_002108437.1) that starts from the midst of ZF1, and awhole genome sequence contig (ABGP01000034.1) thatincludes two presumptive exons with Zic ZF1-3 and ZF4-5as the splicing donor and acceptor, respectively (supplemen-tary fig. S4, Supplementary Material online). However, the firstmethionine is located in the midst of ZF1 of the ZF1-3-containing exon and the splicing acceptor sequence doesnot coincide between the mRNA and the contig sequence.Therefore, we searched for an upstream exon and hypothe-sized a new Tad-Zic protein. The new Tad-Zic gene containedthe A-intron that conforms to the GC-AG rule consensussequence that is frequently involved in alternative splicingin humans (Thanaraj and Clark 2001).
As the A-intron was only found in bilaterian species in aprevious study (Aruga et al. 2006), we investigated the distri-bution of introns in the ZFD where genomic sequences wereavailable (table 1). As a result, all bilaterian Zic genes except abdelloid rotifer Adineta vaga (Ava) were found to possess A-introns (table 1). In case of Ava-Zic sequences, introns wereinserted into one base at 30 from the A-intron (i.e., A-intron isin HTG*[phase-1]EKP; Ava_1 and Ava_2 introns are inHTG*[phase-2]EKP, where * denotes the codon with intron.Hereafter, the Ava introns are referred as Aþ1 introns).Interestingly, although Ava_1 and Ava_2 each possessedtwo additional introns in the ZFD, the positions are slightlydifferent between paralogues (table 1). It is known that bdel-loid rotifers including Ava are tetraploid species without mei-osis (ameiotic), and possess unusual genomic features (Flotet al. 2013). We speculate that the positions of introns areeasily changed in this species and that the Aþ1 introns may bevariants of the A-intron.
A-introns are also found in highly simplified bilaterians. Wepreviously showed that a dicyemid, lophotrochozoan para-sitic worm without any specialized gut, nervous system, andmuscle, possessed Zic genes with A-introns (Aruga et al.2007). In a recent study, whole genome sequencing analysisof an orthonectid Intoshia linei, indicated that orthonectidsare highly simplified lophotrochozoans with a muscular andnervous system (Mikhailov et al. 2016). The orthonectid Ili-Zicwas found to have two introns in the ZFD, one of which wasan A-intron. Orthonectids are rare parasites of marine inver-tebrates having simple body plans without digestive, circula-tory, and excretory systems, and have been described as“mesozoan” animals showing an uncertain affinity with pla-cozoans and dicyemids (Brusca et al. 2016). Another intrigu-ing animal, the Acoela, which lacks the anus, nephridia, and acirculatory system, and is proposed to be a bilaterian and asister to the Nephrozoa (¼ Protostomia þ Deuterostomia)based on the 11 newly reported xenoacoelomorph
Placozoa (1)
Cnidaria (3-6)
Bilateria
Porifera
Ctenophora
Demospongiae
Calcarea (2,3)
Homoscleromorpha (1)
metazoan
ancestor
Choanoflagellatea
+ Zic gain
Zic loss ?- ?
Fungi
+- ?
- ?
Lophotrochozoa (1-3)
Ecdysozoa (1,2)
Deuterostomia (1,2)
Vertebrata (5,6)
FIG. 1. A hypothetical model to explain the Zic distribution in met-azoans. In taxa framed with broken lines, the absence of Zic gene wasconfirmed in multiple species. Numbers in parentheses indicate theparalogue numbers.
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2214
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
transcriptomes (Cannon et al. 2016). Using transcript shotgunassembly in the Acoela Symsagittifera roscoffensis, we identi-fied two Sro-Zic transcripts. One was an A-intron-spliced-outtype (mature form) and the other was an A-intron-retainedtype; a deposited Convolutriloba longifissura transcript(ADN43077) was also an A-intron-retained type (table 1),indicating that xenoacoelomorpha Zic genes possess the A-intron. As a consequence, the A-intron is likely to be found inthe “bilaterian ancestor” defined as the common ancestor ofXenoacoelomorpha and Nephrozoa.
We also examined the introns in cnidarian Zic sequences.There were no introns in the ZFDs from 17 anthozoansequences. In Hydrozoa, one of the four Hydra vulgaris Zicparalogues (Hvu_3) possessed an A-intron variant (A0-in-tron), Hvu_1 and Scolionema suvaense (Ssu) carried a D-in-tron (Aruga et al. 2006), and the others (Hvu_2 and Hvu_4)did not contain any introns. In the A0-intron, a new splicingdonor sequence in the A-intron was used for splicing, result-ing in the insertion of four AA at the position of the A-intron(supplementary fig. S4, Supplementary Material online). It wasconsidered that the A0-intron was formed in the hydorozoanclade after the generation of four paralogues.
Collectively, the A-intron or its variants existed in bilater-ians, placozoan, and hydrozoa. Although the placozoaTrichoplax was classically included in “mesozoan” species, re-cent molecular phylogenetic analyses indicate that placozo-ans are closely related to (Cnidaria þ Bilateria) (Dohrmannand Worheide 2013). Based on recent molecular phylogeneticanalysis (Srivastava et al. 2008; Mallatt et al. 2010; Pick et al.2010; Telford et al. 2015), we prepared four models to hy-pothesize the gain and loss of the A-intron (fig. 2). Among thefour models, model (D) is discordant with the result of thetree topology test (Srivastava et al. 2008). Considering theabsolute conservation of the A-intron in bilaterians, we spec-ulate that there has been a strong negative selection pressureagainst A-intron loss for unknown reasons. If focusing on theparsimony of A-intron gain and loss, we would favor model(C) in which the A-intron is gained two times in theBilateriaþPlacozoa common ancestor and in the hydrozoaclade after paralogue generation. Naturally, the conclusionawaits the solution of metazoan phylogeny.
Identification of Novel CDs in Bilaterian Zic ProteinsIn a previous study, we showed that the extent of conserva-tion varies among the metazoan Zic AA sequences. They arestrongly conserved in Vertebrata (Chordata),Cephalochordata (Chordata), Echinodermata, Mollusca, andArthropoda in comparison to Platyhelminthes, Cnidaria,Nematoda, and Tunicata (Chordata) (Aruga et al. 2006). Zicproteins belonging to the former and latter groups are calledas conserved-type-Zic and diverged-type-Zic, respectively. Tonewly define the evolutionarily CDs, we selected 21conserved-type-Zic sequences (table 1). The set of sequenceswere optimized to comparably represent the three majorbilaterian taxa (Deuterostomia, Lophotrochozoa, andEcdysozoa) where a taxon contained five to seven sequencesfrom three animal phyla. After multiple sequence alignment,we inferred the ancestral sequence, defined by a maximal
likelihood-based prediction in MEGA7 (Kumar et al. 2016).The analysis revealed highly conserved sequence elementsthroughout the proteins. We defined new evolutionarilyCDs according to the following criteria: (1) the sequence el-ement is conserved across the three taxa, (2) the length of thesequence element is>8, and (3) the sequence element is notdivided by intervening AA residues in most species. A zincfinger C-terminal flanking region (ZFCC, eight AA) was ex-ceptionally included because of its inclusion in a previousmolecular phylogenetic analysis (Aruga et al. 2006). As a re-sult, we obtained new CDs that are well conserved among theselected Zic proteins (fig. 3 and supplementary fig. S6,Supplementary Material online).
We next examined the distribution of CD0-6 and ZFCCtogether with the known domains (ZOC, ZFNC, and ZFD)across all eumetazoan Zic proteins (figs. 4 and 5; supplemen-tary fig. S5, Supplementary Material online). CD0 was located
Placozoa
A A-intron gain
A-intron lossA-
C
B
D
Placozoa
Cnidaria
Bilateria
Anthozoa
AA
AHydrozoa
A
Placozoa
Cnidaria
Bilateria
Anthozoa
AHydrozoa
A
Placozoa
Cnidaria
Bilateria
Anthozoa
A
Hydrozoa
A
A-
Cnidaria
Bilateria
Anthozoa
A
Hydrozoa
A
FIG. 2. Hypothetical models to explain the A-intron gain and lossduring evolution. (A) and (B) depend on the tree according toSrivastava et al. (2008). (C) Tree supported by Pick et al. (2010). (D)Trichoplax sister to Hydrozoa is less likely (P< 0.01) in tree topologytests in Srivastava et al. (2008).
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2215
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
at the N terminus of Zic AA sequences from limited taxaincluding Priapulida, Chelicerata (Arthropoda), Brachiopoda,Cephalopoda (Mollusca), Hemichordata/Echinodermata, andCephalochordata. In vertebrata, a partially conserved se-quence could be identified in Zic4/5 paralogues. In theremaining taxa, Zic proteins existed as N-terminally truncatedproteins containing CD1 as the N terminal CD. CD0 is pre-dicted to be present in the bilaterian ancestor but has beenextensively lost during evolution. Although its function isunknown, we noticed the distribution of CD0 in so called“living fossil” species such as Limulus polyphemus (Smith andBerkson 2005) and Lingula anatina (Emig 2008). In this regard,it would be tempting to call CD0 a “living fossil domain.”
On the other hand, the C-terminal conserved domainCD6, was strongly conserved in most conserved-type-Zic pro-teins, and was weakly conserved in nematode, cnidarian, andplacozoan Zic proteins. Interestingly, an “EWYV” sequencemotif that was strongly conserved at the C-termini of thevertebrate Zic1/2/3 paralogues could be found in the N-terminal region of CD6, and its N-terminal flanking showedsimilarity to CD6 in multiple alignment. The result indicatesthat the vertebrate Zic1/2/3-type C-termini are truncatedvariants of CD6. Truncation of CD6 at the same positionhas not been observed in any Zic proteins besides the verte-brate Zic1/2/3 and can be regarded as a unique innovation inearly vertebrates. Because the functional importance of the
C-termini of Zic1/2/3 has been described, we hypothesizedthat this structural change may have a role in the establish-ment of the vertebrate nervous system. Because the C-terminiof Zic1 and Zic2 are shown to have transcriptional regulatoryactivities (Kuo et al. 1998; Mizugishi et al. 2004; Twigg et al.2015), and Zic1 and Zic2 play major roles in vertebrate CNSdevelopment, their loss-of-function results in dysgenesis ofthe central nervous system, grossly characterized by hypoplas-tic changes in the dorsal neural tube along the entire rostro-caudal axis including the forebrain, cerebellum, and spinalcord (reviewed by Aruga 2004). In the case of human ZIC1,loss of the C-terminal CD is associated with calvaria deformityand learning disability (Twigg et al. 2015).
The remaining new CDs (CD1–CD5) are located N-termi-nally to ZFD (fig. 5). ZOC is located between CD1 and CD2.They were variably degenerated across the taxa. CD3 andZOC are conserved as highly as CD6, and are included amongthe most conserved-type-Zic genes. However, ZOC is moreclearly conserved in insects (Arthropoda) in accordance withits derivation (comparison between mouse and fly homo-logues, Zic-Opa Conserved). We also observed a clear conser-vation in a subset (Nve_A and Epa_4) of sea anemone (orderActinaria class Anthozoa phylum Cnidaria) Zic sequences.
ZFNC, ZFD, and ZFCC are adjacently placed, forming thelargest compound CD. ZFNC was strongly conserved as alsoshown in previous studies (Aruga et al. 2006; Layden et al.
FIG. 3. Multiple sequence alignment of Zic CDs. URB, predicted urbilaterian sequence; dot, identical to above URB residue. Full alignments areindicated in supplementary fig. S6, Supplementary Material online.
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2216
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
2010) whereas the extent of conservation was much less inZFCC (fig. 4).
Absolute conservation was observed in 51 AA residues inthe ZFD (supplementary fig. S7, Supplementary Material on-line) including cysteine, histidine, and tryptophan residues ofthe C2H2/tCWCH2 motif. The tCWCH2 motif is proposed tobe a hallmark for the structurally unified zinc finger unit(Hatayama and Aruga 2010). ZF1 and ZF2 in all Zic proteinsconformed to tCWCH2. However, the C2H2 motif was notcompletely conserved. For instance, ZF5 is missing in Pmi(Echinodermata) (PmZic Yankura et al. 2010) and inHVu_3 and Hvu_4 (Cnidaria) (Zic3 and Zic2, respectivelyHemmrich et al. 2012) in addition to the C-terminal trun-cated partial sequence record (Clo). These results indicatethat Zic isoforms lacking ZF5 can occur in the course ofmetazoan evolution. This study also revealed the presenceof intervening sequences, a particularly long one in the HrobZF1 C-C region (61 AA), as well as in the Ste ZF1-ZF2 linkerregion (42 AA) and the novel Tcan ZF2 C-H region (9 AA).This finding is in agreement with the increased frequency ofintervening sequences in tCWCH2 ZFs (Hatayama and Aruga2010).
To further clarify the evolutionary traits of Zic CDs, weexamined the correlation of the evolutionary conservationextent among known CDs. The matrix of r values ofSpearman’s rank coefficient (fig. 6) indicates moderate tostrong (r¼ 0.35–0.82) positive correlations among the ZicCDs. The strongest correlations (r¼ 0.79–0.82) were ob-served in the ZFNC-ZFD, CD3–CD5, CD1–CD5, CD1–CD3,and ZFD-CD6 pairs. These results indicate that degenerationof the Zic CDs has occurred coordinately during evolution,suggesting the presence of intramolecular functional or struc-tural associations in ancient Zic proteins.
In a previous study, we observed coevolution of Zic ZF1and ZF2, which are structurally fused to form a single globularunit (Hatayama et al. 2008; Hatayama and Aruga 2010). TheZF1/2 unit frequently possessed insertions and the extent ofevolutionary conservation was lower in ZF1/2 than in ZF3/4/5(Hatayama and Aruga 2010) (supplementary fig. S7,Supplementary Material online). It is known that proteinrepeats in an open structure and in independently foldingunits are more volatile, and that volatile CDs are often shapedby concerted evolution, likely by recombination (Schuler andBornberg-Bauer 2016). However, ZF3, ZF4, and ZF5 are placedunder strong evolutionary constraints even though they arepredicted to form independent globular structures(Hatayama et al. 2008). This may be explained by the factthat ZF3, ZF4, and ZF5 are essential for DNA and cofactorbinding (Hatayama and Aruga 2018).
FIG. 4. Preservation of conserved domains in metazoan Zic proteins.Preservation of CDs in metazoan Zic proteins. Red color gradient inthe boxes indicates the percentage of conservation where maximalmatching to the urbilaterian CD sequence was defined as 100% andthe minimal blast score as 1%, as shown in the inset scale. Blank boxindicates the absence of the sequence element with minimally de-tectable homology (BLAST score > 25) to the urbilaterian CD se-quence. In the ZOC core column, gray and blank boxes indicate thepresence and absence of ZOC core sequence consensus defined as
FIG. 4. Continued(R/S/N)(D/E)(F/L)(V/L/I)(L/F)(R/K)(R/N/S). Arrowheads, parasiticanimals; hyphen (–), not applicable; rank, ranking order of overallCD conservation extent defined as a summation of the percentagesof each conserved domain among the 116 full-length sequences.Mouse Zic proteins (Mmu_1/2/3/4/5) show the same profiles asHsa_1/2/3/4/5, respectively.
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2217
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
Diversification of Zic Genes in Each TaxaMolecular phylogenetic tree analysis using the full set ofmetazoan Zic ZFNC-ZFD-ZFCC region AA sequencesrevealed novel evolutionary processes of Zic genes ineach taxa. First, anthozoan Zic paralogues (Nve_D,Epa_1, Epa_2, Adi_1, Ofa_1), (Nve_C, Epa_3, Adi_2,and Ofa_2), (Nve_A, Epa_4), and (Nve_E, Epa_6) weregrouped with moderate to strong statistical supports(fig. 7A and supplementary fig. S8, SupplementaryMaterial online). On the other hand, hydrozoan Zicsequences were not grouped with their anthozoan
1 Z 2 3 4 5 N ZF C 6
Bilateria ancestor
1 Z 2 3 4 5 N ZF C 6
Anthozoa (Cnidaria)
Deuterostomia
1 Z 2 3 4 5 N ZF C 6
Brachiopoda
1 Z 2 3 4 5 N ZF C 6
Tunicata
1 Z 2 3 4 5 N ZF C 6
Hemichordata/Echinodermata
1 Z 2 3 4 5 N ZF C 6’
Vertebrata 1/2/3
1 Z 2 3 4 5 N ZF C 6
Vertebrata 4/5
1 Z 2 3 4 5 N ZF C 6
Chelicerata (Arthropoda)
1 Z 2 3 4 5 N ZF C 6
Insecta (Arthropoda)
1 Z 2 3 4 5 N ZF C 6
Platyhelminthes
1 Z 2 3 4 5 N ZF C 6
Crustacea (Arthropoda)
1 Z 2 3 4 5 N ZF C 6
Priapulida
1 Z 2 3 4 5 N ZF C 6
Nematoda
1 Z 2 3 4 5 N ZF C 6
Bivalvia (Mollusca)
1 Z 2 3 4 5 N ZF C 6
Cephalopoda (Mollusca)
1 Z 2 3 4 5 N ZF C 6
Gastropoda (Mollusca)
1 Z 2 3 4 5 N ZF C 6
Annelida
N ZF C 6
Onychophora
1 Z 2 3 4 5 N ZF C 6
Rotifera/Acanthocephala, Dicyemida
1 Z 2 3 4 5 N ZF C 6
Tardigrada
1 Z 2 3 4 5 N ZF C 6
Cephalochordata
Protostomia
Lophotrochozoa
Ecdysozoa
1 Z 2 3 4 5 N ZF C 6
Hydrozoa (Cnidaria)
Placozoa
Metazoa ancestor
1 Z 2 3 4 5 N ZF C 6
Porifera
1 Z 2 3 4 5 N ZF C 6
1 Z 2 3 4 5 N ZF C 6
FIG. 5. Zic CDs during evolutionary processes. Gray gradient, the extent of conservation (higher-darker); 0–6, CD0–CD6; 60 , carboxy terminallytruncated CD6 in vertebrates; Z, ZOC; N, ZFNC; ZF, ZFD; C, ZFNC.
FIG. 6. Correlation of the conservation extent among Zic CDs. Valuesindicate Spearman’s correlation coefficient obtained by a rank-ordercorrelation test. Gray-back indicates strong correlation (r > 0.7).
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2218
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
counterparts, but Hvu_1 and Ssu were grouped withinthe taxon. Together with taxonomic information that thefour anthozoan species belong to either Order Actiniaria(anemones) (Nve and Epa) or Order Scleractinia (corals)(Adi and Ofa) (Shinzato et al. 2011), we hypothesize pro-cesses of the cnidarian Zic paralogue generation (fig. 7B),where the Anthozoan ancestor had three Zic paralogues,and two additional Zic genes were generated by duplica-tion in the Actiniaria ancestor, and an additional duplica-tion in the Epa ancestor after diverging Nve. ZOC waspredicted to have existed in the cnidarian ancestor butwas retained in one type of paralogues in the Actiniaria.
Concerning other paralogues, we found conservation ofeach paralogue between mammals (Hsa) and cartilaginousfish (Cmi) (supplementary fig. S8, Supplementary Materialonline), indicating that vertebrate paralogues were generatedin a vertebrate ancestor before the Chondrichthyes diverged.Rotifer and Acanthocephala are reported to have a close re-lationship in molecular phylogeny (Garey et al. 1996; Sielaffet al. 2016). However, we did not see a clear affinity betweenthe Rotifer (Bpl and Ava) and Acanthocephala (Ega) Zicsequences (supplementary fig. S8, Supplementary Materialonline).
Sequence-Dependent Transcriptional Activation Is anEvolutionarily Conserved Function of Metazoan ZicProteinsWe next addressed the function of Zic proteins from a phy-logenetic perspective. Chordate (vertebrates and tunicates)and fly Zic proteins have been shown to have transcriptionalregulatory activities (Mizugishi et al. 2001; Yagi et al. 2004;Sawada et al. 2005; Sen et al. 2010). To examine the possibilityof transcriptional regulatory activity, we constructed the N-terminal FLAG epitope-tagged expression vectors for Nve_A/B/C/D/E, Hbl, Ttu, Pim, Afr, Dme, Cel, Ppe, Ci_a/b, Bfl, andmouse (Mmu_1/2/3/4/5) proteins. Mouse Zic proteins werechosen because they are well characterized and highly similarto human ZIC proteins (Aruga 2004; Houtmeyers et al. 2013).These were transfected into cultured mammalian cells alongwith high affinity Zic-binding-sites containing the Tgif1 pro-moter (Ishiguro et al. 2017). The results indicated that mostZic proteins showed binding sequence-dependent transcrip-tional regulatory activities (fig. 8A and B). Among the verte-brate paralogues, Zic2 showed the strongest activation. Whencompared with mouse Zic levels, several invertebrate Zic (Bfl,Ppe, Pim, Cel, Nve_A, Nve_C, and Nve_E) proteins showedmore than half of the transcriptional activation as that of
FIG. 7. Evolutionary processes in cnidarian Zic. (A) A part of the molecular phylogenetic tree by Maximal Likelihood analysis using Zic ZFCC-ZFD-ZFCC AA sequences. Branches with <0.5 bootstrap values were condensed. The internal values indicate the Bootstrap values in MaximalLikelihood analysis (above branches) and postprobability in percentages in Bayesian inference analysis (below branches). The complete tree ispresented in supplementary fig. S8, Supplementary Material online. (B) Predicted evolutionary processes of cnidarian Zic genes.þ1–þ3, acquiredgene numbers by gene duplication.
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2219
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
0
50
100
150
200
250
300
em
pty
Nve-A
Nve-B
Nve-C
Nve-D
Nve-E
Ssu
Ttu
Hbl
Pim Afr
Dm
e
Cel
Ppe
Cin
-a
Cin
-b Bfl
Mm
u-1
Mm
u-2
Mm
u-3
Mm
u-4
Mm
u-5
Tgif-luc
TgifDZBS-luc
Cnidaria DeuterostomiaProtostomia
Rela
tive lucifera
se a
ctivity
(Mm
u-2
= 1
00%
)
50
37
75100150250
2520
B
A
C
** *** * * ****** ** **** **
**Tgif-luc
TgifΔZBS-luc
** ** * * *
Nve-A
Nve-B
Nve-C
Nve-D
Nve-E
Ssu
Ttu
Hbl
Pim
Afr
Dm
e
Cel
Ppe
Cin
-a
Cin
-b
Bfl
Mm
u-1
Mm
u-2
Mm
u-3
Mm
u-4
Mm
u-5
em
pty
Nve-A
Ttu
Pim
Dm
e
Cel
Ppe
Mm
u-2
em
pty
Nve-A
Ttu
Pim
Dm
e
Cel
Ppe
Mm
u-2
Input
FIG. 8. Transcriptional regulatory activity of Zic proteins from various animals. (A) N-terminally Flag-tagged Zic protein expression plasmids or theempty were cotransfected with Tgif promoter driven luciferase (Tgif-luc) or Tgif promoter lacking three high affinity Zic binding sites (ZBS)(TgifDZBS-luc) in NIH3T3 cells. Bar graphs show averages of triplicate experiments where each value was normalized by that of internal control.Error bar, SD; *P< 0.05; **P< 0.01 in t-test between Tgif-luc and TgifDZBS-luc. (B) Immunoblot with anti-Flag tag antibody. The cell extracts in (A)are subjected for the immunoblot. (C) Protein binding abilitis of Zic proteins to the transcription regulatory molecular complexes. FLAG-taggedZic expression plasmids were transfected into 293T cells. The cell lysates were immunnoprecipitated (IP) with anti-FLAG epitope antibody andimmunoblotted (IB) with Zic binding proteins (Ishiguro et al. 2007).
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2220
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
mouse Zic2. Most Zic proteins except Afr showed significantactivation, and those except Nve_B, Afr, and Cin_b showedsignificant sequence dependence. These results indicate thatthe transcriptional regulatory activities are maintained de-spite the apparent absence of CDs other than ZFD (e.g., Ssuand Cin_a).
The above results led us to examine the protein bindingabilities of Zic proteins with proteins that are proposed to beassociated with transcriptional regulation of Zic proteins(Ishiguro et al. 2007). For this purpose, we expressedNve_A, Ttu_B, Pim, Dme, Cel, Ppe, and mouse Zic2(Mmu_2) in HEK293T cells and immune-precipitated themusing an anti-FLAG antibody; the coprecipitated proteinswere then analyzed by immunoblotting (fig. 8C). All theZic-binding proteins (DNA-PKcs, Ku70, Ku80, poly ADP-ribose polymerase, RNA helicase A) were found to be coim-munoprecipitated with Zic proteins at least in part. It isknown that these proteins associate with mouse Zic2through either the CD2-ZFNC region (DNA-PKcs) or ZF3 inZFD (Ku70, Ku80, poly ADP-ribose polymerase, and RNAhelicase A) (Ishiguro et al. 2007). It was considered that inter-action with these proteins underlies the highly conservedtranscriptional regulatory activities among Zic proteins.
ZOC Is Essential for Zic–Msx InteractionRecent studies have shown that Zic and a homeodomain(HD) transcription factor, Msx, are involved in an evolution-arily conserved gene regulatory cascade to specify the lateralborder region of the bilaterian central nervous system (CNS)(Simoes-Costa and Bronner 2015; Li et al. 2017). We alsofound that mouse Zic1 and Msx are colocalized in the cellnuclei of the dorsal spinal cord and the progress zone beneaththe apical ectodermal ridge of developing limbs (Aruga et al.2002) (fig. 9A). These facts led us to examine whether Zicprotein could physically interact with Msx protein. We trans-fected HA-tagged mouse Zic2 and FLAG-tagged Msx2 expres-sion vectors into mammalian cells and performed acoimmunoprecipitation assay. The result showed that Zic2physically interacted with Msx2 (fig. 9B). We next mapped theMsx-binding site in Zic2 protein using a series of N-terminallyor C-terminally deleted mutants and found that the Zic2 AA100–140 region including ZOC was necessary for Msx2 bind-ing (fig. 9C). We also mapped the Zic2 binding domain inMsx2 using an Msx deletion mutant, and found that HD isessential for interaction with Zic2 (fig. 9D). The purified HD-GST fusion protein bound HA-Zic2 protein in a GST-pulldown assay (fig. 9E), suggesting a direct interaction betweenZic2 and Msx2.
In a previous study, ZOC was shown to be bound by atranscriptional repressor, I-mfa (Inhibitor of myoD family, alsocalled as Mdfi) protein (Mizugishi et al. 2004). Therefore, it islikely that ZOC serves as a regulatory hub to control Zicprotein function.
Msx-Binding Abilities Are Widely Conserved inMetazoan Zic ProteinsMsx proteins are also widely conserved in metazoa(Takahashi et al. 2008) and play critical roles in neural and
skeletal development (Alappat et al. 2003; Ramos and Robert2005). We next asked how widely the Msx-binding abilitiesare retained in metazoan Zic proteins. We cloned Msx cDNAfrom mouse and sea anemone and constructed FLAG-taggedMmu-Msx1 and Nve-Msx expression vectors. These vectorswere cotransfected with MYC-tagged Zic proteins fromMmu, Nve, Ttu, Pim, Dme, Cel, and Ppe, and the MSX pro-teins were precipitated using anti-FLAG antibodies. The resultindicated that all the tested Zic proteins were coprecipitatedwith both Mmu- and Nve-Msx proteins (fig. 10A and B).However, Dme-Zic (Opa) and Cel-Zic (ref-2) coprecipitatedless efficiently than the other Zic proteins in both cases.
Although Cel-Zic does not contain an apparent ZOC inthe CD homology search, it contained a (KDKMMKS) se-quence instead of typical ZOC (RDFL[L/F]RR) (Layden et al.2010). The similarity in the position of the charged residuesand hydrophobic residues might be sufficient for binding.Therefore, the results suggest that a detailed structure-function analysis based on these experiments is required toconsider the evolutionarily conserved protein–protein inter-action. In addition, we cannot exclude the involvement ofother CDs in the binding of Zic to Msx.
Msx Conservation Extent Was Strongly Correlatedwith That of Zic CDsHaving obtained results suggesting evolutionary conservationof Zic–Msx interaction, we examined how the evolutionaryprocesses of the two genes are correlated. In a previous study,the evolutionary process of metazoan Msx genes was de-scribed from the viewpoint of conservation and diversifica-tion from a bilaterian ancestor (Takahashi et al. 2008). Wethen compared the conservation extent of Msx and Zic CDs.As an Msx CD, HD with 8 AA of N-flanking and 17 AA of C-flanking was chosen because this domain is sufficiently largefor conservation extent analysis and as HD can mediate phys-ical interaction with Zic proteins. We calculated the HD ho-mology score between the predicted bilaterian ancestral MsxAA sequence and Msx sequences from 38 species in whichZic sequences are known (table 1). Coefficient values weredetermined for each Zic CD and Msx HD (fig. 11A). As areference, we used the evolutionary distances of 18S ribo-somal RNA (18S) sequences of the corresponding species.The result indicated that both Msx HD and 18S showedmoderate to strong correlations. However, the coefficientvalues for Msx-Zic CDs was higher than those of 18S-ZicCDs in the total CDs (P< 0.01 in a paired t-test), and thevalues for CD1 and CD3 were particularly well correlated withthose of Msx HD, compared with those of 18S. The value forZFD was high, but was comparable to that of 18S. The scatterplot of the Zic ZFNC-ZFD-ZFCC and Msx HD or 18S score isindicated in figure 11B and C to show the detailed correlationprofile. The graphs indicate that both Zic and Msx arestrongly conserved in animals belonging to Echinodermata/Hemichordate, Mollusca, Cephalochordata, and Vertebratagroups, but are poorly conserved in Nematoda,Platyhelminthes, and Urochordata. There was disparity con-cerning the cnidarian sequences where Msx sequencesshowed high conservation in anthozoa and low conservation
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2221
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
FIG. 9. ZOC domain is necessary for binding to Msx protein. (A) (a, b) Zic2 (a) and Msx2 (b) mRNA distribution in embryonic day 9.5 mouseembryos. Blue signals indicate the presence of mRNA. AER, apical ectodermal ridge of limb bud epithelium; LB, limb buds; PZ, progress zone of limbbud mesenchymal cells; SC, prospective spinal cord. (c–h) Double immunostaining indicating Zic2 (red) and Msx2 (green) protein distribution intissue sections. Overlapping distribution is shown in yellow. (c and d) Limb bud. (d) An enlarged view of the tip of the limb bud. (e, f, g) Cross sectionthrough the dorsal trunk including the spinal cord. These three images are derived from a single section. In (g), the signals in (e) and (f) have beenmerged. (h) An enlarged view of the dorsal part of the spinal cord and the surrounding mesenchymal cells. (B) Physical interaction between Zic2and Msx2. HA-tagged Zic2 and (FLAG-tagged Msx2 or empty) expression plasmids were cotransfected into HEK293T cells. Cell lysates of thetransfected cells were immunoprecipitated (IP) with anti-FLAG antibody and immunoblotted (IB) with anti-HA antibody. (C) ZOC is necessary forMsx2 binding. HA-tagged deletion mutants of Zic2 (top) and FLAG-Msx2 expression plasmids were cotransfected and immunoprecipitated withanti-FLAG antibody and immunoblotted with anti HA-antibody. Asterisk indicates the nonspecific bands generated by immunoglobulins afterimmunoprecipitation. (D) Msx2 homeodomain is necessary for Zic2 binding. FLAG-tagged deletion mutants of Msx2 (top) and HA-Zic2 expres-sion plasmids were cotransfected and immunoprecipitated with anti-FLAG antibody and immunoblotted with anti HA-antibody. (E) GST-pull-down (PD) assay. FLAG-Zic2 transfected cell lysate was incubated with GST or GST-Msx2-HD fusion protein. Coprecipitated FLAG-Zic2 wasdetected with the anti-FLAG antibody. Cell lysates (10%) were electrophoresed in Input (10%) lane.
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2222
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
in hydrozoa, consistent with a previous report (Takahashiet al. 2008).
Significance of Zic-Msx Interaction during EvolutionAbove results suggest evolutionary processes of Zic and Msxfamily genes are closely related (summarized in fig. 12). BothZic and Msx proteins are distributed widely in bilaterians(Aruga et al. 2006; Takahashi et al. 2008; Layden et al. 2010).The role of Msx genes in neuroectodermal patterning hasbeen suggested by their expression in the lateral longitudinalcolumns in the fruit fly and in vertebrates (Isshiki et al. 1997;Arendt and Nubler-Jung 1999), and this feature is now ex-tended to the nematode nervous system (Li et al. 2017).Expression in the lateral longitudinal columns has also beendescribed for the vertebrate Zic family (Nagai et al. 1997;Fujimi et al. 2006) and in nematodes (Li et al. 2017). Theircommon roles in cell fate specification have been indicatedfor both, Protostomia and Deuterostomia (Simoes-Costa andBronner 2015; Li et al. 2017). This study showed Zic/Msxcoexpression in the limb buds (fig. 9A) where both geneshave a role in limb patterning (Satokata and Maas 1994;Nagai et al. 2000; Satokata et al. 2000; Quinn et al. 2012). Inaddition, the genetic interaction could be predicted in thecranial bone as both Zic and Msx are associated with cranio-synostosis (ZIC1 and MSX2, developmental abnormality of
calvaria bones) (Wilkie et al. 2000; Twigg et al. 2015) and jawdevelopment (Inoue et al. 2004; Cerny et al. 2010). Althoughthe functional link is limited to the nematodes at present, theconserved binding between Zic and Msx may predict addi-tional links in unexplored species.
The cooperation between Zic and Msx raises the possibilitythat toolkit genes that are often coopted are placed undersimilar evolutionary constraints to preserve their functionaldomains. The result was logically expected. However, it maynot have been sufficiently proven. In a previous study weshowed that conservation of paired domain and homeodo-main of Pax6 showed taxon dependent differential degener-ation similar to Zic in comparison to housekeeping genes(Aruga et al. 2007). Further molecular phylogenetic analysesconsidering the developmental context or protein functionwould reveal novel aspects of evolution.
However, we should note that Zic and Msx are not dis-tributed identically in metazoans. Zic genes are absent indemosponges and are present in animals belonging to phy-lum Placozoa and phylum Platyhelminthes class Cestoda. Incontrast, Msx genes are retained in demosponges and are notdetected in Placozoa and Cestoda. Parahox genes were shownto be lost in Cestoda species presumably due to adaptationsto parasitism (Tsai et al. 2013). The result is in contrast withthe preservation of Msx genes in parasitic nematodes (Nam,Tcan, Asu, Tzi, Ttri, and Tps) and in a freely living highlysimplified orthonectid (Ili). These results are thought to reflectthat differential evolutionary constraints understandably existfor Zic and Msx genes.
Significance of CDs in Zic Protein EvolutionThis study provided several novel ideas and facts about theZic protein evolution (fig. 12). Identification of poriferan Zicgenes suggests the presence of an ancestral Zic gene in themetazoan ancestor. The presence of novel CDs in Zic proteinsof bilaterian ancestors and their selectivity contributed totheir loss during bilaterian evolution. Transcriptional activa-tion and Msx binding are phylogenetically conserved func-tions of Zic proteins. Some Zic CDs and Msx HDs share similardegeneration profiles during evolution. However, besides theevolutionary history of Zic proteins, some results suggest con-sidering the protein domain evolution.
First, the discovery of novel CDs became feasible by selec-tion of slow evolving genes that preserve the ancestral traits.This idea was based on the awareness that the degenerationextent of CDs in some toolkit proteins such as Zic, Pax, andMsx varies strongly among the animal taxa (phyla or classes)(Aruga et al. 2006, 2007; Takahashi et al. 2008). Because suf-ficient numbers of conserved sequences were identified in theLophotrochozoa, Ecdysozoa, and Deuterostomia, thereported CDs may properly represent those in the bilaterianancestor. By adding these seven CDs (CD0-6) to the twoknown ones (ZOC and ZFNCþZFDþZFCC cluster), we canpredict that at least nine CDs existed in the bilaterianancestor.
The protein domains can be described as a compact, spa-tially distinct unit, which can be defined from both functionaland structural viewpoints (Miklos and Campbell 1992). The
B
A
Mm
u-2
em
pty
Nve
-A
Ttu
Pim
Dm
e
Cel
Ppe
Mm
u-2
Input IP: FLAG
MYC-
IB:
FLAG
MYC
FLAG-Mmu-Msx1 FLAG-Mmu-Msx1
Input IP: FLAG
MYC-
FLAG-Nve-Msx FLAG-Nve-Msx
IB:
FLAG
MYC
Mm
u-2
em
pty
Nve
-A
Ttu
Pim
Dm
e
Cel
Ppe
Mm
u-2
Mm
u-2
em
pty
Nve
-A
Ttu
Pim
Dm
e
Cel
Ppe
Mm
u-2
Mm
u-2
em
pty
Nve
-A
Ttu
Pim
Dm
e
Cel
Ppe
Mm
u-2
FIG. 10. Binding abilities to Msx are phylogenetically conservedamong metazoan Zic proteins. FLAG-tagged mouse Msx1 (A) orsea anemone Msx (B) expression plasmids were cotransfected withMYC-tagged Zic protein expression plasmids in 293T cells. The celllysates were immunnoprecipitated (IP) with anti-FLAG epitope anti-body and immunoblotted (IB) with anti MYC epitope antibody.
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2223
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
“CDs” in this study are based on sequence similarity andare considered from evolutionary perspectives. Amongthe nine Zic CDs, the CD that can be found in the otherproteins is the C2H2 ZF in ZFD upon searching againstprotein CD databases (NCBI-CDD, Pfam, Prosite,SMART), suggesting that CDs other than ZFD are onlydistributed in the Zic family. Furthermore, there are notraits of CD duplication or shuffling in the collectedmetazoan Zic sequences. Together with the strong correla-tion among the CDs (fig. 6), Zic CDs are thought to becoordinated to exert Zic protein function. In other words,the protein structure of the urbilaterian Zic was self-contained and a radical change in protein structure maynot have been allowed in the course of bilaterian evolution.
Even though the domains defined by the structure (se-quence) and function are not identical, there might be acorrelation between the protein function and CDs degener-ation extent. Based on this idea, we performed a functional
assay of transcriptional activation using both conserved anddiverged type Zic expression vectors. However, the analysisdemonstrated that the sequence-dependent transcriptionregulatory functions of the Zic proteins are not clearly corre-lated with CDs maintenance and that interactions betweenZic and transcription regulatory proteins are mostly con-served (fig. 8). These results suggest that the basic regulatoryactivity of Zic protein function is not simply predicted by thepresence or absence of CDs.
Finally, while whole genome sequencing of key species inevolution readily improve our understanding of the phylo-genic relationship among the animals and provide us withan indispensable framework, the key gene with many-species approach as taken by this study would elucidatedifferent aspects of evolution such as function-oriented pro-tein domain evolution. A combination of these twoapproaches would be beneficial for better understandingof the evolutionary process.
B
A
C
n
EpaNve
Adi
Ofa
HvuSsu
HsaCmi
Pma
Cin
Bfl
AplPpeSpu
Sko
PcaTcasDmeHdu
Cel
Nam
Tcan
AsuTzi
Ttri
Tps
CteTtuHrob
CflCgiMye
BglLgi
DjaSme
Sma
Ili
Sci
Lco
Oca
180
230
280
330
380
430
480
400 600 800 1000 1200
Msx-H
D
Zic-NCZFCC
Cnidaria
Deuterostomia
Ecdysozoa
Lophotrochozoa
Placozoa
Porifera
-0.5
-0.45
-0.4
-0.35
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
400 600 800 1000
-18S
Zic-NCZFCC
-0.5
-0.45
-0.4
-0.35
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
150 250 350 450
-18
S
Msx-HD
FIG. 11. Correlation of evolutionarily conservation. (A) Correlation between Zic CDs and Msx HD or 18S rRNA. About 39 animal species were usedfor the analysis. Strong correlation (r> 0.7) is shaded. (B and C) Scattered plots indicating conservation extents relationship between (B) Msx HDand Zic NCZFCC (ZFNC-ZFD-ZFCC) and (C) 18S rRNA and Zic NCZFCC (left) and 18S rRNA and NCZFCC (right). Blast scores (Zic and Msx) oradditive inverse of evolutionary distance score (18S rRNA). Labels in (B) are the abbreviations in table 1.
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2224
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
Materials and Methods
AnimalsBrachionus plicatilis were purchased from Nikkai Center Co.(Tokyo, Japan). Echinorhynchus gadi were collected from anAlaska pollock captured off Hokkaido purchased from a localfish dealer. Spinochordodes tellinii were collected from a wildmantis (Acromantis sp.) captured at Fukaya City in SaitamaPrefecture, Japan. Species validation was done using 18S ribo-somal RNA sequences.
PCR Cloning of Zic cDNARNA was isolated using TRIzol reagent (Invitrogen, CA) as perthe manufacturer’s recommendation. cDNA was synthesizedusing a 3�-Full RACE Core Set (Takara Bio, Shiga, Japan). Zic
homologs were initially identified by nested PCR on cDNA orgenomic templates using degenerate primers correspondingto the ZF2 to ZF3 region (Aruga et al. 2006). cDNAs corre-sponding to ZF at their 30 and 50 ends were cloned using a 3�-Full RACE Core Set and 5�-Full RACE Core Set (Takara Bio),respectively. The entire open reading frame region of thecDNAs was again cloned using primers located outside thetarget regions. Amino acid sequences were acquired afternucleotide sequencing of multiple PCR fragments.
Database SearchA key word- or homology-based search was conductedagainst the NCBI database (https://www.ncbi.nlm.nih.gov/2018.3.21), ENSEMBL database (http://www.ensembl.org/in-dex.html 2018.3.21), Compagen (http://www.compagen.org/index.html 2018.3.16), and SILVA rRNA database (https://www.arb-silva.de/ 2018.3.11). For the database query, weused the sequence information from previous studies (Trueand Carroll 2002; Aruga et al. 2006, 2007; Takahashi et al. 2008;Hatayama and Aruga 2010; Layden et al. 2010). A TBlastNsearch was done for whole genome shotgun contigs (wgs) ortranscript shotgun assembly (TSA) of target organisms withthe following key sequences and the validity of the target waschecked by reciprocal BLAST. Ttu-Zic and Cfl-Msx were usedto identify the hypothetical Hrob-Zic and Gastropoda Msx(Bgl and Lgi), respectively. Hypothetical Cte-Zic and Tad-Zicsequences were obtained with a BLAST search againstADN43078 and XP_002108473, respectively. The criteria fortheir identification as members of Zic and Msx families weredescribed previously (True and Carroll 2002; Aruga et al. 2006;Takahashi et al. 2008). The newly defined sequences weredeposited at DDBJ/NCBI/EMBL databases with accessionnumbers shown in table 1.
We omitted sequences that were thought be immaturelycurated; for instance, the Schistosoma mansoni (short form)and Hymenolepis microstoma sequences were edited to ob-tain the entire ORF by adjoining the predicted Exon1 andExon2 sequences manually.
Presence or absence of introns in ZFD and their positionsin case of presence were examined by ENSEMBL database orby aligning genomic sequence and mRNA sequence.
Molecular Phylogenetic AnalysisThe AA and nucleotide sequences were aligned usingMUSCLE (Edgar 2004), MSAPROBS (Liu et al. 2010), andMAFFT (Katoh et al. 2017). Some of the aligned sequenceswere corrected by visual inspection.
To define the Zic CDs except CD0 in the bilaterian ances-tor, we first selected and aligned Zic sequences fromDeuterostomia (Bfl, Sko, Sca, Apl, Ppe, and Spu),Lophotrochozoa (Cte, Lan, Ttra, Cfl, Sso, Hbl, Obi, and Lgi),and Ecdysozoa (Pca, Lpo_1, Pim, Pte, Afr, Haz, and Eka) with acnidarian sequence (Nve_A) as an outgroup. The ancestralsequences were calculated using the Maximum Likelihoodmethod under a JTT matrix-based model (Jones et al.1992), a defined tree as follows ((Bfl,(((Apl, Ppe), Spu),(Sca, Sko))),((Cte,(Lan, Ttr),(Lga,(Hbl, Obi),(Sso,Cfl))),(Pca,(Eka,((Pim, Lpo_1, Pte),(Afr, Haz))))), Nve_A).
1 Z 2 3 4 5 N ZF C 6
Msx
Zic
HD
1 Z 2 3 4 5 N ZF C 6
Msx
Zic
HD
Msx
Zic
metazoan ancestor
bilaterian ancestor
(NK homeobox family)
(Gli-Glis-Zic ZF family)
EH1
FIG. 12. Coevolution of Zic and Msx. Bottom, Zic family and Msxfamily existed in the metazoan ancestor (fig. 1). Middle, Bilaterianancestor contained Zic and Msx CDs that are differentially conservedin existing animals (fig. 5). EH1 (Engrailed homology 1 motif) binds thetranscriptional corepressor Groucho. Phylogenetically conserved ZicCD (CD1-ZF)-Msx HD interaction (fig. 10) suggests the presence ofthis interaction in the bilaterian ancestor (gray arrows). Top, Curvedlines connecting Zic CDs indicate the correlations of the conservedextent among Zic CDs (fig. 6). Lines connecting Msx and Zic CDs in-dicate the correlation between Zic CDs and Msx HD (fig. 11). Thicklines, r > 0.8; thin lines, r > 0.7. Evolution of Msx family is based onTakahashi et al. (2008).
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2225
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
All positions with <70% site coverage were eliminated(423 positions in the final data set). The CDs were definedas shown in supplementary figure S6, Supplementary Materialonline. Selection of CDs was done under the above criteria byexcluding clusters with less Maximal Probability. To defineCD0, we carried out the same analysis using Bfl, Apl, Ppe,Sca, Sko, Lan, Pca, Lpo_1 sequences and a tree as follows(Bfl, (((Apl, Ppe), Spu),(Sca, Sko)),(Lan,(Lpo_1, Pca))), and de-fined it as shown in supplementary figure S6, SupplementaryMaterial online.
A homology search against the local Zic sequence databasewas performed using the BLAST program (Altschul et al.1990) implemented in the NCBI Genome Workbench(https://www.ncbi.nlm.nih.gov/tools/gbench/2018.3.18) withthe following parameters: word size, 3; e-value, 10; andThreshold, 11. The conservation of CDs was measured bythe BLAST score based on a BLOSUM62 matrix (Henikoffand Henikoff 1992). We defined the presence of a CD if thetarget Zic AA sequence contained a homologous sequenceelement with a score >24. If multiple patterns of sequencealignments were given for a sequence in the homology search,the optimal alignment with the lowest E-value was retainedand the others were omitted. The E-values of the omittedsequences were marginal, ranging from 1.4 to 7.6. To calculatethe conservation extent as percentage among the CD-containing sequences, the minimal and maximal scoreswere defined as rank value 1 and 100, respectively, and theremaining sequence scores were placed proportionally withinthis range. Sequences that were not shown by the CD ho-mology search were defined as having a rank value of 0. Thecorrelation analysis for conservation extent was performedusing the percentages defined above. Because the rank valueswere discontinuous between 0 and 1, we calculated the co-efficient r in Spearman’s rank-order correlation using the rankvalues to consider both presence–absence and conservationextent information.
Phylogenetic tree analysis was performed with MEGA7(Kumar et al. 2016) and MrBayes 3.2 (Ronquist et al. 2012).The Maximal Likelihood-based tree was based on distancecalculation using the JTT matrix (Jones et al. 1992) after re-moving all positions with<70% site coverage. In the MaximalLikelihood tree, tree reliability was estimated by a bootstraptest with 500 repetitions. In the Bayesian inference analysis weused an empirical model (WAG distances, Whelan andGoldman 2001) with gamma, alpha shape parameter, andAA frequencies estimated from the data. We ran 1, 000,000 generations with one cold and three incrementallyheated Markov chains, random starting trees for each chain,and trees sampled every 100 generations. We constructed a50% major rule consensus tree from the last 2500 trees thatwere saved (burnin ¼ 2500). The tree was edited usingTreeGraph 2 (Stover and Muller 2010).
Reconstruction of the Msx sequence in the bilaterian an-cestor was performed as described above for the ancestral Zicsequence construction. The resultant sequence was identicalto that constructed in a previous study (Takahashi et al. 2008)except that the sequence was extended by including 17 AA ofthe N-terminal flanking region of HD. The conservation
extent was defined as the BLOSUM62-based BLAST scorebetween the ancestral sequence and the target sequence.The evolutionary distances of 18S RNA were calculated asthe number of base substitutions per site between the ances-tral 18S RNA sequence and the target species-derived ones.Analyses were performed using the Tamura–Nei model(Tamura and Nei 1993). Measurements were taken afterremoval of any alignment gap-containing sites, assuming dif-ferent evolutionary rates among sites (gamma distribution,a¼ 0.4). The correlation between the conservation extents of(18S or Msx) and Zic was analyzed using the coefficient r inSpearman’s rank-order correlation. The analysis was done forspecies in which Zic, Msx, and 18S RNA sequences were allavailable after removing the paralog that showed lower con-servation values than any remaining paralogues. Therefore,the representative sequences were the most strongly con-served in each group.
Plasmids and MutagenesisTo construct Zic expression vectors for Zic proteins (Nve-A/B/C/D/E, Ssu, Hbl, Pim, Afr, Dme, Cel, Ppe, Cin-a/b, Bfl, Mmu-1/2/3/4/5, Xla-1/2/3/4/5) and Msx proteins (Nve, Mmu-1),entire protein coding regions were first amplified by PCRand cloned into the pGEMT easy vector (Promega). NveBAC (Nve-A, CH314-49A19; Nve-B/C/D/E, CH314-55K22) orNve cDNA (Nve-Msx). After verification by sequencing, thecorrect ORFs were excised from the plasmid using NotI orEcoRI and inserted into modified pcDNA3.1 vectors(Invitrogen), in which a Myc or FLAG tag was introducedN-terminally followed by the initiation codon for methionine.
The HA-tagged Zic2 deletion series was described previ-ously (Mizugishi et al. 2004). The pcDNA3-Flag-Msx2 vectorwas a gift from Dr. Ken Watanabe (Masuda et al. 2001). Aseries of truncated Msx2 (NþHD; 1–612 bp, N; 1–423 bp) wasamplified by PCR and subcloned into the BamHI-SalI site ofpCMV-tag2A (N-terminal Flag expression vector; Stratagene).To construct the GST fusion plasmid, the sequence of theMsx2 homeodomain (424–612 bp) was amplified by PCR andsubcloned into pGEX4T3 vector (Promega). Mutations wereintroduced into mouse and frog Zic2 using the Takara in vitromutagenesis kit (Takara).
Luciferase Reporter AssayNIH3T3 cells in 24-well dishes were transfected with pTgif-lucor pTgifDZBS (200 ng) (Ishiguro et al. 2017), pcDNA3.1-FLAGor pcDNA3.1-FLAG-Zic (200 ng), and pEF-Renilla luciferase(4 ng) using TransIT-LT1 (Mirus Bio). Luciferase activity wasmeasured using a Dual Luciferase Assay System (Promega)and a Minilumat LB 9506 luminometer (Berthold).
Immunoprecipitation and GST-Pull Down AssayFor the Zic-Msx binding assay (figs. 9 and 10), 293 T cells werecotransfected with appropriate expression vectors usingEffectene (Qiagen). At 24 h after transfection, the cells werelysed in an immunoprecipitation buffer (25 mM Hepes, pH7.2, 0.5% NP-40, 150 mM NaCl, 50 mM NaF, 2 mM Na3VO4,1 mM PMSF, 20 lg/ml aprotinin) at 4�C.Immunoprecipitation was performed using an anti-FLAG
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2226
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
M2 monoclonal antibody (Sigma). The bound material wasdetected by immunoblotting with an anti-HA polyclonal an-tibody H-6908 (Sigma). For the Zic2-DNAPK/RHA complexesbinding assay, 293 T cells were transfected with the appropri-ate expression vectors using Lipofectamine 2000 (Invitrogen).Transfected cells were washed and harvested in PBS(-) con-taining 1 mM PMSF, and total cell extracts were preparedwith lysis wash buffer 150 (20 mM HEPES-KOH [pH 7.8],10% glycerol, 150 mM NaCl, 0.5 mM DTT, 0.1 mM EDTA,0.5% Nonidet P-40, and 1 mM PMSF). The extracts were in-cubated with anti-HA or anti-FLAG affinity beads at 4�C for6 h. Immunoblotting was carried out as described previously(Ishiguro et al. 2007).
GST fusion proteins were expressed in Escherichia coli andaffinity-purified with Glutathione Sepharose 4B (Pharmacia).For the GST pull-down assay, GST fusion proteins were incu-bated for 2 h at 4�C with protein extracts from 293 T cells orpurified Zic2 protein in the immunoprecipitation buffer. Afterwashing five times, the bound proteins were separated bySDS–PAGE, and then immunoblotted with an anti-FLAGantibody.
Immunostaining and In Situ HybridizationImmunostaining for mouse embryo sections was carried outas described previously (Aruga et al. 2002). Whole mountstaining of the mouse embryo was carried out as describedpreviously (Nagai et al. 1997).
Data AvailabilityThe sequences newly defined in this study were deposited atthe DDBJ/GenBank/EMBL database under the following ac-cession numbers (Bpl-Zic, LC328942; Ega-Zic, LC328943; Ste-Zic, LC328944; Hrob-Zic, BR001475; Cte-Zic, BR001476; Ttri-Zic, BR001480; Tad-Zic, BR001479; Lgi-Msx, BR001477; Bgl-Msx, BR001478; Ava-Zic1, BR001481; Ava-Zic2, BR001482;Smar-Zic1, BR001483; Smar-Zic2, BR001484; Ava-Msx,BR001485; and Smar-Msx, BR001486).
Supplementary MaterialSupplementary data are available at Molecular Biology andEvolution online.
AcknowledgmentsWe thank Akiko Shimazaki and Yayoi Nozaki for technicalassistance in molecular biology experiments, YoshifumiMatsumoto for comments on the article, and anonymousreviewers for valuable advice including that about poriferanZic sequences. This work was supported by RIKEN BSI fundsand MEXT grants (grant numbers 16390086, 15K15019).
ReferencesAlappat S, Zhang ZY, Chen YP. 2003. Msx homeobox gene family and
craniofacial development. Cell Res. 13(6):429–442.Alper S, Kenyon C. 2002. The zinc finger protein REF-2 functions with the
Hox genes to inhibit cell fusion in the ventral epidermis of C. elegans.Development 129(14):3335–3348.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic localalignment search tool. J Mol Biol. 215(3):403–410.
Arendt D, Nubler-Jung K. 1999. Comparison of early nerve cord devel-opment in insects and vertebrates. Development 126(11):2309–2325.
Aruga J. 2004. The role of Zic genes in neural development. Mol CellNeurosci. 26(2):205–221.
Aruga J, Hatayama M. 2018. Comparative genomics of the Zic familygenes. Adv Exp Med Biol. 1046:3–26.
Aruga J, Kamiya A, Takahashi H, Fujimi TJ, Shimizu Y, Ohkawa K, YazawaS, Umesono Y, Noguchi H, Shimizu T, et al. 2006. A wide-rangephylogenetic analysis of Zic proteins: implications for correlationsbetween protein structure conservation and body plan complexity.Genomics 87(6):783–792.
Aruga J, Odaka YS, Kamiya A, Furuya H. 2007. Dicyema Pax6 andZic: tool-kit genes in a highly simplified bilaterian. BMC Evol Biol.7:201.
Aruga J, Tohmonda T, Homma S, Mikoshiba K. 2002. Zic1 promotes theexpansion of dorsal neural progenitors in spinal cord by inhibitingneuronal differentiation. Dev Biol. 244(2):329–341.
Bertrand V, Hobert O. 2009. Linking asymmetric cell division to theterminal differentiation program of postmitotic neurons in C. ele-gans. Dev Cell 16(4):563–575.
Brusca RC, Moore W, Shuster SM. 2016. Invertebrates. Sunderland MA:Sinauer Associates.
Cannon JT, Vellutini BC, Smith J, 3rd, Ronquist F, Jondelius U, Hejnol A.2016. Xenacoelomorpha is the sister group to Nephrozoa. Nature530(7588):89–93.
Cerny R, Cattell M, Sauka-Spengler T, Bronner-Fraser M, Yu F, MedeirosDM. 2010. Evidence for the prepattern/cooption model of verte-brate jaw evolution. Proc Natl Acad Sci U S A. 107(40):17262–17267.
Dohrmann M, Worheide G. 2013. Novel scenarios of early animalevolution–is it time to rewrite textbooks? Integr Comp Biol.53(3):503–511.
Edgar RC. 2004. MUSCLE: a multiple sequence alignment method withreduced time and space complexity. BMC Bioinformatics 5:113.
Emig CC. 2008. On the history of the names Lingula, anatina, and on theconfusion of the forms assigned them among the BrachiopodaCarnets de G�eologie/Notebooks on Geology CG2008_A08. Carnets,France.
Feuda R, Dohrmann M, Pett W, Philippe H, Rota-Stabelli O, Lartillot N,Worheide G, Pisani D. 2017. Improved modeling of compositionalheterogeneity supports sponges as sister to all other animals. CurrBiol. 27(24):3864–3870 e3864.
Flot JF, Hespeels B, Li X, Noel B, Arkhipova I, Danchin EG, Hejnol A,Henrissat B, Koszul R, Aury JM, et al. 2013. Genomic evidence forameiotic evolution in the bdelloid rotifer Adineta vaga. Nature500(7463):453–457.
Fujimi TJ, Mikoshiba K, Aruga J. 2006. Xenopus Zic4: conservation anddiversification of expression profiles and protein function among theXenopus Zic family. Dev Dyn. 235(12):3379–3386.
Garey JR, Near TJ, Nonnemacher MR, Nadler SA. 1996. Molecular evi-dence for Acanthocephala as a subtaxon of Rotifera. J Mol Evol.43(3):287–292.
Hatayama M, Aruga J. 2010. Characterization of the tandem CWCH2sequence motif: a hallmark of inter-zinc finger interactions. BMC EvolBiol. 10:53.
Hatayama M, Aruga J. 2018. Role of Zic family proteins in transcriptionalregulation and chromatin remodeling. Adv Exp Med Biol.1046:353–380.
Hatayama M, Tomizawa T, Sakai-Kato K, Bouvagnet P, Kose S, ImamotoN, Yokoyama S, Utsunomiya-Tate N, Mikoshiba K, Kigawa T, et al.2008. Functional and structural basis of the nuclear localization sig-nal in the ZIC3 zinc finger domain. Hum Mol Genet.17(22):3459–3473.
Hemmrich G, Bosch TC. 2008. Compagen, a comparative genomicsplatform for early branching metazoan animals, reveals early originsof genes regulating stem-cell differentiation. Bioessays30(10):1010–1018.
Hemmrich G, Khalturin K, Boehm AM, Puchert M, Anton-Erxleben F,Wittlieb J, Klostermeier UC, Rosenstiel P, Oberg HH, Domazet-LosoT, et al. 2012. Molecular signatures of the three stem cell lineages in
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2227
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
hydra and the emergence of stem cell function at the base of multi-cellularity. Mol Biol Evol. 29(11):3267–3280.
Henikoff S, Henikoff JG. 1992. Amino acid substitution matrices fromprotein blocks. Proc Natl Acad Sci U S A. 89(22):10915–10919.
Herrera E, Brown L, Aruga J, Rachel RA, Dolen G, Mikoshiba K, Brown S,Mason CA. 2003. Zic2 patterns binocular vision by specifying theuncrossed retinal projection. Cell 114(5):545–557.
Houtmeyers R, Souopgui J, Tejpar S, Arkell R. 2013. The ZIC gene familyencodes multi-functional proteins essential for patterning and mor-phogenesis. Cell Mol Life Sci. 70(20):3791–3811.
Inoue T, Hatayama M, Tohmonda T, Itohara S, Aruga J, Mikoshiba K.2004. Mouse Zic5 deficiency results in neural tube defects and hy-poplasia of cephalic neural crest derivatives. Dev Biol.270(1):146–162.
Ishiguro A, Hatayama M, Otsuka MI, Aruga J. 2017. Link between thecausative genes of holoprosencephaly, Zic2 directly regulates Tgif1expression. Sci Rep. 8(1):2140.
Ishiguro A, Ideta M, Mikoshiba K, Chen DJ, Aruga J. 2007. ZIC2-depen-dent transcriptional regulation is mediated by DNA-dependent pro-tein kinase, poly(ADP-ribose) polymerase, and RNA helicase A. J BiolChem. 282(13):9983–9995.
Isshiki T, Takeichi M, Nose A. 1997. The role of the msh homeobox geneduring Drosophila neurogenesis: implication for the dorsoventralspecification of the neuroectoderm. Development124(16):3099–3109.
Jones DT, Taylor WR, Thornton JM. 1992. The rapid generation of mu-tation data matrices from protein sequences. Comput Appl Biosci.8(3):275–282.
Katoh K, Rozewicki J, Yamada KD. 2017. MAFFT online service: multiplesequence alignment, interactive sequence choice and visualization.Brief Bioinform. doi: 10.1093/bib/bbx108.
Kumar S, Stecher G, Tamura K. 2016. MEGA7: molecular evolutionarygenetics analysis version 7.0 for bigger datasets. Mol Biol Evol.33(7):1870–1874.
Kuo JS, Patel M, Gamse J, Merzdorf C, Liu X, Apekin V, Sive H. 1998. Opl: azinc finger protein that regulates neural determination and pattern-ing in Xenopus. Development 125(15):2867–2882.
Layden MJ, Meyer NP, Pang K, Seaver EC, Martindale MQ. 2010.Expression and phylogenetic analysis of the zic gene family in theevolution and development of metazoans. Evodevo 1(1):12.
Li Y, Zhao D, Horie T, Chen G, Bao H, Chen S, Liu W, Horie R, Liang T,Dong B, et al. 2017. Conserved gene regulatory module specifieslateral neural borders across bilaterians. Proc Natl Acad Sci U S A.114(31):E6352–E6360.
Lindgens D, Holstein TW, Technau U. 2004. Hyzic, the Hydra homolog ofthe zic/odd-paired gene, is involved in the early specification of thesensory nematocytes. Development 131(1):191–201.
Liu Y, Schmidt B, Maskell DL. 2010. MSAProbs: multiple sequence align-ment based on pair hidden Markov models and partition functionposterior probabilities. Bioinformatics 26(16):1958–1964.
Mallatt J, Craig CW, Yoder MJ. 2010. Nearly complete rRNA genes as-sembled from across the metazoan animals: effects of more taxa, astructure-based alignment, and paired-sites evolutionary models onphylogeny reconstruction. Mol Phylogenet Evol. 55(1):1–17.
Masuda Y, Sasaki A, Shibuya H, Ueno N, Ikeda K, Watanabe K. 2001.Dlxin-1, a novel protein that binds Dlx5 and regulates its transcrip-tional function. J Biol Chem. 276(7):5331–5338.
Meyerowitz EM. 1999. Plants, animals and the logic of development.Trends Cell Biol. 9(12):M65–M68.
Mikhailov KV, Slyusarev GS, Nikitin MA, Logacheva MD, Penin AA,Aleoshin VV, Panchin YV. 2016. The genome of Intoshia linei affirmsorthonectids as highly simplified spiralians. Curr Biol.26(13):1768–1774.
Miklos GL, Campbell HD. 1992. The evolution of protein domains andthe organizational complexities of metazoans. Curr Opin Genet Dev.2(6):902–906.
Mizugishi K, Aruga J, Nakata K, Mikoshiba K. 2001. Molecular propertiesof Zic proteins as transcriptional regulators and their relationship toGLI proteins. J Biol Chem. 276(3):2180–2188.
Mizugishi K, Hatayama M, Tohmonda T, Ogawa M, Inoue T, MikoshibaK, Aruga J. 2004. Myogenic repressor I-mfa interferes with the func-tion of Zic family proteins. Biochem Biophys Res Commun.320(1):233–240.
Moriyama Y, Kawanishi T, Nakamura R, Tsukahara T, Sumiyama K,Suster ML, Kawakami K, Toyoda A, Fujiyama A, Yasuoka Y, et al.2012. The medaka zic1/zic4 mutant provides molecular insights intoteleost caudal fin evolution. Curr Biol. 22(7):601–607.
Moroz LL, Kocot KM, Citarella MR, Dosung S, Norekian TP, PovolotskayaIS, Grigorenko AP, Dailey C, Berezikov E, Buckley KM, et al. 2014. Thectenophore genome and the evolutionary origins of neural systems.Nature 510(7503):109–114.
Nagai T, Aruga J, Minowa O, Sugimoto T, Ohno Y, Noda T, Mikoshiba K.2000. Zic2 regulates the kinetics of neurulation. Proc Natl Acad Sci US A. 97(4):1618–1623.
Nagai T, Aruga J, Takada S, Gunther T, Sporle R, Schughart K, MikoshibaK. 1997. The expression of the mouse Zic1, Zic2, and Zic3 genesuggests an essential role for Zic genes in body pattern formation.Dev Biol. 182(2):299–313.
Nosenko T, Schreiber F, Adamska M, Adamski M, Eitel M, Hammel J,Maldonado M, Muller WE, Nickel M, Schierwater B, et al. 2013. Deepmetazoan phylogeny: when different genes tell different stories. MolPhylogenet Evol. 67(1):223–233.
Pick KS, Philippe H, Schreiber F, Erpenbeck D, Jackson DJ, Wrede P, WiensM, Alie A, Morgenstern B, Manuel M, et al. 2010. Improved phylo-genomic taxon sampling noticeably affects nonbilaterian relation-ships. Mol Biol Evol. 27(9):1983–1987.
Quinn ME, Haaning A, Ware SM. 2012. Preaxial polydactyly caused byGli3 haploinsufficiency is rescued by Zic3 loss of function in mice.Hum Mol Genet. 21(8):1888–1896.
Ramos C, Robert B. 2005. msh/Msx gene family in neural development.Trends Genet. 21(11):624–632.
Riesgo A, Farrar N, Windsor PJ, Giribet G, Leys SP. 2014. The analysis ofeight transcriptomes from all poriferan classes reveals surprising ge-netic complexity in sponges. Mol Biol Evol. 31(5):1102–1120.
Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Hohna S,Larget B, Liu L, Suchard MA, Huelsenbeck JP. 2012. MrBayes 3.2:efficient Bayesian phylogenetic inference and model choice acrossa large model space. Syst Biol. 61(3):539–542.
Ruppert EE, Fox RS, Barns RD. 2004. Invertebrate zoology. Belmont, CA.:Thomson-Brooks/Cole.
Ryan JF, Pang K, Schnitzler CE, Nguyen A-D, Moreland RT, Simmons DK,Koch BJ, Francis WR, Havlak P, Smith SA, et al. 2013. The genome ofthe ctenophore Mnemiopsis leidyi and its implications for cell typeevolution. Science 342(6164):1242592.
Satokata I, Ma L, Ohshima H, Bei M, Woo I, Nishizawa K, Maeda T,Takano Y, Uchiyama M, Heaney S, et al. 2000. Msx2 deficiency inmice causes pleiotropic defects in bone growth and ectodermalorgan formation. Nat Genet. 24(4):391–395.
Satokata I, Maas R. 1994. Msx1 deficient mice exhibit cleft palate andabnormalities of craniofacial and tooth development. Nat Genet.6(4):348–356.
Sawada K, Fukushima Y, Nishida H. 2005. Macho-1 functions as tran-scriptional activator for muscle formation in embryos of the ascidianHalocynthia roretzi. Gene Expr Patterns 5(3):429–437.
Schuler A, Bornberg-Bauer E. 2016. Evolution of protein domain repeatsin metazoa. Mol Biol Evol. 33(12):3170–3182.
Sen A, Stultz BG, Lee H, Hursh DA. 2010. Odd paired transcriptionalactivation of decapentaplegic in the Drosophila eye/antennal disc iscell autonomous but indirect. Dev Biol. 343(1–2):167–177.
Shinzato C, Shoguchi E, Kawashima T, Hamada M, Hisata K, Tanaka M,Fujie M, Fujiwara M, Koyanagi R, Ikuta T, et al. 2011. Using theAcropora digitifera genome to understand coral responses to envi-ronmental change. Nature 476(7360):320–323.
Sielaff M, Schmidt H, Struck TH, Rosenkranz D, Mark Welch DB, HankelnT, Herlyn H. 2016. Phylogeny of syndermata (syn. Rotifera): mito-chondrial gene order verifies epizoic Seisonidea as sister to endopar-asitic Acanthocephala within monophyletic Hemirotifera. MolPhylogenet Evol. 96:79–92.
Tohmonda et al. . doi:10.1093/molbev/msy122 MBE
2228
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
Simion P, Philippe H, Baurain D, Jager M, Richter DJ, Di Franco A, RoureB, Satoh N, Queinnec E, Ereskovsky A, et al. 2017. A large and con-sistent phylogenomic dataset supports sponges as the sister groupto all other animals. Curr Biol. 27(7):958–967.
Simoes-Costa M, Bronner ME. 2015. Establishing neural crest identity: agene regulatory recipe. Development 142(2):242–257.
Smith SA, Berkson J. 2005. Laboratory culture and maintenance ofthe horseshoe crab (Limulus polyphemus). Lab Anim (NY)34(7):27–34.
Srivastava M, Begovic E, Chapman J, Putnam NH, Hellsten U, KawashimaT, Kuo A, Mitros T, Salamov A, Carpenter ML, et al. 2008. TheTrichoplax genome and the nature of placozoans. Nature454(7207):955–960.
Srivastava M, Simakov O, Chapman J, Fahey B, Gauthier ME, Mitros T,Richards GS, Conaco C, Dacre M, Hellsten U, et al. 2010. TheAmphimedon queenslandica genome and the evolution of animalcomplexity. Nature 466(7307):720–726.
Stover BC, Muller KF. 2010. TreeGraph 2: combining and visualizingevidence from different phylogenetic analyses. BMC Bioinformatics11(1):7.
Takahashi H, Kamiya A, Ishiguro A, Suzuki AC, Saitou N, Toyoda A,Aruga J. 2008. Conservation and diversification of Msx protein inmetazoan evolution. Mol Biol Evol. 25(1):69–82.
Tamura K, Nei M. 1993. Estimation of the number of nucleotide sub-stitutions in the control region of mitochondrial DNA in humansand chimpanzees. Mol Biol Evol. 10(3):512–526.
Telford MJ, Budd GE, Philippe H. 2015. Phylogenomic insights into an-imal evolution. Curr Biol. 25(19):R876–R887.
Telford MJ, Moroz LL, Halanych KM. 2016. Evolution: a sisterly dispute.Nature 529(7586):286–287.
Thanaraj TA, Clark F. 2001. Human GC-AG alternative intron isoformswith weak donor sites show enhanced consensus at acceptor exonpositions. Nucleic Acids Res. 29(12):2581–2593.
True JR, Carroll SB. 2002. Gene co-option in physiological and morpho-logical evolution. Annu Rev Cell Dev Biol. 18:53–80.
Tsai IJ, Zarowiecki M, Holroyd N, Garciarrubio A, Sanchez-Flores A,Brooks KL, Tracey A, Bobes RJ, Fragoso G, Sciutto E, et al. 2013.The genomes of four tapeworm species reveal adaptations to par-asitism. Nature 496(7443):57–63.
Twigg SR, Forecki J, Goos JA, Richardson IC, Hoogeboom AJ, van denOuweland AM, Swagemakers SM, Lequin MH, Van Antwerp D,McGowan SJ, et al. 2015. Gain-of-function mutations in ZIC1 areassociated with coronal craniosynostosis and learning disability. Am JHum Genet. 97(3):378–388.
Vasquez-Doorman C, Petersen CP. 2014. zic-1 Expression in Planarianneoblasts after injury controls anterior pole regeneration. PLoSGenet. 10(7):e1004452.
Vogg MC, Owlarn S, Perez Rico YA, Xie J, Suzuki Y, Gentile L, Wu W,Bartscherer K. 2014. Stem cell-dependent formation of a functionalanterior regeneration pole in planarians requires Zic and Forkheadtranscription factors. Dev Biol. 390(2):136–148.
Whelan S, Goldman N. 2001. A general empirical model of proteinevolution derived from multiple protein families using amaximum-likelihood approach. Mol Biol Evol. 18(5):691–699.
Wilkie AO, Tang Z, Elanko N, Walsh S, Twigg SR, Hurst JA, Wall SA,Chrzanowska KH, Maxson RE Jr 2000. Functional haploinsufficiencyof the human homeobox gene MSX2 causes defects in skull ossifi-cation. Nat Genet. 24(4):387–390.
Yagi K, Satou Y, Satoh N. 2004. A zinc finger transcription factor, ZicL, is adirect activator of Brachyury in the notochord specification of Cionaintestinalis. Development 131(6):1279–1288.
Yankura KA, Martik ML, Jennings CK, Hinman VF. 2010. Uncoupling ofcomplex regulatory patterning during evolution of larval develop-ment in echinoderms. BMC Biol. 8:143.
Identification and Characterization of Novel Conserved Domains in Metazoan Zic Proteins . doi:10.1093/molbev/msy122 MBE
2229
Dow
nloaded from https://academ
ic.oup.com/m
be/article/35/9/2205/5037825 by guest on 06 August 2022
Recommended