8
A New Approach to Molecular Phylogeny of H5N1 Avian Influenza Viruses in Asia HUILING WANG, 1 YUSEN ZHANG 2 1 School of Information and Engineering, Shandong University at Weihai, Weihai 264209, China 2 School of Mathematics and Statistics, Shandong University at Weihai, Weihai 264209, China Received 31 March 2009 accepted 6 May 2009 Published online 13 October 2009 in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/qua.22359 ABSTRACT: In this article, we introduce a new method to analysis avian influenza virus (AIV) of subtype H5N1 and study the similarity of these sequences. We make a comparison for some nucleic acid sequences of H5N1 AIV in Asia by using the 2D and 3D graphic representation. Comparing these sequences, we structured a phylogenetic tree and discussed the evolutional relationship among these viruses. The sequences analysis shows that there are some obvious traits depending on different areas, periods, and hosts. © 2009 Wiley Periodicals, Inc. Int J Quantum Chem 110: 1964 –1971, 2010 Key words: H5N1 AIV; 3DD-curve; 2DD-curve; phylogenetic tree 1. Introduction A vian influenza A virus is a negative- stranded RNA virus with eight genomic segments encoding RNA polymerases (PB2, PB1, and PA), hemagglutin (HA), nucleoprotein (NP), neuramindase (NA), matrix protein (M), and non- structural protein (NS). A total of 16 HA (hemag- glutinin) and nine NA (neuraminidase) subtypes have been reported [1]. All these subtypes come from avian species [2]. New influenza viruses and genotypes continuously emerge due to the fre- quent evolutionary events including genetic reas- sortment, recombination, and mutation. Further- more, H5N1 genotype is a high pathogenic avian influenza virus (AIV) and its analyses is critical for preparing a strategy to prevent and to control influenza epidemics and pandemics. Introduction of an avian influenza A virus with a novel HA gene in a population which lacks immu- nity to this HA has the potential to cause a pan- demic when the virus possesses the ability to spread efficiently among humans. During the 20th century, this has happened three times, in 1918, 1957, and 1968, killing millions of people world- wide. In all three pandemics, the viruses originated from AIVs [3]. Correspondence to: Y. Zhang; e-mail: [email protected] Contract grant sponsor: Shandong Natural Science Founda- tion. Contract grant number: Y2006A14. International Journal of Quantum Chemistry, Vol 110, 1964 –1971 (2010) © 2009 Wiley Periodicals, Inc.

A new approach to molecular phylogeny of H5N1 avian influenza viruses in Asia

Embed Size (px)

Citation preview

Page 1: A new approach to molecular phylogeny of H5N1 avian influenza viruses in Asia

A New Approach to MolecularPhylogeny of H5N1 Avian InfluenzaViruses in Asia

HUILING WANG,1 YUSEN ZHANG2

1School of Information and Engineering, Shandong University at Weihai, Weihai 264209, China2School of Mathematics and Statistics, Shandong University at Weihai, Weihai 264209, China

Received 31 March 2009 accepted 6 May 2009Published online 13 October 2009 in Wiley InterScience (www.interscience.wiley.com).DOI 10.1002/qua.22359

ABSTRACT: In this article, we introduce a new method to analysis avian influenzavirus (AIV) of subtype H5N1 and study the similarity of these sequences. We make acomparison for some nucleic acid sequences of H5N1 AIV in Asia by using the 2D and3D graphic representation. Comparing these sequences, we structured a phylogenetictree and discussed the evolutional relationship among these viruses. The sequencesanalysis shows that there are some obvious traits depending on different areas, periods,and hosts. © 2009 Wiley Periodicals, Inc. Int J Quantum Chem 110: 1964–1971, 2010

Key words: H5N1 AIV; 3DD-curve; 2DD-curve; phylogenetic tree

1. Introduction

A vian influenza A virus is a negative-stranded RNA virus with eight genomic

segments encoding RNA polymerases (PB2, PB1,and PA), hemagglutin (HA), nucleoprotein (NP),neuramindase (NA), matrix protein (M), and non-structural protein (NS). A total of 16 HA (hemag-glutinin) and nine NA (neuraminidase) subtypeshave been reported [1]. All these subtypes comefrom avian species [2]. New influenza viruses and

genotypes continuously emerge due to the fre-quent evolutionary events including genetic reas-sortment, recombination, and mutation. Further-more, H5N1 genotype is a high pathogenic avianinfluenza virus (AIV) and its analyses is criticalfor preparing a strategy to prevent and to controlinfluenza epidemics and pandemics.

Introduction of an avian influenza A virus with anovel HA gene in a population which lacks immu-nity to this HA has the potential to cause a pan-demic when the virus possesses the ability tospread efficiently among humans. During the 20thcentury, this has happened three times, in 1918,1957, and 1968, killing millions of people world-wide. In all three pandemics, the viruses originatedfrom AIVs [3].

Correspondence to: Y. Zhang; e-mail: [email protected] grant sponsor: Shandong Natural Science Founda-

tion.Contract grant number: Y2006A14.

International Journal of Quantum Chemistry, Vol 110, 1964–1971 (2010)© 2009 Wiley Periodicals, Inc.

Page 2: A new approach to molecular phylogeny of H5N1 avian influenza viruses in Asia

Highly pathogenic AIV of subtype H5N1 caninflect humans. Although there is no conclusiveevidence that avian influenza can be spread fromperson to person, its strong virulence and highvariability have had a great threat to humanity [2].The H5N1 avian influenza is characterized by itscontinuous antigen variation, which is mainlycaused by the HA and NA proteins in which HAprotein has highest rate of mutation. HA proteinplays a critical role in identifying and adsorbing thehost cell receptor in the infection process, and it isthe decisive factor of host specific. Therefore, it issignificant that we choose HA from the eight seg-ments of H5N1 subtype AIV genes to study.

In this article, we introduce a novel 3D graphicrepresentation of H5N1 virus sequences, which al-lows visual observations of nucleotide composition,base pair patterns, and sequence evolution and canprovide some important information than themethods based on the alignment [5]. This graphicrepresentation provides useful insights into localand global characteristics and the occurrences, vari-ations, and repetition of the nucleotides along aH5N1 virus sequence that are not as easily obtain-able by other methods.

We will make a comparison for the HA genessequences of AIVs belonging to 95 different H5N1avian viruses and human influenza viruses in Asia,construct a phylogenetic tree, and discuss the phy-logenetic relationship of these viruses.

2. Materials and Methods

Our dataset included HA genome segment ofavian influenza H5N1 viruses isolated from Asiancountries in the past 10 years. In addition, someisolated from human were also included to assistour analysis. In Table I, 95 different H5N1 avianviruses and human influenza viruses are listed.

Now, we introduce our method in followingsteps.

2.1. FORMAT OF 3DD-CURVE

In article [4,5], the author provided 3D graphicalrepresentation of DNA sequence. Consider a DNAsequence read from the 5�- to the 3�-end with Nbases. Inspect the sequence one base at a time. Letthe number of steps be denoted by i, i.e., i � 1, 2,. . ., N. In the ith step, count the cumulative num-bers of the bases A, C, G and T, denoted by the fourpositive integers Ai, Ci, Gi, and Ti, respectively,

occurring in the subsequence from the first to theith base in the DNA sequence inspected. The 3DD-curve consists of a series of nodes Pi (i � 1, 2, . . . ,N) whose coordinates are denoted by Xi, Yi, and Zi.The bases of DNA can be classified into groupspurine (A, G)/pyrimidine (C, T), amino (A, C)/keto(G, T), and weak bond (A, T)/strong H bond(G, C).Here, we use the three 3DD-curves correspondingto the three classifications are as follows:

1. The 3DD-curve of DNA sequences based onpattern GCT is

�xi � �uAi � �vGi

yi � �uAi � �vCi

zi � �uAi � �vTi

. (1)

2. The 3DD-curve of DNA sequences based on-pattern CGT is

�xi � �uAi � �vCi

yi � �uAi � �vGi

zi � �uAi � �vTi

. (2)

3. The 3DD-curve of DNA sequences based on-pattern TGC is

�xi � �uAi � �vGi

yi � �uAi � �vTi

zi � �uAi � �vCi

, (3)

where uv are different positive real numbers, butnot perfect square number. We define A0 � C0 � G0� T0 � 0 and thus X0 � Y0 � Z0 � 0. By this way,we can reduce a DNA sequence into a series ofnodes P0, P1, P2, . . . PN, whose coordinates Xi, Yi, Zi,(i � 0, 1 , 2, . . . , N, where N is the length of theDNA sequence).

2.2. 2DD-CURVE

In article [6], the authors introduce a generalnondegeneracy 2D graphical representation ofDNA sequence. That is, we construct a map be-tween the bases of DNA sequences and plots in 2Dspace, and then we will obtain a 2D representationof the corresponding DNA sequences. In a 2Dspace, a point or a vector has two components. Thefour free bases were assigned the following basicelementary directions.

�0,1�3 A,� � 1,0�3 G,

MOLECULAR PHYLOGENY OF H5N1 AVIAN INFLUENZA VIRUSES IN ASIA

VOL. 110, NO. 10 DOI 10.1002/qua INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY 1965

Page 3: A new approach to molecular phylogeny of H5N1 avian influenza viruses in Asia

TABLE I ______________________________________________________________________________________________H5N1 avian and human influenza viruses, their accessions, and lengths.

Number Species Accession Length (bp)

1 A/goose/Guangdong/1/96 AF144305 1,7602 A/goose/Guangdong/3/1997 AF364334 1,7623 A/Dk/HN/5806/2003 AY651363 1,6934 A/Dk/HN/303/2004 AY651364 1,6665 A/Dk/ST/4003/2003 AY651367 1,6976 A/Dk/YN/6255/2003 AY651369 1,6937 A/Ck/YN/374/2004 AY651371 1,6668 A/chicken/Jilin/9/2004 AY653200 1,7799 A/chicken/Hubei/327/2004 AY684706 1,779

10 A/chicken/Guangdong/191/04 AY737289 1,77611 A/duck/Shantou/195/2001 CY028924 1,68012 A/chicken/Shantou/904/2001 CY028925 1,68313 A/chicken/Shantou/3744/2003 CY028959 1,71614 A/duck/Hunan/533/2004 CY028969 1,70415 A/duck/Hunan/114/05 DQ095630 1,69216 A/duck/Hunan/344/2006 DQ992791 1,64717 A/chicken/Yunnan/564/2003 CY028976 1,68818 A/duck/Yunnan/4072/2003 CY028982 1,71319 A/bar-headed goose/Qinghai/59/05 DQ095612 1,70120 A/great black-headed gull/Qinghai/2/05 DQ095614 1,65321 A/migratory duck/Jiangxi/1653/2005 DQ320916 1,69822 A/chicken/Shantou/810/05 DQ095626 1,69323 A/duck/Fujian/897/2005 DQ320875 1,69524 A/chicken/Fujian/1042/2005 DQ320876 1,69825 A/duck/Guangxi/3085/2005 DQ992717 1,69526 A/goose/Guangxi/52/2006 DQ992740 1,66527 A/chicken/Guiyang/29/2006 DQ992763 1,67428 A/goose/Guiyang/337/2006 DQ992765 1,66829 A/duck/Guiyang/497/2006 DQ992767 1,69530 A/goose/Shantou/2086/2006 DQ992781 1,67131 A/chicken/Hong Kong/220/97 AF046080 1,74132 A/chicken/Hong Kong/728/97 AF082034 1,72633 A/goose/Hong Kong/385.5/2000 AF398418 1,70434 A/chicken/Hong Kong/FY77/01 AF509016 1,65035 A/chicken/Hong Kong/822.1/01 AF509026 1,70636 A/duck/Hong Kong/ww382/2000 AY059477 1,70437 A/Gs/HK/739.2/02 AY575871 1,68938 A/Dk/HK/821/02 AY575874 1,68639 A/gray heron/Hong Kong/837/2004 DQ320924 1,69840 A/chicken/Hong Kong/282/2006 DQ992836 1,67741 A/robin/Hong Kong/366/2006 DQ992837 1,50042 A/crested myna/Hong Kong/540/2006 DQ992838 1,67143 A/common magpie/Hong Kong/645/2006 DQ992839 1,66244 A/little egret/Hong Kong/718/2006 DQ992840 1,69245 A/Japanese white-eye/Hong Kong/1038/2006 DQ992842 1,69246 A/munia/Hong Kong/2454/2006 DQ992845 1,69247 A/large-billed crow/Hong Kong/2512/2006 DQ992847 1,69548 A/Hong Kong/213/03 AB212054 1,77949 A/Hong Kong/156/97 AF028709 1,74150 A/Hong Kong/483/1997 AF046097 1,741

(continued)

WANG AND ZHANG

1966 INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY DOI 10.1002/qua VOL. 110, NO. 10

Page 4: A new approach to molecular phylogeny of H5N1 avian influenza viruses in Asia

��m,0�3 C,�0, � �n�3 T,

where m and n are different integer numbers butnot perfect square numbers. So that we can reducea sequence into series of nodes P0, P1, P2, . . . , PN,

whose coordinates xi, yi (i � 0, 1, 2, . . . , N, where Nis the length of the sequence being studied) satisfy:

�xi � � Gi � �mCi

yi � Ai � �nTi. (4)

TABLE I ______________________________________________________________________________________________(Continued)

Number Species Accession Length (bp)

51 A/HongKong/97/98 AF102676 1,65652 A/chicken/Vietnam/P41/05 AM183672 1,77653 A/Ck/Vietnam/35/2004 AY651338 1,69754 A/Dk/Vietnam/11/2004 AY651344 1,69655 A/duck/Vietnam/12/2005 CY016883 1,73156 A/duck/Vietnam/5/2007 CY029583 1,69857 A/goose/Vietnam/113/2001 EF541399 1,63758 A/Vietnam/1194/2004 AY651333 1,69659 A/Vietnam/3062/2004 AY651336 1,68460 A/Vietnam/CL115/2005 DQ497727 1,68961 A/Vietnam/HN31242/2007 EU294369 1,70462 A/quail/Thailand/KTHF/2004 AY534913 30863 A/chicken/Thailand/Nakornsawan-01/2004 AY552000 1,09864 A/duck/Thailand/Kamphaengphet-01/2004 AY553797 1,09365 A/Ck/Thailand/1/2004 AY651326 1,70166 A/Ck/Thailand/73/2004 AY651327 1,69767 A/chicken/Thailand/ICRC-V143/2007 EU233416 1,74568 A/chicken/Thailand/ICRC-V586/2008 EU497919 1,75669 A/quail/Thailand/CU-330/06 EU616851 1,71570 A/Thailand/2(SP-33)/2004 AY555153 1,74071 A/Thailand/4(SP-528)/2004 AY626143 1,72372 A/Thailand/NKNP/2005 DQ885612 1,67073 A/Thailand/16/2004 EF541408 1,74174 A/chicken/Indonesia/R134/03 AM183669 1,48275 A/Ck/Indonesia/PA/2003 AYU651320 1,69676 A/Dk/Indonesia/MS/2004 AY651322 1,69777 A/chicken/Indonesia/CDC25/2005 CY014185 1,65978 A/chicken/Indonesia/Belitung Timor1631-18/2006 EU124201 1,73979 A/muscovy duck/Indonesia/Kedri1631-24/2006 EU124206 1,73680 A/Indonesia/CDC194P/2005 CY014177 1,65981 A/Indonesia/CDC594/2006 CY014272 1,70782 A/Indonesia/CDC1046/2007 CY019408 1,70783 A/chicken/Laos/7191/2004 EF541413 1,74184 A/duck/Laos/3295/2006 DQ845348 1,71985 A/chicken/Yamaguchi/7/2004 AB166862 1,70486 A/duck/Yokohama/aq10/2003 AB212280 1,70787 A/chicken/Korea/ES/03 AY676035 1,70488 A/chicken/Korea/IS/2006 EU233675 1,73089 A/chicken/Korea/IS2/2006 EU233683 1,75190 A/chicken/Korea/CA7/2006 EU233691 1,77991 A/chicken/Korea/IS3/2006 EU233699 1,73892 A/duck/Korea/Asan5/2006 EU233707 1,77993 A/duck/Korea/Asan6/2006 EU233715 1,77394 A/quail/Korea/KJ4/2006 EU233723 1,74395 A/chicken/Korea/es/2003 EF541412 1,673

MOLECULAR PHYLOGENY OF H5N1 AVIAN INFLUENZA VIRUSES IN ASIA

VOL. 110, NO. 10 DOI 10.1002/qua INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY 1967

Page 5: A new approach to molecular phylogeny of H5N1 avian influenza viruses in Asia

The definitions of Ai, Ci, Gi, and Ti are the same asabove. We call the corresponding plot set be char-acteristic plot set. The curve connecting all plots ofthe characteristic plot set in turn is called 2DD-curve. It is easy to see that different parameters mand n can result in different visual clues to se-quence.

Many properties of visual importance in a se-quence are preserved in the 3DD-curve and 2DD-curve. It is useful for visualizing the local andglobal features of long or short DNA sequences andcan facilitate the visual discovery of interesting fea-tures in a DNA sequence. All these validated that itis feasible for us to study H5N1 AIVs based on ourmethod. In the next subsection, we will describe aparticular construction of graphical representationof gene sequences based on the 3D and 2D curves of95 amino acid sequences. The 2D curve is used tofurther understand the evolution trends of H5N1genome and the effects of point mutations on de-termining pathogenesis.

2.3. DISTANCE MATRICES

Now, we define another quotient matrix E/G.The (i, j) element [E/G]ij of matrix E/G is defined tobe [E/G]ij � [ED]ij/�i � j� where [ED]ij is the Euclid-ean distance between a pair of vertices of 3DD-curves:

�ED�ij � �� xi � xj�2 � � yi � yj�

2 � � zi � zj�2. (5)

We choose the leading eigenvalues of quotientmatrices E/G as mathematical descriptors of DNAsequence. On the basis of three different patterns,we can get a three-component vector. Then we geta one-to-one correspondence between the DNA se-quence and three-component vectors (u1, u2, u3) ofthe E/G. So (u1, u2, u3) of E/G can characterize theDNA sequences. Comparison between sequencesbecomes comparison between these three-compo-nent vectors.

2.4. SIMILARITY ANALYSIS

Let ai � (ai1, ai2, ai3), i � 1, 2, . . . s, denote allthree-component vectors of the E/G from 3DD-curves of s DNA sequences. The analysis of simi-larity/dissimilarity among these DNA sequencesrepresented by the three-component vectors isbased on the assumption that two DNA sequences

are similar if the corresponding three-componentvectors in the 3D space have similar magnitudes.

The similarities/dissimilarities matrix can be for-mulated as the symmetric matrix Me whose (i, j)element is defined as:

�Me�ij � ��ai1 � aj1�2 � �ai2 � aj2�

2 � �ai3 � aj3�2,

(6)

where i, j � 1, 2, . . . , s. The sizes of matrices aresimilar but not the same, and it depends on thelength of the sequence. To avoid the influencecaused by lengths, we choose the regularized lead-ing eigenvalues instead of leading eigen-values.Then, the sequences similarity can be characterizedby the regularized leading eigenvalues, which arenot dependent on the length of the sequences.

3. Results and Discussion

In this section, based on neighbor-joiningmethod, we constructed a phylogenetic tree bycomparing these sequences listed in Table I anddetermined the phylogenetic relationships. To ob-serve the phylogenetic tree more clearly, we useu � 57 and v � 29 for the 3DD-curves of these DNAsequences and compute the similarities/dissimilar-ities matrix Me of the E/G for these DNA se-quences.

Phylogenetic analysis of 95 kinds of H5N1 influ-ence viruses is based on 3D graphical representa-tion. All the viruses are chosen from different areas,different periods, and different hosts, so they haveuniversality. We analyze them from areas, periods,and hosts in the following.

First, in regard of distribution regions, one of themost distinct clades is composed of Indonesia andHong Kong for the Southeast Asia region. In addi-tion, the viruses from Korea are also highly concen-trated. The first clade contains the viruses fromIndonesia, Hong Kong, and some areas of southChina such as Guiyang and Hunan. The secondclade contains the viruses from Korea, whereas thethird clade composed of the viruses from Yunnan,which is in the southwest China, and its neighbor-ing countries such as Vietnam, Thailand, and Laos.

From the phylogenetic tree shown in Figures 1(a)and 1(b), we can find out that H5N1 viruses appearbranches with regional characteristics, whichmeans that the human vaccines researched based

WANG AND ZHANG

1968 INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY DOI 10.1002/qua VOL. 110, NO. 10

Page 6: A new approach to molecular phylogeny of H5N1 avian influenza viruses in Asia

on only one branch of the virus is likely to beinvalid.

According to the analysis of the H5N1 virusesfrom Asia, we found that these viruses are almostderived from the original H5N1 virus appeared inGuangdong 1996, when we compared the virusesfrom Guangdong and Hong Kong with the virusesfrom the other regions of southeastern Asia. Butnow viruses from different regions and differentbirds have mutated and then divided into severalbranches with regional characteristics. For instance,this kind of epidemic viruses had made three timesof cross-variation in Vietnam in the past 10 years.

The reality that H5N1 viruses quickly appearedmultiple branches shows that in order to guardagainst human and avian influenza, it is very riskyto research human vaccines based on only onebranch. We must take into account the diversity of

the virus. In addition, we must follow the variationof the virus to update the vaccine quickly, which isvery important for human to keep away from in-fluenza virus.

Then, in regard of time distribution, we foundthat virus strains can be divided into three seriesby comparing the 95 virus strains. One of theseries is the avian influenza broken out in Guang-dong and Hong Kong from 1996 to 1997, whereasthe nucleotide sequences from Guangdong in1996 and 1997 are almost the same. Another seriesis composed of the virus strains from Hong Kongin 1997. The rest virus strains from 2000 to 2007 inSoutheast Asia and East Asia is the other series.After 1998, the avian influenza had a 4-year si-lence period. The avian viruses broken out in2003 and 1996 have great differences, and on theother hand, the virus strain from Chinese Main-

FIGURE 1. (a) Neighbor-joining tree for the 95 H5N1 AIV sequences. (b) Neighbor-joining tree for the 95 H5N1 AIVsequences (continued).

MOLECULAR PHYLOGENY OF H5N1 AVIAN INFLUENZA VIRUSES IN ASIA

VOL. 110, NO. 10 DOI 10.1002/qua INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY 1969

Page 7: A new approach to molecular phylogeny of H5N1 avian influenza viruses in Asia

land is also different with that from other areas ina certain extent.

In addition, we found a rather unexpected se-quence (62), which is from quail of Thailand. It isdifferent from the sequences of other hosts, whichhas great distance between it and other sequences.We infer that it may have a relationship with thehost from this point, the virus mutated in the pro-cedure of infecting quails. Overall, almost everybranch contains viruses from different hosts, butthe evolution has no significant features in respectof hosts.

We can get some information from the phyloge-netic tree. Although there has great differencesamong the viruses from 2000 to 2007 and thosefrom 1996 to 1997 in Guangdong and Hong Kong,we still cannot deny that the former is a variation ofthe latter, because H5N1 avian viruses had a highmutation rate in nature. We also found that almostevery branch has sequences from Hong Kong. Wecan speculate that the avian viruses from the otherregions may be the evolution of that from HongKong. The H5N1 avian virus strain from Fujian in2005 (23/24) have a high degree of similarity withthat separated from Jiangxi (21) and Hunan (96)before. In other words, the virus strain from Fujianmay not be new virus. Here, we use 2D graphicalrepresentation to carry out a detailed analysis ofthese three virus stains.

Observing from the 2D graphical representation(see Fig. 2), we can almost conclude that the H5N1viruses isolated from Fujian and Jiangxi may evolvefrom the same virus. We can clearly observe theoverall characteristics based on our needs, we canalso enlarge graphics for a detailed comparisonwhich is one of the great advantages of our method.Now we give their 2D graphical representations oflast 60 bases of the three sequences below. FromFigure 3, we can observe tiny differences amongthem and easily find the sites that have varied in thesequences. The first bases of the three sequences aredifferent that is the main reason to lead to differentcurves. But from Figure 2, we can also see thatseveral aberrant bases cannot affect the whole trendof the curves.

4. Conclusions

On the basis of a 3D and 2D graphical represen-tation, we translate a gene sequence into a 3D curveor 2D curve, which has one-to-one correspondencebetween sequences and graphs. Although some in-

formation may be lost in the sequences duringtransformation, we can focus our attention on theinformation of our interest. First, it makes the gen-eralization from the sequences to graphical repre-sentations, second, we give a 3D graphical repre-sentation for HA sequence and then construct athree-component vector, in which the leading eig-envalues extracted from such 3D graphs via E/Gmatrices are individual components, to characterizethe HA primary sequence. Thus comparison of HAsequences is transformed into a simpler comparisonof vectors, which does not require multiple align-ment.

The virus strains implicated in the 20th century’sinfluenza pandemics originated directly from AIVs,either through genetic reassortment between hu-man and avian influenza strains (1957 and 1968) orpossibly through adaptation of purely avian strainsto humans (1918). Occurrences of direct bird-to-human transmission of AIVs have increasinglybeen reported in recent years, culminating in theongoing outbreak of avian influenza A virus(H5N1) among poultry in several Asian countrieswith associated human infections. These unprece-dented developments have resulted in increasing

-350 -300 -250 -200 -150 -100 -50 0-100

0

100

x-axis

y-a

xis

A/migratory duck/Jiangxi/1653/2005

-350 -300 -250 -200 -150 -100 -50 0 -100

0

100

x-axis

y-a

xis

A/duck/Fujian/897/2005

-350 -300 -250 -200 -150 -100 -50 0 -100

0

100

x-axis y

-axis

A/chicken/Fujian/1042/2005

FIGURE 2. 2D graphical representations of A/migra-tory duck/Jiangxi/1653/2005, A/duck/Fujian/897/2005,and A/chicken/Fujian/1042/2005. [Color figure can beviewed in the online issue, which is available at www.interscience.wiley.com.]

WANG AND ZHANG

1970 INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY DOI 10.1002/qua VOL. 110, NO. 10

Page 8: A new approach to molecular phylogeny of H5N1 avian influenza viruses in Asia

global concerns about the pandemic potential ofthese viruses [7,8].

Gaining insight into the phylogenetic relation-ships among AIV would be helpful for treatmentand prevention against the virus. The phylogeneticanalysis shows that the evolvement direction hassignificant character on time and area level. Theseresults are consistent with those of previous analy-ses [9,12].

Great attention has been paid to the high patho-genic AIV of subtype H5N1 recently, not only fromscientists and medical workers but also from peopleall over the world. The cause of the global concernof H5N1 is due to the outbreaks of avian influenzain domestic poultry, the spread of the H5N1 virusby migratory birds, and the constant reports of

human cases of infection with H5N1 [10]. So far,most reported human H5N1 infections were ac-quired directly from bird, with a couple of suspi-cious cases that might caused by human-to-humantransmission [11]. However, H5N1 HPAIV of Asianlineage is not confined to birds, and a slowly butsteadily increasing cumulative number of con-firmed human infections lead to growing concernsabout an imminent pandemic caused by this strain.Important questions to be answered includewhether H5N1 will mutate to cause worldwidepandemics by acquiring the ability to spreadamong humans and how to treat such pandemic ifit happens. During transmission through migratingbirds, each of these viruses reflects the geographicarea among the flyways of migration birds thattransmit and spread them. Different environmentalfactors in different geographic areas result in theformation of these emerging influenza viruses.

ACKNOWLEDGMENT

The authors thank the anonymous referees andeditor for their corrections and valuable comments.

References

1. Fouchier, R. A.; Munster, V.; Wallensten, A.; Bestebroer,T. M.; Herfst, S.; Smith, D.; Rimmelzwaan, A. F.; Olsen, B.;Osterhaus, A. D. M. E. J Virol 2005, 79, 2814.

2. Kilbourne, E. D. J Infect Dis 1997, 176, 29.3. Webby, R. J.; Webster, R. G. Science 2003, 302, 1519.4. Zhang, Y.; Tan, M. J Math Chem 2008, 44, 2066.5. Zhang, Y.; Liao, B.; Ding, K. Mol Simul 2006, 32, 29.6. Zhang, Y.; Liao, B.; Ding, K. Chem Phys Lett 2005, 411, 28.7. Jing, L.; Zhi, L.; Hua L. J Clin Exp Med 2006, 5, 1482.8. Li, J.; Zhang, Z.; Tian, X. Chin J Bioinformatics 2006, 4, 109.9. Zhao, H.; Li, Y.; Shi, J.; Tian, A.; Deng, A.; Wang, X.; Wei, P.;

Chen, H. Chin J Prev Vet Med 2007, 29, 760.10. Zhang, J.; Kangzhen, Yu. Chin J Prev Vet Med, 2000, 22,

1364.11. World Health Organization. Wkly Epidemiol Rev 2004, 79,

65.12. Pereira, H. G.; Tumova, B.; Law, V. G. Bull World Health

Organ 1965, 32, 855.

0 10 20 30 40 50 60 -5

0

5

xlabel

yla

bel

A/migratory duck/Jiangxi/1653/2005

0 10 20 30 40 50 60 -5

0

5

xlabel

yla

bel

A/duck/Fujian/897/2005

0 10 20 30 40 50 60 -5

0

5

xlabel

yla

bel

A/chicken/Fujian/1042/2005

FIGURE 3. 2D graphical representations of 60 basesof A/migratory duck/Jiangxi/1653/2005, A/duck/Fujian/897/2005, and A/chicken/Fujian/1042/2005. [Color fig-ure can be viewed in the online issue, which is avail-able at www.interscience.wiley.com.]

MOLECULAR PHYLOGENY OF H5N1 AVIAN INFLUENZA VIRUSES IN ASIA

VOL. 110, NO. 10 DOI 10.1002/qua INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY 1971