16
PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by: [Kim, Jong-Min] On: 1 March 2010 Access details: Access Details: [subscription number 912582068] Publisher Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37- 41 Mortimer Street, London W1T 3JH, UK Communications in Statistics - Simulation and Computation Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713597237 Directional Dependence of Genes Using Survival Truncated FGM Type Modification Copulas Jong-Min Kim a ; Yoon-Sung Jung b ; Tim Soderberg a a Division of Science and Mathematics, University of Minnesota-Morris, Morris, Minnesota, USA b Department of Statistics, Kansas State University, Manhattan, Kansas, USA To cite this Article Kim, Jong-Min, Jung, Yoon-Sung and Soderberg, Tim(2009) 'Directional Dependence of Genes Using Survival Truncated FGM Type Modification Copulas', Communications in Statistics - Simulation and Computation, 38: 7, 1470 — 1484 To link to this Article: DOI: 10.1080/03610910903009336 URL: http://dx.doi.org/10.1080/03610910903009336 Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Directional Dependence of Genes Using Survival Truncated FGM Type Modification Copulas

  • Upload
    pvamu

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

PLEASE SCROLL DOWN FOR ARTICLE

This article was downloaded by: [Kim, Jong-Min]On: 1 March 2010Access details: Access Details: [subscription number 912582068]Publisher Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Communications in Statistics - Simulation and ComputationPublication details, including instructions for authors and subscription information:http://www.informaworld.com/smpp/title~content=t713597237

Directional Dependence of Genes Using Survival Truncated FGM TypeModification CopulasJong-Min Kim a; Yoon-Sung Jung b; Tim Soderberg a

a Division of Science and Mathematics, University of Minnesota-Morris, Morris, Minnesota, USA b

Department of Statistics, Kansas State University, Manhattan, Kansas, USA

To cite this Article Kim, Jong-Min, Jung, Yoon-Sung and Soderberg, Tim(2009) 'Directional Dependence of Genes UsingSurvival Truncated FGM Type Modification Copulas', Communications in Statistics - Simulation and Computation, 38:7, 1470 — 1484To link to this Article: DOI: 10.1080/03610910903009336URL: http://dx.doi.org/10.1080/03610910903009336

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research, teaching and private study purposes. Any substantial orsystematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply ordistribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contentswill be complete or accurate or up to date. The accuracy of any instructions, formulae and drug dosesshould be independently verified with primary sources. The publisher shall not be liable for any loss,actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directlyor indirectly in connection with or arising out of the use of this material.

Communications in Statistics—Simulation and Computation®, 38: 1470–1484, 2009Copyright © Taylor & Francis Group, LLCISSN: 0361-0918 print/1532-4141 onlineDOI: 10.1080/03610910903009336

Directional Dependence of Genes Using SurvivalTruncated FGMTypeModification Copulas

JONG-MIN KIM1, YOON-SUNG JUNG2,AND TIM SODERBERG1

1Division of Science and Mathematics, University of Minnesota-Morris,Morris, Minnesota, USA2Department of Statistics, Kansas State University, Manhattan,Kansas, USA

A multivariate distribution can be represented in terms of its underlying marginsby binding these margins together using a copula function (Sklar, 1959). Here, wepropose a new class of survival FGM-type modification truncated copulas whichquantify dependency and incorporate directional dependence. In addition, we applyour proposed methods to the analysis of directional dependence relationshipsbetween genes. Finally, we employ the Akaike Information Criterion (AIC) to checkthe goodness of fit for our proposed copula models.

Keywords Directional dependence; Farlie-Gumbel-Morgenstern copula;Survival copula.

Mathematics Subject Classification 62H11; 62H20.

1. Introduction

With the availability of increasingly large sets of gene expression data, there is aneed for new methods to analyze these data for gene-gene dependence relationships.The ability to reconstruct gene networks from large sets of microarray data willfacilitate analysis of cellular function at the molecular level, and will have aprofound impact on many areas of biomedical research.

In statistics, there are two approaches to describing dependence structure:(i) setting up a functional relationship between the variables; and (ii) specifyingthe joint distribution of the variables. The second approach eliminates the effect ofunivariate marginals, which has nothing to do with the dependence structure, andis much more general than approach (i). Copulas, first developed by Sklar (1959),are devices which give a representation of a multivariate distribution function interms of its univariate marginal distributions. Recently, Rodríguez-Lallena and

Received October 15, 2008; Accepted April 26, 2009Address correspondence to Jong-Min Kim, Statistic Discipline, Division of Science

and Mathematics, University of Minnesota-Morris, Morris, MN 56267, USA; E-mail:[email protected]

1470

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

Directional Dependence of Genes 1471

Úbeda-Flores (2004) proposed a new class of bivariate copulas depending on twounivariate functions. This class is a generalization of Farlie–Gumbel–Morgenstern(FGM) families of copulas. In this article, we describe survival truncated FGMtype modification copulas and their use in the analysis of directional dependenceamong genes. We investigate several properties of the new class of survival truncatedFGM type modification copulas, including dependence properties and measures ofassociation between two random variables given one truncated random variable.

This article is organized as follows. Section 2 contains a description of thecopulas and survival copulas. The new class of survival truncated copulas withdirectional dependence is explored in Sec. 3, while in Sec. 4 the gene-gene directionaldependence application is introduced. In Sec. 5, we summarize our results andsuggest future research directions.

2. Definitions and Preliminaries

In this section, we review the basic concepts of copulas, focusing on somepreliminary properties of copulas. In addition, we present some standard bivariatecopula families. A copula is a multivariate distribution function defined on the unit�0� 1�n, with uniformly distributed marginals. In this article, we focus on a bivariate(two-dimensional) copula, where n = 2.

Sklar (1973) showed that any bivariate distribution function (FXY ), can berepresented as a function of its marginal distribution of X and Y (FX and FY )by using a two-dimensional copula C�·� ·�. More specifically, the copula may bewritten as

FXY �x� y� = C�FX�x�� FY �y�� = C�u� v��

Therefore, the copula function represents how the multivariate function FXY �x� y� iscopuled with its marginal distribution functions, FX�x� and FY �y�. Also, it describesthe dependence mechanism between two random variables by eliminating theinfluence of the marginals or any monotone transformation on the marginals.

Two additional properties of copulas are the continuity property anddifferentiability property. For the continuity property, let C be a copula. Then,

�C�u2� v2�− C�u1� v1�� ≥ �u2 − u1� + �v2 − v1��∀u1� u2� v1� v2 ∈ �0� 1�, u1 < u2 and v1 < v2, hence, every copula C is uniformlycontinuous on its domains (Nelsen, 1999).

Let X� Y be random variables with continuous distribution functions FX and FY ,respectively. Then the Spearman’s � and Kendall’s � are given, respectively, by

�C = 12∫ 1

0

∫ 1

0�C�u� v�− uv�du dv� (1)

and

�C = 4∫ 1

0

∫ 1

0C�u� v�dC�u� v�− 1� (2)

(Nelsen, 1999).Cherubini et al. (2004) described the survival copula as follows.

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

1472 Kim et al.

Definition 2.1 (Cherubini et al., 2004). The survival copula associated with thecopula C, is

C�v� z� = v+ z− 1+ C�1− v� 1− z�� (3)

It is easy to verify that C has copula properties. Once computed in �1− v� 1− z�,it represents the probability that two standard uniform variates with copula C aregreater than v, z, respectively, since

C�1− v� 1− z� = 1− v+ 1− z− 1+ C�v� z�

= 1− P�U1 ≤ v�+ 1− P�U2 ≤ z�− 1+ P�U1 ≤ v� U2 ≤ z�

= P�U1 > v�+ P�U2 > z�− 1+ P�U1 ≤ v� U2 ≤ z�

= P�U1 > v�U2 > z� (4)

It is also possible to express, via survival copula, the conditional probability

P�U1 > v �U2 > z� = 1− v− z+ C�v� z�

1− z= C�1− v� 1− z�

1− z�

3. Survival Truncated Copulas and Directional Dependence

Rodríguez-Lallena and Úbeda-Flores (2004) developed a wide class of bivariatecopulas which depend on two univariate functions, and describe the dependency ofthe copulas in different ways. In this section, we generalize the class of bivariatecopulas suggested by Rodríguez-Lallena and Úbeda-Flores (2004).

Lemma 3.1. Let (X� Y ) be a continuous random pair whose associated copula C

is given by C�u� v� = uv+ uavb�1− u�c�1− v�d for every (u� v) in �0� 1�2 witha� b� c� d ≥ 1. Rodríguez-Lallena and Úbeda-Flores (2004) proved that C is a copula ifand only if − 1

max��� ��≤ ≤ − 1

min��� ��, where = −� = 1 if a = c = 1, � = −� = 1

if b = d = 1 and

� = −(

a

a+ c

)a−1(1+

√c

a�a+ c − 1�

)a−1(c

a+ c

)c−1(1−

√a

c�a+ c − 1�

)c−1√ac

a+ c − 1�

=(

a

a+ c

)a−1(1−

√c

a�a+ c − 1�

)a−1(c

a+ c

)c−1(1+

√a

c�a+ c − 1�

)c−1√ac

a+ c − 1�

� = −(

b

b + d

)b−1(1+

√d

b�b + d − 1�

)b−1(d

b + d

)d−1(1−

√b

d�b + d − 1�

)d−1√bd

b + d − 1�

� =(

b

b + d

)b−1(1−

√d

b�b + d − 1�

)b−1(d

b + d

)d−1(1+

√b

d�b + d − 1�

)d−1√bd

b + d − 1�

otherwise. Moreover, the range for contains the interval �−1� 1� for all a� b� c� d≥ 1.The case a = b = c = d = 1 produces the Farlie-Gumbel-Morgenstern family ofcopulas (and the smallest range for , i.e., the interval �−1� 1�). In general, the biggerthe parameters a� b� c� d are, the bigger the range for is (for instance: if a = b = c =d = 2, then ∈ �−27� 27�; if a = 2� b = 3� c = 4� d = 5, then ∈ �−840�445� 939�403�,etc.).

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

Directional Dependence of Genes 1473

Table 1Forms of f�u� and g�v� for each type

Type f�u� g�v�

I√u�1− u�

√v�1− v�

II√u��1− u�

√v��1− v�

III√u�1− u��

√v�1− v��

Based on Rodríguez-Lallena and Úbeda-Flores (2004), Jung et al. (2007)introduced three different types of FGM distributions as follows:

• Type I: CI�u� v� = uv+ uv�1− u��1− v� where 0 ≤ u� v ≤ 1• Type II: CII�u� v� = uv+ u�v��1− u��1− v� where � ≥ 1� � ≥ 1� 0 ≤ u� v≤ 1• Type III: CIII�u� v� = uv+ uv�1− u���1− v�� where �≥ 1� �≥ 1� 0≤ u� v≤ 1

Table 1 shows some special forms of f�u� and g�v� for each type of FGM functionconsidered in this article.

The ranges of at each Type in Table 1 are

Type I � −1 ≤ ≤ 1� a = b = c = d = 1�

Type II � − 1max��� ��

≤ ≤ − 1min��� ��

� a = �� b = �� c = d = 1� (5)

Type III � − 1max��� ��

≤ ≤ − 1min��� ��

� a = b = 1 c = �� d = ��

By using Definition 2.1 and Lemma 3.1, the survival FGM type modificationcopulas are as follows:

• Survival Type I: CI�u� v� = uv+ uv�1− u��1− v� = CI�u� v� where 0 ≤ u�

v ≤ 1• Survival Type II: CII�u� v� = uv+ u�v��1− u��1− v� = CIII�u� v� where

�≥ 1� � ≥ 1� 0 ≤ u� v ≤ 1• Survival Type III: CIII�u� v� = uv+ uv�1− u���1− v�� = CII�u� v� where� ≥ 1� � ≥ 1� 0 ≤ u� v ≤ 1

The truncation dependence on invariant copula is defined as follows (Sungur, 1999).

Definition 3.1. If a three-dimensional copula can be represented as

C�u1� u2� u3� = C12

(C13�u1� u3�

u3

�C23�u2� u3�

u3

)u3

= C13

(C12�u1� u2�

u2

�C23�u2� u3�

u2

)u2

= C23

(C12�u1� u2�

u1

�C13�u1� u3�

u1

)u1�

then it will be called a truncation invariant copula.

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

1474 Kim et al.

We consider several different types of the Farlie–Gumbel–Morgenstern (FGM)distribution which have the specific form of the Rodríguez-Lallena and Úbeda-Flores (2004) copula family, C�u� v� = uv+ f�u�g�v�. Let C12� C13, and C23 bemembers of the Farlie–Gumbel–Morgenstern class of copulas:{

Cij� Cij�ui� uj� = uiuj�1+ ij�1− ui��1− uj��}�

Also, let 12� 13, and 23 be the dependence parameters of C12� C13, and C23,respectively. Provided that 12� 13, and 23 lead to compatible two-dimensionalcopulas, the partially truncated invariant 3-dimensional copula with respect to U3 is

C�u1� u2� u3� = u1u2u3�1+ 13�1− u1��1− u3���1+ 23�1− u2��1− u3��

× {1+ 12�1− u1��1− u2��1+ 13u1�1− u3���1+ 23u2�1− u3��

}If 12 = 23 = 13 = , then

C�u1� u2� u3� = u1u2u3�1+ �1− u1��1− u3���1+ �1− u2��1− u3��

× {1+ �1− u1��1− u2��1+ u1�1− u3���1+ u2�1− u3��

}(6)

which will be referred to as the equi-dependence structure. It can be easily shownthat such generated copulas are partially truncated invariants with respect to allpossible truncation regions. The equi-dependence structure employed by Cookand Johnson (1981) to describe data which is not elliptically symmetric has beenextensively discussed in Sungur (1999).

Under the assumption that each trivariate copula has the same parameter,we let 12� 13� 23, and 123 be the dependence parameters of C12� C13� C23, and C123,respectively, then 12 = 23 = 13 = 123 = . We can show the survival FGM-typemodification copulas as follows:

• Survival FGM-Type Modification Copula Type I:

CI�u1� u2� u3�

= CI�12

(CI�13�u1� u3�

1− u3

�CI�23�u2� u3�

1− u3

)· �1− u3�

= CI�12

(CI�13�u1� u3�

1− u3

�CI�23�u2� u3�

1− u3

)· �1− u3�

= CI�12

(u1u3 + u1u3�1− u1��1− u3�

1− u3

�u2u3 + u2u3�1− u2��1− u3�

1− u3

)· �1− u3�

= u1u2u23

{�1+ �1− u1��1− u3���1+ �1− u2��1− u3��

1− u3

}

×{1+

(1− u1u3�1+ �1− u1��1− u3��

1− u3

)(1− u2u3�1+ �1− u2��1− u3��

1− u3

)}

where 0 ≤ u1� u2� u3 ≤ 1 and −1 ≤ ≤ 1.

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

Directional Dependence of Genes 1475

• Survival FGM-Type Modification Copula Type II:

CII�u1� u2� u3�

= CII�12

(CII�13�u1� u3�

u3

�CII�23�u2� u3�

u3

)· �1− u3�

= CIII�12

(CIII�13�u1� u3�

1− u3

�CIII�23�u2� u3�

1− u3

)· �1− u3�

= CIII�12

(u1u3 + u�

1u�3�1− u1��1− u3�

1− u3

�u2u3 + u�

2u�3�1− u2��1− u3�

1− u3

)· �1− u3�

= u1u2u23

[{[1+ u�−1

1 u�−13 �1− u1��1− u3�

][1+ u�−1

2 u�−13 �1− u2��1− u3�

]1− u3

}

+u�−11 u

�−12 u

�+�−23

{1+ u�−1

1 u�−13 �1− u1��1− u3�

1− u3

}�

×{1+ u�−1

2 u�−13 �1− u2��1− u3�

1− u3

}�

×

[1− u3 − u1u3 − u�

1u�3�1− u1��1− u3�

]× [

1− u3 − u2u3 − u�2u

�3�1− u2��1− u3�

]1− u3

where � ≥ 1� � ≥ 1� 0 ≤ u1� u2� u3 ≤ 1 and the admissible range of can bederived from Type II in (5).

• Survival FGM-Type Modification Copula Type III:

CIII�u1� u2� u3�

= CIII�12

(CIII�13�u1� u3�

u3

�CIII�23�u2� u3�

u3

)· �1− u3�

= CII�12

(CII�13�u1� u3�

1− u3

�CII�23�u2� u3�

1− u3

)· �1− u3�

= CII�12

(u1u3 + u1u3�1− u1�

��1− u3��

1− u3

�u2u3 + u2u3�1− u2�

��1− u3��

1− u3

)· �1− u3�

= u1u2u23

{[1+ �1− u1�

��1− u3��][1+ �1− u2�

��1− u3��]

1− u3

}

×{1+

(1− u1u3

[1+ �1− u1�

��1− u3��]

1− u3

)�

×(1− u2u3

[1+ �1− u2�

��1− u3��]

1− u3

)�}�

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

1476 Kim et al.

where � ≥ 1� � ≥ 1� 0 ≤ u1� u2� u3 ≤ 1 and and the admissible range of canbe derived from Type III in (5).

For more details of the admissible range of about the generalized Farlie–Gumbel–Morgenstern distributions the reader is referred to the articles of Bairamovand Eryilmaz (2004), Bairamov et al. (2000), Bairamov and Kotz (2002, 2003), Kimet al. (2008), and Lai and Xie (2000).

The conditional joint distribution of U1 and U2 under the condition U3 ≥ a� is

C�U1� U2 �U3 ≥ a� = P�U1 ≥ u1� U2 ≥ u2 �U3 ≥ a�

= P�U1 ≥ u1� U2 ≥ u2� U3 ≥ a�

P�U3 ≥ a�

= C�u1� u2� a�

1− a�

By the equi-dependence structure, the C�U1� U2 �U3 ≥ a� at each survivaltruncated FGM-type modification copula type is as follows:

• Survival Truncated FGM-Type Modification Copula Type I:

CI�u1� u2 � a� =1

1− a

[u1u2a

2

{�1+ �1− u1��1− a���1+ �1− u2��1− a��

1− a

}

×{1+

(1− u1a�1+ �1− u1��1− a��

1− a

)

×(1− u2a�1+ �1− u2��1− a��

1− a

)}]

where 0 ≤ u1� u2� u3 ≤ 1.• Survival Truncated FGM-Type Modification Copula Type II:

CII�u1� u2 � a� =1

1− a

u1u2a

2

[1+ u�−1

1 a�−1�1− u1��1− a�]

× [1+ u�−1

2 a�−1�1− u2��1− a�]

1− a

+ u�−11 u

�−12 a�+�−2

{1+ u�−1

1 a�−1�1− u1��1− a�

1− a

}�

×{1+ u�−1

2 a�−1�1− u2��1− a�

1− a

}�

×

[1− a− u1a− u�

1a��1− u1��1− a�

]× [

1− a− u2a− u�2a

��1− u2��1− a�]

1− a

where � ≥ 1� � ≥ 1� 0 ≤ u1� u2� u3 ≤ 1.

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

Directional Dependence of Genes 1477

• Survival Truncated FGM-Type Modification Copula Type III:

CIII �u1� u2 � a�

= 11− a

[u1u2a

2

{�1+ �1− u1�

��1− a����1+ �1− u2���1− a���

1− a

}

×{1+

(1− u1a�1+ �1− u1�

��1− a���

1− a

)�

×(1− u2a�1+ �1− u2�

��1− a���

1− a

)�}]

where � ≥ 1� � ≥ 1� 0 ≤ u1� u2� u3 ≤ 1.

The directional dependencies proposed by Sungur (2005) for the direction U1

to U2 and for the direction U2 to U1 under truncation U3 ≥ a are defined as

rU1 �U2�a�u2� = E�U1 �U2 = u2� U3 ≥ a�

= 1−∫ 1

0

�C�u1� u2� a�

�u2

du1 = 1− �

�u2

∫ 1

0C�u1� u2� a�du1� (7)

and

rU2 �U1�a�u1� = E�U2 �U1 = u1� U3 ≥ a�

= 1−∫ 1

0

�C�u1� u2� a�

�u1

du2 = 1− �

�u1

∫ 1

0C�u1� u2� a�du2� (8)

Table 3 shows the measures of dependence of two variables u1 and u2 underthe truncated variable u3 ≥ a and Fig. 1 shows the plots of directional dependenceof for three different survival truncated FGM-type modification copulas (Type I,Type II, and Type III) under the truncated value a = 0�3 using (7) and (8).

4. Using Survival Truncated FGM-Type Modification Copulasto Analyze Directional Dependence in Gene Expression Datasets

The ability to recognize and quantify directionality in gene dependence relationshipswill enhance the ability of researchers working in the fields of genomics andproteomics to extract information from large sets of gene expression data, andthereby to better understand the complex mechanisms by which genes and proteinsinteract. The microarray dataset used in this analysis is from a previous studyon yeast cell-cycle regulation (Spellman et al., 1998) and is publicly available(http://cellcycle-www.stanford.edu/). The dataset is composed of measurementson 6221 genes observed at 80 time points. Eight hundred genes were identified.We selected one group of genes with known interaction patterns (note that knowninteractions are still incomplete at present). The group includes eight histonegenes—HHT1, HHT2, HHF1, HHF2, HTA1, HTA2, HTB1, and HTB2. These eightgenes encode four histones: H2A, H2B, H3, and H4. Histones are proteins whichbind tightly to DNA, helping to ‘package’ the genetic material in chromosomes.

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

1478 Kim et al.

Figure 1. Plots of directional dependence by type under truncation 0.3.

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

Directional Dependence of Genes 1479

Table 2Basic statistics of empirical transformed data

Mean Median Minimum Maximum St.D Skewness Kurtosis

HHT1 −0�01450 0�03 −2�25 1�82 0�88255 −5.903e-47 9.862e-62HHT2 −0�15556 −0�17 −2�32 1�69 0�82572 6.009e-47 1.009e-61HHF2 −0�24212 −0�12 −2�84 1�99 0�99243 6.078e-48 4.759e-63HHF1 −0�24303 −0�15 −3�18 2�00 1�02157 −7.982e-47 1.474e-61HTB1 −0�18278 −0�10 −2�00 1�83 0�89974 −8.163e-49 3.273e-64HTA2 −0�30008 −0�22 −3�06 1�76 0�93989 1.118e-49 2.311e-65HTA1 −0�32810 −0�34 −2�84 1�86 0�98060 1.702e-47 1.879e-62HTB2 −0�20398 −0�07 −2�47 1�83 0�89841 −3.456e-48 2.242e-63

Because chromosomes are replicated during cell division, expression of histone genesmust be tightly regulated in order for replication to proceed. Table 2 provides basicdescriptive statistics of the expression data for the eight histone genes used in thisstudy. From Table 2, we can see that the skewness and the kurtosis of each histonegene are almost zero. We also notice that the maximum of HHT2 is much lowerthan the other maxima. The maxima of HHF1 and HHF2 are close to each other.But the minima of HHF1 and HHF2 are significantly different.

Next, we estimate a parameter in each survival truncated FGM-typemodification distribution. We define Ui �= FX�Xi� and Vi �= FY �Yi� for thecontinuous empirical marginal distribution function FX and FY . We assume thatUi and Vi have uniform distribution U�0� 1�. Hence, we can reduce our empiricallikelihood function to

L��U� V� =n∏

i=1

c�Ui� Vi�� (9)

The estimation of the parameter is determined by maximizing the likelihood (9) forthe real data set. For a computational convenience of a MLE of , the logarithmicform of (9) is as follows:

= argmax∈R

n∑i=1

logL��Ui� Vi� (10)

where R is the set of all possible ’s.We employed Akaike’s Information Criterion (AIC) to evaluate our copula

model. Akaike’s Information Criterion is defined as follows:

AIC = −2 log(L��U� V�

)+ 2t�

where t is the number of parameters of the model. The value of AIC is an indicatorof which estimator fits better: the lower the AIC, the better the model. Table 4shows the values of the AIC for the survival truncated FGM type modificationcopula models (Types I–III). In this article, to estimate the parameter , we considerequation (9) because it is a simpler form of the likelihood function than thelogarithmic form. But we found that there is a difficulty in estimating by using

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

Table

3Measurementof

directiona

ldepend

ence

atTyp

eI,II,IIIwithtrun

cation

0.3

Typ

eI

Typ

eII

Typ

eIII

Uvs

Vr U

�V�0�3

r V�U

�0�3

r U�V

�0�3

r V�U

�0�3

r U�V

�0�3

r V�U

�0�3

1.HHT1vs

HHT2

0.91

1186

7549

30.91

1163

5735

60.91

8300

9479

90.91

8287

2805

51.23

3616

3640

60.91

5243

0081

02.

HHT1vs

HHF2

0.91

0955

9955

40.91

0932

1846

90.91

8263

2236

60.91

8287

2805

51.24

3351

9883

90.91

4866

0495

53.

HHT1vs

HHF1

0.91

1174

8890

50.91

1163

5735

60.91

8278

2881

30.91

8287

2805

51.23

3548

6974

70.91

5243

0081

04.

HHT1vs

HTB1

0.91

1175

1888

20.91

1163

5735

60.91

8309

6040

40.91

8287

2805

51.24

3048

2526

40.91

4866

0495

55.

HHT1vs

HTA2

0.91

1195

9523

80.91

1163

5735

60.91

8285

7350

70.91

8287

2805

51.22

5729

9826

70.91

5556

9342

26.

HHT1vs

HTA1

0.91

0943

3503

40.91

0932

1846

90.91

8291

5842

90.91

8287

2805

51.24

1566

7142

00.91

4928

8945

17.

HHT1vs

HTB2

0.91

1179

2373

70.91

1163

5735

60.91

8262

2276

20.91

8287

2805

51.21

9450

7248

30.91

5807

9382

68.

HHT2vs

HHF2

0.91

1187

0926

70.91

1186

7549

30.91

8263

2236

60.91

8300

9479

91.23

3640

2291

00.91

5268

7532

09.

HHT2vs

HHF1

0.91

1405

9397

00.91

1417

6909

70.91

8278

2881

30.91

8300

9479

91.21

1566

5230

50.91

6145

9424

010

.HHT2vs

HTB1

0.91

1175

1888

20.91

1186

7549

30.91

8309

6040

40.91

8300

9479

91.22

8560

0014

10.91

5456

8463

811

.HHT2vs

HTA2

0.91

1195

9523

80.91

1186

7549

30.91

8285

7350

70.91

8300

9479

91.22

7312

8682

20.91

5519

5292

212

.HHT2vs

HTA1

0.91

1174

6180

60.91

1186

7549

30.91

8291

5842

90.91

8300

9479

91.22

8699

9121

00.91

5456

8463

813

.HHT2vs

HTB2

0.91

1410

2177

80.91

1417

6909

70.91

8262

2276

20.91

8300

9479

91.20

4075

8140

90.91

6458

8613

814

.HHF2vs

HHF1

0.91

0943

6377

00.91

0955

9955

40.91

8278

2881

30.91

8263

2236

61.24

3258

8009

00.91

4920

6813

815

.HHF2vs

HTB1

0.91

0943

9064

30.91

0955

9955

40.91

8309

6040

40.91

8263

2236

61.24

3048

2526

40.91

4920

6813

816

.HHF2vs

HTA2

0.91

1195

9523

80.91

1187

0926

70.91

8285

7350

70.91

8263

2236

61.22

5729

9826

70.91

5609

4055

617

.HHF2vs

HTA1

0.91

0943

3503

40.91

0955

9955

40.91

8291

5842

90.91

8263

2236

61.24

3196

8707

40.91

4920

6813

818

.HHF2vs

HTB2

0.91

1179

2373

70.91

1187

0926

70.91

8262

2276

20.91

8263

2236

61.22

1014

4760

00.91

5797

0851

519

.HHF1vs

HTB1

0.91

1175

1888

20.91

1174

8890

50.91

8309

6040

40.91

8278

2881

31.23

1745

7969

10.91

5328

9886

020

.HHF1vs

HTA2

0.91

1195

9523

80.91

1174

8890

50.91

8285

7350

70.91

8278

2881

31.21

9446

4909

60.91

5830

4598

421

.HHF1vs

HTA1

0.91

1174

6180

60.91

1174

8890

50.91

8291

5842

90.91

8278

2881

31.23

5094

6149

30.91

5203

5457

822

.HHF1vs

HTB2

0.91

1179

2373

70.91

1174

8890

50.91

8262

2276

20.91

8278

2881

31.22

7317

4831

40.91

5517

0968

523

.HTB1vs

HTA2

0.91

1195

9523

80.91

1175

1888

20.91

8285

7350

70.91

8309

6040

41.22

7312

8682

20.91

5485

3892

624

.HTB1vs

HTA1

0.91

0943

3503

40.91

0943

9064

30.91

8291

5842

90.91

8309

6040

41.24

6471

7778

90.91

4730

8258

625

.HTB1vs

HTB2

0.91

1179

2373

70.91

1175

1888

20.91

8262

2276

20.91

8309

6040

41.21

4788

1905

60.91

5987

8070

126

.HTA2vs

HTA1

0.91

1174

6180

60.91

1195

9523

80.91

8291

5842

90.91

8285

7350

71.23

0291

3498

70.91

5419

5269

127

.HTA2vs

HTB2

0.91

1179

2373

70.91

1195

9523

80.91

8262

2276

20.91

8285

7350

71.21

3243

5684

40.91

6107

6686

228

.HTA1vs

HTB2

0.91

1179

2373

70.91

1174

6180

60.91

8262

2276

20.91

8291

5842

91.21

6337

5872

40.91

5943

6697

5

1480

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

Directional Dependence of Genes 1481

Table 4Akaike’s information criterion (AIC) by type

Truncation: a = 0�3

Type I Type II Type III

1. HHT1 vs HHT2 −70.8521157312 306.7020817934 307.90215688382. HHT1 vs HHF2 −70.9700265966 306.4619532701 307.27311274963. HHT1 vs HHF1 −70.8164804306 306.6831562838 308.03223876804. HHT1 vs HTB1 −70.9415181960 306.6310312178 307.28026447665. HHT1 vs HTA2 −70.8291759075 306.9770682957 308.30258556476. HHT1 vs HTA1 −70.9632626266 306.6072269198 307.29858691137. HHT1 vs HTB2 −70.6739916220 306.8164099732 308.90099285658. HHT2 vs HHF2 −70.8544692475 306.6585005289 307.91024520349. HHT2 vs HHF1 −70.4637443736 307.0405391701 309.679015309410. HHT2 vs HTB1 −70.7707748251 306.9070982775 308.194691712911. HHT2 vs HTA2 −70.6622613158 306.7766716690 308.410586339012. HHT2 vs HTA1 −70.7521009098 306.8107610485 308.246226502313. HHT2 vs HTB2 −70.3385426103 306.9931918476 310.169826473714. HHF2 vs HHF1 −70.9302011373 306.3746847371 307.320368786015. HHF2 vs HTB1 −70.9998160819 306.5793843911 307.125502001816. HHF2 vs HTA2 −70.7133726549 306.7510930792 308.462270696217. HHF2 vs HTA1 −70.9374070629 306.4528045756 307.279712351718. HHF2 vs HTB2 −70.6311723271 306.6846689390 308.929475533119. HHF1 vs HTB1 −70.7835084043 306.8059150562 308.121538942620. HHF1 vs HTA2 −70.6480622539 306.9413047371 308.907650097721. HHF1 vs HTA1 −70.7953054175 306.6641396677 307.977462080922. HHF1 vs HTB2 −70.6726502437 306.5861748581 308.599766828223. HTB1 vs HTA2 −70.7230954133 306.9193072923 308.345988072924. HTB1 vs HTA1 −70.9668295265 306.6117003327 307.061172072125. HTB1 vs HTB2 −70.5924158111 306.9426720800 309.188176270126. HTA2 vs HTA1 −70.8385432282 306.8058250092 307.974938498427. HTA2 vs HTB2 −70.5788268299 306.8644176372 309.263732412028. HTA1 vs HTB2 −70.5859402194 306.9014941480 309.1650709023

the method of the maximum likelihood estimation. The problem is �L��U�V�

(or � logL��U�V��

), which is not a function of . Therefore, we cannot estimateparameter using the likelihood function. Instead, we used a numerical method tofind a value of to maximize the copula function. The procedures for finding anestimate of are as follows:

Step 0. Select a model which is Type I, II, or III.

Step 1. Find the range of from (5) under a = 0�3, � = 2, and � = 2 forType II, and under a = 0�3, � = 2, and � = 3 for Type III.

Step 2. Define a grid unit and move grid from the minimum to the maximumin the range of of for Type I, II, or III. (grid unit in data analysis, Type I: 0.001,Type II and III: 0.005.)

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

1482 Kim et al.

Figure 2. Plots of Akaike’s Information Criteria (AIC) by Type under Truncation 0.3.

Step 3. Calculate values of from (10) as the grid increases.

Step 4. Select the value of maximizing (10).

Table 3 shows the measures of dependence of two variables u1 and u2 underthe truncated variable u3 ≥ a for three different survival truncated FGM-typemodification copulas. In Fig. 1, we see that the mean plots of rU �V�a=0�3 and rV �U�a=0�3

for Type I are almost identical, whereas the mean plots of rU �V�a=0�3 and rV �U�a=0�3

for Type II (� = 2 and � = 2) are non identical even if we set Type II (� =2 and � = 2) to be symmetric. This indicates that Type II has more directiondependence response, depending on the truncated value a and a parameter . Also,the mean plots of rU �V�a=0�3 and rV �U�a=0�3 for Type III (� = 2 and � = 3) show moredifferent than those of Type I because we set Type II (� = 2 and � = 3) to benonsymmetric.

The values of rU �V�a and rV �U�a for Type I, Type II (� = 2 and � = 2), andType III (� = 2 and � = 3) are shown in Table 3. The values of rU �V�a and rV �U�afor Type I are almost identical. Therefore, we can conclude that Type I has nodirectional dependence property and Type II and III have directional dependence,unlike Type I. In addition, Fig. 1 provides evidence that Type II and III are ableto recognize directional dependence between any two histone genes. In terms of thegoodness of fit using gene data, the AIC values of Type I, Type II (� = 2 and �= 2),and Type III (� = 2 and � = 3) in Table 4 and Fig. 2 indicate that the Type I modelis better than Type II and Type III models because the values of AIC using thesurvival truncated FGM-type modification copula Type I are much smaller thanthose using the the survival truncated FGM-type modification copula Type II andType II. In addition, the Type II (� = 2 and � = 2) model is better than the Type

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

Directional Dependence of Genes 1483

III (� = 2 and � = 3) model because the AIC values of Type II (� = 2 and � = 2)are slightly smaller than those of Type III (� = 2 and � = 3).

5. Conclusions

Dependence properties and measures of association between two or more variablescan be investigated in terms of various copulas. In this article, we have presented aflexible new class of survival truncated FGM type modification copulas. Using thedescribed methods, we showed that the survival truncated FGM type modificationcopulas (Type II and Type III) have directional dependence. Therefore, the survivaltruncated FGM type modification copula models (Type II and Type III) can helpto determine directional dependence among three different genes. Our methods wereevaluated by employing AIC criterion. More work is needed, however, to refine ourability to analyze directional dependence within a three-variable framework, andto test these new tools with datasets where clearer evidence of unidirectional genedependence is evident. In addition, we need to develop other goodness of fit criterionto select the best multivariate copula model for a given gene dataset.

Acknowledgments

The authors are thankful to the Editor, Associate Editor, and the two refereesfor valuable comments on the original version of this manuscript which lead tosubstantial improvement.

References

Bairamov, I., Kotz, S. (2002). Dependence structure and symmetry of Huang–Kotzdistributions and their extensions. Metrika 56(1):55–72.

Bairamov, I., Kotz, S. (2003). On a new family of positive quadrant dependent bivariatedistributions. International Mathematical Journal 3(11):1247–1254.

Bairamov, I., Eryilmaz, S. (2004). Characterization of symmetry and exceedance models inmultivariate FGM distributions. Journal of Applied Statistical Science 13(2):87–99.

Bairamov, I., Kotz, S., Bekci, M. (2000). New generalized F-G-M distributions andcommitants of order statistics. Journal of Applied Statistics 28:521–536.

Cherubini, U., Luciano, E., Vecchiato, W. (2004). Copula Methods in Finance. Wiley FinanceSeries. Chichester: John Wiley & Sons, Ltd.

Cook, R. D., Johnson, M. E. (1981). A family of distributions for modelling non-ellipticallysymmetric multivariate data. Journal of the Royal Statistical Society Series B 43:210–219.

Jung, Y.-S., Kim, J-.M., Sungur, E. A. (2007). Directional dependence of truncation invariantFGM Copula functions: application to foreign exchange currency data. Unpublished.

Kim, J.-M., Jung, Y., Sungur, E. A., Han, K., Park, C., Sohn, I. (2008). A copula methodfor modeling directional dependence of genes. BMC Bioinformatics 9:225.

Lai, C. D., Xie, M. (2000). A new family of positive quadrant dependent bivariatedistributions. Statistics and Probability Letters 46:359–364.

Nelsen, R. B. (1999). An Introduction to Copulas. New York: Springer-Verlag.Rodríguez-Lallena, J. A., Úbeda-Flores, M. (2004). A new class of bivariate couplas.

Statistics and Probability Letters 66:315–325.Sklar, A. (1959). Fonctions de repartition a n dimensions et leurs marges. (French) Publ.

Inst. Statist. Univsite. Paris 8:229–231.Sklar, A. (1973). Random variables, joint distribution functions, and opulas. Kybernetika

(Prague) 9:449–460.

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010

1484 Kim et al.

Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown,P. O., Botstein, D., Fucher, B. (1998). Comprehensive identification of cell cycle-regulatedgenes of the yeast saccharomyces cerevisiae by microarray hybridization. MolecularBiology of the Cell 9:3273–3297.

Sungur, E. A. (1999). Truncation invariant dependence structures. Communication inStatistics—Theory and Methods 28(11):2553–2568.

Sungur, E. A. (2005). A note on directional dependence in regression setting. Communicationin Statistics—Theory and Methods 34:1957–1965.

Downloaded By: [Kim, Jong-Min] At: 17:54 1 March 2010