Download pdf - Analysis of dose-response microarray data using Bayesian ...Analysis of dose-response microarray data using Bayesian Variable Selection (BVS) methods: Modeling and multiplicity adjustments

Analysis of dose-response microarray data using Bayesian Variable Selection (BVS) methods:

Modeling and multiplicity adjustments

Adetayo Kasim

Durham University UK

Outline

• Introduction to dose-response modeling in microarray experiments.

• Bayesian estimation in the presence of equality constraints

• Inference for monotone genes

• Multiplicity adjustment

• Discussion

• Current work: Bayesian isotonic regression

Dose-response Microarray Experiments

Dose-response Microarray studies

Biological information from gene expression data create

new opportunities for developing effective therapies:

To understand mechanism of action of a new treatment.

To explore the desired properties (efficacy/toxicity…).

Explore functions of genes/pathways in a dose-dependency

manner .

Case Study: Human epidermal squamous carcinoma celllines

– 4 dose levels.

– 12 arrays.

– 16,998 genes measured on each

array.

EGF (ng/ml)

Dose 0 1 10 100

# of arrays

3 3 3 3

Examples: Dose-response relationships with gene expression

Bayesian Estimation in the Presence of equality constraint between parameters

Objective

Primary interest:

• Discovery of genes with monotone relationship with respect to dose.

Order restricted inference.

• Simple order (monotone) alternatives.

Model Formulation

2,~ iij NY

Estimation under strict inequality constraints

• Order constraints of priors (Gelfand et al., 1992).

Kg ,...,: 10

2,~ iij NY

),(,~ 11

2

iii IN

otherwise

NP iii

0

,,| 11

2

Specification of the prior :

unconstrained prior.

Likelihood:

2, N

Model Formulation

1

1

0

)|()|(K

K

S

SPyP

The posterior distribution, given the order constraints, is the

same as the unconstrained distribution defined on the

constraints set.

The constraints set

Model Formulation

K

K

H

H

,...,:

,...,:

101

100

? What happen if there is equality constraints between parameters.

Model Formulation

•The null model

32107

32106

32105

32104

32103

32102

32101

32100

:

:

:

:

:

:

:

:

g

g

g

g

g

g

g

g

32101 : H

• We decompose the simple order alternative to all sub alternatives.

• All possible monotone models

Model Formulation

• Monotone models

32107

32106

32105

32104

32103

32102

32101

32100

:

:

:

:

:

:

:

:

g

g

g

g

g

g

g

g

μμμμg

μμμg

μμμg

μμμg

μμg

μμg

μμg

μg

3210

'

7

2310

'

6

3201

'

5

3120

1

4

1230

'

3

2301

'

2

3012

'

1

0123

'

0

:

:

:

:

:

:

:

:

Model Formulation

32105 : g

32107 : g

1.0 1.5 2.0 2.5 3.0 3.5 4.0

8.2

8.4

8.6

8.8

9.0

dose

ge

ne

exp

ressio

n

32107 : g

32105 : g

•We fitted two monotone models:

Equality constraints are replaced with a single parameter.

Model Formulation

Bayesian Variable Selection Method

ijKjKjjjij xxxxY 1322110

Alternative approach,

Where X is a design matrix with ordered columns, reflecting the direction of the monotone constraints

0l

2,0~ Nij

2,~ iij NY

01

0

i

i

),0(,~ 2 IN

32103

2102

101

0

d

d

d

c

dose mean

2

0 ,~ N

•Alternative approach •For a dose-response experiment with 4 dose levels (control + 3 doses):

Ki ,...,0 10


•Simple order alternative.

32107

32106

32105

32104

32103

32102

32101

32100

:

:

:

:

:

:

:

:

g

g

g

g

g

g

g

g

0 ;0 ;0:

0 ;0 ;0:

0 ;0 ;0:

0 ;0 ;0:

0 ;0 ;0:

0 ;0 ;0:

0 ;0 ;0:

0 ;0 ;0:

3217

3216

3215

3214

3213

3212

3211

3210

g

g

g

g

g

g

g

g


i

i

iz

0

1

•The mean structure:

included in the model

not Included in the model

i

i

1

0

•Bayesian Variable Selection: a procedure of deciding which of the model parameters is equal to zero. •Define an indicator variable:


K

i

iii z1

0

K

K

r Sg ,...,,: 10

1

),0(,~ 2 INi

)(~ ii Bz

)1,0(~ Ui

2

0 ,~ N

•The mean structure for a candidate model:

Order restrictions Variable selection


1.0 1.5 2.0 2.5 3.0 3.5 4.0

8.2

8.4

8.6

8.8

9.0

dose

ge

ne

exp

ressio

n

g_7

g_5

BVS

g7

g5

BVS


Inference for Monotone Genes

101

100

:

:

H

H

• suppose we want to identify genes with differential

expression between the control dose and the first dose.

Comparisons Between Two Doses/Groups

);( );( );( 22

00

2

- NNNm

• Inference is based on the posterior probabilities for each to belong to the non-differential components

• Down regulated component

• Non differential component

• Up regulated component

0

00

0


• Inference is based on the posterior probabilities for each gene to belong to the non-differential component

• The posterior probabilities could be treated like p-values

• The differentially expressed would be expected to have very

low probability if belonging the null component

),|( 10 HHpp mm


• Inference is based on the posterior probabilities for each gene to belong to the non-differential component

• The posterior probabilities could be treated like p-values

• The differentially expressed would be expected to have very

low probability if belonging the null component

),|( 10 HHpp mm


• We assumed a gene specific model for the present

approach


K

K

H

H

,...,:

,...,:

101

100

R

r

rgij gNy0

2 );(

• Inference is based on the posterior probabilities for each to belong to the non-differential components

• Where R=7 and

32107

32101

32100

g

g

g


• We need the posterior probabilities for each gene to belong to the null model

),,|( 2

0 rij gygpp


• Which is equivalent to the posterior probabilities of flat

profile in the Bayesian variable selection approach.

),,|( 2

0 rij gygpp


),,|0,,( 2

0321 ijyδδδp

K

i

iii z1

0

K

K

r Sg ,...,,: 10

1

),0(,~ 2 INi

)(~ ii Bz

)1,0(~ Ui

2

0 ,~ N

•The mean structure for a candidate model:

Order restrictions Variable selection


32107

32106

32105

32104

32103

32102

32101

32100

:

:

:

:

:

:

:

:

g

g

g

g

g

g

g

g

)1,1,1(

)0,1,1(

)1,1,0(

)1,0,1(

)0,1,0(

)0,0,1(

)1,0,0(

)0,0,0(

z

z

z

z

z

z

z

z

•4 dose levels. •The triplet defines uniquely all candidate models: ),,( 321 zzzz

The set of off possible monotone models for an experiment with 4 dose levels


),|(),|)0,0,0(( 0 RdatagpRdatazp

3210

1

0

K

i

iii z

),,|)0,0,0(( 2

0321 ijyzzzzp

•The posterior probability that the triplet equal to zero: )0,0,0(z


514.0),|( 0 Rdatagp

g0 g3 g2 g6 g1 g4 g5 g7

0.0

0.1

0.2

0.3

0.4

0.5

•The highest posterior probability is obtained for the null model

(0.514).


4186.0),|( 5 Rdatagp

001.0),|( 0 Rdatagp

g0 g3 g2 g6 g1 g4 g5 g7

0.0

0.1

0.2

0.3

0.4

4059.0),|( 1 Rdatagp

•The highest posterior probability is obtained for model g5.

•Data do not support the null model.


Multiplicity Adjustment

)(

)()(

N

cFDcFDR

•Choose τ such that


),|( 0 Rdatagpmgene m is included in the discovery list

),|( 0 Rdatagpmthe posterior probability of the null model = the probability that we make a mistake when we include the gene in the discovery list.

)(

)()(

N

cFDcFDR

the false discovery rate for a discovery list in which the g’th gene and all other genes with smallest posterior probabilities of the null model are included (Newton 2004,2007).


),|(0

),|(1

0

0

Rdatagp

RdatagpI

m

m

m

)(N

gene m is included in the discovery list

gene m is not included in the discovery list

The number of genes in the discovery list.

M

m

mIN1

)(

•Primary interest: discovery of subset of genes with monotone relationship with respect to dose.


)(),|()(1

0 cFDIRdatagpFDEM

m

mm

)(

)()(

N

cFDcFDR

•The conditional (on the data) expected number of false discoveries (in the discovery list):

•The conditional false discovery rate:

•Choose τ such that .)( cFDR


%5

3295

,,|

)102.0(

0

Rdatazgp

cFDR

g

The expected error rate for the list with all genes for which the posterior probability of the null model < 0.102 are included.

τ


0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

Cut-off

FD

R

TRUE FDR

Estmated FDR


• From Simulation Study

• BVS methods: estimation and inference.

• Multiplicity adjustment is based on the posterior probability of the null model.

• Connection between BVS and MCT.

• Connection between BVS and Bayesian model averaging.

• BVS for order restricted but non-monotone alternatives (umbrella alternatives/partial order alternatives).

• Posterior probabilities for the number of levels and the level probabilities for isotonic regressions.

Discussion

Current Work: Bayesian Isotonic Transformation

• Motivated by Dunson and Neelon (2003)

• Generate posterior samples from unconstrained full conditional distributions

for the model parameters

• Obtain constrained samples through isotonic transformation of the

unconstrained samples.

Current Work: Bayesian Isotonic Regression

• Examples


• Bayes factor for model selection

• The Bayes factor account for the fact that the model not equally like under the

null model


• Probability under the null model

• The probability under the null model is equivalent to to level probability in

ORIC (Anraku, 1999)


• The Bayesian isotonic transformation approach provide good estimates of dose-

specific means under simple ordered constraints

• However, adjusting for multiplicity is less straight forward for this approach.

Research Team

• Ziv Shkedy.

• Luc Bijnens.

• Willem Talloen.

• Hinrich Gohlmann.

• Dhammika Amaratunga

Hasselt University, Belgium Johnson & Johnson Pharmaceutical

Durham University, UK

• Adetayo Kasim.

Imperial College, UK

• Bernet Kato.

pfizer, Belgium

• Dan Lin