34
Rationalizing CMap Gene Expression Readouts via Target Prediction Nolen Joy Perualila Non-Clinical Statistics Conference 2014 9 October 2014

Rationalizing CMap Gene Expression Readouts via … CMap Gene Expression Readouts via Target Prediction Nolen Joy Perualila Non-Clinical Statistics Conference 2014 9 October 2014 RESEARCH

Embed Size (px)

Citation preview

Rationalizing CMap Gene Expression Readoutsvia Target Prediction

Nolen Joy PerualilaNon-Clinical Statistics Conference 20149 October 2014

RESEARCH GROUP

Hasselt University,BelgiumNolen Joy PerualilaZiv Shkedy

Durham University,UKAdetayo Kasim

Cambridge University,UKAakash Chavan RavindranathAndreas Bender

Janssen Pharmaceutica NV,BelgiumLuc BijnensWillem TalloenHinrich W.H. Gohlmann

QSTAR http://qstar-consortium.org

N. J. Perualila NCS2014 2/18

OUTLINE

1 BackgroundMechanism of Action (MoA) of Compounds

2 Data SourcesTarget prediction DataGene expression Data

3 Analysis Flow

4 ResultsAssociating Protein Targets to compoundsAssociating Genes with compoundsGene-set EnrichmentUsing Pathways to understand MoABiclustering of CMap Gene expression data

5 Discussion

N. J. Perualila NCS2014 3/18

MOA OF COMPOUNDS

Aim: To find subsets of compounds that share similar target prediction andgene expression profiles.

N. J. Perualila NCS2014 4/18

Connectivity Map In silico

EARLY DRUG DISCOVERY

The development ofevery drug begins withthe search for a target onwhich the drug can act.

Lead compounds mustbe able to bind well tothe target protein like akey into a lock.

If a drug binds to oneprotein, its drug target,it may also bind toanother one (or many)(non-selective ligands).

N. J. Perualila NCS2014 5/18

Compound Protein Target

EARLY DRUG DISCOVERY

The development ofevery drug begins withthe search for a target onwhich the drug can act.

Lead compounds mustbe able to bind well tothe target protein like akey into a lock.

If a drug binds to oneprotein, its drug target,it may also bind toanother one (or many)(non-selective ligands).

N. J. Perualila NCS2014 5/18

Compound - Protein Target

EARLY DRUG DISCOVERY

The development ofevery drug begins withthe search for a target onwhich the drug can act.

Lead compounds mustbe able to bind well tothe target protein like akey into a lock.

If a drug binds to oneprotein, its drug target,it may also bind toanother one (or many)(non-selective ligands).

N. J. Perualila NCS2014 5/18

Compound - Protein Target

EARLY DRUG DISCOVERYThe development ofevery drug begins withthe search for a target onwhich the drug can act.

Lead compounds mustbe able to bind well tothe target protein like akey into a lock.

If a drug binds to oneprotein, its drug target,it may also bind toanother one (or many)(non-selective ligands).

Which drugs will bind to which protein?

N. J. Perualila NCS2014 5/18

image source: http://vds.cm.utexas.edu/

OVERVIEW: TARGET PREDICTION TOOL

Calculates the likelihood of binding for every protein target (seeKoutsoukas,2011).

N. J. Perualila NCS2014 6/18

.

COMPOUNDS, TARGETS, AND GENES

Ligand-bindingmodifies thebiological functions ofprotein target, a seriesof target-relateddownstream genes arethen influenced.

Drugs sharingcommon targets resultin similargene-expressionprofiles.

N. J. Perualila NCS2014 7/18

image source: http://cc.scu.edu.cn/G2S/eWebEditor/uploadfile/20120810155023582.jpg

COMPOUNDS, TARGETS, AND GENES

Ligand-bindingmodifies thebiological functions ofprotein target, a seriesof target-relateddownstream genes arethen influenced.

Drugs sharingcommon targets resultin similargene-expressionprofiles.

N. J. Perualila NCS2014 7/18

image source: http://cc.scu.edu.cn/G2S/eWebEditor/uploadfile/20120810155023582.jpg

DATA SOURCES

T =

T11 T12 . . . T1I

T21 T22 . . . T2I

. . . .

. . . .

. . . .

TJ1 TJ2 . . . TJI

X =

X11 X12 . . . X1I

X21 X22 . . . X2I

. . . .

. . . .

. . . .

XG1 XG2 . . . XGI

.

1 The target prediction scorematrix (binary)(J targets x I compounds)

Tji =

{1 compound i hit target j,0 otherwise.

2 The gene expression matrix(G genes x I compounds)

Xgi = expression level ofgene g for compound i

.

N. J. Perualila NCS2014 8/18

APPLICATION: CONNECTIVITY MAP

I = 35 compounds.MC7 cell line,6 hours after exposure,dose at 10 micromolars.

G' 2400 genes afterpre-processing.

J = 477 protein targets.

N. J. Perualila NCS2014 9/18

ANALYSIS FLOWStep I

Step II

Target-prediction

based Clustering

of Compounds

a cluster of compounds

Fisher’s Exact

Test: top K targets

Pathways

LIMMA: top N

differentially

expressed genes

Pathways/ MLPoverlap

For every target-driven compoundcluster

What are theshared targets?

What genes aredifferentiallyexpressed?

Whichbiologicalpathways areaffected?

N. J. Perualila NCS2014 10/18

ANALYSIS FLOWStep I

Step II

Target-prediction

based Clustering

of Compounds

a cluster of compounds

Fisher’s Exact

Test: top K targets

Pathways

LIMMA: top N

differentially

expressed genes

Pathways/ MLPoverlap

For every target-driven compoundcluster

What are theshared targets?

What genes aredifferentiallyexpressed?

Whichbiologicalpathways areaffected?

N. J. Perualila NCS2014 10/18

ANALYSIS FLOWStep I

Step II

Target-prediction

based Clustering

of Compounds

a cluster of compounds

Fisher’s Exact

Test: top K targets

Pathways

LIMMA: top N

differentially

expressed genes

Pathways/ MLPoverlap

For every target-driven compoundcluster

What are theshared targets?

What genes aredifferentiallyexpressed?

Whichbiologicalpathways areaffected?

N. J. Perualila NCS2014 10/18

inout

1 0Target j

cluster

ANALYSIS FLOWStep I

Step II

Target-prediction

based Clustering

of Compounds

a cluster of compounds

Fisher’s Exact

Test: top K targets

Pathways

LIMMA: top N

differentially

expressed genes

Pathways/ MLPoverlap

For every target-driven compoundcluster

What are theshared targets?

What genes aredifferentiallyexpressed?

Whichbiologicalpathways areaffected?

N. J. Perualila NCS2014 10/18

ANALYSIS FLOWStep I

Step II

Target-prediction

based Clustering

of Compounds

a cluster of compounds

Fisher’s Exact

Test: top K targets

Pathways

LIMMA: top N

differentially

expressed genes

Pathways/ MLPoverlap

For every target-driven compoundcluster

What are theshared targets?

What genes aredifferentiallyexpressed?

Whichbiologicalpathways areaffected?

N. J. Perualila NCS2014 10/18

ANALYSIS FLOWStep I

Step II

Target-prediction

based Clustering

of Compounds

a cluster of compounds

Fisher’s Exact

Test: top K targets

Pathways

LIMMA: top N

differentially

expressed genes

Pathways/ MLP

overlap

For every target-driven compoundcluster

What are theshared targets?

What genes aredifferentiallyexpressed?

Whichbiologicalpathways areaffected?

N. J. Perualila NCS2014 10/18

ANALYSIS FLOWStep I

Step II

Target-prediction

based Clustering

of Compounds

a cluster of compounds

Fisher’s Exact

Test: top K targets

Pathways

LIMMA: top N

differentially

expressed genes

Pathways/ MLPoverlap

For every target-driven compoundcluster

What are theshared targets?

What genes aredifferentiallyexpressed?

Whichbiologicalpathways areaffected?

N. J. Perualila NCS2014 10/18

TARGET PREDICTION-BASED CLUSTERING

Similarity matrix based on Tanimoto scores.

thio

ridazin

e

chlo

rpro

mazin

e

pro

chlo

rpera

zin

e

clo

zapin

e

triflu

opera

zin

e

fluphenazin

e

halo

peri

dol

vera

pam

il

dexve

rapam

il

felo

dip

ine

nife

dip

ine

nitre

ndip

ine

ara

chid

onyltri

fluoro

meth

ane

15−

delta p

rosta

gla

ndin

J2

ara

chid

onic

acid

cele

coxib

W−

13

metform

in

tetr

aeth

yle

nepenta

min

e

phenfo

rmin

phenyl big

uanid

e

rofe

coxib

LM

−1685

SC

−58125

dic

lofe

nac

4,5

−dia

nili

nophth

alim

ide

flufe

nam

ic a

cid

N−

phenyla

nth

ranili

c a

cid

pro

bucol

bute

in

bensera

zid

e

LY−

294002

tioguanin

e

fasudil

imatinib

compounds

0 0.6

Value

Color Key

N. J. Perualila NCS2014 11/18

ASSOCIATING TARGETS TO COMPOUNDS

Identify the top predicted protein targets of compounds in cluster 1.

thio

rid

azin

ech

lorp

rom

azin

ep

roch

lorp

era

zin

eclo

za

pin

etr

iflu

op

era

zin

eflu

ph

en

azin

eh

alo

pe

rid

ol

vera

pa

mil

dexve

rap

am

ilfe

lod

ipin

en

ifed

ipin

en

itre

nd

ipin

ea

rach

ido

nyltri

flu

oro

me

tha

ne

15

−d

elta

pro

sta

gla

nd

in J

2a

rach

ido

nic

acid

ce

lecoxib

W−

13

me

tfo

rmin

tetr

ae

thyle

ne

pe

nta

min

ep

he

nfo

rmin

ph

enyl b

igu

an

ide

rofe

coxib

LM

−1

68

5S

C−

58

12

5d

iclo

fen

ac

4,5

−d

ian

ilin

op

hth

alim

ide

flu

fen

am

ic a

cid

N−

ph

enyla

nth

ran

ilic a

cid

pro

bu

co

lbu

tein

be

nse

razid

eLY

−2

94

00

2tio

gu

an

ine

fasu

dil

ima

tin

ib

y

X5.hydroxyr_6

D.1B._dopator

Muscarinic_M3

Muscarinic_M1

Cytochrome2D6

NADPH_oxide_1

Histamine_tor

D.2._dopamtor

Targ

ets

Compounds

N. J. Perualila NCS2014 12/18

ASSOCIATING GENES WITH COMPOUNDS

Identify the most differentially expressed genes for compounds in cluster 1.

log fold change

−lo

g p

−va

lue

−0.4 −0.2 0.0 0.2

02

46

810

12 IDI1

SQLEMSMO1

INSIG1

MNT

SRSF7HMGCS1

CCR1CCNG2 KLHL24 PPIFSLC38A2 NPC2SGCE

PNO1BARD1

LPIN1HMGCRTGS1LDLR

log

2 c

oncentr

ation

−0.2

0.0

0.2

0.4

0.6

0.8

thio

rid

azin

e

ch

lorp

rom

azin

e

pro

ch

lorp

era

zin

e

clo

za

pin

e

triflu

op

era

zin

e

flu

ph

en

azin

e

ha

lop

eri

do

l

vera

pa

mil

dexve

rap

am

il

felo

dip

ine

nife

dip

ine

nitre

nd

ipin

e

ara

ch

ido

nyltri

flu

oro

me

tha

ne

15

−d

elta

pro

sta

gla

nd

in J

2

ara

ch

ido

nic

acid

ce

lecoxib

W−

13

me

tfo

rmin

tetr

ae

thyle

ne

pe

nta

min

e

ph

en

form

in

ph

enyl b

igu

an

ide

rofe

coxib

LM

−1

68

5

SC

−5

81

25

dic

lofe

na

c

4,5

−d

ian

ilin

op

hth

alim

ide

flu

fen

am

ic a

cid

N−

ph

enyla

nth

ran

ilic a

cid

pro

bu

co

l

bu

tein

be

nse

razid

e

LY−

29

40

02

tio

gu

an

ine

fasu

dil

ima

tin

ib

IDI1

INSIG1

MSMO1

LPIN1

SQLE

HMGCS1

NPC2

BHLHE40

N. J. Perualila NCS2014 13/18

ASSOCIATING GENES WITH COMPOUNDS

Identify the most differentially expressed genes for compounds in cluster 1.

log fold change

−lo

g p

−va

lue

−0.4 −0.2 0.0 0.2

02

46

810

12 IDI1

SQLEMSMO1

INSIG1

MNT

SRSF7HMGCS1

CCR1CCNG2 KLHL24 PPIFSLC38A2 NPC2SGCE

PNO1BARD1

LPIN1HMGCRTGS1LDLR

log

2 c

oncentr

ation

−0.2

0.0

0.2

0.4

0.6

0.8

thio

rid

azin

e

ch

lorp

rom

azin

e

pro

ch

lorp

era

zin

e

clo

za

pin

e

triflu

op

era

zin

e

flu

ph

en

azin

e

ha

lop

eri

do

l

vera

pa

mil

dexve

rap

am

il

felo

dip

ine

nife

dip

ine

nitre

nd

ipin

e

ara

ch

ido

nyltri

flu

oro

me

tha

ne

15

−d

elta

pro

sta

gla

nd

in J

2

ara

ch

ido

nic

acid

ce

lecoxib

W−

13

me

tfo

rmin

tetr

ae

thyle

ne

pe

nta

min

e

ph

en

form

in

ph

enyl b

igu

an

ide

rofe

coxib

LM

−1

68

5

SC

−5

81

25

dic

lofe

na

c

4,5

−d

ian

ilin

op

hth

alim

ide

flu

fen

am

ic a

cid

N−

ph

enyla

nth

ran

ilic a

cid

pro

bu

co

l

bu

tein

be

nse

razid

e

LY−

29

40

02

tio

gu

an

ine

fasu

dil

ima

tin

ib

IDI1

INSIG1

MSMO1

LPIN1

SQLE

HMGCS1

NPC2

BHLHE40

N. J. Perualila NCS2014 13/18

BIOLOGICAL PATHWAYS: CLUSTER 1Compounds Pathway Target Genesclozapine

Steroid metabolic process Cytochrome P450 2D6 INSIG1

thioridazinechlorpromazine

LDLRtrifluoperazineprochlorperazinefluphenazine

GO pathways containing the topgene sets with MLP analysis.

GO:0006695\

cholesterol biosynthetic

GO:0016126\

sterol biosynthetic

GO:0008203\

cholesterol metabolic

GO:0016125\

sterol metabolic

GO:0006694\

steroid biosynthetic

GO:0006066\

alcohol metabolic

GO:0008202\

steroid metabolic

GO:0008610\

lipid biosynthetic

GO:0046165\

alcohol biosynthetic

Top genes contributing to choles-terol biosynthetic process.

DHCR24:24−dehydrocholesterol reductase

G6PD:glucose−6−phosphate dehydrogenase

HMGCR:3−hydroxy−3−methylglutaryl−CoA reductas

HMGCS1:3−hydroxy−3−methylglutaryl−CoA synthas

IDI1:isopentenyl−diphosphate delta isomerase

INSIG1:insulin induced gene 1

INSIG2:insulin induced gene 2

PEX2:peroxisomal biogenesis factor 2

MSMO1:methylsterol monooxygenase 1

SOD1:superoxide dismutase 1, soluble

SQLE:squalene epoxidase

CNBP:CCHC−type zinc finger, nucleic acid bind

Sig

nific

an

ce

of te

ste

d g

en

es

invo

lve

d in

ge

ne

se

t GO

:00

06

69

5

Significance

0 1 2 3 4 5

N. J. Perualila NCS2014 14/18

INSIG1LDLR CYP450 2D6

Steroid MetabolicProcess

clozapine, thioridazine,chlorpromazine, trifluoperazine,prochlorperazine,fluphenazine

BIOLOGICAL PATHWAYS: CLUSTER 1Compounds Pathway Target Genesclozapine

Steroid metabolic process Cytochrome P450 2D6 INSIG1

thioridazinechlorpromazine

LDLRtrifluoperazineprochlorperazinefluphenazine

GO pathways containing the topgene sets with MLP analysis.

GO:0006695\

cholesterol biosynthetic

GO:0016126\

sterol biosynthetic

GO:0008203\

cholesterol metabolic

GO:0016125\

sterol metabolic

GO:0006694\

steroid biosynthetic

GO:0006066\

alcohol metabolic

GO:0008202\

steroid metabolic

GO:0008610\

lipid biosynthetic

GO:0046165\

alcohol biosynthetic

Top genes contributing to choles-terol biosynthetic process.

DHCR24:24−dehydrocholesterol reductase

G6PD:glucose−6−phosphate dehydrogenase

HMGCR:3−hydroxy−3−methylglutaryl−CoA reductas

HMGCS1:3−hydroxy−3−methylglutaryl−CoA synthas

IDI1:isopentenyl−diphosphate delta isomerase

INSIG1:insulin induced gene 1

INSIG2:insulin induced gene 2

PEX2:peroxisomal biogenesis factor 2

MSMO1:methylsterol monooxygenase 1

SOD1:superoxide dismutase 1, soluble

SQLE:squalene epoxidase

CNBP:CCHC−type zinc finger, nucleic acid bind

Sig

nific

an

ce

of te

ste

d g

en

es

invo

lve

d in

ge

ne

se

t GO

:00

06

69

5

Significance

0 1 2 3 4 5

N. J. Perualila NCS2014 14/18

GENE EXPRESSION DATA ANALYSIS

X =

X11 X12 . . . X1I

X21 X22 . . . X2I

. . . .

. . . .

. . . .

XG1 XG2 . . . XGI

.

met

form

in

phen

form

in

phen

yl b

igua

nide

vera

pam

il

dexv

erap

amil

rofe

coxi

b

15−

delta

pro

stag

land

in J

2

cele

coxi

b

LM−

1685

SC

−58

125

LY−

2940

02

flufe

nam

ic a

cid

N−

phen

ylan

thra

nilic

aci

d

arac

hido

nyltr

ifluo

rom

etha

ne

dicl

ofen

ac

nife

dipi

ne

nitr

endi

pine

felo

dipi

ne

fasu

dil

imat

inib

tetr

aeth

ylen

epen

tam

ine

cloz

apin

e

thio

ridaz

ine

halo

perid

ol

chlo

rpro

maz

ine

trifl

uope

razi

ne

W−

13

arac

hido

nic

acid

proc

hlor

pera

zine

fluph

enaz

ine

prob

ucol

bute

in

4,5−

dian

ilino

phth

alim

ide

bens

eraz

ide

tiogu

anin

e

CFLARARL4CHMOX1SAE1HMGXB4HIST1H1CHCG9CDH11PMAIP1ZMPSTE24TSPAN5MICAMRPS31SERPINE1RBM4BTOM1L1HIST1H3DPOP7SH2B3EIF1TAF15LPAR6OSGIN1SETXSLC7A11NAMPTSTARHMGCS1TSPAN1LOC100505761ADCK3SF3B4HIST1H3BNQO1MAPRE2IDI1LOC100506963CYP46A1NPC1UBA2NEAT1CDK2AP2CEBPZPDCD6IPATP9ACDK7CALHM2FABP4LOC100506469TXNDC9LOC100507619HIST1H2AEKDM3AHBP1HIST1H2BKDHRS9BCAP31TOB1INSIG1PELOGIT1CDKN1AHMGCRFGL2LOC100129361KIF20ARBM5RBM7BHLHE40PPIFSPRY2MED6MIR22HGGCLMGCLCHIST2H2BELPIN1SQLECDKN1BSLC6A8SPATA1PDIA6DHRS2GADD45AIRX5RTN2DDIT4AKR1C2MSMO1LOC100506168PRMT3CNIHTRIM13NET1HNRNPRSMC4FLOT1ARPC5TOMM6LDLRAKR1C3LOC100293516CDK4SPRY1FAM13AFAM117ATXNRD1LRPPRCZNF586TRIB1HIST1H2BDCCDC28AUSPL1HIST1H2BGRASGRP1BET1AKAP9MPHOSPH10PQBP1STAG1

COMPOUNDS

GE

NE

S

Biclustering of gene expression data provides a simultaneous localsearch of a subset of genes for which a similar expression profiles weredetected across a subset of compounds

N. J. Perualila NCS2014 15/18

Heatmap of the Expression Profiles

GENE EXPRESSION DATA ANALYSIS

X =

X11 X12 . . . X1I

X21 X22 . . . X2I

. . . .

. . . .

. . . .

XG1 XG2 . . . XGI

.

trifl

uope

razi

ne

proc

hlor

pera

zine

fluph

enaz

ine

thio

ridaz

ine

halo

perid

ol

cloz

apin

e

chlo

rpro

maz

ine

imat

inib

fasu

dil

W−

13

15−

delta

pro

stag

land

in J

2

bute

in

arac

hido

nyltr

ifluo

rom

etha

ne

4,5−

dian

ilino

phth

alim

ide

vera

pam

il

dexv

erap

amil

LY−

2940

02

SC

−58

125

LM−

1685

phen

yl b

igua

nide

felo

dipi

ne

nife

dipi

ne

nitr

endi

pine

arac

hido

nic

acid

cele

coxi

b

met

form

in

tetr

aeth

ylen

epen

tam

ine

phen

form

in

rofe

coxi

b

dicl

ofen

ac

flufe

nam

ic a

cid

N−

phen

ylan

thra

nilic

aci

d

prob

ucol

bens

eraz

ide

tiogu

anin

e

BET1CDKN1BNET1STAG1CDKN1AZMPSTE24IRX5CDK2AP2SF3B4HCG9SPRY2SPRY1POP7MRPS31HNRNPRCDK7TRIB1FLOT1EIF1USPL1TRIM13DHRS2CDK4MPHOSPH10PRMT3TXNDC9RBM5RBM7CNIHDHRS9LPAR6CEBPZFAM13AAKAP9TOB1NAMPTBCAP31PDIA6LRPPRCRASGRP1ARL4CKIF20APPIFTSPAN1CDH11TSPAN5ARPC5PQBP1ATP9ASAE1UBA2SMC4LOC100507619MICALOC100506963LOC100506469LOC100506168LOC100505761SPATA1HMGXB4TOM1L1LOC100293516SH2B3TOMM6PDCD6IPLOC100129361MED6HIST1H2BGKDM3AADCK3CALHM2HIST1H2BKHIST1H1CHIST1H2AEHIST1H3BRTN2HBP1CCDC28AHIST1H3DHIST2H2BEHIST1H2BDFGL2SETXCYP46A1CFLARSLC6A8TAF15GIT1MAPRE2STARRBM4BFABP4ZNF586GADD45APMAIP1PELONQO1GCLCSERPINE1MIR22HGOSGIN1GCLMAKR1C2AKR1C3SLC7A11TXNRD1HMOX1NEAT1LDLRDDIT4FAM117ANPC1BHLHE40HMGCS1HMGCRLPIN1IDI1SQLEMSMO1INSIG1

COMPOUNDS

GE

NE

S

Biclustering of gene expression data provides a simultaneous localsearch of a subset of genes for which a similar expression profiles weredetected across a subset of compounds

N. J. Perualila NCS2014 15/18

BICLUSTERING OF GENE EXPRESSION DATA

Target prediction-based clusteringof compounds + identification ofdifferentially expressed genes for acompound cluster of interest.⇒ Gives a subset of genesexhibiting consistent patterns over asubset of compounds.

⇒ A bicluster

Biclustering on Gene expressiondata using FABIA

identifies similar cluster ofcompounds and subset of genes

log

2 c

oncentr

ation

−0.2

0.0

0.2

0.4

0.6

0.8

thio

rid

azin

e

ch

lorp

rom

azin

e

pro

ch

lorp

era

zin

e

clo

za

pin

e

triflu

op

era

zin

e

flu

ph

en

azin

e

ha

lop

eri

do

l

vera

pa

mil

dexve

rap

am

il

felo

dip

ine

nife

dip

ine

nitre

nd

ipin

e

ara

ch

ido

nyltri

flu

oro

me

tha

ne

15

−d

elta

pro

sta

gla

nd

in J

2

ara

ch

ido

nic

acid

ce

lecoxib

W−

13

me

tfo

rmin

tetr

ae

thyle

ne

pe

nta

min

e

ph

en

form

in

ph

enyl b

igu

an

ide

rofe

coxib

LM

−1

68

5

SC

−5

81

25

dic

lofe

na

c

4,5

−d

ian

ilin

op

hth

alim

ide

flu

fen

am

ic a

cid

N−

ph

enyla

nth

ran

ilic a

cid

pro

bu

co

l

bu

tein

be

nse

razid

e

LY−

29

40

02

tio

gu

an

ine

fasu

dil

ima

tin

ib

IDI1

INSIG1

MSMO1

LPIN1

SQLE

HMGCS1

NPC2

BHLHE40

N. J. Perualila NCS2014 16/18

BICLUSTERING OF GENE EXPRESSION DATA

Target prediction-based clusteringof compounds + identification ofdifferentially expressed genes for acompound cluster of interest.⇒ Gives a subset of genesexhibiting consistent patterns over asubset of compounds.

⇒ A bicluster

Biclustering on Gene expressiondata using FABIA

identifies similar cluster ofcompounds and subset of genes

log

2 c

oncentr

ation

−0.2

0.0

0.2

0.4

0.6

0.8

thio

rid

azin

e

ch

lorp

rom

azin

e

pro

ch

lorp

era

zin

e

clo

za

pin

e

triflu

op

era

zin

e

flu

ph

en

azin

e

ha

lop

eri

do

l

vera

pa

mil

dexve

rap

am

il

felo

dip

ine

nife

dip

ine

nitre

nd

ipin

e

ara

ch

ido

nyltri

flu

oro

me

tha

ne

15

−d

elta

pro

sta

gla

nd

in J

2

ara

ch

ido

nic

acid

ce

lecoxib

W−

13

me

tfo

rmin

tetr

ae

thyle

ne

pe

nta

min

e

ph

en

form

in

ph

enyl b

igu

an

ide

rofe

coxib

LM

−1

68

5

SC

−5

81

25

dic

lofe

na

c

4,5

−d

ian

ilin

op

hth

alim

ide

flu

fen

am

ic a

cid

N−

ph

enyla

nth

ran

ilic a

cid

pro

bu

co

l

bu

tein

be

nse

razid

e

LY−

29

40

02

tio

gu

an

ine

fasu

dil

ima

tin

ib

IDI1

INSIG1

MSMO1

LPIN1

SQLE

HMGCS1

NPC2

BHLHE40

N. J. Perualila NCS2014 16/18

BICLUSTERING OF GENE EXPRESSION DATA

Target prediction-based clusteringof compounds + identification ofdifferentially expressed genes for acompound cluster of interest.⇒ Gives a subset of genesexhibiting consistent patterns over asubset of compounds.

⇒ A bicluster

Biclustering on Gene expressiondata using FABIA

identifies similar cluster ofcompounds and subset of genes

log

2 c

oncentr

ation

−0

.20

.00

.20

.40

.60

.8

thio

ridazin

e

chlo

rpro

mazin

e

pro

chlo

rpera

zin

e

clo

zapin

e

triflu

opera

zin

e

fluphenazin

e

halo

peri

dol

fasudil

imatinib

vera

pam

il

dexve

rapam

il

felo

dip

ine

nife

dip

ine

nitre

ndip

ine

ara

chid

onyltri

fluoro

meth

ane

15−

delta p

rosta

gla

ndin

J2

ara

chid

onic

acid

cele

coxib

W−

13

metform

in

tetr

aeth

yle

nepenta

min

e

phenfo

rmin

phenyl big

uanid

e

rofe

coxib

LM

−1685

SC

−58125

dic

lofe

nac

4,5

−dia

nili

nophth

alim

ide

flufe

nam

ic a

cid

N−

phenyla

nth

ranili

c a

cid

pro

bucol

bute

in

bensera

zid

e

LY−

294002

tioguanin

e

N. J. Perualila NCS2014 16/18

FABIA Bicluster 1

BICLUSTERING OF GENE EXPRESSION DATA

Target prediction-based clusteringof compounds + identification ofdifferentially expressed genes for acompound cluster of interest.⇒ Gives a subset of genesexhibiting consistent patterns over asubset of compounds.

⇒ A bicluster

Biclustering on Gene expressiondata using FABIA

identifies similar cluster ofcompounds and subset of genes

log

2 c

oncentr

ation

Genes

HC

Fabia

−0

.20

.00

.20

.40

.60

.8

thio

ridazin

e

chlo

rpro

mazin

e

pro

chlo

rpera

zin

e

clo

zapin

e

triflu

opera

zin

e

fluphenazin

e

halo

peri

dol

fasudil

imatinib

vera

pam

il

dexve

rapam

il

felo

dip

ine

nife

dip

ine

nitre

ndip

ine

ara

chid

onyltri

fluoro

meth

ane

15−

delta p

rosta

gla

ndin

J2

ara

chid

onic

acid

cele

coxib

W−

13

metform

in

tetr

aeth

yle

nepenta

min

e

phenfo

rmin

phenyl big

uanid

e

rofe

coxib

LM

−1685

SC

−58125

dic

lofe

nac

4,5

−dia

nili

nophth

alim

ide

flufe

nam

ic a

cid

N−

phenyla

nth

ranili

c a

cid

pro

bucol

bute

in

bensera

zid

e

LY−

294002

tioguanin

e

N. J. Perualila NCS2014 16/18

FABIA Bicluster 1

DISCUSSION

The similarity of the biclustering results with the integrated approachshows that accounting for another source of information in the analysisof gene expression data gives a more refined search of ‘biclusters’containing a subset of ‘mechanistically’ related compounds regulatinga subset of functionally related genes.

Combining two sources of data provides a better understanding of themechanism of action of a compound cluster.

The approach is not only limited to the use of gene expression andtarget prediction data.

N. J. Perualila NCS2014 17/18

THANK YOU!

N. J. Perualila NCS2014 18/18