26
Drug-disease Graph: Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data Heeyoung Kwak, Minwoo Lee, Seunghyun Yoon, Jooyoung Chang, Sangmin Park, Kyomin Jung Seoul National University, Korea PAKDD 2020

Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Drug-disease Graph: Predicting Adverse Drug Reaction Signals via

Graph Neural Network with Clinical Data

Heeyoung Kwak, Minwoo Lee, Seunghyun Yoon, Jooyoung Chang, Sangmin Park, Kyomin Jung

Seoul National University, Korea

BS01‰�”…‚¶̄ '

…›¿�·º̇ —–‡�‰�”…‚¶̄ '· �́University�Identity�System � �̇–�”»¿�…�• …̨›�·º̇ —��„�̀ �‚ƒ�·º‡»¿��ß�‚• �̨˙¥̂ �̇ ·̌ �́‚����‰ �̂¢�˜¿„ ·́ˇ̃ ���… �̇ �̇

˙�‰����� ·̇ �́·º̇ ¥�ß�»�́ ¡„���·�.�‰ �̂¢�ß���„�̀ �� �̇̄ º� …̌”�»�̈ fi”‚̇ –̌���§̇ ���‡„�� �̈‰�”…‚¶̄ '· �́·º̇ —‡»� �̇–‚…”¿Ý¿¡����ˇ̂ …�¤�»�̀ ¶…”̇ �̌��

·º‡»¿��ß�‚• �̨� �̌�� �̈��„�̀ �‚ƒ���·�̇ –̌���§̇ ��� �̋�„ •̇ �̨»�¿º¿¡�����…›�̆ fl”�̇ ��̀ �� ‚̇ƒ�¿�̇ �·�.�‰�”…‚¶̄ '� �̇̂ �…�»�¿º�–�̀ ⁄�”�‰�”…‚¶̄ '� �̇

��̇ �…”�»���•`̇ ¿̌'�̀ ⁄̇ ��� �̋‚• �̨̂ �…��»���̀ ����̇ �̌ �̇»�¿º�”�–�̇ �·�.�‰�”…‚¶̄ '� �̇��»��”�CD-Rom ¿¡�…�• �̌ �̈�¥��̄ ‚‚ƒ�»�¿º̇ �·�.

‰�”…‚¶̄ '

ˆ�…�»�¿º�–�̀ ⁄ 10m m

PAKDD 2020

Page 2: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Biomedical Graph Embeddings

2

Biomedical Graphs

Graph EmbeddingLearning

Downstream Tasks

In many biomedical applications, learning graph embeddings with a high-quality graph has a great impact on the performance.

Drug-Disease Association (DDA) Network

Protein-Protein Interaction (PPI) Network

Drug-Drug Interaction (DDI) Network

Drug-Target Interaction (DTI) Network

Page 3: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Task Definition

3

• Adverse drug reactions (ADRs)

: Unintended, harmful reactions caused by th

e use of medicines.

(ADR Signal Detection) We aim to predict the potential drug-ADR pairs.

• Signal detection in drug safety

: To detect information on a new potential causal relationship

between a drug exposure and an adverse reaction.

Page 4: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Signal Detection in Pre-/Post-Marketing Phases

4

• Detection after the drug is released

• Traditionally relied on voluntary

reporting but recently ML approaches

have been attempted with clinical DB.

• Fewer studies using graph-based appr

oaches with clinical database.

• Detection during the clinical trial

• Early detection in the drug discovery

process

• Graph-based approaches with

biomedical information sources are

widely used.

Pre-Marketing Phase

Post-Marketing Phase

We explore graph-based approach with clinical database for predicting ADR signals in the post-marketing phase.

Page 5: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

ADRs in ADR Database

5

Drugs

• ADR database• SIDER (Side Effect Resource)

• Drug-ADR pairs extracted from

public documents.

• ADR terms describe the drug-

induced diseases or the signs

and symptoms indicative of

ADRs.Signs & Symptoms

Drug-induced disease

: Causality

UMLS concept ID

Side effect name

STITCH compound ID

Page 6: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

ADR Signals in Clinical Database

6

• Clinical database• EHRs, Healthcare claims data

• Longitudinal observations of

medical codes representing

diagnosis, prescription, and

procedures.

• ADR signals can appear in the

form of (prescription code –

diagnosis code) pair.Signs & Symptoms

Diagnosis(Disease or condition)

ICD-10 code

ATC code

Prescription

explain

Page 7: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

We build Drug-disease graph,a data-driven graph that extracts relationships between drugs and diseasesfrom a large-scale clinical database.

Page 8: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Relationships between drugs and diseases

8

Drug A

DisY

Hypothesis: Drug A induces Disease Y.

!ATC code

ICD-10 code

Page 9: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Relationships between drugs and diseases

9

Drug A

DisY

DisX

Hypothesis: Drug A induces disease Y.

• There are drugs most likely to be prescribed for a given disease.

→ Drug A is a prescription for some disease X.

Page 10: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Relationships between drugs and diseases

10

Drug A

DisY

DisX

Drug B

Hypothesis: Drug A induces Disease Y.

• There are drugs most likely to be prescribed for a given disease.

→ Given disease Y, some drug B will be prescribed.

Page 11: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Relationships between drugs and diseases

11

Hypothesis: Drug A induces Disease Y.

Drug A

DisY

DisX

Drug B

• Assumption: Related drugs and diseases will be positioned in close proximity to each other along the patients’ records.

Page 12: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Relationships between drugs and diseases

Drug A

DisY

DisX

Drug B

• Drug-disease graphis a graph constructed to encode these relationships.

Graph structure can provide insights to potential ADRs, which may not otherwise be revealed using disconnected structures.

12

Page 13: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Drug-disease Graph

13

: Drug node: Disease node

: Homogeneous edge

: Heterogeneous edge

• Each node represents the prescription/diagnosis code.

• Node attributes are given as themedical code embeddings associated with categorical information.

• Homogeneous edges encode the temporal proximity between two codes.

• Heterogeneous edges representdrugs and diseases that are highly related in prescriptions.

Page 14: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Healthcare Claims Data (NHIS-NSC)

14

• Sequencesconsisting ofdiagnosis and prescription codes.

The NHIS-NSC* database is a nationwide population-based cohort in Korea providing information on medical insurance claims and health examinations of a representative Korean population (1M).

* NHIS-NSC: National Health Insurance Service-National Sample Cohort

Page 15: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Medical Code Embedding Learning

15

(Proximity-based code embedding learning) We build a corpus consisting of code sequences extracted from the patients’ records and implement the Skip-gram learning.

Drug sequences

Disease sequences

• The context is composed of the codes that are in close proximity to the given code.

Page 16: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Node Attributes

16

Node attributes in Drug-disease graph contain both proximity and categorical information of medical codes.

• Hierarchical Structure of Medical Codes

Drug category

Disease category

Drug skip-gram

Disease skip-gram

Page 17: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Homogeneous Edges

17

Homogeneous edges in Drug-disease graph captures temporal proximity between medical codes on large-scale clinical databases.

• (Drug-drug), (disease-disease) edges

• Compute l2-distance between two medical code embeddings to

estimate the temporal proximity.

• Compute edge weight using Gaussian kernel.

6 H. Kwak et al.

representation along with category-based node representation. Proximity-basednode representation is obtained by initial Skip-gram code embedding as in sec-tion 3.2. We denote a proximity-based drug node as v0

drug and a disease node asv0dis. Category-based node representation is designed to represent the categorical

information of medical codes. We utilize the hierarchical structure of categoricalcodes (i.e. ATC and ICD-10 codes) by adopting the one-hot vector format. Sincethere are multiple categories for each code, the category-based node represen-tation is shown as a concatenation of one-hot vectors, thus, a multi-hot vector.Finally, the initial node representation of the Drug-disease graph are representedas the concatenation of the proximity-based node embeddings and the category-based node embeddings. Following are the definitions for the drug and diseasenode representations.

Definition 3 (Node Representations)

v00drug = {v1

drug||v2drug||v3

drug||v4drug||v5

drug},v00dis = {v1

dis||v2dis},

vdrug = {v0drug ||v00

drug},vdis = {v0

dis ||v00dis},

(3)

where v00drug is a category-based drug node, v00

dis is a category-based disease node,vdrug is an initial drug node, vdis is an initial disease node, and || is a vectorconcatenation function. Each v

idrug represents the each level in the ATC code

structure and v00drug 2R104. Because the ATC code structure is represented in 5

levels, a drug node vector is also represented as the concatenation of 5 one-hotvectors. Similarly, each v

idis represents each of the first two levels in the ICD-

10 code structure and v00dis 2R126. We only use two classification levels of the

ICD-10 code structure, therefore, the disease node vector is represented as theconcatenation of 2 one-hot vectors.

For homogeneous edges like Edrug and Edis, we view the relationships be-tween homogeneous nodes as the temporal proximity of two entities, meaningthat two nodes are likely to be close together in the records. Therefore, us-ing the proximity-based node embeddings, we compute l2 -distance between twonode embeddings to estimate the temporal proximity. For heterogeneous edges,which are the edges connecting drug nodes and disease nodes, are given as theconditional probability of drug prescription given the diagnosed disease. Thedefinitions of the two types of edges are given as follows:

Definition 4 (Homogeneous Edges) For any node i, j 2 Vdrug (or Vdis),the edge weight wij between two nodes are defined using Gaussian weightingfunction as follows:

wij =

(exp(�kv0

i�v0jk

2

2✓2 ) if kv0i � v

0jk threshold,

0 otherwise,(4)

Page 18: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Heterogeneous Edges

18

• (Drug-disease) edges

• Use the empirical conditional probability to represent the drugs th

at are the most likely to be prescribed for a given disease.

Heterogeneous edges in Drug-disease graph encodes the conditional probability of the prescription given the diagnosis.

Drug-disease Graph: Predicting ADR Signals via GNN with Clinical Data 7

for some parameters threshold and ✓. v0i and v

0j are the proximity-based node

embeddings of two nodes i and j. Later, we additionally use edge-forming thresh-olds to control the sparsity of the graph.

Definition 5 (Heterogeneous Edges) For any drug node i 2 Vdrug and anydisease node j 2 Vdis, the edge weight wij between two nodes are given as:

wij =nij

nj, (5)

where nij is number of patients’ histories in the NHIS-NSC database that isrecorded with a diagnosis j and a prescription i in tandem. nj is the number ofpatients’ histories with the diagnosis j.

3.4 A GNN-based Method for Learning Graph Structure

We aggregate neighborhood information of each drug/disease node from the con-structed graph using the Graph Neural Network (GNN) framework. In each layerof GNN, the weighted sum of neighboring node features in the previous layer iscomputed to serve as the node features (after applying a RELU nonlinearity �)as follows:

zi(l+1) = �(

X

j2N (i)

↵(l)ij Wzj

(l)), (6)

where N (i) denotes the set of neighbors of ith node, zi(l) denotes feature vector

of ith node at lth layer, W denotes a learnable weight matrix and ↵(l)ij denotes

the normalized edge weight between ith and jth nodes at the lth layer. In thefirst layer, the initial drug/disease node representations are each passed througha nonlinear projection function to match their dimensions.

We use two weighting schemes for ↵(l)ij . The first variant follows the definition

in [7], and the weight is defined as follows:

↵ij =wijpdidj

, (7)

where di and dj are the degree of nodes i and j respectively, and wij are theedge weights defined in section 3.3. The weights are fixed for all layers. Thesecond weighting scheme instead learns the weighting scheme using attentionmechanism [19] as follows:

↵(l)ij =

exp(g(zi(l), zj(l)))Pk2N (i) exp(g(zi

(l), zk(l))), (8)

where g is a single fully-connected layer with LeakyReLU nonlinearity that takesa pair of node features as input. In the rest of this paper, we call the network withthe first weighting scheme as GCN and the network with the second scheme asGAT.

𝑛!" : # of patients’ records with a diagnosis 𝑗and a prescription 𝑖 in tandem.

𝑛" : # of patients’ records with a diagnosis 𝑗

Page 19: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

ADR Link Prediction

19

We predict the ADR signal of a drug-disease pair using the learned embeddings from the GNN model with a single bilinear layer.

Page 20: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Experiments show thatusing graph structure improvesthe prediction performances.

Page 21: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

Experimental Results

21

Drug-disease Graph: Predicting ADR Signals via GNN with Clinical Data 9

Table 2. Model AUROC and AUPRC performances (including 95% CI)

Model AUROC AUPRC

LR 0.631 ± 0.006 0.585 ± 0.007NN 0.739 ± 0.005 0.701 ± 0.006

DW 0.728 ± 0.004 0.709 ± 0.004DW + NN 0.772 ± 0.005 0.758 ± 0.005

DW + GCNlow 0.794 ± 0.003 0.768 ± 0.003GCNlow 0.795 ± 0.006 0.775 ± 0.006GATlow 0.732 ± 0.005 0.686 ± 0.009adrGCNlow 0.755 ± 0.008 0.726 ± 0.009GCNhigh 0.784 ± 0.006 0.761 ± 0.008GAThigh 0.733 ± 0.008 0.692 ± 0.009adrGCNhigh 0.756 ± 0.004 0.732 ± 0.006

our dataset, and therefore there could be a data leakage if the same disease classexists in both training and validation. The class of disease means the classifica-tion up to the third digit of ICD-10 codes. Note that we make the inference verydi�cult by not letting the model know which classes of diseases are linked withdrugs as ADRs. We use 80% of data for training, 10% for validation, and theremaining 10% for testing.To control the sparsity of a graph, we build two types of graphs where the edge-forming threshold is either low or high. When the edge-forming threshold is low,the graph has more edges, having more information as a result. We examinewhether it is beneficial or detrimental to have more edge information. We dis-tinguish two graphs by setting the thresholds for Edrug di↵erently. The summarystatistics of the constructed graphs and datasets are provided in Table 1.

4.3 Evaluations

To verify the performance of the GNN approach, we compare GNN modelswith DeepWalk approach and non-graph-based ML techniques. DeepWalk isa unsupervised graph embedding method that uses random walks on graphs.In DeepWalk approach, we use the DeepWalk embeddings, pre-trained by theDeepWalk model. We apply vanilla GCN and its variants to examine the e↵ectof considering the edge types. The followings are the models used for the graphembedding learnings. All the neural-network-based models use two layers witha hidden dimension of 300.

– LR is a logistic regression (LR) approach with information of the graphtopology. The vector composed of initial node representations of the nodeitself and its neighbor nodes are input to the LR model. The number ofneighbors is limited to 10.

– NN is a 2-layer fully-connected neural network which is solely based on theinitial node representations.

with graph

structure

withoutgraph

structure

The best performance when GCN is applied with the low edge-forming threshold.

Results on 𝐷#$!%&'(without overlapping disease classes)

Page 22: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

22

Experimental Results

Learning curves on 𝐷('$')*+(with overlapping disease classes)

DeepWalk embeddings are not sufficient to learn the graph structure.

Page 23: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

The prediction results of drug-ADR pairs can assist ADR database to be more comprehensive.

Page 24: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

24

Newly-Described Drug-ADR PairsDrug-disease Graph: Predicting ADR Signals via GNN with Clinical Data 11

Table 3. Newly-described drug-ADR pairs which are predicted, by the proposedmethod, to be positive with high probability.

Drug name ADR symptom Probability

Dasatinib Cardiac murmur 0.985Hydroxycarbamide Abnormal reflex 0.981Alendronic acid Tetany 0.978Ibandronic acid Unspecified edema 0.976Etidronic acid Abnormal reflex 0.972

4.4 Newly-Described ADR Candidates

To verify the power of the graph-based approach to discover ADR candidateswhich are unseen in the dataset, we extract the drug-disease pairs which arepredicted to be positive with high probability — over 0.97 but labeled as negative(false positive). To demonstrate the genuine power of graph-based methods, weexclude the candidates that are also positively predicted by the baseline neuralnetwork, which does not use relational information. As a result, clinical experts(M.D.) confirm that there exist pairs that are clearly considered to be real ADRs.The pairs are listed in Table 3.

Many of the discovered pairs, including umbrella terms like edema, are rathersymptoms and signs than diseases. This can be explained by the fact that theSIDER database is less comprehensive to cover all the specific symptoms, thatcan be induced by taking medicine. Especially, cardiac murmur and abnormalreflex are frequent symptoms, but it is reasonable to say that the suggestedpairs are ADRs. For example, Dasatinib is used to treat leukemia and can havesignificant cardiotoxicity, which can lead to cardiac murmurs. Hydroxycarbamideis a cytotoxic drug used for certain types of cancer, and it is known that cytotoxicmedications can cause electrolyte imbalance leading to abnormal reflex.

There are also significant pairs such as alendronic acid and tetany in the thirdrow. Severe and transient hypocalcemia is a well-known side-e↵ect of bisphos-phonates, which can lead to symptoms of tetany. Alendronic acid is classified asbisphosphonates, and therefore, tetany can be described as ADR of alendronicacid. Ibandronic acid and etidronic acid in the last two rows are also bisphospho-nates, and the paired symptoms are relevant to the usage of bisphosphonates.Unspecified edema may signify bone marrow edema caused by bisphosphonateuse, and electrolyte imbalance, which can lead to abnormal reflex, can be causedby etidronic acid use. All these explanations show that the ADR pairs we extractare based on various relations among drugs and diseases.

5 Conclusion

In this study, we propose a novel graph-based approach for ADR detection byconstructing a graph from the large-scale healthcare claims data. Our model

Our model predicts drug-ADR pairs that do not exist in the established ADR database showing its capability to supplement the ADR database.

Drug-disease pairs which are predicted to be positive with high probabilitybut labeled as negative (false positive)

Page 25: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

25

ConclusionØ We propose a novel graph-based framework for ADR signal detection.

Ø Drug-disease graph is built to captures the relationships between drugs and diseases in healthcare claims data.

Ø ADR signal detection using graph neural network shows superior performances and its capability to supplement the ADR database.

Ø Future work Ø Augment attributes using pharmacological knowledge.Ø Combine with other biomedical networks such as drug-disease associati

on(DDA) network.Ø Design more suitable graph-learning algorithm for ADR prediction.

Page 26: Drug-disease Graph: Predicting Adverse Drug Reaction ...€¦ · Predicting Adverse Drug Reaction Signals via Graph Neural Network with Clinical Data HeeyoungKwak, ... Healthcare

26

Thank you for listening!

Extended version of our paperprovides more details:

https://arxiv.org/abs/2004.00407