Upload
theresa-fleming
View
224
Download
1
Tags:
Embed Size (px)
Citation preview
Outline
Overview Controlled Natural Language Answer Set Programming Transforming Queries into Programs Producing Answers Related Works Conclusion
Overview
People need to write queries to extract information from biomedical ontologies.
Problem: Formal query languages are not suitable for many of them.
Solution: Natural language query? Further problem: Ambiguities,
Complexities. Solution: Controlled natural language!
How will it work?
Controlled natural language is unambiguous.
Therefore, a query can be easily and unambiguously translated into a logical form.
Then one can do reasoning with it!
Which drug cures Asthma?
which_drug(A) ← drug_cure_disease(A,as
thma).Formoterol
Controlled Natural Language Subset of a natural language. It has a restricted grammar and
vocabulary. It overcomes the ambiguity and
complexity of natural language. Example:
Attempto Controlled English (University of Zurich)
Attempto Controlled English (ACE) Subset of standard English with a
restricted syntax and restricted semantics described by a small set of Construction rules – Grammar Interpretation rules – remove ambiguities
A customer who enters a card manually types a code.
APE translates ACE text into DRS.
Discourse Representation Structure (DRS)
DRS derived from ACE text is returned as: drs ( Domain, Conditions)
Uses a fixed number of predefined predicates: object, predicate, property, relation,
modifier_pp, modifier_adv, query. An example:
A man is mortal.
DRS Contd.
Query: What are the symptoms of the diseases that are related to ADRB1 or that are treated by Epinephrine?
Answer Set Programming (ASP) ASP is a form of declarative programming
oriented towards difficult search problems In ASP a problem is posed as a logic
program and solution is computed by its model or answer set.
It allows to automate reasoning with incomplete information.
Answer set solvers: Smodel, Clasp, DLV etc.
An Example
ide_drive :- hard_drive, not scsi_drive. scsi_drive :- hard-drive, not ide_drive. scsi_controller :- scsi_drive. hard_drive.
M1 = {hard_drive, ide_drive} M2 = {hard_drive, scsi_drive,
scsi_controller}
Converting Query to Program Three step process
Obtaining DRS produced by APE Parsing DRS Generating answer set program
Parsing DRS
Grammar for DRS: DRS drs( Domain , Conditions ) Domain [] | [ Referent {,Referent}* ] Conditions [ Condition {,Condition}* ] Condition Predicate | ComplexStructure Predicate Object | Property | Relation | Predicate | Modifier_pp | Modifier_adv | Query ComplexStructure Question | Negation | Disjunction .....
A Recursive Descent Parser
DRS ParserInternal Structu
re
Generating Answer Set Programs A program consists of rules. A rule has two parts:
Head :- Body Generating rules:
Constructing Head atom Constructing Bodies
Constructing Head Atom
Which drug cures Asthma?
query(A,which) object(A,drug,countabl
e,na,eq,1)
Which_drug(A)
What is the drug that cures Asthma?
query(A,what) predicate(B,be,A,C) object(C,drug,countabl
e,na,eq,1)
What_be_drug(C)
Which Query What Query
Generating Bodies
What are the symptoms of the diseases
THAT
Are related to ADRB1OR
that are treated by Epinephrine?
Generating Bodies Contd.
Depth First Traversal of internal DRS representation
Generate a new body for each leaf Add an atom to the body for
Each predicate-predicate predicate(D,cure,C,named(Asthma)) drug_cure_disease(C, asthma)
Each relation-predicate relation(A,of,B) Symptom_of_disease(A, B)
Examples
What are the symptoms of the diseases that are related to ADRB1 or that are treated by Epinephrine? what_be_symptom(C) :- symptom_of_disease(C,D), disease_be_related_to_gene(D,adrb1)
what_be_symptom(C) :- symptom_of_disease(C,D), drug_treat_disease(epinephrine,D)
Which gene is related to a disease that causes Insomnia? which_gene(A) :- disease_cause_symptom(B,insomnia), gene_be_related_to_disease(A,B)
Producing Answers
We need biomedical knowledge Knowledge must be encoded Answer Set Solver
Clasp We need an interface.
Biomedical Knowledge
Concepts Gene Drug Disease Symptom
PharmGKB database: Relationships between gene, drug and disease.
MedicineNet.com: Disease and symptom database.
Encoding Knowledge
Facts: disease_symptom(asthma,cough). gene_disease(adra1b,asthma). gene_drug(adra1b,norepinephrine). drug_disease(norepinephrine,hypertension)
. …..
Rules: drug_symptom(X,Y) :- drug_disease(X,Z),
disease_symptom(Z, Y).
System Architecture
User Interface
Query Pre-
processor
APE
ParserDRS
Translator
Post Processi
ng
Clasp Interface Clasp User
Interface
Post Processing
disease_be_related_to_gene(D,adrb1)
drug_treat_disease(epinephrine,D)
disease_cause_symptom(B,insomnia)
gene_disease(adrb1,D)
drug_disease(epinephrine,D)
disease_symptom(B,insomnia)
Before Afters
Related Works
A preliminary report on answering complex queries related to drug discovery using answer set programming by Oliver Bodenreider et al.
Transforming Controlled Natural Language Biomedical Queries into Answer Set Programs by Esra Erdem et al.