Upload
msabou
View
214
Download
3
Tags:
Embed Size (px)
Citation preview
Web3.0 and Language Resources
Knowledge Media Institute (KMi)The Open University
Semantic Technologies @ KMi
Outline
• Library of generic problem solving methods– To act as active components
• Methods for dealing with heterogeneous knowledge sources– Knowledge Fusion (KnoFuss)– Ontology Matching (Scarlet)
• Infrastructures for storing large scale semantics (Watson)
• Tools for exploiting distributed semantics– Open Domain, Multi-Ontology Question Answering
(PoweAqua)
PSM
• Library of generic problem solving methods– To act as active components
I include both generic slides and a detailed example- this is more for you to understand what is going on, I think most of the detailed slides can be skipped in the talk.
Knowledge-level Architecturesfor Sharing and Reuse
Application of the modelling paradigm to the specification and
use of libraries of reusable components for knowledge systems
Knowledge-level Architecturesfor Sharing and Reuse
Modelling Frameworks (1)
• A modelling framework identifies the generic types of knowledge which occur in knowledge systems, thus providing a generic epistemological organization for knowledge systems
• Several exist – KADS/Common KADS - Un.of Amsterdam– Components of Expertise - Steels– Generic Tasks - Chandrasekaran– Role-limiting Methods - McDermott– Protégé - Musen, Stanford– TMDA - Motta– Ibrow - Fensel, Motta et al.
Modelling Frameworks (2)
• Much in common– Emphasis on reusable models– Typology of generic tasks– Constructivist paradigm
• Some differences– Different degrees of coupling between domain-specific and domain-independent knowledge
– Different degrees of flexibility– Different typologies of knowledge categories
A Constructive Approach...
Let’s define our own framework...
Generic Tasks
• Informal definition– A generic class of applications - e.g., planning, design, diagnosis, scheduling,
• More precise definition– A knowledge-level, application-independent description of the goal to be attained by a problem solver.
• Several typologies exist– e.g., Breuker, 1994
• Viewpoints over applications– No ‘natural categories’– Different viewpoints can be imposed on a particular application
Example: Parametric Design
Generic Task Parametric Design Inputs: Parameters, Constraints,
Requirements, Cost-Function, Preferences
Output: Design-Model
Goal: “To produce a complete and consistent design model, which satisfies the given
requirements”
Preconditions: “At least one requirement and one parameter are provided”
Example: Classification
Generic Task Classification Inputs: Candidate-classes
ObservablesMatch-criterionSolution-criterion
Output: Best-Matching-Classes
Preconditions: “At least one candidate
class exists”
Goal: “To find the class that best
explains the observables”
Generic Component 2: Reusable PSMs
• A domain-independent, knowledge-level specification of problem solving behaviour, which can be used to solve a class of tasks.
• PSM specifications may be partial
• PSM can be task-specific– E.g., heuristic classification
• PSM can be task-independent– E.g., search methods, such as hill-climbing, A*, etc.....
Functional Specification of a PSM
Problem solving method search ontology import state-space-terminology competence roles input input: State output output: State preconditions input ≠ 0 postconditions solution_state (output) assumptions ?s . solution_state (?s) & successor
(input, ?s)
Operational Description
Begin
states:= one x. initialize (input input)repeatstate:= one x . select _state (states states)if solution_state (state)then return state else succ_states:= one x. derive_successor_states
(state state) states:= one x. update_state_space (input1 states
input2 succ_states)
end ifend repeat
end
Task-Method StructuresProblem Type
Primitive PSM
Multi-Functional Domain Models
• Domain-specific models, which are not committed to a specific PSM or task.
• Examples– A database of cars– The CYC knowledge base, etc..
Picture so far..
Problem SolvingMethod
Classification Simple Classifier
Lunar rocks
Application Model
Generic Task
Multi-Functional Domain
Problem SolvingMethod
Classification Simple Classifier
Lunar rocks
Application Model
Generic Task
Multi-Functional Domain
Issue
How to link different reusable components?
Problem SolvingMethod
Classification
Task-DomainMapping
PSM-DomainMapping
Simple Classifier
Lunar rocks
Application Model
Generic Task
Multi-Functional Domain
Solution: Mappings
• Mappings model explicitly the relationship between different components in an application model
Task-PSMMapping
Example
• Scenario: Office Allocation Application
• Generic Task: Parametric Design• Domain: KB about employees and offices
Parameter
Employee
Design Model
Pairs<Employee, Room>
Task Level
Domain Level
Mappings are an example of application-specific knowledge. Are there others?
Application-specific knowledge
Yes: Application-specific heuristic problem solving knowledge
Elevator Design Example
• A configuration designer only considers two positions for the counterweight– Half way between platform and U-bracket– A position such that the distance
between the counterweight and the platform is at least 0.75 inches
Complete Picture
Problem SolvingMethod
Application Model
Generic Task
Multi-Functional Domain
MappingKnowledge
Application-specificProblem-Solving Knowledge
Application Configuration
Detailed Example:A Library of Components for
Classification
Observables
Candidate Sols.
Criterion
Classification Solution
Classification
• Classification can be seen as the problem of finding the solution (class), which best explains a set of known facts (observables), according to some criterion
Example
Observables
Candidate Sols.
Criterion
Classification Solution
{background=green; area=china...}
Complete-coverage-criterion(every observable has to be explained)
{chinese-granny, dutch-granny, etc..}
{chinese-granny}
Observables
Observables = set_of (Observable);Observable = {feature, value}.
Well defined Observables (obs): ({f1, v1} obs {f1, v2} obs) -> v1 = v2
({f1, v1} obs) -> legal_feature_value (f1, v1 )
Solutions
Solution = set_of (Feature_Spec);Feature_Spec = {Feature, Feature_value_spec}
Feature_value_spec = Unary_Relation
Well defined Solution (sol):{f1, s1} sol holds (s1, v1 ) -> legal_feature_value (f1, v1 )
Matching
Observable={f1, v1} matches Solution=sol iff:
{f1, c} sol holds (c, v1 )
Matching Sets of Obs to a Solution
Sol: {{fsol1, c1}...{fsolm, cm}}; Obs: {{fob1, v1}...{fobn, vn}}
Four possible cases: {fj, cj} sol {fj, vj} obs holds (cj, vj) -> Explained (fj)
{fj, cj} sol {fj, vj} obs not holds (cj, vj) -> Inconsistent(fj)
{fj, vj} obs {fj, cj} sol -> Unexplained (fj)
{fj, vj} obs {fj, cj} sol -> Missing (fj)
Default Match Criterion
Match Score:Vector: <I, E, U, M>
Match Comparison RelationS1 = (i1, e1, u1, m1); S2 = (i2, e2, u2, m2)
S1 better_score than S2 iff:
(i1 < i2) (i2 = i1 e2 < e1) (i2 = i1 e2 = e1 u1 < u2) (i2 = i1 e2 = e1 u2 = u1 m1 < m2)
Possible Solution Criteria
• Positive Coverage– Some feature is explained and none is inconsistent
• Complete Coverage– All features are explained and none is inconsistent
Hierarchy of Criteria
Solution Criterion
Match Criterion
Match Score Comparison Rel
Macro Score MechanismFeature Score Mechanism
Match Score Mechanism
Observables
(def-class observables (set) ?obs "This is simply a set of observables. An important constraint is that there cannot be two values
for the same feature in a set of observables" :iff-def (every ?obs observable) :constraint (not (exists (?ob1 ?ob2) (and (member ?ob1 ?obs) (member ?ob2 ?obs) (has-observable-feature ?ob1 ?
f) (has-observable-feature ?ob2 ?
f) (has-observable-value ?ob1 ?
v1) (has-observable-value ?ob2 ?
v2) (not (= ?v1 ?v2))))))
Solutions
(def-class solution () ?x "A solution is a set of feature definitions" :iff-def (every ?x feature-definition))
(def-class feature-definition () ?x ((has-feature-name :type feature) (has-feature-value-spec :type unary-relation)) :constraint (=> (and (has-feature-name ?x ?f) (has-feature-value-spec ?x
?spec)) (=> (holds ?spec ?v) (legal-feature-value ?f ?
v))))
Solution Criterion
(def-class solution-admissibility-criterion () ?c ((applies-to-match-score-type :type match-score-type) (has-solution-admissibility-relation :type unary-
relation)) :constraint (=> (and (solution-admissibility-criterion ?c) (has-solution-admissibility-
relation ?c ?r) (domain ?r ?d)) (subclass-of ?d match-score)))
Monotonicity of Admissibile Solutions
(def-axiom admissibility-is-monotonic "This axiom states that the admissibility criterion is
monotonic. That is, if a solution, ?sol, is admissible, then any solution which is better than ?sol will also be admissible"
(forall (?sol1 ?sol2 ?obs ?criterion) (=> (and (admissible-solution ?sol1 (apply-match-criterion
?criterion ?obs ?sol1) ?criterion)
(better-match-than ?sol2 ?sol1 ?obs ?criterion))
(admissible-solution ?sol2 (apply-match-criterion
?criterion ?obs ?sol2) ?criterion))))
Complete Coverage
(def-instance complete-coverage-admissibility-criterion solution-admissibility-criterion ((applies-to-match-score-type default-match-score) (has-solution-admissibility-relation complete-coverage-admissibility-relation)))
(def-relation complete-coverage-admissibility-relation (?score)
"a solution should be consistent and explain all features" :constraint (default-match-score ?score) :iff-def (and (= (length (first ?score)) 0) ;;no
inconsistency (= (length (third ?score)) 0))) ;;no
unexplained
Classification Task Ontology
• 42 Definitions• Provides both a theory of classification and a vocabulary to describe classification problems
• Ontology is separated from task specifications
Generic Classification Task
• Input roles– Candidate Solutions, Match Criterion, Solution Criterion, Observables
• Precondition– Both observables and candidate solutions have to be provided
• Goal– To find a solution from the candidate solutions which is admissible with respect to the given observables, solution criterion and match criterion
Specific Classification Tasks
• Single-Solution Classification Task– Single-solution assumption
• Optimal Classification Tasks– Goal requires optimality
Problem Solving Library
• Based on heuristic classification model
• Supports both data-directed and solution-directed classification
• Based on search paradigm• Supported by a method ontology
Method Ontology: Main Concepts
• Abstractors– Mechanism for performing abstraction on observables
– Abstractor: Obs* -> Obs
• Refiners– Mechanism for specialising a solution
– Refiner: Sol -> Sol*
• Candidate Exclusion Criterion– A criterion which is used to decide when a search path is a dead-end
– Default criterion rules out inconsistent solutions
Monotonicity of Exclusion Criterion
(def-axiom exclusion-is-monotonic (forall (?sol1 ?sol2 ?obs ?criterion) (=> (and (ruled-out-solution ?sol1 (the-match-score ?sol1) ?
criterion) (not (better-match-than ?sol2 ?sol1 ?
obs ?criterion))) (ruled-out-solution ?sol2 (the-match-score ?sol2)?criterion))))
Axiom of Congruence(def-axiom congruent-admissibility-and-exclusion-criteria (forall (?sol ?task) (=> (member ?sol (the-solution-space ?task)) (not (and (admissible-solution ?sol (the-match-score ?sol) (role-value ?task 'has-solution-admissibility-
criterion)) (ruled-out-solution ?sol (the-match-score ?sol) (role-value ?psm
'has-solution-exclusion-criterion)))))))
Three Heuristic Classification PSMs
• Two Data-directed– Admissible Solution Classifier
• Finds one admissible solution according to the given criteria
• Uses backtracking hill climbing– Optimal Classifier
• Performs complete search looking for optimal solution
• Uses best-first strategy• Uses candidate exclusion criterion to prune search space
• One Solution-directed– Goes down the solution hierarchy, acquiring observables as needed
– Ask for observables with max discrimination power
Task-Method Hierarchy
abstraction
heuristic-classification-psm
classification
rank-solutions refinement
basic-heuristic-matchselect-abstractor one-step-abstraction collect-refiners apply-refiners
abstraction-psm refinement-psmrank-solutions-psm
KnoFuss
• Methods for dealing with heterogeneous knowledge sources– Knowledge Fusion (KnoFuss)
• The story here is that smart products will contain a lot of instance level semantic data encoded in terms of different ontologies and it will be important to be able to compare and merge similar (or fuse) instance data.
Knowledge fusion scenario
RDF
Images
Other data
Annotation FusionText
Internal corporate reports (Intranet)
Pre-defined public sources (WWW)
Domain ontology
KnoFuss
Knowledge base
Knowledge fusion
Ontology integration
Knowledge base
integration
Ontology matching
Instancetransformation
Coreference resolution
Dependencyprocessing
Source KB
TargetKB
SPARQL query translation
Fusion workflow
KnoFuss architecture
Fusion KBIntermediate data
Main KB
Fusion module
ObjectIdentificationMethod
ConflictDetectionMethod
ConflictResolutionMethod
Method library
New data
Fusion ontology
• Method library– Contains implementation of each specific algorithm
• Fusion ontology– Describes method capabilities– Defines intermediate structures (mappings, conflict sets, etc.)
Steps
• Coreference resolution– Attribute similarity algorithms
• Dependency processing– Employing additional information:
• Schema restrictions• Links between instances• Provenance
– Using formal uncertainty reasoning• Dempster-Shafer belief networks
Scarlet
• Methods for dealing with heterogeneous knowledge sources– Ontology Matching (Scarlet)
• The motivation is that we will need to match between the different ontologies used by different products.
1
0.9
0.9 0.91
0.5
0.5
–Label similarity methods •e.g., Full_Professor = FullProfessor
–Structure similarity methods•Using taxonomic/property related information
Ontology Matching
–Most ontology matching techniques work only in cases when:
–There is a sufficient syntactic overlap between the labels of the concepts in the matched ontologies–The structure of the matched ontologies is rich enough to allow meaningful comparisons
–However, this might not be the case for smart products–Smart products from different domains will use very different terminology thus excluding syntactic comparison–Smart product ontologies will probably be small and structurally shallow due to their resource limitations thus excluding the use of structure based techniques
–Therefore we propose a new matching paradigm which relies on the use of external knowledge
Ontology Matching
New paradigm: use of background knowledge
A B
Background Knowledge(external source)
A’ B’R
R
Proposal: • rely on online ontologies (Semantic Web) to derive mappings• ontologies are dynamically discovered (using Watson) and combined
A Brel
Semantic Web
External Source = Semantic Web
Does not rely on any pre-selected knowledge sources.
Sabou, M., d'Aquin, M., and Motta, E. (2008) Exploring the Semantic Web as Background Knowledge for Ontology Matching, Journal of Data Semantic, XI.
How to combine online ontologies to derive mappings?
The Question is …
Strategy 1 - Definition
Find ontologies that contain equivalent classes for A and B and use their relationship in the ontologies to derive the mapping.
A Brel
Sem
anti
c W
eb
A1’B1’
A2’B2’
An’Bn’
O1
O2 On
BABA
BABA
BABA
BABA
⊥⇒⊥⊇=>⊇⊆=>⊆≡⇒≡
''
''
''
''For each ontology use these rules:
…
These rules can be extended to take into account indirect relations between A’ and B’, e.g., between parents of A’ and B’:
'''' BABCCA ⊥⇒⊥∧⊆
Strategy 1- Examples
ka2.rdf
Researcher AcademicStaff
Sem
anti
c W
eb
Researcher
AcademicStaff
⊆
⊆
ISWC SWRC
But what if there exists no ontology that contains both A and B?
Beef Food
Sem
anti
c W
eb
Beef
RedMeat
Tap
Food
MeatOrPoultry
⊆
⊆
⊆
⊆
SR-16 FAO_Agrovoc
Strategy 2 - Definition
BABCCAr
BABCCAr
BABCCAr
BABCCAr
BABCCAr
⊇⇒≡∧⊇⊇⇒⊇∧⊇⊥⇒⊥∧⊆≡⇒≡∧⊆⊆⇒⊆∧⊆
')5(')4(')3(')2(')1(
Principle: If no ontologies are found that contain the two terms then combine information from multiple ontologies to find a mapping.
A Brel
Sem
anti
c W
eb
A’BC
C’B’rel
rel
Details: (1) Select all ontologies containing A’ equiv. with A (2) For each ontology containing A’:
(a) if find relation between C and B.(b) if find relation between C and B.
CA ⊆'CA ⊇'
Details: (1) Select all ontologies containing A’ equiv. with A (2) For each ontology containing A’:
(a) if find relation between C and B.(b) if find relation between C and B.
Strategy 2 - Examples
PoultryChicken⊆FoodPoultry ⊆
Chicken Vs. Food(midlevel-onto)
(Tap)
Ex1:
FoodChicken⊆
Ham Vs. FoodEx2:
(r1)
MeatHam⊆FoodMeat ⊆
(pizza-to-go)
(SUMO) FoodHam⊆
(Same results for Duck, Goose, Turkey)
(r1)
Ham Vs. SeafoodEx3:
MeatHam⊆SeafoodMeat ⊥
(pizza-to-go)
(wine.owl) SeafoodHam ⊥(r3)
Large Scale Evaluation
• Ontology alignment evaluation initiative - food
• AGROVOC: UN FAO’s agricultural thesaurus– ±16.000 terms – multilingual
• NALT: United States National Agricultural Library Thesaurus
– ±41.000 terms
• Precision obtained: 70%
Concept_A
(e.g., Supermarket)
Concept_B
(e.g., Building)
ScarletScarlet≡≡
Semantic Web
Semantic Relation
( )
Deduce
Access
⊆
- SCARLET - relation discovery on the SW
- http://scarlet.open.ac.uk/
- Automatically selects and combines multiple online ontologies to derive a relation
Basic functionality used:Relation Discovery
Watson
• Infrastructures for storing large scale semantics (Watson)
• The story here is that Watson could be used as a starting point for building a storage infrastructure for the distributed semantic information in SmartProducts.
is a Search Engine for the Semantic Web
Gateway
Architecture
Web Interface
Web Interface
Advanced Keyword Search
Web Interface
Ontology Exploration
Web Interface
Ontology Metadata
Web Interface
Querying
APIs
• SOAP and REST APIs that provide the infrastructure to:– Find SW documents and
retrieve metadata about them– Find entities (classes,
properties, individuals) and explore their semantic description
– Apply SPARQL queries to Semantic Web documents
Next Generation Semantic Web Applications
WATSON enables a new generation of Semantic Web applications that need to access and reuse semantic information distributed on the entire Web.
Examples of NGSW
IEEE Intelligent Systems23(3), pp. 20-28, May/June 2008
• Key aspects of the paradigm• Tech. Infrastructure• Concrete Applications
PoweAqua
• Tools for exploiting distributed semantics– Open Domain, Multi-Ontology Question Answering
(PoweAqua)
PowerAqua
PowerAqua– Cross ontology question aswering– Selects and combines relevant information from multiple ontologie
PowerAqua
Natural language question
Answers from online semantic data
Open domain QA by exploring distributed semantic data.
PowerAqua: Architecture
• Steps 2 and 3 implement a run time knowledge matcher that efficiently produces mappings across ontologies and domains
• Performs concepts, relations, instances and literal mapping
• No assumptions on the user input• No assumptions on the domain, structure or complexity of ontologies