View
25
Download
0
Category
Preview:
DESCRIPTION
Mini-Ontology Generation from Canonicalized Tables. Stephen Lynn Data Extraction Research Group Department of Computer Science Brigham Young University. Supported by the. TANGO Overview. TANGO: Table ANalysis for Generating Ontologies Project consists of the following three components:. - PowerPoint PPT Presentation
Citation preview
Thesis ProposalMini-Ontology GeneratOr (MOGO)
Mini-Ontology Generation from Canonicalized Tables
Stephen LynnData Extraction Research GroupDepartment of Computer ScienceBrigham Young University
Supported by the
Thesis ProposalMini-Ontology GeneratOr (MOGO)
TANGO Overview
1. Transform tables into a canonicalized form
2. Generate mini-ontologies
3. Merge into a growing ontology
TANGO: Table ANalysis for Generating Ontologies
Project consists of the following three components:
Thesis ProposalMini-Ontology GeneratOr (MOGO)
Thesis Statement Proposed Solution
Develop a tool to accurately generate mini-ontologies from canonicalized tables of data automatically, semi-automatically, or manually.
EvaluationEvaluate accuracy of tool with respect to: concept/value
recognition, relationship discovery, and constraint discovery.
Thesis ProposalMini-Ontology GeneratOr (MOGO)
Sample Input
Region and State InformationLocation Population (2000) Latitude LongitudeNortheast 2,122,869 Delaware 817,376 45 -90 Maine 1,305,493 44 -93Northwest 9,690,665 Oregon 3,559,547 45 -120 Washington 6,131,118 43 -120
Location
Northeast Northwest
Maine WashingtonOregonDelaware
[Dimension2]
LongitudeLatitudePopulation
2,122,869 -120817,376
Title: Region and State Information
2000
Sample Output
Thesis ProposalMini-Ontology GeneratOr (MOGO)
Mini-Ontology GeneratOr (MOGO)
Concept/Value Recognition Relationship Discovery Constraint Discovery
NOTE: MOGO implements a base set of algorithms for each step of the process and allows for runtime integration of new algorithms.
Thesis ProposalMini-Ontology GeneratOr (MOGO)
Concept/Value Recognition Lexical Clues
Data value assignment Labels as data values
Default Classifies any unclassified
elements according to simple heuristic.
Location
Northeast Northwest
Maine WashingtonOregonDelaware
[Dimension2]
LongitudeLatitudePopulation
2,122,869 -120817,376
Title: Region and State Information
2000
Concepts and Value Assignments
NortheastNorthwest
DelawareMaineOregonWashington
Population Latitude Longitude
2,122,869817,3761,305,4939,690,6653,559,5476,131,118
45444543
-90-93-120-120
Region State
Thesis ProposalMini-Ontology GeneratOr (MOGO)
Relationship Discovery Dimension Tree Mappings Lexical Clues
Generalization/Specialization Aggregation
Data Frames Ontology Fragment Merge
Location
Northeast Northwest
Maine WashingtonOregonDelaware
[Dimension2]
LongitudeLatitudePopulation
2,122,869 -120817,376
Title: Region and State Information
2000
Location
Northeast Northwest
Maine WashingtonOregonDelaware
[Dimension2]
LongitudeLatitudePopulation
2,122,869 -120817,376
Title: Region and State Information
2000
Thesis ProposalMini-Ontology GeneratOr (MOGO)
Constraint Discovery Generalization/Specialization Computed Values Functional Relationships Optional Participation
Region and State InformationLocation Population (2000) Latitude LongitudeNortheast 2,122,869 Delaware 817,376 45 -90 Maine 1,305,493 44 -93Northwest 9,690,665 Oregon 3,559,547 45 -120 Washington 6,131,118 43 -120
Thesis ProposalMini-Ontology GeneratOr (MOGO)
Validation Concept/Value Recognition
Correctly identified concepts Missed concepts False positives Data values assignment
Relationship Discovery Valid relationship sets Invalid relationship sets Missed relationship sets
Constraint Discovery Valid constraints Invalid constraints Missed constraints
Precision Recall
Concept Recognition
Relationship Discovery
Constraint Discovery
FoundIncorrectTotalCorrectActual
FoundCorrectTotalprecision
___
__
CorrectActual
FoundCorrectTotalrecall
_
__
Thesis ProposalMini-Ontology GeneratOr (MOGO)
Contribution
Tool to generate mini-ontologies Assessment of accuracy of automatic generation
Recommended