Upload
butest
View
1.420
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
Mobile Ontology Cloud- Semantic Post-IT -
IT Life and Ontology
Key-Sun Choi ([email protected])http://kschoi.kaist.ac.kr/
CILab & Semantic Web Research
1st day: what we will learn
• What is Semantic Post-it? (15 min)• Demo and Downloadable (5 min)• Enabling Technologies (15 min)• APIs for Technologies (5 min)
o ontocore.org (what you can do), o Protégé API
• Remaining in your home o References to read and to use
)
What is Semantic Post-it?: Contents
• As Mobile App• Personal Ontology Editors• Benefits when interpreting the input
messages
What is the Semantic Post-It?
• A system that maps personal randomized message into well-organized personal information space based on collective intelligence.
• Personal randomized message
• Organizing by interpreting messageso Table information extraction from texto Relevant table information grouping
• Personal information space o Usage of ontology that user can edit
• Collective intelligenceo Usage of pivot ontology based on Wikipedia (web-based encyclopedia that anyone can edit)
Introduction
A working flow of Semantic Post-It
Windows Mobile is a compact mobile operating system developed by Microsoft
Flash memory
ISA computer storage
Contents Space
Message Space
Triple Message Space(Table information)
Linked Triple Message Space
Flash memory is a non-volatile computer storage that can be electrically erased and reprogrammed.
Omnia 2 is a multimedia smartphone announced at Samsung. Omnia 2 runs Windows Mobile and comes with flash memory..
Omnia 2 ISA smartphone
Omnia 2 hasOS Windows Mobile
Omnia 2 hasMemory Flash memory
Windows Mobile
isDevelopedBy Microsoft
Flash memory
ISA computer storage
Omnia 2 ISA smartphone
Omnia 2 hasOS Windows Mobile
Omnia 2 hasMemory Flash memory
Windows Mobile
isDevelopedBy Microsoft
hasMemoryhasOS
Omnia 2 is a multimedia smartphone announced at Samsung. Omnia 2 runs Windows Mobile and comes with flash memory..
Introduction
Motivating Scenario
Another similar Smartphone?
More details on OS
What is the recent trend
of it?Company in competition
What should we do?
Motivation
Reading an article on “Omnia 2”
Motivating Scenario
Another similar Smartphone?
More details on OS
What is the recent trend
of it?Company in competition
We have to think of what type of information are involved
Reading an article on “Omnia 2”
CPU clock
Products of the company
Manufacturer, design
OS, platform
Motivation
Motivating Scenario
Where is he from?
?
?
?
If new to philosophers, we are likely to have no idea about relevant information
Reading an article on “Immanuel Kant”
nationality
Motivation
What is the solution?
• We need a system that retrieves relevant information• Data set that specifies attributes for each concepts is needed
o Smartphone : manufacturer, OS, memory, …o Philosophers : nationality, follower, teacher, …
• However, no one guy can describe every concepts• We can obtain the data set from collective intelligence
Motivation
Artist
engineer scientist
Philosophers
politicianauthor
Wikipedia
Wikipedia documents (2010/01/29)
3,175,836 (ENG) - 11,527,437 users125,801 (KOR) - 100,498 users
Motivation
① Inter-page link ② Inter-Language link ③ Category ④ Infobox: table information
②
① ④
③
Established February 16, 1971
Type Government-run
President Nam-Pyo Suh
… …
New paradigm
• A few years have passed since a new paradigm was introduced.
• Semantic Webo A machine-readable web
• Ontologyo A formal specification of knowledge
Background Technologies
Semantic Web
• An evolving development of the World Wide Web
• The meaning (semantics) of information and services on the web is defined
• For the web to "understand" and satisfy the requests of people and machines to use the web content
Our focus
Adapted from Wikipedia(http://en.wikipedia.org/wiki/Semantic_Web)
Background Technologies
RDF
• Resource Description Framework
Adapted from Wikipedia(http://en.wikipedia.org/wiki/Resource_Description_Framework)
<http://en.wikipedia.org/wiki/Tony_Benn> <http://purl.org/dc/elements/1.1/title> "Tony Benn" .<http://en.wikipedia.org/wiki/Tony_Benn> <http://purl.org/dc/elements/1.1/publisher> "Wikipedia" .
A Wikipedia article about Tony Benn
<rdf:RDFxmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#xmlns:dc=http://purl.org/dc/elements/1.1/><rdf:Description rdf:about=http://en.wikipedia.org/wiki/Tony_Benn><dc:title>Tony Benn</dc:title><dc:publisher>Wikipedia</dc:publisher></rdf:Description></rdf:RDF>
An expression of “triple”
Background Technologies
Ontology
Skype
Samsungi900 Omnia
releaseDate
cameraPixelOf
hasMemorySize
PDACellularPhone
SmartPhone
rdfs:subClassOfrdfs:subClassOf
rdfs:subClassOf
releaseDate
cameraPixelOf
hasMemorySize
2008
5 megapixels
128 MB
isManufacturedBy
isManufacturedBy
supportOnlineSoftware
supportSoftware
rdfs:subPropertyOf
supportOnlineSoftware
Schema
Instance
hasWebsite
runsOn
WindowsMobile 6.1
runsOn
hasWebsite
Mobile Phone
www.skype.com
PDACellularPhone
Smart Phone
Company
Samsung
Mobile Phone
Software
OS
Background Technologies
A formal specification of knowledge to be interpreted by computers
Content Space -> Message Space
Scrap
KAIST is located in Daejeon, South Korea. KAIST was established by Korean government in 1971
Typical Web BrowserSemantic Post-It(Message List)
Related Problems : Mash-UpHow to extract text from heterogeneous contents (in a context, not a scientific issue)
Illustrative Example
External Contents
KAIST is located in Daejeon, South Korea. KAIST was established by Korean government in 1971
Message Space -> Triple Message Space (1/2)
Semantic Post-It(Message List)
KAIST is located in Daejeon, South Korea. KAIST was established by Korean government in 1971.… The current KAIST President Nam Pyo Suh taught for…
Semantic Post-It(Detail View)
KAIST is located in Daejeon, South Korea. KAIST was established by Korean government in 1971.… The current KAIST President Nam Pyo Suh taught for…
Semantic Post-It(Detail View)
Person
Related Problems : ISA relation recognition
Smatphone’s UI is limited. Information should be shown by one-click.
Illustrative Example
Message Space -> Triple Message Space(2/2)
Semantic Post-It(Message List)
Semantic Post-It(Message View)
Summarization
Semantic Post-It(Table View)
Estabilshed 1971
Province Daejeon
Country South Korea
… …
KAIST
Related Problems : Triple extraction from text
Display size is too small to do full browsing.
Illustrative Example
KAIST is located in Daejeon, South Korea. KAIST was established by Korean government in 1971.… The current KAIST President Nam Pyo Suh taught for…
KAIST is located in Daejeon, South Korea. KAIST was established by Korean government in 1971
Triple Message Space ->Linked Triple Message Space
Semantic Post-It(Message List)
Semantic Post-It(Message View)
Relevant messages
Semantic Post-It(Graph View)
Related Problems : Relevant keyword search by traversing Ontology
Display size is too small to show text
KAIST is located in Daedeok…
Suh was born in Korea on April 22, 1936, and immigrated to the U.S. in 1954….
Daejeon is a center of transportation in South Korea, where two major,
province
president
Illustrative Example
KAIST is located in Daejeon, South Korea. KAIST was established by Korean government in 1971
KAIST is located in Daejeon, South Korea. KAIST was established by Korean government in 1971.… The current KAIST President Nam Pyo Suh taught for…
Linked Triple Message Space
Semantic Post-It(Using Ontology 1)
Related Problems : Personal ontology editing, logical consistency checking
Semantic Post-It(Using Ontology 2)
University
Person
Settlement
president
province
Ontology 1
University
Person
Country
president
locatedAt
Ontology 2
Illustrative Example
KAIST is located in Daedeok…
Suh was born in Korea on April 22, 1936, and immigrated to the U.S. in 1954….
Daejeon is a center of transportation in South Korea, where two major,
province
president
KAIST is located in Daedeok…
Suh was born in Korea on April 22, 1936, and immigrated to the U.S. in 1954….
South Korea is a presidential republic consisting of 16 administrative…
province
president
Personal Ontology Editor
• Rename the property nameo If you wish to see another label in the linko Ex) isManufacturedBy -> manufacturer
• Modify constraintso If you wish to see the country name rather than
the city nameo Ex) o Remove : University-province-Settlemento Add : University-locatedAt-Country
• Use the modified ontology in your Semantic Post-It
How to embed this complex UI into Smartphone?
http://protege.stanford.edu/
Illustrative Example
System architecture (1/2)
Local Message DB
Semantic Post-IT client
Semantic Post-IT Server(HTTP server)
System Message DB
HTTP request
HTTP response
TABLEGEN CAT2ISA
Ontology Access
DBpedia Access
Message Interpretation Services
Personal Ontology
External Message Service
Twitter, Blog, Email, Calendar, …
System architecture (2/2)
Local Message DB
Semantic Post-IT client
Personal Ontology
• Local Message DB controller• Message input interface • Message list viewer
• HTTP service controllero Semantic Post-IT servero External message service
• Message relation graph viewer
• Personal ontology editor
Semantic Post-IT client
Demo and Downloadable
• http://swrc.kaist.ac.kr/SemanticToolkits/
What is Semantic Post-It?
Memo Admin Service
Semantic Service Mash-Up
Evernote, quickies, etc.
Semantic Service Mash-up
• Definition of 3 types of applications– Type 1 Application: Information zooming on specific
‘word’ of a memo– Type 2 Application: Memo Contents Analysis– Type 3 Application: Information zooming on whole
context of a memo
Type 1 Application: Example
DEMO: Semantic Post-It
Type 2 Application: Demo
DEMO: Semantic Post-It
Type 3 Application: Demo
DEMO: Semantic Post-It
Structure of Semantic Post-ItPost-It Client
Add new memo
Delete memo
Change memo
Tag memo
Attach ontology to memo
Local File System
Request for new application
Executeapplication
FindRelated Memo
Synchroni-zation
RequestOntology
RequestShared Memo
Post-It Server
ServiceRepository
Communication betweenServer and Client1. Provide application List2. Application Install
Synchronization Module- Synchronization between
Server & Client
PersonalMemos
Shared Memo Request Module1. Return shared memos which
the client have requested2. Can download shared memo
to local database
Shared Memo
OntologyRepository
Ontology Request Module
Wikipedia Documents
PURE PART
Enterprise Part:Add-on of Semantic Applications
Support for Semantic Post-It:OntoCloud
• Ontology derived from Wikipedia infoboxes
• Official Website: http://swrc.kaist.ac.kr/ontocloud/
Support for Type 2 Application:Semantic Annotation
• One of possible type 2 application: Table-form summary generator
• Semantic Annotation: Mark on the documents – ‘which part’ could be transformed into table?
Semantic Annotation Toolkit: COAT
DEMO: COAT
From annotated data to Application: Machine Learning Feature
• Support Vector Machine(SVM)
Ontology Feature
Modern GSM-based BlackBerry handhelds incorporate an ARM 7 or 9 processor, while older BlackBerry 950 and 957 handhelds used Intel 80386 processors.
Modern GSM-based BlackBerry handhelds incorporate an ARM 7 or 9 processor, while older BlackBerry 950 and 957 handhelds used Intel 80386 processors.
CPU
Intel 80386
IT Ontology Package
useCPU
Gathering semantic InfoUsing Ontology
Data Authority Policy
• Annotators can check his/her documents ONLY!– To prevent cheating
• Simple annotation data viewer is available– For administrators
DEMO: COAT Viewer
Support for Type 3 Application:300M Wikipedia articles into Database
• Provide baseline for shared memo– For type 3 application
• Build shared memo database with 300M wikipedia articles as its part
Screenshots
1) User inputs message 2) Ontology recommendation
3) Table information extraction
4) Relevant message grouping
Enabling Technologies
• CAT2ISA• Table Generator
Ontology expressionOWL (Web Ontology Language)
<owl:Class rdf:ID=“Mobile Phone"/>
<owl:Class rdf:ID=“PDA"><rdfs:subClassOf rdf:resource=“# Mobile Phone"/></owl:Class><owl:Class rdf:ID=“SmartPhone"><rdfs:subClassOf rdf:resource="# Mobile Phone"/></owl:Class><owl:Class rdf:ID=“Cellular Phone"><rdfs:subClassOf rdf:resource="# Mobile Phone"/></owl:Class>
<owl:Class rdf:ID=“Mobile Phone Software"/>
<owl:ObjectProperty rdf:ID=“hasSoftware"><rdfs:domain rdf:resource="#Mobile Phone”/><rdfs:range rdf:resource=“# Mobile Phone Software"/></owl:ObjectProperty>
<owl:ObjectProperty rdf:ID=“hasOnlineSoftware"><rdfs:subPropertyOf rdf:resource=“#hasSoftware"/></owl:ObjectProperty>
Technologies
Ontology inference
IPTV service is launched
Text
TextText
Apple releases iPhone
Samsung releases Omnia
Apple supports Green technologiesCompany
Product
Software
manufactureuse
ISA ISA
Smartphone
ISA
Device
Samsung
Omnia
instanceOf
beginService
instanceOf
Apple
instanceOf
iPhoneinstanceOf
manufactureGreen
Technologysupport
EnvironmentalTechnology
support
instanceOfTV Service
IPTV
HDTV
ServiceISA
instanceOfinstanceOf
manufacture
beginService
Technologies
Ontology construction from Wikipedia Infobox
Technologies
instance
properties
class
university
Ontology construction from textTechnologies
2. Taxonomy Construction
1. Term extraction and conceptualization
3. Relation Addition
4. Integration
equipment-of
Part-of
is-a
not is-a
The other
5. Verification equipment-of
Part-of
Final Ontology
Existing Ontology
COAT (CoreOnto Annotation Toolkit)
• Term and relation annotation
Technologies
Ontology construction cost reduction
Cost reduction
• Manual annotation cost reduction by using COAT
• Further reduction could be possible if we can automate the process
Ontology extension cost reduction by automation
Web-scale annotation by ontology extension tech.
Improve Ontology extension tech. and automation2
Devise ontology extension tech.1
Before COAT
COAT AutoCOAT
Technologies
• Technology for expanding semantic infrastructure• Extract semantic information from anonymous category
system
CAT2ISA ([email protected])
• Extract isa/instanceOf relationo A instanceOf B: A is a member of set B
A is called 'instance', B is called 'concept' A and B must share 'essential properties': Properties
that makes something as itself Example:
<Key-Sun Choi, instanceOf, Professor>: X<Key-Sun Choi, instanceOf, Human>: O
o B isa C: B is a subset of C • isa/instanceOf relation: vital component in many semantic
applications(e.g. semantic search, Q&A system, etc.)
CAT2ISA
• Summarize a text into table format based on its semantic tag
Table Generator (cdh)
• Information extraction using "Ontology"o Ontology: Formal representation of a set of concepts
within a domain and the relationships between those concepts
o Ontology-based information extraction:
Table Generator
Remaining for your home: references
• History of Word Wide Webo Berners-Lee, Tim; Fischetti, Mark (1999). Weaving the Web.
HarperSanFrancisco.
• The Semantic Webo Berners-Lee, Tim; James Hendler and Ora Lassila (May 17, 2001). "The
Semantic Web". Scientific American Magazine.o Grigoris Antoniou, Frank van Harmelen (March 31, 2008). A Semantic Web
Primer, 2nd Edition
• Ontologyo Dean Allemang, James Hendler (May 9, 2008). Semantic Web for the Working
Ontologist: Effective Modeling in RDFS and OWL. Morgan Kaufmann
Remaining for your home: Use experiences
• [1] P. Mistry, P. Maes. Quickies: Intelligent Sticky Notes. In the Proceedings of 4th International Conference on Intelligent Environments (IE08). Seattle, USA. 2008
• [2] Max Van Kleek, Michael Bernstein, Katrina Panovich, Greg Vargas, David Karger, and mc schraefel, Note-to-Self: Examining Personal Information Keeping in a Lightweight Note-Taking Tool.. CHI, 2009
• [3] The Tabulator, http://www.w3.org/2005/ajar/tab
• Read [1,2,3] and use the system [2,3]
• Try also the following systemo http://www.evernote.com/o Smartphone version is available
2nd day
• Deep story about semantic technology (20 min) Wikipedia Dbpedia Ontocloud ([email protected])
• What are the upside?o Email 3.0o Information Zooming o Mobile hyperlink o Personal Preference Ontology and its use o Collective semantic intelligence of LOD + ontology cloud
• Another demo (5 min)• What you can do immediately (review)• What you can contribute (review)• Big picture
o Function, societyo Technology to study
IT-Life Ontology
IT Campus Domain Ontology (Partial)
Wikipedia (http://en.wikipedia.org)
• What is Wikipedia?o An online, collaboratively edited encyclopediao Articles are available in over 250 languageso Freely available and freely distributableo Inter-language (interwiki) page links
DBpedia (http://dbpedia.org)
• What is the DBpedia?o A community effort to extract structured information from
Wikipediao Available on the Webo Different types of structured information
Infobox templates: summaries of the most relevant facts contained in an article
Categorization information Images Geo-coordinates Links to external Web pages
OntoCloud
• Our own constructed Ontology• Goals
o Making more intelligent IT systems focusing on devices and resources
• Key classeso Device, Product, Resource, Technology, Person and
Company
Structure of OntoCloud
• Template Ontologyo Constructing the Pivot dataseto The infobox dataset from DBpedia3.4 (semi-automated)
• IT CUO (IT Core Upper Ontology)o A middle level ontology for integration
• Ontologies under IT domainso IT Service Ontologyo IT Device Ontologyo IT Core Ontology
Mobile 3.0 and its Requirements (full picture: jha)
• Email 3.0
• Information Zooming • Mobile hyperlink• Personal Preference Ontology and its use• Collective semantic intelligence of LOD +
ontology cloud
E-mail 3.0(email categorization)
Automatically map into a class in ontology
Related Problems• Topic detection
Current Status• Categorization of long and
well-formed text (e.g. Wikipedia documents)
Challenges• Short message interpretation• Personal writing styles
E-mail 3.0(Recipient recommendation)
[email protected] Automatically recommend person to whom the message should be sent
Challenges• Task Ontology modeling
E-mail 3.0(Relevant information attachment)
[email protected] Automatically attach pictures
Challenges• Semantic tags on multimedia
data• Local file indexing
The Samsung Group is composed of numerous international affiliated businesses, most of them united under the Samsung brand including Samsung Electronics, the world's largest electronics company,
Automatically attach files in local disk
E-mail 3.0(Mash-up Services)
The following list organizes classic and ongoing topics from the fieldof text-based IR for which contributions are welcome:
- Theory. Retrieval models, language models, similarity measures,formal analysis
- Mining and Classification. Category formation, clustering, entityresolution, document classification
---------------------------------------------------------------------------Important Dates:---------------------------------------------------------------------------
Mar 30, 2010 Deadline for paper submissionApr 20, 2010 Notification to authorsMay 17, 2010 Camera-ready copy dueAug 30, 2010 Workshop opens
---------------------------------------------------------------------------Workshop Organization:---------------------------------------------------------------------------
Benno Stein, Bauhaus University WeimarMichael Granitzer, Know-Center Graz & Graz University of Technology
Contact: [email protected] about the workshop can be found at http://www.tir.webis.de
Automatically create to-do list
Topic Information retrieval
Deadline Mar 30, 2010
Organizer Benno Stein
Related Problems:• Table information extraction• Mash-up
Current Status• Table information generation from text
Challenges• Table information generation from semi-
structured text
A message in inbox
Information Zooming• What is information zooming?
o Show small amount of information firsto When user requires more information about one part,
shows more detailed information about that part.• Why is it necessary?
o Mobile environment: small display We cannot show all the necessary information at
once! (Lack of space)
• Information zooming for one word • Information zooming for whole memo
Information Zooming in Semantic Post-It
Mobile hyperlink• What is mobile hyperlink?
o Represent URL as barcodeo Take a picture of the barcode using camera in
cellphone and you move to that URL!• Why is it necessary?
o Mobile environment: small interface Hard to type all the URL
• Example of mobile hyperlink
o QR code:
Personal Preference Ontology and its use
• Task of packaging from a potentially large ontology, one or several significant sub-partso Knowledge sharing and re-use crucial research issues
• On-demand Extraction Service
o Takes a concept and extract the relations
o • Interactive Service
o The user have to select class and relations to consider
Collective semantic intelligence of LOD + ontology cloud
The Linked Open Data Cloud
What you can do immediately
• review• discussion
What you can contribute
• Data Synchronization for Mobile applicationso Synchronization is a data transfer between computer and
mobile device that aims to keep both of components in a coherent state
• Knowledge-driven Security Handling for Mobile Applicationso Several mobile applications attacks have been
recently reported Device and environment
• Ontology Packaging for Mobile fieldo Bacause of its physical aspect, a mobile device has a
limited processing and computing capabilities
Big picture
• Function, society and Technology to study
A working flow of Semantic Post-It
Windows Mobile is a compact mobile operating system developed by Microsoft
Flash memory
ISA computer storage
Contents Space
Message Space
Triple Message Space(Table information)
Linked Triple Message Space
Flash memory is a non-volatile computer storage that can be electrically erased and reprogrammed.
Omnia 2 is a multimedia smartphone announced at Samsung. Omnia 2 runs Windows Mobile and comes with flash memory..
Omnia 2 ISA smartphone
Omnia 2 hasOS Windows Mobile
Omnia 2 hasMemory Flash memory
Windows Mobile
isDevelopedBy Microsoft
Flash memory
ISA computer storage
Omnia 2 ISA smartphone
Omnia 2 hasOS Windows Mobile
Omnia 2 hasMemory Flash memory
Windows Mobile
isDevelopedBy Microsoft
hasMemoryhasOS
Omnia 2 is a multimedia smartphone announced at Samsung. Omnia 2 runs Windows Mobile and comes with flash memory..
Big Picture
What is the next step?
Message generation
Linked Triple Message SpaceFlash memory
ISA computer storage
Omnia 2 ISA smartphone
Omnia 2 hasOS Windows Mobile
Omnia 2 hasMemory Flash memory
Windows Mobile
isDevelopedBy Microsoft
hasMemoryhasOS
Big Picture
Personalized Message Space
How to do so? Do you have an idea how to utilize personalized
ontology to generate sentences?
Omnia 2 is a multimedia smartphone announced at Samsung. Omnia 2 runs Windows Mobile developed by Microsoft and comes with flash memory which is a computer storage.
Personalized ontology
Functions (1/2)
• From text to presentation file
• Challengeso Semantic Tagging to Imageo Refer to http://www.image-net.org/
KAIST is located in Daejeon, South Korea. KAIST was established by Korean government in 1971
Established 1971
Province Daejeon
Country South Korea
… …
KAIST
Table information Table information + images
established 1971
province Daejeon
Country South Korea
Big Picture
Functions (2/2)
• From table to texto Generate NL text by traversing table
KAIST-Province-Daejeon Daejeon-Districts-fifth KAIST is located in Daejeon. Daejeon is the fifth
largest city in the country.
• Challengeso Transform a predicate into verb phrases
Ex) Province -> is located in
Big Picture
Society
Local Message DB
Semantic Post-IT client
Semantic Post-IT Server(HTTP server)
HTTP request
HTTP response
Message Interpretation Services
Personal Ontology
Make your own message interpretation modules and upload it.
OpenAPI generator will make it available as an OpenAPI service.
OpenAPI generator
TABLEGEN CAT2ISA
Ontology Access
DBpedia Access
Big Picture
Technologies to study(interdisciplinary)
Handle huge amount of messagesEx) manipulating Wikipedia documents
Plug-in architectureEx) Collect personal documents by using Google Desktop APIs
Find person of my interestsEx) References in papers
Write message anywhere and anytimeEx) RFID-equipped notes
How to extract table data from memo?Ex) information extraction from document
Sociology
Design
CognitiveScience
Architecture/Urban design
HCI
Software Engineering
Internet of Things
ConvergenceNetworks
Cloud Computing
Graphics
Middleware
IR, AI, MachineLearning
DB & Data Mining
Which layout is suitable for the display?Ex) Table-memo for a tiny display
Which type of memo? Writing style anaylsysEx) To-do list, contact, documents
Big Picture
Deep story about semantic technology
• discussion!
Credits
• Dong-Hyun Choi, [email protected]• Eun-Kyung Kim, [email protected]• Jinhyun Ahn, [email protected]
• Key-Sun Choi, [email protected]• http://swrc.kaist.ac.kr/ontocloud• http://swrc.kaist.ac.kr/SemanticToolkits/