Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Seman&c Analysis in Language Technology http://stp.lingfil.uu.se/~santinim/sais/2014/sais_2014.htm
The Semantic Web & Ontologies
Marina San(ni [email protected]
Department of Linguis(cs and Philology Uppsala University, Uppsala, Sweden
Autumn 2014
1 The Seman(c Web & Ontologies
Acknowledgements
• Slides inspired by Ian Harrockss.
The Seman(c Web & Ontologies 2
Summary: QA (i) • Google, Yahoo, Bing… • ”Tradi(onal” Ques(on Answering (Start… ):
– hSp://start.csail.mit.edu/publica(ons/FLAIRS0601KatzB.pdf (2006) – Other publica(ons: hSp://start.csail.mit.edu/publica(ons.php
The Seman(c Web & Ontologies 3
Katz et al. (2006) hSp://start.csail.mit.edu/publica(ons/FLAIRS0601KatzB.pdf
• START answers natural language ques(ons by presen(ng components of text and mul(-‐media informa(on drawn from a set of informa(on resources that are hosted locally or accessed remotely through the Internet.
• START targets high precision in its ques(on answering.
• The START system analyzes English text and produces a knowledge base which incorporates, in the form of nested ternary expressions (=triples), the informa(on found in the text.
The Seman(c Web & Ontologies 4
Siri hSp://en.wikipedia.org/wiki/Siri
• Siri /ˈsɪri/ is an intelligent personal assistant and knowledge navigator which works as an applica(on for Apple Inc.'s iOS.
• The applica(on uses a natural language user interface to answer ques(ons, make recommenda(ons, and perform ac(ons by delega$ng requests to a set of Web services.
• The soeware, both in its original version and as an iOS applica(on, adapts to the user's individual language usage and individual searches (preferences) with con(nuing use, and returns results that are individualized.
• The name Siri is Scandinavian, a short form of the Norse name Sigrid meaning "beauty" and "victory", and comes from the intended name for the original developer's first child.
The Seman(c Web & Ontologies 5
Summary (ii) • Siri… conversa(onal ”safety net”. • Conversa(onal agents (chat bots,
and personal assistants)
àcustomer care, customer analy(cs (replacing/integra(ng FAQs and help desk)
The Seman(c Web & Ontologies 6
Avatar: a picture of a person or animal that represents you on a computer screen, for example in some chat rooms or when you are playing games over the Internet
Eliza hSp://en.wikipedia.org/wiki/ELIZA
ELIZA was wriSen at MIT by Joseph Weizenbaum between 1964 and 1966
The Seman(c Web & Ontologies 7
Spoken Language • Spoken language: incorrect syntax, incorrect morpology,
spoken forms….
• syntac(c mechanisms like disloca(on, anaphora, and gapping;
• morphological mechanisms like specialized focus or topic-‐marking affixes;
• specialized discourse par(cles. • Ex:
– ‘this man, that I have not yet seen’ (lee disloca(on) – It is a strange bloke, that man (right disloca(on) – ‘this man, I have not yet seen him’ (hanging topic)
The Seman(c Web & Ontologies 8
Jakobson's func(ons of language • The Pha$c Func$on is language for the sake of interac(on.
• The Pha(c Func(on can be observed in gree(ngs and casual discussions of the weather, par(cularly with strangers.
• It also provides the keys to open, maintain, verify or close the communica(on channel: "Hello?", "Ok?", "Hummm", "Bye"...
• Academic interac(on….
The Seman(c Web & Ontologies 9
Visionary People (i)
• Roberto Busa and Thomas Watson
• QA IBM Watson à Jeopardy! (quizzes)
• Interesseklubben (difficult quizzes) -‐-‐-‐ > Bill Gates?, Kamprad?, etc J J J
• Philanthropist Billionaires – As of 2007, Bill and Melinda Gates were the second-‐most generous philanthropists in America, having given
over US$28 billion to charity; the couple plan to eventually donate 95 percent of their wealth to charity (why not to research too?)
– Bill and Melinda Gates have taken the No. 1 spot on Forbes' list of the 50 top givers in America. (Oct 2014)
• (acquisi(ons is the Codex Leicester, a collec(on of wri(ngs by Leonardo da Vinci, which Gates bought for $30.8 million at an auc(on in 1994)
The Seman(c Web & Ontologies 10
Funding sources in Sweden
• Vetenskapsrådet • Riksbanken
• Vinnova
• Crowdfunding
The Seman(c Web & Ontologies 11
Outline
• The Seman(c Web
• Ontologies
The Seman(c Web & Ontologies 12
Chronology hSp://en.wikipedia.org/wiki/
History_of_the_World_Wide_Web
• On August 6, 1991,Berners-‐Lee posted a short summary of the World Wide Web project on the alt.hypertext newsgroup, invi(ng collaborators. This date also marked the debut of the Web as a publicly available service on the Internet, although new users could only access it aNer August 23.
• Beginning in 2002, new ideas for sharing and exchanging content ad hoc, such as Weblogs and RSS, rapidly gained acceptance on the Web. This new model for informa(on exchange, primarily featuring user-‐generated and user-‐edited websites, was dubbed Web 2.0.
• Popularized by Berners-‐Lee's book Weaving the Web (2000) and a Scien(fic American ar(cle by Berners-‐Lee, James Hendler, and Ora Lassila, the term Seman(c Web describes an evolu(on of the exis(ng Web in which the network of hyperlinked human-‐readable web pages is extended by machine-‐readable metadata about documents and how they are related to each other, enabling automated agents to access the Web more intelligently and perform tasks on behalf of users. This has yet to happen. In 2006, Berners-‐Lee and colleagues stated that the idea "remains largely unrealized"
The Seman(c Web & Ontologies 13
Visionary people (ii)
Web 1.0
• Web 1.0 is a retronym referring to an early stage of the World Wide Web's evolu(on.
• Some design elements of a Web 1.0 site include:
– Personal web pages were common, consis(ng mainly of sta(c pages
– Sta(c pages instead of dynamic HTML. – The use of HTML 3.2-‐era elements such as Framing (World Wide Web)s and tables to posi(on and align elements on a page (now we use css and frames are deprecated)
– GIF buSons...
The Seman(c Web & Ontologies 14
Web 2.0 • Web 2.0 describes World Wide Web sites that use technology beyond the
sta(c pages of earlier Web sites. • The key features of Web 2.0 include:
– Tagging -‐ allows users to collec(vely classify and find informa(on (e.g. Tagging) – Rich User Experience-‐ dynamic content; responsive to user input – User Par(cipa(on -‐ informa(on flows two ways between site owner and site
user by means of evalua(on, review, and commen(ng. – Site users add content for others to seeLong tail-‐ services offered on demand
basis; profit is realized through monthly service subscrip(ons more than one-‐(me purchases of goods over the network[cita(on needed]
– Soeware as a service -‐ Web 2.0 sites developed API to allow automated usage, such as by an app or mashup
– Mass Par(cipa(on -‐ Universal web access leads to differen(a(on of concerns from the tradi(onal internet userbase.
The Seman(c Web & Ontologies 15
Web 3.0
• “Web 3.0, a phrase coined by John Markoff of the New York Times in 2006, refers to a supposed third genera(on of Internet-‐based services that collec(vely comprise what might be called ‘the intelligent Web’ — such as those using seman(c web, microformats, natural language search, data-‐mining, machine learning, recommenda(on agents, and ar(ficial intelligence technologies — which emphasize machine-‐facilitated understanding of informa(on in order to provide a more produc(ve and intui(ve user experience.”
• Web 3.0 will be more connected, open, and intelligent, with seman(c Web technologies, distributed databases, natural language processing, machine learning, machine reasoning, and autonomous agents.
– hSp://lifeboat.com/ex/web.3.0
The Seman(c Web & Ontologies 16
• "The Web was designed as an informa$on space, with the goal that it should be useful not only for human-‐human communica(on, but also that machines would be able to par(cipate and help.
• One of the major obstacles to this has been the fact that most informa$on on the Web is designed for human consump$on, and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the Web.
• Leaving aside the ar(ficial intelligence problem of training machines to behave like people, the Seman$c Web approach instead develops languages for expressing informa$on in a machine process-‐able form"-‐
– Tim Berners-‐Lee, The Seman<c Web Roadmap, 1998 – hSp://www.w3.org/DesignIssues/Seman(c.html
The Seman(c Web & Ontologies 17
The web: present and future
Today…
• The web is rela(vely simple: – Hypertexts and hypermedia – Access is engineered via a combina(on of keyword-‐based search and link nagiva(on.
This simplicity has been one of the great strengths of the web, and has been an important factor in its popularity and their own content.
The Seman(c Web & Ontologies 18
Shortcomings Examples: • Finding informa(on about people with very common names can be a frustra(ng experience.
• Answering more complex queries along with more general informa(on retrieval, integra(on, sharing and processing can be difficult or even impossible. – List of all the heads of a state of EU countries – Who destroyed the Beatles?
The Seman(c Web & Ontologies 19
Some solu(ons • Soeware glue: Mashups
– loca(on informa(on from one source might be combined with map informa(on from another source in order to show the loca(on of and provide direc(ons to points of interest such as hotels and restaurants.
• Tagging via social networks (Web 2.0) – harness the power of user communi(es in order to share and annotate informa(on.
• Examples include image and video shar-‐ing sites such as Flickr and YouTube, and auc(on sites such as eBay.
– In these applica(ons, annota(ons usually take the form simple tags, such as ”each", ”birthday", ”family" and ”friends". The meaning of tags is, however, typically not well defined, and may be impenetrable even to human users: typ-‐ical examples (from Flickr) include "asquatchmusicfes(val", "elebritylookalikes", and "wab08".
The Seman(c Web & Ontologies 20
The ”travel agent”
• The classic example of a seman(c web applica(on is an automated travel agent that, given various constraints and preferences, would offer the user suitable travel or vaca(on sugges(ons.
• A key feature of such a "soeware agent" is that it would not simply exploit a predetermined set of informa(on sources, but would search the web for relevant informa(on in much the same way that a human user might do when planning a vaca(on.
The Seman(c Web & Ontologies 21
The goal
• The goal of th Seman(c Web is to allow web informa(on and services to be more effec(vely exploited by humans and automated tools.
The Seman(c Web & Ontologies 22
Seman(c Web • The focus of the seman(c web is to share data instead of documents.
• In other words, it is a project that should provide a common framework that allows data to be shared and reused across applica(on, enterprise, and community boundaries.
• It is a collabora(ve effort led by World Wide Web Consor(um (W3C).
The Seman(c Web & Ontologies 23
Semantic Web & Ontologies • How are we going to represent meaning and knowledge on the web?
• A key idea behind the seman<c web is to address this problem by giving machine-‐accessible seman<cs via annota<on.
• Knowledge is represented in the form of rich conceptual schemas called ontologies.
• Ontologies are the backbone of the Seman(c Web.
• Ontologies are rich conceptual schemas that give formally defined meanings to the terms used in annota<ons, transforming them into seman<c annota<ons.
• They provide the knowledge that is required for seman(c applica(ons of all kinds. 24 The Seman(c Web & Ontologies
Main Difficulty
• Current web content is intended for humans (HTML markup with layout, images and other presenta(onal features).
• Humans understand this content, but machines can’t.
The Seman(c Web & Ontologies 25
Basically... • Ontologies provide a shared understanding of a domain.
• They provide background knowledge to systems to automatize certain tasks.
• By the process of annotation, knowledge can be linked to ontologies. – Example: “Angelina Jolie” (Text) linked to concept Actress – In our ontology we also know that an actress always is female and a
person.
• Ontologies allow the creation of annotations à machine-readable and machine-understandable content.
• If machines can understand content, they can also perform more meaningful and intelligent queries. – Distinction of Jaguar the animal and the car. – Combination of information that is distributed on the Web.
26 The Seman(c Web & Ontologies
Old and New Issues Old ones: • knowledge representa(on • Reasoning • Linguis(cs • …
New ones: • integra(ng different ontologies may prove to be at least as
hard as integra(ngthe resources that they describe • Crea(on of suitable annota(ons and ontologies • …
The Seman(c Web & Ontologies 27
Notwithstanding these issues…
• … considerable progress has been made in the development of the infrastructure needed to support the seman(c web.
• In par(cular, there has been impressive progress in the development of languages and tools for content annota(on and for the design and deployment of ontologies.
The Seman(c Web & Ontologies 28
Seman(c Annota(on
• To facilitate the process of seman(c annota(on, RDF and OWL have been developed as standard formats fo the sharing and integra(on of data and knowledge.
• RDF and OWL are standards: – RDF (Resource Descrip(on Framework) – OWL (Web Ontology Language)
The Seman(c Web & Ontologies 29
Ontologies (Metaphysics)
• Ontology, in its original philosophical sense, is a fundamental branch of metaphysics focusing on the study of existence.
• Its objec(ve is to determine what en((es and types of en((es actually exist, and thus to study the structure of the world.
• The study of ontology can be traced back to the work of Plato and Aristotle, and includes the development of hierarchical categorisa(ons of different kinds of en((es and the features that dis(nguish them
The Seman(c Web & Ontologies 30
Tree of Porphyry
Tree of Porphyry, III AD
• The Porphyrian tree, Tree of Porphyry or Arbor Porphyriana is a classic device for illustra(ng what is also called a "scale of being". It was suggested by the 3rd century AD Greek neoplatonist philosopher and logician Porphyry
The Seman(c Web & Ontologies 31
Ontology (Computer Science, AI, LT, IR…)
• Engineering artefact, usually a model of some aspect of the world.
• It introduces vocabulary describing various aspects of the domain being modelled, and provides an explicit specifica(on of the intended meaning of the vocabulary.
• This specifica(on oeen includes classifica(on-‐based informa(on, not unlike that in Porphyry's tree.
The Seman(c Web & Ontologies 32
What is an ontology (i)?
33
“An ontology is a formal, explicit specifica<on of a shared conceptualiza<on”
Studer, Benjamins, Fensel. Knowledge Engineering: Principles and Methods. Data and Knowledge Engineering. 25 (1998) 161-‐197
An ontology is an explicit specification of a conceptualization Gruber, T. A translation Approach to portable ontology specifications. Knowledge Acquisition. Vol. 5. 1993. 199-220
Abstract model and simplified view of some phenomenon in the world that we want to represent
Machine-readable
Concepts, properties relations, functions, constraints, axioms, are explicitly defined
Consensual Knowledge
The Seman(c Web & Ontologies
What is an ontology (ii)? • An ontology is a hierarchically structured set of terms for describing a
domain that can be used as a skeletal foundation for a knowledge base
B. Swartout; R. Patil; k. Knight; T. Russ. Toward Distributed Use of Large-Scale Ontologies Ontological Engineering. AAAI-97 Spring Symposium Series. 1997. 138-148
• An ontology defines the basic terms and relations comprising the vocabulary of a topic area, as well as the rules for combining terms and relations to define extensions to the vocabulary
Neches, R.; Fikes, R.; Finin, T.; Gruber, T.; Patil, R.; Senator, T.; Swartout, W.R. Enabling Technology for Knowledge Sharing. AI Magazine. Winter 1991. 36-56
• An ontology provides the means for describing explicitly the conceptualization behind the knowledge represented in a knowledge base
A. Bernaras;I. Laresgoiti; J. Correra. Building and Reusing Ontologies for Electrical Network Applications ECAI96. 12th European conference on Artificial Intelligence. Ed. John Wiley & Sons, Ltd.
298-302
34 The Seman(c Web & Ontologies
Examples • Top level ontology: Standard Upper Ontology
– In informa(on science, an upper ontology (also known as a top-‐level ontology or founda(on ontology) is an ontology (in the sense used in informa(on science) which describes very general concepts that are the same across all knowledge domains.
• Linguis(c ontology: WordNet • General Ontology: Cyc, UNSPSC, ecl@ss • Domain ontology: MeSH (Medical Subject Headings),
CHEMICALS, UMLS • Research ontology: KA2 (Knowledge Acquisi(on
Community Ontology)
The Seman(c Web & Ontologies 35
Resource Descrip(on Framework (i)
• A language that has been developed in order to provide a extensible mechanism for describing web resources and rela(onships between them.
• A key feature of RDF is the use of Interna(onalized Resource Iden(fiers (IRIs) (which is a generalisa(on of Uniform Resource Locators (URLs) to refer to resources.
• RDF is a very simple language: its underlying data structure is a labelled directed graph, and its only syntac(c construct is the triple.
• A triple consists of three components, referred to as the subject, predicate and object.
The Seman(c Web & Ontologies 36
a directed graph is a set of nodes connected by edges, where the edges have a direc(on associated with them.
/ˈaɪˌɑːˌraɪ/
RDF (ii) • More formally, a triple represents a single edge (labelled
with the predicate) connec(ng two nodes (labelled with the subject and object); it describes a binary rela(onship between the subject and object via the predicate.
• The predicate of a triple is always an IRI, and an IRI that is used in the predicate posi(on of a triple is called a property.
• A set of triples is called an RDF graph.
• In order to facilitate the sharing and exchanging of graphs on the web, an XML serialisa(on has also been defined.
The Seman(c Web & Ontologies 37
”Harry PoSer has a pet called Hedwig…”
The Seman(c Web & Ontologies 38
RDF/XML
RDF graph
Lect 07: Rela(on Extrac(on: Rela(on databases that draw from Wikipedia
• Resource Descrip<on Framework (RDF) triples subject predicate object Golden Gate Park location San Francisco!dbpedia:Golden_Gate_Park dbpedia-‐owl:loca(on dbpedia:San_Francisco!
• DBPedia: 1 billion RDF triples, 385 from English Wikipedia
• Frequent Freebase rela(ons: people/person/na(onality, loca(on/loca(on/contains people/person/profession, people/person/place-‐of-‐birth biology/organism_higher_classifica(on film/film/genre
39 The Seman(c Web & Ontologies
Lect 08: Rela(on Extrac(on
• Answers: Databases of Rela(ons – born-‐in(“Emma Goldman”, “June 27 1869”) – author-‐of(“Cao Xue Qin”, “Dream of the Red Chamber”) – Draw from Wikipedia infoboxes, DBpedia, FreeBase, etc.
• Ques(ons: Extrac(ng Rela(ons in Ques(ons Whose granddaughter starred in E.T.?
(acted-in ?x “E.T.”)! (granddaughter-of ?x ?y)!
40 The Seman(c Web & Ontologies
RDF Schema (RDF Vocabulary Descrip(on Language) • Enxtends RDF by giving addi(onal meaning to spiacial resources…
The Seman(c Web & Ontologies 41
… but s(ll not enough…
• Capabili(es of RDF as ontology language are limited – No cardinality – No possible to describe conjunc(on of classes – …
RDF is a very simple language
The Seman(c Web & Ontologies 42
cardinality of a set is a measure of the "number of elements of the set”. For example, the set A = {2, 4, 6} contains 3 elements, and therefore A has a cardinality of 3
Need for a more expressive ontology language: OWL (Web Ontology Language)
• Since the architecture of the web depends on agreed standards, the World Wide Web Consor(um (W3C) set up a standardisa(on working group to develop a standard for a web ontology language.
• The result of this ac(vity was the OWL ontology language standard.
• The integra(on of OWL with RDF has the advantage of making OWL ontologies directly accessible to web based applica(ons.
The Seman(c Web & Ontologies 43
Back Story: hSp://ileriseviye.wordpress.com/2011/11/01/why-‐web-‐ontology-‐language-‐is-‐abbreviated-‐as-‐owl-‐and-‐not-‐wol/
The Seman(c Web & Ontologies 44
Descrip(on Logics (DLs)
• A key feature of OWL is its basis in Descrip(on Logics, a family of logic-‐based knowledge representa(on formalisms that have a formal seman(cs based on first-‐order logic (FOL).
The Seman(c Web & Ontologies 45
Lect 02: Descrip(on Logics • DLs refer to a family of logical approaches that corrispond to
different subsets of FOL.
• We can use DLs to model an applica(on domain. The focus is then on: – Representa(on of knowledge about categories – The set of categories in an applica(on domain is called terminology – The terminology is arranged in a hierachical organiza(on called ontology, which capture superset & subset rela(ons among categoires/concepts.
– In order to specify a hierachical structure, we can use subsump$on rela(ons betw the appropriate concepts in a terminiology
– Subsump$on is a form of inference. Determines whether a superset/subset rela(on (based on the fact asserted in a terminology) exists betw two concepts.
The Seman(c Web & Ontologies 46
Lect 02: OWL and the Seman(c Web
• A Descrip(on Logic roughly is used in the Web Ontology Language (OWL).
• OWL is a language used for the develoment of ontologies that should encapsulate the knowledge in the development of the Seman(c Web
• The Seman(c Web is the effort to formally specify the seman(cs of the contents of the web .
The Seman(c Web & Ontologies 47
DLs • These formalisms adopt an object-‐oriented model, in which the domain is described in terms of individuals (instances), concepts (classes), and roles (proper(es/predicates):
– individuals, e.g., "Hedwig", are the basic elements of the domain;
– concepts, e.g., "Owl", describe sets of individuals having similar characteris(cs;
– roles, e.g., "hasPet", describe rela(onships between pairs of individuals, such as "HarryPoSer hasPet Hedwig".
The Seman(c Web & Ontologies 48
Axioms • An OWL ontology consists of a set of axioms
• Exemple: – given the axiom C equivalentClass D, then an individual is an instance of C if and
only if it is an instance of D. – i.e. Combining axioms with class descrip(ons allows for easy extension of the
vocabulary by introducing new names as abbrevia(ons for descrip(ons.
See the following axiom: Class: HogwartsStudent!
!EquivalentTo: Student and attendsSchoolvalue Hogwarts! introduces the class name HogwartsStudent, and asserts that its instances are just those Students who aSend Hogwarts.
The Seman(c Web & Ontologies 49
TBox & ABox
• Axioms describe constraints on the structure of the domain: – in DLs such a set of axioms is called a TBox (Terminology Box).
• OWL also allows for axioms asser(ng facts about some concrete situa(on, similar to data in a database se�ng: – in DLs such a set of axioms is called an ABox (Asser(on Box).
The Seman(c Web & Ontologies 50
Decid-‐ability (i)
• Descrip(on Logics are fully-‐fledged logics and so have a formal seman(cs.
• DLs can be seen as decidable subsets of FOL with: – individuals being equivalent to constants, – concepts to unary predicates, – roles to binary predicates.
The Seman(c Web & Ontologies 51
Lect 02: But… undecidable (some(mes)
• The Incompleteness Theorem , proven in 1930, demonstrates that first-‐order logic is in general undecidable.
• That means there exist statements in this logic form that, under certain condi(ons, cannot be proven either true or false.
• Ex: can’t solve the Hal$ng Problem
The Seman(c Web & Ontologies 52
Lect 02: Hal(ng Problem • In 1936 Alan Turing proved that it's not possible to decide whether
an arbitrary program will eventually halt, or run forever.
• The official defini(on of the problem is to write a program (actually, a Turing Machine*) that accepts as parameters a program and its parameters. That program needs to decide, in finite (me, whether that program will ever halt running these parameters.
• The hal(ng problem is a cornerstone problem in computer science. It is used mainly as a way to prove a given task is impossible, by showing that solving that task will allow one to solve the hal(ng problem.
*A Turing machine is a hypothe(cal device that manipulates symbols according to a table of rules. Despite its simplicity, a Turing machine can be adapted to simulate the logic of any computer algorithm,
The Seman(c Web & Ontologies 53
Decid-‐ability (ii)
• DLs give a precise and unambiguous meaning to descrip(ons of the domain
• This also allows for the development of reasoning algorithms that can provide correct answers to arbitrarily complex queries about the domain.
The Seman(c Web & Ontologies 54
Reasoning: OWL vs Databases
• Ex: OWL axioms behave like inference rules rather than database constraints.
!Class: Phoenix!
!SubClassOf: isPetOf only Wizard!!Individual: Fawkes!
Types: Phoenix!Facts: isPetOf Dumbledore!
• Fawkes is said to be a Phoenix and to be the pet of Dumbledore, and it is also stated that only a Wizard can have a pet Phoenix.
• In OWL, this leads to the implica(on that Dumbledore is a Wizard. That is, if we were to query the ontology for instances of Wizard, then Dumbledore would be part of the answer.
• In a database se�ng the schema could include a similar statement about the Phoenix class, but in this case it would be interpreted as a constraint on the data: adding the fact that Fawkes isPetOf Dumbledore without Dumbledore being already known to be a Wizard would lead to an invalid database state, and such an update would therefore be rejected by a database management system as a constraint viola(on.
The Seman(c Web & Ontologies 55
Ontology Development Tools
• State of the art ontology development tools, such as SWOOP, Protégé, and TopBraid Composer, use DL reasoners to provide feedback to the user about the logical implica(ons of their design: – i.e. warnings about inconsistencies and synonyms.
The Seman(c Web & Ontologies 56
WebProtégé hSp://protege.stanford.edu/products.php#web-‐protege
The Seman(c Web & Ontologies 57
VOWL: Visual Nota(on for OWL
Ontologies hSp://vowl.visualdataweb.org/v2/
The Seman(c Web & Ontologies 58
WebVOWL hSp://vowl.visualdataweb.org
• A web-‐based implementa(on of VOWL.
• There is also a VOWL plugin for the ontology editor Protégé that implements the VOWL specifica(ons.
The Seman(c Web & Ontologies 59
Domain-‐specific ontologies • The availability of tools has contributed to the increasingly widespread use of OWL, and it has become the de facto standard for ontology development in fields as diverse as – Biology – Medicine – Geography – Geology – Agriculture – Defence – etc
The Seman(c Web & Ontologies 60
Complex Queries • The use of DL reasoners allows OWL ontology applica(ons to answer complex queries and to provide guarantees about the correctness of the result.
• Reliability and correctness are clearly important features of any informa(on system;
• They are par(cularly important if ontology based systems are to be used in safety-‐cri(cal applica(ons such as medicine, where incorrect reasoning could adversely impact pa(ent care.
The Seman(c Web & Ontologies 61
Standard Query Language
• It has long been recognised that the seman(c web, and seman(c web knowledge representa(on languages such as RDF and OWL, would also benefit from the availability of a standardised query language such as SQL
• A W3C standardisa(on working group was set up, and has completed its work on the SPARQL query language standard.
The Seman(c Web & Ontologies 62
SPARQL Protocol and RDF Query Language …
• … is an RDF query language, ie a query language for databases, able to retrieve and manipulate data stored in RDF format
• SPARQL allows for a query to consist of triple paOerns, conjunc(ons, disjunc(ons, and op(onal paSerns
The Seman(c Web & Ontologies 63
Tags & Ontologies
• Tagging facili(es within Web 2.0 applica(ons have shown how it might be possible for user communi(es to collabora(vely annotate web content, and create simple forms of ontology via the development of hierarchically organised sets of tags, oeen called folksonomies….
The Seman(c Web & Ontologies 64
Challenges
• Currently hard to combine: – Increased expressive power (by using more sophis(cated logics) with scalability (large ontologies)
The Seman(c Web & Ontologies 65
Ontology Learning • Ontology learning (ontology extrac(on, ontology genera(on, or ontology
acquisi(on) is the automa(c or semi-‐automa(c crea(on of ontologies, including extrac(ng the corresponding domain's terms and the rela<onships between those concepts from a corpus of natural language text, and encoding them with an ontology language for easy retrieval.
• As building ontologies manually is extremely labor-‐intensive and (me consuming, there is great mo(va(on to automate the process.
• Typically, the process starts by extrac(ng terms and concepts or noun phrases from plain text using linguis(c processors such as part-‐of-‐speech tagging and phrase chunking. Then sta(s(cal techniques are used to extract rela(on, oeen based on paSern-‐based or defini(on-‐based hypernym extrac(on techniques. – hSp://en.wikipedia.org/wiki/Ontology_learning
The Seman(c Web & Ontologies 66
Ontology Mining
• At the intersec(on of computa(onal linguis(cs and the seman(c web, there is a community on ontology learning/mining
• Paul Buitelaar and Georgeta Bordea in Ireland:
– hSp://nlp.insight-‐centre.org/
The Seman(c Web & Ontologies 67
So, how are you, Seman(c Web?
The Seman(c Web & Ontologies 68
Dead or Alive?
• Yahoo Researcher Declares Seman(c Web Dead (2007) • hSp://searchenginewatch.com/sew/news/2056255/yahoo-‐researcher-‐declares-‐seman(c-‐web-‐dead
• Three reasons why the Seman(c Web has failed (2013) • hSps://gigaom.com/2013/11/03/three-‐reasons-‐why-‐the-‐seman(c-‐web-‐has-‐failed/
• Seman(c Web Is Dead, Long Live Structured Web! (2014) • hSp://www.techweekeurope.co.uk/workspace/import-‐io-‐structured-‐web-‐141768
• etc.
The Seman(c Web & Ontologies 69
W3C (World Wide Web Consor(um): Seman(c Web Official web page (2014) • hSp://www.w3.org/standards/seman(cweb/
The Seman(c Web & Ontologies 70
Further Reading: Morgan & Claypool Series on The Seman(c Web
The Seman(c Web & Ontologies 71
Present and Future
The Seman(c Web & Ontologies 72
The end
73 The Seman(c Web & Ontologies