Upload
mike-bergman
View
5.427
Download
2
Embed Size (px)
DESCRIPTION
Mike Bergman offers his take on what approaches to the semantic Web are working, what are not, and what all of this might say about the semantic Web moving forward. Informed by Structured Dynamics' open source frameworks and client experiences, the main thesis is that the pragmatic contribution of semantic technologies resides more in mindsets, information models and architectures than in 'linked data' as currently practiced.
Citation preview
Michael K. Bergman
Pragmatic Approaches to the Semantic Webor, Why Aren’t We in Hyperland Yet?
2
Outline
Intro to SD and Me
Summary of Main Thesis
A Wee Bit of History
What is Not Working?
Problems with Linked Data
What is Working?
Some Pragmatic Lessons
SD’s Pragmatic Approach
Conclusion and Q & A
3
Structured Dynamics
Founded 2008; predecessor Zitgist LLC; two principals
Privately held, revenue funded
Boutique semantic technology shop
Services and consulting: Semantic enterprise adoption Ontology development and mapping Tech transfer and training
Development and software: Open source OSF stack
Data conversion and migration
Client-specific development
4
Current Products and OSF Stack
the pivotal product; Web services middleware that provides distributed data access and federation
Drupal-based structured data linkage to structWSF
spreadsheet, JSON and XML authoring and conversion framework
reference set of linking subjects and basis for domain vocabularies
an ontology- and entity-driven information extraction and tagging system
5
SD Locations
6
Michael Bergman
Summary of Main Thesis
8
Main Arguments
Not against linked data Proponent and explicator since 2006
But, linked data burdensome, not pivotal to interoperability
Interoperability requires: Structured data (from any source) Canonical data model (RDF) (Relatively simple) ontologies for world views, schema Curation
A Wee Bit of History
10
Key Historical Milestones
1945: Memex
1963: Hypertext
1990: Hyperland
2001: Semantic Web Lack of uptake
2006: Linked Data
2010: Revisionist Linked Data
11
Hyperland
12
Linked Data
“Linked Data is a set of best practices for publishing
and deploying instance and class data using the RDF
data model, naming the data objects using uniform
resource identifiers (URIs), thereby exposing the data
for access via the HTTP protocol, while emphasizing
data interconnections, interrelationships and context
useful to both humans and machine agents.”
What is Not Working?
14
Some Disappointments to Date
Full semantic Web vision
Widescale adoption of the semantic Web, linked data
Lack of intelligent agents
Many aspects of the practice of linked data
Problems with Linked Data
16
Problems with Linked Data
Burdensome on publishers
Naïve linkages: Overuse of sameAs Lack of accurate alignments
(Often) poor data quality
Wrong focus
17
Some Conditions for Interoperability
<Interoperability> <needsMapping> <Predicates>
<Interoperability> <needsReference> <Nouns>
18
Many Mappings Should be Approximate skos:broadMatch skos:related ore:similarTo umbel:isAbout vmf:isInVocabulary skos:closeMatch lvont:nearlySameAs umbel:isLike umbel:hasCharacteristic lvont:somewhatSameAs rdfs:seeAlso ore:describes map:narrowerThan skos:narrower map:broaderThan skos:broader dc:subject link:uri foaf:isPrimaryTopicOf
What is Working?
20
Successes
Siri
Bing (Powerset)
Google + schema.org
(Some) linked data
21
Siri
22
Bing (Powerset)
23
Statistical NLP
Structured results
Initial schema (Metaweb)
schema.org (with Yahoo, Bing and Yandex)
24
Some Linked Data
Some selected knowledge bases: DBpedia GeoNames Freebase (Google)
Biomedical community
LOD-LAM community
Some Pragmatic Lessons
26
Some Lessons Learned
Structure is good in any form
Keep semantic technology in the background
Open Web (FYN) likely to be disappointing
Ontologies essential for alignments
NLP an essential contributor to structure
Metadata an essential contributor to characterization, use
Linked data is a burden to publishers, places semantic emphasis on wrong part of chain
27
Seven Pillars
28
Preserving Existing Assets
Relational databases (RDBMs)
Distributed structured assets spreadsheets lightweight datastores
Web pages and Web sites
Existing documents and text
Web databases and APIs
Other databases (RDF, OO, etc.)
29
irON Dataset Exchange Framework
Simple authoring and dataset creation
irON includes an abstract notation and vocabulary for instance records
Notations for: Instance records
Schema
Datasets and metadata
Linkages to other schema
Serializations available for: XML (irXML)
JSON (irJSON)
CSV/spreadsheets (commON)
30
Three irON SerializationsirXML irJSON
commON
31
Spreadsheet Correspondence to Triples
32
More-or-less Interchangeable Formats
SD’s Pragmatic Approach
34
A Layered Approach
35
OSF Stack
Conclusion
37
Summary
If you can, do linked data; it is a GOOD THING
In any event, expose your data: Structured (use NLP for unstructured) Metadata Definitions Relations (simple) “Semsets” (synonyms, acronyms, spelling variants)
Build vocabulary and ontology consortia
Build trust and curation communities
Semantics essential at the interoperability level, not necessarily publication or data transfer
38
Take Aways
James Hendler:
“A little bit of semantics goes a long way”
Leverage linked data, but broaden focus
Consider adopting the semantic enterprise as the broader focus
Further Information
40
More Info and Links
Open Semantic Framework (OSF) stack: http://openstructs.org
TechWiki (400 detailed OSF how-to articles): http://techwiki.openstructs.org
Key ontologies: UMBEL: http://umbel.org
BIBO: http://bibliontology.org
Blogs: Mike Bergman: http://mkbergman.com
Fred Giasson: http://fgiasson.com/blog
Structured Dynamics: http://structureddynamics.com
http://citizen-dan.org (community indicator systems)