45
Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing Linked Geoname Data Dongpo Deng Geospatial Information Specialist Institute of Information Science Academia Sinica [email protected] 2014 PaciKic Neighborhoods Consortium Annual Conference and Joint Meeting

Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing Linked Geoname Data

Embed Size (px)

DESCRIPTION

This is the presentation for the session of spatiotemporal linked data in PNC 2014

Citation preview

Page 1: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Toward  Next  Generation  of  Gazetteer:    Utilizing  GeoSPARQL  For  Developing  

Linked  Geoname  Data

Dongpo  Deng  !

Geospatial  Information  Specialist  Institute  of  Information  Science  

Academia  Sinica  [email protected]

2014  PaciKic  Neighborhoods  Consortium  Annual  Conference  and  Joint  Meeting    

Page 2: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Place  and  Name• A  place  is  a  meaningful  location  for  people.  

• To  identify  places,  people  give  name  for  separating  from  undifferentiated  space.  

• A  place  is  a  concept  of  geography.  

• A  name  of  a  place  is  about  toponymy.

Page 3: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Gazetteer•  A  gazetteer  is  deKined  by  (Hill,  2000)  as  geospatial  dictionaries  of  geographic  names  with  the  core  components  of    

• A  name  (could  have  variant  names  also)  ;  

• A  location  (coordinates  representing  a  point,  line,  or  areal  location)  ;  

• A  type  (selected  from  a  type  scheme  of  categories  for  places/features).  

Page 4: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

ADL  Gazetteer

Page 5: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Geonames.org

Page 6: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Ordnance  Survey  50k  Gazetteer

Page 7: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Getty  TGN

Page 8: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Research  data  management  (CKAN)    (taijiang.tw)

Page 9: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Metadata  management

� &-

'.&-

�/"&-

� �0�2,�,�,��,!� � 2���#

��2��������+1�*�&)�$�

���/"2 (,%����

Page 10: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data
Page 11: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Place  name  as  Controlled  Vocabulary

Page 12: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Place  name  as  Controlled  Vocabulary

Page 13: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Ambiguity  of  Place  names

• Many  places  have  same  name  

• A  place  can  have  many  name  

• Long  place  names  are  often  shortened  

• Spatial  footprints  of  place  names  are  often  difKicult  to  deKine

Page 14: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Linked  Data• Tim  Berners-­‐Lee  (2006)  proposed  4  principles:  

1. Use  URIs  as  names  for  things  

2. Use  HTTP  URIs  so  that  people  can  look  up  those  names.  

3. When  someone  looks  up  a  URI,  provide  useful  information,  using  the  standards  (RDF,  SPARQL)  

4. Include  links  to  other  URIs.  so  that  they  can  discover  more  things.

Page 15: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Why  Linked  Data?

• To  create  web  of  data  

• To  semantically  integrate  data  

• To  facilitate  data  reuse  

• The  more  Link  data,  the  more  knowledge  can  be  discovered

Page 16: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Linked  Open  Data  Cloud  (9/2008)

Page 17: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Linked  Open  Data  Cloud  (7/2009)

Page 18: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Linked  Open  Data  Cloud  (9/2010)

Page 19: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Linked  Open  Data  Cloud  (9/2011)

Page 20: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Linked  Open  Data  Cloud    (2014.4)

Page 21: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

OGC  GeoSPARQL• The  GeoSPARQL  is  a  new  OGC  standard,  which  provides  three  main  components  for  encoding  geographic  information:    

• (1)  The  deKinitions  of  vocabularies  for  representing  features,  geometries,  and  their  relationships;    

• (2)  A  set  of  domain-­‐speciKic,  spatial  functions  for  use  in  SPARQL  queries;    

• (3)  A  set  of  query  transformation  rules  

Page 22: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

GeoSPARQL  Vocabulary:    Basic  Classes  and  Relations

Page 23: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Topological  Relations  between  geo:SpatialObject

• OGC  simple  feature  relation  family  

• Also  support  RCC8  and  Egenhofer

A BA BA B

A A BA BA

B

A/B

geo:sfEquals geo:sfTouches geo:sfOverlaps geo:sfContains

geo:sfWithin geo:sfDosjoint geo:sfIntersects geo:sfCrosses

B

Page 24: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Components  of  GeoSPARQL• Vocabulary  for  Query  Patterns  

• Classes  • Spatial  Object,  Feature,  Geometry  • Properties  

• Topological  relations  • Links  between  features  and  geometries  

• Datatypes  for  geometry  literals  • geo:wktLiteral,  geo:gmlLiteral  

• Query  Functions  • Topological  relations,  distance,  buffer,  intersection,  …  

• Entailment  Components  • RDFS  entailment  • RIF  rules  to  compute  topological  relations

Page 25: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Some  GeoSPARQL  examples:City              rdfs:subClassOf          geo:Feature  

:School    rdfs:subClassOf          geo:Feature

:Taipei      rdf:Type                                :City  

:NTU            rdf:type                                  :School  

:NTU            :isDeveloped              “1928-­‐3-­‐16”^^xsd:date

:Taipei              geo:hasGeometry              :geo_001  

:geo_001      geo:asWKT                                    “Polygon((…))”^^geo:wktLiteral  

:NTU                    geo:hasGeometry              :geo_002  

:geo_002      geo:asWKT                                    “Polygon((…))”^^geo:wktLiteral  

:NTU                    geo:sfWithin                              :Taipei

beta information

non-geospatial information

geospatial information

Page 26: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

BBN  Parliamenthttp://parliament.semwebcentral.org/

Page 27: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

A  procedure  for  making  data  interlink  

Specification

Modeling

Publish

Utility

Transform

• Distinguish  concepts  of  place  names    • URI  design

• Develop  ontologies

• Transform  data  to  RDF

• Publish  the  RDF/OWL

• Utilize  the  RDF/OWL  data  for  services

Page 28: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Taiwan  Geographic  Name  Information  System  

(http://placesearch.moi.gov.tw/)

Page 29: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

A  place  name  ontology

tpn:Place

geo:Feature

tpn:FeatureTypetpn:featureClass

owl:subClassOf

skos:Concept

owl:subClassOf

time:Interval

geo:Point

geo:Geometry

owl:subClassOf

geo:hasGeometry

tpn:Footprint

geo:inside

geo:wktLiteral

geo:asWKT

tpn:is_in

owl:subClassOf

event:Event

event:place

geo="http://www.opengis.net/ont/geosparql#"

time="http://www.w3.org/2006/time#"

xsd="http://www.w3.org/2001/XMLSchema#"

tpn="http://lod.tw/ontologies/geoname.owl#"

owl="http://www.w3.org/2002/07/owl#"

event="http://purl.org/NET/c4dm/event.owl#"

event:time

tpn:Name (NameCollection)

tpn:PlaceName

tpn:memberOf

time:Instant

time:hasBeginningtpn:endToUse

tpn:startToUse

time:hasEnd

tpn:altNametpn:name

Page 30: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

http://geo.lod.tw

D2R  server

Page 31: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

http://geo.lod.tw

D2R  server

Page 32: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Link  to  Geonames

Page 33: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

The  place  names  used  in  dutch  

colonial  rule  period

Page 34: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

The  place  names  used  in  dutch  

colonial  rule  period

Page 35: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data
Page 36: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Disambiguate  place  names  • ‘⼤大山腳’  is  a  name  of  places  and  a  URI  

• There  are  three  places  named  ‘⼤大山腳’

Page 37: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

GeoSPARQL  Endpoint

Page 38: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

GeoSPARQL  query  (1)

SELECT DISTINCT ?Place ?Place_wkt!WHERE { ! ?Place a tpn:Place;! geo:hasGeometry ?Place_geo.! ?Place_geo geo:asWKT ?Place_wkt.!!FILTER (geof:sfWithin(?Place_wkt, "POLYGON((119.99912 23.24348,120.25398 23.24482,120.25398 23.24482,120.25130 23.24348,120.25130 23.24348,120.25666 23.00203,120.00449 23.01276,119.99912 23.24348))”^^sf:wktLiteral)) .!!} !

To  Kind  place  names  within  a  spatial  extent

Page 39: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Query  result  (1)

Page 40: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

GeoSPARQL  query  (2)

SELECT DISTINCT ?p_wkt ?Place_wkt ?distance!WHERE { !! ?Place a tpn:Place;! geo:hasGeometry ?Place_geo.! ?Place_geo geo:asWKT ?Place_wkt.! <http://geo.lod.tw/resource/Point/01c2db36d23bdadda4beca046ce85e47> geo:asWKT ?p_wkt;!! LET (?buff := geof:buffer(?p_wkt, 3000, units:metre)) .! FILTER (geof:sfWithin(?Place_wkt, ?buff)) .! LET (?distance := geof:distance(?Place_wkt, ?p_wkt, units:metre)) .!}

To  Kind  place  names  within  a  3  km  buffer  and  obtain  their  coordinates  and  distances  

Page 41: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data
Page 42: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

GeoSPARQL  query  (3)

SELECT DISTINCT ?pName ?tName ?Time_xsd!WHERE { ! ?PN a tpn:PlaceName;! rdfs:label ?pName;! tpn:startToUse ?Sart_Interval.! ! ?Start_Interval a time:Interval;! rdfs:label ?tName;! time:hasBeginning ?begin.! ! ?begin time:xsdDateTime ?Time_xsd .!!Filter (?Time_xsd < "1900-12-19T16:00:00Z"^^xsd:dateTime ) .!!}

To  Kind  names  of  places  and  their  time  started  to  use  before  Dec.  19,  1900

Page 43: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Query  result  (3)

Page 44: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

Concluding  remarks  • By  using  the  ontology  of  place  names,  Taiwanese  place  name  dataset  is  transferred  from  spreadsheet  to  triples  (RDF).  

• The  uniKied  place  names  can  be  served  as  controlled  vocabularies.  

• The  Taiwanese  place  name  dataset  is  not  only  linked  forward  to  Geonames.org,  but  also  linked  backward  to  historical  place  names.  

• A  front-­‐end  linked  data  server  (D2R)  is  established  to  demonstrate  the  linked  place  names.  

• A  GeoSPARQL  endpoint  (BBN  Parliament)  is  developed  for  serving  spatiotemporal  SPARQL  queries.

Page 45: Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Linked Geoname Data

[email protected]!twitter: @dongpo!

facebook: dongpo.deng

Slides are available on http://tinyurl.com/pnc2014ddp