Transcript
Page 1: Linked Open Government Data (LOGD)

Nooshin Allahyari

1

Linked Open Government Data (LOGD): Ontology Usage Experimental ResultsSecond Presentation

Page 2: Linked Open Government Data (LOGD)

Nooshin Allahyari

2

Outlines

• Categorizing data provider

• Dataset collection

• Dataset characteristics

▫ Namespace

▫ Ontology Usage

▫ Annotation property

• Concept Coverage

• Case-Based Analysis

• Conclusion

Page 3: Linked Open Government Data (LOGD)

Nooshin Allahyari

3

Categorizing data provider

• US Government Agencies

• Dividing agencies based on US Federal Government

Reference Model

• Each agency is in charge of publishing related datasets

• Data.gov catalog also provide topic related

categorization

Page 4: Linked Open Government Data (LOGD)

Nooshin Allahyari

4

Outlines• Categorizing data provider

• Dataset collection

• Dataset characteristics

▫ Namespace

▫ Ontology Usage

▫ Annotation property

• Concept Coverage

• Case-Based Analysis

• Conclusion

Page 5: Linked Open Government Data (LOGD)

Nooshin Allahyari

5

Dataset Collection

• All 25 Datasets collected from Data.gov

• Datasets are in RDF format

• Difficulties running huge datasets

• Using different tools As endpoint

▫ Virtuoso commercial version as SPARQL endpoint

Easy to Install

GUI

Lots of visual tools

SQL,SQL tools and connection tools.

• Increasing dataset number for reliability

Page 6: Linked Open Government Data (LOGD)

Nooshin Allahyari

6

Outlines• Categorizing data provider

• Dataset collection

• Dataset Composition Characteristics

▫ Namespace

▫ Ontology Usage

▫ Annotation property

• Concept Coverage

• Case-Based Analysis

• Conclusion

Page 7: Linked Open Government Data (LOGD)

Nooshin Allahyari

7

NameSpace

•Same Namespace usage for all datasets

Page 8: Linked Open Government Data (LOGD)

Nooshin Allahyari

8

Ontology Vocabulary Usage• FEA Reference Model Ontology(RMO)

• Vocabulary Related to Government Context

▫General Vocabulary

Country

State

City

▫Government programs, Services:

Health Program

Cultural Program

Page 9: Linked Open Government Data (LOGD)

Nooshin Allahyari

9

Annotation Property

•Useful to provide additional information

about datasets. All datasets have:

▫rdfs:lable

▫Rdfs:comments

▫No language tag or metadata

Some datsets from Italy dataset catalog in TWC

LOGD contain Language Tag .

Page 10: Linked Open Government Data (LOGD)

Nooshin Allahyari

10

Outlines• Categorizing data provider

• Dataset collection

• Dataset characteristics

▫ Namespace

▫ Ontology Usage

▫ Annotation property

• Concept Coverage

• Case-Based Analysis

• Conclusion

Page 11: Linked Open Government Data (LOGD)

Nooshin Allahyari

11

Concept Coverage •Same Concept in all datasets•Metadata for Data.gov wiki and TWC

LOGDPrefix Concept

foaf Homepage

rdfs isDefinedBy

dcterms Source

dgtwc uses-property

dgtwc number-of-triples

dgtwc number-of-properties

dgtwc number-of-enteries

Page 12: Linked Open Government Data (LOGD)

Nooshin Allahyari

12

Concept Coverage•General Concept Related Government•Low Coverage of concept• Multi-name concepts

Concept Coverage(percentage)

State 48%

City 32%

State-Abbreviation 16%

Region 12%

Zip 12%

Country 8%

Country origin code 8%

Area code 8%

Page 13: Linked Open Government Data (LOGD)

Nooshin Allahyari

13

Outlines

•Categorizing data provider•Dataset collection•Dataset characteristics

▫Namespace▫Ontology Usage▫Annotation property

•Concept Coverage•Case-Based Analysis•Conclusion

Page 14: Linked Open Government Data (LOGD)

Nooshin Allahyari

14

Case-Based Analysis• Three dataset from same agency in same

category▫Department of Veterans Affairs

dataset1213 dataset1288 Dataset1290

• Result of each dataset queries shows all three of them have similar concepts

State City VISN Station

Page 15: Linked Open Government Data (LOGD)

Nooshin Allahyari

15

Case-Based Analysis-1288• The query lists all station with their specific

code(VISN) in each city and determine the state in which the city is located in:

SELECT DISTINCT ?city ?station ?visn ?stWHERE { ?s <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#city> ?cityOPTIONAL{ ?s <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#station> ?station}OPTIONAL{?s <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#visn> ?visn}OPTIONAL{?s <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st> ?st}}

State VISN Station City

"NJ" "3" "561" "East Orange"

"NY" "3" "620" "Montrose"

"NY" "3" "630""New York Harbor"

"NY" "3" "632" "Northport"

"DE" "4" "460" "Wilmington"

"PA" "4" "503" "Altoona""PA" "4" "529" "Butler"

"WV" "4" "540" "Clarksburg"

Page 16: Linked Open Government Data (LOGD)

Nooshin Allahyari

16

Case-Based Analysis-1290• The query lists all station with their specific

code(VISN) in each city and determine the state in which the city is located in:

SELECT DISTINCT ?city ?station ?visn ?stWHERE { ?s <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#city> ?cityOPTIONAL{ ?s <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#station> ?station}OPTIONAL{?s <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#visn> ?visn}OPTIONAL{?s <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st> ?st}}

State VISN Station City

"ME" "1" "402" "Togus"

"VT" "1" "405" "White River Junction"

"MA" "1" "518" "Bedford"

"MA" "1" "523" "West Roxbury"

"NH" "1" "608" "Manchester"

"MA" "1" "631" "Northampton""RI" "1" "650" "Providence"

"CT" "1" "689" "West Haven"

Page 17: Linked Open Government Data (LOGD)

Nooshin Allahyari

17

Case-Based Analysis-1213• The query lists all station with their specific

code(VISN) in each city and determine the state in which the city is located in:

SELECT DISTINCT ?visn ?city ?state WHERE { ?s <http://www.data.gov/semantic/data/alpha/1213/dataset-1213.rdf#visn> ?visn. ?s <http://www.data.gov/semantic/data/alpha/1213/dataset-1213.rdf#city> ?city. ?s <http://www.data.gov/semantic/data/alpha/1213/dataset-1213.rdf#state> ?state}

State VISN City

"CT" "1" "West Haven"

"MA" "1" "Bedford"

"MA" "1" "West Roxbury"

"MA" "1" "Northampton"

"ME" "1" "Togus"

"NH" "1" "Manchester""RI" "1" "Providence"

"VT" "1""White River

Junction"

Page 18: Linked Open Government Data (LOGD)

Nooshin Allahyari

18

Case-Based Analysis-1206•Dataset 1206 similarities VISN STATE Facility-name City

"1" "CT" "VA Connecticut HCS" "West Haven"

"1" "MA""Edith Nourse Rogers Memorial

Veterans Hospital""Bedford"

"1" "MA""VA Boston HCSW Roxbury

Brockton Jamaica Plns""West Roxbury"

"1" "MA" "VAMC" "Northampton"

"1" "ME" "VAMC/RO" "Togus""1" "NH" "VAMC" "Manchester""1" "RI" "VAMC" "Providence"

"1" "VT" "VAM/ROC""White River

Junction"

SELECT DISTINCT ?state ?facilityname ?city ?visnWHERE {?s <http://www.data.gov/semantic/data/alpha/1206/dataset-1206.rdf#visn> ?visn.?s <http://www.data.gov/semantic/data/alpha/1206/dataset-1206.rdf#state> ?state.?s <http://www.data.gov/semantic/data/alpha/1206/dataset-1206.rdf#city> ?city.?s <http://www.data.gov/semantic/data/alpha/1206/dataset-1206.rdf#facility_name> ?facilityname}

Page 19: Linked Open Government Data (LOGD)

Nooshin Allahyari

19

Case-Based Analysis-Comparison• We need to explicitly define “owl:sameAs”

property for similar properties in order to get query results:

SELECT DISTINCT ?state ?city WHERE { GRAPH <http://localhost8890/vad/dataset1288> {  ?s1 <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st >?state.  ?s1 <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#city> ?city .<http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st> owl:sameAs<http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#st>  . http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#cityOwl:sameAshttp://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#city.} GRAPH <http://localhost8890/vad/dataset1290> {    ?s2 <<http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#st> ?st. ?s2 <http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#city> ?city.

<http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#st> owl:sameAs <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st>.

<http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#city>Owl:sameAs<http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#city>.

}}order by ?state

State City

"CT" "West Haven"

"MA" "Bedford"

"MA" "West Roxbury"

"MA" "Northampton"

"ME" "Togus"

"NH" "Manchester""RI" "Providence"

"VT""White River

Junction"

Page 20: Linked Open Government Data (LOGD)

Nooshin Allahyari

20

Outlines• Categorizing data provider

• Dataset collection

• Dataset characteristics

▫ Namespace

▫ Ontology Usage

▫ Annotation property

• Concept Coverage

• Case-Based Analysis

• Conclusion

Page 21: Linked Open Government Data (LOGD)

Nooshin Allahyari

21

Conclusion

• No Government ontology have been used in

experimental datasets

• Weak vocabulary usage in US Government

• Multi-vocabulary usage for same concept

• Multi-vocabulary usage in same government

agency

• Lack of well defined, coherent, and consistent

government ontology.

Page 22: Linked Open Government Data (LOGD)

Nooshin Allahyari

22

Thank you