71
The Evolving Semantic Web and Semantic eScience Landscape Deborah L. McGuinness Tetherless World Senior Constellation Chair Professor of Computer and Cognitive Science Rensselaer Polytechnic Institute Troy, NY, USA Joint work with the Tetherless World Constellation eScience , Provenance, and Linked Open Data Teams. Particularly Peter Fox, Jim Hendler, Patrick West, Stephan Zednik, Cynthia Chang, … tw.rpi.edu/people

201109021 mcguinness ska_meeting

Embed Size (px)

DESCRIPTION

Invited talk for the Square Kilometer Array meeting in Wellington New Zealand in Sept 2011 on Semantic eScience and Semantically enabled Virtual Observatories along with directions

Citation preview

Page 1: 201109021 mcguinness ska_meeting

The Evolving Semantic Web and Semantic eScience Landscape

Deborah L. McGuinness

Tetherless World Senior Constellation Chair

Professor of Computer and Cognitive Science

Rensselaer Polytechnic Institute

Troy, NY, USAJoint work with the Tetherless World Constellation eScience , Provenance, and Linked Open Data Teams. Particularly Peter Fox, Jim Hendler, Patrick West, Stephan Zednik, Cynthia Chang, … tw.rpi.edu/people

Page 2: 201109021 mcguinness ska_meeting

Introduction– Science data is exploding – sensors creating more than we

can handle, Linked open data initiatives, etc. – Virtual Observatories expanding – in breadth, depth, and

semantic usage– Introduction to a leading edge interdisciplinary virtual

observatory – Virtual Solar Terrestrial Observatory– Directions – (that may be even more important for BIG

science )• Provenance • Semantic eScience Framework

– Discussion

Page 3: 201109021 mcguinness ska_meeting

Rensselaer Tetherless World Constellation (TWC)

http://tw.rpi.eduChaired Professors: McGuinness, Fox, HendlerResearch Prof: Luciano; Research Staff: Bao, Chang, Erickson, Shi, West, Zednik

Themes:•Semantic Foundations

• Knowledge Provenance / Explanation

• Ontology Environments

• Inference• Trust• Linked Data

•Xinformatics• Semantic eScience• Data Science• eHealth• eEnvironment

•Future Web• Web Science• Policy• Social

Page 4: 201109021 mcguinness ska_meeting

McGuinness NSF/NCAR May 6, 2008

Semantic e-Science Motivations

AI Goal: AI in service of supporting the next generation of science – interdisciplinary, distributed e-Science

Science Goal: Scientists should be able to access a global, distributed knowledge base of scientific data that:• appears to be integrated• appears to be locally available

But… data is obtained by multiple instruments, using various protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed.

We look to semantic technologies to help.

Page 5: 201109021 mcguinness ska_meeting

5

Virtual Solar Terrestrial Observatory (vsto.org)

• Interdisciplinary Virtual Observatory for searching, integrating, & analyzing observational, experimental, & model databases.

• Subject matter: solar, solar-terrestrial and space physics• Provides virtual access to specific data, model, tool and

material archives containing items from a variety of space- and ground-based instruments and experiments, as well as individual and community modeling and software efforts bridging research and educational use

• 3 year NSF project; initial deployment in year 1, multiple deployments by year 2; year 3 outreach and broadening

• While aimed at one interdisciplinary area, it serves as a replicable prototype for interdisciplinary virtual observatories• Numerous follow-ons (Semantic Provenance Capture in Data Ingest Systems, SESDI, SESF,

SSIII, …)

Page 6: 201109021 mcguinness ska_meeting

9/15/2009 McGuinness - Cog Sci - RPI 6With NCAR, UTEP

Page 7: 201109021 mcguinness ska_meeting

McGuinness NSF/NCAR May 6, 2008

Page 8: 201109021 mcguinness ska_meeting
Page 9: 201109021 mcguinness ska_meeting

Some Learnings

Successful demonstration of semantic technologies Serves as operational prototype and has been

replicated in volcanology and climate response, semantic sea ice, ….

Semantic Web methodology for development Modularization of ontologies is critical for re-use

(along with designing the ontologies for re-use) Provenance is critical for acceptance Tools, toolkits, and smart frameworks are one next

step that we are taking (and we love partners in this endeavor…)

Page 10: 201109021 mcguinness ska_meeting

Semantic Web Methodology and Technology Development Process

• Establish and improve a well-defined methodology vision for Semantic Technology based application development; Leverage controlled vocabularies, etc.

10

Use Case

Small Team, mixed skills

Analysis

Adopt Technology Approach

Leverage Technology

InfrastructureRapid

Prototype

Open World: Evolve, Iterate,

Redesign, Redeploy

Use Tools

Science/Expert Review & Iteration

Develop model/ ontology

Evaluation

James L. Benedict, Deborah L. McGuinness, and Peter Fox. A Semantic Web-based Methodology for Building Conceptual Models of Scientific Information. In American Geophysical Union, Fall Meeting (AGU2006), San Francisco, Ca., December, 2007. Eos Trans. AGU 88(52), Fall Meet. Suppl., Abstract IN53A-0950. abstract

Page 11: 201109021 mcguinness ska_meeting

11

Semantic Provenance Capture for Data Ingest Systemcs (SPCDIS)

Fact: Scientific data services are increasing in usage and scope, and with these increases comes growing need for access to provenance information.

Provenance Project Goal: to design a reusable, interoperable provenance infrastructure.

Science Project Goal: design and implement an extensible provenance solution that is deployed at the science data ingest/ product generation time.

Outcome: implemented provenance solution in one science setting AND operational specification for other scientific data applications.

Extends vsto.org

Page 12: 201109021 mcguinness ska_meeting

Advanced Coronal Observing System (ACOS) Provenance Use Cases

• What were the cloud cover and seeing conditions during the observation period of this image?

• What calibrations have been applied to this image?

• Why does this image look bad?

12

Page 13: 201109021 mcguinness ska_meeting

ACOSData Ingest

• Typical science data processing pipelines

• Distributed

• Some metadata in silos

• Much metadata lost

• Many human-in-loop decisions, events

• No metadata infrastructure for any user

• Community is broadening

Chromospheric Helium Imaging Photometer (CHIP) Data IngestACOS – Advanced Coronal Observing System 13

Page 14: 201109021 mcguinness ska_meeting

PML Usage in SPCDIS

• Justification– Explanation– Causality graph

• Provenance– Conclusion– Source– Engine– Rule

• Trust– Trust/Belief metrics

NodeSetNodeSet

JustificationJustification

ConclusionConclusion

NodeSetNodeSet

JustificationJustification

ConclusionConclusion

NodeSetNodeSet

JustificationJustification

ConclusionConclusion

EngineEngine RuleRule RuleRule

hasAntecedentList

hasSourceUsagehasInferenceRule

hasInferenceEngine

SourceUsageSourceUsage

SourceSource

DateTimeDateTime

14

Page 15: 201109021 mcguinness ska_meeting

PML in Action

• This is the PML provenance encoding for a “quick look” gif file that is generated from two image data datasets

Node set for the quickloook gif file

hasConclusion: a reference to the gif file itself

InferenceStep: how the gif file was derived

hasAntecedents

hasInferenceRulehasInferenceEngine

The “antecedents” of the quicklook gif file are other node sets

Page 16: 201109021 mcguinness ska_meeting

Integrated View

• Observer log’s information added into quicklook image’s provenance

Page 17: 201109021 mcguinness ska_meeting

17

Knowledge Provenance in Action

Mobile Wine Agent

GILA

Combining Proofs in

TPTP

Cognitive Asst

17

Virtual Observatories

17

Intelligence Analyst Tools

McGuinness – Inference Web

Page 18: 201109021 mcguinness ska_meeting

Discussion

• Semantic technologies can help in many ways – we have demonstrated their use in integration, discovery, access, validation, …

• Many subject area ontologies exist… and some are modular enough and vetted enough and maintained enough to depend on

• Moving from semantically-enabled systems to semantically-enabled frameworks is part of our present and future and we think it will be for others

• Provenance is critical and should be part of the design from day 1 (not an afterthought)…. And languages and tools are emerging

• Linked data can play a role – e.g., SemantAqua

• Things you might consider:– Use our framework / tools / tutorials such as linked data, Inference Web, Ontologies, SESF– Contribute your ontologies, tools, use cases to SESF – Collaborate with us…………..– Questions dlm @ cs. rpi. edu

Page 19: 201109021 mcguinness ska_meeting

Tropopause

http://aerosols.larc.nasa.gov/volcano2.swf

Page 20: 201109021 mcguinness ska_meeting

Atmosphere Use Case

Determine the statistical signatures of both volcanic and solar forcings on the height of the tropopause

From paleoclimate researcher – Caspar Ammann – Climate and Global Dynamics Division of NCAR - CGD/NCAR

Layperson perspective: - look for indicators of acid rain in the part of the

atmosphere we experience… (look at measurements of sulfur dioxide in relation

to sulfuric acid after volcanic eruptions at the boundary of the troposphere and the stratosphere)

Nasa funded effort with Fox – NCAR->RPI, Sinha - Va. Tech, Raskin - JPL

Page 21: 201109021 mcguinness ska_meeting

Use Case: A Volcano Erupts

Preferentially it’s a tropical mountain (+/- 30 degrees of the equator) with ‘acidic’ magma; more SiO2, and it erupts with great intensity so that material and large amounts of gas are injected into the stratosphere.

The SO2 gas converts to H2SO4 (Sulfuric Acid) + H2O (75% H2SO4 + 25% H2O). The half life of SO2 is about 30 - 40 days.

The sulfuric acid condensates to little super-cooled liquid droplets. These are the volcanic aerosol that will linger around for a year or two.

Brewer Dobson Circulation of the stratosphere will transport aerosol to higher latitudes. The particles generate great sunsets, most commonly first seen in fall of the respective hemisphere. The sunlight gets partially reflected, some part gets scattered in the forward direction.

Result is that the direct solar beam is reduced, yet diffuse skylight increases. The scattering is responsible for the colorful sunsets as more and more of the blue wavelength are scattered away.in mid-latitudes the volcanic aerosol starts to settle, but most efficient removal from the stratosphere is through tropopause folds in the vicinity of the storm tracks.

If particles get over the pole, which happens in spring of the respective hemisphere, then they will settle down and fall onto polar ice caps. Its from these ice caps that we recover annual records of sulfate flux or deposit.

We get ice cores that show continuous deposition information. Nowadays we measure sulfate or SO4(2-). Earlier measurements were indirect, putting an electric current through the ice and measuring the delay. With acids present, the electric flow would be faster.

What we are looking for are pulse like events with a build up over a few months (mostly in summer, when the vortex is gone), and then a decay of the peak of about 1/e in 12 months.

The distribution of these pulses was found to follow an extreme value distribution (Frechet) with a heavy tail.

Page 22: 201109021 mcguinness ska_meeting
Page 23: 201109021 mcguinness ska_meeting

Inference Web: Making Data Transparent and Actionable Using Semantic Technologies

• How and when does it make sense to use smart system results & how do we interact with them?

23

Knowledge Provenance in Virtual

Observatories

Hypothesis Investigation /

Policy Advisors

(Mobile) Intelligent

Agents

Intelligence Analyst Tools

NSF Interops:SONETSSIII – Sea Ice

Page 24: 201109021 mcguinness ska_meeting

Core and framework semantics

Page 25: 201109021 mcguinness ska_meeting

Ontology Spectrum

Catalog/ID

GeneralLogical

constraints

Terms/glossary

Thesauri“narrower

term”relation

Formalis-a

Frames(properties)

Informalis-a

Formalinstance Value

Restrs.

Disjointness,

Inverse, part-of…

From 99 AAAI panel, 2000 Dagstuhl talk

Page 26: 201109021 mcguinness ska_meeting

November 9, 2006 26

Virtual Observatory (VSTO)

• General: Find data subject to certain constraints and plot appropriately

• Specific: Plot the observed/measured Neutral Temperature as recorded by the Millstone Hill Fabry-Perot interferometer while looking in the vertical direction at any time of high geomagnetic activity in a way that makes sense for the data.

Page 27: 201109021 mcguinness ska_meeting

VSTO Results

Many Benefits:– Reduced query formation from 8 to 3 steps and reduced choices at each stage– Allowed scientists to get data from instruments they never knew of before (e.g.,

photometers in example)– Supported augmentation and validation of data– Useful and related data provided without having to be an expert to ask for it– Integration and use (e.g. plotting) based on inference– Ask and answer questions not possible before

But Needed Provenance (SPCDIS, PML), reusability & modularity (SESF)

– Deborah McGuinness, Peter Fox, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. In the Proceedings of the Nineteenth Conference on Innovative Applications of Artificial Intelligence (IAAI-07). Vancouver, British Columbia, Canada, July 22-26, 2007.

– Peter Fox, Deborah L. McGuinness, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. Ontology-supported Scientific Data Frameworks: The Virtual Solar-Terrestrial Observatory Experience. In Computers and Geosciences - Elsevier. Volume 35, Issue 4 (2009).

Page 28: 201109021 mcguinness ska_meeting

VSTO Instrument

28

Page 29: 201109021 mcguinness ska_meeting

VSTO Infrastructure

29

Page 30: 201109021 mcguinness ska_meeting

November 9, 2006 Deborah L. McGuinness 30

Partial exposure of Instrument class hierarchy - users seem to LIKE THIS

Page 31: 201109021 mcguinness ska_meeting

Users Require Provenance!Users demand it! If users (humans and agents) are to use, reuse, and integrate system

answers, they must trust them.

Intelligence analysts: (from DTO/IARPA’s NIMD)Andrew. Cowell, Deborah McGuinness, Carrie Varley, and David A. Thurman. Knowledge-Worker Requirements for Next Generation

Query Answering and Explanation Systems. Proc. of Intelligent User Interfaces for Intelligence Analysis Workshop, Intl Conf. on Intelligent User Interfaces (IUI 2006), Sydney, Australia.

Intelligent Assistant Users: (from DARPA’s PAL/CALO)Alyssa Glass, Deborah L. McGuinness, Paulo Pinheiro da Silva, and Michael Wolverton. Trustable Task Processing Systems. In Roth-

Berghofer, T., and Richter, M.M., editors, KI Journal, Special Issue on Explanation, Kunstliche Intelligenz, 2008.

Virtual Observatory Users: (from NSF’s VSTO)Deborah McGuinness, Peter Fox, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. The Virtual Solar-

Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. Proc. of the Nineteenth Conference on Innovative Applications of Artificial Intelligence (IAAI-07). Vancouver, British Columbia, Canada.

And… as systems become more diverse, distributed, embedded, and depend on more varied data and communities, more provenance and more types are needed

.

Page 32: 201109021 mcguinness ska_meeting

Advanced Coronal Observing System (ACOS) Provenance Use Cases

• What were the cloud cover and seeing conditions during the observation period of this image?

• What calibrations have been applied to this image?

• Why does this image look bad?

32

Page 33: 201109021 mcguinness ska_meeting

ACOSData Ingest

• Typical science data processing pipelines

• Distributed

• Some metadata in silos

• Much metadata lost

• Many human-in-loop decisions, events

• No metadata infrastructure for any user

• Community is broadening

Chromospheric Helium Imaging Photometer (CHIP) Data IngestACOS – Advanced Coronal Observing System 33

Page 34: 201109021 mcguinness ska_meeting

PML Usage in SPCDIS

• Justification– Explanation– Causality graph

• Provenance– Conclusion– Source– Engine– Rule

• Trust– Trust/Belief metrics

NodeSetNodeSet

JustificationJustification

ConclusionConclusion

NodeSetNodeSet

JustificationJustification

ConclusionConclusion

NodeSetNodeSet

JustificationJustification

ConclusionConclusion

EngineEngine RuleRule RuleRule

hasAntecedentList

hasSourceUsagehasInferenceRule

hasInferenceEngine

SourceUsageSourceUsage

SourceSource

DateTimeDateTime

34

Page 35: 201109021 mcguinness ska_meeting

A PML-Enhanced Image

provenance

CHIP Quick-LookCHIP PML-Enhance Quick-Look

Page 36: 201109021 mcguinness ska_meeting

Integrated View

• Observer log’s information added into quicklook image’s provenance

Page 37: 201109021 mcguinness ska_meeting

Provenance aware faceted search

Tetherless World Constellation 37

Page 38: 201109021 mcguinness ska_meeting

Technologies

• Semantic Web methodology

• Medium weight ontologies (although adapted from existing ontologies)

• Access to data

• Mapping info / services

• Reasoning (previous application was linking and exploration)

• Note – this project was operational in 8 months and is still in use years later

Page 39: 201109021 mcguinness ska_meeting

Semantically-Enabled Systems -> Semantically-Enabled Frameworks

• We could continue to build somewhat extensible and reusable systems…. But

• We wanted broader base of builders and users

• Frameworks provide many entry and exit points and re-usable (hopefully) seamless components

• Open source ontologies and software!

• We love partners in this endeavor…39

Page 40: 201109021 mcguinness ska_meeting

Background

• Began knowledge environment for GeoSciences discussions – early 2000s

• Chose a particular interdisciplinary virtual observatory (VSTO) powered by semantic technologies

• Use case driven – in solar and solar-terrestrial physics with an emphasis on instrument-based measurements and real data pipelines

• First step – proof of concept semantically-enabled pilot – VSTO quite successful

• We pushed semantics into applications that were already built on advanced cyberinfrastructure 40

Page 41: 201109021 mcguinness ska_meeting

Background II

• Provenance demands led to Semantic Provenance Capture for Data Ingest Systems

• Test in new domains – Semantically-Enabled Scientific Data Integration – predict climate impacts following volcanic eruption

• Reuse worked: semantic integration, semantic provenance, (with modularization and tool requests)

• Goal now – configurable, re-usable framework with embedded toolkit

Page 42: 201109021 mcguinness ska_meeting

Framework overview

Tetherless World Constellation 42

Page 43: 201109021 mcguinness ska_meeting

Semantic Web Methodology and Technology Development Process

43James L. Benedict, Deborah L. McGuinness, and Peter Fox. A Semantic Web-based Methodology for Building Conceptual Models of Scientific Information. In American Geophysical Union, Fall Meeting (AGU2006), San Francisco, Ca., December, 2007. Eos Trans. AGU 88(52), Fall Meet. Suppl., Abstract IN53A-0950. abstract

Page 44: 201109021 mcguinness ska_meeting

Application integration with smart, scalable search

• Rozell et al.

Page 45: 201109021 mcguinness ska_meeting

Core and framework semantics

Page 46: 201109021 mcguinness ska_meeting

Status & Discussion

• Ontology and tool re-use in process or beginning with many projects– VSTO re-implementation– BCO-DMO (biological and chemical oceanography)– Semantic Sea Ice (NSF Interop project)– Scientific Observations Network (SONET – NSF Interop)– National Ecological Observatory Network (NEON)– CSIRO Water Monitoring– Your Project Here!

• Modularization in process• Tools like S2S in place and being tested

Page 47: 201109021 mcguinness ska_meeting

Commonalities

• Applications – simple linking; integration of many existing vocabularies, simple inference

• Encoding of meaning – often lightweight – ontologies• Semantic Web methodology• Often light weight data encodings – triple stores• Usually simple reasoners• Provenance encodings

• These are all options that can be used incrementally and at varying degrees of sophistication.

• While initial applications are often on larger platforms, many can be adapted to mobile platforms

Page 48: 201109021 mcguinness ska_meeting

Comments

• Broader groups of people are now building linked data applications – e.g., hackathons for linked govt data, TWC/Elsevier Hackathon, Health 2.0 , etc.

• Broader groups of people are now building Virtual Observatories AND wanting to integrate more data, disciplines, etc.

• More interest in encodings of meaning to create smarter and more context aware application

• Growing demand for provenance for attribution, trust, transparency

• *More applications are moving to mobile and becoming ubiquitous

• More data from sensors and from open data initiative is fueling some applications

• Things you might consider:– Use our framework / tools / tutorials such as linked data, Inference Web, Ontologies, SESF– Contribute your modules to SESF – Collaborate with us…………..– Questions dlm @ cs. rpi. edu

Page 49: 201109021 mcguinness ska_meeting

Extras

Page 50: 201109021 mcguinness ska_meeting

Ontology

Regulation Ontology– Model federal and state

water quality regulations for drinking water sources

– Can use to define: for example, in California, “any measurement has value 0.01 mg/L is the limit for Arsenic”

– Combine with core ontology, we can infer “any water source contains 0.01 mg/L of Arsenic is a polluted water source.”

Portion of Cal. Regulation Ontology.

Page 51: 201109021 mcguinness ska_meeting

Visualization

• Map Visualization:1. Presents analyzed

results with Google Map

2. Presents explanation on why a water source is marked as polluted

3. Use “Facet” type filter to select type of data

1

2

3

http://was.tw.rpi.edu/swqp/map.html

Page 52: 201109021 mcguinness ska_meeting

Selected Follow-up options

Limit

Violation

Page 53: 201109021 mcguinness ska_meeting

PopSciGrid in Action

http://logd.tw.rpi.edu/demo/tax-cost-policy-prevalence

Page 54: 201109021 mcguinness ska_meeting

Directions

• Use sensed personal data to provide context and integrate with aggregated data to provide actionable health advisors – diet & nutrition, exercise, etc.

• Use PopsciGrid model for other data, e.g., CLASS data about exercise and nutrition in schools

• Relate to health impacts

• Expose provenance more effectively

Page 55: 201109021 mcguinness ska_meeting

Tetherless Faceted Browsing

Page 56: 201109021 mcguinness ska_meeting

PopSciGrid Revisited

CSV2RDF4LODDirect

SemDiff

Archive

CSV2RDF4LODEnhance

visualize

derive derive

integrate

derive

archive

Publish

Ban coverage

Data sets, simple ontologyProvenance toolsVisualization tools

Page 57: 201109021 mcguinness ska_meeting
Page 58: 201109021 mcguinness ska_meeting

VSTO DataProduct

58

Page 59: 201109021 mcguinness ska_meeting

Semantic Web Methodology

McGuinness, Fox, West, Garcia, Cinquini, Benedict, Middleton http://www.vsto.org

Page 60: 201109021 mcguinness ska_meeting

60

Semantic Provenance Capture for Data Ingest Systemcs (SPCDIS)

Fact: Scientific data services are increasing in usage and scope, and with these increases comes growing need for access to provenance information.

Provenance Project Goal: to design a reusable, interoperable provenance infrastructure.

Science Project Goal: design and implement an extensible provenance solution that is deployed at the science data ingest/ product generation time.

Outcome: implemented provenance solution in one science setting AND operational specification for other scientific data applications.

Extends vsto.org

Page 61: 201109021 mcguinness ska_meeting

PML in Action

• This is the PML provenance encoding for a “quick look” gif file, which is generated from two image data datasets

Node set for the quickloook gif file

hasConclusion: a reference to the gif file itself

InferenceStep: how the gif file was derived

hasAntecedents

hasInferenceRulehasInferenceEngine

The “antecedents” of the quicklook gif file are other node sets

Page 62: 201109021 mcguinness ska_meeting

62

CHIP Pipeline(Chromospheric Helium Image Photometer)

Mauna Loa Solar Observatory (MLSO)Hawaii

National Center for Atmospheric Research (NCAR) Data Center.Boulder, CO

Intensity Images (GIF)

Velocity Images (GIF)

•Follow-up Processing on Raw Data (e.g., Flat Field Calibration)•Quality Checking(Images Graded: GOOD, BAD, UGLY)

•Raw Image Data

Raw Image DataCaptured by CHIPChromosphericHelium-I ImagePhotometer

•Raw Data Capture

Publishes

62

Page 63: 201109021 mcguinness ska_meeting

Core and Framework Semantics - Multi-tiered interoperability

used by

Page 64: 201109021 mcguinness ska_meeting

SPARQL to Xquery translator RDFS materialization(Billion triple winner)

Govt metadata searchLinked Open Govt Data

SPARQL WG, earlier QL –OWL-QL, Classic’ QL, …

OWL 1 & 2 WG Edited main OWL Docs, quick reference, OWL profiles (OWL RL),

Earlier languages: DAML, DAML+OIL, Classic

RIF WGAIR accountability tool

DL, KIF, CL, N3Logic

Inference Web, Proof Markup Language, W3C Provenance Working group formal model

Inference Web IW Trust, Air + Trust

Visualization APIsS2S

Govt Data

Ontology repositories (ontolinguag),Ontology Evolution env:Chimaera, Semantic eScience Ontologies, MANY other ontologies

Transparent AccountableDatamining Initiative (TAMI)

TWC and the Semantic Web Layer Cake

Page 65: 201109021 mcguinness ska_meeting

SemantAqua (part of SemantEco)

• Enable/Enpower citizens & scientists to explore water pollution sites, facilities, and regulations along with provenance.

• Demonstrates semantic web technologies in environmental informatics systems.

• Map presentation of analysis• Explanations and Provenance

available • Use “Facet” type filter to select

type of data

1

2

3

http://was.tw.rpi.edu/swqp/map.html

Page 66: 201109021 mcguinness ska_meeting

System Architecture

access

Virtuoso

Page 67: 201109021 mcguinness ska_meeting

Ontology

• Core TWC Water ontology– Extends existing best

practice ontologies, e.g. SWEET, OWL-Time.

– Includes terms for relevant pollution concepts

– Can use to conclude: “any water source that has a measurement outside of its allowable range” is a polluted water source.

Portion of the TWC Water Ontology.

Page 68: 201109021 mcguinness ska_meeting

Provenance

• Preserves provenance in the Proof Markup Language (PML).

• Data Source Level Provenance:– The captured provenance data are used to

support provenance-based queries.

• Reasoning level provenance: – When water source been marked as polluted,

user can access supporting provenance data for the explanations including the URLs of the source data, intermediate data and the converted data.

Page 69: 201109021 mcguinness ska_meeting

Some Foundations

• Growing body of Open Linked Data• Growth and acceptance of ontologies and ontology-

enabled service• RPI Tetherless World backend tools and service

• LOGD• Inference Web and Proof Markup Language• eScience ontologies and infrastructure

Page 70: 201109021 mcguinness ska_meeting

The Tetherless World Constellation Linked Open Government Data

Portal

70

Create

TWC LOGD

ConvertQuery/Access

LOGDSPARQL Endpoint

Enhance

• RDF• RSS• JSON• XML• HTML• CSV• …

Community Portal

Data.gov deployment

Page 71: 201109021 mcguinness ska_meeting

A PML-Enhanced Image

provenance

CHIP Quick-LookCHIP PML-Enhance Quick-Look