Explanation Infrastructure Supporting Transparency and Accountability Deborah L. McGuinness...

Explanation Infrastructure Supporting

Transparency and Accountability

Deborah L. McGuinnessCo-Director and Senior Research Scientist

Knowledge Systems, AI Laboratory, Stanford University

Joint work with Li Ding, Cynthia Chang, Vasco Furtado, Paulo Pinheiro da Silva

Inference Web is joint work with Richard Fikes, Alyssa Glass, Honglei Zeng, Mayukh Bhaowal, Bill Millar, Dhyanesh Narayan, Priyendra Deshwal, …

TAMI/Portia Privacy and Accountability Workshop28-29 June 2006, Cambridge, MA USA

Deborah McGuinness, June 2006

General Motivation

• Interoperability – as more systems are use varied sources and multiple information manipulation engines, they benefit more from encodings that are shareable and interoperable

• Provenance - if users (humans and agents) are to use and integrate data from unknown, unreliable, or multiple sources, they need provenance metadata for evaluation

• Explanation/Justification – if information has been manipulated (i.e., by sound deduction or by heuristic processes), information manipulation trace information should be available

• Trust –if some sources are more trustworthy than others, representations should be available to encode, propagate, combine, and (appropriately) display trust values

Provide interoperable knowledge provenance infrastructure that supports explanations of sources, assumptions, learned information, and answers as an

enabler for trust.

TAMI Background

• Overall TAMI Ambition: Explore privacy implications of semantic web technologies and determine viability.

• Goal: Assess applicability of a usage limitation model (as opposed to or in addition to a data collection model) to data mining/profiling applications.

• Explore technical challenges of provable accountability with explicit justifications in large scale, heterogeneous information systems (i.e., the Web).

• Develop public policy models to encourage transparent, accountable data mining

Privacy and ExplanationAs more online data becomes available and as inference and interoperability increases, privacy protection will have to rely more on usage limitation rules and less on collection limitation rules

Usage Limits depend upon:• Transparency: knowledge provenance history of sources, data

manipulations, and inferences is maintained in an interoperable form. Explanation technology used to allow examination of knowledge provenance by authorized parties (who may be the general public).

• Accountability: ability to check whether the policies that govern data manipulations and inferences were in fact adhered to. Explanation technology used to examine inference usage.

WWW Toolkit

Proof Markup Language (PML)CWM

(TAMI)

JTP(DAML/NIMD)

SPARK(CALO)

UIMA(NIMD/Exp Agg)

IW Explainer/Abstractor

IWBase

IWBrowser

IWSearch

Justification

Provenance

SPARK-L

Text Analytics

IWTrust

provenanceregistration

search enginebased publishing

Expert friendlyVisualization

End-user friendly visualization

Trust computationSDS

(DAML/SNRC)OWL-S/BPEL

[Inference Web] Framework for explaining question answering tasks by abstracting, storing, exchanging, combining, annotating, filtering, segmenting, comparing, and rendering proofs and proof fragments provided by question answerers.

Inference Web Infrastructure

PML Ontology

How PML Works

Justification Trace

IWBase

NodeSet foo:ns1(hasConclusion …)

Query foo:query1(type arrest1 ?x)

Question foo:question1 (how is arrest1 classified?)

Mapping

NodeSet foo:ns2(hasConclusion …)

SourceUsage

hasAnswer

hasAntecendent

fromQuery

fromAnswer

isQueryFor

InferenceEngine

InferenceRule

hasVariableMapping

hasInferencEngine

hasRuleInferenceStep

Language hasLanguage

InferenceStep

Source

isConsequentOf

hasSourceUsage hasSource isConsequentOf

usageTime …

TAMI ArchitectureData Sources

Inference Engine(s) & Truth Maintenance System

Justification Generator (why action does/doesn’t comply with policy)

Proof Generation Services (Inference Web)

Assertions with provenance

Usage Rules (Laws/Policy)

Action

Privacy Act background

Privacy Act

Sharing Restrictions• (b) Conditions of Disclosure.—

– No agency shall disclose any record which is contained in a system of records

– by any means of communication to any person or agency …

– except . . . with the prior written consent of, the individual to whom the record pertains,

– unless disclosure of the record would be— ….

• (3) for a routine use [5 USC § 552a(b)(3)]

Routine Use• as defined in subsection (a)(7)

– “the use of such record for a purpose which is compatible with the purpose for which it was collected;” [5 USC § 552a(a)(7)]

• described under subsection (e)(4)(D) – “publish in the Federal Register upon establishment or

revision a notice of the existence and character of the system of records, which notice shall include—” [5 USC § 552a(e)(4)]

– “each routine use of the records contained in the system, including the categories of users and the purpose of such use” [5 USC § 552a(e)(4)(d)]

a ROUTINE USE

a SHARING_PERMISSION

Recipient : [1,∞] OrganizationHasDataCategory :DataCategoryHasPurpose :AuthorizedPurpose

General Categories

Structured Components

One Simple Description

Sample CodeRU1 a RoutineUse

; recipient FBI ; category DataCategory3; purpose CounterTerrorism

; dc:description "May share where TSA becomes aware of information that may be related to an individual identified in the TSDB.“

DataCategory3 a DataCategory; dc:description "Information about people who are known or reasonably suspected to be or have been engaged in conduct constituting, in preparation for, in aid of, or related to terrorism.".

Can share with FBI

Data about possible terrorists

So they can investigate whether a criminal law has been violated

Explanation• Why was data sharing xyz acceptable?

– Using RoutineUse459• We shared with the FBI• Data belonging to DataCategory3• For the purpose of

CounterterrorismCriminalLawEnforcement• Provenance: RoutineUse459 is derived from

– SORN (statement of records notice) for “Secure Flight”– Published at 70 FR 36319– Encoded by DHS Office of the Privacy Officer

Explanation• Why was data sharing xyz unacceptable?

– We checked all of the RoutineUses• We shared with the IRS• Data belonging to DataCategory3

– IRS is not authorized to receive DataCategory3• For the purpose of

CounterterrorismCriminalLawEnforcement• Provenance: all RoutineUses are derived from

– SORN for “Secure Flight”– Published at 70 FR 3620– Encoded by DHS Office of the Privacy Officer

Scenario 3 Examples(nerd level)

IWBrowser - Browse & Debug

IWBrowser - Provenance

IWBrowser - Explanation

A fragment of PML data1. <iw:NodeSet rdf:about="#ns__g463">2. <iw:hasConclusion> @prefix : <http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#> .3. @prefix data: <http://dig.csail.mit.edu/TAMI/cph/v2/data.ttl#> .4. data:arrest-1 a :NotJustifiedArrest; :charge <http://dig.csail.mit.edu/TAMI/law/USC-18-228> . 5. </iw:hasConclusion>6. <iw:hasEnglishString>data:arrest-1 [is-a] :NotJustifiedArrest;[and has] :charge <http://dig.csail.mit.edu/TAMI/law/USC-18-228>

.</iw:hasEnglishString>7. <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/N3.owl#N3"/>8. <iw:isConsequentOf rdf:resource="#is__g463"/>9. </iw:NodeSet> 10. <iw:InferenceStep rdf:about="#is__g463">11. <iw:hasAntecedent rdf:resource="#ns__g419"/>12. <iw:hasAntecedent rdf:resource="#ns__g420"/>13. <iw:hasAntecedent rdf:resource="#ns__g425"/>14. <iw:hasAntecedent rdf:resource="#ns__g479"/>15. <iw:hasAntecedent rdf:resource="#ns__g502"/>16. <iw:hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/CWM.owl#CWM"/>17. <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/GMP.owl#GMP"/>18. <iw:hasVariableMapping rdf:parseType="Resource">19. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>20. <iw:mapFrom>http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#A</iw:mapFrom>21. <iw:mapTo>http://dig.csail.mit.edu/TAMI/cph/v2/data.ttl#arrest-1</iw:mapTo>22. </iw:hasVariableMapping>23. <iw:hasVariableMapping rdf:parseType="Resource">24. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>25. <iw:mapFrom>http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#P</iw:mapFrom>26. <iw:mapTo>http://dig.csail.mit.edu/TAMI/background#ct-criminal-law-enforcement</iw:mapTo>27. </iw:hasVariableMapping>28. <iw:hasVariableMapping rdf:parseType="Resource">29. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>30. <iw:mapFrom>http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#C</iw:mapFrom>31. <iw:mapTo>http://dig.csail.mit.edu/TAMI/law/USC-18-228</iw:mapTo>32. </iw:hasVariableMapping>33. </iw:InferenceStep>

Next Generation Browser

Discussion• Exploring an explainable usage limitation model

supported by semantic technologies• Status:

– Use case written up and (very recently) encoded– Early prototype in place – JUST integrated with CWM with

use case 3 … refinement in process– Exploiting domain independent explanation tools– Exploring options that leverage knowledge of N3 and CWM

supporting more useful tools/APIs/interfaces– Exploring options that leverage knowledge of law supporting

context-sensitive and appropriate presentations

Summary• Leverage points include:

– PML: justification, provenance, and trust interlingua– IW tools for interactive browsing, summarizing, searching, validation,

abstraction, trust representation, propagation, presentation… (domain independent)

– Experience explaining a wide variety of reasoners (JTP, SNARK, …), task processing engines (SPARK), learners (TAILOR), text analytics (UIMA), web services (SDS/BPEL), …

• Continuing work– Identify explanation (representation and presentation) requirements– Refining registration and presentation– Designing special purpose filters and abstractions

• Impact– Interoperable justifications supporting transparency and accountability– Has the potential to change publishing law (with markup) as well as presenting

judgments (with interactive justifications)– Audit trace

More Information• Links

– http://iw.stanford.edu/: Inference Web home – http://iw.stanford.edu/doc/project/tami/: TAMI@Stanford– http://dig.csail.mit.edu/TAMI: TAMI home

• Papers– (Inference Web) Deborah L. McGuinness and Paulo Pinheiro da Silva.

Explaining Answers from the Semantic Web: The Inference Web Approach. Journal of Web Semantics. Vol.1 No.4., pages 397-413, 2004

– (PML) Paulo Pinheiro da Silva, Deborah L. McGuinness and Richard Fikes. A Proof Markup Language for Semantic Web Services. Information Systems. Volume 31, Issues 4-5, pages 381-395, 2006

– (TAMI) Daniel J. Weitzner, Hal Abelson, Tim Berners-Lee, Chris P. Hanson, Jim Hendler, Lalana Kagal, Deborah L. McGuinness, Gerald J. Sussman, K. Krasnow Waterman. Transparent Accountable Inferencing for Privacy Risk Management. Proceedings of AAAI Spring Symposium on The Semantic Web meets eGovernment. AAAI Press, Stanford University, USA 2006. Also available as MIT CSAIL Technical Report-2006-007 and Stanford KSL Technical Report KSL-06-03.

Extras

IWBrowser - Browse & Debug

IWBrowser - Provenance

IWBrowser - Explanation

A fragment of PML data1. <NodeSet rdf:about="#ns__g208">2. <hasConclusion> @prefix : <http://dig.csail.mit.edu/TAMI/cdk/scenario3/data.n3#> .3. @prefix tr: <http://dig.csail.mit.edu/TAMI/cdk/scenario3/rules.n3#> .4. :Arrest05NY2343CRJFC a tr:JustifiedArrest . </hasConclusion>5. <hasEnglishString>:Arrest05NY2343CRJFC [is-a] tr:JustifiedArrest .</hasEnglishString>6. <hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/N3.owl#N3"/>7. <isConsequentOf rdf:resource="#is__g208"/>8. <rea:step rdf:resource="tami-scenario3-proof.n3#_g_L23C26"/>9. </NodeSet>10. <InferenceStep rdf:about="#is__g208">11. <hasAntecedent rdf:resource="#ns__g186"/>12. <hasAntecedent rdf:resource="#ns__g187"/>13. <hasAntecedent rdf:resource="#ns__g188"/>14. <hasAntecedent rdf:resource="#ns__g207"/>15. <hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/CWM.owl#CWM"/>16. <hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/GMP.owl#GMP"/>17. <hasVariableMapping rdf:parseType="Resource">18. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>19. <mapFrom>http://people.csail.mit.edu/lkagal/tami/tami-scenario3-filter.n3#A</mapFrom>20. <mapTo>http://dig.csail.mit.edu/TAMI/cdk/scenario3/data.n3#Arrest05NY2343CRJFC</mapTo>21. </hasVariableMapping>22. <hasVariableMapping rdf:parseType="Resource">23. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>24. <mapFrom>http://people.csail.mit.edu/lkagal/tami/tami-scenario3-filter.n3#B</mapFrom>25. <mapTo>http://dig.csail.mit.edu/TAMI/cdk/scenario3/data.n3#open-source-search-2-result-1</mapTo>26. </hasVariableMapping>27. </InferenceStep>

Next Generation Browser

PML based Explanation:

Adding trust-tab to Wikipedia

The Original Wikipedia Article

Trust Tab

Multiple Trust Tab• citation based

• revision history based

Fragments colored according trust value

Explanation about a fragment (author, trust value)

fragment

PML Tabhttp://inferenceweb.stanford.edu/2006/02/example1-iw-wiki.owl

fragment trust

author trust

<iw:NodeSet rdf:about="http://foto.stanford.edu/mediawiki-1.4.12/index.php/Natural_number"> <In mathematics, a natural number is either a positive integer … </iw:hasConclusion> <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/English.owl#English"/> <iw:isConsequentOf> <iw:InferenceStep> <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/Told.owl#Told"/> <iw:hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/CitationTrust.owl#CitationTrust"/> <iw:hasSourceUsage> <iw:SourceUsage> <iw:hasSource> <iw:Source rdf:about="http://inferenceweb.stanford.edu/wp/registry/PER/Alexandrov.owl#Alexandrov"/> </iw:hasSource> </iw:SourceUsage> </iw:hasSourceUsage> </iw:InferenceStep> </iw:isConsequentOf></iw:NodeSet>

<iw:AggregatedTrustRelation> <iw:hasTrustingParty rdf:resource="http://inferenceweb.stanford.edu/wp/registry/ORG/wikipedia.owl#wikipedia"/> <iw:hasTrustedParty rdf:resource="http://foto.stanford.edu/mediawiki-1.4.12/index.php/Natural_number"/> <iw:hasTrustValue rdf:datatype="http://www.w3.org/2001/XMLSchema#float">0.1766</iw:hasTrustValue></iw:AggregatedTrustRelation>

<iw:AggregatedTrustRelation> <iw:hasTrustingParty rdf:resource="http://inferenceweb.stanford.edu/wp/registry/ORG/wikipedia.owl#wikipedia"/> <iw:hasTrustedParty rdf:resource="http://inferenceweb.stanford.edu/wp/registry/PER/Alexandrov.owl#Alexandrov"/> <iw:hasTrustValue rdf:datatype="http://www.w3.org/2001/XMLSchema#float">0.1766</iw:hasTrustValue></iw:AggregatedTrustRelation>

PML based Explanation:

Explaining Cognitive Assistants – DARPA PAL Program’s CALO system

Explanation ProcessInitial request and answer strategy<user>: Why are you doing <subtask>?<system>: I am trying to do <high-level-task> and <subtask>

is one subgoal in the process.

Follow-up questions for mixed initiative dialogue• <user>: Why are you doing <high-level-task>?• <user>: How did you learn to do <high-level-task> in this

way? • <user>: Why haven’t you completed <subtask> yet?• <user>: Why is <subtask> a subgoal of <high-level-task>?• <user>: When will you finish <subtask>?• <user>: What sources did you use to do <subtask>?

McGuinness, D.L.; Pinheiro da Silva, P.; Glass, A.; Wolverton, M. Explaining Task Processing in Cognitive Assistants. 2006. Technical Report, KSL-06-06, Knowledge Systems Lab., Stanford.

CALO TM Explainer UI

Initial explanation,with links indicatingfollow-up queries

and alternate strategies.

Explanation Infrastructure Supporting Transparency and Accountability Deborah L. McGuinness...

Documents

Deborah McGuinness Co-Director Knowledge Systems, Artificial Intelligence Laboratory

Semantic Web Cyberinfrastructure for Virtual Observatories Deborah L. McGuinness Acting Director and Senior Research Scientist Knowledge Systems, AI Laboratory

An Environment for Merging and Testing Large Ontologies Deborah McGuinness, Richard Fikes, James Rice*, Steve Wilder Associate Director and Senior Research

Ontology Development To Support AQUA Question-Answering Richard Fikes Jessica Jenkins Bill MacCartney Rob McCool Deborah McGuinness Knowledge Systems Laboratory

The Semantic Web Deborah McGuinness Associate Director and Senior Research Scientist Knowledge Systems Laboratory Stanford University Stanford, CA USA

1 ‘Class Exercise’ III: Application Project Evaluation Deborah McGuinness and Peter Fox CSCI/ITEC-6962-01 Week 10, November 16, 2009

Semantically-Enabled Virtual Observatories: VSTO Highlights (for Observational Data) Deborah McGuinness Acting Director and Senior Research Scientist Knowledge

Ontology-enhanced Search for Primary Care Literature Deborah L. McGuinness Associate Director and Senior Research Scientist Knowledge Systems Laboratory

Selected Semantic Web Trends, Progress, and Directions Deborah McGuinness Acting Director and Senior Research Scientist Knowledge Systems, AI Laboratory

Ontologies and the Semantic Web II Deborah L. McGuinness Associate Director and Senior Research Scientist Knowledge Systems Laboratory Stanford University

1 Foundations I: Methodologies, Knowledge Representation Professor Deborah McGuinness TA － Weijing Chen Other lectures from Professor Peter Fox, Professor

1 Using Semantically-Enabled Data Frameworks for Data Integration in Virtual Observatories Peter Fox * * HAO/ESSL/NCAR Deborah McGuinness $#, Luca Cinquini

Investigations into Trust for Collaborative Information Repositories: A Wikipedia Case Study Deborah L. McGuinnessDeborah L. McGuinness, Co-Director Knowledge

Human-Aware Sensor Network Ontology (HASNetO): Semantic Support for Empirical Data Collection Paulo Pinheiro 1, Deborah McGuinness 1, Henrique Santos 1,2

Knowledge Provenance in Semantic Wikis Li Ding, Jie Bao, and Deborah McGuinness Tetherless World Constellation, Rensselaer Polytechnic Institute Troy,

@ Semantic Web Technologies: A Tutorial Li Ding University of Maryland Baltimore County Joint work with Deborah McGuinness, Tim Finin and Anupam Joshi

1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Peter Fox CSCI-6962-01 Week 9, October 27, 2008

TWC LOGD: A Portal for Linking Open Government Data Li Ding, Deborah L. McGuinness, Jim Hendler Tetherless World Constellation Rensselaer Polytechnic Institute

McGuinness Geon 5/5/2005 SOLAR-TERRESTRIAL ONTOLOGIES (for VSTO and Beyond) Peter Fox 1, Deborah McGuinness 3, Don Middleton 2, Stan Solomon 1, Jose Garcia

Mash-up of Linked Government Data from Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic