Explanation Infrastructure Supporting Transparency and Accountability Deborah L. McGuinness...

Preview:

DESCRIPTION

Deborah McGuinness, June 2006 TAMI Background Overall TAMI Ambition: Explore privacy implications of semantic web technologies and determine viability. Goal: Assess applicability of a usage limitation model (as opposed to or in addition to a data collection model) to data mining/profiling applications. Explore technical challenges of provable accountability with explicit justifications in large scale, heterogeneous information systems (i.e., the Web). Develop public policy models to encourage transparent, accountable data mining

Citation preview

Explanation Infrastructure Supporting

Transparency and Accountability

Deborah L. McGuinnessCo-Director and Senior Research Scientist

Knowledge Systems, AI Laboratory, Stanford University

Joint work with Li Ding, Cynthia Chang, Vasco Furtado, Paulo Pinheiro da Silva

Inference Web is joint work with Richard Fikes, Alyssa Glass, Honglei Zeng, Mayukh Bhaowal, Bill Millar, Dhyanesh Narayan, Priyendra Deshwal, …

TAMI/Portia Privacy and Accountability Workshop28-29 June 2006, Cambridge, MA USA

Deborah McGuinness, June 2006

General Motivation

• Interoperability – as more systems are use varied sources and multiple information manipulation engines, they benefit more from encodings that are shareable and interoperable

• Provenance - if users (humans and agents) are to use and integrate data from unknown, unreliable, or multiple sources, they need provenance metadata for evaluation

• Explanation/Justification – if information has been manipulated (i.e., by sound deduction or by heuristic processes), information manipulation trace information should be available

• Trust –if some sources are more trustworthy than others, representations should be available to encode, propagate, combine, and (appropriately) display trust values

Provide interoperable knowledge provenance infrastructure that supports explanations of sources, assumptions, learned information, and answers as an

enabler for trust.

Deborah McGuinness, June 2006

TAMI Background

• Overall TAMI Ambition: Explore privacy implications of semantic web technologies and determine viability.

• Goal: Assess applicability of a usage limitation model (as opposed to or in addition to a data collection model) to data mining/profiling applications.

• Explore technical challenges of provable accountability with explicit justifications in large scale, heterogeneous information systems (i.e., the Web).

• Develop public policy models to encourage transparent, accountable data mining

Deborah McGuinness, June 2006

Privacy and ExplanationAs more online data becomes available and as inference and interoperability increases, privacy protection will have to rely more on usage limitation rules and less on collection limitation rules

Usage Limits depend upon:• Transparency: knowledge provenance history of sources, data

manipulations, and inferences is maintained in an interoperable form. Explanation technology used to allow examination of knowledge provenance by authorized parties (who may be the general public).

• Accountability: ability to check whether the policies that govern data manipulations and inferences were in fact adhered to. Explanation technology used to examine inference usage.

Deborah McGuinness, June 2006

WWW Toolkit

Proof Markup Language (PML)CWM

(TAMI)

JTP(DAML/NIMD)

SPARK(CALO)

UIMA(NIMD/Exp Agg)

IW Explainer/Abstractor

IWBase

IWBrowser

IWSearch

Trust

Justification

Provenance

N3

KIF

SPARK-L

Text Analytics

IWTrust

provenanceregistration

search enginebased publishing

Expert friendlyVisualization

End-user friendly visualization

Trust computationSDS

(DAML/SNRC)OWL-S/BPEL

[Inference Web] Framework for explaining question answering tasks by abstracting, storing, exchanging, combining, annotating, filtering, segmenting, comparing, and rendering proofs and proof fragments provided by question answerers.

Inference Web Infrastructure

Deborah McGuinness, June 2006

PML Ontology

Deborah McGuinness, June 2006

How PML Works

Justification Trace

IWBase

NodeSet foo:ns1(hasConclusion …)

Query foo:query1(type arrest1 ?x)

Question foo:question1 (how is arrest1 classified?)

Mapping

NodeSet foo:ns2(hasConclusion …)

SourceUsage

hasAnswer

hasAntecendent

fromQuery

fromAnswer

isQueryFor

InferenceEngine

InferenceRule

hasVariableMapping

hasInferencEngine

hasRuleInferenceStep

Language hasLanguage

InferenceStep

Source

isConsequentOf

hasSourceUsage hasSource isConsequentOf

usageTime …

Deborah McGuinness, June 2006

TAMI ArchitectureData Sources

Inference Engine(s) & Truth Maintenance System

Justification Generator (why action does/doesn’t comply with policy)

Proof Generation Services (Inference Web)

Assertions with provenance

Usage Rules (Laws/Policy)

Action

Privacy Act background

Deborah McGuinness, June 2006

Privacy Act

Deborah McGuinness, June 2006

Sharing Restrictions• (b) Conditions of Disclosure.—

– No agency shall disclose any record which is contained in a system of records

– by any means of communication to any person or agency …

– except . . . with the prior written consent of, the individual to whom the record pertains,

– unless disclosure of the record would be— ….

• (3) for a routine use [5 USC § 552a(b)(3)]

Deborah McGuinness, June 2006

Routine Use• as defined in subsection (a)(7)

– “the use of such record for a purpose which is compatible with the purpose for which it was collected;” [5 USC § 552a(a)(7)]

• described under subsection (e)(4)(D) – “publish in the Federal Register upon establishment or

revision a notice of the existence and character of the system of records, which notice shall include—” [5 USC § 552a(e)(4)]

– “each routine use of the records contained in the system, including the categories of users and the purpose of such use” [5 USC § 552a(e)(4)(d)]

Deborah McGuinness, June 2006

a ROUTINE USE

a SHARING_PERMISSION

Recipient : [1,∞] OrganizationHasDataCategory :DataCategoryHasPurpose :AuthorizedPurpose

General Categories

Structured Components

One Simple Description

Deborah McGuinness, June 2006

Sample CodeRU1 a RoutineUse

; recipient FBI ; category DataCategory3; purpose CounterTerrorism

; dc:description "May share where TSA becomes aware of information that may be related to an individual identified in the TSDB.“

DataCategory3 a DataCategory; dc:description "Information about people who are known or reasonably suspected to be or have been engaged in conduct constituting, in preparation for, in aid of, or related to terrorism.".

Can share with FBI

Data about possible terrorists

So they can investigate whether a criminal law has been violated

Deborah McGuinness, June 2006

Explanation• Why was data sharing xyz acceptable?

– Using RoutineUse459• We shared with the FBI• Data belonging to DataCategory3• For the purpose of

CounterterrorismCriminalLawEnforcement• Provenance: RoutineUse459 is derived from

– SORN (statement of records notice) for “Secure Flight”– Published at 70 FR 36319– Encoded by DHS Office of the Privacy Officer

Deborah McGuinness, June 2006

Explanation• Why was data sharing xyz unacceptable?

– We checked all of the RoutineUses• We shared with the IRS• Data belonging to DataCategory3

– IRS is not authorized to receive DataCategory3• For the purpose of

CounterterrorismCriminalLawEnforcement• Provenance: all RoutineUses are derived from

– SORN for “Secure Flight”– Published at 70 FR 3620– Encoded by DHS Office of the Privacy Officer

Scenario 3 Examples(nerd level)

Deborah McGuinness, June 2006

IWBrowser - Browse & Debug

Deborah McGuinness, June 2006

IWBrowser - Provenance

Deborah McGuinness, June 2006

IWBrowser - Explanation

Deborah McGuinness, June 2006

A fragment of PML data1. <iw:NodeSet rdf:about="#ns__g463">2. <iw:hasConclusion> @prefix : &#60;http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#&#62; .3. @prefix data: &#60;http://dig.csail.mit.edu/TAMI/cph/v2/data.ttl#&#62; .4. data:arrest-1 a :NotJustifiedArrest; :charge &#60;http://dig.csail.mit.edu/TAMI/law/USC-18-228&#62; . 5. </iw:hasConclusion>6. <iw:hasEnglishString>data:arrest-1 [is-a] :NotJustifiedArrest;[and has] :charge &#60;http://dig.csail.mit.edu/TAMI/law/USC-18-228&#62;

.</iw:hasEnglishString>7. <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/N3.owl#N3"/>8. <iw:isConsequentOf rdf:resource="#is__g463"/>9. </iw:NodeSet> 10. <iw:InferenceStep rdf:about="#is__g463">11. <iw:hasAntecedent rdf:resource="#ns__g419"/>12. <iw:hasAntecedent rdf:resource="#ns__g420"/>13. <iw:hasAntecedent rdf:resource="#ns__g425"/>14. <iw:hasAntecedent rdf:resource="#ns__g479"/>15. <iw:hasAntecedent rdf:resource="#ns__g502"/>16. <iw:hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/CWM.owl#CWM"/>17. <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/GMP.owl#GMP"/>18. <iw:hasVariableMapping rdf:parseType="Resource">19. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>20. <iw:mapFrom>http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#A</iw:mapFrom>21. <iw:mapTo>http://dig.csail.mit.edu/TAMI/cph/v2/data.ttl#arrest-1</iw:mapTo>22. </iw:hasVariableMapping>23. <iw:hasVariableMapping rdf:parseType="Resource">24. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>25. <iw:mapFrom>http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#P</iw:mapFrom>26. <iw:mapTo>http://dig.csail.mit.edu/TAMI/background#ct-criminal-law-enforcement</iw:mapTo>27. </iw:hasVariableMapping>28. <iw:hasVariableMapping rdf:parseType="Resource">29. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>30. <iw:mapFrom>http://dig.csail.mit.edu/TAMI/lkagal/scenario3/rules#C</iw:mapFrom>31. <iw:mapTo>http://dig.csail.mit.edu/TAMI/law/USC-18-228</iw:mapTo>32. </iw:hasVariableMapping>33. </iw:InferenceStep>

Deborah McGuinness, June 2006

Next Generation Browser

Deborah McGuinness, June 2006

Discussion• Exploring an explainable usage limitation model

supported by semantic technologies• Status:

– Use case written up and (very recently) encoded– Early prototype in place – JUST integrated with CWM with

use case 3 … refinement in process– Exploiting domain independent explanation tools– Exploring options that leverage knowledge of N3 and CWM

supporting more useful tools/APIs/interfaces– Exploring options that leverage knowledge of law supporting

context-sensitive and appropriate presentations

Deborah McGuinness, June 2006

Summary• Leverage points include:

– PML: justification, provenance, and trust interlingua– IW tools for interactive browsing, summarizing, searching, validation,

abstraction, trust representation, propagation, presentation… (domain independent)

– Experience explaining a wide variety of reasoners (JTP, SNARK, …), task processing engines (SPARK), learners (TAILOR), text analytics (UIMA), web services (SDS/BPEL), …

• Continuing work– Identify explanation (representation and presentation) requirements– Refining registration and presentation– Designing special purpose filters and abstractions

• Impact– Interoperable justifications supporting transparency and accountability– Has the potential to change publishing law (with markup) as well as presenting

judgments (with interactive justifications)– Audit trace

Deborah McGuinness, June 2006

More Information• Links

– http://iw.stanford.edu/: Inference Web home – http://iw.stanford.edu/doc/project/tami/: TAMI@Stanford– http://dig.csail.mit.edu/TAMI: TAMI home

• Papers– (Inference Web) Deborah L. McGuinness and Paulo Pinheiro da Silva.

Explaining Answers from the Semantic Web: The Inference Web Approach. Journal of Web Semantics. Vol.1 No.4., pages 397-413, 2004

– (PML) Paulo Pinheiro da Silva, Deborah L. McGuinness and Richard Fikes. A Proof Markup Language for Semantic Web Services. Information Systems. Volume 31, Issues 4-5, pages 381-395, 2006

– (TAMI) Daniel J. Weitzner, Hal Abelson, Tim Berners-Lee, Chris P. Hanson, Jim Hendler, Lalana Kagal, Deborah L. McGuinness, Gerald J. Sussman, K. Krasnow Waterman. Transparent Accountable Inferencing for Privacy Risk Management. Proceedings of AAAI Spring Symposium on The Semantic Web meets eGovernment. AAAI Press, Stanford University, USA 2006. Also available as MIT CSAIL Technical Report-2006-007 and Stanford KSL Technical Report KSL-06-03.

Deborah McGuinness, June 2006

Extras

Deborah McGuinness, June 2006

IWBrowser - Browse & Debug

Deborah McGuinness, June 2006

IWBrowser - Provenance

Deborah McGuinness, June 2006

IWBrowser - Explanation

Deborah McGuinness, June 2006

A fragment of PML data1. <NodeSet rdf:about="#ns__g208">2. <hasConclusion> @prefix : &#60;http://dig.csail.mit.edu/TAMI/cdk/scenario3/data.n3#&#62; .3. @prefix tr: &#60;http://dig.csail.mit.edu/TAMI/cdk/scenario3/rules.n3#&#62; .4. :Arrest05NY2343CRJFC a tr:JustifiedArrest . </hasConclusion>5. <hasEnglishString>:Arrest05NY2343CRJFC [is-a] tr:JustifiedArrest .</hasEnglishString>6. <hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/N3.owl#N3"/>7. <isConsequentOf rdf:resource="#is__g208"/>8. <rea:step rdf:resource="tami-scenario3-proof.n3#_g_L23C26"/>9. </NodeSet>10. <InferenceStep rdf:about="#is__g208">11. <hasAntecedent rdf:resource="#ns__g186"/>12. <hasAntecedent rdf:resource="#ns__g187"/>13. <hasAntecedent rdf:resource="#ns__g188"/>14. <hasAntecedent rdf:resource="#ns__g207"/>15. <hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/CWM.owl#CWM"/>16. <hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/GMP.owl#GMP"/>17. <hasVariableMapping rdf:parseType="Resource">18. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>19. <mapFrom>http://people.csail.mit.edu/lkagal/tami/tami-scenario3-filter.n3#A</mapFrom>20. <mapTo>http://dig.csail.mit.edu/TAMI/cdk/scenario3/data.n3#Arrest05NY2343CRJFC</mapTo>21. </hasVariableMapping>22. <hasVariableMapping rdf:parseType="Resource">23. <rdf:type rdf:resource="http://inferenceweb.stanford.edu/2004/07/iw.owl#Mapping"/>24. <mapFrom>http://people.csail.mit.edu/lkagal/tami/tami-scenario3-filter.n3#B</mapFrom>25. <mapTo>http://dig.csail.mit.edu/TAMI/cdk/scenario3/data.n3#open-source-search-2-result-1</mapTo>26. </hasVariableMapping>27. </InferenceStep>

Deborah McGuinness, June 2006

Next Generation Browser

PML based Explanation:

Adding trust-tab to Wikipedia

Deborah McGuinness, June 2006

The Original Wikipedia Article

Deborah McGuinness, June 2006

Trust Tab

Multiple Trust Tab• citation based

• revision history based

Fragments colored according trust value

Explanation about a fragment (author, trust value)

Deborah McGuinness, June 2006

fragment

PML Tabhttp://inferenceweb.stanford.edu/2006/02/example1-iw-wiki.owl

fragment trust

author trust

<iw:NodeSet rdf:about="http://foto.stanford.edu/mediawiki-1.4.12/index.php/Natural_number"> <In mathematics, a natural number is either a positive integer … </iw:hasConclusion> <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/English.owl#English"/> <iw:isConsequentOf> <iw:InferenceStep> <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/Told.owl#Told"/> <iw:hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/CitationTrust.owl#CitationTrust"/> <iw:hasSourceUsage> <iw:SourceUsage> <iw:hasSource> <iw:Source rdf:about="http://inferenceweb.stanford.edu/wp/registry/PER/Alexandrov.owl#Alexandrov"/> </iw:hasSource> </iw:SourceUsage> </iw:hasSourceUsage> </iw:InferenceStep> </iw:isConsequentOf></iw:NodeSet>

<iw:AggregatedTrustRelation> <iw:hasTrustingParty rdf:resource="http://inferenceweb.stanford.edu/wp/registry/ORG/wikipedia.owl#wikipedia"/> <iw:hasTrustedParty rdf:resource="http://foto.stanford.edu/mediawiki-1.4.12/index.php/Natural_number"/> <iw:hasTrustValue rdf:datatype="http://www.w3.org/2001/XMLSchema#float">0.1766</iw:hasTrustValue></iw:AggregatedTrustRelation>

<iw:AggregatedTrustRelation> <iw:hasTrustingParty rdf:resource="http://inferenceweb.stanford.edu/wp/registry/ORG/wikipedia.owl#wikipedia"/> <iw:hasTrustedParty rdf:resource="http://inferenceweb.stanford.edu/wp/registry/PER/Alexandrov.owl#Alexandrov"/> <iw:hasTrustValue rdf:datatype="http://www.w3.org/2001/XMLSchema#float">0.1766</iw:hasTrustValue></iw:AggregatedTrustRelation>

PML based Explanation:

Explaining Cognitive Assistants – DARPA PAL Program’s CALO system

Deborah McGuinness, June 2006

Explanation ProcessInitial request and answer strategy<user>: Why are you doing <subtask>?<system>: I am trying to do <high-level-task> and <subtask>

is one subgoal in the process.

Follow-up questions for mixed initiative dialogue• <user>: Why are you doing <high-level-task>?• <user>: How did you learn to do <high-level-task> in this

way? • <user>: Why haven’t you completed <subtask> yet?• <user>: Why is <subtask> a subgoal of <high-level-task>?• <user>: When will you finish <subtask>?• <user>: What sources did you use to do <subtask>?

McGuinness, D.L.; Pinheiro da Silva, P.; Glass, A.; Wolverton, M. Explaining Task Processing in Cognitive Assistants. 2006. Technical Report, KSL-06-06, Knowledge Systems Lab., Stanford.

Deborah McGuinness, June 2006

CALO TM Explainer UI

Initial explanation,with links indicatingfollow-up queries

and alternate strategies.

Recommended