A Study on Multimodal and Interactive Explanations for ...€¦ · A Study on Multimodal and...

A Study on Multimodal and Interactive Explanations for Visual Question Answering

Kamran Alipour,1 Jurgen P. Schulze,1 Yi Yao,2 Avi Ziskind,2 and Giedrius Burachas 2

1UC San Diego2SRI International

Introduction

Visual Question Answering

○ Answering a question in natural language regarding a given image

○ Attention-based models on image features influenced by the question to produce answers

What is the animal in this picture?

Zebra.

Introduction

eXplainable AI (XAI)

○ Used for years in medicine, robotics, ...

○ Mainly attention-based

○ Various formats: visual, textual, ...What is the animal

in this picture?

Zebra.

Because:

Contributions

○ Generated multiple explanation modes from attention features and annotations

○ Introduced an interactive explanation mode

○ Prediction task: A novel assessment of the efficacy of explanations

○ A user study to evaluate our XAI system

eXplainable Visual Question Answering (XVQA)

FeaturesImage

FeaturesQuestion Layer

Attention Answer

ResNetFrisbee

LSTMWhat is in the dog's mouth?

User Study

- Images and Questions

- Explanations

- VQA correctness and confidence

- Rank explanations

- Predict if VQA answers correctly based on explanations.

User StudyGroup

NE Control group

SP Spatial attention

SA Active (steerable) attention

SE Semantic

OA Object attention

AL All explanations

Subjects: 90

Trials: 10513

Explanation modes

Spatial attention

Shows the parts of the image the model is focused on while preparing the answer.

QUESTION: Where is this place?

ANSWER: Airport.

Explanation modes

Active (streerable) attention

FeaturesImage

FeaturesQuestion

LayerAttention

ResNet

LSTMWhat covers the ground?

Attention map drawn on the image by the user

Multiplied to image features

FC Answer

Explanation modes

Semantic:

Scene graphBounding box

Explanation modes

Semantic:Textual Explanation [Ghosh et al. 2019]

QUESTION: What is this sport?

ANSWER: Soccer.

Explanation modes

Object attention [Ray et al. 2019]

QUESTION: What food is he eating?

ANSWER: Sandwich.

Results

Prediction accuracies in different study groups

Results

Prediction accuracy vs. explanation ratings when system accurate

Results

Prediction accuracy vs. explanation ratings when system inaccurate

Results

Prediction accuracy progress

Results

User confidence progress: Active attention vs. other explanations

Discussion

○ Interactive explanations improved users confidence compared to other explanations

○ When AI is inaccurate, explanations are more helpful

○ Getting exposed to multiple explanation modes can be conflicting/overwhelming and reduce prediction accuracy

Conclusion and Future Work

○ Interactive experiment to probe explanation effectiveness

○ Explanations help accuracy prediction

○ Users confidence improved when exposed to explanations

Thank You!

User Study StatisticsGroup Subjects Trials

NE Control group 15 4124

SP Spatial attention 15 1826

SA Active (steerable) attention 15 1021

SE Semantic 15 1261

OA Object attention 15 1435

AL All explanations 15 846

Total : 90 10513

User Study Structure

Practice Block

ExpTrial

Control BlockExplanation Block

No-ExpTrial

Practice Block

Control Block

No-ExpTrial

Control Group

Explanation Groups No-Exp

TrialExpTrial5 x No-Exp

Trial5 x

Control BlockExplanation Block

ExpTrial5 x No-Exp

Trial5 x

No-ExpTrial

User Study Structure

- Images and Questions

- Explanations

- VQA correctness and confidence

- Rank explanations

- Predict if VQA answers correctly based on explanations.

A Study on Multimodal and Interactive Explanations for ...€¦ · A Study on Multimodal and...

Documents

Raising effective academic literacy online: working towards a multimodal interactive platform

Plant Physiology Water Relations Problem Set Semi-interactive Key And Explanations for use with PowerPoint XP

Interactive Design of Multimodal User Interfaces

A Study on Multimodal and Interactive Explanations for

Living organisms INTERACTIVE MULTIPLE CHOICE QUESTIONS The answers are provided. Explanations of why the alternatives are unsatisfactory are also offered

Kay L. O’Halloran Multimodal Analysis Laboratory Interactive …multimodal-analysis-lab.org/_docs/pubs02-Multimodal_Analysis_and... · Multimodal Analysis and Digital Technology

A multimodal architecture for simulating natural interactive walking in virtual environments.pdf

Symmetric Multimodal Interactive Intelligent Development Environments

Interactive Design of Multimodal User Interfaces - …hci.uni-konstanz.de/intehrdis/koenig-Squidy-JMUI-2010.pdf · Journal on Multimodal User Interfaces / Springer DOI 10.1007/s12193-010-0044-2

Fusion in Multimodal Interactive Systems: An HMM-Based Algorithm for User-Induced Adaptation

Multimodal Analysis within an Interactive Software ...multimodal-analysis-lab.org/_docs/pubs09-MultimodalDiscourse... · The social semiotics tradition (Hodge & Kress, 1988), based

Fusion in Multimodal Interactive Systems: An HMM-Based ......Fusion in Multimodal Interactive Systems: An HMM-Based Algorithm for User-Induced Adaptation Bruno Dumas WISE Lab Vrije

Multimodal literacy learning with interactive technologies Sue Nichols Centre for Research in Education

A novel framework for retrieval and interactive ... · visualization, a multimodal similarity measure is deﬁned between multimodal objects, as a weighted sum of unimodal similarities,

FUSION AND COORDINATION FOR MULTIMODAL … · FUSION AND COORDINATION FOR MULTIMODAL INTERACTIVE INFORMATION PRESENTATION ... 1 Some slides are available at wahlster ... & Invisible

Factors influencing academics’ development of interactive ...non_USQ)_Dissertation.pdf · Factors influencing academics’ development of interactive multimodal technology-mediated

Factors influencing academics’ development of interactive ...eprints.qut.edu.au/16698/1/Dawn_Birch_Thesis.pdf · Factors influencing academics’ development of interactive multimodal

Multimodal Analysis of laughter for an Interactive System · Multimodal Analysis of laughter for an Interactive System J er^ome Urbain1, Radoslaw Niewiadomski2, Maurizio Mancini3,

Slide 1 Kirsten Butcher Elaborated Explanations for Visual/Verbal Problem Solving: Interactive Communication Cluster July 24, 2006

Interactive Visualization of Multimodal Volume Data …cg/Veroeffentlichungen/RiederRitterRas... · Interactive Visualization of Multimodal ... is to show only functional data located