Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Julien Perez

Machine Learning and Optimization group

27th March, 2019

Machine Reading,

Models and Applications

2

4

5

Machine Readingmotivations

6

Human knowledge is (mainly) stored in natural language

Natural Language is an efficient support of knowledge transcription

Language is efficient because of itscontextuallity that leads to ambiguity

Languages assume apriori knowledgeof the world

The Library of Trinity College Dublin

Definition

7

“A machine comprehends a passage of text if, for any question regarding that text, it can be answered correctly by a majority of native speakers.

The machine needs to provide a string which human readers would agree both 1. Answers that question2. Does not contain information irrelevant to that question.” (Burges, 2013)

Applications

• Collection of documents as KB• Social media mining• Dialog understanding • Fact checking – Fake news detection

Machine Readingas Span selection

SQuAD• 500 passages• 100,000 questions on Wikipedia text• Human annotated

TriviaQA• 95k questions• 650k evidence documents• distant supervision

8

[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016[2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension, Zottlemoyer et al, 2017

Machine ReadersArchitectures

9

1) Word-level Interaction 2) Contextualization 2’) Word-Token Interaction 3) Context Question Interaction. 3’) Self-Attention.

[3] Fusionnet: fusing via fully-aware attention with application to machine comprehension, Huang et al, 2018

10

Extractive modelsresults

11

… BERT and ELMO

© 2018 NAVER LABS. All rights reserved.

… but

12

Error-analysis

What the current models solved

• Lexical variation

• Local context-handling

What the current models do not solve

• Reasoning tasks

• Common-sense requierement

Text Understanding

Machine Translation

Dialog State Tracking

What the current models solved

• Lexical variation

• Local Context-handling

What the current models do not solve

• Reasoning tasks

• Common-sense requierementsSkill set annotations over machine comprehension task,saguwara et al, 2017

Error-analysis


Common-sense & Reasoning

15

Common-Sense

"Sound practical judgment concerning everyday matters, or a basic ability to perceive, understand, and judge that is shared by ("common to") nearly all people. "

"the system of implications shared by the competent users of a language"

Aristote – 300BC, the first person known

to have discussed "common sense"

Elmo Bert

Language modelingfor commonsenseacquisition

Commonsense for Generative Multi-Hop Question Answering TasksBauer and al, 18

"Great food, one of the best, awesome presentation of food!!!": [

"cake is related to food",

"plate is related to food",

"rice is related to food",

"Something you find in the refrigerator is food",

"bread is related to food",

"soup is related to food",

"butter is a food",

"Something you find in the kitchen is food",

"Something you find on a table is food",

"chicken is a type of food",

"chicken is related to food",

"Something you find in the fridge is food",

"Something you find in the oven is food",

"Something you find at the supermarket is food",

"eat is related to food",

"best is related to good",

"best is related to better",

"dog is related to best",

"better is related to best",

"excellent is related to best",

"best is a type of attempt",

"best is a type of person",

"best is related to good",

"best is related to incomparable",

"best is related to superior",

"best is related to top",

"good is related to best",

"awesome is a synonym of awe-inspiring",

"great is related to awesome",

"anyone can be awesome",

"counterdemonstration is a type of presentation",

"debut is a type of presentation",

"exhibition is a type of presentation",

"exposure is a type of presentation",

"first reading is a type of presentation",

"lecture demonstration is a type of presentation",

"performance is a type of presentation",

"presentation is a type of ceremony",

"presentation is a type of display",

"presentation is a type of informing",

"presentation is a type of position",

"presentation is a type of proposal",

"presentation is a type of show",

"production is a type of presentation",

"cake is related to food",

"plate is related to food",

"rice is related to food",

"Something you find in the refrigerator is food",

"bread is related to food",

"soup is related to food",

"butter is a food",

"Something you find in the kitchen is food",

"Something you find on a table is food",

"chicken is a type of food",

"chicken is related to food",

"Something you find in the fridge is food",

"Something you find in the oven is food",

"Something you find at the supermarket is food",

"eat is related to food"

],

Attention over CommonSenseAspect term extraction

• Knowledge extraction through ConceptNet• Contextualization, CS attention and history of words• Categorical Cross-Ent. with Entropic regularization

biGRU -Contextualization

… …

Dot Attention


Opinion words Facts sentences

TransformerSelf-Attention

Label: {O, B-TERM, I-TERM}…

n

m k

RuleBased

SimpleDeep

• Knowledge extraction through ConceptNet• Contextualization, CS attention and history of words• Categorical Cross-Ent. with Entropic regularization• Tagging task on Semeval 2016


… …

Dot Attention


Opinion words Facts sentences

TransformerSelf-Attention

Label: {O, B-TERM, I-TERM}…

AoCS \w cs: 0.64 0.68AoCS : 0.69 0.736

n

m

FS SE

Attention over CommonSenseAspect term extraction


Common-sense & Reasoning

20

“Reasoning is a process of thinking during which the individual is aware of a problem identifies, evaluates, and decides upon a solution“

[3] Towards AI-Complete Question Answering : a set of prerequisite toy tasks, FAIR 2016[4] Measuring abstract reasoning in neural networks, DeepMind 2017

Reasoning

Multi document reasoningRiedel and al, 2017

[29] Constructing Datasets for Multi-hop Reading Comprehension Across Documents, Riedel et al, 2017

• Most Reading Comprehension methods limit themselves to queries which can be answered using a single sentence, paragraph, or document.

• Enabling models to combine disjoint pieces of textual evidence would extend the scope of machine comprehension

• Text understanding across multiple documents and to investigate the limits of existing methods.

• Toward ensemblist operations (union, intersection, selection … )

22

Review readingReviewQA: a relational aspect-based opinion reading dataset

23

Adversarial learningProtocol

24

25

Adversarial learningResults

Analyze the probabilities of obfuscation of the different words of a given {d , q , a}, i.e. the rewards of the obfuscation network for each word of a document

Given a tuple {d , q} where d is a clear document and q a query and assuming the document contains k words, we generate k corrupted documents where one word is obfuscated in each of them.

We then feed the obfuscation network with these corrupted data and report the results. A strong intensity means that a high reward is expected.

Adversarial learningObfuscator attention

• HotpotQA questions are designed with multi-hop reasoning in mind.

• The questions are not limited by predefined knowledge bases or schemas.

• We also collected the supporting facts which answers are based on to improve explainability of future QA models

Multi document reasoningHotpotQA – Bengio, Manning and al, 2018

Hybrid Extractive modelsHotpot Baseline

28

• Extractive model

• Fully differentiable

• Early fusion model

• 3-way projective head

Latent reformulation modelGrail, Perez and Gaussier, 2019

… Thanks !