41
Email Processing and Question Answering System (EPQAS) Date: 29 March 2018 Rafael E. Banchs Human Language Technology Unit

Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Email Processing

and Question

Answering System

(EPQAS)

Date: 29 March 2018

Rafael E. Banchs Human Language Technology Unit

Page 2: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

HOW IT STARTED

Page 3: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

• Agency receives a high volume of emails that need to be dealt with on a daily basis demanding significant amount of resources and long response times

• Main objectives: to use state-of-the-art natural language processing technologies to

1. Reduce the volume of incoming emails by supporting advance FAQ online support at agency’s website

2. Automatically redirect the incoming emails to the appropriate officer or group

3. Provide officers with pre-selected responses based on similar past email responses

Problem Statement and Objectives

Page 4: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Agency Website

Automated FAQ system

Query Resolved?

YES

Email classifier

NO

Use

r Se

nd

s Em

ail

Group 1 Group N …

Proposed responses

Proposed responses

… Officer 1 Officer N

USER

Frequently Asked Questions (FAQ) Service Engine

Email classification and Response Recommendation Engine

Proposed Solution

Page 5: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

WHAT WE FACED

Page 6: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Phonetics: system of sounds

Morphology: forms and words

Syntax: clauses and sentences

Semantics: conveying of meaning

Pragmatics: meaning in context

Communication

Means

Pursued Goals

Intentions

Physiological

Capabilities

Abstract Cognitive

Faculties

Levels of Linguistic Phenomena

Page 7: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Phonetics: system of sounds

Morphology: forms and words

Syntax: clauses and sentences

Semantics: conveying of meaning

Pragmatics: meaning in context

Communication

Means

Pursued Goals

Intentions

Physiological

Capabilities

Abstract Cognitive

Faculties

Levels of Linguistic Phenomena

Page 8: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

• The process of transforming a natural language

statement into a semantic representation (frame):

• Subtask 1: Intent Detection

• Subtask 2: Entity Extraction

Natural Language Understanding

Page 9: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

• The process of transforming a natural language

statement into a semantic representation (frame):

• Subtask 1: Intent Detection

• Subtask 2: Entity Extraction

Ok, I will meet you in Starbucks at 6pm

Natural Language Understanding

Page 10: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

• The process of transforming a natural language

statement into a semantic representation (frame):

• Subtask 1: Intent Detection

• Subtask 2: Entity Extraction

Ok, I will meet you in Starbucks at 6pm

Intent Detection: Confirm_meeting

Natural Language Understanding

Page 11: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

• The process of transforming a natural language

statement into a semantic representation (frame):

• Subtask 1: Intent Detection

• Subtask 2: Entity Extraction

Ok, I will meet you in Starbucks at 6pm

Intent Detection: Confirm_meeting

Entity Extraction: Action: Meet

Place: Starbucks

Time: 6:00pm

Natural Language Understanding

Page 12: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

• Transactional natural language applications

Detected Intent: Confirm_meeting

Extracted Entities: Action: Meet

Place: Starbucks

Time: 6:00pm Semantic Frame

Execute Transaction

for instance… Update_calendar

Transactional vs Informational

Page 13: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

• Transactional natural language applications

• Informational natural language applications

Detected Intent: Confirm_meeting

Extracted Entities: Action: Meet

Place: Starbucks

Time: 6:00pm Semantic Frame

Execute Transaction

for instance… Update_calendar

Detected Intent: Ask_for_location

Extracted Entities: Action: Go_to

Place: Starbucks

Location: ??? Semantic Frame

Search for Information

for instance… Find_address_in_map

Transactional vs Informational

Page 14: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

• Q&A is typically an informational application

• There are two different approaches, depending on the type of information available:

• Question search: matching intents and entities over a database of available question answer pairs (FAQs).

• Response selection: matching intents and entities over a collection of statements that might contain the answer.

Question Answering

Page 15: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

NLU in Question Answering

• Intents have to be identified among different language constructions:

Do you take credit cards?

Can I make the payment with visa?

Detected Intent: Ask_about_CC_accepted

Page 16: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

NLU in Question Answering

• Intents have to be identified among different language constructions:

• Entities have to be identified among different references:

Do you take credit cards?

Can I make the payment with visa?

Detected Intent: Ask_about_CC_accepted

Do you take credit cards?

Can I make the payment with visa?

Extracted Entities: Action: Payment

Type: Credit_Card

Accept: ???

Page 17: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Problems of Discrete Representation

Consider the following sequences of words

Do you take credit cards?

Can I make the payment with visa?

When can I make the payment for tourist visa application?

Page 18: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Problems of Discrete Representation

Consider the following sequences of words

Do you take credit cards?

Can I make the payment with visa?

When can I make the payment for tourist visa application?

SEMANTICALLY RELATED

WORD OVELAP = 0%

Page 19: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Problems of Discrete Representation

Consider the following sequences of words

Do you take credit cards?

Can I make the payment with visa?

When can I make the payment for tourist visa application?

SEMANTICALLY RELATED

WORD OVELAP = 0%

NOT SEMANTICALLY

RELATED

WORD OVELAP = 70%

Page 20: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Properties of Continuous Spaces

The Distributional Hypothesis

“a word is characterized for the company it keeps” (Firth 1957)

meaning is mainly determined by the context rather than from individual language units

• Continuous spaces represent semantic similarities by means of

the geometric concept of proximity

• Offer much “better” smoothing capabilities

• Not constrained to the Markovian assumption

Page 21: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Similarity in Continuous Space

Can I make the payment with visa?

Do you take credit cards?

When can I make the payment for tourist visa application?

Page 22: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Similarity in Continuous Space

Can I make the payment with visa?

SEMANTICALLY RELATED

(SHORT DISTANCE) Do you take credit cards?

When can I make the payment for tourist visa application?

Page 23: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Similarity in Continuous Space

Can I make the payment with visa?

SEMANTICALLY RELATED

(SHORT DISTANCE)

NOT SEMANTICALLY RELATED

(LONG DISTANCE)

Do you take credit cards?

When can I make the payment for tourist visa application?

Page 24: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Building Continuous Space Models

1.- Train a deep learning network on a supervised task

W1

W2

W3

Wn

Input S

ente

nce

L e v e l s o f A b s t r a c t i o n

C1

C2

Ck

Superv

ised T

ask

Page 25: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Building Continuous Space Models

1.- Train a deep learning network on a supervised task

2.- Use some of its internal layer representations as continue space models for language

W1

W2

W3

Wn

Input S

ente

nce

L e v e l s o f A b s t r a c t i o n

C1

C2

Ck

Superv

ised T

ask

M O D E L

Page 26: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Semantic Maps of Words

BIRD

GOAT

SKY

LIGHTNING

THUNDER

RAIN FIELD

FLOCK

SHEEP MOUNTAIN

SEA

CLOUD

WIND

FISH

RIVER

STORM

(Banchs 2012)

Page 27: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Semantic Maps of Words

Non-living things Living things

BIRD

GOAT

SKY

LIGHTNING

THUNDER

RAIN FIELD

FLOCK

SHEEP MOUNTAIN

SEA

CLOUD

WIND

FISH

RIVER

STORM

(Banchs 2012)

Page 28: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Semantic Maps of Words

Water

Land

Sky

Non-living things Living things

BIRD

GOAT

SKY

LIGHTNING

THUNDER

RAIN FIELD

FLOCK

SHEEP MOUNTAIN

SEA

CLOUD

WIND

FISH

RIVER

STORM

(Banchs 2012)

Page 29: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Regularities as Vector Offsets

Kings

(Mikolov 2013)

King

Queen

Queens

Page 30: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Regularities as Vector Offsets

Kings

(Mikolov 2013)

King

Queen

Queens gender offset singular/plural offset

Page 31: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Regularities as Vector Offsets

Kings

(Mikolov 2013)

King

Queen

Queens

Kings – King ≈ Queens – Queen

gender offset singular/plural offset

Queens ≈ Kings – King + Queen

Page 32: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Regularities across Languages

(Hermann 2014)

Days of the Week Months of the Year

GERMAN FRENCH ENGLISH

GERMAN FRENCH

Page 33: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

THE SOLUTION

Page 34: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Agency Website

Automated FAQ system

Query Resolved?

YES

Email classifier

NO

Use

r Se

nd

s Em

ail

Group 1 Group N …

Proposed responses

Proposed responses

… Officer 1 Officer N

USER

Frequently Asked Questions (FAQ) Service Engine

Email classification and Response Recommendation Engine

QU

ESTI

ON

SEA

RC

H P

RO

BLE

M

RES

PO

NSE

SEL

ECTI

ON

PR

OB

LEM

Proposed Solution Revisited

Page 35: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Overall FAQ System Architecture

User Query

Query

Type?

Keyword-based

Keyword

based Search

Query

Processing

Natural

Language

Query

Continuous

Space Model 1

Continuous

Space Model 2

In-domain

Bag-of-Words

Re-ranker

Control Engine:

.- High confidence: single response

.- Medium confidence: multiple responses

.- Low confidence: suggest to send email

.- Out-of-domain: institutional response

In/out domain

Classifier

FAQ

Database

Page 36: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Overall Email System Architecture

User Email Email

Classification

Email

Segmentation

Re-ranker

Interactive Interface:

Original user email +

List of pre-selected

responses

Sentence-level

Classification

Historical

Email DB

In-domain

Resources

Query

Processing

Entity

Extraction

Pre-selection of Responses

System Update

Page 37: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Outcome and Output Indicators

Website

FAQs

Email

Enquiry

Officer

Response

USER USER

C u s t o m e r J o u r n e y - V a l u e C h a i n

Page 38: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Outcome and Output Indicators

1. Reduction of incoming email volume (10%-20% less) • User finds more information in the website, and faster • Lower average number of emails per day sent to agency

Website

FAQs

Email

Enquiry

Officer

Response

USER USER 1

C u s t o m e r J o u r n e y - V a l u e C h a i n

Page 39: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Outcome and Output Indicators

1. Reduction of incoming email volume (10%-20% less) • User finds more information in the website, and faster • Lower average number of emails per day sent to agency

2. Reduction in human effort (20%-30% less) • Less human effort to re-route and reply to emails • Larger volume of emails processed per time unit

Website

FAQs

Email

Enquiry

Officer

Response

USER USER 1 2

C u s t o m e r J o u r n e y - V a l u e C h a i n

Page 40: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Outcome and Output Indicators

1. Reduction of incoming email volume (10%-20% less) • User finds more information in the website, and faster • Lower average number of emails per day sent to agency

2. Reduction in human effort (20%-30% less) • Less human effort to re-route and reply to emails • Larger volume of emails processed per time unit

3. Reduction on email response time (20%-30% less) • Faster internal processing of emails • Lower average response time to the user

Website

FAQs

Email

Enquiry

Officer

Response

USER USER 1 2 3 Response Time

C u s t o m e r J o u r n e y - V a l u e C h a i n

Page 41: Email Processing and Question Answering System (EPQAS) · 2018. 4. 3. · Overall Email System Architecture User Email Email Classification Email Segmentation Re-ranker Interactive

Thank you

www.a-star.edu.sg/I2R www.facebook.com/i2r.research

Rafael E. Banchs Human Language Technology Unit