1
D YNAMIC TIME-A WARE A TTENTION TO SPEAKER R OLES AND CONTEXTS FOR SPOKEN LANGUAGE UNDERSTANDING Po-Chun Chen * ,Ta-Chung Chi * , Shang-Yu Su * , Yun-Nung (Vivian) Chen *Three authors contribute this work equally. https://github.com/MiuLab/Time-SLU Summary o Approach: an end-to-end attentional role-based contextual model that leverages various content-aware and time-aware attention mechanisms o Experiment: impressive improvement on a benchmark multi-domain dialogue dataset o Result: temporal information is important when modeling history utterances for language understanding [email protected] [email protected] [email protected] [email protected] The Proposed Approach: Time-Aware Attention to Roles & Contexts Code Available: https://github.com/MiuLab/Time-SLU Task Definition o Language understanding for human-human dialogues between tourists and guides (DSTC4) Motivation o Human-human dialogues contain multiple reasoning steps o Additional attention mechanism like role or temporal information may be useful Method: Time-Aware Attention to Speaker Roles and Contexts o Modeling by the fixed time-aware attention and other learnable attention mechanisms Experiments and Discussions Conclusions Setup o Dataset: DSTC4 35 human-human dialogues o Evaluation metrics: F1 for multi-label classification Experimental Results o Time-aware attention models significantly outperform the baselines o The attention mechanisms provide improvement for LU o Role-level attention requires content-related information to achieve better performance Discussion o Guide results are consistently better than tourist results o The reason may be that the guide has similar behavior patterns (e.g. providing information and confirming questions) while the user has more diverse intentions Result o The model achieves state-of-the-art performance Dense Layer Σ Current Utterance w t w t+1 w T Dense Layer LU Role History Summary u 2 u 6 User (Tourist) u 4 Agent (Guide) u 1 u 7 Current Attentional Contextual Model Sentence Attention Content Time u 3 u 5 Role Attention Role-Level Attention o Different speaker roles behave differently o Role-level attention is based on how much to address on different speaker roles’ contexts Time-Aware Attention o Recent utterance contains more relevant information Content-Aware Attention o The semantic relation decides where and how much the model should focus on given the contexts End-to-End Training for Contextual Language Understanding o BLSTM-encoded current utterance concatenated with the history vector for multi-label intent prediction o All encoders, prediction models, and attention weights (except time-aware attention) can be automatically learned in an end-to-end manner miu Attention Mechanism Attention Type Content-Aware Time-Aware Attention Level Sentence Role distance between u and the current utterance Leveraging different types and levels of attention can improve language understanding Time-aware attention brings significant improvement for language understanding you were saying that you wanted to come to singapore? FOL_CONFIRM Uh yes I'm actually planning to visit uh on august FOL_CONFIRM Guide (Agent) Tourist (User) Task : Language Understanding (User Intents) RES_WHEN uh maybe can I have a little bit more details like uh when will you be coming QST_WHEN and like who will you be coming with QST_WHO Content-Aware Time-Aware Both Sentence Role Both 60 62 64 66 68 70 72 74 76 Baseline w/o Context w/ Context

DYNAMIC TIME-AWARE ATTENTION TO SPEAKER ROLES AND …yvchen/doc/ASRU17_Temporal_poster.pdf · o Result: temporal information is important when modeling history utterances for language

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DYNAMIC TIME-AWARE ATTENTION TO SPEAKER ROLES AND …yvchen/doc/ASRU17_Temporal_poster.pdf · o Result: temporal information is important when modeling history utterances for language

DYNAMIC TIME-AWARE ATTENTION TO SPEAKER ROLES AND CONTEXTS FOR

SPOKEN LANGUAGE UNDERSTANDINGPo-Chun Chen*,Ta-Chung Chi*, Shang-Yu Su*, Yun-Nung (Vivian) Chen

*Three authors contribute this work equally.

https://github.com/MiuLab/Time-SLU

Summary

o Approach: an end-to-end attentional role-based contextual model that leverages

various content-aware and time-aware attention mechanisms

o Experiment: impressive improvement on a benchmark multi-domain dialogue dataset

o Result: temporal information is important when modeling history utterances for

language understanding

[email protected]

[email protected]

[email protected]

[email protected]

The Proposed Approach: Time-Aware Attention to Roles & Contexts

Code Available:

https://github.com/MiuLab/Time-SLU

➢Task Definition

o Language understanding for human-human

dialogues between tourists and guides (DSTC4)

➢Motivation

o Human-human dialogues contain multiple

reasoning steps

o Additional attention mechanism like role or

temporal information may be useful

➢Method: Time-Aware Attention to Speaker Roles and Contexts

o Modeling by the fixed time-aware attention and other learnable attention mechanisms

Experiments and Discussions

Conclusions

➢Setup

o Dataset: DSTC4 35 human-human dialogues

o Evaluation metrics: F1 for multi-label classification

➢Experimental Results

o Time-aware attention models significantly outperform the baselines

o The attention mechanisms provide improvement for LU

o Role-level attention requires content-related information to achieve

better performance

➢Discussion

o Guide results are consistently better than tourist results

o The reason may be that the guide has similar behavior patterns (e.g.

providing information and confirming questions) while the user has

more diverse intentions

➢Result

o The model achieves state-of-the-art performance

Dense LayerΣ

Current Utterance

wt wt+1 wT… …

Dense Layer

LU

Role History Summary

u2

u6

User (Tourist)

u4 Agent (Guide)

u1

u7 Current

Attentional Contextual Model

Sentence AttentionContent Time

u3

u5

Role Attention

➢Role-Level Attention

o Different speaker roles behave differently

o Role-level attention is based on how

much to address on different speaker

roles’ contexts

➢Time-Aware Attention

o Recent utterance contains more

relevant information

➢Content-Aware Attention

o The semantic relation decides

where and how much the model

should focus on given the contexts

➢End-to-End Training for Contextual Language Understanding

o BLSTM-encoded current utterance concatenated with the history vector for multi-label intent prediction

o All encoders, prediction models, and attention weights (except time-aware attention) can be

automatically learned in an end-to-end manner

miu

Attention Mechanism

Attention Type

• Content-Aware 𝛼𝐶

• Time-Aware 𝛼𝑇

Attention Level

• Sentence 𝛼𝑢𝑖• Role 𝛼𝑟𝑖

distance between u and

the current utterance

Leveraging different types and

levels of attention can improve

language understanding

Time-aware attention brings significant improvement for

language understanding

you were saying that you wanted to come to singapore? FOL_CONFIRM

Uh yes

I'm actually planning to visit uh on august

FOL_CONFIRM

Guide (Agent)

Tourist (User)

Task : Language Understanding

(User Intents) RES_WHEN

uh maybe can I have a little bit more details like uh when will you be coming

QST_WHENand like who will you be coming with QST_WHO

Content-Aware Time-Aware Both

Sentence Role Both

60

62

64

66

68

70

72

74

76

Baseline w/oContext

w/Context