Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
1
Dialog
syst
ems
Rolf C
arlson
, CT
T, KT
H
Multi D
isci
plin
ary
Reco
gnit
ion
Synt
hesi
s
Und
erst
andi
ngGe
nera
tion
Dia
log
Cont
rolPr
oduc
tion
Perc
epti
on
Ling
uist
ics
Know
ledg
e
Hum
an-
Mac
hine
Prog
ram
min
g
Phon
etic
s
Stat
isti
cs
Sign
al P
roce
ssin
g
Sem
anti
csPhys
iolo
gy
Agen
da
•D
ialo
g S
yste
ms
•D
ata
Colle
ctio
n•
Rec
ognitio
n U
nder
stan
din
g•
Dis
fluen
cy•
Gen
erat
ion,
Voca
bula
ry•
Dia
log m
odel
s•
Sponta
neo
us
Dat
a•
Pla
tform
s•
Eva
luat
ion
•Err
or
Han
dlin
g•
Challe
nges
Cla
ssic
sys
tem
s
•Res
earc
h s
yste
ms
–Voy
ager
(1989)
–ATIS
(1992)
–SU
ND
IAL
(1993)
–TRAIN
S (
1996)
•Applic
atio
n–
Phili
ps
Tra
in I
nfo
rmation
(1995)
•La
rge
Eff
ort
s–
Com
munic
ator
–Ver
bm
obil
Nord
ic S
cene
•Sto
ckholm
, Sw
eden
–W
axhol
m
•Li
nkö
pin
g,
Sw
eden
–LI
NLI
N
•G
öte
borg
, Sw
eden
–TRIN
DI
•Aal
borg
, D
enm
ark
•H
elsi
nki, F
inla
nd
•Tro
ndhei
m,
Norw
ay
Dia
log
syst
em
s at
KTH
2
Dia
log
syst
em
s at
KTH
Dia
log
syst
em
s at
KTH
Dia
log
syst
em
s at
KTH
The
HIG
GIN
S d
omai
n
AdApt
multim
odal
dia
log s
yste
m
Mul
tim
odal
dia
log
syst
em
Conv
ersa
tion
sab
out
appa
rtm
ents
for
sale
Wor
kto
geth
erw
ith
a an
imat
edag
ent,
Urb
an
Som
ere
searc
h iss
ues
•M
ultim
odal dia
log m
odel
ling
–Spee
chSyn
thes
is,
Anim
atio
n,
Turn
taki
ng
–Spee
chRec
ognitio
n,
Poin
ting
–Ref
eren
ceH
andlin
g
•Err
or
Han
dlin
g•
Adap
tivity
Dia
log P
hen
om
ena
•“H
ar
du
in
get
bil
lig
are
?”
Implic
it r
efer
ence
, el
lipse
, co
nte
xt
•“B
erä
tta m
er
om
den
an
dra
läg
en
hete
n!”
Met
a-r
efer
ence
•“V
ad
men
ar
du
med
ch
arm
ig?”
Dom
ain-q
ues
tion
3
Simulation(W
izard-of-Oz)
User
Hum
anoperator
Wiz
ard
of O
z
•H
ow m
uch
doe
sth
e w
izar
d,
WO
Z,
take
care
of•
The
Com
ple
teSys
tem
•Pa
rts
of t
he
syst
em–
Rec
ognitio
n–
Syn
thes
is–
Dia
log H
andlin
g–
Know
ledge
Bas
e
•W
hic
hdem
ands
on t
he
WO
Z–
How
to h
andl
eer
rors
–Shou
ldyo
u a
dd
info
rmat
ion
–W
hat
is a
llow
edto
say
•W
hic
hsu
pport
doe
sth
e W
OZ h
ave
Wiz
ard
-of-
Oz
data
colle
ctio
n
The
Wiz
ard’
s gr
aphi
cal i
nter
face
Early
dem
oPic
torial sc
enarios
Adap
t –
dem
onst
ration
of ”c
omple
te”
syst
em
4
Th
e W
axh
olm
Pro
ject
•to
uris
t in
form
atio
n•
Stoc
khol
m a
rchi
pela
go•
tim
e-ta
bles
, hot
els,
hos
tels
, cam
ping
an
d di
ning
pos
sibi
litie
s.•
mix
ed in
itia
tive
dia
logu
e•
spee
ch r
ecog
niti
on•
mul
tim
odal
syn
thes
is•
grap
hic
info
rmat
ion
•pi
ctur
es, m
aps,
cha
rts
and
tim
e-ta
bles
.
Waxh
olm
Sys
tem
Gram
mat
ik &
sem
antik
Dialogk
ontr
oll
Graf
ik
Igen
känn
ing
Aku
stisk
och
visu
ell
Talsy
ntes
Dat
abas
-Sök
ning
Ljud
Kart
or o
ch t
abel
ler
Bått
idta
belle
r,H
amnp
osit
ione
r,H
otel
l,Re
stau
rang
er,
Mm
.
Inmat
ning
Utm
atning
Kont
extk
änsl
iga
Regl
er o
ch n
ätve
rkLe
xiko
n
Sim
uler
ing
med
W
izar
d
Insp
elni
ngar
Tal
Dat
abas
Ord
Ord
klas
ser
Sem
antis
k in
form
atio
nUt
tal
Tal
The
Wax
holm
inte
rfac
e
Waxh
olm
Data
base
•About
70 s
ubje
cts
(9200 w
ord
s)•
Phonet
ical
ly t
ransc
ribed
•Exa
mple
s fr
om
the
Wax
holm
sys
tem
–Fi
ve d
iffe
rent
spea
kers
•EJ
KR
G
Ö
LN
M
K
Wor
d c
over
age
00,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
010
020
030
040
050
060
070
0Num
ber
of w
ords
CoverageLe
xico
n -
tran
scription
Word
freq.
Word
freq.
skärgården
69skulle
33SJ'Ä3R
GÅ:2N
15SKk”ULE0
26
SJ'Ä3R
GgÅ
:2N
6SKk”U
3SJ'Ä3G
gÅ:2N
5SKk”UL
2SJӀ3R
#GgÅ̀:2DdE0N
5SKkLE0
1SJ'Ä3R
GgÅ
:2DdE0N
4SKk”UE0
1SJ'Ä3R
GgÅ
:2DE0N
4SJӀ3R
#GgÅ̀:2DE0N
4SJ'Ä3G
gÅ:2DE02N
32S'Ä3R
GgÅ
:2N
3SJ'Ä3R
GgÅ
:N2
SJӀ3R
#GgÅ̀:2DdE02N
2SJӀ3R
#GÅ̀:2DE02N
2SJ”Ä3#GgÅ̀:2DdE0N
2SJ'Ä3R
Å:NG
1SJ'Ä3R
GÅ:N
1SJ'Ä3R
GÅ:2DdE02N
1SJ'Ä3R
GÅ:2DE0N
1SJ'Ä3R
GÅ:2DE02N
1SJ'Ä3G
gÅ:2DdE0N
v1
SJ'Ä3G
gÅ:2DE0N
1SJ'Ä3G
Å:2DE0N
1SJӀ3R
#GgÅ̀:RDdE0N
1
5
Utt
erance
len
gth
dis
trib
ution
0
50
100
150
200
250
300
350
05
10
15
20
Utterancelength(numberofwords)
NumberofoccurrencesExtr
a lin
gu
isti
c so
un
ds
32(.3%)
60(.6%)
67(.7%)
117(1.3%)
152(1.6%)
230(2.5%)
110
100
1000
interruptedword
exhalation
hesitation
inhalation
smack
insertedvowel
NumberofoccurrencesintheWaxholm
database
(9200words)
Thre
e ye
ars
late
r….
Th
e W
axh
olm
syst
em
Whe
reca
nIf
ind
hote
ls?
Info
rmat
ion
abou
thot
els
issh
own
inth
ista
ble
Whi
chho
tels
are
inW
axho
lm?
Info
rmat
ion
abou
tthe
hote
lsin
Wax
holm
issh
own
inth
ista
ble
Whe
reis
Wax
holm
?W
axho
lmis
show
non
this
map
Ithi
nkIw
antt
ogo
toW
axho
lmIa
mlo
okin
gfo
rboa
tsto
Wax
holm
From
whe
redo
you
wan
tto
goTh
eci
tyW
hich
day
ofth
ew
eek
doyo
uw
antt
ogo
?Iw
antt
ogo
tom
orro
wTh
ere
are
lots
ofbo
ats
from
Stoc
khol
mto
Wax
holm
ona
Frid
ay,A
twha
ttim
edo
you
wan
tto
go?
Whe
ndo
the
even
ing
boat
sde
part?
This
isa
tabl
eof
the
boat
s...
Isit
poss
ible
toea
tin
Wax
holm
?In
form
atio
nab
outt
here
stau
rant
sin
Wax
holm
issh
own
inth
ista
ble
Than
kyo
uTh
ank
you
too
Inst
ead
of W
OZ
”Boos
trap
” th
e sy
stem
•M
ake
a si
mple
but
com
ple
tesy
stem
and e
valu
ate
•Spre
adth
e in
form
atio
n…
.•
Colle
ctdat
a•
Upgra
de
the
syst
em
Phili
ps
trai
n info
rmat
ion
020
7575
75
6
Web
GALA
XY D
ispla
y
System
Status
Recognition
Paraphrase
Domain
Input
Status
Abort
Connection
Inform
ation
MIT-LCS
Tra
inin
g o
f Ju
piter
Eve
ngen
eric
data
bas
esare
im
port
ant
Sw
edia
Spee
chD
at
……
…
Sw
edis
h d
iale
cts
”Fly
get,
tåg
et o
ch b
ilbra
nsch
en t
ävla
r om
löns
amhe
t oc
h fo
lket
s gu
nst”
.
Född
iU
SA
ex-J
ugos
lavi
en
Spee
chunder
standin
gso
me
aspec
ts
•Big
ram
Tig
ht
couplin
g•
Key
wor
dsp
otting
•Ph
rase
spot
ting
•Fu
ll gra
mm
atic
alan
d s
eman
tic
anal
ysis
•O
OV o
ut
of v
ocab
ula
ry
Per
ple
xity
of th
e la
nguage
Bperp
lexi
tyfo
r th
e a
pplic
ation
Hentr
opy
for
the a
pplic
ation
P(W
)pro
bab
ility
of
a w
ord
giv
en its
pre
ceedin
gco
nte
xt
HPW
PW
W
=−
∀�(
)log
()
2
B=2H
7
paus|hej|paus
paus|hej|paus
paus|hej|paus
paus|hej|paus
Auto
mat
ic r
ecognitio
n
paus|hej|paus
dynamic
“analysis”
...a b
grammatical
analysis
semantic
analysis
dialog-
analysis
paus|nej|paus
Auto
mat
ic u
nder
stan
din
g
...a b
grammatical
analysis
dialog-
analysis
paus|nej|paus
dynamic
“analysis”
semantic
analysis
Rep
rese
nting m
ultip
le h
ypoth
eses
Jag
vill
åka
till
Vaxh
olm
Jag
till
åka
till
Vaxh
olm
Jag
vill
åka
vill
Vaxh
olm
Jag
till
åka
vill
Vaxh
olm
Ja v
ill å
ka t
ill V
axho
lmJa
till å
ka t
ill V
axho
lmJa
vill å
ka v
ill V
axho
lmJa
till å
ka v
ill V
axho
lm
N-b
est
list
Wor
d gr
aph
jag
vill
åka
till
Vaxh
olm
jatill
vill
Know
ledge
sourc
es -
Eva
luation
Aco
ust
ic a
naly
sis
Syn
tact
ic a
naly
sis
Sem
antic
analy
sis
Dia
log s
tate
Dia
log C
onte
xt
Conf
iden
ceEx
pect
atio
n Fi
lter
Syn
tact
ic a
nal
ysis
Multi-
leve
l an
alys
is
p(boat|“TOP+SUBJ”)
p(verb|“SUBJ+boat+VP”)
p(TO|“VP+verb+TO_PLACE”)
p(port|“VP+verb+TO_PLACE”)
p(SUBJ|“TOP”)
p(VP|“TOP+SUBJ”)
p(TO_PLACE|“TOP+SUBJ+VP”)
p(END|“SUBJ+VP+TO_PLACE”)
“båten”
boat
“Vaxholm”
“till”
verb
“går”
TO
port
SUBJ
VP
TO_PLACE
TOP
SCORE=Σp n
ode+Σp terminal+Σp w
ord+f(length)
8
I w
ant
to g
o..
....
12.26
JagvillåkafrånStockholm
tillVaxholm.
11.99
JagvillåkatillVaxholm
frånStockholm.
10.01
JagvillåkatillVaxholm.
9.85
JagskulleviljaåkatillVaxholm.
5.30
Jagvillåka.
3.17
NärgårdetenbåttillVaxholm?
-1.32
NärgårbåtentillVaxholm?
-1.95
Jagvillåkatillmam
ma.
Parserscore
IwanttogofromStockholm
toVaxholm.
IwanttogotoVaxholm
fromStockholm.
IwanttogotoVaxholm.
IwouldliketogotoVaxholm.
Iwanttogo.
WhendoesaboatgotoVaxholm?
WhendoestheboatgotoVaxholm?
Iwanttogotomymother.
Robust
Anal
ysis
•Rob
ust
inte
rpre
tation
–U
sing g
ram
mar
to
auto
mat
ical
ly d
etec
t non
-exp
ecte
d
wor
ds
bet
wee
n a
nd insi
de
phra
ses
–Pe
rfor
ms
bet
ter
than
key
wor
d-s
pot
ting f
or d
etec
ting
erro
neo
us
conte
nt-
wor
ds
–Ska
ntz
e, G
. &
Edlu
nd,
J. (
2004).
Rob
ust
inte
rpre
tation
in
the
Hig
gin
s sp
oken
dia
logue
syst
em.
Dis
fluen
cies
Type
Example
Filledpause
jaghumtyckeromglass
Repetition
Insertion
Restart
Substitution
jagjagtyckeromglass
kandujagtyckeromglass
jagtyckerominteomglass
vityckerjagtyckeromglass
‘Dis
fluen
cy r
ate
’
•H
um
an
-H
um
an
–Tw
o per
son t
elep
hon
eH
igh
–Tw
o per
son d
irec
t
–O
ne
per
son
•H
um
an
-M
ach
ine
–Com
pute
r in
tera
ctio
nLo
w
S.O
viatt
Dis
trib
ution
of D
isfluen
ices
Swit
ch b
oard
dat
a, L
iz S
hrib
erg,
The
sis,
SRI
DisfluencyPosition
0,00
0,10
0,20
0,30
0,40 0,00
0,20
0,40
0,60
0,80
1,00
PositionofCurrentWord
/TotalWords
ProportionofDisfluencies
5-8
words
9-12words
13-16words
Dis
fluen
cy e
xam
ple
s fr
om A
dapt
rätt
else
det
är li
te f
ör..
.lit
e fö
r se
nt tid
igt
finn
s de
t nå
n eh
m .
..lik
nand
e lg
h …
in/~
områ
det
med
som
är
bygg
d på
180
0tal
et
avbr
utet
hur se
/ eh
...
är k
öket
eh
…ut
rust
at
paus
eruh
m..
.hö
gtti
ll ta
k oc
heh
...
kans
ke n
ågon
kak
elug
n..
.oc
h ba
lkon
g gä
rnaiii
söde
rläg
e
felu
ttal
är d
en e
h ny
redo
~ny
reno
vera
d
förl
ängn
ing
huuu
rrr
ser
gata
n ut
rätt
else
term
jag
vill
gärn
aha
en
läge
nhet
med
...
utsi
kt..
. ne
jmed
bal
kong
9
Dis
fluen
cies
in h
alf o
f th
e Adap
t co
rpus
22%
of
all u
tter
ance
s di
sflu
ent
6% o
f al
l wor
ds d
isfl
uent
Perc
enta
ge d
isflue
nt w
ords
intu
rns
with
five
to n
ine
word
s
4812
Individu
al u
sers
[%]
4812
1-4
word
s5-
9 wo
rds
10-1
4 wo
rds>
=15
wor
ds
Perc
enta
ge d
isflue
nt w
ords
[%
]
Utt
eran
ce le
ngth
s
Utt
erance
Gen
erat
ion
•Pre
def
ined
utt
eran
ces
•Fr
ames
with s
lots
•G
ener
atio
n b
ased
on g
ram
mar
and
under
lyin
gse
man
tics
Sys
tem
Utt
erance
s
•The
outp
ut
should
ref
lect
the
syst
em’s
vo
cabula
ry a
nd lin
guis
tic
capab
ility
–th
e use
rs a
dap
t
•Short
utt
eran
ces
–The
use
rs a
dap
t
•G
ood e
rror
mes
sages
–U
se w
ords
and p
hra
ses
the
syst
em c
an h
andle
Use
r an
swers
to
qu
est
ion
s?
The
answ
ers
to t
he
ques
tion
:“W
hat
weekd
ay d
o y
ou
wan
t to
go?”
(Vilk
en v
ecko
dag
vill
du å
ka?)
•22%
Fri
day
(fre
dag
)•11%
I w
an
t to
go
on
Fri
day
(jag
vill
åka
på
fred
ag)
•11%
I w
an
t to
go
to
da
y(j
ag v
ill å
ka idag
)•
7%
o
n F
rid
ay
(på
fred
ag)
•6%
I
wan
t to
go
a F
rid
ay
(jag
vill
åka
en f
redag
)
•-
are
th
ere
an
y h
ote
ls i
n V
axh
olm
?(f
inns
det
någ
ra h
otel
l i Vax
hol
m)
Use
r an
swers
to
qu
est
ion
s?
The
answ
ers
to t
he
ques
tion
:“W
hat
weekd
ay d
o y
ou
wan
t to
go?”
(Vilk
en v
ecko
dag
vill
du å
ka?)
•22%
Fri
day
(fre
dag
)•11%
I w
an
t to
go
on
Fri
day
(jag
vill
åka
på
fred
ag)
•11%
I w
an
t to
go
to
da
y(j
ag v
ill å
kaid
ag)
•7%
o
n F
rid
ay
(på
fred
ag)
•6%
I
wan
t to
go
a F
rid
ay
(jag
vill
åka
en f
redag
)
•-
are
th
ere
an
y h
ote
ls i
n V
axh
olm
?(f
inns
det
någ
ra h
otel
l i Vax
hol
m)
Pairs
of a
lter
nat
ive
main
ver
bs
höra
-lys
sna
(list
en -
hear
)
sepå
-gå
på
(wat
ch-
go t
o)
före
drar
-tyc
ker
mes
t om
(pre
fer
-like
the
mos
t)
köpa
-han
dla
(sho
p-bu
y)
test
a-p
röva
(tes
t-tr
y)
vand
ra-s
tröv
a(h
ike-
stro
ll)
10
Huroftaåkerduutomlandspåsemestern?
jagåkerengångomåretkanske
jagåkerganska
sällanutomlandspåsemester
jagåkernästanalltidutomlandsunderminsemester
jagåkerungefär2gångerperårutomlandspåsemester
jagåkerutomlandsnästanvarjeår
jagåkerutomlandspåsemestern
varjeår
jagåkerutomlandsungefärengångomåret
jagärnästanaldrigutomlands
enellertvågångeromåret
engångpersemester
kanske
engångperår
ungefärengångperår
åtminståneengångomåret
nästanaldrig
Huroftareserduutomlandspåsemestern?
jagreserengångomåretutomlands
jagreserinteoftautomlandspåsemesterdetblirmeraiarbetet
jagreserreserutomlandspåsemestern
vartannatår
jagreserutomlandsengångpersemester
jagreserutomlandspåsemesterungefärengångperår
jagbrukarresautomlandspåsemesternåtminståneengångiåret
engångperårkanske
engångvartannatår
varjeår
varttredjeårungefär
nuförtideninteså
ofta
varjeårbrukarjagåkautomlands
Exempleofquestions
andansw
ers
Res
ults
no reus
e4%
reus
e 52
%
ellip
se18
%othe
r24
%
no answ
er2%
Less
ons
•su
bje
cts
adap
t th
eir
lexi
cal ch
oic
es t
o s
yste
m
ques
tions
•le
ss t
han
5%
of
the
case
s an
alter
nat
ive
main
ve
rb is
use
d in t
he
answ
er
•ad
aptive
lan
guag
e m
odel
and lex
icon in t
he
reco
gniz
er
Dia
log M
odel
Hum
an-
mach
ine
inte
ract
ion
•In
itia
tive
–sy
stem
/use
r•
Wh
o is
the
use
r–
Firs
t tim
e?•
Term
ino
log
y–
join
t vo
cabula
ry•
Do
yo
u a
ccep
t b
arg
e in
?–
Has
the
use
r under
stoo
d w
hat
was
sai
d?
•C
an
th
e u
ser
teach
th
e s
yst
em
?
Modalit
ies
•W
ho a
re y
ou t
alki
ng t
o–
syst
em–
Anim
ated
char
acte
r
•H
ow
is
the
info
rmat
ion p
rese
nte
d–
Tex
t, t
able
s, p
ictu
res
–Syn
thet
ic s
pee
ch
•Can
you b
oth
tal
k an
d p
oin
t
11
Spoke
ndia
log s
yste
m
•Fi
nite-
stat
ebas
edsy
stem
s–
dia
log a
nd s
tate
sex
plic
itly
spec
ifie
d
•Fr
ame
bas
edsy
stem
s–
dia
log s
epara
ted
from
info
rmat
ion s
tate
s
•Agen
t bas
edsy
stem
s–
mod
elof
inte
ntion
s, g
oals
, bel
iefs
Dia
log m
odel
•D
om
ain
dep
enden
t m
odel
–Rule
s, n
etw
orks
, st
ack
•Sep
arat
e m
odel
s fo
r th
e dia
log t
urn
s an
d t
he
sem
antics
–Fo
r ex
am
ple
Ques
tion
/answ
er
•Ref
eren
ce H
andlin
g
voic
eXM
L
http
://w
ww.v
oice
xml.o
rg/
Nuance
Voy
ager
Waxh
olm
Topic
s
TIM
E_
TA
BLE
Tas
k: g
et a
tim
e-ta
ble
.Exa
mple
: N
ärgår
båt
en?
(When
does
the
boat
lea
ve?)
SH
OW
_M
AP
Tas
k :
get
a c
har
t or
a m
ap d
ispla
yed.
Exa
mple
: Var
lig
ger
Vax
holm
? (W
her
e is
Vax
holm
loca
ted?)
EX
IST
Tas
k :
dis
pla
y lo
dgin
g a
nd d
inin
g pos
sibi
litie
s.Exa
mple
: Var
fin
ns
det
van
drar
hem
? (W
her
e ar
e th
ere
hos
tels
?)
OU
T_
OF_
DO
MA
INTas
k :
the
subje
ct is
out
of t
he
dom
ain.
Exa
mple
: K
an jag
boka
rum
. (C
an I
book
a ro
om?)
NO
_U
ND
ER
ST
AN
DIN
GTas
k :
no
under
stan
din
g o
f use
r in
tention
s.Exa
mple
: Ja
g h
eter
Olle
. (M
y nam
e is
Olle
)
EN
D_
SC
EN
AR
IOTas
k :
end a
dia
log.
Exa
mple
: Tac
k. (
Than
k yo
u.)
Dia
log
ue c
on
tro
l -
state
pre
dic
tio
n
Dia
log g
ram
mar
spec
ifie
d b
y a
num
ber
of st
ates
Eac
h s
tate
ass
oci
ated
with a
n a
ctio
ndat
abas
e se
arc
h,
syst
em q
ues
tion
… …
Pro
bab
le s
tate
det
erm
ined
fro
m s
eman
tic
feat
ure
sTra
nsi
tion p
robab
ility
from
one
stat
e to
sta
teD
ialo
g c
ontr
ol des
ign t
oolw
ith a
gra
phic
inte
rfac
e
12
Sem
antic
Fram
e
Currentfunctions:
/TO-PLACEQ-VERBALSUBJECTFROM-TIME/
Currentmeaning:/MOVEBOATPORTQUANT/
Historyfunctions:
/TO-PLACEQ-VERBALSUBJECTFROM-TIME/
Historymeaning:/MOVEBOATPORTQUANT/
(FROM-TIME.AFTER_TIME"04"//)
(FROM-TIME.BEFORE_TIME"06"//)
(SUBJECT"båten"/BOAT/)
(Q-VERBAL"går"/MOVE/)
(TO-PLACE"vaxholm"/PORT/)
proposedtopicTIME_TABLE
Topic
sel
ection
TIM
ES
HO
WFA
CILI
TYN
O U
ND
ER-
OU
T O
FEN
DT
ABL
EM
AP
STA
ND
ING
DO
MA
IN
OBJ
ECT
.062
.312
.073
.091
.067
.091
QU
EST
-WH
EN.188
.031
.024
.091
.067
.091
QU
EST
-WH
ERE
.062
.688
.390
.091
.067
.091
FRO
M-P
LACE
.250
.031
.024
.091
.067
.091
AT
-PLA
CE.062
.219
.293
.091
.067
.091
TIM
E.312
.031
.024
.091
.067
.091
PLA
CE.091
.200
.500
.091
.067
.091
OO
D.062
.031
.122
.091
.933
.091
END
.062
.031
.024
.091
.067
.909
HO
TEL
.062
.031
.488
.091
.067
.091
HO
STE
L.062
.031
.122
.091
.067
.091
ISLA
ND
.333
.556
.062
.091
.067
.091
PORT
.125
.750
.244
.091
.067
.091
MO
VE.875
.031
.098
.091
.067
.091
argmax
i{p(ti|F
)}
TOPI
C EX
AM
PLES
FEA
TURE
S
Topic
pre
dic
tion
res
ults
0510
15
All
“nounderstanding”
excluded
12,9
12,7
8,8
8,5
noextra
linguistic
sounds
raw
data
3,1
2,9
complete
parse
%Errors
How
may
I h
elp y
ou?
•Cal
lers
are
route
d t
o s
upport
sta
ff
usi
ng
Nat
ura
l Voic
es t
echnolo
gy,
AT&
T C
onsu
mer
Ser
vice
s' H
ow
May
I H
elp Y
ou?
(HM
IHY).
•
The
HM
IHY s
yste
m w
as d
eplo
yed in 2
001,
and b
y th
e en
d o
f th
e ye
ar,
it w
as h
andlin
g
more
than
2 m
illio
n c
alls
per
month
.
•Alle
n G
orin e
t al
.
Hum
an-h
um
an c
onve
rsations
Customer
Agent
Act
Freq.
Words
Freq.
Words
Acknowledge
47.9
2.3
30.8
3.1
Request
29.5
9.0
15.0
12.3
Confirm
13.1
5.3
11.3
6.4
Inform
5.9
7.9
27.8
12.7
Statement
3.4
6.9
15.0
6.7
Stat
isti
cs o
f tu
rns
in a
mov
ie d
omai
n (f
rom
Fla
mm
ia).
Conve
rsat
ional “g
runts
”
•Gruntsoccuranaverageofonce
every5secondsinAmericanEnglish
conversation.(NigelW
ard,2000)
•InSwitchboarddatabase
–um
wasthe6thmostfrequentitem(afterI,and,the,you,anda),(NigelWard,
2000)
–thefouritemsuh,uh-huhandum
andum
-hum
accountedfor4%ofthetotal
(Piconeetal.1998).
13
Nota
tion
abbreviationandfunction/position
back
back-channel
fill
filler,includingvariousthingsthatoccurutterance-
orturn-initially
dis
disfluency
marker
isisolate,producedwhenneitherpersonhastheturn,
typically
moreself-directedthanother-directed
rsresponse
todirectquestionorhigh-risestatement
cconfirmation,inresponse
toaback-channel
oother,includingclause-finalitems,itemsthatoccur
inquotations,anditemswhose
functionisobscure
Use
r st
udie
s
•Turn-taking
•Interaction
•PositiveandNegativeUserFeedback
•Userreactions
Posi
tive
and n
egat
ive
feed
back
2550
Individu
al u
sers
Utt
eran
ces
with
fee
dbac
k
Sys
tem
Use
rFe
edba
ck
This
bui
ldin
g wa
s co
nstr
ucte
d in
186
1Ye
s ye
sth
at’s
righ
t…..
is
the
re a
tile
d st
ove
ther
e to
oPo
sitive
Att
ention
This
apa
rtm
ent
has
a fi
repl
ace
Yes
that
’s a
ll righ
t to
o……
..
how
high
is t
he b
uild
ing
Positive
Att
itud
e
This
apa
rtm
ent
is o
n th
e fi
rst
floo
rOka
y….
and
I se
e it
is c
lose
to
the
Germ
an c
hurc
h th
erePo
sitive
Att
ention
I do
n’t k
now
any
thin
g ab
out
such
thi
ngs
Well ok
ay……
……….
ye
s bu
t I
thin
k I’
m h
appy
wit
h th
atNeg
ativeA
ttitud
e
This
hou
se w
as b
uilt
in 1
600
Oh!
Neg
ativeA
ttitud
e
94%
of t
he s
ubje
cts
used
fee
dbac
k at
leas
t on
ce65
%of
the
fee
dbac
k tu
rns
were
labe
led
as p
osit
ive
18%
of a
ll us
er u
tter
ance
s co
ntai
ned
feed
back
6%of
the
fee
dbac
k oc
curr
ed in
a s
epar
ate
turn
[%]
Para
met
er
sett
ings
to c
reat
e diffe
rent
stim
uli
Affirmativesetting
Negativesetting
Smile
Headsm
iles
Headhasneutralexpression
Headmovem
ent
Headnods
Headleansback
Eyebrows
Eyebrowsrise
Eyebrowsfrown
Eyeclosure
Eyescloseabit
Eyesopen
widely
F0contour
Declarativeintonation
Interrogativeintonation
Delay
Immediatereply
Slowreply
The
August
sys
tem
•Sto
ckholm
(eve
nts
and g
enera
l in
form
ation)
•Yello
w p
ages
•KTH
and s
pee
ch t
echnolo
gy
•August
Str
indberg
•G
reet
ings
and s
oci
al utt
era
nce
s•
Com
ments
about
the s
yste
m c
apabili
ties
and t
he
dis
cours
e
Shal
low
sem
antic
anal
ysis
•In
put
–w
ord s
equen
ces
–se
man
tic
feat
ure
s fr
om lex
icon
•O
utp
ut
–Acc
epta
ble
utt
eran
ce?
yes
/no
–Pr
edic
ted d
omai
n•
stri
ndberg
, st
ock
holm
, ye
llow
pag
es…
..
–Fe
ature
:val
ue
repre
senta
tion
•obje
ct:r
esta
ura
nt,
pla
ce:m
aria
torg
et
•Tra
ined
on t
agged
N-b
est
lists
and lex
icon
14
Theset-upin
Kulturhuset
Asamplevideoofthe
systemenvironment
The
August
data
bas
e
men 50
%
wom
en26
%
child
ren
24%
men 55%
wom
en23
%
child
ren
22%
2685
use
rs10
,058
utt
eran
ces
Sept
embe
r 19
98 -
Febr
uary
199
9:
10,0
58ut
tera
nces
(app
roxi
mat
ely 15
hou
rs o
f sp
eech
) we
re m
anua
lly c
heck
ed, t
rans
crib
ed a
nd a
naly
zed
What
do
you s
ay
to A
ugust
?
•Child
•
Wom
an 1
•W
om
an 2
Utterancetypesinthe
Augustdatabase
Socializing
Social
Insult
Test
Info-seeking
Domain
Meta
Facts
Soci
aliz
ing c
ate
gor
ies
Hel
lo A
ugus
t!Th
at’s
a ni
ce m
oust
ache
!W
ould
you
like
to
go o
ut w
ith
me
toni
ght?
You
are
stup
id!
Is y
our
brai
n to
o sm
all?
You
have
a s
ausa
ge b
rain
!
Wha
t is
my
nam
e?I
want
to
rent
a r
efri
gera
tor
Wha
t is
the
col
our
of y
our
hair
?
Social
Insu
lt
Test
15
The
info
-see
king c
ate
gor
ies
How
man
y bo
oks
did
Stri
ndbe
rg w
rite
?W
hat
can
you
stud
y at
KTH
?W
here
are
the
res
taur
ants
on
Kung
sgat
an?
Wha
t ca
n I
ask
you?
Augu
st a
nswe
r m
y qu
esti
on I
kno
w yo
u kn
ow
ever
ythi
ngTh
en I
will
spe
ak a
t th
e sa
me
tim
e as
I h
old
down
the
but
ton
-wha
t is
you
r na
me,
age
ntW
hat’s
the
cap
ital
of
Finl
and?
Wha
t is
two
tim
es t
wo?
How
man
y p
eopl
e liv
e in
Mad
rid?
Dom
ain
Met
a
Fact
s
Use
r utt
erance
cate
gor
ies
during t
he
firs
t si
x dia
logue
turn
s
cate
gory
child
ren
women
men
only
soc
ializ
ing
34%
20%
20%
only
info
-see
king
28%
39%
34%
from
soc
ializ
ing
to in
fo-s
eeki
ng
alte
rnat
ing
7%6%
3%
31%
35%
43%
The
stat
isti
cs a
re b
ased
on
the
firs
t ut
tera
nces
(up
to s
ix)
from
all
user
s th
at s
aid
mor
e th
an t
wo u
tter
ance
s to
the
sys
tem
What
……
.. ?
•334 u
tter
ance
s in
clude
“what
”–
only
75 h
ave
“what
” in
initia
l pos
itio
n
•99 “
what
is
your
nam
e”–
all in
fin
al u
tter
ance
pos
itio
n–
only
13 initia
te a
n u
tter
ance
intr
owh
at …
…...
An e
xam
ple
of
a re
pet
itiv
e se
quen
ce
The
utte
ranc
e ”V
ad h
eter
kun
gen?
” (W
hat
is t
he n
ame
of t
he k
ing?
) as
ori
gina
l inp
ut (t
op) a
nd r
epea
ted
twic
e by
the
sam
e us
er
Featu
res
in r
epet
itio
n
010
20
30
40
50
more
clearly
articulated
increased
loudness
shiftingof
focus
Percentageofallrepetitions
adults
children
Som
e le
ssons
for
reco
gnitio
n•
lexic
al en
train
men
t–
use
bot
h u
ser
input
and s
yste
m o
utp
ut
•ad
aptive
to
–ap
plic
atio
n–
use
r–
dia
log
•use
thre
e re
cognitio
n s
yste
ms
in p
aralle
l–
continuou
s sp
eech
(def
ault)
–w
ord b
y w
ord (
erro
r re
solu
tion
)–
continuou
s sy
llable
s (c
onfiden
ce)
16
TheAugustsystem
Wha
t is
you
r na
me?
I cal
l mys
elf
Stri
ndbe
rg,
but
I do
n’t
real
ly h
ave
a su
rnam
eW
hen
wer
e yo
u bo
rn?
Stri
ndbe
rg w
as b
orn
in 1
849
Wha
t do
you
do
for
a liv
ing?
I can
ans
wer
que
stio
ns a
bout
St
rind
berg
, KT
H a
nd S
tock
holm
How
man
y pe
ople
live
in
Stoc
khol
m?
Ove
r a
mill
ion
peop
le li
vein
the
Sto
ckho
lm a
rea
Do
you
like
it h
ere?
Peop
le w
ho li
ve in
gla
ss h
ouse
ssh
ould
not
thr
ow s
tone
s Ye
s, tha
tw
as a
sm
art
thin
g to
say
!I c
ome
from
the
dep
artm
ent
of S
peec
h, M
usic
and
Hea
ring
Th
e Ro
yal I
nsti
tute
of
Tech
nolo
gy!
The
info
rmat
ion
is s
how
non
the
map
Than
k yo
u!Yo
u ar
e w
elco
me!
Goo
d by
e!Pe
rhap
s w
e w
ill m
eet
soon
aga
in!
Yes,
it m
ight
be
that
we
will
!St
rind
berg
was
mar
ried
th
ree
tim
es!
Stri
ndbe
rg w
as m
arri
ed
thre
e ti
mes
!
Ver
bal
topic
aliz
atio
n
NP+
STRE
ETTh
e ap
artm
ent
on S
ankt
Eriks
gata
n-w
hich
flo
or is
it o
nST
REET
Öst
erlång
gata
n 24
-doe
s it
hav
e a
balc
ony
COLO
RTh
e gr
een
apar
tmen
t-d
oes
it h
ave
a ba
lcon
yDEI
CTIC
+STR
EET
This o
ne-o
n Helen
ebor
gsga
tan
-doe
s it
hav
e a
bath
tub
COLO
R+ST
REET
The
yello
w on
e on
Koc
ksga
tan
20-d
oes
it h
ave
a ba
lcon
y
5101520
COLO
RCO
LOR+
STRE
ETD
EICT
IC+
STRE
ETN
P+ST
REET
STRE
ET
Number of occurrences
Abo
ut 1
0% o
f al
l req
uest
s fo
r in
form
atio
n ab
out
a sp
ecif
ic
apar
tmen
t co
ntai
ned
a to
pica
lized
ref
eren
ce. 4
% c
onta
ined
a
verb
al t
opic
al r
efer
ence
and
6%
a p
rece
ding
gra
phic
al r
efer
ence
Gra
phic
al to
pic
aliz
ation
[clic
k] +
DEI
CTIC
Whe
n wa
s th
is o
nebu
ilt[c
lick]
+PR
ONOUN
Doe
s it
have
a t
iled
stov
e[c
lick]
+NP
Doe
s th
e ap
artm
ent
have
a b
atht
ub[c
lick]
+EL
LIPS
EW
hich
flo
or[c
lick]
+CO
LOR
Whe
n wa
s th
e wh
ite
hous
ebu
ilt[c
lick]
+ S
TREE
TD
oes Hor
nsga
tan
59ha
ve a
tile
d st
ove
5101520
[clic
k]+
COLO
R[c
lick]
+D
EICT
IC[c
lick]
+EL
LIPS
E[c
lick]
+N
P[c
lick]
+PR
ON
OU
N[c
lick]
+ST
REET
Number of occurrences
Tim
ing o
f m
ouse
clic
ks
124 3 Average time between mouse click and subsequent speech [sec]
[clic
k]+
COLO
R[c
lick]
+D
EICT
IC[c
lick]
+EL
LIPS
E[c
lick]
+N
P[c
lick]
+PR
ON
OU
N[c
lick]
+ST
REET
Time between mouse click and speech [sec]
Individu
al U
sers
befo
re
afte
r-50510
Cla
ssic
sys
tem
s
•Res
earc
h s
yste
ms
–Voy
ager
(1989)
–ATIS
(1992)
–SU
ND
IAL
(1993)
–TRAIN
S (
1996)
•Applic
atio
n–
Phili
ps
Tra
in I
nfo
rmation
(1995)
•La
rg E
ffort
s–
Com
munic
ator
–Ver
bm
obil
TRIP
S
JamesAllen
etal"Tow
ardsconversationalhuman-com
puterinteraction,"
AIMagazine,22(4),2001
TheBehavioralAgent
(BA)planssystem
behaviorbasedon
its
goalsandobligations,the
user’sutterancesand
actions,and
changesin
theworldstate.
TheInterpretation
Manager(IM)
interpretsuser
input.Itbroadcasts
therecognized
speech
actsand
increm
entally
updatesthe
DiscourseContext.
TheGenerationManager
(GM)plansthespecific
contentofutterancesand
displayupdates.
17
Use
r and s
yste
m m
odel
A n
ew m
etho
d fo
r di
alog
ue m
anag
emen
t in a
n inte
llige
nt s
yste
m f
or
info
rmat
ion
retr
ieva
l -
Kenj
i Abe
, Kaz
ushi
ge K
urok
awa,
Kaz
unar
i Ta
keta
, Sum
io O
hno
and
Hir
oya
Fujis
aki,
ICSL
P 20
00
Use
r and s
yste
m m
odel
Pla
tform
s
Waxh
olm
Sys
tem
GRAMMATIK
&SEMANTIK
DIALOGKONTROLL
GRAFIK
IGENKÄNNING
AKUSTISKOCHVISUELL
TALSYNTES
DATABAS-
SÖKNING
LJUD
“WIZARDOFOZ”
Kartoroch
tabeller
Båttidtabeller,
Hamnpositioner,
Hotell,
Restauranger,
mm.
INMATNING
UTMATNING
Kontextkänsliga
regleroch
nätverk
LEXIKON
Manuellsimulering
Inspelningar
Tal
DATABAS
Ord
Ordklasser
Semantiskinformation
Uttal
TAL
Flat
model
TTS
ASR
TTS
ASR
ASV
TTS
desktop
audio
animated
agent
SQL
datab.
ASR
ASR
ASV
TTS
audio
device
animated
agent
SQL
desktop
audio
audio
coder
application,dialogengine
Multi-
laye
rm
odel
componentAPIs
ASV
TTS
audio
device
animated
agent
SQL
componentinteraction
services
high-levelprimitives
dialogcomponents
speechtechnologyAPI
audio
coder
application,dialogengine
ASR
speech
detector
resourcelayer
application-independentlayer
application-dependentlayer
18
Pla
tform
support
TTS
ASR
TTS
ASR
ASV
TTS
desktop
audio
animated
agent
SQL
datab.
ASR
ASR
ASV
TTS
audio
device
animated
agent
SQL
speech-techAPI desktop
audio
audio
coder
Genericdialoguemanager-SesaME
ATL
AS
Serviceplatform
Commonplatform
Servicedescriptions
Speech
TechnologyAPI(ATLAS)
Black
board
DB-s
InteractionManager
Dialogue
Engine
System
task
models
A-Agent
A-Agent
A-Agent
Decision
Agent
Decision
Agent
Decision
Agent
Dialog
interpreter
Dialogdescription
activator
Servicedescription
collection
SesaM
E Domain
task
models
User
models
Open
ServiceEnvironmentAPI
Eva
luation
•Ph
onet
ic a
nal
ysis
•W
ord
under
stan
din
g S
ynth
esis
/Rec
ognitio
n•
Dom
ain D
epen
den
t–
Voc
abula
ry S
ynta
x
•Sys
tem
Fee
dbac
k•
“Tas
k co
mple
tion
”•
How
lon
g t
ime
•H
ow m
any
turn
s•
Hap
py
and s
atis
fied
use
rs
Eva
luation
effo
rts
•Som
e pro
ject
s–
NIS
T–
SAM
–CO
CO
SD
A–
EAG
LES
–D
ISC
NIS
T-e
valu
ations
•N
IST -
Nat
ional
Inst
itute
of
Sta
ndar
ds
and T
echnol
ogy
(USA)
–htt
p:/
/ww
w.n
ist.
gov
/spee
ch/
•Are
as–
Com
munic
ator
(In
telli
gen
t Con
vers
atio
nal
Inte
rfac
es,
2000-)
–Spee
ch R
ecog
nitio
n (
Engl
ish,
Span
ish,
Man
dar
in)
•bro
adca
st n
ews
(1996-1
999)
•co
nver
sational
tel
ephone
speec
h (
1997-)
–Top
ic D
etec
tion
and T
rack
ing (
1998-,
Engl
ish,
Man
dar
in)
–In
form
atio
n E
xtra
ctio
n -
Entity
Rec
ognitio
n (
1999-)
–Spok
en D
ocum
ent
Ret
riev
al (
1997-2
000)
–Spea
ker
Rec
ognitio
n (
1996-)
Wor
d a
ccura
cy
GN
FB
I
N=
−−
−100*
Gwo
rd a
ccur
acy
Nnu
mbe
r of
wor
dsF
num
ber
of w
rong
wor
dsB
num
ber
of m
issi
ng w
ords
Inu
mbe
r of
inse
rted
wor
ds
Not
sen
siti
ve t
o wo
rd s
imila
riti
esEx
:i k
väll
-ikv
äll,
jag
-ja,
Vax
holm
-Va
xhol
ms,
bil -
rest
aura
ng
equ
ally
wro
ng
19
Per
ple
xity
Bperp
lexi
tyH
entr
opy
P(W
)pro
bab
ility
of
the w
ord
sequence
in t
he u
sed lan
guag
e E
xam
ple
:N
um
ber
seq
uen
ces
B =
11,
if a
ll dig
its
equally
pro
bab
le
HPW
PW
W
=−
∀�(
)log
()
2
B=2H
DARPA-e
valu
atio
n1988-1
999
05
10
15
20
25
198819891990199119921993199419951996199719981999
RMcommands(1000ord)
ATISspontanous(2000ord)
WSJreadnews(20000ord)
NAB,BroadcastNews
Transrciption(60000ord)
Acc
urac
y(%
)
DIS
C
•Spoke
n L
anguag
e D
ialo
gue
Sys
tem
san
d C
om
ponen
ts:
Bes
t pra
ctic
e in
dev
elopm
ent
and e
valu
atio
n•
Par
tner
s an
d p
eople
–N
atu
ral In
tera
ctiv
e Sys
tem
s La
bora
tory
(N
IS),
Denm
ark
–C
entr
e N
ational de la R
ech
erc
he S
cien
tifique (
CN
RS-L
IMSI)
Fra
nce
–U
niv
ers
ität
Stu
ttgar
t, G
erm
any
–Kunglig
a T
ekn
iska
Högsk
ola
n (
KTH
), S
weden
–Voca
lis L
td,
Engla
nd
–D
aim
ler-
Chry
sler
AG
, G
erm
any
–ELS
NET,
Euro
pe
•U
RL:
ww
w.d
isc.
dk
Use
r pro
file
Emailretrieval
Kamm,LitmanWalkerICSLP98
10
100
1000
Time
UserTurns
RecognitionScore
NovicewithoutTutorial
NovicewithTutorial
Expert
Gen
eral
Model
s of
Usa
bili
ty w
ith P
ARAD
ISE
Marily
n W
alk
er,
Candac
e Kam
m a
nd D
iane
Litm
an
Para
dis
eU
ser
Satisf
action
•I
found t
he
syst
em e
asy
to u
nders
tand in t
his
co
nve
rsat
ion.
(TTS P
erf
orm
ance
)•
In t
his
con
vers
atio
n,
I kn
ew w
hat
I c
ould
say
or
do
at
each
poi
nt
of t
he
dia
logue.
(U
ser
Exp
erti
se)
•The
syst
em w
orke
d t
he
way
I ex
pec
ted it
to in t
his
conve
rsat
ion.
( Exp
ecte
d B
ehav
iour)
•Base
d o
n m
y ex
per
ience
in t
his
con
vers
atio
n u
sing t
his
syst
em t
o get
tra
vel in
form
atio
n,
I w
ould
lik
e to
use
th
issy
stem
reg
ula
rly.
(Fu
ture
Use
)
20
Eva
luation
met
rics
•D
ialo
g E
ffic
iency
Met
rics
: Tot
al e
lapse
d t
ime,
Tim
e on
tas
k,Sys
tem
turn
s, U
ser
turn
s, T
urn
s on
tas
k, t
ime
per
turn
for
eac
hsy
stem
mod
ule
Eva
luation
met
rics
•D
ialo
g E
ffic
iency
Met
rics
: Tot
al e
lapse
d t
ime,
Tim
e on
tas
k,Sys
tem
turn
s, U
ser
turn
s, T
urn
s on
tas
k, t
ime
per
turn
for
eac
hsy
stem
mod
ule
•D
ialo
g Q
ual
ity
Met
rics
: W
ord A
ccura
cy, Sen
tence
Acc
ura
cy,
Mea
n
Res
pon
se lat
ency
, Res
pon
se lat
ency
var
iance
Eva
luation
met
rics
•D
ialo
g E
ffic
iency
Met
rics
: Tot
al e
lapse
d t
ime,
Tim
e on
tas
k,Sys
tem
turn
s, U
ser
turn
s, T
urn
s on
tas
k, t
ime
per
turn
for
eac
hsy
stem
mod
ule
•D
ialo
g Q
ual
ity
Met
rics
: W
ord A
ccura
cy, Sen
tence
Acc
ura
cy,
Mea
n
Res
pon
se lat
ency
, Res
pon
se lat
ency
var
iance
•Tas
k Succ
ess
Met
rics
: Pe
rcei
ved
task
com
ple
tion
, Exa
ctSce
nar
io
Com
ple
tion
, Any
Sce
nar
io C
omple
tion
Eva
luation
met
rics
•D
ialo
g E
ffic
iency
Met
rics
: Tot
al e
lapse
d t
ime,
Tim
e on
ta
sk,
Sys
tem
turn
s, U
ser
turn
s, T
urn
s on t
ask,
tim
e per
turn
for
each
syst
em m
odule
•D
ialo
g Q
ual
ity
Met
rics
: W
ord
Acc
ura
cy,
Sen
tence
Acc
ura
cy,
Mea
n R
espon
se lat
ency
, Res
pon
se lat
ency
va
rian
ce•
Tas
k Succ
ess
Met
rics
: Pe
rcei
ved t
ask
com
ple
tion
, Exa
ctSce
nar
io C
omple
tion
, Any
Sce
nar
io C
omple
tion
•U
ser
Sat
isfa
ctio
n:
Sum
of
TTS p
erfo
rmance
, Tas
k ea
se,
Use
rex
per
tise
, Exp
ecte
d b
ehavi
our,
Futu
re u
se.
Com
munic
ato
r: U
ser
Satisf
action
Tas
k co
mple
tion
21
Tas
k dura
tion
Pred
iction
of sa
tisf
act
ion?
PERFORMANCE=.25MRS+.33COMP-.33HELP
MRS
=meanrecognitionscore
COMP
=perceivedcompletion
HELP
=numberofhelpmessages
PERFORMANCE=Usersatisfaction
Covers41.3%ofthevariance
Som
e Chal
lenges
•D
ialo
g M
odel
ing
–st
atis
tica
l?
•In
itia
tive
–co
nve
rsat
ion
•Err
or
Han
dlin
g•
Multid
om
ain
•U
ser
model
ling –
Adap
tivi
ty•
Turn
Tak
ing
•M
ultim
odal Com
munic
atio
n
Dia
log M
anagem
ent
in M
IMIC
•In
itia
tive
model
ing
–dis
trib
ution
of
syst
em initia
tive
s
•G
oal
sel
ection
–goa
l th
at t
he
syst
em a
ttem
pts
to
reac
h
•Str
ateg
y se
lect
ion
–dia
log a
cts
dep
endin
g o
nin
itia
tive
dis
trib
ution
MIM
IC: A
n A
dapt
ive
Mix
ed I
niti
ativ
e Sp
oken
Dia
logu
e Sy
stem
for
In
form
atio
n Q
ueri
es J
enni
fer
Chu-
Carr
oll,
NA
ACL
200
0
Initia
tive
-Cue
det
ection
•D
isco
urs
e cu
es–
Tak
eOve
rTas
k•
when
use
r giv
es m
ore
info
than
nee
ded
–N
oN
ewIn
form
atio
n•
no
pro
gre
ss t
ow
ards
task
com
ple
tion
Adap
tation o
f th
e dia
log
•Eva
luat
e th
e dia
log c
ontinuousl
y–
Do
the
syst
em a
nd t
he
use
r have
the
sam
e goa
l–
Who
take
s th
e in
itia
tive
•Err
or
han
dlin
g–
Anal
ysis
and r
epai
r
22
Err
or
handlin
g in W
axh
olm
Inpu
t
Robu
st p
arse
Ans
wer
OK
Out
put
“I d
o no
t un
ders
tand
”
Out
put
“Thi
s is
wha
t I
unde
rsto
od….
.”
Pred
ict
topi
c. A
gree
?
Full
pars
e
1. lo
op2.
loop
Yes
No
The
HIG
GIN
S d
om
ain
•The p
rim
ary
dom
ain
of H
IGG
INS is
city
navi
gation for
ped
estr
ians.
•
Seco
ndarily
, H
IGG
INS is
inte
nded t
o p
rovi
de s
imple
info
rmation a
bout
the
imm
edia
te s
urr
oundin
gs.
Thisisa
3Dtest
environment
The
HIG
GIN
S d
omai
n
U:
Jag h
ar
en s
tor
byggnad
till
vänst
er
S:
Vilk
en f
ärg
har
den b
yggnaden?
U:
Ora
nge
S:
Bes
kriv
något
mer
U:
Jag h
ar
en g
lasb
yggnad
fram
för
mig
S:
Stä
lldig
mella
nden o
ch e
n t
räbyg
gnad
Err
ors
in s
poke
n d
ialo
gue
syst
ems
•W
hat
is
an e
rror?
–A d
evia
tion
fro
m a
n e
xpec
ted o
utp
ut
from
a
syst
em,
mod
ule
or
pro
cess
•D
evia
tion f
rom
what
?–
What
is
wri
tten
in t
he
requir
emen
t sp
ecific
atio
n
–W
hat
a h
um
an “
wiz
ard”
wou
ld d
o–
What
max
imis
es u
ser
satisf
action
•U
sers
nev
er m
ake
erro
rs in t
his
sen
se!
–D
isfluen
ces,
etc
, is
just
anot
her
beh
avio
ur
the
syst
em s
hou
ld h
andle
Err
or h
andlin
g r
esea
rch iss
ues
Userutterance
User
reaction/repair
Assume
understanding
Norecovery
Non-understanding
Assume
understanding
Err
or r
ecov
ery
(Non
-und
erst
andi
ng)
Ear
ly e
rror
det
ecti
on
Gro
undi
ng
Late
err
or d
etec
tion
Err
or r
ecov
ery
(Mis
unde
rsta
ndin
g)
Misunderstanding
U:
jag
trä
dh
ar
hej
en
sto
r
byg
gn
ad
en
till
vän
ster
S:
Vilk
en f
ärg
har
den
byg
gnad
en?
U:
ora
ng
e v
än
ster
Initia
l ex
per
imen
ts
•Stu
die
s on h
um
an-h
um
an c
onve
rsat
ion
•The
Hig
gin
s dom
ain (
sim
ilar
to M
ap T
ask)
•U
sing A
SR in o
ne
direc
tion t
o e
licit e
rror
han
dlin
g b
ehav
iour
Vocoder
User
Operator
Listens
Speaks
Reads
Speaks
ASR
23
Non-u
nder
standin
g e
rror
rec
ove
ry
•Res
ults
show
that
hum
ans
tend n
ot t
o si
gnal
non
-under
stan
din
g:
O:
Do
you s
ee a
wood
en h
ouse
in f
ront
of y
ou?
U:
YES
CR
OS
SIN
G A
DD
RES
S N
OW
(I p
ass
the
wood
en h
ouse
now
)O
: Can
you
see
a r
esta
ura
nt
sign?
•This
lea
ds
to–
Incr
ease
d e
xper
ience
of
task
succ
ess
–Fa
ster
rec
over
y fr
om n
on-u
nder
stan
din
g
•Ska
ntz
e, G
. (2
003).
Exp
loring h
um
an e
rror
han
dlin
g s
trat
egie
s: im
plic
atio
ns
for
spoke
n
dia
logue
syst
ems.
Err
or h
andlin
g r
esea
rch iss
ues
Userutterance
User
reaction/repair
Assume
understanding
Norecovery
Non-understanding
Assume
understanding
Err
or r
ecov
ery
(Non
-und
erst
andi
ng)
Ear
ly e
rror
det
ecti
on
Gro
undi
ng
Late
err
or d
etec
tion
Err
or r
ecov
ery
(Mis
unde
rsta
ndin
g)
Misunderstanding
Err
or h
andlin
g r
esea
rch iss
ues
Userutterance
User
reaction/repair
Assume
understanding
Norecovery
Non-understanding
Assume
understanding
Err
or r
ecov
ery
(Non
-und
erst
andi
ng)
Ear
ly e
rror
det
ecti
on
Gro
undi
ng
Late
err
or d
etec
tion
Err
or r
ecov
ery
(Mis
unde
rsta
ndin
g)
Misunderstanding
Early
erro
r det
ection
•The
syst
em m
ust
under
stan
d w
hat
it
does
n’t
under
stan
d–
Mea
sure
of
confiden
ce in its
under
stan
din
g•
Det
erm
ines
gro
undin
g b
ehav
iour
•Fa
cilit
ates
lat
e er
ror
det
ection
–D
ecid
ing w
hen
to
reje
ct a
nd w
hen
to
acc
ept
whol
e utt
eran
ces
or
part
s of
utt
eran
ces
•Shou
ld d
epen
d o
n–
Con
fiden
ce o
f under
stan
din
g–
Con
sequen
ce o
f non
-under
stan
din
g–
Con
sequen
ce o
f m
isunder
stan
din
g
Curr
ent
rese
arc
h
Userutterance
User
reaction/repair
Assume
understanding
Norecovery
Non-understanding
Assume
understanding
Err
or r
ecov
ery
(Non
-und
erst
andi
ng)
Ear
ly e
rror
det
ecti
on
Gro
undi
ng
Late
err
or d
etec
tion
Err
or r
ecov
ery
(Mis
unde
rsta
ndin
g)
Misunderstanding
Hig
gin
s dom
ain
-dem
onst
ration
24
CH
IL"C
om
pute
rs in t
he
Hum
an inte
ract
ion L
oop"
•In
tegra
ted P
roje
ct u
nder
the
Euro
pea
n
Com
mis
sion
's S
ixth
Fra
mew
ork
Pro
gra
mm
e.
•Coo
rdin
ated
by
Univ
ersi
tät
Kar
lsru
he
(TH
) an
d t
he
Frau
nhof
er I
nst
itute
IIT
B.
•CH
IL w
as lau
nch
ed o
n J
anuar
y, 1
st 2
004.
htt
p:/
/chil.
serv
er.d
e/
•D
aim
lerC
hry
sler
AG
, G
roup D
ialo
gue S
yste
ms,
Germ
any
•ELD
A,
Eva
luations
and L
anguage r
eso
urc
es D
istr
ibution A
gen
cy,
Fr
ance
•IB
M C
eska
Rep
ublik
a,
Jzec
h R
epublic
•RESIT
, Rese
arc
h a
nd E
duca
tion S
oci
ety
in I
nfo
rmation T
ech
nolo
gie
s,
Gre
ece
•IN
RIA
(In
stitut
National de R
ech
erc
he e
n I
nfo
rmatique
et
en
Auto
mat
ique),
Pro
ject
GRAVIR
, Fr
ance
•IR
ST (
Inst
ituto
Tre
ntino
diC
ultura
), I
taly
•KTH
(KunglTekn
iska
Högsk
ola
n),
Sw
eden.
•C
NRS,
LIM
SI
(Centr
e N
ational de la R
ech
erc
he S
cientifique t
hro
ugh
its
Labora
toire
d'Info
rmatique
pour
la m
écaniq
ue
et
les
scie
nce
s de
l'ingénie
ur)
, Fr
ance
•TU
E (
Tech
nis
che
Univ
ers
iteit
Ein
dhove
n),
The N
eth
erlands
•IP
D,
Univ
ersi
tät
Karlsr
uhe (
TH
) th
rough its
Inst
itute
IPD
, G
erm
any
•U
PC,
Univ
ersi
tat
Polit
ècnic
ade C
ata
lunya
, Spain
•U
niv
ers
ität
Karlsr
uhe (
TH
), I
nte
ract
ive
Sys
tem
s La
bs,
Germ
any
•Fr
aunhofe
r In
stitut
für
Info
rmations-
und D
ate
nve
rarb
eitung
(IIT
B),
Karlsr
uhe,
Germ
any
•Sta
nfo
rd U
niv
ersi
ty,
Clif
ford
Nass
, U
SA
•C
MU
, C
arn
egie
Mello
n U
niv
ers
ity,
USA
Challe
nge
•The
obje
ctiv
e–
to c
reat
e en
viro
nm
ents
in w
hic
h c
ompute
rs s
erve
hum
ans
who
focu
s on
inte
ract
ing w
ith o
ther
hum
ans
as o
ppos
ed t
o hav
ing
to a
tten
d t
o an
d b
eing p
reoc
cupie
d w
ith t
he
mac
hin
es t
hem
selv
es.
–In
stea
d o
f co
mpute
rs o
per
atin
g in
an iso
late
d m
anner
, an
d H
um
ans
in t
he
loop
of
com
pute
rs w
e w
ill p
ut
Com
pute
rs in t
he
Hum
an I
nte
ract
ion L
oop (
CH
IL).
•Com
pute
r Ser
vice
s–
mod
els
of h
um
ans
and t
he
stat
e of
thei
r ac
tivi
ties
and
inte
ntion
s. B
ased
on t
he
under
stan
din
g o
f th
e hum
an
per
ceptu
al c
onte
xt,
CH
IL c
ompute
rs a
re e
nab
led
to
pro
vide
hel
pfu
l as
sist
ance
im
plic
itly
, re
quirin
g a
min
imum
of
hum
an a
tten
tion
or
inte
rrupt
ions
CH
IL -
Ser
vice
s
•M
emory
Jog
(M
J).
–It
hel
ps
the
atte
ndee
s by
pro
vidi
ng info
rmat
ion r
elat
ed t
o th
e dev
elop
men
t of
the
even
t (m
eeting/l
ectu
re)
and t
o th
e par
tici
pan
ts.
MJ
pro
vides
con
text
-an
d c
onte
nt-
awar
e in
form
atio
n p
ull
and p
ush
, bot
h p
erso
nal
ized
and p
ublic
.
•Att
ention
Coc
kpit (
AC).
–
AC m
onitor
s th
e at
tention
and
inte
rest
lev
el o
f par
tici
pan
ts, su
ppor
ting indiv
idual
s w
ho
wan
t m
ore
or les
s in
volv
emen
t in
the
dis
cuss
ion.
It c
an a
lso
info
rm t
he
Soc
ially
-Suppor
tive
Wor
kspac
es a
bou
t th
e at
tention
al
stat
e of
the
par
tici
pan
ts.
•Con
nec
tor
(Con
nec
tor)
. –
Con
text
-aw
are
connec
ting s
ervi
ces
ensu
re t
hat
tw
o par
ties
ar
e co
nnec
ted w
ith e
ach o
ther
at
the
right
pla
ce,
tim
e an
d
by
the
bes
t m
edia
, w
hen
it
is m
ost
appro
pri
ate
and
des
irab
le for
bot
h p
arties
to
be
connec
ted.
Ljuddes
ign för
talg
ränss
nitt
•U
tvec
klin
g a
v et
t sy
stem
för
jourh
avan
de
fast
ighet
sskö
tare
•H
ur
ser
en f
astighet
sskö
tard
om
än u
t?–
Kom
plic
erad
! -
Käl
lare
med
mån
ga
rum
•va
tten
kran
ar•
elsk
åp•
Ven
tila
tionsa
nlä
ggn
ingar
KorridorB1
Huvudkranrum
D3
VarmvattenrumB
4HuvudfläktrumA1
Sektionslufts-rum
C2
Huvud-
elcentralB2
ElrumD4
SäkringsrumB3
KorridorD1
KorridorA3
Sektionsvatten-
rumC1
VattenrumA2
Luftkylnings-
centralD2
Elmätarcentral
C0
KorridorC4
Ventilations-rum
C3
Kom
pliceraddomän!
Ljud
desi
gn f
ör t
algr
änss
nitt
25
•Sys
tem
et ä
r an
vändar
ens
”ögon
och
hän
der
”, t
.ex.
:
S>DubefinnerdigikorridorC4.Härifrånkandugåtillföljande
rum:elrumD3,sektionsvattenrumC3.Vad
villdu
göra?
A>
gåtillsektionsvattenrumC3
S>DubefinnerdigisektionsvattenrumC3.Härifrånkandugå
tillföljanderum:korridorC4,elmätarcentralC2,ventilations-
rumE3.Vad
villdugöra?
A>
tittapåvattenkranarna
S>Dutittarpåsektionsvattenkranarna.Följandekranaräröpp-
na:kran
2,kran
7.Restenavkranarnaärstängda.Vad
villdu
göra?Ljuddes
ign för
talg
ränss
nitt
•H
ur
kan lju
d a
nvä
ndas
som
nav
igat
ionss
töd?
•Auditiv
a ik
oner
–tv
å ty
per
för
ekom
mer
i t
vå o
lika
rolle
r•
bak
gru
nds
ljud (
konte
xtgiv
are)
•
åter
kopplin
g på
kom
man
do
–[e
xem
pel
dial
og u
tan lju
d]
–[e
xem
pel
dial
og m
ed lju
d]
Ljuddes
ign
för
talg
ränss
nitt
The
End