21
Evaluation of Multi-user System of Voice Interaction Using Grammars Elizabete Munzlinger, Fabricio da Silva Soares, and Carlos Henrique Quartucci Forster {bety, p2p, forster}@ita.br ITA – Instituto Tecnológico de Aeronáutica EEC-I – Engenharia Eletrônica e Computação – Informática Divisão de Ciência da Computação

Evaluation of multi user system of voice interaction using grammars(slide share)

Embed Size (px)

Citation preview

Page 1: Evaluation of multi user system of voice interaction using grammars(slide share)

Evaluation of Multi-user System of Voice Interaction

Using Grammars

Elizabete Munzlinger, Fabricio da Silva Soares,

and Carlos Henrique Quartucci Forster{bety, p2p, forster}@ita.br

ITA – Instituto Tecnológico de AeronáuticaEEC-I – Engenharia Eletrônica e Computação –

InformáticaDivisão de Ciência da Computação

Page 2: Evaluation of multi user system of voice interaction using grammars(slide share)

Agend Introduction Grammar Design Tests and Results of Accuracy Conclusion

Page 3: Evaluation of multi user system of voice interaction using grammars(slide share)

Introduction Exaustive training

Fig. 1. Train the system to recognize one’s voice through the exhaustive reading of texts

Page 4: Evaluation of multi user system of voice interaction using grammars(slide share)

Several contexts

Introduction

Fig. 2. Systems which particular application and contexts

Page 5: Evaluation of multi user system of voice interaction using grammars(slide share)

1on

Port [4], Action [true]

Introduction Domotic system

Por favor, ligue a

lâmpada!

Fig. 3. Prototype of Domotic system

Page 6: Evaluation of multi user system of voice interaction using grammars(slide share)

Domotic system

Introduction

Fig. 3. Prototype of Domotic system

Page 7: Evaluation of multi user system of voice interaction using grammars(slide share)

Grammar Design Grammar tree Main

Rule2 Rule3Rule1

Rule1

Rule4

Rule5

Rule6

Terminal symbols

Terminal symbols

Terminal symbols

Rule8

Rule7

... ... Terminal symbols

Fig. 4. The grammar tree composed by nodes

Page 8: Evaluation of multi user system of voice interaction using grammars(slide share)

Grammar Design Grammar in Java Speech Grammar

Formatgrammar br.ita.domovox;public <command> = [<introdução>] <action> [<complemento>] <object> [<complemento>] [<conclusão>];<introdução> = [<educação>] [<complemento>] [<quem>];<action> = <ação>;<complemento> = [<posse>] [<outros>] [<onde>] [<tempo>] [<educação>] [<outros>];<object> = [<indica>] [<posse>] <dispositivo>;<conclusão> = [<introdução>];<educação> = [<outros>] [<tratamento>] [<sistema>] [<tratamento>] [<complemento>];<quem> = [<sujeito>] [<desejo>];<posse> = [<outros>] [<possessivo>] [<outros>] [<sujeito>] [<outros>];<onde> = [<lugar>] | [<outros>];<tempo> = [<quando>] | [<outros>];<tratamento> = por favor | faz favor | por gentileza | por obséquio | faça a gentileza | faça o favor | fazer o favor | fazer a gentileza;<sistema> = pc | computador | notebook | máquina | sistema | domovox | sistema domovox | sistema de voz | sistema de fala | meu | cara | bicho | mano | maluco;<sujeito> = eu | tu | ele | ela | nós | vós | eles | elas | você | vocês | mim | gente;<desejo> = [<querer>] | [<desejar>] | [<precisar>] | [<necessitar>] | [<ir>] | [<poder>];<querer> = quero | queres | quer | queremos | quereis | querem | querendo;<desejar> = desejo | desejas | deseja | desejamos | desejais | desejam | desejando; <precisar> = preciso | precisas | precisa | precisamos | precisais | precisam | precisando;<necessitar> = necessito | necessitas | necessita | necessitamos | necessitais | necessitam | necessitando;<ir> = vou | vais | vai | vamos | vão;<poder> = pode | podes;<ação> = <verdadeiro> | <falso>;<verdadeiro> = (ligar | ligue | ativar | ative | ascender | ascenda) {true};<falso> = (desligar | desligue | desativar | desative | apagar | apague) {false} <indica>= [<artigo>] | [<indicação>];<artigo> = o | a | os | as;<indicação> = esse | essa | este | esta | aquele | aquela | aquilo | todos | todos os | todas as | tudo;<dispositivo> = <porta00> | <porta01> | <porta02> | <porta03> | <porta04> | <porta05>;<porta00> = (tudo | dispositivos | aparelhos) {0};<porta01> = (luz | lâmpada) {1};<porta02> = (ventilador | aparelho ventilador) {2};<porta03> = (tv | tevê | televisão | televisor | aparelho de tv | aparelho televisor) {3};<porta04> = (abajur | luminária | candelabro) {4};<porta05> = (outros) {5};<quando> = já | agora | nesse momento | nesse minuto | nesse segundo | agora mesmo;<lugar> = aqui | aí | lá | ambiente | quarto | sala | peça | lugar | casa | apartamento | ap;<possessivo> = meu | minha | meus | minhas | nosso | nossa | nossos | nossas | vosso | vossa | vossos | vossas | dele | dela | deles | delas | desse | dessa | desses | dessas | nesse | nessa | nesses | nessas;<outros> = que | da | de | do | mesmo | para | pra | momento | mandando | também | inclusive | estou | aí | ô | é | ã | hum | mas | pode;

Page 9: Evaluation of multi user system of voice interaction using grammars(slide share)

Grammar Design Computational

resources

980MB100%

0

200

400

600

800

1000

0 min 0,5 min 1,0 min 1,5 min 2,0 min 2,5 min 3,0 min

CPU

Memory

Graph. 1. Graphic of allocation and processing of the structure of the grammar

100%

50%

Page 10: Evaluation of multi user system of voice interaction using grammars(slide share)

Grammar Design Redesign of grammar

Comando

Ação ObjetoComplemento*

FalsoVerdadeiro

Porta 01

Porta 02 ...

ligar, ativar, acender...

desligar, desativar, apagar ...

1, luz, lâmpada...

Por favor, eu você, sistema, do, preciso, meu, pode, de, quarto, a, o...

2, TV, televisor, televisão...

Fig. 5. The new grammar tree

Page 11: Evaluation of multi user system of voice interaction using grammars(slide share)

423MB

5%0

200

400

600

800

1000

0 min 0,5 min 1,0 min 1,5 min 2,0 min 2,5 min 3,0 min

CPU

Memory

Grammar Design

100%

50%

Graph. 2. Graphic of allocation and processing of the structure of the grammar

Computational resources

Page 12: Evaluation of multi user system of voice interaction using grammars(slide share)

Representation of grammar

Grammar Design

Fig. 6. Grammar represented through a state machine with a recursivity rule

Page 13: Evaluation of multi user system of voice interaction using grammars(slide share)

Grammar Design Accepted commands

Table 1. Examples of simple and complex commands based in the rules of grammar

Page 14: Evaluation of multi user system of voice interaction using grammars(slide share)

Tests and Results of Accuracy Domotic system

Por favor, ligue a

lâmpada!

Fig. 7. Comparison between registered spoken words and the log system

registered

logged

Page 15: Evaluation of multi user system of voice interaction using grammars(slide share)

Tests and Results of Accuracy General rates of acceptation

98%

85,70%

24,10%

0%10%20%30%40%50%60%70%80%90%

100%

All commands (simples and complex)

Accepted withoutlog analysis

Disregardingdefinite articles

Exactly commandswith log analysis

Graph. 3. Rates of acceptation of all commands

Page 16: Evaluation of multi user system of voice interaction using grammars(slide share)

Tests and Results of Accuracy Rates of acceptation by simple and

complex commands

10,98,9

35,333,0

0%

5%

10%

15%

20%

25%

30%

35%

40%

Simple commands Complex commands

Definite articlesaccepted

Definite articesright

Graph. 4. Rates of definite articles acceptation by simple and complex commands

Page 17: Evaluation of multi user system of voice interaction using grammars(slide share)

Tests and Results of Accuracy Rates of acceptation by numbers from 1 to

32

33,20%

66,80%

0%

34%

0%

10%

20%

30%

40%

50%

60%

70%

Number from 1 to 32

Word form

Numeral form

Just word form

Just numeral form (6, 7,14, 19, 23, 24, 25, 26, 28,29, 32)

Graph. 5. Rates of acceptation by numbers from 1 to 32

Page 18: Evaluation of multi user system of voice interaction using grammars(slide share)

Tests and Results of Accuracy Rates of errors in the numbers recognition

Highest rates of error: 21, 27 and 31

Mistook words with similar sound: 21 “20 eu” 31 “30 aí eu” “30 aí vou”

“30 aí o” “30 aí os”“30 aqui os” “30 aqui eu”“30 eu” “30 em”This happened in 70% of the cases

Page 19: Evaluation of multi user system of voice interaction using grammars(slide share)

Conclusion Behavior of a voice interface system Design of grammar Experiments with users Redesign of grammar

Page 20: Evaluation of multi user system of voice interaction using grammars(slide share)

References1. Burstein, A., Stolzle, A., Brodersen, R.W.: Using

Speech Recognition in a Personal Communications System. In: Communications, 1992. ICC 92, Conference record, SUPERCOMM/ICC ’92, IEEE, Los Alamitos (1992)

2. Pfaff, G.E.: User Interface Management Systems, p. 72. Springer, New York (1985)

3. Seneff, S.: TINA: A Natural Language System for Spoken Language Applications. Comput. Linguist. 18, 61–86 (1992)

4. Sun Microsystems Ltd, Java Speech API Programmer’s Guide Version 1.0, [online at], http://java.sun.com/products/javamedia/speech/

5. Vieira, R., Lima, V.L.: Lingüística Computacional: Princípios e Aplicações. In: JAIA – ENIA, 2001, Fortaleza (2001)

Page 21: Evaluation of multi user system of voice interaction using grammars(slide share)

Evaluation of Multi-user System of Voice Interaction

Using Grammars

Elizabete Munzlinger, Fabricio da Silva Soares,

and Carlos Henrique Quartucci Forster{bety, p2p, forster}@ita.br

ITA – Instituto Tecnológico de AeronáuticaEEC-I – Engenharia Eletrônica e Computação –

InformáticaDivisão de Ciência da Computação