32
Sports Scores Speech Sports Scores Speech Recognition System Recognition System Major League Baseball Score System

Sports Scores Speech Recognition System Major League Baseball Score System

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sports Scores Speech Recognition System Major League Baseball Score System

Sports Scores Speech Sports Scores Speech Recognition SystemRecognition System

Major League Baseball Score System

Page 2: Sports Scores Speech Recognition System Major League Baseball Score System

Development Team Development Team MembersMembers

Dan Corkum (Director)Jason NguyenTrieuDan Ragland (Producer)Quang VuAndrew Wagner

Sponsor: Sponsor: Jim Larson, Intel CorporationJim Larson, Intel Corporation

Page 3: Sports Scores Speech Recognition System Major League Baseball Score System

Goals & ObjectivesGoals & ObjectivesDevelop a compelling Speech Recognition

Application for Retrieval of Sports Information.

Incorporate Ease of Use Techniques including: Tapered Prompts, Global Commands, Barge-In, Repair Dialogs, and others.

Develop an Architecture that is both Robust and Modular. Design for Reuse.

Page 4: Sports Scores Speech Recognition System Major League Baseball Score System

Core ModulesCore Modules“Web Viking” – Parse Internet Web Pages to retrieve

sports information.Data Warehousing & Querying – Database for

storage of searchable information. Client and Server Communication – Enables

communication between Server and remote Clients.VUI (Voice User Interface) Voice Prompts

and Response System – The core engine that controls the entire VUI.

Dialog Database – Contains the content for the text-to-speech prompts and response criteria.

Page 5: Sports Scores Speech Recognition System Major League Baseball Score System

Web VikingWeb Viking The purpose of the Web Viking is to retrieve data

from web sites, parse and format it into a format so that the database interface can understand it.

There are three data collection scripts: Schedule, Scores, and Standing/Ranking

The data comes from 2 sources:– Major League Baseball– ESPN

Two chances to get the right data:– First, we get data from MLB web site and parse it. If it

fails for any reason, we'll try to get data from the ESPN web site.

Page 6: Sports Scores Speech Recognition System Major League Baseball Score System

Web VikingWeb Viking

How is the data retrieved?– We used the library functions available in the CPAN

(Comprehensive Perl Archive Network.) – The HTTP::Request module: package up the URL

request– The HTTP::Response module: handle the data coming

back. How the data is parsed:

1. Match and strip off unnecessary data.

2. Regular expression

3. Split

4. Format data and check result.

Page 7: Sports Scores Speech Recognition System Major League Baseball Score System

Data Warehousing & QueryingData Warehousing & Querying

The Database was implemented using MS Access.

It functions as a storage site keeping track of team names, scores associated with each team, league/division ranking information, and the schedules for each game.

The Database Handler was written in Java. Its primary purpose is to query the database and

fetch the results to the sport score server.

Page 8: Sports Scores Speech Recognition System Major League Baseball Score System

Client & Server CommunicationClient & Server Communication

danC stuff

Page 9: Sports Scores Speech Recognition System Major League Baseball Score System

Client & Server CommunicationClient & Server Communication

danC #2

Page 10: Sports Scores Speech Recognition System Major League Baseball Score System

VUI (Voice User Interface) VUI (Voice User Interface) Voice Prompts and Response SystemVoice Prompts and Response System

User Interface and Underlying Logic

Page 11: Sports Scores Speech Recognition System Major League Baseball Score System

Design ConsiderationsDesign Considerations

Two Options For Design:

1. Dialog logic coded directly into code.

2. Dialog logic entered into a data structure and presented by separate internal logic.

VUIVUI

Page 12: Sports Scores Speech Recognition System Major League Baseball Score System

Advantages & Disadvantages Advantages & Disadvantages of Hard-Coded Dialogsof Hard-Coded Dialogs

Fast initial implementation

Ultimate flexibility of features

Duplicated code Difficult to provide

consistent global functionality

Hard-coded grammars

VUIVUI

Page 13: Sports Scores Speech Recognition System Major League Baseball Score System

Advantages & Disadvantages Advantages & Disadvantages of Dialog Databaseof Dialog Database

Good design: Data separated from presentation

Consolidation of code Easy to create and

maintain dialogs Features aided by use of

recursion Computer-generated

grammars

Much work required before any results seen

Difficult to customize specific components

VUIVUI

Page 14: Sports Scores Speech Recognition System Major League Baseball Score System

Decision: Dialog DatabaseDecision: Dialog DatabaseSports Score dialogs all follow the same

basic patternImplementation could be modularized by

separating the dialogs from their presentation logic

The gains made by the ease of entry and flexibility for the end-user outweighed the losses in implementation time

Some features require recursion

VUIVUI

Page 15: Sports Scores Speech Recognition System Major League Baseball Score System

VUI Infrastructure DesignVUI Infrastructure Design

Features– Tapered, User-Level Sensitive Prompts– Tapered, User-Level Sensitive Help– Barge-In capability– User shortcut capability (users can answer future prompts

from any prompt)– Navigational user commands (“back”,”quit”,etc)– Enumerated user commands to allow the user to say a

number as an alternative to the command

VUIVUI

Page 16: Sports Scores Speech Recognition System Major League Baseball Score System

Dialog ComponentsDialog Components

Prompt– Point of user interaction– Has associated Prompt Text levels, including

text to be read, the user level for which it is to be read, and the number of visits before the next user-level is used

– Has associated Commands, or phrases the user is allowed to say and the actions to take

– Has a parameter name to be used in a query

VUIVUI

Page 17: Sports Scores Speech Recognition System Major League Baseball Score System

Dialog Components (cont)Dialog Components (cont)

Commands– The text the user says to access the command– The text that will be returned when this

command is accessed (used in a query)– A flag to indicate whether or not the command

is to be enumerated– The action the system is to take when the

command is accessed

VUIVUI

Page 18: Sports Scores Speech Recognition System Major League Baseball Score System

Dialog Components (cont)Dialog Components (cont)

Scripts– Series of prompts to be called in succession

Script Steps– Individual prompts belonging to a script– Each contains its own grammar (reflecting

shortcuts available later in the script)– Each contains a flag indicating whether or not a

query will be performed following the step

VUIVUI

Page 19: Sports Scores Speech Recognition System Major League Baseball Score System

How A Prompt WorksHow A Prompt Works

Shortcuts take place when a user answers multiple prompts in a row, so the first thing the prompt does is checks for overflow from the last prompt. If there is overflow, jump ahead to the command processing. Otherwise, cycle through the following:– Find the appropriate prompt text to be read to

the user based on user level and number of visits.

VUIVUI

Page 20: Sports Scores Speech Recognition System Major League Baseball Score System

How A Prompt Works (cont)How A Prompt Works (cont)

– If the user requires help, find the appropriate help text to be read

– Begin the reading of the help and prompt text to the user

– At the same time, begin listening for a user response (if the user responds while it is reading, interrupt the reading)

– When the computer finishes reading, begin timing. After five seconds with no user speaking, time-out.

VUIVUI

Page 21: Sports Scores Speech Recognition System Major League Baseball Score System

How A Prompt Works (cont)How A Prompt Works (cont)

– Attempt to match what the user said to a command that is available at this prompt.

Match using the longest available command, so if the user said “New York Yankees”, match “New York Yankees,” not “New York”

Any portion of what the user said that was not matched (if anything was matched at all) gets sent to proceeding prompts for processing. Example: The user said “score New York” in the first prompt. If the prompt matches “score”, “New York” will get passed to any following prompts.

VUIVUI

Page 22: Sports Scores Speech Recognition System Major League Baseball Score System

How A Prompt Works (cont)How A Prompt Works (cont)

When a command is matched, the command’s return value is attached to the parameter name of the prompt

The action that is then performed is dictated by the command. Some possibilities are:

– Calling another script/prompt and returning all values– Calling another script/prompt and returning only those

values– Repeating the prompt and reading help to the user– Changing the user level– Running a query and repeating or calling another

prompt/script

VUIVUI

Page 23: Sports Scores Speech Recognition System Major League Baseball Score System

How A Script WorksHow A Script Works

A script is presented simply by presenting the first script step in the script

A script step presents its associated prompt, using its own grammar (reflecting the ability of the user to shortcut to the next script step)

After the script step is executed, a query may be performed and the next script step (if any) may be performed

VUIVUI

Page 24: Sports Scores Speech Recognition System Major League Baseball Score System

Other Dialog RoutinesOther Dialog Routines

Components are also involved in routines to build grammars acceptable to Microsoft’s SAPI interface– The dialog structure is descended recursively,

with all dependent grammars being included in each prompt’s grammar

– Global commands are also created and added to grammars

VUIVUI

Page 25: Sports Scores Speech Recognition System Major League Baseball Score System

QueriesQueries

All query parameters are accumulated in an XML document

When a query occurs, the document is sent to the server

The server returns an XML document containing results

The results are read to the user based on administrator-defined result strings

VUIVUI

Page 26: Sports Scores Speech Recognition System Major League Baseball Score System

Why XML?Why XML?

XML is fast becoming the industry standard for data transfer over the Internet

XML’s hierarchical structure lends itself to this application

Several XML parsers already exist for various platforms (we used IBM’s XML4J)

The HTML-like nature of XML makes results easy to read, even for a human.

Page 27: Sports Scores Speech Recognition System Major League Baseball Score System

How Query Results Are ReadHow Query Results Are Read

The administrator defines parameter-value pairs as criteria for which response is read

Each response consists of segments of literal text along with parameter values (which can be drawn either from the client or server)

Page 28: Sports Scores Speech Recognition System Major League Baseball Score System

Query Results ExampleQuery Results Example

Criteria– Function = “Score”– Team = “Yankees”

To be read– “The Yankees score

was”– <TeamA>– <ScoreA>– “To”– <TeamB>– <ScoreB>

Page 29: Sports Scores Speech Recognition System Major League Baseball Score System

The ResultsThe ResultsThe front-end is very customizableDialogs can be built simply and quicklyThe system administrator needs no

knowledge of programming conceptsThe overall behavior of the system could be

changed without changing each promptThe computer speech engine is accessed in

only one area of code, so it could be swapped with minimal effort

VUIVUI

Page 30: Sports Scores Speech Recognition System Major League Baseball Score System

Dialog StructureDialog Structure

The Dialog System consists of:– Prompts– Responses– Help System

All Dialogs are tapered (Prompts, Responses, & Help)

Repair Dialogs – Example: Two teams from same city (New York Mets and Yankees)

Page 31: Sports Scores Speech Recognition System Major League Baseball Score System

Dialog Structure OverviewDialog Structure Overview

Main Menu

ScoreInfo.

RankingInfo.

SchedulingInfo.

Score

Ran

k

Schedule

Info byLeague

HELP HELP HELP

HELP

Page 32: Sports Scores Speech Recognition System Major League Baseball Score System

SummarySummary We not only developed a powerful Speech

Recognition Application for Retrieval of Sports Information, we also developed a reusable framework which can be easily modified for use in other applications.

We incorporated Ease of Use Techniques including: Tapered Prompts, Global Commands, Barge-In, Repair Dialogs, and others.