30
Telecom Innovators’ Web Seminar Series Real-World Experience Adding Speech to IVR Solutions with MRCP A webinar by NMS, ScanSoft and CapitalOne

Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

  • Upload
    vudung

  • View
    233

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Telecom Innovators’Web Seminar Series

Real-World Experience Adding Speech to IVR Solutions

with MRCP A webinar by NMS, ScanSoft and CapitalOne

Page 2: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 2

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Agenda

Introduction to speech technologyDr. Rob Kassel, Senior Product Manager, ScanSoft, Inc.

MRCP and Natural AccessJack Chase, Director, Product Marketing, NMS

MRCP integration on the TelBert IVR Platform using NMS and ScanSoft

Eric Cunningham, Enterprise Architect, Capital One

Page 3: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 3

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Introduction toSpeech Technology

Rob KasselSenior Product Manager

ScanSoft

Page 4: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 4

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

The Need For Speech Recognition

Automation less costly than live agentsIncreases call handling capacity / reduces hold timesDTMF often is pressed into service

Numeric entry is easy… unless you are readingSpelling entry is more difficultMenus need to be enumerated, can’t be too longDeep menu structure becomes tiresomeAssignment inconsistent between vendors (e.g., voicemail)How do you enter “5 ½%” or “Albuquerque”?

With speech, questions are answered naturallyCaller satisfaction is higherFewer zero-outs leads to additional cost savings

Page 5: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 5

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Speech Recognition Process

FeatureExtractionFeature

Extraction

SpeechDetectorSpeechDetector

ConfidenceScoring

ConfidenceScoring

Speech

Results

Grammar

GrammarCompilerGrammarCompiler

SystemDictionarySystem

DictionaryPronunciation

RulesPronunciation

Rules

PhonemeClassifierPhonemeClassifier Acoustic

ModelsAcousticModels

SearchSearch

Page 6: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 6

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Speech Recognition Challenges

Speech can be difficult to decode, even for humansFixed, confusable vocabularies: “B-C-D-E-G-P-T-V-Z”Ambiguous boundaries: “It’s hard to wreck a nice beach!”

Speaker variability: dialect, volume, gender, etc.

Noise rejection: hands-free, mobile, telematics

Out-of-vocabulary rejection & confidence measures

Processor and memory demands

Page 7: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 7

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Speech Recognition: State of the Art

Callers speak naturally in directed dialogs

Million-word vocabularies: stocks, names, addresses

Open-ended responses, coupled with language understanding: “How may I help you?”

High accuracy, infrequent confirmation

Transaction completion rate over 90% is typical

Automatically adapt to caller population andchannel characteristics

Page 8: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 8

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

The Need For Text-To-Speech

Professional recordings costly and time-consuming

Large output vocabularies common (e.g. city names)

Word concatenation is difficult to do wellOften used for numeric outputCan sound mechanical; irritating when frequent

Some applications defy recordings (e.g. messaging)

Page 9: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 9

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Text-To-Speech Process

PronunciationGeneration

PronunciationGeneration

TextNormalization

TextNormalization

Text

Speech

SystemDictionarySystem

Dictionary

PronunciationRules

PronunciationRules

ProsodyGeneration

ProsodyGeneration

VoiceDatabase

VoiceDatabase

UnitSelection

UnitSelection

Concatenateand Smooth

Concatenateand Smooth

Page 10: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 10

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Text-To-Speech ChallengesText Normalization

Numerics: “12535” (number / zip code), “2x4”Abbreviations: “OR” (or / Oregon), “Dr. Jones on Elm Dr.”Acronyms: “IBM is listed on NASDAQ”Evolving usage: “CUL8R”

Pronunciation GenerationHomographs: “minute” (60 seconds / tiny)Vowel reduction: “he came to town” vs. “he came to”

Prosody GenerationPhrasing: “he is physically and mentally exhausted”Emphasis: “Are you flying tomorrow?”Emotion: upbeat vs. serious, calming vs. urgency

Page 11: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 11

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Text-to-Speech: State of the ArtNatural sounding output, no more “drunken Swede”

Seamlessly mix dynamic data with recorded prompts

Accurate pronunciation, including proper names

A variety of voices to choose from

Custom voices to maintain brand identity

Listen here…http://www.scansoft.com/speechworks/realspeak/teleco/

Page 12: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 12

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

MRCP and Natural AccessJack Chase

Director, Product MarketingNMS

Page 13: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 13

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

What Is MRCP v1?

Speech servers are connected by VoIP to IVR servesStandard API for ASR and TTS Easy to reconfigure system as needs changeEasy to implement redundancy

Control: MRCP/ RTSP/ TCP/ IP

Speech: G.711/ RTP/ UDP/ IP MRCP ServerMRCP Server

Speech

ServersIP

PSTN IVR

ServersIVR

ServersSpeech

Servers

Page 14: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 14

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Natural Access and MRCP

Service Managers, Libraries

Driver Driver Driver IPC

Call Control

CX Boards AG Boards CG Boards PacketMedia HMP

SNMP

HMP

PCI PCI PCI IP

IVRServices

PSTNTrunking

Fusion(VoIP)

Conferencing

FaxServices

Universal Speech Access

(MRCP)

OAM

VideoAccess

Page 15: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 15

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Universal Speech Access Makes Speech Integration Easy

Page 16: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 16

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Current Support for Universal Speech Access

Loquendo ASR LSS 6.0N/AASRLoquendo

SWMS 3.1OSR 3.0

OMS 2.0.1OSR 2.0

ASRScanSoft

SWMS 3.1RealSpeak 4.0

OMS 2.0.1Speechify 2.0

TTSScanSoft

teliSpeech 1.0 SP4Philsoft 3.2ASRTelisma

Vocalizer 3.0.8Vocalizer 3.0TTSNuance

MRCP Server SP7 Nuance 8.5

MRCP Server SP5 Nuance 8.5

ASRNuance

Universal Speech Access 1.1

Universal Speech Access 1.0

TypeVendor

Page 17: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 17

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

What’s Next for MRCP?MRCP v2

draft-ietf-speechsc-mrcpv2-06, Feb 20, 2005

Adds SIP/ SDP for session setupReplaces RTSP

Adds support for speaker verification

Little deployment yet

NMS will update Universal Speech Access when deployments occur

Page 18: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 18

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

MRCP Integration on the TelBert IVR Platform

using NMS and ScanSoftEric Cunningham

Enterprise ArchitectCapital One

Page 19: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 19

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Agenda

Why use MRCPMain business drivers for voice enablementOverview of architectureLessons learned

Page 20: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 20

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Why Use MRCP

Capital One has built its own IVR system (TelBert)Internally built and maintainedLinux based C/C++ system5000+ ports in productionHandles nearly 100% of all in-bound credit card callsBusiness wants to have speech enabled applications

Leading speech vendors are embracing MRCP for integrationCentralizes automated speech recognition (ASR) and text-to-speech (TTS) resources in the networkStandards based protocol, allowing multi-vendor interoperability

continued

Page 21: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 21

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Why Use MRCP (cont'd)

Benefits to Capital OneMRCP allows integration with leading vendors and avoids vendor lock-in

NMS APIs simplify the learning of MRCP and RTP protocols and integration; accelerated the adoption of MRCP into TelBert

Migration from AG 4000 to CG 6000 – clean evolution

CG 6000 provides on-board Ethernet and T1 terminations; eliminates host based processing of RTP data

Current AG 4000 code compatible with CG 6000; quick upgrade to existing platform

Page 22: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 22

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Overview of TelBert ArchitectureWhere applications run. The control what grammars are used, processing of results, and user prompting

Where NMS libraries are integrated. Single, state-machine model handling 184 ISDN callers, Voice processing commands, and the new ASR/TTS commands via Universal Speech Access.

ScanSoft has their MRCP server (SWMS) co-located on the same machine as the OSR and RealSpeak servers.

Note: This means that load balancing and failover is done by TelBert, not the MRCP serer

Private network (100MB switch) to encapsulate the RTP traffic.

Page 23: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 23

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Main Business Drivers for Voice Enablement

Improve customer experienceProvide both touch-tone and speech-enabled handlingSwitch between modes

Provide additional automated customer servicingAutomating time-consuming call center activitiesFrees call center representatives for more complex tasks

Basically, all of the standard reasons a business wants to start using voice recognition technologies

Page 24: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 24

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Lessons Learned

NMS Universal Speech Access and Fusion APIs front-end the complexity of RTSP, MRCP, and RTP protocols

You still need to read the specifications to troubleshoot problems

You need to understand the specifications in order to talk to vendors you are integrating with (ScanSoft)

continued

Page 25: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 25

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Lessons Learned (cont'd)

Example: NMS codeif( (nRtn = saiCreateSynthesizer(m_cta_context_handle, m_stRtpEndpointTts, m_ob_locate.get_server(),

TELBERT_CONTEXT_TTS, &m_stTtsHd)) != SUCCESS){

……}

RTSP/MRCP sniffer trace (what the MRCP server sees)

RequestSETUP rtsp://NEWBOX36/synthesizer/ RTSP/1.0CSeq: 7Transport: RTP/AVP;unicast;destination=10.87.204.8;client_port=3000-3001Content-Type: application/sdpContent-Length: 167

v=0o=139112752 0 127.0.0.1s=nms speechc=IN IP4 0.0.0.0t=0 0m=audio 3000 RTP/AVP 0 96a=rtpmap:0 pcmu/8000a=rtpmap:96 telephone-event/8000

ResponseRTSP/1.0 200 OKCSeq: 7Session: RQKCRCSPWX0000000368fgJiuWPnxzTransport: RTP/AVP;unicast;client_port=3000-3001Content-Length: 215Content-Type: application/sdp

v=0o=- RQKCRCSPWX0000000368fgJiuWPnxz RQKCRCSPWX0000000368fgJiuWPnxz IN IP4 10.87.204.36s=SpeechWorks OpenSpeech Media Server version 2.0c=IN IP4 0.0.0.0t=0 0m=audio 3000 RTP/AVP 0a=rtpmap: 0 pcmu/8000

Page 26: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 26

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Lessons Learned (cont'd)

Load BalancingThe MRCP specification allows for the MRCP server to coordinate where to setup the RTP connection with the ASR/TTS server; allows performance of load balancing activities

Currently ScanSoft’s MRCP server does not provide load balancing, but their engineers are looking at providing this

Until then, your IVR will have to create its own load balancing and failover logic for the ASR/TTS server farm

continued

Page 27: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 27

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Lessons Learned (cont'd)

Lots of specifications to be learned and not just by the integration team

Application Interface TeamApplication Developers

http://www.w3.org/TR/nl-spec/Natural Language Semantics Markup Language for Speech Interface Framework (nl-spec) Specification

Application Interface TeamApplication Developers

http://www.w3.org/TR/2004/REC-speech-grammar-20040316/

Speech Recognition Grammar Specification

Integration Teamftp://ftp.rfc-editor.org/in-notes/std/std64.txt

Real-Time Protocol (RTP) Specification

Integration Teamftp://ftp.rfc-editor.org/in-notes/rfc2326.txt

Real Time Streaming Protocol (RTSP) Specification

Integration TeamApplication Interface Team

ftp://ftp.rfc-editor.org/in-notes/internet-drafts/draft-shanmugham-mrcp-05.txt

Media Resource Control Protocol (MRCP) Specification

Who needs to understand/ be aware of this spec

LocationSpecification

Page 28: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 28

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Thank You!Note:

PDF will be posted todayRecorded version posted in a few days

Page 29: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 29

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

Please use the text messaging feature to send your questions

Q & A Session

Page 30: Real-World Experience Adding Speech to IVR Solutions with · PDF fileReal-World Experience Adding Speech to IVR Solutions ... The MRCP specification allows for the MRCP server to

Slide 30

TelecomInnovators’

Web SeminarSeries

TelecomInnovators’

Web SeminarSeries

For more information…Contact

Dr. Rob Kassel, Senior Product Manager, ScanSoft+1 617 428 4444; [email protected]

Jack Chase, Director, Product Marketing, NMS +1 508 271 1109; [email protected]

Eric Cunningham, Enterprise Architect, Capital One+1 804 855 3597; [email protected]

Upcoming EventsVON Europe

May 23 – 26Stockholm, SwedenBooth # 1040

Upcoming WebinarsJune: Ready for Mainstream: AdvancedTCA Solutions Become RealityJuly: “Transforming Speech Applications With NMS' new VoiceXML Server”