View
23
Download
2
Category
Preview:
Citation preview
Digital Speech ProcessingDigital Speech Processing——Lecture 18Lecture 18Lecture 18Lecture 18
TextText--toto--Speech (TTS) Speech (TTS) Synthesis SystemsSynthesis SystemsSynthesis SystemsSynthesis Systems
TextText--toto--Speech (TTS) SynthesisSpeech (TTS) SynthesisTextText toto Speech (TTS) SynthesisSpeech (TTS) Synthesis• GOAL: convert arbitrary textual messages to
intelligible and natural sounding syntheticintelligible and natural sounding synthetic speech so as to transmit information from a machine to a personmachine to a person
input text
Text Analysis
Speech Synthesizer
abstract underlying linguistic
synthetic speech
t ty y
description output
• pronunciation of text => phonemes, stress, intonation, duration
• syntactic structure of sentence (pauses, rate of speaking, emphasis)
• semantic focus, ambiguity resolution (duration, intonation)
– rules for word etymology (especially names, foreign terms)
Text Analysis ComponentsText Analysis ComponentsText Analysis ComponentsText Analysis ComponentsRaw English Input TextRaw English Input Text
Basic Text ProcessingBasic Text ProcessingDocument Structure DetectionDocument Structure DetectionText NormalizationText NormalizationLinguistic AnalysisLinguistic Analysis DictionaryDictionaryLinguistic AnalysisLinguistic Analysis
Phonetic AnalysisPhonetic Analysis
Tagged TextTagged Text
DictionaryDictionary
Phonetic AnalysisPhonetic AnalysisHomograph disambiguationHomograph disambiguationGraphemeGrapheme--toto--Phoneme ConversionPhoneme Conversion
Tagged PhonesTagged Phones
Prosodic AnalysisProsodic AnalysisPitch and Duration RulesPitch and Duration RulesStress and Pause AssignmentStress and Pause Assignment
Tagged PhonesTagged Phones
Stress and Pause AssignmentStress and Pause Assignment
Synthesis Controls (Sequence of Synthesis Controls (Sequence of Sounds, Durations, Pitch)Sounds, Durations, Pitch)
Document StructureDocument StructureDocument StructureDocument Structure•• end of sentenceend of sentence marked by ‘.?!’ is not infallibley
– “The car is 72.5 in. long”•• ee--mailmail and web pages need special processing
– Larry:
Sure. I'll try to do it before Thursday :-)y y )
Ed•• multiple languagesmultiple languages•• multiple languagesmultiple languages
– insertion of foreign words, unusual accent and diacritical marks, etc.
Text NormalizationText Normalization
•• abbreviations and acronymsabbreviations and acronyms– Dr. is pronounced either as Doctor or drive depending on context (Dr.
Smith lives on Smith Dr.)S i d i h ‘ ’ ‘S i ’ d di (I li– St. is pronounced either as ‘street’ or ‘Saint’ depending on context (I live on Bourbon St. in St. Louis)
– DC is either direct current or District of Columbia– MIT is pronounced as either ‘M I T’ or ‘Massachusetts Institute of
Technology’ but never as ‘mitt’– DEC is pronounced as either ‘deck’ or ‘Digital Equipment Company’ but
never as ‘D E C’•• numbersnumbers
– 370-1111 can be either ‘three seven oh …’ or ‘three seventy-model 1111’ for the IBM 370 computer
– 1920 is either ‘nineteen-twenty’ or ‘one thousand, nine hundred, twenty’•• dates times currency account numbers ordinals cardinalsdates times currency account numbers ordinals cardinals•• dates, times currency, account numbers, ordinals, cardinals, dates, times currency, account numbers, ordinals, cardinals,
mathmath– Feb. 15, 1983 needs to convert to ‘February fifteenth, nineteen eighty-
three’$10 50 is pronounced as ‘ten dollars and fifty cents’– $10.50 is pronounced as ‘ten dollars and fifty cents’
– part # 10-50 needs to be pronounced as ‘part number 10 dash fifty’ rather than ‘part pound sign ten to fifty’
Text NormalizationText NormalizationText NormalizationText Normalization•• proper namesproper namesp pp p
– Rudy Prpch is pronounced as ‘Rudy Perpich’– Sorin Ducan is pronounced as ‘Sorin Duchan’
•• part of speechpart of speech– read is pronounced as ‘reed’ or ‘red’
record is pronounced as ‘rec ard’ or ‘ri cord’– record is pronounced as rec-ard or ri-cord•• word decompositionword decomposition
– need to decompose complex words into base formsneed to decompose complex words into base forms (morphemes) to determine pronunciation (‘indivisibility’ needs to be decomposed into ‘in-di-visible-ity’ to determine pronunciation)visible ity to determine pronunciation)
Text NormalizationText NormalizationText NormalizationText Normalization
• proper handling of special symbolsspecial symbols• proper handling of special symbolsspecial symbolsin text– punctuation, e.g., . , : ; ’ ” - -- _ * & ( ) ^
% @ ! ~ < > ? / \ = +•• string resolutionstring resolution
– 10:20 can be pronounced as either10:20 can be pronounced as either ‘twenty after 10 (as a time)’ or ‘ten to twenty’ (as a sequence)twenty (as a sequence)
Knowledge Source ConfusionsKnowledge Source ConfusionsKnowledge Source ConfusionsKnowledge Source Confusions
• Let us pray Lettuce spray• Let us pray ---------- Lettuce spray• Pay per view ---------- Paper viewy p p• Meet her on Main St. ----------
M t M i StMeter on Main St.• It is easy to recognize speech -----It is easy to recognize speech
----- It is easy to wreck a nice b hbeach
8
Linguistic AnalysisLinguistic AnalysisLinguistic AnalysisLinguistic Analysis• part-of-speech (POS)p p ( )• word sense• phrases
h• anaphora• emphasis• style• style
a conventional parser could be used, but a co e t o a pa se cou d be used, buttypically a simple shallow analysis is done for speed (parsers are not real-time!)
Homograph DisambiguationHomograph DisambiguationHomograph DisambiguationHomograph Disambiguation• “an absent boy” versus “do you choose toan absent boy versus do you choose to
absent yourself?”• “they will abuse him” versus “they won’tthey will abuse him versus they won t
take abuse”• “an overnight bag” versus “are you stayingan overnight bag versus are you staying
overnight?”• “he is a learned man” versus “he learnedhe is a learned man versus he learned
to play piano”• “El Camino Real road” versus “real world”El Camino Real road versus real world
LetterLetter--toto--Sound (LTS) ConversionSound (LTS) Conversion
• CART “Whole Word” Dictionary Probe“Whole Word” Dictionary Probe
Input WordInput Word
(Classification and Regression Affix StrippingAffix Stripping
nono
nonogTree) analysis
• conventional “Root” Dictionary Probe“Root” Dictionary Probe
yesyesnono
nonoconventional dictionary search with
LetterLetter--toto--Sound and Sound and Stress RulesStress Rules
yesyesyesyes
search with letter-to-sound rules
Affix ReattachmentAffix Reattachmentrules
phonemes, stress, phonemes, stress, partsparts--ofof--speechspeech
This is only ONE way of doing This is only ONE way of doing LTS. FSMs are another.LTS. FSMs are another.
ProsodyProsodyProsodyProsody•• pausespauses
– to indicate phrases and to avoid running out of breath
•• pitchpitch– fundamental frequency (F0) as a function of q y ( )
time•• rate/relative durationrate/relative duration
– phoneme durations, timing, and rhythm•• loudnessloudnessloudnessloudness
– relative amplitude/volume
Full TTS SystemFull TTS Systemyy
T t L tt t S th i
Text Face SyncText/Language
Text Analysis
Letter-to-Sound Rules
Synthesis Backend Speech
S th i d l i t TTS
- Numerical expansion(dates, times, $$$, …)
- abbreviations acronyms
- text and name dictionaries
- timing and duration[rate, stress (position, context)]
- intonation/pitch
Synthesis model impacts TTS, as well as unit representation.
choice of units: - words
phonesabbreviations, acronyms
- proper name id
“Dr. Smith lives at 23 Lakeshore Dr.”
- intonation/pitch(type of phrase[list,statement, ??])
- phonetic substitution(vowel reduction, flapping rules)Pr
osod
y - phones - diphones - dyads - syllables
choice of parameters:
- sentences, phrase and breath groups- semantic and syntactic accent; stress of compounds
- loudness/amplitude(phrasal amp., local reduction)
choice of parameters:- LPC - formants- waveform templates- articulatory parameters
Parsing:
y p- intonation types; parts of speech - sinusoidal parameters
method of computation:- rules- concatenation
Word Concatenation SynthesisWord Concatenation Synthesisyy
• words in sentences are much shorter than• words in sentences are much shorter than in isolation (up to 50% shorter) (see next page)page)
• words cannot preserve sentence-level t h th i t ti ttstress, rhythm or intonation patterns
• too many words to store (1.7 million surnames), extended words using prefixes and suffixes
Word ConcatenationWord ConcatenationWord ConcatenationWord Concatenation
Proper Name StatisticsProper Name StatisticsProper Name StatisticsProper Name StatisticsNumber of NamesNumber of Names CoverageCoverageNumber of NamesNumber of Names CoverageCoverage
10 4.9%
100 16.3%
5,000 59.1%5,000 59.1%
50,000 83.2%
100,000 88.6%
200,000 93.0%, %
Statistics of Proper Name Statistics of Proper Name CCCoverageCoverage
name name pronunciation pronunciation
based onbased onbased on based on etymologyetymology
Concatenative Word SynthesisConcatenative Word SynthesisConcatenative Word SynthesisConcatenative Word Synthesis
several examples of several examples of concatenated words and fullconcatenated words and fullconcatenated words and full concatenated words and full
sentencessentences----SBRSBR
wordword--based based synthesis does synthesis does
not work!not work!
Speech Synthesis MethodsSpeech Synthesis MethodsSpeech Synthesis MethodsSpeech Synthesis Methods• 1939—the VODER (Voice Operated1939 the VODER (Voice Operated
DEmonstratoR)—Homer Dudley– based on a simple model of speech sound production– select voicing source (with foot pedal control of pitch)
or noise sourcet filt h d th t d l– ten filters shaped the source to produce vocal or noise-excited sounds—controlled by finger motions
– separate keys for stop soundsseparate keys for stop sounds– wrist bar control of signal energy
The VODERThe VODERThe VODERThe VODER
Articulatory SynthesisArticulatory Synthesis•• in theoryin theory — can create more natural and more
realistic motions of the articulators (rather than formant parameters), thereby leading to more natural sounding synthetic speech
utilize physical constraints of articulator movements– utilize physical constraints of articulator movements– use X-ray data to characterize individual speech
sounds– model how articulatory parameters move smoothly
between sounds• direct method: solve wave equation for sound pressure at q p
lips• indirect method: convert to formants or LPC parameters for
final synthesis in order to utilize existing synthesizers– use highly constrained motions of articulatory
parameters
Articulatory ModelArticulatory ModelArticulatory ModelArticulatory ModelArticulatory Parameters of Articulatory Parameters of Model:Model:
•• lip lip openingopening——WW
•• lip protrusion lengthlip protrusion length——LLlip protrusion lengthlip protrusion length LL
•• tongue body height and tongue body height and lengthlength——Y,XY,X
l ll l KK•• velar closurevelar closure——KK
•• tongue tip height and tongue tip height and lengthlength——B,RB,R
•• jaw raising (dependent jaw raising (dependent parameter)parameter)
•• velum openingvelum opening——NNp gp g
Articulatory Synthesis Using Articulatory Synthesis Using F S h i B k dF S h i B k dFormant Synthesizer BackendFormant Synthesizer Backend
Cecil CokerCecil Coker----teaching computers to teaching computers to talktalk
Articulatory Synthesis of Articulatory Synthesis of SpeechSpeech——Cecil CokerCecil Coker
talktalk
SpeechSpeech Cecil CokerCecil Coker
Articulatory Synthesis by Copying Articulatory Synthesis by Copying M d V l T t D tM d V l T t D tMeasured Vocal Tract DataMeasured Vocal Tract Data
• fully automatic closed-loop optimizationi iti li d f ti l t d b k l t– initialized from articulatory codebooks, neural nets
– Schroeter and Sondhi, 1987
One example: original: re-synthesis:
Articulatory Synthesis IssuesArticulatory Synthesis IssuesArticulatory Synthesis IssuesArticulatory Synthesis Issues
• requires highly accurate models of glottisrequires highly accurate models of glottis and vocal tract
• requires rules for dynamics of the• requires rules for dynamics of the articulators
SourceSource--Filter Synthesis ModelsFilter Synthesis ModelsSourceSource Filter Synthesis ModelsFilter Synthesis Models• cascade/serial (formant) synthesis model
Serial/Formant Synthesis ModelSerial/Formant Synthesis Model
•• flaws in the serial/formant synthesis model:flaws in the serial/formant synthesis model:flaws in the serial/formant synthesis model:flaws in the serial/formant synthesis model:–– can’t handle voiced fricativescan’t handle voiced fricatives–– no zeros for nasal soundsno zeros for nasal sounds–– no precise control for stop consonantsno precise control for stop consonants–– pitch pulse shape fixedpitch pulse shape fixed——independent of pitchindependent of pitch–– spectral compensation is inadequatespectral compensation is inadequate
OVE 1OVE 1----FantFant SPASS SPASS synthesissynthesis
“To Be …”“To Be …”--Bell LabsBell Labs
DaisyDaisy--Daisy with Daisy with
musicmusic
We Wish We Wish You …You …
JSRU JSRU SynthesisSynthesis
Parallel Synthesis ModelParallel Synthesis ModelA i l th i i d h f l l t tA serial synthesizer is a good approach for open, non-nasal vocal tracts (vowels, liquids). For obstruents and nasals, we need to control the amplitudes of each resonance, and to introduce zeros in addition to the poles.
24
1 2 21
1 21 2
cos( )( )cos( )
k k k
k k k k
r rV zr z r z
θθ − −
=
− +=
− +∏14
1 2 21 1 2 cos( )
use the approximation
k k
k k k k
A B zr z r zθ
−
− −=
−=
− +∑i
4
1 2 21 1 2
use the approximation
( )cos( )
k
k k k k
AV zr z r zθ − −
=
≈− +∑
i
parallel synthesizer provides more parallel synthesizer provides more flexibility in matching spectrum levels flexibility in matching spectrum levels
t f t f i ( i it f t f i ( i iat formant frequencies (via gain at formant frequencies (via gain controls)controls)——however, zeros are however, zeros are introduced into the spectrum.introduced into the spectrum.
Parallel SynthesisParallel SynthesisParallel SynthesisParallel Synthesis•• issues:issues:
–– need individual resonance amplitudes need individual resonance amplitudes ((A1,…,A4A1,…,A4))——if resonances are close, this is a if resonances are close, this is a messy calculationmessy calculationmessy calculationmessy calculation
–– phasing of resonances neglected (the phasing of resonances neglected (the BBkkzz--11 terms)terms)–– synthetic speech has both resonances and zeros synthetic speech has both resonances and zeros
(at frequencies between the resonances) that may (at frequencies between the resonances) that may be perceptiblebe perceptiblebetter reproduction of complex consonantsbetter reproduction of complex consonants–– better reproduction of complex consonantsbetter reproduction of complex consonants
Parallel synthesis from BYUParallel synthesis from BYUParallel synthesisParallel synthesis--HolmesHolmes
Continuing Evolution (1959Continuing Evolution (1959--1987)1987)
• Haskins 1959• Haskins, 1959 • KTH – Stockholm, 1962• Bell Labs 1973• Bell Labs, 1973 • MIT, 1976
MIT t lk 1979• MIT-talk, 1979• Speak ‘N spell, 1980
BELL L b 1985• BELL Labs, 1985• Dec talk, 1987
TextText--toto--Speech Synthesis (TTS) Speech Synthesis (TTS) EvolutionEvolution
Poor Good
Good Intelligibility;
C stomerPoor Intelligibility;
Poor Naturalness
Good Intelligibility;
Poor Naturalness
Customer Quality
Naturalness (Limited Context)
Formant Synthesis
LPC-Based Diphone/Dyad
Unit Selection Synthesis
Context)
Bell Labs; Joint Speech
Synthesis
Bell Labs; CNET;
ATR in Japan; CSTR in Scotland; Joint Speech
Research Unit; MIT (DEC-Talk);
Haskins Lab
;Bellcore; Berkeley Speech
Technology
BT in England; AT&T Labs (1998);
L&H in Belgium
1962 1967 1972 1977 1982 1987 1992 1997
Year
gy
Speech SynthesisSpeech Synthesis——the 90’sthe 90’sSpeech SynthesisSpeech Synthesis the 90 sthe 90 s• what changed?what changed?
– TTS was highly intelligible but extremely unnatural sounding
– a decade of work had not changed the naturalness substantiallycomputation and memory grew with Moore’s law– computation and memory grew with Moore s law, enabling highly complex concatenative systems to be created, implemented and perfected
– concatenative systems showed themselves capable of producing (in some cases) extremely natural sounding synthetic speechsounding synthetic speech
Concatenation TTS SystemsConcatenation TTS SystemsConcatenation TTS SystemsConcatenation TTS Systems• key idea: use segments of recorded speech for y g p
synthesis• data driven approach more segments give better
synthesis using an infinite number of segments leadssynthesis using an infinite number of segments leads to perfect synthesis
• key issues:what units to use– what units to use
– how to select units from natural speech– how to label and extract consistent units from a large database
h t i l t ti h ld b d f t ll– what signal representation should be used for spectrally smoothing units (at junctures) and for prosody modification (pitch, duration, amplitude)
Concatenation UnitsConcatenation UnitsConcatenation UnitsConcatenation Units
• choice of units:choice of units:– Words—there are an infinite number of them
Syllables there are about 10K in English– Syllables—there are about 10K in English– Phonemes—there are about 45 in English
Demi syllables there are about 2500 in– Demi-syllables—there are about 2500 in EnglishDiphones there are about 1500 2500 in– Diphones—there are about 1500-2500 in English
Choice of UnitsChoice of UnitsChoice of UnitsChoice of UnitsLength # Rules, Necessary
Unit ModificationsUnit # Units
(English)
Allophone 60-80
( g )QualityLow
ManyShort
pDiphone <402-652
Triphone <403-653
Demisyllable 2K ySyllable 11KVC*V2-syllable <11 K2yWord 100K-1.5MPhraseSentence HighFewLong
8Sentence gLong
Corpus Coverage by Unit TypeCorpus Coverage by Unit TypeCorpus Coverage by Unit TypeCorpus Coverage by Unit Type50000 2-Syllable
VC*V1.0 NOTE: depends on domain
(here SURNAMES)
40000TriphonesSyllablesDemisyllablesDiphones
.83
)
20000
30000DiphonesWords
Slope
# U
nits
oken
/uni
t)
10000
20000 .23.15.11
#(1
to
0.02
.0031 10K 20K 30K 40K 50K 2 M
Top N Surnames (rank)
Concatenation UnitsConcatenation UnitsConcatenation UnitsConcatenation Units• Words:
– no complete coverage for broad domains words have to be supplemented with smaller units
– limited ability to modify pitch, amplitude and durationlimited ability to modify pitch, amplitude and duration without losing naturalness and intelligibility
– need huge database to extract multiple versions of each word
• Sub-Words:– hard to isolate in context due to co-articulation
need allophonic variations to characterize units in all– need allophonic variations to characterize units in all contexts
– puts large burden on signal processing to smooth at unit join pointsunit join points
Block Diagram of a Block Diagram of a C i TTS SC i TTS SConcatenative TTS SystemConcatenative TTS System
Dictionary and Rules
Store of Sound Units
Text l
Assemble Speech W f
SpeechAnalysis,
Letter-to-Sound,Prosody
ssemble Units that
Match Input
Waveform Modification
and Synthesis
Alphabetic Ch t
Message Textp
Prosodyp
Targets SynthesisCharacters
Phonetic SymbolsPhonetic Symbols, Prosody Targets
Female MaleFemale Male
Procedure for Concatenative SynthesisProcedure for Concatenative Synthesis• off-line inventory preparation:
– record speech corpus and process with coding method f h iof choice
– determine location of speech units and store units in inventoryy
• on-line synthesis from text– normalize input text (expand abbreviations, etc.)p ( p )
– letter-to-sound (pronunciation dictionary and rules)
– prosody (melody/pitch&durations, stress patterns/amplitudes, …)
– select appropriate sequence of units from inventory– modify units (smooth at boundaries, match desired
prosody)prosody)– synthesize and output speech signal
Unit Selection SynthesisUnit Selection SynthesisUnit Selection SynthesisUnit Selection Synthesis• need to optimally match units at boundaries, e.g.,
fundamental frequency (pitch) and spectrumfundamental frequency (pitch), and spectrum • need to automatically and efficiently select optimal
sequence of units from database• issues in Unit Selection Synthesis
– several examples in each unit category (from 10 to 10 )– waveform modification used sparingly (leads to perceived
6
p g y ( pdistortions)
– high intelligibility must be maintained– “customer quality” attained with reasonable training set y g
(1-10 hours)– “natural quality” attained with large training set (10’s of hours)– “unit selection” algorithm must run in a fraction of real time on a
state-of-the-art processor
(On(On--line) Unit Selection: Viterbi line) Unit Selection: Viterbi SearchSearch
u-##-a a-u
a-u(1)
#-a(1)
u-#(1)
#-# #-#
( )
a-u(2)
( )
#-a(2)
( )
u-#(2)
a-u(3)
#-a(3)
u-#(3)
• transitional (concatenation) costs are based on acoustic distances
a-u(4)
( )• node (target) costs are based on linguistic identity of unit
Unit Selection MeasuresUnit Selection MeasuresUnit Selection MeasuresUnit Selection Measures• USD—Unit Segmental Distortion differencesUSD Unit Segmental Distortion differences
between desired spectral pattern of target and that of candidate unit, throughout whole unit
• UCD—Unit Concatenative Distortion spectral discontinuity across boundaries of the concatenated units
Example: source context—want—/w ah n t/target context—cart—/k ah r t/USD (ah n versus ah r)
Modern TTS Systems (Natural Modern TTS Systems (Natural V i f AT&T)V i f AT&T)Voices from AT&T)Voices from AT&T)
NV2005NV2005• Soliloquy from Hamlet—• Gettysburg Address—
NV2005NV2005
NV2005NV2005
• Bob Story—• German female—• UK British female—• Spanish female—Spanish female• Korean female—• French male• French male—
Modern COMMERCIAL SystemsModern COMMERCIAL Systems
L t• Lucent• AcuVoice • Festival• L&H RealSpeakL&H RealSpeak• SpeechWorks female
S hW k l• SpeechWorks male• Cselt (Actor) - Italian
TTS Future NeedsTTS Future NeedsTTS Future NeedsTTS Future Needs• TTS needs to know how things should be said• context-sensitive pronunciations of words• prosody prediction emphasis
– I gave the book to John (not someone else)g ( )– I gave the book to John (not the photos)– I gave the book to John (I did it, not someone else)
• unit selection process target cost captures mismatch p g pbetween predicted unit specification (phoneme name, duration, pitch, spectral properties) and actual features of a candidate recorded unit need better spectral distance measures that incorporate human perceptiondistance measures that incorporate human perception
• better signal processing compress units for small footprint devices
Business Drivers of TTSBusiness Drivers of TTSBusiness Drivers of TTSBusiness Drivers of TTS• cost reductioncost reduction
– TTS as a dialog component for customer care– TTS for delivering messages– TTS to replace expensive recorded IVR promptsTTS to replace expensive recorded IVR prompts
• new products and services– location-based services
providing information in cars (e g driving directions– providing information in cars (e.g., driving directions, traffic reports)
– unified Messaging (reading e-mail, fax)– voice Portals (enterprise, home, phone access tovoice Portals (enterprise, home, phone access to
Web-based services)– e-commerce (automatic information agents)– customized News, Stock Reports, Sports Scores, p , p– devices
Reading EmailReading EmailFrom: Marilyn Walker <walker@research.att.com>
Example: Old TTS, No Filter
Reading EmailReading EmailFrom: Marilyn Walker walker@research.att.com To: David Ross <davidross@home.com> Subject: Re: Today's Meeting Date: Tuesday, December 01, 1998 4:25 PM ---------------------------------------------------------------------------------------------------------------------------4 30 i fi f S t th ti4:30 is fine for me. See you at the meeting.
Marilyn
-----Original Message-----g gFrom: David Ross <davidross@home.com> To: Marilyn Walker <walker@research.att.com> Date: Tuesday, December 01, 1998 2:25 PM Subject: Today's Meeting
Today's meeting has been changed from 4:00 to 4:30 PM. If the time change is a problem, please send me email at davidross@home.com.
Thanks,
david ross
Reading Email (final)Reading Email (final)From: Marilyn Walker <walker@research.att.com>
Example: Enhanced TTS
Reading Email (final)Reading Email (final)
To: David Ross <davidross@home.com> Subject: Re: Today's Meeting Date: Tuesday, December 01, 1998 4:25 PM ---------------------------------------------------------------------------------------------------------------------------4:30 is fine for me. See you at the meeting.
Marilyn Walker
O i i l M-----Original Message-----From: David Ross <davidross@home.com> To: Marilyn Walker <walker@research.att.com> Date: Tuesday, December 01, 1998 2:25 PM Subject: Today's Meeting j y g
Today's meeting has been changed from 4:00 to 4:30 PM. If the time change is a problem, please send me email at davidross@home.com.
ThanksThanks,
david ross
TTS Application CategoriesTTS Application CategoriesTTS Application CategoriesTTS Application Categories• PDAs, cellphones, gaming, talking appliancesDevices
• driving directions, city and restaurant guides, location services (e.g., “Macys has a sale!”)
Automotive Connectivity
Devices
voice control of cell phones, VCRs, TVs • home information access over telephone (“Home
Voice Portals”)
Connectivity
ConsumerCommunications
• information access over the phone such as sales information, HR, internal phonebook, messaging
Enterprise Communications
Voice assisted • E-commerce, customer care (e.g., friendly automated talking web agents, FAQs, product information)
Voice-assistedE-Commerce
Call center • next-gen “HMIHY” automated operator servicesAutomation
Recommended