Upload
apu
View
53
Download
0
Embed Size (px)
DESCRIPTION
Voice source characteristics in speaker segregation. Patti Adank. Aim project : to establish whether voice source characteristics of speakers can be useful to listeners when attending to a target speaker in a multi-speaker situation. - PowerPoint PPT Presentation
Citation preview
Voice source characteristics in
speaker segregation
Patti Adank
• Some speaker-related characteristics have been
found to be helpful:
Darwin et al. 2003, F0 (pitch) and vocal tract length (VTL)
differences between concurrent speakers help listeners attending
to the target speaker
• Aim project:
to establish whether voice source characteristics of speakers can
be useful to listeners when attending to a target speaker in a
multi-speaker situation
• Speaker-related differences that might aid listeners:
- style of speech
- voice quality: creaky voice, roughness, breathiness
• My experiments:
- establish the possible relevance of acoustic aspect of a creaky
voice: jitter
• Speaker-related differences that aid listeners:
- F0 difference (if > 2 semitones)
- Vocal tract length difference (VTL) (if > 1.08)
- Effects of F0 and VTL are superadditive Darwin et al. 2003
Time (s)0 0.0663689
-0.8568
0.9091
0
Pitch: periodicity of the voice source
Time (s)0 0.111872
-0.8568
0.9091
0
Time (s)0 0.1065
-0.7458
0.8588
0
Jitter: a-periodicity of the voice source
• Literature:
- McAdams (1989): natural jitter present in speaker’s voice may be
helpful for listeners
- Ellis (1993): segregate simultaneously presented vowels using
jitter differences alone, for a computational model
How could jitter help listeners?
•Auditory Scene Analysis
- primitive segregations cues
bottom-up
involuntary listening
- schema-driven segegation cues (Bregman, 1990)
top-down
voluntary/effortful listening
•Pitch =
primitive segregation cue
(Scheffers, 1983, Assmann & Summerfield, 1990 etc…)
+
schema-driven segregation cue
(Darwin et al, 2003)
• Hypotheses:
0. jitter does not aid the auditory system
1. jitter is only a primitive segregation cue
2. jitter is a primitive cue AND schema-driven cue
3. jitter is only a schema-driven segregation cue
• Experiments:
1. one double-vowel experiment with pitch as the experimental
factor to replicate earlier results for pitch as a primitive cue
2. one double-vowel experiment with jitter as the experimental
factor to establish if jitter is a primitive cue
3. An experiment like Darwin et al., with pitch and jitter as
factors to establish if jitter is a schema-driven cue
• Experiment 1:
- Double-vowel experiment to test pitch effect
- Synthetic vowels (Klat 1990):
AH, EE, ER, OO, OR, 200 milliseconds
- five versions of each vowel:
100 Hz, +1/4 semitone (st), +1/2 st, +1 st, +2 st
• Experiment 2:
- Double-vowel experiment to test jitter effect
- Synthetic vowels (Klat 1990) altered version:
AH, EE, ER, OO, OR, 200 milliseconds
- five versions of each vowel:
100 Hz, +/-1%, +/-2%, +/-4%, +/-8%
• Procedure (1 & 2):
- 7 listeners (5 British-English, 2 bilingual)
- categorization pre-test (45 stimuli)
- experiment 1 (or 2):
presentation double vowel (125 combinations)
select one of 15 options
Results pitch
2626262626N =
Pitch
+2+1+1/2+ 1/4100 Hz
95
% C
I P
ER
CE
NT
70
60
50
40
3030303030N =
Jitter
+/-8%+/-4%+/-2%+/-1%0
95
% C
I P
ER
CE
NT
70
60
50
40
Results jitter
• Hypotheses:
0. jitter does not aid the auditory system
1. jitter is only a primitive segregation cue
2. jitter is a primitive cue AND schema-driven cue
3. jitter is only a schema-driven segregation cue
4. jitter is a primitive segregation cue if there is also a pitch
difference.
1010101010N =
v2 = jit & f0 max
v1 = jitmax v2 = f0m
Jit max
F0 max
baseline
95
% C
I P
ER
CE
NT
80
70
60
50
40
Results jitter & pitch
Is there still hope for jitter?
• Next experiment: test if jitter is schema-driven cue
Setup as in Darwin et al.:
2 sentences from same speaker presented simultaneously
attend to target sentence
report on target words
vary jitter and pitch of the sentences