Upload
rache02
View
1.270
Download
5
Embed Size (px)
DESCRIPTION
Citation preview
Seo-jung ko, Industrial Engineering, Hanyang University
Engineering Psychology and Human Performance
Chapter 6. Language and Communications
Speech Perception
ㅇㅇ
Seo-jung ko, Industrial Engineering, Hanyang University
Contents
1. Speech Perception
2. Representation of Speech
3. Units of Speech PerceptionPhonemesSyllablesWords
4. Top-Down Processing of Speech
5. Applications of Voice Recognition Research
6. CommunicationsNonverbal CommunicationsVideo Mediated CommunicationsCrew Resource Management
Seo-jung ko, Industrial Engineering, Hanyang University
Speech Perception
Example
In1997, a Tragic event occurred at the Tenerife airport in the Canada Island : A
KLM Royal Dutch Airlines 747 jumbo jet, accelerating for takeoff, crashed into a
Pan American 747 taxiing on the same runway.
→ Confusion between the KLM pilot and air traffic control.
Reading & Speech
In common with reading, the perception of speech involves both bottom-up hier-
archical processing and top-down contextual processing.
reading :: features 세부특징 – letters 낱자 – words 단어
Speech :: phonomes 음소 - sylables 음절 -words 단어
But, reading 과 달리 physical units of speech 분리가 쉽지 않다 .
The perceptual system must undertake some analog to digital conversion to
translate the continuous speech waveform into the discrete units of speech
perception.
Seo-jung ko, Industrial Engineering, Hanyang University
Contents
1. Speech Perception
2. Representation of Speech
3. Units of Speech PerceptionPhonemesSyllablesWords
4. Top-Down Processing of Speech
5. Applications of Voice Recognition Research
6. CommunicationsNonverbal CommunicationsVideo Mediated CommunicationsCrew Resource Management
Seo-jung ko, Industrial Engineering, Hanyang University
Representation of Speech
(a)The stimulus of speech is a continuous variation or oscillation of the air pressure
(b)Fourier 분석서로 다른 주파수 , 진폭을 갖는 sine wave 로 분리시킬 수 있다 .
(c) Spectral representation.(b) 의 그래프를 각각Y 축 : Power
sine wave 진동의 평균 폭 or 폭의제곱X 축 : Frequency로 표현함 .
(d)Formants :: two separated
tones Y 축 : FrequencyX 축 : Time넓이 : amplitude
Seo-jung ko, Industrial Engineering, Hanyang University
Contents
1. Speech Perception
2. Representation of Speech
3. Units of Speech PerceptionPhonemesSyllablesWords
4. Top-Down Processing of Speech
5. Applications of Voice Recognition Research
6. CommunicationsNonverbal CommunicationsVideo Mediated CommunicationsCrew Resource Management
Seo-jung ko, Industrial Engineering, Hanyang University
Units of Speech Perception
Phonemes ( 음소 ) – the basic unit of speech
• changing a phoneme in a word will change its meaning (or change it to a nonword).
• The 38 English phonemes. Ex) [p] [b] [t] [d] [k] [g] [f] [v] [θ] …
• 실제 지각시 phonemes 와 printed letters 가 상당히 다름 .
• Physical form of a phoneme is highly dependent on the context in which it appears.
Syllables ( 음절 ) – the basic unit of speech perception.
• Two of more phonemes generally combine to create syllables.
• The syllabic unit is itself relatively invariant in its physical form.
• A Study suggests that people are particularly dependent on the syllable unit in speech percep-
tion.
Words ( 단어 ) – the smallest cognitive or semantic unit of meaning
• Morpheme( 형태소 ) 로 이루어져있다 . Ex) un- . –ing …
• Segmentation problem
“she uses st*and*ard oil”
세 단어 사이의 boundary-gap 이외에도 두개의 physical pauses 가 있음
→ 순수 Bottom-up processing 에서 의미를 모르는 단어들이 연속적으로 주어진 경우
단어들의 분리경계를 구분하기 어려워진다 .
Phonemes, Syllables, Words
Seo-jung ko, Industrial Engineering, Hanyang University
Contents
1. Speech Perception
2. Representation of Speech
3. Units of Speech PerceptionPhonemesSyllablesWords
4. Top-Down Processing of Speech
5. Applications of Voice Recognition Research
6. CommunicationsNonverbal CommunicationsVideo Mediated CommunicationsCrew Resource Management
Seo-jung ko, Industrial Engineering, Hanyang University
Top-Down Processing of Speech
Contrast speech perception with reading(1) invariable problem.(2) segmentation problem.(3) the serial and transient nature of the auditory message.
→Bottom-up processing 을 어렵게 하며 , top-down processing 에 의존하게 한다 .
Demonstrations of top-down or context-dependent pro-cessing in
speech perception are quite robust.
In one experiment, compare recognition of degraded word strings..
(1) 무작위 단어들 (2) 문법적 구조이지만 , 의미가 없는 단어들 (3) 의미적 맥락이 있는 단어들
→ 문법 , 의미 제약이 적을수록 신호강도가 커야만 같은 수준의 인식가능 .
Mixture of bottom-up and top-down processing.
Bottom-up processing : 음향적인 세부특징 , 음절수준의 하위특징
Top-down processing : 의미적 , 통사론적 맥락에서 특정 speech 의 음이 무엇인지
단어경계에 대한 주관적 특성
Seo-jung ko, Industrial Engineering, Hanyang University
Contents
1. Speech Perception
2. Representation of Speech
3. Units of Speech PerceptionPhonemesSyllablesWords
4. Top-Down Processing of Speech
5. Applications of Voice Recognition Research
6. CommunicationsNonverbal CommunicationsVideo Mediated CommunicationsCrew Resource Management
Seo-jung ko, Industrial Engineering, Hanyang University
Applications of Voice Recognition Research
1. Understanding of how humans perceive speech and em-ploy context-driven top-down processing in recognition.
2. Measure and predict the effects on speech comprehension of various kind of distortion. (extrinsic or intrinsic distor-tion)
Speech perception - Two major categories of applications.
Natural speech the differing amplitudes of the vari-ous phonemes distributed across a wide rage of frequencies. → spectrum 형성가능
Figure 6.12Typical power spectra of speech
Noise & frequency 동일 주파수대의 noise 가 이해를 더 떨어트림
Seo-jung ko, Industrial Engineering, Hanyang University
Applications of Voice Recognition Research
Articulation index (AI) : Predict the effects of background noise on speech under-standing
Noise
Signal
hearing.It is not comprehension.
So, AI provided measure ofOnly bottom-up stimulus quality.
Seo-jung ko, Industrial Engineering, Hanyang University
Applications of Voice Recognition Research
Speech intelligibility 명백 :: Vocal material of particular level of redundancy
over the speech channel in question and computing the percentage of words un-
derstood correctly.
정보내용 / redundancy / 청자의 top-down processing 에 따라 다른 이해 정도를 표현할 수 있음 .
제한된 단어 > 제한이 없는 단어 ( 표준화 등등 )
의미 있는 단어 > 무의미한 음절
고빈도 단어 > 저빈도 단어
맥락 있는 문장 > 맥락 없는 문장
Figure 6.14
The important implications
1. Either the AI or the speech-intelligibility Measures by themselves are inherently Am-
biguous unless the redundancy of the transmitted material is carefully
specified
2. data-driven, bottom-up processing may trade off with context-driven, top-down pro-
cessing.
Seo-jung ko, Industrial Engineering, Hanyang University
Applications of Voice Recognition Research
Limitations in signal quality can be compensated for by augmenting top-down
processing
– creating the ability to “guess” the message without actually (or com-
pletely) hearing it.
Ex) 표준화된 어휘만 이용 , 중복되는 "carrier” sentences 사용
The effect of redundant carrier sentences on comprehension.
소음이 있는 상태에서 비행기조조사에게 음성경고를 보내는 실험 .
경고형태 :: “ fuel low” , “your fuel is low”
→recognition performance : “fuel low” < “you fuel is low”
→carrier sentences : one-syllable words > multi-syllable words
The ability to “guess” the massage
Seo-jung ko, Industrial Engineering, Hanyang University
Contents
1. Speech Perception
2. Representation of Speech
3. Units of Speech PerceptionPhonemesSyllablesWords
4. Top-Down Processing of Speech
5. Applications of Voice Recognition Research
6. Communications : there is more to communications than simply under-standing the words and sentences in speech. ex) gestures, pauses, and voice inflection …
Nonverbal CommunicationsVideo Mediated CommunicationsCrew Resource Management
Seo-jung ko, Industrial Engineering, Hanyang University
Communications
Communications :: there is more to communications than simply understanding the words and sentences in speech. ex) gestures, pauses, and voice inflection …Nonverbal Communications
1. Visualizing the mouth.화자의 입 움직임과 단어를 발음하는 모양을 보는 것 . 유용한 중복적 단서 .( 특히 음성의 질이 좋지 않을 때 )
2. Nonverbal cues.
화자의 끄덕임이나 곤혹스런 표정 등과 같은 표정을 통해 얻는 단서 , 제스처 등의 부가정보 .
3. Disambiguity.
화자입장에서 청자의 곤혹스러운 표정이나 끄덕임을 통해 내 말을 이해 했는지 못 했는지 알 수 있기 때문에 메시지의 모호성을 알 수 있음 .청각에만 의존하고 시각적 피드백이 없는 경우 , 전체적으로 단서 수가 많아지고 정식으로 이루어지는 대화의 주고받기 빈도가 늘어남 .
4. Shared knowledge of action.팀 수행에서는 , 팀 구성원들이 수행하는 혹은 실패하는 행위를 단지 지켜보는 것 만으로도 많은 정보가 교환되고 공유됨
Seo-jung ko, Industrial Engineering, Hanyang University
Communications
Video Mediated communications
Video 가 face to face communication의 장점을 가질 것이다 .
하지만 비디오와 청각정보의 질이 떨어지고 , 이 두 채널을 동기화 시켜하 하는데 문제가 있을 수 있다 .
또한 질이 좋고 동기화가 잘 되더라도 , face-to-face communication에 비해 원격 비디오 조건에서는
더 많은 단어가 필요했고 , 소통의 방식도 청각만 사용하는 소통과 비슷하게 더 많은 공식적인 주고받기와 더 적은수의 중단을 보였다 .
Crew Resource Management
대화에 참여하는 사람들간의 사회적 분위기의 특성이 의사소통의 패턴을 촉진하거나 떨어뜨릴수 있다 .
조종사의 실수를 보았거나 의심을 갖고 있는 부조종사의 경우 .
1) 조종사가 주의를 기울이도록 만드는데 실패
2) 너무 모호하게 말해서 실수가 교정되지 못함
정보를 교환해야 하는 두 오퍼레이터 간에 분명한 지위차이가 있는 경우 . (Ex.사장과 비서 )
조종사들의 협동을 필요로하는 , 효율적인 의사소통을 위한 시뮬레이션 실험 결과
승무원들이 의사소통을 많이 공유하고 , 더 빈번히 소통내용을 확인하며 명령 또는 단정적인 문장을 많이 사용 .
공식적인 지휘체계의 양방향으로 (조종사 ->부 조종사 . 부 조종사 ->조종사 ) 단정적인 진술을 사용 .
- 각 구성원이 분명하게 정해진 책임감을 자각하고 의사소통을 하고 있음을 뜻함
오랜시간 함께 수행한 승무원들의 수행이 더 우수 .