AI를활용한외국어교육dml.snu.ac.kr/sites/dml.snu.ac.kr/files/AI를 활용한... · 2020. 6. 1. · AI를활용한외국어교육 김선희 서울대학교불어교육과 2020-05-22

AI를활용한외국어교육

김선희서울대학교불어교육과

[email protected]

Outline

• Artificial Intelligence

• AI Applications

• AI Technologies for Language Education

• Corpus-Based Evaluation on Chinese Text Normalization

• Applications for Language Education

• Concluding remarks

Artificial Intelligence

AI everywhere!

https://medium.com/infonation-monthly/5-companies-making-the-world-a-better-place-with-ai-right-now-7a7b109f0120

https://medium.com/infonation-monthly/5-companies-making-the-world-a-better-place-with-ai-right-now-7a7b109f0120

Artificial Intelligence (AI)

Artificial Intelligence (AI)

AI and Machine Learning


canvas.northwestern.edu


https://towardsdatascience.com/role-of-data-science-in-artificial-intelligence-950efedd2579

https://towardsdatascience.com/role-of-data-science-in-artificial-intelligence-950efedd2579

10

ObservableData

Modeling(Learning)

Information

AI Method: Machine Learning

11

ObservableData

Modeling(Learning)

Information

Speech Text

안녕하세요

SpeechRecognition


12

ObservableData

Modeling(Learning)

Information

OneLanguage

AnotherLanguage

I am a boy

MachineTranslation

나는소년이다


13

ObservableData

Modeling(Learning)

Information

Image Category

고양이

ImageClassify


14

ObservableData

Modeling(Learning)

Information

HOW ?


15

ObservableData

Modeling(Learning)

Information

DeepNeural

Networks


AI Applications

AI Applications

Search

Ⓒ2017 NAVER Corp.

AI Applications

Object detection and Image

classification

AI Applications

Recommendation

AI Applications

Recommendation

AI Applications

Translation Speech Synthesis

24

Smart Speaker With Screen Display

Lenovo Smart Display

All-new Echo Show

Echo Spot

Portal from Facebook

https://thedroidguy.com/2018/11/5-best-smart-speaker-with-screen-display-in-2019-1092784

JBL Link View

https://thedroidguy.com/2018/11/5-best-smart-speaker-with-screen-display-in-2019-1092784

Smart Speaker With Screen Display

https://www.mk.co.kr/news/business/view/2019/05/312029/

https://www.mk.co.kr/news/business/view/2019/05/312029/

Smart Devices & Multiexperience



AI Technologies for Language Education

30

ObservableData

Modeling(Learning)

Information

Speech Text

안녕하세요Speech

Recognition

Speech Recognition

Speech Recognition

Task Vocabulary Word Error Rate %

Digits 11 0.5

WSJ read speech 5K 3

WSJ read speech 20K 3

Broadcast news 64,000+ 5

Conversational Telephone 64,000+ 10

Source: http://web.stanford.edu/class/cs224s/lec/

Speech Recognition

Machines about 5 times worse than humansGap increases with noisy speech

Task Vocab ASR Hum SR

Continuous digits 11 .5 .009

WSJ 1995 clean 5K 3 0.9

WSJ 1995 w/noise 5K 9 1.1

SWBD 2004 65K 10? 3-4?

Source: http://web.stanford.edu/class/cs224s/lec/

Speech Recognition

Speech Synthesis

Speech Synthesis

TextLanguage Understanding

ModuleVoice

Prosody

Prediction

Module

Unit

Selection Module

Prosody ModelsSpeech

DBNLU

Models

Architecture

열대/십때/십대삼미터/삼메가/쓰리엠이뤌리릴/한시일분/…

이뤌 리릴이뤌 리릴이뤌 이릴

10 대3M

01.01

Speech Synthesis

텍스트정규화(Text Normalization)

예제

Je loue meublé de 38 m² en excellent état au 2ème étage (refait à neuf en 2012) situé Rue

Monsieur le Pince à coté du jardin du Luxembourg.

Rares au XIXe siècle, leur présence est coutumière dès le milieu de notre siècle.

문장내의숫자, 기호, 외국어, 등여러가지문제들을처리하는모듈

방법론: 규칙기반혹은통계기반방법

Speech Synthesis - NLU

발음변환(Grapheme-to-Phoneme Conversion)

Homophones

Les touristes affluent pour visiter le musée.

L'Isère est un affluent du Rhône.

Liaison

Il a été très étonné de voir ça !

Hier, on s'est bien amusés.

방법론: 규칙기반혹은통계기반방법


운율경계및액센트추정

예제

Mon mari veut raminer Romain.

규칙기반혹은통계기반방법

운율경계정보태깅학습데이터구축


최적발성목록설계

성우선정

TTSDB녹음

발음전사

음소전사

운율전사

언어정보태깅

합성단위특징추출

Pre-Selection

보이스폰트패키징

언어및도메인지식

언어지식

언어지식

언어지식

언어지식

언어지식

보이스폰트

보이스폰트

보이스폰트

Speech Synthesis - Speech DB

현재서귀포날씨입니다.

기온이 7.3도로춥지는않지만, 카메라에빗방울이맺혀있죠?

제주도와전남해안에는비가약하게내리고있는데요, 이비는밤에그밖에충청과남부지방으로확대되겠습니다.

오늘출근길에도크게춥지않겠습니다.

날씨정보였습니다.

YTN 기상캐스터

GoogleTranslator

nVoice본문듣기

Speech Synthesis

http://folk.uio.no/plison/research

Dialogue system

Speaker Recognition

Speaker Recognition

Speaker Verification (Speaker Detection)

• Is this speech sample from a particular speaker Is that Jane?

Speaker Identification

• Which of these speakers does this sample come from? Who is that?

• Related tasks: Gender ID, Language ID Is this a woman or a man

Speaker Diarization

• Segmenting a dialogue or multiparty conversation Who spoke when?

• 음성인식기반의한국어학습자발음평가

• 한국어학습자발음평가시스템(네이버 + 서울대)

Multilingual Pronunciation Assessment

Corpus-Based Evaluation on Chinese Text Normalization

INTRODUCTION

TAXONOMY OF NON-STANDARD WORDS

TEXT NORMALIZATION MODULES

CORPUS AND TESTSET

EVALUATION RESULTS

DISCUSSION AND FUTURE WORK

Kim, S. (2017). Corpus-based evaluation of Chinese text normalization. In 2017 20th Conference of the Oriental Chapter of the

International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA) (pp. 1-4). IEEE.

CORPUS-BASED EVALUATION OF CHINESE TEXT NORMALIZATION

Speech Synthesis: Text-to-Speech

Text normalization A crucial component of text analysis in TTS systems, causing a major degradation of perceived quality of

the given TTS system Converts Non-Standard Words (NSWs) into corresponding standard words

Number expressions, abbreviations, and acronyms, etc. Detection/classification of NSWs, disambiguation, and conversion into standard words

Approaches TN Approaches

WFSTs , Language modeling Machine learning approaches (ME, RNN) along with DBB based TTS (Wavenet, Deep Voice, Tacotron)

Evaluation methods WER, TER, F1-Measure

Aim of paper To present a method of developing a corpus consisting of various categories of NSWs and a

representative test set in Standard Mandarin and Taiwanese Mandarin

INTRODUCTION

TAXONOMY OF NON-STANDARD WORDS

Taxonomy proposed Based on a systematic investigation of a large scale corpus, which consists of sentences from email,

chatting, and news Similar to the one presented in [2]

Examples of Basic NSWs (BNSWs)

Examples of Ambiguous NSWs (ANSWs))

TEXT NORMALIZATION MODULES

Tools Thrax [9][10] and OpenFST [11], which is similar to the one presented in [8]

CORPUS AND TEST SET: STANDARD MANDARIN

Description News Blog Email Forum SMS Chat Total

corpus size 100MB 3.45MB 7.7MB 3.11MB 1.48MB 7.84MB 123.58MB

number of sentences 440,000 42,629 67,572 75,519 33,403 216,116 875,239

sentences with NSWs 150,000 5,062 7,493 9,021 1,885 17,970 191,431percentage 30% 20% 10% 10% 15% 15% 100%

NSW type Date Time Email URL Phone/Fax PercentageProportion 12.8% 0.8% 0.1% 0.1% 0.3% 4.0%

NSW type Num+suffix_each money_name digit_suffix en_digit en_seq nt_wordsProportion 0.2% 0.1% 1.2% 5.8% 2.0% 1.2%

NSW type NumOrder Ratio NumberInterval YearInterval Num+suffix num_units

Proportion 5.4% 0.1% 1.2% 0.1% 34.8% 0.5%

NSW type en_words Digit NumberReal symbol ext_rules en_seq_default

Proportion 0.5% 0.3% 13.1% 5.2% 0.6% 9.7%

Corpus composition

Distribution of NSW categories of 1,000 test cases

CORPUS AND TEST SET: STANDARD MANDARIN

Distribution of NSW categories

CORPUS AND TEST SET: TAIWANESE MANDARIN

Corpus composition

Distribution of NSW categories of 1,000 test cases

Description NewsBlog&Forum&Ne

ws Email SMS Chatting Total

corpus size 25.8MB 100MB 8.55MB 40MB 10MB 184.35MB

number of sentences 723,385 1,230,000 74,300 881,510 302,545 3,211,740

sentences with NSWs 44,478 525,326 21,681 85,386 54,653 731,524

percentage 30% 20% 10% 20% 20% 100%

NSW type Date Time Email URL Phone/Fax PercentagePercentage 4.1% 1.7% 0.3% 0.1% 0.6% 1.2%

NSW type Num+suffix_each money_name digit_suffix en_digit en_seq nt_wordsPercentage 0.1% 0.1% 1.0% 7.4% 2.4% 1.0%

NSW type Fraction NumOrder Ratio NumberInterval Num+suffix num_unitsPercentage 0.1% 2.2% 0.1% 1.2% 23.6% 1.0%

NSW type Digit NumberReal symbol ext_rules en_seq_default Number_TraPercentage 0.6% 18.5% 6.9% 0.2% 21.4% 4.2%

CORPUS AND TEST SET: TAIWANESE MANDARIN

Distribution of NSW categories

Standard Mandarin The test set

1000 sentences including 1,782 NSWs, amounting to 57,387 characters Manual checking conducted by two language experts Results

Errors: 34 NSWs in 33 sentences NSW token accuracy: 98.09% (P_NSW = 1 - 34/1782 = 98.09%) Sentence accuracy: 96.7% (P_Sent = 967/1000 = 96.7%)

Taiwanese Mandarin The test set

1000 sentences including 1,402 NSWs, amounting to 29,158 characters Manual checking conducted by two language experts Results

Errors: 33 NSWs in 31 sentencesNSW token accuracy: 97.64% (P_NSW = 1 - 33/1402 = 97.64%)Sentence accuracy: 96.9% (P_Sent = 969/1000 = 96.9%).

EVALUATION RESULTS

Summary This paper presents a method of developing a corpus consisting of various categories of Non-Standard

Words (NSWs) and a representative test set for the evaluation of the text normalization module proposed for Standard Mandarin and Taiwanese Mandarin.

To note The two languages known to be the same except for their character sets show difference in terms of

NSW categories. More alphabets and their compounds appear in Taiwanese Mandarin (33.8%) than in Standard

Mandarin (20.5%) More numbers and their compounds are found 81.4% in Standard Mandarin than in Taiwanese

Mandarin (63.9%). The symbols appear in the similar proportion in two languages.

DISCUSSION AND FUTURE WORK

Applications for Language Education

로제타스톤 (Rosetta Stone)

시원스쿨리얼트레이닝

스피킹맥스

특징비교

RosettaStone RealTraining SpeakingMax

environment web / app web / app web / app

materialsmultimedia

(photos / speech)

multimedia

(interviews of native

speakers)

multimedia

(interviews of

native speakers)

speaking method shadowing shadowingRepeat: shadowing

Speech: speaking

learning unitword – phrase -

sentencesentence sentence

assessment /

feedback

word: pass/fail

sentence:

sound wave

visual feedback

including individual

scores

visual feedback

(a variational sould

wave)

objective fluency fluency fluency

네이버 cake

• 영어발음평가시스템

• 네이버영어사전적용

네이버어학사전

Concluding remarks

Future: AI tutors

blog.frontiersin.org

https://www.google.com/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwi6psfj0-PiAhUu7GEKHXeABNcQjB16BAgBEAQ&url=https://blog.frontiersin.org/2016/11/04/robotic-tutors-for-primary-school-children/&psig=AOvVaw0KNBsPqYZ_MIYAbhOpp9kC&ust=1560418880087268

Language Skills

Discussion

• 인공지능기술과외국어교육

• “Virtual Tutor”

• 인공지능환경에서외국어교육연구자의역할

• 컨텐츠 개발

• 컨텐츠설계

• 컨텐츠추천(큐레이션)

• 컨텐츠평가

• 데이터베이스구축

• 기능및컨텐츠를고려한데이터설계및개발

관련연구결과물

• 김선희 (2013). 일본어음성합성을위한음성셋정의. 한국음향학회 2013 추계학술대회논문요약집, p17.

• 김종진, 김상진, 김선희, 김형준, 와타나베리카, 홍진표 (2013). NAVER 다국어음성합성시스템소개. 한국음향학회 2013 추계학술대회논

문요약집, p18.

• 김상진, 김종진, 김선희, 김형준 (2013). 영자신문낭독 TTS용음성코퍼스의발성목록설계. 한국음향학회 2013 추계학술대회논문요약집,

p17.

• 김선희(2014).영어 TTS DB 운율연구: 낭독체와대화체비교. 한국음성학회 2014 가을학술대회논문집, pp93-94.

• 홍진표, 김선희(2014). 한국어 TTS 개발을위한통합운율경계모델링. 한국음성학회 2014 가을학술대회논문집, pp187-188.

• Minki Lee, Jaemin Kim, Sunhee Kim (2015). Classification of prosodic boundaries based on acoustic cues for Korean TTS. ICSS 2015.

• 이민기, 김선희, 김재민(2015). 음소특성을이용한한국어자동음소전사. 음성및신호처리학술대회 2015.

• 이민기, 김재민, 홍진표, 김선희(2016). 개인화음성합성개발을위한소용량발성목록추출. 한국음성학회 2016 봄학술대회.

• Sunhee Kim (2016). How to select a good voice for TTS. 9th ISCA Speech Synthesis Workshop.

• Sunhee Kim (2017). Corpus-based evaluation of Chinese text normalization. In 2017 20th Conference of the Oriental Chapter of the International

Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA) (pp. 1-4). IEEE. (Best paper)

• 김선희(2018). 코퍼스기반프랑스어텍스트정규화평가. 말소리와음성과학, 10(4).

• 김선희(2018). 지식기반프랑스어발음열생성시스템. 말소리와음성과학, 10(1), 49-55.

• 김선희(2018). 프랑스어자동발음평가를위한음운자질연구. 프랑스어문교육, 60, 147-168.

관련연구결과물

• 김선희(2018). 프랑스어 schwa 의음향학적특성.언어학연구 49 (2018): 83-101.

• 김선희(2018). 프랑스어자동발음평가를위한음운자질연구. 프랑스어문교육, 60, 147-168.

• 김선희, & 정현훈(2018). 외국어학습용어플리케이션의음성인식기술활용현황. 한국디지털콘텐츠학회논문지, 19(4), 621-630.

• Ryu, Hyuksu, et al. (2016). Automatic pronunciation assessment of Korean spoken by L2 learners using best feature set selection. Signal and

Information Processing Association Annual Summit and Conference (APSIPA), 2016 Asia-Pacific. IEEE, 2016.

• Hyejin Hong, Sunhee Kim, & Minhwa Chung (2014). A corpus-based analysis of English segments produced by Korean learners. Journal of Phonetics,

46, 52-67.

谢谢

Documents

AI를활용한외국어교육dml.snu.ac.kr/sites/dml.snu.ac.kr/files/AI를 활용한... · 2020. 6. 1. · AI를활용한외국어교육 김선희 서울대학교불어교육과 2020-05-22