Upload
samuel-french
View
226
Download
3
Embed Size (px)
Citation preview
Applications of Corpora in English Language Learning
Yen-Liang Lin (Eric)
Corpus linguistics
A corpus is a collection of (1) machine-readable (2) authentic texts (including transcripts of spoken data) which is (3) sampled to be (4) representative of a particular language or language variety” (McEnery, Xiao and Tono, 2006, p. 5).
A large corpus provides evidence of typical formulaic language use.
Google search engine
www.zlcool.com LOGO
Disadvantages of using Google search engine
Very time-consuming to filter out irrelevant information
Queries not flexible and specific enoughMixed with native and nonnative speakers’
productionsThe output of the data is unorganized.Too much data to make a generalization.
Collocations
A sequence of words or terms which co-occur more often than would be expected by chance.
a questiona requestan applicationone’s resignation
RaiseAdvance/ makeFile / make / submit Tender / submit/ hand in
提出… . V + N
______________ a check 開支票 ______________ perfume 擦香水 ______________ the jackpot 中獎
Collocations V + N
我早上起床時感到劇烈的頭痛。When I woke up this morning, I _________ a serious headache.
這種情況下會出現什麼困難?What kind of difficulties may _____________ from this situation?
青少年需要一些基本生活技能的訓練,包括洗自己的衣服。Teenagers need some basic life-skills training, including how to_________ their own laundry.
•
V + Prep I congratulate you ______ your promotion. remind him _______ something be accused _______ corruption
N + Prep A check _______ $500 UK is an abbreviation _______ United
Kingdom.
Collocations
Adj + N交通壅擠 ___________ traffic濃茶 _____________ tea空頭支票 ____________ check酸雨 ____________ rain大風大雨 ____________ rain ____________ wind
Collocations
Oxford Online Collocation Dictionary
MUST<http://miscollocation.appspot.com/>
www.zlcool.com LOGO
The Tango English collocation
COCA
Corpus of Contemporary American English (http://corpus.byu.edu/coca/)
COCA, created by Professor Mark Davies at Brigham Young University, provides free access to an authentic American English corpus of 450 million words.
COCA has well-designed corpus tools tailored made for corpus research.
www.zlcool.com LOGO
Just the word (JTW)
www.zlcool.com LOGO
Chinese-English Parallel Corpora
The TOTALrecall Bilinugal concordancer is developed by Professor Jason Chang at National Tsing Hua University in Taiwan http://candle.fl.nthu.edu.tw/totalrecall/totalrecall/totalrecall.aspx?funcID=1
Jukuu ( 句酷 ) is a multilingual translation memory system. http://www.jukuu.com/
TAUS is a multilingual translation memory system. http://www.tausdata.org/index.php/home
JuKuu
www.zlcool.com LOGO
JuKuu 2
www.zlcool.com LOGO
Eric Lin