Translating Data Driven Language Learning into French Tom Cobb Dép. de Linguistique Université du...

Translating Data Driven Language Learning into French

Tom CobbDép. de Linguistique

Université du Québec à Montréal

Peut-on augmenter le rythme d’acquisition lexicale par la lecture ?

Une expérience de lecture en français appuyée sur une série de ressources en ligne.

Tom Cobb, Université du Québec à Montréal

Can the rate of lexical acquisition from reading be increased?

An experiment in reading French with a suite of on-line resources.

Tom Cobb, Université du Québec à Montréal

Background:

Data-Driven Language Learning On-line

Discovery learning Learner-as-linguist Alternatives to rules &

definitions Concordancing

Grammar Safari Concordancing Concordancing on-line Concordancing on-line in French

The idea of shortcuts to L2

It has long been known that the time available for LL through experience is inadequate in most cases

Learner’s time is shortDatabase is dispersedMuch time is needed to expose

patterns in data

The traditional shortcut to L2: Explicit declarative knowledge

‘Rules’ in grammar ‘Definitions’ in vocabulary

Never all that successful

Linguistic computing makes another kind of shortcut possible

Data aggregation & compressionRapid pattern exposure

‘Rules’ in grammar

Error: * This is one of the biggest car in the world

Solution: We tell students the rule: “After one of the comes a plural noun”

Or, tell them to go check the data

10 of 396 examples in Brown Corpus…

Advantages of data based learning

Learners initiate search themselves Patterns are large, crystal clear Linguistic authenticity is assured Learners have positive role to play: they are

linguists (Cobb, 1999)

Cf. negative ‘mistake maker’ role in traditional approach

Technology is used in a non-gaming context And used well, since concordances can not be

generated by any other means

Building a second lexicon - big need for data aggregation

Contextual inference problematic On learner-side (inferences generally unsuccessful;

Laufer, Haynes et al studies) On data-side (poor contexts, vast distances between)

Dictionary information hard to use by those who need it

Direct instruction runs up against task-size problem

Can computer data-aggregation help build a second lexicon?Two ideas:

1. List-driven learning: Corpus and concordance linked to frequency lists Frequency based testing to

find levelMake yourself a dictionary at

the level where you are weakExample: Lexical Tutor

Problems with list-driven learning:

1. Needed frequency information seems unavailable except in English

2. List is not everyone’s cup of tea

So, another idea: Adapt computational tools to the

less structured context of extensive reading

Introducing R-READ

Reading Extended Authentic Documents with Resources

…of a kind that are increasingly capable of Internet delivery

Brief History of Computer-Assisted L2 Reading Pre-Internet Age:

Skills based, no proof of transfer, “too little to read”

Internet Age: Too much to read, reading reduced to scanning

R-READ as a middle way

that uses Internet resources to

make extensive authentic documents readable, and

target specific learning

Personal Anecdote

Me, 1980, French reading test looming… Method: read one book, several times, aided by a

‘language consultant’ Voltaire’s Candide Francophone girlfriend

Look into every word; deconstruct every structure Repeat pronunciations Stick-on concordances Little notebooks

Stick-on’s removed, fewer look-ups

First Hurdle clear in about a week

Equity problem:

Not everyone can find a personal language consultant

Question: Would it be possible to itemise what the consultant was doing and reproduce these services universally?

An electronic language consultant?

Go online

User lexicon

Research Base (1)

Listen & read Draper & Moeller, 1971; Stanovich, 1896.

Lightbown,1992

Concordance: computer aided contextual inference

Huckin, Haynes & Coady, 1991; Cobb, 1999; Zahar, Cobb, & Spada, in press

Database as take-home learning outcome

Minimal time-off-task (Cobb, 1997) Collaborative (Horst & Cobb, in prep)

Research Base (2)

Dictionary Can disrupt reading, cause

misconception (Noblitt et al, 1990)

Useful pair with context if it follows effort to infer (Fraser, 1990)

Click-on interface Even if useful, dictionary will not be used

if effortful (Hulsteijn et al, 1996)

Research Base (3)

R-READ as middle position between stark choices of the past on extensive reading

Alternative 1: Natural extensive reading is an adequate source of vocabulary growth in L1 (Krashen, 1989) or L2 (Nagy, 1997)

Alternative 2: Vocabulary growth will not happen if conditions are not in place; assure they are in place by pre-teaching wordlists, out of context if necessary (Nation & Waring, 1997)

Middle approach made possible through ‘NTIC’

Vocabulary enhanced reading (Hulstijn, Holander, & Greidanus, 1996) Learners make their own way through roughly

tuned texts with support of resources In-context feature preserved

But is it useful?What follows is a substantial test of

this middle approach

Pilot Test of de Maupassant’s Boule de Suif with R-READ

How do vocabulary learning results of reading with online lexical resources compare to results of reading without these tools?

Baseline for comparison: Repeated-reading case studies of lexical acquisition by Horst (2000)

R’s reading of German novella (Horst, 2000)

R – motivated adult intermediate learner

German novella 9500 words 300 unique targets

(1:32) 45% rated unknown

at pretest 20% rated known at

pretest Treatment 3 readings Av. 3 hrs / reading

(3167 wds/hr)

J’s reading of Boule de Suif

J – motivated adult intermediate learner

Boule de Suif 13,400 words 400 unique targets

(1:33) 45% rated unknown at

pretest 27% rated known at

pretest Treatment 3 readings Av. 4.6 hrs/reading

(2913 wds/hr)

R’s German novella vs. J’s Boule de Suif R – motivated adult

intermediate learner German novella 9500 words 300 unique targets

(1:32) 45% rated unknown

at pretest 20% rated known at

pretest Treatment 3 readings Av. 3 hrs / reading

(3167 wds/hr)

J – motivated adult intermediate learner

Boule de Suif 13,400 words 400 unique targets

(1:33) 45% rated unknown at

pretest 27% rated known at

pretest Treatment 3 readings Av. 4.6 hrs/reading

(2913 wds/hr)

Rating scaleused at end of each reading

0 = I don't know what this word means 1 = I am not sure what this word means 2 = I think I know what this word means 3 = I definitely know what this word means

(Underlining added)

Non-binary measure, Horst & Meara, 1999

Results

Pretest Posttest 1 Posttest 2 Posttest 3

0 (unknown) 180 wds 74 49 28

1,2 (unsure) 142 wds 189 165 170

3 (known) 78 wds 137 186 202

J’s word knowledge ratings before reading and after each of three readings (resource assisted)

Summary: Unknown reduced from 180 to 128Known increased from 78 to 202

Comparison to baseline

Results for R (unassisted)n=300 words

Results for J (R-READ)

n=400 words

Pretest 3rd posttest

0 (not known)

45% 38 45 7

1 or 2 (unsure)

28% 33 36 43

3 (known) 27% 29 20 51

Percentage of targets in each category at outset and after three readings, unassisted and assisted

Comparison to baseline

Results for R (unassisted)n=300 words

Results for J (R-READ)

n=400 words

Pretest 3rd posttest

0 (not known)

45% 38 45 7

1 or 2 (unsure)

28% 33 36 43

3 (known) 27% 29 20 51

R’s results typical of many acquisition-from-reading studies;J 250% greater in ‘known’ category.

Self-assessment check

J (after 3 readings) and R (after 10 readings) asked for translations of words judged known

Js responses 94% accurate (Three readings with R-READ)

Rs responses 77% accurate (10 unassisted readings)

Conclusion (1)

This is only a pilot study

Suggests significant learning increase for minor time increase

These are learning figures seen in previous research only for tiny word sets via ‘rich’ instruction (Beck, McKeown… 1982)

Conclusion (2)

Suggests viablity of middle-way model of acquisition-through-reading

Suggests that low-cost language consultants can be brought into wide-spread use

Conclusion (3)

J. B. Carroll (1964) expressed a wish that a way could be found to mimic the effects of natural contextual learning, except more efficiently....

Maybe this ancient educational cul-de-sac can be solved through the principled application of computer technology – how many others?

Acknowledgements

This Web page incorporates the labours of many:

The roman 'Boule de Suif' Guy de Maupassant (1870)

Concordance program, true click-on hypertext Chris Greaves, Virtual Language Centre, Polytechnic University, Hong Kong

French-English Dictionary Neil Coffey http://www.french-linguistics.co.uk/dictionary/

Complete Corpus of de Maupassant oeuvre Thierry de Selva, Laboratoire d'Informatique, Université de Franche-Compté, Besançon

Read-aloud of 'Boule de Suif' Dominique Daguier, for «Le livre qui parle»

Perl scripting for User Lexicon Mutassem Abdulahab & Monet, EZScripting.

Web formatting of 'Boule de Suif' Carole Netter, Clicnet, Swarthmore College.

Historical Background Luc et Eric Dodument, Skylink, Hombourg, Belgium.

Movie poster http://perso.wanadoo.fr/lester/fifiaffiche.htm

Frequency List Association des Bibliophiles Universels (ABU), De Maupassant, CEDRIC/CNAM, Paris

Translating Data Driven Language Learning into French Tom Cobb Dép. de Linguistique Université du...

Documents

Linguistique de Corpuscoursdelinguistique.free.fr/semestre 1/Linguistique de Corpus/Cours... · La linguistique reconnaît la primauté de l’oral sur l’écrit : La linguistique

LLLDL 487 Linguistique informatique et linguistique de … · linguistique de corpus. ... •De façon générale •Voyelles ... Correction exercice . Title: LLLDL 487 Linguistique

Master mention Littérature Philologie Linguistique ...lettres.sorbonne-universite.fr/IMG/pdf/Brochure_master_LING_1516-2.pdf · générale, de la typologie linguistique, de la linguistique

Université de la Réunion Dép. Mathématiques et Informatiquelim.univ-reunion.fr/staff/fred/M2info/14-15/Soute... · Université de la Réunion Dép. Mathématiques et Informatique

1. D’un point de vue linguistique - LiLPa - Linguistique… · 2013-02-22 · qui regroupent divers chercheurs en linguistique et en informatique. 1. D’un point de vue linguistique,

Giáo trình thiết kế giày dép delcam shoemaker

CLASSE LINGUISTIQUE. Quelle est la mission dune classe linguistique?

Immersion linguistique

linguistique générale et appliquée - bououd.e-monsite.combououd.e-monsite.com/medias/files/linguistique-appliquee-2.pdf · BENVENISTE E. : Problèmes de linguistique générale,

PROCÉDURE DÉP ARTEMENTALE D AGRÉMENT DES …

vanbanphapluat.co filetcvn tcvn 10945:2015 iso/tr 16189:2013 xuât bàn làn 1 giÅy dÉp - cÁc chat cÓ hai tiÊm an trong giÅy dÉp vÀ cÁc chi tiÊt cÙa giÀy dÉp - phuong

3245 Cobb Pkwy NW - riverwoodproperties.com...3245 Cobb Pkwy NW. 3245 Cobb Pkwy NW. Kennesaw, Georgia 30101. 3245 Cobb Pkwy NW is an unanchored center located on Cobb Pkwy NW. The

Référence (linguistique)

linguistique enonciative

Linguistique -- Informatique

Baccalauréat en linguistique Majeure en linguistique

RÉPERTOIRE LINGUISTIQUE ET COMPETITIVITÉ: ANALYSE AFOM DU RÉPERTOIRE LINGUISTIQUE DES GALICIENS

Cobb Emergency Cobb County Veterinary Clinic Leash Law

Meillet - Linguistique historique et linguistique générale

B. LINGUISTIQUE