Corpus Linguistics: session 2 Corpus Linguistics (2): The Tools of the Trade 669o4zt...

Preview:

Citation preview

Corpus Linguistics: session 2

Corpus Linguistics (2):

The Tools of the Trade

http://tinyurl.com/669o4zt

martin.wynne@it.ox.ac.ukylva.berglund@it.ox.ac.uk

Today’s session

• An introduction to some features of tools

• Demo of different (kinds of) tools

• Hands-on practice with one tool

AIM: Help you know what to look for in a tool for your work (and what options there are)

TYPES OF TOOLSThere are different

Different kinds of tools

• Online / offline• For one particular corpus / for any corpus or

text• Use straight away / need to prepare corpus• 'Free' / licence conditions and costs

Different kinds of tools

• Online / offline• For one particular corpus / for any

corpus or text• Use straight away / need to prepare

corpus• 'Free' / licence conditions and costs

Tools may

• have different functions: concordance, wordlist, statistics, collocation, keywords…

• handle annotation: interpret tags, ignore tags, treat tags as text

• take different text formats: .txt, .xml, .html

TYPICAL FUNCTIONS

Different tools have different functions.

Concordance

• Search word + context

• Can be displayed as KWIC

• Can usually be sorted

• Used to see patterns of use

KWIC Concordance

Wordlist

List all words in the corpus

• alphabetically

• by frequency

Used as starting point for further functions

• keywords

• lexical density/readability calculations

Sampler AntConc wordlist

Collocations

Co-occurrence patterns

borrow money

borrow books

borrow a car

May I borrow

(more in Session 3)

Collocates: adjectives immediately preceding BUSINESS

Corpus of Contemporary American English

http://www.americancorpus.org/

Visualization

Graphs

Word clouds

Distribution displays

Etc.

Example: BNCweb

borrow

Example: Voyant Toolshttp://voyant-tools.org

‘borrow’Compare your intuition to what you find in the corpus

What is borrowed and by whom?

What words do you expect to find together with borrow?

Can these words be grouped in some way, for example based on their word class, function, or meaning?

Where would you expect these words (e.g. before or after borrow? Immediately adjacent or not?)

Who do you think uses the work borrow? In what context or type of language would you find borrow?

Are there any words that are NOT used with borrow?

AntConc

Download AntConc for free from:

http://www.antlab.sci.waseda.ac.jp/antconc_index.html

(or just search for Antconc)

Use your own texts and corpora. Find some examples at:

http://www.ota.ox.ac.uk/

Tip of the week

Register to use the BYU corpora for free.

http://corpus.byu.edu

Next week (Session 3)Collocation

Corpus linguists claim to have identified an important principle is responsible for the creation of much of the meaning of texts – collocation (co-occurrences). What is it, and are the claims true?

Optional reading:* Xiao, Richard, and Tony McEnery (2006). "Collocation, Semantic Prosody, and near Synonymy: A Cross-Linguistic Perspective " Applied Linguistics 27(1): 103-129. http://applij.oxfordjournals.org/cgi/content/full/27/1/103

Corpus Linguistics: session 2

Corpus Linguistics (2):

The Tools of the Trade

http://tinyurl.com/669o4zt

martin.wynne@it.ox.ac.ukylva.berglund@it.ox.ac.uk

Recommended