Timothy Baldwin and Dominic Widdows

Noun Countability

Timothy Baldwin and Dominic Widdows

1 Noun Countability

Background

• Countability is a syntactic property of the noun phrase in

languages such as English, Dutch, Albanian and Tagalog

• In generation used to decide between:

a cake, cake, a piece of cake

• In analysis, helps to resolve ambiguity:

? I need a paper by this evening (academic/newspaper)

? I need some paper by this evening (material)

? I need the paper by this evening (ambiguous)

31 October, 2003

2 Noun Countability

Noun Phrase Countability

• Semantically motivated:

? bounded, indivisible individuals (+b)

prototypically countable: a dog, two dogs

? unbounded, divisible substances (−b)

prototypically uncountable: gold

31 October, 2003

3 Noun Countability

Countability Classes

• countable: book, button, person (one book, two books)

• uncountable: equipment, gold, wood (*oneequipment, much equipment, *two equipments)

• plural only: clothes, manners, outskirts (*oneclothes, clothes horse)

• bipartite: glasses, scissors, trousers (*one scissors,scissor kick, pair of scissors)

31 October, 2003

4 Noun Countability

Applications

• Determination of countability of unknown nouns (e.g.

acyclovir, coagulopathy)

• Detection of countability anomalies in multiword

expressions (e.g. public relations, cat’s cradle)

• Extraction of English determinerless PPs (e.g. by bus,at sea)

• Key component of noun type hierarchy in precision

grammars (e.g. ERG, Alpino)

31 October, 2003

5 Noun Countability

Learning the Countability ofEnglish Nouns from Corpus

Data

Timothy Baldwin and Francis BondACL 2003

31 October, 2003

6 Noun Countability

Learning Countability

• Observation: the countability properties of a noun type

are reflected in its corpus token occurrences:

... Cezanne snarling like a dog and then ...

... doing an impression of a rabid dog.

... with a pack of dogs running beside them.

Amnesty International has received information ...

Recent information from former detainees ...

... researchers often uncover information ...

31 October, 2003

7 Noun Countability

Case in Point

Acyclovir is a specifically anti-viral drug ...

Acyclovir has been developed and marketed by ...

Acyclovir given intravenously, ...

Coagulopathy is a well recognised complication ...

... may explain why coagulopathy after shunting is ...

... could stimulate a coagulopathy ...

... is also probably responsible for a coagulopathy ...

... a patient with a coagulopathy.

31 October, 2003

8 Noun Countability

Methodology

• Identify lexical and/or constructional features associated

with each countability class

• Determine the relative corpus occurrence of the features

for each noun

• Use the noun feature vectors to classify the noun as

a member of each of the countability classes, training

from gold-standard countability data

31 October, 2003

9 Noun Countability

Feature Clusters

Head noun number:1D target noun number as head of

NP (e.g. a shaggy dog = SINGULAR)

Modifier noun number:1D target noun number as

modifier in NP (e.g. dog food = SINGULAR)

Subject–verb agreement:2D target noun number as

subject vs. verb number agreement (e.g. the dog barks

= 〈SINGULAR,SINGULAR〉)Coordinate noun number:2D target noun number vs.

31 October, 2003

10 Noun Countability

the number of the head nouns of conjuncts (e.g. dogs

and mud = 〈PLURAL,SINGULAR〉)N of N constructions:2D number of N vs. type of N

(e.g. the type of dog = 〈TYPE,SINGULAR〉); total of 11

N types for use in this feature cluster (e.g. COLLECTIVE,

LACK, TEMPORAL).

Occurrence in PPs:2D the presence or absence of a

determiner (±DET) in singular head complement of PP

(e.g. per dog = 〈per ,−DET〉).Pronoun co-occurrence:2D what pronouns occur in the

31 October, 2003


same sentence as singular and plural instances (e.g. The

dog ate its dinner = 〈its,SINGULAR〉). Approximation of

pronoun co-indexation.

Singular determiners:1D singular-selecting determiners

(e.g. a dog = a). Two types: countable (e.g. another,

each), uncountable (e.g. much, little).

Plural determiners:1D plural-selecting determiners (e.g.

few dogs = few).

Non-bounded determiners:2D non-bounded determiner

vs. noun number (e.g. more dogs = 〈more,PLURAL〉).

31 October, 2003


Feature Values

1D corpfreq(f s,w) =freq(f s|w)freq(∗) (1)

wordfreq(f s,w) =freq(f s|w)freq(w)

(2)

featfreq(f s,w) =freq(f s|w)∑

ifreq(f i|w)(3)

2D featdimfreq(f s,t,w) =freq(f s,t|w)∑

ifreq(f i,t|w)(4)

featdimfreq(f s,t,w) =freq(f s,t|w)∑jfreq(f s,j|w)

(5)

31 October, 2003


1-D case

corpfreq(f ,w)

wordfreq(f ,w)

featfreq(f ,w)

31 October, 2003


1-D case

corpfreq(f ,w)

wordfreq(f ,w)

featfreq(f ,w)

31 October, 2003


2-D casecorpfreq(f ,,w)

wordfreq(f ,,w)

featfreq(f ,,w)

31 October, 2003


2-D case

featdimfreq(f ,,w)

31 October, 2003


2-D case

featdimfreq(f ,,w)

31 October, 2003


2-D case

corpfreq(f ∗,,w)

wordfreq(f ∗,,w)

featfreq(f ∗,,w)

31 October, 2003


2-D case

corpfreq(f ,∗,w)wordfreq(f ,∗,w)featfreq(f ,∗,w)

31 October, 2003


Feature Value Extraction

• POS tagging and templates

? extract features with regexp-base templates

• Full text chunking

? conservative inter-chunk attachment disambiguation

• Robust parsing (RASP)

• Concatenated feature values from three systems

31 October, 2003


Classifier architecture

• Training data: generated from combination of ALT-J/E

and COMLEX (5,943 common nouns in BNC)

? positive examples in both ALT-J/E and COMLEX

? negative examples in neither ALT-J/E nor COMLEX

• Test data: nouns with ≥ 10 BNC instances for all 3

methods (20,530 common nouns)

• Four binary supervised classifiers, one per countability

class (learned using TiMBL and k-NN)

31 October, 2003


Supervised Classifiers: Basic Overview

Training data

Learner

Classifier

Test data

Test instance?

A

B

B

Classification

A

31 October, 2003


k-NN in TiMBL

• Distance between feature vectors X and Y based on“overlap metric”:

∆(X, Y ) =∑

δ(xi, yi)

δ(xi, yi) =

| xi−yimaxi−mini

| if numeric, else0 if xi = yi

1 if xi 6= yi

• Retrieve the neighbours at the k closest distances and

classify according to the most common class amongst

them

31 October, 2003


Cross Validation: Input

• Take training data:

31 October, 2003


Cross Validation: Partitioning

• Split up into N equal-sized (optionally stratified)

partitions P i:

P

P

P

P

P

P

P

P

P

P

31 October, 2003


Cross Validation: Fold 1

• For each i = 1...N , take P i as the test data and

{P j : j 6= i} as the training data

P

P

P

P

P

P

P

P

P

P

31 October, 2003





P

P

P

P

P

P

P

P

P

P

31 October, 2003





P

P

P

P

P

P

P

P

P

P

31 October, 2003


Cross Validation: Fold i

• And so on ...

31 October, 2003


Cross Validation: Evaluate

• Calculate classification accuracy, precision, recall, F-

score, ... according to the average across the N

iterations

• Effective method of minimising training bias and test

variance

31 October, 2003


Cross-validated Countability Results

• Good results (particularly for countable and uncountable

nouns), well above the baseline accuracy in each case

• Best results for combined method (concatenation of

three pre-processors)

31 October, 2003


Manual Evaluation over Open Data

• Classifier precision of 94.6% relative to lexicons

• Manually annotated 100 nouns from the test data:

? Agreement between classified and hand-annotated

countabilities 92.4%

? Agreement between classified and dictionary

countabilities 92.4%

• Classifiers agree with corpus as well as lexicons

31 October, 2003


Reflections

• Impressive results, but still room for improvement

(particularly for the less-populated countability classes)

• Boundary between motivated countabilities and

conversions (e.g. chicken vs. elephant vs. dog)

• Difficulties caused by MWEs (e.g. cat’s cradle)

• Sense and frequency effects (e.g. information)

31 October, 2003


Using an ontology todetermine English countability

Francis Bond and Caitlin Vatikiotis-BatesonCOLING 2002

31 October, 2003


Semantic Predictability of Countability

• How far is English countability predictable from

meaning?

• Countability is to some degree deterministic given the

semantics of a word:

dog, pooch, canine, mongrel, ...BUT suitcases vs. luggage, leaves vs. foliage, etc.

31 October, 2003


Case in Point

Coagulopathy: group of conditions of the blood clotting

(coagulation) system in which bleeding is prolonged and

excessive, a bleeding disorder

Acyclovir: antiviral drug

31 October, 2003


Word Denotation and Countability

• Knowing the referent is not enough, e.g. scales

1. Thought of as being made of two arms: (British)

a pair of scales

2. Thought of as a set of numbers: (Australian)

a set of scales

3. Thought of as discrete whole objects: (American)

one scale/two scales

31 October, 2003


Methodology

• Take an existing ontology and determine the default

countability for each synset (semantic class)

• Test how reliably defaults predict the countability of

members of each synset

• Base experimentation on the ALT-J/E semantic transfer

lexicon and ontology

31 October, 2003


Lexicon

• ALT-J/E’s semantic transfer lexicon

• 71,833 linked Japanese-English noun pairs

31 October, 2003


The Goi-Taikei Ontology

• A rich ontology and wide coverage of Japanese

• Used in many NLP applications such as MT

• 2,710 semantic classes (12-level tree structure) for

common nouns

• Constructed from translation pairs (without countability

in mind)

31 October, 2003


Top Four Levels of Ontology

31 October, 2003


Noun Countability Preferences in ALT-J/E

Noun Countability Code Example Default Default # %Preference Number Classifier

fully CO knife sg — 47,255 65.8countable

strongly BC cake sg — 3,110 4.3countable

weakly BU beer sg — 3,377 4.7countable

uncountable UC furniture sg piece 15,435 21.5

plural only PT scissors pl pair 2,107 2.9

31 October, 2003


Experiment

• Treat every combination of semantic classes as a

different semantic class.

• Most frequent NCP is assigned to all members of a

class.

? Ties are resolved as follows: fully countable beats

strongly countable beats weakly countablebeats uncountable beats plural only.

• Baseline (all fully countable = 65.8%)

31 October, 2003


Example

• Semantic Class 910:tableware

? crockery ⇔ toukirui (UC)? dinner set ⇔ youshokki (CO)? tableware ⇔ shokki (UC)? Western-style tableware ⇔ youshokki (UC)

• The most common NCP is UCAssociated uncountable with 910:tableware.

• This predicts the NCP correctly 75% of the time.

31 October, 2003


Results

Conditions % Range Baseline

Training=Test 77.9 76.8–78.6 65.8

10-fold Cross Validation 71.2 69.8–72.1 65.8

• 11.6% given default value (fully countable)

31 October, 2003


Discussion

• Semantics predicts countability around 78% of the time

: supports hypothesis that countability is semantically

motivated

• Less successful than corpus-based countability learning

• Problems of granularity/translation-orientation of

lexicon

• Problems with noise in lexicon

31 October, 2003


The Ins and Outs of DutchNoun Countability

Classification

Timothy Baldwin and Leonoor van der BeekALTW2003

31 October, 2003


Crosslinguistic Predictability of Countability

• In linguistically-related languages such as English and

Dutch, countability generally patterns the same way:

? same basic behaviour of translation-equivalent

lexical/syntactic markers of countability (e.g. one

dog ⇀↽ een hond, some rice ⇀↽ een beetje rijst)

? translation pairs often have same countability: car ⇀↽

auto [countable], food ⇀↽ eten [uncountable], BUT

thunderstorm [countable] vs. onweer [uncountable]

31 October, 2003


Out-of- vs. In-language Classification

• Given high-quality training data in a closely-related

language (English – COMLEX +ALT-J/E) and medium-

quality data in the target language (Dutch – Alpinolexicon):

? which generates the best classifier?

? what is the best form of crosslingual mapping?

• Focus on the task of Dutch noun countability

classification

31 October, 2003


Approaches to Monolingual Classification

• Evidence-based classification: base classification on

token evidence for each countability class

• Distribution-based classification: same as for EN-EN

classification task (Baldwin and Bond (2003))

31 October, 2003


Approaches to Crosslingual Classification

• Corpus occurrence-based classification (binary vs.

multiclass):

? cluster-to-cluster classification: EN and ND feature

clusters pattern the same

? feature-to-feature classification: EN and ND

features pattern the same (all features vs. partitions

of feature space)

31 October, 2003


• Translation-based classification: countability

is preserved under translation (e.g. car ⇀↽ auto

[countable])

• Transliteration-based classification: countability

is preserved under transliteration (e.g. paranoia ⇀↽

paranoia [uncountable])

• System combination: classify according to combined

outputs of individual methods

? crosslingual + unsupervised monolingual

? crosslingual + monolingual

31 October, 2003


Results

• Better results for crosslingual than monolingual

classification (!)

• Classifiers produce countability results more consistent

with corpus occurrence than Alpino lexicon

• Translation and transliteration are excellent predictors

of countability

• Semantics in crosslingual countability classification?

31 October, 2003


A Preview of Results from EuroWordNet

Dutch Alpino English dict+learnedcount uncount count uncount

Dutch Alpino 0.75 0.37 0.87 0.47

Dutch annotated 0.76 0.44 0.90 0.75

English annotated 0.64 0.49 0.58 0.47

English dict+learned 0.63 0.33 0.62 0.47

31 October, 2003


Final Reflections

• Demonstration of types of methods that can be used to

determine noun type countability

? distribution-based

? semantics/sense-based

? translation/transliteration-based

• Where next? Watch this space!

31 October, 2003


Acknowledgements

• Thanks to Francis Bond and Caitlin Vatikiotis-Bateson

for sharing their wonderful slides and graphics!

31 October, 2003

Documents

Timothy Baldwin and Dominic Widdows