Adding Dense, Weighted Connections to W ORDNET W NETschapire/papers/wordnet.pdf · Adding Dense, Weighted Connections to W ORDNET Jordan Boyd-Graber and Christiane Fellbaum and Daniel

Adding Dense, Weighted Connections to WORDNET

Jordan Boyd-Graber and Christiane Fellbaum and Daniel Osherson and Robert SchapirePrinceton University

October 9, 2005

Abstract

WORDNET, a ubiquitous tool for natural language processing, suffers from sparsity of connectionsbetween its component concepts (synsets). Through the use of human annotators, a subset of theconnections between 1000 hand-chosen synsets was assigned a value of “evocation” representinghow much the first concept brings to mind the second. These data, along with existing similaritymeasures, constitute the basis of a method for predicting evocation between previously unratedpairs.

Submission Type:Long Article

Topic Areas: Extending WORDNET

Author of Record: Jordan Boyd-Graber, [email protected]

Under consideration for other conferences (specify)?None

Adding Dense, Weighted Connections to WORDNET

Abstract

WORDNET, a ubiquitous tool for natural language processing, suffers from sparsity of connectionsbetween its component concepts (synsets). Through the use of human annotators, a subset of theconnections between 1000 hand-chosen synsets was assigned a value of “evocation” representinghow much the first concept brings to mind the second. These data, along with existing similaritymeasures, constitute the basis of a method for predicting evocation between previously unratedpairs.

1 Introduction

WORDNET is a large electronic lexical database ofEnglish. Originally conceived as a full-scale modelof human semantic organization, it was quickly em-braced by the Natural Language Processing (NLP)community, a development that guided its subse-quent growth and design. WORDNET has becomethe lexical database of choice for NLP; Kilgariff(Kilgarriff, 2000) notes that “not using it requiresexplanation and justification.” WORDNET’s popu-larity is largely due to its free public availability andits broad coverage.

WORDNET already has a rich structure connect-ing its component synonym sets (synsets) to eachother. Noun synsets are interlinked by means ofhyponymy, thesuper-subordinateor is-a relation,as exemplified by the pair[poodle]-[dog] .1

Meronymy, thepart-wholeor has-arelation, linksnoun synsets like[tire] and [car] (Miller,1998). Verb synsets are connected by a va-riety of lexical entailment pointers that expressmanner elaborations[walk]-[limp] , tempo-ral relations[compete]-[win] , and causation[show]-[see] (Fellbaum, 1998). The linksamong the synsets structure the noun and verb lex-icons into hierarchies, with noun hierarchies beingconsiderably deeper than those for verbs.

WORDNET appeals to the NLP community be-cause these semantic relations can be exploited forword sense disambiguation (WSD), the primary

1Throughout this article we will follow the convention ofusing a single word enclosed in square brackets to denote asynset. Thus,[dog] refers not just to the word dog but tothe set – when rendered in its entirety – consisting of{dog,domestic dog, canis familaris }.

barrier preventing the development of practical in-formation retrieval, machine translation, summa-rization, and language generation systems. Al-though most word forms in English are monose-mous, the most frequently occurring words arehighly polysemous. Resolving the ambiguity ofa polysemous word in a context can be achievedby distinguishing the multiple senses in terms oftheir links to other words. For example, thenoun [club] can be disambiguated by an au-tomatic system that considers the superordinatesof the different synsets in which this word formoccurs: [association] , [playing card] ,and [stick] . It has also been noted that di-rectly antonymous adjectives share the same con-texts (Deese, 1964); exploiting this fact can help todisambiguate highly polysemous adjectives.

1.1 Shortcomings of WORDNET

Statistical disambiguation methods, relying oncooccurrence patterns, can discriminate amongword senses rather well, but not well enough for thelevel of text understanding that is desired (Schutze,1998); exploiting sense-aware resources, such asWORDNET, is also insufficient (McCarthy et al.,2004). To improve such methods, large manu-ally tagged training corpora are needed to serve as“gold standards.” Manual tagging, however, is time-consuming and expensive, so large semantically an-notated corpora do not exist. Moreover, people havetrouble selecting the best-matching sense from adictionary for a given word in context. (Fellbaumand Grabowski, 1997) found that people agreedwith a manually created gold standard on average74% of the time, with higher disagreement rates for

1

more polysemous words and for verbs as comparedto nouns. Confidence scores mirrored the agreementrates.

In the absence of large corpora that are manuallyand reliably disambiguated, the internal structure ofWORDNET can be exploited to help discriminatesenses, which are represented in terms of relation-ships to other senses. But because WORDNET’snetwork is relatively sparse, such WSD methodsachieve only limited results.

In order to move beyond these meager begin-nings, one must be able to use the entire contextof a word to disambiguate it; instead of looking atonly neighboring nouns, one must be able to com-pare the relationship between any two words via acomplete comparison. Moreover, the character ofthese comparisons must be quantitative in nature.This paper serves as a framework for the additionof a complete, directed, and weighted relationshipto WORDNET. As a motivation for this additon,we now discuss three fundamental limitations ofWORDNET ’s network.

No cross-part-of-speech linksWORDNET con-sists of four distinct semantic networks, onefor each of the major parts of speech. Thereare no cross-part-of-speech links.2 The lack ofsyntagmatic relations means that no connec-tion can be made between entities (expressedby nouns) and their attributes (encoded byadjectives); similarly, events (referred to byverbs) are not linked to the entities with whichthey are characteristically associated. Forexample, the intuitive connections among suchconcepts as[traffic] , [congested] ,and[stop] are not coded in WORDNET.

Too few relations WORDNET’s potential is lim-ited because of its small number of relations.Increasing the number of arcs connecting agiven synset to other synsets not only refinesthe relationship between that synset and other

2WORDNET does contain arcs among many words fromdifferent syntactic categories that are semantically and morpho-logically related, such as[operate] , [operator] , and[operation] (Fellbaum and Miller, 2003). However, se-mantically related words like operation, perform, and danger-ous are not interconnected in this way, as they do not share thesame stem.

meanings but also allows a wider range of con-texts (that might contain a newly connectedword) to help disambiguation.

(Mel’cuk and Zholkovsky, 1998) propose sev-eral dozen lexical and semantic relationsnot included in WORDNET, such as “ac-tor” ([book]-[writer] ) and “instrument”([knife]-[cut] ). But many associationsamong words and synsets cannot be repre-sented by clearly labeled arcs. For exam-ple, no relation proposed so far accounts forthe association between pairs like[tulip]and[Holland] , [sweater] and[wool] ,and [axe] and [tree] . It is easy to de-tect a relation between the members of thesepairs, but the relations cannot be formulated aseasily as hyponymy or meronymy. Similarly,the association between[chopstick] and[Chinese restaurant] seems strong,but this relation requires more than the kind ofsimple label commonly used by ontologists.

Some users of WORDNET have tried to makeup for the lack of relations by exploiting thedefinition and/or the illustrative sentence thataccompany each synset. For example, inan effort to increase the internal connectiv-ity of WORDNET, Mihalcea and Moldovan(Mihalcea and Moldovan, 2001) automaticallylink each content word in WORDNET’s defi-nitions to the appropriate synset; the Prince-tonWORDNET team is currently performingthe same task manually. But even this signif-icant increase in arcs leaves many synsets un-connected; moreover, it duplicates some of theinformation already contained in WORDNET,as in the many cases where the definition con-tains a monosemous superordinate. Anotherway to link words and synsets across parts ofspeech is to assign them to topical domains, asin (Magnini and Cavaglia, 2000). WORDNET

contains a number of such links, but the do-main labels are not a well-structured set. Inany case, domain labels cannot account forthe association of pairs like[Holland] and[tulip] . In sum, attempts to make WORD-NET more informative by increasing its con-nectivity have met with limited success.

2

No weighted arcsA third shortcoming of WORD-NET is that the links are qualitative ratherthan quantitative. It is intuitively clear thatthe semantic distance between the members ofhierarchically related pairs is not always thesame. Thus, the synset[run] is a subordi-nate of[move] , and[jog] is a subordinateof [run] . But [run] and[jog] are seman-tically much closer than[run] and[move] .WORDNET currently does not reflect this dif-ference and ignores the fact that words – la-bels attached to concepts – are not evenly dis-tributed throughout the semantic space coveredby a language. This limitation of WORDNET

is compounded in NLP applications that relyon semantic distance measures where edgesare counted, e.g., (Jiang and Conrath, 1997)and (Leacock and Chodorow, 1998). Recallthat adjectives in WORDNET are organizedinto pairs of direct antonyms (e.g., long-short)and that each member of such a pair is linkedto a number of semantically similar adjectivessuch as[lengthy] and [elongated] ,and [clipped] and [telescoped] , re-spectively. The label “semantically similar,”however, hides a broad scale of semantic re-latedness, as the examples indicate. Mak-ing these similarity differences explicit wouldgreatly improve WORDNET’s content and use-fulness for a variety of NLP applications.

2 An Enrichment of W ORDNET

To address these shortcomings, we are workingto enhance WORDNET by adding a radically dif-ferent kind of information. The idea is to addquantified, oriented arcs between pairs of synsets,e.g., from {car, auto } to {road, route },from {buy, purchase } to {shop, store },from {red, crimson, scarlet } to {fire,flame }, and also in the opposite direction. Eachof these arcs will bear a number corresponding tothe strength of the relationship. We chose to usethe concept of evocation – how much one conceptevokes or brings to mind the other – to model therelationships between synsets.

[cat] brings [dog] to mind, just as[swimming] evokes [water] , and the word[cunning] evokes[cruel] . Such association

of ideas has been a prominent feature of psycho-logical theories for a long time (Lindzey, 1936).It appears to be involved in low-level cognitivephenomena such as semantic priming in lexicaldecision tasks (McNamara, 1992) and high-levelphenomena like diagnosing mental illness (Chap-man and Chapman, 1967). Its role in the on-linedisambiguation of speech and reading has beenexplored by (Swinney, 1979), (Tabossi, 1988), and(Rayner et al., 1983), among others.

Evocation is a meaningful variable for all pairsof synsets and seems easy for human annotators tojudge. In this sense our extension of WORDNET

will have no overlap with knowledge repositorieslike CYC (Lenat, 1995) but can be viewed as com-plementary.

2.1 Collecting Ratings

We hired 20 Princeton undergraduates during the2004-2005 academic year to rate evocation in120,000 pairs of synsets. The synsets were drawnrandomly from all pairs defined from a set of 1000“core” synsets compiled by the investigators. Thecore synsets were compiled as follows. The mostfrequent strings (nouns, verbs, and adjectives) fromthe BNC were selected. For each string, the WORD-NET synsets containing this string were extracted.Two of the authors then went over the list of synsetsand selected those senses of a given string thatseemed the most salient and basic. The initial stringis the “head word” member of the synset; the syn-onyms function merely to identify the concept ex-pressed by the central string. To reflect the distribu-tion of parts of speech in the lexicon, we chose 642nouns, 207 verbs, and 151 adjectives.

Our raters were first instructed about the evoca-tion relation and were offered the following expli-cations:

1. Evocation is a relation between meanings asexpressed by synsets and not a relation be-tween words; examples were provided to re-inforce this point.

2. One synset evokes another to the extent thatthinking about the first brings the secondto mind. (Examples were given, such as[government] evoking [register] forthe appropriate synsets including these terms.)

3

3. Evocation is not always a symmetrical re-lation (for example,[dollar] may evoke[green] more than the reverse).

4. The task is to estimate the extent to which onesynset brings to mind another in the generalundergraduate population of the United States;idiosyncratic evocations caused by the annota-tor’s personal history are irrelevant.

5. It is expected that many pairs of synsets willproduce no evocation at all (connections be-tween synsets must not be forced).

6. There are multiple paths to evocation, e.g.:

[rose] - [flower] (example)[brave] - [noble] (kind)[yell] - [talk] (manner)[eggs] - [bacon] (co-occurrence)[snore] - [sleep] (setting)[wet] - [desert] (antonymy)[work] - [lazy] (exclusivity)[banana] - [kiwi] (likeness)

7. In no case should evocation be influenced bythe sounds of words or their orthographies(thus, [rake] and [fake] do not evokeeach other on the basis of sound or spelling).

8. The integers from 0 to 100 are available to ex-press evocation; round numbers need not beused.

Raters were familiarized with a computer inter-face that presented pairs of synsets (each as a listwith the highest frequency word first and empha-sized; we will refer to this word as the “head word”).The parts of speech corresponding to each synset ina pair were also shown. Presenting entire synsetsinstead of single words eliminates the risk of con-fusion between rival senses of polysemous words.Between the two synsets appeared a scale from 0to 100; 0 represented “no mental connection,” 25represented ”remote association,” 50 represented“moderate association,” 75 represented “strong as-sociation,” and 100 represented “brings immedi-ately to mind.”

As final preparation, each rater was asked toannotate two sets of 500 randomly chosen pairsof synsets (distinct from the pairs to be annotated

later). Both sets had been annotated in concert bytwo of the investigators. The first served as a train-ing set: the response of the annotator-trainee to eachpair was followed by the “gold standard” rating ob-tained by averaging the ratings of the investigators.The second served as a test set: no feedback wasoffered, and we calculated the Pearson correlationbetween the annotators rating versus our own. Themedian correlation obtained on the test set by the24 annotators recruited for the project was .72; nonescored lower than .64.

Unbeknownst to the annotators, some pairs werepresented twice, on a random basis, always on dif-ferent sessions. The average correlation betweenfirst and second presentations was .70 for those an-notators who generated at least 100 test-retest pairs.

2.2 Analysis of Ratings

Every pair of synsets were evaluated by at leastthree people (additional annotations were some-times collected to test consistency of annotatorjudgments), and as one might expect from randomlyselecting pairs of synsets, most (67%) of the pairswere rated by every annotator as having no mentalconnection (see Figure 1). The ratings were usuallyconsistent across different annotators; the averagestandard deviation for pairs where at least one raterlabeled it as non-zero was 9.25 (on a scale from 0 to100).

Because there is an active vein of research com-paring the similarity of synsets within WORDNET,we present the Spearman rank order coefficientρ fora variety of similarity measures. We use WORD-NET::Similarity (Patwardhan et al., 2004) to pro-vide WORDNET-based measures (e.g. (Leacockand Chodorow, 1998) and (Lesk, 1986) applied toWORDNET glosses). In addition, Infomap (Peters,2005) is used to provide the cosine between LSAvectors (Landauer and Dumais, 1997) created fromthe British National Corpus (BNC). For every wordfor which the program computes a context vector,the 2000 words closest to it are stored. We onlyconsider pairs where both words had context vectorsand one was within the 2000 closest vectors to theother. Other words can be safely assumed to havea small value for the cosine of the angle betweenthem.

The Leacock-Chodorow (LC) and Path measures

4

0 20 40 60 80 100

5 e−

055

e−

045

e−

035

e−

02Distribution of Evocation

Strength of Evocation

Log

of D

ensi

ty

Figure 1: Logarithmic distribution of evocation rat-ings

require connected networks to compute distance, sothose values were only computed for noun-noun andverb-verb pairs. The correlations achieved by thesevarious methods are displayed in the following ta-ble.

Metric Subset (# Pairs) ρ

Lesk All (119668) 0.008Path Verbs (4801) 0.046LC Nouns (49461) 0.130Path Nouns (49461) 0.130LSA Closest 2000 (15285) 0.131

Although there is evidence of a slight mono-tonic increase of evocation with these similaritymeasures, the lack of correlation shows that thereis not a strong relationship. Our results thereforedemonstrate that evocation is an empirical measureof some aspect of semantic interaction not capturedby these similarity methods.

For each of the similarity measures, a wide rangeof possible evocation values is observed for the en-tire gamut of the similarity range. One typical rela-tionship is shown in Figure 2, which shows evoca-tion vs. the cosine between LSA vectors. The onlyexception is that the similarity measures tend to dovery well in determining synsets with little evoca-tion; for low values of similarity, the evocation is

0.2 0.4 0.6 0.8 1.0

020

4060

8010

0

Evocation vs. LSA Cosine

Cosine Between LSA Vectors

Evo

catio

n

Figure 2: The relationship between LSA cosine vec-tors and evocation for pairs that were within the1000 closest word vectors.

reliably low.

There are several reasons why these measures failto predict evocation. Many of the WORDNET mea-sures are limited to only a small subset of the synsetpairs that are of interest to us; the path and Leacock-Chodorow metrics, for instance, are useable onlywithin the components of WORDNET that have welldefinedis-ahierarchies.

Although the LSA metric is free of this restric-tion, it is really a comparison of the relatedness ofstringsrather than synsets. The vector correspond-ing to the stringfly , for example, encompasses allof its meanings for multiple parts of speech; becausemany of the words under consideration are poly-semous, LSA could therefore suggest relationshipsbetween synsets that correspond to meanings otherthan the intended one.

Finally, all these measures are symmetric, butthe evocation ratings are not (see Figure 3). Ofthe 3302 pairs where both directions between pairswere rated by annotators and where one of the rat-ings was non-zero, the correlation coefficient be-tween directions was 0.457. While there is a strongsymmetric component, there are many exampleswhere asymmetry is observed.

5

0 20 40 60 80

020

4060

8010

0Reciprocity of Evocation

Evocation

Rev

erse

Evo

catio

n

Figure 3: The evocation observed between synsetpairs in opposite directions.

2.3 Extending Ratings

Before these data can be used for NLP applica-tions, it must be possible to query the evocationbetween arbitrary synsets. Our goal is to create ameans of automatically judging the evocation be-tween synsets while avoiding the impractical task ofhand annotating the links between all1010 pairs ofsynsets. Our method attempts to leverage the dis-parate strengths of the measures of semantic dis-tance discussed above in addition to measures ofsimilarity between context vectors (culled from theBNC) for individual words.

These context vectors were created by search-ing the BNC for the head word of each synset andlemmatizing the results. Stop words were removedfrom the results, and frequencies were tabulatedfor words at mostn words away (both right andleft) from the head word found in the sentence forn = 2, 4, 6, 8. Because the BNC is tagged, it al-lows us to specify the part of speech of the targetwords. Although the tagging does not completelyalleviate the problem of multiple senses being rep-resented by the context vectors, it does eliminate theproblem when the senses have different functions.

We created a range of features from each pairof context vectors including the relative entropy,cosine,L1 distance,L2 distance, and the number

of words in the context vectors of both words (afull listing appears in the table below). Descriptivestatistics for the individual context vectors were alsocomputed. It is hoped that the latter information,in addition to relative entropy, would provide someasymmetric foundation for the prediction of evoca-tion links.

WORDNET- based BNC-derivedJiang-Contrath Relative Entropy

Path MeanLesk Variance

Hirst-St. Onge L1 DistanceLeacock-Chodorow L2 Distance

Part of Speech CorrelationContextual OverlapLSA-vectors Cosine

Frequency

These were exploited as features for the Boost-Texter algorithm (Schapire and Singer, 2000),which learns how to automatically apply labels toeach example in a dataset. In this case, we brokethe range of evocations into five labels:{x ≥ 0, x ≥1, x ≥ 25, x ≥ 50, x ≥ 75}. Because there are somany ratings with a value of zero, we created a spe-cial category for those values; the other categorieswere chosen to correspond to the visual promptspresented to the raters during the annotation pro-cess. Another option would have been to divide upthe range to have roughly equal frequencies of evo-cation; given the large numbers of zero annotations,however, this would lead to very low resolution forhigher – and more interesting – levels of evocation.

Given the probabilities for membership in eachof the range of values, this allows us to compute anestimate of the expected predicted evocation fromour coarse probability distribution. We randomlyheld out 20% of the labeled data and trained thelearning algorithm on the remaining data. Becauseit is reasonable to assume that different parts ofspeech will have different models of evocation andbecause WORDNET and WORDNET-derived simi-larity measures provide different data for differentparts of speech, we trained the algorithm on each ofthe six pairs of parts of speech as well as the com-plete, undivided dataset. The mean squared errors(the square of the predicted minus the correct level

6

Distribution of Predicted Evocation

Evocation

Fre

quen

cy

0 20 40 60 80 100

020

0040

0060

0080

00

Figure 4: After training, the learning algorithm pre-dicted a distribution of evocation on the test dataconsistent with the log-transformed distribution ofevocations displayed in Figure 1.

of evocation) and the sizes of the training corporafor each are provided below.

Dataset Mean Squared Error Training SizeAA 89.9 2202VV 83.2 3861NV 80.7 25483AN 67.2 18471All 63.0 95603AV 53.8 6022NN 49.8 39564

A naıve algorithm would simply label all pairs ashaving no evocation; this would yield a 73.0 meansquared error for the complete data set, higher thanour mean squared error of 63.0 on the completedataset and 53.8 on noun-noun pairs (as above).Not only does the algorithm perform better usingthis metric, but these predictions also, as one wouldhope, have a distribution much like that observed forthe original evocation data (see Figure 4). Taken to-gether, it confirms that we are indeed exploiting thefeatures to create a reasonably consistent model ofevocation, particularly on noun-noun pairs, whichhave the richest set of features.

0 20 40 60 80 100

010

0020

0030

0040

0050

0060

0070

00

Predicted Evocation Error

True Evocation

Squ

ared

Err

or

Figure 5: Although most of the data are clusteredaround(0, 0), there are many data points for whichhigh levels of evocation were assigned to zero evo-cation data (the spike to the left) and some data ofhigh evocation that was assigned to zero levels ofevocation (the line followingx2). For high levels ofevocation, the predictions become less accurate.

We hope to further refine the algorithm and thefeature set to improve prediction further; even forthe best prediction scheme, for noun-noun pairs,many pairs with zero evocation were assigned tohigh levels of evocation (see Figure 5). It is heart-ening, however, to note that the algorithms success-fully predicted many pairs with moderate levels ofevocation.

3 Future Work

Although our work with developing a learning al-gorithm to predict evocation is still in its prelim-inary stages, a foundation is in place for creatinga complete, directed, and weighted network of in-terconnections between synsets within WORDNET

that represent the actual relationships observed byreal language users. Once our automatic systemhas shown itself to be reliable, we intend to extendits application beyond the 1000 synsets selected forthis study: first by extending it to 5000 synsetsjudged central to a basic vocabulary and then to therest of WORDNET.

7

The real test of the efficacy of any addition toWORDNET remains how well it performs in tasksthat require sense disambiguation. It is hoped thatan enriched network in WORDNET will be able toimprove disambiguation methods that use WORD-NET as a tool.

4 Acknowledgments

The authors would like to acknowledge the sup-port of the National Science Foundation (grant Nos.0414072 and 0530518) for the funding to gatherthe initial annotations and Princeton Computer Sci-ence for fellowship support. The authors wish toespecially acknowledge the work of our undergrad-uate annotators for their patience and hard work incollecting these data and Ben Haskell, who helpedcompile the initial list of synsets.

References

L. Chapman and J. Chapman. 1967. Genesis of popularbut erroneous psychodiagnostic observations.Journalof Abnormal Psychology, (72):193–204.

J. Deese. 1964. The associative structure of some en-glish adjectives.Journal of Verbal Learning and Ver-bal Behavior, (3):347–357.

C. Fellbaum and J. Grabowski. 1997. Analysis of ahand-tagging task. InProceedings of the ACL/Siglexworkshop. Association for Computational Linguistics.

C. Fellbaum and G. A. Miller. 2003. Morphosemanticlinks in WordNet.Traitement automatique de langue.

C. Fellbaum, 1998. WordNet : An Electronic Lexi-cal Database, chapter A semantic network of Englishverbs. MIT Press, Cambridge, MA.

J. Jiang and D. Conrath. 1997. Semantic similaritybased on corpus statistics and lexical taxonomy. InProceedings on International Conference on Researchin Computational Linguistics, Taiwan.

A. Kilgarriff. 2000. Review of WordNet : An electroniclexical database.Language, (76):706–708.

T. Landauer and S. Dumais. 1997. Solutions to Plato’sproblem: The latent semantic analsyis theory of ac-quisition, induction, and representation of knowledge.Psychological Review, (104).

C. Leacock and M. Chodorow, 1998.WordNet : AnElectronic Lexical Database, chapter Combining localcontext and WordNet similarity for word sense identi-fication. MIT Press, Cambridge, MA.

D. Lenat. 1995. Cyc: A large-scale investment in knowl-edge infrastructure. Communications of the ACM,38(11).

M. Lesk. 1986. Automatic sense disambiguation us-ing machine-readable dictionaries. InProceedings ofSIGDOC.

G. Lindzey, editor. 1936.History of Psychology in Au-tobiography. Clark University Press, Worcester, MA.

B. Magnini and G. Cavaglia. 2000. Integrating subjectfield codes into WordNet. InProceedings of LREC,pages 1413–1418, Athens, Greece.

D. McCarthy, R. Koeling, J. Weeds, and J. Carroll. 2004.Finding predominant senses in untagged text. InPro-ceedings of the 42nd Annual Meeting of the Associ-ation for Computational Linguistics, pages 280–287,Barcelona, Spain.

T. P. McNamara. 1992. Priming and constraints it placeson theories of memory and retrieval.PsychologicalReview, pages 650–662.

I. Mel’cuk and A. Zholkovsky, 1998.Relational Modelsof the Lexicon, chapter The explanatory combinatorialdictionary. Cambridge University Press, Cambridge.

R. Mihalcea and D. Moldovan. 2001. Extended Word-Net : Progress report. InProceedings of the NAACLWorkshop on WordNet and other lexical resources,pages 95–100, Pittsburgh, PA.

G. Miller, 1998. WordNet : An Electronic LexicalDatabase, chapter Nouns in WordNet. MIT Press,Cambridge, MA.

S. Patwardhan, T. Pedersen, and J. Michelizzi. 2004.WordNet::similarity–measuring the relatedness ofconcept. InProceedings of the 19th National Con-ference on Artificial Intelligence, page 25.

S. Peters. 2005. Infomap NLP software: an open-sourcepackage for natural language processing. Webpage,October.

K. Rayner, M. Carlson, and L. Frazier. 1983. The inter-action of syntax and semantics during sentence pro-cessing — Eye-movements in the analysis of seman-tically biased sentences.Journal of Verbal Learningand Verbal Behavior, 22(3):358–374.

R. Schapire and Y. Singer. 2000. BoosTexter: Aboosting-based system for text categorization.Ma-chine Learning, 39(2/3):135–168.

H. Schutze. 1998. Automatic word sense discrimina-tion. Computational Linguistics, 24(1):97–123, 1998.

8

D. Swinney. 1979. Lexical access during sentencecomprehension: (Re)consideration of context effects.Journal of Verbal Learning and Verbal Behavior,(18):645–659.

P. Tabossi. 1988. Acessing lexical ambiguity in differenttypes of sentential context.Journal of Memory andLanguage, (27):324–340.

9

Documents

Adding Dense, Weighted Connections to W ORDNET W NETschapire/papers/wordnet.pdf · Adding Dense, Weighted Connections to W ORDNET Jordan Boyd-Graber and Christiane Fellbaum and Daniel