LexicalSemantics,Distributions,Predicate-ArgumentStructure,and
FrameSemanticParsing
11-711AlgorithmsforNLP29November2016
(WiththankstoNoahSmithandLoriLevin)
11-711CourseContext
• Previoussemanticslecturesdiscussedcomposingmeaningsofpartstoproducethecorrectglobalsentencemeaning– Themailmanbitmydog.
• The“atomicunits”ofmeaninghavecomefromthelexicalentriesforwords
• Themeaningsofwordshavebeenoverlysimplified(asinFOL):atomicobjectsinaset-theoreticmodel
WordSense
• Instead,abank canholdtheinvestmentsinacustodialaccountintheclient’sname.
• Butasagricultureburgeonsontheeastbank,theriverwillshrinkevenmore.
• Whilesomebanks furnishspermonlytomarriedwomen,othersaremuchlessrestrictive.
• Thebank isnearthecornerofForbesandMurray.
FourMeaningsof“Bank”• Synonyms:• bank1 =“financialinstitution”• bank2 =“slopingmound”• bank3 =“biologicalrepository”• bank4 =“buildingwhereabank1 doesitsbusiness”
• Theconnectionsbetweenthesedifferentsenses varyfrompracticallynone(homonymy)torelated(polysemy).– Therelationshipbetweenthesensesbank4 andbank1 iscalledmetonymy.
Antonyms
• White/black,tall/short,skinny/American,…• Butdifferentdimensionspossible:–White/Blackvs.White/Colorful– Oftenculturallydetermined
• Partlyinterestingbecauseautomaticmethodshavetroubleseparatingthesefromsynonyms– Samesemanticfield
Ambiguityvs.Vagueness
• Lexicalambiguity:Mywifehastwokids(childrenorgoats?)
• vs.Vagueness:1sense,butindefinite:horse(mare,colt,filly,stallion,…)vs.kid:– IhavetwohorsesandGeorgehasthree– IhavetwokidsandGeorgehasthree
• Verbstoo:IranlastyearandGeorgedidtoo• vs.Reference:I,here,thedognotconsideredambiguousinthesameway
HowManySenses?
• Thisisahardquestion,duetovagueness.• Considerations:– Truthconditions(servemeat/servetime)– Syntacticbehavior(servemeat/serveassenator)– Zeugmatest:• #DoesUnitedservebreakfastandPittsburgh?• ??Shepoacheselephantsandpears.
RelatedPhenomena
• Homophones(would/wood,two/too/to)–Mary,merry,marryinsomedialects,notothers
• Homographs(bass/bass)
Ontologies
• ForNLP,databasesofwordsensesaretypicallyorganizedbylexicalrelationssuchashypernym (IS-A)intoaDAG
• Thishasbeenworkedonforquiteawhile• Aristotle’sclasses(about330BC)– substance(physicalobjects)– quantity(e.g.,numbers)– quality(e.g.,beingred)– Others:relation,place,time,position,state,action,affection
Synsets
• (bass6,bass-voice1,basso2)• (bass1,deep6)(Adjective)
• (chump1,fool2,gull1,mark9,patsy1,fallguy1,sucker1,softtouch1,mug2)
FramebasedKnowledgeRep.
• Organizerelationsaroundconcepts• Equivalentto(orweakerthan)FOPC
– Imagefromfuturehumanevolution.com
Wordsimilarity
• Humanlanguagewordsseemtohavereal-valuedsemanticdistance(vs.logicalobjects)
• Twomainapproaches:– Thesaurus-basedmethods• E.g.,WordNet-based
– Distributionalmethods• Distributional“semantics”,vector“semantics”• Moreempirical,butaffectedbymorethansemanticsimilarity(“wordrelatedness”)
Human-subjectWordAssociationsStimulus:wall
Numberofdifferentanswers:39Totalcountofallanswers:98BRICK160.16STONE90.09PAPER70.07GAME50.05BLANK40.04BRICKS40.04FENCE40.04FLOWER40.04BERLIN30.03CEILING30.03HIGH30.03STREET30.03...
Stimulus:giraffe
Numberofdifferentanswers:26
Totalcountofallanswers:98NECK330.34ANIMAL90.09ZOO90.09LONG70.07TALL70.07SPOTS50.05LONGNECK40.04AFRICA30.03ELEPHANT20.02HIPPOPOTAMUS20.02LEGS20.02...
FromEdinburghWordAssociationThesaurus,http://www.eat.rl.ac.uk/
Betterapproach:weightedlinks• Usecorpusstatstogetprobabilitiesofnodes• Refinement:useinfocontentofLCS:
2*logP(g.f.)/(logP(hill)+logP(coast))=0.59
DistributionalWordSimilarity• Determinesimilarityofwordsbytheirdistribution inacorpus– “Youshallknowawordbythecompanyitkeeps!”(Firth1957)
• E.g.:100kdimension vector,“1”ifwordoccurswithin“2lines”:
• “Whoismyneighbor?”Whichfunctions?
Whoismyneighbor?• Linearwindow?1-500wordswide.Orwholedocument.Removestopwords?
• Usedependency-parserelations?Moreexpensive,butmaybebetterrelatedness.
Weightsvs.justcounting
• Weightthecountsbytheapriorichanceofco-occurrence
• Pointwise MutualInformation(PMI)• Objectsofdrink:
Distancebetweenvectors
• Comparesparsehigh-dimensionalvectors– Normalizeforvectorlength
• Justusevectorcosine?• SeveralotherfunctionscomefromIRcommunity
Distributionally SimilarWords
30
Rumvodkacognacbrandywhiskyliquordetergentcolaginlemonadecocoachocolatescotchnoodletequilajuice
Writereadspeakpresentreceivecallreleasesignofferknowacceptdecideissueprepareconsiderpublish
Ancientoldmoderntraditionalmedievalhistoricfamousoriginalentiremainindianvarioussingleafricanjapanesegiant
Mathematicsphysicsbiologygeologysociologypsychologyanthropologyastronomyarithmeticgeographytheologyhebreweconomicschemistryscripturebiotechnology
(fromanimplementationofthemethoddescribedinLin.1998.AutomaticRetrievalandClusteringofSimilarWords.COLING-ACL.Trainedonnewswiretext.)
Human-subjectWordAssociationsStimulus:wall
Numberofdifferentanswers:39Totalcountofallanswers:98BRICK160.16STONE90.09PAPER70.07GAME50.05BLANK40.04BRICKS40.04FENCE40.04FLOWER40.04BERLIN30.03CEILING30.03HIGH30.03STREET30.03...
Stimulus:giraffe
Numberofdifferentanswers:26
Totalcountofallanswers:98NECK330.34ANIMAL90.09ZOO90.09LONG70.07TALL70.07SPOTS50.05LONGNECK40.04AFRICA30.03ELEPHANT20.02HIPPOPOTAMUS20.02LEGS20.02...
FromEdinburghWordAssociationThesaurus,http://www.eat.rl.ac.uk/
Recentevents(2013-now)
• RNNs(RecurrentNeuralNetworks)asanotherwaytogetfeaturevectors– Hiddenweightsaccumulatefuzzyinfoonwordsintheneighborhood
– Thesetofhiddenweightsisusedasthevector!• Compositionbymultiplying(etc.)–Mikolov etal(2103):“king– man+woman=queen”(!?)
– CCGwithvectorsasNPsemantics,matricesasverbsemantics(!?)
34 SemanticProcessing[2]
SemanticCases/ThematicRoles
• Developedinlate1960’sand1970’s• Postulatealimitedsetofabstractsemanticrelationshipsbetweenaverb&itsarguments:thematicroles orcaseroles
• Insomesense,partoftheverb’ssemantics
35 SemanticProcessing[2]
ThematicRoleexample
• Johnbrokethewindowwiththehammer• John:AGENTrolewindow:THEMErolehammer:INSTRUMENTrole
• ExtendLFnotationtousesemanticroles
36 SemanticProcessing[2]
ThematicRoles
• IsthereaprecisewaytodefinemeaningofAGENT,THEME,etc.?
• Bydefinition:– “TheAGENTisaninstigatoroftheactiondescribedbythesentence.”
• Testingviasentencerewrite:– Johnintentionally brokethewindow– *Thehammerintentionally brokethewindow
37 SemanticProcessing[2]
ThematicRoles[2]
• THEME– Describestheprimaryobjectundergoingsomechangeorbeingactedupon
– FortransitiveverbX,“whatwasXed?”– Thegrayeaglesawthemouse“Whatwasseen?”(A:themouse)
Breaking,Eating,Opening• Johnbrokethewindow.• Thewindowbroke.• Johnisalwaysbreakingthings.
• Weatedinner.• Wealreadyate.• Thepieswereeatenupquickly.
• Openup!• Someoneleftthedooropen.• Johnopensthewindowatnight.
Breaking,Eating,Opening• Johnbrokethewindow.• Thewindowbroke.• Johnisalwaysbreakingthings.
• Weatedinner.• Wealreadyate.• Thepieswereeatenupquickly.
• Openup!• Someoneleftthedooropen.• Johnopensthewindowatnight.
breaker,brokenthing,breakingfrequency?
eater,eatenthing,eatingspeed?
opener,openedthing,openingtime?
CanWeGeneralize?
• Thematicroles describegeneralpatternsofparticipantsingenericevents.
• Thisgivesusakindofshallow,partialsemanticrepresentation.
• FirstproposedbyPanini,before400BC!
ThematicRoles
Role Definition ExampleAgent Volitional causer of the event The waiter spilled the soup.
Force Non-volitional causer of the event The wind blew the leaves around.
Experiencer Mary has a headache.Theme Most directly affected participant Mary swallowed the pill.Result End-product of an event We constructed a new building.Content Proposition of a propositional event Mary knows you hate her.Instrument You shot her with a pistol.Beneficiary I made you a reservation.Source Origin of a transferred thing I flew in from Pittsburgh.Goal Destination of a transferred thing Go to hell!
ThematicGridorCaseFrame
• Example:break– Thechildbrokethevase.<agenttheme>
subjobj– Thechildbrokethevasewithahammer.
<agentthemeinstr >subjobj PP
– Thehammerbrokethevase.< themeinstr >obj subj
– Thevasebroke.<theme>subj
ThematicGridorCaseFrame
• Example:break– Thechildbrokethevase.<agenttheme>
subjobj– Thechildbrokethevasewithahammer.
<agentthemeinstr >subjobj PP
– Thehammerbrokethevase.< themeinstr >obj subj
– Thevasebroke.<theme>subjTheThematicGridorCaseFrameshows
• Howmanyargumentstheverbhas• Whatrolestheargumentshave• Wheretofindeachargument
• Forexample,youcanfindtheagentinthesubjectposition
DiathesisAlternation:achangeinthenumberofargumentsorthegrammaticalrelationsassociatedwith
eachargument
• Chris gaveabooktoDana. <agentthemegoal>subjobj PP
• AbookwasgiventoDana byChris. <agentthemegoal>PPsubjPP
• ChrisgaveDanaabook. <agentthemegoal>subjobj2obj
• Dana wasgivenabookbyChris. <agentthemegoal>PPobj subj
TheTroubleWithThematicRoles
• Theyarenotformallydefined.• Theyareoverlygeneral.• “agent verb theme withinstrument”and“instrumentverbtheme”...– Thecookopenedthejarwiththenewgadget.
→ Thenewgadgetopenedthejar.– Susanatetheslicedbananawithafork.
→ #Theforkatetheslicedbanana.
TwoDatasets
• PropositionBank(PropBank):verb-specificthematicroles
• FrameNet:“frame”-specificthematicroles
• Thesearelexiconscontainingcaseframes/thematicgridsforeachverb.
PropositionBank(PropBank)
• Asetofverb-sense-specific “frames”withinformalEnglishglossesdescribingtheroles
• Conventionsforlabelingoptionalmodifierroles
• PennTreebankislabeledwiththoseverb-sense-specificsemanticroles.
“Agree”inPropBank
• arg0:agreer• arg1:proposition• arg2:otherentityagreeing
• Thegroupagreeditwouldn’tmakeanoffer.• UsuallyJohn agreeswithMary oneverything.
“Fall(movedownward)”inPropBank
• arg1:logicalsubject,patient,thingfalling• arg2:extent,amountfallen• arg3:startingpoint• arg4:endingpoint• argM-loc:medium• Sales fellto$251.2million from$278.8million.• Theaveragejunkbondfellby4.2%.• Themeteorfellthroughtheatmosphere,crashingintoCambridge.
FrameNet
• FrameNetissimilar,butabstractsfromspecificverbs,sothatsemanticframes arefirst-classcitizens.
• Forexample,thereisasingleframecalledchange_position_on_a_scale.
change_position_on_a_scale
Oil rose in price by 2%It has increased to having them 1 day a month.Microsoft shares fell to 7 5/8.Colon cancer incidence fell by 50% among men.
Manywords,notjustverbs,sharethesameframe:
Verbs:advance,climb,decline,decrease,diminish,dip,double,drop,dwindle,edge,explode,fall,fluctuate,gain,grow,increase,jump,move,mushroom,plummet,reach,rise,rocket,shift,skyrocket,slide,soar,swell,swing,triple,tumbleNouns:decline,decrease,escalation,explosion,fall,fluctuation,gain,growth,hike,increase,rise,shift,tumbleAdverb:increasingly
Conversely,onewordhasmanyframesExample:rise
• Change-position-on-a-scale:OilROSEinpricebytwopercent.• Change-posture:aprotagonist changestheoverallpositionorpostureofabody.
– Source:startingpointofthechangeofposture.– Charles ROSEfromhisarmchair.
• Get-up:A Protagonist leavestheplacewheretheyhaveslept,their Bed,tobeginorresumedomestic,professional,orotheractivities.GettingupisdistinctfromWakingup,whichisconcernedonlywiththetransitionfromthesleepingstatetoawakefulstate.– I ROSE frombed,threwonapairofcamouflageshortsanddrovemylittleToyotaCorolla
toaconstructionclearingafewmilesaway.• Motion-directional:Inthisframea Theme movesinacertain Direction whichisoften
determinedbygravityorothernatural,physicalforces.The Theme isnotnecessarilyaself-mover.– TheballoonROSEupward.
• Sidereal-appearance: An Astronomical_entity comesintoviewabovethehorizonaspartofaregular,periodicprocessof(apparent)motionoftheAstronomical_entity acrossthesky.Inthecaseofthesun,theappearancebeginstheday.– Atthetimeofthenewmoon, themoon RISES ataboutthesametimethesunrises,and
itsetsataboutthesametimethesunsets.Eachday thesun's RISE offersusanewday.
FrameNet• Framesarenotjustforverbs!• Verbs:advance,climb,decline,decrease,diminish,dip,double,drop,dwindle,edge,explode,fall,fluctuate,gain,grow,increase,jump,move,mushroom,plummet,reach,rise,rocket,shift,skyrocket,slide,soar,swell,swing,triple,tumble
• Nouns:decline,decrease,escalation,explosion,fall,fluctuation,gain,growth,hike,increase,rise,shift,tumble
• Adverb:increasingly
FrameNet
• Includesinheritanceandcausationrelationshipsamongframes.
• Examplesincluded,butlittlefully-annotatedcorpusdata.
SemLink• Itwouldbereallyusefulifthesedifferentresourceswereinterconnectedinausefulway.
• SemLink projectis(was?)tryingtodothat• UnifiedVerbIndex(UVI)connects– PropBank– VerbNet– FrameNet–WordNet/OntoNotes
SemanticRoleLabeling
• Input:sentence• Output:foreachpredicate*,labeledspansidentifyingeachofitsarguments.
• Example:[agentThebatter]hit[patient theball][time yesterday]
• Somewherebetweensyntacticparsingandfull-fledgedcompositionalsemantics.
*Predicatesaresometimesidentifiedintheinput,sometimesnot.
Butwait.Howisthisdifferentfromdependencyparsing?
• Semanticrolelabeling– [agentThebatter]hit[patient theball][time yesterday]
• Dependencyparsing– [subjThebatter]hit[obj theball][mod yesterday]
Butwait.Howisthisdifferentfromdependencyparsing?
• Semanticrolelabeling– [agentThebatter]hit[patient theball][time yesterday]
• Dependencyparsing– [subjThebatter]hit[obj theball][mod yesterday]
1. Thesearenotthesametask.2. Semanticrolelabelingismuchharder.
Subjectvsagent
• Subjectisagrammaticalrelation• Agentisasemanticrole
• InEnglish,asubjecthastheseproperties– Itcomesbeforetheverb– Ifitisapronoun,itisinnominativecase(inafiniteclause)
• I/he/she/we/theyhittheball.• *Me/him/her/us/themhittheball.
– Iftheverbisinpresenttense,itagreeswiththesubject• She/he/ithitstheball.• I/we/theyhittheball.• *She/he/ithittheball.• *I/we/theyhitstheball.• Ihittheball.• Ihittheballs.
Subjectvsagent
• Inthemosttypicalsentences(forsomedefinitionof“typical”),theagentisthesubject:– Thebatterhittheball.– Chrisopenedthedoor.– Theteachergavebookstothestudents.
• Sometimestheagentisnotthesubject:– Theballwashitbythebatter.– Theballswerehitbythebatter.
• Sometimesthesubjectisnottheagent:– Thedooropened.– Thekeyopenedthedoor.– Thestudentsweregivenbooks.– Booksweregiventothestudents.
SimilaritiestoWSD
• PickcorrectchoicefromNambiguouspossibilities
• Definitionsarenotcrisp• Needtopickalabellingscheme,corpus– Choiceshavebigeffectonperformance,usefulness
SemanticRoleLabeling
• Input:sentence• Output:segmentationintoroles,withlabels
• Examplefrombook:• [arg0 TheExaminer]issued[arg1 aspecialedition][argM-tmp yesterday]
SemanticRoleLabeling:HowItWorks
• First,parse.• Foreachpredicatewordintheparse:– Foreachnodeintheparse:• Classify thenodewithrespecttothepredicate.
YetAnotherClassificationProblem!
• Asbefore,therearemanytechniques(e.g.,NaïveBayes)
• Key:whatfeatures?
FeaturesforSemanticRoleLabeling
• Whatisthepredicate?• Phrasetypeoftheconstituent• Headwordoftheconstituent,itsPOS• Pathintheparsetreefromtheconstituenttothepredicate
• Activeorpassive• Isthephrasebeforeorafterthepredicate?• Subcategorization(≈grammarrule)ofthepredicate
Featureexample
• Examplesentence:[arg0 TheExaminer]issued[arg1 aspecialedition][argM-tmp
yesterday]
• Arg0features:issued,NP,Examiner,NNP,path,active,before,VP->VBDNPPP
Example
Figure20.16:ParsetreeforaPropBank sentence,showingthePropBank argumentlabels.Thedottedlineshowsthepath featureNP ↑ S ↓ VP ↓ VBD forARG0,theNP-SBJconstituentTheSanFranciscoExaminer.
AdditionalIssues
• Initialfilteringofnon-arguments• Usingchunkingorpartialparsinginsteadoffullparsing
• Enforcingconsistency(e.g.,non-overlap,onlyonearg0)
• Phrasalverbs,supportverbs/lightverbs– takeanap:verbtakeissyntacticheadofVP,butpredicateisnapping,nottaking
Twodatasets,twosystems
• ExamplefrombookusesPropBank
• Locally-developedsystemSEMAFORworksonSemEval problem,basedonFrameNet
Shallowapproachestodeepproblems
• ForbothWSDandSRL:– Shallowapproachesmucheasiertodevelop• Asin,possibleatall forunlimitedvocabularies
– Notwonderfulperformanceyet• Sometimesclaimedtohelpaparticularsystem,butoftendoesn’tseemtohelp
– Definitionsarenotcrisp• Thereclearlyissomething there,butthegranularityofthedistinctionsveryproblematic
• DeepLearningwillfixeverything?
SEMAFOR
• AFrameNet-basedsemanticrolelabelingsystemdevelopedwithinNoah’sresearchgroup‣ Itusesadependencyparser(theMSTParser)forpreprocessing
‣ Identifiesanddisambiguatespredicates;thenidentifiesanddisambiguateseachpredicate’sarguments
‣ Trainedonframe-annotatedcorporafromSemEval2007/2010tasks.Domains:weaponsreports,travelguides,news,SherlockHolmesstories.
Nouncompounds• Averyflexible(productive)syntacticstructureinEnglish
‣ Thenounnoun patterniseasilyappliedtonamenewconcepts(Webbrowser)andtodisambiguateknownconcepts(firetruck)
‣ CanalsocombinetwoNPs:incumbentprotectionplan,[undergraduate [[computerscience][lecturecourse]]
‣ Sometimescreatesambiguity,esp.inwritingwherethereisnophonologicalstress:Spanishteacher
‣ Peoplearecreativeaboutinterpretingevennonsensicalcompounds
• Alsopresentinmanyotherlanguages,sometimeswithspecialmorphology
‣ Germanisinfamousforlovingtomergewordsintocompounds.e.g.Fremdsprachenkenntnisse,‘knowledgeofforeignlanguages’
Nouncompounds• SemEval2007task:ClassificationofSemanticRelationsbetweenNominals
‣ 7predefinedrelationtypes
1. Cause-Effect:fluvirus
2. Instrument-User:laserprinter
3. Product-Producer:honeybee
4. Origin-Entity:ryewhiskey
5. Purpose-Tool:souppot
6. Part-Whole:carwheel
7. Content-Container:applebasket
• http://nlp.cs.swarthmore.edu/semeval/tasks/task04/description.shtml
Nouncompounds• SemEval2010task:Nouncompoundinterpretationusing
paraphrasingverbs
‣ Adatasetwascompiledinwhichsubjectswerepresentedwithanouncompoundandaskedtoprovideaverbdescribingtherelationship
‣ nutbreadelicited:contain(21);include(10);bemadewith(9);have(8);bemadefrom(5);use(3);bemadeusing(3);feature(2);befilledwith(2);tastelike(2);bemadeof(2);comefrom(2);consistof(2);hold(1);becomposedof(1);beblendedwith(1);becreatedoutof(1);encapsulate(1);diffuse(1);becreatedwith(1);beflavoredwith(1)
• http://semeval2.fbk.eu/semeval2.php?location=tasks#T12