Dynamic Programming The CKY (Cocke-Kasami-Younger) Algorithmstaff.icar.cnr.it/ruffolo/progetti/projects/09... · A sig n rob al t ec u hy com p led ani h b Use the max probability

1

1LING 180 Autumn 2007

LINGUIST 180Introduction to Computational Linguistics

Lecture 10 Grammar and Parsing (II)October 25 2007

Dan Jurafsky

Thanks to Jim Martin for many of these slides


Grammars and Parsing

Context-Free Grammars and ConstituencySome common CFG phenomena for EnglishBaby parsers Top-down and Bottom-up ParsingToday Real parsers Dynamic Programming parsing

CKY

Probabilistic parsingOptional section the Earley algorithm


Dynamic Programming

We need a method that fills a table with partial resultsthat

Does not do (avoidable) repeated workDoes not fall prey to left-recursionCan find all the pieces of an exponential number oftrees in polynomial time

Two popular methodsCKYEarley


The CKY (Cocke-Kasami-Younger)Algorithm

Requires the grammar be in Chomsky Normal Form(CNF)

All rules must be in following formndash A -gt B Cndash A -gt w

Any grammar can be converted automatically toChomsky Normal Form


Converting to CNF

Rules that mix terminals and non-terminalsIntroduce a new dummy non-terminal that covers theterminal

ndash INFVP -gt to VP replaced by

ndash INFVP -gt TO VPndash TO -gt to

Rules that have a single non-terminal on right (ldquounitproductionsrdquo)

Rewrite each unit production with the RHS of theirexpansions

Rules whose right hand side length gt2Introduce dummy non-terminals that spread theright-hand side


Automatic Conversion to CNF

2


Sample Grammar


Back to CKY Parsing

Given rules in CNFConsider the rule A -gt BC

If there is an A in the input then there must be a Bfollowed by a C in the inputIf the A goes from i to j in the input then there mustbe some k st iltkltj

ndash Ie The B splits from the C someplace


CKY

So letrsquos build a table so that an A spanning from i to jin the input is placed in cell [ij] in the tableSo a non-terminal spanning an entire string will sit incell [0 n]If we build the table bottom up wersquoll know that theparts of the A must go from i to k and from k to j


CKY

Meaning that for a rule like A -gt B C we should lookfor a B in [ik] and a C in [kj]In other words if we think there might be an Aspanning ij in the inputhellip ANDA -gt B C is a rule in the grammar THENThere must be a B in [ik] and a C in [kj] for someiltkltjSo just loop over the possible k values


CKY Table

bullFilling the[ij]th cell inthe CKY table


CKY Algorithm

3


Note

We arranged the loops to fill the table a column at atime from left to right bottom to top

This assures us that whenever wersquore filling a cell theparts needed to fill it are already in the table (to theleft and below)Are there other ways to fill the table


0 Book 1 the 2 flight 3 through 4 Houston 5


CYK Example

S -gt NP VPVP -gt V NPNP -gt NP PPVP -gt VP PPPP -gt P NP

NP -gt John Mary DenverV -gt calledP -gt from


Example

John called Mary from Denver

S

VP PP

NP VP

V NP NPP


Example


S

PP

NP VP

NP

NP

V


Example

NP

John

calledNP

MaryV

fromNP

DenverP

4


Example

NP

John

calledNP

MaryVX

fromNP

DenverP


Example

NP

John

calledNP

MaryVX

fromNPVP

DenverP


Example

NP

John

calledNP

MaryVX

fromNPVP

DenverPX


Example

NPPP

John

calledNP

MaryVX

fromNPVP

DenverPX


Example

NPPP

John

calledNP

MaryV

fromNPVPS

DenverPX


Example

NPPP

John

calledNP

MaryVX

fromNPVPS

DenverPXX

5


Example

NPPPNP

John

calledNP

MaryVX

fromNPVPS

DenverPX


Example

NPPPNP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP1VP2

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP1VP2

S

John

calledNP

MaryVX

fromNPVPS

DenverPXXX

6


Example

NPPPNPVPS

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Back to Ambiguity

Did we solve it


Ambiguity


Ambiguity

NohellipBoth CKY and Earley will result in multiple Sstructures for the [0n] table entryThey both efficiently store the sub-parts that areshared between multiple parsesBut neither can tell us which one is rightNot a parser ndash a recognizer

ndash The presence of an S state with the right attributes inthe right place indicates a successful recognition

ndash But no parse treehellip no parserndash Thatrsquos how we solve (not) an exponential problem in

polynomial time


Converting CKY from Recognizerto Parser

With the addition of a few pointers we have a parserAugment each new cell in chart to point to where wecame from


How to do parse disambiguation

Probabilistic methodsAugment the grammar with probabilitiesThen modify the parser to keep only most probableparsesAnd at the end return the most probable parse

7


Probabilistic CFGs

The probabilistic modelAssigning probabilities to parse trees

Getting the probabilities for the modelParsing with probabilities

Slight modification to dynamic programmingapproachTask is to find the max probability tree for an input


Probability Model

Attach probabilities to grammar rulesThe expansions for a given non-terminal sum to 1VP -gt Verb 55VP -gt Verb NP 40VP -gt Verb NP NP 05

Read this as P(Specific rule | LHS)


PCFG


PCFG


Probability Model (1)

A derivation (tree) consists of the set of grammarrules that are in the tree

The probability of a tree is just the product of theprobabilities of the rules in the derivation


Probability model

P(TS) = P(T)P(S|T) = P(T) since P(S|T)=1

P(TS) = p(rn )nT

8



The probability of a word sequence P(S) is theprobability of its tree in the unambiguous caseItrsquos the sum of the probabilities of the trees in theambiguous case


Getting the Probabilities

From an annotated database (a treebank)So for example to get the probability for a particularVP rule just count all the times the rule is used anddivide by the number of VPs overall


TreeBanks


Treebanks


Treebanks


Treebank Grammars

9


Lots of flat rules


Example sentences from thoserules

Total over 17000 different grammar rules in the 1-million word Treebank corpus


Probabilistic GrammarAssumptions

Wersquore assuming that there is a grammar to be used to parsewithWersquore assuming the existence of a large robust dictionarywith parts of speechWersquore assuming the ability to parse (ie a parser)Given all thathellip we can parse probabilistically


Typical Approach

Bottom-up (CKY) dynamic programming approachAssign probabilities to constituents as they arecompleted and placed in the tableUse the max probability for each constituent going up


Whatrsquos that last bullet mean

Say wersquore talking about a final part of a parseS-gt0NPiVPj

The probability of the S ishellipP(S-gtNP VP)P(NP)P(VP)

The green stuff is already known Wersquore doing bottom-up parsing


Max

I said the P(NP) is knownWhat if there are multiple NPs for the span of text inquestion (0 to i)Take the max (where)

10


CKY Where does the max go


Problems with PCFGs

The probability model wersquore using is just based on therules in the derivationhellip

Doesnrsquot use the words in any real wayDoesnrsquot take into account where in the derivation arule is used


Solution

Add lexical dependencies to the schemehellipInfiltrate the predilections of particular words into theprobabilities in the derivationIe Condition the rule probabilities on the actualwords


Heads

To do that wersquore going to make use of the notion ofthe head of a phrase

The head of an NP is its nounThe head of a VP is its verbThe head of a PP is its preposition

(Itrsquos really more complicated than that but this will do)


Example (right)

Attribute grammar


Example (wrong)

11


How

We used to haveVP -gt V NP PP P(rule|VP)

ndash Thatrsquos the count of this rule divided by the number ofVPs in a treebank

Now we haveVP(dumped)-gt V(dumped) NP(sacks)PP(in)P(r|VP ^ dumped is the verb ^ sacks is the head ofthe NP ^ in is the head of the PP)Not likely to have significant counts in any treebank


Declare Independence

When stuck exploit independence and collect thestatistics you canhellipWersquoll focus on capturing two things

Verb subcategorizationndash Particular verbs have affinities for particular VPs

Objects affinities for their predicates (mostly theirmothers and grandmothers)

ndash Some objects fit better with some predicates thanothers


Subcategorization

Condition particular VP rules on their headhellip so r VP -gt V NP PP P(r|VP)Becomes

P(r | VP ^ dumped)

Whatrsquos the countHow many times was this rule used with (head) dump

divided by the number of VPs that dump appears (ashead) in total


Preferences

Subcat captures the affinity between VP heads(verbs) and the VP rules they go withWhat about the affinity between VP heads and theheads of the other daughters of the VPBack to our exampleshellip


Example (right)


Example (wrong)

12


Preferences

The issue here is the attachment of the PP So theaffinities we care about are the ones betweendumped and into vs sacks and intoSo count the places where dumped is the head of aconstituent that has a PP daughter with into as itshead and normalizeVs the situation where sacks is a constituent withinto as the head of a PP daughter


Preferences (2)

Consider the VPsAte spaghetti with gustoAte spaghetti with marinara

The affinity of gusto for eat is much larger than itsaffinity for spaghettiOn the other hand the affinity of marinara forspaghetti is much higher than its affinity for ate


Preferences (2)

Note the relationship here is more distant anddoesnrsquot involve a headword since gusto andmarinara arenrsquot the heads of the PPs

Vp (ate) Vp(ate)

Vp(ate) Pp(with)Pp(with)

Np(spag)

npvvAte spaghetti with marinaraAte spaghetti with gusto

np


Summary

Context-Free GrammarsParsing

Top Down Bottom Up MetaphorsDynamic Programming Parsers CKY Earley

DisambiguationPCFGProbabilistic Augmentations to ParsersTreebanks


Optional section the Earley alg


Problem (minor)

We said CKY requires the grammar to be binary (ieIn Chomsky-Normal Form)We showed that any arbitrary CFG can be convertedto Chomsky-Normal Form so thatrsquos not a huge dealExcept when you change the grammar the treescome out wrongAll things being equal wersquod prefer to leave thegrammar alone

13


Earley Parsing

Allows arbitrary CFGsWhere CKY is bottom-up Earley is top-down

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

ndash Completed constituents and their locationsndash In-progress constituentsndash Predicted constituents


States

The table-entries are called states and are representedwith dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found


StatesLocations

It would be nice to know where these things are in the inputsohellip

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3


Graphically


Earley

As with most dynamic programming approaches theanswer is found by looking in the table in the rightplaceIn this case there should be an S state in the finalcolumn that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done

S ndashgt α [0n+1]


Earley Algorithm

March through chart left-to-rightAt each step apply 1 of 3 operators

Predictorndash Create new states representing top-down expectations

Scannerndash Match word predictions (rule with word after dot) to

words

Completerndash When a state is complete see what rules were looking

for that completed constituent

14


Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated statebeginning and ending where generating state endsSo predictor looking at

ndash S -gt VP [00]

results inndash VP -gt Verb [00]ndash VP -gt Verb NP [00]


Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechCreate a new state with dot moved over the non-terminalSo scanner looking at

ndash VP -gt Verb NP [00]If the next word ldquobookrdquo can be a verb add new state

ndash VP -gt Verb NP [01]Add this state to chart entry following current oneNote Earley algorithm uses top-down input to disambiguate POSOnly POS predicted by some state can get added to chart


Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category

copy state move dot insert in current chart entryGiven

NP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]


Earley how do we know we aredone

How do we know when we are doneFind an S state in the final column that spans from 0to n+1 and is completeIf thatrsquos the case yoursquore done

S ndashgt α [0n+1]


Earley

So sweep through the table from 0 to n+1hellipNew predicted states are created by starting top-down from SNew incomplete states are created by advancingexisting states as new constituents are discoveredNew complete states are created in the same way


Earley

More specificallyhellip1 Predict all the states you can upfront2 Read a word

1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

15


Example

Book that flightWe should findhellip an S from 0 to 3 that is a completedstatehellip


Example


Example


Example


Details

What kind of algorithms did we just describe (bothEarley and CKY)

Not parsers ndash recognizersndash The presence of an S state with the right attributes in

the right place indicates a successful recognitionndash But no parse treehellip no parserndash Thatrsquos how we solve (not) an exponential problem in

polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time


Converting Earley fromRecognizer to Parser

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we camefrom


Augmenting the chart withstructural information

S8S9

S10

S11

S13S12

S8

S9

S8


Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S inthe last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be anexponential number of treesSo we can at least represent ambiguity efficiently

2


Sample Grammar


Back to CKY Parsing

Given rules in CNFConsider the rule A -gt BC

If there is an A in the input then there must be a Bfollowed by a C in the inputIf the A goes from i to j in the input then there mustbe some k st iltkltj

ndash Ie The B splits from the C someplace


CKY

So letrsquos build a table so that an A spanning from i to jin the input is placed in cell [ij] in the tableSo a non-terminal spanning an entire string will sit incell [0 n]If we build the table bottom up wersquoll know that theparts of the A must go from i to k and from k to j


CKY

Meaning that for a rule like A -gt B C we should lookfor a B in [ik] and a C in [kj]In other words if we think there might be an Aspanning ij in the inputhellip ANDA -gt B C is a rule in the grammar THENThere must be a B in [ik] and a C in [kj] for someiltkltjSo just loop over the possible k values


CKY Table

bullFilling the[ij]th cell inthe CKY table


CKY Algorithm

3


Note






CYK Example




Example


S

VP PP

NP VP

V NP NPP


Example


S

PP

NP VP

NP

NP

V


Example

NP

John

calledNP

MaryV

fromNP

DenverP

4


Example

NP

John

calledNP

MaryVX

fromNP

DenverP


Example

NP

John

calledNP

MaryVX

fromNPVP

DenverP


Example

NP

John

calledNP

MaryVX

fromNPVP

DenverPX


Example

NPPP

John

calledNP

MaryVX

fromNPVP

DenverPX


Example

NPPP

John

calledNP

MaryV

fromNPVPS

DenverPX


Example

NPPP

John

calledNP

MaryVX

fromNPVPS

DenverPXX

5


Example

NPPPNP

John

calledNP

MaryVX

fromNPVPS

DenverPX


Example

NPPPNP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP1VP2

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP1VP2

S

John

calledNP

MaryVX

fromNPVPS

DenverPXXX

6


Example

NPPPNPVPS

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Back to Ambiguity

Did we solve it


Ambiguity


Ambiguity




polynomial time







7


Probabilistic CFGs





Probability Model




PCFG


PCFG






Probability model


P(TS) = p(rn )nT

8








TreeBanks


Treebanks


Treebanks


Treebank Grammars

9


Lots of flat rules








Typical Approach








Max


10




Problems with PCFGs




Solution



Heads





Example (right)

Attribute grammar


Example (wrong)

11


How











Subcategorization


P(r | VP ^ dumped)




Preferences



Example (right)


Example (wrong)

12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




3


Note






CYK Example




Example


S

VP PP

NP VP

V NP NPP


Example


S

PP

NP VP

NP

NP

V


Example

NP

John

calledNP

MaryV

fromNP

DenverP

4


Example

NP

John

calledNP

MaryVX

fromNP

DenverP


Example

NP

John

calledNP

MaryVX

fromNPVP

DenverP


Example

NP

John

calledNP

MaryVX

fromNPVP

DenverPX


Example

NPPP

John

calledNP

MaryVX

fromNPVP

DenverPX


Example

NPPP

John

calledNP

MaryV

fromNPVPS

DenverPX


Example

NPPP

John

calledNP

MaryVX

fromNPVPS

DenverPXX

5


Example

NPPPNP

John

calledNP

MaryVX

fromNPVPS

DenverPX


Example

NPPPNP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP1VP2

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP1VP2

S

John

calledNP

MaryVX

fromNPVPS

DenverPXXX

6


Example

NPPPNPVPS

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Back to Ambiguity

Did we solve it


Ambiguity


Ambiguity




polynomial time







7


Probabilistic CFGs





Probability Model




PCFG


PCFG






Probability model


P(TS) = p(rn )nT

8








TreeBanks


Treebanks


Treebanks


Treebank Grammars

9


Lots of flat rules








Typical Approach








Max


10




Problems with PCFGs




Solution



Heads





Example (right)

Attribute grammar


Example (wrong)

11


How











Subcategorization


P(r | VP ^ dumped)




Preferences



Example (right)


Example (wrong)

12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




4


Example

NP

John

calledNP

MaryVX

fromNP

DenverP


Example

NP

John

calledNP

MaryVX

fromNPVP

DenverP


Example

NP

John

calledNP

MaryVX

fromNPVP

DenverPX


Example

NPPP

John

calledNP

MaryVX

fromNPVP

DenverPX


Example

NPPP

John

calledNP

MaryV

fromNPVPS

DenverPX


Example

NPPP

John

calledNP

MaryVX

fromNPVPS

DenverPXX

5


Example

NPPPNP

John

calledNP

MaryVX

fromNPVPS

DenverPX


Example

NPPPNP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP1VP2

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP1VP2

S

John

calledNP

MaryVX

fromNPVPS

DenverPXXX

6


Example

NPPPNPVPS

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Back to Ambiguity

Did we solve it


Ambiguity


Ambiguity




polynomial time







7


Probabilistic CFGs





Probability Model




PCFG


PCFG






Probability model


P(TS) = p(rn )nT

8








TreeBanks


Treebanks


Treebanks


Treebank Grammars

9


Lots of flat rules








Typical Approach








Max


10




Problems with PCFGs




Solution



Heads





Example (right)

Attribute grammar


Example (wrong)

11


How











Subcategorization


P(r | VP ^ dumped)




Preferences



Example (right)


Example (wrong)

12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




5


Example

NPPPNP

John

calledNP

MaryVX

fromNPVPS

DenverPX


Example

NPPPNP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP1VP2

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Example

NPPPNPVP1VP2

S

John

calledNP

MaryVX

fromNPVPS

DenverPXXX

6


Example

NPPPNPVPS

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Back to Ambiguity

Did we solve it


Ambiguity


Ambiguity




polynomial time







7


Probabilistic CFGs





Probability Model




PCFG


PCFG






Probability model


P(TS) = p(rn )nT

8








TreeBanks


Treebanks


Treebanks


Treebank Grammars

9


Lots of flat rules








Typical Approach








Max


10




Problems with PCFGs




Solution



Heads





Example (right)

Attribute grammar


Example (wrong)

11


How











Subcategorization


P(r | VP ^ dumped)




Preferences



Example (right)


Example (wrong)

12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




6


Example

NPPPNPVPS

John

calledNP

MaryVX

fromNPVPS

DenverPXXX


Back to Ambiguity

Did we solve it


Ambiguity


Ambiguity




polynomial time







7


Probabilistic CFGs





Probability Model




PCFG


PCFG






Probability model


P(TS) = p(rn )nT

8








TreeBanks


Treebanks


Treebanks


Treebank Grammars

9


Lots of flat rules








Typical Approach








Max


10




Problems with PCFGs




Solution



Heads





Example (right)

Attribute grammar


Example (wrong)

11


How











Subcategorization


P(r | VP ^ dumped)




Preferences



Example (right)


Example (wrong)

12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




7


Probabilistic CFGs





Probability Model




PCFG


PCFG






Probability model


P(TS) = p(rn )nT

8








TreeBanks


Treebanks


Treebanks


Treebank Grammars

9


Lots of flat rules








Typical Approach








Max


10




Problems with PCFGs




Solution



Heads





Example (right)

Attribute grammar


Example (wrong)

11


How











Subcategorization


P(r | VP ^ dumped)




Preferences



Example (right)


Example (wrong)

12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




8








TreeBanks


Treebanks


Treebanks


Treebank Grammars

9


Lots of flat rules








Typical Approach








Max


10




Problems with PCFGs




Solution



Heads





Example (right)

Attribute grammar


Example (wrong)

11


How











Subcategorization


P(r | VP ^ dumped)




Preferences



Example (right)


Example (wrong)

12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




9


Lots of flat rules








Typical Approach








Max


10




Problems with PCFGs




Solution



Heads





Example (right)

Attribute grammar


Example (wrong)

11


How











Subcategorization


P(r | VP ^ dumped)




Preferences



Example (right)


Example (wrong)

12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




10




Problems with PCFGs




Solution



Heads





Example (right)

Attribute grammar


Example (wrong)

11


How











Subcategorization


P(r | VP ^ dumped)




Preferences



Example (right)


Example (wrong)

12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




11


How











Subcategorization


P(r | VP ^ dumped)




Preferences



Example (right)


Example (wrong)

12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




12


Preferences



Preferences (2)




Preferences (2)


Vp (ate) Vp(ate)


Np(spag)


np


Summary







Problem (minor)


13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




13


Earley Parsing





States






StatesLocations






Graphically


Earley


S ndashgt α [0n+1]


Earley Algorithm




words



14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




14


Predictor


ndash S -gt VP [00]



Scanner





Completer








S ndashgt α [0n+1]


Earley



Earley




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




15


Example



Example


Example


Example


Details




polynomial time


Back to Ambiguity

Did we solve it

16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




16


Ambiguity


Ambiguity




polynomial time






S8S9

S10

S11

S13S12

S8

S9

S8




Documents

Dynamic Programming The CKY (Cocke-Kasami-Younger) Algorithmstaff.icar.cnr.it/ruffolo/progetti/projects/09... · A sig n rob al t ec u hy com p led ani h b Use the max probability