Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
11-711 Algorithms for NLP
Chart Parsing
Reading:
James Allen,
“Natural Language Understanding”
Section 3.4, pp. 53-61
Chart Parsing
General Principles:
A Bottom-Up parsing method– Construct a parse starting from the input symbols– Build constituents from sub-constituents– When all constituents on the RHS of a rule are matched, create aconstituent for the LHS of the rule
The Chart allows storing partial analyses, so that they can be shared.
Data structures used by the algorithm:– The Key: the current constituent we are attempting to “match”– An Active Arc: a grammar rule that has a partially matched RHS– The Agenda: Keeps track of newly found unprocessed constituents– The Chart: Records processed constituents (non-terminals) thatspan substrings of the input
1 11-711 Algorithms for NLP
Chart Parsing
Steps in the Process:
Input is processed left-to-right, one word at a time
1. Find all POS of word (terminal-level)
2. Initialize Agenda with all POS of the word
3. Pick a Key from the Agenda
4. Add all grammar rules that start with the Key as active arcs
5. Extend any existing active arcs with the Key
6. Add LHS constituents of newly completed rules to the Agenda
7. Add the Key to the Chart
8. If Agenda not empty - goto (3), else goto (1)
2 11-711 Algorithms for NLP
The Chart Parsing Algorithm
Extending Active Arcs with a Key:
Each Active Arc has the form: 1
A Key constituent has the form:
When processing the Key 1 2 , we search the active arclist for an arc 1 0 1 , and then create anew active arc 1 0 2
If the new active arc is a completed rule: 1 0 2 ,then we add 0 2 to the Agenda
After “using” the key to extend all relevant arcs, it is enteredinto the Chart
3 11-711 Algorithms for NLP
The Chart Parsing Algorithm
The Main Algorithm: parsing input 1
1. 0
2. If Agenda is empty and then set 1, find all POS of andadd them as constituents 1 to the Agenda
3. Pick a Key constituent 1 from the Agenda
4. For each grammar rule of form 1 ,add 1 to the list of active arcs
5. Use Key to extend all relevant active arcs
6. Add LHS of any completed active arcs into the Agenda
7. Insert the Key into the Chart
8. If Key is 1 then Accept the inputElse goto (2)
4 11-711 Algorithms for NLP
Chart Parsing - Example
The Grammar:
123456
5 11-711 Algorithms for NLP
Chart Parsing - Example
The input: “ The large can can hold the water”POS of Input Words:
the: ART
large: ADJ
can: N, AUX, V
hold: N, V
water: N, V
6 11-711 Algorithms for NLP
Chart Parsing = Example
The input: "X = large can can hold the 8
f
1 1-71 1 Algorithms for NLP
Chart Parsing - Example
The input: "X = The large can can hold the water"
I 2 3
NP + ART 0 ADJ N
NP + ART0 N
NP + ADJ o N
0 -
ART 1
Figure 3.9 The chart after seeing an ADJ in position 2
ADJ 1
11-71 1 Algorithms for NLP
P
io
-1 - 5 - M Z ?
I - - 2 - CD
h)
w
X
C
C
8 Z
Z 2 2 - CD
P
b
PC
Z
C-
-
~ >
SZ
C
WW
Z
-a
Chart Parsing - Example
The input: "X = The large can can hold the water"
1 the 2 large 3 can 4 can 5 hold 6 the 7 water
Figure 3.15 The final chart
S I (rule 1 with NPI and VP2)
S2 (rule I with NP2 and VP2)
VP3 (rule 5 with AUXI and VP2)
11 -71 1 Algorithms for NLP
NP2 (rule 4)
NPI (rule 2)
VP2 (rule 5)
N2
v 2
AUX2
N1
VI
AUX 1 ART I ADJ 1
VPI (rule 6)
v 3
N3
NP3 (rule 3)
ART2
v 4
N4
Time Complexity of Chart Parsing
Algorithm iterates for each word of input (i.e. iterations)
How many keys can be created in a single iteration ?– Each key in Agenda has the form 1 , 1– Thus keys
Each key 1 can extend active arcs of the form1 , for 1
Each key can thus extend at most active arcs
Time required for each iteration is thus 2
Time bound on entire algorithm is therefore 3
12 11-711 Algorithms for NLP
Parsing with a Chart Parser
We need to keep back-pointers to the constituents that wecombine
Each constituent added to the chart must have the form1 1 , where the are “pointers”
to the RHS sub-constituents
Pointers must thus also be recorded within the active arcs
At the end - reconstruct a parse from the “back-pointers”
13 11-711 Algorithms for NLP
Parsing with a Chart Parser
Efficient Representation of Ambiguities
a Local Ambiguity - multiple ways to derive the same substring from anon-terminal
What do local ambiguities look like with Chart Parsing?– Multiple items in the chart of the form
1 1 , with the same , and .
Local Ambiguity Packing: create a single item in the Chart for, with pointers to the various possible derivations.
can then be a sufficient “back-pointer” in the chart
Allows to efficiently represent a very large number of ambiguities (evenexponentially many)
Unpacking - producing one or more of the packed parse trees byfollowing the back-pointers.
15 11-711 Algorithms for NLP
Improved Chart Parser
Increasing the Efficiency of the Algorithm
Observation: The algorithm creates constituents that cannot “fit” into aglobal parse structure. Example: 2 3
These can be eliminated using predictions
Using Top-down predictions:– For each non-terminal , define the set of all leftmostnon-terminals derivable from
– Compute in advance for all non-terminals– When processing the key 1 we look for grammar rulesthat form an active arc 2
– Add the active arc only if is predicted:there already exists an active arc 1
such that– For 1 (arcs of the form 2 1 1 ), add theactive arc only if
16 11-711 Algorithms for NLP
Improved Chart Parser
Simple Procedure for Computing :(1) For every A , Cur-First[A] = { A }
(2) For every grammar rule A --> B... ,Cur-First[A] = Union( Cur-First[A], { B })
(3) Updated = TWhile Updated do
Updated = FFor every A do
Old-First[A] = Cur-First[A]For each B in Old-First[A] do
Cur-First[A] = Union( Cur-First[A], Cur-First[B])If Cur-First[A] != Old-First[A]
then Updated = T
(4) For every A doFirst[A] = Cur-First[A]
17 11-711 Algorithms for NLP
Chart Parsing - Example
The input: ".I. = The large can can hold the water"