23
11-711 Algorithms for NLP Chart Parsing Reading: James Allen, “Natural Language Understanding” Section 3.4, pp. 53-61

Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

11-711 Algorithms for NLP

Chart Parsing

Reading:

James Allen,

“Natural Language Understanding”

Section 3.4, pp. 53-61

Page 2: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Chart Parsing

General Principles:

A Bottom-Up parsing method– Construct a parse starting from the input symbols– Build constituents from sub-constituents– When all constituents on the RHS of a rule are matched, create aconstituent for the LHS of the rule

The Chart allows storing partial analyses, so that they can be shared.

Data structures used by the algorithm:– The Key: the current constituent we are attempting to “match”– An Active Arc: a grammar rule that has a partially matched RHS– The Agenda: Keeps track of newly found unprocessed constituents– The Chart: Records processed constituents (non-terminals) thatspan substrings of the input

1 11-711 Algorithms for NLP

Page 3: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Chart Parsing

Steps in the Process:

Input is processed left-to-right, one word at a time

1. Find all POS of word (terminal-level)

2. Initialize Agenda with all POS of the word

3. Pick a Key from the Agenda

4. Add all grammar rules that start with the Key as active arcs

5. Extend any existing active arcs with the Key

6. Add LHS constituents of newly completed rules to the Agenda

7. Add the Key to the Chart

8. If Agenda not empty - goto (3), else goto (1)

2 11-711 Algorithms for NLP

Page 4: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

The Chart Parsing Algorithm

Extending Active Arcs with a Key:

Each Active Arc has the form: 1

A Key constituent has the form:

When processing the Key 1 2 , we search the active arclist for an arc 1 0 1 , and then create anew active arc 1 0 2

If the new active arc is a completed rule: 1 0 2 ,then we add 0 2 to the Agenda

After “using” the key to extend all relevant arcs, it is enteredinto the Chart

3 11-711 Algorithms for NLP

Page 5: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

The Chart Parsing Algorithm

The Main Algorithm: parsing input 1

1. 0

2. If Agenda is empty and then set 1, find all POS of andadd them as constituents 1 to the Agenda

3. Pick a Key constituent 1 from the Agenda

4. For each grammar rule of form 1 ,add 1 to the list of active arcs

5. Use Key to extend all relevant active arcs

6. Add LHS of any completed active arcs into the Agenda

7. Insert the Key into the Chart

8. If Key is 1 then Accept the inputElse goto (2)

4 11-711 Algorithms for NLP

Page 6: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Chart Parsing - Example

The Grammar:

123456

5 11-711 Algorithms for NLP

Page 7: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Chart Parsing - Example

The input: “ The large can can hold the water”POS of Input Words:

the: ART

large: ADJ

can: N, AUX, V

hold: N, V

water: N, V

6 11-711 Algorithms for NLP

Page 8: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Chart Parsing = Example

The input: "X = large can can hold the 8

f

1 1-71 1 Algorithms for NLP

Page 9: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Chart Parsing - Example

The input: "X = The large can can hold the water"

I 2 3

NP + ART 0 ADJ N

NP + ART0 N

NP + ADJ o N

0 -

ART 1

Figure 3.9 The chart after seeing an ADJ in position 2

ADJ 1

11-71 1 Algorithms for NLP

Page 10: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find
Page 11: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find
Page 12: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

P

io

-1 - 5 - M Z ?

I - - 2 - CD

h)

w

X

C

C

8 Z

Z 2 2 - CD

P

b

PC

Z

C-

-

~ >

SZ

C

WW

Z

-a

Page 13: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Chart Parsing - Example

The input: "X = The large can can hold the water"

1 the 2 large 3 can 4 can 5 hold 6 the 7 water

Figure 3.15 The final chart

S I (rule 1 with NPI and VP2)

S2 (rule I with NP2 and VP2)

VP3 (rule 5 with AUXI and VP2)

11 -71 1 Algorithms for NLP

NP2 (rule 4)

NPI (rule 2)

VP2 (rule 5)

N2

v 2

AUX2

N1

VI

AUX 1 ART I ADJ 1

VPI (rule 6)

v 3

N3

NP3 (rule 3)

ART2

v 4

N4

Page 14: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Time Complexity of Chart Parsing

Algorithm iterates for each word of input (i.e. iterations)

How many keys can be created in a single iteration ?– Each key in Agenda has the form 1 , 1– Thus keys

Each key 1 can extend active arcs of the form1 , for 1

Each key can thus extend at most active arcs

Time required for each iteration is thus 2

Time bound on entire algorithm is therefore 3

12 11-711 Algorithms for NLP

Page 15: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Parsing with a Chart Parser

We need to keep back-pointers to the constituents that wecombine

Each constituent added to the chart must have the form1 1 , where the are “pointers”

to the RHS sub-constituents

Pointers must thus also be recorded within the active arcs

At the end - reconstruct a parse from the “back-pointers”

13 11-711 Algorithms for NLP

Page 16: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find
Page 17: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Parsing with a Chart Parser

Efficient Representation of Ambiguities

a Local Ambiguity - multiple ways to derive the same substring from anon-terminal

What do local ambiguities look like with Chart Parsing?– Multiple items in the chart of the form

1 1 , with the same , and .

Local Ambiguity Packing: create a single item in the Chart for, with pointers to the various possible derivations.

can then be a sufficient “back-pointer” in the chart

Allows to efficiently represent a very large number of ambiguities (evenexponentially many)

Unpacking - producing one or more of the packed parse trees byfollowing the back-pointers.

15 11-711 Algorithms for NLP

Page 18: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Improved Chart Parser

Increasing the Efficiency of the Algorithm

Observation: The algorithm creates constituents that cannot “fit” into aglobal parse structure. Example: 2 3

These can be eliminated using predictions

Using Top-down predictions:– For each non-terminal , define the set of all leftmostnon-terminals derivable from

– Compute in advance for all non-terminals– When processing the key 1 we look for grammar rulesthat form an active arc 2

– Add the active arc only if is predicted:there already exists an active arc 1

such that– For 1 (arcs of the form 2 1 1 ), add theactive arc only if

16 11-711 Algorithms for NLP

Page 19: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Improved Chart Parser

Simple Procedure for Computing :(1) For every A , Cur-First[A] = { A }

(2) For every grammar rule A --> B... ,Cur-First[A] = Union( Cur-First[A], { B })

(3) Updated = TWhile Updated do

Updated = FFor every A do

Old-First[A] = Cur-First[A]For each B in Old-First[A] do

Cur-First[A] = Union( Cur-First[A], Cur-First[B])If Cur-First[A] != Old-First[A]

then Updated = T

(4) For every A doFirst[A] = Cur-First[A]

17 11-711 Algorithms for NLP

Page 20: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find
Page 21: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find
Page 22: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find

Chart Parsing - Example

The input: ".I. = The large can can hold the water"

Page 23: Char t Parsing Reading - staff.um.edu.mtstaff.um.edu.mt/mros1/hlt/PDF/Chart-Parsing_lavie09.pdfThe Main Algor ithm: parsing input 1 1. 0 2. If Agenda is empty and then set 1, find