37
LING 581: Advanced Computational Linguistics Lecture Notes January 30th

LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Embed Size (px)

Citation preview

Page 1: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

LING 581: Advanced Computational Linguistics

Lecture NotesJanuary 30th

Page 2: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Relative clause constructions

• Terminology– gap (__):

• indicates where the head of the construction is interpreted

– Subject RC: the man (that|who) __ saw me– Object RC: the man (that|who) I saw __– Subject and object RCs can appear in subject and object

positions freely:• The man that saw me left the room• The man that I saw left the room• I saw the man that saw me• I again saw the man that I sawNote: the relative pronoun is the that/who/which

Page 3: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Relative clause constructions

• Terminology contd.:– Infinitival/untensed vs. tensed• John saw Mary (tensed)• John sees Mary (tensed)• John to see Mary (untensed)

– In RC constructions:• the man to see Mary• a person to see• a time to go see Mary

Note: subject is always missing…But it’s not always the RC gap

Page 4: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Relative clause constructions

• Terminology contd.:– Zero refers to a missing relative pronoun– Zero RCs:

• the man I saw (tensed)• the man to see (untensed)

– *Zero:• *the man saw me / the man who saw me• *the man was seen by me / the man who was seen by me• The horse raced past the barn fell

– must be zero:• *a person that to see• *the man that to see Mary

Page 5: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise

Subject Non-Subject

Tensed relatives

Untensed relatives

Frequency counts

that which/who/what/when/where

zero

Tensed relatives

Page 6: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing

Guidelines section 4.2.2:2. zero relative clauses

Page 7: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing

Guidelines section 4.2.2:2. zero relative clauses

Page 8: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing

Guidelines section 4.2.2:3. infinitival relative clauses

Page 9: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing

Guidelines section 4.2.2:3. infinitival relative clauses

Page 10: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review• Use tregex to search for relative clauses as defined in Parsing

Guidelines section 4.2.2:3. infinitival relative clauses

Page 11: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

• From page 17:

Page 12: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

• Use tregex to search for relative clauses as defined in Bracketing Guidelines (prsguid1.pdf) section 4.2.2:1. wh- and that- relative clauses Two subtypes:

WHNP NP-traceWHADVP ADVP-trace

Note: the format in the guide doesn’t always match exactly with WSJ trees … -NONE-

Page 13: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

• Use tregex to search for relative clauses as defined in Bracketing Guidelines (prsguid1.pdf) section 4.2.2:1. wh- and that- relative clauses

Matches Pattern11598 @NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) << (@NP < (/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))

1. 2.

3.

Page 14: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

• Browsing through the matches and refining the search is always a good idea …

to see what we have inadvertently picked up or have not thought of

Page 15: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

• Note: 2nd matching tree has an intervening PP:

Page 16: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

• Note: 5th matching tree has an intervening PP:

Note: intervening punctuation is also commonThe plant, which is owned by Hollingsworth & Vose Co., was under contract …

Page 17: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)

Note: the SBAR from NP-SBJ was extraposed to the VP

Note: *ICH* non-subject relative clause

Page 18: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)

This is NOT a relative clauseconstruction!

Page 19: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)

The relative clause gap here is ADVP

Infinitival/non-tensed clause

Page 20: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)

*ICH* subject relative clause

Note: the SBAR from the NP objectwas right extraposed to the VP

Page 21: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

11598@NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)

CoordinationSBAR SBAR CC SBAR

Page 22: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

• 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i)• 10290 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)

$/#2%i)

• Excludes *ICH* cases• Excludes coordination …

Page 23: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

• 10290 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i)• 10326 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i

<< (/^(NP|ADVP)/ < (/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))

Page 24: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review• 8575 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i << (NP-SBJ

< /^-NONE-$/))• 5975 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i << (NP-SBJ <

(/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))

Page 25: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

Let’s look at the *ICH* subcases:

Page 26: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))

Page 27: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review

159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))

This is NOT a relative clauseconstruction!

Page 28: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))155 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : /^SBAR-([0-9]+)$/#1%i

Only 1 out of the 4 is NOT a relative clauseconstruction!

Page 29: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))155 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : /^SBAR-([0-9]+)$/#1%i

Search string is too restrictive:SBAR-PRPSBAR-NOM

Page 30: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise Review• 116 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : (/^SBAR.*-

([0-9]+)$/#1%i < /^WH(NP|ADVP)-([0-9]+)$/)• 115 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : (/^SBAR.*-

([0-9]+)$/#1%i < /^WH(NP|ADVP)-([0-9]+)$/#2%j << /\*T\*-([0-9]+)/#1%j)

Not a trace?BUG?

Page 31: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Relevance of Treebanks

• Statistical parsers typically construct syntactic phrase structure– they’re trained on Treebank corpora like the Penn

Treebank• Note: some use dependency graphs, not trees

Page 32: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Parsers trained on the Treebank

• Don’t recover fully-annotated trees– not trained using nodes with indices or empty (-NONE-) nodes– not trained using functional tags, e.g. –SBJ

• Therefore they don’t fully parse• Example: no SBAR node in … a movie to see

Stanford parser

Page 33: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Parsers trained on the Treebank

• SBAR can be forced by the presence of an overt relative pronoun, but note there is no subject gap:

Page 34: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Parsers trained on the Treebank

• Probabilities are estimated from frequency information of each node given surrounding context (e.g. parent node, or the word that heads the node)

• Still these systems have enormous problems with prepositional phrase (PP) attachment

• Example:(borrowed from Igor Malioutov)

– A boy with a telescope kissed Mary on the lips– Mary was kissed by a boy with a telescope on the lips

• PP with a telescope should adjoin to the noun phrase (NP) a boy• PP on the lips should adjoin to the verb phrase (VP) headed by

kiss

Page 35: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Active/passive sentences

• Examples using the Stanford Parser:

Both active and passivesentences are parsed incorrectly

Page 36: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Active/passive sentences

• Examples:

X on the lips modifies MaryX on the lips modifies telescope

Page 37: LING 581: Advanced Computational Linguistics Lecture Notes January 30th

Homework Exercise• Use tregex to find out how many passive sentences there are in

the Treebank WSJ section?• The passive construction (according to the Bracketing Guidelines)

– Note: by-phrase containing logical subject (LGS) is optional