44
DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Page 1: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

DS-to-PS conversion

Fei XiaUniversity of Washington

July 29, 2011

1

Page 2: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Main steps in building the treebank

• DS treebank:– Tokenization– Morphological analysis, voice, etc.– POS tagging– DS

• Propbank: adding Predicate-argument info

• Automatic DS-to-PS conversion

• Some manual check to ensure the conversion works well

2

Page 3: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Outline

• Important concepts

• Compatibility and consistency

• Handling inconsistency

3

Page 4: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Important concepts

• Linguistic phenomena

• Representation type

• Linguistic theory– Theoretical framework– Linguistic analyses

• Annotation guidelines4

Page 5: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Linguistic phenomena

• They are what we want to present, including– General concepts: e.g., which words form a

phrase? What types of phrases does a language have?

– Types of relations between words or phrases (e.g., subjecthood, temporal modification)

– Specific constructions (e.g., small clause)– Finer-grained distinctions (e.g., unergative vs.

unaccusative)

5

Page 6: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Representation type

• It is the type of mathematical object that is used to represent syntactic facts

• Examples: DS, PS

• Each representation type can decide what more specific representation devices to employ– Labels on the arcs of a tree– Use of empty nodes or coindexation between nodes

6

Page 7: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Linguistic theory

• It explains how linguistic phenomena are represented in the chose representation type

• It has two components:– Theoretical framework: it provides vocabulary and

constraints in which linguistic theories can be formulated: e.g., GB, LFG, LTAG, HPSG

– Linguistic analyses

7

Page 8: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Small clause

8

Page 9: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

“Exceptional case-marking” analysis

9

Page 10: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

“Raising-to-object” analysis

10

Page 11: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Annotation guidelines

• Guideline designers need to choose the following– Linguistic phenomena to represent– Representation type– Theoretical framework– Linguistic analyses

– Descriptions– Examples: sentences with DS or PS trees

11

Page 12: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Outline

• Important concepts

• Compatibility and consistency

• Handling inconsistency

12

Page 13: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

“Exceptional case-marking” analysis

13

Page 14: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

“Raising-to-object” analysis

14

Page 15: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Implicit vs. explicit information

• Certain aspects of information has to be expressed explicitly in DS, but not PS, or vice versa– Head in DS – Syntactic categories of phrases in PS

• Not explicitly providing info does not mean that corresponding concepts does not exist in DS/PS

15

Page 16: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Syntactic consistency

• We assume each phrase in a PS has a special word, head word, which represents the property of the phrase.

• A (DS, PS) pair is called consistent if there is a way to assign a head word to each internal node in the PS so that the resulting DS is identical to the given DS.

16

Page 17: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Consistent pairs

17

Page 18: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Inconsistent pairs

18

Page 19: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

A real example

19

Page 20: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Consistency assumption

20

Page 21: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Definition of consistency

• A DS and a PS are consistent iff there exists a flattened version of the PS that is identical to the DS.

• If the input DS and the desired PS are consistent, the PS can be created by stretching the DS and adding syntactic labels.

21

Page 22: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Checking consistency

• For each (dep, head) pair in the DS– find their location in the PS and their closest

antecedent – add heads to the nodes on the path between the

leaf nodes and the antecedent

• The DS and the PS are consistent iff each node in the PS has exactly one head.

22

Page 23: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

(Vinken, join)

(Vinken)

(join)

(join)

(join)

(board, join)

(board)

(will, join)

(29, join)

(29)

23

Page 24: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Outline

• Important concepts

• Compatibility and consistency

• Handling inconsistency

24

Page 25: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

wh-movement

(who, come)come

come

come

come

come

come

come

(come, think)25

Page 26: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

wh-movement

(who, come)

(come, think)

come

come

come

come

come | think

come | think

come

26

Page 27: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

wh-movement

(who, come)

(come, think)

come

come

come

??

think

think

??

(you, think)27

Page 28: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Can DS and PS be inconsistent?• DS and PS can represent different aspects of the same

overall pictures, and still be consistent.– Info provided in PropBank: e.g., empty subject, unaccusative– Info that is in PS only: e.g., traces

• DS and PS should not choose “conflicting” analyses.– DS and PS are two images of the same underlying treebank, not

two separate treebanks.– Ex: ba-construction in Chinese: verb, prep, or something else?– Ex: free relatives: empty nominal head

• The inconsistency cases should be rare and well-motivated.

28

Page 29: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

How to handle inconsistency?

• Detect inconsistency in (DS, PS) pairs in the guidelines

• Consult guideline designers to determine whether the inconsistency can be resolved by changing analyses

• If not, introduce DScons and ensure sufficient info is in DS for automatic conversion.

29

Page 30: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Two-stage conversion

• DS to DScons: by removing “inconsistency” between DS and PS.

• DScons to PS: by applying conversion rules

30

Page 31: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Case #1: long-distance movement

31

DSconst:DSprop:

• Other examples: extraposition• Easily detectable due to non-projectivity• Create DSconst by moving up the “moved element” and leaving a trace

• which node is the “moved element”? The one that is apart from other nodes in the subtree.

Page 32: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Case #2: local scrambling

32

Detectable by assuming canonical word order: k1 > k2

Need from PS/DS teams the canonical word order and what word order triggers movement

Page 33: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Case #3: small clause rule

33

Detectable by dependency type k2s

Need confirmation from IIIT that k2s is used only for small clause

Page 34: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Case 4: support verb

34

Detectable by dependency type “pof”Need confirmation from IIIT that “pof” is used only for support verb

Page 35: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Conclusion

• We define consistency between DS and PS

• DS and PS can be inconsistent but such cases should be rare and well-motivated.

• We will handle inconsistency with the two-stage approach

35

Page 36: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Conversion algorithm

36

Page 37: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Definition of conversion rule• A conversion rule is a (DS_pattern, PS_pattern) pair.

• Ex:

• Simplest case: – DS_pattern corresponds to only one dependency link– Decomposing DS becomes trivial– PS_pattern is a tree fragment (e.g., wh-movement)– Learning rules from (PS, DS) pairs is easy

37

Page 38: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Extracting rules

38

Page 39: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Rules extracted from the example

39

Page 40: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Input DS

40

Page 41: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

41

Page 42: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

Gluing PS segments together

42

Page 43: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

c

c

c

43

Page 44: DS-to-PS conversion Fei Xia University of Washington July 29, 2011 1

44