35
Overview of Feature Structure Representation Kiyong Lee [email protected] July 7, 2003 Sapporo, Japan

Overview of Feature Structure Representation Kiyong Lee [email protected] July 7, 2003 Sapporo, Japan

Embed Size (px)

Citation preview

Page 1: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

Overview of Feature Structure Representation

Kiyong Lee [email protected] 7, 2003Sapporo, Japan

Page 2: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

2

Preliminary Remark This work item on feature

structures concerns itself with their representational aspect only.

The task of specifying their description or declaration mechanism shall be taken up as a separate but related new work item.

Page 3: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

3

Historical Background Initial use of feature structures to

theoretical linguistics in treating distinctive features of phonological segments and representing their oppositions in 1930’s or earlier.

Extensive use made in 1960’s in generative phonology and then to the development of grammar formalisms in 1980’s and finally to other computational or theoretical work such as parsing, lexicology or formal semantics in 1990’s and up to the present.

Page 4: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

4

Use of feature structures Feature structures are an essential

part of many linguistic formalisms as well as an underlying mechanism for

representing the information consumed or produced by and for language resource management or information technology.

Page 5: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

5

•This international standard provides a format to represent, store or exchange feature structures in natural language applications,•both for the purpose of annotation or production of linguistic data.

Page 6: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

6

General Characteristics of Feature Structures

Feature structure representation can:1. Capture the notion of partial

information.2. Make finer-grained distinctions

among objects being described.3. Provide a systematic format for

accommodating various types of constraints on organizing information

Page 7: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

7

1. Capturing partiality of information A feature structure represents partial

information about an object being described.

This information can be manipulated or augmented monotonically with simple operations like unification, generalization or merging, while such feature structures can explicitly

be compared with each other with respect to some formally defined relation such as subsumption.

Page 8: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

8

Feature structure representation can also make finer-grained distinctions among objects being analyzed.

Phonemes, for instance, like /p/ and /k/ are non-continuant consonants which share the property of being peripheral or non-coronal with respect to their place of articulation in the oral cavity, thus sometimes behaving similarly toward some phonological process of assimilation.

2. Finer-grained distinctions

Page 9: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

9

3. Organizing information Feature structure representation

makes it easier to organize information systematically by accommodating various constraints or inheritance mechanisms into the sorting out of information from various sources

Page 10: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

10

Formal definition of feature structure A feature structure is thus

formally defined either as a partial function from

features (or attributes) to values in set-theoretic terms

or as a directed acyclic graph (dag) in graph-theoretic terms.

Page 11: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

11

Typed feature structure Feature structures may be

constrained by some typing. Multiple inheritance type

hierarchy, for instance, constrains the construction of well-formed feature structures.

Page 12: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

12

Notations Feature structures can be

represented in either matrix or graph format. Each matrix consists of pairs of a

feature (or attribute) and its unique value, thus being called AVM.

Page 13: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

13

Page 14: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

14

Page 15: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

15

•Each graph representing a feature structure, on the other hand, consists of

- a single root, labeled and directed branches, and terminal nodes.

-Each node including the particular node called root represents a type and each label on a branch a feature.

-A type can either be an atom or a complex object that is a feature structure itself.

Page 16: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

16

Path In a graph notation, a feature

structure can easily be seen as consisting of many but possibly null paths. Each path starts with the root and goes through each branch to a terminal node, thereby consisting of labels on its branches.

Page 17: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

17

Shared values or reentrant Some features may share a token

identical value. The verb runs, for example, has the agreement value of 3rdSg and so does the noun Mary.

Such shared values are called reentrants because, in graph notation, two branches that share a value emerge into one and the same node.

In matrix format, reentrancy or sharing is represented by a tag with the same index.

Page 18: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

18

Page 19: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

19

Page 20: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

20

List as a value A feature may take a list of

values. The argument structure of a

predicate, for instance, may take as value a list of arguments, say Subject and one or two Objects, depending on the type of the predicate.

Page 21: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

21

F: <v1, v2, v3>

Page 22: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

22

Systematic representation of feature structures In typed feature structure, the

value of each feature is a type and typing is constrained by particular

applications. Nevertheless, values need be

characterized in some systematic ways.

Page 23: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

23

•One is to build libraries for feature, feature-values or feature structures

•fLib

•fsLib

•fvLib

•through some clustering and identification mechanism.

Page 24: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

24

•Another is to introduce structures into values or organize them as singleton, set, bag (or multiset) or list.

•Singleton: {a}

•Set: {a,a} = {a},

{a,b} = {b,a}

•Bag: m{a,a} /= m{a},

m{a,b} = m{b,a}

•List: <a,b> /=<b,a>

Page 25: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

25

•A third way may be to introduce special values such as

-<plus, minus> for binary values

-<any> for the Boolean truth value variable, cf. wild card?

-<none> for the Boolean falsity value variable,

-<dft> for default value, or

-<uncertain> for uncertainty value.

cf. <unknown>

Page 26: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

26

•Finally, the <rel> attribute may be provided for some values.

•The rel = ne for non-equality may be introduced to exclude a certain feature-value pair, while allowing other alternatives.

Page 27: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

27

Comments from P-members See Annex 2 to ISO/TC 37/SC 4 N

053 Bulgaria: extensive use of FSR in

applications Japan: XML’s interoperability of

other description tools such as RDF and OWL and accommodation of the dump format for LAF

Page 28: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

28

Korea: ensuing work on description or declaration language for FSR and B-R Ryu’s detailed proof-reading

UK: the SGML/XML implementation should not be so prominent, since FSR is a data-modeling tool.

No comments from France because …

Page 29: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

29

Proof Reading Part of Annex B, namely on

subsumption, should be added to Table of Contents because that notion is used in Chapter 5.

Page 30: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

30

Suggestions for Improvements Any restructuring of the document? Addition of subsections on historical

background, basic and Boolean operations,definition of type and multiple inheritance hierarchy

Adding more terms to section 3 Different uses of the term “tag”??

Page 31: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

31

Table of Contents

4 General Characteristics of Feature Structure Overview(move to the beginning of the document) 4.0 Historical Background(addendum) 4.1 Use of Feature Structures 4.2 Basic Concepts 4.2.1 Typed Feature Structure(addendum) 4.2.2 Multiple Inheritance Hierarchy(addendum) 4.3 Notations 4.3.1 Graph Notation 4.3.2 Matrix notation 4.4 Shared Feature Structure or Reentrancy 4.5 Basic Operations and Relations(addendum) 4.6 List, Set, etc as Feature Values(addendum) 4.7 Boolean Operations and Relations(addendum)

Page 32: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

32

Discussion of singleton, set, bag, and list as possible forms of feature-values in section 4

Coordination of sections 4 and 5, for instance, by converting examples in section 5 into AVM’s and listing them in section 4.

More illustrations or restricting them to language related ones, perhaps from MSA?

Page 33: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

33

Main action to be taken by the joint working group The following items have been identified as requiring revision in the current document:

- Preliminary formal description of feature structures - Provision of a simplified representation (FS lite) describi

ng the basic subset of FS representation without libraries; -  Provision of a re-entrance mechanisms; -  Description of typed feature structure; - Simplification of feature value content by replacing som

e elements (<symb>, <num> etc.) by references to types (à la XML schemas)

- Provision of more NLP related examples -.[Note: reference to pointer/linking group for ID/IDREF me

chanisms]

Page 34: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

34

Questions? Is an XML representation the third type of not

ation for FS viewed as being at the same descriptive level as AVM and DAG?

List possible applications, say to lexicology, (polysemy, dialectal variations), MorSA, and Description of Sem Rep (tripartite analysis of Quantification), with relevant illustrations?

Degree of formality in definitions, perhaps in Annexes

Page 35: Overview of Feature Structure Representation Kiyong Lee klee@korea.ac.kr July 7, 2003 Sapporo, Japan

2003-07-07 Kiyong Lee

ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan

35

Editor’s responsibility Contact each expert with specific questions f

or revising the document Koiti Hasida, Manfred Pinkal, and Eric de Cler

gerie agreed to write up some comments KH: use of XML for representing FS MP: EC: coordination of sections 4 and 5

specification of atomic values