Upload
jaquez-gilyard
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Overview of Feature Structure Representation
Kiyong Lee [email protected] 7, 2003Sapporo, Japan
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
2
Preliminary Remark This work item on feature
structures concerns itself with their representational aspect only.
The task of specifying their description or declaration mechanism shall be taken up as a separate but related new work item.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
3
Historical Background Initial use of feature structures to
theoretical linguistics in treating distinctive features of phonological segments and representing their oppositions in 1930’s or earlier.
Extensive use made in 1960’s in generative phonology and then to the development of grammar formalisms in 1980’s and finally to other computational or theoretical work such as parsing, lexicology or formal semantics in 1990’s and up to the present.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
4
Use of feature structures Feature structures are an essential
part of many linguistic formalisms as well as an underlying mechanism for
representing the information consumed or produced by and for language resource management or information technology.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
5
•This international standard provides a format to represent, store or exchange feature structures in natural language applications,•both for the purpose of annotation or production of linguistic data.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
6
General Characteristics of Feature Structures
Feature structure representation can:1. Capture the notion of partial
information.2. Make finer-grained distinctions
among objects being described.3. Provide a systematic format for
accommodating various types of constraints on organizing information
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
7
1. Capturing partiality of information A feature structure represents partial
information about an object being described.
This information can be manipulated or augmented monotonically with simple operations like unification, generalization or merging, while such feature structures can explicitly
be compared with each other with respect to some formally defined relation such as subsumption.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
8
Feature structure representation can also make finer-grained distinctions among objects being analyzed.
Phonemes, for instance, like /p/ and /k/ are non-continuant consonants which share the property of being peripheral or non-coronal with respect to their place of articulation in the oral cavity, thus sometimes behaving similarly toward some phonological process of assimilation.
2. Finer-grained distinctions
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
9
3. Organizing information Feature structure representation
makes it easier to organize information systematically by accommodating various constraints or inheritance mechanisms into the sorting out of information from various sources
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
10
Formal definition of feature structure A feature structure is thus
formally defined either as a partial function from
features (or attributes) to values in set-theoretic terms
or as a directed acyclic graph (dag) in graph-theoretic terms.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
11
Typed feature structure Feature structures may be
constrained by some typing. Multiple inheritance type
hierarchy, for instance, constrains the construction of well-formed feature structures.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
12
Notations Feature structures can be
represented in either matrix or graph format. Each matrix consists of pairs of a
feature (or attribute) and its unique value, thus being called AVM.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
13
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
14
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
15
•Each graph representing a feature structure, on the other hand, consists of
- a single root, labeled and directed branches, and terminal nodes.
-Each node including the particular node called root represents a type and each label on a branch a feature.
-A type can either be an atom or a complex object that is a feature structure itself.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
16
Path In a graph notation, a feature
structure can easily be seen as consisting of many but possibly null paths. Each path starts with the root and goes through each branch to a terminal node, thereby consisting of labels on its branches.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
17
Shared values or reentrant Some features may share a token
identical value. The verb runs, for example, has the agreement value of 3rdSg and so does the noun Mary.
Such shared values are called reentrants because, in graph notation, two branches that share a value emerge into one and the same node.
In matrix format, reentrancy or sharing is represented by a tag with the same index.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
18
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
19
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
20
List as a value A feature may take a list of
values. The argument structure of a
predicate, for instance, may take as value a list of arguments, say Subject and one or two Objects, depending on the type of the predicate.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
21
F: <v1, v2, v3>
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
22
Systematic representation of feature structures In typed feature structure, the
value of each feature is a type and typing is constrained by particular
applications. Nevertheless, values need be
characterized in some systematic ways.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
23
•One is to build libraries for feature, feature-values or feature structures
•fLib
•fsLib
•fvLib
•through some clustering and identification mechanism.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
24
•Another is to introduce structures into values or organize them as singleton, set, bag (or multiset) or list.
•Singleton: {a}
•Set: {a,a} = {a},
{a,b} = {b,a}
•Bag: m{a,a} /= m{a},
m{a,b} = m{b,a}
•List: <a,b> /=<b,a>
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
25
•A third way may be to introduce special values such as
-<plus, minus> for binary values
-<any> for the Boolean truth value variable, cf. wild card?
-<none> for the Boolean falsity value variable,
-<dft> for default value, or
-<uncertain> for uncertainty value.
cf. <unknown>
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
26
•Finally, the <rel> attribute may be provided for some values.
•The rel = ne for non-equality may be introduced to exclude a certain feature-value pair, while allowing other alternatives.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
27
Comments from P-members See Annex 2 to ISO/TC 37/SC 4 N
053 Bulgaria: extensive use of FSR in
applications Japan: XML’s interoperability of
other description tools such as RDF and OWL and accommodation of the dump format for LAF
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
28
Korea: ensuing work on description or declaration language for FSR and B-R Ryu’s detailed proof-reading
UK: the SGML/XML implementation should not be so prominent, since FSR is a data-modeling tool.
No comments from France because …
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
29
Proof Reading Part of Annex B, namely on
subsumption, should be added to Table of Contents because that notion is used in Chapter 5.
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
30
Suggestions for Improvements Any restructuring of the document? Addition of subsections on historical
background, basic and Boolean operations,definition of type and multiple inheritance hierarchy
Adding more terms to section 3 Different uses of the term “tag”??
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
31
Table of Contents
4 General Characteristics of Feature Structure Overview(move to the beginning of the document) 4.0 Historical Background(addendum) 4.1 Use of Feature Structures 4.2 Basic Concepts 4.2.1 Typed Feature Structure(addendum) 4.2.2 Multiple Inheritance Hierarchy(addendum) 4.3 Notations 4.3.1 Graph Notation 4.3.2 Matrix notation 4.4 Shared Feature Structure or Reentrancy 4.5 Basic Operations and Relations(addendum) 4.6 List, Set, etc as Feature Values(addendum) 4.7 Boolean Operations and Relations(addendum)
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
32
Discussion of singleton, set, bag, and list as possible forms of feature-values in section 4
Coordination of sections 4 and 5, for instance, by converting examples in section 5 into AVM’s and listing them in section 4.
More illustrations or restricting them to language related ones, perhaps from MSA?
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
33
Main action to be taken by the joint working group The following items have been identified as requiring revision in the current document:
- Preliminary formal description of feature structures - Provision of a simplified representation (FS lite) describi
ng the basic subset of FS representation without libraries; - Provision of a re-entrance mechanisms; - Description of typed feature structure; - Simplification of feature value content by replacing som
e elements (<symb>, <num> etc.) by references to types (à la XML schemas)
- Provision of more NLP related examples -.[Note: reference to pointer/linking group for ID/IDREF me
chanisms]
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
34
Questions? Is an XML representation the third type of not
ation for FS viewed as being at the same descriptive level as AVM and DAG?
List possible applications, say to lexicology, (polysemy, dialectal variations), MorSA, and Description of Sem Rep (tripartite analysis of Quantification), with relevant illustrations?
Degree of formality in definitions, perhaps in Annexes
2003-07-07 Kiyong Lee
ISO/TC 37/SC 4/WG 1 Meeting Sapporo, Japan
35
Editor’s responsibility Contact each expert with specific questions f
or revising the document Koiti Hasida, Manfred Pinkal, and Eric de Cler
gerie agreed to write up some comments KH: use of XML for representing FS MP: EC: coordination of sections 4 and 5
specification of atomic values