Learning by Reading

Learning by Reading

Micah Clark & Selmer BringsjordRensselaer AI & Reasoning (RAIR) Laboratory

Department of Cognitive ScienceDepartment of Computer Science

Rensselaer Polytechnic Institute (RPI)Troy NY 12180 USA

03.20.06

Turning to written text and diagrams to learn, isn’t considered learning as it has been rigorously studied in computer science, cognitive science, and AI. In them, to learn is almost invariably to produce an underlying function f on the basis of a restricted set of pairs.

Yet this form of Learning by Reading (LBR) underpins much of modern human life – e.g. the educational system, job training, IRS tax forms, product manuals.

Shallow vs. Deep Learning

Shallow Learning:Absorb the semantic content explicitly present in the surface structure and form of the medium (texts)

Deep Learning:Reflective contemplation of semantic content with respect to prior knowledge, experience, and beliefs as well as imaginative hypothetical projections

Example: Book Reports!

Shallow LBR for Slate

Reading Process

Process: Intelligence Reports Multi-Sorted Logic

Reading Process Implementation

Process: Intelligence Reports Multi-Sorted Logic

Reading Process – Phase 1

• ACE (Fuchs, et al)• WordNet used prior as lexicon

database for CELT, an ACE-like controlled language (Pease, et al)

• Manual transcription/authoring in controlled languages is viable at scale (Allen & Barthe)

• Techniques for automated conversion from natural English to controlled English are being developed (Mollá & Schwitter)

Attempto Controlled English

ACE is an unambiguous proper subset of full English• Vocabulary of reserved function words and user-

defined content words• Grammar is context-free, phrase-structured, and

definite clause• Principles of Interpretation deterministically

disambiguate otherwise ambiguous phrases• Direct translation into Discourse Representation

Structures


• ACE Parser (APE)• Discourse Representation

Structures (DRSs) are central to Discourse Representation Theory (DRT) (Kamp & Reyle)

• DRT is a linguistic theory for assigning meaning to discourse by sequential additive contribution

• DRS is a syntactic variant of first-order logic for the resolution of unbounded anaphora

• DRS is a structure ((referents), (conditions))

DRS Example

“John talks to Mary.”((A, B), (John(A), Mary(B), talk(A, B)))

…“He smiles at her.”((A, B, C, D),

(John(A), Mary(B), talk(A, B),

smile(C, D), C=A, D=B))

DRS Example

…“She does not smile at him.”((A, B, C, D),

(John(A), Mary(B), talk(A, B),

smile(C, D), C=A, D=B),

((E, F), (smile(E, F), E=B, F=A)))


• Transformation from DRS to MSL/FOL is well understood (Blackburn & Bos)

• ACE uses an extended form of DRS

• Small, domain-neutral, encoding scheme & ontology to capture semantic content

• Straight-forward translation would interject ACE’s ontology/encoding scheme

• Translation must map from ACE’s ontology to another, perhaps PSL

• Similar to CELT’s mapping of WordNet to the Suggested Upper Merged Ontology (SUMO)

Encoding Scheme Examples

• Nouns and verbs have semantic type; person, object, time, or unspecified for nouns, event, state, or unspecified for verbs– e.g. object(A, named_entity, person)

• Properties are encoded using property– e.g. green(A) property(A, green)

• Predicates are encoded using predicate– e.g. enter(A, B) predicate(P, event, enter, A, B)

Slate Reading Example

Input Text

Security searches every foreigner that boards a plane. Abdul is an Iranian. He boards DL846.

Parse Tree

DRS

Multi-Sorted Logic(Using Inverse Encoding Map)

1. A (Security(A) B,C ((foreigner(B) plane(C) board(B, C)) search(A, B)))

2. AB (Abdul(A) Iranian(A) DL846(B) board(A, B))

Comparison to KANI and HITIQA

Knowledge Associates for Novel Intelligence (KANI)

High-Quality Interactive Question Answering (HITIQA)

Technical Accomplishments

• Proof-of-concept demonstration of automatic translation of a controlled English to FOL for the IA domain

• Demonstration leverages 3rd party technologies as previously discussed

• Effort has identified specific aspects of the approach in need of novel research

Programmatic Accomplishments• Bringsjord, S. & Clark, M. (2006) ‘For Problems Sufficiently Hard . . . AI Needs CogSci.’ To

appear in Proceedings for the American Association for Artificial Intelligence’s Spring Symposium on Cognitive Science and AI (“Between a Rock and a Hard Place: Cognitive Science Principles Meet AI-Hard Problems”).

• Clark, M & Bringsjord, S. (2006) ‘Learning by Reading’, Invited talk for the Institute for Informatics, Logics, and Security Studies, State University of New York at Albany, Albany, NY.

• Bringsjord, S. & Clark, M. (2006) ‘Solomon: A Next Generation Q&A System’, Blue-sky proposal in response to BAA N61339-06-R-0034 (DTO AQUAINT Program phase 3).

• Clark, M. (2006) ‘Method for Detecting Infinite Ambiguity in Context-Sensitive Generative Grammars’, Research Note, Rensselaer AI & Reasoning (RAIR) Laboratory, Cognitive Science Department, Rensselaer Polytechnic Institute, Troy, NY.

Future Research

• Interpretation of ‘natural style’ proofs as DRSs• Ontologically neutral DRSs• Ambiguous referents and incremental resolution• Conversational DRT• Non-monotonic transitions in DRT• Restatements in Conversational Discourse

Immediate Objectives

• Develop inverse mapping and translation from ACE ontology and encoding to ‘vanilla’ MSL (with Bettina)

• Develop basic translation/reformulation of natural deductive proofs (NDL?, Athena?, Slate?) into DRSs (with Sunny)

ReferencesAllen, J. & Barthe, K. (2004), ‘Introductory Overview of Controlled Languages’, Invited talk for the Society for Technical

Communication. Presentation.Blackburn, P. & Bos, J. (Forthcoming), Working with Discourse Representation Theory: An Advanced Course in Computational

Semantics. Forthcoming.Fuchs, N. E., Hoefler, S., Kaljurand, K., Schneider, G. & Schwertel, U. (2005), Extended Discourse Representation Structures in

Attempto Controlled English, Technical Report ifi-2005.08, Department of Informatics, University of Zurich, Zurich, Switzerland.

Fuchs, N. E., Kaljurand, K., Rinaldi, F. & Schneider, G. (2005), A Parser for Attempto Controlled English, Technical Report IST506779/Zurich/I2D3/D/PU, REWERSE.

Hoefler, S. (2004), The Syntax of Attempto Controlled English: An Abstract Grammar for ACE 4.0, Technical Report ifi-2004.03, Department of Informatics, University of Zurich, Zurich, Switzerland.

Fuchs, N. E., Schwertel, U. & Schwitter, R. (1999), Attempto Controlled English (ACE) Language Manual, Version 3.0, Technical Report 99.03, Department of Computer Science, University of Zurich, Zurich, Switzerland.

ISO (2001), Industrial automation system and integration — Process specification language, Committee Draft ISO/CD 18629-1, International Organization for Standardization (ISO).

Kamp, H. & Reyle, U. (1993), From Discourse to Logic: Introduction to Model-theoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory, 1 edn, Springer.

Mollá, D. & Schwitter, R. (2001), From Plain English to Controlled English, in ‘Proceedings of the 2001 Australasian Natural Language Processing Workshop’, Macquarie University, Sydney, Australia, pp. 77–83.

Pease, A. & Fellbaum, C. (2004), Language to Logic Translation with PhraseBank, in ‘Proceedings of the Second International WordNet Conference (GWC2004)’, Masaryk University Brno, Czech Republic, pp. 187–192.

Pease, A. & Murray, W. (2003), An English to Logic Translator for Ontology-based Knowledge Representation Languages, in ‘Proceedings of the 2003 IEEE International Conference on Natural Language Processing and Knowledge Engineering’, Beijing, China, pp. 777–783.

Documents

Learning by Reading