37
Embodied Construction Grammar ECG (Formalizing Cognitive Linguistics) 1. Community Grammar and Core Concepts 2. Deep Grammatical Analysis 3. Computational Implementation a. Test Grammars b. Applied Projects – Question Answering 4. Map to Connectionist Models, Brain

Embodied Construction Grammar ECG (Formalizing Cognitive Linguistics )

Embed Size (px)

DESCRIPTION

Embodied Construction Grammar ECG (Formalizing Cognitive Linguistics ). Community Grammar and Core Concepts Deep Grammatical Analysis Computational Implementation Test Grammars Applied Projects – Question Answering Map to Connectionist Models, Brain Models of Grammar Acquisition. - PowerPoint PPT Presentation

Citation preview

Page 1: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Embodied Construction GrammarECG

(Formalizing Cognitive Linguistics)

1. Community Grammar and Core Concepts

2. Deep Grammatical Analysis

3. Computational Implementationa. Test Grammars

b. Applied Projects – Question Answering

4. Map to Connectionist Models, Brain

5. Models of Grammar Acquisition

Page 2: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Simulation specification

The analysis process produces a simulation specification that

•includes image-schematic, motor control and conceptual structures

•provides parameters for a mental simulation

Page 3: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Summary: ECG• Linguistic constructions are tied to a model of

simulated action and perception• Embedded in a theory of language processing

– Constrains theory to be usable– Basis for models of grammar learning

• Precise, computationally usable formalism– Practical computational applications, like MT and NLU– Testing of functionality, e.g. language learning

• A shared theory and formalism for different cognitive mechanisms– Constructions, metaphor, mental spaces, etc.

• Reduction to Connectionist and Neural levels

Page 4: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

physics lowest energy state

chemistry molecular fit

biology fitness, MEU Neuroeconomics

vision threats, friends

language errors, NTL

Constrained Best Fit in Natureinanimate animate

society, politicsframing, compromise

Page 5: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Competition-based analyzer• An analysis is made up of:

– A constructional tree– A semantic specification– A set of resolutions

Bill gave Mary the book

MaryBill

Ref-Exp Ref-Exp Ref-ExpGive

A-GIVE-B-X

subj v obj1 obj2

book01

@Man @WomanGive-Action @Book

giver

recipient

theme

Johno Bryant

Page 6: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Combined score determines best-fit

• Syntactic Fit:– Constituency relations– Combine with preferences on non-local elements– Conditioned on syntactic context

• Antecedent Fit:– Ability to find referents in the context– Conditioned on syntax match, feature agreement

• Semantic Fit:– Semantic bindings for frame roles– Frame roles’ fillers are scored

Page 7: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

0Eve1walked2into3the4house5

Constructs--------------NPVP[0] (0,5)Eve[3] (0,1)ActiveSelfMotionPath

[2] (1,5)WalkedVerb[57] (1,2)SpatialPP[56] (2,5)Into[174] (2,3)DetNoun[173] (3,5)The[204] (3,4)House[205] (4,5)

Schema Instances-------------------

SelfMotionPathEvent[1]HouseSchema[66]WalkAction[60]Person[4]SPG[58]RD[177] ~ houseRD[5]~ Eve

Page 8: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Unification chains and their fillersSelfMotionPathEvent[1].mover

SPG[58].trajectorWalkAction[60].walkerRD[5].resolved-refRD[5].category

Filler: Person4  SpatialPP[56].mInto[174].mSelfMotionPathEvent[1].spg

Filler: SPG58 

SelfMotionPathEvent[1] .landmarkHouse[205].mRD[177].categorySPG[58].landmark

Filler:HouseSchema66  WalkedVerb[57].mWalkAction[60].routineWalkAction[60].gaitSelfMotionPathEvent[1] .motion

Filler:WalkAction60

Page 9: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

• Mother (I) give you this (a toy).

CHILDES Beijing Corpus (Tardiff, 1993; Tardiff, 1996)

ma1+ma

gei3

ni3zhei4+

ge

mother give 2PS this+CLS• You give auntie [the

peach].

• Oh (go on)! You give [auntie] [that].

Productive Argument Omission (Mandarin)Johno Bryant & Eva Mok

1

2

3

ni3 gei3

yi2

2PS give auntie

ao ni3gei3

ya

EMP 2PS give EMP4 gei

3

give

• [I] give [you] [some peach].

Page 10: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Arguments are omitted with different probabilities

All args omitted: 30.6% No args omitted: 6.1%

% elided (98 total utterances)

Giver

Recipient

Theme

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

Page 11: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Analyzing ni3 gei3 yi2 (You give auntie)

• Syntactic Fit: – P(Theme omitted | ditransitive cxn) = 0.65– P(Recipient omitted | ditransitive cxn) = 0.42

Two of the competing analyses:

ni3 gei3 yi2 omitted↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

ni3 gei3 omitted yi2↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

(1-0.78)*(1-0.42)*0.65 = 0.08 (1-0.78)*(1-0.65)*0.42 = 0.03

Page 12: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Using frame and lexical information to restrict type of reference

Lexical Unit gei3

Giver (DNI)

Recipient (DNI)

Theme (DNI)

The Transfer Frame

Giver

Recipient

Theme

Manner

Means

Place

Purpose

Reason

Time

Page 13: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Can the omitted argument be recovered from context?

• Antecedent Fit:ni3 gei3 yi2 omitted↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

ni3 gei3 omitted yi2↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

Discourse & Situational Context

child motherpeach auntietable

?

Page 14: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

How good of a theme is a peach? How about an aunt?

The Transfer Frame

Giver (usually animate)

Recipient (usually animate)

Theme (usually inanimate)

ni3 gei3 yi2 omitted↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

ni3 gei3 omitted yi2↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

Semantic Fit:

ni3 gei3 yi2 omitted↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

Page 15: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

The argument omission patterns shown earlier

can be covered with just ONE construction

• Each construction is annotated with probabilities of omission • Language-specific default probability can be set

Subj Verb Obj1 Obj2

↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

0.78 0.42 0.65P(omitted|cxn):

% elided (98 total utterances)

Giver

Recipient

Theme

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

Page 16: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Leverage process to simplify representation

• The processing model is complementary to the theory of grammar

• By using a competition-based analysis process, we can:– Find the best-fit analysis with respect to

constituency structure, context, and semantics

– Eliminate the need to enumerate allowable patterns of argument omission in grammar

• This is currently being applied in models of language understanding and grammar learning.

Page 17: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Modeling context for language understanding and learning

• Linguistic structure reflects experiential structure– Discourse participants and entities

– Embodied schemas:• action, perception, emotion, attention, perspective

– Semantic and pragmatic relations: • spatial, social, ontological, causal

• ‘Contextual bootstrapping’ for grammar learning

Page 18: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

The context model tracks accessible entities, events, and utterances

Discourse & Situational

Context

Discourse01participants: Eve , Motherobjects: Hands, ...discourse-history: DS01situational-history: Wash-Action

Discourse:

Page 19: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Each of the items in the context model has rich internal structure

Situational History: Discourse History:

Participants: Objects:

Discourse:

Wash-Actionwasher: Evewashee: Hands

DS01speaker: Motheraddressee: Eveattentional-focus: Handscontent: {"are they clean yet?"}speech-act: question

Evecategory: childgender: femalename: Eveage: 2

Mothercategory: parentgender: femalename: Eveage: 33

Handscategory: BodyPartpart-of: Evenumber: pluralaccessibility: accessible

Page 20: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Analysis produces a semantic specification

Linguistic Knowledge

UtteranceDiscourse & Situational

Context

Semantic Specification

World Knowledge

Analysis

“You washed them”

WASH-ACTIONwasher: Evewashee: Hands

Page 21: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

How Can Children Be So Good At Learning Language?

• Gold’s Theorem:No superfinite class of language is identifiable in the limit from positive data only

• Principles & ParametersBabies are born as blank slates but acquire language quickly (with noisy input and little correction) → Language must be innate:

Universal Grammar + parameter setting

But babies aren’t born as blank slates!And they do not learn language in a vacuum!

Page 22: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Key ideas for a NT of language acquisitionNancy Chang and Eva Mok

• Embodied Construction Grammar

• Opulence of the Substrate– Prelinguistic children already have rich sensorimotor

representations and sophisticated social knowledge

• Basic Scenes – Simple clause constructions are associated directly with

scenes basic to human experience(Goldberg 1995, Slobin 1985)

• Verb Island Hypothesis – Children learn their earliest constructions

(arguments, syntactic marking) on a verb-specific basis(Verb Island Hypothesis, Tomasello 1992)

Page 23: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Embodiment and Grammar Learning

Paradigm problem for Nature vs. Nurture

The poverty of the stimulus

The opulence of the substrate

Intricate interplay of genetic and environmental, including social, factors.

Page 24: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Two perspectives on grammar learning

Computational models

• Grammatical induction– language identification– context-free grammars,

unification grammars– statistical NLP (parsing,

etc.)• Word learning models

– semantic representations• logical forms• discrete representations• continuous

representations– statistical models

Developmental evidence

• Prior knowledge– primitive concepts– event-based knowledge– social cognition– lexical items

• Data-driven learning– basic scenes– lexically specific patterns– usage-based learning

Page 25: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Key assumptions for language acquisition

• Significant prior conceptual/embodied knowledge– rich sensorimotor/social substrate

• Incremental learning based on experience– Lexically specific constructions are learned

first.• Language learning tied to language

use– Acquisition interacts with comprehension,

production; reflects communication and experience in world.

– Statistical properties of data affect learning

Page 26: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Context

Eve

washer

Wash-Action

Hands

washee

Discourse Segment

addressee

attentional-focus

Analysis draws on constructions and context

before

before

MeaningForm

you Addressee

washer

Wash-Actionwashed

washee

ContextElementthem

Page 27: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Learning updates linguistic knowledge based on input utterances

Learning

Discourse & Situational

Context Linguistic Knowledge

Analysis

Utterance

PartialSemSpec

World Knowledge

Page 28: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Context

Eve

washer

Wash-Action

Hands

washee

Discourse Segment

addressee

attentional-focus

Context aids understanding: Incomplete grammars yield partial SemSpec

MeaningForm

you Addressee

washer

Wash-Actionwashed

washee

ContextElementthem

Page 29: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Context

Eve

washer

Wash-Action

Hands

washee

Discourse Segment

addressee

attentional-focus

Context bootstraps learning: new construction maps form to meaning

MeaningForm

you Addressee

Wash-Actionwashed

ContextElementthem

before

before washer

washee

Page 30: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Context bootstraps learning: new construction maps form to meaning

MeaningForm

you Addressee

Wash-Actionwashed

ContextElementthem

before

before washer

washee

YOU-WASHED-THEM

constituents:

YOU, WASHED, THEM

form:

YOU before WASHED

WASHED before THEM

meaning: WASH-ACTION

washer: addressee

washee: ContextElement

Page 31: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Grammar learning: suggesting new CxNs and reorganizing existing ones

reinforcement

reorganize• merge• join• split

Linguistic Knowledge

Discourse & Situational

Context

Analysis

Utterance

PartialSemSpec

World Knowledge

hypothesize• map form to

meaning• learn contextual

constraints

Page 32: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Challenge: How far up to generalize

• Eat rice• Eat apple• Eat watermelon

• Want rice• Want apple• Want chair

Inanimate Object

ManipulableObjects

Unmovable Objects

Food Furniture

Fruit Savory Chair Sofa

apple watermelon

rice

Page 33: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Challenge: Omissible constituents

• In Mandarin, almost anything available in context can be omitted – and often is in child-directed speech.

• Intuition:• Same context, two expressions that differ

by one constituent a general construction with the constituent being omissible

• May require verbatim memory traces of utterances + “relevant” context

Page 34: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

When does the learning stop?

• Most likely grammar given utterances and context

• The grammar prior includes a preference for the “kind” of grammar

• In practice, take the log and minimize cost Minimum Description Length (MDL)

)(),|(argmax

),|(argmaxˆ

GPZGUP

ZUGPG

G

G

Bayesian Learning FrameworkSchemas +

Constructions

SemSpec

Analysis + Resolution

Context Fitting

reorganize

hypothesize

reinforcement

Page 35: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Intuition for MDL

• S -> Give me NP• NP -> the book• NP -> a book

• S -> Give me NP• NP -> DET book• DET -> the• DET -> a

39

Suppose that the prior is inversely proportional to the size of the grammar (e.g. number of rules)

It’s not worthwhile to make this generalization

Page 36: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Intuition for MDL

• S -> Give me NP• NP -> the book• NP -> a book• NP -> the pen• NP -> a pen• NP -> the pencil• NP -> a pencil• NP -> the marker• NP -> a marker

• S -> Give me NP• NP -> DET N• DET -> the• DET -> a• N -> book• N -> pen• N -> pencil• N -> marker

Page 37: Embodied Construction Grammar ECG (Formalizing Cognitive  Linguistics )

Usage-based learning: comprehension and production

reinforcement(usage)

reinformcent(correction)

reinforcement(usage)

hypothesize constructions& reorganize

reinforcement(correction)

constructicon

world knowledge

discourse & situational context

simulation

analysis

utterance

analyze &

resolve

utterance

response

comm. intent

generate