42
1 (and computational linguistics)

(and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

1

(and computational linguistics)

Page 2: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Computational linguistics: computer science theory cognition algorithms

Natural language processing: software development application practical techniques

Page 3: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Computer methods and their usefulness (or uselessness) for human language processing (textual, spoken, gestural, etc.)

Implementation of techniques, procedures, algorithms for language computation

Enabling human-machine communication Enhancing human-human communication

3

Page 4: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

4

computer science

psychology/cognitive science

linguistics

math/statistics

philosophy

communication

NLP

Page 5: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Tokenization Part-of-speech tagging Computational morphology Syntactic parsing Lexical relations Dialogue move engines

Page 6: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Dialectizer Speech recognition (speech to text) Speech synthesis (text to speech) Diacritization, Romanization Corpus annotation (Syriac) Thought identification

Page 7: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Question answering Summarization Natural language generation Machine translation Spoken language identification Spoken language translation

Page 8: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Humanities, natural and behavioral sciences, and engineering

Linguistics, computer science, psychology, and mathematics

Theory and practice, science and art Models, foundations vs. corpora, data

(top-down vs. bottom-up)

8

Page 9: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Math: statistics, calculus, algebra, modelling Computational paradigms: connectionist, rule-

based, cognitively plausible Linguistics: LFG, HPSG, GB, OT, CG, etc. Architectures: stacks, automata, networks,

compilers

9

Page 10: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Several approaches implemented, taught here Homegrown: analogical modeling (AM) State-of-the-art performance in various

applications for various languages: Written language identification Part-of-speech tagging Morpheme boundary detection Named entity recognition Word sense disambiguation Shallow parsing Semantic role labeling Spoken language identification

10

Page 11: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

11

Year Price

Make Mileage

Model

Feature

PhoneNr

Extension

Car

has has

has

has is for

has

has

has

1..*

0..1

1..*

1..* 1..*

1..*

1..*

1..*

0..1 0..1 0..1

0..1

0..1

0..1

0..*

1..*

Page 12: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Work on information extraction (data-rich text, web)

Recognition and extraction of low-level data elements

Ontology-based Related applications: ontology

generation, text similarity and classification, information integration, etc.

NSF-funded

12

Page 13: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical
Page 14: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Results and issues

• Corpus of 1500 obituaries, 500 hand-annotated

• Preliminary evaluation on a few features: name, age, title, birth date, death date, death place, funeral time/location

• Results: around 80% precision, little less on recall

• Lexicon coverage (especially place names)

• Occasional typos • Deceased sometimes

not named • Factored lists: Pierre et

Marie, son fils et belle-fille

• Anaphora resolution: Né à Paris et y décédé…

Page 15: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

… …

Page 16: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

… …

Page 17: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

… …

… …

Page 18: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

… …

grandchildren of Mary Ely

… …

Page 19: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

… …

grandchildren of Mary Ely

… …

Page 20: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

grandchildren of Mary Ely

… …

… …

Page 21: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

… …

… …

Page 22: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Number of facts extracted: 22,251 8,740 Person-BirthDate facts 3,803 Person-DeathDate facts 9,708 children facts, including

▪ 5,020 Child-has-parent-Person facts ▪ 2,394 Son-of-Person facts ▪ 2,294 Daughter-of-Person facts

Number of implied grandchild facts inferred: 5,277

Processing time: ~18 seconds per page CPU time: ~4 hours

Precision: .52 (spot-checking 100 of the 22,251 facts) Recall: .33 & Precision: .40 (spot-checking 2 fact-filled family

pages)

Page 23: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

“Find a BBQ restaurant near the Umeda station, with typical prices under $40”

Language-Agnostic Ontology

Page 24: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Oral proficiency testing for language learners

Sentences presented aurally, repeated back Carefully engineered for vocabulary level,

grammatical complexity, length in syllables Score responses with forced alignment Correlate to standard testing methods English, French, Spanish, Japanese In use at language training facilities,

universities, industry

Page 25: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Too short: just WM task w/ parroting Too long: impossible to repeat Too complex: even NS can’t repeat Too simple: can’t discriminate NNS levels EI item design is a linguistic engineering

task! Sentence length Sentence complexity Vocabulary levels Breadth of sampling of grammatical

structures, constructions

Page 26: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical
Page 27: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

681,925 annotated sentences of length 5-20 words

Page 28: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical
Page 29: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical
Page 30: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

NLP in a cognitive modeling framework Goal-directed, incremental Machine learning Trying to model/mimic human performance

in language tasks Several modalities Parsing Generation Translation Dialogue

30

Page 31: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Cognitive modeling Model human behavior: agent-based, goal-

directed, representation of world, decomposable actions, learned skills, behaviors, expertise, memory

Fatigue, emotion, attention, overload, confusion Plausible: processes, time course, constraints Robots: explore control, agency, interaction Language: cognition, acquisition, modeling,

agency, incrementality, discourse/dialogue, process (parsing, lexical access, generation, translation, …)

Page 32: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Develop NLP capability in Soar Parsing, generation, discourse/dialogue,

translation, speech Fit models of human performance data Incremental, learning, agent-based WordNet, other resources for lexical info English, French, Japanese Use in HCI, modeling (reading, acquisition),

task interactions, emotion, attention, ambiguity resolution, parser breakdown, etc.

Page 33: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

33

Page 34: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Dialogue

Comprehension

Generation

Dialogue

Generation

Comprehension

Page 35: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical
Page 36: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Operationalize language processing of all kinds (mostly for DoD) Machine translation, sentiment analysis,

dialect recognition, prevarication detection, etc. Beyond the current paradigms, language

resources (cf. trained on newswire) MT and CLIR (A), HCI English+Arabic (B), ST

English+Arabic (C), Arabic dialects (D) Activity E: language, agents, and robotics

Page 37: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Grounded language acquisition by robots Deep semantics, visual+tactile input,

experiential learning of objects, actions, and consequences

Acquires language via grounding, hypothesizing, automated reasoning

Human guides acquisition via situated, inter-active instruction

Robot demonstrates understanding via performance

Page 38: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Social band (105 to 107:days to months) Rational band (102 to 104:minutes to

hours) Cognitive band (10-1 to 101:100 ms to 10

secs) Biological band (10-4 to 10-2:100 μs to 10

ms)

Page 39: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

Put <object> in <location> Includes moving to <object>, picking it up, moving to <location>,

opening <location> if necessary, depositing <object>, closing <location> if necessary

Fails if already another object in location (or can extend to put second object in work area?)

Cook <object> Clears the location where the object will be cooked. Turns on location to correct temperature (background knowledge in

semantic memory!) If need to preheat (oven), wait for it to preheat. Puts object in location. Waits. Tests temperature or other appropriate sensor (toothpick for

cake?). Removes object from oven/stove and places on workspace Turns off oven/stove

Page 40: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

40

Page 41: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

41

Page 42: (and computational linguistics) · Computational linguistics: computer science theory cognition algorithms Natural language processing: software development application practical

42