What do we need to do, to understand an utterance?

What do we need to do, to understand an

utterance?

Cartoon-head figures from Jackendoff (1994), Patterns in the Mind

Speech Perception

– What are the phonemes?

– Where are the word boundaries?

Segment the auditory stream into words, made up of particular phonemes

Word Recognition

Recognize individual words, resolving ambiguities

• MeaningChris walked near the bank.

• Syntactic category (noun, verb, etc.)She saw her duck.Buffalo buffalo buffalo buffalo.

Syntactic Analysis

• Determine the structure of the sentence– Constituents– Hierarchical structure

• Buffalo buffalo buffalo buffalo.• Put the ball in the box on the table.

(ball in the box) on (the table)(ball) in (the box on the table)

Semantic & Pragmatic Analysis

• What does the sentence mean?– Who did what to whom?– Truth conditions

• What is the speaker trying to convey?– Can you pass the salt?

The Big Question:

Jackendoff (1994), Patterns in the Mind

How Do We Accomplish Linguistic Communication?

Map sounds to stored representations?

1. A bird was in the tree yesterday.

2. Are there any birds in the tree?

3. A bird might be in the tree.

4. That tree looks like a bird.

What do we need?

Linguistic Knowledge as Categorical Rules for

structure-building

Jackendoff (1994), Patterns in the Mind

Related Questions

• How do those categorical rules map onto cognitive processes?

• Are the same cognitive processes used to produce speech as to understand it?

• Does modality (reading/listening) matter?• Are our brains specialized for language? If I form a hypothesis about language

understanding/production, how would I test it? What would count as data? Let’s give it a shot…

History • Philosophical interest in language processing &

language acquisition goes back to ancient times– E.g., Aristotle on relations among thought, language,

& external world

• Modern experimental approach quite new (most paradigms developed within last 50 years)

• Modern theories reflect contributions from modern linguistic theory, cognitive psychology, computer science, & cognitive neuroscience– Human language is unique among animal kingdom

Early Psychological study of Language

• Wundt (1832-1920)– One of psychology’s

founding fathers– Primarily used

introspection for studying mental behavior, but one of first to use RT

– Published on language in 1911

Wundt’s (1911) hypotheses about Language

• Sentence (defined intuitively) is the primary unit of language– “Leave!” is one; “Days of the week” is not– “I filled the water with bottle.”

• Production converts a thought into a sequential string of sounds

• Comprehension is simply the reverse

What’s wrong with Wundt’s Approach?

• Is introspection a reliable, replicable, objective tool for science?

• Do Wundt’s hypotheses lead to any clear predictions about behavior?

• Constructs (e.g., sentence) not carefully defined• The stimuli/inputs for production and the

response/outputs for comprehension are neither well-defined nor directly observable.– How can one develop a hypothesis and test it

experimentally?

Would a different approach be more productive?

• Need to develop hypotheses that lead to clear predictions about behavior– Link environmental conditions to observable

behavior

• Need experimental techniques that can be clearly described and replicated in different laboratories.

Behaviorism

Dominant paradigm in psychology 1927-1960. (Skinner, Pavlov)

– How often does a behavior occur and with what intensity?

– All behavior shaped by the environment using classical and operant conditioning.

– No mental representations– Introspection devalued

Skinner’s (1957) Verbal Behavior

• Language is a difficult phenomena for a behaviorist account. – In 1934, at a dinner party, philosopher A. N.

Whitehead challenged Skinner to “account for my behavior as I sit here saying ‘No black scorpion is falling upon this table.’”

– Skinner began the book the next morning, and spent 20+ years working on it.

• Skinner often called this book his most important work

Skinner’s (1957) Verbal Behavior

• Emphasis on production, rather than comprehension

• A sentence is a chain of associative links, “like beads on a string”There – is – no – black – scorpion…

• Speech is learned response to environmental stimuli (reinforcement, punishment)

• Speech is learned response to environmental stimuli (reinforcement, punishment)– Use of plural nouns increase if reinforced

with “mmm-hmm” (Greenspoon, 1954, 1955)

– Proportion of opinion statements increase if paraphrased/agreed (Verplank, 1955)

Experimental Evidence

Language research during the Reign of Behaviorism

• Operant studies (e.g., Verplanck)• Classical Conditioning experiments (e.g.,

Stroop)• Practical research, much of which was

funded by the defense department– George Miller: understanding speech in noisy

radio transmissions

George Miller’s Lab

• Interested in speech and hearing• Trained as a behaviorist in the 1940s• In 1950’s investigated radio-based

communication– How high does the signal-to-noise ratio need

to be, for adequate transmission of the message?

• Amount of noise• Characteristics of the message• Characteristics of the speaker

How do we (the military) insure adequate transmission of message?

One strategy: Limit the vocabulary/possible messages– Digits are easy: 0-9 have 8 different

nuclear vowels (only 5 and 9 share their vowel)

– Nonsense syllables are opposite extreme—need to hear each phoneme clearly

George’s Ground-Breaking Findings

Miller et al. (1951)

Miller & Selfridge (1950) demonstrated analogous pattern in free-recall test.

Why are words in sentences easier to perceive and easier to remember than words in lists?

What does “sentence-advantage” mean?

Miller et al. (1951) maintain that sentences effectively restrict the number of alternative words, similarly to small vocabularies.

“In 1951, I apparently still hoped to gain scientific respectability by swearing allegiance to behaviorism. Five years later, inspired by such colleagues as Noam Chomsky and Jerry Bruner [a social psychologist], I had stopped pretending to be a behaviorist.” (Miller, 2003)

Miller (1962) re-examines Miller et al. (1951)

• Words in a sentence are not as distinct as words in isolation…– Less carefully pronounced (splice-test)– Words run together

• Why is there no extra cost for these?• Speech rate of 2-3 words per second leaves little

time for deducing set of alternatives after each word

• “Reduction of alternatives” explanation is inadequate

Miller (1962)

Miller (1962)

Evolution of Speech and Language

August 4, 2009

How did human vocal tract evolve?

All mammals produce vocal sounds in essentially the same way…Source – Oscillator (voicing) – Filter

(formants)YouTube - vocal tract model synthesisYouTube - Vocal formants

In human speech, formants are the most informative parameter. They make speech intelligible. E.g., whispered speech lacks voicing

and pitch, but has normal formants

Human speech requires fine, rapid motor control during articulation

http://www.youtube.com/watch?v=wR41CRbIjV4

http://www.youtube.com/watch?v=s5ypALATOLI

Role of formants in animal communication

Primates & birds perceive formants as accurately as humans

– Individual identification via vocal signature

– Provide cues to body size of “speaker”

Diff’s btwn ape & human vocal tract

1. Human larynx lowers in throat during 1st yr of life·Allows more tongue movement, for broad range of

discriminable formant patterns·Lowers formant freq-- impression of larger size

2. Human oral cavity shorter, nasal cavity bigger

3. Humans lack laryngeal air sacs– Little known about function

Vocal Imitation • Except for humans, primates are poor at

this.• Apes raised like human kids• Monkeys raised with other species• Little evidence for learned vocal behavior

• Humans clearly learn language(s)• Human whistling• Human bird calls• Human imitation of animal noises

Vocal Imitation• Whales, seals and dolphins are somewhat

better than most primates.– Whales learn their songs

• Passerine Birds are terrific at this, even cross-species– Songbirds learn their songs– Mockingbirds learn other species songs, as well

as environmental sounds (insects, car alarms, etc.)

– Parrots can mimic human speech and even specific voices

– Irene Pepperberg has trained African Grey Parrots to use human speech communicatively

Primates are poor candidates for production of spoken language

• Lack of rapid, fine motor control of vocal articulation

• Structure of the vocal tract• Limited ability for vocal imitation

Ape Language studies (1950’s to present)

• No luck training chimpanzees to produce spoken language

• Some success with manual/visual “languages.”

• Chimpanzees, gorillas, & bonobos approximate linguistic skill of a 3-yr old human

• Ceiling on lg potential?

Koko (gorilla.org)

Gorilla trained in sign language by Penny Patterson

Video Clip

Has Koko acquired a language? What evidence is necessary to answer this question?

http://www.koko.org/world/kokoflix.php?date=2007-08-01

Summing Up: How is human language special?

• Vocal tract anatomy• Vocal imitation• Rapid, fine motor control of ariculators• Creative recombination of phonemes,

morphemes, words for expression of nearly any thought

But what about Koko? Compare Koko to Nicaraguan deaf kids

Spontaneous emergence of Nicaraguan Sign Language

• Clip from “Birth of a Language”

• How is this signed communication the same/different from Koko’s?

• What kind of tests would you need to conduct to compare them?

http://www.pbs.org/wgbh/evolution/library/07/2/l_072_04.html

Special Features of Human Speech

• Specialized vocal tract: Broad range of formants for producing many distinct sounds

• Vocal imitation/Social Learning

• Rapid, fine articulation

• Hierarchical structure

• Rule Learning

Hierarchical Structure of Lg

Human speech has hierarchical structure, which is necessary to produce utterances of arbitrary complexity. Structure is distinct from content (specific phonemes).

Syllable = (onset) + rhymeRhyme = nucleus + (coda)Onset = one or more consonantsNucleus = one or more vowelsCoda = one or more consonants

Compositionality and the Rate of Data Transmission

• Small set of phonemes can be recombined very productively (but in a constrained way) to form morphemes.– Morpheme = one or more syllables (meaning unit)– Signed language morphemes are also made up of

“phonological” constituents (e.g., hand shape, movement, location)

• Morphemes can be combined productively (but constrained) to create words.– (prefix) + stem + (suffix)– Stem = (prefix) + stem + (suffix)

• Words can be combined productively to create utterances.

Hierarchical Structure of Lg

Syntactic Hierarchies & Center-embedding

• The man read Chaucer.• The man who the woman despised

read Chaucer.• The man who the woman the children

loved despised read Chaucer.

What are the preconditions for learning hierarchical structure?

1. Fixed sequences (linear order): – Idioms & stock phrases (once upon a time) are fixed word

sequences– Words are fixed phoneme sequences

2. Statistical Learning is probably important for: – discovering words in speech stream– identifying syntactic category and subcategory of words– resolving lexical and syntactic ambiguity

– Within phrases (e.g. NP), there is a predictable ordering of categories (e.g., the predicts a noun in the next word or two)

Predictive constraints on sequences may allow us to learn hierarchical relationships

Coding the probabilities

of sequences aids word

segmentation prior to

lexical knowledge

Jenny Saffron & colleagues Marc Hauser & colleagues

Shared neural underpinnings to syntax & sequence learning? • Broca’s aphasics who have severe syntactic deficits

also exhibit deficits in sequence learning (Christiansen et al., 2001 unpubl)

• Incongruent musical sequences elicit P600’s, just like syntactic anomalies (Patel et al., 1998)

• MEG shows that Broca’s area is involved in processing music sequences (Maess et al., 2001)

• All higher organisms must learn about sequential events. How does human sequential learning compare with that of other primates?

What limits the kinds of rules that Tamarins can learn about sequences?

“The rat the cat the dog bit chased died.”

Fitch & Hauser (2004) suggest that Tamarin’s can master Finite State Grammars, but not Phrase Structure Grammars

Types of Grammars

All human languages allow for an infinite number of different utterances. What kinds of grammars allow this?

Finite State Grammars: A finite number of states (e.g., words, calls, syntactic categories), with rules for getting from one state to the next.

the

an

boring

easy

class

satisfies

A+ really

me

sucks

FSG’s provide rules for concatenation

disappoints

Types of Grammars

A phrase structure grammar allows for long distance dependencies.

(Last year, (Demi Moore (took (that cute dumb guy who’s about 20 years old from “That 70’s Show”) out for a while))).

NP took out NP.NP took NP out.NP = NP + PPNP = NP + S The intervening NP can have

an arbitrary amount of internal complexity

Types of Grammars

Phrase structure grammars allow for center embedded constructions:

((The rat ((the cat (the dog bit _cat ))chased _rat)) died.)

The water someone I know carried spilled.

F&H 2004

A simple FSG could have two categories of “words”, A & B, with the rules that A must follow B and vice versa. ABn

A: no ba la wu, etc. (hi pitch)B: li pa mo, etc. (low pitch)

No li ba paBa pa ba pa ba pa…*No ba la mo

Tamarins & humans easily learn the simple finite state grammar. In F&H, were the learning conditions comparable across the 2 species?

F&H 2004

A simple phrase structure grammar might require equal numbers of A and B syllables. AnBn

A: no ba la wu, etc. (hi pitch)B: li pa mo, etc. (low pitch)

((The rat ((the cat (the dog bit _cat ))chased _rat)) died.)The water someone I know carried spilled.

What were the results for humans and tamarins in the PSG condition? What are the implications?

Why does human performance surpass monkey

performance? 1. Humans have UG2. Humans have general cognitive abilities

superior to monkeys (Look for evidence of this in non-language context.)– Sequences can be learned via linear

associations (as in FSG) or by learning ordinal positions of items (e.g., syllable structure, word position in a sentence).• Ordinal sequence learning in rhesus monkeys (Chen

et al., 1997)• Strategies for nesting cups

Chen et al. (1997)• Monkeys learned 4 sets of 4 pictures each on touch-screen.

• Trained to press pics w/in each set in a particular order. (Spatial config random.)

• Then the 16 pics were reorganized into new sets & monkeys re-trained. – In “maintained” sets, the

pictures were in the same ordinal slot [A,B,C,D].

– In “changed” sets, the pictures were in a new ordinal slot.

– Maintained sets were much easier to learn, suggesting that ordinal position had been encoded.

Problem-solving

strategies: do they involve hierarchical planning?

Human speech probably evolved as a result of …

• Rapid, fine motor control of articulators [frontal lobe, hypoglossal nerve]

• Ability to analyze sounds in terms of hierarchical structure [Broca’s area? UG?]

• Changes in the vocal tract & enhanced role of formants

• Increased ability to imitate auditory input [arcuate fasciculus?]

Documents

What do we need to do, to understand an utterance?