37
Timbre perception Harvard-MIT Division of Health Sciences and Technology HST.725: Music Perception and Cognition Prof. Peter Cariani www.cariani.com

Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Timbre perception

Harvard-MIT Division of Health Sciences and TechnologyHST.725: Music Perception and CognitionProf. Peter Cariani

www.cariani.com

Page 2: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Timbre perception• Timbre: tonal quality (≠ pitch, loudness, duration or location)• Defines separate voices, musical coloration• Multidimensional space: not completely well understood• Two general aspects: spectrum & dynamics• Stationary spectrum

– Spectral center of gravity - "brightness"– Formant structure; – Harmonicity

• Amplitude-frequency-phase dynamics– Amplitude dynamics (attack, decay)

• amplitude modulation (roughness)– Frequency dynamics

• relative timings of onsetsand offsets of partials• frequency modulation (vibrato)

– Phase dynamics (noise, phase coherence, chorus effect)• Analogy with phonetic distinctions in speech

• Vowels (stationary spectra; formant structure)• Consonants (dynamic contrasts: amplitude, frequency & noise)

• Temporal integration windows and timbral fusion• Some neural correlates

Page 3: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Timbre ~ sonic texture, tone color

Please see Paul Cezanne, Apples, Peaches, Pears, and Grapes c. (1879-80); Oil on canvas, 38.5 x 46.5 cm; The Hermitage, St. Petersburg at (http://www.ibiblio.org/wm/paint/auth/cezanne/sl/)

Page 4: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Timbre ~ sonic texture, tone color

Page 5: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

TextureRoughness

Page 6: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Stationary and dynamic factors in timbre perception

• Periodicity (noise-like or tone-like)– Harmonicity (is this properly an aspect of timbre?)– Phase coherence (noise-incoherent; tones-coherent)– Smoothness or roughness

• Stationary spectrum– Formant structure;

• Amplitude-frequency-phase dynamics– Amplitude dynamics (attack, decay)

• amplitude modulation (roughness)– Frequency dynamics

• relative timings of onsetsand offsets of partials• frequency modulation (vibrato)

– Phase dynamics (phase shifts, chorus effect)• Analogy with phonetic distinctions in speech

• Vowels (stationary spectra; formant structure)• Consonants (dynamic contrasts: amplitude, frequency & noise)

Page 7: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Stationary and dynamic factors in timbre perception

• Stationary spectrum– Formant structure; Harmonicity

• Amplitude-frequency-phase dynamics– Amplitude dynamics (attack, decay)

• amplitude modulation (roughness)– Frequency dynamics

• relative timings of onsetsand offsets of partials• frequency modulation (vibrato)

– Phase dynamics (noise, phase coherence, chorus effect)• Analogy with phonetic distinctions in speech

• Vowels (stationary spectra; formant structure)• Consonants (dynamic contrasts: amplitude, frequency & noise)

Page 8: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Timbre: a multidimensional tonal quality

tone texture, tone colordistinguishes voices,

instruments

StationaryAspects

(spectrum)

DynamicAspects

∆ spectrum∆ intensity∆ pitchattackdecay

Vowels

Consonants

Photo Courtesy of Pam Roth.

Photo Courtesy of Per-AkeBystrom.

Photo Courtesy of MiriamLewis.

http://www.wikipedia.org/

Page 9: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

HarmonicityFrequency dynamics

Rafael A. Irizarry's Music and Statistics Demo

Spectrograms of Harmonic Instruments violin, trumpet, guitar(more harmonic,stationary spectra)

Non-Harmonic Instruments

marimba, timpani, gong(more inharmonic,time-varying spectra)

http://www.biostat.jhsph.edu/~ririzarr/Demo/demo.html

Page 10: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Some methods for studying the perceptual space

1. Try to derive the structure of the space from the dimensionality of responses

• Similarity magnitude estimations• Similarity rankings• Multidimensional scaling

2. Systematically vary acoustic parameters known to influence timbre to find acoustic correlates of perceptual dimensions, e.g.• Formant structure• Attack and decay parameters

Page 11: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Grey (1975) Timbre:Perceptualdimensions

Figure from Butler, David. The Musician's Guide to Perception and Cognition. Schirmer, 1992.

Also see: Grey, J., and J. Moorer. "Perceptual Evaluations of Synthesized Musical Instrument Tones." J. Acoustical Society of America 63 (1977): 1493-1500.

Page 12: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Timbre dimensions: spectrum, attack, decay

Page 13: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Music based on timbral contrasts

Kurt Schwitters, Ur Sonata (1932)perf. George Melly, Miniatures

Miniatures [Pipe/Cherry Red] Performer(s): Various Artists

Label: Cbc Records/Musica Viva (Can) Catalog: #1043

Audio CD (January 19, 1997) Number of Discs: 1 ASIN: B000003WYQ

Page 14: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Waveforms Power Spectra AutocorrelationsPitch periods, 1/F0

0 1 2 3 4Frequency (kHz)

0 10 20Time (ms)

0 5 10 15Interval (ms)

100 Hz125 HzFormant-relatedVowel qualityTimbre

[ae]F0 = 100 Hz

[ae]F0 = 125 Hz

[er]F0 = 100 Hz

[er]F0 = 125 Hz

Stationary spectral aspects of timbre

Page 15: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Formants and the vocal tract

Heed Hid Head Had

Hod Hawed Hood Who’d

Page 16: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

VowelF1-F2space

heed

hid

head

had

heard

hud

hod

hood

who’dhawed

log [2250]

log [2000]

log [1750]

log [1500]

log [1250]

log [1000]

log [750]

log [300] log [400] log [600] log [800]

Page 17: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Time domain analysis of auditory-nerve fiber firing rates.Hugh Secker-Walker & Campbell Searle, J. Acoust. Soc. 88(3), 1990Neural responses to /da/ @ 69 dB SPL from Miller and Sachs (1983)

High CFs

Low CFs

F1

F2

F3

Peristimulus time (ms)

Reprinted with permission from Secker-Walker HE, Searle CL. 1990. Time-domain analysis of auditory-nerve-fiber firing rates.J. Acoust. Soc. Am. 88 (3): 1427-36. Copyright 1990, Acoustical Society of America. Used with permission.

Page 18: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Vowels

.

Formant-structure

0 5 10 15

Voice pitch

1/F1

Interval (ms)

Formant-structure

# i n t e r v a l s

Population interval histogram0 5 10 15

01

Signal autocorrelation [ae]Voice pitch

Population-interval coding of timbre (vowel formant structure)

[α]

0 5

[ u]

Interval (ms)

[ ∋]

[ æ]

0 5

Population-wide distributions ofshort intervals for 4 vowels

# i n t e r v a l s

m a g n i t u d e

Page 19: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Coding of vowel quality (timbre)(Reprinted with permission from Secker-Walker HE, Searle CL. 1990. Time-domain analysis of auditory-nerve-fiber firing rates.J. Acoust. Soc. Am. 88 (3): 1427-36. Copyright 1990, Acoustical Society of America.)

Page 20: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Please See Hirahara, Cariani, Delgutte (1996)

Page 21: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Please see Figures 6, and 7 in Hirahara, Cariani, Delgutte (1996)

Page 22: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Spectrum as a function of intensity (trumpet)

Please see Figure 4-3 in Butler, David. The Musician’s Guide to Perception and Cognition. New York: Schirmer Books ; Toronto: Maxwell Macmillan Canada, New York: Maxwell Macmillan International, c1992.

Page 23: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Singer'sformant

Graph, Fig 11.12 on page 138 of Cook, Perry, ed. Music, Cognition & Computerized Sound MIT Press 2001. Used with permission.

Page 24: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Singer'sformant

Graph, Fig 11.12 on page 138 of Cook, Perry, ed. Music, Cognition & Computerized Sound MIT Press 2001. Used with permission.

Page 25: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Amplitude dynamics (envelope, intensity contour)

Page 26: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Frequencydynamics of note onsets(clarinet)

Please see Figure 4-4 in Butler, David. The Musician’s Guide to Perception and Cognition. New York: Schirmer Books ; Toronto: Maxwell Macmillan Canada, New York: Maxwell Macmillan International, c1992. ISBN: 0028703413.

Page 27: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Time-course of harmonics

Please see Figure 3 in Deutsch, D., ed. The Psychology of Music..San Diego: Academic Press, 1999.

Page 28: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

SpeechNeurogram

Please see Delgutte, B. "Auditory Neural Processing of Speech." In The Handbook of Phonetic Sciences. Edited by W. J. Hardcastle, and j. Laver. London: Blackwell, 1995.

Page 29: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Possible interval-based neural correlates for basic phonetic distinctionsCHARACTERISTIC ACOUSTIC

DISTINCTIONPHONETIC CLASS EXAMPLES INTERVAL CORRELATES

Voice Pitch (80-400 Hz) pitch contours, ² over time

voice pitch, F0prosody

most common intervalrunning interval ²

Voice onset time VOT prominent interval betweenonset/offset responses

Spectral Pattern stationary low frequency

formant patternnasal resonances

vowelsnasals

[u], [ae], [i][m], [n]

intervals for periodicities50-5000 Hz

Spectro-temporal pattern fast transition

slow transition

formant transitions consonants

semivowelsdipthongs

[b], [d], [g]

[w], [r], [y][ay], [aw],[ey]

cross-BF intervals (?)timing of FM responses (?)slow ² in interval distr.low freq modulationsinteractions

Spectral Dispersion noise-excitation(frication)

fricative consonants /f/, /s/,/�/,/v/,/θ/ semi periodic temporalstruct. ;phase incoherence

Voiced-unvoiced voiced/unvoiced stop consonantsfricatives

whispered/voiced

[b]/[p][v]/[f]

presence of harmonicstructure in intervalsdegree interval dispersion

Dynamic Amplitude Patterns amplitude time profiles abrupt/gradual ²

(buildup / decay)affricative/fricative /t�/ vs /�/

chip vs ship

adaptation + runninginterval buildup patterns(Autocorrelations ² sh ape)

Rhythm metrical aspectsword rhythmspeaking rate

Longer interval patterns(50-500 msec)

Duration duration prominent interval betweenonset & offset responders

Suprasegmental structure word time pattern whole word patterns longer time structures

Page 30: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Audit ory nerve f iberCF: 1.4 kHz Thr: 2.0 SR: 90.7 35-60

1000 20000

Audit ory nerve f iber

1000 20000

Inte

rspi

ke in

terv

al (m

s)

5

10

15CF: 1.4 kHz Thr: 2.0 SR: 90.7 35-60

1/F0 1/F0

n/F1n/F1

Inte

rspi

ke in

terv

al (m

s)

5

10

15

1/F1

Low F2 Formant Sweep[ u] [ a] [ u]

High F2 Formant Sweep[ i] [ æ] [ i]

DCN PauserCF: 1.3 kHz Thr: 24.8 SR:0.0

1000 20000

1/F0

1/F1Inte

rspi

ke in

terv

al (m

s)5

10

15

250 500 750

Sec

ond

form

ant (

Hz)

2000

1500

1000

i

∋æ

IHigh F2

Low F2

Two-formant vowel sweeps

First formant (Hz)

PVCN Chop-SCF: 2.1 kHz Thr: 5.3 SR: 17.7

Inte

rspi

ke in

terv

al (m

s)

5

10

15

1/F0

n/F1

Peristimulus time (ms)1000 20000

AVCN Pri-N

Peristimulus time (ms)1000 20000

CF: 1.5 kHz Thr: 8.8 SR: 247.1

Inte

rspi

ke in

terv

al (m

s)

5

10

15

1/F0

n/F1

1/F1

1000 20000

Page 31: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

250 500 750

Sec

ond

form

ant (

Hz)

2000

1500

1000

i

∋æ

IHigh F2

Low F2

Two-formant vowel sweeps

First formant (Hz)

Page 32: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Audit ory nerve f iberCF: 1.4 kHz Thr: 2.0 SR: 90.7 35-60

1000 20000

1/F0

n/F1

Inte

rspi

ke in

terv

al (m

s)

5

10

15

Low F2 Formant Sweep

[u] [a] [u]

Page 33: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

250 500 750

Sec

ond

form

ant (

Hz)

2000

1500

1000

i

∋æ

IHigh F2

Low F2

Two-formant vowel sweeps

First formant (Hz)

Page 34: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Audit ory nerve f iber

1000 20000

Inte

rspi

ke in

terv

al (m

s)

5

10

15CF: 1.4 kHz Thr: 2.0 SR: 90.7 35-60

1/F0

n/F1

1/F1

High F2 Formant Sweep[ i] [æ] [ i]

Page 35: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

GUY-BUY-DIE

500 1000 1500 2000 2500 3000

50

100

150

200

250

300

Time

Freq

uenc

y

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.60

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Page 36: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Timbre: summary

• Time-invariant properties (static)– Stationary spectrum– Relatively well-understood & characterized

• Time-varying properties (dynamic)– Amplitude dynamics (envelope)

– Frequency dynamics (spectral changes, vibrato)

– Phase shifts (chorus effect & electronic contexts)– Relatively poorly understood & characterized

Page 37: Harvard-MIT Division of Health Sciences and Technology HST ... · Low F2 Formant Sweep [u][ a] [u] High F2 Formant Sweep [i ]æ [i] DCN Pauser CF: 1.3 kHz Thr: 24.8 SR:0.0 01000 2000

Reading/assignment for next meeting

• Tuesday, March 2• Consonance, dissonance, and roughness

• Reading:– Deutsch, Rasch & Plomp chapter re: beats,

combination tones, and consonance• also Burns chapter on intervals & scales(look at section on consonance)