Upload
clifford-booker
View
220
Download
1
Embed Size (px)
Citation preview
Basic Acoustics + Digital Signal Processing
September 11, 2014
Road Map!• For today:
• Part 1: Go through a review of the basics of (analog) acoustics.
• Part 2: Converting sound from analog to digital format.
• Any questions so far?
Part 1: An Acoustic Dichotomy• Acoustically speaking, there are two basic kinds of
sounds:
1. Periodic
• = an acoustic pattern which repeats over time
• The “period” is the length of time it takes for the pattern to repeat
• Periodic speech sounds = voiced segments + trills
2. Aperiodic
• Continuous acoustic energy which does not exhibit a repeating pattern
• Aperiodic speech sounds = fricatives
The Third Wheel• There are also acoustic transients.
• = aperiodic speech sounds which are not continuous
• i.e., they are usually very brief
• Transient speech sounds:
• stop release bursts
• clicks
• also (potentially) individual pulses in a trill
• Let’s look at the acoustic properties of each type of sound in turn…
Pin
Fad
Fad
• How is a periodic sound transmitted through the air?
• Consider a bilabial trill:
Acoustics: Basics
What does sound look like?• Air consists of floating air molecules
• Normally, the molecules are suspended and evenly spaced apart from each other
• What happens when we push on one molecule?
What does sound look like?• The force knocks that molecule against its neighbor
• The neighbor, in turn, gets knocked against its neighbor
• The first molecule bounces back past its initial rest position
initial rest position
What does sound look like?• The initial force gets transferred on down the line
rest position #1
rest position #2
• The first two molecules swing back to meet up with each other again, in between their initial rest positions
• Think: bucket brigade
Compression Wave• A wave of force travels down the line of molecules
• Ultimately: individual molecules vibrate back and forth, around an equilibrium point
• The transfer of force sets up what is called a compression wave.
• What gets “compressed” is the space between molecules
• Check out what happens when we blow something up!
Compression Wave
area of high pressure
(compression)area of low pressure
(rarefaction)
• Compression waves consist of alternating areas of high and low pressure
Pressure Level Meters• Microphones
• Have diaphragms, which move back and forth with air pressure variations
• Pressure variations are converted into electrical voltage
• Ears
• Eardrums move back and forth with pressure variations
• Amplified by components of middle ear
• Eventually converted into neurochemical signals
• We experience fluctuations in air pressure as sound
Measuring Sound• What if we set up a pressure level meter at one point in the wave?
Time
pressure level meter
Sine Waves• The reading on the pressure level meter will fluctuate between high and low pressure values
• In the simplest case, the variations in pressure level will look like a sine wave.
time
pressure
Other Basic Sinewave concepts• Sinewaves are periodic; i.e., they recur over time.
• The period is the amount of time it takes for the pattern to repeat itself.
• A cycle is one repetition of the acoustic pattern.
• The frequency is the number of times, within a given timeframe, that the pattern repeats itself.
• Frequency = 1 / period
• usually measured in cycles per second, or Hertz
• The peak amplitude is the the maximum amount of vertical displacement in the wave
• = maximum (or minimum) amount of pressure
Waveforms• A waveform plots air pressure on the y axis against time on the x axis.
Phase Shift• Even if two sinewaves have the same period and amplitude, they may differ in phase.
• Phase essentially describes where in the sinewave cycle the wave begins.
• This doesn’t affect the way that we hear the waveform.
• Check out: sine waves vs. cosine waves!
Complex Waves• It is possible to combine more than one sinewave together into a complex wave.
• At any given time, each wave will have some amplitude value.
• A1(t1) := Amplitude value of sinewave 1 at time 1
• A2(t1) := Amplitude value of sinewave 2 at time 1
• The amplitude value of the complex wave is the sum of these values.
• Ac(t1) = A1 (t1) + A2 (t1)
Complex Wave Example• Take waveform 1:
• high amplitude
• low frequency
• Add waveform 2:
• low amplitude
• high frequency
• The sum is this complex waveform:
+
=
A Real-Life Example• 480 Hz tone
• 620 Hz tone
• the combo = ?
Spectra• One way to represent complex waves is with waveforms:
• y-axis: air pressure
• x-axis: time
• Another way to represent a complex wave is with a power spectrum (or spectrum, for short).
• Remember, each sinewave has two parameters:
• amplitude
• frequency
• A power spectrum shows:
• amplitude on the y-axis
• frequency on the x-axis
One Way to Look At It• Combining 100 Hz and 1000 Hz sinewaves results in the following complex waveform:
amplitude
time
The Other Way• The same combination of 100 Hz and 1000 Hz sinewaves results in the following power spectrum:
amplitude
frequency
The Third Way• A spectrogram shows how the spectrum of a complex sound changes over time.
frequency
time
• intensity (related to amplitude) is represented by shading in the z-dimension.
1000 Hz
100 Hz
Fundamental Frequency• One last point about periodic sounds:
• Every complex wave has a fundamental frequency (F0).
• = the frequency at which the complex wave pattern repeats itself.
• This frequency happens to be the greatest common denominator of the frequencies of the component waves.
• Example: greatest common denominator of 100 and 1000 is 100. (boring!)
• GCD of 480 and 620 Hz is 20.
• GCD of 600 and 800 Hz is 200, etc.
Aperiodic sounds• Not all sounds are periodic
• Aperiodic sounds are noisy
• Their pressure values vary randomly over time
“white noise”
• Interestingly:
• White noise sounds the same, no matter how fast or slow you play it.
Fricatives• Fricatives are aperiodic speech sounds
[s]
[f]
Aperiodic Spectra• The power spectrum of white noise has component frequencies of random amplitude across the board:
Aperiodic Spectrogram• In an aperiodic sound, the values of the component frequencies also change randomly over time.
Transients• A transient is:
• “a sudden pressure fluctuation that is not sustained or repeated over time.”
• An ideal transient waveform:
A Transient Spectrum• An ideal transient spectrum is perfectly flat:
As a matter of fact• Note: white noise and a pure transient are idealizations
• We can create them electronically…
• But they are not found in pure form in nature.
• Transient-like natural sounds include:
• Hand clapping
• Finger snapping
• Drum beats
• Tongue clicking
Click Waveform
some periodic reverberation
initial impulse
Click Spectrum
• Reverberation emphasizes some frequencies more than others
Click Spectrogram
some periodic reverberation
initial impulse
Part 2: Analog and Digital
• In “reality”, sound is analog.
• variations in air pressure are continuous
• = it has an amplitude value at all points in time.
• and there are an infinite number of possible air pressure values.
• Back in the bad old days, acoustic phonetics was strictly an analog endeavor.
analog clock
Part 2: Analog and Digital
• In the good new days, we can represent sound digitally in a computer.
• In a computer, sounds must be discrete.
• everything = 1 or 0 digital clock
• Computers represent sounds as sequences of discrete pressure values at separate points in time.
• Finite number of pressure values.
• Finite number of points in time.
Analog-to-Digital Conversion• Recording sounds onto a computer requires an analog-to-
digital conversion (A-to-D)
• When computers record sound, they need to digitize analog readings in two dimensions:
X: Time (this is called sampling)
Y: Amplitude (this is called quantization)
sampling
quantization
Sampling Example
0 20 40 60 80 100-100000
10000
nominal time
amplitude
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
Thanks to Chilin Shih for making these materials available.
Sampling Example
Sampling Rate• Sampling rate = frequency at which samples are taken.
• What’s a good sampling rate for speech?
• Typical options include:
• 22050 Hz, 44100 Hz, 48000 Hz
• sometimes even 96000 Hz and 192000 Hz
• Higher sampling rate preserves sound quality.
• Lower sampling rate saves disk space.
• (which is no longer much of an issue)
• Young, healthy human ears are sensitive to sounds from 20 Hz to 20,000 Hz
One Consideration• The Nyquist Frequency
• = highest frequency component that can be captured with a given sampling rate
• = one-half the sampling rate
Problematic Example:
• 100 Hz sound
• 100 Hz sampling rate
samples 1 2 3
Harry Nyquist (1889-1976)
Nyquist’s Implication• An adequate sampling rate has to be…
• at least twice as much as any frequency components in the signal that you’d like to capture.
• 100 Hz sound
• 200 Hz sampling rate
samples 1 2 3 4 5 6
Sampling Rate Demo• Speech should be sampled at at least 44100 Hz
• (although there is little frequency information in speech above 10,000 Hz)
• 44100 Hz
• 22050 Hz • 11025 Hz (watch out for [s])
• 8000 Hz • 5000 Hz