Upload
rosaline-ellis
View
236
Download
3
Embed Size (px)
Citation preview
Digital Audio Signal Processing
Lecture-2
Microphone Array Processing
Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven
homes.esat.kuleuven.be/~moonen/
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 2
Overview
• Introduction & beamforming basics– Data model & definitions
• Fixed beamforming– Filter-and-sum beamformer design – Matched filtering
• White noise gain maximization• Ex: Delay-and-sum beamforming
– Superdirective beamforming • Directivity maximization
– Directional microphones (delay-and-subtract)
• Adaptive Beamforming– LCMV beamformer– Generalized sidelobe canceler
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 3
Introduction
• Directivity pattern of a microphone– A microphone (*) is characterized by a `directivity pattern which
specifies the gain & phase shift that the microphone gives to a
signal coming from a certain direction (i.e. `angle-of-arrival’) – In general the directivity pattern is a function of frequency (ω) – In a 3D scenario `angle-of-arrival’
is azimuth + elevation angle– Will consider only 2D scenarios for
simplicity, with one angle-of arrival (θ),
hence directivity pattern is H(ω,θ) – Directivity pattern is fixed and defined
by physical microphone design
(*) We do digital signal prcessing, so this includes front-end filtering/A-to-D/..
|H(ω,θ)| for 1 frequency
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 4
Introduction
• Virtual directivity pattern – By weighting or filtering (=freq. dependent weighting) and then
summing signals from different microphones, a (software controlled) virtual directivity pattern (=weigthed sum of individual patterns) can be produced
– This assumes all microphones receive the same signals (so are all in the same position). However…
M
mmm HFH
1virtual ),().(),(
01000
20003000
045
90135
180
0
0.5
1
Angle (deg)
+:
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 5
+:
Introduction
• However, in a microphone array different microphones are in different positions/locations, hence also receive different signals
• Example : uniform linear array i.e. microphones placed on a line & uniform inter-micr. distances (d) & ideal micr. characteristics (p.9) For a far-field source signal (plane waveforms), each microphone receives the same signal, up to an angle-dependent delay… fs=sampling rate c=propagation speed
][][ 1 mm kyky
sm
m fc
d cos)(
dmdm )1(
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 6
+:
Introduction
• Beamforming = `spatial filtering’ based on
microphone characteristics (directivity patterns)
AND microphone array configuration (`spatial sampling’)
• Classification:
Fixed beamforming: data-independent, fixed filters Fm
e.g. delay-and-sum, filter-and-sum
Adaptive beamforming: data-dependent filters Fm
e.g. LCMV-beamformer, generalized sidelobe canceler
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 7
Introduction
• Background/history: ideas borrowed from antenna array design and processing for radar & (later) wireless communications
• Microphone array processing considerably more difficult than antenna array processing: – narrowband radio signals versus broadband audio signals– far-field (plane wavefronts) versus near-field (spherical wavefronts)– pure-delay environment versus multi-path environment
• Applications: voice controlled systems (e.g. Xbox Kinect), speech communication systems, hearing aids,…
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 8
Data model and definitions
Data model: source signal in far-field (see p.13 for near-field)
• Microphone signals are filtered versions of source signal S() at angle
• Stack all microphone signals (m=1..M) in a vector
d is `steering vector’• Output signal after `filter-and-sum’ is
TjM
j MeHeH )()(1 ).,(...).,(),( 1 d
)()}.,().({),().(),().(),(1
* SYFZ HHM
mmm dFYF
H instead of T for convenience (**)
)( ..),(),(
shift phase dep.-pos.
)(
pattern dir.
SeHY mjmm
)().,(),( SdY
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 9
Data model: source signal in far-field
• If all microphones have the same directivity pattern Ho(ω,θ), steering vector can be factored as…
• Will often consider arrays with ideal omni-directional microphones : Ho(ω,θ)=1 Example : uniform linear array, see p.5
Data model and definitions
)().,(),( SdY
positions spatial
)()(
pattern dir.
0 ...1.),(),( 2Tjj MeeH d
microphone-1 is used as a reference (=arbitrary)
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 10
Data model and definitions
Definitions: (1)
• In a linear array (p.5) : =90o=broadside direction = 0o =end-fire direction
• Array directivity pattern (compare to p.3) = `transfer function’ for source at angle ( -π<< π )
• Steering direction = angle with maximum amplification (for 1 freq.)
• Beamwidth (BW) = region around max with (e.g.) amplification > -3dB (for 1 freq.)
),().()(
),(),(
dFH
S
ZH
),(maxarg)(max H
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 11
Data model and definitions
Data model: source signal + noise• Microphone signals are corrupted by additive noise
• Define noise correlation matrix as
• Will assume noise field is homogeneous, i.e. all diagonal elements of noise correlation matrix are equal :
• Then noise coherence matrix is
TMNNN )(...)()()( 21 N
})().({)( Hnoise E NNΦ
iΦΦ noiseii , )()(
1....
....
....1
)(.)(
1)(
noise
noisenoise ΦΓ
)()().,(),( NdY S
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 12
Data model and definitions
Definitions: (2) • Array Gain = improvement in SNR for source at angle ( -π<< π )
• White Noise Gain =array gain for spatially uncorrelated noise
(e.g. sensor noise)
ps: often used as a measure for robustness• Directivity =array gain for diffuse noise (=coming from all directions)
DI and WNG evaluated at max is often used as a performance criterion
c
ddf ijsdiffuseij
)(sinc)(
skip this formula)(.).(
),().(),(
2
FF
dFdiffusenoise
H
H
DI
)().(
),().(),(
2
FF
dFH
H
WNG Iwhite
noise
|signal transfer function|^2
|noise transfer function|^2)().().(
),().(),(
2
FΓF
dF
noiseH
H
input
output
SNR
SNRG
PS: ω is rad/sample ( -Π≤ω≤Π ) ω fs is rad/sec
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 13
• Far-field assumptions not valid for sources close to microphone array– spherical wavefronts instead of planar waveforms– include attenuation of signals– 2 coordinates ,r (=position q) instead of 1 coordinate (in 2D case)
• Different steering vector (e.g. with Hm(ω,θ)=1 m=1..M) :
PS: Near-field beamforming
TjM
jj Meaeaea )()(2
)(1
21),( qqqqd ),( d
s
mref
m fc
pqpqq
)(
with q position of source pref position of reference microphone pm position of mth microphone
m
ref
mapq
pq
e
e=1 (3D)…2 (2D)
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 14
PS: Multipath propagation
• In a multipath scenario, acoustic waves are reflected against walls, objects, etc..
• Every reflection may be treated as a separate source (near-field or far-field)
• A more realistic data model is then..
with q position of source and Hm(ω,q), complete transfer function from source position to m-the microphone (incl. micr. characteristic, position, and multipath propagation)
`Beamforming’ aspect vanishes here, see also Lecture-3
(`multi-channel noise reduction’)
)()().,(),( NqdqY S
TMHHH ),(...),(),(),( 21 qqqqd
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 15
Overview
• Introduction & beamforming basics– Data model & definitions
• Fixed beamforming– Filter-and-sum beamformer design – Matched filtering
• White noise gain maximization• Ex: Delay-and-sum beamforming
– Superdirective beamforming • Directivity maximization
– Directional microphones (delay-and-subtract)
• Adaptive Beamforming– LCMV beamformer– Generalized sidelobe canceler
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 16
Filter-and-sum beamformer design
• Basic: procedure based on page 10
Array directivity pattern to be matched to given (desired) pattern over frequency/angle range of interest
• Non-linear optimization for FIR filter design (=ignore phase response)
• Quadratic optimization for FIR filter design (=co-design phase response)
),().()(
),(),(
dFH
S
ZH
),( dH
1
0,1 .)( , )(...)()(
N
n
jnnmm
TM efFFF F
ddHH dNnMmf nm ),(),(min
2
1..0,..1,,
ddHH dNnMmf nm ),(),(min
2
1..0,..1,,
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 17
Filter-and-sum beamformer design
• Quadratic optimization for FIR filter design (continued)
With
optimal solution is
ddHH dNnMmf nm ),(),(min
2
1..0,..1,,
),(
).1(
.0
1
11,0,
),(.)(.),(.)(...)(),().(),(
... , ...
d
dfddF
ffffNMxM
jN
j
MxMTH
MHH
TTM
TTNmmm
e
e
IFFH
ff
ddHdd d
Hoptimal ),().,( , ),().,( , . *1 dpddQpQf
Kronecker product
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 18
Filter-and-sum beamformer design
• Design example
M=8Logarithmic arrayN=50fs=8 kHz
01000
20003000
045
90135
180
0
0.5
1
Angle (deg)
Frequency (Hz)
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 19
Matched filtering: WNG maximization
• Basic: procedure based on page 12
• Maximize White Noise Gain (WNG) for given steering angle ψ
• A priori knowledge/assumptions:– angle-of-arrival ψ of desired signal + corresponding steering vector– noise scenario = white
})().(
),().(arg{max)},(arg{max)(
2
)()(MF
FF
dFF FF H
H
WNG
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 20
Matched filtering: WNG maximization
• Maximization in
is equivalent to minimization of noise output power (under
white input noise), subject to unit response for steering angle (**)
• Optimal solution (`matched filter’) is
• [FIR approximation]
),(.),(
1)( 2
MF
dd
F
1),().( s.t. ),().(min )( dFFFFHH
})().(
),().(arg{max)(
2
)(MF
FF
dFF F H
H
dNnMmf nm
2MF1..0,..1, )()(min
,FF
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 21
Matched filtering example: Delay-and-sum
• Basic: Microphone signals are delayed and then summed together
• Fractional delays implemented with truncated interpolation filters (=FIR)
• Consider array with ideal omni-directional micr’s
Then array can be steered to angle :
Hence (for ideal omni-dir. micr.’s) this is matched filter solution
1),( H
d
cos)1( dm
d
2
m
1
M
1
M
eF
mj
m
)(
M
mmm ky
Mkz
1
][.1
][
)(mm M
),()(
dF
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 22
Matched filtering example: Delay-and-sum
• Array directivity pattern H(ω,θ):
=destructive interference
=constructive interference• White noise gain :
(independent of ω)
For ideal omni-dir. micr. array, delay-and-sum beamformer provides
WNG equal to M for all freqs. (in the direction of steering angle ψ).
)( 1),(
1),(
),().,(1
),(
max
H
HM
H H dd
MWNGH
H
..)().(
),().(),(
2
FF
dF
ideal omni-dir. micr.’s
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 23
02000
40006000
8000 045
90135
180
0.2
0.4
0.6
0.8
1
Angle (deg)Frequency (Hz)
• Array directivity pattern H(,) for uniform linear array:
H(,) has sinc-like shape and is frequency-dependent
Matched filtering example: Delay-and-sum
)2/sin(
)2/sin(
),(
2/
2/1
)cos(cos)1(
j
jM
M
m
fc
dmj
e
Me
eHs
-20
-10
0
90
270
180 0
Spatial directivity pattern for f=5000 Hz
M=5 microphones
d=3 cm inter-microphone distance
=60 steering angle
fs=16 kHz sampling frequency
=endfire
1),( H
=60wavelength=4cm
ideal omni-dir. micr.’s
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 24
0
2000
4000
6000
8000 050
100150
0.20.40.60.8
1
Angle (deg)
Matched filtering example: Delay-and-sum
For an ambiguity, called spatial aliasing, occurs.
This is analogous to time-domain aliasing where now the spatial sampling (=d) is too large. Aliasing does not occur (for any ) if
M=5, =60, fs=16 kHz, d=8 cm
)cos1.(
d
cf
2.2min
max
f
c
f
cd
s
)cos1.(.. and 0for occurs 2 then
2 if 3)
)cos1.(.. and for occurs 2 then
2 if 2)
) all(for for 0 )1
integer for 2 1),(
Details...
d
cfπ
d
cfπ
pπ.pγiffH
ideal omni-dir. micr.’s
)cos1.(
d
cf
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 25
Matched filtering example: Delay-and-sum
• Beamwidth for a uniform linear array:
hence large dependence on # microphones, distance (compare p.24 & 25) and frequency (e.g. BW infinitely large at DC)
• Array topologies:– Uniformly spaced arrays– Nested (logarithmic) arrays (small d for high , large d for small )– 2D- (planar) / 3D-arrays
with e.g. =1/sqrt(2) (-3 dB)
d
2d
4d
sec
)1(96
dMcBW
ideal omni-dir. micr.’s
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 26
Super-directive beamforming : DI maximization
• Basic: procedure based on page 12
• Maximize Directivity (DI) for given steering angle ψ
• A priori knowledge/assumptions:– angle-of-arrival ψ of desired signal + corresponding steering vector– noise scenario = diffuse
})(.).(
),().(arg{max)},(arg{max)(
2
)()(SD
FF
dFF FF diffuse
noiseH
H
DI
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 27
• Maximization in
is equivalent to minimization of noise output power (under
diffuse input noise), subject to unit response for steering angle (**)
• Optimal solution is
• [FIR approximation]
})(.).(
),().(arg{max)(
2
)(SD
FF
dFF F diffuse
noiseH
H
1),().( s.t. ),().().(min )( dFFFFHdiffuse
noiseH
),(.)}(.{1
)( 1
),(.1)}(.{),(
SD
dFdd
diffusenoisediffuse
noiseH
dNnMmf nm
2SD1..0,..1, )()(min
,FF
Super-directive beamforming : DI maximization
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 28
• Directivity patterns for end-fire steering (ψ=0):
Superdirective beamformer has highest DI, but very poor WNG (at low frequencies, where diffuse noise coherence matrix becomes ill-conditioned)
hence problems with robustness (e.g. sensor noise) !
-20
-10
0
90
270
180 0
Superdirective beamformer (f=3000 Hz)
-20
-10
0
90
270
180 0
Delay-and-sum beamformer (f=3000 Hz)
M=5 d=3 cmfs=16 kHz
Maximum directivity=M.M obtained for end-fire steering and for frequency->0 (no proof)
ideal omni-dir. micr.’s
0 2000 4000 6000 80000
5
10
15
20
25
Frequency (Hz)
Dir
ecti
vit
y (
lin
ear)
SuperdirectiveDelay-and-sum
0 2000 4000 6000 8000-60
-50
-40
-30
-20
-10
0
10
Frequency (Hz)
Whit
e n
ois
e g
ain
(d
B)
SuperdirectiveDelay-and-sum
WNG=M= 5
PS: diffuse noise ≈white noise for high frequencies(cfr. ωΠ and c/fs=λmin/2≈min(dj-di) in diffuse noise coherence matrix)
DI=WNG=5
DI=M 2=25
Super-directive beamforming : DI maximization
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 29
• First-order differential microphone = directional microphone 2 closely spaced microphones, where one microphone is delayed (=hardware) and whose outputs are then subtracted from each other
• Array directivity pattern:
– First-order high-pass frequency dependence– P() = freq.independent (!) directional response– 0 1 1 : P() is scaled cosine, shifted up with 1
such that max = 0o (=end-fire) and P(max )=1
d
+
_
)cos
(1),( c
dj
eH
d/c <<, <<
cd /1
cos)1()( 11 P
Differential microphones : Delay-and-subtract
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 30
Differential microphones : Delay-and-subtract
• Types: dipole, cardioid, hypercardioid, supercardioid (HJ84)
=endfire
=broadside
Dipole:
1= 0 (=0) zero at 90o
DI=4.8dB
Cardioid:
1= 0.5 zero at 180o
DI=4.8dB
Supercardioid:
1=
zero at 125o, DI=5.7 dB highest front-to-back ratio
35.02/)13(
Hypercardioid:
1= 0.25
zero at 109o highest DI=6.0dB
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 31
Overview
• Introduction & beamforming basics– Data model & definitions
• Fixed beamforming– Filter-and-sum beamformer design – Matched filtering
• White noise gain maximization• Ex: Delay-and-sum beamforming
– Superdirective beamforming • Directivity maximization
– Directional microphones (delay-and-subtract)
• Adaptive Beamforming– LCMV beamformer– Generalized sidelobe canceler
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 32
LCMV beamformer
• Adaptive filter-and-sum structure:– Aim is to minimize noise output power, while maintaining a chosen
response in a given look direction
(and/or other linear constraints, see below). – This is similar to operation of a superdirective array (in diffuse
noise), or delay-and-sum (in white noise), cfr. (**) p.20 & 27, but now noise field is unknown !
– Implemented as adaptive FIR filter :
]1[...]1[][][ T Nkykykyk mmmmy
M
mm
Tm
T kkkz1
][][][ yfyf
TTM
TT kkkk ][][][][ 21 yyyy
TTM
TT ffff 21 ... T
1,1,0, Nmmmm ffff+:
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 33
LCMV beamformer
LCMV = Linearly Constrained Minimum Variance– f designed to minimize output power (variance) (read on..) z[k] :
– To avoid desired signal cancellation, add (J) linear constraints
Example: fix array response in look-direction ψ for sample freqs ωi
For sufficiently large J constrained output power minimization approximately corresponds to constrained output noise power minimization(why?)
– Solution is (obtained using Lagrange-multipliers, etc..):
fRfff
][min][min 2 kkzE yyT
JJMNMNT bCfbfC ,, .
bCRCCRf111 ][][
kk yyT
yyopt
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 34
Generalized sidelobe canceler (GSC)
GSC = Adaptive filter formulation of the LCMV-problem
Constrained optimisation is reformulated as a constraint pre-processing, followed by an unconstrained optimisation, leading to a simple adaptation scheme– LCMV-problem is
– Define `blocking matrix’ Ca, ,with columns spanning the null-space of C
– Define ‘quiescent response vector’ fq satisfying constraints
– Parametrize all f’s that satisfy constraints (verify!)
I.e. filter f can be decomposed in a fixed part fq and a variable part Ca. fa
bfCfRff
Tyy
T k ,][min
)( JMNMNa
C
JJMNMN bCf ,,
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 35
Generalized sidelobe canceler
GSC = Adaptive filter formulation of the LCMV-problem
Constrained optimisation is reformulated as a constraint pre-processing, followed by an unconstrained optimisation, leading to a simple adaptation scheme– LCMV-problem is
– Unconstrained optimization of fa : (MN-J coefficients)
bfCfRff
Tyy
T k ,][min JJMNMN bCf ,,
).].([.).(min aaqyyT
aaq ka
fCfRfCff
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 36
GSC (continued)
– Hence unconstrained optimization of fa can be implemented as an adaptive filter (adaptive linear combiner), with filter inputs (=‘left- hand sides’) equal to and desired filter output (=‘right-hand side’) equal to
– LMS algorithm :
Generalized sidelobe canceler
}))..][().][({min...).].([.).(min
2][~][
aa
kd
qaaqyyT
aaq
T
aakkEk fCyfyfCfRfCf
ky
TTff
][~ ky
][kd
])[..][][..(][..][]1[
][~][][~
kkkkkk a
k
aT
kd
Tq
k
Taaa
T
fCyyfyCff
yy
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 37
Generalized sidelobe canceler
GSC then consists of three parts:
• Fixed beamformer (cfr. fq ), satisfying constraints but not yet minimum
variance), creating `speech reference’ • Blocking matrix (cfr. Ca), placing spatial nulls in the direction of the
speech source (at sampling frequencies) (cfr. C’.Ca=0), creating `noise
references’ • Multi-channel adaptive filter (linear combiner) your favourite one, e.g. LMS
][kd
][~ ky
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 38
Generalized sidelobe canceler
A popular & significantly cheaper GSC realization is as follows
Note that some reorganization has been done: the blocking matrix now generates (typically) M-1 (instead of MN-J) noise references, the multichannel adaptive filter performs FIR-filtering on each noise reference (instead of merely scaling in the linear combiner). Philosophy is the same, mathematics are different (details on next slide).
Postproc
y1
yM
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 39
Generalized sidelobe canceler
• Math details: (for Delta’s=0)
][.
~][~
]1[~...]1[~][~][~
~...00
:::
0...~
0
0...0~
][...][][][
]1[...]1[][][
][.][.][~
:1:1
:1:1:1
permuted,
21:1
:1:1:1permuted
permutedpermuted,
kk
Lkkkk
kykykyk
Lkkkk
kkk
MTaM
TTM
TM
TM
Ta
Ta
Ta
Ta
TMM
TTM
TM
TM
Ta
Ta
yCy
yyyy
C
C
C
C
y
yyyy
yCyCy
=input to multi-channel adaptive filter
select `sparse’ blocking matrix such that :
=use this as blocking matrix now
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 40
Generalized sidelobe canceler
• Blocking matrix Ca (cfr. scheme page 24)
– Creating (M-1) independent noise references by placing spatial nulls in look-direction
– different possibilities
(broadside steering)
• Problems of GSC:– impossible to reduce noise from look-direction– reverberation effects cause signal leakage in noise references adaptive filter should only be updated when no speech is present to avoid
signal cancellation!
1111TC
1001
0101
0011TaC
1111
1111
1111TaC 0
20004000
60008000 0 45 90 135 180
0
0.5
1
1.5
Angle (deg)
Griffiths-Jim
Walsh