2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and characterization

Embed Size (px)

Citation preview

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    1/32

    Audio Engineering Society

    Convention PaperPresented at the 131st Convention2011 October 2023 New York, USAThis paper was peer-reviewed as a complete manuscript for presentation at this Convention. Additional papers may be obtainedby sending request and remittance to Audio Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA; alsosee www.aes.org. All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission

    from theJournal of the Audio Engineering Society.

    Beamforming regularization, scaling

    matrices and inverse problems for soundfield extrapolation and characterization:Part I Theory

    Philippe-Aubert Gauthier1,2,Eric Chambatte1,2, Cedric Camier1,2, Yann Pasco1,2, and Alain Berry1,2

    1Groupe dAcoustique de lUniversite de Sherbrooke, Univ. de Sherbrooke, Sherbrooke, J1K 2R1 Canada

    2Centre for Interdisciplinary Research in Music, Media and Technology, McGill Univ., Montreal, H3A 1E3 Canada

    Correspondence should be addressed to Philippe-Aubert Gauthier

    ([email protected])

    ABSTRACT

    Sound field extrapolation (SFE) is aimed at the prediction of a sound field in an extrapolation region usinga microphone array in a measurement region. For sound environment reproduction purposes, sound fieldcharacterization (SFC) aims at a more generic or parametric description of a measured or extrapolatedsound field using different physical or subjective metrics. In this paper, a SFE method recently introducedis presented and further developed. The method is based on an inverse problem formulation combined witha beamforming matrix in the discrete smoothing norm of the cost function. The results obtained from theSFE method are applied to SFC for subsequent sound environment reproduction. A set of classificationcriteria is proposed to distinguish simple types of sound fields on the basis of two simple scalar metrics. Acompanion paper presents the experimental verifications of the theory presented in this paper.

    1. INTRODUCTION

    For spatial sound reproduction technologies based on

    physical simulationsuch as Wave Field Synthesis (WFS)

    [1, 2], the underlying hypothesis is that the immersion

    of a listener in a physical reconstruction of a target

    sound field will lead to an appropriate sound percep-

    tion over a large listening area. In this area, the local-

    ization cues (interaural level difference, interaural time

    difference and spectral modifications) are naturally de-

    rived from the interaction of the listeners body and ex-

    ternal ears with the recreated sound field. To reproduce

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    2/32

    Gauthier et al. Sound field extrapolation and characterization I

    or recreate a real sound field or real sound environment,

    WFS and other physical reproduction techniques require

    a complete physical description of the target sound field.

    Sound field extrapolation using microphone array tech-nologies is appropriate for this purpose. In this paper, a

    sound field extrapolation and characterization methodol-

    ogy is presented. The experimental tests of the method

    are reported in a companion paper.

    This work is part of a larger project which involves the

    entire sound field reproduction of an airplane cabin in a

    full-scale mock-up. The objective of the reported theory

    and experiments is to get preliminary insights about the

    efficiency and validity of the sound field extrapolation

    (SFE) and sound field characterization (SFC) methods in

    a practical situation. Preliminary experiments in labora-

    tory conditions ensure the validation of the method be-fore the realization of actual on-site measurements, SFE

    and SFC for subsequentsoundenvironment reproduction

    in a mock-up of an airplane cabin.

    Sound field extrapolation (SFE) finds many applications

    in various domains: acoustic imaging, source localiza-

    tion, sound field reproduction, etc. SFE relies primarily

    on the measurement of a sound field using a microphone

    array placed in a measurement region. Among the most

    common techniques, one finds: inverse problems [3, 4]

    and spatial transform methods (such as nearfield acousti-

    cal holography [5]). In this paper, we consider an inverse

    method since this method can easily deal with any mi-crophone array configuration, regular or not. However,

    the typically-large condition number of the matrix that

    must be inverted signals that matrix-form inverse prob-

    lems are sensitive to measurement noise [6]. Therefore,

    regularization of the inverse problem is mandatory. Usu-

    ally, with conventional regularization methods, this is at

    some expense: reduced spatial resolution and supple-

    mentary regularization errors. In a recent paper, a new

    measurement-data-dependent regularization method that

    suffers less from the aforementioned issues was intro-

    duced [7].

    The novelty of the method is that it applies a beamform-ing regularization matrix in the discrete smoothing norm

    of the cost function used to solve the inverse problem

    in the least-mean-square sense [3]. The advantages of

    this method are to increase the solution spatial resolu-

    tion and reduce the measurement noise sensitivity. In the

    inverse problem, the beamforming regularization matrix

    simply penalizes more strongly the sources for which an

    a priori delay-and-sum beamformer gives a weaker am-

    plitude. Recently, an experimental validation of the SFE

    method was reported [8]. The validation was based on

    the direct comparison of an extrapolated sound field withthe exact sound field in an extrapolation region differ-

    ent from the measurement region. It was shown that the

    proposed SFE method is effective. In this paper, new the-

    oretical developments for the interpretation of the beam-

    forming regularization matrix will be introduced on the

    basis of the transformation of the general-form inverse

    problem with the beamforming regularization matrix to

    a standard-form inverse problem [3]. A companion pa-

    per (Part - II) discusses the results of a complete experi-

    mental verification of this recently developedmethod for

    SFE. This companion paper presents experiments in an

    hemi-achenoic room and in a reverberant chamber.

    For sound environment and soundfield reproduction pur-

    poses, SFE results can readily be applied to the deriva-

    tion of multichannel signals using sound field reproduc-

    tion technologies such as Wave Field Synthesis or Am-

    bisonics [8, 10]. However, in some practical applica-

    tions such as sound environment reproduction in vehi-

    cle mock-ups, the entire sound environment tends to be

    made of mostly stationary signals, at least for a finite pe-

    riod of time (corresponding to cruise speed, fixed alti-

    tude, stationary road condition, etc.). This very specific

    yet simplified nature of the sound environment encoun-

    tered in most vehicles allows for the fragmentation of thesound environment into sound components or sound en-

    vironment atoms [11, 12]. For such components of the

    entire sound environment, it is sometimes more useful to

    summarize the spatial property of the sound component

    by few simple metrics using a general sound field char-

    acterization (SFC) method. For example, this is the hy-

    pothesis behind the Directional Audio Coding (DiRAC)

    [13, 14, 15, 16] approach by Pulki and coworkers for a

    point in space for which the spatial sound properties are

    summarized as impinging directions and diffuseness as

    function of frequency. In this paper, we develop these

    ideas further and apply them to the typical SFE results

    obtained by the proposed method for an extended spa-tial area. Moreover, we propose a simple classification

    method to distinguish simple and generic types of sound

    field. This classification is deduced from direct observa-

    tion of the metrics efficiency to distinguish these generic

    types of sound field. Supplementary methods for virtual

    acoustics and simulations from microphone array mea-

    surement are also possible and discussed in [8].

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 2 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    3/32

    Gauthier et al. Sound field extrapolation and characterization I

    Fig. 1: Illustration of the coordinate systems. Micro-

    phones are located in xm. Points that belong to the in-

    verse problem equivalent source distribution are denoted

    by y. Any field point is denoted by x. The real sound

    sources are confined to Vs.

    1.1. Paper structure

    Section 2 presents the general theory behind SFE using

    the inverse problem approach and the beamforming reg-

    ularization matrix as initially introduced in [7]. In this

    paper, the beamforming regularization theory introduced

    in [7] is further developed. The SFC metrics and meth-

    ods are presentedin Sec. 3 where several of these metricsare discussed and compared on the basis of simple yet

    archetypical theoretical cases. Based on the evaluation

    of these metrics, a sound field components classification

    tree is also proposed in Sec. 3. A short discussion and a

    conclusion gather the main concluding remarks.

    2. SOUND FIELD EXTRAPOLATION

    The generic microphone array and coordinate systems

    are shown in Fig. 1. The array includes Mmicrophones.

    For a given frequency, a sound pressure field measure-

    ment is stored in a complex vector p(xm) M. Al-though the method is developed in the frequency do-

    main, it is possible to derive the resulting time-domain

    quantities using inverse Fourier transform as long as the

    equivalence between circular and linear convolution is

    respected with proper zero-padding of the input data [9].

    2.1. Direct problem

    The discrete direct sound radiation problem in matrix

    form:

    p(xm) =G(xm,yl)q(yl ), (1)

    with

    p

    M,G

    ML andq

    L, (2)

    whereqis the source strength vector for sources located

    inyl ,Gis the transfer matrix that represents sound radi-

    ation and p is the resulting sound pressure vector at the

    microphone locationsxm. In this paper, a simple model

    of the direct problem is used: qare amplitudes of el-

    emental plane waves propagating in different directions

    andin Fig. 1. Therefore, we let R . Then, inthis more specific case: Gml =e

    ikl xm with k l being the

    wave vector for the l -th plane wave (kl = knl ,k=/c,

    is the angular frequency [rad/s], c is the sound speed[m/s],nlis a unit vector aligned withkl ,land lare thepropagation azimuth and elevation). Many other types of

    sources or idealized waves could be used in the direct

    problem definition. Indeed, one may object that spheri-

    cal or cylindrical harmonic waves could be more suitable

    for the inverse problem. This is only the case when the

    microphonearraydoes not include the origin of the coor-

    dinate system. Indeed, the linear combination of spheri-

    cal harmonics or cylindrical harmonics tends to numeri-

    cally diverge in the immediate vicinity of the coordinate

    system origin. In our case, the microphone array a priori

    covers an extended area (as opposed to compact arrays

    such as the first-order Ambisonics Sound Field micro-phone [17]) and typically includes the origin, hence our

    interest for plane waves in the discrete problem. In all

    cases, it is possible to convert plane waves into spherical

    harmonics in a subsequent step.

    2.2. Sound field extrapolation outside the mi-crophone array

    Theextrapolatedsound pressurefield [Pa] at any location

    xis then computed using a linear combination of plane

    waves

    p(x) =L

    l=1

    eikl xql , (3)

    where the complex plane wave distribution is centered

    around the coordinate system origin x = 0. Indeed, onenotes that the sound pressureat the origin is thedirect lin-

    ear combination of the plane wave complex amplitudes

    p(0) =L

    l=1

    ql. (4)

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 3 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    4/32

    Gauthier et al. Sound field extrapolation and characterization I

    For any field point x that excludes the array origin, it

    would be interesting for notational purposes to obtain an

    expression similar to Eq. (4) where a new complex plane

    wave distributionqwould be centered around the fieldpointx. This corresponds to a simple translation of the

    coordinate system origin. This is expressed as follow

    p(x) =L

    l=1

    ql(x), (5)

    with

    ql(x) =ejkl xql. (6)

    2.3. Inverse problem: general-form andstandard-form of Tikhonov regularization

    For SFE, the goal of inverse problem is to estimate the

    source amplitude qthat best predicts the measured sound

    fieldp, knowing the propagationoperatorG. Put simply,

    as for most classical inverse problems in acoustics, we

    ask for the causes, i.e. the sources amplitudesq, that cre-

    ated the effect, i.e. the measurement data p, for a given

    known system model G , i.e. for an imposed geometry

    of source distribution. Note that for practical applica-

    tions, the source geometry is imposed and specified, but

    for now, we keep an unspecified definition of the source

    distributionylto propose a general view of the method.

    A typical approach to that problem is to cast it as a min-

    imization problem with Tikhonov regularization [3]:

    q=argmin

    p Gq22 +2(q)2

    . (7)

    In Eq. (7), 2 represents the vector 2-norm (x22=

    xHx, superscript Hdenotes Hermitian transpose), isthe penalization parameter and()is a discrete smooth-ing norm [3]. The function()is termed a discretesmoothing norm because it smoothly regularizes the un-

    known solutionq. In classical Tikhonov regularization

    as reported in many papers [4, 6, 18, 19, 20], the dis-

    crete smoothing norm is the solution vector 2-norm [3]:

    (q) = q2. The inverse problem solutionq should

    approach the real sound source distribution or should, atleast, be able to achieve SFE according to Eq. (3) within

    an extrapolation region for which the prediction error

    would be below a given threshold.

    In this paper, we will assume that the discrete smoothing

    norm is of the more general form

    (q) =Lq2, (8)

    where L NL is a rectangular or square weighting ma-trix. Then, one writes the general-form inverse problem

    as

    q=argmin

    p Gq22 +2Lq22, (9)

    and the standard-form inverse problem as [3]

    q

    =argmin

    p Gq22 +2q22

    . (10)

    In this equation, thenew standard-formmatrices andvec-

    torsGandpmust be computed fromGandpwhich, de-

    pending on the weighting matrixL, might not be a trivial

    task. This must often be achieved using numerical meth-

    ods [3].

    However, when the weighting matrix is square (L

    LL) and when its inverse exists, one directly obtains

    the standard-form transformed quantities

    G=GL1, p=pandq=L1q

    . (11)

    When L= I, as often reported in the literature, thereis no difference between the general-form and standard-

    form problems. The optimal solution of the general-form

    problem Eq. (9) is [3, 4, 6]

    q= GHp

    GHG +2LHL. (12)

    The solution of the standard-form problem Eq. (10) is

    [3, 4, 6]

    q

    =GHp

    GHG +2I. (13)

    For the specific case of Eq. (11) (i.e. L LL anddet(L)=0), Eq. (13) can be directly expressed as func-tion ofL

    q=L1q

    =L1

    [L1]TGHp

    [L1]TGHGL1 +2I (14)

    with superscript T denoting matrix transposition. There-

    fore, for the specific case of Eq. (11), it is possible to in-

    terpret the problem from two equivalent vantage points:

    a general-form problem with a regularization matrix L(Eq. (12))or a standard-formproblem with a transforma-

    tion matrixL1 that transforms the propagation operator

    G(Eq. (13)). In the even more specific case of a diagonal

    matrixL, the regularizationmatrixLputs weights on the

    individual solution components ql and the transforma-

    tion matrix is a diagonal matrix L1 that scales columns

    of the propagation operatorG.

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 4 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    5/32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    6/32

    Gauthier et al. Sound field extrapolation and characterization I

    in reference [7]. Moreover, initial experiments with the

    beamforming regularization matrix in an hemi-anechoic

    room, with a 96-microphone array and with a single om-

    nidirectionnal source showed the practicability and va-lidity of the method [8].

    To illustrate the validity of the method, a simple theoreti-

    cal example is given in Figs. 2, 3 and 4. The direct prob-

    lem source distribution involves 642 plane waves com-

    ing from 4steradians. The microphone array is shownin Fig. 3: it is made of 256 microphones. The original

    sound field is created by a dipole inx = [1.8k,2k,0]T

    where k is the acoustic wavelength. Real parts ofthe original and extrapolated sound fields are shown in

    Fig. 4. The SFE was based on Eq. (20) with=0.01.Clearly, SFE using the proposed method is effective.

    In the following, we illustrate the equivalence of the

    beamforming regularization matrix and the correspond-

    ing scaling matrixL (see Eqs. (11) to (13))

    L =L1 =diag(|GHp|/GHp) LL. (21)

    Then, Eq. (14) gives

    qBF= diag(|QBF|/QBF)

    2GHp

    GHdiag(|QBF|/QBF)2G +2I (22)

    since L is diagonal. This solution is equivalent to

    Eq. (20). Therefore, it is possible to interpret the original

    beamforming regularization matrix as a standard-formproblem using a data-dependent scaled system matrixG

    G=GL =Gdiag(|GHp|/GHp), (23)

    withqBF= Lq

    BF. MatrixL will be called the scaling

    matrix.

    2.6. Spatial resolution: equivalence between

    the beamforming regularization matrix andbeamforming scaling matrix

    As discussed in [7], since the beamforming regulariza-

    tion matrix involves a general-form inverse problem, one

    must rely on the generalized singular value decomposi-tion (GSVD) of the matrices pair Gand Lto evaluate

    the possible spatial resolution of the problem. This is

    typically evaluated on the basis of the generalized sin-

    gular vectors. In the case of the beamforming scaling

    matrix, the problem is written in standard form and the

    spatial resolution of the problem can be evaluated on the

    basis of the singular value decomposition (SVD) of the

    0.5

    00.5

    1

    0.5

    0

    0.5

    1

    1

    0.5

    0

    0.5

    1

    cos(l)cos(l)sin(l)cos(l)

    sin(l)

    Fig. 2:Spherical distribution ofL= 642 incoming planewaves. Each propagating directionlandlis shown asa black dot on the sphere with the corresponding direc-

    tion cosines.

    x2

    x1

    x3

    Fig. 3:Theoretical 256-microphonearray geometry. Mi-

    crophones are horizontally aligned with a uniform rect-

    angular grid (shown in grey) and they are randomly po-sitioned along x3on the basis of a two-layer geometry

    (with 0.1 wavelength as the vertical separation distance).

    Microphone acoustic centers are shown as black dots.

    The problem is dimensionless and the array spans two

    wavelengths alongx1and x2.

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 6 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    7/32

    Gauthier et al. Sound field extrapolation and characterization I

    Fig. 4: (a): Real part of the original sound field created

    by a dipole sound source in x = [1.8k,2k,0]T in the

    direction/2.25 radians. Field positions are normalized

    by the acoustical wavelengthk. (b): Real part of the ex-trapolated sound field using a plane wave source distri-

    bution obtained from Eq. (20) with= 0.01. The dipoleis marked as a black and white dot. The dipole main

    and orthogonal axes are highlighted by dashed black and

    white lines. The white contour line indicates the region

    of 0.001 of local quadratic SFE error and the black con-

    tour line indicates the region of 0.1 of local quadratic

    SFE error (p(x) p(x)22).

    scaled system matrix G. In this paper, the equivalence

    of the two formulations in terms of spatial resolution

    will be illustrated. Furthermore, this demonstration will

    also illustrate the better spatial resolution obtained by thebeamforming regularization or scaling matrices.

    The SVD ofGis given by

    G=UVH =M

    i=1

    uiivHi , (24)

    with unitary matricesU

    MM andV

    LL (UHU =VHV=I). In Eq. (24), the vectorsuiand viare the leftand right singular vectors, respectively. They correspond

    to the columns of Uand V. Each singular vector pair

    corresponds to a singular value i stored on the main

    diagonal of

    ML. It is assumed that the number

    of microphones is smaller than the number of unknown

    sources. The singular values are ordered in decreasing

    order (1 2 >0). On the basis of this SVD, thesolution of the standard form is written

    qBF=L

    M

    i=i

    fi

    uHi p

    ivi (25)

    where the filter factors fi=2i/(

    2i +

    2)represent theregularization effect.

    The GSVD of G and L is given by [3, 21] with U

    MM,V LL,C ML,M LL andZ LL.The columns ofUandVare orthonormal (UHU=IandVHV= I) and Z is nonsingular. The columns ofU, VandZ(ui,viand zi, respectively) form a new set of sin-

    gular vectors that are used as independent basis vectors.

    The columns ofU are used as basis vectors for acoustic

    pressurepwhile the columns ofZare used as basis vec-

    tors for source distributionq. Note thatUandVare not

    equal to those found from standard SVD (namely,Uand

    V). Matrices C and M have their coefficients ciand m istored in increasing order on their main diagonals. The

    generalized singular valuesiare given by

    i=ci/mi. (26)

    On the basis of this GSVD, Eq. (12) is written

    qBF=M

    i=1

    fiuHi p

    cizi (27)

    where fi= 2i/(

    2i +

    2)represents the regularization ef-fect on the solution.

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 7 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    8/32

    Gauthier et al. Sound field extrapolation and characterization I

    An example is now introduced to highlight the increased

    spatial resolution of the inverse problem approach with

    the beamforming regularization matrix in comparison

    with the inverse problem with the identity matrix as theweighting matrix. Moreover, this example will illus-

    trate the fact that the spatial resolution obtained with

    the beamforming regularization matrixLwritten in gen-

    eral form is equal to the resolution obtained with the

    beamforming scaled system matrixGwritten in standard

    form. To illustrate this property, we simply rely on the

    comparison of the spatial resolution of the generalized

    singular vectors zi and the scaled singular vectors Lvi

    since they both form the orthogonal bases of the solu-

    tions of the standard form Eq. (25) and the general form

    Eq. (27). Theziand Lviare presented in Figs. 5(a) and

    5(b). For comparison purposes, the right singular vectors

    of the matrixGare shown.

    The example involves a linear microphone array ofM=32 microphones spanning 4 acoustical wavelengths kwith a plane wave distribution ofL=256 plane waves(l=0, ,and l=0) and for a plane wave incidentfrom =/2. First note the effect of the beamform-ing regularization matrix and the corresponding scaling

    matrix by comparison with the standard system matrix

    G: they provide a locally-increased spatial resolution in

    the vicinity of the impinging sound wave. Moreover, the

    comparison of Fig. 5(a) and Fig. 5(b) illustrates the ex-

    act correspondence of the vector bases used for the stan-

    dard form Eq. (25) and the general form Eq. (27). There-fore, on the basis of this example and Eqs. (20) and (22),

    the increasedspatial resolutionproperty associated to the

    beamforming regularizationmatrix method (as originally

    presented on the basis of the GSVD in [7]) is equivalent

    to the increased spatial resolution property for the scaled

    system matrix.

    3. SOUND FIELD CHARACTERIZATION

    In this section, several metrics and quantifiers are pre-

    sented to characterize the measured and extrapolated

    sound fields for a given frequency on the basis of theinverse problem solutions q, qBF or q(,), i.e. theplane wave distributions. In some cases, the metrics are

    computed directly from the inverse problem solution and

    in some other cases the metrics are computed from the

    SFE result, namely the sound pressure or the particle ve-

    locity. The presented metrics are either objective or sub-

    jective predictors. A distinction is also introduced be-

    tween local and global metrics. It is known from the lit-

    erature that some metrics aremore effective to predict the

    listener sound localization in different frequency bands

    [24], therefore several metrics are presented and dis-cussed before being exemplified using the SFE method

    presented earlier.

    3.1. Sound intensity and direction-of-arrival

    fields and averages

    The extrapolated sound pressure field [Pa] as function of

    xis given by the algebraic superposition of the Lhar-

    monic plane waves used in the direct problem as ex-

    pressed in Eq. (3).

    The acoustic velocity field u(x)[m/s] is computed using

    the linearized Euler equation [25]

    u(x) =p(x)

    i , (28)

    with being the air density [kg/m3],the angular fre-quency [rad/s] and the gradient operator given by

    =

    x1e1 +

    x2e2 +

    x3e3 (29)

    whereeiis a canonical vector [21] pointing in the xidi-

    rection. Accordingly, for the problem at hand, one finds

    p(x) =L

    l=1

    ikl eikl xql, (30)

    and

    u(x) =L

    l=1

    nl

    ceikl xql, (31)

    with nl = k l/kl2 being a unit vector collinear withkl . For a given harmonic sound field, the time averaged

    acoustic intensityI(x)[W/m2] is given by [25]

    I(x) =1

    2[p(x)u(x)], (32)

    which gives, in our specific case

    I(x) =1

    2

    L

    l=1

    eikl xql

    L

    l=1

    nl

    ceikl xql

    . (33)

    Many metrics presented in the sequel are derived from

    the sound pressure, velocity and intensity fields.

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 8 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    9/32

    Gauthier et al. Sound field extrapolation and characterization I

    0 0.25 0.5 0.75 1

    10

    20

    30

    Q

    BF

    0 0.2 0.4 0.6 0.8 11

    0

    1

    Lv1

    0 0.2 0.4 0.6 0.8 11

    0

    1

    Lv2

    0 0.2 0.4 0.6 0.8 11

    0

    1

    Lv3

    0 0.2 0.4 0.6 0.8 11

    0

    1

    Lv4

    0 0.2 0.4 0.6 0.8 11

    0

    1

    Lv5

    0 0.2 0.4 0.6 0.8 11

    0

    1

    Lv6

    l/

    (a)

    0 0.25 0.5 0.75 1

    10

    20

    30

    Q

    BF

    0 0.2 0.4 0.6 0.8 11

    0

    1

    z1

    0 0.2 0.4 0.6 0.8 11

    0

    1

    z2

    0 0.2 0.4 0.6 0.8 11

    0

    1

    z3

    0 0.2 0.4 0.6 0.8 11

    0

    1

    z4

    0 0.2 0.4 0.6 0.8 11

    0

    1

    z5

    0 0.2 0.4 0.6 0.8 11

    0

    1

    z6

    l/

    (b)

    Fig. 5: Absolute value of the beamforming output QBF(Eq. (15)) (top) and the first six (from top) singular vectors

    (black lines) (a) and generalized singular vectors (black lines) (b) for a linear microphone array of 32 microphones

    spanning 4 acoustical wavelengthskwith a plane wave distribution of 256 plane waves (l= 0, ,and l= 0) andfor a plane wave incident from= /2. The real part of the vectors are shown as continuous lines and the imaginaryparts of the vectors as dashed lines. For comparison purpose, the right singular vectors of the matrixGare shown in

    grey.

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 9 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    10/32

    Gauthier et al. Sound field extrapolation and characterization I

    The direction of the sound intensityI(x)can also be usedto predict a local indication of the DOA (direction-of-

    arrival). Indeed, the DOA is a unit vector in the opposite

    direction of the sound intensity vector. Then the DOAvectornDOA(x)is given by

    nDOA(x) = I(x)

    I(x)2. (34)

    For a set of N SFE pointsxn, the average intensity vector

    is introduced for a spatially discrete extrapolationregion:

    IN=1

    N

    N

    n=1

    I(xn). (35)

    This averaging operation can also be computed for all

    the subsequent metrics, including the DOA. It will be

    systematically denoted byN.

    Intensity and DOA fields are both objective and sub-

    jective metrics, they represent a directional transport of

    acoustical energy, but they are also sometimes used as

    indicators of sound localization by human hearing, es-

    pecially the DOA. However, the sound intensity solely

    expresses the net flow of energy, it does not indicate the

    direction of particular simultaneous arrivals, as for the

    DOA.

    3.2. Energy density field and average energy

    The local time-averaged energy density field E(x)of anharmonic acoustic sound field is a combination of ki-

    netic and potentialenergydensity fields,Ec(x) andEp(x)[J/m3], respectively [25]:

    E(x) =Ec(x) +Ep(x) =

    4

    u(x)22 +

    |p(x)|2

    (c)2

    .

    (36)

    According to Eqs. (3) and (31), one obtains for the prob-

    lem at hand

    E(x) =

    4

    L

    l=1

    nl

    ceikl xql

    2

    2

    +|Ll=1 e

    ikl xql|2

    (c)2 .

    (37)

    The energy density field can provide some interesting

    insights about a measured sound field. For a com-

    pletely diffuse sound field, the local spatial average en-

    ergy density fieldE(x)N(withNneighboring points ofx) should be constant in space [25]. For an harmonic

    sound field, a local spatial average is an average over

    a volume with dimensions larger than the wave length

    [25]. However, in practical situations, the local energy

    density fieldE(x)is not spatially uniform and this intro-

    duces some issues. This will be discussed in Secs. 3.6and 3.7.

    In this paper, we also introduce the normalized standard

    deviation of the local energy density with respect to the

    average energy density (EN)

    N= 1

    NEN

    N

    n=1

    |E(xn) EN| 100%, (38)

    in % ofEN. A small Nwill suggest a uniform distri-bution of the energy density while a large Nsuggests anheterogeneous distribution of the energy density.

    3.3. Directional pressure, energy density anddiffusion

    Most of the previously introduced SFCmetrics andquan-

    tifiers (Secs. 3.1 and 3.2) rely on the computation of SFE

    and local quantities before being actually averaged over

    the SFE sampled region. It is possible to introduce clas-

    sical metrics on the basis of the plane wave source distri-

    bution qwithout actual SFE.

    3.3.1. Directional pressure

    For the proposed SFE method, the output is a plane wave

    amplitude vector qwhich directly gives the directionalpressure: pl(l+,l) = ql(l ,l ). Indeed, if a local-ization algorithm could be designed to listen to a single

    directionl+,lfrom the SFE results, it would onlydetect a sound pressure wave withqlas its complex am-

    plitude. Therefore, the passage from ql to pl is direct.

    However, one should keep in mind the reversal of the

    propagation directions l ,l to corresponding listeningdirectionsl+,l.

    3.3.2. Directional energy density

    Since, for a single harmonic plane wave ql(l ,l )thevelocity field is related to the pressure field through the

    characteristic impedance c, one can directly write thedirectional energy density [25] on the basis of Eq. (36)and the directional pressure p l

    El = |pl|

    2

    2c2=

    |ql|2

    2c2. (39)

    The directional energy densityEl , since it is based on di-

    rectional pressure pl , represents the energy density that

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 10 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    11/32

    Gauthier et al. Sound field extrapolation and characterization I

    comes from the listening directionl+ ,l . The av-erage directional energy is given by

    ElL=1L

    Ll=1

    El= 12c2L

    Ll=1

    |ql|2. (40)

    3.3.3. Directional diffusion

    The previous metrics lead to the definition of directional

    diffusion. The directional diffusion in % is defined as

    follows [26]

    d= (1 /o) 100 % (41)

    where is the average of the absolutedifference betweenthe directional energy density and the spatial average of

    the directional energy density andois the value offora single impinging plane wave. Therefore, d= 100% fora perfectly diffuse sound field and d=0 % in anechoicconditions. In this paper, we follow the propostion of

    Goveret al.[26] and use the following definition for

    = 1

    El

    L

    l=1

    |El El|. (42)

    However, for the evaluation ofo, we rely on an averageof(according to Eq. (42)) over all the possible planewave directions:

    o= 1L

    L

    l=1

    1E

    (l)l

    L

    l=1

    |E(l

    )l E(l

    )l |. (43)

    That is, the inverse problem is theoretically computedL

    times for all the possible harmonic plane wave directions

    (indexl in the previous equation). The resulting solu-

    tions q(l)l lead to the directional energy densities E

    (l)l

    used in this definition ofo. Indeed, the heterogeneousnature of the inverse problem solution q as function of

    sound wave direction due to array geometry requires

    the computation of a direction-averagedoas shown inEq. (43). Note that since the inverse problem solution

    qis obtained with regularization, we do not expect that

    the directional diffusion will reach 0 % in practical situ-ation. Indeed, the regularization introduces some spheri-

    cal spreading of the solution, even for a single incoming

    plane wave.

    3.4. Incident directivity factor

    Assuming that the plane wave distributionql(l ,l )uni-formly covers 4steradians, it is possible to quantify the

    directivity of the source distribution. Inspired from the

    definition of the directivity factor of sound sources, an

    incident directivity factor is accordingly introduced

    Q=q2q22

    , (44)

    with q= q or q= qBF. The corresponding incidentdirectivity index [dB ref 1] is

    DI=10log10(Q). (45)

    This type of incident directivity factor was also intro-

    duced by Gover [26] for the analysis of transient sound

    fields in rooms. As for the directional diffusion, the di-

    rectivity index is an averaged parameter that expresses

    the anisotropic character of a sound field.

    3.5. Sound localization: Velocity and energyvectors, interaural time difference

    Both the velocity and energy vectors are derived from

    the audio engineering field where researchers look for

    predictors of human sound localization in presence of

    stereophonic sound systems.

    The velocity vector was proposed as a sound localization

    predictor at low frequencies, i.e. typically below 700 Hz

    where the interaural phase difference is a dominating cue

    for the localizationand where thehead diffraction is min-

    imal [23, 24, 27]. It is originally defined as the normal-ized particle velocity at the center of the reproduction

    region, where the listener stands. More recently, the ve-

    locity vector definition was expanded to the entire sound

    field, and it is now given by Daniel et al.[27]

    V(x) =cu(x)

    p(x). (46)

    One notes that the velocity vector is the particle velocity

    vector u(x) normalized by the particle velocityamplitudep(x)/cthat would be obtained for a purely propagatingplane wave of sound pressure amplitude equal to the lo-

    cal sound pressure p(x). Therefore, the velocity vectoris a dimensionless metric. Note that, by contrast with the

    intensity and DOA vectors, the velocity vector is a com-

    plex quantity. The real part of the velocity vector is in

    the opposite direction of the DOA vector nDOA(x)(seeEq. (34)) and is associated with precise sound localiza-

    tion [24] and active sound intensity. It is also generally

    accepted that the imaginary part of the velocity vector is

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 11 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    12/32

    Gauthier et al. Sound field extrapolation and characterization I

    associated with image broadeningor perceived phasi-

    ness [24]. Typically, the imaginary part of the veloc-

    ity vector is also related to reactive sound intensity. In

    [27], it is stressed that the velocity vector can also beused to predict the interaural time difference (ITD). In

    accordance with the informations conveyed by the ve-

    locity vector, we introduce an equivalent expression of

    the velocity vector

    V(x) =VR(x) + iVI(x), (47)

    with VR(x) =(V(x)), and VI(x) =(V(x)). Themag-nitude, azimuth and elevation ofVR, in the spherical co-

    ordinates shown in Fig. 1, are denoted VR,VRand VR ,respectively. Since it is assumed that VRis associated

    with sound localization, we derive the ITD as follows

    ITD(x,VR,H) =H

    c sin(VR H)cos(VR ), (48)

    where it is assumed that the listeners head is oriented

    towardsH and H without any roll movement of thelisteners head, i.e. the two ears are always in the same

    horizontal plane. In Eq. (48), the listeners ear separation

    isH[m].

    For the higher frequency range, the head diffraction has

    an strong effect and this makes the interaural level differ-

    ence (ILD) one of the dominating localization cues [27].

    Accordingly, the following energy vector is a more rele-

    vant predictor of sound localization above 700 Hz [27],

    E=

    Ll=1 nl |ql|

    2

    Ll=1|ql |

    2 , (49)

    or

    E = RE

    cos(E)cos(E)e1

    + sin(E)cos(E)e2 + sin(E)e3

    , (50)

    where 0 RE1 andE,Eare the spherical compo-nents of the energy vector. Note that the energy vector as

    defined in Eq. (49) can only predict sound localization atthe coordinates systems origin.

    3.6. Diffuseness field and average diffuseness

    By combining the intensity field (Eq. (32)) and the

    energy density field (Eq. (36)), Merimaa and Pulkki

    [13, 14, 15, 16] introduced the definition of diffuseness

    for a single frequency and a single point in space. For

    any pointx, it is possible to write the diffuseness field as

    follows:

    (x) =1 I(x)/c2E(x)

    . (51)

    The diffuseness varies between zero and unity. Theo-

    retically, in a completely diffuse field it is expected that

    (x) = 1 while in a purely propagative field,(x) = 0 isexpected. One of the issues that arises with the diffuse-

    ness(x)is that it mostly depends on the sound inten-sity. Therefore, any situation that would lead to a null in-

    tensity will be detected as a diffuse sound field. This can

    arise for two propagating plane waves in opposite direc-

    tion and identical amplitudes which produce a standing

    wave. In this case, the net energy flow is zero and the in-

    tensity field is null. This situation is easily generalized toany sound field made of opposite-direction propagating

    plane waves with similar amplitudes. Therefore, the dif-

    fuseness, as defined above, cannot distinguish between a

    standing wave pattern and a diffusesound field. This will

    be illustrated in Sec. 3.7.

    Another limitation of this definition of the diffuseness

    field is that it may not be appropriate for a diffuse sound

    field since the local energy density E(x)tends to varywith position for a harmonic diffuse sound field (see

    Sec. 3.2). This strong variation of the local energy den-

    sity, by marked contrast with the spatially-uniform local

    average energy density of the theoretical harmonic dif-

    fuse sound field [25], makes it difficult to use the local

    diffuseness(x)as a quantifier of the overall anisotropyor diffusion of the sound field over the SFE area.

    Again, for the discrete set of pointsxn, we can also intro-

    duce a discrete average

    N=1

    N

    N

    n=0

    (xn). (52)

    In subjective terms, the diffuseness is often related to the

    listener envelopment or the sensation of surrounding and

    enveloping sound.

    3.7. Theoretical test cases

    In order to evaluate the capability of the previous char-

    acterization metrics (computed on the basis of SFE) to

    distinguish or characterize several types of sound fields,

    several test cases are reported: a single source in free

    field, multiple sources in free field, a standing wave and

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 12 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    13/32

    Gauthier et al. Sound field extrapolation and characterization I

    a diffuse sound field. As the test cases are presented, the

    relevance of these metrics are discussed for the charac-

    terization of sound environments in vehicles or other sit-

    uations. Finally, the quantifiers most able to distinguishbetween these archetypical test cases will be identified.

    As it will be shown, two simple scalar metrics are the

    most appropriate for a classification of such sound field

    components. It is also made clear that other field metrics

    are useful to visualize and understand the behavior of the

    sound field in a large area.

    3.7.1. Single source in free field

    For this first test case, the SFE results reported in Fig. 4

    are used. This corresponds to a single dipole of strength

    0.1 [5] in free field. The intensity field computed at

    N= 625 locations in the horizontal plane is shown in

    Fig. 6(a). Clearly, the intensity field is stronger in thevicinity of the exact dipole position. Also note that the

    average intensity IDOAN (computed from the NSFEpoints, see Eq. (35)) is correctly oriented. The DOA field

    is shown in Fig. 6(b). This result clearly highlights the

    effectiveness of the SFE DOA vector as a predictor of

    perceived sound localization in a free-field situation over

    the entire SFE region. The average DOA orientation, be-

    sides being slightly different from the average intensity

    orientation, is also correctly aligned. The slight orien-

    tation difference between the average intensity and the

    average DOA is caused by the fact that for the average

    DOA all theNpoints share the same contribution in the

    averaging while for the average intensity the contribution

    of each of the Npoints in the averaging is proportional

    to the local intensity magnitude. Also, DOA more effi-

    ciently predicts the perceived direction of the incoming

    sound. Note that proper intensity and DOA results are

    not expected outside the effective SFE region shown by

    the contour lines in Figs. 6(a) and 6(b).

    The corresponding energy density E(x)and diffuseness(x)fields are shown in Figs. 7(a) and 7(b). The energydensity is confined to the vicinity of the true dipole po-

    sition. Moreover, one notes that the local diffuseness is

    zero nearly everywhere in the SFE region except along

    the dipole null-axis. This is expected since the null pres-sure observed on the dipole null-axis makes the intensity

    null in this region. Therefore, the diffuseness approaches

    unity. This result explains the observed DOA in that re-

    gion. Indeed, a closer look at the DOA (Fig. 6(b)) along

    the dipole null-axis shows that the DOA strongly varies

    with the position in that area. This variation is an ar-

    tifact since the intensity approaches zero in that region,

    2 1 0 1 2

    2

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    I(x) [W/m2], Max(|I(x)|) = 0.087649 W/m

    2

    x1/

    k

    I(x) N

    = 0.0067485

    x2

    /k

    (a)

    2 1 0 1 2

    2

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    nDOA

    N

    nDOA

    (x)

    x1/

    k

    x2/k

    (b)

    Fig. 6: (a): Intensity field I(x)(Eq. (32)) [W/m2], av-

    erage intensity vectorI(x)N[W/m2

    ], (b): direction-of-arrival nDOA(x)(Eq. (34)) and average DOA nDOAN(N=625) for a single dipole in free field for the SFEshown in Fig. 4. The microphone array is shown in light

    grey. The average vectors (computed for the SFE points)

    are centered at the origin and shown as a large arrow.

    Local SFE errors shown as contour lines (see Fig. 4 for

    more details).

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 13 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    14/32

    Gauthier et al. Sound field extrapolation and characterization I

    then the DOA is not defined (nDOA 0/0). Interestingly,the DOA clearly goes through strong or erroneous vari-

    ation when it crosses the SFE effective region illustrated

    as the contour lines in Fig. 6(b). The local variation ofthe DOA also suggests a potential local sensation of dif-

    fuseness which also corresponds to actual perception in

    the null-axis of a dipole in free-field. This will be con-

    firmed by the velocity vectors.

    The directional pressure pland the energy vector E ob-

    tained from SFE are shown in Fig. 8(a) as a spherical

    plot in linear scale (radius) and logarithmic scale (color)

    [dBref 1]. Clearly, the directional pressure,which is sup-

    posed to predict perceivedsound directionabove 700 Hz,

    is precise and well aligned with the energy vector which

    points towards (from the coordinates system origin) the

    dipole. The directional energy and the energy vector areshown in Fig. 8(b). As expected from the directional en-

    ergy definition (Eq. (39)), the directional energy is much

    more precise than the directional pressure.

    The velocity vectorV(x), the real part of which is a pre-dictor of sound source localization below 700 Hz, is re-

    ported in Fig. 9. As expected from the definition of the

    velocity vector (see Eq. (46)), the real part of the veloc-

    ity vector orientation matches the orientation of the DOA

    vector (see Eq. (34)). In addition, the imaginary part

    of the velocity vector highlights the regions where the

    sound is perceived as diffuse or not localized. Clearly,

    for the reported test case, the imaginary part of the ve-

    locity vector is non-negligible in the dipole null-axis, the

    region where the diffuseness (x)approaches one (seeFig. 7(b)). Therefore, as shown by this example, the ve-

    locity vector is an interesting metric since it combines

    the information carried by the DOA vectornDOA(x)andthe diffuseness(x). The ITD predicted from the realpart of the velocity vector is shown in Fig. 10 where one

    clearly notes the transition from negative to positive ITD

    when the listener passes from one side to the other side

    of the sound source while its head azimuth is fixed to the

    angle of the energy vector. Therefore, SFE seems to cor-

    rectly predict the velocity vector and the ITD in the SFE

    region where the local SFE errors are low.The scalar metrics related to this test case are reported in

    Tab. 1. We recall that these scalar metrics are directly de-

    rived from the plane wave amplitudes obtained from the

    inverse problem solution: these scalar metrics are rep-

    resentative of the sound field as a whole. Interestingly,

    even if they are intuitively understood as origin-centered,

    they are in fact the same for any SFE pointsx. Indeed, all

    (a)

    (b)

    Fig. 7: (a): Energy density E(x) (Eq. (36))[J/m3]104, average energyE(x)N [J/m3], (b): dif-fuseness (x) (Eq. (51)) and average diffuseness(x)N(N=14400) for a single dipole in free field forthe extrapolated sound field shown in Fig. 4.

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 14 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    15/32

    Gauthier et al. Sound field extrapolation and characterization I

    x1

    Directional pressure pl(

    l,

    l), color: dB ref 1

    x3

    x2

    10

    8

    6

    4

    2

    0

    (a)

    x1

    Directional energy density El(

    l,

    l), color: dB ref 1

    x3

    x2

    10

    8

    6

    4

    2

    0

    (b)

    Fig. 8: Spherical plots of the (a): directional pressure

    pl(l ,l ) (linear (radius) and dB ref 1 (color) scale),(b): the directional energy density El(linear (radius) and

    dB ref 1 (color) scales) and the energy vector E(shown

    as large arrow) for a single dipole in free field for the

    extrapolated sound field shown in Fig. 4.

    2 1 0 1 2

    2

    1

    0

    1

    2

    Re[V(x)], Max(|V(x)|) = 36.289

    x1/

    k

    x2

    /k

    2 1 0 1 2

    2

    1

    0

    1

    2

    Im[V(x)], Max(|V(x)|) = 36.289

    x1/

    k

    x2

    /k

    Fig. 9: Real (top) and imaginary (bottom) parts of the

    velocity vectorV(x)(Eq. (46)) for a single dipole in freefield for the extrapolated sound field shown in Fig. 4.

    Local SFE errors shown as contour lines (see Fig. 4 for

    more details).

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 15 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    16/32

    Gauthier et al. Sound field extrapolation and characterization I

    Metrics Values

    Deviation ofE(x)(N) 110.3190%Directional diffusion (d) 13.6152 %

    Directivity factor (Q) 0.2920Directivity index (DI) 5.3462 dB ref 1

    Energy vector azimuth (E) 1.0985 radEnergy vector elevation (E) 0.0014 rad

    Energy vector radius (RE) 0.9174

    Table 1:Scalar metrics for the single dipole in free field

    (see Fig. 4) withN=14400.

    Fig. 10:Predicted ITD [ms] (Eq. 48) for a single dipole

    in free field for the extrapolated sound field shown in

    Fig. 4. The listener head orientation is fixed toH= Eover the entire SFE region. The head orientation is

    shown as a black large arrow. Local SFE errors shown

    as contour lines (see Fig. 4 for more details).

    the scalar metrics, except the directional diffusionpl , are

    based on absolute and squared values of the plane wave

    amplitudesq. Therefore, the phase shift of the solution

    for a translation of the origin (Eq. (6)) does not affect the

    computed scalar metrics. This is an important property

    of these scalar metrics.

    For this reported test case, the value of the directivity

    factor and directivity index suggest a moderately direc-

    tive sound field. This is not really the case, therefore,

    these two quantifiers might not be the most appropri-

    ate or should, at least, be modified for the case of singlesource in free field. For comparison purpose, withL =642 plane waves in the source distribution of the direct

    problem, a single plane wave would give Q =1/1=1and DI=0 dB ref 1 while a totally diffuse sound fieldwith L=642 equal amplitude plane waves would giveQ = 1/642 = 0.0016and DI = 28.0754dB ref 1. How-ever, as will be shown in the following, more immers-

    ing sound field situation, theQ and DI values reported

    in Tab. 1 are in the highest observable range.

    Comparison of these Qand DI directivity metrics with

    the directional diffusion dsuggests that the latter is a

    more effective directivity metric. Indeed, d is muchcloser to its lowest value (0%) thanQ(or DI) is closer to

    its highest value. Moreover, a high standard deviation of

    the energy density (N) suggests an heterogeneous distri-bution of the energy density through space which com-

    forts the idea that the sound field is all but diffuse.

    From the scalar metrics, one also notes that the energy

    vector magnitudeRE=0.9174 is relatively high. This isa direct consequence of the energy vector E (Eq. (49))

    definition which implies that the energy vector is high if

    and only if the directional pressure shows a strong spher-

    ical polarity. Indeed, in the case of spherically symmet-

    ricalqlthe vector sum in Eq. (49) is null. This will be

    further discussed for the upcoming test cases.

    3.7.2. Two sources in free field

    This test case corresponds to a free-field 2-channel

    stereophonic sound reproduction situation. As shown in

    Fig. 11, two in-phase monopole sources are located in

    x1/k=1 and x2/k=2. The monopole amplitude[25] of each source is 0.5.

    For this test case, theSFE results arepresentedin Fig. 11.

    The intensityfield computed atN= 625 points in the hor-izontal plane is shown in Fig. 12(a). Clearly, the intensity

    field is stronger in the vicinity of the monopolepositions.

    Also note that the average intensity is correctly orientedin terms of stereophonic sound perception. The DOA

    field is shown in Fig. 12(b). This result clearly highlights

    the effectiveness of the SFE-based DOA vector as a pre-

    dictor of perceived sound localization in a stereophonic

    free-field situation (two coherent sound sources). This

    prediction is valid over the entire effective SFE region.

    Indeed, for the extended central sweet spot (x1/k=0,

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 16 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    17/32

    Gauthier et al. Sound field extrapolation and characterization I

    x2 2), the reported test case exactly corresponds toa stereophonic listening with a centered phantom im-

    age created by level-differencestereophony. The average

    DOA is also correctly oriented in that sweet spot. For anoff-axis listening position, the phantom image predicted

    by the DOA and SFE deviates towards the closest sound

    source.

    The energy density field E(x)and the diffuseness field(x)for the same extrapolated sound field are shown inFigs. 13(a) and 13(b). Again, the energy density is well

    localized in the vicinity of the exact source positions.

    One notes that the local diffuseness is zero nearly ev-

    erywhere in the SFE region except in the region where a

    wrong or imprecise phantom image position is expected

    from two-channel stereophonic systems (for a given fre-

    quency). Source localization cues from the local diffuse-ness(x)agrees with the predicted perceived sound di-rection from the DOA in these regions (see Fig. 12(b)).

    Indeed, a closer look at the DOA (Fig. 12(b)) in this re-

    gion reveals a DOA that strongly varies with the position

    in that area.

    The directional pressure, directional energy and energy

    vectors are presented in Figs. 14(a) and 14(b). One notes

    that the energy vector is a good predictor of sound lo-

    calization for a listener at the center of the array while

    the directional pressure and energy density reveal both

    the presence of the two real sources and the presence of

    the perceived central sound image. The complex velocity

    vector field is shown in Fig. 16. Again, we note that the

    real part of the velocity vectors predict both the sound

    localization created by the stereophonic image and the

    diffuse curved-regions. The ITD predicted from the real

    part of the velocity vector is shown in Fig. 15 where one

    clearly notes that the ITD is zero in the central region.

    Moreover, one can observe the expected passage from

    negative to positive ITD on the left and on the right sides

    of the central position. Again, SFE seems to correctly

    predict the velocity vector and the ITD for the reported

    test case.

    The scalar metrics for this test case are presented in

    Tab. 2. The directivity factor Qand the correspondingdirectivity index DI are lower than for the case of a sin-

    gle source in free field. These values suggest a less di-

    rective sound field, which is the case. The directional

    diffusiondalso gives a higher value. The deviation of

    the energy density is very similar to the value obtained

    for the single dipole test case. Therefore, it seems that

    this deviationNmight be a good indicator of free-field

    Fig. 11:(a): Real part of the original sound field created

    by two monopole sound sources. (b): Real part of the

    extrapolated sound field using a plane wave source dis-

    tribution obtained from Eq. (20) with =0.001. Themonopoles are marked as a black and white dots. The

    white contour line indicates the region of 0.001 of local

    quadratic SFE error and the black contour line indicates

    the region of 0.1 of local quadratic SFE error.

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 17 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    18/32

    Gauthier et al. Sound field extrapolation and characterization I

    2 1 0 1 2

    2

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    I(x) [W/m2], Max(|I(x)|) = 0.15636 W/m

    2

    x1/

    k

    I(x) N

    = 0.011408

    x2

    /k

    (a)

    2 1 0 1 2

    2

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    nDOA

    (x)

    x1/

    k

    nDOA

    N

    x2/k

    (b)

    Fig. 12: (a): Intensity field I(x)(Eq. (32)) [W/m2], av-

    erage intensity vector I(x)N [W/m2

    ], (b): direction-of-arrival nDOA(x)(Eq. (34)) and average DOA vectornDOA(x)N (N=625) for two monopoles in free fieldfor SFE shown in Fig. 11. The microphone array is

    shown in light grey. The average vectors (computed for

    the SFE points) are centered at the origin and shown as

    a large arrow. Local SFE errors shown as contour lines

    (see Fig. 11 for more details).

    (a)

    (b)

    Fig. 13: (a): Energy density E(x) [J/m3] 104

    (Eq. (36)), average energyE(x)N[J/m3], (b): diffuse-

    ness (x)(Eq. (51)) and average diffuseness (x)N(N=14400) for two monopoles in free field for the ex-trapolated sound field shown in Fig. 11.

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 18 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    19/32

    Gauthier et al. Sound field extrapolation and characterization I

    x1

    Directional pressure pl(

    l,

    l), color: dB ref 1

    x3

    x2

    10

    8

    6

    4

    2

    0

    (a)

    x1

    Directional energy density El(

    l,

    l), color: dB ref 1

    x3

    x2

    10

    8

    6

    4

    2

    0

    (b)

    Fig. 14: Spherical plots of the (a): directional pressure

    pl(linear (radius) and dB ref 1 (color) scale), (b): energy

    vector E, directional energy density El (linear (radius)

    and dB ref 1 (color) scale) and energy vector E for thetwo monopole sources in free field (see Fig. 11). The

    energy vector is shown as a large arrow (aligned withx2and fused with the main lobe).

    Fig. 15: Predicted ITD [ms] (Eq. 48) for the two

    monopole sources in free field (see Fig. 11). The listener

    head orientation is fixed toH= Eover the entire SFEregion. The head orientation is shown as a black large

    arrow.

    Metrics Values

    Deviation ofE(x)(N) 108.2139 %Directional diffusion (d) 22.9553 %

    Directivity factor (Q) 0.1997Directivity index (DI) 6.9959 dB ref 1

    Energy vector azimuth (E) 1.5570 radEnergy vector elevation (E) 0.0261 rad

    Energy vector radius (RE) 0.7387

    Table 2:Scalar metrics for the two monopole sources in

    free field (see Fig. 11) with N=14400.

    situations with localized sound sources. Again, the en-

    ergy vector azimuth and elevation angles agree with theexpected sound perception.

    3.7.3. Standing wave in rectangular coordi-nates

    In this case, the sound field is a standing wave in rectan-

    gular coordinates created by eight propagating waves in

    three-dimensional space. Low frequency standing wave

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 19 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    20/32

    Gauthier et al. Sound field extrapolation and characterization I

    2 1 0 1 2

    2

    1

    0

    1

    2

    Re[V(x)], Max(|V(x)|) = 35.0474

    x1/

    k

    x2

    /k

    2 1 0 1 2

    2

    1

    0

    1

    2

    Im[V(x)], Max(|V(x)|) = 35.0474

    x1/

    k

    x2

    /k

    Fig. 16: Velocity vector V(x) (Eq. (46)) for the twomonopole sources in free field (see Fig. 11). The

    monopoles are marked as black and white dots. The mi-

    crophone array is shown in light grey. Local SFE errors

    shown as contour lines (see Fig. 11 for more details).

    patterns can be found in small closed spaces such as ve-

    hicle cabins. Although not very often investigated by

    the spatial audio community, the identification, charac-

    terization and subsequent reproduction of standing wavepatterns represents a specific challenge encountered in

    sound environment reproduction of closed spaces. The

    reported theoretical case corresponds to an oblique mode

    of a rigid-walled rectangular cavity. For this test case, the

    sound field is given by

    p(x) =cos(kx1x1)cos(kx2x2)cos(kx3x3), (53)

    with kx1 =2cos(s)cos(s), kx2 =2sin(s)cos(s)andkx3 =2sin(s). For the reported case, the standingwave angles were set to s = /7 and s = /6. Thecomparison of the original sound field and the SFE re-

    sults with a regularization parameter of =0.0001 isshown in Fig. 17: SFE is effective over a large region.

    This is perhaps caused by the fact that the plane wave

    model used in the direct problem definition is more ap-

    propriate for that type of sound field.

    As expected, the corresponding intensity field (not

    shown here) is numerically null over the entire region.

    The energy density field E(x)and the diffuseness field(x) for the extrapolated sound field are shown inFigs. 18(a) and 18(b). Clearly, energy distribution over

    the entire SFE region corresponds to the modal pattern.

    However, the local diffuseness(x)is one nearly every-

    where. On the basis of the definition of the diffuseness(x)(see Eq. (51)), this was to be expected. Indeed, ac-cording to that definition, this metric will attribute full

    diffuseness to a standing wave field since its net intensity

    is null over the entire SFE domain.

    The directional pressure and energy density are shown in

    Figs. 19(a) and 19(b). In the second of these two figures,

    the identification of the eight propagating waves that cre-

    ate the three-dimensional standing wave is clear.

    For this standing wave test case, the scalar metrics are

    reported in Tab. 3. The directivity factor Q(Eq. (44))

    is 0.2530 and the corresponding directivity index DI

    (Eq. (45)) is5.9682 dB ref 1. These values suggest amoderately directive incident sound field. This is not re-

    ally the case, therefore, these two quantifiers might not

    be the most appropriate or should, at least, be modi-

    fied to detect standing wave pattern. Some indications

    of the standing-wave nature of the sound field is pro-

    vided by the fact that the directional diffuse dis very

    low: 12.3839%, which is the lowest of all the observed

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 20 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    21/32

    Gauthier et al. Sound field extrapolation and characterization I

    Fig. 17: (a): Real part of the original sound field de-

    fined as a standing wave in rectangular coordinates. The

    dimensionless problem is normalized by the acoustical

    wavelength k. (b): Real part of the extrapolated soundfield using a plane wave source distribution obtained

    from Eq. (20) with=0.0001. The white contour lineindicates the regionof 0.001 of local quadraticSFE error.

    The nodal lines are shown as dashed black lines.

    (a)

    (b)

    Fig. 18: (a): Energy density E(x) (Eq. (36))[J/m3]106, average energyE(x)N [J/m3], (b): dif-fuseness (x) (Eq. (51)) and average diffuseness(x)N(N= 14400) for a standing wave and for the ex-trapolated sound field shown in Fig. 17. The nodal lines

    are shown as dashed lines.

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 21 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    22/32

    Gauthier et al. Sound field extrapolation and characterization I

    x1

    x3

    Directional pressure pl(

    l,

    l), color: dB ref 1

    x2

    10

    8

    6

    4

    2

    0

    (a)

    x1

    x3

    Directional energy density El(

    l,

    l), color: dB ref 1

    x2

    10

    8

    6

    4

    2

    0

    (b)

    Fig. 19:(a): Directional pressurepl(linear (radius) and

    dB ref 1 (color) scale) and (b): directional energy den-

    sity El (linear (radius) and dB ref 1 (color) scale) for a

    standing wave for the extrapolated sound field shown in

    Fig. 17.

    directional diffusion for the four reported test cases.Moreover, the energy vector radius RE is very small:

    0.0379 in comparison with RE close to 1 for the singledipole and two monopole test cases. Notably, this test

    case illustrates a very interesting property of the energy

    vector magnitude. Indeed, as soon as two plane waves

    of opposite direction share a similar amplitude|ql|, theytend to cancel each other in the computation of the en-

    Metrics Values

    Deviation ofE(x)(N) 63.1460 %Directional diffusion (d) 12.3839 %

    Directivity factor (Q) 0.2530Directivity index (DI) 5.9682 dB ref 1

    Energy vector azimuth (E) 1.5727 radEnergy vector elevation (E) 0.0208 rad

    Energy vector radius (RE) 0.0379

    Table 3: Scalar metrics for the stationary wave (see

    Fig. 17) withN=14400.

    ergy vector. Most interesting is the fact that this hap-

    pens for plane wave distributions such as the one shown

    in Fig. 19(a) for a oblique standing wave but also for anyother stationary wavessuch a cylindrical or sphericalhar-

    monics. Moreover, this same cancellation also arises for

    a diffuse sound field where soundenergy travels in all di-

    rections. Therefore, the energy vector magnitude seems

    a good predictor of directive (few sources in free space)

    or non-directive (standing waves or partly diffuse sound

    fields) sound field. This will be further discussed in the

    case of the diffuse sound field.

    3.7.4. Diffuse sound field

    For this test case, an harmonic diffuse sound field is

    created using a limited set of 642 plane waves coming

    from random directions covering a 4steradians solidangle. Both the amplitude and phase of the plane waves

    were random. For more details about the definition and

    properties of harmonic diffuse sound fields, the reader

    is referred to [25]. The original diffuse sound field and

    the corresponding SFE result are shown in Fig. 20 for

    =0.0001. Again, the SFE method performs very wellover a large effective area even for the specific case of

    a diffuse sound field. The sound field characterization

    metrics, namely sound intensity, DOA, energy density,

    diffuseness, directional pressure and directional energy

    density are shown in Figs. 21(a) to 23(b).

    Both the sound intensity and the DOA fields shown in

    Figs. 21(a) and 21(b) suggest a diffuse situation. Indeed,

    the sound intensity average is very low and the DOA and

    sound intensity spatial variations are large. This diffuse

    character will be supported by the corresponding scalar

    metrics.

    The acoustical energy density E(x)and the diffuseness(x)are shown in Figs. 22(a) and 22(b), respectively.

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 22 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    23/32

    Gauthier et al. Sound field extrapolation and characterization I

    Fig. 20:(a): Real part of the original diffuse sound field

    defined. Thedimensionless problem is normalized by the

    acoustical wavelengthk. (b): Real part of the extrapo-lated sound field using a plane wave source distribution

    obtained from Eq. (20) with= 0.0001. The white con-tour line indicates the region of 0.001 of local quadratic

    SFE error and the black contour line indicates the region

    of 0.1 of local quadratic SFE error.

    2 1 0 1 2

    2

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    I(x) [W/m2], Max(|I(x)|) = 0.00085613 W/m

    2

    x1/

    k

    I(x) N

    = 2.7693e005

    x2

    /k

    (a)

    2 1 0 1 2

    2

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    nDOA

    N

    nDOA

    (x)

    x1/

    k

    x2

    /k

    (b)

    Fig. 21:(a): Intensity fieldI(x)[W/m3] (Eq. (32)), aver-age intensity vectorI(x)N[W/m3], (b): DOAnDOA(x)(Eq. (34)) and average DOA vector nDOA(x)N (N=625) for a diffuse sound field (SFE shown in Fig. 20).

    The microphone array is shown in light grey. The aver-

    age vector are centered at the origin and shown as large

    arrows. Local SFE errors shown as contour lines (see

    Fig. 20 for more details).

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 23 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    24/32

    Gauthier et al. Sound field extrapolation and characterization I

    (a)

    (b)

    Fig. 22: (a): Energy density E(x) [J/m3] 106

    (Eq. (36)), average energyE(x)N [J/m3], (b): diffuse-

    ness (x)(Eq. (51)) and average diffuseness (x)N(N=14400) for a diffuse sound field (SFE shown inFig. 20).

    Metrics Values

    Deviation ofE(x)(N) 43.7181 %Directional diffusion (d) 53.9748 %

    Directivity factor (Q) 0.1548Directivity index (DI) 8.1022 dB ref 1

    Energy vector azimuth (E) 2.4254 radEnergy vector elevation (E) 0.5296 rad

    Energy vector radius (RE) 0.0788

    Table 4: Scalar metrics for the diffuse sound field (see

    Fig. 17) withN=14400.

    The energy density distribution is not homogeneous and

    the diffuseness goes through strong spatial variations

    from zero to unity. By itself, the diffuseness average(x)N= 0.43666 suggests a moderately diffuse soundfield, which is not the case. Since the diffuseness is

    (x)1 x for the stationary wave (see Fig. 18(b))but much less than unity for the true diffuse sound field,

    the diffuseness might not be the most appropriate and il-

    lustrative quantifier to distinguish an harmonic standing

    wave from an harmonic diffuse sound field.

    The information on the directional pressure and energy

    density are shown in Figs. 23(a) and 23(b). Since the

    energy vector E is very small, it is not shown on these

    two figures. By comparison with the previously reported

    test cases, these directional quantifiers show a distribu-

    tion that covers more uniformly the 4steradians solidangle. However, one can observe that the directional en-

    ergy shows some sort of principal directions, something

    that would not be the case for a true description of a

    diffuse sound field. This heterogeneity will explain the

    fact that scalar metrics reported in subsequent paragraphs

    does not reach the expected and ideal theoretical and ex-

    treme values. In fact, due the array finite size and geom-

    etry, it might not be possible to obtain a entirely filled

    plane wave distribution for a true diffuse sound field.

    The scalar metrics for the diffuse sound field test cases

    are very relevant. They are shown in Tab. 4. Attention

    will be directed to the two most relevant scalar metrics.First, one notes that the directional diffusion dis higher

    than for all the other test cases. Second, the energy vec-

    tor radiusREis, as expected, very low, hence suggesting

    a poorly perceived sound source position. As it will be

    shown in the next section, these two scalar metrics can be

    used to derive a classification tree that might be able to

    distinguish between the archetypical situations of a pre-

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 24 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    25/32

    Gauthier et al. Sound field extrapolation and characterization I

    x1

    x3

    Directional pressure pl(

    l,

    l), color: dB ref 1

    x2

    10

    8

    6

    4

    2

    0

    (a)

    x1

    x3

    Directional energy density El(

    l,

    l), color: dB ref 1

    x2

    10

    8

    6

    4

    2

    0

    (b)

    Fig. 23:(a): Directional pressurepl(linear (radius) and

    dB ref 1 (color) scale) and (b): directional energy density

    El(linear (radius) and dB ref 1 (color) scale) for a diffuse

    sound field (SFE shown in Fig. 20).

    cise sound source, a standing wave pattern and a diffuse

    sound field.

    3.8. Transition between test cases

    The previous sections and subsections highlighted the ef-

    fectiveness of the field and scalar metrics to predict var-

    ious characteristics of the sound field computed by SFE.

    Yet the physical metrics such as sound intensity field,

    DOA field, diffuseness field and expected ITD are very

    useful, they also provide a very much detailed descrip-

    tion of the sound field. Sometimes, it is interesting to

    classify or characterize the sound field in broader cate-

    gories, atoms or terms so that the most appropriate soundreproduction techniquecan be selected for that sound en-

    vironment component. In this section, we present the

    transition between the archetypical test cases reported

    earlier: it will be shown that a simple classification could

    be based on two scalar metrics, namely the energy vector

    magnitudeREand the directional diffusiond.

    Transitions between: the single dipole and the two

    monopole cases (hereafter named #1 to #2), the single

    dipole and the standing wave (#1 to #3), the standing

    wave to the diffuse sound field (#3 to #4) and the single

    dipole to the diffuse sound field (#1 to #4) were com-

    puted for 20 interpolation points. For the interpolation,the orignal test cases were first scaled to ensure a simi-

    lar vector 2-norm of the measured pressurep(xm)at themicrophone array. Note that transition between the two

    monopoles in free field and the standing wave or diffuse

    sound field are not reported. Indeed, it was sufficient

    to keep only a single free-field case, namely the single

    dipole, for the transition between free-field and standing

    wave or diffuse sound field. Otherwise, the transition

    graphics would have been too dense. Next, for the in-

    terpolation, a linear amplitude fade is applied between

    two limiting test cases and the inverse problem is solved

    for each of the interpolation points. The energy vector

    and the directional diffusion are then directly computedas above. Note that for the interpolations the regulariza-

    tion parameter in Eq. 22 is chosen as follows: #1 to #2,=0.01; #1 to #3, =0.0001; #1 to #4, =0.0001and #3 to #4,= 0.0001.

    The results, in terms of energy vector magnitude and di-

    rectional diffusion are shown in Fig. 24.

    Lets first examine the transition #1 to #2 which is from

    the single dipole to the two monopole cases. As one

    could expect, they both occupy a similar region of the

    dREplane and, most notably, they share a very highR

    Ewhich indicates a very directive sound field.

    Next, consider the transition #1 to #3 which is from the

    single dipole to the standing wave. The major differ-

    ence between the two limiting points of this transition is

    alongRE: the standing wave involves, as explained ear-

    lier, a very lowRE. When the transition fade is at 50 %,

    RE 0.6. The curvature (along axis d) of this transi-tion curve is easily explained by the fact that the stand-

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 25 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    26/32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    27/32

    Gauthier et al. Sound field extrapolation and characterization I

    simple assume that a point that tends toward this limit-

    ing corner will approach a single propagating wave in

    free field. The other limiting case occurs at RE =0,

    d/100=1 (shown as a thick+in Fig. 24). This caseis even more difficult to reach both physically and as a

    solution of the inverse problem. Indeed, this is only pos-

    sible if the plane wave amplitudes are all exactly equal

    in magnitude. The imperfection of the array and the SFE

    algorithm cannot reach such an ideal case. This explains

    why the directional diffusion seems to be limited at 0.6for thereported cases. The last possible limiting case is at

    RE= 0,d/100 = 0 (shown as a thickin Fig. 24). Thiscould only occur if two plane waves of exactly the same

    amplitude were exactly facing each other. That would

    be the indication of an axial mode. The other corner of

    the classification plot, RE=1, d/100=1, correspondsto an impossible case of both spherically-polarized and

    spherically-uniform plane wave amplitudes. Therefore,

    one should not expect any points in the top right triangu-

    lar part of the plot.

    Before actually proceeding to the experimental valida-

    tion of the proposed SFE and SFC methods, the sug-

    gested classification criteria must be discussed.

    First, one should be aware that this is a preliminary pro-

    posal and that it could be refined. Indeed, it is easy

    to object that the methodology behind the definition of

    this classification is some sort of a manual multi-

    dimensional analysis. We note that a systematic multi-dimensional analysis could be performed. However, the

    simple classification, as proposed in this paper, has the

    great advantage of involving two simple metrics and not

    a metrics linear combination. Consequently, the classifi-

    cation is easy to understand and interpret.

    Second, it could be objected that the proposed transition

    values RE and d that circumscribe the directive, non-

    directive, stationary and diffuse regions are derived, for

    specific test cases for a specific microphone array and

    for dimensionless simulations. Further verifications for

    various cases should be performed. In fact, we suggest

    that these transition values should be verified for othermicrophone arrays. In all cases, the experimental results

    will show that the proposed classification and transition

    values are convenient for the reported experiments.

    3.10. Sound-field type scores

    In real applications, a sharp classification tree might not

    always be the most appropriate approach to quantify the

    measured sound field. Therefore, we propose the intro-

    duction of sound-field type scores. The free-field score

    is given by

    Sff=RE. (54)

    The modal or standing-wave score is given by

    Sm= (1 RE)(1 d/60)2, (55)

    and the diffuse-field score is given by

    Sd= (1 RE)(d/60)2. (56)

    As one notes, the division of the directional diffusion by

    60 is inspired from the observations that stem from the

    transition plot shown in Fig. 24. Further refinements of

    these scores could be done. The interest of the scores

    in comparison with a sharp classification scheme is thatthey can deal with intermediate cases. Moreover, for

    broadband noise or signals, it would be possible to plot

    the score as function of frequency. This is the topic of

    current verifications.

    To illustrate the capability of these sound-field scores to

    deal with the reported test cases, Fig. 26 shows the scores

    of the four transitions already reported in Fig. 24. For

    Fig. 26(a), the transition from the single dipole to the two

    monopoles cases systematically gives Sffas the highest

    score. For the transition from the single monopole to the

    diffuse sound field, the scores shown in Fig. 26(b) are

    able to distinguish the free-field and diffuse-field situa-tions. For the twoother transitions reported in Figs. 26(c)

    and (d), the scores are also good detectors for the stand-

    ing waves and diffuse sound field.

    4. CONCLUSION

    The aim of this paper was twofold: 1) develop and de-

    scribea method of spatial sound field extrapolation based

    on microphone array measurements of arbitrary geome-

    try and 2) develop and define a sound field characteri-

    zation method and a sound field classification based on

    known objective and subjective metrics.

    To achieve SFE, a recently developed method was pro-posed and further analyzed. This method is based on the

    combination of classical least-square inverse problems in

    matrix form with a beamforming regularization matrix

    used as a discrete smoothing norm in the regularization.

    In this paper, we extended the analysis of this regular-

    ization method which was compared to the application

    of a beamforming scaling matrix in the inverse problem

    AES 131st Convention, New York, USA, 2011 October 2023

    Page 27 of 32

  • 8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact

    28/32

    Gauthier et al. Sound field extrapolation and characterization I

    ffm d0

    0.5

    1

    Sff,

    Sm,

    Sd

    ffm d0

    0.5

    1(a) Transition from #1 to #2 (0%, 50%, 100%)

    ffm d0

    0.5

    1

    ffm d0

    0.5

    1

    Sff,

    Sm,

    Sd

    ffm d0

    0.5

    1(b) Transition from #1 to #4 (0%, 50%, 100%)

    ffm d0

    0.5

    1

    ffm d0

    0.5

    1

    Sff,

    Sm,

    Sd

    ffm d0

    0.5

    1

    (c) Transition from #3 to #4 (0%, 50%, 100%)

    ffm d0

    0.5

    1

    ffm d

    0

    0.5

    1

    Sff,

    Sm,

    Sd

    ffm d

    0

    0.5

    1(d) Transition from #1 to #3 (0%, 50%, 100%)

    ffm d

    0

    0.5

    1

    Fig. 26:Sound type scores (Sff,Sm,Sd) for the four tran-

    sitions shown in Fig. 24. The scores are only shown for

    0 %, 50 % and 100 % (from left to right) of the transi-

    tions, these points correspond to the markers in Fig. 24.

    The highest scores are highlighted as black bars.

    with classical Tikhonov regularization. On the basis of

    the general singular value decomposition of the transfer

    matrix and beamforming regularization matrix pair and

    the singular value decomposition of the transfer matrix,

    it was shown that the beamforming regularization ma-trix approach is equivalent to an inverse problem where

    the original transfer matrix is scaled by a beamforming

    scaling matrix. This is a new result that opens up the

    understanding of the original proposal on the use of a

    beamforming regularization matrix [7].

    Once SFE was achieved, the inverse problem solution

    was investigated to evaluate different sound field met-

    rics, namely: energy density, sound intensity, direction

    of arrival, diffuseness, velocity vector, energy vector, di-

    rectional energy, interaural time difference, incident di-rectivity factor, incident directivity index and directional

    diffusion. Using theoretical simulations, they were com-

    pared in terms of their capability to simply character-

    ize archetypical sound field types: a small number of

    sources in free-field situations, standing wave patterns