Perceptual Approximation of Room Impulse … · Perceptual Approximation of Room Impulse Responses with Artificial Reverberation Algorithms Audio Engineering Project Thesis Lukas

Perceptual Approximation of Room Impulse Responses

with Artificial Reverberation Algorithms

Audio Engineering Project Thesis

Lukas Knöbl

Assesor: MSc. Georgios Marentakis

Graz, 31.05.2015

Abstract

The goal of the project is to design and implement a reverberation algorithm with various

perceptual controls. The literature review in the field will be investigated and synthesized to

come up with a realistic and efficient reverberation algorithm implementation. Furthermore,

the algorithm will be tested with respect to its ability to simulate given impulse responses and

the possibility to provide perceptual controls.

Kurzbeschreibung

Das Ziel der Projektarbeit ist die Entwicklung und Implementation eines Algorithmus zur

Raumsimulation, dessen perzeptuelle Eigenschaften sich mit Hilfe verschiedener

Bedienelemente verändern lassen. Es soll eine Literaturrecherche durchgeführt werden,

welche als Grundlage für die Erstellung eines realistischen und ressourcenschonenden Hall-

Algorithmus dient. In weiterer Folge wird der Algorithmus bezüglich der Fähigkeit getestet

Impulsantworten von realen Räumen zu simulieren.

Lukas Knoebl, Simulation of Room Impulse Response

3

Table of Contents

1 Introduction 5

2 Artificial Reverberation Algorithms 7

2.1 Properties of room acoustics ............................................................................ 7

2.2 EDC and EDR .................................................................................................. 8

2.3 Time & Frequency Density .............................................................................. 9

2.3.1 Frequency Density .......................................................................................... 9

2.3.2 Time Density ................................................................................................ 11

2.4 Echo Density Profile ...................................................................................... 11

2.5 Convolution Reverb ....................................................................................... 14

2.6 Algorithmic Reverberation ............................................................................ 15

2.6.1 The Recursive Comb Filter .......................................................................... 15

2.6.2 The Allpass Filter ......................................................................................... 18

2.7 Classic Filter Network Structures .................................................................. 21

2.7.1 Parallel Comb Filters .................................................................................... 21

2.7.2 Combination of Comb- and Allpass Filters .................................................. 22

2.7.3 Nested Series Allpass Network .................................................................... 24

3 Feedback Delay Networks 27

3.1 General FDN Structure .................................................................................. 27

3.2 Obtaining frequency depended reverberation times ...................................... 29

3.2.1 Implementation using 1st order filters ......................................................... 30

3.2.2 Control of decay characteristics in FDNs .................................................... 31

3.3 Parameterization of FDNs .............................................................................. 32

4 The Final Algorithm 35

4.1 Impulse Response Analysis ........................................................................... 36

4.1.1 Segmentation of Early Reflections and Late Reverberation ........................ 36

4.1.2 Measuring Reverberation Time and Total Energy ....................................... 37


4

4.2 Improving the Quality of Late Reverberation ................................................ 38

4.2.1 Delay Lines and Delay Lengths ................................................................... 38

4.2.2 Modulation ................................................................................................... 40

4.2.3 Output Vectors and Feedback Matrix .......................................................... 42

4.3 Controlling the Reverberation Time .............................................................. 43

4.4 Spectral Correction Filter ............................................................................... 46

4.5 Synthesizing the Early Reflections ................................................................ 47

4.6 Summary of the Algorithm ............................................................................ 48

5 Evaluation and Results 49

5.1.1 Simulation of a Concert Hall ........................................................................ 49

5.1.2 Simulation of a Cavern ................................................................................. 51

5.1.3 Simulation of a Large Studio ....................................................................... 52

5.1.4 Simulation of a Small Studio ....................................................................... 54

5.1.5 Simulation of an Algorithmic Reverb .......................................................... 55

5.2 General Considerations .................................................................................. 57

6 Perceptual Controls 58

7 Conclusion and Outlook 60

Literatur/Bibliography 62

Picture Credits 64

APPENDIX 65


5

1 Introduction

Artificial reverberators are used in almost every audio production to add space, depth and life

to dry recordings. In the very beginning analogue devices - such as plates, springs or even

echo chambers - were used for this purpose, but as technology developed further digital

reverberators have become the new studio standard [JC91].

There exist two main approaches to create artificial reverberation: One technique is based on

the convolution of a room’s impulse response with the input signal, whereas the other one is

based on delay networks. While the former approach produces very accurate and authentic

results, the computational cost increases in conjunction with the length of the impulse

response [CNW14]. However, algorithmic reverberators usually provide a larger set of

perceptual controls to adjust certain parameters like diffusion or reverberation time, but they

have to be designed carefully in order to produce a natural sounding decay without

undesirable artefacts.

The reverberation tail of a real room’s impulse responses consists of a large number of echoes

increasing over time and is perceived as being an exponentially decaying white noise. Various

algorithms have been developed to recreate these characteristics in real time as closely as

possible while using only a reasonable amount of processing power [Jot92].

One can separate such impulse responses into two parts. The very beginning of the

reverberation tail usually consists of discrete reflections and is called early reflections. In

most cases algorithms attempt to model these reflections by using a set of delay lines. The

early reflections contain important information about the characteristics of a room, like size,

geometry and architecture of a room [Zöl08].

Fig. 1.1: Echogram showing three parts of the reverberation process: direct sound, early

reflections and late reverberation.


6

However, in the late part of the impulse response the reflections get much denser and thus the

exact prediction (e.g. via image-source model) with delay lines gets impractical [Schro62].

Usually, a more statistical approach is chosen to model the late part of the reverberation tail

[Ruo12], in fact, most algorithms use a combination of feedback loops and delay taps. The

recursive structures guarantee that the input signal is repeated and attenuated in each loop. By

combining many of these structures into networks, it is possible to create an overall output

that sounds like exponentially decaying white noise. But even with a high number of delay

elements it is difficult to avoid discrete frequency peaks in the artificial reverberator’s reverb

tail, which are caused by the fixed rate of the delay lines. The resulting sound can be

described as ‘metallic‘, because certain frequencies resonate more than others, and this sounds

very unnatural [Fre00].

In chapter two, some properties of room acoustics, which are strongly associated with

artificial reverberation, are reviewed and basic feedback structures like comb- and allpass

filters are explained. In addition, some different approaches and algorithms which have been

used in the past are compared. Chapter three will emphasise the importance of a very specific

and common type of algorithms, the Feedback Delay Networks (FDNs).

Chapter four will describe the final algorithm in detail. All basic blocks will be discussed step

by step, starting with the impulse response analysis and automatic parametrisation procedure.

Next, some methods will be suggested in order to increase the subjective quality of the

reverberation tail generated by the FDN. Finally, it will be explained how to adjust the

frequency dependent reverberation time and initial energy to the statistics of the original

impulse response.

In chapter five we will investigate the algorithm’s ability to model various impulse responses

of different lengths and sources. Finally, chapter six will describe the perceptual controls

which were added to the reverberator in order to provide the option to modify certain

parameters like reverberation time or diffusion.


7

2 Artificial Reverberation Algorithms

2.1 Properties of room acoustics

Any sound source placed in an enclosure will produce a dense pattern of reflections on the

listening position which is perceived as reverberation. These echoes occur because the sound

is partly reflected and diffracted by the walls and objects within the room, depending on their

shape and material of their surfaces. A concrete wall will reflect almost all frequencies

equally, whereas other materials (like wood or fabric) will absorb more of the higher

frequencies. A reflection is therefore a delayed and attenuated copy of the direct sound

[Zöl11]. There exist a variety of different room acoustical measures that describe the

acoustical quality of a room, one of them being the definition of Sabine’s reverberation time

[JC91].

𝑇 = 0,161 [

𝑠

𝑚] .

𝑉

𝐴𝑔𝑒𝑠 (Eq. 2.1)

where V is the volume of the room in m3, and A is the frequency dependent absorption of the

room.

The reverberation time is the time required for the energy in a sound signal to decay a certain

level. Most commonly used is the RT60, which corresponds to a decay of 60dB [GW09].

The transfer function from a static sound source to a listening position is described by the

impulse response, which is a function over time and describes the system’s (in this case the

room’s) reaction for all frequencies. It can be divided in three parts: The direct sound, the

early reflections and the late reverberation [Fre00].

The initial direct sound is the unreflected sound following the shortest acoustical path from

the source to the listening position. It is followed by a discrete set of early reflections, which –

though normally not being be perceived separately - are important for the perception of both

the size of the room and the size of the sound source [Zöl11]. Finally, the density of the late

reverberation is so high that it closely resembles Gaussian Noise [MS07]. This latter portion

gives information about the room size as well as the distance of the sound source [Zöl11].


8

Fig. 2.1: Room impulse response illustrating the three different periods of reverberation.

2.2 EDC and EDR

To obtain the reverberation time, Schroeder introduced the Energy Decay Curve (EDC),

which is computed by the time reversed integration of the squared impulse response h(t). It

describes the remaining energy in the impulse response after any time t [Jot92]:

𝐸𝐷𝐶(𝑡) = ∫ ℎ2(𝜏). 𝑑𝜏

+∞

𝑡

(Eq. 2.2)

Since the energy decreases over time, the reverberation time can be derived from the slope of

this decay.

Jot further extended this method by computing the integral for several frequency bands of the

impulse response. The resulting plot is called the Energy Decay Relief (EDR), which can be

used to visualize the reverberation time as a function of frequency and time in a three-

dimensional graph [Jot92].


9

It can be seen that the reverberation time is larger for low frequencies because usually more of

the higher frequencies are absorbed by the walls and objects within the room. This is an

important characteristic and has to be considered during the development of artificial reverb

algorithms.

Fig. 2.2: Energy Decay Relief of a concert hall.

2.3 Time & Frequency Density

2.3.1 Frequency Density

The resonant frequencies of a rectangular room can be determined by

|𝑓𝑟| = (𝑐

2)√(

𝑛𝑥

𝑙𝑥)

2

+ (𝑛𝑦

𝑙𝑦)

2

+ (𝑛𝑧

𝑧)2

(Eq. 2.3)

where l are the room dimensions and n∈ℕ [GW09].

The modal density is defined as the number of modes per Hertz, and is dependent on

frequency f, volume of the room V and the speed of sound [Kut91]:


10

𝑑𝑁

𝑑𝑓~

4𝜋𝑉

𝑐3𝑓2 (Eq. 2.4)

It can be shown, that the number of eigenmodes increases as frequency squares. Above a

certain frequency, the modal density is so high that the human ear cannot perceive individual

frequency peaks anymore [Smi10]. For average rooms, this limiting frequency can be given

as:

𝑓𝑔 = 2000√𝑇

𝑉 (Eq. 2.5)

where T is the reverberation time in seconds and V is the volume of the room [JC91].

Above this critical frequency, the mean spacing between the frequency peaks can be

approximated by [Fre00]:

∆𝑓𝑚𝑎𝑥 =

4

𝑇 [𝐻𝑧] (Eq. 2.6)

The theoretical modal density can be distinguished from the perceivable frequency density.

The frequency response of a room fluctuates about 10dB on the average, but psycho-acoustic

experiments have shown that such irregularities cannot be detected by the human ear when

the density of modes is high enough [Schro62]. Also, some modes with lower amplitudes

cannot be perceived and therefore the frequency density is always lower than the modal

density [Zöl11].

As explained in chapter 2.3.1, increasing the delay time of a comb filter will thus increase the

modal- or respectively frequency density.

For a large concert hall with V = 30000 m2

and a reverberation time of 3,5 seconds, the

critical frequency is 21.6 Hz. This leads to a modal density of 4.35 modes per Hz and a

∆𝑓𝑚𝑎𝑥of 1.14 Hz. The frequency density will therefore be 0.88 modes (1/∆𝑓𝑚𝑎𝑥 ) per Hz

[Fre00].


11

2.3.2 Time Density

The echo density in a room is defined as the number of echoes until a certain time t

[Schro62]:

𝑑𝑁𝑡

𝑑𝑡~

4𝜋𝑐3

𝑉𝑡2

(Eq. 2.7)

According to Griesinger, an echo density of 10000 echoes per second for transient input

signals should be sufficient for a natural, colourless reverberation [JC91].

Similar as with frequency peaks, the amplitude of successive echoes can be very different and

masking may occur. This suggests introducing the term ‘time density’, which refers to the

perceived number of echoes. Again, the time density is generally lower than the echo density.

Also, the time tc after which individual echoes are not distinguishable anymore is dependent

on the length ∆t of the exciting impulse:

𝑡𝑐 = 5. 10−5√𝑉

∆𝑡 (Eq. 2.8)

Another property of good sounding, large rooms is the absence of flutter echoes and an equal

decay rate for all modes within the same frequency region [Schro62].

2.4 Echo Density Profile

As discussed previously, the echo density increases over time until the number of reflections

is so high that the impulse response is very similar to exponentially decaying Gaussian noise.

The rate of this increase is dependent on the size and shape of the room and whether it is

empty or cluttered with reflective objects.

There are some reasons why counting reflections over a period of time (most commonly one

second) is not always the best solution to express echo density. One reason would be that it is

not clarified how a reflection is generally defined (i.e. when an impulse response tap is

counted as a reflection and when not), whereas the other reason would be that the sampling

frequency will limit the amount of echoes that can be detected [AH06].


12

Abel and Huang [AH06] introduced a measuring method called the ‘Echo Density Profile’,

which relies on the property that the impulse response taps take on a Gaussian distribution

after a sufficient amount of time. For this purpose, a sliding window over the impulse

response is used. The echo density measure counts the taps outside the standard derivation for

the window and normalizes by that expected for Gaussian noise. As a result, the value for the

echo density, which is always somewhere between zero and one (due to normalization), will

be low if there are only a few pronounced reflections. In other words, they will contribute to a

larger standard deviation. On the other side, the value will be close to one for extremely dense

reflection patterns.

Accordingly, the echo density profile η(t) is defined as the fraction of impulse response taps

which lie outside the window standard deviation.

𝜂(𝑡) =1

𝑒𝑟𝑓𝑐(1

√2)

∑ 𝑤(𝜏)𝟏{|ℎ(𝜏)| > 𝜎}

𝑡+𝛿

𝜏=𝑡−𝛿

(Eq. 2.10)

where h(t) is the (zero mean) impulse response, 2𝛿 + 1 is the window length (in samples),

𝑒𝑟𝑓𝑐(1

√2) ≐ 0.3173 is the expected fraction of samples lying outside a standard deviation

from the mean for a Gaussian distribution, 1{..} is returning one when its argument is true and

zero otherwise , w(t) is a positive weighting function (to reduce the effect of the taps at the

window edges) and 𝜎 is the window standard deviation

𝜎 = [ ∑ 𝑤(𝜏)ℎ2(𝜏)

𝑡+𝛿

𝜏=𝑡−𝛿

]

12

(Eq. 2.11)

where w(𝜏) is normalized to have unit sum ∑ 𝑤(𝜏) = 1𝜏 .

For a typical room impulse response or an impulse response taken from a high-quality

artificial reverberator the echo density profile starts near zero and increases to around one,

which indicates the start of the late field. The rate of increase and the time at which the value

of one is reached will depend on the properties of the room or algorithm.


13

Abel and Huang suggest setting the length of the sliding window to about 20-30 ms. A shorter

window will be more responsive to short-term echo density changes, but can also cause jumps

in the profile, which should be avoided. Individual echoes with large amplitudes can be

responsible for sudden changes in the echo density profile and thus a weighting function (e.g.

Hanning window) should be used in order to smooth out the curve (taps at the window edges

will be de-emphasized).

Fig. 2.5.1: Top: Echo density profiles (Hanning weighting, window length 20 ms) for

the impulse responses of a lobby (red, green) and a hallway (blue, yellow). Bottom:

Corresponding impulse responses.

As shown in Fig. 2.5.1, the echo density profile and the time at which the late field is reached

is very similar for the impulse responses measured in the same room. This suggests that the

echo density profile is not noticeably affected by the microphone placement or source

position. Generally, in big rooms it takes more time for the reflections to build up whereas in

smaller rooms the initial echo density is higher. The slope of the curve is dependent on the


14

number of reflective surfaces as well as their distances to the sound source and to the listening

position [AH06].

Fig. 2.5.2: Echo density profiles (Hanning weighting, window length 26 ms) for

the impulse responses of a concert hall located in Pori, Finland (top), and for the

impulse response taken from a Lexicon Vocal Hall algorithm (bottom).

2.5 Convolution Reverb

A very common approach to reverberate a dry signal would be to simply convolve a room

impulse response with the input signal. For long impulse responses the convolution is usually

performed in the frequency domain, as its complexity is reduced to a simple multiplication.

This can be done by computing the Fourier transform of the impulse response and the block

by block Fourier transform of the input signal. Both signals can now be multiplied point per

point and the result is transformed back to the time domain [Zöl11].

Though convolution is a very accurate method to simulate the characteristics of a room, the

quality of the resulting reverberation naturally depends on the quality of the impulse response.


15

Another drawback is that with convolution there are no parameters to manipulate the decay

characteristics like reverberation time and frequency properties.

2.6 Algorithmic Reverberation

A variety of techniques, like ray-tracing or the image-source model, can be used to create an

impulse response by modelling how sound propagates in a room with certain geometry

[Ruo12]. However, a common approach is to simulate the room acoustics from a perceptual

point of view, since the human hearing is not very sensitive to details when it comes to

evaluating reverb tails. Most reverberation algorithms use feedback structures to create echoes

of the input signal, and some of them are described in more detail in the following chapters

[Zöl11].

2.6.1 The Recursive Comb Filter

The recursive comb filter was introduced by Manfred Schroeder at Bell Laboratories in 1961

as a computational inexpensive module to create artificial reverberation [Zöl11].

It consists of a delay line which is inserted into a feedback loop with gain g (less than one for

stability) in order to produce multiple echoes over time [Schro62].

Fig. 2.3.0: Structure of a comb filter

The impulse response of such a structure corresponds to an exponentially decaying repeated

echo.


16

Fig 2.3.1: Impulse response of a comb filter with 𝜏=10ms and g=0.5

Fig 2.3.2: Energy decay relief of a comb filter with 𝜏=10ms and g=0.5

The amplitude-spectrum of this impulse train is given as

|𝐻(𝜔)| =1

(1 + 𝑔2 − 2𝑔 cos(𝜔𝜏))12

(Eq. 2.12)

where the ratio of the response maxima to minima is


17

𝐻𝑚𝑎𝑥

𝐻𝑚𝑖𝑛=

1 + 𝑔

1 − 𝑔 (Eq. 2.13)

Fig 2.3.3: Magnitude response and phase of a comb filter with

τ=0.2ms and g=0.5, Fs=44100 Hz

With the above gain setting of g=0.5 (which corresponds to a attenuation of 6dB for every trip

around the feedback loop), the resulting magnitude response maximum to minimum response

is equal to about 10dB, which results in a very unnatural, ‘ringing’ sound [Schro62].

As a result, the resonant frequencies we perceive as ‘metallic’ are spaced by 1

𝜏 Hz on the

frequency axis [Schro62].

Fig 2.3.4: Pole-Zero map of a comb filter with 𝜏=0.2ms and g=0.5, Fs=44100


18

There are m (m is the delay length in samples) poles in the pole-zero diagram and each pole is

responsible for one frequency peak in the spectrum, which further means that there are m/2

frequency peaks below half of the sampling frequency. Therefore, reducing the delay time

will on the one hand increase the amount of echoes per second, but on the other hand reduce

the modal density as there are fewer peaks in the spectrum [Fre00]. In real rooms, the density

of modes above a certain frequency becomes so high that they interfere and cannot be

distinguished by the human ear [SL61]. A combination of high gain settings and short delay

times expose unnatural resonances of the recursive comb filter even more, because maximum

to the minimum ratio of the magnitude response will be increased [Fre00].

2.6.2 The Allpass Filter

To avoid the problem of unnatural resonances in the frequency domain, Schroeder introduced

the delay based allpass filter. This structure uses a feed-forward path, where the input signal

is multiplied with the negative gain of the feedback path. As opposed to the comb filter, the

resulting frequency response is flat while still producing a dense impulse response [Zöl11].

Fig. 2.3.5: Flowgraph of an allpass filter

The output of this structure is given by

𝑦(𝑛) = −𝑔. 𝑥(𝑛) + 𝑥(𝑛 − 𝑚) + 𝑔. 𝑦(𝑛 − 𝑚) (Eq. 2.14)

The resulting impulse response shares the property of exponential decay of energy and is

similar to the comb filter’s impulse response except from the negative peak in the beginning

[Schro62 and Fre00].


19

Fig. 2.3.6: Impulse response of an allpass filter with 𝜏=10ms and g=0.5

Fig. 2.3.7: EDR of an allpass filter with 𝜏=10ms and g=0.5

Due to its flat frequency response, the allpass filter does not colour the sound from a

perceptual point of view as long as the delay time is much shorter than the integration time

(i.e. the duration of the interval in which stimuli appear to be summated [Tou98]) of the ear ,


20

which is about 50ms. In any other case, the time domain effects become audible and

coloration effects can be perceived [Fre00].

Also, in the case of a stationary input, the coloration created by the comb filter is suppressed

by the allpass filter due to its flat frequency response. However, for short transient signals, the

low echo density still causes fluttering sound and the timbre of the comb filter is still present

[JC91].

Fig 2.3.9: Magnitude response and phase of an allpass filter with 𝜏=10ms and g=0.5

Some zeros at the conjugate reciprocal locations to the poles are now added to the pole-zero

map, as the z-transform of the allpass filter is given by [Fre00]:

𝐻(𝑧) =𝑧−𝑚 − 𝑔

1 − 𝑔𝑧−𝑚 (Eq. 2.15)

The relation between loop gain, delay and reverberation time of a single allpass filter module

can be given as [Schro62]:

𝑇60 = (3

log10 |1𝑔|) . 𝜏 (Eq. 2.16)


21

For a reverberation time of T=2sec and a gain setting of 0.708, the delay time must be 100ms.

This produces only ten echoes per second, which is far to less for a natural reverberation. For

a single component, no matter if allpass- or comb filter, the echo density is nowhere near the

echo density of a real room and thus a solution to solve this problem would be to connect

several of these modules within larger structures [Schro62].

2.7 Classic Filter Network Structures

This section will discuss some common filter networks, consisting of various connections of

comb and allpass filters.

2.7.1 Parallel Comb Filters

If several parallel comb filters are used, as shown in Fig. 2.6.1, it is not possible to achieve a

flat frequency response. However, as long as a sufficient amount of different frequency peaks

are added together, the impulse response will become closer to that of real rooms. Jot and

Chaigne have shown that all comb filters have the same decay rate if the magnitudes of their

poles are made equal. This will reduce the effect of the individual resonances of the

combfilters and will make it less noticeable [JC91].

Fig. 2.6.1: Four parallel comb filters

When connecting P comb filters in parallel, the system transfer function can be given as:


22

𝐶(𝑧) = ∑𝑔𝑝

𝑧𝑚𝑝 − 𝑔𝑝=

𝑃−1

𝑝=0

∑ ∑ [1

𝑚𝑝.

𝑧𝑘𝑝

𝑧 − 𝑧𝑘𝑝]

𝑚𝑝−1

𝑘𝑝=0

𝑃−1

𝑝=0

(Eq. 2.17)

where mp are the delay lengths in samples, gp are the gains and 𝑧𝑘𝑝 = 𝛾. 𝑒𝑗𝑤𝑘𝑝 , where 𝛾 = 𝑔1

𝑚

and 𝑤𝑘𝑝 = 2𝑘𝜋/𝑚.

For the above system, the following condition for equal magnitude of the poles must be

fulfilled for any p:

𝛾 = 𝑔𝑝

1𝑚 (Eq. 2.18)

In this context the frequency density Df and time density Dt (as discussed in chapter 2.4) for a

parallel comb filter with P delays and corresponding delay lengths 𝜏𝑝 can be approximated as

follows:

𝐷𝑓 = ∑ 𝜏𝑝 ≈ 𝑃. 𝜏

𝑃−1

𝑝=0

(Eq. 2.19)

𝐷𝑡 = ∑ 1/𝜏𝑝 ≈ 𝑃/𝜏

𝑃−1

𝑝=0

(Eq. 2.20)

Given that about 10000 echoes and a frequency density of 0.15 for a reverberation time of one

second are considered necessary to obtain a natural and smooth reverberation for transient

input signals, a total amount of 40 comb filter with an average delay length of about 12 ms is

needed. Even more comb filters will be required when approximating large rooms, because

the average separation of frequency peaks is inversely proportional to the reverberation time

[JC91].

2.7.2 Combination of Comb- and Allpass Filters

In search for a more efficient algorithm, Schroeder proposed a network of four parallel comb

filters with incommensurate delay lengths in series with two allpass filters, as shown in Fig.


23

2.6.2. The comb filters are responsible for a sufficient frequency density, whereas the allpass

filters should increase the echo density [Schro62].

Fig. 2.6.2: Delay Network consisting of four parallel comb filters followed by two allpass

filters in series.

This structure has been implemented in Matlab with the following settings (as suggested by

Schroeder [Schro62]) , and plots are provided below.

Sampling Frequency: Fs = 44100 Hz;

Comb Filter Delays: τ1 = 1327, τ2 = 1553, τ3 = 1801, τ4 = 1979 [Samples]

Comb Filter Gains: g1=0.81, g1=0.78, g1=0.754, g1=0.733

Allpass Filter Delays: τ5 = 221, τ6 = 75 [Samples]

Allpass Filter Gains: g5= g6=0.7

Fig. 2.6.4: EDR of Schroeders combined comb- and allpass filter reverberator


24

Fig. 2.6.5: Echo Density Profile of Schroeders combined comb- and allpass filter reverberator

While this network provides a reasonable frequency density, the echo density is much lower

than in real rooms, as shown in Fig. 2.6.5. The Echo Density Profile does not reach Gaussian

distribution and consequently this structure is not able to provide flutter free reverberation for

transient test signals such as clicks.

2.7.3 Nested Series Allpass Network

In his early papers Schroeder [SL61] proposes a second reverberator which consists of five

allpass filters in series, nested within an outer allpass filter, as shown in Fig. 2.6.6. He

suggests choosing the delay length of each of the inner modules to about one third of the

preceding delay length. Again, these ratios should be made incommensurate to avoid echo

cancellation or superposition. The feedback gains gn are most commonly made equal to

around 0.7.

The delay line of the outer filter can be used to introduce a time gap (predelay) between the

direct sound and the onset of reverberation. For large concert halls, this value is often set to

around 30 ms, depending on the position of the listener. The absolute values of the outer

feedback gain should be made less than one to guarantee stability. The ratio of direct sound to

the reverberant sound is given by g2/1-g

2 [SL61].


25

Fig. 2.6.6: In this structure a series of five allpass filters (figured as the ‘All-pass reverberator

gain, 1’ block) is nested within an outer allpass.

Due to its nested structure this reverberator provides a surprisingly high echo density. As it

exclusively consists of allpass filters, each producing a flat frequency response, the resulting

overall frequency response of the whole structure is also flat.

Fig. 2.6.7: EDR of Schroeders nested allpass cascade algorithm. The network consists of

several allpass units. Consequently the resulting frequency response is flat.


26

When analysing the Echo Density Profile (as shown in Fig. 2.6.8) of this structure, a

fundamental property can be observed: The profile increases over time, as it is the case in real

acoustic spaces. Accordingly, putting allpass filters into feedback loops can be used as an

efficient method to increase the echo density of artificial reverberation algorithms.

Fig. 2.6.8: Echo Density Profile of Schroeders nested allpass cascade algorithm. The profile

increases at a slow rate but finally reaches Gaussian distribution at about 600 ms.


27

3 Feedback Delay Networks

This chapter will focus on a very common reverberator structure, the Feedback Delay

Network (FDN). This algorithm was first introduced by Stautner and Puckette [SP82] and is

based on delay lines which are connected by means of a feedback matrix [Zöl11]. This

structure should take advantage of both the high echo density provided by series allpass filters

and the property to simulate the frequency response of real rooms as it can be achieved with

parallel comb filters [Fre00].

3.1 General FDN Structure

The FDN can be described as a vector generalization of the recursive comb filter. The order N

of the system is defined by the number of delay lines, each being 𝜏𝑖 = 𝑚𝑖𝑇𝑠 seconds long,

where m is the delay length in samples and 𝑇𝑠 =1

𝐹𝑠 is the sampling interval . The feedback

gain of the unit comb filter is replaced by a NxN feedback matrix A with elements ai,j. A

fourth order FDN is shown in Fig 3.1.1. [Zöl11].

Fig. 3.1.1: Structure of a fourth order FDN

The FDN is completely described by the following properties:


28

𝑦(𝑛) = ∑𝑐𝑖𝑠𝑖(𝑛) + 𝑑𝑥(𝑛)

𝑁

𝑖=1

(Eq. 3.1)

𝑠𝑖(𝑛 + 𝑚𝑖) = ∑𝑎𝑖,𝑗𝑠𝑗(𝑛) + 𝑏𝑖𝑥(𝑛)

𝑁

𝑗=1

(Eq. 3.2)

The output signal y(n) is a linear combination of the input signal x(n) and the individual

outputs of the delay lines si(t) [JC91].The delay lengths mi are generally high integer numbers

on the orders of hundreds or thousands [Zöl11]. Jot and Chaigne have investigated the

possibilities of FDNs very thoroughly. Using the z-transformation the above equations can be

written as:

𝑦(𝑧) = 𝒄𝑻𝒒(𝒛) + 𝑑𝑥(𝑧) (Eq. 3.3)

𝒔(𝒛) = 𝑫(𝒛). [𝑨. 𝒔(𝒛) + 𝒃. 𝑥(𝑧)] (Eq. 3.4)

Column vectors b and c can be used for multiple input-output systems, A is called the

feedback matrix and the delay matrix D(z) is defined as:

𝐷(𝑧) = [𝑧−1 … 0… … …0 … 𝑧−𝑚𝑁

] (Eq. 3.5)

Now it is possible to find the systems transfer function H(z) [JC91]:

𝐻(𝑧) =𝑦(𝑧)

𝑥(𝑧)= 𝒄𝑻[𝑫𝑧−1 − 𝑨]−1𝒃 + 𝑑 (Eq. 3.6)

The poles of the system can be found by solving the characteristic equation of the system:

det[𝑨 − 𝑫(𝑧−1)] = 0 (Eq. 3.7)

It is not trivial to solve this equation, but it has been shown that the stability of this system can

be ensured if the feedback matrix A is unitary, i.e. A* A = I, where A* is the Hermetian


29

transpose of A. Moreover, this choice leads to a lossless FDN prototype, because all poles of a

unitary feedback loop are located on the unit circle. Consequently, the system has only non-

decaying eigenmodes [JC91 and Jot97].

The matrix A should have no null coefficients to provide a faster increase of the echo density

and ideally all coefficients should have the same magnitude in order to provide a minimum

crest-factor (ratio of largest coefficient over RMS average of all coefficients). The latter will

speed up the convergence to a Gaussian amplitude distribution [Jot97].

As suggested by Jot [Jot97] several classes of unitary matrices can be used for this purpose:

Householder matrices of the type A = (2/N) e eT – I, where e = [1….1]

T and I is the

identity matrix. In this way the complexity of an implementation is reduced to 2N

numerical operations for a N by N matrix. However, householder matrices have a high

crest factor for high numbers of N.

Hadamard matrices can be implemented with butterfly networks and require 𝑁 log2 𝑁

additions an N multiplications.

Circulant matrices can also be implemented very efficiently by using two FFTs and N

complex products.

Generally, with the matrices described above about 8 to 16 delay units should be sufficient to

provide a natural reverberation with an adequate time and frequency density [Jot97].

3.2 Obtaining frequency depended reverberation times

The choice of a unitary feedback matrix leads to a lossless prototype FDN, which creates an

infinite, non-decaying impulse response. However, in real rooms low frequencies will

naturally decay slower than high frequencies due to the absorption properties of the walls. In

order to obtain a frequency dependent reverberation time each delay line can be connected in

series with a corresponding absorbent filter h(z).

Jot [JC91] has investigated this method on the basis of Schroeder’s parallel comb filter, which

is equivalent to the well-known case of a diagonal feedback matrix.

To avoid any unpleasant resonating frequencies it is important that every comb filter decays at

the same relative rate. More precisely, all system poles corresponding to neighbouring


30

eigenmodes must have the same magnitude. This condition is called the continuity of the pole

locus [JC91].

Generally, when the feedback gain of a comb filter is replaced by an absorbent filter then both

the decay time as well as the magnitude of the frequency response is modified. As long as the

continuity of the pole locus is fulfilled this magnitude can be kept independent of the decay

characteristics by connecting a tone correction filter t(z) in series with the comb filter [JC91].

3.2.1 Implementation using 1st order filters

For each delay line there will be one absorbent filter with the same structure but different

coefficients. The absorbent filters can be implemented as first order IIR filters with the

following transfer function hp(z):

ℎ𝑝(𝑧) = 𝑘𝑝. 𝛿𝑘𝑝 where 𝛿𝑘𝑝 =1−𝑏𝑝

1−𝑏𝑝.𝑧−1 (Eq. 3.8)

If 0 ≤ 𝑏𝑝 < 1this will provide a low-pass filter and the gains kp can be computed by the

desired reverberation time 𝑇𝑟 at zero frequency:

𝐾𝑝 = 20 log10(𝑘𝑝) = −60𝜏𝑝

𝑇𝑟(0) (Eq. 3.9)

The coefficients 𝑏𝑝 can be determined as follows:

𝑏𝑝 = 𝐾𝑝.ln (10)

60. [1 −

1

∝2] where ∝=𝑇𝑟(𝜋)

𝑇𝑟(0) (Eq. 3.10)

The tone corrector t(z) is responsible for compensating the frequency response as the

absorbent filters will modify the system poles as described above.

𝑡(𝑧) =1−𝑏.𝑧−1

1−𝑏 with 𝑏 ≈

1−𝛼

1+𝛼 (Eq. 3.11)

These formulas are valid for small values of 𝑏𝑝and not too long delay times. They can be used

to achieve a desired reverberation time at zero frequency and Nyquist frequency [JC91].


31

More general, the tone corrector can be implemented as a filter with a magnitude response

that is equal to:

|𝑡(𝑧)| = √1

𝑇𝑟(𝜔) (Eq. 3.12)

where 𝑇𝑟(𝜔) is the frequency dependent reverberation time in seconds [JC91].

3.2.2 Control of decay characteristics in FDNs

Following the indications of Jot [JC91] every delay line within the FDN can be cascaded with

a gain kp:

𝑘𝑝 = 𝑎𝑚𝑝 (3.13)

This condition ensures that the reverberation time can be modified without violating the

principle of equal magnitudes of the poles. Since all poles are contracted by the same factor a,

this is equivalent to replacing D(z) with D(z/a).

Additionally attenuation coefficients can be replaced by low-pass filters using the method

described in 3.2.2. By inserting the absorbent filters at the output of each delay unit in the

general network as shown in Fig 3.1.1 it is possible to obtain a frequency dependent

reverberation time while still satisfying condition 3.12. Finally, the tone corrector t(z) can be

added to the output of the reverberator.

Fig. 3.2.1: General structure of a FDN of order N=3, including absorbent filters hN(z) and

the tone corrector t(z).


32

3.3 Parameterization of FDNs

As discussed in chapter 3.2 FDNs can provide natural reverberation at low computational

costs. The overall quality of the resulting impulse response depends on the number of delay

lines, their lengths and the choice of coefficients of the unitary feedback matrix. The

frequency dependent reverberation time can be controlled with absorbent filters connected in

series with the output of the delay lines, as shown in Fig. 3.2.1.

The intention of this chapter is to find out about the capabilities of an order N=8 FDN in

terms of its maximum echo density, subjective sound quality and overall variability. Finally it

should be investigated if this structure can be used as the basic tool in order to simulate the

late reverberation of any given impulse responses of real enclosures.

For the eight delay lines, at a sampling rate of 44100 Hz the following lengths in samples

have been chosen:

𝜏1 = 587 𝜏2 = 661 𝜏3 = 743 𝜏4 = 827 𝜏5 = 883 𝜏6 = 967 𝜏7 = 1049 𝜏8 = 1151

The summation of these values leads to a total delay length of 6868 samples which

corresponds to about 155 ms at a sampling frequency of 44100 Hz. It is important to choose

prime numbers for the delay lengths in order to avoid echo-cancelation or -overlapping.

The choice of the feedback matrix affects the echo density and is also responsible for the

computational efficiency. In general, null-coefficients should be avoided and, as discussed in

the previous chapters, the matrix has to be unitary for a lossless prototype reverberator. In this

case the feedback matrix was taken from the class of Householder matrices, as proposed by

Frenette [Fre00]:

𝑨 = 𝑱 −2

𝑁. 𝒆. 𝒆𝑻 (Eq. 3.14)


33

where J is a N x N circular permutation matrix and e is a N x 1 column vector of ones.

The circular N x N permutation matrix J was implemented as follows:

[ 0 0 … … … 11 ⋱ ⋱ ⋱ 0 00 1 ⋱ ⋱ ⋱ ⋮⋮ 0 ⋱ ⋱ ⋱ ⋮⋮ ⋮ ⋱ ⋱ ⋱ ⋮0 … … 0 1 0]

(Eq. 3.15)

In order to simplify the subjective quality of the reverberation effect a longer decay time of

3.5 seconds was chosen for the zero-frequency, whereas the decay time at the Nyquist

frequency (in this case 22050Hz) was set to 1 second to mimic the high frequency absorption

of the walls.

Finally, coefficients of the input vector b have all been set to one to provide maximum echo

density and the stereo output matrix c has been set to:

𝒄 =

[

1 1−1 11 −1

−1 −1⋮ ⋮ ]

(Eq. 3.16)

The first column corresponds to the left output channel. Jot noticed that with the above

feedback matrix periodic clicks can occur in the impulse response. To avoid this sort of clicks,

the signs of every other coefficient can be inverted. The second column was chosen to be as

different as possible from the first column so that the two outputs are perceived as being

uncorrelated [Fre00].

For figures 3.3.1-3.3.2 the two outputs of the reverberator were summarized to a single mono

output. As expected, the desired frequency dependent reverberation time could be obtained

very accurately with a smooth decay curve across the whole frequency range, as shown in

figure 3.3.1. In addition the echo density profile was computed and it can be observed that

Gaussian distribution was reached after about 200 ms. The initial slope of the profile seems to

be somewhat slower compared to the profile illustrated in Fig. 2.5.2.


34

Fig. 3.3.1: Energy Decay Relief of the order N=8 Feedback Delay Network

Fig. 3.3.2: Echo Density Profile for the order N=8 Feedback Delay Network

The subjective quality of the reverberator was tested with a mono, 16-bit wav signal of a snare

drum hit. Although the overall decay is very smooth, a very subtle periodic ringing could be

observed. This may be an indicator for an insufficient frequency density or inappropriate

delay times.


35

4 The Final Algorithm

This chapter will provide information about the development of an algorithm which can

approximate a given room impulse response by automatic parameterization of the used

reverberation system. In general, the reverberation process can be divided into two separate

sections, one for synthesizing the early reflections and one for the simulation of the

reverberation tail.

Fig. 4.6.1: Complete structure of the algorithm. Blocks within the red dashed line are

performed offline.

The individual sections of the block diagram above will be discussed within the next section.

First of all, in chapter 4.1, we will discuss how we can use the Energy Decay Relief and the

Echo Density Profile to get necessary information about reverberation time, spectral


36

properties and temporal progress of the impulse response we would like to model. We will

further establish a criterion, according to which we can clearly separate the early part from the

more diffuse late part.

Section 4.2 will describe the changes which have been applied to the Feedback Delay

Network to improve the subjective quality of the reverb tail for various types of input signals.

In the following part (4.3.) we will look at how to achieve a frequency dependent

reverberation time, corresponding to the data obtained from the methods used in 4.1.

Furthermore it will be discussed how a spectral correction filter can be used adjust the

frequency response of the FDN output, so that it is closer to the response of the modelled

room.

Finally, chapter 4.4 describes an approach to simulate the early reflections by using a simple

tapped delay line.

4.1 Impulse Response Analysis

The sonic quality of reverberation is for the most part influenced by the length and

distribution of the early reflections, the frequency dependent decay time and consequently its

overall frequency response [CNW14]. Therefore it is necessary to investigate the room

impulse response regarding these parameters in order to use the knowledge gained from this

process to be able to adjust the controls of the algorithm properly.

4.1.1 Segmentation of Early Reflections and Late Reverberation

The first step in the structure of the algorithm is to load the impulse response specified by the

user. The audio file should be of good quality in terms of noise floor, sample rate and

resolution.

In regard of its further course, the algorithm now needs to decide at which time the sparse

early reflections change over to the very diffuse late part of the reverberation, since both of

those segments are modelled by separate sections of the reverberator.

In order to do this, the Echo Density Profile described in section 2.5 will be calculated. A

good condition for the choice of the transition point would be the time where the Echo

Density Profile first reaches a value of one, since the late reverberation tends towards a


37

normal distribution. However, tests have shown that this condition is quite prone to errors, so

it was extended in the way it is suggested by Rebecca Steward and Damian Murphy [MS07].

Now the transition point is defined at the maximum of the Echo Density Profile within a time

window of 60ms, right after Gaussian distribution was reached for the first time. Additionally,

the transition point has to be within 60ms and 150ms.

Figure 4.1.1: Echo Density Profile and selected truncation point for a large storage room

(left) and a concert hall (right).

4.1.2 Measuring Reverberation Time and Total Energy

As described in 2.1, the frequency dependent reverberation time can easily be measured by

using the normalized Energy Decay Relief. For each frequency band the reverberation time is

equal to where the EDR matches a value of -60dB. Moreover, the EDR provides information

about the total energy of the impulse response at any desired timeframe t. As the timbre of the

impulse response is strongly connected to the initial energy at t=0, this value can be used to

compare the spectral properties of the original and the synthesized impulse response

[JCW97]. (See chapter 4.3.2)


38

Figure 4.1.2: Reverberation time (top) and the total energy (bottom) for a concert hall. Both

values are calculated using the Energy Decay Relief.

4.2 Improving the Quality of Late Reverberation

A Feedback Delay Network of order N=8 was described earlier in chapter 3. While this

network provided good control over reverberation time for very low and very high

frequencies, the decay suffered from a subtle but still unnatural sounding periodic ringing.

The following chapter will describe which changes have been made to the FDN to further

improve the subjective quality of the reverberation tail.

4.2.1 Delay Lines and Delay Lengths

The first step of improvement was to change the order of the FDN from N= 8 to N=12, which

implies that four additional delay lines are attached to the network. This yields to an increased

total delay length and, in consequence, to a higher modal density. Furthermore the echo

density builds up faster, depending on the length of the individual delays, assumed that they

are chosen mutually prime.

Though in theory any set of delay lengths would have fulfilled the objective criteria of this

thesis, their choice is extremely important for the naturalness and smoothness of the

reverberation tail. Tuning the delay lengths of the FDN was consequently one of the most

time consuming parts of the whole development of the proposed algorithm. There does not


39

seem to exist any scientific approach or any logical coherencies in order to improve them,

except that they should be chosen mutually prime, thus avoiding echo cancellation and

superposition. Changing only one of the delay lengths from a decent sounding set to a

neighbouring prime number turned out to be an impractical optimization method. Finally, a

Matlab program was written in order to create random sets of prime numbers in between

different intervals. In this way thousands of different delay length combinations were tested

by simply listening to the different impulse responses of the FDN. Each of them corresponded

to an individual combination and while the evaluation still took a lot of time, the various

impulse responses were created automatically at least.

Generally, sets containing prime numbers on an interval with a ratio of about 1:1,6 turned out

to be the most effective. Combinations of primes on a rather low interval from around 600 to

960 (at a sampling frequency of 44100 Hz) produced a fast echo density built up, but tended

to sound more metallic. Additionally, they often revealed a periodic ringing, which sometimes

can even be perceived as a very disturbing sound of distinct pitch.

The most convincing results could be achieved by increasing the interval where the prime

numbers are taken from 3000 to around 5000. This finally corresponds to individual delay

lengths of about 68ms to 113ms at a sampling frequency of 44100 Hz. Periodicities within the

reverberation tail could still be heard, but they are generally slower, occur in lower registers

and, as a result, are less obvious. In some cases they were very hard to perceive, even with

good headphones and high-end D/A converters. However, the drawback of the increased

delay length is again a slower slope of the echo density profile. Normal distribution was only

reached after 300-400ms, which does not match the properties of a real room’s impulse

response, where it is usually reached within the first 150ms. In order to solve this problem,

two further improvements have been applied to the FDN: Firstly, the number of delay lines

has again been increased to 16. We will later see that this provides a good compromise

between quality and computational cost. Moreover, a diffusion section has been added to the

input of the network, consisting of four short allpass filters in series, as suggested by Dattorro.

The purpose of these filters is to de-correlate the signal quickly and to reduce peakedness by

randomizing the signal phase [Dat97].


40

Figure 4.2.1: Echo Density Profile of the FDN without input diffusion (left) and with input

diffusion (right). The series allpass filters at the input of the network rapidly increase the echo

density, while it takes much longer to reach normal distribution when no diffusion is applied

to the input signal.

4.2.2 Modulation

Another method to improve the quality of late reverberation is to add modulation to delay

lengths, the feedback matrix or the output vector coefficients of the FDN [Fre00]. It can be

used to continuously modify one or several of these parameters over time in order to avoid

repetitive patterns in the reverberation tail. When employed sparingly, it will add a slight

amount of motion and blurriness to the decay without introducing any unnatural pitch-

shifting, as it intentionally happens with time-varying effects like chorus or flanger. Although

there does not really exist a physical equivalent to modulation in real rooms, many well-

known commercial reverberators use it in their algorithms. As a result of constantly changing

the delay lengths, the resonant frequencies of the system will also be changed, which helps to

achieve a flatter frequency response [Fre00]. In theory, within an enclosure this effect could

only be produced by moving the walls of a room back and forth, or – to a certain degree – if

the air temperature and consequently the speed of sound are altered continuously.

Generally, the output of a modulated delay line is given by [Zöl11]:

𝑦(𝑛) = 𝑥(𝑛 − 𝐷 − 𝑚(𝑛)) (Eq. 4.1)


41

where D corresponds to a fixed integer delay, and m(n) is the excursion of the modulation on

each side of this value.

𝑚(𝑛) = 𝑤𝑖𝑑𝑡ℎ ∗ 𝑀𝑂𝐷(𝑛) (Eq. 4.2)

MOD is a constantly changing modulation signal, typically a sinusoid or any continuous

signal with values between minus one and one. Width corresponds to the modulation depth in

samples. Thus, the maximum possible delay of the input signal is x(n-D-width) and the

smallest possible delay is x(n-D+width). More general, non-integer delay values become

necessary and they can be expressed as a whole number plus a fractional part (on the interval

0 < frac < 1). The technique of interpolation is used to calculate the output sample y(n), which

lies in between two consecutive samples. In this way it is possible to avoid signal

discontinuities when the delay times are modulated continuously, as shown in figure 4.2.2

[Zöl11 and Dat97,2].

Figure 4.2.2: A fractional delay line with interpolation.

There exist several interpolation algorithms which are suitable for audio signals and delay line

interpolation. The most commonly used and straight-forward method would be Linear

Interpolation [Smi10]. This technique works best for lowpass signals or in combination with

oversampling. However, it introduces a high frequency loss, which can be a problem when

used for the delay lines inside a reverberator. This means that the Linar Interpolator acts like a

high frequency damping filter, thus affecting the frequency dependent reverberation time.


42

Fortunately, this does not occur with first-order Allpass Interpolation, which has a flat

frequency response and – like Linear Interpolation – requires only one multiply and two adds

per sample [Smi10].

Figure 4.2.3: First-order Allpass Interpolation.

The difference equation for first order allpass interpolation is given by [Smi10]:

𝑥(𝑛 − 𝑓𝑟𝑎𝑐) = 𝑦(𝑛) = 𝜂 . [𝑥(𝑛) − 𝑦(𝑛 − 1)] + 𝑥(𝑛 − 1))

(Eq. 4.3)

where 𝜂 ≈1−𝑓𝑟𝑎𝑐

1+𝑓𝑟𝑎𝑐

Consequently, this type of modulation was used for all of the 16 delay lines. Finally, a

modulation depth of around 12 samples and a modulation rate of 1.5 to 2 Hz turned out to be

a good compromise. In this way the quality of the reverberation tail could be improved by

suppressing undesired resonances successfully, while any pitch-shifting artefacts were

noticeable. A simple sinusoid was used as the modulation signal, with different phase shifts

applied in order to change the individual delay lengths in more arbitrary directions, as

suggested by Frenette [Fre00].

4.2.3 Output Vectors and Feedback Matrix

It is worth mentioning that the choice of the output vector c does not only affect the spatial

impression (for detailed information, please refer to chapter 3.3), but also influences the

timbre and resonances of the whole system. A combination of delay lengths which sounds

good with all coefficients of c set to one, is not necessarily be the best choice when c is set

differently. The same applies for the choice of the feedback matrix A. For this reason the

output vector c and the feedback matrix A of proposed algorithm correspond to extended

versions of the respective matrices described in 3.3.

Again, the feedback matrix which was used for the FDN is given by:


43

𝑨 = 𝑰 −

2

𝑁. 𝒆. 𝒆𝑻

(Eq. 4.4)

One advantage of this class of Householder feedback matrices is that the N x N matrix-times-

vector operation can be implemented by the adding the values of the input vector, multiplying

it with 2/N and subtracting the result from the input vector [Smi10]. In the case where I is an

N x N circular permutation matrix [See Equation 3.15] the values of the resulting vector have

to be circularly shifted by one.

4.3 Controlling the Reverberation Time

The reverberation time of the FDN can be controlled by inserting absorbent filters with

transfer function hi(z) into all of the N feedback channels. Each absorbent filter introduces a

frequency dependent absorption and is chosen such that the logarithm of its magnitude

response is proportional to the delay length mi, and thus inversely proportional to the

reverberation time Tr(𝜔), derived from the EDR as described in 4.1.2. [Jot92 and CNW14]

By neglecting the absorptive filter’s phase delay, the desired magnitude response of each filter

can be calculated as:

|ℎ𝑖(𝑒

𝑗𝜔)| = 10−60∙𝑚𝑖∙𝑇20∙𝑇𝑟(𝜔)

(Eq. 4.5)

Where 0 ≤ 𝜔 = 2𝜋𝑓𝑇 ≤ 𝜋, f is the frequency in Hz, T is the sampling period in seconds

and Tr(𝜔) is the desired frequency-dependent reverberation time in seconds [Jot92].

Figure 4.3.1: Desired magnitude response (concert hall) according to equation 4.5 for

absorbent filters related to different delay lengths.


44

The absorptive filters are implemented as direct-form-I biquad-filters and their coefficients

are calculated with the Matlab function yulewalk. This function performs an adaptive

algorithm and tries to find the nominator and denominator coefficients of an IIR Filter in a

way that the filter’s magnitude response matches the specified desired magnitude response

(equation 4.5). On the one hand, yulewalk provides very accurate approximations for high

frequencies (>10kHz), but on the other hand also produces quite considerable errors for the

lower octaves. Unfortunately, it doesn’t provide any possibilities to apply a higher weight to

the lower octaves, thus forcing those frequencies to decay too quickly, as shown in figure

4.3.2.

Figure 4.3.2.: The red line shows that the magnitude approximation is slightly too low for

frequencies up to 4 kHz and slightly too high for frequencies above around 10 kHz (top).

Even very small errors for low frequencies can result in drastic errors of around one second

for the reverberation time (bottom).


45

The magnitude error could be optimized by manipulating the measured reverberation time for

the calculation of the absorbent filters. This was done by adding a certain percentage of the

reverberation time for each frequency, depending on the average decay time of the IR. In this

way, it was possible to apply an ‘artificial’ frequency dependent weight to the Matlab function

yulewalk.

Figure 4.3.3: By adding an offset to the reverberation time, the magnitude error for lower

frequencies could be improved. Due to the usage of second order filters, a bigger error for the

very high frequencies was introduced (top). Interestingly, this is not reflected in the resulting

reverberation time of the synthesized impulse response, which is very close to the original for

frequencies above 500 Hz.

As shown in Fig 4.3.3, very good results could be achieved for mid- and high frequencies.

Due to the limited filter order of N=2, the filter is not capable of approximating all of the


46

ripples for the bass frequencies successfully. One way to further improve this approach would

be to increase the filter order, which goes along with significantly increasing the computation

time of the FDN. However, subjective listening test have shown that the choice of second

order filters are a good tradeoff between modeling accuracy and computational efficiency, as

differences within the lowest octaves are generally harder to perceive.

4.4 Spectral Correction Filter

The sound of a room and its related EDR is not only characterized by its frequency dependent

reverberation time Tr(f), but also by its initial power spectrum P(f). In addition to the FDN’s

tone-correction filter described in chapter 3.2.2, a spectral correction filter is introduced to

match the initial spectrum of the FDN with that of the original impulse response. The filter is

applied to the output of the FDN and - in reference to [CNW14] - its magnitude is given as:

𝑆𝑐(𝑓) = √𝐸𝐷𝑅𝐼𝑅(𝑡 = 𝑡𝑟𝑢𝑛𝑐, 𝑓)

𝐸𝐷𝑅𝐹𝐷𝑁(𝑡 = 𝑡𝑟𝑢𝑛𝑐, 𝑓) (Eq. 4.6)

The EDRs are both evaluated at the transition time t=trunc, when the phase of late

reverberation and thus normal distribution of the echo density has been reached (see chapter

4.1.1). The coefficients of the order N=12 linear-phase spectral correction filter are again

calculated with the Matlab function yulewalk.

Figure 4.4.1: The blue line shows the magnitude response of the spectral correction filter

applied to the output of the FDN.


47

4.5 Synthesizing the Early Reflections

The early reflections typically arrive within less that 150ms and can be separated from the late

part of the reverberation by truncating the impulse response at the transition time computed in

chapter 4.4.1. It is very important to model these first reflections very accurately, since they

preserve the naturalness and special impression of the room response [CNW14]. For this

reason, each channel is analyzed separately.

When it comes to the perceptual approximation of room impulse responses, in many existing

approaches the first part of the original impulse response is convolved with the input signal

[CNW14]. By using this method the correlation between the separate channels is remained,

thus giving a very natural impression. However, this algorithm concentrates on an alternative

approach, where only the most prominent early reflections are synthesized. The goal is to

further reduce the computational cost and still produce a result that sounds very similar - if

not indiscernible in the best cast – to the original ERs.

The synthesis of the early reflections is realized in the form of a tapped delay line, with delay

times and corresponding gains extracted from the first part of the original impulse response.

In order to reduce the amount of reflections that need to be generated and thus to reduce the

computation time, only the most prominent reflections are synthesized. This is done by

neglecting all reflections with amplitude below 0.08 (for a normalized impulse response).

Furthermore, a 0.5 milliseconds sliding window is applied to the truncated impulse response.

If a window is still containing more than a single reflection, only the strongest echo is

retained. Consequently, the amount of early reflections is reduced from a maximum of about

6000 to a maximum of about 270 echoes which have to be generated by the tapped delay line.

Finally, a spectral correction filter is applied to the output of the tapped delay line. The

coefficients are calculated in the same way as described in chapter 4.4, except that the EDRs

are evaluated at time t=0.


48

Fig. 4.5.1: Comparison between early reflections of a concert hall impulse response (top) and

their synthesized version (bottom).

Subjective listening tests on headphones and speakers have shown that the approach described

above provides very accurate results for a large variety of impulse responses. In some cases

the synthesized version sounds slightly brighter, while still remaining the character and spatial

impression of its reference.

4.6 Summary of the Algorithm

As discussed above, the structure of the algorithm can be separated into three different parts.

Firstly, the analysis of the original impulse response is performed offline. The transition time

t=trunc between the early reflections and the reverb tail time is determined by evaluating the

Echo Density Profile. The Energy Decay Relief can be used to estimate the frequency

dependent reverberation time and the total energy for the time instances t=trunc and t= 0.

These parameters are necessary to calculate the coefficients for the absorbent- and spectral

correction filters. Additionally, the Early Reflections Analysis block provides the delay times

and gains for the most prominent early reflections.


49

Section number two is responsible for creating the early part of the reverberation. This is

achieved by synthesizing the first reflections with a tapped delay line and applying a spectral

correction filter to its output. Finally, the RMS level is adjusted so that it is equal to the level

of original early reflections.

The third part consists of an input diffusion section in order to increase the echo density.

Next, the Feedback Delay Network will provide the late reverberation. Again a spectral

correction filter is applied to the output of the FDN and thus, RMS matching is performed to

avoid a noticeable transition between the different parts of the reverberation.

Both the synthesis of the early reflections and the late reverberation can be implemented in

real time. The final output of the algorithm is obtained by the summation of these

components, with possibilities to adjust their individual volume.

5 Evaluation and Results

This chapter will provide an overview about the results which could be achieved by using the

proposed algorithm in order to simulate an arbitrary room impulse response. For this purpose

a variety of reference impulse responses was taken from commercial reverberators (both

convolution and algorithmic reverberators). With regard to investigate the flexibility of the

proposed approach a great number of very different types of room responses have been

modelled. The following section will present the results for one example of each reverberation

category which have been specified as Concert Hall, Cavern, Large Studio, Small Studio and

Algorithmic.

5.1.1 Simulation of a Concert Hall

The reference concert hall impulse response was taken from a commercial convolution reverb.

Referring to the manual, it was captured in true stereo, having a reverberation time of 3.5

seconds and being well suited for a lot of different input signals, like orchestral music, vocals

or piano. Figure 5.1 shows a comparison between the original and the synthesized version

regarding the EDR, impulse response, reverberation time and energy.


50

Fig. 5.1.1.a: Normalized Energy Decay Relief of the

original impulse response (Concert Hall)

Fig. 5.1.1.b: Normalized Energy Decay Relief of the

synthesized impulse response (Concert Hall)

Fig. 5.1.2.a: Original impulse response left and right

channel (Concert Hall)

Fig. 5.1.2.b: Synthesized impulse response left and

right channel (Concert Hall)

Fig. 5.1.3: Comparison of reverberation time Tr(f), total energy EDR(t=0,f) and initial energy EDR(t=0,f)/Tr(f),

(Concert Hall)


51

As shown above, the algorithm models the concert hall accurately. The reverberation time,

energy, and impulse response look very similar, except for the slightly overestimated

reverberation time from around 150-500 Hz. This is due to the limitation of the second order

absorbent filters, which are not able to recreate the fluctuation of the decay rates in this

frequency range. The most obvious difference, however, is that the synthesized version

generally sounds a lot smoother than its reference. This is also reflected in the Energy Decay

Reliefs in Fig. 5.1.1: Two very distinct resonances at around 250 Hz and 1 kHz can be

detected in the decay of the original impulse response, whereas the decay of algorithmic

version is a lot straighter. While the synthesized reverberation still remains the overall

character of its reference, it does not contain such obvious resonances and generally sounds a

bit softer. This can be explained by the fact that the Feedback Delay Network was

intentionally designed to provide a preferably neutral and flutter free reverberation.

5.1.2 Simulation of a Cavern

The reference cavern impulse response was taken from the same commercial convolution

reverb as the concert hall. According to its attributes it features a reverberation time of 6.2

seconds. The caver sounds a lot brighter than the concert hall, and there is a lot of movement

within the decay.


original impulse response (Cavern)


synthesized impulse response (Cavern)


52


channel (Cavern)


right channel (Cavern)

Fig. 5.2.3: Comparison of reverberation time Tr(f), total energy EDR(t=0,f) and initial energy EDR(t=0,f)/Tr(f)

(Cavern)

As shown in Fig. 5.2.3, the algorithm also provides convincing results for longer impulse

responses. In both versions subtle resonances can be perceived, whereby for the reference

they seem to be slightly less distinctive and located at higher frequencies, thus causing the

synthesized impulse response to sound somewhat darker.

5.1.3 Simulation of a Large Studio

The modelled large studio in question is an orchestral scoring stage originally located in the

United States. The impulse response was again taken from a commercial convolution-based

reverberator and has a reverberation time of 1.8 seconds.


53


original impulse response (Large Studio)


synthesized impulse response (Large Studio)


channel (Large Studio)


right channel (Large Studio)


(Large Studio)


54

Again, the algorithm was able to simulate the large studio quite well. The original impulse

response shows a bit more energy in the very low frequencies, otherwise the reverberation

sounds very similar. In the same fashion of the concert hall described in 5.1.1, the scoring

stage features a noticeable resonance which builds up over time. This resonance does not exist

in the decay of the emulation, while the overall character of the room is preserved very

accurately.

5.1.4 Simulation of a Small Studio

With about 0.7 seconds the reference small studio features the shortest reverberation time of

all test cases. These kind of rooms are many times chosen as recording rooms for audio

productions requiring dry source material in order to be able to add the desired amount of

reverberation during mixing-phase. Typically, this applies to drum-, guitar- or vocal

recordings.


original impulse response (Small Studio) Fig. 5.4.1.b: Normalized Energy Decay Relief of the

synthesized impulse response (Small Studio)


channel (Small Studio)


right channel (Small Studio)


55


(Small Studio)

As shown above, the reverberation time is notably too low for frequencies up to 250 Hz. This

case is another good example for the limitations of the second order absorbent filters.

Surprisingly the audible differences are not too drastically, because the most significant

deviations occur at low frequencies. Due to the overall short reverberation time the remaining

absolute error is still small enough to provide an adequate emulation.

5.1.5 Simulation of an Algorithmic Reverb

Finally, the proposed approach was tested for its capability to recreate the reverberation

generated by algorithmic hall devices. For this purpose a medium hall impulse response with

a decay time of about 2.1 seconds was taken from a broadly known hardware unit. This preset

is characterized by a fairly slow increasing echo density and a smooth reverberation tail.


original impulse response (Algorithmic) Fig. 5.5.1.b: Normalized Energy Decay Relief of the

synthesized impulse response (Algorithmic)


56


channel (Algorithmic) Fig. 5.5.2.b: Synthesized impulse response left and

right channel (Algorithmic)


(Algorithmic)

By listening closely to both impulse responses, it is noticeable that the echo density increases

faster in the emulated version. However, this can easily be fixed by reducing the allpass

coefficients of the input diffusion section described in chapter 4.2.1. Due to the allpasses this

is also the first case where the proposed algorithm provides a slightly brighter reverberation

than the reference. Moreover, the original response is characterized by a coloured, distinct

decay which is quite different to the sound of the real rooms discussed earlier. This is

something which varies from algorithm to algorithm and depends on the underlying structure

of the individual reverberator. Ultimately this means that the special characteristics of the

reference algorithm cannot be modelled properly with the proposed approach, unless both

algorithms are based on the same basic structure (in this case a Feedback Delay Network).

However, the overall sound is again modelled quite accurately, despite the slight over-

estimation of the reverberation time from around 1.5 kHz to 8 kHz. Interestingly the early


57

reflections are emulated very convincingly. Probably this is because there are generally less

reflections generated by artificial reverberators than there are naturally occurring in real

rooms.

5.2 General Considerations

Beside the case studies mentioned above, the proposed algorithm was also used to emulate a

large number of different room impulse responses. It can be observed that the emulation

generally tends to sound subtly darker than their references, especially if the reference

impulse response was taken in a real room. In some cases the stereo image is also slightly

wider. However, the most obvious quality of the algorithm is that it will not reproduce

conspicuous resonances, as the Feedback Delay Network is designed to provide reverberation

that sounds as smooth as possible. Nonetheless, the characteristic timbre of the original room

is retained and in general the reverberation times are modelled precisely. Differences occur

mainly for low frequencies, especially in combination with very short reverberation times. In

such cases the accuracy can further be improved by increasing the order of the absorbent

filters (at cost of computation time). Having a close look on the diagrams showing the initial

energy of the impulse responses, it can be observed that the Feedback Delay Network

reverberator produces slightly too much energy below 70 Hz. This can be compensated by

applying a first order high pass filter to the output signal. Additionally, the echo density built

up can be modified by adjusting the allpass coefficients of the input diffusion section.

The algorithm has also been tested with more practical audio signals like drums, vocals,

orchestral music or piano. Furthermore, it is a common technique to mix the output of the

reverberator with the direct signal, like it is the case in a real enclosure. In such cases it is

even more difficult to detect whether the signal was reverberated by convolving it with the

original impulse response or whether the proposed algorithm was used. However, the

development of a dedicated listening test is beyond the scope of this thesis.


58

6 Perceptual Controls

A variety of different perceptual controls have been implemented in order to be able to

modify certain parameters of the reverberation. They will be described briefly in the

following section.

Generally, a so called ‘preset’ will be created for each impulse response the user chooses to

emulate. The parameters for each of those presets (like filter coefficients, reverberation time

etc.) are derived offline from the analysis of the impulse response as described in chapter 4.

General Controls:

Reverberation time: This value describes the desired maximum reverberation time in

seconds. The default value is equal to the maximum reverberation time of the

corresponding impulse response. The desired reverberation time is achieved by

multiplying the numerator coefficients of the absorbent filters with a gain factor.

Predelay: This is the time gap between the direct signal and the onset of the

reverberation in milliseconds. For concert halls, the predelay is usually set to around

25 ms, depending on the desired position of the listener [Ber04].

Mono/Stereo: Defines if the output of the reverberator will be in mono or stereo.

Dry Level: Level of the direct signal in dB.

Wet Level: Level of reverberation in dB.

Controls for Earl Reflections:

ER Gain: Sets the volume of the early reflections from –inf to 0dB. –inf will mute the

early reflections.

ER Slope: This control adjusts the attack of the early reflections from 0% to 100%. An

extreme value of zero will multiply the gains of the early reflections with an

increasing ramp, and a value of 100% will multiply them with a decreasing ramp. A

value of 50% (default) will leave the gains as they are.

ER Spread: It is possible to time-stretch or time-compress the early reflections. A

value of 0% will time compress them to half of their length and a value of 100% will

time-stretch them to twice of their length. A value of 50% will leave them unaltered.


59

Controls for Late Reverberation:

Diffusion 1: Controls the coefficients of the first two allpasses of the input diffusion

section. A value close to one means maximum diffusion, a value of zero corresponds

to no diffusion [Dat97].

Diffusion 2: Controls the coefficients of the last two allpasses of the input diffusion

section. A value close to one means maximum diffusion, a value of zero corresponds

to no diffusion [Dat97].

Depth: This gives control about the modulation depths from 0% to 100%. The

modulation can be turned off by setting this value to zero, or to a maximal modulation

depth of 30 samples by setting it to 100%. Default is 50%.

Tail Gain: Sets the volume of the late reverberation from –inf to 0dB. –inf will mute

the reverb tail.

Output Filters:

A simple filtering section has been added to the output of the reverberator. It provides

controls for the cut-off frequency for a first order highpass filter as well as for a first order

lowpass filter.


60

7 Conclusion and Outlook

This thesis proposed an algorithm which was designed in order to emulate the reverberation

of common enclosures like rooms, concert halls or cathedrals. While this effect is typically

achieved by convolving a signal with a measured room impulse response, this thesis describes

an approach, where early reflections are recreated by a tapped delay line and the reverberation

tail is synthesized with a Feedback Delay Network. Common impulse response analysis

methods, like the Energy Decay Relief and the Echo Density Profile, as well as the most basic

building blocks of reverberators, the comb- and allpass filters, have been reviewed. The final

algorithm consists of an offline analysis bock and an artificial reverberator which was

designed to work in real time. The former part gains information about the reference impulse

response and then automatically tunes the necessary parameters of the reverberator in order to

emulate the reverberation effect. The usage of second order absorbent filters approved to be

adequate to successfully approximate the frequency dependent reverberation time.

Additionally a spectral correction was applied to the output of the Feedback Delay Network to

match the initial spectrum with that of the original impulse response.

The duration of the early reflections was determined by calculating and evaluating the Echo

Density Profile, as the start of the late field is usually indicated by the maximum value right

after the Echo Density Profile approached a value of one for the first time. This initial part of

the impulse response can be used to extract the delay times and gains of the early reflections.

After a threshold is applied to eliminate the weaker echoes, only the most prominent

reflections are retained. In this way the computation time can be decreased while still

remaining the characteristic spatial impression of the first part of the reverberation.

A large amount of time was spent on tuning the delay times of the FDN. Consequently the

quality of the reverberation tail could further be improved by introducing modulation and

increasing the order of the network to 16. Although the decay generally sounds smooth and

shows almost no obvious, undesirable resonances, a few subtle periodicities could only be

detected when using Dirac-impulses as input signals and listening on high end headphones

and DA converters.

Finally, the algorithm was tested with a variety of different impulse responses, including

small rooms, large rooms, halls, caverns and even impulse responses taken from algorithmic

reverb devices. In most cases the achieved emulations have been very accurate with the


61

property to suppress the most prominent resonances of the original reverberation, while still

retaining its overall sonic character. However, a dedicated listening test still has to be

conducted to verify the subjective impressions.

Further research can be done in the area of finding an optimal solution to determine the filter

coefficients of the absorbent filters. While the proposed adaptive approach provides adequate

results for frequencies above around 500 Hz, the algorithm sometimes underestimates the

reverberation time for lower frequencies. This issue can probably be fixed by connecting two

of those second order filters in series, where the latter corrects the remaining magnitude error

for the low frequencies. This will however significantly increase the computation time, as

every delay line requires this filtering process.

Tough the advantage of using Feedback Delay Networks is that they are generally well

studied, in theory the whole approach is not necessarily limited to this kind of structure. Other

systems, like single loop reverberators [as proposed in Dat97], can alternatively be used to

create the late reverberation. As such systems only use a single feedback path, a fewer number

of absorbent filters is necessary to control the reverberation time and consequently more

accurate filters can be used. Ultimately the goal is to find a structure that is computationally as

efficient as possible, while still providing good control and high quality reverberation.

At this time the proposed algorithm is implemented in Matlab. A real time implementation in

the form of an audio plugin will be accomplished in future works.


62

Literatur/Bibliography

[JC91] J.M. Jot and A. Chaigne, “Digital Delay Networks for Designing Artificial

Reverberators,” 90th

AES Convention, 1991.

[Zöl11] U. Zölzer, Digital Audio Effects – Second Edition. Wiley, 2011.

[GW09] G. Graber and W. Weselak, „Raumakustik Skriptum,“ Institut für

Breitbandkommunikation, TU Graz, Graz, A, 2009.

[MS07] R. Steward and D. Murphy, “A Hybrid Artificial Reverberation Algorithm,” Audio

Eng. Society Convention Paper 7021, 2007.

[Jot92] J. M. Jot, “An Analysis/Synthesis Approach To Real Time Artificial Reverberation,”

IEEE Int. Conf. Acoustics, vol. 2, 1992.

[Schroe62] M.R. Schroeder, “Natural Sounding Artificial Reverberation,” J. Audio Eng.

Society, vol. 10, no. 3, 1962.

[Fre00] J. Frenette, “Reducing Artificial Reverberation Requirements Using Time Variant

Feedback Delay Networks,” Thesis, University Of Miami, Miami, USA, 2000. URL:

http://www.music.miami.edu/programs/mue/research/jfrenette/index.html (09.06.2015).

[SL61] M. R. Schroeder and B. F. Logan, “Colorless Artificial Reverberation,” J. Audio Eng.

Society, vol. 9, no. 3, 1961.

[Smi10] J. O. Smith, Physical Audio Signal Processing for Virtual Musical Instruments And

Audio Effects. W3K Publishing. 2010.

[Kut91] H. Kuttruff, Room Acoustics. Elsevier Science Publishing Company (New York),

1991.

[AH06] J. S. Abel and P. Huang, “A Simple, Robust Measure of Reverberation Echo

Density,” Proceedings of the 121st AES Convention, San Francisco, CA, USA, October 2006.


63

[SP82] J. Stautner, and M. Puckette, Designing Multi Channel Reverberators, Computer

Music Journal, vol 6, no. 1, pp. 52-65, Spring 1982

[Jot97] J. M. Jot, “Efficient Models for Reverberation and Distance Rendering in Computer

Music and Virtual Reality,” Computer Music Conf., Thessaloniki, GRE, 1997.

[Jot96] J. M. Jot, “Synthesizing Three-Dimensional Sound Scenes in Audio or Multimedia

Production and Interactive Human-Computer Interfaces,” 5th International Conference:

Interface to Real & Virtual Worlds, Montpellier, France, Mai 1996.

[JCW97] J. M. Jot and L. Cerveau and O. Warusfel, “Analysis and Synthesis of Room

Reverberation Based on a Statistical Time-Frequency Model,” Proceedings of the 103rd

AES

Convention, New York, USA, 1997.

[Dat97] J. Dattorro, “Effect Design Part 1: Reverberator and Other Filters,” J. Audio Eng.

Society, vol. 45, 1997.

[Dat97,2][18] J. Dattorro, “Effect Design Part 2: Delay-Line Modulation and Chorus,” J.

Audio Eng. Soc., vol. 45, 1997.

[CNW14] T. Carpentier and M. Noisternig and O. Warusfel, “Hybrid Reverberation Processor

with Perceptual Control,” Proc. Of the 17th

Int. Conference on Dig. Audio Effects, Erlangen,

GER, 2014.

[Ber04] L. Beranek, Concert Halls and Opera Houses-Music, Acoustics and Architecture – 2nd

edition. Springer. 2004.

[Zöl08] U. Zölzer, Digital Audio Signal Processing. Wiley, 2008.

[Ruo12] M. Ruohonen, “Measurement-Based Automatic Parameterization of a Virtual

Acoustic Room Model,” Thesis, School of Electrical Engineering, Aalto University, Espoo,

FI, 2012.

[Tou98] J. Tougaard, “Detection of short pure-tone stimuli in the noctuid ear: what are

temporal integration and integration time all about?,” J. Comp Physiol A, 1998.


64

Picture Credits

Fig. 1.1: J.M. Jot: Synthesizing Three-Dimensional Sound Scenes in Audio or Multimedia

Production and Interactive Human-Computer Interfaces (1996), URL:

http://articles.ircam.fr/textes/Jot96a/ (Stand: 06.06.2015)

Fig. 2.1: U. Zölzer, Digital Audio Effects – Second Edition. Wiley, 2011.

Fig. 2.3.0: M.R. Schroeder, “Natural Sounding Artificial Reverberation,” Bell Telephone

Laboratories, 1962.

Fig. 2.3.5: U. Zölzer, Digital Audio Effects – Second Edition. Wiley, 2011.

Fig. 2.5.1: [12] J. S. Abel and P. Huang, “A Simple, Robust Measure of Reverberation Echo

Density,” Proceedings of the 121st AES Convention, San Francisco, CA, USA, October 2006.

Fig. 2.6.1: J.M. Jot and A. Chaigne, “Digital Delay Networks for Designing Artificial

Reverberators,” 90th



Laboratories, 1962.


Laboratories, 1962.

Fig. 3.1.1: U. Zölzer, Digital Audio Effects – Second Edition. Wiley, 2011.

Fig. 3.2.1: J.M. Jot and A. Chaigne, “Digital Delay Networks for Designing Artificial

Reverberators,” Proc. of the 90th


Figure 4.2.2: U. Zölzer, Digital Audio Effects – Second Edition. Wiley, 2011.

http://articles.ircam.fr/textes/Jot96a/


65

APPENDIX

Delay Lengths in samples of the FDN in the final implementation (@44100Hz), in sequential

order:

3011 3083 3251 3307 3433 3461 3727 3797 4057 4153 4229 4451 4517

4999 5081 5209

Feedback Matrix (N=16) of the final implementation, construction code for Matlab:

A = eye(N); idx = [N,1:N-1]; A = A(idx,:); F = A - (2/N)*ones(N); % F… Feedback Matrix of FDN

Documents

Perceptual Approximation of Room Impulse … · Perceptual Approximation of Room Impulse Responses with Artificial Reverberation Algorithms Audio Engineering Project Thesis Lukas