93
A Comparison of Techniques for Detecting Clicks on Recordings Taken From Vinyl James Nugent Bachelor of Science in Computer Science with Honours The University of Bath May 2008

A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Embed Size (px)

Citation preview

Page 1: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

A Comparison of Techniques for Detecting Clicks on

Recordings Taken From Vinyl

James Nugent

Bachelor of Science in Computer Science with HonoursThe University of Bath

May 2008

Page 2: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

This dissertation may be made available for consultation within the Uni-versity Library and may be photocopied or lent to other libraries for thepurposes of consultation.

Signed:

Page 3: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

A Comparison of Techniques for Detecting Clicks

on Recordings Taken From Vinyl

Submitted by: James Nugent

COPYRIGHT

Attention is drawn to the fact that copyright of this dissertation rests with its author. TheIntellectual Property Rights of the products produced as part of the project belong to theUniversity of Bath (see http://www.bath.ac.uk/ordinances/#intelprop).This copy of the dissertation has been supplied on condition that anyone who consults itis understood to recognise that its copyright rests with its author and that no quotationfrom the dissertation and no information derived from it may be published without theprior written consent of the author.

Declaration

This dissertation is submitted to the University of Bath in accordance with the requirementsof the degree of Batchelor of Science in the Department of Computer Science. No portion ofthe work in this dissertation has been submitted in support of an application for any otherdegree or qualification of this or any other university or institution of learning. Exceptwhere specifcally acknowledged, it is the work of the author.

Signed:

Page 4: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Abstract

The problem of removing noise from recordings has been considered both for commercialand academic purposes, and is of interest to many groups of people. This project investi-gates two algorithms for detecting clicks in audio, and a method for repairing the affectedsamples. We also examine a brief history of audio restoration, and conduct a survey of somerelevant literature. We present the results of a comparison between these two detectionalgorithms, and go on to discuss some possibilities for further development.

Page 5: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Contents

1 Introduction 2

1.1 Current State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Changes to the Original Plan . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Document Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Literature Review 5

2.1 History of Audio Recording . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Types of Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Basic Digital Signal Processing Concepts . . . . . . . . . . . . . . . . . . . . 10

2.3.1 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3.2 Frequency Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 History of Audio Restoration . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4.1 Beginnings of Forensic Audio . . . . . . . . . . . . . . . . . . . . . . 13

2.4.2 Analogue and Tape-Based Restoration . . . . . . . . . . . . . . . . . 13

2.4.3 Digital Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.5 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.5.1 Debuzzing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.5.2 Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.5.3 Berger, Coifman and Goldberg . . . . . . . . . . . . . . . . . . . . . 18

2.5.4 Declicking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Design 21

3.1 Overall Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

i

Page 6: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CONTENTS ii

3.2 Detecting clicks in the Time Domain . . . . . . . . . . . . . . . . . . . . . . 21

3.3 Detecting clicks in the Frequency Domain . . . . . . . . . . . . . . . . . . . 23

3.4 Fixing Clicks using LSAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4 Implementation 26

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2 Code Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3 FFTW Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.4 libsndfile Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.4.1 Basic Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.4.2 Opening a file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.4.3 Closing a file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.4.4 Reading samples from a file . . . . . . . . . . . . . . . . . . . . . . . 29

4.4.5 Writing samples to a file . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.5 Guide to Using the Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5 Testing and Results 31

5.1 Source Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6 Conclusions 42

A Raw Result Outputs 45

A.1 Time Domain Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

A.1.1 Artificial Vinyl Noise (Track 4) . . . . . . . . . . . . . . . . . . . . . 45

A.1.2 “Brothers In Arms” (Clean) (Track 5) . . . . . . . . . . . . . . . . . 51

A.1.3 “Minute Waltz” (With artificial noise) (Track 10) . . . . . . . . . . . 56

A.1.4 “Jungle Dream” (Track 11) . . . . . . . . . . . . . . . . . . . . . . . 56

A.2 Frequency Domain Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 60

A.2.1 “Brothers In Arms” (Clean) (Track 5) . . . . . . . . . . . . . . . . . 60

A.2.2 “Minute Waltz” (With artificial noise) (Track 10) . . . . . . . . . . . 61

A.2.3 “Jungle Dream” (Track 11) . . . . . . . . . . . . . . . . . . . . . . . 62

Page 7: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CONTENTS iii

B Project Proposal 67

C Code 73

C.1 File: SoundFileHandling.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

C.2 File: hpf-demonstration.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

C.3 File: ClickList.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

C.4 File: vclean.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Page 8: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

List of Figures

2.1 The RIAA Equalisation Curve . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Surface of a 78rpm record as viewed through an electron microscope, withdust and scratches clearly visible. Courtesy of Scientific Imaging Group,CUED. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 A click on a stereo recording taken from vinyl, as displayed when zoomed inusing Adobe Audition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.4 A waveform exhibiting the results of clicks and crackle. This section repre-sents a second of audio, and is displayed in Adobe Audition. . . . . . . . . . 9

2.5 Time Domain to Frequency Domain using the DFT . . . . . . . . . . . . . . 11

2.6 A spectogram, showing time on the horizontal axis, frequency on the verticalaxis, and using colour to show signal intensity. . . . . . . . . . . . . . . . . 14

2.7 A spectogram of a portion of “Equation” (Aphex Twin). . . . . . . . . . . . 14

2.8 Debuzzing with a comb filter and the CEDAR algorithm. Frequency plots. 17

2.9 Methods of replacing samples containing clicks. . . . . . . . . . . . . . . . . 20

3.1 Basic click detection and removal process . . . . . . . . . . . . . . . . . . . 22

5.1 A spectogram showing a click with frequency scale . . . . . . . . . . . . . . 33

5.2 Samples of vinyl noise made up from run-in groove noise of several records)(Track 4). Waveform and Spectogram. . . . . . . . . . . . . . . . . . . . . . 34

5.3 “Brothers In Arms” (Dire Straits, 1985) (Track 5). Waveform and Spectogram. 35

5.4 “Concerto in C minor” (St. Petersburg Radio & TV Symphony Orchestra,1999) (Track 6). Waveform and Spectogram. . . . . . . . . . . . . . . . . . 36

5.5 “Waltz in D flat major” (Chet Atkins, 1957) (Track 7). Waveform andSpectogram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

iv

Page 9: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

LIST OF FIGURES v

5.6 “Jungle Dream” (Los Indios Tabajaras, 1963) (Track 11). Waveform andSpectogram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.7 “One Note Samba” (Stan Getz and Charlie Byrd, 1963) (Track 12). Wave-form and Spectogram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.8 “Honour, Riches, Marriage Blessing” (Woolfenden, 1978) (Track 13). Wave-form and Spectogram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Page 10: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

List of CD Tracks

1. Reproduction of “Au Clair De Lune” (Unknown Performer, 9 April 1860). Theearliest known recording.

2. Excerpt from “One Note Samba” (Stam Getz and Charlie Byrd, 1962, Verve UK).Taken from a 1963 pressing. Mono source recorded using a stereo record player.Includes needle drop and groove run-in noise.

3. Excerpt from “One Note Samba” as in track 2, having been run through a high passfilter.

4. Vinyl noise made up from run-in groove noise of several records (including track 11).

5. Excerpt from “Brothers In Arms” (Dire Straits, 1985, Vertigo Records). Stereo.Taken from CD.

6. Excerpt from “Concerto in C minor” (Vivaldi, Recorded by the St. Petersburg Radio& TV Symphony Orchestra, 1999, Madacy Records). Stereo. Taken from CD.

7. Except from “Waltz in D flat major” (Chopin, Performed by Chet Atkins, 1957).Stereo. Originally recorded to tape and subsequently remastered in 2007.

8. As track 5, with added vinyl noise.

9. As track 6, with added vinyl noise.

10. As track 7, with added vinyl noise.

11. Excerpt from “Jungle Dream” (Los Indios Tabajaras, 1963, RCA Victor, RCA-1365)(B-Side of “Maria Elena”). Mono source recorded using a stereo record player. In-cluding needle drop and groove run-in noise.

12. Excerpt from “One Note Samba” (Stan Getz and Charlie Byrd, 1962, Verve UK).Taken from a 1963 pressing. Mono source recorded using a stereo record player.Includes needle drop and groove run-in noise.

13. Excerpt from “Honour, Riches, Marriage Blessing” (Woolfenden, Performed by theRoyal Shakespeare Company Ensemble and Cast, 1978, Ariel Records). Stereo.

vi

Page 11: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Acknowledgements

Numerous people have aided my preparation of this project. I am particularly grateful toProf. John Fitch for supervising the project, and providing advice on many aspects of thework. I am also grateful to those who put substantial amounts of effort into producing highquality open source libraries which have been used in the course of this project.

I would also like to thank my father Jim Nugent for providing access to a large libraryof vinyl records, and high quality digitisation equipment, and Chris Bailey for providingobscure facts regarding the use of LATEX.

1

Page 12: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Chapter 1

Introduction

The aim of this project is to produce a program capable of allowing comparisons to bemade between two different types of algorithm, and some variants thereof, for detectingclicks in audio files, focusing particularly on recordings taken from vinyl sources.

Many hours of historical audio recordings reside on analogue media throughout the world.In many cases, a vinyl disc is the only available source of a given recording. Unfortunately“mint” vinyl is rare, often leaving well-used discs as the only available source of a givenrecording. Imperfections in and scratches on the vinyl surface manifest themselves as noiseduring playback, as does dust in the grooves. An example of a typical piece of audio takenfrom an old, badly damaged vinyl disc can be heard on track 2 of the accompanying audioCD.

Until 1950, it was common to record and master to a wax disc in one pass, from which avinyl negative was formed (Martin and Hornsby, 1995). This practice became less commonwith the advent of high quality tape recorders as a result of developments made by Ampexwhich were inspired by the discovery of Nazi Germany’s Magnetophone by Jack Mullinfollowing the Second World War (Budman, 2006). The result of this, however, was thatthe quality of the master from which vinyl discs were made depended heavily on boththe quality of the recording process and the wax media. Whilst tape recorders arguablyimproved quality, they did bring with them their own problems (notably hiss).

The advent of very high quality digital audio media such as Compact Disc has raisedexpectations amongst listeners of the levels of quality which are attainable. Consequently,restoration of recordings suffering from defects such as these is of interest to many people- whether for the purposes of improving the perceived quality of personal recordings, re-mastering recordings for release on compact disc, or for the purposes of so-called “forensicaudio”.

2

Page 13: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 1. INTRODUCTION 3

1.1 Current State of the Art

As in many fields, the current state of the art of audio restoration is represented by ex-pensive commercial hardware and software packages. CEDAR audio, for example, produceseveral hardware/software or pure software packages such as “Retouch” which are widelyregarded as the state of the art, and are often used in re-mastering recordings before issueon CD.

Other packages such as Diamond Cut’s “Audio Restoration Tools”, and Dartech’s “Dart-Pro”, whilst less well regarded than packages such as “Retouch”, are well known. Thecommercial field also offers lower-end packages such as Steinburg’s “Clean”, which is de-signed to ease the process of an individual improving the perceived quality of their personalrecord collections before transfer to compact disc.

In the world of Free Software, one of the pre-eminent projects is “Gnome Wave Cleaner”,which fall between the two previously mentioned packages in terms of complexity, and iscapable of producing excellent results on certain types of damaged audio. The Free waveeditor “Audacity” also includes noise reduction tools.

Several other audio editors, such as “Sound Forge” and “Adobe Audition”1 also includenoise reduction tools in their feature lists.

It is unrealistic to expect to reach the level of high-end commercial systems during thecourse of this project. With this in mind, our target is to produce a program whichcompares with some of the lower end packages and provides a basis for further work.

1.2 Changes to the Original Plan

The project proposal detailed a number of requirements for the project. Several of thesechanged because of investigation, or the realisation that they were infeasible or would notproduce good results. The project proposal in its original form reproduced in appendix Bfor completeness, although much of the material is present in other areas of this document.

The original intention was to produce software which worked as a plugin to the Audacitywave editor. However, it was decided that this was more complicated than dealing withreading and writing wave files using a good library (see chapter 4 for more details), andlead to such increased compile time as to negate the benefits.

The detection strategy we intended to implement was intended to allow easy detection ofclicks resulting from scratches on the surface of a record, utilising the knowledge that clicksare often at a high angle to the grooves of the disc. They would thus form a slow, regularbeat, since the next click would be located approximately one revolution away. However,further investigation showed that clicks caused by scratches on records (as opposed to dustin the grooves or other defects) tend to be quite distinctive, and much louder than the

1Formerly Cool Edit/Cool Edit Pro

Page 14: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 1. INTRODUCTION 4

source signal. Consequently, traditional algorithms are able to detect them easily andaccurately.

Such large clicks are not normally where most effort is focused during a restoration, andthe algorithm is not designed to detect clicks caused by other things. Therefore, we decidednot to implement this algorithm, and instead to focus on more traditional methods, makinga comparison between them on a variety of source material.

1.3 Document Overview

In this document, we will first review some of the available resources in chapter 2. Thiswill encompass academic work on audio restoration, as well as some basic digital signalprocessing concepts and a brief history of audio recordings. We will then go on to examinethe design and some implementation details of our restoration program in chapters 3 and4. Chapter 5 will review the testing methods and material used, and present the results,and finally we will present our conclusions in chapter 6.

Accompanying the document is an audio CD containing several excerpts of audio used toillustrate various points throughout. A list of tracks on the disc and their origins is includedin the preface. We refer the reader to the tracks on this disc at several points throughoutthis document.

Page 15: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Chapter 2

Literature Review

As described in the introduction, our aim is to compare some of the algorithms which haveemerged for detecting clicks on recordings. In order to better understand both the uses ofand the principles behind audio restoration, and the existing tools available, a literaturesurvey was carried out.

It was found when performing initial research to scope the project, and confirmed duringthe literature search that the “state of the art” of audio restoration is in the commercialspace, and that there is little information besides marketing materials available in the publicdomain about the products generally considered to produce the best results (such as thesuite of tools available from CEDAR (CEDAR Audio Ltd, 2008)). Consequently, the scopeof the literature search was broadened to include Digital Signal Processing more generally,and brief histories of both recording and computerised audio restoration.

We will not, however, examine any algorithms we intend to implement in depth in thischapter, instead leaving that until chapter 3. We do, however, cover some interestingalternative algorithms briefly.

2.1 History of Audio Recording

Much of the information in this section is taken from two sources, (Martin and Hornsby,1995) and (Borwick, 1994). Other sources have been noted where appropriate.

The first recordings of audio were made in the mid-nineteenth century using a device whichcould record a waveform on paper. The first known surviving recording made in this fashionis of french folk song “Au Clair de la Lune”, and dates from 9th April 1860. At the timeof recording, there was no method of reproducing the audio, although recent developmentshave permitted this1. For the sake of historical interest, a reproduction can be found on

1The reproduced version of this recording caused the much-publicised “corpsing” of BBC Radio 4 news-reader Charlotte Green when the successful reproduction was announced on March 28th 2008.

5

Page 16: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 6

track 1 of the accompanying audio CD.

The first known recordings reproducible at the time of recording were made by ThomasEdison with his Phonograph machine in 1877. The original machine used a foil rotatingdrum as the recording medium, which was developed by Bell into a commercial productusing wax cylinders. However, no method of duplicating cylinders existed, meaning thateach copy of a recording an artist wished to sell required a separate performance.

In 1888, the flat disc record made from polished zinc and the Gramophone machine wereunveiled by Emile Berliner at the Franklin Institute. This was developed to use discs madeof shellac running at 78rpm by 1900, and cutting negatives allowed duplication for the firsttime.

Until 1920, all recordings were acoustic, relying on sound pressure focused via a horn movinga cutting tip. During the early 1920s, the first electronic recordings were introduced, leadingto large increases in bandwidth, allowing frequencies of 5kHz to be captured for the firsttime. From here, improvements to equipment and processes appeared rapidly. The firstrecording on magnetic tape was debuted in 1935 by AEG in Germany, although the adventof World War II meant that little further progress on this was seen in the United Kingdomor United States until 1945.

Following the war, the discovery of Nazi Germany’s Magnetophone by Jack Mullin leadto the founding of Ampex, the company responsible for developing the high quality taperecording machines which replaced direct-to-disc recording in the United States by 1950,and slighly later in the United Kingdom.

Figure 2.1: The RIAA Equalisation Curve

The first stereo recordings were also made in 1935, although it was not until 1954 that thefirst commercially available stereo tape recording machines were manufactured (again byAmpex). The development of multitrack tape recorders and recording methods was largelypioneered by guitarist and jazz musician Les Paul, whose recordings with wife Mary Fordare of remarkable quality given their production methods. A detailed history of this isavailable on DVD (Paulson and Paul, 2007).

Page 17: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 7

Figure 2.2: Surface of a 78rpm record as viewed through an electron microscope, with dustand scratches clearly visible. Courtesy of Scientific Imaging Group, CUED.

Whilst tape took over in the recording studio, at least until the introduction of digitalrecorders in the early 1980s, domestic technology remained based around discs. The late1940s saw the introduction of vinyl records in the form of the “Long Player”, which ran at331

3rpm, and the 45rpm disc. In order to improve the perceived quality of sound from thesediscs, equalisation was applied to recordings in order to reduce bass frequencies which couldcause the needle to bounce. Reproduction equipment reversed this equalisation by boostingthe bass frequencies. A de-facto standard for the equalisation curve was introduced by theRecording Industry Association of America in 1954, and is shown in figure 2.1. Mostrecords produced from this time onwards used this standard, replacing the proprietaryequalisations each record company applied to their own pressings.

The final major development in consumer audio reproduction was Compact Disc in 1982,which dramatically increased perceived quality over vinyl disc, and also provided randomaccess and increased playing time without the need to reverse a disc.

It is perhaps somewhat ironic that despite the continual improvement in available qualityover the last 150 years, much of the music now heard comes from compressed audio fileformats such as MPEG Layer 3. It is possible to obtain excellent results from such formatshowever.

2.2 Types of Noise

Broadly speaking, recording noise can be defined as any unwanted modification between thesound source being recorded and the reproduction of the recording. Generally, background

Page 18: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 8

Figure 2.3: A click on a stereo recording taken from vinyl, as displayed when zoomed inusing Adobe Audition.

noise present at the point of recording is not considered noise, although it is often usefulto be able to reduce it. Defects can be categorised into local and global problems. Localproblems affect a small number of samples, whereas global degradations affecting all samplesfor some considerable period of time (Godsill, Rayner and Cappe, 1997).

Broadly speaking, and discounting noise arising from digitization, the types of noise foundon recordings taken from vinyl sources can be categorized as follows:

• Clicks - A click is an aberration in the waveform, generally only lasting for a fewsamples, but manifesting itself as a “popping” noise. They are most often caused byscratches and dust on the surface of a vinyl disc (see, for example, figure 2.2). Anexample of the waveform resulting from this can be seen in Figure 2.3.

• Crackle - Crackle is often characterized as a “fat frying” type noise. It is generallycaused by randomly distributed pock-marks on the surface of a vinyl disc, creatingimpulsive disturbances in the audio. An example of a waveform resulting from thiscan be seen in Figure 2.4.

• Buzz - Buzz is very similar to crackle in that it manifests itself because of impul-sive disturbance in the waveform. However, unlike crackle, where the disturbancesare randomly distributed, the impulses causing a buzz are regularly spaced. Thisis a common problem caused by poor shielding or incorrect grounding in electricalequipment, in a form known as fifty-cycle hum.2

• Hiss - Generally manifesting itself as a constant sound, the word “hiss” is ono-matopoeic for the sound one can expect to hear from this type of noise. Hiss often

2Sixty-cycle hum in the United States!

Page 19: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 9

Figure 2.4: A waveform exhibiting the results of clicks and crackle. This section representsa second of audio, and is displayed in Adobe Audition.

originates from tape rather than vinyl sources, and as such is not considered further.

The order in which these types of noise are processed can have a substantial effect on theresult, as some processes may remove signal which is useful to other processes. A suitableorder for performing various noise removal functions is documented by CEDAR. Theirsuggestion is that the restoration be carried out in the following order:

• Declick - should be performed first in order to remove large clicks which otherwisemake it difficult for the decrackling process to identify smaller clicks consitutingsurface noise.

• Decrackle - should be performed after declicking, but before dehissing in order toremove small crackles which cause problems with dehissing algorithms.

• Debuzz - should be performed after declicking but before dehissing for similar reasons,although this process is not necessary in every restoration.

• Dehiss - should be the last process applied since it often removes information usefulto carrying out the other processes.

Page 20: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 10

2.3 Basic Digital Signal Processing Concepts

In this section, we will examine some of the basic concepts involved in digital signal pro-cessing, including sampling and the errors it introduces, and the frequency domain. It actsas a summary, with most of the information taken from the University of Bath CM30142“Music and Digital Signal Processing” course, and “The Scientist and Engineer’s Guide toDigital Signal Processing” (Smith, 1997).

2.3.1 Sampling

Since we intend to work in the digital domain, we must consider sampling theory, and theeffects of sampling upon audio signals. One way of looking at sound is as a continuouslyvariable pressure upon the eardrum. However, we must represent the sound inside a digitalcomputer, and sampling provides a way of achieving this.

A simple method is known as Pulse Code Modulation (PCM). We sample the audio atfixed time intervals and record the value. Normally, the time between samples is fixed, andthe rate is known as the sampling rate. Since we are converting a continuous function intoa discrete function, we inevitably introduce error. The two types of error introduced byPCM sampling are known as Quantisation error and Sampling error.

Quantisation Error

We represent the value of each sample as an integer in binary. If we use B bits to representeach sample, there are 2B distinct values that each sample could take. The values of theoriginal function can take values in between these integers, however, and we must quantiseto the nearest integer value. Fortunately, it is simple to reduce this by increasing B.

The signal to noise ratio as a result of quantisation error when dealing with 16-bit samples,the standard CD quality, is around 102dB, which is acceptable to most humans.

Sampling Error

Sampling error results from only taking samples at equally spaced time intervals. If thefunction being sampled changes more frequently than the sample rate, we get the effect ofan alias at a lower frequency. The highest frequency which can be sampled at sample rateSHz without aliasing is S/2Hz. This frequency is known as the Nyquist Frequency.

2.3.2 Frequency Domain

The sampling method we discussed earlier represents the signal amplitude as a functionof time, so we say it is in the time domain. It is often useful to process signals working

Page 21: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 11

in the frequency domain instead. In the frequency domain representation, the signal isrepresented as the amplitude of various constituent sinusoidal frequencies.

It should be noted that the frequency domain representation of a signal contains exactlythe same information as a time domain representation. In order to convert a signal into thefrequency domain, we use the Discrete Fourier Transform. In other words, we decomposethe signal. In order to convert a signal from the frequency domain to the time domain, weuse the Inverse Discrete Fourier Transform, or synthesise the signal. The decomposition isformalised in the analysis equations:

ReXk =N−1∑i=0

xi cos(

2πkiN

)(2.1)

ImXk = −N−1∑i=0

xi sin(

2πkiN

)(2.2)

The standard notation within digital signal processing is to use lower case letters to referto signals in the time domain and upper case letters to refer to signals in the frequencydomain.

Chapter 8- The Discrete Fourier Transform 147

Time Domain Frequency Domain

x[ ] Re X[ ] Im X[ ]

0 N-1 0 N/2 0 N/2

Forward DFT

Inverse DFTN/2+1 samples

(cosine wave amplitudes)

N/2+1 samples(sine wave amplitudes)

collectively referred to as X[ ]

N samples

FIGURE 8-3DFT terminology. In the time domain, consists of N points running from 0 to . In the frequency domain,x[ ] N&1

the DFT produces two signals, the real part, written: , and the imaginary part, written: . Each ofReX[ ] Im X [ ]

these frequency domain signals are points long, and run from 0 to . The Forward DFT transforms fromN/2%1 N/2

the time domain to the frequency domain, while the Inverse DFT transforms from the frequency domain to thetime domain. (Take note: this figure describes the real DFT. The complex DFT, discussed in Chapter 31,changes N complex points into another set of N complex points).

samples taken at regular intervals of time. Of course, any kind of sampled

data can be fed into the DFT, regardless of how it was acquired. When you

see the term "time domain" in Fourier analysis, it may actually refer to

samples taken over time, or it might be a general reference to any discrete

signal that is being decomposed. The term frequency domain is used to

describe the amplitudes of the sine and cosine waves (including the special

scaling we promised to explain).

The frequency domain contains exactly the same information as the time

domain, just in a different form. If you know one domain, you can calculate

the other. Given the time domain signal, the process of calculating the

frequency domain is called decomposition, analysis, the forward DFT, or

simply, the DFT. If you know the frequency domain, calculation of the time

domain is called synthesis, or the inverse DFT. Both synthesis and analysis

can be represented in equation form and computer algorithms.

The number of samples in the time domain is usually represented by the

variable N. While N can be any positive integer, a power of two is usually

chosen, i.e., 128, 256, 512, 1024, etc. There are two reasons for this. First,

digital data storage uses binary addressing, making powers of two a natural

signal length. Second, the most efficient algorithm for calculating the DFT, the

Fast Fourier Transform (FFT), usually operates with N that is a power of two.

Typically, N is selected between 32 and 4096. In most cases, the samples run

from 0 to , rather than 1 to N. N&1

Standard DSP notation uses lower case letters to represent time domain

signals, such as , , and . The corresponding upper case letters arex[ ] y[ ] z[ ]

Figure 2.5: Time Domain to Frequency Domain using the DFT

If we take a time domain signal x, containing N samples, the frequency domain represen-tation is X, which consists of two parts - real and imaginary. Each of these parts is ofsize N

2 − 1. We denote them as ReX and ImX respectively. The values in the real andimaginary parts of the frequency domain represent the amplitudes of the constituent cosineand sine waves respectively. This is illustrated in figure 2.5.

The output of the DFT is a set of numbers representing amplitudes. However, these am-plitudes must be associated with the correct functions in order to be useful. The functionswith which the amplitudes are associated are known as the basis functions of the DFT,

Page 22: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 12

and are most commonly cosine and sine, with unity amplitude. The basis functions mustbe orthogonal. When the frequency domain is assigned to the basis functions, the resultis scaled cosine and sine waves which can be added to form a time domain signal. It ispossible to decompose a signal into other orthogonal basis functions. Much of the time thisis not very useful, although it does form the basis of a noise reduction technique presentedin section 2.5.3.

The basis functions are discrete, and N points in length (such that when they are added, thesignal length matches that of the original time domain representation). They are definedby the following equations:

cki= cos

2πkiN

i = 0toN − 1 (2.3)

ski= sin

2πkiN

i = 0toN − 1 (2.4)

where ck is the cosine wave to be assigned to the amplitude at position k of the real part ofX, and sk is the sine wave to be assigned to the amplitude at position k of the imaginarypart of X.

For an N point Discrete Fourier Transform, k will take values between 0 and N2 .

Although we have referred to the output of the DFT as being the amplitudes of the signal,in reality, the amplitudes needed to resynthesize the original signal (which we will referto as ImX ′ and ReX ′) are different from the frequency domain of the signal (ImX andReX). The reasons for this can be found in (Smith, 1997, page 156). In order to acquirethe proper amplitudes, we must normalise the frequency domain. That is to say:

ReX ′k =ReXkN2

(2.5)

ImX ′k =ImXkN2

(2.6)

except for two cases:

ReX ′0 =ReX0

N(2.7)

ReX ′N2

=ReXN

2

N(2.8)

The signal can then be resynthesised using the equation:

xi =

N2∑

k=0

Re′Xk cos(

2πkiN

)+

N2∑

k=0

Im′Xk sin(

2πkiN

)(2.9)

Page 23: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 13

2.4 History of Audio Restoration

Much of this history has been adapted from (Godsill and Rayner, 1998). However, it hasbeen augmented by other sources which are noted where appropriate.

Since the earliest days of recordings, engineers have tried to reduce the amount of noisepresent on their recordings. However, between the introduction of electronic recordingmethods and digital media, the recording medium itself was often the limiting factor interms of noise floor. Large volumes of recordings still exist on poor quality, or in manycases decaying media. Whilst transfer into the digital domain is simple, and prevents furtherdecay, noise removal is a more difficult proposition. In this section we shall investigate someof the major technical developments in terms of hardware and software.

2.4.1 Beginnings of Forensic Audio

The origins of taking audio and processing it to improve clarity can be traced to World War2, during which it was important to be able to identify and understand voices transmittedby radio(CEDAR Audio Ltd, 2008). This requirement lead to the development of thespectogram - a plot of the energy across various frequency ranges of a signal as it changesover time. It is created by calculating the frequency spectrum over windows frames of asignal. An example can be seen in figure 2.6, and another, more interesting example infigure 2.7

When the technique was first developed, audio equipment was analogue. Consequently,the technique used to produce a spectogram worked with a continuous signal and usedbandpass filters. Using this method, the frequency range of the signal is divided up intoequal sections using filtering, and the magnitude of each filter’s output is recorded as afunction of time. The plot could then be used to determine which filters would be necessaryto remove extraneous noise from a single recording, as well as for comparing two differentsignals for frequency content over time(Godsill et al., 1997).

2.4.2 Analogue and Tape-Based Restoration

For as long as magnetic tape has existed, it has been possible to remove individual clicksand pops from a recording by simply splicing the tape at the point the click occurs, andre-joining it with the section containing the click missing. It should be noted, however,that this method is also notorious for creating aberrations in the audio waveform whichmanifest themselves as clicks, thus possibly substituting two clicks for one! Analogue filterswere often used to remove background hiss - generally the bandwidth of older media suchas wax cylinders is so low that cutting out high frequencies is of little consequence to thesignal itself.

Other methods for reducing clicks in the analogue era involved electronic circuits whichdetected clicks using a high pass filter, and then low pass filtered the signal at appropriate

Page 24: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 14

Figure 2.6: A spectogram, showing time on the horizontal axis, frequency on the verticalaxis, and using colour to show signal intensity.

Figure 2.7: A spectogram of a portion of “Equation” (Aphex Twin).

Page 25: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 15

points to make an attempt at removing them(Carrey and Buckner, 1976). This methodtends to interfere with the low frequency content of the signal, however.

2.4.3 Digital Systems

Since microprocessors became fast enough to handle the signal processing operations re-quired for audio restoration, research groups working in various American universities, aswell as Cambridge. (Godsill and Rayner, 1998, pages 5-6). Many of those groups then wenton to found commercial organisations, the most prominent example of which is CEDAR.

Some of the earliest work was to enhance recordings of Italian tenor Caruso (Caruso, 2007),who was one of the pioneers of recorded sound. Whilst it was found during this workthat it was possible to extract the vocals to a very high standard, the accompanyingorchestra was indistinguishable from noise. This produced the somewhat interesting effectof Caruso singing a capalla, and can largely be explained by the recording process; boththe vocalist and the orchestra were captured using a single horn (Stockham, Cannon andIngebretsen, 1975).

When dealing with material recorded using that method, Stockham et al. (1975) notes thefollowing:

Contrary to the popular concept concerning old recordings, whether theybe acoustic or electric, the problem of surface noise or scratches is not themost important. While this form of degradation is immediately obvious whenplaying and old recording, it is generally not the major difficulty that listen-ers complain about, at least where collector-quality copies are concerned. Foracoustic recordings the major problem seems to be the resonant or reverberantcharacteristic given to the musical instruments or vocal sound by the primitiverecording horns which were used to focus the sound energy onto the originalwax discs.

While it is well known that these acoustic mechanisms were incapable oftranscribing frequencies much below 200Hz or above 4kHz, these frequencylimitations alone to not account for the degree of the degredation produced.

Stockham et al. (1975) goes on to suggest that this can be verified by taking a high qualitydigital recording and subjecting it to high-cut and low-cut filters which impart the samefrequency response as the original recordings in question. It is also noted that the distortionsproduced are not of fixed character between recordings made.

However, for recordings made using later analogue equipment, frequency response is notgenerally a problem. This can be observed by listening to many records produced since thelate 1950’s.

The earliest PC-based systems were built as a result of the British Library National SoundArchive funding a research project at Cambridge University in 1983, with early prototypes

Page 26: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 16

working in 1987. The first commercial systems were released by CEDAR in 1990 as a plugincard and software for conventional personal computers, and standalone hardware units soonfollowed. Since the products are commercial, and expensive, little information about thealgorithms used is available. Indeed, most of the products run only on the company’s ownhardware3.

Many different packages, including Adobe Audition4(Adobe Systems Inc., 2008), SoundForge(Sony Corporation, 2008) and Soundtrack Pro(Apple Inc., 2008) (and by extensionLogic) include tools aimed at reducing noise. Again, these are commercial packages. Inthe open source space, Audacity(Audacity Development Team, 2008) features tools forremoving constant background noise (the technique for which we shall examine shortly). Formore comprehensive noise removal, open source package Gnome Wave Cleaner (Welty, 2008)is often used.

2.5 Algorithms

We will now examine some of the algorithms for noise removal which are non-proprietary.We will cover the algorithms we intend to implement in chapter 3, however.

2.5.1 Debuzzing

As stated earlier, buzz is normally caused by regular impulsive disturbance in the signal,which can be produced by a variety of things. Buzzes can be treated either in the time orfrequency domains.

We will first examine frequency domain removal techniques, the simplest of which is a highpass filter. A high pass filter can be used to remove harmonically simple hums, such as thoseoften caused by mains interference. The fundamental mains frequency is 50Hz across mostof Europe, and 60Hz in much of the rest of the world, thus eliminating them with a simplelow pass filter would remove anything below that frequency. This has the unfortunateresult of removing some of the desired sound, making it inappropriate for many signals. Ahigh pass filter is, in any case, not suited to removing buzzes with complex harmonics.

A better technique is to use a comb filter. This is one of the most common techniques usedin restoration tools, although still not ideal since the removal of the signal component ofso many frequencies can cause noticeable degredation in some types of source material.

Undoubtedly the best available tool for removing buzz is from CEDAR. We include as amatter of interest the spectral patterns of a piece of audio suffering from complex harmonicbuzz, and the resulting spectral plot after removal with a comb filter and using the CEDARproprietary algorithm. This can be seen in figure 2.8.

3With the exception of CEDAR for SADiE, a rather bizaare system which offered no controls to theuser.

4Formerly Cool Edit Pro.

Page 27: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 17

(a) Original audio with harmonic buzz

(b) After processing with a comb filter

(c) After processing with the CEDAR debuzzing algorithm

Figure 2.8: Debuzzing with a comb filter and the CEDAR algorithm. Frequency plots.

Page 28: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 18

2.5.2 Denoising

Although a rather general term, we intend here to talk about removal of broadband noise,and various methods for achieving this. Broadband noise is most intrusive at high frequen-cies, and categorisable as a global degradation.

The simplest technique, as one may imagine given the nature of broadband noise, is a lowpass filter, removing any of the signal above the cut off frequency of the filter. Whilst thiscan be used to great effect on recordings with a narrow bandwidth (wax cylinders throughto 78rpm records being good examples). However, on even on these recordings the removalof the high frequency content is noticable and undesirable. Dynamic filters with a cut offwhich varies depending on the signal can also be used.

A more sophisticated technique is to use an expander or a multi-band expander, whichprovides progressive gain reduction once the signal reaches some threshold. Whilst anexpander working across the whole signal is slightly better than using a simple cut-off noisegate, the effect is similar, and produces results which do not sound especially pleasant.Multi-band expanders are an improvement, however.

All of the techniques mentioned so far could be implemented either digitally or in analoguehardware. A technique which is really only applicable to digital signal processing is spectralsubtraction. The signal can be split into thousands of bands, so it is possible to removevery precise frequencies. A passage of signal with no music in it can be used to measurea noise “fingerprint”, which is then subtracted throughout. This method is not perfect,however, because the noise fingerprint is just a snapshot of random noise which is onlyaccurate at the place where it is taken. Results are often very good, however.

2.5.3 Berger, Coifman and Goldberg

Berger, Coifman and Goldberg (1994) proposed a denoising algorithm in their 1994 paper,“A Method of Denoising and Reconstructing Audio Signals”. The algorithm works witha library of orthonormal bases into which the signal can be decomposed. Each basis isassigned a cost measuring the efficiency of the representation of a signal in a that basis.

The basis with the lowest cost is chosen, and ordered by magnitude of coefficients. Anumber of terms to keep as a coherant part of the signal is then chosen. The residual termsare considered to be noise, although it is often useful to run this algorithm iteratively,taking the part determined to be noise by one iteration as the input for the next iteration.

Having extracted the coherant parts of the input of each iteration, they can be addedtogether to synthesise the coherant part of the signal as a whole. The paper suggests thatit may be possible to improve the results of the algorithm by applying it to frequency bandsrather than the signal as a whole.

Although the algorithm is reported as working well in passages containing music, it is notedas failing in passages which consist primarily of noise, instead understandably detecting

Page 29: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 19

much of the noise as coherant signal.

2.5.4 Declicking

Perhaps the most important area to examine for our purposes is that of declicking. Severalmethods exist for repairing clicks, however the significant challenge to address is that ofdetection (which we examine in chapter 3).

There are a number of techniques applicable to both analogue and digital domain work.The simplest is to use an attenuator, which mutes the area affected by a click with a highspeed fade out and fade in. Although the impact of the click is reduced, the “repair” iseasily perceptible. Another method is to use two sources of the same signal, for example,playing a mono record with a stereo cartridge. The signals are monitored, and switchedat appropriate points to give the cleanest output. This method is often good for removinglarge clicks, although it is obviously only applicable to mono signals in this form. It is,however, the most sophisticated analogue click removal method. Again, we include forillustration some plots from CEDAR’s marketing materials, which can be seen in figure2.9.

Moving into the digital domain provides far more scope. The first is the “Sample andHold” approach, which is similar to the attenuator system mentioned previously. Ratherthan muting the area surrounding a click, however, the last good sample is held for theduration of the click. This is the method used for error correction in most compact discplayers, and often gives rise to irksome distortions.

An alternative is to perform linear interpolation. This takes into account the last goodsample before the click, and the first good sample following the click. The click is replacedby samples representing a straight line between these. A reduction in bandwidth over therepaired region is still evident, although results are much better than the sample and holdapproach.

A better technique is to use a more sophisticated prediction technique based around amodel of the samples leading up to the click. The method implemented is discussed inchapter 3. The general name for such methods is high order interpolation.

Having examined some of the history, background theory, academic work and the availabil-ity existing products, we will now go on to describe the algorithms we intend to implement.

Page 30: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 2. LITERATURE REVIEW 20

(a) Original Signal

(b) The “Sample and Hold” method

(c) Linear interpolation

(d) Higher order interpolation

Figure 2.9: Methods of replacing samples containing clicks.

Page 31: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Chapter 3

Design

3.1 Overall Architecture

It has been decided to implement our program as a standalone piece of C code rather thana plugin for an existing auto editor or system. The primary reason for this is that thelibsndfile library for reading from and writing to audio files is significantly easier to learnthan the plugin interfaces for any of the major audio editors. Whilst it could have beenimplemented as either a VST plugin or an Apple Audio Unit, neither of these provideadequate documentation for the offline processing features which would be needed.

We have chosen to produce only a command line interface, both because of the limitednumber of parameters which must be passed to the program and the availability of muchbetter audio editors than we could hope to write! For the intended purposes of the programthis will not present us with a problem.

The basic process is shown in Figure 3.1. The reading and writing stages are straightfor-ward, and not discussed further, although some of the details which may be useful whenexamining the source code are described in chapter 4.

3.2 Detecting clicks in the Time Domain

Our first algorithm for detecting the locations of clicks works entirely in the time domain,although it works on the assumption that clicks contain more high-frequency content thanmost wanted signal. For this reason, the signal is run through a high pass filter prior toprocessing. The algorithm takes a parameter S which determines the sensitivity. This isdiscussed in more detail shortly.

The algorithm works on blocks of audio, of size l. The high pass filter used is fairly standard.It’s equation is shown in equation 3.1. A sample of audio which has been run through thehigh pass filter can be heard on track 3 - listening to this is useful in determining that

21

Page 32: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 3. DESIGN 22

Read audio fromwave file

Run appropriate click detection algorithm

Run click repair algorithm

Output list of detected clicks

Write repaired audio to wave file

Figure 3.1: Basic click detection and removal process

passing the audio through it is of use! The code used to generate this output can be foundin appendix C.2.

hi =

√√√√√ 1N − 2

i+N∑j=i−N

(sj−1 − 2sj + sj+1)

(3.1)

The value taken for N can vary - a typical value is 8. We then compute the mean of theoutput of the high pass filter, h. We also compute dj = sj+1−sj−1

2 for j = (0 . . . l) whichis the first derivative of the change in high frequency content, and the mean and standarddeviation of this, d and dσ.

In order to perform actual click detection in the block, we move backwards through it,testing whether or not the following is true for each sample in h:

dj >3dσS

+ d and hj >3hS

(3.2)

If this is true, the algorithm is likely to have found the end of a click (in terms of forwardplaying time), and must now locate the start of it. We continue moving backwards, testingeach sample for the following condition:

hj <3dσS

(3.3)

If this condition is met, the algorithm is likely to have found the start of the click.

Page 33: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 3. DESIGN 23

The reason we move backwards can be seen when considering, for example, the crash ofa cymbal. Working forwards through the samples, the initial crash has a lot more highfrequency content than the preceeding sample. However, the sound tails off, resulting in agradual change over time, something which is generally not observed in clicks. Thus movingbackwards through the samples is less likely to lead to false positives being detected thanmoving forwards.

Signals formed from random processes usually have a bell shaped probability distributionfunction (Smith, 1997, page 26). This means that it is probable that approximately 99.7%of the sample values are within 3 standard deviations of the mean, and thus those fallingoutside are likely to be unwanted disturbances. This is then divided by the sensitivity toallow finer control over whether or not a sample is detected as a click.

3.3 Detecting clicks in the Frequency Domain

The frequency domain algorithm we have chosen to examine again works on blocks of audio,this time windowed using a Blackman window, and then converted into the frequencydomain using the discrete fourier transform.

The algorithm proceeds first by calculating the power spectrum of the signal.

This is calculated by applying the equation:

pj = ReSj2 + ImS − j2 (3.4)

in all but two cases. These cases are:

p0 = ReS02 (3.5)

andpN

2= ReSN

2

2 (3.6)

We then assign a value between -127.0 and +127.0 depending on the power.

Following this, the mean of the sample range is calculated before detection starts. Weproceed forwards through the frequency domain coefficients, determining whether or notthe change in power in higher frequencies is substantial, by comparing to a threshold value.If it is, we determine that we have found the start of a click. We then search for the signalto return to the mean to find the end of the click.

This method is good at detecting noises which cause sympathetic problems - for examplea piece of dirt in the groove of a record may cause the needle to bounce. Because thefrequency content of this is likely to be different from the underlying signal, we are able todetect a lot of the entire damaged area.

Page 34: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 3. DESIGN 24

3.4 Fixing Clicks using LSAR

Having identified the locations of clicks in the waveform using one of the algorithms de-tailed previously, we can use a linear prediction method in order to estimate what thesample values should be during the damaged sections. A linear prediction model allows theestimation of future values of a discrete time signal as a function of previous samples.

The auto-regressive model represents a given value of a signal as a weighted sum of anumber of previous values and an error term thusly:

sn =P∑i=1

sn−iai + en. (3.7)

where:sn represents the signal value at sample n.en represents the error term at sample n.P is the order of the model, that is, how many previous samples to use (remember fromthe earlier discussion of click repair the concept of a high order interpolation).

This can be rearranged to read:

en = sn −P∑i=1

sn−1ai (3.8)

This is known to be a good representation of many stationary linear processes. Most audiosignals are generally non-stationary, however for modelling purposes it is often assumedthat they are, at least in the short term. Because of this, the model must be implementedin blocks of an appropriate length - generally between 500 and 2000 samples when workingwith a signal sampled at 44.1kHz (Godsill et al., 1997).

Choosing an appropriate value for P, the order of the model, is important - for complexwaveforms of mixed musical signals a value of 100 or greater may be required to givesufficient accuracy, whilst around 25 can be enough for simpler waveforms made up of singleinstruments. For most practical modelling scenarios, P is fixed high enough to representthe most complex signal likely to be encountered.

Having decided to use an autoregressive model, it is necessary to calculate the coefficientsan. We use a straightforward least-squares method and modify s such that samples whichwere detected as constituting a click have their value set to 0.

The model can then be rewritten in vector notation as:

e = A ∗ s (3.9)

We then construct matrix A such that A ∗ s produces the vector e as in equation 3.8. Thematrix is constructed according to the following rules:

1. The size of A is (N − P )×N , where N is the number of samples in s

Page 35: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 3. DESIGN 25

2. The (j − P )th row is constructed such that:

ej = sj −P∑i=1

sj−1aj (3.10)

Having constructed this matrix, we take the row indices corresponding to missing samplesin s, and create a new matrix consisting of the columns of A with these indicies. Let thisnew matrix be known as Au.

We can then obtain a solution by minimising the sum of squares E = eT e with respect tothe new matrix (Veldhuis, 1992). The solution is:

su = −(Au

TAu)−1

AuTAs (3.11)

which is the minimum-variance unbiased estimator for the missing samples.

This method has been shown to work well where unknown data appears in contiguous areas,with each area separated by at least P good samples (Janssen, Veldhuis and Vries, 1986).It works less well with longer missing passages, as should be expected with an interpolationof the result of a random process. However, the results can be improved using higher ordermodels, producing good results in many cases.

Two improvements are suggested by Godsill et al. (1997). The first is to modify the modelto include information about the pitch period. The second deals with the energy of thesignal. The least squares auto-regressive model aims to minimise the energy of the predictedsections, which may result in the repaired signal having considerably lower energy than thesurrounding sections. A method aimed at addressing this problem is presented by Raynerand Godsill (1991).

Page 36: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Chapter 4

Implementation

4.1 Introduction

This section aims to document exactly what was produced, and provide a guide to some ofthe libraries used.

Unfortunately, time constraints and programming errors meant that not as much was imple-mented as was originally intended. Two click detection algorithms have been implemented,along with supporting code allowing them to run. However, there was no fully working im-plementation of the least squares autoregressive repair algorithm described in the chapter3.

Code is available showing the implementation in Appendix C. The raw output results canbe seen in appendix A.

The algorithms implemented did show promise, and exhibited different characteristics toone another. This is discussed further in the next chapter.

4.2 Code Organisation

Code is split across three files. The purpose of each of these is detailed below:

• SoundFileHandling.c - this file contains the majority of code relating to opening filesand filling audio buffers using the libsndfile library.

• ClickList.c - this file contains the data structure and related functions which store theoutput of the click detection algorithms. It was intended that this structure wouldserve as the input to the LSAR algorithm.

• vclean.c - this file contains the majority of the code, including the point of entry.

26

Page 37: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 4. IMPLEMENTATION 27

4.3 FFTW Library

Throughout the implementation it was necessary to compute distrete Fourier transforms(DFT). It was chosen to use the “Fastest Fourier Transform in the West” (FFTW) library(FFTW Home Page, 2008) for the relative simplicity of it’s programming interface, andnoted good performance on a wide variety of machines. The Design and Implementationof FFTW3 (2005) published a detailed paper regarding the design and implementation ofversion 3 of the library, however we summarise the key points about using the library inthis section for the convenience of the reader.

The FFTW library uses a planner to adapt its algorithms in order to maximise performanceon given hardware. The steps taken when performing a transform are as follows:

1. Allocate memory for input and output arraysThe library provides an allocation function, fftw malloc, which ensures that mem-ory is allocated in a way conducive to allowing optimising technologies such as SSE(x86) and Altivec (PowerPC) to work properly.

For a DFT from real data, the input array is of type double and the output ar-ray is of doubles. Note that the sizing of the output array relative to that of theinput array should be taken into account!

2. Prepare the transform planAs mentioned, the library uses a planner to adapt its algorithms based on a number ofparameters, including the size and memory locations of the input and output arrays,the types of transform to be performed, and whether or not the transform should bein place.

The planner can run in two modes - measured or estimated. Measured mode isbest used when initialisation time is unimportant but executing the transform asquickly as possible is critical. Measured mode also destroys any data which alreadyexists in the input arrays at the time the planner is executed. Estimated mode takesless time to initialise, however it is not guaranteed that the planner will producethe optimal plan for the given transform. Throughout the implementation, we useestimated mode.

The code used to prepare the transform plan is as follows:

Listing 4.1: Code used to prepare the transform plan1 fftw plan aPlan;2 /∗ TODO: Grab actual method signature from docs here ∗/3 leftChannelPlan = fftw plan r2r 1d(FFT SIZE, timeInput, fftOutput,

FFTW R2HC, FFTW ESTIMATE);

Page 38: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 4. IMPLEMENTATION 28

3. Execute transformThe transform is performed according to the plan using the following function:

Listing 4.2: Code used for executing transform1 fftw execute(fftw plan plan to execute) ;

4.4 libsndfile Library

In order to read and write from a variety of sound file formats, we use the libsndfile library.This is an open source LGPL-licensed library which will read and write most formats (MP3not withstanding owing to patent issues). A good description of the programming interfaceis available at the author’s websit, however we summarise the key points about using thelibrary in this section for the convenience of the reader.

It should be noted that although many of the file formats supported by libsndfile supportone or more channels, it was decided that for the purposes of reducing the amount of coderequired our implementation would deal only with stereo files. The result of this is that itwill most likely fail when used with single channel files, and produce bizaare results whenused with files with more than two channels.

4.4.1 Basic Types

A frame is one sample of audio data across all channels.

When using a file, it is opened for reading, writing or both. Information about a sound fileis stored in an SF INFO structure, which is defined as follows:

Listing 4.3: The SF INFO data structure1 typedef struct {2 sf count t frames;3 int samplerate;4 int channels;5 int format;6 int sections ;7 int seekable ;8 } SF INFO;

If a file is being opened for reading, this structure is filled in based on the file opened. If afile is being opened for writing, it must be filled in before opening the file.

Page 39: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 4. IMPLEMENTATION 29

Upon opening a file, a pointer of type SNDFILE * is returned. This is an opaque structurewhich must be passed into every function which operates upon the file.

During compilation of the library, an appropriate type is chosen and defined as sf count t.This can be treated as an integer number in most cases however.

4.4.2 Opening a file

In order to open a file, the following function must be used:

Listing 4.4: Code used for opening an audio file1 SNDFILE∗ sf open (const char ∗path, int mode, SF INFO ∗sfinfo);

The value of the mode parameter determines whether the file is opened for reading, writingor both, and should be set to:

• SFM READ - Read only mode

• SFM WRITE - Write only mode

• SFM RDWR - Read and Write mode

If the file is being opened in write only mode, an SF INFO structure pre-filled with ap-propriate values must be supplied. If the file is being opened in read only mode or readand write mode, it is recommended that the format field of this structure be set to 0, andvalues will be supplied.

4.4.3 Closing a file

Each call to sf open must be matched with a call to sf close in order to free resourcesused. The function signature is as follows: This returns 0 if the close is successful or an

Listing 4.5: Code used for closing an audio file1 int sf close (SNDFILE ∗sndfile);

error value otherwise.

4.4.4 Reading samples from a file

There are two methods of reading samples from a file, by item and by frame. There arealso four different data types between which the library will convert. We always read byframe, and in most cases read samples as the type double.

Page 40: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 4. IMPLEMENTATION 30

Listing 4.6: Functions used for reading from audio files1 sf count t sf readf short (SNDFILE ∗sndfile, short ∗ptr, sf count t frames);2 sf count t sf readf int (SNDFILE ∗sndfile, int ∗ptr, sf count t frames);3 sf count t sf readf float (SNDFILE ∗sndfile, float ∗ptr, sf count t frames);4 sf count t sf readf double (SNDFILE ∗sndfile, double ∗ptr, sf count t frames);

The four frame reading function signatures are as follows:

In each case, the parameter ptr is a pointer to an array which is appropriately sized toread the number of frames requested by the frames parameter. Each function returnsthe number of frames successfully read, which should in general be equal to the numberrequested, however if more are requested than are available, no error occurs, but the numberactually read is returned.

4.4.5 Writing samples to a file

Again, there are two methods of writing samples to a file - by item or by frame. Thereare four functions which can be used depending on the data type used within the program.These are:

Listing 4.7: Functions used for writing to audio files1 sf count t sf writef short (SNDFILE ∗sndfile, short ∗ptr, sf count t frames) ;2 sf count t sf writef int (SNDFILE ∗sndfile, int ∗ptr, sf count t frames) ;3 sf count t sf writef float (SNDFILE ∗sndfile, float ∗ptr, sf count t frames) ;4 sf count t sf writef double (SNDFILE ∗sndfile, double ∗ptr, sf count t frames) ;

Each function returns the number of frames written to the file.

4.5 Guide to Using the Program

The vclean program takes two command line parameters, -i to specify the input file, and-t if use of the time domain algorithm is desired. The input file should be two channels -results are undefined when this is not the case.

Page 41: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Chapter 5

Testing and Results

5.1 Source Material

In order to test our detection algorithms, we have selected a number excerpts from audiosignals which suffer from a variety of defects. It was decided that in order to properlytest the performance of different click detection algorithms, the following factors should betaken into account:

1. How well the algorithm performs on a variety of sources containing no defects. Thisis a measure of detection of “false positives”.

2. How well the algorithm performs on a sources with artificially added vinyl noise inknown locations.

3. How well the algorithm performs on “real world” recordings taken from a vinyl source.

We were fortunate in having access to a large library of vinyl records, many of which werepressed during the 1960s and 1970s. From these, excerpts were chosen in order to providea good match with the criteria set out above.

In order to satisfy the criteria listed above, it was necessary to generate a passage of knownvinyl noise. This was assembled using Audacity from the run-in groove noise of severalvinyl records. A small section of the waveform and spectogram can be seen in figure 5.2,and the resulting audio can be heard by playing track 4 of the included audio CD.

In figure 5.2, the two largest transients can be clearly seen on the spectogram, and audiblyas clicks. This diagram gives a good idea of a typical click containing approximately equalenergy across the frequency spectrum. Another example of this can be seen in figure 5.1,along with the frequency scale. The frequency range of the audio is 0Hz-22.05kHz, since itwas sampled at 44.1kHz.

Although it is simple to identify the clicks in both figure 5.1 and 5.2, things are not sostraightforward in figure 5.3. This shows an excerpt from a recording taken from a CD

31

Page 42: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 5. TESTING AND RESULTS 32

(“Brothers In Arms” by Dire Straits) which does not contain any clicks, however there areseveral points on the spectogram which seem to match the profile of the clicks examinedin the previous two figures. These are actually as a result of “rimshots” on a snare drumor in some cases a hi-hat.

This clearly illustrates the problem of false positive click detection from an algorithm -visually and audibly hi-hats in particular are very similar to clicks, and thus relying solelyon an automated detection and removal technique can remove these wanted pieces of audio.This is perhaps an extreme example however, since the song has very little high frequencycontent apart from the click-like peaks in question.

We have also chosen to examine two further pieces of music taken from digital sources. Thefirst is a recording of Vivaldi’s “Cello Concerto in C minor”, written for a string group.This can be heard by playing track 6. The second is a recording of a solo electric guitarplaying Chopin’s “Waltz in D flat major” (more commonly known as the “Minute Waltz”).These three pieces are used to test the detection algorithms against the first criteria.

In order to test against the second criteria, we add the vinyl noise previously mentioned(track 4) and mix it with each of the previously mentioned pieces in order to simulatematerial coming from vinyl. This is by no means representative of music actually takenfrom vinyl - none of the distortion present on most vinyl recordings occurs. However, itdoes prove useful in determining how likely an algorithm is to detect clicks in the presenceof music.

In order to test against the third criteria, we decided to make use of three pieces. Thefirst is a recording of “Jungle Dream” as performed in 1963 by Los Indios Tabajaras, andtaken from a vinyl record pressed in that year. The vinyl disc was badly damaged havingbeen played many times over its forty-five year life, and stored for long periods of time insub-optimal conditions. Although the recording is mono, it was transferred to the digitaldomain using a stereo record player. This resulted in a different noise pattern on eachchannel. The first thirty seconds of the recording, including the needle drop can be heardby playing track 11 of the included audio CD.

A short section of one channel of the waveform and associated spectogram is shown infigure 5.6. The second is a recording of “One Note Samba”, of a similar age and condition,but a different style. This is more challenging, partly as it is more badly damaged, andpartly because it contains drums. The final recording is vocal accompanied by orchestra,and is an excerpt from “Honour, Riches, Marriage Blessing”, taken from an adaption ofShakespeare’s “The Tempest”.

Page 43: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 5. TESTING AND RESULTS 33

Figure 5.1: A spectogram showing a click with frequency scale

Page 44: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CH

AP

TE

R5.

TE

ST

ING

AN

DR

ESU

LT

S34

Figure 5.2: Samples of vinyl noise made up from run-in groove noise of several records) (Track 4). Waveform and Spectogram.

Page 45: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CH

AP

TE

R5.

TE

ST

ING

AN

DR

ESU

LT

S35

Figure 5.3: “Brothers In Arms” (Dire Straits, 1985) (Track 5). Waveform and Spectogram.

Page 46: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CH

AP

TE

R5.

TE

ST

ING

AN

DR

ESU

LT

S36

Figure 5.4: “Concerto in C minor” (St. Petersburg Radio & TV Symphony Orchestra, 1999) (Track 6). Waveform andSpectogram.

Page 47: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CH

AP

TE

R5.

TE

ST

ING

AN

DR

ESU

LT

S37

Figure 5.5: “Waltz in D flat major” (Chet Atkins, 1957) (Track 7). Waveform and Spectogram.

Page 48: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CH

AP

TE

R5.

TE

ST

ING

AN

DR

ESU

LT

S38

Figure 5.6: “Jungle Dream” (Los Indios Tabajaras, 1963) (Track 11). Waveform and Spectogram.

Page 49: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CH

AP

TE

R5.

TE

ST

ING

AN

DR

ESU

LT

S39

Figure 5.7: “One Note Samba” (Stan Getz and Charlie Byrd, 1963) (Track 12). Waveform and Spectogram.

Page 50: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CH

AP

TE

R5.

TE

ST

ING

AN

DR

ESU

LT

S40

Figure 5.8: “Honour, Riches, Marriage Blessing” (Woolfenden, 1978) (Track 13). Waveform and Spectogram.

Page 51: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

CHAPTER 5. TESTING AND RESULTS 41

5.2 Results

The raw readouts can be seen in appendix A. The outputs of the click detection algorithmsare not in themselves especially tangible without subsequently repairing what is detected.See the Conclusions section for more about this.

Several things can be noted about the results:

1. In all cases, the time domain detection algorithm detected more clicks than the fre-quency domain algorithm.

2. This did, however, result in more false positives being detected. The evidence of thiscan be seen by comparing the output for “Brothers In Arms”, a recording which wastaken from CD. Although a particularly tricky piece for a click detector, there are noactual clicks present which are not part of the wanted signal. Thus, every detectionwas a false positive.

3. The frequency domain based detection algorithm detected clicks affecting more sam-ples

4. Very few of the clicks detected by the two algorithms co-incided.

5. The detection rate against the aritifical vinyl noise was much lower when music wasalso present.

Page 52: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Chapter 6

Conclusions

From a personal perspective, it was disappointing not to be able to implement the leastsquares autoregressive method we described in chapter 3. As a result of this, no “repaired”audio is included on the accompanying CD, and the results of the click detection algorithmsare somewhat intangible.

However, it can be seen from the results that the two methods are suited to different typesof source material. The time domain algorithm detects more clicks in most cases, and oneswhich are smaller. However, this is at the expense of more false positives. The frequencydomain algorithm tends not to detect as many, but hits fewer false positives.

In retrospect, it may have proved more useful to implement the LSAR method in favourof one of the detection algorithms, or a simpler linear interpolation algorithm which wouldhave resulted in audible results, and confirmed the correct operation of both detectors.In it’s current state, the project makes little contribution to the field. The question ofwhether or not the project is personally seen as a success is simple to answer, the answerbeing negative for the reasons mentioned.

The results do, however, bring up the question of whether it would be a good idea toconstruct an algorithm which adapts its dectection technique depending on the type ofsource material, given the obviously differing characteristics of the two. Another interestingfurther work would be to implement some repair function in order to confirm in a moretangible manner that the detection algorithms produced work as expected.

42

Page 53: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Bibliography

Adobe Systems Inc. (2008), ‘Adobe audition’. http://www.adobe.com/products/audition/ [Online; accessed 6-December-2007].

Apple Inc. (2008), ‘Soundtrack pro’. http://www.apple.com/finalcutstudio/soundtrackpro/ [Online; accessed 6-December-2007].

Audacity Development Team (2008), ‘Audacity: Free audio editor and recorder’, Website.

Berger, J., Coifman, R. R. and Goldberg, M. J. (1994), ‘A method of denoising and recon-structing audio signals’, Proceedings of the ICMC pp. 344–347.

Borwick, J., ed. (1994), Sound Recording Practice, 4 (paperback) edn, Oxford UniversityPress.

Budman, S. (2006), ‘Uncovering the story of jack mullin’, Santa Clara Magazine .

Carrey and Buckner (1976), ‘A system for reducing impulsive noise on gramophone repro-duction equipment’, The Radio Electronic Engineer .

Caruso, D. (2007), Enrico Caruso - His Life and Death, Unkown.

CEDAR Audio Ltd (2008), ‘Cedar audio ltd: Audio restoration systems’, Website.

FFTW Home Page (2008).URL: http://www.fftw.org/

Foundation, F. S. (2008), ‘Gsl - gnu scientific library’, Website.URL: http://www.gnu.org/software/gsl/

Godsill, Rayner and Cappe (1997), ‘Digital audio restoration’.

Godsill, S. J. and Rayner, P. J. W. (1998), Digital Audio Restoration.

Janssen, A. J. E. M., Veldhuis, R. and Vries, L. B. (1986), ‘Adaptive interpolation ofdescrete-time signals that can be modeled as ar processes’, IEEE Trns. Acoustics,Speech and Signal Processing 34(2), 317–330.

43

Page 54: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

BIBLIOGRAPHY 44

Martin, G. and Hornsby, J. (1995), All You Need Is Ears: The inside personal story of thegenius who created The Beatles, 2 edn, Saint Martin’s Press Inc.

Paulson, J. and Paul, L. (2007), ‘Chasing sound’, DVD.

Rayner, P. J. W. and Godsill, S. J. (1991), ‘The detection and correction of artefacts inarchived gramophone recordings’, Proceedings of the IEEE Workshop on Audio andAoustics .

Smith, S. W. (1997), The Scientist and Engineer’s Guide to Digital Signal Processing,online edn, California Technical Publishing.

Sony Corporation (2008), ‘Sound forge’. http://www.sonycreativesoftware.com/products/soundforgefamily.asp [Online; accessed 6-December-2007].

Stockham, T. G., Cannon, T. M. and Ingebretsen, R. B. (1975), ‘Blind deconvolutionthrough digital signal processing’, Proceedings of the IEEE 63(4), 678–692.URL: http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=1451730

The Design and Implementation of FFTW3 (2005), Vol. 93, IEEE.

Veldhuis, R. (1992), Restoration of Lost Samples in Digital Signals, Prentice-Hall.

Welty, J. (2008), ‘Gnome wave cleaner’. http://gwc.sourceforge.net/ [Online; accessed6-December-2007].

Page 55: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Appendix A

Raw Result Outputs

This section contains reformatted output from each of the click detection algorithms onselected pieces of the source material.

A.1 Time Domain Algorithm

A.1.1 Artificial Vinyl Noise (Track 4)

Click Start Click End Click Length Threshold

798 805 7 0.3528914302 4327 25 7.040374302 4326 24 7.1094452 4457 5 0.1612374614 4623 9 0.8683684692 4694 2 0.04144534452 4460 8 0.7710164613 4625 12 1.482414691 4698 7 0.6502777200 7206 6 0.6393367203 7210 7 1.950567870 7882 12 1.397097869 7884 15 2.77659

10974 10991 17 3.3396910974 10991 17 3.1838312926 12934 8 0.9686212926 12935 9 1.0574716017 16035 18 4.2976816329 16343 14 2.78188

Continued on Next Page. . .

45

Page 56: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 46

16016 16037 21 5.1952316328 16345 17 3.6535720444 20458 14 2.352320444 20458 14 1.989122054 22078 24 6.694222054 22077 23 6.1476622704 22722 18 3.808422703 22723 20 5.0374626870 26891 21 5.1332926870 26891 21 5.3479428229 28235 6 0.26531231634 31640 6 0.44508736341 36365 24 5.844536341 36364 23 5.6134837393 37412 19 3.1535837393 37412 19 2.9936241189 41207 18 4.1360341200 41207 7 3.2787950522 50542 20 4.6416550522 50542 20 4.7005655709 55720 11 1.240755710 55719 9 0.91730563668 63690 22 5.9945963667 63691 24 6.4210976575 76577 2 0.086562576575 76581 6 0.49230591426 91444 18 3.8909891426 91444 18 3.5509492673 92721 48 5.4515692673 92722 49 5.5057793097 93100 3 0.15945393095 93106 11 1.6023

108748 108759 11 1.38891108748 108760 12 1.53066121563 121591 28 8.48493121564 121590 26 7.17205122890 122894 4 0.120508124901 124904 3 0.108047130020 130026 6 0.5625134727 134751 24 5.79676134727 134750 23 5.64721135779 135798 19 3.14844135779 135798 19 3.01188

Continued on Next Page. . .

Page 57: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 47

139575 139594 19 4.62583139575 139593 18 4.13282148908 148928 20 4.6415148908 148928 20 4.607154095 154106 11 1.28863154096 154105 9 0.910848162054 162076 22 5.93008162053 162077 24 6.42877174961 174964 3 0.0622266174961 174967 6 0.533125189812 189830 18 3.85922189812 189830 18 3.45988191059 191107 48 5.42408191059 191107 48 5.49306191484 191486 2 0.0619531191481 191492 11 1.54301207134 207145 11 1.33926207134 207146 12 1.53043219950 219977 27 8.58573219950 219976 26 7.16618221276 221279 3 0.0735938223287 223290 3 0.0471094228406 228413 7 0.685089233113 233137 24 5.85095233112 233136 24 5.48291234165 234184 19 3.15758234165 234184 19 3.07485237961 237980 19 4.53185237961 237979 18 3.97572247294 247314 20 4.59215247294 247314 20 4.55578252481 252492 11 1.30484252482 252491 9 0.794046260440 260462 22 5.94386260439 260463 24 6.37975273347 273350 3 0.0437891273347 273353 6 0.456602288198 288216 18 3.90852288198 288216 18 3.36375289445 289493 48 5.50327289445 289493 48 5.48593289869 289872 3 0.0610937289867 289878 11 1.59773

Continued on Next Page. . .

Page 58: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 48

305520 305531 11 1.33113305520 305532 12 1.58826318336 318363 27 8.58086318336 318362 26 7.1319321673 321676 3 0.0667188331499 331523 24 5.8466331499 331522 23 5.63486332551 332570 19 3.05741332551 332570 19 3.05084336347 336366 19 4.54594336347 336365 18 4.04727345680 345700 20 4.4759345680 345700 20 4.57812350867 350878 11 1.34297350868 350877 9 0.784642358826 358848 22 5.92836358825 358849 24 6.38355371732 371736 4 0.115273371733 371739 6 0.525625386584 386602 18 3.92879386584 386602 18 3.41375387831 387879 48 5.4386387831 387879 48 5.39805388253 388264 11 1.59508403906 403917 11 1.2216403906 403918 12 1.50721416722 416749 27 8.42212416722 416748 26 7.02253418050 418050 0 0.00777344420060 420060 0 0.0111328429885 429909 24 5.75703429885 429909 24 5.40133430937 430956 19 2.96774430937 430956 19 3.17669432383 432389 6 0.484883434733 434752 19 4.47581434733 434751 18 3.91266444066 444086 20 4.55043444066 444086 20 4.7132449253 449264 11 1.39273449254 449262 8 0.73862457212 457234 22 5.97075457211 457235 24 6.40383

Continued on Next Page. . .

Page 59: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 49

470119 470122 3 0.0954297470119 470125 6 0.445781484970 484988 18 3.93574484970 484988 18 3.40668486217 486265 48 5.39655486217 486265 48 5.44667486642 486644 2 0.0494141486639 486650 11 1.6432502292 502303 11 1.15719502292 502304 12 1.51132515108 515135 27 8.30636515108 515134 26 6.9469516434 516438 4 0.119766518445 518447 2 0.0438802528271 528295 24 5.6668528271 528295 24 5.41215529323 529342 19 3.02608529323 529342 19 3.30983530767 530774 7 0.44625533119 533138 19 4.56632533119 533137 18 3.97965542452 542472 20 4.49025542452 542472 20 4.71713547639 547650 11 1.33953547640 547648 8 0.585273555597 555620 23 5.99805555606 555621 15 7.24613568504 568509 5 0.177357568505 568511 6 0.44207583356 583374 18 3.96582583356 583374 18 3.41629584603 584651 48 5.48564584603 584651 48 5.4642585028 585030 2 0.068125585025 585036 11 1.53105600678 600689 11 1.13012600678 600690 12 1.47036613494 613521 27 8.30645613494 613520 26 6.87536614818 614825 7 0.257305626657 626681 24 5.62309626657 626681 24 5.52028627709 627728 19 3.04229

Continued on Next Page. . .

Page 60: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 50

627709 627728 19 3.31471629154 629160 6 0.277031631505 631524 19 4.47086631505 631523 18 3.9157640838 640858 20 4.52886640838 640858 20 4.76768646025 646036 11 1.32652646026 646034 8 0.593438653983 654006 23 6.21562654000 654007 7 4.44465666890 666895 5 0.194688666889 666896 7 0.47207681742 681760 18 3.95605681742 681760 18 3.43914682989 683037 48 5.4725682989 683037 48 5.49265683413 683416 3 0.07125683411 683422 11 1.43691699064 699075 11 1.22008699064 699076 12 1.50011711880 711907 27 8.30612711880 711906 26 6.81967713204 713211 7 0.40293725043 725067 24 5.60488725043 725067 24 5.48676726095 726114 19 3.1579726095 726114 19 3.36591727540 727545 5 0.17444729891 729910 19 4.44705729891 729909 18 3.93965739224 739244 20 4.52284739224 739244 20 4.78929744411 744422 11 1.23883744412 744420 8 0.697891752369 752392 23 6.34469752369 752392 23 6.3459765276 765281 5 0.181042765275 765282 7 0.381758780128 780146 18 3.97633780128 780146 18 3.41988781375 781423 48 5.46401781375 781424 49 5.55085781798 781802 4 0.171094

Continued on Next Page. . .

Page 61: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 51

781797 781808 11 1.38422797450 797461 11 1.21031797450 797462 12 1.47751810266 810293 27 8.36439810266 810292 26 6.88003818724 818724 0 0.0254297823429 823453 24 5.59445823429 823453 24 5.58203824481 824500 19 3.08622824481 824500 19 3.23364825926 825931 5 0.223919828277 828296 19 4.35782828277 828295 18 3.9484837610 837630 20 4.45319837610 837630 20 4.84565842797 842808 11 1.13461842800 842807 7 0.982813850755 850778 23 6.1807850755 850778 23 6.37652863663 863665 2 0.0783203863662 863668 6 0.28596878514 878532 18 3.98727878514 878532 18 3.38992879761 879809 48 5.46633879761 879810 49 5.72809880184 880189 5 0.288333880183 880194 11 1.44516895836 895847 11 1.26113895836 895848 12 1.48085908652 908679 27 8.40258908652 908678 26 6.97069911989 911995 6 0.633906911984 912004 20 4.70403912004 912015 11 12.9339912053 912078 25 3.62021912198 912229 31 10.6617

A.1.2 “Brothers In Arms” (Clean) (Track 5)

Click Start Click End Click Length Threshold

19995 19996 1 0.27729239595 39596 1 0.554584

Continued on Next Page. . .

Page 62: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 52

59195 59196 1 0.83187678795 78796 1 1.1091778791 78792 1 1.1573798395 98396 1 1.4346698391 98392 1 1.46225

117995 117996 1 1.73954117991 117992 1 1.77982137595 137596 1 2.05711137591 137592 1 2.09936157195 157196 1 2.37666157191 157192 1 2.43069176795 176796 1 2.70798196395 196396 1 2.98527196391 196392 1 3.00993215995 215996 1 3.28722235595 235596 1 3.56452255195 255196 1 3.84181255191 255192 1 3.86539274795 274796 1 4.14268274791 274792 1 4.18259294395 294396 1 4.45988294391 294392 1 4.49091313995 313996 1 4.7682313991 313992 1 4.79855333595 333596 1 5.07584333591 333592 1 5.10265353195 353196 1 5.37995372795 372796 1 5.65724372791 372792 1 5.67405392395 392396 1 5.95134392391 392392 1 5.97045411995 411996 1 6.24774431595 431596 1 6.52503451195 451196 1 6.80233451191 451192 1 6.81888470795 470796 1 7.09618490395 490396 1 7.37347509995 509996 1 7.65076529595 529596 1 7.92805549195 549196 1 8.20534536893 536894 1 8.21194536786 536787 1 8.20859536461 536462 1 8.24796

Continued on Next Page. . .

Page 63: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 53

536438 536439 1 8.2515536425 536426 1 8.27541536401 536402 1 8.31029536359 536360 1 8.33569536342 536343 1 8.33218536241 536242 1 8.35091536046 536047 1 8.35665536004 536005 1 8.34946535784 535785 1 8.3366535765 535766 1 8.34196535753 535754 1 8.33889535750 535751 1 8.33625535697 535698 1 8.33373535659 535660 1 8.35813535655 535656 1 8.36901535568 535569 1 8.38411568795 568796 1 8.6614552448 552449 1 8.6997588395 588396 1 8.977607995 607996 1 9.25429605703 605704 1 9.26429605435 605436 1 9.28187605369 605370 1 9.33345605321 605322 1 9.35127605067 605068 1 9.4008604957 604958 1 9.46256604912 604913 1 9.47647604843 604844 1 9.52844604711 604712 1 9.55834604490 604491 1 9.5507604434 604435 1 9.55644604392 604393 1 9.54716604389 604390 1 9.55512604386 604387 1 9.62168627595 627596 1 9.89898647195 647196 1 10.1763647191 647192 1 10.2155639148 639149 1 10.2273638923 638924 1 10.2382666795 666796 1 10.5155666791 666792 1 10.5728686395 686396 1 10.8501673647 673648 1 10.8864

Continued on Next Page. . .

Page 64: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 54

673477 673478 1 10.9013673405 673406 1 10.9547673207 673208 1 11.0135673190 673191 1 11.0287673108 673109 1 11.1132673014 673015 1 11.1379673004 673005 1 11.1949705995 705996 1 11.4722705991 705992 1 11.5023725595 725596 1 11.7796745195 745196 1 12.0569743203 743204 1 12.0638742814 742815 1 12.0801742363 742364 1 12.0692742048 742049 1 12.0822741897 741898 1 12.103741879 741880 1 12.1081741867 741868 1 12.1146741858 741859 1 12.1277741854 741855 1 12.1628741702 741703 1 12.153741673 741674 1 12.1312741566 741567 1 12.1489741492 741493 1 12.1739741150 741151 1 12.1623741119 741120 1 12.1767741112 741113 1 12.1788741109 741110 1 12.1908741106 741107 1 12.2603764795 764796 1 12.5375764791 764792 1 12.5873784395 784396 1 12.8645803995 803996 1 13.1418823595 823596 1 13.4191809331 809332 1 13.4266808957 808958 1 13.4086808954 808955 1 13.4272843195 843196 1 13.7045862795 862796 1 13.9818882395 882396 1 14.2591877199 877200 1 14.2671876907 876908 1 14.2979876806 876807 1 14.2943

Continued on Next Page. . .

Page 65: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 55

876793 876794 1 14.2958876759 876760 1 14.2675901995 901996 1 14.5448901991 901992 1 14.5948921595 921596 1 14.872921591 921592 1 14.9142910869 910870 1 14.9198941195 941196 1 15.1971960795 960796 1 15.4744946674 946675 1 15.5259946601 946602 1 15.5208946598 946599 1 15.5675946574 946575 1 15.5483946567 946568 1 15.58946368 946369 1 15.5536946365 946366 1 15.5564946359 946360 1 15.5598946352 946353 1 15.5727946303 946304 1 15.6079946286 946287 1 15.6262946255 946256 1 15.6449946247 946248 1 15.645946053 946054 1 15.6613946032 946033 1 15.681945975 945976 1 15.7029945937 945938 1 15.7092945896 945897 1 15.7091945885 945886 1 15.7084945844 945845 1 15.7533945787 945788 1 15.7615945731 945732 1 15.7523945727 945728 1 15.719945724 945725 1 15.7574945612 945613 1 15.792945608 945609 1 15.7936945605 945606 1 15.8018945602 945603 1 15.8588945577 945578 1 15.8766945570 945571 1 15.887945567 945568 1 15.9032945564 945565 1 15.9775980395 980396 1 16.2548980391 980392 1 16.3079

Continued on Next Page. . .

Page 66: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 56

999995 999996 1 16.5852

A.1.3 “Minute Waltz” (With artificial noise) (Track 10)

Click Start Click End Click Length Threshold

15750 15750 0 0.044492232662 32667 5 0.26753932663 32669 6 0.44972753742 53752 10 0.79298253742 53752 10 0.72634172693 72698 5 0.132487

101055 101067 12 1.54714101055 101066 11 1.30539243140 243155 15 1.31734243139 243155 16 1.27033297827 297838 11 1.29945297828 297837 9 1.07206439913 439928 15 1.21828439913 439928 15 1.16216636683 636689 6 0.857266636691 636697 6 0.483605636683 636689 6 0.842969636691 636697 6 0.469308691372 691380 8 0.730326691372 691379 7 0.614581833457 833471 14 1.08859833458 833470 12 0.920651859787 859790 3 0.119453888144 888151 7 0.339492888144 888150 6 0.212595

1030227 1030242 15 1.022551030228 1030241 13 0.8763511056557 1056562 5 0.395691056557 1056563 6 0.4160491084914 1084921 7 0.1830861084916 1084920 4 0.1094531253329 1253335 6 0.4459241253329 1253335 6 0.3382681281688 1281697 9 1.190121281688 1281697 9 0.900078

A.1.4 “Jungle Dream” (Track 11)

Page 67: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 57

Click Start Click End Click Length Threshold

25725 25742 17 3.3794931523 31559 36 3.9363731915 31933 18 2.3562631956 31991 35 1.6005532661 32665 4 0.089726632699 32705 6 0.27276233048 33057 9 0.63773733281 33287 6 0.23337633229 33304 75 4.9898833338 33377 39 4.626133401 33444 43 4.2982933460 33484 24 3.7562834153 34161 8 0.61291734153 34160 7 0.54520636758 36761 3 0.19890636765 36771 6 0.76227436759 36774 15 1.9890644449 44478 29 2.677544697 44727 30 2.276744456 44475 19 1.8823644696 44711 15 1.866544926 44942 16 1.9156446742 46760 18 3.9356346742 46761 19 4.6078247834 47869 35 7.7861247834 47869 35 7.9676848775 48792 17 2.761548778 48787 9 1.0568852480 52491 11 1.426852482 52488 6 0.58195355611 55647 36 6.2569555612 55647 35 6.652458817 58829 12 0.75257458816 58833 17 1.4520361390 61407 17 3.0543261391 61406 15 2.6533662158 62164 6 0.78191462156 62165 9 1.1855668835 68849 14 2.587568834 68849 15 2.5966772403 72428 25 4.75463

Continued on Next Page. . .

Page 68: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 58

72458 72464 6 0.56757872408 72429 21 6.3729772456 72467 11 1.4032484332 84332 0 0.067851686351 86369 18 3.1285586352 86367 15 2.124786968 86976 8 0.62326886968 86977 9 1.0166987121 87122 1 0.035664191610 91618 8 0.89108193483 93497 14 1.4833693483 93498 15 2.3664799696 99699 3 0.18875

109780 109785 5 0.309284110401 110413 12 1.93337110404 110414 10 2.50842123154 123163 9 1.05547123154 123163 9 1.0843131479 131497 18 2.69131480 131498 18 2.6628150431 150446 15 1.91257150431 150446 15 1.73605152532 152548 16 1.9248152532 152548 16 2.14163160120 160132 12 1.97963160120 160131 11 1.68074178793 178806 13 1.95096178800 178806 6 1.62359205083 205098 15 1.92998205083 205098 15 1.76381209267 209273 6 0.816602230890 230892 2 0.0474609230889 230893 4 0.214766263109 263122 13 1.50551263109 263122 13 1.68329295238 295244 6 0.744258295237 295246 9 1.11888320878 320896 18 3.25764320879 320893 14 1.86074352883 352889 6 0.541445352882 352890 8 0.808724383141 383159 18 2.67551383140 383160 20 2.89438

Continued on Next Page. . .

Page 69: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 59

412122 412129 7 0.681907412123 412129 6 0.693047438883 438898 15 2.66444438883 438898 15 2.54334532912 532914 2 0.0811328540557 540560 3 0.182266540564 540564 0 0.0182031563984 563990 6 0.583281563982 563989 7 0.270234571228 571231 3 0.122383571233 571236 3 0.0989453598339 598344 5 0.231576644323 644330 7 0.544389645797 645810 13 2.06285645796 645811 15 2.30527646962 646976 14 1.9177646962 646975 13 1.82224705685 705688 3 0.0923047705685 705688 3 0.130391832690 832700 10 1.12629832690 832699 9 1.16961882040 882049 9 0.699775882041 882048 7 0.702958906052 906058 6 0.63293906051 906059 8 0.777813914483 914489 6 0.258359946960 946972 12 1.30477946959 946972 13 1.50027990601 990608 7 0.414297990602 990611 9 1.17347

1023354 1023367 13 1.586731023354 1023367 13 1.540131058314 1058330 16 1.978721058314 1058330 16 2.037351085066 1085079 13 1.725861085065 1085079 14 1.755821140872 1140881 9 1.133361140872 1140881 9 1.160591143647 1143648 1 0.02535161143644 1143650 6 0.2518971169277 1169284 7 0.4507881169276 1169285 9 0.9109991228028 1228043 15 1.17325

Continued on Next Page. . .

Page 70: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 60

1228029 1228044 15 1.43511258334 1258340 6 0.5570881258332 1258339 7 0.3903521286777 1286797 20 4.510231286777 1286797 20 4.165971293596 1293613 17 3.536981293604 1293614 10 4.20511323650 1323662 12 1.856341323650 1323661 11 1.378791330224 1330240 16 1.628871330224 1330240 16 1.760631345531 1345546 15 2.968391345530 1345547 17 3.276641386760 1386767 7 0.4745311386760 1386767 7 0.6264841404291 1404309 18 2.299881404291 1404309 18 2.293751443284 1443290 6 0.6574611443283 1443291 8 0.8141671449628 1449630 2 0.07953131449626 1449632 6 0.2723721463061 1463069 8 0.5869141463060 1463072 12 1.184971476342 1476348 6 0.4234551476341 1476347 6 0.163221521825 1521840 15 1.945531521824 1521842 18 2.259221535770 1535785 15 2.791681535771 1535786 15 2.716521580595 1580613 18 2.027541580595 1580614 19 2.18648

A.2 Frequency Domain Algorithm

A.2.1 “Brothers In Arms” (Clean) (Track 5)

Click Start Click End Click Length Threshold

535540 535552 12 0.526985535557 535559 2 0.18875535569 535590 21 1.83198552319 552349 30 1.72108552353 552359 6 0.357891

Continued on Next Page. . .

Page 71: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 61

604697 604706 9 0.689438673056 673063 7 0.558345741121 741130 9 0.548348741162 741171 9 0.449018808963 808981 18 1.81566809003 809037 34 1.56626809054 809056 2 0.243789809063 809073 10 0.890664809119 809127 8 0.461237876768 876784 16 1.38017876803 876809 6 0.534727876917 876919 2 0.0737891876922 876922 0 0.0581641876929 876931 2 0.115456945847 945856 9 0.494324945860 945860 0 0.21668945942 945948 6 0.298277945953 945954 1 0.161992945960 945966 6 0.378138

A.2.2 “Minute Waltz” (With artificial noise) (Track 10)

Click Start Click End Click Length Threshold

15750 15750 0 0.044492232662 32667 5 0.26753932663 32669 6 0.44972753742 53752 10 0.79298253742 53752 10 0.72634172693 72698 5 0.132487

101055 101067 12 1.54714101055 101066 11 1.30539243140 243155 15 1.31734243139 243155 16 1.27033297827 297838 11 1.29945297828 297837 9 1.07206439913 439928 15 1.21828439913 439928 15 1.16216636683 636689 6 0.857266636691 636697 6 0.483605636683 636689 6 0.842969636691 636697 6 0.469308691372 691380 8 0.730326

Continued on Next Page. . .

Page 72: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 62

691372 691379 7 0.614581833457 833471 14 1.08859833458 833470 12 0.920651859787 859790 3 0.119453888144 888151 7 0.339492888144 888150 6 0.212595

1030227 1030242 15 1.022551030228 1030241 13 0.8763511056557 1056562 5 0.395691056557 1056563 6 0.4160491084914 1084921 7 0.1830861084916 1084920 4 0.1094531253329 1253335 6 0.4459241253329 1253335 6 0.3382681281688 1281697 9 1.190121281688 1281697 9 0.900078

A.2.3 “Jungle Dream” (Track 11)

Click Start Click End Click Length Threshold

25725 25742 17 3.3794931523 31559 36 3.9363731915 31933 18 2.3562631956 31991 35 1.6005532661 32665 4 0.089726632699 32705 6 0.27276233048 33057 9 0.63773733281 33287 6 0.23337633229 33304 75 4.9898833338 33377 39 4.626133401 33444 43 4.2982933460 33484 24 3.7562834153 34161 8 0.61291734153 34160 7 0.54520636758 36761 3 0.19890636765 36771 6 0.76227436759 36774 15 1.9890644449 44478 29 2.677544697 44727 30 2.276744456 44475 19 1.8823644696 44711 15 1.866544926 44942 16 1.91564

Continued on Next Page. . .

Page 73: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 63

46742 46760 18 3.9356346742 46761 19 4.6078247834 47869 35 7.7861247834 47869 35 7.9676848775 48792 17 2.761548778 48787 9 1.0568852480 52491 11 1.426852482 52488 6 0.58195355611 55647 36 6.2569555612 55647 35 6.652458817 58829 12 0.75257458816 58833 17 1.4520361390 61407 17 3.0543261391 61406 15 2.6533662158 62164 6 0.78191462156 62165 9 1.1855668835 68849 14 2.587568834 68849 15 2.5966772403 72428 25 4.7546372458 72464 6 0.56757872408 72429 21 6.3729772456 72467 11 1.4032484332 84332 0 0.067851686351 86369 18 3.1285586352 86367 15 2.124786968 86976 8 0.62326886968 86977 9 1.0166987121 87122 1 0.035664191610 91618 8 0.89108193483 93497 14 1.4833693483 93498 15 2.3664799696 99699 3 0.18875

109780 109785 5 0.309284110401 110413 12 1.93337110404 110414 10 2.50842123154 123163 9 1.05547123154 123163 9 1.0843131479 131497 18 2.69131480 131498 18 2.6628150431 150446 15 1.91257150431 150446 15 1.73605152532 152548 16 1.9248152532 152548 16 2.14163

Continued on Next Page. . .

Page 74: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 64

160120 160132 12 1.97963160120 160131 11 1.68074178793 178806 13 1.95096178800 178806 6 1.62359205083 205098 15 1.92998205083 205098 15 1.76381209267 209273 6 0.816602230890 230892 2 0.0474609230889 230893 4 0.214766263109 263122 13 1.50551263109 263122 13 1.68329295238 295244 6 0.744258295237 295246 9 1.11888320878 320896 18 3.25764320879 320893 14 1.86074352883 352889 6 0.541445352882 352890 8 0.808724383141 383159 18 2.67551383140 383160 20 2.89438412122 412129 7 0.681907412123 412129 6 0.693047438883 438898 15 2.66444438883 438898 15 2.54334532912 532914 2 0.0811328540557 540560 3 0.182266540564 540564 0 0.0182031563984 563990 6 0.583281563982 563989 7 0.270234571228 571231 3 0.122383571233 571236 3 0.0989453598339 598344 5 0.231576644323 644330 7 0.544389645797 645810 13 2.06285645796 645811 15 2.30527646962 646976 14 1.9177646962 646975 13 1.82224705685 705688 3 0.0923047705685 705688 3 0.130391832690 832700 10 1.12629832690 832699 9 1.16961882040 882049 9 0.699775882041 882048 7 0.702958906052 906058 6 0.63293

Continued on Next Page. . .

Page 75: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 65

906051 906059 8 0.777813914483 914489 6 0.258359946960 946972 12 1.30477946959 946972 13 1.50027990601 990608 7 0.414297990602 990611 9 1.17347

1023354 1023367 13 1.586731023354 1023367 13 1.540131058314 1058330 16 1.978721058314 1058330 16 2.037351085066 1085079 13 1.725861085065 1085079 14 1.755821140872 1140881 9 1.133361140872 1140881 9 1.160591143647 1143648 1 0.02535161143644 1143650 6 0.2518971169277 1169284 7 0.4507881169276 1169285 9 0.9109991228028 1228043 15 1.173251228029 1228044 15 1.43511258334 1258340 6 0.5570881258332 1258339 7 0.3903521286777 1286797 20 4.510231286777 1286797 20 4.165971293596 1293613 17 3.536981293604 1293614 10 4.20511323650 1323662 12 1.856341323650 1323661 11 1.378791330224 1330240 16 1.628871330224 1330240 16 1.760631345531 1345546 15 2.968391345530 1345547 17 3.276641386760 1386767 7 0.4745311386760 1386767 7 0.6264841404291 1404309 18 2.299881404291 1404309 18 2.293751443284 1443290 6 0.6574611443283 1443291 8 0.8141671449628 1449630 2 0.07953131449626 1449632 6 0.2723721463061 1463069 8 0.5869141463060 1463072 12 1.184971476342 1476348 6 0.423455

Continued on Next Page. . .

Page 76: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX A. RAW RESULT OUTPUTS 66

1476341 1476347 6 0.163221521825 1521840 15 1.945531521824 1521842 18 2.259221535770 1535785 15 2.791681535771 1535786 15 2.716521580595 1580613 18 2.027541580595 1580614 19 2.18648

Page 77: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Appendix B

Project Proposal

For completeness, the project proposal which originated this work is included here.

67

Page 78: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Automatic Removal of Scratches from Audio

Sourced from Vinyl

Project Proposal

James Nugent

Page 79: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Contents

1 Problem Description 11.1 Types of Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Current State of the Art . . . . . . . . . . . . . . . . . . . . . . . 21.3 Proposed Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Requirements Specification 32.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3 Project Plan 4

4 Resources 44.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.2 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.3 Source Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.4 Human Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1 Problem Description

Many hours of historical recordings reside on analogue media throughout theworld. In many cases, a vinyl disc is the only available source of a given record-ing. Unfortunately “mint” vinyl is rare, leaving well-used discs as the mastersource. Imperfections in and scratches on the vinyl surface manifest themselvesas noise during playback, as does dust in the grooves.

Restoration of recordings suffering from defects such as these is of interestto many people - whether for the purposes of improving the perceived qualityof personal recordings, or re-mastering recordings for release on compact disc.The aim of this project is to produce a program capable of going some way tominimize the noise relative to the wanted signal.

1.1 Types of Noise

Broadly speaking, the types of noise found on recordings taken from vinylsources can be categorized as follows:

• Clicks - A click is an aberration in the waveform, generally only lastingfor a few samples, but manifesting itself as a “popping” noise. They aremost often caused by scratches and dust on the surface of a vinyl disc.

• Crackle - Crackle is often characterized as a “frying fat” type noise. Itis generally caused by randomly distributed pock-marks on the surface ofa vinyl disc, creating impulsive disturbances in the audio.

• Buzz - Buzz is very similar to crackle in that it manifests itself becauseof impulsive disturbance in the waveform. However, unlike crackle, wherethe disturbances are randomly distributed, the impulses causing a buzz areregularly spaced. This is a common problem caused by poor shielding or

1

Page 80: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

incorrect grounding in electrical equipment, in a form known as fifty-cyclehum.1

• Hiss - Generally manifesting itself as a constant sound, the word “hiss”is onomatopoeic for the sound one can expect to hear from this type ofnoise. Hiss often originates from tape rather than vinyl sources, and assuch is not considered further.

Each of these types of noise requires a different strategy for removal. Thesewill be detailed further in the literature survey and dissertation. In the firstinstance however, it is intended that this project will cover only the “click”category of noises.

1.2 Current State of the Art

As in many fields, the current state of the art is represented by expensive com-mercial hardware and software packages. CEDAR audio, for example, produceseveral hardware/software or pure software packages such as “Retouch” whichare widely regarded as the state of the art, and are often used in re-masteringrecordings before issue.

Other packages such as Diamond Cut’s “Audio Restoration Tools”, andDartech’s “DartPro”, whilst less well regarded than packages such as “Re-touch”, are well known. The commercial field also offers lower-end packagessuch as Steinburg’s “Clean”, which is designed to make it simple for an individ-ual to improve the quality of their personal record collection before transfer tocompact disc.

Several audio editors, such as “Sound Forge” and “Adobe Audition”2 alsoinclude noise reduction tools in their feature lists.

In the world of Free Software, one of the pre-eminent projects is “GnomeWave Cleaner”, which fall between the two previously mentioned packages interms of complexity, and is capable of producing excellent results on certaintypes of damaged audio. The free wave editor “Audacity” also includes noisereduction tools.

It is unrealistic to expect to reach the level of high-end commercial systemsduring the course of this project. However, adopting the strategy detailed laterin this proposal may produce a piece of software capable of producing resultscomparable to those of “Gnome Wave Cleaner” - at least when dealing withclicks.

1.3 Proposed Strategy

The primary strategy to be investigated was proposed by John ffitch (who issupervising this project), and draws on the knowledge that clicks are oftencreated by scratches on the surface of a vinyl record, and that such scratchesare often at a high angle to the tracks of the record. Consequently, it wouldbe expected that clicks will form a fairly regular, albeit slow “beat”, since it isknown that clicks resulting from such scratches will be located approximately

1Sixty-cycle hum in the United States!)2Formerly Cool Edit/Cool Edit Pro

2

Page 81: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

one revolution of the record away (approximately every 4/3 seconds for a 45rpmrecord, or approximately every 2 seconds when the source was a 33RPM record).

More traditional click detection algorithms may need implementing in ad-dition to this to allow the processing of clicks originating because of other cir-cumstances.

The exact method for removing a detected click is yet to be determined.It is possible that the chosen solution may be as simple as linear interpola-tion between “good” samples adjacent to a given click, although it is intendedthat a signal modelling technique similar to those purportedly used in high endcommercial packages such as CEDAR will also be investigated for feasibility.

2 Requirements Specification

2.1 Aim

The aim of the project is to produce a program which will investigate whetheror not the algorithm described above is suitable for detecting clicks in audiooriginating from vinyl sources. The detection will be compared to a programusing a traditional click detection technique, namely “Gnome Wave Cleaner”.

2.2 Requirements

This section details an initial set of requirements for the software product of thisproject. It is almost certain that these requirements will be further developedduring the course of further research and implementation.

1. The software should operate inside the Audacity audio editor. This couldbe achieved by several means:

• Directly adding code to Audacity

• A VST3 plugin

• A Nyquist4 plugin

It is currently intended to write the software as a VST plugin, howeverbefore committing to this, the Nyquist language will be investigated fur-ther.

2. The software should aim to detect as many clicks as possible using themethod detailed in the Proposed Strategy section of this document. Thesoftware should be capable of producing output detailing where it believesto have found clicks.

• The software should use traditional click detection algorithms forattempt to discover further clicks in the audio.

3. The software should use an appropriate strategy as determined by investi-gation into the feasibility of signal modeling technique to remove detectedclicks.

3A Steinberg proprietary format. An SDK is available from Steinberg subject to a license.4A programming language designed from the outset for for audio synthesis and analysis.

Based on Lisp.

3

Page 82: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

It is intended that the results produced by the program will be comparedwith the output of Gnome Wave Cleaner on the same source material.

3 Project Plan

4 Resources

4.1 Hardware

It will be necessary to have access to a computer with a high quality soundcard and speakers. It is intended to primarily use my own Apple Macintoshcomputers for this, along with the Sony amplifier and Marantz speakers from myhi-fi system as main monitors, and a pair of Beyer DT100 headphones as secondmonitors and for working during evenings so as to cause minimum disturbanceto others!

4.2 Software

Much of the software to be used is Free, namely:

• the Audacity audio editor, for study of source code and to act as a pluginhost

• Gnome Wave Cleaner, both for study of the source code and for produc-ing results for comparison with the program written in the course of theproject.

• GCC for compiling C/C++ code

The Steinberg VST software development kit is not Free, and is subject toa commercial license. It is, however, available for no monetary cost. I also haveaccess to a second VST host in the form of Steinberg Cubase.

4.3 Source Material

A variety of different styles source material will be required. I have access toa large library of damaged vinyl recordings (several of which have since beenrestored using traditional techniques previously mentioned) taken using a highquality Linn turntable and Apogee Analogue-to-Digital converters. The qualityof the recordings is varied, as is the range of styles. I intend, however, to focusmy testing on source material to which I find listening pleasurable, primarilyjazz and operatic works.

4.4 Human Resources

All work will be undertaken by myself as per the regulations for a dissertationproject. However, the time and and direction of Prof. John ffitch throughoutthe project will be invaluable and appreciated.

4

Page 83: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

Appendix C

Code

C.1 File: SoundFileHandling.c

1 void printSoundInfoStructure(SF INFO info) {2 printf (”−− SF INFO −−\n”);3 printf (”Frames: %d\n”, info.frames);4 printf (”Sample Rate: %d\n”, info.samplerate);5 printf (”Channels: %d\n”, info.channels);6 printf (”Format: %x\n”, info.format);7 printf (”Sections: %d\n”, info.sections) ;8 printf (”Seekable?: %s\n”, info.seekable == 1 ? ”Yes” : ”No”);9 printf (”−− End of SF INFO −−\n”);

10 return;11 }1213 sf count t writeInterleavedToFile(const char ∗filePath, double ∗leftBuffer, double ∗rightBuffer, SF INFO

fileInfo) {1415 sf count t bufferLength = fileInfo .frames ∗ 2;16 double ∗interleavedBuffer = (double ∗)calloc(bufferLength, sizeof(double));17 sf count t currentRead, currentWrite;1819 for (currentRead=0, currentWrite=0; currentWrite < bufferLength; currentRead++, currentWrite += 2)

{20 interleavedBuffer [currentWrite] = leftBuffer [currentRead];21 interleavedBuffer [currentWrite + 1] = rightBuffer[currentRead];22 }2324 SNDFILE ∗soundFile = sf open(filePath, SFM WRITE, &fileInfo);2526 #ifdef DEBUG OUTPUT27 printSoundInfoStructure( fileInfo ) ;28 #endif2930 //For some reason this ∗doesnt∗ use the frame write function. Fuck knows why.31 sf count t writeCount = sf write double(soundFile, interleavedBuffer , bufferLength);3233 #ifdef DEBUG OUTPUT34 printf (”Written %d samples to file\n”, writeCount);35 #endif36

73

Page 84: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX C. CODE 74

37 sf close (soundFile);3839 free ( interleavedBuffer ) ;4041 return writeCount;42 }4344 SF INFO readAndCloseEntireSoundFile(const char ∗filePath, double ∗∗leftBuffer, double ∗∗rightBuffer) {45 SF INFO soundFileInfo;4647 /∗ As recommended by the libsndfile API ∗/48 soundFileInfo.format = 0;4950 /∗ Open file for reading ∗/51 SNDFILE ∗soundFile = sf open(filePath, SFM READ, &soundFileInfo);5253 #ifdef DEBUG OUTPUT54 printSoundInfoStructure(soundFileInfo);55 #endif5657 sf count t bufferLength = soundFileInfo.frames ∗ soundFileInfo.channels;58 double ∗interleavedBuffer = (double ∗)calloc(bufferLength, sizeof(double));59 sf count t readCount = sf readf double(soundFile, interleavedBuffer , soundFileInfo.frames);6061 sf close (soundFile);6263 #ifdef DEBUG OUTPUT64 printf (”Read %d Frames\n”, readCount);65 #endif6667 ∗ leftBuffer = (double ∗)calloc(soundFileInfo.frames, sizeof(double));68 ∗rightBuffer = (double ∗)calloc(soundFileInfo.frames, sizeof(double));6970 sf count t currentWrite, currentRead;71 currentWrite = 0;7273 for (currentRead=0; currentRead < bufferLength; currentWrite++, currentRead += 2) {74 ∗((∗ leftBuffer )+currentWrite) = interleavedBuffer[currentRead];75 ∗((∗rightBuffer)+currentWrite) = interleavedBuffer[currentRead + 1];76 }7778 free ( interleavedBuffer ) ;7980 return soundFileInfo;81 }

C.2 File: hpf-demonstration.c

This code was used to generate track 3 on the accompanying CD, with track 2 as input.Note that it is not a complete implementation, and is only intended to be illustrative ofthe high pass filter used. In particular, the first and last fifty samples of the output are notfiltered since the code to deal with this makes the remainder far less clear!

1 #define DEBUG OUTPUT2 #define MIN(X,Y) ((X) < (Y) ? : (X) : (Y))34 #include <stdlib.h>5 #include <stdio.h>6 #include <string.h>

Page 85: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX C. CODE 75

7 #include <math.h>8 #include <stdbool.h>9

10 #include <sndfile.h>11 #include ”SoundFileHandling.c”1213 SF INFO sourceSoundInfo;14 double ∗leftChannel = NULL;15 double ∗rightChannel = NULL;1617 int main() {18 const char ∗sourceFileName = ”/Users/James/highpassin.wav”;19 const char ∗destFileName = ”/Users/James/highpassout.wav”;2021 SF INFO destSoundInfo;22 destSoundInfo.format = SF FORMAT WAV | SF FORMAT PCM 16;23 destSoundInfo.samplerate = 44100;24 destSoundInfo.channels = 2;2526 sourceSoundInfo = readAndCloseEntireSoundFile(sourceFileName, &leftChannel, &rightChannel);27 destSoundInfo.frames = sourceSoundInfo.frames;2829 /∗ Processing starts here ∗/30 long i = 0;3132 for ( i = 0; i < sourceSoundInfo.frames; i++) {33 bool inFirstEight = false , inLastEight = false ;34 if ( i < 50) {35 inFirstEight = true;36 }37 if (( i+50) > sourceSoundInfo.frames) {38 inLastEight = true;39 }4041 //Deal with these later42 if (inFirstEight) continue;43 if (inLastEight) continue;4445 double tempSumL = 0;46 double tempSumR = 0;47 int j = 0;48 for (j = i − 8; j < i + 8; j++) {49 tempSumL = leftChannel[j−1] − 2∗leftChannel[j] + leftChannel[j+1];50 tempSumR = rightChannel[j−1] − 2∗rightChannel[j] + rightChannel[j+1];51 }5253 leftChannel[ i ] = sqrt(tempSumL / 6);54 rightChannel[i ] = sqrt(tempSumR / 6);55 }5657 /∗ Processing ends here, writes out to file and closes ∗/5859 sf count t writeCount = writeInterleavedToFile(destFileName, leftChannel, rightChannel, destSoundInfo);6061 return;62 }

C.3 File: ClickList.c

Page 86: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX C. CODE 76

1 /∗ Linked list structure for tracking the locations of clicks in the waveform.2 Note that no provision is made for which channel the click is in − it is assumed3 that all clicks in this linked list are on the same channel, and two should be4 initialized if we want to deal with stereo . ∗/5 typedef struct click {6 long clickStartSample;7 long clickEndSample;8 double threshold;9 struct click ∗nextClick;

10 } Click;1112 /∗ This shouldn’t be called directly really ∗/13 Click ∗createClickDataList(long startSample, long endSample, double threshold) {14 Click ∗returnList = (Click ∗)calloc (1, sizeof(struct click )) ;1516 returnList−>clickStartSample = startSample;17 returnList−>clickEndSample = endSample;18 returnList−>threshold = threshold;19 returnList−>nextClick = NULL;2021 return returnList;22 }2324 /∗ For now this always returns the head of the list . This may change at some point25 however ∗/26 Click ∗appendClickToClickDataList(Click ∗clickList, long startSample, long endSample, double threshold) {27 if ( clickList == NULL) {28 return createClickDataList(startSample, endSample, threshold);29 }3031 /∗ Assign data to the new node ∗/32 Click ∗newNode = (Click ∗)calloc(1, sizeof(struct click)) ;33 newNode−>clickStartSample = startSample;34 newNode−>clickEndSample = endSample;35 newNode−>threshold = threshold;36 newNode−>nextClick = NULL;3738 /∗ Attach the new node to the end of the list ∗/39 Click ∗currentNode = clickList;40 while (currentNode−>nextClick != NULL) {41 currentNode = currentNode−>nextClick;42 }43 currentNode−>nextClick = newNode;4445 return clickList;46 }4748 double sampleNumberToTime(long sampleNumber) {49 /∗ TODO: Convert this to be in minutes and seconds ∗/50 //printf(”%d”, sampleNumber / 44100);51 return sampleNumber / 44100;52 }5354 void printClickList(Click ∗ clickList ) {55 printf (”Start Sample\tEnd Sample\tLength\t\tThreshold\n”);56 printf (”−−−−−−−−−−−−\t−−−−−−−−−−\t−−−−−−\t\t−−−−−−−−−\n”);5758 Click ∗currentClick = clickList ;59 while (currentClick != NULL) {60 printf (”%ld\t\t%ld\t\t%ld\t\t%lg \\\\\n”, currentClick−>clickStartSample,

currentClick−>clickEndSample,

Page 87: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX C. CODE 77

61 currentClick−>clickEndSample − currentClick−>clickStartSample,currentClick−>threshold);

62 currentClick = currentClick−>nextClick;63 }64 printf (”−−−−−−−−−−−−\t−−−−−−−−−−\t−−−−−−\t\t−−−−−−−−−\n”);}

C.4 File: vclean.c

1 #define FFT METHOD 12 #define HPF METHOD 234 #include <stdlib.h>5 #include <stdio.h>6 #include <string.h>7 #include <math.h>8 #include <stdbool.h>9

10 #include <sndfile.h>11 #include <fftw3.h>12 #include <getopt.h>13 #include ”SoundFileHandling.c”14 #include ”ClickList.c”1516 SF INFO sourceSoundInfo;17 double ∗leftChannel = NULL;18 double ∗rightChannel = NULL;19 Click ∗leftChannelClickList = NULL;20 Click ∗rightChannelClicklist = NULL;2122 double calculateSampleRangeMean(double samples[], int numberOfSamples) {23 double sum = 0.0;24 int i ;2526 for ( i=0; i < numberOfSamples; i++) {27 sum += samples[numberOfSamples];28 }2930 return sum / numberOfSamples;31 }3233 void fillBlackmanWindow(double ∗array, int n) {34 int i ;35 for ( i = 0; i < n; i++) {36 double p = ((double)(i))/(double)(n−1) ;37 array[ i ] = 0.42−0.5∗cos(2.0∗M PI∗p) + 0.08∗cos(4.0∗M PI∗p);38 }39 }4041 void hpfDeclick(double sensitivity, bool list ) {42 const int BLOCK SIZE = 20000;43 const int BLOCK OVERLAP = 400;44 const int FILTER N = 8;4546 int clickCount = 0;4748 long currentBlockStartPosition;49 long currentWritePosition;5051 double currentDataBlock[BLOCK SIZE];5253 //Move over the whole signal, a block at a time

Page 88: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX C. CODE 78

54 for (currentBlockStartPosition = 0; currentBlockStartPosition < (sourceSoundInfo.frames −BLOCK SIZE); currentBlockStartPosition += (BLOCK SIZE − BLOCK OVERLAP)) {

5556 //First get the actual data for this block into our buffer57 long currentWritePosition;58 for (currentWritePosition = 0; currentWritePosition < BLOCK SIZE; currentWritePosition++) {59 currentDataBlock[currentWritePosition] = leftChannel[currentBlockStartPosition +

currentWritePosition];60 }6162 //Now apply the high pass filter63 long i;64 for ( i = 0; i < BLOCK SIZE; i++) {65 double tempSum = 0;66 int j = 0;67 for (j = i − 8; j < i + 8; j++) {68 tempSum = currentDataBlock[j−1] − 2∗currentDataBlock[j] +

currentDataBlock[j+1];69 }7071 if (tempSum < 0) tempSum = −tempSum;7273 currentDataBlock[i] = sqrt( tempSum / 6);74 }7576 //Calculate the mean of the signal in the block77 double signalMean;78 double tempSum = 0;79 for ( i = 0; i < BLOCK SIZE; i++) {80 tempSum += currentDataBlock[i];81 }82 signalMean = tempSum / BLOCK SIZE;8384 //Calculate the first derivative , mean and standard devation of these85 double dy[BLOCK SIZE];86 double dyMean;87 double dyStdDev;88 for ( i = BLOCK SIZE − 1; i > 1; i−−) {89 dy[i ] = (currentDataBlock[i+1] − currentDataBlock[i−1]) / 2;90 }9192 tempSum = 0;9394 for ( i = 1; i < BLOCK SIZE −1; i++) {95 tempSum += dy[i];96 }97 dyMean = tempSum / (BLOCK SIZE − 2);9899 double tempSumOfSquaresOfDeviations = 0;

100 for ( i = 0; i < BLOCK SIZE − 1; i++) {101 double deviation = dy[i] − dyMean;102 tempSumOfSquaresOfDeviations += (deviation ∗ deviation);103 }104 dyStdDev = sqrt(tempSumOfSquaresOfDeviations / (BLOCK SIZE−2));105106 //Now do the actual detection107 double dyThreshold = 2 ∗ dyStdDev / sensitivity + dyMean;108 double sThreshold = 2 ∗ signalMean / sensitivity;109110 bool inClick = false ;111 long clickEnd;112 for ( i = BLOCK SIZE − 1; i > 1; i−−) {

Page 89: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX C. CODE 79

113 if (! inClick) {114 if (dy[i ] > dyThreshold && currentDataBlock[i] > sThreshold) {115 inClick = true;116 clickEnd = currentBlockStartPosition + i;117 }118 } else {119 if (currentDataBlock[i] < sThreshold) {120 inClick = false ;121 long clickStart = currentBlockStartPosition + i;122123 double localDyAverage;124 int k;125126 for (k = clickStart − currentBlockStartPosition; k < clickEnd −

currentBlockStartPosition; k++) {127 localDyAverage += dy[k];128 }129 localDyAverage = localDyAverage / (clickEnd − clickStart);130131 leftChannelClickList =

appendClickToClickDataList(leftChannelClickList, clickStart,clickEnd, localDyAverage);

132 clickCount++;133 }134 }135 }136 }137 printf (”Time Domain Algorithm: detected %d clicks at sensitivity %lg\n”, clickCount, sensitivity);138 }139140 void fftDeclick(double sensitivity, bool list ) {141 const int FFT WINDOW = 1024;142 const int MAX FFT = 128;143 const int FFT SIZE = 64;144 const int WINDOW STEP = 400;145 const int WINDOW SIZE = 800;146 const int TOTAL SAMPLES = sourceSoundInfo.frames;147 int clickCount = 0;148149 fftw plan leftChannelPlan;150151 char level[2 ∗ FFT WINDOW + 1][MAX FFT];152153 double windowCoefficients[2 ∗ MAX FFT];154 double powerSpectrum[2 ∗ MAX FFT];155156 double fftOutput[2 ∗ MAX FFT];157 //Why is this +1 again?158 double timeInput[2 ∗ FFT WINDOW + 1];159160 fillBlackmanWindow(windowCoefficients, FFT SIZE);161162 leftChannelPlan = fftw plan r2r 1d(FFT SIZE, timeInput, fftOutput, FFTW R2HC, FFTW ESTIMATE);163164 long windowStart;165 bool finished = false ;166167 for (windowStart = 0; !finished && windowStart < TOTAL SAMPLES; windowStart +=

WINDOW STEP) {168 int firstSampleInWindow;169 int lastSampleInWindow;170

Page 90: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX C. CODE 80

171 /∗ This may result in some samples being processed twice towards172 the end of the file . Oh well. ∗/173 if (windowStart + WINDOW SIZE > TOTAL SAMPLES) {174 finished = true;175 windowStart = TOTAL SAMPLES − WINDOW SIZE;176 }177178 int i ;179 for ( i=0; i < WINDOW SIZE; i+= 2) {180 if (( i + windowStart − FFT SIZE / 2) > 0) {181 firstSampleInWindow = i + windowStart − FFT SIZE / 2;182 } else {183 firstSampleInWindow = 0;184 }185 lastSampleInWindow = firstSampleInWindow + FFT SIZE − 1;186187 if (lastSampleInWindow > TOTAL SAMPLES − 1) {188 lastSampleInWindow = TOTAL SAMPLES − 1;189 }190191 /∗ Copy data across from the sound file ∗/192 int readPosition, writePosition ;193 for (readPosition = firstSampleInWindow, writePosition = 0; readPosition <

lastSampleInWindow; writePosition++, readPosition++) {194 timeInput[writePosition] = leftChannel[readPosition];195 }196197 /∗ Window it ∗/198 int j ;199 for (j = 0; j < FFT SIZE; j++) {200 timeInput[j] ∗= windowCoefficients[j ];201 }202203 double minP = 1.e30;204 double maxP = −1.e30;205206 fftw execute(leftChannelPlan);207208 /∗ DC Offset ∗/209 powerSpectrum[0] = fftOutput[0] ∗ fftOutput[0];210211 for (j = 1; j < (FFT SIZE + 1) / 2; j++) {212 /∗ Real part for element j is in fftOutput[ j ], imaginary part is in

fftOutput[FFT SIZE − j]; ∗/213 powerSpectrum[j] = fftOutput[j] ∗ fftOutput[j ] + fftOutput[FFT SIZE − j] ∗

fftOutput[FFT SIZE − j];214 }215216 /∗ Nyquist freq . ∗/217 if (FFT SIZE % 2 == 0) {218 powerSpectrum[FFT SIZE / 2] = fftOutput[FFT SIZE / 2] ∗ fftOutput[FFT SIZE

/ 2];219 }220221 for (j = 1; j <= FFT SIZE / 2; j++) {222 double p = 10.0 ∗ log10(powerSpectrum[j]);223224 /∗ Cap at +− 127 ∗/225 if (p < −127.0) p = −127.0;226 if (p > 127.0) p = 127.0;227228 /∗ Track the maximum and minimum occurring ∗/

Page 91: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX C. CODE 81

229 if (p > maxP) maxP = p;230 if (p < minP) minP = p;231232 level [ i ][ j − 1] = (char) p;233 }234 }235236 /∗ −−−−−−−− ∗/237 double meanLevel[FFT SIZE];238 double offset[FFT WINDOW ∗ 2 + 1];239 double hgt sum = 0;240 double mean;241 long clickStart = 0;242243 int k;244 for (k = 0; k < FFT SIZE / 2; k++) {245 meanLevel[k] = 0.0;246 for ( i = 0; i < WINDOW SIZE; i += 2) {247 meanLevel[k] += level[i ][ k ];248 }249 meanLevel[k] /= (double) WINDOW SIZE;250 }251252 for ( i = 0; i < WINDOW SIZE; i += 2) {253 offset [ i ] = 0.0;254 for (k = 0; k < FFT SIZE / 2; k++) {255 offset [ i ] += level[i ][ k] − meanLevel[k];256 }257258 offset [ i ] /= (double) FFT SIZE / 2.0;259 }260261 for ( i = 1; i < WINDOW SIZE; i += 2) {262 offset [ i ] = (offset [ i − 1] + offset [ i + 1]) / 2.0;263 }264265 mean = calculateSampleRangeMean(offset, WINDOW SIZE);266267 bool inClick = false ;268269 /∗∗ −−−−−− ∗/270 for ( i = 0; i < WINDOW SIZE; i++) {271 double z = (offset[i ] − mean);272273 if (z < 0.0)274 z = 0.0;275276 if (z > 1.e−30) {277 if (! inClick) {278 inClick = true;279 clickStart = i + windowStart;280 hgt sum = z;281 } else {282 hgt sum += z;283 }284 } else {285 if (inClick) {286 long clickEnd = i − 1 + windowStart;287 double clickWidth = (clickEnd − clickStart + 1);288289 double meanHgt = hgt sum / clickWidth;290

Page 92: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX C. CODE 82

291 if (clickWidth > 8) {292 clickStart += clickWidth / 4.0 + 0.5;293 clickWidth = (clickEnd − clickStart + 1);294 }295296 //if (meanHgt > sensitivity) {297 leftChannelClickList =

appendClickToClickDataList(leftChannelClickList,clickStart, clickEnd, meanHgt);

298 clickCount++;299 //}300 }301 inClick = false ;302 }303304 }305306 }307 printf (”FFT Algorithm: detected %d clicks at sensitivity %lg\n”, clickCount, sensitivity);308 }309310 int main(int argc, char ∗∗argv) {311312 char optChar;313 char ∗sourceFileName = NULL;314 int detectionMethod = 1;315316 /∗ Get options from the command line ∗/317 while ((optChar = getopt(argc, argv, ”i:t ::? ”)) != −1) {318 switch (optChar) {319 case ’ i ’ :320 sourceFileName = optarg;321 break;322 case ’t ’ :323 detectionMethod = HPF METHOD;324 break;325 case ’?’ :326 printf (”−i specifies the input file . −t uses the time domain algorithm.”);327 break;328 default:329 break;330 }331 }332333 if (sourceFileName == NULL) {334 printf (”No input file was specified!\n”);335 exit (1) ;336 }337338 sourceSoundInfo = readAndCloseEntireSoundFile(sourceFileName, &leftChannel, &rightChannel);339340 switch (detectionMethod) {341 case FFT METHOD:342 fftDeclick (0.1, false ) ;343 break;344 case HPF METHOD:345 hpfDeclick(0.4, false ) ;346 break;347 }348349 printClickList (leftChannelClickList) ;350

Page 93: A Comparison of Techniques for Detecting Clicks on ...mdv/courses/CM30082/projects... · A Comparison of Techniques for Detecting Clicks on ... A Comparison of Techniques for Detecting

APPENDIX C. CODE 83

351 return;352 }