Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Evaluation of methodsfor the rapid extraction of theAuditory Brainstem Response
from underlying Electroencephalogram
Anjula C. De Silva
Bachelor of Science (Electrical Engineering)
Thesis as the requirement of
Doctor of Philosophy
Sensory Neuroscience Laboratory
Swinburne University of Technology
2011
Coordinating Supervisor: Dr. Mark A. Schier
Associate Supervisor: A/Prof. David J.T. Liley
Authorship
I hereby declare that this submission is my own work. The content in this thesis has not
been previously submitted for a degree or diploma in Swinburne University of Technology
or in any other higher educational institute. To the best of my knowledge and belief,
the thesis contains no material previously published or written by another person except
where due references are made.
Signature: ............
Date: .................
Journal articles
Material for the following article was extracted from this thesis.
• DE SILVA, A. C. & SINCLAIR, N. C. & LILEY, D. T. J. 2012. Limitations in the
rapid extraction of evoked potentials using parametric modelling. IEEE Transac-
tions in Biomedical Engineering, accepted on January 21, 2012.
• DE SILVA, A. C. & SCHIER, M. A. 2011. Evaluation of wavelet techniques in rapid
extraction of ABR variations from underlying EEG. Physiological Measurement, 32,
1747-1761.
Conference proceedings
Following conference papers have been prepared in support of this thesis.
• DE SILVA, A. C. & SCHIER, M. A. 2009. A Feasibility Study of Commercially Avail-
able Audio Transducers in ABR Studies.13th International Conference on Biomedical
Engineering, Singapore.
• DE SILVA, A. C. & SCHIER, M. A. 2010. Effectiveness of wavelet filtering in rapid
extraction of ABR from underlying EEG. Biosignal 2010, Berlin.
Abstract
The single trial or rapid extraction of evoked potentials (EPs) has previously been applied
to middle and late latency evoked potentials with the aim of accurately tracking a variety
of central nervous system processes. Because the evoked ‘far fields’ are expected to be
largely independent of the overlying ‘near field’ EEG noise, it can be argued that single
trial extraction techniques are better suited to study rapid extraction of the auditory
brainstem response (ABR) compared with the other EPs with cortical origin. However,
methods have not been systematically studied to extract variations in the early ABR
largely due to the inherent low signal to noise ratio in single trials. Therefore, this thesis
aims to systematically analyse the denoising and time-scale variation tracking of the ABR
using autoregression with an exogenous input (ARX) and wavelet methods.
Rapid extraction of the ABR could reduce clinical test trial times, as a non-invasive
tool for long-term patient monitoring systems with enhanced patient comfort and for
real-time sensory identification applications in brain-computer interfacing. The literature
revealed that, time-series modelling using ARX and wavelet denoising techniques have a
potential to extract the ABR. These findings are further strengthened by the existence
of commercial devices using ARX modelling for monitoring depth of anaesthesia and the
encouraging results reported with wavelets in EP studies.
The dissertation initially presents the analysis conducted to adopt ARX modelling to
extract simulated ABRs. This includes a systematic evaluation of the ARX model and
its modified algorithm; the robust evoked potential estimator (REPE), for their feasibility
and limitations when used in the presence of known variations of ABR latency and signal
to noise ratio. Results revealed superior performance with ARX modelling in extracted
morphology (with a mean correlation coefficient of 0.84 (SD = 0.02)) and latency tracking
(with a mean square error of 0.18 (SD = 0.02)) compared to the robust evoked potential
estimator with a mean correlation coefficient of 0.63 (SD = 0.06) and a mean square error
of 0.35 (SD = 0.06). Verification of these simulated results with actual ABRs concluded;
while ARX modelling is capable of extracting time-scale varying features of a signal only
at relatively high SNRs of > −20 dB.
In a separate study, wavelet denoising methods were analysed as a rapid extraction
system by initially applying them to simulated ABRs followed by application to ABRs
recorded from human participants. The previously reported latency-intensity curve of the
ABR wave V was used as the reference to determine the variation tracking capability of
these wavelet methods. The application of the wavelet methods to the recorded ABRs
required validation of threshold functions and time-windows as an integral part of this
research. To arrive at more accurate results, the wavelet study was extended to observe
the effect of shift-variant discrete wavelet transform and the shift-invariant stationary
wavelet transform with the tested wavelet methods.
It was revealed that the cyclic-shift-tree-denoising wavelet method with the discrete
wavelet transform is the most effective since it produce significantly lower MSEs com-
pared to other methods (p < 0.01) and producing an optimum mean square error of 0.18
(SD = 0.01). This required an ensemble of only 32 epochs to extract a fully featured
ABR with latency variations associated with the latency-intensity curve. However, use
of the computationally redundant stationary wavelet transform yielded significantly bet-
ter results (p < 0.01) compared to the discrete wavelet transform with a MSE of 0.11
(SD = 0.01). The resultant 32 epochs is a significant improvement compared to con-
ventional moving time averaging which uses approximately 1024 epochs to extract the
ABR.
The systematic analysis of rapid extraction of the ABR concluded that CSTD wavelet
method produced the optimum result with only an ensemble of 32 epochs to produce
an ABR with characteristic features and their time-scale variations out performing ARX
modelling methods. Future developments of this work could include recording the ABR
in an ambulatory mode to document and understand the normal population, and such
developments could also find subsequent clinical applications.
Acknowledgement
Over the four years since I started this research, many people have supported encouraged
and given me valuable advice.
Foremost, I acknowledge my supervisors Dr. Mark Schier and A/Prof. David Liley who
kept me focused and guided me towards the light through the intricacies of the research
path. Apart from the research, the support given by way of understanding the life in a
foreign country is much appreciated.
I acknowledge the support given by Nicolas Sinclair from Brain Science Institute (now
with BionicVision Australia) with the collaborative work carried out in evoked response
modelling.
Also I acknowledge the financial support given by the Sensory Neuroscience Laboratory
and Ian Black of the Swinburne TAFE for providing me an employment opportunity to
financially support my living which helped me to concentrate on work related to this thesis
with a peaceful mind.
I acknowledge the valuable advice given by Prof. Peter Cadusch of the Faculty of
Engineering and Industrial Sciences and Martin Dubaj from the Sensory Neuroscience
Laboratory regarding wavelets and Prof. Andrew Wood and David Simpson from the
Faculty of Life and Social Sciences regarding hardware setup for data collection. Also I
acknowledge Chris Anthony from the Faculty of Life and Social Sciences and Jim Barbour
of Media and Communications Group for helping me in the laboratory and expertise given
in the area of acoustics.
I would like to extend special thanks to Dr. Dario Toncich who is the initiator of this
research through which I gained invaluable exposure.
My gratitude is extended to all the friends who were with me every step of the way
sharing hard times, embracing good times and encouraging me to reach this level, especially
by filling the gaps of home touch.
My heartfelt appreciation goes to my mother, father and the sister who gave me con-
stant and unconditional support throughout. You are the invisible force behind the journey
of life. Also my special thanks is extended to my uncles Prof. Nihal Kodikara and Prof.
Saman Gunathilake for valuable advice.
Finally I acknowledge the patience and understanding of my dear wife Pabarasi, to
whom I have to prove a lot from the outcome of this thesis.
Sincerely,
Anjula De Silva.
Abbreviations
AABR Automated Auditory Brainstem Evoked ResponseAAI A-Line ARX indexABR Auditory Brainstem ResponseAEP Auditory Evoked PotentialsALR Auditory Late ResponseAMLR Auditory Middle Latency ResponseAR AutoregressiveARX Autoregressive model with an exogenous inputASSR Auditory Steady State ResponseBCI Brain Computer InterfaceCI Confidence IntervalCNS Central Nervous SystemCSTD Cyclic Shift Tree DenoisingCTMC Constant Threshold with Matching Coefficientsdof degree of freedomDWT Discrete Wavelet TransformECG ElectrocardiogramECochG ElectrocochleogramEEG ElectroencephalogramEMG ElectromyogramEOG ElectrooculogramEP Evoked PotentialERP Event Related PotentialFIR Finite Impulse ResponseFPE Final Prediction ErrorFsp F statistics at a single pointIIR Infinite Impulse ResponseL-I Latency-IntensityMA Moving AverageMLAEP Middle Latency Auditory Evoked PotentialMSE Mean Square ErrorMTA Moving Time AveragenHL Normal Hearing LevelOAE Otoacoustic EmissionPSWC Periodic Sharp Wave ComplexesREPE Robust Evoked Potential EstimatorSAET Stimulus Artifact End TimeSEP Somatosensory Evoked PotentialsSNR Signal to Noise RatioSWT Stationary Wavelet TransformTWMC Temporal Windowing with Matching CoefficientsUNHS Universal Neonatal Hearing ScreeningVEP Visual Evoked PotentialWT Wavelet TransformZ Integers
Contents
1 Introduction 1
1.1 Evoked potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Rapid extraction of EPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Parametric modelling . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Wavelet denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Thesis objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Perceived contributions of the research . . . . . . . . . . . . . . . . . . . . . 11
1.5 Organization of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 A review of the ABR and its extraction 14
2.1 Review of evoked potentials and the auditory brainstem response . . . . . . 15
2.1.1 Evoked potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.2 Auditory evoked potentials . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.3 Origin of the auditory brainstem response . . . . . . . . . . . . . . . 17
2.1.4 Factors influencing the ABR . . . . . . . . . . . . . . . . . . . . . . 22
2.1.5 EPs in brain computer interfacing . . . . . . . . . . . . . . . . . . . 28
2.1.6 ABRs for rapid extraction . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Review of ARX modelling based extraction methods . . . . . . . . . . . . . 30
2.2.1 Moving time averaging to parametric modelling . . . . . . . . . . . . 30
i
2.2.2 The ARX(p, q, d) model . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.3 Applications of ARX modelling . . . . . . . . . . . . . . . . . . . . . 35
2.2.4 Robust evoked potential estimator (REPE) . . . . . . . . . . . . . . 38
2.2.5 Simulation studies and drawbacks . . . . . . . . . . . . . . . . . . . 39
2.2.6 Scope of the current study . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3 Review of wavelet based extraction methods . . . . . . . . . . . . . . . . . . 42
2.3.1 Wavelets in the extraction of ABRs and in general EPs . . . . . . . 43
2.3.2 Concept of wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.3.3 Basis wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.3.4 DWT with Biorthogonal wavelets . . . . . . . . . . . . . . . . . . . . 51
2.3.5 Shift variance of DWT . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.3.6 Stationary wavelet transform . . . . . . . . . . . . . . . . . . . . . . 54
2.4 Summation of the ABR extraction methodologies . . . . . . . . . . . . . . . 54
2.4.1 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.5 ABR data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.5.1 Types of ABR data . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.5.2 Simulated ABR data . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.5.3 Real ABR data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3 Recording and constructing synthetic ABR data 63
3.1 Recording of ABR data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.1.1 Equipment and parameters . . . . . . . . . . . . . . . . . . . . . . . 64
3.1.2 Participant details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.1.3 MTA and statistically significant SNR . . . . . . . . . . . . . . . . . 67
3.1.4 Data organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.1.5 The template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
CONTENTS ii
3.2 Latency-intensity and amplitude-intensity curves . . . . . . . . . . . . . . . 71
3.2.1 Compatibility of the L-I curve model . . . . . . . . . . . . . . . . . . 73
3.3 Synthetic ABR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.3.1 Construction of the ABR model . . . . . . . . . . . . . . . . . . . . 74
3.3.2 Construction of synthetic datasets . . . . . . . . . . . . . . . . . . . 75
3.3.3 Adding noise to simulated datasets . . . . . . . . . . . . . . . . . . . 78
4 ARX modelling in rapid extraction of the ABR 80
4.1 Introduction to the simulation study . . . . . . . . . . . . . . . . . . . . . . 81
4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.2.1 Simulation study domain and extrapolation . . . . . . . . . . . . . . 81
4.2.2 Simulated reference ABR and datasets . . . . . . . . . . . . . . . . . 82
4.2.3 Acquisition of real ABR data . . . . . . . . . . . . . . . . . . . . . . 83
4.2.4 Predetermined models . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3.1 The efficacy of identifying the predefined models . . . . . . . . . . . 88
4.3.2 Estimation of model orders . . . . . . . . . . . . . . . . . . . . . . . 89
4.3.3 Comparison of model performance . . . . . . . . . . . . . . . . . . . 93
4.3.4 Estimated single sweep of an ABR . . . . . . . . . . . . . . . . . . . 96
4.3.5 Tracking variations of a single sweep . . . . . . . . . . . . . . . . . . 98
4.3.6 Confirmation of simulated results with actual ABRs . . . . . . . . . 106
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5 Wavelets in rapid extraction of the ABR 116
5.1 Wavelet extracting methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.1.1 Synthetic and real ABR template . . . . . . . . . . . . . . . . . . . . 118
CONTENTS iii
5.1.2 Wavelet decomposition levels . . . . . . . . . . . . . . . . . . . . . . 118
5.1.3 Constant thresholds with matching coefficients (CTMC) . . . . . . . 119
5.1.4 Time windowing with matching coefficients (TWMC) . . . . . . . . 121
5.1.5 Cyclic shift tree denoising (CSTD) . . . . . . . . . . . . . . . . . . . 122
5.1.6 Use of SWT algorithm in CTMC, TWMC and CSTD . . . . . . . . 127
5.2 Choice of the basis wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.3 Simulation study on wavelet methods . . . . . . . . . . . . . . . . . . . . . 129
5.3.1 Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.3.2 Latency tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.4 Evaluation of wavelet methods on real ABR Data . . . . . . . . . . . . . . . 132
5.4.1 Denoising ability of wavelet methods . . . . . . . . . . . . . . . . . . 134
5.4.2 Latency tracking ability of wavelet methods . . . . . . . . . . . . . . 134
5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.5.1 Determination of a common basis wavelet for analysis . . . . . . . . 135
5.5.2 Determination of CSTD level threshold function . . . . . . . . . . . 136
5.5.3 Noise reduction of wavelet methods with DWT . . . . . . . . . . . . 136
5.5.4 Fsp threshold in quantifying the effectiveness of wavelet filtered ABRs140
5.5.5 Comparison of noise reduction between DWT and SWT . . . . . . . 146
5.5.6 Latency tracking results of wavelet methods with DWT . . . . . . . 149
5.5.7 Latency tracking results of wavelet methods with SWT . . . . . . . 155
5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.6.1 Evaluation of de-noising capacity of wavelet methods using DWT . . 158
5.6.2 Performance comparison of DWT and SWT decomposition algorithms159
5.6.3 Evaluation of latency tracking with DWT and SWT . . . . . . . . . 160
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
CONTENTS iv
6 Overall conclusions and further work 165
6.1 The approach towards the extraction of ABR . . . . . . . . . . . . . . . . . 166
6.2 Rapid extraction with ARX and REPE . . . . . . . . . . . . . . . . . . . . 166
6.3 Rapid extraction with wavelets . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.4 Limitations of the current study and future work . . . . . . . . . . . . . . . 170
A Appendix A I
B Appendix B IV
C Appendix C XXI
D Appendix D XL
E Appendix E LIX
F Appendix F LXIII
CONTENTS v
List of Figures
2.1 The AEPs include early, middle and late potentials . . . . . . . . . . . . . . 17
2.2 A fully featured ABR recorded from a participant . . . . . . . . . . . . . . 19
2.3 Auditory pathway from inner ear to the primary auditory cortex . . . . . . 20
2.4 Presumed generators of the ABR waves I-V . . . . . . . . . . . . . . . . . . 22
2.5 Latency-intensity curves of wave I, wave III and wave V . . . . . . . . . . . 26
2.6 The process of the ARX model . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.7 The process of the REPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.8 SNR improvement of ARXE (same as ARX) and REPE . . . . . . . . . . . 41
2.9 Mallat’s cascaded filter multiresolution analysis . . . . . . . . . . . . . . . . 53
2.10 Decomposition and the synthesis tree of the SWT . . . . . . . . . . . . . . 55
2.11 Types of auditory stimulus and their frequency spectrums . . . . . . . . . . 59
2.12 Possible electrode montages for ABR recordings . . . . . . . . . . . . . . . . 61
3.1 ABR recording with a stimulus artifact . . . . . . . . . . . . . . . . . . . . 67
3.2 Typical recording setup on a participant . . . . . . . . . . . . . . . . . . . . 68
3.3 Fsp plot for a worst-case scenario. . . . . . . . . . . . . . . . . . . . . . . . 70
3.4 Synthetic and the Real ABR templates . . . . . . . . . . . . . . . . . . . . . 71
3.5 Latency and amplitude intensity curves derived from recorded data . . . . . 72
3.6 The theoretical and the derived L-I curve of wave V . . . . . . . . . . . . . 73
vi
3.7 Synthetic and the Real ABR templates . . . . . . . . . . . . . . . . . . . . . 75
3.8 Comparison spectra of the ABR model and the real ABR . . . . . . . . . . 76
3.9 Types of datasets used in the simulation study . . . . . . . . . . . . . . . . 77
3.10 Spectra of associated noise compared to Gaussian white noise . . . . . . . . 78
4.1 Characteristics of the transfer function of the ARX model . . . . . . . . . . 85
4.2 Characteristics of the transfer function of the REPE . . . . . . . . . . . . . 87
4.3 Estimated Pole (x) and Zero (o) plots of the ARX model . . . . . . . . . . 89
4.4 Estimated Pole (x) and Zero (o) plots of the REPE . . . . . . . . . . . . . 90
4.5 Results of the fixed model order determination of the ARX model . . . . . 92
4.6 Results of the fixed model order determination of the REPE . . . . . . . . . 94
4.7 The SNR improvement of the estimated ABR . . . . . . . . . . . . . . . . . 95
4.8 Detection of wave V with empirical and theoretical model orders . . . . . . 96
4.9 Single sweep estimated with ARX model and REPE . . . . . . . . . . . . . 97
4.10 Wave V latency (1 ms) tracking using ARX(6,7,0) . . . . . . . . . . . . . . 100
4.11 Wave V latency (2 ms) tracking using ARX(6,7,0) . . . . . . . . . . . . . . 101
4.12 Comparison of the latency tracking of the ARX estimation and the MTA . 102
4.13 Wave V latency (1ms) tracking using REPE(6,7,8,0) . . . . . . . . . . . . . 104
4.14 Wave V latency (2 ms) tracking using REPE(6,7,8,0) . . . . . . . . . . . . . 105
4.15 Comparison of the latency tracking of the REPE estimation and the MTA . 106
4.16 Histograms of model order combinations for real ABR . . . . . . . . . . . . 108
4.17 L-I curves derived with a single epoch, MTA of 32, 128 and 256 . . . . . . . 110
4.18 Unstable model estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.1 Synthetic and the Real ABR templates . . . . . . . . . . . . . . . . . . . . . 118
5.2 Flowchart of the CTMC algorithm . . . . . . . . . . . . . . . . . . . . . . . 120
5.3 Flowchart of the TWMC algorithm . . . . . . . . . . . . . . . . . . . . . . . 122
LIST OF FIGURES vii
5.4 Temporal windows defined for the TWMC . . . . . . . . . . . . . . . . . . . 123
5.5 The flowchart of the CSTD algorithm . . . . . . . . . . . . . . . . . . . . . 124
5.6 Averaging sequence of the CSTD algorithm . . . . . . . . . . . . . . . . . . 126
5.7 Defined temporal windows for TWMC with SWT algorithm . . . . . . . . . 129
5.8 The constructed array with SWT coefficients to suit CSTD . . . . . . . . . 130
5.9 Improvement in the SNR with wavelet filtering methods . . . . . . . . . . . 131
5.10 Latency tracking results with simulated ABR datasets . . . . . . . . . . . . 133
5.11 Effect of Biorthogonal basis wavelets on denoising methods . . . . . . . . . 135
5.12 Effect of level threshold functions in CSTD . . . . . . . . . . . . . . . . . . 137
5.13 Denoising effect of Wavelet methods . . . . . . . . . . . . . . . . . . . . . . 138
5.14 Surface plots of CTMC filtered ABRs . . . . . . . . . . . . . . . . . . . . . 141
5.15 Surface plots of TWMC filtered ABRs . . . . . . . . . . . . . . . . . . . . . 142
5.16 Surface plots of CSTD filtered ABRs . . . . . . . . . . . . . . . . . . . . . . 143
5.17 Mean correlation coefficients between the template and CTMC filtered ABRs144
5.18 Denoised ABRs at a block size of 32 . . . . . . . . . . . . . . . . . . . . . . 145
5.19 Effect of dof of F statistics on the threshold criteria . . . . . . . . . . . . . 146
5.20 Comparison of Denoising of SWT and DWT . . . . . . . . . . . . . . . . . . 148
5.21 The plot of the effect of denoising of SWT and DWT on Random ABRs . . 150
5.22 Latency tracking with wavelet methods using DWT . . . . . . . . . . . . . 153
5.23 L-I curves derived from estimated models with DWT . . . . . . . . . . . . . 154
5.24 Latency tracking with wavelet methods using SWT . . . . . . . . . . . . . . 156
5.25 L-I curves derived from estimated models with SWT . . . . . . . . . . . . . 157
5.26 The difference of the MSE of the L-I curves derived using DWT and SWT . 163
LIST OF FIGURES viii
List of Tables
2.1 A brief summary of the types of EPs, their generators and features . . . . . 16
2.2 Normative Latencies and Amplitudes for ABR wave features . . . . . . . . 19
2.3 Specifications of key algorithms for rapid extraction . . . . . . . . . . . . . 47
2.4 Settings for a typical ABR recording . . . . . . . . . . . . . . . . . . . . . . 58
3.1 Finalised parameters for the data collection for the main study . . . . . . . 65
4.1 Unstable estimated epochs percentage (%) . . . . . . . . . . . . . . . . . . . 109
5.1 Frequency Content of wavelet subspaces . . . . . . . . . . . . . . . . . . . . 119
5.2 Coefficients of SWT and DWT . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.3 ANOVA results comparing MSEs produced by CSTD, TWMC and CTMC 138
5.4 Tukey post-hoc comparison of CSTD against TWMC and CTMC . . . . . . 139
5.5 Results of paired t-test between the MSEs of DWT and SWT denoised ABRs149
5.6 Coefficients of the estimated models of the L-I curves derived with DWT . . 154
5.7 Results of the t-test for DWT derived curves . . . . . . . . . . . . . . . . . 154
5.8 Coefficients of the estimated models of the L-I curves derived with SWT . . 157
5.9 Results of the t-test for SWT derived curves . . . . . . . . . . . . . . . . . . 157
ix
Chapter 1
Introduction
Bioelectric signals are key tools used by physicians for diagnosis, research, therapy and
prognosis of the health of patients and recently for brain-computer interfacing (BCI). These
signals are obtained through electrodes which sense the variations in electrical potentials
generated by physiological systems. Bioelectric signals originate as a result of groups of
neural or muscular cells producing an electric field which propagates through tissues in
the body (Adelman & Smith 1999). A few of these include electroencephalogram (EEG),
electrooculogram (EOG), electrocardiogram (ECG) and electromyogram (EMG).
Evoked potentials (EPs) are a sub-group of EEG that directly measure the electrical
response of the cortex to sensory, stimuli or affective and cognitive processes. In gen-
eral EPs are relatively small in amplitude (less than 30 µV) compared to the ongoing
EEG (20-50 µV), especially early components such as the auditory brainstem response
(ABR) in the range of one tenth of a microvolt. Therefore biomedical signal processing
techniques are extensively used to extract these EPs to enhance features for accurate di-
agnosis and prognosis. Currently signal processing related to extraction of EPs is a major
area of research which looks into rapid and accurate variation tracking for intraoperative
neurophysiological monitoring and patient comfort related applications.
1
1.1 Evoked potentials
Evoked potentials are bioelectric signals recorded from the scalp in response to a variety
of controlled internal and external stimuli. The time locked stimuli activate a series of
neuronal populations along the path from the receptor to the brain. It should be noted
that event related potentials (ERPs) are another sub-group of EEG that directly result
from a thought or perception such as P300, N400 and P600. They are usually confused
with EPs e.g. the ABR is a measured brain response to sound that is not directly the
result of a thought or perception.
Both EPs and ERPs produced by the activation of neuronal populations can be
recorded via scalp electrodes. In general, an EP lasts for a few hundreds of millisec-
onds, with its various features categorised into early, middle and late components. In this
thesis, we investigate the auditory brainstem response (ABR) which is one of the early
components of auditory EPs, arising within 0-10 ms reflecting compound action potentials
along auditory pathway from the distal vestibulocochlear nerve (VIII cranial nerve) to the
inferior colliculus in the brainstem (Hall 2007).
Features of EPs that are of clinical and physiological relevance include amplitude and
latency variations of specific, well defined peaks. These provide important information
regarding cortical activity and therefore are measures of the functional state of the cortex.
The latency and amplitude effects on EPs have been observed and are affected by;
neurophysiological disorders, subject factors, stimulus and acquisition factors and drug
and muscular artefacts.
Neurological disorders have an effect on a range of EPs, especially on the ABR with
multiple sclerosis, Parkinson’s disease, Tumors and Strokes (Chiappa & Ropper 1982, Ko-
dama, Ieda, Hirayama, Koike, Ito & Sobue 1999, Misra & Kakita 1999). Also, the ABR is
used as an important tool in intraoperative monitoring of the acoustic nerve during acous-
tic neuroma and brainstem tumor resections (Lee, Song, Kim, Lee & Kang 2009, Morawski,
Niemczyk, Sokolowski & Telischi 2010, Matthies 2008) along with ECochG (Gouveris
Introduction 2
& Mann 2009) and direct cochlear nerve action potential (CNAP) (Aihara, Murakami,
Watanabe, Takahashi, Inagaki, Tanikawa & Yamada 2009, Yamakami, Yoshinori, Saeki,
Wada & Oka 2009). Further, variations of the peak amplitude of middle latency auditory
evoked potentials (MLAEPs) have been recorded in response to anaesthetic agents such as
propofol and analgesics; alfentanil, fentanyl and morphine (Davies, Mantzaridis, Kenny &
Fisher 1996, Schwender, Rimkus, Haessler, Klasing, Poppel & Peter 1993, Thornton 1991).
The correlation between the amplitude of the MLAEP and depth of anaesthesia was pos-
itively identified from the results of an artificial neural network fed with selective com-
ponents extracted from applying discrete wavelet transform on the MLAEP (Nayak &
Roy 1998). Further, MLAEP has been commercially used as an automated monitoring
tool by means of A-liner index (Jensen, Nygaard & Henneberg 1998). The emerging field
of brain computer interfacing (BCI) is heavily dependent upon visual and auditory P300
response which is related to the perception and auditory steady state response (ASSR)
(Furdea, Halder, Krusienski, Bross, Nijboer, Birbaumer & Kbler 2009, Lee, Hsieh, Wu,
Shyu & Wu 2008, Lopez, Pomares, Pelayo, Urquiza & Perez 2009, Nijboer, Furdea, Gunst,
Mellinger, McFarland, Birbaumer & Kbler 2008, Pham, Hinterberger, Neumann, Kbler,
Hofmayer, Grether, Wilhelm, Vatine & Birbaumer 2005).
While changes in the morphology of EP peaks are slow (hours, days) for neuro-
logical disorders, changes due to surgical procedures, drug administration and stimu-
lus parameters can, in contrast, be very rapid (seconds, minutes) (Jensen, Lindholm &
Henneberg 1996). The detection rate of ERPs generated for BCI applications similarly
needs to be close to real-time for the external system to be able to meaningfully react to
the brain state. But the delay associated with the time point of emitting the EP/ERP
and the interpretation of it, is a common issue in BCI systems (Lopez et al. 2009) and in
intraoperative monitoring (Yamakami et al. 2009) when using conventional moving time
average method of detecting EP/ERP changes, therefore highlighting the need of a rapid
extraction method.
Introduction 3
1.2 Rapid extraction of EPs
The far field recordings of EPs challenge extraction due to distortion by the comparatively
larger amplitudes of spontaneous EEG. Typically, early and middle components of an EP
have a signal to noise ratio (SNR) in the range of -20 to -30 dB (or 0.1 to 0.03:1 ratio
of amplitudes). Conventionally, moving time average of a large ensemble of time locked
responses is used to extract the deterministic EP and to suppress random spontaneous
EEG assumed to be uncorrelated to the EP. This is true for early components but less so for
late and middle components due to the wide range of neuronal populations involved in the
generation. In conventional ABR extraction, a moving time average of approximately 1000
sweeps is considered to suppress the noise and arrive at the ABR i.e. a SNR improvement
of ' 30 dB (Rushaidin, Salleh, Swee, Najeb & Arooj 2009, Strauss, Delb, Plinkert &
Schmidt 2004, Shangkai & Loew 1986, Wilson 2004, Stuart, Yang & Botea 1996).
The use of a conventional moving time average in these applications however results in
poor time resolution and therefore cannot be used to detect the fast variations in latency
and amplitude of EPs. In practice, an initial ABR study (as the first step) investigating
the response of the auditory system to different stimulus intensities takes approximately 30
minutes using conventional moving time averaging (i.e. both ears are tested using only one
stimulus frequency, five stimulus intensities and two repeat traces per intensity) (Moller,
Jho, Yokota & Jannetta 1995). A more detailed ABR study investigating how the auditory
system responds to different stimulus frequencies and intensities takes approximately 60
minutes using current technology (i.e. both ears are tested using four stimulus frequencies,
three stimulus intensities and two repeat traces per intensity) (Wilson, Mills, Bradley,
Petoe, Smith & Dzulkarnain 2011, Vannier & Nat-Ali 2004).
In addition, the underlying assumptions of time invariance of the EP and the inde-
pendence of background spontaneous EEG with the EP makes conventional moving time
average unsuitable to extract a series of EPs with time varying features. The irrationality
of these assumptions could be further explained as follows (Sun & Chen 2008):
Introduction 4
• The active and adaptive ability of the cerebrum and the nervous system, the evoked
response is not necessarily a definite process itself, but of stochastic nature.
• Multiple excitations during traditional ABR tests will cause the nervous system to
react repeatedly and fatigue chronically, which will affect the waveform of the final
induced responses to some extent.
• The relation between the signal and noise cannot be described by simple additive
model, i.e., the signal and noise may not be kept wholly irrelevant. It is found that
the sound stimulation may have phase control effect on the self-induced brain stem
potential (Hanrahan 1990).
Further drawbacks are reported on the current technology for extraction of ABRs in
international heath schemes. According to Universal Neonatal Hearing Screening pro-
gram (UNHS), Automated Auditory Brainstem Evoked Response (AABR) technology is
the most preferred screening tool due to the low false positive rate of 4% (Tann, Wil-
son, Bradley & Wanless 2009). However, this is a multi-stage process where Otoacoustic
Emission (OAE) test is performed prior to the referral of AABR. This multi-stage process
is prone to inaccuracies, unnecessary time delays and additional costs involved to equip
test centres with OAE devices which account for approximately 60% of installed devices
globally (Moller et al. 1995).
In eliminating these drawbacks, two significant limitations that impede AABR as the
sole clinical neonatal hearing screening device are:
• Lengthy ABR acquisition times limit the diagnosis to only be able to assess at
near-threshold stimulus intensity (typically 35 dB nHL). Although a more thorough
and accurate ABR test could be performed if both the results above and below the
hearing threshold are included.
• The acquisition of the ABR is subject to high levels of noise interference from
both external noise sources and the neonate being tested. Therefore, data acqui-
sition times for the near-threshold ABR waveforms required for UNHS are typically
Introduction 5
around 5 minutes and in less favourable acquisition conditions, extend to 20 min-
utes (Corona-Strauss, Delb, Schick & Strauss 2010a), after which testing is typically
aborted until another time which cause parental anxiety.
• An automated and objective audiological method which is able to quantify the hear-
ing threshold within a short measurement time (less than 2 or 3 minutes) without
sedation or general anaesthesia could drastically reduce the number of audiological
examinations that need to be performed under anaesthesia (Strauss et al. 2004).
The identification of these constraints led to further analysis and improved rapid ex-
traction methods. Recent literature has reported considerable work on the reduction of the
number of epochs required to obtain EPs in general and ABRs in particular. Early noise
reduction methods such as matched filters (Delgado & Ozdamar 1994), use of templates
(Vannier, Adam, Karasinski, Ohresser & Motsch 2001) and Wiener filtering (Doyle 1975)
make the assumption that the signal is stationary.
However, ABR signals are transient (non-stationary) in nature and present with vari-
able peak morphology, both within a single ABR and between different ABRs (De Weerd
1981). In other words, the frequency content of bioelectric waveforms such as ABRs vary
over their time courses and are localised in time, as such they are non-stationary in time
viz. transient (Samar, Bopardikar, Rao & Swartz 1999).
Woody (1967) introduced an iterative method for EP/ERP latency estimation based
on common averages. He determined the time instant of the best correlation between a
template (EP/ERP average) and single trials by shifting the latter in time. A similar
method has been adopted by Vannier et al. (2001) and Delgado & Ozdamar (1994) by
shifting a template for each peak of the ABR. While these methods correct possible latency
variability of EPs/ERPs, the performance was highly dependent on the choice of templates.
Eliminating these drawbacks, the Wiener filter (Doyle 1975), uses spectral estimation
to reduce uncorrelated noise. This technique, however, is less accurate for EPs/ERPs,
because the time course of transient signals is lost in the Fourier domain (Effern, Lehnertz,
Fernndez, Grunwald, David & Elger 2000). The disadvantages of Fourier analysis arise
Introduction 6
due to the decomposition of the signal of interest into linear combinations of sine and
cosine waves, which are highly localized in frequency, but strictly not localized in time
(Raz, Dickerson & Turetsky 1999, Quian Quiroga 2000).
1.2.1 Parametric modelling
One common approach to rapidly extract EPs that avoid the above mentioned drawbacks
is to parametrically model the EP using an autoregressive model with an exogenous in-
put (ARX). Here a single sweep is modelled with a reference signal (exogenous input)
and white noise. ARX modelling has been widely adopted by researchers to rapidly ex-
tract MLAEP, visual evoked potentials (VEP) and somatosensory evoked potentials (SEP)
(Cerutti, Baselli, Liberati & Pavesi 1987, Jensen et al. 1996, Rossi, Bianchi, Merzagora,
Gaggiani, Cerutti & Bracchi 2007). This method of rapid extraction has been used to
quantify changes in MLAEP during anaesthesia (Jensen et al. 1996, Mainardi, Kupila,
Nieminen, Korhonen, Bianchi, Pattini, Takala, Karhu & Cerutti 2000, Urhonen, Jensen
& Lund 2000), changes in auditory N100, as a means of monitoring sedation in cardiac
surgery patients (Mainardi et al. 2000) and changes in SEPs to investigate a combined
spinal cord intraoperative neuromonitoring technique. The ARX method has also been
extended to make the single sweep estimation process resistant to noise using the robust
evoked potential estimator (REPE) (Lange & Inbar 1996).
To date, ARX and REPE methods of rapid extraction have been evaluated on the basis
of their ability to detect assumed variations in actual EP data. Since the actual EP is
unknown in real data, simulated data is required so that (in order to provide a deterministic
EP) ARX methods for rapid extraction can be meaningfully evaluated. Therefore, a set
of synthetic data with predefined, but physiologically plausible, variations in latency and
amplitude provide a better basis to compare the model estimation rather than using real
EPs which inherit large variances due to physiological and recording conditions. While a
range of attempts have been made to conduct simulation studies (Cerutti et al. 1987, Lange
& Inbar 1996, Rossi et al. 2007), these suffer from:
Introduction 7
• Use of grand average of real EP data as the reference signal.
– “An actual average over 99 sweeps of a visual evoked potential is taken as the
reference signal u” (Cerutti et al. 1987)
– “An averaged 200-trial ensemble of a finger tapping experiment was used as the
signal s(n)” (Lange & Inbar 1996)
Such experimental ERPs are not reproducible due to obvious reasons such as; par-
ticipant conditions, experimental setup and noise associated in the recording envi-
ronment. Therefore a comparison or a performance evaluation with previous work
is impossible.
• Ambiguity of the selection criteria for model parameters with qualitative statements,
such as:
– “The transfer function B(z)/A(z) is designed in such a way as to perform only
a temporal delay on the reference signal u” (Cerutti et al. 1987).
– “Then, single-trial realizations with . . . varying gains were synthesized from
the simulated signal and ongoing activity at different SNRs, . . . ” (Lange &
Inbar 1996)
Such qualitative statements make the validation of the model performance with
previous work impossible.
• The range of latency variations tested in the simulated ERP is limited, e.g.
– “. . . single-trial realizations with a constant latency shift of three sample points
. . . ” (Lange & Inbar 1996)
Maximum latency shift of a typical ABR recording is 80 sample points (explained
further in chapter 4), thus such studies do not encompass the expected range of
physiological variation of the ABR.
Therefore, one of the aims of this thesis is to systematically address these shortcomings
and provide a solid evaluation criterion for denoising signals with parametric modelling.
Introduction 8
1.2.2 Wavelet denoising
Recently wavelet domain filtering has been widely studied in conjunction with EPs for its
ability to analyse time-variant signals. The wavelet transform (WT) considers the signal
to be non-stationary and provides the additional advantage of having a time-frequency
representation with resolution control in both time and frequency domains. Analysis and
synthesis of the WT with a range of wavelet base functions has an advantage over other
conventional transformations with only a cosine base function. Use of closely matched
wavelet base functions to the morphology of the EP improves the spectral feature which
leads to better feature localization (Samar et al. 1999). Also wavelets are localized in both
time and frequency domains with the ‘compact support’ feature and the band limited
spectrum of the wavelet suits features of EPs as oppose to infinite length in time and
single frequency of the cosine base signal.
The main analysis technique, discrete wavelet transform (DWT) decomposes the EP
into number of temporal scales based on the frequency distribution. Then various thresh-
olding methods are applied to retain only the relevant coefficients to the EP such as; fixed
thresholding (Maglione, Pincilotti, Acevedo, Bonell & Gentiletti 2003, McCullagh, Wang,
Zheng, Lightbody & McAllister 2007, Wilson, Winter, Kerr & Aghdasi 1998, Zhang, McAl-
lister, Scotney, McClean & Houston 2006), soft thresholding (Causevic, Morley, Wicker-
hauser & Jacquin 2005, Donoho 1995) and thresholding based on temporal distribution
of coefficients (Quian Quiroga 2005). In addition to DWT, there are other wavelet trans-
formation methods such as continuous wavelet transform for an analogue transformation
from which DWT got inspired, stationary wavelet transform without decimation which
reduces the shift invariance of DWT (Nason & Silverman 1995), wavelet packet decom-
position to achieve a fully decomposed tree (Coifman & Wickerhauser 1992), dual tree
complex wavelet transform to achieve shift invariance of signals (Kingsbury 2001).
The wavelet studies reported in the literature are predominantly related to middle
and late components of EPs and to a lesser extent to the ABR. However these studies
concentrate more on the denoising aspect and the classification of the ABR according to
Introduction 9
the presence of wave V (Corona-Strauss, Delb, Schick & Strauss 2010b, Zhang et al. 2006).
In contrast, the research conducted for this thesis, in addition to denoising, concentrates
on extracting a fully featured ABR using a minimum ensemble of epochs and accurate
tracking of time-scale variations of ABR features. The approach for the evaluation consists
of estimating systematic variation of ABR peaks using the below mentioned WT denoising
approaches based on (Zhang et al. 2006, Quian Quiroga 2005, Causevic et al. 2005) with
simulated data followed by real ABRs recorded from a group of human participants:
• Constant thresholds with matching coefficients (CTMC)
• Temporal windowing with matching coefficients (TWMC)
• Cyclic shift tree denoising (CSTD)
1.3 Thesis objectives
The principal objective of this thesis is to evaluate the two identified methods of ARX
modelling and wavelets and their variations, for the potential rapid extraction of the ABR.
To our knowledge a systematic study has not been performed with these methods on the
ABR. Such a detailed study could be used as an analysis tool for a better choice of EPs
which could use parametric modelling and wavelet denoising to calculate a fine grained
measure of the brain state, thereby providing a better understanding of brain structures
relevant to the generation of EPs.
In particular, short lasting alterations which may provide relevant information about
cognitive functions are probably smoothed or even masked by the averaging process.
Therefore, investigators are interested in single trial analysis, that allows extraction of
reliable signal characteristics out of single EP/ERP sequences (Effern, Lehnertz, Fern-
ndez, Grunwald, David & Elger 2000, Wada 1986).
To suit this new application domain of ABRs, the following improvements were made
to the tested algorithms.
Introduction 10
• ARX and REPE models
– Model order selection criterion
– Derive new model orders relevant to the ABR
– Defining a reference signal
• Wavelets
– Selection of optimum basis wavelet
– Defining threshold functions and temporal windows for wavelet subbands to
optimise the characteristics of the filtered ABR
Specific details are given in chapter 4 and 5.
With these improvements, the thesis aims to:
• To conduct a well defined, reproducible simulation study to determine the robust-
ness of ARX and REPE rapid extraction methods (in terms of noise removal) to
variations in the SNR of the simulated EPs and the evaluation of the ability of these
rapid extraction methods to accurately track time-scale variations in the latency of
simulated EP components;
• To analyse and optimise the wavelet denoising methods CTMC, TWMC and CSTD
using a common set of real ABR data for the purpose of rapid extraction. The
evaluation covers the denoising and time-scale variation tracking ability of these
methods;
• To explicitly identify the limitations and implications of using ARX modelling and
specific wavelet denoising methods.
1.4 Perceived contributions of the research
This research provides original contributions for rapid extraction of ABR applications
using ARX modelling and specific wavelet denoising algorithms. The major contributions
of the research can be summarised as follows:
• Determination of limitations of temporal variation tracking ability of ARX mod-
elling using physiologically plausible variations related to ABRs. The contradictory
Introduction 11
outcome generated with REPE is discussed in this thesis and several shortcomings
in the original study are pointed out.
• Optimization of CTMC, TWMC and CSTD wavelet denoising methods to suit ex-
traction of ABRs. This included the determination of a compatible mother wavelet,
threshold functions and temporal windows to suit the new domain of application.
• Identification of the suitability and limitations of CTMC, TWMC and CSTD wavelet
denoising methods to apply as a rapid extraction method of ABRs by reproducing
known time-scale variations (in response to intensity of the stimulation).
1.5 Organization of the thesis
This thesis elaborates the research carried out in the chapters that follow. In summary,
the content of these chapters are:
Chapter 2 - ABR and its extraction is a study of the literature including the
ABR, in general EPs and their extraction methods. This chapter initially reviews the
physiological origin of the ABR and its clinical importance. The applications related in
general to EPs and the ABR are then discussed. Finally a comprehensive literature review
of ARX modelling and wavelet methods is presented in conjunction with the extraction of
EPs and specifically of the ABR.
Chapter 3 - Recording and constructing synthetic ABR data presents the
stimulation and acquisition parameters used for ABR data recording and discuss the suit-
ability of these data to use in evaluating denoising methods. A mathematical model for
the ABR is also introduced in this chapter, from which multiple datasets are constructed
in order to evaluate denoising methods representing ideal conditions in later chapters.
Chapter 4 - Effectiveness of ARX modelling in rapid extraction of the
ABR analyses ARX modelling and its extension REPE systematically using synthetic
ABRs. The findings of this chapter refine the ambiguity of previous research findings and
establish clear boundaries for the use of ARX methods in rapid extraction applications.
The simulated results are confirmed with the use of real ABR recordings.
Chapter 5 - Effectiveness of wavelet techniques in the rapid extraction of
Introduction 12
the ABR investigates the ability to rapidly track ABR variations using CTMC, TWMC
and CSTD wavelet denoising methods. This analysis is carried out in two steps, one which
determines the denoising capacity of wavelet methods and the other for assess ability of
tracking temporal variations. Similar datasets were used to that of Chapter 3, including
simulated and real ABR recordings for direct comparison to establish thorough conclusions.
Chapter 6 - Overall conclusions and future work highlights the overall effec-
tiveness of ARX modelling and wavelet methods in the rapid extraction of ABRs. Several
issues that have not been addressed and possible solutions that could guide future work
are also discussed here.
Introduction 13
Chapter 2
A review of the ABR and its
extraction
This chapter presents information collected from relevant literature to provide the basis
for rapid extraction of ABRs and include:
• The origin and the importance of ABR and EPs
• Recording of ABR data
• Review of ARX modelling for rapid extraction of EPs
• Review of wavelets as a method of rapid extraction of EPs
The importance of a fast extraction method of the ABR highlights due to the proximity of
its generators to critical physiological structures of the brain. These important applications
provide the basis to formulate a methodology to evaluate rapid extraction algorithms
discussed throughout the thesis, using systematic variations of ABR features. Recording
an accurate ABR depends on effective stimulation, acquisition and signal processing of
EEG. Therefore, initially the setup used for stimulation and acquisition of the ABR will
be discussed. Then, the two identified approaches for processing the ABR; ARX and
wavelets are discussed in relation to rapid extraction. Specific drawbacks of the existing
implementation and modifications with novel features that can be added to enhance the
rapid extraction process will also be discussed.
14
2.1 Review of evoked potentials and the auditory
brainstem response
2.1.1 Evoked potentials
As information is processed within the neural networks of the human brain, the electrical
activity arising from millions of participating neurons is summed to form field potentials
that can be recorded through the intact scalp. The brain’s field potentials include the
rhythmic voltage oscillation of the ongoing electroencephalogram (EEG) and the short
evoked potentials (EPs) that arise in association with specific sensory, motor and cognitive
events. While the spontaneous EEG rhythms are sensitive monitors of general states of
arousal, consciousness and the sleep-waking cycle, the EPs are believed to represent more
discrete patterns of neural activity that reflect specific perceptual and cognitive processes.
EPs are categorised into auditory, visual, somatosensory and motor evoked potentials
depending on the modality of stimulation eliciting the respective evoked potential. A brief
summary of most notable features of these EPs are summarized in table 2.1. The reader
should note that there are more features of these EPs which could be relevant on the basis
of application and are direct to Chiappa et al. (1997) for further information. Visual
evoked potentials (VEPs) are caused by sensory stimulation of a subject’s visual field with
flashing lights or checkerboards on a video screen that flicker between black and white. The
commonly used feature of a VEP is P100 with amplitude in the range of 10-12 µV (Misra
& Kakita 1999). Somatosensory evoked potentials (SEPs) are generated mainly by the
large diameter sensory fibres in the peripheral and central portion of the nervous system,
typically in response to an electrical stimulus. SEPs are mainly relevant to the monitoring
and diagnosis of lesions in relatively long sensory pathways from peripheral nerve to spinal
cord and cerebral cortex. The clinically significant features; N9, P13 and N20 are in
the range of 3-5 µV in amplitude (Chiappa 1990). Motor evoked potentials (MEPs)
are recorded from the muscles following stimulation of the motor cortex or spinal cord,
through magnetic or electrical stimulation. While magnetic stimulation can penetrate
tissues regardless of electrical resistance, electrical stimulation achieving better depth of
penetration allowing direct spinal cord stimulation, but with a trade-off of local discomfort
A review of the ABR and its extraction 15
EP Generators Importantfeatures
Amplituderange
AEP Auditory stimulation of the auditory pathwayfrom the in-ear to the auditory cortex throughthe brainstem
Wave V,Pa, P300
0.1-5 µV
VEP Stimulation of the visual field (rods and conecells in the retina)
P100 10-12 µV
SEP Electrical stimulation of sensory fibres in theperipheral and central portion of the nervoussystem
N9, P13,N20
3-5 µV
MEP Electrical or magnetic stimulation of motorcortex and the spinal cord
D-waves,I-waves
hundreds ofmV
Table 2.1: A brief summary of the types of EPs, their generators and features. Note: for further informationregarding other features related to these EPs refer (Chiappa 1997)
to the patient. In contrast to other EPs, the MEPs possess larger amplitudes in the range
of milli-volts, and therefore do not require special signal processing methods for extraction.
In contrast, the early component of the auditory evoked potential, the auditory brain-
stem response which is the signal of interest for the current thesis has comparatively
smaller amplitude (of the order of one tenth of a micro-volt), and therefore requires spe-
cial extraction methods. The following section presents a description of these auditory
evoked potentials.
2.1.2 Auditory evoked potentials
Auditory evoked potentials (AEPs) are scalp recordable electrical potentials generated by
the central nervous system in response to an auditory stimulus. This evoked electrical
response lasts for approximately 600 ms and consists of a number of well defined features
that can be divided into early, middle and late components. Figure 2.1 illustrates a window
of approximately 300 ms from the stimulus onset. Early components typically occur within
10 ms after initiating the stimulus and are generated in the distal portion of the cochlea
nerve through to the brainstem and are called the auditory brainstem response (ABR)
(Roeser, Valente & Hosford-Dunn 2000). Middle latency components (middle latency
auditory evoked response, (MLAEP)) are those occurring within 10 to 50 ms and are
generally believed to be generated by the serial activation of the brainstem, thalamus and
A review of the ABR and its extraction 16
Figure 2.1: The log time scale distribution of AEP includes early, middle and late potentials. Extractedand modified from (Adelman & Smith 1999). The polarity of the features is inverted here due to theelectrode placement on the scalp while recording.
cortex (Hall 2007). In contrast, the late components (auditory late response, (ALR)) arise
between 50 to 300 ms post stimulus are thought to arise exclusively from the activation
of cerebral cortex and in particular the auditory cortex resident in the temporal lobe
(Adelman & Smith 1999).
Typically, AEPs cannot be distinguished from the ongoing EEG activity by the naked
human eye. Given that most AEP components have an amplitude of the order of 0.2-
0.5 µV and the background EEG is of the order of 10-100 µV RMS, amount to a signal to
noise ratio of approximately -30 dB (Aurlien, Gjerde, Aarseth, Elden, Karlsen, Skeidsvoll
& Gilhus 2004). Therefore in order to reliably extract the amplitude and latency of the
various AEP components, special signal processing methods are mandatory.
2.1.3 Origin of the auditory brainstem response
Auditory brainstem responses (ABRs) are the early latency evoked potentials within 10
ms post stimulus generated by the serial activation of the auditory pathways beginning
at the distal portion of the eighth cranial nerve and terminating at the medial geniculate
A review of the ABR and its extraction 17
nucleus of the thalamus. Since the first thorough study by Jewett & Williston (1971), the
human ABR has been correlated with a range of physiological functions, and has therefore
been used as an important tool for diagnosing and monitoring purposes.
A fully featured ABR consists of seven distinct peaks labelled with Roman numerals
I to VII, out of which waves I, III and V are of clinical significance. Such a fully featured
ABR is illustrated in figure 2.2, which is the moving time average of 1024 epochs extracted
from a healthy 24 year old female participant, stimulated at 55 dB nHL. Parameters of
the ABR that are of clinical and physiological relevance are the amplitude and latency
variations of these peaks. The amplitude of a peak is measured relative to the preceding
or the following trough, which reflects the activity level of a specific neurogenerator. The
absolute latency of a peak (commonly known as, latency) is measured as the time interval
between the onset of a stimulus and its peak. The latency is thought to largely reflect
the actual conduction time along the neural pathway. It is often useful to define inter-
peak latencies, which are relative time intervals, measured between two different waves,
typically I-III, I-V and III-V. Such latency differences represent the axonal conduction
time along neuron pathways and/or synaptic delays between the respective populations of
neurons responsible for the generation of a particular evoked component (Ponton, Moore
& Eggermont 1996). Normative values for such absolute and inter-wave latencies and peak
amplitudes for the main features of the ABR are shown in table 2.2. These values are
obtained with 786 healthy human participants at stimulus intensity level of 80 dB nHL.
The morphology of the ABR is the overall shape of the waveform and is usually de-
scribed with reference to a standard template. Even though wave component latencies
and amplitudes are within the standard range, the morphology should be similar to the
template for it to be considered as a proper ABR.
Knowledge of the brain structures that generate the features of the ABR are important
in interpreting the abnormalities of the ABR and thereby diagnose pathological disorders.
Figure 2.3 illustrates the auditory pathways along which the ABR is generated. The
sound wave carrying the auditory stimulus transmits through the external and middle ear
to the fluid compartment of the inner ear containing the cochlea. Vibration of the basilar
membrane of the cochlea, due to sound induced movement of fluid in the cochlea, results
A review of the ABR and its extraction 18
Wave componentLatency range (ms) Amplitude
range(µV)Mean SD
Wave I 1.65 0.14 0.40Wave II 2.67 0.13Wave III 3.80 0.18Wave IV similar to wave VWave V 5.64 0.23 0.50-0.75Inter-wave I-III 2.15 0.14Inter-wave III-V 1.84 0.14Inter-wave I-V 3.99 0.20
Table 2.2: Normative Latencies and Amplitudes for ABR wave features. Number of participants included:786. Adopted from (Joseph et al. 1987).
in the stimulation of hair cells in the organ of Corti. The activity of these hair cells induces
activity in the cochlear branch of the eighth cranial nerve. The stimulation amplitude of
the hair cells is directly proportional to the intensity of the auditory stimulus. Once a
neural response to a sound is generated in the inner ear, the signal is transferred to a series
of nuclei in the brainstem. Output from these nuclei is sent to a relay in the thalamus, the
medial geniculate nucleus. Finally, the medial geniculate nucleus projects to the primary
! " ! #
Figure 2.2: A fully featured ABR recorded from a 24-year-old healthy female participant. The derivationof this ABR included a moving time average of 1024 epochs.
A review of the ABR and its extraction 19
auditory cortex located in the temporal lobe.
The exact brain structures contributing to the human ABR are subjected to much
debate. Despite the confident claim by Jewett & Williston (1971) of the origin of wave I
to be the cochlea nerve (eighth cranial nerve) the remaining waves II, III, IV, V, VI and
VII repeatedly appeared as over simplistic anatomically matched diagrams presenting in-
accurate schematics (Hall 2007). Mostly, these schematics were inferred from small animal
studies (rat, cat, guinea pig) in which brainstem structures were significantly smaller than
corresponding structures in humans (Moore 1987, Moller et al. 1995).
Hall J. W. (2007) points out two factors, which inhibit the understanding of the ex-
act anatomical structures of the ABR peaks. 1) Technical limitations related to placing
Figure 2.3: Auditory pathway from inner ear to the primary auditory cortex. Adopted from (Kiernan 2007).
A review of the ABR and its extraction 20
the electrode to achieve a true reference. 2) The complications associated with far-field
recordings that do not lead to pin point the origin. However, it could be concluded from
the evidence at hand that multiple anatomic sites may contribute to a single ABR wave
and conversely a single anatomic site may generate multiple ABR waves.
There is evidence that wave II is generated by the eighth cranial nerve from the in-
tracranial recordings of Moller (Moller 1987, Moller et al. 1995) and clinical evidence
suggest that it originates from eighth cranial nerve at the root entry zone, as it enters the
brainstem and thus the proximal portion of the eighth cranial nerve (Hall III, Mackey-
Hargadine & Kim 1985).
Wave III was traditionally believed to originate from the contralateral superior olivary
complex based on the lesion studies in small animals (Buchwald & Huang 1975). However,
a contradictory conclusion was derived by Achor and Starr (1980) stating the origin to be
the ipsilateral superior olivary complex. In contrast, human studies have found the origin
of wave III to be the cochlear nucleus (Moller 1987, Moller et al. 1995) even though Scherg
and Von Cramon (1985) were unable to derive the pinpoint location as their conclusion
which was beyond the eighth cranial nerve and the trapezoid body.
Wave IV is less observed in clinical practice as it is not consistently recorded and often
appears as the leading shoulder on wave V. Determination of the precise generators of wave
IV is complicated by the likelihood of multiple crossings of the midline for auditory fibres
beyond the cochlear nucleus. As Moller et al. (1995) suggest, generation of wave IV is
mainly associated with the third order neurons located in the superior olivary complex but
evidence of contribution from second and third order neurons is also reported by Scherg
et al. (1985). Moore (1987) also suggests that the contribution of the lateral lemniscus to
wave IV in human ABR is probably minor.
Wave V is the most frequently analysed ABR feature due to its prominent large ampli-
tude, which is affected by neurophysiological disorders. It is therefore critical to identify
its anatomic origin. Traditionally the origin of wave V was considered to be the inferior
colliculus (Buchwald & Huang 1975) but depth electrode and spatio-temporal dipole model
findings in humans have suggested that wave V is generated at the termination of lateral
lemniscus fibres as they enter the inferior colliculus (Moller et al. 1995). The resulting
A review of the ABR and its extraction 21
dendritic potentials within the inferior colliculus are thought to be responsible for the
large, broad negative voltage trough following wave V. These conclusions are supported
by anatomical findings to the effect that pathways to the inferior colliculus have varying
lengths and varying numbers of synapses, which would result in a large but relatively
broad ABR wave because of the less synchronized activation of the nucleus. Second-order
neuron activity may also contribute in some way to wave V (Hall 2007).
While the less significant wave VI and VII suggest to originate in the thalamic region
(Stockard & Rossiter 1977), some studies have narrowed the site of origin down to the
continuous firing of neurons in inferior colliculus (Moller et al. 1995).
As evident, the the origin of ABR wave features are uncertain and require further
investigations with improved methods, which may could benefit by the conclusions of this
thesis. However, an illustration of presumed anatomic correlation of major peaks of the
ABR is shown in Figure 2.4, which is extracted from (Hall 2007).
2.1.4 Factors influencing the ABR
The features of the ABR in terms of latency and amplitude are affected by various patho-
logic and non-pathologic factors. Evaluation of methods of rapid identification of these
Figure 2.4: Presumed generators of the ABR waves I-V. Note that one anatomic structure may give riseto more than one ABR wave and conversely more than one anatomic structure may contribute to a singleABR wave. Adopted from (Hall 2007).
A review of the ABR and its extraction 22
variations is the main objective of the thesis. This section presents significant factors,
which cause variations in ABR features.
Pathologic factors Hearing impairment - ABR is widely used in the screening of
hearing of neonates and uncooperative adult patients where a behavioural feedback is dif-
ficult to achieve (Hall 2007). In these cases, the presence of wave V at low sound intensities
is observed to assess the hearing ability. The sound intensity of the stimuli is varied from
70 dB to 30 dB to detect the hearing threshold (Intracoustics 2011, Incorporated 2011,
Otometrics 2011).
Multiple sclerosis - Multiple sclerosis is a chronic, often disabling disease which
randomly attacks the central nervous system. ABR abnormalities caused by this disease
include, prolonged inter-peak latencies I-III, III-V, I-V, decreased amplitude of wave V,
poor morphology, occasional total absence of wave I and V (Antonelli, Bonfioli, Cappiello,
Peretti, Zanetti & Capra 1988, Papathanasiou, Pantzaris, Myrianthopoulou, Kkolou &
Papacostas 2010, Soustiel, Hafner, Chistyakov, Barzilai & Feinsod 1995).
Parkinson’s disease - Parkinson’s disease is a consequence of the depletion of dopamine
in the CNS due to damage of the substantia nigra pars compacta. Symptomatically it is
characterised by bradykinesia, rigidity and tremor. Interestingly, changes in the ABR
have been reported due to Parkinson’s disease. Some research suggests that there is an
abnormality in wave III with prolongation of the latency and reduction in the amplitude
(Yousefi 2004) whereas a separate study observed significantly increased latencies in wave
V and I-V inter-peak latencies (Ylmaz, Karal, Tokmak, Gl, Koer & ztrk 2009). While
a number of these results are conflicting, ABR changes may prove to be of relevance for
the early sub-clinical diagnosis of Parkinson’s disease. As an example, a study conducted
to find a diagnostic tool to differentiate Multiple System Atrophy and Parkinson’s dis-
ease suggest that there is no effect of Parkinson’s disease on the ABR features (Kodama
et al. 1999).
Alzheimer’s disease and dementia - There are reports of pathologic involvement
of the inferior colliculus, medial geniculate body and both primary and secondary au-
ditory cortex in Alzheimer’s disease and dementia (O’Mahony, Rowan, Feely, Walsh &
A review of the ABR and its extraction 23
Coakley 1994). An analysis of ABR data of demented patients showed increased wave
I-V inter-peak latency values (Harkins 1981, O’Mahony et al. 1994). It is of interest that
abnormalities of the late components of the AEP have also been reported in Alzheimer’s
disease (Egerhzi, Glaub, Balla, Berecz & Degrell 2008, Graf, Marterer & Sluga 1992).
Acoustic neuroma - This is the most common cerebellopontine angle tumor ac-
counting for 80% of the lesions in this area (Misra & Kakita 1999). Acoustic neuromas
are almost invariably associated with an increase in inter-peak latencies I-III and I-V and
absence of peaks beyond wave I (Parker, Chiappa & Brooks 1980).
Coma and Brain Death - A considerable amount of literature exists regarding the
role of ABR as a diagnostic tool for coma and brain death as spontaneous EEG and
CT scan are inadequate in the assessment of the physiological integrity of the brainstem.
Absence of waves I, II, III and V were associated with death and vegetative states of these
disorders (Goldie, Chiappa, Young & Brooks 1981).
Stroke - ABR has been used for evaluation and prognosis of acute brainstem stroke.
Mainly, the wave peak ratio of IV/V increased in patients with strokes while less significant
abnormalities include prolonged inter-peak latency I-III and ABRs with only wave I or no
waves (Ferbert, Buchner, Bruckmann, Zeumer & Hacke 1988).
Diabetes mellitus and hypothyroidism - Degenerative diseases such as diabetes
mellitus and hypothyroidism were found to have an effect on the ABR with prolongation
of absolute and inter-peak latencies of main waves I, III and V (Fedele, Martini & Cardone
1984, Khedr, Toony, Tarkhan & Abdella 2000).
The use of rapid (single/limited trial) extraction methods in the detection and diagnosis
of such pathological abnormalities would have the following advantages:
(i) Reduction of clinical test times
(ii) Enhancement of patient comfort at stimulus delivery by reducing the number of
stimuli, especially for long term monitoring systems such as in a potential wearable
device
(iii) Detection of short term variability of the ABR such as in intraoperative monitoring
applications
A review of the ABR and its extraction 24
Stimulus and subject factors In addition to aberrations in evoked auditory activity
caused by pathological states, a range of stimulus and subject related factors are well
known to systematically change one or more early, middle or late components. Stimulus
dependent parameters include the frequency, duration, intensity and polarity of the stimu-
lus, whereas subject dependent factors mainly include age, gender and body temperature.
The latency of the ABR waves changes considerably up to one year of age (Kaga &
Tanaka 1980). The latency reduces by 1 to 1.5 ms as the child reaches one year and then
stabilises. After the age of 25 up to at least 55, there is prolongation of approximately
0.2 ms of the latency which however remains constant beyond that age (Hecox & Galambos
1974). The effect of gender on the ABR is observed with shorter latencies and larger
amplitudes in females than in males. Such gender based latency difference range from
0.1 to 0.2 ms (Rosenhall, Bjrkman, Pedersen & Kall 1985, Chu 1985). Since age and
gender is a constant for a given participant in a diagnosis scenario and do not have an
advantage of using a rapid extraction system. In contrast, tracking the effects of core
body temperature on the ABR latencies (Markand, Lee, Warren, Stoelting, King, Brown
& Mahomed 1987) could be greatly benefitted by a rapid extraction system. It is found
that inter-wave latencies were prolonged by 0.2 ms per 0C in cases of hypothermia, and
reduced by 0.15 ms per 0C in cases of hyperthermia (Kohshi & Konda 1990).
The properties of the auditory stimulus greatly affect ABR component latencies and
amplitudes. In general, these changes vary systematically with changes in the frequency
and intensity of the auditory stimulus. As the stimulus intensity is increased, the ab-
solute latency of ABR peaks reduces while their amplitude increases (Collet, Delorme,
Chanal, Dubreuil, Morgon & Salle 1987, Babkoff, Pratt & Kempinski 1984, Pratt &
Sohmer 1977). This reduction in peak latency at higher stimulus intensities is caused
by the rapid approach of summed postsynaptic excitation potentials to the neuronal firing
threshold (Hall 2007). In general, the latency of the ABR components is found to smoothly
decrease with an increasing stimulus intensity. This pattern is shown in figure 2.5 with typ-
ical latency-intensity curves for wave I, II, III, V and VI (Delgado & Ozdamar 1994). Due
to the infrequent appearance of wave IV, variation of its latency with stimulus intensity
has not been systematically determined.
A review of the ABR and its extraction 25
The most common clinical convention of presenting the stimulus intensity is in decibel
(dB) relative to the normal behavioural hearing threshold level for a stimulus (Hall 2007).
This is usually denoted as ‘dB nHL’. The hearing threshold level for the stimulus is deter-
mined by stimulating a group of normal subjects and taking the average of the intensity
at which the click is just audible. This intensity is defined as 0 dB nHL and used as the
reference level to indicate subsequent intensity levels.
The amplitude variations of ABR features affected by the sound intensity are charac-
teristically more variable than changes in the latency (Jewett & Williston 1971, Lasky, Ru-
pert & Waller 1987). Therefore, clinicians use consistent latency variations for diagnosing
conductive and sensory hearing loss of patients (Steinhoff, Bhnke & Janssen 1988, Suter
& Brewer 1983). Based on similar reasons, verification methods of this thesis are also
based on latency-intensity curves produced by controlled stimulus intensity delivered to
participants.
Effect of anaesthetic agents on the ABR The effects of anaesthetic agents are
prominent on the ABR and especially on the MLAEP. Such effects are essential in neuro-
Figure 2.5: Latency-intensity curves of wave I, II, III, V and VI. Dotted lines represent peaks labelled by thehuman experts and solid lines represent an automated system. Adopted from (Delgado & Ozdamar 1994)
A review of the ABR and its extraction 26
monitoring during surgery. It has been reported that administration of propofol induce an
increase in absolute and inter-peak latencies of the ABR by 0.15 to 0.5 ms depending on the
dosage up to 8 µg/ml (Chassard, Joubaud, Colson, Guiraud, Dubreuil & Banssillon 1989).
Halothane and isoflurane cause statistically significant prolongation of latency by 0.2 ms
in ABR peaks (Cohen & Britt 1982) where as enflurane cause linear prolongation of inter-
peak I-III and I-V latencies with the dosage (Thornton, Heneghan, James & Jones 1984).
An average variation of 0.23 ms in inter-peak latencies were observed due to the effect
of sevoflurane (Kitahara, Fukatsu & Koizumi 1995). In contrast to methohexial sodium
which induced a prolongation of 0.4 ms in the wave V latency. A similar effect was ob-
served in wave V absolute and inter-peak latencies with prolongations of 6.16 to 6.87 ms
with thiopental (Drummond, Todd & Hoi Sang 1985).
However, the amplitude of MLAEP is usually considered to measure the depth of anaes-
thesia even though there is an effect of latency of MLAEP components (Garcia-Larrera,
Fischer & Artru 1993). Considering the consistent and prominent amplitude variation
as a result of anaesthetic agents has lead to the production of commercial products for
monitoring depth of anaesthesia.
Typical detection times for depth of anaesthesia with MLAEP are in the order of
few seconds. Therefore it is reasonable to assume that measuring depth of anaesthesia
using the ABR extracted by conventional moving time average is not feasible as almost one
minute of recording is required to extract only a single ABR. This duration is considered for
a typical extraction scenario in an operating room/intensive care unit/neonatal intensive
care unit with electromagnetic interference up to 220 µT (Sokolov, Kurtz, Steinman, Long
& Sokolova 2005) where in average 1024 stimulation is presented at a rate of 21.1 Hz.
However, an exact number of stimuli required could be estimated by statistical methods
such as Fsp based on the signal to noise ratio (Elberling & Don 1984). With the presence
of a rapid extraction methods, short-term variations in the ABR could be revealed and
these may potentially be used to detect monitor depth of anaesthesia.
A review of the ABR and its extraction 27
2.1.5 EPs in brain computer interfacing
A brain computer interface (BCI) is a way of communication between human/animals and
computers that does not use any muscular movement such as talking, writing or mimic,
rather it uses brain signals. EPs are the basis for BCI applications due to the event driven
nature. At present, even though ABRs are not directly involved in BCI applications, fast
extraction algorithms evaluated in this thesis could lead to a innovative paradigm shift and
improve the performance of existing applications. The commonly used P300 signal in BCI
is a non-stationary signal similar to the ABR, and therefore compatible with algorithms
developed to ABR compatible. Further, the adaptation of the algorithms developed for
ABRs will be more effective in relation to P300 as the characteristic SNR is high compared
to the ABR.
The late event related potential P300, is elicited when users attend to a random series
of stimulus events that contain an infrequently presented set of items which, forms an odd-
ball paradigm. Patients suffering with communication disabilities caused by amyotrophic
lateral scleroses, severe cerebral palsy, head trauma, multiple sclerosis and muscular dys-
trophies are helped with a speller driven by visual P300 to improve their communication
ability (Lee et al. 2008). In addition, patients having difficulties in voluntary control of
gaze have been assisted with auditory BCI driven by auditory P300 (Furdea et al. 2009),
ASSR (Lopez et al. 2009), self-regulation of slow cortical potentials (Pham et al. 2005)
and sensorimotor rhythm (Nijboer et al. 2008). Also, the PC based gaming industry is
increasingly using BCI peripherals produced by companies such as Emotiv (EPOC) and
Ocz Technology (NIA game controller).
Such applications suggest that real time EP extraction algorithms are essential in
BCI where reaction time between the emission of the EP and the interpretation of the
action is critical. BCI applications help subjects with severe motor impairment with
no other way of communication but to capture subjects’ intent directly from the brain.
But poor information transmission rate from the time point of emitting the EP and the
interpretation of it is a common issue in BCI systems when using conventional MTA in
detecting EP changes (Lopez et al. 2009).
A review of the ABR and its extraction 28
2.1.6 ABRs for rapid extraction
From the foregoing discussion, it is evident that the early components of the EPs provide
important information regarding the functional state and integrity of a variety of sensory
pathways. The conventional method for extracting EPs is to calculate a time locked
average of the EP to multiple sequential presentations of an identical stimulus. For the
ABR, such a time average more often requires (considering the noise associated) the order
of no less than one thousand sweeps, spread over approximately 60 seconds, in order to
sufficiently resolve its components for diagnostic and quantitative analysis purposes. The
following issues arise with the time span associated with such an extraction system.
(i) Inability to observe short-time variations with the delayed output of the ABR.
(ii) Patient discomfort due to prolonged auditory stimulation associated with a long
term monitoring system (potentially a wearable device).
(iii) Generation of muscle artifacts at infant hearing screening with prolonged test times.
Similar effects could be observed with adult patients due to fatigue when testing is
conducted with a series of sound intensities for a long period of time.
Close to real-time applications of EPs exist such as depth of anaesthesia monitoring
and BCI related applications as they use variety of rapid extraction methods. In contrast
ABRs are excluded from such ‘close-to real time’ applications due to the substantial times
required to extract using present moving time averaging. The resultant low time resolution
leads to loss of short-time information, which limits the understanding of the internal brain
structures that cause variations in ABR peaks. With improved extraction methods, there
is a potential to identify the exact origin of the ABR peaks, and such methods can thereby
be used as a potential non-invasive technique to localize lesions along the auditory pathway.
Therefore the necessity of rapid extraction methods for ABRs and in general EPs is
essential. This thesis evaluates two such methods ARX modelling and wavelet filtering,
both identified as having significant potential to removing noise and tracking time scale
variations associated with ABR features.
A review of the ABR and its extraction 29
2.2 Review of ARX modelling based extraction methods
2.2.1 Moving time averaging to parametric modelling
Based on the relation to the simple additive signal + noise model and for the ease of
calculations, moving time average (MTA) is typically used in clinical applications related
to EPs. MTA is calculated by averaging a large number of time locked responses to
suppress random spontaneous EEG assumed to be uncorrelated to the deterministic ABR
and thereby retaining the ABR. Such a synchronous summation of responses improves
the signal component while reducing noise. This is true for early EP components but
less so for late and middle components due to the fact that the evoked component almost
certainly depends on the seemingly random prestimulus activity (Makeig, Westerfield,
Jung, Enghoff, Townsend, Courchesne & Sejnowski 2002). A single sweep of a recorded
ABR yi(k), is typically described with the conventional additive model:
yi(k) = s(k) + ni(k) (2.1)
where s(k) is the deterministic ABR component and is the Gaussian white noise compo-
nent, i.e. the spontaneous EEG at the i th sweep and k is the discrete time point. s(k) is
extracted by taking a MTA of a large number of sweeps, assuming the random nature of
the underlying EEG noise will progressively decrease the noise component with an increas-
ing number of averages. This improves the SNR of the ABR by: 10 log10(N) dB, where N
is the number of sweeps averaged in the MTA. In conventional ABR extraction, a MTA
of approximately 1000 sweeps is considered enough to suppress the noise and arrive at the
ABR i.e. a SNR improvement of 30 dB (Don & Elberling 1996).
The use of a conventional MTA in these applications however results in poor time
resolution and thus cannot be used to detect the rapid variations of latency and amplitude
of EPs which are expected to attend cognition. Also the underlying assumption of the
time invariant ABR makes conventional MTA not suitable to extract a series of ABRs
with time varying features (Ozdamar & Kalayci 1999).
The identification of these drawbacks has lead to concerted efforts to develop im-
proved extraction methods such as weighted averaging techniques (John, Dimitrijevic
A review of the ABR and its extraction 30
& Picton 2001), Wiener filtering (Doyle 1975, Boston 1983), adaptive filtering (Vaz &
Thakor 1989), median averaging (Ozdamar & Kalayci 1999), independent component
analysis (Scott, Anthony, Tzyy-ping & Terrence 1996, Hu, Zhang, Hung, Luk, Iannetti
& Hu 2011), principal component analysis (Lins, Picton, Berg & Scherg 1993), Spatial
weighted averaging (Ivannikov, Karkkainen, Ristaniemi & Lyytinen 2010), parametric
modelling and denoising with wavelet filter banks.
Weighted averaging is a modification to the conventional MTA with the same number of
epochs involved, and thus not leading to rapid extraction. The process of Wiener filtering
s(k) = g(k)∗ [s(k)+n(k)] assumes spectral properties of the signal s(k) and noise n(k) and
then attempts to estimate the signal s(k) with a minimum error compared to the actual
signal by adjusting the Wiener filter g(k). The major disadvantage of Wiener filtering is
the underlying assumption that the signal to be analysed being stationary. However in
adaptive filtering, even though the non-stationary nature of an EP is considered, it tends
to be vulnerable to highly coloured noise. Independent component analysis and principal
component analysis is a means of removing artifacts generated from different sources such
as eye blinks and movements, muscle noise, cardiac signals and line noise. Independent
component analysis performs superior to principal component analysis with the ability to
separate EEG and its artifacts within the same analysis (Jung, Humphries, Lee, Makeig,
McKeown, Iragui & Sejnowski 1998). Also independent component analysis preserves and
recovers more brain activity than principal component analysis in decomposing EEG data
(Jung et al. 1998). The Spatial weighted averaging is an improved version of independent
component analysis from which direct separation of subspaces is achieved in raw data
among EP and noise sources and thus leads to rapid extraction and possible single sweep
analysis (Ivannikov et al. 2010). In contrast, separation of EPs from raw data seems less
effective at very low SNRs with independent component analysis, due to its tendency to
concentrate only on large components.
A more widely used approach to the rapid extraction of EPs is to parametrically model
EPs using an autoregressive model with an exogenous input (ARX). Here a single sweep
of an EP is modelled with a reference signal (exogenous input) and white noise, which
is compatible with the conventional additive model. ARX modelling has been widely
A review of the ABR and its extraction 31
adopted by researchers to rapidly extract MLAEP, VEP and SEP. Cerutti etal. (1987)
introduced the ARX model to extract single sweeps by fitting it to a reference signal
defined by a MTA of an ensemble of sweeps. The validity of this method further can
be said to underpin its use in the calculation of the A-LineTMIndex (Jensen et al. 1998)
in the depth of anaesthesia monitoring device ‘AEP Monitor/2’ produced by Danmeter
aps (DanmeterAps 2010). Given the successful application of the ARX based methods
to the rapid extraction of middle and late latency EP components, it logically follows to
systematically investigate the application of this method to the rapid extraction of ABRs.
The following section describes the general principles of the ARX modelling method.
2.2.2 The ARX(p, q, d) model
The adaptation of the auto-regressive model with an exogenous model (ARX) black-box
model to a signal generation mechanism Cerutti et al. (1987) allowed representation of
a single sweep as the sum of an autoregressive pseudo-random term and the output of a
proper filter with a deterministic input. According to the convention, it is assumed that
a single EP sweep consists of true signal ep plus additive noise n i.e.
yi(k) = epi(k) + n(k) ⇐⇒ Yi(z) = EPi(z) +N(z) (2.2)
where k is the corresponding sample for the ith sweep. The right-hand-side shows the
z-transform of the corresponding time domain additive model.
Now consider the general form of an ARX model for a single sweep in equation (2.3):
yi(k) = −p
∑
j=1
ajyi(k − j) +
q+d−1∑
l=d
blu(k − l) + ei(k) (2.3)
Here, the single sweep yi(k) is modelled with the reference input u(k) and Gaussian white
noise ei(k). u(k) is the exogenous input to the ARX model which in theory, represents the
true nature of the EP. In practice, with the absence of the true response, u(k) is assigned
with a MTA of appropriate number of sweeps with a close representation. p and q are the
orders of the autoregressive (AR) and moving average (MA) parts respectively, aj ’s and
bj ’s are the model coefficients. The number of poles of the estimated filter is therefore p
A review of the ABR and its extraction 32
whereas the number of zeros is (q− 1). The delay d is the temporal lag between the input
and the output to the model. This time domain model can be re-written in the z-domain
as the following
Y (z) =B(z)
A(z)U(z) +
1
A(z)E(z) (2.4)
where Y (z), U(z), E(z) are the z-transform of yi(k), u(k), ei(k) respectively and A(z) =
1 +∑p
j=1 ajz−j , B(z) =
∑q+d−1l=d blz
−l) . By comparing equations (2.2) and (2.4), a noise
free single sweep EP estimate is derived as; EPi(z) =B(z)A(z)U(z).
Therefore, by filtering the template signal U(z) with the filter defined by the ARX
model, in principle, a noise reduced single sweep EP is derived. The fact that the noise
and the evoked response both depend on the same autoregressive component A(z) however
does imply a partial loss of generality as the estimated evoked response and noise are not
independent of each other. Despite this restriction the ARX model is typically preferred
due to the computational simplicity. Removing this restriction would result in the additive
noise model being described by the following ARMAX process
Y (z) = B(z)A(z)U(z) + 1
C(z)E(z)
⇒ Y (z)D(z) = U(z)F (z) +A(z)E(z)
where D(z) = C(z)A(z) and F (z) = B(z)C(z). Thus a noise free single sweep estimate
would be given by U(z)F (z)/D(z). However, difficulties in the robust estimation of AR-
MAX models has resulted in frequent use of the ARX model in single sweep estimations
(Makeig et al. 2002).
If the two series yi(k) and u(k) are known for each sweep (i), it is possible to calculate
aj and bj for each n, m and d using a batch least-square-method which minimises the
prediction error (Broersen 2006),
Ji =1
N
N∑
k=1
e2i (k)
Where N is the number of samples in a sweep and ei(k) = yi(k)− yi(k) where yi(k) is the
measured sweep and its model estimation yi(k).
For a given single sweep, optimum model orders p and q are determined by the mini-
A review of the ABR and its extraction 33
mum value of the final prediction error (FPE) (Ljung 1987),
FPEi =N + s
N − sJi (2.5)
where s = p+ q. FPE gives a measurement of model estimation error combined with the
model orders as a penalty factor in order to determine the optimum estimation error with
a minimum number of model orders.
According to the schematic diagram of the ARX model implementation shown in fig-
ure 2.6, s(k) is the signal of interest that needs to be extracted from background noise n(k).
The deterministic ABR s(k) is derived by filtering the reference u(k), i.e. S(z) = B(z)A(z)U(z).
The transfer function B(z)A(z) merely represents a mechanism to incorporate deterministic sin-
gle sweep EP variations into the reference signal rather than a physiologically meaningful
process. The ongoing EEG n(k) is derived by filtering Gaussian white noise e(k) with an
all pole filter (AR model) i.e. N(z) = 1A(z)E(z). Finally, the single sweep y(k) corresponds
to the additive model in equation (2.4). Then a template u(k) is derived from MTA of an
ensemble of y(k) which is then used as the exogenous input to the derived ARX A(z) and
B(z). The output to the model is a single sweep of y(k) and model orders are determined
using minimum FPE. The estimated ABR s(k) is then derived as:
S(z) =B(z)
A(z)U(z) (2.6)
Figure 2.6: The process of the ARX model
A review of the ABR and its extraction 34
2.2.3 Applications of ARX modelling
Cerutti et al. (1987) used the ARX model to extract single sweep estimates of VEPs
with a MTA of 99 sweeps as the reference input u(k) to the model. Model orders were
determined by minimizing the Final Prediction Error (FPE) function (Akaike 1970) and
were set to p = 6, q = 7. The FPE is a measure of model goodness of fit and represents
a compromise between model complexity (i.e. the number of parameters (p+ q − 1) and
model prediction error. These model orders were the minimum required to accurately
represent the relationship between the single sweep, exogenous reference signal and the
Gaussian white noise. The estimated single sweeps had four types of characteristics; 70%
of the single sweeps had a close morphology to that of the reference input providing a
statistically consistent value of the mean pattern of all responses. 12% of the estimated
sweeps did not represent a VEP due to the small amplitudes. 5% of the estimated sweeps
resembled the VEP only during the first half of the sweep whereas 13% of it had completely
different morphology to that of the reference. The reason for negative results in this study,
is limited to an assumption of the absence of VEP due to loss of attention in the subject,
which leads to bad focusing of the stimulus on the retina (Cerutti et al. 1987). However
the lack of actual EP when using empirical data, poses a major disadvantage to determine
whether corrupt results are due to the noise present in the data or due to shortcomings of
the ARX methodology.
The lack of standard procedure to assess the effectiveness of the ARX method may
have led to misinterpreted results. A simulation study with reproducible EPs at each single
sweep could avoid such uncertainties and provide an unbiased opinion on the aberrations
of the estimated single sweep.
Based on the development of single sweep extraction, ARX modelling was initially used
to rapidly quantify the effects that anaesthetics on MLAEP in order to quantify the depth
of anaesthesia (Jensen et al. 1996). It is well known that anaesthetics systematically and
reversibly affect the amplitude and latencies of the major middle latency components Na,
Pa, Nb and Pb (Church & Gritzke 1987, Church & Gritzke 1988). Jensen et al. compared
the behaviour of Na-Pa amplitude extracted using ARXmodelling, with conventional MTA
during combined propofol and alfentanil anaesthetic agents. ARX model parameters were
A review of the ABR and its extraction 35
fixed in such a way that, a MTA of 256 epochs of MLAEP assigned to the reference input
u(k) to suppress the noise by 24 dB, a single epoch was assigned to yi(k) (therefore single
sweep estimation) and model orders were fixed to p = q = 5 on the basis of FPE. Results
of the ARX model output indicated that onset of unconsciousness occurs in the order
of 1.5 minutes earlier than conventional MTA. Also the change in the amplitudes was
larger in ARX extracted results, providing a clearer distinguishable characteristic between
unconscious and the awake state.
This application of ARX model leads to the development of the A-Line ARX index
(AAI) to quantify the depth of anaesthesia (Jensen et al. 1998) and was eventually used
commercially in AEP Monitor/2 (DanmeterAps 2010). The parameters of the ARX model
to calculate the AAI is slightly different. It uses a MTA of 15 epochs for the output of the
model yi(k) instead of a single sweep, whilst maintaining the other parameters to be the
same as those mentioned earlier. The modified yi(k) improves the efficiency of the esti-
mation process by reducing noise associated with it. In a separate study, which compares
conventional MTA and ARX in AAI, states that A-Line index detects the transition from
awake to unconscious in 6 seconds whereas MTA of 256 sweeps detect it after 28.4 seconds
(Litvan, Jensen, Galan, Lund, Rodriguez, Henneberg, Caminal & Villar Landeira 2002).
Similar promising results were achieved with patients who had anaesthesia induced with
thiopentone and were subsequently maintained with isoflurane and alfentanil when con-
ducting tracheal intubation (Urhonen et al. 2000). Also AAI was tested for the effects
of sevoflurane with a rapid response compared to MTA and recorded encouraging results
(Alpiger, Helbo-Hansen & Jensen 2002).
Further evidence for the successful use of ARX modelling based on rapid extraction is
reported with the findings of monitoring sedation in cardiac surgery patients (Mainardi
et al. 2000). Amplitude differences between P50 and N100 features of ALRs were moni-
tored in patients undergoing anaesthesia with propofol-alfentanil or midazolam-fentanyl.
Here, single sweeps were extracted using a reference signal u(k) with a MTA of 40 sweeps.
The low number of sweeps for the reference signal is due to the high SNR of ALRs com-
pared to MLAEPs.
Apart from monitoring anaesthetic effects, ARX modelling is used to extract SEPs
A review of the ABR and its extraction 36
and H-reflex responses induced by stimulating the spinal cord (Rossi et al. 2007). Such
applications are essential for recommended intraoperative neuromonitoring during surgery
where the spinal cord functional integrity is at risk. In this application, a modification has
been made to the way the reference signal is calculated by taking an average of 50 trials
with exponential weights attribute to minimise the effect of past trials. The model orders
were set to p = 2, q = 4. Results suggest an improved temporal resolution of 1 second
and therefore this method is deemed suitable for an online rapid extraction system. Also
a DSP based hardware system has been successfully built to incorporate this algorithm
(Bracchi, Perale, Rossi, Gaggiani & Bianchi 2003).
ARX modelling was also used to analyse the shape and the time course of periodic
sharp wave complexes (PSWCs) of flash VEPs, which assisted in clinically diagnosing
Creutzfeldt-Jacob disease (Visani, Agazzi, Scaioli, Giaccone, Binelli, Canafoglia, Panzica,
Tagliavini, Bugiani, Avanzini & Franceschetti 2005). The chosen model orders for the
analysis were p = q = 4. However this study did not report the number of sweeps averaged
for the reference input (one could assume it to be a single sweep). The single sweep results
extracted, contributed to clarifying the much-debated problem of the occurrence of giant
flash VEPs in Creutzfeldt-Jacob disease and their relationships with the spontaneous
periodic EEG patterns.
In summary, the review suggests major contributions in which ARX modelling has
been successfully used to extract EPs. According to the noise present in the interested
physiological signal, some have estimated a single sweep while others have estimated the
resultant of an ensemble of sweeps. However, it was found that ARX modelling has not
been used to extract the ABR during the conduct of this literature survey. Therefore
the use of ARX on ABR may be considered a worthwhile endevour, with many potential
applications.
With regards to anaesthesia monitoring, the tracking of time scale variations of phys-
iological signals has not been studied due to the emphasis being given to anaesthesia
monitoring applications that predominantly depend upon amplitude variations. A study
of the latency tracking ability of the ARX modelling will open a new paradigm of appli-
cations where time scale varying features are of importance.
A review of the ABR and its extraction 37
In addition the validity of the extracted signals so far has been assessed using recorded
EP data. Due to the high inconsistency of real EPs, it is worthwhile to carry out an initial
simulation study with synthetic data to verify the model performance.
2.2.4 Robust evoked potential estimator - REPE(p, q, r, d)
Because of much lower SNR of the ABR compared to the other EPs, of the order of -
30 to -20 dB compared to the MLAEP of about 0 dB, it is reasonable to assume that
the rapid extraction of ABR using ARX methodologies will require modification. Lange
& Inbar (1996) have proposed a method claimed to resist such noise conditions by pre-
whitening the template u(k) before applying the ARX model calculations. The resulting
implementation, referred to as the Robust Evoked Potential Estimator (REPE), is based
on the fact that a successful estimation of a system is achieved by exciting the input of the
model with a wide band of frequencies (Box & Jenkins 1976). Otherwise, where when the
input is driven by a narrow-band signal, it might not excite the system to its full modality,
resulting only in a partial identification of the examined signal. Therefore in REPE, the
exogenous input to the ARX model is pre-whitened via an AR process to broaden the
frequency distribution. This can be expressed as u(k) = −∑r
j=1 cju(k − j) + ξ(k), where
cj ’s are model coefficients and r is the order of the AR process. As shown in figure 2.7,
the process after z-transform could be formulated as:
A(z)Y (z) = B(z)C(z)U(z) + E(z) (2.7)
where C(z) =∑r
j=1 cjz−j . AR model order r and model coefficients cj ’s are calculated
using FPE and a batch-least-square method respectively. Here, the exogenous input to
the conventional ARX model is W (z) = C(z)U(z). The estimated EP s(k) is then derived
as:
S(z) =B(z)C(z)
A(z)U(z) (2.8)
In verifying the REPE, Lange & Inbar (1996) used EPs obtained from a finger tapping
experiment with a MTA of 200 sweeps as the reference signal. Initially, the pre-whitening
model order was determined to be r = 8 followed by the calculation of other model or-
A review of the ABR and its extraction 38
Figure 2.7: The process of the REPE
ders to be p = 8, q = 6 (i.e. REPE(8,6,8,0)) by minimising the prediction error as per
the FPE (2.5). A comparison with conventional ARX(8,7,0) model in general suggested
that REPE(8,6,8,0) estimated single sweeps maintain a close resemblance to the average
response both in shape and amplitude where as ARX estimations often display high ampli-
tudes which ‘may be’ noise related and sometimes fail to describe the general wave shape
(Lange & Inbar 1996)
2.2.5 Simulation studies and drawbacks
To date, these ARX methods of rapid extraction have been evaluated by their ability to
detect assumed variations in recorded EP data. Since the deterministic EP is unknown in
the actual data, simulated data are essential in order to provide a deterministic EP so that
ARX modelling methods for rapid extraction can be meaningfully evaluated. According to
information gathered through this literature review, none of the research has conducted a
complete study with simulated data. Therefore, a set of synthetic EP data with predefined
but physiologically plausible variations of those EP features could provide a less biased
basis to investigate the model estimations compared to those studies which use actual
single sweep EPs, whose characteristics are highly variable.
Few attempts have been made to verify the effectiveness of tracking varying features
using semi-simulated data where an actual EP template is subjected to predefined mod-
ifications. The lack of precise definition of the EP template has made reproduction of
the results impossible and comparisons inaccurate. Cerutti et al. (Cerutti et al. 1987)
has made an attempt to verify the amplitude variation tracking ability of the ARX model
A review of the ABR and its extraction 39
by inducing variations to the single sweeps by changing the gain in the signal generating
model. However, the gain values were not quantified here thus making it impossible to
determine the limitations of extracting amplitude variations with the ARX model.
Also, Rossi et al. (2007) have attempted to perform a simulation study to extract the
amplitude variations of SEPs, which are used for a spinal cord intraoperative neuromoni-
toring technique using conventional ARX modelling. Here, the evaluation is limited only
to the tracking of amplitude variations with a concluding remark stating, ‘the early de-
tection of changes in the EP could be better achieved by ARX than MTA’ which is the
apparent result. The narrow variation of amplitude (limited to the specific application
domain) tested, absence of latency variations and the fixed initial SNR (typically -5 dB of
SEPs which is higher than ABRs of -30 dB (assuming following amplitudes: SEP=15 µV
ABR=1 µV, EEG=30 µV)) are the significant shortcomings of this simulation study.
With regards to REPE, Lange & Inbar (1996) have attempted to verify the performance
in two stages; 1) The denoising ability 2) The variation tracking ability. They quantified
the denoising ability by calculating the improvement in the SNR, SNRimprovement, defined
as;
SNRimprovement = 10. log10
(
SNRfinal
SNRinitial
)
(2.9)
where SNRfinal =E[s2(k)]
s(k)−s(k))2and SNRinitial =
E[s2(k)]E[n2(k)]
. A comparison of ARX estimator
and REPE methods for single sweep extraction suggested the SNRimprovement achieved by
the REPE is substantially higher than the ARX for low SNRs, reaching a 20 dB advantage
at an initial SNR of -35 dB (shown in figure 2.8). The REPE exhibits a linearly increasing
SNRimprovement with respect to higher initial SNR and saturate below -15 dB initial SNR,
while the ARX presents an almost constant improvement in SNR of approximately 6 dB
irrespective of the initial SNR.
However, even though there is a significant improvement in the SNR around an initial
SNR of -30 dB, Lange & Inbar have not assessed the morphology of the extracted EP for
the presence of its features. Given the importance of estimating variations in the latency
of various AEP components, it is essential to systematically investigated the effects of the
REPE method on morphological characteristics at such low initial SNRs.
A review of the ABR and its extraction 40
Figure 2.8: SNR improvement of ARXE (same as ARX) and REPE. The difference in the curves suggeststhat there is a considerable improvement in SNR when REPE is used (Lange & Inbar 1996).
2.2.6 Scope of the current study
The single trial extraction methods of ARX and REPE have been applied with much
apparent success to a range of evoked responses (Cerutti et al. 1987, Lange & Inbar 1996,
Rossi et al. 2007) (specific details were discussed in section 2.2.3). However, attempts
to systematically verify the validity of these methods using simulated data suffer from
following drawbacks;
(i) Use of grand average of real EP data as the reference/template signal. Such EPs
are not reproducible thus a comparison or a performance evaluation with previous
work is impossible.
(ii) Ambiguity of the selection criteria for model parameters makes the validation of
model performance with previous work impossible.
(iii) The range of latency variations tested in the simulated EP is limited and does not
encompass the expected range of physiological variation.
Because of the promising results produced in extracting single sweeps of AEP compo-
nents, it is important to systematically evaluate the performance using synthetic data in
A review of the ABR and its extraction 41
order to clearly define the scope for the use of ARX and REPE methods. On this basis,
this thesis presents an original, well defined, reproducible simulation study to determine
the effectiveness of ARX and REPE rapid extraction methods to variations in the SNR of a
simulated ABR and the ability of these methods to accurately track time-scale variations
(latency) of simulated ABR components. This study follows from a similar simulation
study but in a different paradigm, in which three algorithms; correlation based, adaptive
least mean square based and p-norm based were compared with respect to their ability to
track EP latency variations (Kong & Oiu 2001).
Chapter 4 and 5 present systematic simulation studies performed on a mathemati-
cally modelled ABR which is reproducible and added typical EEG noise to determine the
limitations of single sweep extraction. A range of physiologically plausible latency varia-
tions were induced to this model to assess the range over which accurate tracking can be
achieved. This study will lead to a better choice of applications which could use paramet-
ric modelling to calculate a fine-grained measure of the brain state, thereby providing a
better understanding of relevant brain structures which generate EPs. The methodology
and the results of this study are described in sections 4.2 andsec:ARXRes respectively. To
confirm these results, real ABR recorded from human participants were then subjected to
ARX and REPE methods and results were discussed in section 4.3.6 with respect to the
practical implementation of these methods.
2.3 Review of wavelet based extraction methods
Wavelet analysis was recently used with much success on the EPs and ERPs including the
ABR. In general, EPs are recorded from multiple electrode arrays with variable frequency
content over time and across spatial location on the scalp, therefore are characterised as
non-stationary in both time and space. Also the features and variations of these EPs
which interest neuroscientists and clinicians are transient (localized in time), prominent
over certain scalp regions (localized in space), and restricted to certain ranges of temporal
and spatial frequencies (localized in scale).
One of the major advantages of using wavelets on such signals is that, it permits ac-
curate decomposition of EPs into a set of sub-waveforms which can isolate all frequencies
A review of the ABR and its extraction 42
from the largest to the smallest pattern of variation in time and space which is a character-
istic of non-stationary EPs. Consequently, wavelet analysis provides flexible control over
the resolution which enables components of the EP and their variations to be localized in
time, frequency and space. This control over resolution translates directly into increased
power in statistical waveform analyses and improved digital processing techniques to de-
tect and analyse anatomical structures and pathological effects which contribute to such
variations.
In contrast to parametric modelling (ARX and REPE) which is based on characteristics
of noise (ongoing EEG), wavelet analysis depends upon the morphological characteristics
the signal of interest. The choice of basis wavelet is critical in a wavelet analysis to
produce accurate estimates (Wilson & Aghdasi 1999, Wilson et al. 1998). Therefore this
thesis presents a comparable analysis of the ABR through wavelets as a new paradigm
and compares this with parametric modelling results to arrive at a conclusion.
2.3.1 Wavelets in the extraction of ABRs and in general EPs
Considering the advantages, researchers have implemented several wavelet techniques to
extract EPs, predominantly for auditory middle and late components and in some cases
for the early ABR.
Due to relatively high SNR of middle and late EPs, single sweep extraction was widely
examined for rapid extraction using wavelets (Demiralp, Yordanova, Kolev, Ademoglu,
Devrim & Samar 1999, Quian Quiroga 2005). Quian Quiroga (2005) reported a wavelet
denoising method; Time Windowing with Matching Coefficients (TWMC) using discrete
wavelet transform (DWT) to extract single sweeps of MLAEPs and VEPs. These EPs
are of the same order of SNR, but high compared to a typical ABR. TWMC resulted
in superior denoising compared to Wiener filtering when applied to a simulated signal
having a SNR of 0 dB. However, Woody averaging performed to enhance the EPs prior
to applications of TWMC (Woody 1967) is prone to errors with large amplitudes of noise
associated with the ABR. Such preprocessing could lead to false peak identification, thus
rendering it unsuitable to apply on ABRs. Therefore, it is worthwhile to investigate the
applicability of TWMC to unmodified ABRs for rapid extraction purposes.
A review of the ABR and its extraction 43
Demiralp et al. (1999) reported analysis of alpha (8-16 Hz), theta (4-8 Hz) and delta
(0.5-4 Hz) bands of single sweep of middle EPs and P300 by decomposing with DWT.
This study investigates the temporal locations of EP components in relation to specific
cognitive processes. However, the wavelets are used for the separation of frequency bands
with decomposed wavelet sub-spaces which reflect conventional band-pass filtering. In
contrast, overlap of the ABR bandwidth with noise spectrum (as a result of fast ABR
components) such simple methods cannot be applied to extract the ABR. Similar studies
are reported by Acir et al. (2006), Bradley & Wilson (2005) and McCullagh et al. (2007)
involving DWT but does not aim at rapid extraction. Instead, these studies use an ABR
derived from a grand average of more than 1000 epochs and then band-limit using DWT
to analyse the ABR.
Further use of wavelets in EP studies is reported by Effern et al. (2000) who introduced
a method for single trial analysis of P300 that combines non-linear state-space time series
analysis with the wavelet transform. However, results indicate optimum filter characteris-
tics were not be achieved producing misleading interpretations in the absence of necessary
arrangement of the single trial ensemble, and thus were not used for further analysis.
In contrast, Causevic et al. (2005), Maglione et al. (2003) and Zhang et al. (2006) have
made attempts to reduce the number of epochs involved in the analysis using different
thresholding criteria.
A constant threshold with matching coefficients (CTMC) with a template was im-
plemented by Zhang et al. (2006) to detect the presence of an ABR using a Bayesian
network. This is an improved version of the basic thresholding method (Donoho 1995)
suitable for signals with low SNRs. Here, the presence of an ABR is based on the detection
of only wave V (the only wave present at low stimulus intensities). This study reports
the extraction of wave V using an MTA of 64 and 128 epochs leading to rapid extraction
with an accuracy of 78.8% and 84.2% respectively. In contrast, the study presents in this
thesis aims to extract a fully featured ABR including the three main waves I, III and V.
Also, time scale variation tracking has not been investigated in previous studies which are
critical for the context of this thesis. The need for such an investigation is highlighted in
that the use of a template in CTMC could possibly limit the range of time scale variation
A review of the ABR and its extraction 44
tracking capability.
However, based on the encouraging results reported for the rapid nature of the extrac-
tion, this method was considered, to further explore and analyse the possibility of using
as an algorithm to rapidly extract an ABR.
In a separate study, Strauss et al. (2004) reported a signal classifier using vectors
generated by wavelets as the input and achieved 100% sensitivity and 90% specificity in
classifying the presence of the wave V at a stimulus intensity of 30 dB. This study presents
significant motivation for the use of wavelets in rapid extraction of the ABR. However, the
study described in this thesis pursues a morphologically correct ABR showing wave I, III
and V for potential identification of aberrations due to neurological disorders in addition
to hearing screening. In this regard, the following differences were identified in the series
of work reported by Strauss et al. compared to the aim of this thesis.
• In Strauss et al. (2004), even though the feature extraction is of a single sweep,
the novelty detection scheme which identifies the presence and absence of the ABR
features consist of an average of 10 feature vectors to reduce the accidental changes
in the measurement setup.
• The method reported in Corona-Strauss, Delb, Bloching and Strauss (2007) and
Corona-Strauss, Delb, Hecker and Strauss Citeyear,Corona-Strauss2007b suffers from
generalisation for all patient conditions (i.e. patient specific) as a result of inter-
individual variations of the synchronization stabilities. This impedes the application
of such a method in clinical situations as it is not possible to establish an absolute
threshold. On the contrary, the denoising method recommended in this thesis should
be independent of the patient condition, thus making it better suited for adaptation
for all patient conditions.
• In addition, a systematic simulation study or a time scale feature tracking study was
not performed in any of Strauss et al.s‘ work. This contrasts to the investigations
conducted in this thesis with the use of reliable latency-intensity curves.
Another promising fast estimation method based on thresholding was introduced by
A review of the ABR and its extraction 45
Causevic et al. (2005) called cyclic shift tree denoising (CSTD). This method consists of a
circular averaging technique with an additional thresholding criterion. Circular averaging
creates additional averages (which is the most reliable and conventional method) leading
to better noise suppression. Non-linearity of the averaging process makes it an improved
version of conventional moving time averaging. MLAEPs and ABRs were used to assess
the performance of the CSTD and have yielded promising results. But the rapid nature of
extraction is not highlighted in this study where they have presented denoised waveforms of
a moving time average of 512 ABR epochs and 256 MLAEP epochs. Even though Causevic
et al. (2005) have mentioned the improvement in SNR in general, specific features of filtered
ABRs involving smaller number of epochs have not been analysed. Another shortcoming
in this study is the absence of an evaluation of the method under time scale variations of
the ABR which is a common drawback of previously mentioned studies.
According to the literature available at the time of this research, the exact nature of
a fully featured ABR extraction including critical time-scale variations were not analysed
(even though several promising rapid extraction algorithms were formulated). Table 2.3
summarises the specifications of the following three most probable algorithms for rapid
extraction:
• Constant thresholds with matching coefficients (CTMC)
• Temporal windowing with matching coefficients (TWMC)
• Cyclic shift tree denoising (CSTD)
In summarising the research literature, CTMC was used to test the ABR but concen-
trated only on peak V. TWMC was used to investigate P300 but not ABR. CSTD was
used to test MLAEP and ABR but was not used to quantify the morphology of the ABR.
All these techniques have contributed to rapid extraction in their own right. However,
this thesis investigates the performance of denoising of these three wavelet methods using
a common set of ABR data.
In addition, the thesis presents an investigation of these methods under time varying
conditions, close to what would be seen in clinical practice using latency-intensity curves.
A review of the ABR and its extraction 46
Features of thealgorithm
Zhang (2006)CTMC
QuianQuiroga (2005)TWMC
Causevic (2005)CSTD
Wavelet algorithm DWT DWT DWTbasis wavelet Biorthogonal 5.5 Biorthogonal 4.4 Biorthogonal 4.4Use of a template Yes Yes NoAnalysed EP ABR ALR MLAEP and
ABRMinimum number ofepochs involved
64 Single sweep MLAEP - 256ABR - 512
EP features analysed Wave V P100, N200, P300 ALMR - Na, Nb,Pa, Pb
ABR - overallmorphology
Table 2.3: Specifications of key algorithms for rapid extraction.
It has been reported that shift variance plays a key role in filtering time-shifted data with
DWT (Bradley & Wilson 2004, Kingsbury 2001). Therefore, CTMC, TWMC and CSTD
were implemented with DWT and reformulated to stationary wavelet transform (SWT)
to identify such temporal distortions described in section 5.1.6.
To achieve comparable results for these wavelet-denoising methods, a common basis
wavelet is essential. As shown in table 2.3, the order of the Biorthogonal wavelets used for
each method is different as a result of adopting recommendations of other similar studies.
Therefore thIS thesis presents a study to choose the optimum, common Biorthogonal
wavelet for all three denoising methods; CTMC, TWMC and CSTD. The methodology
for this study is described in section 5.2.
The following sections describe the theoretical background of wavelets in detail and
the way in which contribute as a powerful tool to analyse the ABR.
2.3.2 Concept of wavelets
A wavelet is a time domain function which inherits specific properties of the energy content,
frequency content and the length in time. The wavelet function defined over the real axis
(−∞,∞) must satisfy two basic properties; 1) zero mean 2) unity energy. These can be
A review of the ABR and its extraction 47
mathematically expressed as in (2.10) and (2.11).
∞∫
−∞
ψ(t)dt = 0 (2.10)
∞∫
−∞
ψ2(t)dt = 1 (2.11)
The notion of ‘small wave’ or ‘wavelet’ is related to the limited interval associated in
time scale (Haar 1910). If we assume that there exist δ, very close to zero, there must be
an interval of finite length of which the energy of the wavelet is;
T∫
−T
ψ2(t)dt < 1− δ (2.12)
Because, the deviation of ψ(t) outside the interval [T,−T ] will be insignificant, the
activity of the wavelet is limited within the interval [T,−T ] and negligible outside. Since
the total waveform energy is concentrated strictly within this time window (i.e. zero
amplitude out of the window) or if the majority of the waveform energy is within the
window (i.e. low amplitudes out of the window), those wavelets are defined as “compact
support”. In contrast, even though the cosine base signal in Fourier transform fulfils the
criterion for the mean in (2.10), the energy would be diverging to infinity and cannot
be normalised to unity, and therefore is not considered as a wavelet. Also, distinct from
the Fourier base function, the band limited frequency content in the wavelet sub-bands
enhances the dynamic nature when used to analyse transient signals.
Scaling and translating are the two main principles of wavelet analysis. A wavelet can
be scaled in time by stretching or shrinking the wavelet and can be translated by shifting
to different time positions without changing its basic wavelet shape. An orthonormal,
compactly supported wavelet basis is formed by the scales j and translations k of a single
real valued function ψ(j,k)(t) =1jψ
(
t−kj
)
defined as the wavelet and its scaling function
ϕ(j,k)(t). The result will be a set of coefficients for each scale and translation representing
the extent to which the wavelet has been matched with the signal of interest. This implies
that larger coefficients are related to the actual signal and noise is mostly represented by
A review of the ABR and its extraction 48
smaller coefficients (the the choice of morphologically similar wavelet basis function for
such analysis is critical). Therefore the systematic removal of less significant coefficients
leads to refining the signal of interest. These coefficients could then be used to reconstruct
the denoised signal of interest by an inverse calculation.
2.3.3 Basis wavelets
A basis wavelet (also known as a mother wavelet) is defined as a basic waveform shape.
Haar, Daubechies, Symlet, Morlet and Biorthogonal are few examples of different wave-
form shapes. Different orders of each basis wavelet forms a wavelet family in which the
higher orders make the wavelet smoother. Each wavelet family possesses unique properties
that make them more appropriate for a certain range of applications including extraction
of the ABR. Some of these essential properties are explained below.
• Symmetry - The phase response of a filter, and therefore a wavelet, is defined by
its symmetry properties. If a basis wavelet possesses either even or odd symmetry,
then the corresponding phase response is linear (Parameswariah & Cox 2006). A
linear filtered preserves the in-phase frequency components filtered in contrast to an
asymmetric filter with a non-linear phase response which produces phase distortions
(Oppenheim & Schafer 1999). With the exception of the orthogonal Haar wavelet (a
B-spline of degree zero), only Biorthogonal basis wavelets can be designed to have
linear phase responses.
One of the main foci of the context of this thesis is to study the ability of wavelet
methods to track time-scale variations of specific ABR features. Therefore the preser-
vation of the locations of these features is critical, thus signifying the necessity of
symmetric filters.
In addition, the symmetry of the basis wavelet reduces the number of multiplications
in the convolution integral by half. This is achieved by adding signal samples prior
to multiplication by the filter coefficient (Bradley & Wilson 2004).
• Smoothness - There are several factors which affect the smoothness of the basis
wavelet. Smoothness defines the differentiability of the wavelet. Since the EPs such
A review of the ABR and its extraction 49
as the ABR is a smooth continuous signal, the smoothness of the basis wavelet
improves the morphology of the reconstructed waveform (Burrus, Gopinath & Guo
1998).
The order (also called vanishing moments) directly affects the smoothness or the
regularity of the basis wavelet. In theory, higher orders allow more complex signals
to be accurately represented by the scaling functions of the wavelet. As the order
represents the accuracy (Strang & Nguyen 1997), wavelets with a higher order, in
general, approximate signals with fewer non-zero coefficients, thus providing a sparse
representation suitable for data compression and fast calculations (Mallat 1998).
The choice of a basis wavelet for filtering varies depending on the EPs. Some choices of
reported literature appear vague with no justification for the choice of the basis wavelet.
For example, Huang & Nayak (1999) have used 20th order Daubechies wavelet to mea-
sure the depth of anaesthesia with MLAEP and Kochs et al. (2001) uses the 3rd order
Daubechies wavelet. Similarly, Effern et al. (Effern, Lehnertz, Fernndez, Grunwald, David
& Elger 2000, Effern, Lehnertz, Schreiber, Grunwald, David & Elger 2000) has introduced
and examined an ALR analysis method using a single sweep with wavelets but has failed
to reveal the basis wavelet used in the study. However, some of the literature has justi-
fied the choice of basis wavelet, on the basis of various properties of the chosen family of
wavelets. Wilson et al. (1998) has investigated the use of three basis wavelets; Biorthogo-
nal 3.5, Daubechies 5 and Symlet 4 for their ability to analyse ABRs and concluded that
a so-called best basis wavelet is less distinct. Therefore, they suggest choosing the best
basis wavelet for each wavelet decomposition sub-band. Hoppe et al. (2001) in their study
of automatic sequential recognition of ALR has identified that Mallat’s wavelet produced
best results compared to Daubechies and Biorthogonal.
Wilson et al. (1998) on the other hand, have provided reasons for his choice of 5th
order Daubechies wavelet for analysing ABRs including: exact reconstruction, similarity
in morphology, arbitrary regularity and asymmetry (suitable for the irregularly shaped
ABR waveform), compact support and orthogonality. But according to the suitability
criteria mentioned above with symmetry and smoothness, several arguments of Wilson et
A review of the ABR and its extraction 50
al. (1998) have failed. Exact reconstruction can only be achieved if all the coefficients
are involved in the reconstruction process (inverse wavelet transform). But in contrast,
denoising applications always nullify irrelevant coefficients therefore prohibit exact recon-
struction. Also the choice of asymmetric basis wavelets in Wilson et al. (1998) affects the
orthogonality (except Haar wavelet) leading to a non-linear phase distortions during the
filtering process.
However, the literature review presented here revealed that wavelets in the Biorthogo-
nal family are used frequently with justification. Causevic et al. (2005) and Quian Quiroga
(2005) have used Biorthogonal 4.4, Bradley & Wilson (2004) and Zhang et al. (2006) have
used Biorthogonal 5.5. Also Basar et al. (2001) have used Biorthogonal wavelets but have
not mentioned the order. In addition to EP related applications, Biorthogonal wavelets
contribute to compression of 2D data by retaining relevant coefficients related to the en-
ergy distribution as it is used in JPEG2000 image compression (ISO/IEC 2004) and in
the fingerprint information storage system in FBI uses wavelet/scalar quantization (WSQ)
image coding standard (Bradley & Brislawn 1994). The symmetry of the filter is critical
when choosing the family of wavelets with stable linear-phase FIR filter which does not
distort the filtered waveform (Parameswariah & Cox 2006) and the reduction of the shift
variance effect (Singh & Tiwari 2006). In addition, further reasons stated for the choice
of Biorthogonal basis wavelets are; the order of the wavelet for better match for the ABR,
compact support nature to reduce the computational complexity and visual similarity to
individual ABR peaks.
2.3.4 DWT with Biorthogonal wavelets
Wavelet analysis and its efficient computer implementation of DWT is extensively de-
scribed in (Mallat 1998) and specific applications related to the ABR are explained in
(Samar et al. 1999, Raz et al. 1999). The wavelet transform in its pure form calculates all
the wavelet coefficients at all the scales and translations, which are both computationally
redundant and a time consuming process. Therefore a subset of scales and translations of
dyadic nature (based on power of two) is used for more efficient analysis. Mallat (1989)
has introduced a filter bank consisting high-pass and low-pass filters to calculate wavelet
A review of the ABR and its extraction 51
coefficients of a given sequence of a discrete time series.
The algorithm of the DWT of a signal x(t) of length N consists of performing several
elementary decomposition steps. Starting with the signal x(t), the first step produces
two vectors of coefficients: approximation coefficients a1 and detail coefficients d1. These
vectors are obtained by a convolution of the signal x(t) with the low-pass filter h for
the approximation and with the high-pass filter g for the detail, followed in both cases
by a dyadic decimation. They are approximately N/2 in length. The convolution of the
signal x(t) by a filter h is defined by [x ∗ h]n =∑
k
xn−khk. The dyadic decimation of
the signals defined by yn = x2n.
This operation leads to higher order decomposition levels by breaking up the approx-
imation coefficients into two, replacing a1 by a2 and d2. The algorithm follows the same
pattern of decomposing up to level l as shown in figure 2.9, decomposition half. System-
atic nullification of these coefficients leads to efficient removal of noise. Such three similar
methods are implemented in this study to extract features of the ABR and are explained
in section 5.1.
In the case of using Biorthogonal basis wavelets, there are two separate filters for
decomposition and synthesis. Such Biorthogonal compliments of decomposition filters for
synthesis are defined as h representing low-pass filter and g representing high-pass
filter .
The inverse DWT performs a reconstruction of the original signal by up-sampling
coefficients by a factor of 2 and then convolving with the synthesis low-pass filter h and
high-pass filter g. The perfect reconstruction of the original signal is shown in synthesis
section of the figure 2.9.
2.3.5 Shift variance of DWT
It is well known that, DWT suffers from shift variance in the time domain i.e. even in the
case of periodic extension of a signal x(t), the DWT of a translated version of x(t) is not,
in general, the translated version of the DWT of x(t).
In other words, time shifts in the signal are not properly represented by the decomposed
approximation and detailed functions of the DWT. Therefore, reconstructed signals using
A review of the ABR and its extraction 52
( )x t
g
h
2 ↓
2 ↓
d1
d2
dl
al
Level 1 Level 2 Level l
+
2 ↑
2 ↑
Decomposition Synthesis Approximation
and detail
coefficients
g
h
g g
g
g
h h
h
h
( )x t2 ↓
2 ↓
2 ↓
2 ↓
2 ↑
2 ↑
2 ↑
2 ↑
+
Figure 2.9: Mallat’s cascaded filter multiresolution analysis. x(t) is decomposed by analysis filter coeffi-cients h and g. Resultant approximation and detail coefficients are then inverse filtered by synthesisfilter coefficients h and g to arrive at the reconstructed x(t).
the thresholded coefficients of the DWT are prone to latency distortions (Bradley &Wilson
2004, Coifman & Donoho 1995, Kingsbury 2000). This is as a result of down-sampling
the decomposed coefficients by a factor of two at each decomposition level in the DWT.
That is, when the wavelet transform sub-bands (which nominally have half the bandwidth
of the original signal) are sub-sampled by a factor of two results in violating the Nyquist
criterion and frequency components above (or below) the cut-off frequency of the filter are
aliased into the wrong sub-band.
Several decomposition methods have been suggested to suppress shift variance with
continuous wavelet transform, dual tree complex wavelet transform (Kingsbury 2001), sta-
tionary wavelet transform (SWT) (Misiti, Misiti & Oppenheim 2006) and over complete
discrete wavelet transform (Bradley & Wilson 2004). Continuous wavelet transform is a
highly redundant method whereas dual tree complex wavelet transform needs special basis
wavelets to implement, thus both were not suitable for implementation of denoising meth-
ods evaluated in this thesis. Over complete discrete wavelet transform is a combination of
conventional DWT and SWT in which critically sampled DWT is applied to the first M
levels of an L level wavelet decomposition and then the fully sampled SWT is applied to
the remaining (L −M) levels (Bradley & Wilson 2004). This however produces a slight
shift variance even though it is computationally efficient.
In this thesis, to evaluate denoising methods in an unbiased manner, the perfect shift
A review of the ABR and its extraction 53
invariant SWT was used in addition to the conventional DWT. This way, latencies of
filtered ABRs are guaranteed to be preserved, thereby expecting to generate an accurate
result.
2.3.6 Stationary wavelet transform
Stationary wavelet transform (SWT) follows a similar decomposition tree to that of DWT.
The only difference is the elimination of dyadic down sampling at each decomposition
levels preserving all of the information of the signal at decomposition levels thus avoiding
aliasing.
The algorithm used for the calculation of the SWT possesses similar features to that of
the DWT. Level 1 decomposition of a given signal can be obtained by convolving it with
appropriate filters, as in the case of the DWT but without decimating. In this case the
approximation and detail coefficients at level 1, a1 and d1 both have a length N , which is
the length of the original signal.
In general, the approximation coefficients at the level l are convolved with an up-
sampled version of the two usual filters to produce the approximation al+1 and detail
dl+1 coefficients at the j + 1 level. The algorithm can be visualised as per the schematic
diagram presented in figure 2.10.
2.4 Summation of the ABR extraction methodologies
According to the information revealed in this chapter, the ABR is associated with key
pathologic conditions in the auditory central nervous system as well as non-pathologic
conditions such as drug administration and stimulation parameters. While hearing screen-
ing related applications of the ABR are well established, correlation with other pathologic
and non-pathologic conditions is a grey area thus imposing a barrier to use in practice.
The main drawback of the conventional ABR extraction method is the prolonged dura-
tion which results in the inability to observe the short-term variations which could provide
critical information about internal brain structures. Therefore the need arises for a rapid
system which can accurately estimate the features of the ABR. Such systems could also
A review of the ABR and its extraction 54
( )x t
g
h
d1
d2
dl
al
Level 1 Level 2 Level l
+
Decomposition Synthesis Approximation
and detail
coefficients
g
h
g g
g
g
h h
h
h
( )x t+
+
lh ↑2 1+lh
lg ↑2 1+lg
1ˆ
+lh lh
1ˆ
+lg
lg
↑2
↑2
Figure 2.10: Decomposition and the synthesis tree of the SWT with a Biorthogonal basis wavelet. Dyadicdown-sampling is eliminated here. To implement this, filter coefficients have up-sampled (h and g)for the decomposition and down-sampled (h and g) for the synthesis of x(t).
potentially be able to be use in intraoperative monitoring systems and in long term patient
monitoring systems to record continuous readings with enhanced comfort to the patient.
The signal processing algorithm is the key features of such a system. Therefore this
thesis mainly aims to:
• Investigate the denoising capacity of ARX and REPE with a well defined, repro-
ducible simulation study under variable SNRs and to evaluate the ability to track
time scale variations in the latency of simulated EP components followed by applying
these methods to real ABRs.
• Compare the performance with a simulation study with CTMC, TWMC and CSTD
wavelet methods for comparison purposes followed by optimising these wavelet meth-
ods on real ABRs to evaluate the performance of noise removal and time-scale vari-
ation tracking.
• Determine limitations and implications of using ARX modelling and specific wavelet
denoising methods in extracting the ABR.
A review of the ABR and its extraction 55
2.4.1 Hypotheses
The large coefficients that are derived from wavelet transformation (with a closely com-
patible basis wavelet) of a noisy ABR in general are related to the underlying signal and
comparatively smaller coefficients are related to the noise. Imposing a threshold on these
coefficients retains the larger coefficients and neglects the lesser coefficients. Therefore
the general hypothesis of wavelet denoising is that the coefficients that are neglected by
thresholds are related to spontaneous EEG noise, where as the retained coefficients are
relevant to the ABR.The specific hypotheses related to denoising methods are as follows.
• It is hypothesised that the use of a template in CTMC algorithm improves the
conventional thresholding by retaining wavelet coefficients of the noisy signal related
to the temporal locations of the template.
• In TWMC, it is hypothesised that specific ABR features occur at predetermined
time windows along the time frame of the response.
• In both CTMC and TWMC it is hypothesised that a fixed template accommodates
detection of time scale variations of the ABR.
• It is hypothesised that the template independent CSTD method will allow tracking
latency variations without constrains.
It is hypothesised that shift invariant SWT yields better denoising results (in terms
of mean square error) and latency approximations compared to the use of DWT as the
decomposition algorithm.
2.5 ABR data
2.5.1 Types of ABR data
According to the drawbacks identified in section 2.2 and 2.3 the extraction methods eval-
uated in this thesis were initially subjected to simulated data and then those results were
justified and practical implications were studies through ABR data recorded from human
participants.
A review of the ABR and its extraction 56
2.5.2 Simulated ABR data
As identified in section 2.2.5 a systematic evaluation has not been performed on either ARX
or wavelet methods. One of the major drawbacks of existing simulations studies in ARX
modelling is the derivation of the template with an ensemble average of real EP data (Lange
& Inbar 1996, Cerutti, Chiarenza, Liberati, Mascellani & Pavesi 1988). Simulation of
noise has been the major focus in wavelet studies where some studies use a cosine wave for
performance comparison (Causevic et al. 2005). As identified, contrasting characteristics of
the non-stationary ABR to that of stationary cosine wave warrant establishing a systematic
simulation basis for performance comparison. The ABR model introduced in the thesis
intends to include following characteristics.
• Similarity in morphology including clinically important wave I, III and V.
• Obtain systematic variations in the amplitude and latency of individual waves.
• Comparable spectral characteristics to that of a real ABR.
An ABR model including these characteristics is presented in section 3.3 and both ARX
modelling and wavelet methods will be subjected to this model and its variations for an
initial feasibility study before subjecting them to real ABRs.
2.5.3 Real ABR data
The recording task of ABRs is non-trivial due to the small amplitude which is highly
susceptible to background noise including ongoing EEG, noise of the recording setup and
ambient electromagnetic noise. The short time span (10 ms) of the ABR activity includes
high frequency components which could overlap such a noise bandwidth. To avoid con-
tamination, a general set of parameters has been established for ABR data recording and
is listed in table 2.4 (Hall 1992, Van Campen, Sammeth, Hall 3rd & Peek 1992). The
importance of these parameters is heightened due to:
• optimised ability to detect the ABR suppressing substantial noise interference
• providing a standard recording environment.
A review of the ABR and its extraction 57
Parameter Settings
Stimulus parameters
Type ClickPulse width 0.1 msPolarity a square pulse with a negative polarityFrequency > 20 HzIntensity Variable in dB nHLNo. of epochs Variable to obtain an ABR with adequate SNRMode MonauralMasking Only if the ABR is abnormalAcquisition parameters
Electrode montageNon-inverting Cz or FzInverting A or M (ipsilateral)Ground FpzFilteringHigh-pass 30 HzLow-pass 3 kHzAmplification 100000Sampling rate 40 kHzAnalysis time 15 msPre-stimulus interval 10% of the analysis time
Table 2.4: Settings for a typical ABR recording (Hall 1992, Van Campen et al. 1992).
Also, myogenic artefacts can induce voltage fluctuations in the order of 100 µV which
result in saturation of the bio amplifier. Epochs containing such artefacts are removed with
artefact rejection methods before further processing with wavelets or any other extraction
method.
The most common stimulus types used in ABR recordings are the ‘click’ and the tone
burst (refer figure 2.11a). A ‘click’ stimulus is preferred in ABR recordings over tone
bursts due to the inherent broad frequency spectrum as shown in figure 2.11b. The high
frequencies affect the basal end of the cochlea and the low frequencies affect the apical
end of the cochlea. Thus ‘click’ stimuli are able to produce all features in the ABR and
help clinicians to assess the functionality of the auditory pathway of a patient (Misra &
Kakita 1999).
Stimulus frequency is typically set above 20 Hz to shorten the recording time and
usually set to an odd number with fractions to create an asynchronous alignment of 50
A review of the ABR and its extraction 58
(a) Waveforms of a click stimulus and a tone burst. (b) Spectrum of the click stimulus and the tone burst.
Figure 2.11: Types of auditory stimulus and their frequency spectrums. It is evident that the click stimulushas a broader frequency spectrum than the tone burst.
Hz line frequency components so that they will nullify during the moving time averaging
process. The number of epochs recorded per trial varies according to the noise condition
of the recording setup and the signal processing method.
The number of sweeps required to arrive at an ABR depends upon the electrical and
physiological noise contributing to the final SNR of the MTA. Given perfect conditions
viz. quiet environment, clam and normal hearing participant subjected to high stimulus
intensities, few number of sweeps such as 100-200 is sufficient where as 2000 or more sweeps
are required for a restless participant (e.g. infants) with hearing impairment at low sound
intensities (Hall 2007). This measurement in clinical practice is mostly determined by
calculating the Fsp value and the averaging is terminated when Fsp reaches 3.1 (Elberling
& Don 1984). That is, when Fsp ≥ 3.1 the probability of arriving at a noise signal is 1%
(false positive). However, given imperfect recording conditions, the number of sweeps in
the MTA contributing to the ABR often reaches the order of thousands (Strauss et al. 2004,
Wilson 2004, Wilson et al. 1998, Bradley & Wilson 2004, Shangkai & Loew 1986, Stuart
et al. 1996).
In a typical diagnostic ABR assessment, both ears are tested (one after the other) so
that interaural comparisons can be made. The non-test ear (contralateral) will be masked,
A review of the ABR and its extraction 59
in the case of stimulus intensity in the test ear (ipsilateral) being high enough to cross
over via bone conduction to stimulate the cochlea of the non-test ear. The masking noise
is typically white which masks the entire cochlea in the non-test ear.
Two standard electrode montages, based on the international 10-20 system, are used
for data acquisition purposes (refer figure 2.12). The position for the ground electrode is
at Fpz for both montages. The non-inverting electrode is placed either at Cz or at Fz
and the inverting electrode is placed either at the ipsilateral earlobe or mastoid. These
positions are chosen due to the close proximity of the ABR generators; eighth nerve is
close to earlobe and mastoid and the nucleus of lateral lemniscus or inferior colliculus is
close to Cz and Fz.
Electrode montages for ABR recording include various possibilities; vertical (Cz-Nape
of neck), ipsilateral (Cz-Mi), contralateral (Cz-Mc), and horizontal (Mc-Mi) with the
ground electrode at Fz or Cz (these positions are based on the international 10-20 system).
Following are the implications of using these montages.
• Vertical: prominent wave V low wave I and III (Stuart et al. 1996)
• Horizontal: absent of wave I (Stuart et al. 1996)
• Contralateral: prominent wave II (Kato, Kimura, Shiraishi, Eura, Morizono & Soda
1995)
• Ipsilateral: prominent wave I, III and V (Stuart et al. 1996)
It is widely acknowledged that the ipsilateral electrode montage is entirely adequate
for most ABR applications in adults (Hall 2007) and in infants (Stuart et al. 1996). But in
cases where there is insufficient room to place the active electrode on Fz and the ground
electrode on Fpz, the ground electrode is often placed on the nape/shoulder.
Inter-electrode impedance is a critical parameter in recording an ABR to reduce the
noise and improve the quality of the ABR. Therefore the electrode site at the scalp should
be thoroughly prepared to achieve an impedance of less than 5 kΩ (Chiappa, Gladstone
& Young 1979).
A review of the ABR and its extraction 60
Figure 2.12: Possible electrode montages for ABR recordings. F-Frontal P-Parietal C-Central z-zero(midline)
Other acquisition parameters (according to the table 2.4) can be set at the amplifier
or in the recording software. The standard time window for an ABR is 15 ms with 10%
of it as the pre-stimulus interval (Hall 2007). The window of 10 ms after the onset of the
stimulus includes the important wave V approximately at the 6 ms time point and the
comparatively less significant wave VII approximately at 9 ms.
Electromagnetic interference in the recording environment should be minimised. Elec-
tronic and electrical recording equipment should be kept at a distance from the participant
(Moller 1987). To avoid extraneous auditory stimulation other than the click stimulus, the
recording should be carried out in an anechoic chamber (at minimum in a quiet room).
Also it is advised to rest the head of the participant on a support (or lay on a supine
position (Wilson et al. 1998)) and to close the participants‘ eyes while recording to avoid
myogenic artefacts (Sokolov et al. 2005).
The stimulus artifact The major source of external noise at the point of stimulation is
from the audio transducer which will result in a ‘stimulus artefact’. The stimulus artefact
is generated when the electromagnetic field produced by the audio transducer interacts
with the electrodes placed on the scalp (Coats, Jenkins & Monroe 1984, Elberling &
Salomon 1973, Sokolov et al. 2005). The electrode and the instrumentation, which are fine
tuned to pickup small ABRs could easily record this noise and suppress early components
of the ABR. However, the issue of stimulus artefact is not uncommon. Usual methods
A review of the ABR and its extraction 61
used to minimise this effect are (Hall 2007):
(i) Use of µ-metal shielded headphones to absorb the magnetic field.
(ii) Use of audio couplers to create a distance between the headphone and the scalp
electrodes.
(iii) Use of insert earphones (Etymotic ER-3A) to minimise the effect of the magnetic
field on the electrode by confining to the ear canal.
Insert earphones are associated with additional costs ($498 (Inc 2011)) and the use of
audio couplers and µ-metal shielded headphones increases the complexity of the recording
setup (Cooper & Parker 1981), introducing a delay to the ABR while the sound travels from
the headphone through the audio coupler to the tympanic membrane. An alternative is
to make use of cost effective and less complicated audio transducers that produce artifacts
with little or no impact on early components of the ABR, especially wave I.
Equipment, stimulation and acquisition parameters and participants involved for data
collection are presented in section 3.1. These data were collected so that the effect of the
above mentioned drawbacks are minimised according to the resources and time permitted
to conduct this research.
A separate study conducted on the periphers to the main aim of the thesis comparing
results from various types of audio transducers is included in Appendix C. This formulates
a methodology to compare results from these transducers with different orientations and
strengths of magnetic fields, to determine their suitability for use in an ABR study as
substitutes for the more expensive and complicated transducer setups.
A review of the ABR and its extraction 62
Chapter 3
Recording and constructing
synthetic ABR data
To obtain reliable results from an analysis of the extraction methods, fidelity and com-
patibility of ABR data is critical. In analysing the identified parametric modelling and
wavelet methods; simulated data were initially used and then applied on real recorded
ABR data to arrive at conclusions.
Recording of ABRs in practice is non-trivial. It needs careful preparation of the record-
ing setup and the participant. Such methods are discussed in detail in this chapter. Exact
recording parameters of ABR data used in evaluating parametric modelling and wavelets
are also presented.
Use of simulated data avoids the uncertainty of generating the ABR in practical record-
ings and eliminates the influence of physiological and non-physiological artifacts. In addi-
tion simulated data generated with a defined mathematical model will enables comparisons
with future research. This chapter describes the ABR mathematical model and its varia-
tions in generating datasets to evaluate ABR extraction methods.
63
3.1 Recording of ABR data
The recording setup of ABRs should be be done with great care as the small magnitudes
of the ABR could easily be contaminated with potentially larger noise sources within the
subject and in the recording environment. Such sources include; ongoing EEG, myogenic
artifacts, stimulus artifacts, noise of the recording setup and ambient electromagnetic
noise. Of these high frequency noise could contaminate the ABR due to the short time
span ('10 ms). Therefore recording parameters and subject/participant factors were
optimised for maximum suppression of such noise sources. With the help of published
parameters listed in table 2.4 and information gathered from New Handbook of Auditory
Evoked Responses (Hall 2007) the following setup was utilised.
3.1.1 Equipment and parameters
The specific stimulus and acquisition parameters used to record ABR data for the en-
tirety of this thesis is listed in table 3.1. The auditory stimulus was a negative polarity
square pulse with a pulse width of 0.1 ms at a frequency of 21.1 Hz and delivered via a
TelephonicTMTDH-49 headphone. Three electrodes were utilised, located at the Interna-
tional 10-20 sites of Cz, Fpz, and A1. Disposable, self-adhesive electrodes (3MTM) were
used at Fpz and A1 (where clear access to the scalp was available). At Cz, a domed
electrode was used with viscous electrode paste to provide attachment and electrical con-
ductivity. All the electrodes were silver/silver chloride to achieve comparable surface
impedances. The scalp was prepared so that the inter-electrode impedances were below
5 kΩ.
The specific stimulus and acquisition parameters used to record ABR data for the
entirety of this thesis is listed in table 3.1. The auditory stimulus was a negative polarity
square pulse with a pulse width of 0.1 ms at a frequency of 21.1 Hz and delivered via a
TelephonicTMTDH-49 headphone. A conducting gel-injected disk electrode was used at
Cz and 3MTMdisposable electrodes were used at A1 and Fpz locations. All the electrodes
had silver chloride surfaces to achieve comparable surface impedances. The scalp was
prepared to keep the inter-electrode impedances below 5 kΩ.
Recording and constructing synthetic ABR data 64
The Recording setup consisted of Dual Bio Amp/Stimulator, PowerLab amplifiers
and Chart-5 software produced by ADInstruments (Sydney, Australia). The Dual Bio
Amp/Simulator was used as the main amplifier which is accurate to ±1% of the gain of
100k in an amplification range of ±5 µV. The specified RMS noise is 1.3 µV within the
bandwidth of 0.1 Hz to 5 kHz. This value was reconfirmed in the recording environment
to be 1.6 µV from the amplifier noise recorded with the leads short-circuited (refer Fig-
ure 3.10). Such a noise profile is characteristic to any amplifier affected by thermal and
Johnson noise. The PowerLab amplifier was used as the ADC with a resolution of 16
bits. Any unrelated equipment was turned off (to avoid unwanted electrical or magnetic
fields) and the subject was kept at a distance to the recording equipment to reduce any
Parameter Setting
Stimulus parameters
Transducer TDH-49pType Click (square wave)Pulse width 0.1 msPolarity NegativeFrequency 21.1 HzIntensity 10-75 dB nHLRepetitions 1024Mode MonauralMasking nonAcquisition parameters
Electrode montageNon-inverting CzInverting A1Ground FpzElectrode material AgClInter-electrode impedance <5 kΩFilteringHigh-pass 100 HzLow-pass 3 kHzAmplification 100000Sampling rate 40 kHzAnalysis time 10 msPre-stimulus interval 1 ms
Table 3.1: Finalised parameters for the data collection for the main study.
Recording and constructing synthetic ABR data 65
interfering field.
The data were sampled at a frequency of 40 kHz and an artifact rejection process was
performed within the window of interest, so that amplitudes that exceeded a threshold
of 25 µV were excluded from further analysis. The retained epochs were then band-
pass filtered between 100-3000 Hz with a 3rd order Butterworth filter using a zero-phase
shifting method (Oppenheim & Schafer 1999). A low cut-off frequency of 100 Hz as
opposed to 30 Hz (seen in table 2.4) was chosen in order to minimize the effect of noise
from ongoing EEG and myogenic artifacts (Corona-Strauss et al. 2010b, Rushaidin et al.
2009, Petoe, Bradley & Wilson 2010). The zero-phase shifting filter was specifically used
here to preserve the latencies of the ABR waves. The ABR is convolved in both the
forward and backward directions to regain the phase shift created when filtered only in one
direction. This operation doubles the filter order, leading to additional computation but
with an added advantage of retaining phase characteristics. Participants were stimulated
with sound intensities ranging from 10-75 dB nHL at intervals of 5 dB. Such a range of
stimulation enabled construction of the L-I curve of ABR waves to performance evaluate
latency tracking of extraction methods.
Custom written scripts were used for offline analysis using MATLABTMproduced by
MathWorks (MATLAB 2008). These scripts are attached in Appendix D.
Stimulus artifact
As a result of the magnetic field generated in the audio transducer at the time of stim-
ulus delivery, an artifact is present when using with TDH-49 headphones. The observed
stimulus artifact at the onset of the auditory stimulation (t = 0 ms) is shown in figure 3.1.
The critical observation is the time duration of the stimulus artifact, which appears to ter-
minate well before the wave I suggesting minimum or no effect from the stimulus artifact
on ABR wave features. Therefore, to avoid such stimulus artifacts, we truncated the ABR
time window to 1-10 ms. Such time window avoided false positives in the artifact rejection
process explained in section 3.1.1. The reader is referred to a more detailed study on the
stimulus artifact attached in Appendix C if further information is required. In summary,
it states that the average stimulus artifact end time is 0.54 ms and the average latency of
Recording and constructing synthetic ABR data 66
Figure 3.1: An ABR with a stimulus artifact at t=0 ms (De Silva & Schier 2009). It is evident that thereis no effect on wave I due to the stimulus artifact.
wave I is 1.62 ms (SD=0.12).
3.1.2 Participant details
ABRs were recorded from 8 normal hearing participants (4 males and 4 females) in an
age range from 24 to 34 years (mean=26.7, SD=2.6). Participants were rested between
the stimulation of each intensity and were asked if they were feeling comfortable to avoid
variation in the ABR due to auditory fatigue. The Swinburne University Human Ex-
perimentation Ethics Committee approved the data collection, and each participant gave
written informed consent in accordance with these requirements. The official ethics ap-
proval details are attached in Appendix A. A visualization of the recording setup is shown
in figure 3.2. The photograph was included with the full consent the participant.
3.1.3 MTA and statistically significant SNR
The ensemble of epochs required to arrive at an ABR depends upon the underlying noise
contributed by electrical and physiological activity. Given perfect conditions viz. quiet
environment, calm and normal-hearing participant and high stimulus intensity, only a
small number of sweeps (as few as 100-200) are sufficient. This contrasts with imperfect
conditions in practice such as restless participant (e.g. infants) with a hearing impairment,
or low stimulus intensities which require of the order of thousands of sweeps (Strauss
Recording and constructing synthetic ABR data 67
Figure 3.2: Recording setup with electrodes placed on the scalp and TDH-49 headphones are worn by theparticipant. The photograph was included with the full consent the participant.
et al. 2004, Shangkai & Loew 1986, Wilson 2004, Bradley &Wilson 2004, Stuart et al. 1996,
Hall 2007). The size of such ensembles, in clinical practice are determined by calculating
the single point F ratio (Fsp). The MTA process is terminated when the value of Fsp
reaches 3.1 (Elberling & Don 1984). That is, when the value of Fsp ≥ 3.1 the probability
of arriving at a signal contaminated with noise is 1% (false positive).
Definition of Fsp
The Fsp is defined as the ratio between the variance of the averaged ABR and the average
variance of a single point across the ensemble of ABRs. The mathematical interpretation
is as follows:
Fsp =V AR(ABRi)
V AR(ABRk)(3.1)
Here, the variance of the averaged ABR is defined as V AR(ABRi) =
400∑
k=1
ABRi
2(k)
400 , where
k is the number of sample points in the ABR derived by averaging i number of epochs
(ABRi). The average variance of the single point is defined as V AR(ABRk) =
N∑
i=1
ABRi2(240)
N
such that the variance of the single point ABRk is determined by k= 240th a single point
(which corresponds to wave V).
To avoid the outliers of Fsp values due to highly variable physiological noise (Ozdamar
& Delgado 1996), the property of linearity of Fsp against the number of averaged epochs
Recording and constructing synthetic ABR data 68
was used to approximate a linear trend with a robust linear regression method using
iteratively re-weighted least-squares (also known as the Welsch approximation) (Holland
& Welsch 1977). In other words, a statistically significant ABR was deemed to have
achieved when the Welsch approximated curve reached the threshold of Fsp = 3.1.
Application of Fsp method to the artifact rejected ensemble of epochs from all the
participants involved in this study revealed that the number of epochs required in average
to be 968 (SD = 36). The closest power of 2 led to setting the ensemble of epochs to 1024
given the limitations of the block size of CSTD since it is possible to have only combinations
of 2x where x ∈ Z+. However, more justification is provided below for the validity of this
choice in all conditions pertaining to this study. An example of an ABR where the Fsp
value is not sufficient for a statistical significant ABR is in figure 3.3. An extrapolation of
Welsch approximation suggests the threshold is reached at i = 1238th epoch. However, a
large variation of the averaged ABR cannot be expected between ensembles of 1024 and
1238 as the correlation coefficient of progressive averages compared to the MTA of 1024 in
figure 3.3 suggests saturation after a MTA of approximately 500. This therefore justifies
an ABR at a MTA of 1024 epochs specific to the experimental setup of this thesis, and is
closely related to clinically acceptable standards. Therefore, all the grand averaged ABRs
in this thesis are derived from a MTA of 1024.
3.1.4 Data organisation
A fully featured grand averaged ABR template was generated using a MTA of 1024 epochs
recorded at 55 dB nHL. Out of 1024 epochs collected at each sound intensity level, ‘block
sizes’ of 256, 128, 64, 32, 16 and 8 epochs were tested for rapid extraction. A total of 769
ABRs were extracted for each block size at a single sound intensity level with a sliding
window of length corresponding to the block size. These blocks were processed with ARX
and wavelet methods to arrive at a rapid extraction system.
3.1.5 The template
A fully featured reference ABR template is mandatory to;
Recording and constructing synthetic ABR data 69
$ % $ $ & $ $ ' $ $ ( $ $ ) $ $ $* )$)%+& ,'-
. / 0 / 1 2 3 / 4 5 6 7 8 9 5 2 : ; <= >?@A BCCDE FGH BIJBDKKH JH DIG Fsp of the ABR
Welsch approx
Correlation coefficient
Fsp threshold
Figure 3.3: A worst-case scenario of the Fsp curve derived fromMTA of a series of ABR epochs. TheWelschapproximated robust linear trend suggests 1238 epochs are required to reach a statistically significant SNR.However, the correlation coefficient suggests minimal variations occur between a MTA of 1024 and 1238.Therefore, all the grand averaged ABRs in this thesis are derived from a MTA of 1024.
1. Provide a morphologically accurate template for ARX, REPE, CTMC and TWMC
methods.
2. Compare filtered/estimated ABRs from ARX and wavelet methods.
A moving time average of 1024 sweeps of ABR recorded at 55 dB nHL (shown in figure 3.4)
was used as the template throughout this thesis as the important features wave I, III and
V are easily identified in this ABR. In addition, less significant wave II and wave VI can
also be identified. The absence of wave IV, which normally appears before wave V could
be due to the associated noise or a characteristic of the participant (Kjaer 1980). A sound
intensity level of 55 dB nHL is used here because;
1. It is greater than threshold to eliminate non-responses due to threshold variation.
2. It is in the mid range of the L-I curve, which provides a balance when evaluating
tracking latency.
3. Stimulating at this sound intensity level were more comfortable for the participants
compared to stimulating at higher levels.
Recording and constructing synthetic ABR data 70
) % + & , ' - ( L ) $* $ 0 %$$ 0 %$ 0 &$ 0 '$ 0 (
; 7 M 2 N M 6 OP Q?EHG RS DT RUV W X Y Z [
W X Y Z \W X Y Z [ [ [W X Y Z [ [ W X Y Z \ [Figure 3.4: The reference ABR template used in wavelet denoising methods. Generated with MTA of 1024sweeps at 55 dB nHL showing main wave features I, III and V and additional features wave II and VI.
3.2 Latency-intensity and amplitude-intensity curves
Inducing systematic variations of ABR features are critical for the outcome of this thesis.
Such variations could be artificially induced by varying the stimulus intensity, thereby
producing latency-intensity and amplitude-intensity curves (Vannier et al. 2001). Varia-
tions produced in these controlled environments are ideal for the validation of algorithms
due to the predictable nature of the outcome.
The curves plotted in figure 3.5 represent the average of 8 participants with each ABR
at a given intensity derived by a MTA of 1024 epochs. Then the peaks, wave I, III and V
were manually determined by an independent observer, with around 25 years of experience
with evaluation of EEG and EP signals. The method was to visually inspect the waveforms
for evidence of peaks at the approximate latency, from which amplitude and latency were
calculated.
The curves clearly indicate the characteristic reduction in latency and increase in
amplitude with the increase in stimulus intensity. The critical observation however, is
the consistency and the reduced variability of the latency-intensity (L-I) curve compared
to the amplitude-intensity curve, which supports the use of the L-I curve to verify the
Recording and constructing synthetic ABR data 71
!
(a) Latency intensity curves
!" #$
(b) Amplitude intensity curves
Figure 3.5: Latency and amplitude intensity curves derived from recorded data. These curves were derivedusing ABRs generated from 8 participants with the parameters given in table 3.1. Error bars representstandard error among participants.
Recording and constructing synthetic ABR data 72
variation tracking ability of the algorithms developed in Chapters 4 and 5.
3.2.1 Compatibility of the L-I curve model
The validity of the L-I curves were tested by comparing the L-I curve of the most prminent
wave V, with the theoretical model in (3.2) reported by Picton et al. (1981) (where L is
the wave V latency in ms and I is the stimulus intensity in dB). The deviation of this
curve is usually about 0.2 ms at 70 dB and 0.3 ms at 30 dB.
log10(L) = −0.0025I + 0.924 (3.2)
Figure 3.6 illustrates the comparative plots of experimental and theoretical wave V L-I
curves. It is apparent that the derived curve closely follows the theoretical model, and
is well within one standard deviation confidence limits. This it therefore validates the
experimental data recorded in this study. This L-I curve of wave V was then considered
to be the benchmark for later comparisons of the ARX and wavelet estimated L-I curves
was considered acceptable.
] ^ _ ^ ` ^ a ^ b ^ c ^ d ^]_ `a bcdef] ^
g h i j k l j m n j o l m p q k r j s t uv wx yz| ~ n h n m l n l n k h j k n n
Figure 3.6: The theoretical and the derived L-I curve of wave V. The curve derived by the grand averageof experimental data follows the theoretical curve and well within the theoretical range.
Recording and constructing synthetic ABR data 73
3.3 Synthetic ABR model
The absence of a mathematical model for the ABR and systematic simulation studies
(refer sections 2.2.5, 2.5.2) in the literature gave an impetus to construct an ABR model
and conduct simulation studies. As opposed to using recorded ABR, the presence of the
exact deterministic ABR in simulated data improves the reliability of derived results. The
study conducted for this thesis was intended to achieve the following:
• A well-defined simulation study that is reproducible.
• A comparison of the performance of ARX model based and wavelet based extraction
methods.
In general, such a model of the ABR will enable researchers to validate and compare novel
extraction methods based on a universal benchmark.
3.3.1 Construction of the ABR model
The introduced simulated ABR model has three prominent features with similar mor-
phological characteristics to the ABR waves I, III and V: having approximately similar
latencies and amplitudes. These features of the ABR are the most dominant and are
clinically significance compared to other waves (II, IV, VI and VII). The synthetic ABR
u(k) expressed in (3.3).
u(k) =
aIsinc[0.13π(4k − 8 + lI)] +
aIIIsinc[0.13π(4k − 16 + lIII)]+
aV sinc[0.13π(4k − 24 + lV )]
(3.3)
Here, 0 ≤ k ≤ 10 ms with 400 data points to represent a typical ABR recorded at a
sampling frequency of 40 kHz. The three sinc functions represent ABR waves I, III and
V. The terms lI , lIII , lV define the latency of wave I, III and V respectively and are set
to 2, 4 and 6 ms at lI = lIII = lV = 0. The terms aI , aIII , aV define the amplitudes
of wave I, III and V respectively and are set to aI = 0.25, aIII = 0.5, aV = 1 to mimic
morphological characteristics. The synthetic ABR model with an unperturbed latency
(lI = lIII = lV = 0) is shown in figure 3.7a and the comparable recorded real ABR (at 55
Recording and constructing synthetic ABR data 74
dB nHL) is shown in figure 3.7b.
The ABR model formulated in (3.3) possess following favourable characteristics.
• Morphological similarities with regards to wave I, III and V.
• Ability to derive systematic variations of amplitude and latency of individual waves.
• Comparable spectral characteristics to that of a real ABR (as detailed in figure 3.8).
Both the ARX modelling and wavelet methods will be subjected to datasets derived from
this model for an initial feasibility study before applying them to real ABRs.
¡¢£¤ ¥¦ §¨ µ
©ª « ¬ ® ¯ « ¬ ® ¯ ¯ ¯ « ¬ ® °(a) Ideal synthetic reference signal u(t) with no latencyvariation (l=0)
± ² ³ ´ µ ¶ · ¸ ¹ ± º» º ¼ ²ºº ¼ ²º ¼ ´º ¼ ¶º ¼ ¸½ ¾ ¿ À Á ¿  ÃÄ ÅÆÇÈÉ ÊË ÌÍ µ
ÎÏ Ð Ñ Ò Ó Ô Ð Ñ Ò Ó ÕÐ Ñ Ò Ó Ô Ô Ô(b) Actual ABR recorded at 55 dB nHL
Figure 3.7: Synthetic and the Real ABR templates. These possess comparable features in terms of latencyand amplitude for ABR waves I, III and V.
3.3.2 Construction of synthetic datasets
To assess the full functionality of the ARX and wavelet methods, appropriate datasets were
created using the ABR defined in (3.3). Two types of datasets were created from u(k) to
specifically analyse: 1) Denoising performance and 2) Variation tracking performance.
(i) With no latency variations to evaluate denoising capacity. This dataset was
created with l = 0 using (3.3) to assess the improvement in SNR for ARX and
wavelet methods. The simulated dataset is shown in figure 3.9a as a surface plot. It
consist 60 s of recording assuming a stimulus frequency of 20 Hz.
Recording and constructing synthetic ABR data 75
Ö Ö × Ø Ö × Ù Ö × Ú Ö × Û Ü Ü × Ø Ü × Ù Ü × ÝÞ Û ÝÞ Û ÖÞ ß ÝÞ ß ÖÞ Ú ÝÞ Ú ÖÞ Ý ÝÞ Ý ÖÞ Ù ÝÞ Ù ÖÞ à Ýá â ã ä å ã æ ç è é ê ë ì íî ïðñòóô òñõöñ÷øùúûüóý þÿ è æ ã ç ã ã Þ é Ü Ö Ø Ù í
Figure 3.8: Comparison spectra of the ABR model and the real ABR template derived from MTA of 1024epochs. The characteristic frequencies at 100, 500 and 900 Hz are evident in the ABR model spectrumsuggesting spectrum compatibility.
(ii) With periodic (modulated) latency variations to evaluate latency tracking. 12
datasets representing 60 s of a recording were created with a combination of aL= 1,
1.5, 2 ms and fL = 0.025, 0.05, 0.1, 1 Hz. A visualisation of 3 datasets are shown in
figure 3.9b, 3.9c and 3.9d with [aL, fL] = [1, 1], [1.5, 0.05] and [2, 0.025] respectively.
Recording and constructing synthetic ABR data 76
!"#$%&'()*+,- ..
.//...
../.01
2 34! (a)
566766866966: ;666: ;566
65789:6< = >=? @ABBC@
DEFGHIJKLMNO BP=QRS TUVS WBX @Y
Z[\]^_`abcd e:6:e565ef6fe767ee6ee86
6:5f7e8g9h:6(b)
566766866966: ;666: ;566
65789:6< = >=? @ABBC@
DEFGHIJKLMN e:6:e565ef6fe767ee6ee86
6:5f7e8g9h:6 O BP=QRS TUVS WBX @YZ[\]^_`abcd
(c)
ijjkjjljjmjjn ojjjn oijj
jiklmnjp q rqs tuvvwt
xyz|~ vq v t
njnijijkjkjlj
jniklmnj(d)
Figure
3.9:Types
ofdatasets
usedin
thesimulationstudy.
(a)Datasetwithconstantlatency
toassesstheden
oisingcapability(b)Datasetwitha
maxim
um
latency
variationof1msatafrequen
cyof1Hz(c)Datasetwithamaxim
um
latency
variationof1.5msatafrequen
cyof0.05Hz(d)Dataset
withamaxim
um
latency
variationof2msatafrequen
cyof0.025Hz.
Recording and constructing synthetic ABR data 77
3.3.3 Adding noise to simulated datasets
In the conventional scalp recorded EP model (refer (2.2)), noise n(k) is assumed to rep-
resent Gaussian white noise within the bandwidth of interest (100-3000 Hz). Therefore,
repeated averaging of a large number of sweeps removes such random noise and retains
the deterministic ABR. However, it is important to characterise actual noise in the ABR
recording and add noise of similar characteristics to the synthetic datasets.
In reality, the spectrum of n(k) is dominated by ongoing EEG. While, in conditions
other than awake, EEG spectrum is skewed (Hauri, Orr & Company 1982), within the
framework of this thesis and in clinical studies pertaining to the ABR, EEG is predom-
inantly recorded while the participant is awake. Therefore it is reasonable to assume
the spectrum of ongoing EEG is less contaminated with large amplitude, low frequency
components, thereby reducing the skewness.
The EEG spectral power while awake, lies below 100 Hz (gamma 25-100 Hz), therefore
is extraneous to the ABR bandwidth (100-3000 Hz). The equipment noise (mainly from
the recording equipment used for the data collection for the thesis) was measured to be
white as shown in figure 3.10 confirming the conventional assumption. Also the explicit
¡ ¡ ¢£ ¤ £ £ ¥ £ ¢ £ ¡ £ ¦ § ¨ © ª « ¬ ® ¥ ¯ ° ± ² ³ ´ µ ¶ ¶ ± ² · ¸ ± ² ¹ ° ¸ ¸ ± ³ ª ¡ º ° » § ¼ ¨ µ ¯ ½ ¾ ± ¨ ³ § ¼ § ¹
¿ ¼ § À Á § ² ©  ® à ´ Ä ½Å ÆÇÈÉÊË ÉÈÌÍÈÎÏÐÑÒÓÊÔ ÕÖ× Ø Ù Ú Û Ü Ý Þ ß Þ à Ú á Ý × × â ã ä å æ × × â ç è Ú ß Ý Þ à Ú á Ý ä Ù ß ß Ý é ê à é ß è ë Ú ì ß Ý é í î ï ð ñ ò ó ô õ ö
Figure 3.10: Spectra of EEG, EEG+ABR and equipment noise compared to Gaussian white noise suggestthat Equipment noise could be approximated by Gaussian white noise.
Recording and constructing synthetic ABR data 78
band-pass filtering between 100-3000 Hz of scalp recorded EEG+ABR used in this thesis
eliminates the contamination from low frequency artifacts from delta to gamma or any
myogenic or ocular artefacts (Hall 2007, Cerutti et al. 1988). As is evident, the EEG
spectrum is smooth and contains no identifiable spectral peaks. Any deviation from this
could be due to the filter rather than characteristics of the signal. Simulated data in
the study assumes a flat spectrum revealing the upper bound effects of ARX modelling.
Deviations from such a flat spectrum would yield worse results.
Based on these arguments, the synthetic ABR datasets had Gaussian white noise added
with the necessary power for the simulation studies involving ARX and wavelet studies.
Adding coloured noise could be conducted as a separate study to suit patient conditions
other than awake or with other artefacts, such as myogenic and ocular potentials.
Recording and constructing synthetic ABR data 79
Chapter 4
Effectiveness of ARX modelling in
rapid extraction of the ABR
Performance evaluation of two parametric modelling methods; autoregressive model with
an exogenous input (ARX) and its extension for robust evoked potential estimator (REPE)
is presented in this chapter in relation to rapid extraction of ABRs. According to the
review in section 2.2, ARX modelling is a frequently used single sweep extraction method
of middle and late EP components. As an extension, this chapter looks into the feasibility
of using ARX modelling to extract early ABR. Initially, a simulation study was performed
for better evaluation of the ARX methodology applied to the new signal domain. Recorded
real ABRs were then used to verify the result of the simulation study.
It was found that ARX and REPE methods of rapid extraction were not suitable to
denoise the ABR due to the comparatively low SNR to that of middle and late EPs.
Performance of variation tracking revealed the limited scope of ARX based extraction
methods in ABRs.
80
4.1 Introduction to the simulation study
Analysing the performance of the ARX and REPE methods in extracting single sweeps
of simulated EPs is of critical importance before these methods are used in a new domain
of physiological signals. Also, the following shortcomings identified in previous studies
prompted systematic adaptation of ARX algorithms (these points are discussed in detail
in section 2.2.5).
• Reproducibility of the reference signal (grand average)
• Ambiguity of the selection criteria of model parameters
• Limited variations induced in the EP
ARX model in z-domain is represented as S(z) = [B(z)/A(z)]U(z) + [1/A(z)]E(z),
where S(z) is the single/limited averaged sweep, U(z) is the reference/template signal,
E(z) is ongoing EEG noise assumed to be white and A(z) and B(z) are the trans-
formed AR and MA filter coefficients. After generating the filter coefficients with a batch
least square method, the estimated single/limited averaged sweep is derived with S(z) =
[B(z)/A(z)]U(z). Based on the fact that, excitation of the model using a signal with a
wide bandwidth improves its estimation, REPE pre-whiten the input to the basic ARX
model. REPE is defined in the z-domain as S(z) = [B(z)C(z)/A(z)]U(z) + [1/A(z)]E(z),
where C(z) is the converted coefficients of the pre-whitening filter. The REPE estimated
single/limited averaged sweep is derived with S(z) = [B(z)C(z)/A(z)]U(z). The reader is
referred to section 2.2.2 and 2.2.4 for further information on the derivation of ARX and
REPE methods.
4.2 Methods
4.2.1 Simulation study domain and extrapolation
The absence of well defined simulation study in literature gave an impetus to conduct such
a study. In summary, ARX modelling based extraction methods were used to extract a
synthesised ABR embedded in noise. This systematic simulation study described in this
thesis aims to achieve the following:
ARX modelling in rapid extraction of the ABR 81
• A well defined simulation study which is reproducible
• A comparison of the performance of the conventional ARX model and its extension
REPE adjusting few parameters of their original studies to prevent ambiguity
• An investigation of the noise reduction ability of the two modelling methods
• An investigation of the time-scale variation tracking ability of the two modelling
methods
The study presented in this chapter is limited to extract a single sweep si(k) using
a template si(k) derived from a MTA of 100 sweeps of simulated ABRs. A range of
SNRs have been used to evaluate the resistance of these methods to noise. In this way,
it is possible to determine the feasibility of single sweep extraction of an ABR. However,
modifying this method to extract an averaged ensemble of sweeps instead of a single sweep
is trivial, and could be performed as a further study if required.
With constant epoch ensembles for the template/reference u(k) and output y(k) with
predefined SNRs, this study expected to determine the range of amplitude (aL) and the
frequency (fL) of latency variations that can be tracked using ARX and REPE methods.
The use of known variations in simulated data makes the adaptation of these methods to
real ABR applications straight forward. The reader is referred to section 3.3 for detailed
information on these datasets.
4.2.2 Simulated reference ABR and datasets
The mathematically modeled reference ABR used in this study (refer (3.3)) consisted of
three clinically significant features with similar morphological characteristics to the real
ABR waves I, III and V having approximately the same latencies and amplitudes (shown
in figure 3.7a).
The datasets derived from this ABR aims at: 1) Performance evaluation of denoising
2) Performance evaluation of time-scale (latency) variation tracking. In summary, one
dataset was created with a constant latency (l = 0) to evaluate the improvement in SNR
for ARX and REPE methods and another 12 datasets were created with a systematic
latency shift combination of aL= 1, 1.5, 2 ms and fL = 0.025, 0.05, 0.1, 1 Hz to evaluate
ARX modelling in rapid extraction of the ABR 82
the latency tracking of ARX and REPE methods. They consist of 60 s of recordings
assuming a stimulus frequency of 20 Hz.
The reader is referred to section 3.3.2 for a detailed description of these datasets.
4.2.3 Acquisition of real ABR data
Expanding the analysis and to confirm the outcome of the simulation study, a similar
methodology was applied to real ABR data from a participant. Stimulus and acquisition
parameters in table 3.1 were used on a healthy female participant of age 24 for ABR
data acquisition. In order to construct the L-I curve, the participant was stimulated with
intensities ranging from 10-75 dB nHL at intervals of 5 dB. The reader is referred to
section 3.1 for a comprehensive description of these stimulus and acquisition parameters.
Output to the model y(k) was tested with a single sweep, MTA of block sizes 32, 128
and 256. To reduce the complexity of the analysis, the exogenous input to the model u(k)
was fixed to a fully featured ABR template generated using MTA of 1024 epochs recorded
at 55 dB nHL (refer section 3.1.5 for more details).
4.2.4 Predetermined models
To produce a realistic set of simulated EEG data y(k) (which includes the ABRs and
the noise associated with ongoing EEG), predetermined filter models with physiologically
plausible responses were constructed. These models are described in the following two
sections.
ARX(p, q, d) model
Arbitrary model orders were sufficient for the simulation study provided that the ABR
u(k) retains its features after the filtering process. Since an ARX model has not been
constructed for the ABR before, similar applications related to middle and late evoked
potentials prompted us to use model parameters derived in Cerutti et al. (1987). Ac-
cordingly, the predefined model orders chosen for this simulation study were p = 6 and
q = 7 ARX(6,7,0) while the delay d was made to zero because the application of input
and output data was performed at the same time. The coefficients of this predefined ARX
ARX modelling in rapid extraction of the ABR 83
model were set to form a low pass filter which preserved the morphology of the reference
signal u(k) ensuring the additive noise is meaningfully incorporated into the model for the
purposes of extraction. The transfer function satisfying these conditions is expressed in
(4.1). Figure 4.1a, 4.1b and 4.1c shows the pole-zero plot, the magnitude response and the
impulse response respectively depicting the stability of the transfer function. The ABR
s(k) derived by filtering synthetic reference signal u(k) with the transfer function in (4.1)
is shown in figure 4.1d illustrating major peaks present in s(k) similar to u(k). The power
s(k) was maintained at the same level as u(k) to normalise any gains associated with the
predefined transfer function. The amount of filtered noise n(k) was maintained such that
the initial SNR is at -10 dB in between s(k) and n(k) 1.
B(z)
A(z)=z−1 − 3.3z−2 + 4.4z−3 − 2.2z−4 + 2.7z−5 − 2.5z−7 + z−8
1− 2.9z−1 + 3.5z−2 − 2.4z−3 + z−4 − 0.3z−5 + 0.1z−6(4.1)
REPE(p, q, r, d)
To improve the resistance to noise of the ARX model, the improved REPE is used to
estimate the ABR. Model orders were set to p = 6, q = 7, r = 8 and d = 0 REPE(6,7,8,0)
considering the previous ARX(6,7,0) model and the work of Lange & Inbar (1996). The
choice of these model orders are justified for the similar reason as in the ARX model
suggesting that, even though arbitrary model orders are sufficient for the simulation study,
the use of orders derived from a physiological signal improves the validity. A pre-whitened
template was derived by adding noise to the template with an appropriate inverse filter
C(z) before subjecting to the ARX process. The underlying reason for the pre-whitening
process is to improve the excitation of the model which is then able to generate a more
accurate set of coefficients. The pre-whitening was performed using an autoregressive
model of order AR(8) (Lange & Inbar 1996) with the transfer function expressed in (2.8)
which was then used as the exogenous input to the ARX model of orders ARX(6,7,0). The
1This section of the thesis concentrate on the degree of accuracy of the ARX model in identifyingpredefined model parameters, therefore the expected outcome is a set of poles and zeros in the vicinity ofpredefined. Use of a low SNR e.g. -30 dB would produce dispersed poles and zeros and will not be able tojudge the model performance. Based on this fact, use of -30 dB will not yield useful information whereas-10 dB of noise could differentiate the performance. The accuracy of the models at -30 dB is indirectlymeasured in figure 4.7.
ARX modelling in rapid extraction of the ABR 84
÷ ø ÷ ù ú û ù ù ú û ø÷ ø÷ ù ú ûùù ú ûø üý þ ÿ
(a) (b)
(c)
(d)
Figure 4.1: Characteristics of the transfer function of the ARX model. (a) Pole-zero plot of the transferfunction (b) Magnitude plot (c) Impulse response of the transfer function (d) Filtered s(k) and referenceu(k) showing preserved features in s(k).
ARX modelling in rapid extraction of the ABR 85
transfer function B(z)/A(z) in REPE was set according to (4.2) to obtain comparable
results or both REPE and ARX.
C(z) = 1− 0.5z−1 − 0.4z−2 − 0.2z−3 − 0.1z−4 − 0.2z−5 + 0.1z−6 + 0.2z−7 + 0.2z−8 (4.2)
Figure 4.2a, 4.2b and 4.2c shows the pole-zero plot, the magnitude response and the
impulse response respectively depicting the properties of the AR(8) model. The effect of
pre-whitening process is evident in figure 4.2d with power spectral density estimations of
the template u(k) and its pre-whitened version w(k). The power of w(k) was maintained
at the same level of u(k) to normalise any gains associated with the predefined transfer
function. The flat band for w(k) results in an even distribution of frequency components,
providing a better excitation at ARX model (Lange & Inbar 1996).
ARX modelling in rapid extraction of the ABR 86
! ! ! ! " # $ % &
' WX (Y(a) (b)
(c) (d)
Figure 4.2: Characteristics of the transfer function of the REPE (a) Pole-zero plot (b) Magnitude plot (c)Impulse response of the transfer function 1/C(z) (d) Effect of pre-whitening the template u(k) resultingin flat spectrum for w(k)
ARX modelling in rapid extraction of the ABR 87
4.3 Results
4.3.1 The efficacy of identifying the predefined models
Quantifying the performance of ARX and REPE methods in extracting single sweeps of
simulated EPs with controlled noise addition is of critical importance before these methods
are applied in real world applications where access to noise free EPs is not possible. On
this basis, we compared the poles and zeros of the estimated model with the predefined
counterparts. With the assumption of similarity between the predefined and the estimated
models, predefined model orders were set to ARX(6,7,0) and REPE(6,7,8,0). The dataset
with a constant latency was then used for this test as illustrated in figure 3.9a.
ARX model
Figure 4.3 shows the plots of poles and zeros of the predefined and estimated models.
These models were obtained by a dataset with an initial SNR of -10 dB subject to
(2.4). Coefficients of the model were derived using a Batch Least Squares algorithm which
minimises the quadratic error function between the estimated and empirical ABR (Cerutti
et al. 1987). For each sweep in the dataset (1200 sweeps), an ARX model was created
and the resulting poles and zeros are shown as clouds. It suggests that the identification
of poles (figure 4.3a) in the estimated model have approached to that of predefined but
locations of estimated zeros (figure 4.3b) have a noticeable offset.
REPE model The estimated model parameters for the pre-whitening process shown in
figure 4.4b suggest an accurate estimation of the predefined AR(8) process. The estimated
ARX process in REPE does produce similar characteristics to that of the pure ARX model.
Figure 4.4b suggests an accurate estimation of poles in contrast, figure 4.4c suggests zeros
have been estimated with a systematic offset. While poles have converged to certain
positions (compared to the positioning of the zeros of the ARX model) these do not
necessarily correspond to the predefined values.
1This section concentrates on the degree of accuracy of the ARX model in identifying predefined modelparameters. The expected outcome is a set of estimated poles and zeros in the vicinity of relevant predefinedpoles and zeros. Use of a low SNR i.e. -30 dB, would produce dispersed poles and zeros and will not be ableto assess the model performance. Given this fact, use of such low SNRs will not yield useful information,where as a SNR of -10 dB could enable assessing the model performance.
ARX modelling in rapid extraction of the ABR 88
) * + , ) * ) - + , - - + , * * + ,) * + ,) *) - + ,-- + ,** + ,. / 0 1 2
3 45 67(a)
) * + , ) * ) - + , - - + , * * + ,) * + ,) *) - + ,-- + ,** + ,. / 0 1 2
3 45 67(b)
Figure 4.3: Estimated Pole (x) and Zero (o) plots of the ARX model. (a) Pole plot of predefined andestimated models (b) Zero plot of predefined and estimated models. Estimated poles have converged topredefined values but not zeros.
4.3.2 Estimation of model orders
A range of model orders from 1 to 10 for p, q and r with a zero delay for d were tested on
each sweep of the test dataset (with a constant latency refer figure 3.9a) using the final
prediction error (FPE) (2.5) to find the optimum model order. The consistent asymptotic
behaviour of FPE at each model led us to set a criterion to automatically extract the
optimum model order as the first local minimum. Even though there are local minima at
higher model orders, the difference of FPE is negligible compared to lower model orders.
This is further clarified in following sections, separately for ARX and REPE models.
ARX(p, q, d) model
The FPE values for all model order combinations applied on a typical single sweep are
shown in figure 4.5a. The general observation is a sharp drop in FPE at low AR(p) orders
and asymptotic at higher orders which is a typical scenario seen in system identification
based on ARX modelling. In contrast, FPE values of MA(q) orders have a small variation.
A closer observation of the extracted FPE curves in figure 4.5b of AR(p) at MA(4) confirms
prior observation but the zoomed in version of the same curve indicates a local minimum
at AR(4). A similar local minimum could be observed at MA(4) with the extracted FPE
curve in figure 4.5c of MA(q) at AR(4). This trend could be seen in other model order
ARX modelling in rapid extraction of the ABR 89
) * + , ) * ) - + , - - + , * * + ,) * + ,) *) - + ,-- + ,** + ,. / 0 1 2
3 45 67(a)
) * + , ) * ) - + , - - + , * * + ,) * + ,) *) - + ,-- + ,** + ,. / 0 1 2
3 45 67(b)
) * + , ) * ) - + , - - + , * * + ,) * + ,) *) - + ,-- + ,** + ,. / 0 1 2
3 45 67(c)
Figure 4.4: Estimated Pole (x) and Zero (o) plots of the REPE. (a) Pole plot of the predefined andestimated AR(8) pre-whitening model. (b) Pole plot of the predefined and estimated ARX(6,7,0) modelin the REPE. (c) Zero plot of the predefined and estimated ARX(6,7,0) model in the REPE.
ARX modelling in rapid extraction of the ABR 90
combinations and sweeps, thus was used to automatically extract the optimum model
order.
Based on the above mentioned observation, the following algorithm was developed to
detect the optimum order:
For a given single sweep with a MA(q) order, the optimum AR(p) order was detected
by locating the first local minimum in AR(p) curve. Likewise, the mode of 10 optimum
AR(p) for a single sweep was considered to be the optimum orders for that sweep. The
same procedure was followed to detect the optimum MA(q) order for that sweep.
These criteria were applied to all the sweeps in the dataset and the resulting histogram
of optimum model pairs is shown in figure 4.5d. This suggests the most frequently identi-
fied model pairs are AR(4) and MA(4). Therefore for subsequent evaluation of ARX rapid
extraction, the empirical model was fixed to ARX(4,4,0).
Similar criterion of FPE was then used to determine the optimum, empirical model
orders of the REPE method. These details are reported in the following section.
REPE(p, q, r, d)
The FPE values for the pre-whitening AR(r) model is shown in figure 4.6a. Even though
it seems to be asymptotic at higher model orders, a zoomed in version of the same curve
indicates a minimum at AR(8) and is consistent with the optimum model order detection
criteria mentioned earlier. Repeated measures concluded a model order of AR(8) was
optimum for the pre-whitening model.
For a single sweep, FPE results for the estimation of ARX(p, q, d) in the REPE is
shown in figure 4.6b suggesting a similar behaviour to that of previous ARX model. A
closer observation of an individual FPE curve of all AR(p) at MA(3) shown in figure 4.6c
suggests an asymptotic nature of FPE after AR(4). But a zoomed in version of the same
curve suggests that there exist a local minimum at AR(4). This is consistent with the
optimum model order detection criteria. A closer look at individual MA(p, q, d) curve
at AR(4) in figure 4.6d suggests a gradual reduction in FPE with several local minima.
Therefore the optimum MA(p, q, d) order was detected by locating the first local minimum
in MA(p, q, d) curve in accordance with the optimum model order detection criteria. The
ARX modelling in rapid extraction of the ABR 91
-,* -- , * --- + ,* 8 * - 9 :; < 0 = 2< . 0 > 2?@A
(a)
B C D E F G H I J B KKK L F BB L F M B K N OP Q R S T
UVW XYZ[\\]_aB C D E F G H I J B KB KM B K N bc d e D fc d e D f g h h i j k l mn h o p q E r h B Ks t u v w x y z x | t t | ~ |
(b)
- * - + * + * + + + + + + 8 * - 9 ; < 0 = 2
?@A < . 0 2 (c)
* , * - *,* --* - - -< . 0 > 2; < 0 = 2
(d)
Figure 4.5: Results of the fixed model order determination of the ARX model. (a) Surface plot of FPEvalues for model order combinations of p and q between 1 and 10 applied on a typical single sweep. (b)Extracted FPE curves from (a) at MA(4) showing saturation at MA(4) and the zoomed in version of itshows the first local minimum at AR(4). (c) Extracted FPE curves from (a) at AR(4) showing the firstlocal minimum at MA(4). (d) Histogram of the optimum model order pairs derived from all the sweeps inthe dataset suggests the optimum estimated model to be ARX(4,4,0).
ARX modelling in rapid extraction of the ABR 92
resultant histogram of the optimum model pairs obtained by applying this criterion to all
sweeps in the dataset shown in figure 4.6e suggests an optimum model orders of AR(4)
and MA(3). Therefore for subsequent evaluation of REPE rapid extraction, the empirical
model was fixed to REPE(4,3,8,0).
4.3.3 Comparison of model performance
Performance of the two models is evaluated using two methods. One of these is the pre-
viously reported improvement in SNR (dB) = 10log10(E[n2(k)]/E[(s(k)− s(k))2]) which
indicates ratio of initial noise n(k) to the final residual noise between the estimated signal
s(k) and the original signal s(k) (Lange & Inbar 1996). The other method is by comparing
latency and amplitude of the estimated wave V with the simulated wave V which is of
clinical importance.
Improvement in the SNR
The performance of two algorithms were analysed within a range of 10 to -30 dB ini-
tial noise in accordance with (2.9). The figure 4.7 shows the SNR improvement of the
estimated ABR using theoretical ARX(6,7,0), REPE (6,7,8,0) and empirical ARX(4,4,0),
REPE(4,3,8,0) models across 100 responses. It is evident that the REPE shows a superior
performance at low initial SNRs, and ARX has a fairly constant but lower improvement
throughout. These SNR improvement values are comparable with the results reported in
Lange et al. (1996) for MLAEPs (refer figure 2.8). When considering the performance of
the empirical and theoretical models in figure 4.7, a superior performance is evident with
the empirical model over theoretical producing higher SNR improvement values. However,
this measurement considers the overall signal but not a specific feature of it, therefore it
is not able to conclude that the result generated from this method has a clinical signifi-
cance. Therefore the latency and the amplitude of wave V were compared to evaluate the
performance of the empirical and theoretical models.
Latency and amplitude of wave V
Figure 4.8a compares the latency detection of two models with theoretical and empirical
ARX modelling in rapid extraction of the ABR 93
¡ ¢ ¡ ¢ ¡ ¢£ ¤ ¥ ¦ §©ª««¬®°± ² ³ ´ µ¶· ¹ º » ¼ ½ ¾ ¾ ¿ »¾ ¿ ¼½½ ¿ ¹½ ¿ ºÀ ½ ¾ Á ± ² ³ ´ µ± ² ³ ´ µ Ã Ä Ä Å Æ Ç È ÉÊ Ä ´ ´ Ë ¢ Ì Ä Í È ´ Î Ì Ï Ä Ð Ñ Ï Å È É È Å Ò Å
(a)
¢ ¢ ¡ ¢ ¡ ¢ ¤ ¥ ÓÔ ± ³ Õ µ± ² ³ Ö µ¶·
(b)
¡ ¢ ¡ ¢ ¤ ¥ Ó
× Ø Ù Ú Û¶· §©ª««¬®°¹ º » ¼ ½ ¾
½½ ¿ ¹À ½ ¾ Á ÜÔ ± ³ £ µÔ ± ³ £ µ Ã Ä Ä Å Æ Ç È ÉÊ Ä ´ Ö Ë Ì Ä Í È ´ Î Ì Ï Ä Ð Ñ Ï Å È É È Å Ò Å(c)
¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ £ ¤ ¥ ¦Ô ± ³ Õ µ
¶· ± ² ³ µÍ È ´ Î Ì Ï Ä Ð Ñ Ï Å È É È Å Ò Å(d)
£ ¢ Ý Þ £¢ÝÞ ¢ ¢ ± ² ³ Ö µÔ ± ³ Õ µ¶ ßàáâàãäå
(e)
Figure 4.6: Results of the fixed model order determination of the REPE. (a) FPE curve for AR(r) modelorders and the zoomed in version of it shows the first local minimum at AR(8). (b) Surface plot of FPEvalues for model order combinations of p and p, q, d between 1 and10 applied on a typical single sweep.(c) FPE curve for all AR(p) orders at MA(3) and the zoomed in version of it shows a local minima atAR(4). (d) FPE curve for all MA(p, q, d) orders at AR(4) indicating the local minimum at MA(3). (e)Histogram of the optimum model order pairs derived from all the sweeps in the dataset suggests, theoptimum estimated model to be REPE(4,3,8,0).
ARX modelling in rapid extraction of the ABR 94
æ ç è æ é è æ ê è æ ë è è ë è ê èèìë èë ìê èê ì
í î ï ð ï ñ ò ó ô õ ö ÷ ø ùúûüý þÿþ
Figure 4.7: The SNR improvement of the estimated ABR. Using theoretical ARX(6,7,0), REPE(6,7,8,0)and empirical ARX(4,4,0), REPE(4,3,8,0) models. Error bars represent standard deviation across 100responses.
model orders. In general ARX estimated sweeps produce a close match to the actual value
with an overall MSE of 0.002 compared to REPE producing a MSE of 0.013. But there
is a considerable deviation with a MSE of 0.006 in REPE from 0 to -15 dB initial SNR
which cannot be seen in ARX estimated latency curve with a MSE of 0.001. Within them,
theoretical model orders ARX(6,7,0) and REPE(6,7,8,0) have close latency values, with a
MSE of 0.002 compared to empirical model orders ARX(4,4,0) and REPE(4,3,8,0) with a
MSE of 0.004. The amplitude variations shown in figure 4.8b indicate a similar behaviour
showing more accurate amplitudes in ARX with a MSE of 0.288 than in REPE with a
MSE of 0.697 even though there is a 0.012 improvement in MSE for REPE at low initial
SNRs. Therefore in general, the variations of amplitude suggest that theoretical model
orders perform superior to the empirical.
On the other hand, when estimating model orders of an unknown signal, the similarity
in the results generated from theoretical and empirical (as shown in figure 4.8) is an
advantage given the fact that the only plausible method of estimation model orders in this
ARX modelling in rapid extraction of the ABR 95
case is by analysing FPE values.
Within the context of this thesis and considering the clinical importance of extracting
EPs, the theoretical model orders of ARX(6,7,0) and REPE(6,7,8,0) were used for further
analysis.
) - ) - ) * - - * -, + , + , + , + + . 0 2 !"#$% &' (5 4)7 < . *
) - ) - ) * - - * - + ,, + , + , + . 0 2 !"#$% &' (5 4)7 . + , +
(6,7,8,0) s
(4,3,8,0)s
s
(6,7,0) s
(4,4,0) s
s
(a)
) - ) - ) * - - * -) *-*
. 0 2- 4./0 12 $% &' (5 1(7 < . *
) - ) - ) * - - * -) *-*
. 0 2- 4./0 12 $% &' (5 1(7 Q 3 4 3(6,7,8,0) s
(4,3,8,0) s
s
(6,7,0) s
(4,4,0) s
s
(b)
Figure 4.8: Detection of wave V with empirical and theoretical model orders. (a) and (b) are latency andamplitude of wave V. Standard deviation has an expected increase at low SNRs. It is evident that thecurves derived from theoretical model orders have a closer match to the actual in all the plots. Howeverthe difference of the empirical model orders is not significant.
4.3.4 Estimated single sweep of an ABR
Figures 4.9a and 4.9b show estimated single sweep s(k) using theoretical model orders
in (2.6) for the ARX and (2.8) for the REPE respectively. A simple visual comparison
suggests better morphology (i.e. similar to u) in ARX estimation than in REPE, confirming
the results shown in figure 4.8. In contrast this proves that improvement in SNR at -10 dB
initial SNR shown in figure 4.7 does not depict the actual amplitudes and latencies of ABR
features.
ARX modelling in rapid extraction of the ABR 96
5 6 7 8 9 : 5; 5 < 9; 5 < 8; 5 < 7; 5 < 6 55 < 65 < 75 < 8
= > ? @ A ? B CD EFGHI JK LM JNO u
s
s
(a)
5 6 7 8 9 : 5; 5 < 9; 5 < 8; 5 < 7; 5 < 6 55 < 65 < 75 < 8
= > ? @ A ? BD EFGHI JK LM JNO u
ss
(b)
Figure 4.9: Single sweep estimated with ARX model and REPE. (a) A single sweep estimated using anARX(6,7,0) model at initial SNR of -10 dB. (b) A single sweep estimated using an REPE(6,7,8,0) modelat initial SNR of -10 dB. u(k) - derived reference ABR, S - deterministic ABR, s(k) - estimated ABR.Having a close u(k) to S with the ARX model provides the best result compared to REPE.
ARX modelling in rapid extraction of the ABR 97
4.3.5 Tracking variations of a single sweep
It is critical to accurately track variations of the EP (in terms of latency and amplitude)
when monitoring or diagnosing the physiological status of a patient. The importance of
the accurate variation tracking highlights with the inclusion of a template in the ARX
based modelling methods. To examine possible limitations, latency variations of single
sweeps were estimated using ARX(6,7,0) model and REPE(6,7,8,0) and compared to that
of the MTA. The most prominent wave V was tracked in datasets.
Single sweep variations estimated using ARX modelling
The latency tracking capability of ARX estimated ABR is shown for a minimal variation
of 1 ms in figure 4.10 and maximum variation of 2 ms in figure 4.11 at different SNRs from
0 dB to -20 dB. The maximum variation was limited to 2 ms considering the variation of
the L-I curve. Also, the extracted wave V latency of the conventional MTA is also plotted
for comparison purposes. Figure 4.10 suggests that latency tracking is achievable in 0 dB
and -5 dB. They show a clear phase difference in MTA latency at fL = 0.05 Hz and 0.1 Hz
and a flat line in 1 Hz. But the ARX estimated latency is closely following the actual
latency variation of the ABR. However, at lower SNRs of -10 dB and -20 dB, latency of
the estimated ABR is not consistent suggesting vulnerability of ARX estimations at low
SNRs. In contrast, figure 4.11 with a latency variation of 2 ms, in general results in poor
tracking. A closer look at the 0 dB plot suggests that latency tracking is reasonable up to
fL = 0.05 Hz but not at either higher frequencies or below SNR of 0 dB.
This visual observation was quantified by calculating the MSE of s(k) and u(k) com-
pared to s(k). With an additional peak-to-peak latency variation of 1.5 ms, the MSE
values are represented in figure 4.12 at different SNRs. Figure 4.12a and 4.12b suggest
peak-to-peak latency variation of up to 1.5 ms produce lower MSEs by the ARX esti-
mation compared to the MTA at all frequencies, and therefore indicates superior latency
tracking performance. However, tracking of peak-to-peak variation of 2 ms from both
ARX estimated and MTA perform similarly with a high MSE, suggesting tracking such
large variations are not possible. Figure 4.12c and 4.12d with lower SNRs, indicates a
deterioration of the latency tracking performance of peak-to-peak variations of 1 ms and
ARX modelling in rapid extraction of the ABR 98
1.5 ms. ARX estimated ABRs have an equal performance to the MTA at a SNR of -10 dB
and worse at -20 dB. Therefore the range of latency variation that could be tracked by
the ARX model are at a higher initial SNR of -5 dB with a maximum latency variation
of 2 ms peak-to-peak at 1 Hz.
ARX modelling in rapid extraction of the ABR 99
PQPPRPPSPPTPPUPPPUQPP
VWX Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQP
lmWX Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQPP
lnVWX Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQP
loVWX Y Z[\] ^_`abcdefgbhijadk
Figure
4.10:WaveV
latency
trackingusingARX(6,7,0).
Latency
variationof1msateach
SNR.Thefrequen
cyoflatency
variationsare
0.025Hz,
0.05Hz,
0.1
Hzand1Hzfrom
topto
bottom
ateach
SNR.
ARX modelling in rapid extraction of the ABR 100
PQPPRPPSPPTPPUPPPUQPP
Ppq Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQPP
rspq Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQPP
rUPpq Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQPP
rQPpq Y Z[\] ^_`abcdefgbhijadk
Figure
4.11:WaveV
latency
trackingusingARX(6,7,0).
Latency
variationof2msateach
SNR.Thefrequen
cyoflatency
variationsare
0.025Hz,
0.05Hz,
0.1
Hzand1Hzfrom
topto
bottom
ateach
SNR.
ARX modelling in rapid extraction of the ABR 101
t u t v w t u t w t u x xy zy v u wy vy x u wy xy t u wtt u w | ~ t | ~ | ~ |
(a)
- + - , - + - , - + * *) + ,) ) * + ,) *) - + ,-- + , . ) , / = / ¡ / ¢ 0 £ 1 2 $¤ ¥¦5§A7
(b)
- + - , - + - , - + * *) ) * + ,) *) - + ,-- + , . ) * - / = / ¡ / ¢ 0 £ 1 2 $¤ ¥¦5§A7
(c)
© ª © « ¬ © ª © ¬ © ª ® «® ª ¬® ® © ª ¬©© ª ¬ ¯ ° ± ² ³ ´ µ ¶ · ¸ ® « © ¹ º» ¼ ½ ¾ ¿ ½ ° À Á  à ´ ³ ² ½ ° À Á Ä ³ ¼ ± ³ ² ±  ° Å Æ Ç ÈÉ ÊË ÌÍÎÏÐÑÒ
(d)
Figure 4.12: MSE values comparing the latency tracking of the ARX estimation and the MTA. At latencyvariations of 1 ms, 1.5 ms, 2 ms. (a), (b), (c) and (d) represent initial SNRs of 0 dB, -5 dB, -10 dB and-20 dB.
Single sweep variations estimated using REPE
Following similar methodology to that of ARX estimates, figure 4.13 and figure 4.14 illus-
trate the latency tracking capability of REPE(6,7,8,0) at a peak-to-peak latency variation
of 1 ms and 2 ms. Similar to ARX latency tracking, REPE show poor performance to
large variations in latency as can be seen in figure 4.14. A close observation of plots in
figure 4.13 suggests the REPE(6,7,8,0) estimated latency follows the latencies derived by
MTA producing poor tracking performance compared to ARX(6,7,0).
ARX modelling in rapid extraction of the ABR 102
MSE plots in figure 4.15 confirm the poor latency tracking of REPE estimations com-
pared to MTA and also compared to ARX estimations in figure 4.12. Even at 0 dB
(figure 4.15a) estimated latency shows a similar or higher MSE compared to the MTA
derived latency.
ARX modelling in rapid extraction of the ABR 103
PQPPRPPSPPTPPUPPPUQPP
Ppq Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQP
rspq Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQPP
rUPpq Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQP
rQPpq Y Z[\] ^_`abcdefgbhijadk
Figure
4.13:WaveV
latency
trackingusingREPE(6,7,8,0).
Latency
variationof1msateach
SNR.Thefrequen
cyofvariationsare
0.025Hz,
0.05Hz,
0.1
Hzand1Hzfrom
topto
bottom
ateach
SNR.
ARX modelling in rapid extraction of the ABR 104
PQPPRPPSPPTPPUPPPUQPP
Ppq Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQP
rspq Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQPP
rUPpq Y Z[\] ^_`abcdefgbhijadk
PQPPRPPSPPTPPUPPPUQP
rQPpq Y Z[\] ^_`abcdefgbhijadk
Figure
4.14:WaveV
latency
trackingusingREPE(6,7,8,0).
Latency
variationof2msateach
SNR.Thefrequen
cyofvariationsare
0.025Hz,
0.05Hz,
0.1
Hzand1Hzfrom
topto
bottom
ateach
SNR.
ARX modelling in rapid extraction of the ABR 105
© ª © « ¬ © ª © ¬ © ª ® « ª ¬® «® ª ¬® ® © ª ¬©© ª ¬ ¯ ° ± ² ³ ´ µ ¶ · ¸ © ¹ º» ¼ ½ ¾ ¿ ½ ° À Á  à ´ ³ ² ½ ° À Á Ä ³ ¼ ± ³ ² ±  ° Å Æ Ç ÈÉ ÊË ÌÍÎÏÐÑÒ
(a)
t u t v w t u t w t u x xy v u wy vy x u wy xy t u wtt u w | ~ y w | ~ | ~ |
Ó ÔÕ Ö×ØÙÚÛÜ(b)
t u t v w t u t w t u x xy vy x u wy xy t u wtt u w | ~ y x t | ~ | ~ | Ý Þß àáâãäåæ
(c)
t u t v w t u t w t u x xy vy x u wy xy t u wtt u w | ~ y v t | ~ | ~ |
(d)
Figure 4.15: MSE values comparing the latency tracking of the REPE estimation and the MTA. At latencyvariations of 1 ms, 1.5 ms, 2 ms. (a), (b), (c) and (d) represent initial SNRs of 0 dB, -5 dB, -10 dB and-20 dB.
4.3.6 Confirmation of simulated results with actual ABRs
Model order determination
Since empirical results derived from REPE were not promising, real ABR data were
applied only to the ARX model.
Complying with the norm when estimating EPs with an ARX model (Jensen et al.
1998, Litvan et al. 2002), a unique fixed model order was determined with respect to
the recorded ABR to generate the L-I curve of wave V. Estimating with a fixed model
order as opposed to estimating model orders for individual epochs is acceptable due to
ARX modelling in rapid extraction of the ABR 106
the minimal variation of orders among epochs and to save the estimation time associated
with calculating those model orders.
Histograms of the optimum model order pairs for each model output block size are
shown figure 4.16. They suggest that the optimum model order combination to be
ARX(3,2,0) with the only exception at a block size of 128 epochs with ARX(3,3,0). The
model order of ARX(3,2,0) was considered to estimate ABRs to derive the L-I curve for
the following reasons:
i difference of one model order has minimal effect at the output
ii the accuracy of model orders are critical at smaller block sizes for a rapid extraction
system
.
Estimation of L-I curves
Plots in figure 4.17 present latency of the wave V of 100 random ABR epochs from each
sound intensity from 10 to 75 B nHL. Randomly selected estimated individual ABRs used
to construct these L-I curves are shown in Appendix E. It is clearly evident that the block
size of the output y(k) has a major effect on the ARX model estimate. As expected,
a converging pattern of the estimated L-I curve could be seen as the number of epochs
included in the average increases. ARX estimated curves approach closer to the grand
average derived curves at a block size of 128 and 256. Another observation is the high
variability at low sound intensity levels even at these block sizes. This is an effect of small
wave V amplitudes at low sound intensities which results in low SNRs in epochs compared
to high sound intensities. The ARX estimated L I curve closely follows from a sound
intensity of 40 dB nHL at a block size of 128 and improves it up to 30 dB nHL at a block
size of 256 (with a small deviation at 35 dB nHL). These results indicate that ARX is not
suitable to extract ABRs at low stimulus intensity levels.
ARX model stability
Another critical aspect of the ARX modelling is the stability of the generated model
ARX modelling in rapid extraction of the ABR 107
ç è é ê ë ì í î ï ç ð çèéêëìíîïç ððè ðê ðì ðî ðñ ò ó ô õö ñ ó ÷ õø ùúûüúýþÿ
(a) y(k)= 1 epoch
ç è é ê ë ì í î ï ç ð çèéêëìíîïç ððç ðè ðé ðê ðë ðñ ò ó ô õö ñ ó ÷ õø ùúûüúýþÿ
(b) y(k)= MTA of 32 epoch
ç è é ê ë ì í î ï ç ð çèéêëìíîïç ððè ðê ðì ðñ ò ó ô õö ñ ó ÷ õø ùúûüúýþÿ
(c) y(k)= MTA of 128 epoch
ç è é ê ë ì í î ï ç ð çèéêëìíîïç ððç ðè ðé ðê ðñ ò ó ô õö ñ ó ÷ õø ùúûüúýþÿ
(d) y(k)= MTA of 256 epoch
Figure 4.16: Histograms of model order combinations resulted in when using a single epoch, MTA of 32,128 and 256 as the output to the ARX model with an exogenous input of a grand averaged real ABR.
ARX modelling in rapid extraction of the ABR 108
for each epoch. The location of poles of the estimated model in the unit circle of the
z-plane determines a realistic ABR estimation. Instability of the automatically generated
model results in distorted estimations of the ABR. While generating ARX models for
the ABRs used to derive L-I curves, the models which contained poles outside the unit
circle were counted and discarded. Examples of such two unstable models are shown in
figure 4.18 with a pole outside the unit circle ([1.011,0] and [1.004,0]) in the z-plane and
correspondingly out of shape estimates. The percentage of these unstable models (while
generating 100 stable models) are tabulated in table 4.1. This suggests the unstable
models are unavoidable and expected to be at an average of 25% more than the stable
models generated. This results in prolongation of analysis time which is an additional
disadvantage for a rapid extraction system.
Blocksize
Sound Intensity (dB nHL)10 15 20 25 30 35 40 45 50 55 60 65 70 75 Average
1 21 15 21 17 11 18 19 16 19 11 12 22 19 18 17%32 16 16 39 12 1 26 38 3 54 8 27 14 17 19 21%128 70 40 1 50 1 41 1 1 35 1 7 9 40 24 23%256 65 26 12 26 1 72 6 1 1 1 12 49 80 11 26%
Table 4.1: Unstable estimated epochs percentage (%) at each sound intensity level associated with differentblock sizes to the output to the ARX model
ARX modelling in rapid extraction of the ABR 109
ç ô ó õ úýþÿ !"#
(a)
$ % $ & ' % ' & ( % ( & ) % ) & & % & & * % * & + % + &) , &&& , &** , &++ , &-- , &. / « 0 1 ½ ½ 2 3 ´ Â À 4
µ Â ¿ ° ¹ ± ° ² ½ ° 0 ± ² Á Å ¹ º ° Æ 5 ÈÉ 67 89:;Î <=Ò
(b)
ç è î ô ó õ úýþÿ !"#
(c)
ó õ úýþÿ !"# è ë ì ô
(d)
Figure 4.17: L-I curves derived with a single epoch, MTA of 32, 128 and 256 as the output to the ARXmodel. 100 random estimated epochs at each sound intensity was picked to plot these curves. A highvariance is observed at small block sizes with improved L-I curves which are closer to the benchmark couldbe observed at a MTA of 256 epochs (only above 30 dB nHL) with MSEs of single epoch-0.18, 32-0.14,128-0.05, 256-0.02
ARX modelling in rapid extraction of the ABR 110
> ? @ > ? A B B ? C B ? D B ? @E > ? DE > ? C >> ? CF G H I JK LM NO B C P D Q @ R A S B >>QB >
T U V G H V W JX LYZ[\ ] M µ
O
> ? @ > ? A B B ? CE > ? CE > ? B>> ? B> ? C> ? PF G H I JK LM NO B C P D Q @ R A S B >E B ? QE BE > ? Q>> ? QBB ? Q
T U V G H V W JX LYZ[\ ] M µ
Os(k)
u(k)
Figure 4.18: Two unstable model estimates derived by the ARX model with a pole outside of the unitcircle [1.011,0] and [1.004,0] (top and bottom plot respectively) resulted in morphologically uncharacteristicABRs
4.4 Discussion
Systematic evaluation of two parametric modelling methods (ARX, REPE) were presented
here, for rapid extraction of EPs based on the filtering of a canonical template to determine
both the ability to remove noise from a ABR single sweep and to track latency variations
of the respective components. This systematic study included a simulation followed by a
confirmation of those results by physiological ABR recordings. The ability of the ARX
and REPE methods to remove noise was comparable to that of previous studies producing
similar improvements in the SNR. In contrast the ability to meaningfully track simulated
changes in the latency of dominant ABR components was possible only with ARX mod-
elling as opposed to the REPE. The application of these parametric models to real ABR
data revealed that, while the ARX modelling is effective at high SNRs producing a close
L-I curve to that of the standard (MSEs corresponding to block sizes; 1-0.18, 32-0.14,
128-0.05, 256-0.02), it is compromised with low initial SNRs and single sweeps (or small
block sizes), therefore unable contribute to a rapid extraction system.
ARX modelling in rapid extraction of the ABR 111
By producing similar improvements in SNR, the section 4.3.3 confirms that this sim-
ulation study is comparable with the previous studies by Cerutti (1988) and Lange &
Inbar (1996). Further investigations were carried out to evaluate the performance using
simulated data with deterministic EPs. Performance of tracking latency variation by using
wave V of the ABR introduced in this study provide to be of clinical significance.
The latency tracking results provide new insight into the performance and limitations
of using ARX and REPE as a rapid feature extraction method. The scope of this study
is limited to a template with a MTA of 100 sweeps and estimation of a single sweep.
The unique study conducted here with known and physiologically plausible latency vari-
ations of the ABR (e.g. 1 ms, 1.5 ms, 2 ms) revealed that results which previous research
approximated as ‘uncertainties due to physiological phenomena’ are in fact caused by inac-
curacies of the methodology itself. As an example, this study concludes that even though
REPE produces superior performance in improving the SNR compared to ARX, it under
performed in tracking latency variations, producing similar results to conventional MTA.
The reasons for the underperformance of the REPE could be due to the formulation of
noise n(k) as defined by Lange & Inbar (1996) using a finite impulse response MA process
rather an infinite impulse response AR process, which contradicts (2.4) in section 2.2.4.
Further, simulations reported in (Lange & Inbar 1996), low-pass filtered the test dataset
before applying the REPE in which case the effective transfer function of the system is
changed from the original.
Further, the study in this thesis on REPE revealed that normalization of the amplitude
after the pre-whitening stage is essential as the signal power is attenuated producing a
signal which is unable to excite the ARX model for optimum performance. Another
shortcoming of the original article (Lange & Inbar 1996) considering the application of the
ABR, is the amount of induced latency variation on the template. Lange & Inbar (1996)
varied the latency by 3 time points. In contrast the current study induces a minimum
variation of 40 time points which coincides with physiological latency variation of 1 ms
of the ABR at a sampling frequency of 40 kHz. The original study which introduced
the ARX modelling for variation tracking by Cerutti (1988), induced latency variations
through the model transfer function but neither specification nor any analysis has been
ARX modelling in rapid extraction of the ABR 112
performed on them.
Results of the simulation study further suggests that, extraction of latency variations
with the ARX model is superior (producing low MSEs) with a capacity to extract latency
variations of up to 1 ms at 1 Hz at an initial SNR of -5 dB and up to 2 ms variations
at 0.05 Hz at an initial SNR of 0 dB. A recorded actual ABR (and also most of EPs)
possesses SNRs lower than -5 dB. Therefore, for ARX to be effectively used, a MTA
of more than 100 to the exogenous input u(k) to and 1 to the output y(k) should be
considered. However, the higher MSEs (viz. poor performance) of REPE in variation
tracking suggests inapplicability of extracting ABR variations.
To confirm these simulated results, ARX modelling was then applied to real ABR
data. According to the L-I curves derived (as a measurement tool to evaluate time-scale
variation tracking), the optimum result was generated with the output of a MTA of 256.
The derived L-I curve coincided with the benchmark L-I curve within the range of 30-75
dB nHL with a MSE of 0.02. Smaller block sizes of 128, 32 and the single epoch produced
worse L-I curves higher MSEs 0.05, 0.14, 0.18 respectively with a high variability. The
inability to derive the L-I curve at low intensities is due to the low amplitudes of the
ABR which results in lower SNRs compared to the larger amplitudes found at higher
sound intensities of 30-75 dB nHL. This implies that, to generate a complete L-I curve,
an ensemble of more than 256 epochs should be included in the MTA to generate the
reference to the ARX model.
4.5 Conclusion
Within the framework of the simulation study, we can conclude that the parametric
method of autoregressive modelling with an exogenous input (ARX) model is capable
of extracting time-scale varying features of a signal within the range of real physiological
signals. Even though the robust evoked potential estimator (REPE) is superior in the
SNR improvement, it is unable to track time-scale variations of the signal.
Results generated with the L-I curve impede the use of ARX modelling in rapid extrac-
tion of the ABR. The conclusions derived from the simulation study are strengthened by
these results with recorded ABRs. Even with a fully featured template generated with a
ARX modelling in rapid extraction of the ABR 113
MTA of 1024 (grand average), an excess of 256 epochs are required to derive a complete L-I
curve. Therefore, compared to the conventional MTA, ARX modelling does not provide
an improvement for the rapid extraction. Rather the ARX modelling requires additional
processing time by estimating unstable models during the filtering process. Therefore we
conclude that rapid extraction of ABR using ARX modelling methods is not a feasible as
it is highly susceptible to the magnitude of noise associated with the ABR.
As it is evident, even though the use of templates enables noise to be removed from
a morphologically similar EP, it imposes a limitation for tracking substantial offsets
(>2 ms/80 time points) from the template. Therefore it is worthwhile to evaluate the
performance of feature extraction methods which are not based on parametric mod-
elling. Wavelet based methods have shown promising results in removing noise from
non-stationary EPs. The robust decomposition methods combined with efficient wavelet
coefficient selection algorithms were emerging at the time of this research. The following
chapter describes a detailed study and the resulting conclusions for a rapid extraction
system of the ABR based on wavelet theories.
ARX modelling in rapid extraction of the ABR 114
%————————————————————————-
ARX modelling in rapid extraction of the ABR 115
Chapter 5
Effectiveness of wavelet techniques
in the rapid extraction of the ABR
Extraction of the ABR with wavelets mainly investigate three denoising methods: con-
stant thresholds with matching coefficients (CTMC), temporal windowing with match-
ing coefficients (TWMC) and cyclic shift tree denoising (CSTD). This chapter presents
the methodology of modifying these methods and evaluating them as a rapid extraction
method of the ABR. In addition, two wavelet decomposition algorithms; DWT and SWT
were involved to investigate the accuracy of results produced. Using an approach similar
that in Chapter 4, a simulation study was followed by applying these methods to real ABR
data. Both these approaches included evaluation of denoising capacity and the ability to
track time-scale variations. The specific aims of this chapter are:
• To optimise of CTMC, TWMC and CSTD algorithms for the ABR signal domain.
• To conduct a comparable simulation study to that of ARX modelling methods.
• To determine the minimum number of epochs required to extract a fully featured
ABR.
• To investigate the effect of the template on time-scale variation tracking ability using
L-I curves.
• Novel implementation of wavelet denoising methods with SWT and analyse the effect
of shift-invariance.
116
The Journal article published (De Silva & Schier 2011) in association with this chapter is
attached in Appendix B.
Wavelets in rapid extraction of the ABR 117
5.1 Wavelet extracting methods
Prerequisites for defining wavelet denoising methods used in this chapter are explained in
the following two sections 5.1.1 and 5.1.2 viz. the synthetic and real ABR templates and
necessary wavelet decomposition sub-bands.
5.1.1 Synthetic and real ABR template
The essential fully featured reference ABR templates were adopted from sections 3.3.1
and 3.1.5. These templates were used in CTMC, TWMC and to determine the common
optimum basis wavelet derived with MTA of 1024 sweeps of ABR recorded at 55 dB nHL.
The synthetic template was based on (3.3) featuring the important ABR waves I, III and
V. Both these synthetic and real templates are shown in figure 5.1(These are identical to
the template shown in section3.3.1).
_ ` a b c d e f g _ hi h j chh j c _
k l m n o m p qr stuvw xy z µ
| ~ ~ ~ (a) Synthetic reference signal with no latency variation(l=0)
Ø µ
Ü ¡ ¢ £ ¤ ¥ ¡ ¢ £ ¤ ¦¡ ¢ £ ¤ ¥ ¥ ¥(b) Actual ABR recorded at 55 dB nHL
Figure 5.1: The reference ABR template used in wavelet denoising methods. Generated with MTA of 1024sweeps at 55 dB nHL showing main features wave I, wave III and wave V and additional features wave IIand wave VI. These are identical to the templates shown in 3.7
.
5.1.2 Wavelet decomposition levels
The number of wavelet decomposition levels calculated was based on the frequency content
of the ABR. The spectrum of the significant features of the ABR is dominated by frequen-
cies; 200, 500 and 900 Hz (Boston 1981, Delgado & Ozdamar 1994). Therefore a 6-level
Wavelets in rapid extraction of the ABR 118
wavelet decomposition tree was considered to include these three dominant frequencies in
separate levels to allow better noise reduction. As shown in table 5.1 the ABR recording
sampled at 40 kHz is divided into dyadic scales with frequency ranges becoming half of
the previous level. Levels A6, D6 and D5 contain the dominant frequencies of the ABR.
Also D4 (1.25-2.5 kHz) was included in the analysis to comply with the substantial signal
power of the ABR included within 100-3000 Hz as per Hall J. W. (2007) Further evidence
related to wavelets is presented in figure 5.4b with substantial amplitudes in reconstructed
waveform in D4 subband.
DWT level D1 D2 D5 D3 D4 D6 A6
Frequencycontent(Hz)
20k-10k 10k-5k 5k-2.5k 2.5k-1.25k 1.25k-625 625-312.5 312.5-0
Table 5.1: Frequency Content of wavelet subspaces. 6-level DWT decomposition levels at a samplingfrequency of 40 kHz
5.1.3 Constant thresholds with matching coefficients (CTMC)
The CTMC is based on an idealised template of the signal to be extracted. It is assumed
that an increase in noise reduction is achieved by matching coefficients of the noisy signal
with the template.
Applying a threshold alone will not be able to arrive at a clean ABR due to the
small amplitudes of the ABR compared to the background spontaneous EEG. Therefore
in addition to the fundamental thresholding of wavelet coefficients, to make the denoising
robust, a matching process is implemented for the threshold coefficients of the noisy ABR
with the template.
According to the flowchart shown in figure 5.2, first the block with the reduced number
of averages was decomposed using DWT into six levels. Then the thresholds were applied
to each level as follows:
• Level A6 - all the coefficients were retained
• Levels D6 to D4 - 20% of the most prominent coefficients were retained
• Levels D3 to D1 - all the coefficients were nullified
Wavelets in rapid extraction of the ABR 119
Generate the template
Decompose using DWT
Apply the threshold on
coefficients
Calculate the average of
the block
Decompose using DWT
Apply the threshold on
coefficients
Retain the matching
coefficients compared
with threshold coefficients
of the template
Reconstruct the denoised
block using IDWT
CTMC
Figure 5.2: Flowchart of the CTMC algorithm The flow above the dotted line represents the use oftemplate defining temporal windows based on the template and below the dotted line represents applyingthose temporal windows to a noisy ABR.
Since the coefficients at level A6 provide the base of the ABR, all of its coefficients
were retained. A threshold was applied to retain 20% of the highest coefficients in levels
D6 to D4. The threshold level of 20% was chosen as a compromise between
i a lower threshold which neglects important coefficients at higher scales and in con-
trast allows to reduce high frequency noise at lower scales and,
ii a higher threshold which includes important low frequency components at higher
scales at the expense of adding noise at lower scales.
All coefficients were nullified at levels D3 to D1 to remove high frequency noise generated
from spontaneous EEG in between 2.5 and 20 kHz.
The fully featured reference template defined in section 5.1.1 was used as the template
for this method to match thresholded coefficients.
Wavelets in rapid extraction of the ABR 120
The coefficient matching process started with applying the threshold scheme to the de-
composed template using DWT. Then the temporal locations of these retained coefficients
were matched with threshold coefficients of the noisy ABR. The common coefficients were
then used to reconstruct the refined ABR using inverse DWT.
5.1.4 Time windowing with matching coefficients (TWMC)
Using the time domain representation of wavelet transform, TWMC method identifies and
suppresses noise components distributed along wavelet subspaces and extracts the ABR.
TWMC is based on the assumption that specific ABR features occur at predetermined time
windows within the response. Such time windows are assumed to reduce distortions from
latency variations caused by the stimulus or pathological conditions of the participants. At
the same time, these time windows should minimise uncorrelated wavelet coefficients form
being involved in the reconstruction process and thereby removing noise. To incorporate
these features, time windows should be heuristically determined so that time locations of
ABR template features coincide. As a result, TWMC is template dependent and as such
the effect of using the template is presented in section 5.6.3
According to the flowchart in figure 5.3, the standard template (section 5.1.1) was
decomposed into 6 levels using DWT. Under the assumption of the noise free template,
the ABR is generated only with wavelet coefficients with large magnitudes. Figure 5.4a
illustrates the template and its decomposed wavelet coefficients on the same time scale with
prominent coefficients aligned along with features of the ABR template. Since frequency
ranges of decomposition levels D1, D2 and D3 are out of the ABR spectrum, no windows
were defined for them. In all the other decomposition levels, time windows were defined
such that prominent coefficients related to wave I, III and V were included. The process
of defining these windows was experimental (thus heuristic) allowing sufficient width to
accommodate any latency variations when applying to recorded ABRs. The effect of these
temporal windows at each decomposition level is illustrated in figure 5.4b. The algebraic
sum of these individual signals can be used to arrive at the refined template as seen in
figure 5.4 uppermost plots. Here, the reconstructed template contains all the important
features and was expected to preserve similar features in noisy ABRS.
Wavelets in rapid extraction of the ABR 121
Generate the template
Decompose using DWT
Define windows on each
decomposition level
Verify the reconstructed
template using IDWT
Calculate the average of the
block
Decompose using DWT
Apply windows defined on
the template
Reconstruct the denoised
block using IDWT
TWMC
Figure 5.3: The flowchart of the algorithm of TWMC The flow above the dotted line represents definingtemporal windows based on the template and below the dotted line represents applying those temporalwindows to a noisy ABR.
According to TWMC algorithm in figure 5.3, the next step after verifying the temporal
windows with the template, is to apply those to a noisy ABR which has been derived from
MTA of reduced number of epochs. Similar to the template, this noisy ABR is decomposed
into 6 levels and then predefined temporal windows were applied to them to arrive at the
noise reduced ABR.
5.1.5 Cyclic shift tree denoising (CSTD)
CSTD uses linear averaging of epochs and thresholding of wavelet coefficients in a system-
atic iteration to achieve a greater reduction in noise.
When using the basic and the most reliable method of extracting ABRs; the MTA,
CSTD hypothesised that increasing the number of averages within the same number of
epochs will yield improved noise reduction, and thereby suppressing random ongoing EEG
Wavelets in rapid extraction of the ABR 122
(a)
_ ` a b c d e f g _ hk l m n o m p q§ zstu ¨w z©ª©«©¬©©®©©°(b)
Figure 5.4: Temporal windows defined for the TWMC. (a) Temporal windows defined according to thesignificant coefficients correlated to ABR features are indicated in grey shades. (b) Reconstructed signalat each decomposition level using only the windowed coefficients. The direct summation of these D1-D6and A6 reconstructed signals gives the final refined template.
Wavelets in rapid extraction of the ABR 123
Create the array of epochs
( N) in the block
Decompose each epoch
using DWT
Average as per CSTD
Apply scale thresholds
Calculate the linear ave r age
of the last CSTD level
Reconstruct the denoised
block using IDWT
Apply CSTD level
thres h olds
Iter ate
ln(N)
times
CSTD
Figure 5.5: The flowchart of the CSTD algorithm. This is an iterative process which does not depend ona template compared to CTMC and TWMC.
noise. In the implementation, CSTD utilises MTAs in a cyclic manner to create additional
averages on a block of epochs. Also CSTD hypothesises that the application of additional
thresholds thus improving the SNR. Therefore two types of thresholds are applied to
wavelet coefficients that are derived from cyclic averaging.
The unique feature of the CSTD algorithm compared to CTMC and TWMC is that, it
does not depend on a template. This has the potential benefits of extracting the wide range
of temporal variations in the ABR as a result of stimulation and pathological conditions of
the participant. The effectiveness of the use of a template will be assessed in sections 5.5.7
and 5.5.7 in detail.
Figure 5.5 shows the iterative process of the CSTD algorithm in contrast to CTMC
and TWMC which is based on a template. First a block of N epochs are created where
N = 2i, i = 2, 3, 4, 5, 6, 7. Then these individual epochs are decomposed into six levels
Wavelets in rapid extraction of the ABR 124
using DWT. The coefficients derived from the decomposition are then subjected to a
thresholding process (described in section 5.1.5). The thresholded coefficients are then
subjected to dyadic averaging with cyclic shifts to create the next level of CSTD algorithm
(described in section 5.1.5). These coefficients are then processed with the second level
of thresholding called CSTD ‘level thresholding’ (described in section 5.1.5). There are
L = ln(N) number of iterations i.e. ‘CSTD levels’, given N number of epochs included
in a block. After CSTD has reached the last level, a single set of wavelet coefficients is
calculated representing only one epoch and then reconstructed the denoised ABR using
inverse DWT.
Cyclic shift dyadic averaging
At each CSTD level l (1 ≤ l ≤ L), two adjacent epochs at level l − 1 (dyads) are
averaged and denoised to create the new CSTD level. The dyadic averages consider not
only adjacent epochs, but also dyadic averages of a cyclical shift by one epoch at that
level. This process is illustrated in figure 5.6 for N = 8 scenario. As an example, the
cyclic shift nature of this algorithm is evident in the CSTD level l = 2 at the last dyadic
average showing E81. According to CSTD algorithm, the last CSTD level l = 4 indicates
that each epoch at the last CSTD level is identical to linear average of epochs at the initial
level and that they are included only once in that process. Therefore the last N epochs
are the linear average of initial N epochs but derived through different paths of the tree
structure. But the application of thresholds at each wavelet and CSTD levels makes the
CSTD algorithm a nonlinear process. As a results the last N epochs derived with cyclic
shift dyadic averages after applying the threshold vary from each other. The following
sections 5.1.5 and 5.1.5 will describe the application of thresholds.
Wavelet level thresholds
Distinct to the constant threshold used in CTMC, CSTD uses a threshold function which
depends upon the wavelet decomposition level δl. A decreasing function δw+1 = 2−w/2δw
from D1 to D6 and A6 (where w = 1, 2, 3, 4, 5, 6) was chosen for this study with the initial
value δ1 = 1 (Causevic et al. 2005, Donoho 1995). Such a decreasing function removes most
Wavelets in rapid extraction of the ABR 125
E1 E2 E3 E4 E5 E6 E7 E8
E12 E34 E56 E78 E23 E45 E67 E81
L=1
The block of initial N=8 epochs
N
L=2
Arrive at L=2 by cyclic shift dyadic averaging and applying CSTD level threshold
N/2 N/2
1δ
δ
E1234 E5678 E3456 E7812 E2345 E6781 E4567 E8123
E12345678 E56781234 E34567812 E78123456 E23456781 E67812345 E45678123 E81234567
N/8 N/8 N/8 N/8 N/8 N/8 N/8 N/8
L=4
Arrive at L=4 by cyclic shift dyadic averaging and applying CSTD level threshold
The linear average of these final 8 epochs is the refined signal
L=3
Arrive at L=3 by cyclic shift dyadic averaging and applying CSTD level threshold
N/4 N/4 N/4 N/4
1δ
2δ
3δ
Figure 5.6: Averaging sequence of the CSTD algorithm. Cyclic shift dyadic averaging and application ofCSTD level threshold algorithm for a case of N = 8 epochs.
of the coefficients generated by high frequency noisy data at initial wavelet decomposition
levels with a high threshold and retains relevant coefficients at lower wavelet decomposition
levels.
CSTD level thresholds
The CSTD level threshold is unique to CSTD and an additional thresholding process
when compared with the conventional wavelet denoising. This threshold is applied to
epochs in all the CSTD levels. A unique function δl and an initial value for CSTD level
threshold were required to be determined for the purpose of this study. The deviation
from the original study (Causevic et al. 2005) is due to the difference in the basis wavelet.
A set of increasing, constant and decreasing functions as shown in (5.1) with a range of
initial values were tested using the recorded data to determine the CSTD level threshold
Wavelets in rapid extraction of the ABR 126
function δl and the results are presented in section 5.5.2.
δl+1 = 0.05l + δl
δl+1 = δl
δl+1 =1
2l2δl
δl+1 =1
exp(l)δl
δl+1 =1l δl
δl+1 =1l2δl
(5.1)
5.1.6 Use of SWT algorithm in CTMC, TWMC and CSTD
To eliminate the drawback of shift-variance in DWT which causes time scale distortions
of the reconstructed signal, we tested the above mentioned denoising methods with the
SWT. As suggested in section 2.3.6 and 2.3.5 the absence of sub sampling makes SWT
shift-invariant, however with a trade-off of increasing computational complexity.
The major difference between the DWT and the SWT algorithms is the dyadic decima-
tion at each decomposition level. As shown in table 5.2, the number of wavelet coefficients
of a 6 level decomposed SWT contain the same number of coefficients to that of the anal-
ysed signal at all decomposition levels where as in DWT, the number of coefficients is
reduced in dyadic scales.
Decomposition levels D1 D2 D3 D4 D5 D6 D6
SWT N N N N N N NDWT N/2 N/4 N/8 N/16 N/32 N/64 N/64
Table 5.2: Coefficients of SWT and DWT. Number of wavelet coefficients at each decomposition level ofDWT and SWT of a signal length of N .
Given a constant number of decomposition levels, the only difference between DWT
and SWT is the number of wavelet coefficients at each decomposition level. Taking this
point in to consideration, the adaptation of CTMC, TWMC and CSTD to the SWT
algorithm is described as follows.
CTMC applies a 20% threshold at each detailed decomposition level of both the tem-
plate and the noisy ABR in order to retain the common coefficients based on the temporal
Wavelets in rapid extraction of the ABR 127
locations prior to reconstructing the denoised ABR. This method can be directly imple-
mented using SWT derived wavelet coefficients instead of DWT with the only difference
of having N number of coefficients in each decomposition level. The 20% threshold can
be applied to these coefficients in methods identical to those described in section 5.1.2.
In modifying TWMC to suit the SWT algorithm, a new set of windows was determined
to suit new temporal locations of the coefficients. Figure 5.7 illustrates the windows defined
to be compatible with the coefficients derived from SWT decomposition algorithm. These
windows are different from the DWT windows defined in figure 5.4. These newly defined
windows were then imposed on noisy ABRs to filter noise in a similar method to those
shown in figure 5.3.
The CSTD in contrast has two independent processes; application of thresholds and
circular averaging. Application of thresholds is similar to that of CTMC and thus adap-
tation from DWT to SWT is similar. But, it was necessary to rearrange the wavelet
coefficients derived from SWT to suit the circular averaging due to the resultant lengths
of the arrays from MATLABTMfunctions. The resulting coefficients after SWT from each
decomposition level were arranged in an array as shown in figure 5.8 for the convenience
of circular averaging. This represents only one decomposed epochs (Ei) in figure 5.6. Such
arrays were then used with the CSTD algorithm according to the flow chart in figure 5.5.
5.2 Choice of the basis wavelet
The fully featured template shown in figure 3.4 with added Gaussian white noise was used
to determine the suitable basis wavelet. Synthesised noise was used here to achieve a
consistent noise profile, thereby avoiding any spurious effects that might occur in recorded
ongoing EEG. The SNR of the tested ABR was kept at -15 dB which is equivalent to a
theoretical MTA of 32 sweeps having an initial SNR of -30 dB.
Biorthogonal basis wavelets of orders 3.3, 3.5, 3.7, 3.9, 4.4, 5.5 and 6.8 were tested with
denoising methods CTMC, TWMC and CSTD using their default parameters related to
the ABR application. The results are presented in section 5.5.1 in the form of MSEs.
Wavelets in rapid extraction of the ABR 128
± ² ³ ´ µ ¶ · ¸ ¹ ± º» º ¼ µºº ¼ µ ±± ² ³ ´ µ ¶ · ¸ ¹ ± º» º ¼ ²» º ¼ ±ºº ¼ ±º ¼ ²½¾± ² ³ ´ µ ¶ · ¸ ¹ ± º» º ¼ µºº ¼ µ ±½¿± ² ³ ´ µ ¶ · ¸ ¹ ± º» ²» ±º ±²½À± ² ³ ´ µ ¶ · ¸ ¹ ± º» ²» ±º ±² Á Â Ã Ä Å Ã Æ ÇÈÀ
Figure 5.7: Defined temporal windows for TWMC with SWT algorithm. Blue - coefficients of the originaltemplate. Red - windowed coefficients.
5.3 Simulation study on wavelet methods
Similar to the one shown in Chapter 4, a simulation study was conducted with chosen
wavelet denoising methods to assess the noise removal and latency tracking ability of
CTMC, TWMC and CSTD. Such a study enables a direct comparison to parametric
modelling methods and an unbiased estimation of the performance before applying them
to real ABRs. The simulation study is twofold; 1) Evaluation of denoising. 2) Evaluation
of latency tracking of ABR wave V.
5.3.1 Denoising
The reference signal for the simulation study is the synthetic ABR model defined in (3.3)
and illustrated in figure 5.1a characterised with similar morphology. The dataset including
Wavelets in rapid extraction of the ABR 129
Figure 5.8: The constructed array with SWT coefficients to suit circular averaging of CSTD. Each decom-position level consists of N number of coefficients.
1200 ABRs (duration of 60 seconds of recording at a stimulus frequency of 20 Hz) with
no latency variations as shown in figure 3.9a was filtered with each wavelet method at
different SNRs. The noise addition was based on a SNR of -30 dB for a single sweep (also
known as the initial SNR). Accordingly, to represent block sizes of 8, 16, 32, 64, 128 and
256, Gaussian white noise was added with noise powers of -22 dB, -18 dB, 15 dB, -12 dB,
-9 dB and -6 dB respectively based on theoretical SNR = 10log10
(√N)
.
Parameters of the wavelet methods for the simulation study and results
CTMC used a threshold which retained the highest 20% of coefficients of wavelet sub-
bands A6, D6, D5 and D4. For TWMC, windows were defined for each wavelet subband
to suit the synthetic template similar to figure 5.4 such that minimum number of coef-
ficients was used to reconstruct a morphologically comparable template. According to
the original study (Causevic et al. 2005), the CSTD level threshold function was set to
δl+1c = 1/exp(l)δl with an initial value of δ1 = 0.8 and the wavelet threshold function was
set to δw+1 = 2−w/2δw with an initial value of δ1 = 1. New functions that would explicitly
suit the ABR were not investigated here. However, these were investigated in depth with
the application of real ABRs.
The effect of wavelet filtered ABRs quantified in terms of improvement in SNR is shown
in figure 5.9 which is directly comparable with the performance of the parametric modelling
in figure 4.7 (improvement in SNR is calculated according to (2.9)). In general, it suggests
that the wavelet methods are superior to conventional MTA and prominent at low SNRs
(even though the improvement reduces as the initial SNR reduce). In comparison to the
parametric modelling; initial SNRs above -25 dB indicate superior improvement and about
par below that. As a result, the decision to use wavelet methods is justified given the fact
that wavelet methods yield superior improvement in SNR. These results therefore justify
Wavelets in rapid extraction of the ABR 130
É Ê Ë É Ë Ì É Ë Í É Ë Ê É Î É ÏË ÏË ÌÊ ÐÊ ÊÊ ÑÊ ÏÊ ÌÒ ÐÒ ÊÒ Ñ
Ó Ô Õ Ö Õ × Ø Ù Ú Û Ü Ý Þ ß Ü à á â â ã ä å á Ô Ý Õ Ô æ ç Ø á à è ä Õ é ã ßê ëìíîïðëðñòó ñôõö÷øùúû ü ý þ ü û þ ü ÿ û þ þ ü Ê Í Ï Ï Ñ Ë Ê Ì Ì Ë Ï Ò Ê
Figure 5.9: Improvement in the SNR with wavelet filtering methods suggest superior performance to thatof parametric modelling.
the use of real ABR data for a comprehensive analysis.
5.3.2 Latency tracking
As pointed out in Chapter 2, tracking time-scale variations are critical in identifying
pathological conditions of a patient. Therefore, we used periodic latency variations (mod-
ulated) with several datasets representing 60 s of a recording as shown in figure 3.9.
The range of amplitude variations included aL = 1, 2 ms and latency variations included
fL = 0.025, 0.05, 0.1, 1 Hz creating a total of 8 datasets. These combinations were then
used to assess the latency variation tracking capability of the three wavelet denoising
methods. It should be noted that these datasets are identical to those used in the ARX
simulation study in section 4.3.5.
To clarify such limitations and to assess the property of time-scale (latency) variation
tracking, the synthetic ABR with aL = 0 was used as the template in CTMC and TWMC
to extract variations. Similar to the analysis in Chapter 4 simulation study, variations of
wave V were extracted in MTA and wavelet filtered ABR datasets for comparison.
Results shown in figure 5.10 suggest a comparatively low MSE was produced by CSTD
implying superior latency tracking. However, contradicting the obvious implication of
Wavelets in rapid extraction of the ABR 131
producing low MSEs by wavelet filtered ABRs, CTMC and TWMC indicate comparatively
high MSE. The underlying reason could be that the limitations imposed by the template
in these methods limits the tracking of latency variations. However, the use of a template
in CTMC and TWMC could potentially limit the feature extraction with the induced
latency variations. It can be observed that this difference is prominent when aL = 2 ms
compared to aL = 1 ms. A one-way ANOVA of aL = 2 ms at each fL suggests a significant
difference of the mean of CSTD compared to other methods (e.g. at a block size of 32;
F (2, 2997) = 134.64, p < 0.01). However, such a difference was not recorded at aL = 1 ms
(e.g. at a block size of 32; F(2,2997)=1.736 , p = 0.1764).
This study provides the basis to conduct a similar analysis with real ABRs. Due to
the difficulty of obtaining sinusoidal variations in the latency, we used the L-I curve of
wave V by controlling the intensity of auditory stimuli.
5.4 Evaluation of wavelet methods on real ABR Data
Motivated by the promising results of the simulation study, CTMC, TWMC, CSTD
wavelet filtering methods were applied on recorded ABR data to confirm and finalise
an effective method for the rapid extraction of a fully featured ABR from a minimum
block size). During the performance evaluation of these wavelet methods, the following
critical factors were considered:
i Denoising and the ability to produce a fully featured ABR.
ii The ability to track latency variations induced in ABR features.
Initial analysis was performed with DWT as the decomposition algorithm (similar to
original studies of the three wavelet denoising methods). The Choice of basis wavelets and
threshold functions were determined according to DWT implementation. The performance
of denoising and latency tracking was then compared with the results generated from the
SWT decomposition algorithm. The comparison led to a conclusion with regards to the
possible use of these methods in a system for rapid extraction of ABRs.
Wavelets in rapid extraction of the ABR 132
Figure 5.10: [Latency tracking results with simulated ABR dataset suggests difference is prominent whenaL = 2 ms compared to aL = 1 ms. However, such a difference was not recorded at aL = 1 ms.
Wavelets in rapid extraction of the ABR 133
5.4.1 Denoising ability of wavelet methods
The performance evaluation of denoising included quantitative measurements such as MSE
and correlation coefficient and qualitative peak detection by an expert. Given the reason
that MSE and correlation coefficient compares two signals as a whole rather than their
individual features, we carried out a visual inspection to arrive at a thorough conclusion
of the minimum block size that gives optimum ABR features.
For evaluation purposes it was important to have a fully featured reference template
when calculating MSE. Therefore the fully featured template derived at 55 dB nHL, men-
tioned in section 5.1.1 was used for this purpose. MSE was calculated for each of the three
wavelet methods at all block sizes. To establish a baseline comparison for the wavelet
methods, a conventional band-pass filtering of 100-3000 Hz was also performed.
Initially, analysis was carried out using conventional DWT and its performance in
terms of the minimum block size which leads to rapid extraction. Then these results were
compared with SWT derived results to arrive at a final conclusion of the performance.
5.4.2 Latency tracking ability of wavelet methods
As discussed in section 2.1.4, due to the effect on the latency as a result of various patholog-
ical conditions and the usefulness to patients of these wavelet methods in practice depends
upon the range of latency variations that could be accurately tracked. For the purpose of
this study, latency variations were induced by controlled variation of the stimulus intensity.
Then the latency tracking ability of wavelet methods was evaluated by comparing the L-I
curve of wave V derived using the grand average at each sound intensity level with that of
the wavelet filtered. The block size used to generate the L-I curve was determined as per
the denoising performance presented in section 5.5.3. Both DWT and SWT decomposition
algorithms were used with the three denoising methods to arrive at a conclusion as to the
optimum latency tracking method.
Wavelets in rapid extraction of the ABR 134
5.5 Results
5.5.1 Determination of a common basis wavelet for analysis
The MSE values shown in figure 5.11 result from using different orders of Biorthogonal
basis wavelet in CTMC, TWMC and CSTD. In general, these MSE values have a common
trend of reduction as the order increases, caused by the accurate representation of the
detail and approximation functions to the ABR waveform morphology. A close observa-
tion suggests that there is an optimum MSE value for CSTD and TWMC methods at
Biorthogonal 5.5, where as CTMC shows almost equal MSE values with Biorthogonal 5.5
and 6.8. Since low filter orders reduce computational complexity, processing speed and
memory requirements leading to rapid extraction, the Biorthogonal 5.5 basis wavelet was
considered to be the optimum choice, and will be used throughout this chapter. A similar
conclusion is supported by the work of Bradley & Wilson (2004).
! " # $ % & ' ! " ( $ ( & ' ! " ) $ ) & ' ! " * $ + & ' ! " * $ , & ' ! " * $ ( & ' ! " * $ * - & . * - & . ./ 0 1 / 0 2 1 / / 3 0 4 5 6 7 8 0 9 7 : ; 9 < = 8 > ? @ A B CD E D DD E D FG H G ID E D JD E D KG H G LD E D MD E D NG H G OD E D PD E F D
Q R S T M E U Q R S T V E V Q R S T K E K Q R S T J E P Q R S T J E N Q R S T J E V Q R S T J E JWXY
Z [ \ ] \ ^ [ _ ` a ` bc d e c d f e c c g d hFigure 5.11: Effect of Biorthogonal basis wavelets on denoising methods. The performance of denoisingmethods CTMC, TWMC and CSTD in terms of MSE when used different orders of the Biorthogonal basiswavelet (error bars represent SD). This suggests that Biorthogonal 5.5 is optimum for all three denoisingmethods.
Wavelets in rapid extraction of the ABR 135
5.5.2 Determination of CSTD level threshold function
The CSTD level threshold function had to be uniquely determined for this study due to
the use of a different basis wavelet to that of the original study (Causevic et al. 2005). The
test dataset used for this evaluation is a block of 32 ABRs recorded at 55 dB nHL and the
template (refer figure 5.1 for comparison i.e. to calculate MSE. Six threshold functions in
(5.1) were tested with a range of initial values. The resultant MSE values are plotted in
figure 5.12.
The results confirmed our assumptions. The natural behaviour of the CSTD produced
fewer noise components at low CSTD levels, and the use of constant and increasing func-
tions produced high MSEs removing relevant wavelet coefficients to the ABR at lower
levels. In contrast decreasing functions resulted in an optimum initial value for a given
function within the range of initial values. Out of the three decreasing functions tested,
δl+1c = 1/exp(l) δl produces the lowest MSE at an initial value of δ1 = 0.8. Therefore
these settings are used for the CSTD level threshold function from here onwards.
Note that the sudden termination of curves is due to the thresholds of above 100%
being applied to wavelet coefficients which resulted in no coefficients to reconstruct an
ABR.
5.5.3 Noise reduction of wavelet methods with DWT
The noise reduction of the wavelet methods was evaluated using MSE and a visual com-
parison followed by a statistical significance test of correlation coefficients, to arrive at the
lower bound limit of the block size (number of epochs required in the MTA) that produce
a fully featured ABR. The data used for this analysis were recorded at a sound intensity
of 55 dB nHL for all the 8 participants as per the section 3.1. 769 ABRs were extracted
at a single block size by sweeping through the total number of 1024 epochs from single
participants. Each extracted ABR was filtered with CTMC, TWMC and CSTD using
DWT and conventional Butterworth band pass filtering with cut-off frequencies at 100 Hz
and 3000 Hz for comparison purposes. The MSE was calculated for each filtered ABR
with reference to the grand average at 55 dB nHL of the respective participant.
The average of MSE values across all the participants at each block size for the different
Wavelets in rapid extraction of the ABR 136
i i j k l l j k m m j k ni j l oi j mi j m mi j m pi j m qi j m oi j ni j n mi j n pi j n qi j n o
r s t u t v w x v w y zδ
|~δl+1 = 1
exp(l)δl
δl+1 = 1
2l/2δl
δl+1 = 1lδl
δl+1 = 1l2δl
δl+1 = δl
δl+1 = 0.05l+ δl
Figure 5.12: Effect of level threshold functions in CSTD showing MSE of the filtered signal according tothe level threshold function and the initial value used. The function 1/exp(k) at an initial value of 0.8has the minimum MSE.
filtering methods are plotted in figure 5.13. As expected these values show an exponential
reduction in MSE as the block size increases. Also, it is evident that all wavelet methods
have a superior performance compared to conventional band-pass filtering (i.e. MTA) at
all block sizes, with a greater effect at small block sizes, thus suggesting that wavelets are
most effective at low SNRs. Among the wavelet filtering methods, in general at all block
sizes, CSTD produce a superior performance according to the MSE compared to CTMC
and TWMC. The observation is statistically justified with one-way ANOVA (p < 0.01)
results shown in table 5.3 and specific Tukey post-hoc comparison results between CSTD-
TWMC and CSTD-CTMC in table 5.4. Therefore the early indication is that CSTD is a
potential method to rapidly extract the ABR.
However, in determining the smallest block size for rapid extraction, while MSE values
indicate the noise reduction aspect, they do not represent the quality of the actual filtered
signal. Also, the MSE value does not indicate the extent to which the filtered ABR in a
particular block size is close to the grand averaged template. Therefore a visual comparison
was carried out to check which block size produces a detectable ABR.
Figures 5.14, 5.15 and 5.16 show the surface plots of the same ABR dataset filtered
Wavelets in rapid extraction of the ABR 137
Figure 5.13: Denoising effect of Wavelet methods. Average MSE of 8 participants for each wavelet methodand band-pass filtering for all block sizes at a sound intensity level of 55 dB nHL. Wavelet methods arebetter than conventional band-pass filtering and CSTD produces the lowest MSE among wavelet methodsat any given block size.
Table 5.3: One-way ANOVA results comparing MSEs produced by ABRs filtered from CSTD, TWMC andCTMC suggest that there is a significant difference between the group MSE means. The specific differencesare identified with a Tukey post-hoc comparisons study from which the results are tabulated in table 5.4
Block size df F p
256 (2,18453) 222.9 <0.01128 (2,18453) 175.8 <0.0164 (2,18453) 138.4 <0.0132 (2,18453) 181.8 <0.0116 (2,18453) 245.2 <0.018 (2,18453) 84.7 <0.01
Wavelets in rapid extraction of the ABR 138
Table 5.4: Tukey post-hoc comparison of CSTD against TWMC and CTMC suggest there exist a siginifi-cant difference in mean MSE at all block sizes (as CI does not include zero) confirming superior performanceof CSTD
Block size Comparisonagainst
Meandifference
CI
256 TWMC -0.0050 -0.0058 :-0.0043CTMC -0.0042 -0.0049 :-0.0034
128TWMC -0.0086 -0.0103 :-0.0069CTMC -0.0103 -0.0120 :-0.0086
64TWMC -0.0200 -0.0160 :-0.0120CTMC -0.0261 -0.0221 :-0.0181
32TWMC -0.0431 -0.0341 :-0.0250CTMC -0.0678 -0.0588 :-0.0498
16TWMC -0.0905 -0.1128 :-0.0683CTMC -0.1687 -0.1910 :-0.1465
8 TWMC -0.1768 -0.2243 :-0.1292CTMC -0.1905 -0.2380 :-0.1429
with the three wavelet methods. Six plots in each figure represent block sizes from 256 to
8 with 769 epochs across the y-axis. The wave V at a latency of 6 ms is clearly visible as
a vertical red strip at a block size of 256 in all the wavelet methods. This strip gradually
becomes overshadowed by noise towards smaller bock sizes. Due to this prominence, wave
V was considered as a good indicator of determining the effect of denoising. It could be
observed that below a block size of 32, wave V cannot be distinguished from noise in all
the wavelet methods.
To confirm this visual observation, a statistical analysis was carried out using corre-
lation coefficients derived from the template and denoised ABRs with CTMC including
data of all the participants. The choice of CTMC is based on the worst case scenario
according to MSE values. A one-way ANOVA calculation suggests that there exists a sig-
nificant difference of means between block sizes, F (5, 4608) = 287.77, p < 0.01 suggesting
the presence of a block size that is not contaminated with noise. The comparison of these
mean correlation coefficients is illustrated in figure 5.17. Tukey post-hoc comparisons of
the block sizes revealed that the difference of correlation coefficients generated between a
block size of 8 (M = 0.355, 95%CI[0.334, 0.377]) and 16 (M = 0.393, 95%CI[0.3710.415])
was not significant (p = 0.51) suggesting similar interference from noise. However, con-
Wavelets in rapid extraction of the ABR 139
firming the visual observation, block size of 32 revealed a significant difference compared
to the noise corrupted block size of 16 (M = 0.453, 95%CI[0.432, 0.474]) with p < 0.01.
With all these results we concluded that, ABR extraction could be performed at a
rapid rate using an ensemble of only 32 epochs compared to the conventional 1024 epochs.
The argument is further strengthened when examining a randomly selected block of
size 32 filtered with three wavelet methods shown in figure 5.18 illustrating the ability
to extract ABR features of the template. CTMC extracted wave III and V but failed to
extract wave I and II. Also we could see a spurious peak after wave V which was not in
the template. TWMC shows only wave V with a distorted combination of wave II and
III while wave I appears to be absent. It is possible that this effect causes due to the
predetermined windows in TWMC. In contrast CSTD is able to extract waves I, II, III
and V. An important observation here is that wave IV which is difficult to observe is also
visible immediately before wave V (similar to the ideal ABR shown in figure 2.2). One of
the important aspects of rapid extraction is demonstrated with this example where reduced
number of averages provide more information and variations than a grand averaged ABR.
One could argue that false peaks to the right of wave V are noise, but a clinician would
know that these are out of the range of peaks of interest, therefore the chances of being
misled are minimal. In contrast, the band-pass filtered ABR contains a large amount
of noise from spontaneous EEG. A trained eye could identify wave III and V but with
minimal accuracy of their latencies.
5.5.4 Fsp threshold in quantifying the effectiveness of wavelet filtered
ABRs
It is arguable that statistical methods such as Fsp could be used to quantify the noise
associated with wavelet filtered ABRs. However, such a claim should be systematically
investigated as there could be a contradiction of the underlying assumptions. The main
assumption in the Fsp calculation is that the noise associated with a MTA of a signal is
F distributed with degrees of freedom fixed on v1 = 5 and v2 = 250. It is apparent that
MTA is not the only filter imposed in wavelet filtering (or in ARX modelling). Therefore,
the degrees of freedom of the F distribution could be different. The potential effect on
Wavelets in rapid extraction of the ABR 140
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(a) 256
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(b) 128
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(c) 64
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(d) 32
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(e) 16
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(f) 8
Figure 5.14: normalised surface plots of 769 filtered ABRs with CTMC of a participant at block sizes from256 down to 8. The highlighted vertical strip around 6 ms shows the wave V. It gradually becomes obscuredby noise when block size is reduced. The smallest block size where the wave V was readily identifiable was32.
Wavelets in rapid extraction of the ABR 141
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(a) 256
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(b) 128
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(c) 64
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(d) 32
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(e) 16
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(f) 8
Figure 5.15: normalised surface plots of 769 filtered ABRs with TWMC of a participant at block sizesfrom 256 down to 8. The highlighted vertical strip around 6 ms shows the wave V. It gradually becomesobscured by noise when block size is reduced. The smallest block size where the wave V was readilyidentifiable was 32.
Wavelets in rapid extraction of the ABR 142
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(a) 256
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(b) 128
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(c) 64
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(d) 32
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(e) 16
Time (ms)
Epochs
2 3 4 5 6 7 8 9 10
300
400
500
600
700
800
900
1000
(f) 8
Figure 5.16: normalised surface plots of 769 filtered ABRs with CSTD of a participant at block sizes from256 down to 8. The highlighted vertical strip around 6 ms shows the wave V. It gradually becomes obscuredby noise when block size is reduced. The smallest block size where the wave V was readily identifiable was32.
Wavelets in rapid extraction of the ABR 143
Figure 5.17: Mean correlation coefficients between the template and CTMC filtered ABRs at differentblock sizes suggesting a significant difference between the block size of 32 and noise corrupted block sizeof 16. Error bars represent 95% confidence interval
the threshold in such a difference is explained below.
In the original study, it was estimated that MTA of 250 epochs of EEG noise has a
degree of freedom v1 = 5, v2 = 250 (F (v1, v2) statistics, where v1 and v2 are degrees
of freedom of numerator and denominator) (Elberling & Don 1984). According to the
current study assuming v1 = 5 F (5, v2) values derived for v2 = 1 to 1024 is shown in
the figure 5.19. It is reasonable to state that the deviation from established threshold of
F (5, 250) = 3.1 is minimal at v2 = 1024 considering the natural variation of Fsp.
However, a slight variation in the v1 could have an impact on the standard threshold
as is evident in figure 5.19 with F (10, 250) = 2.392. As per (Elberling & Don 1984) v1
depends on the filter imposed on white noise e.g. for white noise v1 =number of data
points in the epoch (160) where as for pink noise v1 = 15.
Considering the nonlinearity of wavelet filtering imposed on the ABR, it is not correct
to use v1 = 5 viz. the threshold of Fsp = 3.1. Determination of new Fsp threshold for
wavelet filtering is out of the scope of this thesis. Therefore Fsp values were not used
either for objective quantification of wavelet filtered ABRs in this thesis nor for ARX
model derived ABRs.
Wavelets in rapid extraction of the ABR 144
¡ ¢ £ ¤ ¥¦ ¦ ¥ § ¡¥¥ § ¡ ¨ © ª « ¬ ª ®¯ °±²³ µ¶ ·¸ µ¹º
¡ » ¢ £ ¤ ¥¦ ¥ § ¡¥¥ § ¡ ¨ ¼ ½ ¾¨ © ª « ¬ ª ®¯ °±²³ µ¶ ·¸ µ¹º
¡ » ¢ £ ¤ ¥¦ ¦ ¥ § ¡¥¥ § ¡ ¾ ¿ ¨ À¨ © ª « ¬ ª ®¯ °±²³ µ¶ ·¸ µ¹º
¡ » ¢ £ ¤ ¥¦ ¦ ¥ § ¡¥¥ § ¡ Á ¨ © ª « ¬ ª ®¯ °±²³ µ¶ ·¸ µ¹º
¨ « ª Ã Ä Å Æ « Ç Á È ¼ Å É « Ä « Æ Ê © Ä Æ « Ë « Ì Ç Á ÈFigure 5.18: Denoised ABRs at a block size of 32. Comparison of a typical ABR at a block size of 32processed with wavelet methods and band pass filtered. Underlying template features are mostly presentin the ABR at CSTD
Wavelets in rapid extraction of the ABR 145
Í Î Í Í Ï Í Í Ð Í Í Ñ Í Í Ò Í Í ÍÍ ÒÎÓÏÔÐ
Õ Ö × Ø ÙÚ Ö × Û Ü Ý ×Þ ß à ß á â ã ß ä å æ ç è é å â ê ë ì
í îï Õ Ö × Ø ÙÚ Ö Ü Û Ù Ý ð Õ Ö ð Ù × ñÚ Ö Ü Û Ù Ü ØÕ Ö ð Ù × ñÚ Ö × Û Ü Ü òó æ ã ô õ â ö ÷ ì ø ù ú ó ô Ò Í û ü × ú ó ô Ô û ü × ú ó â ÷ æ ä å ö ã ã õ ß ý
Figure 5.19: Effect of dof of F (v1, v2) statistics on the threshold criteria. The threshold curve where v1 = 5has contrasting effect to the threshold curve compared to v1 = 10. Here, v2 = 1− 1024. v1 depends uponthe filter imposed therefore direct application of Fsp = 3.1 is not suitable.
5.5.5 Comparison of noise reduction between DWT and SWT
To compare denoising of DWT and SWT decomposition algorithms, the same real ABR
dataset and identical settings were used as in section 5.5.3. The results are shown in
figure 5.20 for comparison with 5.20a and 5.20b) showing results of SWT and DWT re-
spectively.
It could be observed that the trend of MSEs produced by both SWT and DWT remains
the same across denoising methods so that CSTD produces the minimum MSE is followed
by TWMC and CTMC. A closer visual inspection suggests an improvement in mean MSE
of SWT compared to DWT in TWMC and CSTD as opposed to minimal improvement
in CTMC. A paired t-test was performed to statistically justify the results across the 8
participants (df = 7 and at 5% significance). The summary of the t-test tabulated in
table 5.5 confirms the visual inspection and suggests similar mean values of MSEs with
CTMC being the only exception of a block size of 8. On the contrary, with TWMC and
CSTD, mean MSEs are significantly different with the only exception in CSTD at a block
Wavelets in rapid extraction of the ABR 146
size of 256.
The reason for the improvement could be due to the advantage of shift invariance in
SWT. Even at a constant sound intensity level, there could be temporal variations in the
ABR due to imperfections in the recording setup and the physiological condition of the
participant. As SWT preserves the shift invariance without affecting such time jitters, a
close approximation to the template could be achieved compared to that of the DWT.
Wavelets in rapid extraction of the ABR 147
þÿÿþ
ÿÿ
ÿ
(a)
ÿþÿþ þÿÿ ÿ ÿ ÿ þ
ÿ
(b)
Figure
5.20:ComparisonofDen
oisingofSW
TandDW
T.a)MSE
usingSW
T(b)MSE
usingDW
T.SW
Tproduce
less
errors
thanDW
T
Wavelets in rapid extraction of the ABR 148
Table 5.5: Results of paired t-test (df = 7 and at 5% significance) between the MSEs of DWT and SWTdenoised ABRs.
Block size CTMC TWMC CSTDt p CI t p CI t p CI
256 0.381 0.714 -0.001 0.001 4.426 0.003 0.005 0.015 1.609 0.152 -0.001 0.006128 1.08 0.316 -0.001 0.004 4.335 0.003 0.008 0.027 5.183 0.001 0.007 0.01864 1.831 0.11 -0.001 0.01 4.326 0.003 0.015 0.05 5.977 0.001 0.019 0.04532 2.134 0.07 -0.001 0.02 4.234 0.004 0.028 0.099 5.936 0.001 0.046 0.10716 1.504 0.176 -0.009 0.042 4.518 0.003 0.065 0.208 5.697 0.001 0.103 0.2498 4.344 0.003 0.035 0.119 3.921 0.006 0.131 0.527 4.615 0.002 0.206 0.638
The statistical results are further confirmed by individual ABRs extracted from SWT
and DWT processes. Figure 5.21 presents such randomly selected ABRs (corresponding
to each other) at a block size of 32, filtered by the three wavelet methods (CTMC, TWMC
and CSTD) with the conventional band-pass filter (MTA). A block size of 32 was selected
for comparison with the results from section 5.5.3. It is clearly visible in figure 5.21a and
5.21c that CTMC and CSTD filtered ABRs with SWT produce closer amplitudes to the
template for wave V than with DWT. In general, SWT filtered ABRs tended to produce
closer amplitudes to that of the template compared to DWT. However the lack of waves
I, II and III in CTMC filtered ABRs resulted in a high MSEs, and therefore it was not
ideal for feature extraction. In figure 5.21b and 5.21d, a comparison of SWT and DWT
reiterates the reason for having low MSEs with SWT. In this randomly selected epoch, the
large noise component at the start of the ABR was remarkably removed by TWMC and
CSTD with SWT but residue with DWT. CTMC does not suppress that paticular noise
component thus revealing one of its disadvantages: in the presence of noise components
at a similar magnitude CTMC incorrectly detects these noise components as a related
ABR wave component. In conclusion, out of TWMC and CSTD implemented with both
SWT and DWT decomposition algorithms, the CSTD with SWT yields arguably the best
estimates of the ABR.
5.5.6 Latency tracking results of wavelet methods with DWT
The ability to track latency variations using wavelet methods, is evaluated using the L-I
curve of wave V. The validity of the grand averaged (MTA of 1024) L-I curve with recorded
ABRs are well within the standard model (3.2) as per the section 3.2.1. Therefore this
Wavelets in rapid extraction of the ABR 149
1 2 3 4 5 6 7 8 9 10-1
-0.5
0
0.5
1CTMC- 724
1 2 3 4 5 6 7 8 9 10-0.5
0
0.5
1TWMC- 724
1 2 3 4 5 6 7 8 9 10-0.4
-0.2
0
0.2
0.4
0.6CSTD- 724
1 2 3 4 5 6 7 8 9 10-2
-1
0
1
2BandPass- 724
(a) SWT-106
1 2 3 4 5 6 7 8 9 10-1
0
1
2CTMC- 106
1 2 3 4 5 6 7 8 9 10-1
0
1
2TWMC- 106
1 2 3 4 5 6 7 8 9 10-0.5
0
0.5
1
1.5CSTD- 106
1 2 3 4 5 6 7 8 9 10-1
0
1
2BandPass- 106
(b) DWT-106
1 2 3 4 5 6 7 8 9 10-1
-0.5
0
0.5
1CTMC- 724
1 2 3 4 5 6 7 8 9 10-0.5
0
0.5
1TWMC- 724
1 2 3 4 5 6 7 8 9 10-0.4
-0.2
0
0.2
0.4
0.6CSTD- 724
1 2 3 4 5 6 7 8 9 10-2
-1
0
1
2BandPass- 724
(c) SWT-724
1 2 3 4 5 6 7 8 9 10-1
-0.5
0
0.5
1CTMC- 724
1 2 3 4 5 6 7 8 9 10-1
-0.5
0
0.5
1
1.5TWMC- 724
1 2 3 4 5 6 7 8 9 10-1
-0.5
0
0.5
1CSTD- 724
1 2 3 4 5 6 7 8 9 10-2
-1
0
1
2BandPass- 724
(d) DWT-724
Figure 5.21: The plot of the effect of denoising of SWT and DWT on Random ABRs. At a block size of32 filtered with different filter methods using SWT and DWT.
Wavelets in rapid extraction of the ABR 150
curve was considered as the benchmark and the best match L-I curve derived by wavelet
filtered ABRs.
All the data from 8 participants were then independently filtered with the three wavelet
methods using DWT, and the wave V latencies were extracted using settings similar to
that of the grand average (refer section 3.2). The resultant curves and the overall average
across all the participants are shown in figure 5.22. It should be noted that the template
ABR used for CTMC and TWMC was derived from the grand average at 55 dB nHL for
each participant and was kept constant while denoising ABRs at all sound intensities for
each participant. This enables us to arrive at a conclusion of the range of latency variations
that could be tracked using a constant template with CTMC and TWMC. In addition,
such a constant template enables the representation of a practical situation where the
actual ABR of a patient is a priori unknown.
The overall results in figure 5.22 indicate that the L-I curve derived from CSTD follows
the reference curve better than the CTMC and TWMC derived curves. This difference is
prominent at low sound intensities. According to the behaviour of the L-I curves derived
by CTMC and TWMC, we could see that the variation tracking is limited compared
to that of CSTD. The individual L-I curves follow a similar trend except in the case of
Female(31) and Female(27) where all the wavelet methods follow the reference curve at
all sound intensities. The limitations of latency tracking based on these results will be
discussed with more detail in section 5.6.3.
To confirm this visual observation, a statistical analysis was performed by approxi-
mating similar polynomial models (as in (5.1)) for the derived L-I curves. The coefficients
calculated for each wavelet method and the grand average (1024) derived from 800 epochs
across 8 participants including 14 sound intensity levels are tabulated in table 5.6 in the
form of log10(L) = a1I + a2. The comparative plot of these estimated models are shown
in figure 5.23. A one sample t-test was carried out to statistically quantify the significance
of these estimations. It was hypothesised that a1 and a2 of filtered curves are similar to
those of the theoretical curve.
According to the t-test results in table 5.7, the grand averaged L-I curve does not have
a significant difference in the mean compared to the theoretical curve, with a p value of
Wavelets in rapid extraction of the ABR 151
both a1 and a2 greater than 0.01. Therefore in general, the data collected for this study
is valid and the results acquired from further processing of this data should also produce
valid results. Among the three curves derived with wavelet methods, only CSTD produced
a curve (t(799) = -1.4018, p = 0.1615, 95% CI for the mean (-0.0026, -0.0025)) with no
significant difference to that of the slope (a1) of the theoretical curve. The difference in the
intercept (a2) could be due to the systematic differences in the experimental setup which
affect the overall recording voltage. The curves derived from both CTMC and TWMC
indicates significant differences in the slope (a1), thus less effective than CSTD.
Wavelets in rapid extraction of the ABR 152
! ! " " # # $ % & ' ( ) ' * + ' , ) * - . ( / ' 0 1 2
3 45678 49 6:;<=> ?@@;48 6AB C D + . ! 2B C D + . 2B C D + . E 2B C D + . # 2F + G C D + . 2
F + G C D + . 2F + G C D + . 2
H I + J C D D I C J ) C * ) % '
F + G C D + . # 2
Figure 5.22: Latency tracking with wavelet methods using DWT. Derived L-I curves from the three waveletmethods with DWT and the grand average for individual participants and the overall effect. Overall L-Icurves suggest CSTD is close to the benchmark curve. To identify inter-subject variability, L-I curves ofindividual participants are plotted with 4 male and 4 female participants with theirs age shown in brackets.
Wavelets in rapid extraction of the ABR 153
K L M L N L O L P L Q L R L S LPP T PQQ T PRR T PSS T P
U V W X Y Z X [ \ X ] ^ [ _ ` Y a X b c de fg hijkl mno p q \ V r \ [ ^ s t uK L M Ov p w vp x w vv U p y
Figure 5.23: L-I curves derived from estimated models according to log10(L) = a1I + a2 with DWT.
These curves represent each filtering method using ABR data from all participants. The estimated modelcoefficients are shown in table 5.6.
Method a1 a2
Theoretical -0.0025 0.9241024 -0.0026 0.9306CTMC -0.002 0.9044TWMC -0.0019 0.8978CSTD -0.0026 0.9203
Table 5.6: Coefficients of the estimated models of the L-I curves according to log10(L) = a1I + a2 derived
using DWT.
Methoda1 a2
t p CI t p CI
1024 -0.8677 0.4143 -0.0028 : -0.0023 0.7451 0.4805 0.9097 : 0.9514CTMC 15.4439 < 0.01 -0.0021 : -0.0020 -11.5464 < 0.01 0.9011 : 0.9077TWMC 22.031 < 0.01 -0.0019 : -0.0018 -12.9731 < 0.01 0.8939 : 0.9018CSTD -1.4018 0.1615 -0.0026 : -0.0025 -3.203 0.0014 0.9180 : 0.9226
Table 5.7: Results of the t-test to determine the significant difference between the derived curves and thetheoretical using DWT as the decomposition algorithm. Null hypothesis: equal means to the theoreticalcurve (P < 0.01), df = 799
Wavelets in rapid extraction of the ABR 154
5.5.7 Latency tracking results of wavelet methods with SWT
The same datasets were used as in the previous section 5.5.6 and were processed inde-
pendently with the three wavelet denoising methods using SWT as the decomposition
algorithm instead of DWT. The resultant average curve across all participants and indi-
vidual curves are shown in figure 5.24. It should be noted that the ABR template used
for CTMC and TWMC was derived from the grand average of data at 55 dB nHL for
each participant and was kept constant while denoising ABRs at all sound intensities for
each participant. These results closely resemble the curves derived from DWT. The effect
of using the template is visible in CTMC and TWMC with deviated curves at low sound
intensity levels.
A similar statistical analysis to that of previous section for DWT was carried out to
confirm the visual observation when used SWT as the decomposition algorithm. The
coefficients calculated for each wavelet method and the grand average (1024) derived from
800 epochs across 8 participants including 14 sound intensity levels, are tabulated in
table 5.8 in the form of log10(L) = a1I + a2. The plot of these models in figure 5.25
with the theoretical curve, suggests that the grand averaged and the CSTD derived curves
are a visually closer approximation to the theoretical curve compared to the CTMC and
TWMC derived curves. A one sample t-test was carried out to statistically quantify the
significance of these estimations. It was hypothesised that a1 and a2 of filtered curves are
similar to that of the theoretical curve.
According to the t-test results in table 5.9, among the three curves derived with wavelet
methods, only CSTD produced a curve (t(799) = -0.9062, p = 0.3950, 95% CI for the mean
(-0.0030, -0.0023)) with no significant difference to that of the slope (a1) of the theoretical
curve. The curves derived from both CTMC and TWMC indicate significant differences in
the slope (a1), thus these methods are less effective in latency tracking compared to CSTD.
In addition SWT derived p values are higher than those derived with DWT, suggesting the
positive effect of shift-invariance in SWT. Supporting this argument, the intercept (a2)
of the CSTD derived L-I curve show no significant difference compared to the theoretical
curve.
Wavelets in rapid extraction of the ABR 155
z z | | ~ ~ | | | | | | |
¡ ¢
£ ¤ ¥ ¡ ¢ ¢ ¤ ¡ ¥ ¡ ¡ ¢ ~ ¡ ¢ ~ z ¡ ¢ ¦ § ¨ ¡ ¢ ~ z § ¨ ¡ ¢ § ¨ ¡ ¢ | § ¨ ¡ ¢ |
Figure 5.24: Latency tracking with wavelet methods using SWT. Derived L-I curves from the three waveletmethods with SWT and the grand average for individual participants and the overall effect. Overall L-Icurves suggest CSTD is close to the benchmark curve. To identify inter-subject variability, L-I curves ofindividual participants are plotted with 4 male and 4 female participants with theirs age shown in brackets.These results closely resemble the L-I curves derived using DWT.
Wavelets in rapid extraction of the ABR 156
© ª « ª ¬ ª ª ® ª ¯ ª ° ª ± ª®® ² ®¯ ² ®°° ² ®±± ² ®
³ ´ µ ¶ · ¸ ¶ ¹ º ¶ » ¼ ¹ ½ ¾ · ¿ ¶ À Á ÂÃ ÄÅ ÆÇÈÉÊ ËÌÍ Î Ï º ´ Ð º ¹ ¼ Ñ Ò Ó© ª « Ô Î Õ ÔÎ Ö Õ ÔÔ ³ Î ×
Figure 5.25: L-I curves derived from estimated models according to log10(L) = a1I + a2 with SWT.
These curves represent each filtering method using ABR data from all participants. The estimated modelcoefficients are shown in table 5.8
Method a1 a2
Theoretical -0.0025 0.924Grand average -0.0026 0.9306CTMC -0.0023 0.906TWMC -0.0019 0.8813CSTD -0.0026 0.9265
Table 5.8: Coefficients of the estimated models of the L-I curves according to log10(L) = a1I + a2 derived
using SWT.
Methoda1 a2
t p CI t p CI
Grand average -0.8677 0.4143 -0.0028 : -0.0023 0.7451 0.4805 0.9097 : 0.9514CTMC 2.3677 0.0498 -0.0025 : -0.0021 -5.6873 < 0.01 0.8985 : 0.9135TWMC 6.9367 < 0.01 -0.0021 : -0.0016 -16.822 < 0.01 0.8753 : 0.8873CSTD -0.9062 0.395 -0.0030 : -0.0023 0.5682 0.5877 0.9161 : 0.9368
Table 5.9: Results of the t-test to determine the significant difference of the derived curves compared tothe theoretical. Null hypothesis is equal mean to the theoretical curve (P < 0.01), df = 799. SWT is usedas the decomposition algorithm.
Wavelets in rapid extraction of the ABR 157
5.6 Discussion
5.6.1 Evaluation of de-noising capacity of wavelet methods using DWT
The wavelet methods evaluated in this study are based on the common hypothesis that the
nullified wavelet coefficients by applying thresholds are related to spontaneous EEG noise
and retains the coefficients generated as a result of the ABR. The contribution of these
wavelet methods towards this hypothesis appeared to be different with the results obtained
in sections 5.5.3 and 5.5.5. The purpose of this discussion is to critically determine the
advantages and disadvantages of these methods and arrive at a efficient and a reliable
wavelet method for rapid extraction of the ABR. Therefore the discussion is initiated with
a summary of the performance of each wavelet method CTMC, TWMC and CSTD.
The use of a constant threshold of 20% at all the wavelet decomposition levels in CTMC
is improved by matching them with the thresholded coefficients of the template. Here,
similar thresholds are applied to high frequency noise and low frequency ABR coefficients.
Therefore high frequency components that are relevant to ABR features were removed
thus producing ABRs with distorted morphology (absence of wave I and III) as evident
in figure 5.18 and 5.21. The inability to extract such features leads to a high MSEs as
evident in figure 5.13. On the contrary, the ability to extract only wave V without the
other wave components could be beneficial in applications such as in screening of hearing,
where the presence of wave V indicates that the sound has been heard by the patient.
The use of temporal windows in TWMC appears to produce less MSE than CTMC
according to figure 5.13. The unique implementation of temporal windows, exactly tar-
get the relevant coefficients of the ABR and extract them. The effect of nullification of
irrelevant coefficients towards the end of the ABR epoch (7 ms to 10 ms) is clearly visible
in figure 5.15. However, some spurious effects could be observed in TWMC filtered ABR
in figure 5.18 in the range of 1 ms to 5 ms. This could be due to the close proximity of
the wave I and III which leads the to temporal windows being defined too close to each
other in that range. Originally TWMC was implemented to extract features of VEPs and
MLAEPs where their features are located well apart from each other on the time-scale
(Quian Quiroga 2005, Quian Quiroga 2000). But it should be noted that distortion of wave
Wavelets in rapid extraction of the ABR 158
I and III is not always the case for this method, with the evidence attached in Appendix F.
In contrast, the performance of the CSTD algorithm is superior to that of CTMC
and TWMC in terms of MSE and the extraction of features from the ABR. This unique
enhancement is due to the systematic nullification of coefficients and additional averages
to remove uncorrelated ongoing EEG. Compared to CTMC, this algorithm uses a variable
threshold on each decomposition level so that more coefficients will get nullified at higher
levels which include high frequency noise and retain more coefficients at lower levels which
are related to low frequency ABR components. In addition, the CSTD level threshold
removes more noise components resulting in a closer match to the template. The most
impressive outcome of this method is the presence of small wave components such as wave
I and III and potentially even wave II and IV with an ensemble of just 32 epochs.
In summary, when considering the rapid denoising aspect of the CTMC, TWMC and
CSTD wavelet algorithms, CSTD performs better with the DWT decomposition algorithm.
Another unique feature in CSTD is the independence of a template. The importance of this
feature, related to tracking variations of the ABR latency, is explained in the section 5.6.3.
5.6.2 Performance comparison of DWT and SWT decomposition
algorithms
It is a well known fact that the shift-variance in DWT as mentioned in section 5.1.6
distort signals with a temporal shift and SWT is one of the alternatives to prevent
such aberrations. The earlier studies published on CTMC (Zhang et al. 2006), TWMC
(Quian Quiroga 2000, Quian Quiroga 2005) and CSTD (Causevic et al. 2005) use DWT
wavelet decomposition algorithms. This study extends these methods by applying the
SWT decomposition algorithm thus replacing the DWT to avoid shift-variances. Apart
from computational complexities and redundancies in SWT, the hypothesis was that the
SWT would yield better results compared to DWT.
Complying with the hypothesis, the MSE of the SWT filtered signals yielded a lower
value compared to DWT filtered signals as evidenced in figure 5.20, supported by two
typical ABRs shown in figure 5.21. This clearly illustrates that distortions are minimised
when SWT is used in comparison with DWT; and CSTD still maintains a low MSE with
Wavelets in rapid extraction of the ABR 159
the SWT implementation.
All this evidence suggests that the combination of CSTD with SWT produces the best
method of denoising the ABR with minimum distortions in its amplitude and latency of
features (quantified by MSE). However it is important to consider the processing time of
the SWT given that the application of these methods is for a rapid extraction system.
Doubling of coefficients at each decomposition level creates a time delay. On the contrary,
given the improved accuracy of the result, a high speed processor could be easily used for
such devices to reduce the processing time with reasonable cost effectiveness.
5.6.3 Evaluation of latency tracking of wavelet methods with DWT
and SWT
The ability to track time scale variations is a key feature that should be integrated in
an algorithm used to extract ABRs. The importance of this feature becomes critical if
required for monitoring a patient with varying conditions, such as intraoperative monitor-
ing, whether the patient is undergoing a drug administration or in cases where a patient
may be suffering a neurological disorder. This thesis reports an analysis of the ability to
track such temporal variations (in terms of latency) of three wavelet algorithms. The fol-
lowing discussion will look into assessing the optimum wavelet denoising method and the
decomposition algorithm to track latency variations as well as determine the limitations
of using a template.
The overall performance of tracking latency variations using the DWT decomposition
algorithm in figure 5.22 indicates that CSTD performs better than CTMC and TWMC.
This observation is statistically confirmed by results presented in table 5.7. Additionally,
a closer observation suggests contrasting behaviour at high sound intensity levels and low
sound intensities.
At higher intensity levels, all wavelet methods follow the L-I curve of the template. For
CTMC and TWMC, this is due to the effect of the fixed reference template at 55 dB nHL.
In contrast, at low sound intensity levels, the overall L-I curve derived by CTMC sug-
gests that deviation from the benchmark curve that starts at the sound intensity level of
25 dB nHL. This is likely to be a result of using a fixed reference template. Therefore
Wavelets in rapid extraction of the ABR 160
the variation that can be tracked with CTMC is limited to ±1 ms with respect to the
latency of the reference template. The individual ABRs observed in figure 5.18 indicate
that CTMC neglects the smaller peaks, wave I and II, which may be critical in clinical
examinations. Therefore CTMC has two disadvantages according to the context of this
study.
i Inability to track latency variation greater than ±1 ms (in the case of constructing
the L-I curve in audiometry).
ii Inability to produce a fully featured ABR.
However, modifying the method to use a continuously updating the template with
CTMC would be a worthwhile endevour for a future study.
TWMC produces an overall L-I curve close to the benchmark curve at sound intensities
greater than 35 dB nHL with a mean difference of 0.2 ms (SD=0.03) but shows a larger
deviation from the standard below 35c with a mean difference of 1.56 ms (SD=0.04)
than CTMC 1.03 ms (SD=0.02). Since this method has used temporal windows defined
at 55 dB nHL, the latency variation tracking has been impossible beyond 35 dB nHL.
According to figure 5.22, the L-I curve from TWMC suggests that the maximum variation
that it can track is ±0.5 ms with respect to the latency of the reference template. This
verifies Quian Quiroga’s statement in regards to this method being resistant for latency
jitter (Quian Quiroga 2005). In addition the current study has quantified the limitation
of maximum latency jitter that can be detected to be ±0.5 ms relative to the reference
template.
In contrast, the overall L-I curve of CSTD follows the benchmark L-I curve along
the full range of sound intensities with a mean difference of 0.94 ms (SD=0.11). This
behaviour could be explained by the reference template independency of CSTD and the
superior denoising capability.
Inter-participant variability is briefly addressed, as there is a known relationship of
differences in ABRs among age and gender, for example ABR wave V latency is shorter
in females and younger adults than in males and older adults (Wilson & Aghdasi 1999).
Figure 5.22 illustrates the individual L-I curves for 4 male and 4 female participants. A
Wavelets in rapid extraction of the ABR 161
mean wave V latency prolongation of 0.3 ms (SD=0.07) can be seen in the benchmark curve
of males compared to females. This difference is closely preserved by the L-I curve derived
by CSTD with a difference of 0.12 ms (SD=0.04). In contrast, it is impossible to compare
L-I curves for CTMC and TWMC where they do not represent an accurate variation at
low sound intensities. The variation due to age appears negligible in these plots due to
the narrow age range of the participants 24 to 34 years (mean=26.7, SD=2.6). However,
a visual inspection suggests the superior performance of CSTD following the benchmark
curve at all ages.
According to the statistical analysis of the use of DWT and SWT as the decomposition
algorithm for latency tracking, it can be concluded that even with SWT as the decompo-
sition algorithm, CSTD performs superior (p = 0.395) to other wavelet filtering methods
(p < 0.01) thus revealing the limitation of using a template when tracking time-scale
variations.
A similar result is achieved by using the SWT algorithm in figure 5.24 with CSTD
performing better than CTMC and TWMC. Similar effects of the template could be seen
here with diverging L-I curves towards low sound intensity levels derived from CTMC and
TWMC.
Quantification of the MSEs of overall L-I curves in figure 5.26 reflects the similarity in
the performance of DWT and SWT algorithms. A large variation is present at low sound
intensities. However, this variation converges to very low MSE values from 30 dB nHL
onwards. This result does not entirely satisfy the hypothesis of shift-invariance property
which is, to produce an improved approximation of latency with SWT compared to the
shift-variant DWT.
However, it should be noted that the L-I curves constructed here are derived using
discrete datasets recorded at each intensity level. The drawback of shift-variance in DWT
is less observed for this type of data due to the minimal variation in ABRs produced at
constant stimulus intensity. In contrast, a dataset recorded with continuously varying
sound intensity levels as in most practical situations will have a considerable time-scale
variations in ABR latency. Even though it would be interesting to see the performance of
shift-variant DWT and the shift-invariant SWT on such a dataset, due to the limitation
Wavelets in rapid extraction of the ABR 162
Figure 5.26: The difference of the MSE of the L-I curves derived using DWT and SWT for each denoisingmethod with reference to the L-I curve derived from the MTA. i.e. (MSEDWT - MSESWT) suggestingthat MSE>0 indicates better performance of SWT and MSE<0 indicates better performance of DWT.
of time and data for this thesis, this analysis was not performed.
5.7 Conclusion
Considering the frequent use of wavelets in denoising applications, three different wavelet
denoising methods were analysed for rapid extraction of ABRs with minimum number
of epochs. CTMC and TWMC methods were based on a template while the CSTD
method was independent of a template. Two wavelet decomposition algorithms were also
considered in this analysis to assess the distortion produced by shift-variance in DWT
compared to shift-invariant SWT.
Use of constant templates in CTMC and TWMC supports the hypothesis of tracking
time-scale variations (section 2.4.1). CTMC detects latencies of ±1 ms with reference
to the template and TWMC allow latencies of ±0.5 ms. Supporting the hypothesis that
the shift invariance of the SWT decomposition algorithm produces better denoised ABRs,
MSE values were low compared to that of DWT. In contrast, contradictory results were
observed in tracking latency changes of the ABR. However, it is worthwhile to examine
Wavelets in rapid extraction of the ABR 163
the performance of the CSTD using SWT decomposition algorithm with an ABR dataset
having continuously varying latencies.
The denoising results suggested that CSTD denoising method with SWT decomposi-
tion algorithm produced a fully featured ABR. Only 32 epochs were required for the ABR
extraction which is a considerable improvement in the rapid extraction of ABRs, com-
pared to the conventional MTA of 1024 epochs. Latency tracking results suggested that
template independent CSTD is superior to the template depended methods. According
to the results of this study, CSTD with DWT decomposition algorithm is suitable for an
ABR rapid extraction system. With only 32 epochs, CSTD with DWT decomposition was
able to arrive at a fully featured ABR.
The Journal article published (De Silva & Schier 2011) including these conclusions is
attached in Appendix B.
Wavelets in rapid extraction of the ABR 164
Chapter 6
Overall conclusions and further
work
A summary of the independent studies carried out towards the common objective of eval-
uating a rapid method to extract ABRs is presented in this chapter. In the exploration of
a rapid extraction algorithm of the ABR, ARX modelling and wavelet denoising revealed
contrasting results with different levels of susceptibility to noise and time-scale variations.
It was determined for parametric model-based extraction algorithms that the conven-
tional ARX modelling outperformed REPE for MSE. However, superior performance was
observed in the CSTD wavelet denoising algorithm, which produced a fully featured ABR.
This chapter presents the conclusions drawn, application domain, limitations and pos-
sible future expansions, based upon the investigated methods.
165
6.1 The approach towards the extraction of ABR
Lengthy acquisition times required to extract the ABR impose significant restrictions
on its use in diagnostic and monitoring applications. Typically it takes of the order of
minutes to acquire a sufficiently noise free ABR, using conventional MTA methods, so that
the amplitude and latency of its major components can be identified. Further, lengthy
acquisition times increase the likelihood that a range of externally generated artifacts,
particularly patient movements, will compromise the fidelity of the acquisition.
In order to overcome such limitations, studies were constituted to investigate the effi-
cacy of algorithms to rapidly acquire the ABR using a minimal number of epochs. Based
on an extensive review of the literature concerning the rapid extraction of evoked re-
sponses, the ARX modelling and wavelet based denoising methods were considered the
most promising for ABR rapid extraction.
6.2 Rapid extraction with ARX and REPE
Because ARX modelling has not been used previously to extract short latency ABRs,
this study systematically establishes the suitability of the ARX approach along with the
variant REPE, for single/limited sweep extraction in a high noise environment. The use of
real ABR data for the evaluation imposes an often-unacknowledged drawback of assuming
the availability of an actual noise-free EP/ABR for each epoch recorded. To address
this limitation and uncertainty; the ability to extract the ABR was tested with a well-
defined and reproducible simulation study involving a synthetic ABR model with additive
noise. On this basis, the following specific features were evaluated for the ARX and REPE
methods:
• The accurate identification of the actual model parameters with ARX and REPE
algorithms.
• Quantification of noise reduction/denoising achieved by the ARX and REPE algo-
rithms.
Overall conclusions and further work 166
• Quantification of the range of wave V latency variations tracked with ARX and
REPE algorithms.
• The application and confirmation of these findings with real ABRs.
The systematic evaluation these features of the ARX and REPE methods revealed the
following:
• Estimation of predefined model parameters performed for both ARX and the REPE
methods revealed that the poles approached the predetermined values, but with an
offset for the zeros. This suggests that the estimated ABR does not possess the exact
spectral characteristics of the original. The scattered zeros could be as a result of
the noise imposed on the ABR.
• REPE produced a superior SNR improvement of 23 dB at -30 dB initial SNR com-
pared to the ARX. In contrast, inspection of individual ABR epochs, suggested that
ARX produced a closer match to the reference ABR with a mean correlation coeffi-
cient of 0.84 (SD = 0.02) compared to the REPE with a mean correlation coefficient
of 0.63 (SD = 0.06), suggesting the introduction of pre-whitening in the REPE has
a detrimental effect on the estimated ABR.
• ARX modelling performed superior in tracking latency variations compared to the
REPE. This was evident with the ARX able to estimate latency offsets of 2 ms at
a frequency of 1 Hz with an initial SNR of -5 dB. In contrast, REPE was unable to
estimate latency variations (even at a SNR of 0 dB).
• Since the SNR of a physiological ABR is approximately -30 dB, the ARX model
required an ensemble of epochs rather than a single sweep to provide an output of
the model. Application of the ARX modelling to real ABRs confirmed that even
with an ideal reference input template u(k), the output of the model y(k) would
require an ensemble average of more than 256 sweeps. According to these results,
rapid extraction cannot be achieved with ARX modelling. Therefore ARX modelling
is unsuitable for the extraction of signals with low SNR such as ABRs.
Overall conclusions and further work 167
The contributions to the field by the study carried out for this thesis are largely to
clarify the applicability of the ARX model in extracting the ABR features and their
time-scale variations as well as to clarify shortcomings of previous, related studies. In
previous research, aberrations of estimated signals from the ARX model were suspected
to arrive from the inconsistency of the generation of the EP (Cerutti et al. 1987, Rossi
et al. 2007). But with the simulation study in this thesis (with synthetic ABRs and known
presence of the EP), the effect of SNR and the latency variability were highlighted as
contributors to the aberration of the estimated EP. Thereby, boundaries were determined
for the successful estimation of single sweep ABRs with ARX modelling. It was confirmed
both in previous research and in the current study that the REPE produced approximately
10 dB SNR improvement compared to the conventional ARX modelling. However, the
latency variation tracking ability of the REPE was poor (following the MTA estimate)
and hence was deemed not suitable for ABR extraction.
6.3 Rapid extraction with wavelets
There also exists the possibility of rapidly extracting ABRs using a different paradigm.
Considering the success of wavelet based denoising methods in EP applications, three
wavelet denoising methods, CTMC, TWMC and CSTD were adopted to the ABR. CTMC
and TWMC are based on a template of the ABR whereas CSTD is executed without such
a template. Denoising of CTMC imposes only thresholds, whereas CSTD also employed
a cyclic averaging technique. In contrast, TWMC used a denoising technique based on
temporal windowing of wavelet coefficients. The following were examined in order to
determine the suitability of these methods to use in a rapid extraction method of the
ABR.
• Identification of optimum threshold functions and time windows compatible with
the ABR.
• Evaluation of the most effective denoising method at low SNR i.e. with reduced
number of epochs.
Overall conclusions and further work 168
• Identification of the most effective time-scale variation tracking wavelet denoising
method.
• Analysis of the effect of DWT and SWT decomposition algorithms on chosen three
denoising methods.
The following conclusions were drawn from the results generated by subjecting real ABRs
to the specific wavelet methods.
• Performance of CSTD is superior in denoising ABRs compared to CTMC and
TWMC with a significant difference in MSEs (p < 0.01). The two optimum threshold
functions and the circular averaging contribute to this effective noise reduction.
• Performance of noise reduction in CSTD compared to CTMC and TWMC is con-
sistent for both DWT and SWT decomposition algorithms. However, an improved
noise reduction resulted when using SWT as the decomposition algorithm, thus re-
vealing the advantage of the shift-invariance (p < 0.01).
• An ensemble of 32 epochs is sufficient to extract a fully featured ABR denoised with
CSTD as a result of a significant difference in correlation coefficients (p < 0.01).
• Latency variations are closely tracked by the CSTD with similar slopes (a1 in the L-I
curve model) (p = 0.1615). Independence of a template (reference signal) appears
to assist the CSTD to track latency variations without any limitations.
• With a constant template, CTMC is able to track latency variations of ±1 ms ref-
erence to the template, while TWMC is able to track latency variations of ±0.5 ms.
These limitations impose a barrier for their use in clinical practice.
• As a method of extracting ABRs from underlying EEG with the association of
reduced number of sweeps, cyclic shift tree denoising (CSTD) algorithm is the opti-
mum among the wavelet algorithms compared. Therefore, the systematic evaluation
confirms that the CSTD wavelet method can be used for rapid extraction of ABRs.
Overall conclusions and further work 169
6.4 Limitations of the current study and future work
The promising results generated during this study provide scope for further refinements
and developments. Considering the clinical importance of the amplitude variations of the
ABR (especially in hearing screening), it would be worthwhile to evaluate these variations
using both ARX and wavelet methods in addition to the latency examined here.
A specific limitation of this study exists in the discrete datasets recorded at each sound
intensity level. The L-I curves constructed with this data created discontinuities along the
curve. It would be worthwhile for the future examination of the performance of the CSTD
using SWT decomposition algorithm with an ABR dataset derived from a continuous
variation of sound intensity levels (which leads to a continuous variation in ABR latency).
This would give a profound indication of the performance comparison of the DWT and
SWT decomposition algorithms. In addition, for the special case of identifying only wave
V, modifying the CTMC method to use a continuously updating template would be a
worthwhile investigation.
As conventional statistical threshold methods (Fsp) do not fulfil the underlying as-
sumptions, estimating the distribution of the residual noise could be beneficial thereby
determining a new threshold compatible to non MTA methods.
Finally, the evaluation of the CSTD method, with optimised parameters using a large
pool of ABR recordings would lead to a fully functional device for the rapid extraction
of ABRs. It would also be important to include participants with pathological disorders
which are known to affect ABR features.
Overall conclusions and further work 170
Bibliography
Achor, L. & Starr, A. (1980), ‘Auditory brain stem responses in the cat. ii. effects of
lesions’, Electroencephalography and Clinical Neurophysiology 48(2), 174–190.
Acir, N., Ozdamar, O. & Guzelis, C. (2006), ‘Automatic classification of auditory brain-
stem responses using svm-based feature selection algorithm for threshold detection’,
Engineering Applications of Artificial Intelligence 19(2), 209–218.
Adelman, G. & Smith, B. (1999), Encyclopedia of neuroscience, Elsevier.
Aihara, N., Murakami, S., Watanabe, N., Takahashi, M., Inagaki, A., Tanikawa, M. & Ya-
mada, K. (2009), ‘Cochlear nerve action potential monitoring with the microdissector
in vestibular schwannoma surgery’, Skull Base 19(05), 325,332.
Akaike, H. (1970), ‘Statistical predictor identification’, Annals of the Institute of Statistical
Mathematics 22(1), 203–217.
Alpiger, S., Helbo-Hansen, H. S. & Jensen, E. W. (2002), ‘Effect of sevoflurane on the
mid-latency auditory evoked potentials measured by a new fast extracting monitor’,
Acta Anaesthesiologica Scandinavica 46(3), 252–256.
Antonelli, A. R., Bonfioli, F., Cappiello, J., Peretti, G., Zanetti, D. & Capra, R. (1988),
‘Auditory evoked potentials test battery related to magnetic resonance imaging for
multiple sclerosis patients’, Scandinavian Audiology 17(SUPPL. 30), 191–196.
Aurlien, H., Gjerde, I. O., Aarseth, J. H., Elden, G., Karlsen, B., Skeidsvoll, H. & Gilhus,
N. E. (2004), ‘Eeg background activity described by a large computerized database’,
Clinical Neurophysiology 115(3), 665–673.
171
Babkoff, H., Pratt, H. & Kempinski, D. (1984), ‘Auditory brainstem evoked potential
latency-intensity functions: A corrective algorithm’, Hearing Research 16(3), 243–
249.
Basar, E. (2001), ‘”event-related oscillations are [‘]real brain responses’ – wavelet analysis
and new strategies”’, International Journal of Psychophysiology 39(2-3), 91 – 127.
Boston, J. R. (1981), ‘Spectra of auditory brainstem responses and spontaneous eeg’, IEEE
Transactions on Biomedical Engineering 28(4), 334–341.
Boston, J. R. (1983), ‘Effects of digital filtering on the waveform and peak parameters of
the auditory brainstem response’, Journal of clinical engineering 8(1), 79–84.
Box, G. & Jenkins, G. (1976), Time Series Analysis: Forecasting and Control, CA: Holden-
Day, San Francisco.
Bracchi, F., Perale, G., Rossi, L., Gaggiani, A. & Bianchi, A. M. (2003), A pc-based
system for h-reflex and single sweep sep coupled monitoring of spinal cord function in
vertebral column surgery, in R. S. Leder, ed., ‘Engineering in Medicine and Biology
Society, 2003’, Vol. 4 of A New Beginning for Human Health: Proceedings of the 25th
Annual International Conference of the IEEE Engineering in Medicine and Biology
Society, Cancun, pp. 3169–3172.
Bradley, A. P. & Wilson, W. J. (2004), ‘On wavelet analysis of auditory evoked potentials’,
Clinical Neurophysiology 115(5), 50.
Bradley, A. P. & Wilson, W. J. (2005), ‘Automated analysis of the auditory brainstem
response using derivative estimation wavelets’, Audiology and Neuro-Otology 10(1), 6–
21.
Bradley, J. N. & Brislawn, C. M. (1994), Wavelet/scalar quantization compression stan-
dard for digital fingerprint images, in ‘Proceedings - IEEE International Symposium
on Circuits and Systems’, Vol. 3 of Proceedings of the 1994 IEEE International Sym-
posium on Circuits and Systems. Part 3 (of 6), IEEE, London, England, pp. 205–208.
BIBLIOGRAPHY 172
Broersen, P. (2006), Automatic Autocorrelation and Spectral Analysis, Springer, Germany.
Buchwald, J. & Huang, C. (1975), ‘Far-field acoustic response: origins in the cat’, Science
189(4200), 382.
Burrus, C., Gopinath, R. & Guo, H. (1998), Introduction to wavelets and wavelet trans-
forms: a primer, Prentice Hall.
Causevic, E., Morley, R. E., Wickerhauser, M. V. & Jacquin, A. E. (2005), ‘Fast
wavelet estimation of weak biosignals’, IEEE Transactions on Biomedical Engineering
52(6), 1021–1032.
Cerutti, S., Baselli, G., Liberati, D. & Pavesi, G. (1987), ‘Single sweep analysis of visual
evoked potentials through a model of parametric identification’, Biological Cybernetics
56(2-3), 111–120.
Cerutti, S., Chiarenza, G., Liberati, D., Mascellani, P. & Pavesi, G. (1988), ‘A parametric
method of identification of single-trial event-related potentials in the brain’, IEEE
Transactions on Biomedical Engineering 35(9), 701–711.
Chassard, D., Joubaud, A., Colson, A., Guiraud, M., Dubreuil, C. & Banssillon, V. (1989),
‘Auditory evoked potentials during propofol anaesthesia in man’, British Journal of
Anaesthesia 62(5), 522–526.
Chiappa, K. (1997), Evoked potentials in clinical medicine, Lippincott-Raven.
Chiappa, K., Gladstone, K. & Young, R. (1979), ‘Brain stem auditory evoked responses:
studies of waveform variations in 50 normal human subjects’, Archives of Neurology
36(2), 81.
Chiappa, K. H. (1990), Short latency somatosensory evoked potentials in clinical medicine
1990 chiappa, Raven Press, New York.
Chiappa, K. H. & Ropper, A. H. (1982), ‘Evoked potentials in clinical medicine. (first of
two parts)’, New England Journal of Medicine 306(19), 1140–1150.
BIBLIOGRAPHY 173
Chu, N.-S. (1985), ‘Age-related latency changes in the brain-stem auditory evoked poten-
tials’, Electroencephalography and Clinical Neurophysiology/Evoked Potentials Section
62(6), 431–436.
Church, M. W. & Gritzke, R. (1987), ‘Effects of ketamine anesthesia on the rat brain-stem
auditory evoked potential as a function of dose and stimulus intensity’, Electroen-
cephalography and Clinical Neurophysiology 67(6), 570–583.
Church, M. W. & Gritzke, R. (1988), ‘Dose-dependent effects of atropine sulfate on
the brainstem and cortical auditory evoked potentials in the rat’, Brain Research
456(2), 224–234.
Coats, A., Jenkins, H. & Monroe, B. (1984), ‘Auditory evoked potentials-the cochlear
summating potential in detection of endolymphatic hydrops’, Otology & Neurotology
5(6), 443.
Cohen, M. S. & Britt, R. H. (1982), ‘Effects of sodium pentobarbital, ketamine, halothane,
and chloralose on brainstem auditory evoked responses’, Anesthesia and Analgesia
61(4), 338–343.
Coifman, R. & Donoho, D. (1995), ‘Translation invariant de-noising’, Lecture Notes in
Statistics 103, 125–150.
Coifman, R. R. & Wickerhauser, M. V. (1992), ‘Entropy-based algorithms for best basis
selection’, IEEE Transactions on Information Theory 38(2 pt II), 713–718.
Collet, L., Delorme, C., Chanal, J. M., Dubreuil, C., Morgon, A. & Salle, B. (1987),
‘Effect of stimulus intensity variation of brain-stem auditory evoked potentials: com-
parison between neonates and adults’, Electroencephalography and Clinical Neuro-
physiology/Evoked Potentials Section 68(3), 231–233.
Cooper, W. A. J. & Parker, D. J. (1981), ‘Stimulus artefact reduction systems for the
tdh-49 headphone in the recording of auditory evoked potentials’, Ear and Hearing
2(6), 283–293.
BIBLIOGRAPHY 174
Corona-Strauss, F. I., Delb, W., Bloching, M. & Strauss, D. J. (2007), Ultra- fast quantifi-
cation of hearing loss by neural synchronization stabilities of auditory evoked brain-
stem activity, in ‘Engineering in Medicine and Biology Society, 2007. EMBS 2007.
29th Annual International Conference of the IEEE’, pp. 2476–2479.
Corona-Strauss, F. I., Delb, W., Schick, B. & Strauss, D. J. (2010a), A kernel-based nov-
elty detection scheme for the ultra-fast detection of chirp evoked auditory brainstem
responses, in ‘Engineering in Medicine and Biology Society (EMBC), 2010 Annual
International Conference of the IEEE’, pp. 6833–6836.
Corona-Strauss, F. I., Delb, W., Schick, B. & Strauss, D. J. (2010b), A kernel-based nov-
elty detection scheme for the ultra-fast detection of chirp evoked auditory brainstem
responses, in ‘Engineering in Medicine and Biology Society (EMBC), 2010 Annual
International Conference of the IEEE’, pp. 6833–6836.
DanmeterAps (2010), ‘www.danmeter.dk’.
Davies, F. W., Mantzaridis, H., Kenny, G. N. C. & Fisher, A. C. (1996), ‘Middle la-
tency auditory evoked potentials during repeated transitions from consciousness to
unconsciousness’, Anaesthesia 51(2), 107–113.
De Silva, A. C. & Schier, M. A. (2011), ‘Evaluation of wavelet techniques in rapid extrac-
tion of abr variations from underlying eeg’, Physiological Measurement 32(11), 1747–
1761.
De Weerd, J. (1981), ‘A posteriori time-varying filtering of averaged evoked potentials. i.
introduction and conceptual basis’, Biological Cybernetics 41(3), 211.
Delgado, R. E. & Ozdamar, O. (1994), ‘Automated auditory brainstem response interpre-
tation’, IEEE Engineering in Medicine and Biology Magazine 13(2), 227–237.
Demiralp, T., Yordanova, J., Kolev, V., Ademoglu, A., Devrim, M. & Samar, V. J. (1999),
‘Time-frequency analysis of single-sweep event-related potentials by means of fast
wavelet transform’, Brain and Language 66(1), 129–145.
BIBLIOGRAPHY 175
Don, M. & Elberling, C. (1996), ‘Use of quantitative measures of auditory brain-stem
response peak amplitude and residual background noise in the decision to stop aver-
aging’, The Journal of the Acoustical Society of America 99(1), 491–499.
Donoho, D. L. (1995), ‘De-noising by soft-thresholding’, IEEE Transactions on Informa-
tion Theory 41(3), 613–627.
Doyle, D. J. (1975), ‘Some comments on the use of wiener filtering for the estimation of
evoked potentials’, Electroencephalography and Clinical Neurophysiology 38(5), 533–
534.
Drummond, J. C., Todd, M. M. & Hoi Sang, U. (1985), ‘The effect of high dose sodium
thiopental on brain stem auditory and median nerve somatosensory evoked responses
in humans’, Anesthesiology 63(3), 249–254.
Effern, A., Lehnertz, K., Fernndez, G., Grunwald, T., David, P. & Elger, C. E. (2000),
‘Single trial analysis of event related potentials: Non-linear de-noising with wavelets’,
Clinical Neurophysiology 111(12), 2255–2263.
Effern, A., Lehnertz, K., Schreiber, T., Grunwald, T., David, P. & Elger, C. E. (2000),
‘Nonlinear denoising of transient signals with application to event-related potentials’,
Physica D: Nonlinear Phenomena 140(3-4), 257–266.
Egerhzi, A., Glaub, T., Balla, P., Berecz, R. & Degrell, I. (2008), ‘P300 in mild cognitive
impairment and in dementia’, P300 enyhe kognitv zavarban s demenciban. 23(5), 349–
357.
Elberling, C. & Don, M. (1984), ‘Quality estimation of averaged auditory brainstem re-
sponses’, Scandinavian Audiology 13(3), 187–197. Cited By (since 1996): 58 Export
Date: 28 July 2009 Source: Scopus.
Elberling, C. & Salomon, G. (1973), ‘Cochlear microphonics recorded from the ear canal
in man’, Acta Oto-Laryngologica 75(2-6), 489–495.
Fedele, D., Martini, A. & Cardone, C. (1984), ‘Impaired auditory brainstem-evoked re-
sponses in insulin-dependent diabetic subjects’, Diabetes 33(11), 1085–1089.
BIBLIOGRAPHY 176
Ferbert, A., Buchner, H., Bruckmann, H., Zeumer, H. & Hacke, W. (1988), ‘Evoked
potentials in basilar artery thrombosis: Correlation with clinical and angiographic
findings’, Electroencephalography and Clinical Neurophysiology 69(2), 136–147.
Furdea, A., Halder, S., Krusienski, D. J., Bross, D., Nijboer, F., Birbaumer, N. & Kbler,
A. (2009), ‘An auditory oddball (p300) spelling system for brain-computer interfaces’,
Psychophysiology 46(3), 617–625.
Garcia-Larrera, L., Fischer, C. & Artru, F. (1993), ‘Effect of anaesthetic agents on sen-
sory evoked potentials’, EFFET DES ANESTHESIQUES SUR LES POTENTIELS
EVOQUES SENSORIELS 23(2-3), 141–162.
Goldie, W. D., Chiappa, K. H., Young, R. R. & Brooks, E. B. (1981), ‘Brainstem au-
ditory and short-latency somatosensory evoked responses in brain death’, Neurology
31(3), 248–256.
Gouveris, H. & Mann, W. (2009), ‘Association between surgical steps and intraopera-
tive auditory brainstem response and electrocochleography waveforms during hear-
ing preservation vestibular schwannoma surgery’, European Archives of Oto-Rhino-
Laryngology 266(2), 225–229.
Graf, M., Marterer, A. & Sluga, E. (1992), ‘Evoked potentials in neurophysiologic assess-
ment of dementias’, American Journal of EEG Technology 32(3), 204–212.
Haar, A. (1910), ‘Zur theorie der orthogonalen funktionensysteme - erste mitteilung’,
Mathematische Annalen 69(3), 331–371.
Hall III, J., Mackey-Hargadine, J. & Kim, E. (1985), ‘Auditory brain-stem response in
determination of brain death’, Archives of Otolaryngology- Head and Neck Surgery
111(9), 613.
Hall, J. W. (1992), Handbook of Auditory Evoked Responses, Allyn and Bacon, Mas-
sachusetts.
Hall, J. W. (2007), New Handbook of Auditory Evoked Responses, Allyn and Bacon, Mas-
sachusetts.
BIBLIOGRAPHY 177
Hanrahan, H. E. (1990), Extraction of features in auditory brainstem response (abr) sig-
nals, in ‘Communications and Signal Processing, 1990. COMSIG 90. Proceedings.,
IEEE 1990 South African Symposium on’, pp. 61–66.
Harkins, S. W. (1981), ‘Effects of presenile dementia of the alzheimer’s type on brainstem
transmission time’, International Journal of Neuroscience 15(3), 165–170.
Hauri, P., Orr, W. & Company, U. (1982), The sleep disorders, Upjohn Kalamazoo, Michi-
gan.
Hecox, K. & Galambos, R. (1974), ‘Brain stem auditory evoked responses in human infants
and adults’, Archives of Otolaryngology- Head and Neck Surgery 99(1), 30.
Holland, P. W. & Welsch, R. E. (1977), ‘Robust regression using iteratively reweighted
least-squares’, Communications in Statistics - Theory and Methods 6(9), 813–827.
Hoppe, U., Weiss, S., Stewart, R. W. & Eysholdt, U. (2001), ‘An automatic sequential
recognition method for cortical auditory evoked potentials’, Biomedical Engineering,
IEEE Transactions on 48(2), 154–164.
Hu, L., Zhang, Z. G., Hung, Y. S., Luk, K. D. K., Iannetti, G. D. & Hu, Y. (2011), ‘Single-
trial detection of somatosensory evoked potentials by probabilistic independent com-
ponent analysis and wavelet filtering’, Clinical Neurophysiology 122(7), 1429–1439.
Cited By (since 1996): 1 Export Date: 5 September 2011 Source: Scopus.
Huang, J. W. & Nayak, A. (1999), ‘Depth of anesthesia estimation and control’, IEEE
Transactions on Biomedical Engineering 46(1), 71–81.
Inc, E. R. (2011), ‘Etymotic er-3a insert earphones for audiometry’.
Incorporated, N. M. (2011), ‘algo5’.
Intracoustics (2011), ‘Ep15 diagnostic abr’.
Ivannikov, A., Karkkainen, T., Ristaniemi, T. & Lyytinen, H. (2010), Spatial weighted
averaging for erp denoising in eeg data, in ‘Communications, Control and Signal
Processing (ISCCSP), 2010 4th International Symposium on’, pp. 1–6.
BIBLIOGRAPHY 178
Jensen, E. W., Lindholm, P. & Henneberg, S. (1996), ‘Autoregressive modelling with
exogenous input of auditory evoked potentials to produce an on-line depth anaesthesia
index’, Methods Inf Med pp. 35–256–60.
Jensen, E. W., Nygaard, M. & Henneberg, S. W. (1998), ‘On-line analysis of middle
latency auditory evoked potentials (mlaep) for monitoring depth of anaesthesia in
laboratory rats’, Medical Engineering and Physics 20(10), 722–728.
Jewett, D. L. & Williston, J. S. (1971), ‘Auditory-evoked far fields averaged from the scalp
of humans’, Brain 94(4), 681–696.
John, M. S., Dimitrijevic, A. & Picton, T. W. (2001), ‘Weighted averaging of steady-state
responses’, Clinical Neurophysiology 112(3), 555–562.
Joseph, J. M., West, C. A., Thorton, A. R. & S., H. B. (1987), ‘Improved decision criteria
for evaluatino of clinical abr’s’.
Jung, T.-P., Humphries, C., Lee, T.-W., Makeig, S., McKeown, M. J., Iragui, V. & Se-
jnowski, T. J. (1998), Removing electroencephalographic artifacts: Comparison be-
tween ica and pca, in M. Niranjan, E. Wilson, T. Constantinides & S. Y. Kung, eds,
‘Neural Networks for Signal Processing VIII, 1998’, Proceedings of the 1998 8th IEEE
Workshop on Neural Networks for Signal Processing VIII, IEEE, Cambridge, Engl,
pp. 63–72.
Kaga, K. & Tanaka, Y. (1980), ‘Auditory brainstem response and behavioral audiometry:
Developmental correlates’, Arch Otolaryngol 106(9), 564–566.
Kato, T., Kimura, K., Shiraishi, K., Eura, Y., Morizono, T. & Soda, T. (1995), ‘Topogra-
phy of binaural interaction in the auditory brainstem response’, Auris, nasus, larynx
22(3), 145.
Khedr, E. M., Toony, L. F. E., Tarkhan, M. N. & Abdella, G. (2000), ‘Peripheral and
central nervous system alterations in hypothyroidism: Electrophysiological findings’,
Neuropsychobiology 41(2), 88–94.
BIBLIOGRAPHY 179
Kiernan, J. (2007), ‘Anatomy, lecture notes’.
Kingsbury, N. (2000), A dual-tree complex wavelet transform with improved orthogonality
and symmetry properties, in ‘IEEE International Conference on Image Processing’,
Vol. 2 of International Conference on Image Processing (ICIP 2000), Vancouver, BC,
pp. 375–378.
Kingsbury, N. (2001), ‘Complex wavelets for shift invariant analysis and filtering of signals’,
Applied and Computational Harmonic Analysis 10(3), 234–253.
Kitahara, Y., Fukatsu, O. & Koizumi, Y. (1995), ‘Effect of sevoflurane and nitrous oxide
anesthesia on auditory brainstem responses in children’, Masui 44(6), 805–9.
Kjaer, M. (1980), ‘Localizing brain stem lesions with brain stem auditory evoked poten-
tials’, Acta Neurologica Scandinavica 61(5), 265–274.
Kochs, E., Stockmanns, G., Thornton, C., Nahm, W. & Kalkman, C. J. (2001), ‘Wavelet
analysis of middle latency auditory evoked responses: Calculation of an index for
detection of awareness during propofol administration’, Anesthesiology 95(5), 1141–
1150.
Kodama, Y., Ieda, T., Hirayama, M., Koike, Y., Ito, H. & Sobue, G. (1999), ‘Auditory
brainstem responses in patients with autonomic failure with parkinson’s disease and
multiple system atrophy’, Journal of the Autonomic Nervous System 77(2-3), 184–
189.
Kohshi, K. & Konda, N. (1990), ‘Human auditory brain stem response during induced
hyperthermia’, Journal of Applied Physiology 69(4), 1419–1422.
Kong, X. & Oiu, T. (2001), ‘Latency change estimation for evoked potentials: A compar-
ison of algorithms’, Medical and Biological Engineering and Computing 39(2), 208–
224.
Lange, D. H. & Inbar, G. F. (1996), ‘A robust parametric estimator for single-trial
movement related brain potentials’, IEEE Transactions on Biomedical Engineering
43(4), 341–347.
BIBLIOGRAPHY 180
Lasky, R. E., Rupert, A. & Waller, M. (1987), ‘Reproducibility of auditory brain-stem
evoked responses as a function of the stimulus, scorer and subject’, Electroencephalogr
Clin Neurophysiol 68(1), 45–57.
Lee, P. L., Hsieh, J. C., Wu, C. H., Shyu, K. K. & Wu, Y. T. (2008), ‘Brain computer in-
terface using flash onset and offset visual evoked potentials’, Clinical Neurophysiology
119(3), 605–616.
Lee, S. H., Song, D. G., Kim, S., Lee, J. H. & Kang, D. G. (2009), ‘Results of auditory
brainstem response monitoring of microvascular decompression: A prospective study
of 22 patients with hemifacial spasm’, The Laryngoscope 119(10), 1887–1892.
Lins, O. G., Picton, T. W., Berg, P. & Scherg, M. (1993), ‘Ocular artifacts in recording
eegs and event-related potentials ii: Source dipoles and source components’, Brain
Topography 6(1), 65–78.
Litvan, H., Jensen, E. W., Galan, J., Lund, J., Rodriguez, B. E., Henneberg, S. W.,
Caminal, P. & Villar Landeira, J. M. (2002), ‘Comparison of conventional averaged
and rapid averaged, autoregressive-based extracted auditory evoked potentials for
monitoring the hypnotic level during propofol induction’, Anesthesiology 97(2), 351–
358.
Ljung, L. (1987), System identification : theory for the user, Prentice-Hall information
and system sciences series, Englewood Cliffs, N.J. : Prentice-Hall.
Lopez, M. A., Pomares, H., Pelayo, F., Urquiza, J. & Perez, J. (2009), ‘Evidences of
cognitive effects over auditory steady-state responses by means of artificial neural
networks and its use in brain-computer interfaces’, Neurocomput. 72(16-18), 3617–
3623.
Maglione, J. L., Pincilotti, M., Acevedo, R. C., Bonell, C. E. & Gentiletti, G. G. (2003),
Estimation of the auditory brainstem response’s wave v by means of wavelet trans-
form, in R. S. Leder, ed., ‘Annual International Conference of the IEEE Engineering
in Medicine and Biology - Proceedings’, Vol. 3 of A New Beginning for Human Health:
BIBLIOGRAPHY 181
Proceedings of the 25th Annual International Conference of the IEEE Engineering in
Medicine and Biology Society, Cancun, pp. 2631–2634.
Mainardi, L. T., Kupila, J., Nieminen, K., Korhonen, I., Bianchi, A. M., Pattini, L.,
Takala, J., Karhu, J. & Cerutti, S. (2000), ‘Single sweep analysis of event related au-
ditory potentials for the monitoring of sedation in cardiac surgery patients’, Computer
Methods and Programs in Biomedicine 63(3), 219–227.
Makeig, S., Westerfield, M., Jung, T.-P., Enghoff, S., Townsend, J., Courchesne, E. &
Sejnowski, T. J. (2002), ‘Dynamic brain sources of visual evoked responses’, Science
295(5555), 690–694.
Mallat, S. G. (1989), ‘Theory for multiresolution signal decomposition: the wavelet
representation’, IEEE Transactions on Pattern Analysis and Machine Intelligence
11(7), 674–693.
Mallat, S. G. (1998), A wavelet tour of signal processing, 1st edition edn, Academic Press,
San Diego.
Markand, O. N., Lee, B. I., Warren, C., Stoelting, R. K., King, R. D., Brown, J. W. &
Mahomed, Y. (1987), ‘Effects of hypothermia on brainstem auditory evoked potentials
in humans’, Annals of Neurology 22(4), 507–513.
MATLAB (2008), ‘Matlab help r2008a’.
Matthies, C. (2008), Monitoring during surgery around the acoustic and vestibular nerves,
in R. N. Marc, ed., ‘Handbook of Clinical Neurophysiology’, Vol. Volume 8, Elsevier,
pp. 566–589.
McCullagh, P., Wang, H., Zheng, H., Lightbody, G. & McAllister, G. (2007), ‘A compar-
ison of supervised classification methods for auditory brainstem response determina-
tion’, Medinfo. MEDINFO 12(Pt 2), 1289–1293.
Misiti, M., Misiti, Y. & Oppenheim, G. (2006), Wavelets and their Applications, ISTE.
BIBLIOGRAPHY 182
Misra, U. K. & Kakita, J. (1999), Clinical Neurophysiology, 1st edn, B.I. Churchill Liv-
ingstone, New Delhi.
Moller, A. (1987), ‘Auditory evoked potentials to continuous amplitude-modulated sounds:
can they be described by linear models?’, Electroencephalography and Clinical Neu-
rophysiology 66(1), 56–65.
Moller, A., Jho, H., Yokota, M. & Jannetta, P. (1995), ‘Contribution from crossed and
uncrossed brainstem structures to the brainstem auditory evoked potentials: a study
in humans’, The Laryngoscope 105(6), 596–605.
Moore, J. K. (1987), ‘The human auditory brain stem as a generator of auditory evoked
potentials’, Hearing Research 29(1), 33–43.
Morawski, K., Niemczyk, K., Sokolowski, J. & Telischi, F. (2010), ‘Intraoperative mon-
itoring of hearing during ossiculoplasty’, Otolaryngology – Head and Neck Surgery
143(2 suppl), P238.
Nason, G. & Silverman, B. (1995), ‘The stationary wavelet transform and some statis-
tical applications’, LECTURE NOTES IN STATISTICS-NEW YORK-SPRINGER
VERLAG- pp. 281–281.
Nayak, A. & Roy, R. (1998), ‘Anesthesia control using midlatency auditory evoked poten-
tials’, Biomedical Engineering, IEEE Transactions on 45(4), 409 –421.
Nijboer, F., Furdea, A., Gunst, I., Mellinger, J., McFarland, D. J., Birbaumer, N. & Kbler,
A. (2008), ‘An auditory brain-computer interface (bci)’, Journal of Neuroscience
Methods 167(1), 43–50.
O’Mahony, D., Rowan, M., Feely, J., Walsh, J. B. & Coakley, D. (1994), ‘Primary auditory
pathway and reticular activating system dysfunction in alzheimer’s disease’, Neurology
44(11), 2089–.
Oppenheim, A. V. & Schafer, R. W. (1999), Discrete-time signal processing, 2nd ed edn,
Prentice-Hall, New Jersey.
BIBLIOGRAPHY 183
Otometrics (2011), ‘Madsen accuscreen - newborn hearing screener’.
Ozdamar, O. & Delgado, R. (1996), ‘Measurement of signal and noise characteristics in
ongoing auditory brainstem response averaging’, Annals of Biomedical Engineering
24(6), 702–715.
Ozdamar, O. & Kalayci, T. (1999), ‘Median averaging of auditory brain stem responses’,
Ear and Hearing 20(3), 253–264.
Papathanasiou, E. S., Pantzaris, M., Myrianthopoulou, P., Kkolou, E. & Papacostas,
S. S. (2010), ‘Brainstem lesions may be important in the development of epilepsy in
multiple sclerosis patients: An evoked potential study’, Clinical Neurophysiology .
Parameswariah, C. & Cox, M. (2006), ‘The ways of wavelet filters’, IEEE Potentials
25(1), 11–15.
Parker, S., Chiappa, K. & Brooks, E. (1980), ‘Brainstem auditory evoked response in
patients with acoustic neuromas and cerebellopontine angle meningiomas’, Neurology
1(30).
Petoe, M. A., Bradley, A. P. & Wilson, W. J. (2010), ‘On chirp stimuli and neural syn-
chrony in the suprathreshold auditory brainstem response’, Journal of the Acoustical
Society of America 128(1), 235–246. Export Date: 23 May 2011 Source: Scopus.
Pham, M., Hinterberger, T., Neumann, N., Kbler, A., Hofmayer, N., Grether, A., Wil-
helm, B., Vatine, J. J. & Birbaumer, N. (2005), ‘An auditory brain-computer interface
based on the self-regulation of slow cortical potentials’, Neurorehabilitation and Neu-
ral Repair 19(3), 206–218.
Picton, T. W., Stapells, D. R. & Campbell, K. B. (1981), ‘Auditory evoked potentials
from the human cochlea and brainstem’, The Journal of otolaryngology. Supplement
9, 1–41.
Ponton, C. W., Moore, J. K. & Eggermont, J. J. (1996), ‘Auditory brain stem response
generation by parallel pathways: differential maturation of axonal conduction time
and synaptic transmission’, Ear Hear 17(5), 402–10.
BIBLIOGRAPHY 184
Pratt, H. & Sohmer, H. (1977), ‘Correlations between psychophysical magnitude estimates
and simultaneously obtained auditory nerve, brain stem and cortical responses to click
stimuli in man’, Electroencephalography and Clinical Neurophysiology 43(6), 802–812.
Quian Quiroga, R. (2000), ‘Obtaining single stimulus evoked potentials with wavelet de-
noising’, Physica D: Nonlinear Phenomena 145(3-4), 278–292.
Quian Quiroga, R. (2005), ‘Single-trial event-related potentials with wavelet denoising:
method and applications’, International Congress Series 1278(0), 429 – 432.
Raz, J., Dickerson, L. & Turetsky, B. (1999), ‘A wavelet packet model of evoked potentials’,
Brain and Language 66(1), 61–88.
Roeser, R., Valente, M. & Hosford-Dunn, H. (2000), Audiology: Diagnosis, Thieme.
Rosenhall, U., Bjrkman, G., Pedersen, K. & Kall, A. (1985), ‘Brain-stem auditory evoked
potentials in different age groups’, Electroencephalography and Clinical Neurophysi-
ology/Evoked Potentials Section 62(6), 426–430.
Rossi, L., Bianchi, A. M., Merzagora, A., Gaggiani, A., Cerutti, S. & Bracchi, F. (2007),
‘Single trial somatosensory evoked potential extraction with arx filtering for a com-
bined spinal cord intraoperative neuromonitoring technique’, BioMedical Engineering
Online 6.
Rushaidin, M. M., Salleh, S. H., Swee, T. T., Najeb, J. M. & Arooj, A. (2009), ‘Wave v de-
tection using instantaneous energy of auditory brainstem response signal’, American
Journal of Applied Sciences 6(9), 1669–1674.
Samar, V. J., Bopardikar, A., Rao, R. & Swartz, K. (1999), ‘Wavelet analysis of neuro-
electric waveforms: A conceptual tutorial’, Brain and Language 66(1), 7–60.
Scherg, M. & Von Cramon, D. (1985), ‘Two bilateral sources of the late aep as identified
by a spatio-temporal dipole model’, Electroencephalography and Clinical Neurophys-
iology/Evoked Potentials Section 62(1), 32–44.
BIBLIOGRAPHY 185
Schwender, D., Rimkus, T., Haessler, R., Klasing, S., Poppel, E. & Peter, K. (1993), ‘Ef-
fects of increasing doses of alfentanil, fentanyl and morphine on mid-latency auditory
evoked potentials’, British Journal of Anaesthesia 71(5), 622–628.
Scott, M., Anthony, J. B., Tzyy-ping, J. & Terrence, J. S. (1996), ‘Independent component
analysis of electroencephalographic data’.
Shangkai, G. & Loew, M. H. (1986), ‘An autoregressive model of the baep signal for hearng-
threshold testing’, Biomedical Engineering, IEEE Transactions on BME-33(6), 560–
565.
Singh, B. N. & Tiwari, A. K. (2006), ‘Optimal selection of wavelet basis function applied
to ecg signal denoising’, Digital Signal Processing: A Review Journal 16(3), 275–287.
Sokolov, Y., Kurtz, I., Steinman, A., Long, G. & Sokolova, O. (2005), ‘Integrity technology:
Enabling practical abr’.
Soustiel, J. F., Hafner, H., Chistyakov, A. V., Barzilai, A. & Feinsod, M. (1995), ‘Trigem-
inal and auditory evoked responses in minor head injuries and post-concussion syn-
drome’, Brain Injury 9(8), 805–813.
Steinhoff, H. J., Bhnke, F. & Janssen, T. (1988), ‘Click abr intensity-latency characteristics
in diagnosing conductive and cochlear hearing losses’, European Archives of Oto-
Rhino-Laryngology 245(5), 259–265.
Stockard, J. & Rossiter, V. (1977), ‘Clinical and pathologic correlates of brain stem audi-
tory response abnormalities’, Neurology 27(4), 316.
Strang, G. & Nguyen, T. (1997), Wavelets and filter banks, rev. ed edn, Wellesley-
Cambridge Press.
Strauss, D. J., Delb, W., Plinkert, P. K. & Schmidt, H. (2004), Fast detection of wave v in
abrs using a smart single sweep analysis system, in ‘Annual International Conference
of the IEEE Engineering in Medicine and Biology - Proceedings’, Vol. 26 I of Con-
ference Proceedings - 26th Annual International Conference of the IEEE Engineering
in Medicine and Biology Society, EMBC 2004, pp. 458–461.
BIBLIOGRAPHY 186
Stuart, A., Yang, E. Y. & Botea, M. (1996), ‘Neonatal auditory brainstem responses
recorded from four electrode montages’, Journal of Communication Disorders
29(2), 125–139.
Sun, Y. & Chen, Z. X. (2008), Fast extraction method of auditory brainstem response
based on wavelet transformation, in ‘Proceedings of the 2007 International Confer-
ence on Wavelet Analysis and Pattern Recognition, ICWAPR ’07’, Vol. 4 of 2007
International Conference on Wavelet Analysis and Pattern Recognition, ICWAPR
’07, Beijing, pp. 1862–1864.
Suter, C. M. & Brewer, C. C. (1983), ‘Auditory brain stem response wave v latency-
intensity function and three audiologic measures of cochlear function’, Ear and Hear-
ing 4(4), 212–219.
Tann, J., Wilson, W., Bradley, A. & Wanless, G. (2009), ‘Progress towards universal
neonatal hearing screening: A world review’, Australian and New Zealand Journal of
Audiology 31(1), 3–14. Export Date: 23 May 2011 Source: Scopus.
Thornton, C. (1991), ‘Evoked potentials in anaesthesia’, European Journal of Anaesthesi-
ology 8(2), 89–107.
Thornton, C., Heneghan, C. P. H., James, M. F. M. & Jones, J. G. (1984), ‘Effects of
halothane or enflurane with controlled ventilation on auditory evoked potentials’,
British Journal of Anaesthesia 56(4), 315–323.
Urhonen, E., Jensen, E. W. & Lund, J. (2000), ‘Changes in rapidly extracted auditory
evoked potentials during tracheal intubation’, Acta Anaesthesiologica Scandinavica
44(6), 743–748.
Van Campen, L. E., Sammeth, C. A., Hall 3rd, J. W. & Peek, B. F. (1992), ‘Comparison
of etymotic insert and tdh supra-aural earphones in auditory brainstem response
measurement’, Journal of the American Academy of Audiology 3(5), 315–323.
BIBLIOGRAPHY 187
Vannier, E., Adam, O., Karasinski, P., Ohresser, M. & Motsch, J. F. (2001), ‘Computer-
assisted abr interpretation using the automatic construction of the latency-intensity
curve’, Audiology 40(4), 191–201.
Vannier, E. & Nat-Ali, A. (2004), ‘Baeps averaging analysis using autoregressive mod-
elling’, Journal of Clinical Monitoring and Computing 18(3), 147–155.
Vaz, C. A. & Thakor, N. V. (1989), ‘Adaptive fourier estimation of time-varying evoked
potentials’, IEEE Transactions on Biomedical Engineering 36(4), 448–455.
Visani, E., Agazzi, P., Scaioli, V., Giaccone, G., Binelli, S., Canafoglia, L., Panzica,
F., Tagliavini, F., Bugiani, O., Avanzini, G. & Franceschetti, S. (2005), ‘Fveps in
creutzfeldt-jacob disease: Waveforms and interaction with the periodic eeg pattern
assessed by single sweep analysis’, Clinical Neurophysiology 116(4), 895–904.
Wada, J. (1986), Handbook of electroencephalography and clinical neurophysiology, Else-
vier.
Wilson, W. J. (2004), ‘The relationship between the auditory brain-stem response and its
reconstructed waveforms following discrete wavelet transformation’, Clinical Neuro-
physiology 115(5), 1129–1139.
Wilson, W. J. & Aghdasi, F. (1999), ‘Discrete wavelet analysis of the auditory brainstem
response: Effects of subject age, gender and test ear’, IEEE AFRICON Conference
1, 291–296.
Wilson, W. J., Mills, P. C., Bradley, A. P., Petoe, M. A., Smith, A. W. B. & Dzulkarnain,
A. A. (2011), ‘Fast assessment of canine hearing using high click-rate baer’, Veterinary
Journal 187(1), 136–138. Export Date: 23 May 2011 Source: Scopus.
Wilson, W. J., Winter, M., Kerr, G. & Aghdasi, F. (1998), ‘Signal processing of the au-
ditory brainstem response: Investigation into the use of discrete wavelet analysis’,
Proceedings of the South African Symposium on Communications and Signal Pro-
cessing, COMSIG pp. 17–22.
BIBLIOGRAPHY 188
Woody, C. D. (1967), ‘Characterization of an adaptive filter for the analysis of variable
latency neuroelectric signals’, Medical & Biological Engineering 5(6), 539–554.
Yamakami, I., Yoshinori, H., Saeki, N., Wada, M. & Oka, N. (2009), ‘Hearing preservation
and intraoperative auditory brainstem response and cochlear nerve compound action
potential monitoring in the removal of small acoustic neurinoma via the retrosigmoid
approach’, Journal of Neurology, Neurosurgery & Psychiatry 80(2), 218–227.
Ylmaz, S., Karal, E., Tokmak, A., Gl, E., Koer, A. & ztrk, . (2009), ‘Auditory evaluation
in parkinsonian patients’, European Archives of Oto-Rhino-Laryngology 266(5), 669–
671.
Yousefi, S. (2004), An Investigation of the Auditory Brainstem Response Characteristics
of People with Parkinson’s Disease, PhD thesis, Faculty of Engineering and Industrial
Sciences.
Zhang, R., McAllister, G., Scotney, B., McClean, S. & Houston, G. (2006), ‘Combining
wavelet analysis and bayesian networks for the classification of auditory brainstem
response’, IEEE Transactions on Information Technology in Biomedicine 10(3), 458–
467.
BIBLIOGRAPHY 189
Appendix A
Ethics approval
! " # $ % & ' ( ) * + , - . ) + / 0 + 1 2 + 3 + + 2 $ + 4 / + 1 ' / 5 / 6 4 ' 7 2 1 ( + 5 8 1 6 ( ( 9 7 " # ! - : : - - - ; 2 + + + ! < 2 + = ! 5 + 5 + $ + = 2 2 < = > / & 1 ' / ( + / ) 1 1 + + 6 % & ' ( ) 7 = % & ' ( ) = 1 1 + + 6 & ( ) 7 ? @ / 5 / / + + ! < / 1 $ . 5 < + + + 1 + / < 5 + + $ 8 + 2 + % & ' ( ) = 1 1 + + 2 / $ + ?; 1 5 / $ + $ ! / + + 5 5 ! + 5 $ / = 8 ! 2 + 5 + 6 // = 1 + + $ + $ + 7 < + / + $ $ 0 8 8 + / $ + / + $ ?0 1 / + ! + $ + $ < = / 5 / 1 / + 2 1 + < = $ 3 + 8 + / + $ $ / $ 8 + A + + + 1 + ( + ) $ + & 1 ' / $ < + / 5 + + / $ + / + + $$ / 5 / ?0 1 $ < = ) 2 ; ! / + 8 + 5 ! / 1 / / 5 / = 2 5 / 5 5 + $ + / / + $ < + + 5 + = 8 1 $ < 2 + / $ + / $ 8 / $ / + 5 $ / / + 1 + / 5 5 ! $ ? 8 2 ! / + 8 + / 5 ! / B / + 1 + 2 + $ % & ' ( ) $ / 1 + ?0 = ! 5 + / = 5 5 ! $ / / = 1 + + $ 2 + ! < = = 2 2 % & ' ( ) ? 1 $ 1 + / + 5 5 ! $ 5 $ / / + 1 + / $ B 5 + 5 5 / ? % & ' ( ) 1 / + = + 2 $ 1 1 $ + / / /5 / / = + 2 + 2 6 7 / / 3 5 + $ $ ! / 2 2 + / 5 + 5 + / $
$ / / 1 / / C 6 = 7 5 5 / $ 8 / 5 + / C $ 6 7 2 / ! + /< 1 8 + 2 2 + + $ + 5 + = + 2 + 5 + ?0 + 1 1 1 5 + + 5 8 / / 2 + 5 + / B $ / < / ++ / 6 = $ 1 + 7 2 + 5 + ?0 $ + / $ 3 + + $ + 2 + 5 + 1 = $ + + + 1 ?* / + + 1 2 ! B / = + 0 8 8 + / ? % & ' ( ) 5 + 1 = / $ = B + $ 1 1 + ?4 / + < / / 2 + 5 + ?@ / / D + E / 2 + & ( ) F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F FD + E /' / ( + / G 2 2 < = ' / 6 & 7 < = % ! / + 2 8 * G 4 3 - & E & G ' A H ; ) . - I - . J - : K - " 3 I - . J - : K ,
Appendix B
Journal Publications
PLEASE NOTE
Appendix B is unable to be reproduced online.
Please consult print copy held in the Swinburne Library or click on the links below.
De Silva, AC, Schier, MA (2011) Evaluation of wavelet techniques in
rapid extraction of ABR variations from underlying EEG. Physiological Measurement 32 (11) 1747-1761
DOI: 10.1088/0967-3334/32/11/s03
De Silva, AC, Sinclair, NC, Liley, DT (2012) Limitations in the rapid extraction of evoked potentials using parametric modelling.
IEEE Transactions in Biomedical Engineering 59(5) 1462-1471 DOI: 10.1109/TBME.2012.2188527
Appendix C
Stimulation and acquisition of the
ABR
i Introduction
The study about the ABR recording setup is peripheral to the main aim of this thesis,
thus presented as an appendix. During the recordings aberrations were observed and an
exploration of the parameters affecting the quality of the ABR recording was warranted.
This study aims to achieve the following:
• Explore parameters affecting the stimulus artifact
– Type of transducer used for auditory stimulation
– Electrode montage used to record scalp voltages
• Assess the quality of ABRs produced by a range of audio transducers
• Minimise the effect of the stimulus artifact
In summary the results suggest that, even though all the transducers generated a
stimulus artifact with considerably large amplitudes, its duration did not affect the early
components of the ABR. Temporal delays of ABR components were observed with some
transducers which required calibration before their use in diagnosis tests.
Appendix C I
ii Methodology
ii.i Pilot study
The initial recording was conducted with a one participant stimulated at an intensity
of 60 dB nHL at 21.1 Hz. Electrodes were placed at Cz, Fpz and at the mastoid. The
remainder of the parameters were set according to table 1. The resultant ABR with a MTA
of 1024 epochs (deemed to be sufficient with an Fsp = 3.51) is shown in figure 1. Here,
the prominent observation is a stimulus artifact at the onset of the auditory stimulation
(t = 0 ms) generated by the magnetic field induced by the audio transducer. The critical
observation is the time duration of the stimulus artifact which appears to reach towards
1 ms posing a threat to distort wave I. However, the general morphology of the ABR is
visible with the maximum peak at 5.8 ms corresponding to the wave V.
As a result, we formulated a method to investigate the suitability of a variety of audio
transducers in stimulating the ABR. The experimental setup and the results obtained
from this study are reported in following sections.
V W X Y Z [ V\ ]\ W\ [V [W]X
^ _ ` a b ` c de fghi j kl mn µ
op q r _ ` s t s c u v r _ w u x r
Figure 1: ABR recording with a stimulus artifact at t=0 (De Silva & Schier 2009)
Appendix C II
Parameter Settings
Stimulus parameters
Type ClickPulse width 0.1 msPolarity a square pulse with a negative polarityFrequency > 20 HzIntensity Variable in dB nHLNo. of epochs Variable to obtain an ABR with adequate SNRMode MonauralMasking Only if the ABR is abnormalAcquisition parameters
Electrode montageNon-inverting Cz or FzInverting A or M (ipsilateral)Ground FpzFilteringHigh-pass 30 HzLow-pass 3 kHzAmplification 100000Sampling rate 40 kHzAnalysis time 15 msPre-stimulus interval 10% of the analysis time
Table 1: Settings for a typical ABR recording (Hall 2007, Van Campen et al. 1992).
ii.ii Selection of audio transducers
Eight audio transducers based on moving-coil technology were tested and are summarised
in table 2. These transducers included supra-aural and circum-aural headphones (with
different padding at the ear piece) and outer-ear and in-ear earphones with different posi-
tioning at the ear. The TDH-49 and TDH-39 headphones were considered as the ‘gold stan-
dard ’ for comparison purposes (Hall 2007). QantasTM, PanasonicTMand NokiaTMouter-ear
earphones were selected considering the common availability. The 1.2M earphone provided
in-ear positioning which could stimulate the tympanic membrane with less distortion due
to the close proximity and analogues to clinically used Etymotic ER-3A insert earphones
(Hall 2007, Wilson 2004, Wilson & Aghdasi 1999). Superior sound quality prompted
AltronicsTMand PhilipsTMcircum-aural headphones to be tested to deliver stimuli. All
the audio transducers will be referred with the specific letter label, ‘A’ to ‘H’ in table 2
Appendix C III
Label Model Type Description
A TDH-49p Supra-aural Headphones Telephonics (Audiometer)B TDH-39 Supra-aural Headphones Telephonics Voyager 522C Qantas Outer-ear Earphones Comfort kit (Entertainment)D Panasonic Outer-ear Earphones RQ-E27V (Entertainment)E Nokia Outer-ear Earphones 5200 (Communication)F 1.2M (in-ear) In-ear Earphones Capdase (Entertainment)G Altronics Circum-aural Headset C 9073 (Professional pilot)H Philips Circum-aural Headphone SB347 (Entertainment)
Table 2: Description of the audio transducers used in this study.
from here onwards.
ii.iii Selection of electrode montages
A potential implication of using non-standard audio transducers is the possibility of in-
ducing a stimulus artifact by the interaction of scalp electrodes and the stray magnetic
field generated by the trasducer. To study an such effect, out of the standard electrode
montages mentioned in table 1, inverting electrode at the earlobe and the mastoid posi-
tions were assessed due to the closest proximity to the audio transducer. Non-inverting
and the ground electrodes were kept at Cz and Fpz respectively. The choice of Cz for
the non-inverting electrode instead of Fz was to obtain a larger wave V (Kavanagh &
Clark 1989). These two electrode montages evaluated are summarised in table 3.
Earlobe montage Mastoid montage
Non-inverting Cz CzInverting Earlobe MastoidGround Fpz Fpz
Table 3: Two electrode montages tested in this study. The position of the inverting electrode differed intwo montages from earlobe to the mastoid to assess the interaction with the EM field of the transducer.
ii.iv Measurement approaches
As evident in figure 1, it is difficult to determine the end time point of the stimulus artifact
due to the overlap of ongoing EEG. Therefore a phantom head with a realistic shape and
Appendix C IV
(a) Human participant
(b) Phantom
Figure 2: Recording setup with electrode placement and headphones TDH-49p headphones are worn herewith the electrodes connected in earlobe montage. Picture was taken with the permission of the participant.
similar electrical properties to that of an average human head was used to examine the
stimulus artifact produced without the interference of any physiological potential. Then
the same audio transducers were tested on human participants to analyse the combined
effect with ABRs. Both electrode montages in table 3 were tested on the phantom head
and on human participants.
Data were collected from four healthy human participants (2 male and 2 female) with
an age range of 24 to 26 years. The Swinburne University Human Experimentation Ethics
Committee approved this study, and each participant gave written informed consent in
accordance with these requirements. The official ethics clearance details are attached
in Appendix A. All the headphones listed in table 2 were used on each participant. A
visualization of this recording setup is shown in figure 2a. The polystyrene phantom head
is shown in figure 2b with dimensions similar to an average human head to study the
stimulus artifact in isolation.
Simulating inter-electrode impedance is the critical electrical parameter which affects
voltage measurements from the scalp. Therefore a series combination of resistors and
capacitors was used on the phantom scalp in between ground-inverting and ground-non-
inverting electrode combinations (Wood, Hamblin & Croft 2003). The inter-electrode
impedances used at the phantom scalp was based on the average impedances of the human
Appendix C V
Resistance (kΩ) Capacitance (µF) No. of measurements
Human scalp 3.0 ± 1.4 2.3 ± 0.9 32Phantom 3.1 3 16
Table 4: Average inter-electrode impedance measurements between ground-inverting and ground-non-inverting at 100 Hz. Variation values indicate standard deviation.
participants. The values obtained from the 32 independent readings of human participants
and the comparable values used for the phantom is tabulated in table 4 indicating the
validity of the phantom model. All the impedances were measured at 100 Hz using GW
Digital LCR meter produced by GW InstekTM(New Taipei City, Taiwan).
ii.v Equipment and parameters
A conducting gel-injected disk electrode was used at Cz and 3MTMdisposable electrodes
at earlobe, mastoid and Fpz locations. All the electrodes had silver chloride surfaces to
achieve comparable surface impedances. Recording setup consisted of PowerLabTMamplifier
with a gain of 100k and Chart-5 software produced by ADInstrumentsTM(Sydney, Aus-
tralia) as the interface to collect data. The auditory stimulus was a negative polarity
square pulse with a width of 0.1 ms at a frequency of 21.1 Hz and was dilivered via a
TelephonicTMTDH-49 headphone. The data were sampled at a frequency of 40 kHz and
band-pass filtered between 100-3000 Hz with a 3rd order Butterworth filter using a zero-
phase shifting method (Oppenheim & Schafer 1999). Choice of the low cut-off frequency
of 100 Hz as oppose to 30 Hz in table 1 is to minimize the effect of noise from on going
EEG and myogenic artifacts (Corona-Strauss, Delb, Schick & Strauss 2010, Rushaidin,
Salleh, Swee, Najeb & Arooj 2009, Petoe, Bradley & Wilson 2010). The zero-phase shift-
ing filter was specifically used here to preserve the latencies of the ABR waves. The ABR
is convolved in both the forward and backward directions to regain the phase shift created
when filtered only in one direction. This operation doubles the filter order, leading to
additional computation but with an added advantage of retaining phase characteristics.
A sound intensity of 60 dB nHL was maintained at each audio transducer output. Custom
written scripts were used for offline analysis using MATLABTM(MATLAB 2008) produced
Appendix C VI
by MathWorks (Natick, Massachusetts, USA).
ii.vi Analysis criteria
To achieve the objective of determining the best setup for recording ABR data, following
criteria were analysed with each audio transducer.
(i) Determine the minimum impact of the stimulus artifact among earlobe and mastoid
electrode montages using phantom recordings
(ii) Analyse the separation between the stimulus artifact and the wave I
(iii) Analyse the ABR latencies produced in comparison to normative data
ii.vii Stimulus artifact end time (SAET)
As identified, the amplitude of the stimulus artifact is less important in its effect on the
early components of the ABR than the width of the artifact. Therefore a new measurement
y z z | ~y y y ~y ||~
Figure 3: Measurement of stimulus artifact end time (SAET). Standard deviation is calculated for thepre-stimulus interval from -1 to 0 ms. Here the decaying stimulus artifact reaches within the standarddeviation just after 0.5 ms. Therefore according to the criterion SAET = 0.65 ms.
Appendix C VII
defined as the stimulus artifact end time (SAET) was used to analyse the duration of the
stimulus artifact. SAET is the time period from the onset of the stimulus to the time point
of the decaying stimulus artifact reach within 1 standard deviation of the pre-stimulus
baseline. A visualisation of SAET is shown in figure 3.
ii.viii Normative latency data
The introduction of new transducers should be assessed by the quality of the ABRs pro-
duced by them. Two measurement approaches were consider here by examining the pres-
ence of wave I, II, III and V and the latencies of them. The latencies of ABR features were
benchmarked with normative data to strengthen the conclusion. Table 5 shows published
literature values for ABR wave latencies with similar experimental conditions. Stimulus
intensity is the critical stimulus parameter which affects the ABR latency. However, these
data have been extracted from ABRs generated with sound intensities in the range of
75-85 dB nHL. As evident in the figure 3.5(a), latency variations at high intensities are
negligible, thus latencies in table 5 are comparable with each other within the relevant
variance. These were used as normative data for comparison purposes with the range of
audio transducers and montages.
Appendix C VIII
iii Results
iii.i Effect of the stimulus artifact on electrode montages with the
phantom
To observe the effect of the magnetic field on inverting electrode placed at the earlobe
and mastoid, results were generated by conducting recordings on the phantom with all
the transducers. A visualization of the stimulus artifact at the earlobe is shown in figure 4
and that of the mastoid is shown in figure 5. The labels on the vertical axis are referred to
the audio transducers listed in the table 2. The SAETs calculated from these recordings
are tabulated in table 6 for both electrode montages.
iii.ii Results for the separation between the stimulus artifact and wave
I
Figure 6 and figure 7 show the ABR recordings of a single participant using the two elec-
trode montages. Each figure shows 8 ABRs which were obtained using audio transducers
AudioTransducer
Wave I Wave II Wave III Wave V Source
x σ x σ x σ x σ
TDH-49 1.62 0.12 N/A 3.76 0.14 5.74 0.2 (Van Campenet al. 1992)
TDH-39 1.61 0.13 N/A 3.78 0.17 5.76 0.21 (Van Campenet al. 1992)
TDH-39 1.54 0.08 2.67 0.13 3.73 0.1 5.52 0.15 (Antonelli,Bellotto &Grandori1987)
TDH-39 1.87 0.18 2.88 0.2 3.83 0.2 5.82 0.25 (Rowe III1978)
ER-3A 1.54 0.1 N/A 3.7 0.15 5.6 0.19 (Schwartz,Pratt Jr &Schwartz1989)
Table 5: Published data for ABR waves in literature. These will be considered as normative data and thegold standard for this study.
Appendix C IX
¡ ¢ £ ¤ ¥ ¦§ ¨©ª« ¬
® ¯ ° ± ² ° ³ ´µ ¶·¹º »¼½¶¹
Figure 4: Phantom recordings with inverting electrode at earlobe
¾ ¿ À Á  à ¾ÄÅ ÆÇÈÉ ÊË
Ì Í Î Ï Ð Î Ñ ÒÓ ÔÕÖ×Ø ÙÚÛÔ×
Figure 5: Phantom recordings with inverting electrode at mastoid
Appendix C X
Transducer label SAETEarlobe (ms) Mastoid (ms)
A 0.5 0.58B 0.6 0.5C 0.65 0.68D 0.65 0.68E 0.65 0.48F 0.63 0.63G 0.65 0.65H 0.65 0.65x 0.62 0.61σ 0.05 0.08
Table 6: SAETs for all the transducers. The inverting electrode at the earlobe and mastoid. Refer table 2for transducer descriptions
listed in table 2. Physiologically important waves I, II, III and V are shown on top of each
ABR with vertical markers from left to right. The latency of the wave I was calculated
with these data to determine the distortion by the stimulus artifact.
Figure 8 shows the separation time between the wave I and the SAET for all the
participants. Here, SAETs were the values derived from phantom data in table 6. It is
evident that there exists an effect of the audio transducer on the separation time and a
variation between electrode montages.
Appendix C XI
¡ ¢ £ ¤ ¥ ¦§ ¨©ª« ¬
® ¯ ° ± ² ° ³ ´µ ¶·¹º »¼½¶
Figure 6: ABRs produced with inverting electrode at earlobe. These ABRs contain average of 1024 epochsfor each transducer A-H. Vertical markers on ABR show wave I, II, III and V from left to right.
¾ ¿ À Á  à ¾ÄÅ ÆÇÈÉ ÊË
Ì Í Î Ï Ð Î Ñ ÒÓ ÔÕÖ×Ø ÙÚÛÔ×
Figure 7: ABRs produced with inverting electrode at mastoid. These ABRs contain average of 1024 epochsfor each transducer A-H. Vertical markers on ABR show wave I, II, III and V from left to right.
Appendix C XII
Figure 8: The average separation between the SAET and the ABR wave I. Across all participants for eachtransducer. SAETs are taken from phantom measurements.
iii.iii Effect of transducer type on the latency
Figure 9a and 9b show the latencies of waves I, II, III and V for the earlobe and mastoid
electrode montages respectively. These values are an average across all participants. The
standard deviation among them is shown in horizontal error bars. Vertical grey bars
represent the probable range for each wave latency according published normative data
in table 5. This allows a direct comparison of the results obtained from this study with
standard data published.
Appendix C XIII
(a) Average latency of wave I, II, III and V from all participants with invertingelectrode at earlobe.
(b) Average latency of wave I, II, III and V from all participants with invertingelectrode at mastoid.
Figure 9: Latency of ABR waves with inverting electrode at earlobe and mastoid. The horizontal axisrepresents transducers from A-H on the vertical axis. Standard deviation is in horizontal error bars.Vertical grey bars represent the average range of published data for ABR waves.
Appendix C XIV
iv Discussion
iv.i Stimulus artifact and electrode montage
A visual inspection of figure 4 and 5 suggests that the end of the stimulus artifact is
aligned closely within each electrode montage. A paired two tailed t-test of SAETs in
table 6 confirms that the difference of SAET in between the earlobe and the mastoid
electrode montages are not significant (t(7) = 0.57, p = 0.58, 95%). Therefore, we can
conclude that there is no effect on the length of the stimulus artifact from either the
transducer or the electrode montage.
It is also evident from these figures that there exists a variation in amplitudes of the
stimulus artifacts. But this measure has no adverse effect on ABR features.
A comparison of stimulus artifact amplitudes with the gold standard transducers A
and B suggest, transducer F in earlobe (figure 4) and C, D, E and F in mastoid (figure 5)
electrode montages have comparable values. Therefore they have the potential to be used
in ABR recordings. The variable amplitudes of the same transducer in two electrode
montages confirm the assumption that the electrode montages have a considerable effect
on the stimulus artifact.
In terms of substitutions for an audio transducer for stimulation, all the transducers
tested here provided promising results for the SAET. Therefore they all were tested and
analysed further on human participants for:
1. separation between the stimulus artifact and the wave I
2. quality of the ABR produced
in order to arrive at a comprehensive conclusion in sections (iv.ii) and section (iv.iii).
Appendix C XV
iv.ii Analysis of the separation of SAET and wave I
A visual observation of figure 6 and 7 indicates a clear separation in time between stimulus
artifact and the wave I for all the transducers in both electrode montages. With regard
to amplitudes, a similar visual comparison reveals, the introduced transducers D, F and
G with reference to the earlobe and C, G and H with reference to the mastoid have
comparatively larger amplitudes to gold standard transducers A and B.
Precise values for the average separation among all participants shown in figure 8
suggest all the introduced transducers produce an average separation time of 1 ms which
is comparable with 1.06 ms of separation produced by the gold standard transducers A
and B.
An ANOVA was performed on separation values presented in figure 8 to investigate
the effect of the magnetic field on electrode montages for each transducer. Results suggest
that there is no significant effect on separation time among the two electrode montages
for all the transducers tested (F (1, 63) = 0.11, p = 0.74, 95%). Therefore, despite all these
minor variations, the separation between the stimulus artifact and the wave I produced by
the introduced transducers (C, D, E, F, G and H) are comparable with the gold standard
transducers.
As observed, the stimulus artifact is unavoidable when the transducer is located close
to the head. Therefore, it is important to exclude this artifact at the signal processing stage
of the ABR because some processing methods might pickup the artifact as an important
feature and neglect actual ABR waves.
iv.iii Quality of the ABRs
It can be observed that wave I, II, III and V are present in all ABRs evoked all the
transducers in figures 6 and 7. This is an encouraging result when assessing the quality of
transducers for the purpose of generating ABRs.
Appendix C XVI
The absolute latencies of ABR waves are constant for a given participant at a specific
sound intensity level (Misra & Kakita 1999). Therefore the reliability of the ABR produced
by each audio transducer could be determined by comparing latencies of wave I, II, III
and V with published normative data (Table 5).
According to figure 9, resultant mean latencies of all waves produced by the gold stan-
dard transducers A and B fall well within the normative ranges and produce a substantially
overlapped variation. Since there is no assumption of a significant difference in wave V
latency above 60 dB nHL, it is confirmed that the experimental conditions are comparable
with the conditions of the previous research.
Referring to figure 9, a noticeable lag for the transducer G and a lead for the transducer
F exists for all ABR features. This is as a result of the proximity of the transducer to the
eardrum. The bulky Altronics headphone (G) has its transducer positioned away from
the outer ear creating a lag and the in-ear earphone (F) has it inside the ear canal close to
the eardrum creating a lead. However there is not enough variation to warrant a latency
correction for these two transducers in both electrode montages.
The reader should note that there are other measurements which can improve the
comparison study such as, the spectral characteristics of the audio output which would
have a considerable effect on the stimulation of hair cells. Also calibration of the output
of all earphones to a common reference would have greatly benefited when interpreting
variations in figures 6, 7 and 9. Including these as a future study could be of great benefit
to increase its coherence.
The ABR features produced by all the transducers in figure 9 suggest that both earlobe
and mastoid electrode montages produce consistent results for all the introduced audio
transducers.
Appendix C XVII
v Summary
According to the analysis criteria stated in section (ii.vi), the results of this study are
summarised as shown below.
(i) The type of the transducer and the electrode montage did not affect the SAET.
Even though the amplitudes of the stimulus artifact from introduced transducers
were larger than those of the gold standard, it had no impact on the ABR features.
(ii) The time separation between the SAET and the wave I for introduced transducers
did not produce significant differences to that of gold standard transducers A and
B.
(iii) ABR features were produced by all the transducers and mean latencies of them were
within the normative ranges. A small latency shift of approximately 0.2 ms was
observed in the ABR depending on the proximity of the transducer relative to the
tympanic membrane.
The conclusion, considering the above statements suggest the feasibility to use most com-
mercially available audio transducers in ABR studies. The latency shifts incorporated
with the ABR due to transducers could be adjusted to achieve a more accurate result.
Truncating the ABR with time window of 1-10 ms will suppress the inevitable stimulus
artifact and prevent any false peaks.
The comparable morphology of the ABRs and its variations to that of previously
published reports gave an impetus to evaluate noise reduction methods for rapid extraction
with confidence. As literature suggested, commercially used (at the time of the research)
MLAEP extraction method based on ARX was comprehensively evaluated in the next
chapter for its adaptation to the early components i.e. ABR.
Appendix C XVIII
Appendix D
MATLAB scripts
Ü Ý Ý Þ ß à á â ã ä
å æ ç è é ê ë ì é ì í î ï î æ ð í ð ñ ò ó ô ò õ ï ö ì ë ì î ï ë ÷ ð ø æ î ù é õ æ î ù ú û ó% Processing data of a single sound intensity level with CTMC using DWT The template is derived with the grand average at 55 dB nHL clear all WT = 'bior5.5'; % mother wavelet level = 6; % decomposition levels threshold = 0.8; % retain (1-threshold) bkl = [256 128 64 32 16 8]; % analysed block lengths %% Template calculation using 55 dB nHL load('sw_epochs_55_-11-10.mat'); [b,a] = butter(3,[100/20000,3000/20000]); % generate band pass filter coefficients epochs = filtfilt(b,a,epochs); % zero phase filtering tmpl = mean(epochs(401:840,1:1024),2); % calculating the grand average tmpl((1:80),1) = 0; % remove stimulus artifact; tapering - zeros till 1ms %% Loading dataset to be denoised epochs = []; load('sw_epochs_75_-11-10.mat') epochs = filtfilt(b,a,epochs); [C(:,1),Ltmpl] = wavedec(tmpl,level,WT); % wavelet decomposition of the template for n = 1:length(bkl) %loop for different block lengths for k = 1:1024-256+1 %loop for continuous sweeping through the dataset %% Signal calculations sig(:,k) = mean(epochs(401:840,256-bkl(n)+k:256+k-1),2); % generate the noisy ABR with reduced number epochs sig((1:80),k) = 0; [C(:,2),Lsig] = wavedec(sig(:,k),level,WT); % wavelet decomposition of the noisy ABR %normalizing the simple average to plot signorm(:,k) = sig(:,k) - min(sig(:,k)); signorm(:,k) = signorm(:,k)/max(signorm(:,k)); %normaling the block signorm((1:80),k) = 0; %tapering - zeros till 1ms %% Extraction of thresholded coefficients of the template and the noisy ABR Ct(:,1) = zeros(length(C),1); % initialise coefficients of the thresholded template Ct(:,2) = zeros(length(C),1); % initialise coefficients of the thresholded signal for j = 1:2 %thresholding of template and signal coefficients % A6 - retaining all the coefficients A6 = C(1:Lsig(1,1),j); for i = 1:Lsig(1,1) Ct(i,j) = A6(i,1); end %D6 D6 = C(Lsig(1,1)+1:sum(Lsig(1:2)),j); top20 = find(abs(D6)>max(abs(D6))*threshold); % applying the threshold for i = 1:length(top20) Ct(Lsig(1,1)+top20(i,1),j) = C(Lsig(1,1)+top20(i,1),j); end %D5 D5 = C(sum(Lsig(1:2))+1:sum(Lsig(1:3)),j);
ü ý ý þ ÿ ä ä
top20 = find(abs(D5)>max(abs(D5))*threshold); % applying the threshold for i = 1:length(top20) Ct(sum(Lsig(1:2))+top20(i,1),j) = C(sum(Lsig(1:2))+top20(i,1),j); end % D4 D4 = C(sum(Lsig(1:3))+1:sum(Lsig(1:4)),j); top20 = find(abs(D4)>max(abs(D4))*threshold); % applying the threshold for i = 1:length(top20) Ct(sum(Lsig(1:3))+top20(i,1),j) = C(sum(Lsig(1:3))+top20(i,1),j); end % D3 D3 = C(sum(Lsig(1:4))+1:sum(Lsig(1:5)),j); top20 = find(abs(D3)>max(abs(D3))*threshold); for i = 1:length(top20) Ct(sum(Lsig(1:4))+top20(i,1),j) = 0; % set all coefficients to zero; out of the bandwith of the ABR end % D2 D2 = C(sum(Lsig(1:5))+1:sum(Lsig(1:6)),j); top20 = find(abs(D2)>max(abs(D2))*threshold); for i = 1:length(top20) Ct(sum(Lsig(1:5))+top20(i,1),j) = 0; % set all coefficients to zero; out of the bandwith of the ABR end % D1 D1 = C(sum(Lsig(1:6))+1:sum(Lsig(1:7)),j); top20 = find(abs(D1)>max(abs(D1))*threshold); for i = 1:length(top20) Ct(sum(Lsig(1:6))+top20(i,1),j) = 0; % set all coefficients to zero; out of the bandwith of the ABR end end %% Matching thresholded template coefficients and thresholded noisy ABR coefficients for further refining match = find(abs(Ct(:,1))>0); % find significant coefficients of the template Ct(:,3) = zeros(length(C),1); % intialise coefficients of the matched signal with the thresholded template for i = 1:length(match) Ct(match(i,1),3) = Ct(match(i,1),2); end %% Reconstructing and plotting the denoised ABR recon(:,k) = waverec(Ct(:,3),Lsig,WT); % reconstruction of the denoised ABR %normalizing the filtered average to plot reconnorm(:,k) = recon(:,k) - min(recon(:,k)); reconnorm(:,k) = reconnorm(:,k)/max(reconnorm(:,k)); %normalising reconnorm((1:80),k) = 0; %tapering - zeros till 1ms %% MSE calculation (amount of noise compared to the template) mse_at(k,n) = mean((tmpl-sig(81:440,k)).^2); % between conventional average and template mse_wt(k,n) = mean((tmpl-recon(81:440,k)).^2); % between wavelet filtering and template end %% Plots figure
ü ý ý þ ÿ ä ä ä
contourf((41:400)/40,256:k+256-1,flipdim(reconnorm(81:440,:)',1),'LineStyle','none') title(['CTMC - Filtered average of ',num2str(bkl(n)),' blocks - ',WT]) xlabel('Time (ms)'),ylabel('Epochs') end è é ê ë ì é ì í î ï î æ ð í ð ñ ó û ô ò õ ï ö ì ë ì î ï ë ÷ ð ø æ î ù é õ æ î ù ú û ó% Processing data of a single sound intensity level with TWMC % The template is derived with the grand average at 55 dB nHL clear all WT = 'bior5.5'; % mother wavelet level = 6; % decomposition levels bkl = [256 128 64 32 16 8]; % analysed block lengths %% Template calculation using 55 dB nHL load('sw_epochs_55_-11-10.mat'); [b,a] = butter(3,[100/20000,3000/20000]); % generate band pass filter coefficients epochs = filtfilt(b,a,epochs); % zero phase filtering tmpl = mean(epochs(401:840,1:1024),2); % calculating the grand average tmpl((1:80),1) = 0; % remove stimulus artifact; tapering - zeros till 1ms %% Loading signal to be denoised epochs = []; load(['sw_epochs_75_-11-10.mat']); epochs = filtfilt(b,a,epochs); for j = 1:length(bkl) %loop for different block lengths for k = 1:1024-256+1 %loop for continuous blocks %% Signal calculations sig(:,k) = mean(epochs(401:840,256-bkl(j)+k:256+k-1),2); % generate the noisy ABR with reduced number epochs sig((1:80),k) = 0; %normalizing the simple average to plot (1-10ms) signorm(:,k) = sig(:,k) - min(sig(:,k)); %normalizing the block signorm(:,k) = signorm(:,k)/max(signorm(:,k)); %normalizing the block [Csig,Lsig] = wavedec(sig(:,k),level,WT); % wavelet decomposition of the noisy ABR for i = 1:level % reconstruction of the signal at each decomposition level sigcoeff(:,i) = wrcoef('d',Csig,Lsig,WT,i); if i ==level sigcoeff(:,i+1) = wrcoef('a',Csig,Lsig,WT,i); end end coeff = []; coeff_A6 = Csig(1:Lsig(1,1),1); coeff_D6 = Csig(Lsig(1,1)+1:sum(Lsig(1:2)),1); coeff_D5 = Csig(sum(Lsig(1:2))+1:sum(Lsig(1:3)),1); coeff_D4 = Csig(sum(Lsig(1:3))+1:sum(Lsig(1:4)),1); coeff_D3 = Csig(sum(Lsig(1:4))+1:sum(Lsig(1:5)),1); coeff_D2 = Csig(sum(Lsig(1:5))+1:sum(Lsig(1:6)),1); coeff_D1 = Csig(sum(Lsig(1:6))+1:sum(Lsig(1:7)),1); %% Choosing A6 coefficients
ü ý ý þ ÿ ä
Csig_filt = Csig; % creating a new variable to assign filtered coefficients filt_coeff_A6 = zeros(Lsig(1),1); % filtered A6 A6 = [7 9 10 11]; % retaining coefficients for i = 1:length(A6) filt_coeff_A6(A6(i),1) = coeff_A6(A6(i),1); end Csig_filt(1:Lsig(1,1),1) = filt_coeff_A6; %% Choosing D6 coefficients filt_coeff_D6 = zeros(Lsig(2),1); %filtered D6 D6 = [7 8 9 10]; %retaining coefficients for i = 1:length(D6) filt_coeff_D6(D6(i),1) = coeff_D6(D6(i),1); end Csig_filt(Lsig(1,1)+1:sum(Lsig(1:2)),1) = filt_coeff_D6; %% Choosing D5 coefficients filt_coeff_D5 = zeros(Lsig(3),1); %filtered D5 D5 = [9 12 13 14]; %retaining coefficients for i = 1:length(D5) filt_coeff_D5(D5(i),1) = coeff_D5(D5(i),1); end Csig_filt(sum(Lsig(1:2))+1:sum(Lsig(1:3)),1) = filt_coeff_D5; %% Choosing D4 coefficients filt_coeff_D4 = zeros(Lsig(4),1); %filtered D4 D4 = [9 10 11 13 20]; %retaining coefficients for i = 1:length(D4) filt_coeff_D4(D4(i),1) = coeff_D4(D4(i),1); end Csig_filt(sum(Lsig(1:3))+1:sum(Lsig(1:4)),1) = filt_coeff_D4; %% Choosing D3 coefficients filt_coeff_D3 = zeros(Lsig(5),1); %filtered D3 no coefficients are retained Csig_filt(sum(Lsig(1:4))+1:sum(Lsig(1:5)),1) = filt_coeff_D3; %% Choosing D2 coefficients filt_coeff_D2 = zeros(Lsig(6),1); %filtered D2 no coefficients are retained Csig_filt(sum(Lsig(1:5))+1:sum(Lsig(1:6)),1) = filt_coeff_D2; %% Choosing D1 coefficients filt_coeff_D1 = zeros(Lsig(7),1); %filtered D1 no coefficients are retained Csig_filt(sum(Lsig(1:6))+1:sum(Lsig(1:7)),1) = filt_coeff_D1; %% Reconstruction of the filtered decomposition levels for i = 1:level sigcoeff_recon(:,i) = wrcoef('d',Csig_filt,Lsig,WT,i); if i ==level sigcoeff_recon(:,i+1) = wrcoef('a',Csig_filt,Lsig,WT,i); end end %% Reconstruction of the filtered template recon(:,k) = sum(sigcoeff_recon,2); reconnorm(:,k) = recon(:,k) - min(recon(:,k)); % normalizing each block between 0 and 1 reconnorm(:,k) = reconnorm(:,k)/max(reconnorm(:,k)) ; % normalizing each block between 0 and 1 %% MSE calculation (amount of noise compared to the template) mse_at(k,j) = mean((tmpl(:,1)-sig(81:440,k)).^2); % between conventional average and template
ü ý ý þ ÿ
mse_wt(k,j) = mean((tmpl(:,1)-recon(81:440,k)).^2); % between wavelet filtering and template end %% Plots figure contourf((41:400)/40,256:k+256-1,flipdim(reconnorm(81:440,:)',1),'LineStyle','none') title(['TWMC - Filtered average of ',num2str(bkl(n)),' blocks - ',WT]) xlabel('Time (ms)'),ylabel('Epochs') end è é ê ë ì é ì í î ï î æ ð í ð ñ ò ó ú õ ï ö ì ë ì î ï ë ÷ ð ø æ î ù é õ æ î ù ú û ó % Processing data of a single sound intensity level with CSTD clear all WT = 'bior5.5'; % mother wavelet level = 6; % decomposition levels bkl = [256 128 64 32 16 8]; % analysed block lengths t = (41:400)/40; % 1ms - 10ms d = [1 1 1 0.2500 0.1768 0.1250]; % scale thresholds dk = d1/(2)^(k/2), d1=1, d1 is applied to initial set of frames h = [0.7358 0.2707 0.0996 0.0366 0.0135 0.0050 0.0018 0.0007]; % CSTD level thresholds dl = d1/exp(l); d1=0.8 %% Reference ABR epochs=[]; load 'sw_epochs_55_1-10.mat' [b,a] = butter(3,[100/20000,3000/20000]); % generate band pass filter coefficients epochs = filtfilt(b,a,epochs); % zero phase filtering tmpl = mean(epochs(:,:),2); % calculating the grand average %% One block of each average for i = 1:6 f = epochs(:,1:2^(i+1)); sig = f; switch i case 1 % calling the fucntion to denoise a block of 8 epochs for j = 1:1024-256+1 f = epochs(:,256-2^(i+1)+j:256+j-1); sig = f; [onerecon,frecon8(:,j),mse_at(j,i),mse_wt(j,i),max_ncr_at(j,i),max_ncr_wt(j,i)] = CSTD_8_fun_2(f,WT,lvl,d,h,tmpl,sig); end mse_wt_avg(1,i) = mean(mse_wt(1:1024/2^(i+1),i)); case 2 % calling the fucntion to denoise a block of 16 epochs for j = 1:1024-256+1 f = epochs(:,256-2^(i+1)+j:256+j-1); sig = f; [frecon16(:,j),mse_at(j,i),mse_wt(j,i),max_ncr_at(j,i),max_ncr_wt(j,i)] = CSTD_16_fun_2(f,WT,lvl,d,h,tmpl,sig); end mse_wt_avg(1,i) = mean(mse_wt(1:1024/2^(i+1),i)); case 3 % calling the fucntion to denoise a block of 32 epochs for j = 1:1024-256+1 f = epochs(:,256-2^(i+1)+j:256+j-1); sig = f;
ü ý ý þ ÿ ä
[frecon32(:,j),mse_at(j,i),mse_wt(j,i),max_ncr_at(j,i),max_ncr_wt(j,i)] = CSTD_32_fun_2(f,WT,lvl,d,h,tmpl,sig); end mse_wt_avg(1,i) = mean(mse_wt(1:1024/2^(i+1),i)); case 4 % calling the fucntion to denoise a block of 64 epochs for j = 1:1024-256+1 f = epochs(:,256-2^(i+1)+j:256+j-1); sig = f; [frecon64(:,j),mse_at(j,i),mse_wt(j,i),max_ncr_at(j,i),max_ncr_wt(j,i)] = CSTD_64_fun_2(f,WT,lvl,d,h,tmpl,sig); end mse_wt_avg(1,i) = mean(mse_wt(1:1024/2^(i+1),i)); case 5 % calling the fucntion to denoise a block of 128 epochs for j = 1:1024-256+1 f = epochs(:,256-2^(i+1)+j:256+j-1); sig = f; [frecon128(:,j),mse_at(j,i),mse_wt(j,i),max_ncr_at(j,i),max_ncr_wt(j,i)] = CSTD_128_fun_2(f,WT,lvl,d,h,tmpl,sig); end mse_wt_avg(1,i) = mean(mse_wt(1:1024/2^(i+1),i)); case 6 % calling the fucntion to denoise a block of 256 epochs for j = 1:1024-256+1 f = epochs(:,256-2^(i+1)+j:256+j-1); sig = f; [frecon256(:,j),mse_at(j,i),mse_wt(j,i),max_ncr_at(j,i),max_ncr_wt(j,i)] = CSTD_256_fun_2(f,WT,lvl,d,h,tmpl,sig); end mse_wt_avg(1,i) = mean(mse_wt(1:1024/2^(i+1),i)); end end function [frecon,mse_at,mse_wt,max_ncr_at,max_ncr_wt] = CSTD_32_fun_2 (f,WT,lvl,d,h,tmpl,sig) % Implementation of CSTD for a block size of 32 epochs %% Wavelet transform for i = 1:size(f,2) [C(:,i),L] = wavedec(f(:,i),lvl,WT); end f = C; %% CSTD level 1 for i = 1:size(f,2)/2 f1(:,i) = mean(f(:,2*i-1:2*i),2); end for i = size(f,2)/2+1:size(f,2) if 2*i-(size(f,2)-1)>size(f,2) f1(:,i) = (f(:,2*i-size(f,2))+f(:,1))/2; else f1(:,i) = mean(f(:,2*i-size(f,2):2*i-(size(f,2)-1)),2); end end %% Scale Thresholding ff1 = f1; f1 = zeros(size(f1)); for k = 1:size(f,2) %loop for number of epochs/frames %approximation coefficients
ü ý ý þ ÿ ä ä
th = []; th = find(abs(ff1(1:L(1,1),k))>max(abs(ff1(1:L(1,1),k)))*d(1,lvl)); for i = 1:length(th) f1(th(i,1),k) = ff1(th(i,1),k); end for n = 1:lvl % loop for number of decomposition levels %detail coefficients th = []; th = find(abs(ff1(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))>max(abs(ff1(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))*d(1,lvl-n+1))); for i = 1:length(th) f1(th(i,1)+sum(L(1:n,1)),k) = ff1(th(i,1)+sum(L(1:n,1)),k); end end end %% Level Thresholding ff1 = f1; f1 = zeros(size(f1)); for k = 1:size(f,2) %loop for number of epochs/frames %approximation coefficients th = []; th = find(abs(ff1(1:L(1,1),k))>max(abs(ff1(1:L(1,1),k)))*h(1,1)); for i = 1:length(th) f1(th(i,1),k) = ff1(th(i,1),k); end for n = 1:lvl % loop for number of decomposition levels %detail coefficients th = []; th = find(abs(ff1(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))>max(abs(ff1(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))*h(1,1))); for i = 1:length(th) f1(th(i,1)+sum(L(1:n,1)),k) = ff1(th(i,1)+sum(L(1:n,1)),k); end end end %% CSTD level 2 for i = 1:size(f,2)/4 f2(:,i) = mean(f1(:,2*i-1:2*i),2); end for i = size(f,2)/4+1:size(f,2)/2 if 2*i-15>size(f,2)/2 f2(:,i) = (f1(:,2*i-16)+f1(:,1))/2; else f2(:,i) = mean(f1(:,2*i-16:2*i-15),2); end end for i = size(f,2)/2+1:size(f,2)/4*3 f2(:,i) = mean(f1(:,2*i-17:2*i-16),2); end for i = size(f,2)/4*3+1:size(f,2) if 2*i-31>size(f,2) f2(:,i) = (f1(:,2*i-32)+f1(:,size(f,2)/2+1))/2; else f2(:,i) = mean(f1(:,2*i-32:2*i-31),2); end end %% Scale Thresholding ff2 = f2; f2 = zeros(size(f2)); for k = 1:size(f,2) %loop for number of epochs/frames
ü ý ý þ ÿ ä ä ä
%approximation coefficients th = []; th = find(abs(ff2(1:L(1,1),k))>max(abs(ff2(1:L(1,1),k)))*d(1,lvl)); for i = 1:length(th) f2(th(i,1),k) = ff2(th(i,1),k); end for n = 1:lvl % loop for number of decomposition levels %detail coefficients th = []; th = find(abs(ff2(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))>max(abs(ff2(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))*d(1,lvl-n+1))); for i = 1:length(th) f2(th(i,1)+sum(L(1:n,1)),k) = ff2(th(i,1)+sum(L(1:n,1)),k); end end end %% Level Thresholding ff2 = f2; f2 = zeros(size(f2)); for k = 1:size(f,2) %loop for number of epochs/frames %approximation coefficients th = []; th = find(abs(ff2(1:L(1,1),k))>max(abs(ff2(1:L(1,1),k)))*h(1,2)); for i = 1:length(th) f2(th(i,1),k) = ff2(th(i,1),k); end for n = 1:lvl % loop for number of decomposition levels %detail coefficients th = []; th = find(abs(ff2(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))>max(abs(ff2(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))*h(1,2))); for i = 1:length(th) f2(th(i,1)+sum(L(1:n,1)),k) = ff2(th(i,1)+sum(L(1:n,1)),k); end end end %% CSTD level 3 for i = 1:size(f,2)/8 f3(:,i) = mean(f2(:,2*i-1:2*i),2); end for i = size(f,2)/8+1:size(f,2)/8*2 if 2*i-7>size(f,2)/8*2 f3(:,i) = (f2(:,2*i-8)+f2(:,1))/2; else f3(:,i) = mean(f2(:,2*i-8:2*i-7),2); end end for k = 1:3 for i = size(f,2)/8*2*k+1:size(f,2)/8*(2*k+1) f3(:,i) = mean(f2(:,2*i-(8*k+1):2*i-8*k),2); end for i = size(f,2)/8*(2*k+1)+1:size(f,2)/8*(2*k+2) if 2*i-(8*(k+1)-1)>size(f,2)/8*(2*k+2) f3(:,i) = (f2(:,2*i-8*(k+1))+f2(:,size(f,2)/8*2*k+1))/2; else f3(:,i) = mean(f2(:,2*i-8*(k+1):2*i-(8*(k+1)-1)),2); end end end
ü ý ý þ ÿ ä
%% Scale Thresholding ff3 = f3; f3 = zeros(size(f3)); for k = 1:size(f,2) %loop for number of epochs/frames %approximation coefficients th = []; th = find(abs(ff3(1:L(1,1),k))>max(abs(ff3(1:L(1,1),k)))*d(1,lvl)); for i = 1:length(th) f3(th(i,1),k) = ff3(th(i,1),k); end for n = 1:lvl % loop for number of decomposition levels %detail coefficients th = []; th = find(abs(ff3(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))>max(abs(ff3(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))*d(1,lvl-n+1))); for i = 1:length(th) f3(th(i,1)+sum(L(1:n,1)),k) = ff3(th(i,1)+sum(L(1:n,1)),k); end end end %% Level Thresholding ff3 = f3; f3 = zeros(size(f3)); for k = 1:size(f,2) %loop for number of epochs/frames %approximation coefficients th = []; th = find(abs(ff3(1:L(1,1),k))>max(abs(ff3(1:L(1,1),k)))*h(1,3)); for i = 1:length(th) f3(th(i,1),k) = ff3(th(i,1),k); end for n = 1:lvl % loop for number of decomposition levels %detail coefficients th = []; th = find(abs(ff3(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))>max(abs(ff3(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))*h(1,3))); for i = 1:length(th) f3(th(i,1)+sum(L(1:n,1)),k) = ff3(th(i,1)+sum(L(1:n,1)),k); end end end %% CSTD level 4 for i = 1:size(f,2)/16 f4(:,i) = mean(f3(:,2*i-1:2*i),2); end for i = size(f,2)/16+1:size(f,2)/16*2 if 2*i-3>size(f,2)/16*2 f4(:,i) = (f3(:,2*i-4)+f3(:,1))/2; else f4(:,i) = mean(f3(:,2*i-4:2*i-3),2); end end for k = 1:7 for i = size(f,2)/16*2*k+1:size(f,2)/16*(2*k+1) f4(:,i) = mean(f3(:,2*i-(4*k+1):2*i-4*k),2); end for i = size(f,2)/16*(2*k+1)+1:size(f,2)/16*(2*k+2) if 2*i-(4*(k+1)-1)>size(f,2)/16*(2*k+2) f4(:,i) = (f3(:,2*i-4*(k+1))+f3(:,size(f,2)/16*2*k+1))/2;
ü ý ý þ ÿ
else f4(:,i) = mean(f3(:,2*i-4*(k+1):2*i-(4*(k+1)-1)),2); end end end %% Scale Thresholding ff4 = f4; f4 = zeros(size(f4)); for k = 1:size(f,2) %loop for number of epochs/frames %approximation coefficients th = []; th = find(abs(ff4(1:L(1,1),k))>max(abs(ff4(1:L(1,1),k)))*d(1,lvl)); for i = 1:length(th) f4(th(i,1),k) = ff4(th(i,1),k); end for n = 1:lvl % loop for number of decomposition levels %detail coefficients th = []; th = find(abs(ff4(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))>max(abs(ff4(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))*d(1,lvl-n+1))); for i = 1:length(th) f4(th(i,1)+sum(L(1:n,1)),k) = ff4(th(i,1)+sum(L(1:n,1)),k); end end end %% Level Thresholding ff4 = f4; f4 = zeros(size(f4)); for k = 1:size(f,2) %loop for number of epochs/frames %approximation coefficients th = []; th = find(abs(ff4(1:L(1,1),k))>max(abs(ff4(1:L(1,1),k)))*h(1,4)); for i = 1:length(th) f4(th(i,1),k) = ff4(th(i,1),k); end for n = 1:lvl % loop for number of decomposition levels %detail coefficients th = []; th = find(abs(ff4(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))>max(abs(ff4(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))*h(1,4))); for i = 1:length(th) f4(th(i,1)+sum(L(1:n,1)),k) = ff4(th(i,1)+sum(L(1:n,1)),k); end end end %% CSTD level 5 for i = 1:size(f,2) if mod(i,2)==1 %check odd i f5(:,i) =mean(f4(:,i:i+1),2); else f5(:,i) =mean(f4(:,i-1:i),2); end end %% Scale Thresholding ff5 = f5; f5 = zeros(size(f5)); for k = 1:size(f,2) %loop for number of epochs/frames %approximation coefficients th = [];
ü ý ý þ ÿ ä
th = find(abs(ff5(1:L(1,1),k))>max(abs(ff5(1:L(1,1),k)))*d(1,lvl)); for i = 1:length(th) f5(th(i,1),k) = ff5(th(i,1),k); end for n = 1:lvl % loop for number of decomposition levels %detail coefficients th = []; th = find(abs(ff5(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))>max(abs(ff5(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))*d(1,lvl-n+1))); for i = 1:length(th) f5(th(i,1)+sum(L(1:n,1)),k) = ff5(th(i,1)+sum(L(1:n,1)),k); end end end %% Level Thresholding ff5 = f5; f5 = zeros(size(f5)); for k = 1:size(f,2) %loop for number of epochs/frames %approximation coefficients th = []; th = find(abs(ff5(1:L(1,1),k))>max(abs(ff5(1:L(1,1),k)))*h(1,5)); for i = 1:length(th) f5(th(i,1),k) = ff5(th(i,1),k); end for n = 1:lvl % loop for number of decomposition levels %detail coefficients th = []; th = find(abs(ff5(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))>max(abs(ff5(sum(L(1:n,1))+1:sum(L(1:n+1,1)),k))*h(1,5))); for i = 1:length(th) f5(th(i,1)+sum(L(1:n,1)),k) = ff5(th(i,1)+sum(L(1:n,1)),k); end end end %% Final Average favg = mean(f5,2); %% Inverse wavelet tranform frecon = waverec(favg,L,WT); %% MSE calculation (amount of noise compared to the template) mse_at = mean((tmpl-mean(sig,2)).^2); % between conventional average and template mse_wt = mean((tmpl-frecon).^2); % between wavelet filtering and template
ü ý ý þ ÿ ä ä
è é ê ë ì é ì í î ï î æ ð í ð ñ ò ó ô ò õ ï ö ì ë ì î ï ë ÷ ð ø æ î ù é õ æ î ù û ó% Processing data of a single sound intensity level with CTMC using SWT % The template is derived with the grand average at 55 dB nHL %% Template calculation using 55 dB load('sw_epochs_55_-11-10.mat'); tmpl_nonext = mean(epochs(481:840,1:1024),2); % signal from 1ms to 10ms tmpl = wextend('1','sym',tmpl_nonext,76); % extended template for swt % Decomposing the template with SWT [swat,swdt] = swt(tmpl,level,WT); % Application of thresholds to the tempate as per CTMC %% Noisy ABR filtering for n = 1:length(bkl) %loop for different block lengths for k = 1:1024-256+1 %loop for continuous blocks % Noisy ABR calculations sig_nonext(:,k) = mean(epochs(481:840,256-bkl(n)+k:256+k-1),2); sig(:,k) = wextend('1','sym',sig_nonext(:,k),76); %extended noisy ABR for swt % Decomposing with SWT [swas,swds] = swt(sig(:,k),level,WT); % Application of thresholds to the noisy ABR as per CTMC % Reconstructing the denoised ABR recon(k,:) = iswt(swas_th,swds_mat,WT); end end è é ê ë ì é ì í î ï î æ ð í ð ñ ó û ô ò õ ï ö ì ë ì î ï ë ÷ ð ø æ î ù é õ æ î ù û ó% Processing data of a single sound intensity level with TWMC using SWT % The template is derived with the grand average at 55 dB nHL %% Template calculation using 55 dB load('sw_epochs_55_-11-10.mat'); tmpl_nonext = mean(epochs(481:840,1:1024),2); % signal from 1ms to 10ms tmpl = wextend('1','sym',tmpl_nonext,76); % extended template for swt %% Loading signal to be denoised epochs = []; load('sw_epochs_75_-11-10.mat'); epochs = filtfilt(b,a,epochs); for n = 1:length(bkl) % loop for different block lengths for k = 1:1024-256+1 % loop for continuous blocks sig_nonext(:,k) = mean(epochs(481:840,256-bkl(n)+k:256+k-1),2); % mean for the block sig(:,k) = wextend('1','sym',sig_nonext(:,k),76); % extended noisy ABR for swt % Decomposing with SWT [swas,swds] = swt(sig(:,k),level,WT); % Application of thresholds to the noisy ABR as per TWMC recon(:,k) = iswt(swas_win,swds_win,WT); end
ü ý ý þ ÿ ä ä ä
end è é ê ë ì é ì í î ï î æ ð í ð ñ ò ó ú õ ï ö ì ë ì î ï ë ÷ ð ø æ î ù é õ æ î ù û ó ! " ! " ! # $ % & ! ' ( ) & * & $ ' * # $ + , $ " $ % & $ # ! % " $ ' & ! * % - . function [frecon,mse_at,mse_wt] = CSTD_32(f,WT,lvl,d,h,tmpl,sig) % Implementation of CSTD for a block size of 32 epochs with SWT %% Wavelet transform for i = 1:size(f,2) [swa,swd] = swt(f(:,i),lvl,WT); C(1:512,i) = swa(6,:); for k = 1:6 C(512*k+1:512*(k+1),i) = swd(6-k+1,:); end L = [512,512,512,512,512,512,512,512]'; end f = C; % Application of scale and level thresholds to the noisy ABR as per CSTD % Inverse wavelet transform frecon = iswt(swa_rec,swd_rec,WT); / ì í ì ø ï î æ ð í ð ñ î ù ì 0 1 í î ù ì î æ 2 3 4 5 6 ï î ï 0 ì î ñ ð ø 3 5 7 é ð 6 ì ë æ í ÷% Generating datasets with the synthetic ABR with its amplitude and latency variation. clear all;clc x = linspace(-4*pi,4*pi,400); t = linspace(0,2*pi,1200); % number of repetition blocks y = 1.3*sin(20*t)/4; % main shape sin(t) yn = ones(1,length(y)); % length function max_rep = 1; % maximum repetitions m = 1; for i = 1:length(y) n = 1; sig0 = sin(2*(x+(12-(y(1,i)*12-1))*pi/5))./(2*(x+(12-(y(1,i)*12-1))*pi/5)); % sinc function 1 sig1 = 0.25*sig0; sig = sin(2*(x+(4-(y(1,i)*12-1))*pi/5))./(2*(x+(4-(y(1,i)*12-1))*pi/5)); % sinc function 2 sig2 = 0.5*sig; sig = sin(2*(x+(-4-(y(1,i)*12-1))*pi/5))./(2*(x+(-4-(y(1,i)*12-1))*pi/5)); % sinc function 3 sig3 = sig; sig = sig1+sig2+sig3; sig = sig/max(sig); % normalising sig = sig-mean(sig); % base shifting to mean zero n = n+1; for j = 1:max_rep*yn(1,i) syn (m,:) = sig;
ü ý ý þ ÿ ä
m = m+1; end end syn = syn'; %% Plot figure,contourf(syn,'LineStyle','none') ô ð 6 ì ë ð ø 6 ì ø 6 ì î ì ø é æ í ï î æ ð í ñ ð ø î ù ì 3 5 7 é ð 6 ì ë% Ditermination of model orders using FPE with a dataset of constant latency clear all;clc % Arbirary poles and zeros (using unfiltered real data fitted to an ARMA model) a = [1.0000 -2.8664 3.4718 -2.3611 0.9599 -0.2721 0.0686]; %AR(6) b = [0 1.0000 -3.3171 4.4271 -2.1791 -1.1732 2.7485 -2.5161 0.9903]; %MA(7) N = 10; % range of model orders for AR part M = 10+1; % range of model orders for MA part noise_seed = 4; load('u.mat') % load the synthetic ABR l = 100; % number of sweeps u = repmat(synthetic',1,l); e = wgn(400,l,-2,1,noise_seed); s = filter(b,a,u); pu = mean(u.^2); ps = mean(s.^2); s=s./repmat(sqrt(ps),400,1).*repmat(sqrt(pu),400,1); % normalize power of the whitened template to the s(k) n = filter(1,a,e); pe = mean(e.^2); pn = mean(n.^2); n=n./repmat(sqrt(pn),400,1).*repmat(sqrt(pe),400,1); % normalize power of the whitened template to the n(k) y = s + n; y = y - repmat(mean(y),400,1); % making mean zero for i = 1:100 snr(1,i) = 10*log10(mean((s(:,i).^2)/mean(n(:,i).^2))); end snr_initial = round(mean(snr,2)) %% ARX model calculation u_hat = mean(y,2); % derived template u_hat = u_hat - mean(u_hat); % making mean zero for k = 1:100 %sweep through each sweep fpe=[];dat=[];m=[];df=[];ind=[]; for i = 1:N % run through all the sweeps dat = iddata(y(:,k),u_hat,1/40000); for j = 2:M m = arx(dat,[i j 0]); %na is the number of poles, nb is the number of zeros plus 1, nk is the number of samples before the input affects the system output fpe(i,j-1) = m.es.FPE; end end %% Find the optimum order at the fist local minima for AR(p)
ü ý ý þ ÿ
df=[]; for i = 1:N df = diff(fpe,1,1);cnt1=0; for j = 1:N-2 if df(j,i)<0 && df(j+1,i)>0 cnt1=1; ind(i,1)=j+1; break end end if cnt1 == 0 %if not found a local minima (gradual decrease) then get the 5% of the minimum FPE df1 = fpe(:,i) - repmat(min(fpe(:,i))*1.05,10,1); % 5% ind(i,1) = find(abs(df1) == min(abs(df1))); end end %% Find the optimum order at the fist local minima for MA(q) df=[]; for i = 1:N df = diff(fpe,1,2);cnt1=0; for j = 1:N-2 if df(i,j)<0 && df(i,j+1)>0 cnt1=1; ind(i,2)=j+1; break end end if cnt1 == 0 %if not found a local minima (gradual decrease) then get the 5% of the minimum FPE df1 = fpe(i,:) - repmat(min(fpe(i,:))*1.05,1,10); % 5% ind(i,2) = find(abs(df1) == min(abs(df1))); end end orders(k,:) = mode(ind); end %% Histogram of ARMA(p,q) orders H=zeros(N,M-1); for nn=1:N; for mm=1:M-1 for r=1:100 if ((orders(r,1)==nn) && (orders(r,2)==mm)) H(nn,mm)=H(nn,mm)+1; end end end end figure surf(H) xlabel('MA(q)'),ylabel('AR(p)'),zlabel('Frequency') 3 4 5 ì 0 î æ é ï î æ ð í ï í 6 ë ï î ì í 2 1 î ø ï 2 8 æ í ÷ õ æ î ù î ù ì 3 5 7 é ð 6 ì ë% Latency tracking with ARX model with predeitermined model orders (6,7,0) clear all;clc % Arbirary poles and zeros (using unfiltered real data fitted to an ARMA model) a = [1.0000 -2.8664 3.4718 -2.3611 0.9599 -0.2721 0.0686]; %AR(6)
ü ý ý þ ÿ ä
b = [0 1.0000 -3.3171 4.4271 -2.1791 -1.1732 2.7485 -2.5161 0.9903]; %MA(7) N = 6; % fixed model order for the AR part M = 7+1; % fixed model order for the MA part noise_seed = 4; file = 'ampl_2ms_freq_1.0_1min.mat'; % load the dataset with the latency variations load(file) l = 1200; %number of sweeps e = wgn(400,l,-7,1,noise_seed); s = filter(b,a,u); pu = mean(u.^2); ps = mean(s.^2); s=s./repmat(sqrt(ps),400,1).*repmat(sqrt(pu),400,1); %normalize power of the whitened template to the s(k) n = filter(1,a,e); pe = mean(e.^2); pn = mean(n.^2); n=n./repmat(sqrt(pn),400,1).*repmat(sqrt(pe),400,1); %normalize power of the whitened template to the n(k) y = s + n; y = y - repmat(mean(y),400,1); %making mean zero % Intial SNR for i = 1:l snr(1,i) = 10*log10(mean((s(:,i).^2)/mean(n(:,i).^2))); end snr_initial = round(mean(snr,2)) %% ARX model calculation tmpll = 100; for i = 1:l-tmpll+1 u_hat(:,i+tmpll-1) = mean(y(:,i:i+tmpll-1),2); % derived template u_hat(:,i+tmpll-1) = u_hat(:,i+tmpll-1) - mean(u_hat(:,i+tmpll-1)); % making mean zero dat = iddata(y(:,i+tmpll-1),u_hat(:,i+tmpll-1),1/40000); % creating the data object for the model calculation m = arx(dat,[N M 0]); % determination of the model s_hat(:,i+tmpll-1) = filter(m.b,m.a,u_hat(:,i+tmpll-1)); % derivation of the estimated ABR end %% Peak detection Lamp=182:260; % wave V latency for i = 1:l [rc_s(i,1),rc_s(i,2)] = find(s(:,i)==max(s(Lamp,i))); end np=0;rc_s_hat=zeros(l,1); for i = 1:l-tmpll+1 epo= i+tmpll-1; df = diff(s_hat(Lamp,epo)); cnt = 0;rrr=0; for j = 1:length(df)-1 if df(j)>0 && df(j+1)<0; cnt = cnt+1; [rrr(cnt),ccc] = find(s_hat(:,epo)==s_hat(j+1,epo)); end end if cnt ~= 0
ü ý ý þ ÿ ä ä
[rc_s_hat_temp(1,1),rc_s_hat_temp(1,2)] = find(s_hat(:,epo) == max(s_hat(rrr+Lamp(1)-1,epo))); [rc_s_hat_max(1,1),rc_s_hat_max(1,2)] = find(s_hat(:,epo)==max(s_hat(Lamp,epo))); if rc_s_hat_temp(1,1)>=rc_s_hat_max(1,1) rc_s_hat(epo,1) = rc_s_hat_temp(1,1); else np=np+1; no_peak_s_hat(1,np)=epo; end else np=np+1; no_peak_s_hat(1,np)=epo; end end np=0;rc_u_hat=zeros(l,1); for i = 1:l-tmpll+1 epo= i+tmpll-1; df = diff(u_hat(Lamp,epo)); cnt = 0;rrr=0; for j = 1:length(df)-1 if df(j)>0 && df(j+1)<0; cnt = cnt+1; [rrr(cnt),ccc] = find(u_hat(:,epo)==u_hat(j+1,epo)); end end if cnt ~= 0 [rc_u_hat_temp(1,1),rc_u_hat_temp(1,2)] = find(u_hat(:,epo)==max(u_hat(rrr+Lamp(1)-1,epo))); [rc_u_hat_max(1,1),rc_u_hat_max(1,2)] = find(u_hat(:,epo)==max(u_hat(Lamp,epo))); if rc_u_hat_temp(1,1)>=rc_u_hat_max(1,1) rc_u_hat(epo,1) = rc_u_hat_temp(1,1); else np=np+1; no_peak_u_hat(1,np)=epo; end else np=np+1; no_peak_u_hat(1,np)=epo; end end %MSE calculation (after ignoring very high MSEs in s_hat) mse_s_shat = mean((rc_s(find(rc_s_hat),1)/40-rc_s_hat(find(rc_s_hat),1)/40).^2) mse_s_uhat = mean((rc_s(find(rc_u_hat),1)/40-rc_u_hat(find(rc_u_hat),1)/40).^2) l-100+1-length(no_peak_s_hat) length(no_peak_u_hat) è é ê ë ì é ì í î ï î æ ð í ð ñ î ù ì 5 9 : 9% Predefined model parameters a = [1.0000 -2.8664 3.4718 -2.3611 0.9599 -0.2721 0.0686]; %AR(6) b = [0 1.0000 -3.3171 4.4271 -2.1791 -1.1732 2.7485 -2.5161 0.9903]; %MA(7) c = [1.0000 -0.5473 -0.3750 -0.2088 -0.0579 -0.1965 0.1101 0.1557 0.1582];% AR(8) for pre-whitening % Data generation (only the pre-whitening coding is presented here)
ü ý ý þ ÿ ä ä ä
w = filter(c,1,u); %whitening the template pu = mean(u.^2); pw = mean(w.^2); w=w./repmat(sqrt(pw),400,1).*repmat(sqrt(pu),400,1); %normalize power of the whitened template to the u(k) w = w - repmat(mean(w),400,1); %making mean zero s = filter(b,a,w); ps = mean(s.^2); s=s./repmat(sqrt(ps),400,1).*repmat(sqrt(pu),400,1); %normalize power of the whitened template to the s(k) %% REPE implementation tmpll = 100; for i = 1:l-tmpll+1 u_hat(:,i+tmpll-1) = mean(y(:,i:i+tmpll-1),2); % derived template u_hat(:,i+tmpll-1) = u_hat(:,i+tmpll-1) - u_hat(1,i+tmpll-1); %shift to make the starting sample to zero datw = iddata(u_hat(:,i+tmpll-1),[],1/40000); % creating the data object for the model calculation mw = ar(datw,8); w_hat(:,i+tmpll-1) = filter(mw.a,1,u_hat(:,i+tmpll-1)); %pre-whitening the template pu_hat = mean(u_hat(:,i+tmpll-1).^2); pw_hat = mean(w_hat(:,i+tmpll-1).^2); w_hat(:,i+tmpll-1)=w_hat(:,i+tmpll-1)/sqrt(pw_hat)*sqrt(pu_hat); %normalize power of the whitened template to the u(k) w_hat(:,i+tmpll-1) = w_hat(:,i+tmpll-1) - mean(w_hat(:,i+tmpll-1)); %making mean zero dat = iddata(y(:,i+tmpll-1),w_hat(:,i+tmpll-1),1/40000); m = arx(dat,[N M 0]); %na is the number of poles, nb is the number of zeros plus 1, nk is the number of samples before the input affects the system output s_hat(:,i+tmpll-1) = filter(m.b,m.a,w_hat(:,i+tmpll-1)); end
Appendix E
ARX estimated ABRs
ü ý ý þ ÿ ; ä< = > ? @ A B C @ D E F B F G H A = G B ? = G B = D E F @ I > ? H > G B > F H G J K L @ A M N O P Q ? R > S T U H G E = > @ I G C I G V W @ D X F H Y B @ K Z Q [ G @G E B \ < A @ ? B W ] ä > B = D E C W @ G ^ G E B B F G H A = G B ? \ R < H F D @ A C = L B ? U H G E G E B _ L = > ? = ` B L = _ B ? G B A C W = G B = > ? G E BD @ L L B F C @ > ? H > _ \ R < V B K @ L B G E B B F G H A = G H @ > ] a E B _ B > B L = W @ V F B L ` = G H @ > H F G E B B F G H A = G B ? \ R < F = G E H _ E B LH > G B > F H G H B F F E @ U L B W B ` = > G K B = G I L B V I G > @ G = G W @ U F @ I > ? H > G B > F H G H B F ]
b c d e f gh fgf i g j k l m nb c d e f gh fgf i i j k l m n b c d e f gh fgf d g j k l m nb c d e f gh fgf d i j k l m n b c d e f gh fgf o g j k l m nb c d e f gh fgf o i j k l m n
b c d e f gh fgf f g j k l m n b c d e f gh fgf f i j k l m nb c d e f gh fgf b g j k l m n b c d e f gh fgf b i j k l m nb c d e f gh fgf p g j k l m n b c d e f gh fgf p i j k l m nb c d e f gh b gb c g j k l m n b c d e f gh fgf c i j k l m n
ü ý ý þ ÿ ; ä ä
< = > ? @ A B C @ D E F B F G H A = G B ? = G B = D E F @ I > ? H > G B > F H G J K L @ A q N O P Q ? R > S T U H G E = > @ I G C I G V W @ D X F H Y B @ K q Z G @G E B \ < A @ ? B W ] ä > B = D E C W @ G ^ G E B B F G H A = G B ? \ R < H F D @ A C = L B ? U H G E G E B _ L = > ? = ` B L = _ B ? G B A C W = G B = > ? G E BD @ L L B F C @ > ? H > _ \ R < V B K @ L B G E B B F G H A = G H @ > ] a E B _ B > B L = W @ V F B L ` = G H @ > H F G E B B F G H A = G B ? \ R < F E = ` B = U = ` B W = G B > D J @ K K F B G B ` B > = G E H _ E B L H > G B > F H G H B F U H G E > @ \ R < C L B F B > G = G W @ U F @ I > ? H > G B > F H G H B F ]
r s t u v wx vwv v w y z | r s t u v wx rwr v ~ y z | r s t u v wx rwr r w y z | r s t u v wx rwr r ~ y z | r s t u v wx rwr w y z | r s t u v wx vwv ~ y z | r s t u v wx rwr s w y z |
ü ý ý þ ÿ ; ä ä ä
< = > ? @ A B C @ D E F B F G H A = G B ? = G B = D E F @ I > ? H > G B > F H G J K L @ A q N O P Q ? R > S T U H G E = > @ I G C I G F H > _ W B B C @ D E G @ G E B\ < A @ ? B W ] ä > B = D E C W @ G ^ G E B B F G H A = G B ? \ R < H F D @ A C = L B ? U H G E G E B _ L = > ? = ` B L = _ B ? G B A C W = G B = > ? G E BD @ L L B F C @ > ? H > _ \ R < V B K @ L B G E B B F G H A = G H @ > ] T = L _ B = A C W H G I ? B F D @ I W ? V B @ V F B L ` B ? H > G E B B F G H A = G B ? \ R <L B F B A V W H > _ G E B F H > _ W B B C @ D E V B K @ L B G E B B F G H A = G H @ > V I G > @ G G E B G B A C W = G B ]
Appendix F
Wavelet estimated ABRs
ü ý ý þ ÿ ä
ò ó ô ò õ æ î ù ú û ó ¡ ¢ £
¡ ¢ £
¡ ¢ £ ¡ ¢ £
¡ ¢ £ ¡ ¢ £
¡ ¢ £ ¡ ¢ £
ü ý ý þ ÿ ä ä
ó û ô ò õ æ î ù ú û ó
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ° ± ² ³ ´ µ ¶ · ¸ ® ¥ « ¦¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¤® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ° ± ² ³ ´ µ ¶ · ¸ ® ¦ ¤ ª
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¤ ¯ ¥® ¤® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ° ± ² ³ ´ µ ¶ · ¸ ® ¬ ¬¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤ ° ± ² ³ ´ µ ¶ · ¸ ® ¥ § ¦
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤¤ ¯ ¥¤ ¯ § ° ± ² ³ ´ µ ¶ · ¸ ® ¤ « ª¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤ ° ± ² ³ ´ µ ¶ · ¸ ® ¤ ¥ ¦
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¤® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ° ± ² ³ ´ µ ¶ · ¸ ® § ¥¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤¤ ¯ ¥¤ ¯ § ° ± ² ³ ´ µ ¶ · ¸ ® §
ü ý ý þ ÿ ä ä ä
ò ó ú õ æ î ù ú û ó
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¤ ¯ ¥® ¤® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ³ ° ² ³ ´ µ ¶ · ¸ ® ¥ « ¦¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤¤ ¯ ¥ ³ ° ² ³ ´ µ ¶ · ¸ ® ¦ ¤ ª
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ¤¤ ¯ ¥ ³ ° ² ³ ´ µ ¶ · ¸ ® ¬ ¬¹ º » ¼ ½ ¾ ¿ À Á ¹ Âà ¹Ã Â Ä ½ÂÂ Ä ½¹¹ Ä ½ Å Æ Ç Å È É Ê Ë Ì Ã º ¼ »
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ³ ° ² ³ ´ µ ¶ · ¸ ® ¤ « ª¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ³ ° ² ³ ´ µ ¶ · ¸ ® ¤ ¥ ¦
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤ ³ ° ² ³ ´ µ ¶ · ¸ ® ¤ ¦ «¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ³ ° ² ³ ´ µ ¶ · ¸ ® §
ü ý ý þ ÿ ä
ò ó ô ò õ æ î ù û ó
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¤® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ³ ° ² ³ ´ µ ¶ · ¸ ® ¥ « ¦¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ³ ° ² ³ ´ µ ¶ · ¸ ® ¦ ¤ ª
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ¦® ¯ ¥® ¯ ¤ ¯ ¤ ¯ ¥ ¯ ¦ ¯ § ¯ ¨ ¯ © ¯ ª ³ ° ² ³ ´ µ ¶ · ¸ ® ¬ ¬¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤ ³ ° ² ³ ´ µ ¶ · ¸ ® ¥ § ¦
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤¤ ¯ ¥ ³ ° ² ³ ´ µ ¶ · ¸ ® ¤ « ª¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ³ ° ² ³ ´ µ ¶ · ¸ ® ¤ ¥ ¦
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ³ ° ² ³ ´ µ ¶ · ¸ ® ¤ ¦ «¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¤® ¯ ¯ ¤¤ ¯ ¨ ³ ° ² ³ ´ µ ¶ · ¸ ® §
ü ý ý þ ÿ
ó û ô ò õ æ î ù û ó
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ° ± ² ³ ´ µ ¶ · ¸ ® ¥ « ¦¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ° ± ² ³ ´ µ ¶ · ¸ ® ¦ ¤ ª
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤¤ ¯ ¥ ° ± ² ³ ´ µ ¶ · ¸ ® ¬ ¬¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ «® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤ ° ± ² ³ ´ µ ¶ · ¸ ® ¥ § ¦
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ¯ ¤ ° ± ² ³ ´ µ ¶ · ¸ ® ¤ « ª¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤ ° ± ² ³ ´ µ ¶ · ¸ ® ¤ ¥ ¦
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ° ± ² ³ ´ µ ¶ · ¸ ® § ¥¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤¤ ¯ ¥¤ ¯ § ° ± ² ³ ´ µ ¶ · ¸ ® §
ü ý ý þ ÿ ä
ò ó ú õ æ î ù û ó
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ³ Í ° Î ´ µ ¶ · ¸ ® ¥ « ¦¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤ ³ Í ° Î ´ µ ¶ · ¸ ® ¦ ¤ ª
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ³ Í ° Î ´ µ ¶ · ¸ ® ¬ ¬¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤ ³ Í ° Î ´ µ ¶ · ¸ ® ¥ § ¦
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ «¤ ³ Í ° Î ´ µ ¶ · ¸ ® ¤ « ª¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ©® ¯ §® ¯ ¥ ¯ ¥ ¯ § ¯ © ¯ « ³ Í ° Î ´ µ ¶ · ¸ ® ¤ ¥ ¦
¤ ¥ ¦ § ¨ © ª « ¬ ¤ ® ¯ ¦® ¯ ¥® ¯ ¤ ¯ ¤ ¯ ¥ ¯ ¦ ¯ § ¯ ¨ ¯ © ¯ ª ³ Í ° Î ´ µ ¶ · ¸ ® § ¥¹ º » ¼ ½ ¾ ¿ À Á ¹ ÂÃ Â Ä ¾Ã Â Ä ¼Ã Â Ä º ÂÂ Ä ºÂ Ä ¼Â Ä ¾Â Ä À Å Ï Æ Ð È É Ê Ë Ì Ã ¼  Â