Upload
others
View
16
Download
0
Embed Size (px)
Citation preview
Detecting Emotional response to music using near-infraredspectroscopy of the prefrontal cortex
by
Saba Moghimi
A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy
Graduate Department of Institute of Biomaterials and BiomedicalEngineering
University of Toronto
c⃝ Copyright 2013 by Saba Moghimi
Abstract
Detecting Emotional response to music using near-infrared spectroscopy of the
prefrontal cortex
Saba Moghimi
Doctor of Philosophy
Graduate Department of Institute of Biomaterials and Biomedical Engineering
University of Toronto
2013
Many individuals with severe motor disabilities may not be able to use conventional
means of emotion expression (e.g. vocalization, facial expression) to make their emo-
tions known to others. Lack of a means for expressing emotions may adversely affect
the quality of life of these individuals and their families. The main objective of this
thesis was to implement a non-invasive means of identifying emotional arousal (neutral
vs. intense) and valence (positive vs. negative) by directly using brain activity. In
this light, near infrared spectroscopy (NIRS), which optically measures oxygenated and
deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively), was used to
monitor prefrontal cortex hemodynamics in 10 individuals as they listened to music ex-
cerpts. Participants provided subjective ratings of arousal and valence. With respect to
valence and arousal, prefrontal cortex [HbO2] and [Hb] were characterized and significant
prefrontal cortex hemodynamic modulations were identified due to emotions. These mod-
ulations were not significantly related to the characteristics of the music excerpts used
for inducing emotions. These early investigations provided evidence for the use of pre-
frontal cortex NIRS in identifying emotions. Next, using features extracted from [HbO2]
and [Hb] in the prefrontal cortex, an average accuracy of 71% was achieved in identifying
arousal and valence. Novel hemodynamic features extracted using dynamic modeling and
template-matching were introduced for identifying arousal and valence. Ultimately, the
ii
ability of autonomic nervous system (ANS) signals including heart rate, electrodermal
activity and skin temperature to improve the identification results, achieved when using
PFC [HbO2] and [Hb] exclusively, was investigated. For the majority of the participants,
prefrontal cortex NIRS-based identification achieved higher classification accuracies than
combined ANS and NIRS features. The results indicated that NIRS recordings of the
prefrontal cortex during presentation of music with emotional content can be automat-
ically decoded in terms of both valence and arousal encouraging future investigation of
NIRS-based emotion detection in individuals with severe disabilities.
iii
Dedication
To Hope and Trinity for inspiring me to pursue this work.
iv
Acknowledgements
I would like to thank my supervisor Dr. Tom Chau for his kind help and all his support
throughout my work. I will be forever indebted to him for giving me the chance to be
part of his dynamic research team. His mentorship has helped me develop skills that I
will carry for the rest of my life. My special thanks to my co-supervisor Dr. Anne-Marie
Guerguerian for sharing her knowledge and supporting me throughout the challenges I
faced. Her unwavering care and concern for the patients has always been a source of
inspiration to me. I would like to thank my committee members Dr. Maureen Dennis
and Dr. Milos Popovic for sharing their insight, and guiding me with their suggestions.
I would like to express my gratitude to Dr. Azadeh Kushki and Dr. Sarah Power for
their kind help throughout my research. I am also grateful to Ka Lun Tam and Pierre
Duez for their technical support. I would like to express my gratitude to Dr. Negar
Memarian and Dr. Stefanie Blain-Moraes for helping me in developing my research
skills.
I would like to thank the participants who took the time to help me with this study,
without whom this work would have not been possible. I acknowledge the financial
support of the National Science and Engineering Research Council CREATE CARE
program, and Holland Bloorview Kids Rehabilitation Hospital graduate scholarship. I
would like to thank donors of the K.M. Peterborough Hunter graduate studentship for
their financial support.
Finally, I would like to express my gratitude to my family whose love and support
has always embraced me although they are miles and miles away. I would like to thank
my father for all his contributions. His interest in my work and our discussions truly
motivated me in my research. I thank my mother and my aunt Ferreshteh who reminded
me to be strong and determined throughout my work. Special thanks to my sister who
helped me in so many ways from encouraging me in my work to sharing her technical
insight. Finally, my special thanks to Amin Abdossalami for reminding me to never give
v
up.
vi
Contents
1 Introduction 1
1.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Current clinical evidence for EEG-based BCIs, a literature appraisal . . . 3
1.3.1 BCI Development Using Electroencephalography . . . . . . . . . . 5
1.3.2 Applications User Interface . . . . . . . . . . . . . . . . . . . . . 7
1.3.3 Controlling brain computer interfaces . . . . . . . . . . . . . . . . 7
1.3.4 Evaluation Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.5 Future Directions in BCI research . . . . . . . . . . . . . . . . . . 13
1.3.6 Towards affective brain computer interfaces . . . . . . . . . . . . 16
1.4 Neural correlates of emotion . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.1 The role of prefrontal cortex in default, salient and executive con-
trol networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.5 Near-infrared spectroscopy of the brain . . . . . . . . . . . . . . . . . . . 21
1.6 Emotion induction via music . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.7 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.8 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2 Experimental Protocol 29
2.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
vii
2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5 Signal acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.6 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.7 Study design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3 Characterizing PFC Hemodynamic changes due valence and arousal 35
3.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4.1 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4.2 Wavelet-based peak detection . . . . . . . . . . . . . . . . . . . . 40
3.4.3 Statistical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4 The Effect of Music Characteristics 47
4.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.1 Music characteristic extraction . . . . . . . . . . . . . . . . . . . . 49
4.3.2 Music database . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.3 Statistical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5.1 Subject specific patterns . . . . . . . . . . . . . . . . . . . . . . . 53
viii
4.5.2 Temporal dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5 Automatic Detection of Emotional Response to Music 55
5.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.4.1 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.4.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.4.3 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.4.4 Classification procedures . . . . . . . . . . . . . . . . . . . . . . . 62
5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.6.1 Classification Accuracy . . . . . . . . . . . . . . . . . . . . . . . . 66
5.6.2 Diversity in the music database . . . . . . . . . . . . . . . . . . . 69
5.6.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6 Combining autonomic and central nervous system activity 71
6.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.3.1 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.3.2 NIRS data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.3.3 ANS data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.3.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.3.5 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3.6 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
ix
6.3.7 Mixture of experts . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.4.1 Dynamic model-based features . . . . . . . . . . . . . . . . . . . . 84
6.4.2 Classification results . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7 Concluding remarks 89
7.1 Summary of contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.1.1 A literature appraisal of the existing evidence for the use of BCI
for individuals with disabilities [143] . . . . . . . . . . . . . . . . 89
7.1.2 PFC [Hb] and [HbO2] patterns characterization using wavelet anal-
ysis with respect to emotional arousal and valence [142] . . . . . . 90
7.1.3 Identified emotional arousal and valence in response to dynamic
emotion induction using PFC NIRS [144] . . . . . . . . . . . . . . 90
7.1.4 Introduced features based on dynamic modeling for emotion iden-
tification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.1.5 Multi-modal emotion identification using a mixture of classifier ex-
perts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2 Recommendation for future studies . . . . . . . . . . . . . . . . . . . . . 92
7.2.1 Assessing PFC hemodynamics for emotion identification in the pe-
diatric population and individuals with severe disabilities . . . . . 92
7.2.2 Potential clinical implications . . . . . . . . . . . . . . . . . . . . 93
7.2.3 Dynamic emotional rating paradigms . . . . . . . . . . . . . . . . 94
7.2.4 Emotional sensitivity measures . . . . . . . . . . . . . . . . . . . 94
7.2.5 Individual specific analysis . . . . . . . . . . . . . . . . . . . . . . 94
7.2.6 Inclusion of larger sample sizes . . . . . . . . . . . . . . . . . . . 95
x
Appendix A: Open Challenges Regarding Control Mechanisms 96
Appendix B: Music Database 100
Appendix C: Music characteristic extraction using MIRTOOLBOX 103
Appendix D: Region specific analysis of [HbO2] and [Hb] with respect to
music characteristics 104
Appendix E: Contributions from Systemic Blood Flow 105
Appendix F: Cognitive Processing Activity in the Prefrontal Cortex 107
Appendix G: Research Ethics 108
Bibliography 114
Acknowledgements
xi
List of Tables
1.1 Summary of BCI studies on individuals with disabilities (1999-2005) . . . 8
1.1 Summary of BCI studies on individuals with disabilities (2006-2009) . . . 9
1.2 BCI Control Mechanisms. . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 A summary of existing theories of emotions. See [50] for more details. . . 23
4.1 P-values for the main effect of arousal and valence rating in modeling
mode, dissonance and maximum sound pressure level. . . . . . . . . . . . 51
4.2 P-values for the main effect of music characteristics (i.e. dissonance, mode,
and maximum sound pressure level) in modeling the peaks of [HbO2] and
[Hb] averaged across the nine recording sites. . . . . . . . . . . . . . . . . 52
5.1 Summary of features used in the analysis . . . . . . . . . . . . . . . . . . 62
5.2 Classification accuracy in % for each participant when classifying HA vs.
BN. Feature-types corresponding to the best average accuracy are also
presented for each participant (M = stimulus period mean; ∆M = stimulus
period mean - preceding noise period mean; LSR = lateral slope ratio;
∆LM = Lateral mean difference; S = slope, CV = coefficient of variation 65
xii
5.3 Classification accuracy in % for each participant when classifying PV vs.
NV. Feature-types corresponding to the best average accuracy are also
presented for each participant (M = stimulus period mean; ∆M = stimulus
period mean - preceding noise period mean; LSR = lateral slope ratio;
∆LM = Lateral mean difference; S = slope, CV = coefficient of variation 66
6.1 Features resulting from arx dynamic modeling. (very low frequency band
(VLF) = 0-0.025 Hz, low frequency band (LF) = 0-0.075 Hz and high
frequency band (HF) = 0.075-0.1 HZ) . . . . . . . . . . . . . . . . . . . . 79
6.2 Feature used for training classifier experts . . . . . . . . . . . . . . . . . 83
6.3 Classification accuracy in % determined using ANS features for solving
the HA vs BN and PV vs. NV classification problem . . . . . . . . . . . 85
6.4 Classification accuracy in % determined using the mixture of experts for
solving the HA vs. BN and PV vs. NV classification problem . . . . . . . 86
6.5 Classification accuracy in % for each participant when classifying HA vs.
BN. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and
arx (b) input:[HbO2]/[Hb])) and template-based features. . . . . . . . . . 86
6.6 Classification accuracy in % for each participant when classifying PV vs.
NV. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and
arx (b) input:[HbO2]/[Hb])) and template-based features. . . . . . . . . . 87
1 The list of music pieces included in the common music database . . . . . 101
2 The list of self-selected music pieces . . . . . . . . . . . . . . . . . . . . . 102
3 The significance of the main effect of a. Mode, b. Dissonance, and c.
Maximum sound pressure level for each recording site shown in Figure
2.1. (α = 0.05) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
xiii
List of Figures
1.1 General BCI Components . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Various structures within the survival network involved in the emotional
response, and the resulting outputs. [123] . . . . . . . . . . . . . . . . . . 19
1.3 General overview of NIRS recording system . . . . . . . . . . . . . . . . 22
1.4 Thesis roadmap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1 The layout of light sources (circles) and detectors (X’s). The vertical line
denotes anatomical midline. The annotated shaded areas correspond to
recording locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2 Trial sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3 The Self Assessment Manikin Rating System is shown. The top and the
bottom row depict valence (positive to negative) and arousal (intense to
neutral) ratings, respectively. The participant could select one of the nine
levels of arousal/valence by marking the corresponding circles shown. For
example, in the sample rating provided, a very intense positive emotion is
represented . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1 The layout of light sources (circles) and detectors (X’s). The vertical line
denotes anatomical midline. The annotated shaded areas correspond to
recording locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Trial sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
xiv
3.3 Mexican hat wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Box-plot of valence and arousal ratings for each participant . . . . . . . . 43
3.5 Slopes of regression lines between participant arousal ratings and (a) the
maximum wavelet coefficient (MWC), and (b) the corresponding scale.
Only slopes significantly different from zero are shown (p < 0.005). . . . . 44
3.6 Slopes of regression lines between participant valence ratings and (a) the
maximum wavelet (MWC), and (b) the corresponding scale. Only slopes
significantly different from zero are shown (p < 0.005). . . . . . . . . . . 44
3.7 Plotted in black are the (a) [HbO2] (top panel) and (b) [Hb] (bottom
panel)recordings across nine interrogation sites for a music sample inducing
intense negative emotions from one of the participants during 45 seconds
of aural stimulus. In grey are the corresponding waveforms of wavelet
coefficients at the scale where the maximum wavelet coefficient occurs.
These waveforms have been scaled by their standard deviation to facilitate
visual comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1 In grey: the normalized sound pressure level of self-selected song A for par-
ticipant 3. In black: normalized [HbO2] averaged across the nine recording
locations shown for each of the four repetitions of song A. The [HbO2] var-
ied in different repetitions of the same song. . . . . . . . . . . . . . . . . 52
5.1 Plots (a) and (c) exemplify normalized HbO2 concentration signals at dif-
ferent recording locations while plots (b) and (d) are the corresponding
normalized Hb concentration signals. The dark lines represent normalized
signals corresponding to highly valenced, high arousal stimuli while the
lighter grey line depicts normalized concentrations during Brown noise
presentation to the same participant. The same Brown noise sample is
illustrated for both positively and negatively valenced examples. . . . . . 64
xv
5.2 Location of features resulting in the best overall accuracy. Each rectangle
is located over a recording site. The size of the rectangle is proportional
to the number of features selected from the corresponding location. The
vertical line denotes the anatomical midline (HA = high arousal; BN =
Brown noise; PV = positive valence; NV=negative valence). . . . . . . . 67
5.3 Adjusted classification accuracy (shown in (??)) results (averaged across
participants) versus the number of trials included for classification against
brown noise trials, after sorting all trials based on ratings of arousal in
descending order. (e.g. accuracies reported for the top 12 are the result of
classifying the 12 highest rated arousal trials against all trials with brown
noise. The confidence intervals are shown as error bars for each number
of trials included.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.4 Adjusted classification accuracy (shown in (??)) results (averaged across
participants) versus the number of trials included for classification, after
sorting all trials based on ratings of positive and negative valence in de-
scending order. (e.g. accuracies reported for the top 12 are the result of
classifying the 12 most positively rated trials against the 12 most nega-
tively rated trials. The confidence intervals are shown as error bars for
each number of trials included.) . . . . . . . . . . . . . . . . . . . . . . . 68
6.1 Trial sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.2 The layout of light sources (circles) and detectors (X’s). The vertical line
denotes anatomical midline. The annotated shaded areas correspond to
recording locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.3 A. Custom-made template, B. Sample normalized [HbO2] recorded in a
trial with chills. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.4 Feature segmentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.5 A simplified diagram depicting fusion of classifier decisions. . . . . . . . . 82
xvi
6.6 Sample trial with chills (participant 2): EDA recording and estimation,
using the average [HbO2] concentrations as the input to the arx model.
The fit achieved by the model for the depicted estimation is 52.9%. . . . 84
6.7 Sample scaled frequency response estimated for (A) chilling and (B) neu-
tral trials for participant 4. The magnitude of the frequency response was
normalized by dividing the results by the total power of the signal over
the entire frequency range. . . . . . . . . . . . . . . . . . . . . . . . . . . 85
1 Ethics approval notice . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
2 Participant consent form . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
xvii
xviii
List of abbreviations ANS: autonomic nervous system AR: autoregressive arx: autoregressive model with exogenous input BCI: brain computer interface BN: brown noise BVP: blood volume pulse CNS: central nervous system HA: high arousal EDA: electrodermal activity EEG: electroencephalography [HbO2]: oxygenated hemoglobin concentration [Hb]: deoxygenated hemoglobin concentration MRITOLLBOX: music information retrieval toolbox MRI: magnetic resonance imaging MWC: maximum wavelet coefficient NIRS: near infrared spectroscopy NV: negative valence PFC: prefrontal cortex PET: positron emission tomography PV: positive valence
Chapter 1
Introduction
1.1 Preamble
Sections of this chapter are drawn from the following published review paper: Moghimi S,
Kushki A, Guerguerian AM, Chau T, A Review of EEG-Based Brain-Computer Interfaces
as Access Pathways for Individuals with Severe Disabilities. To appear in Assistive
technology: the official journal of RESNA 2012.
1.2 Motivation
Many individuals with severe motor disabilities may not be able to use conventional
means of communication such as speech or facial gestures to express their intentions.
Lack of communication may adversely impact the quality of life of these individuals as
well as that of their families. In particular, manifestation of emotions such as facial ex-
pressions and body language are an imperative part of human interactions. Emotional
communication enables caretakers to address the needs of infants [225]. Severe motor
impairments may result in an absence of physical displays of emotion, and leave caretak-
ers with no means of interpreting emotional reactions. Realizing alternative pathways
through which individuals with severe motor impairments may express their affective
1
2
response may ultimately improve their quality of life and quality of care while reducing
care-giver stress[70].
Alternative access pathways can be used to translate functional intent into electri-
cal signals for environmental or computer control [217]. Examples of these alternative
pathways include mechanical switches and vision-based systems that generate binary
control signals from limb movements [236], eye gaze [7, 37], mouth opening [137] or
tongue protrusion [128]. These solutions, however, are not appropriate for individuals
who are cognitively capable but have little or no voluntary and repeatable muscle control.
The etiology ranges from acute conditions such as brain-stem stroke, infectious basilar
arteritis , acute inflammatory demyelinating polyneuropathy [162] or brainstem tumor
[86] to chronic causes including amyotrophic lateral sclerosis, severe spastic quadriplegic
cerebral palsy, severe nemaline myopathy and multiple sclerosis [97]. For example, in-
dividuals affected by neuro-degenerative conditions such as amyotrophic lateral sclerosis
(ALS) or multiple sclerosis (MS) may experience locked in syndrome (LIS) in the late
stages of the disease. Individuals with LIS have little or no voluntary muscle control
while retaining cognitive awareness. These individuals are aware of their surroundings,
however, they may not be able to communicate their intent via speech or facial expres-
sion. Children with severe congenital disabilities due to severe motor impairments may
also experience communication difficulties. To enable communication without relying
on motor capacity, physiologically-based communication systems have been investigated.
In particular, communication alternatives have been developed by directly using brain
activity. Technologies known as brain computer interfaces (BCI) can generate a control
command enabling users to operate communication interfaces [237, 115]. This thesis ex-
plores affective BCI systems [158] capable of detecting emotional response. These systems
constitute an emerging field of BCI research.
3
Brain sensing
moduleDecoder
Control
command
Communication
Environment
control
Device control
Communication: restoring communication e.g. Alternative and
augmentative communication (AAC), Speller, computer
mediated communication such as internet access control
Environment control: Interacting with and influencing the
surrounding environment e.g. TV, bed position, lights control
Device control: Controlling mechanical devices to restore
mobility or dexterity e.g. Neuroprosthesis, wheelchair control
CONTROL INTERFACE
USER INTERFACE
Figure 1.1: General BCI Components
1.3 Current clinical evidence for EEG-based BCIs, a
literature appraisal
While many different BCI system paradigms have been proposed (e.g. [133, 52]), most
fundamentally, a BCI system is comprised of an activity sensing module, a brain ac-
tivity decoder and an output module as depicted in Figure 1.1. The activity sensor
measures brain activity while the decoder detects specific evoked or spontaneous brain
activity patterns and translates them into control commands. The output module takes
these control commands to drive applications such as an on-screen scanning keyboard for
communication.
One of the key aspects of BCI development is choosing a brain sensing module suit-
able for long-term bedside monitoring. With their high spatial resolution, electrode
implants ([98, 103, 104]) have facilitated accurate cursor control in humans ([77]) and
high throughput ([194]) and multi-joint prosthesis control ([227]) in primates, but do
require invasive surgery ([69]). Sacrificing some spatial resolution for non-invasiveness,
economy and portability, EEG is the most widely used modality in BCI applications
to date. Therefore, Electroencephalography(EEG) which monitors electrical potentials
4
from the skull surface has dominated BCI research due to its non-invasiveness, low cost,
portability and convenient set-up requirements.
Another modality explored for BCI development is magnetic resonance imaging (MRI)
[234]. MRI is capable of detecting hemodynamic changes by monitoring blood oxygena-
tion level dependent (BOLD) with high spatial resolution and can detect signals from
deeper brain areas. In fact, Weiskopf et al. have been able to differentiate brain activity
corresponding to motor imagery, visual imagery and spatial navigation using MRI [233].
Despite these findings, and the spatial resolution available using MRI, the current MRI
technologies are bulky, expensive, and require radio frequency and magnetic shielding
which impedes their use as a portable bedside monitoring system.
Another emerging cerebral hemodynamic monitoring technology for BCI develop-
ment is near infrared spectroscopy (NIRS). NIRS monitors the level of oxygenated and
deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively) in the cerebral
cortex using optical imagery. Near-infrared light shined through the adult skull, is de-
tected 2.5-3 cm apart from the source [228, 90]. The light intensity detected can be used
to identify [HbO2] and [Hb] in the underlying tissue due to the differences in absorp-
tion characteristics of these two chromophores. NIRS systems are relatively inexpensive,
portable, and suitable for long-term bedside monitoring. Recent studies have illustrated
the ability of NIRS to detect task-related changes in brain activity. These findings have
indicated that active music imagery (mental singing) can be differentiated from the rest
state and mental math with accuracies significantly above chance [177, 45, 65]. In addi-
tion to user-convenience, NIRS is particularly immune to electro-genic artifacts due to
eye movement and muscle contractions which are frequently encountered in the prefrontal
area. Therefore, in this thesis, NIRS was selected for detecting hemodynamic changes in
the prefrontal cortex associated with emotional responses. However, due to the extent
of BCI systems developed using EEG, a literature appraisal of clinically investigated
EEG-based BCI systems was conducted to set the stage for understanding the potential
5
of NIRS.
1.3.1 BCI Development Using Electroencephalography
To find out the extent of existing evidence for EEG-based BCI use for individuals with
disabilities and identifying research gaps, a literature review was conducted. The spe-
cific focus of this search was BCI systems for communication and environmental control.
Studies related to other BCI applications such as brain-controlled prostheses were ex-
cluded from the review. PubMed, ISI Web of Science, and OVID (MEDLINE, CINAHL,
EBM Reviews, EMBASE, and Ovid Healthstar) databases were searched using keyword
combinations containing brain computer interface and one of disability, disabilities, dis-
abled or ALS. Only English-language journal articles that directly evaluated EEG-based
BCI technology with participants with physical disabilities were included. This search
was further narrowed to journal articles published between January 1999 and December
2010.
The search identified 380 articles. The reference and citation lists of the retrieved
articles were further examined. The articles were screened, based on title and abstract, to
only include studies involving individuals with disabilities. This screening exercise yielded
119 articles. Further screening for EEG studies focused on restoring communication in
the target population further reduced the sample to 39 articles, as listed in Table 1.1.
In the following sections, we appraise these articles with respect to the participants’
characteristics, and to the articles’ control mechanisms, clinical findings, and evaluation
criteria.
The level of evidence for clinical interventions is typically rated according to study
design criteria [192]. None of the studies examined were controlled experiments per se.
Fourteen (14) articles compared BCI use between able-bodied participants and individu-
als with disabilities. However, if we regard BCI as a clinical intervention, these studies do
not follow conventional experimental designs as there were no control and experimental
6
groups. Thus, according to the conventional rating criteria [192], the entire collection
of selected articles would be rated as level V i.e., case series without controls. By and
large, the majority of studies involved a small number of participants; only 8 of the 39
studies had more than 6 participants. The selected studies do not propose an interven-
tion for a certain population, but rather report efforts of restoring communication in
the few individuals participating in the respective investigations. Therefore, one may
argue that the focus of these BCI assistive technology studies has been on individual-
centered solutions [199]. In this light, introducing clinical rating guidelines revolving
around person-centered constructs such as person-environment interaction [88] may be
more appropriate for these studies.
The selected studies included 6 single-subject reports, 25 studies with fewer than six
participants, and eight studies with 6-35 participants with disabilities. A total of 14
studies involved both able-bodied and individuals with disabilities.
Of the 39 studies involving participants with disabilities, 29 studies reported BCI eval-
uation results for individuals with locked-in syndrome (LIS) [211]. In LIS, consciousness
is preserved but severe motor and communication impairments are present (quadriplegia
and anarthia). Depending on the level of residual motor control, LIS is classified into
three categories [8]: (1) incomplete LIS, where remnant voluntary motion is preserved in
addition to those retained in classical LIS, (2) classical LIS, which refers to cases with
total immobility except for blinking and vertical eye movement, and (3) total LIS where
no voluntary motor control is preserved. Among the studies involving participants with
LIS, only two studies recruited participants with total LIS. The majority of these stud-
ies (64%) considered participants with LIS resulting from amyotrophic lateral sclerosis
(ALS), an adult-onset progressive neuro-degenerative disease that affects both upper and
lower motor neurons [147]. In its late stages, ALS can lead to the locked-in state.
Other conditions reported in the reviewed articles included different levels of spinal
cord injury (SCI) (8 studies), cerebral palsy (4 studies), cerebral paresis ( 1 studies),
7
muscular dystrophy (4 studies), stroke (3 study), chronic Guillian Barr syndrome (2
studies), multiple sclerosis (1 study), spinal muscular atrophy (2 studies), post-polio (1
study), and primary lateral sclerosis (1 study). All of the selected studies considered
adult participants.
1.3.2 Applications User Interface
The selected articles used EEG-based BCI systems for two main applications, namely,
augmentative and alternative communication (34 studies) and environmental control (5
studies). Augmentative and alternative communication tools enable or facilitate com-
munication with other individuals and include spellers and Internet navigation tools.
Environmental control tools enable the user to modify environmental conditions and in-
clude body-position control, control of electronic appliances, and navigation in real and
virtual environments.
One of the spelling applications used is a binary tree arrangement of the alphabet to
efficiently locate a desired letter [166]. At each level of the tree, the user is presented with
two segments of the alphabet and can eventually choose the desired letter by traversing
the alphabet tree. Another common spelling interface is the scanning keyboard, where
different columns and rows of an array of letters are sequentially intensified [230]. A
third widely used interface involves on-screen object navigation. The user-guided cursor
points to the desired options or letters for communication purposes.
Table 1.1 summarizes EEG-based BCI studies involving individuals with disabilities
in the past decade.
1.3.3 Controlling brain computer interfaces
EEG-based BCIs rely on modulation of brain activity for application control. The mecha-
nism used to modulate brain activity may rely on reactions evoked by externally presented
stimuli or generated spontaneously by a trained user. The most commonly used control
8
Tab
le1.1:
Summaryof
BCIstudieson
individualswithdisab
ilities(1999-2005)
Authors
Participants
(condition)
Controlmechanism
Application
Birbau
mer
etal.(1999)[13]
2(A
LS)
SCP
Spelling
Kubleret
al.(1999)[114]
3(ALS),13able-bodied
SCP
Spelling
Birbau
mer
etal.(2000)[14]
5(A
LS)
SCP
Spelling
Don
chin,Spencer,
Wijesingh
e(2000)[38]
3(com
plete
paraplegia),10able-bodied
P300
Spelling
,1(incomplete
praraplegia)
Kubleret
al.(2001)[115]
2(A
LS)
SCP
Spelling
Kaiseret
al.(2002)[93]
1(A
LS)
SCP
Environmentalcontrol
Hinterbergeret
al.(2003)[76]
1(A
LS)
SCP
Spelling
Sellers
etal.(2003)
[202]
3(A
LS)
P300
Spelling
Muller
etal.(2003)
[148]
1(infantile
CP)
SMR
Cursorcontrol
Neuman
nan
dKubler(2003)
[154]
11(notspecified)
SCP
Krausz
etal.(2003)
[107]
4(SCI)
partialparalysis
SMR
Cursor(ball)control
Neuman
n,Birbau
mer
(2003)
[153]
5(A
LS)
SCP
Spelling
Neuman
net
al.(2003)[154]
5(A
LS)
SCP
Spelling
Neuper
etal.(2003)[155]
1(C
P)
SCP
Spelling
Bayliss
etal.(2003)[10]
1(A
LS),9able-bodies
P300
3-choicesw
itch
Kubleret
al.(2004)[116]
10(A
LS),10able-bodied
SCP
Cursorcontrol
Wolpaw
,McF
arland(2004)
[238]
2(SCI),2
able-bodied
MSMR
Cursorcontrol
Sellers
etal.[204]
15(A
LS),1(brain
stem
stroke
P300
Spelling
Kubleret
al.(2005)[117]
4(A
LS)
SMR
Cursorcontrol
Piccion
eet
al.(2005)[173]
1(ALS),1(LIS
post
vertebrobasilartrombosis)
P300
4-choicesw
itch
1(ALS),1(LIS
post
vertebrobasilartrombosis)
1(SCI),1(G
uillain
Barresyndrome)
1(M
S),7(able-bodied)
Note:
ALS:am
yotropic
lateralsclerosis,
CP:cerebralpalsy,DMD:Dunchennemusculardystrophy,
LIS:locked-in
syndrome,
Lv:level,MD:musculardystrophy,
MS:multiple
sclerosis,
SCI:spinalcord
injury,SCP:
slow
cortical
potentials,
SMA:spinal
muscularatrophy,
SMR:sensorimotorrhythms.
9
Tab
le1.1:
Summaryof
BCIstudieson
individualswithdisab
ilities(2006-2009)
Authors
Participants
(condition)
Controlmechanism
Application
Karim
etal.(2006)[94]
1(A
LS)
SCP
Internet
surfing
Neuper
etal.(2006)[26]
1(CP),1(MDD),1(A
LS)
SMR
Spelling
,1(SCI,lv.C4),1(SCI,lv.C5)
Sellers,D
onchin(2006)
[203]
3(ALS),3able-bodied
P300
Spelling
Vau
ghan
etal.(2006)
[226]
1(A
LS)
SMR,SCP,P300
Spelling,Cursor
control
Wan
get
al.(2006)[230]
11(SCI,lv.C
4-C7),16able-bodied
SSVEP
Environmentalcontrol
Kau
han
enet
al.(2007)[95]
5(SCI,lv.C
4-C5),1(GuillianBarresyndrome)
SMR
Cursor(circle)control
Leebet
al.(2007)[127]
1(SCI,complete
lesionbelow
C4
SMR
VirtualEnvironment
,incomplete
lesionbelow
C5)
nav
igation
Bai
etal.(2008)[6]
1(brain
stroke),1(A
LS),9able-bodied
SMR
Binary
switch
Cincottiet
al.(2008)[27]
14(SMA,D
MD),14able-bodied
SMR
Environmentcontrol
Hoff
man
net
al.(2008)[78]
1(CP),1(SMA),1(A
LS)
P300
6-choicesw
itch
,1(brain
andspinalcord
injury)
,1(P
ost-an
oxic
encephalopathy)
,4able-bodied
Kubler,
Birbau
mer(2008)
[112]
29(A
LS),6(Guillain
Barresyndrome)
SCP
Cursorcontrol
,1(m
usculardystrophy),1(cerebralparesis)
Spelling
1(diffuse
brain
dam
agepost
hypox
ia),2(brain
stroke)
McF
arlandet
al.(2008)[135]
1(SCI,lv.C4),1(SCI,lv.T7)
SMR
Cursorcontrol
Nijboer
etal.(2008)[159]
8(A
LS)
P300
Spelling
Kubleret
al.(2009)[113]
4(A
LS)
P300
Spelling(auditory)
Bab
ilon
iet
al.(2009)
[5]
6(D
MD)
SMR
Environmentcontrol
Con
radiet
al.(2009)[29]
7(SCI)
SMR
Cursorcontrol
Feltonet
al.(2009)
[48]
2(ALS),1(MD
post-polio),
3(SMA)
SMR
Cursorcontrol
2(post-polio),
1(C
P),2(SCI),1(L
IS)
8ab
le-bodied
Bai
etal.(2010)
3(PLS),3(A
LS)
SMR
Cursorcontrol
Mugler
etal.(2010)
3(ALS),10able-bodied
P300
Internet
browsing
Note:
ALS:am
yotrop
iclateralsclerosis,
CP:cerebralpalsy,
DMD:Dunchennemusculardystrophy,
LIS:locked-in
syndrome,
Lv:level,MD:musculardystrophy,
MS:multiple
sclerosis,
SCI:spinalcord
injury,SCP:
slow
cortical
potentials,
SMA:spinal
muscularatrophy,
SMR:sensorimotorrhythms.
10
Table 1.2: BCI Control Mechanisms.
Control Mechanism
SpontaneousSlow Cortical Potentials (SCP)Sensorimotor Rhythms (SMR)
EvokedP300Steady State Visually Evoked Potentials
Mental TaskLanguage tasksMental Arithmetic
mechanisms are shown in Table 1.2. The remainder of this section discusses each of
these mechanisms in detail. The review of the selected articles led to identifying different
challenges surrounding BCI control mechanisms which are listed in Appendix A.
Slow cortical potentials
The most frequently deployed control mechanism among the selected studies is the slow
cortical potential (SCP), a spontaneously generated signal. SCPs are slowly varying
trends that are time locked to specific external or internal events [12]. The duration of
these potentials is generally between 300 milliseconds to several seconds [13].
Voluntary control using behavioral manipulations can cause positive or negative SCP
shifts. Negative deviations of the SCPs are known to be associated with arousal as well
as response preparation [220, 56]. Positive deflections in SCPs are related to response
inhibition and relaxation [42]. Voluntary control of SCP can be achieved by providing
visual or auditory biofeedback to participants [43, 12].
Thought translation device (TTD) is an example of an SCP-based BCI [11], employing
voluntarily generated SCPs to control a computer. The TTD requires a training phase
during which the user receives visual or audio-visual feedback reflecting the presence of
positive or negative deflections [115]. In particular, the reviewed articles reported that
successful use of the SCP-based BCI (achieving accuracies higher than 70%) required
several training sessions. Among the reviewed articles SCP-based BCIs were used to
operate spelling interfaces, navigate the Internet, and control environmental devices.
11
Sensorimotor rhythms
Like SCPs, sensorimotor rhythms (SMRs) rhythms are spontaneously occurring EEG
activities in the somatosensory cortex in the absence of movement [157]. These rhythms
are attenuated by movement or somatosensory stimulation. Control of SMRs can be
achieved through biofeedback-based training as the user performs motor imagery [119,
170]. While SMR-based BCIs were successfully used by individuals with disabilities
[117, 170, 239], voluntary modulation of SMRs for BCI control required many training
sessions.
SMR-based BCI have been used for cursor control, where bilateral motor imagery of
hands, legs, and tongue was used to control the direction of cursor movement. SMRs
have also been used for selecting various targets. For example, Mcfarland et al. [135]
used a linear combination of SMRs to enable selection once the cursor reached the target
choice. Such cursor control can also be used for spelling.
P300 evoked potentials
In contrast to the spontaneous control mechanisms, the P300 is an evoked response.
The P300 wave has a latency of 300ms and is the positive-going component of the event-
related potential that results from exposure to an occasional stimulus [181]. This response
is generated by a network comprising the prefrontal cortex, anterior insula, cingulate
gyrus, temporoparietal cortex, medial temporal cortex, and the hippocampal formation
[183] and can be maximally recorded from the midline centroparietal regions [174].
An example of a P300-based BCI is the P300 speller [47] that intensifies columns and
rows of an alphabet matrix presented visually to the user. A P300 response is elicited
when the user is presented with intensification of the row and column containing the
desired letter. Thus, the presence of the P300 can be used to detect the user’s choice.
The P300 speller was shown to achieve higher than 70% accuracy in 5 of 6 participants
with ALS in [159]. Moreover, Sellers et al. reported that with visual and auditory P300-
12
inducing stimuli, 2 of 3 participants with ALS achieved a selection accuracy comparable
to that of able-bodied individuals using a similar system [202].
Steady state visually evoked responses
When presented with repetitive visual stimuli, EEG recordings from the parieto-occipital
sites demonstrate peaks at frequencies matching that of the stimuli and its harmonics
[73]. This response is known as the steady state visually evoked potential (SSVEP). The
physiological mechanism underlying generation of SSVEPs remains largely unknown, al-
though the amplitude of the SSVEPs is reportedly related to increase in synaptic activity
[165]. It is suggested that SSVEP peaks intensify with selective attention to the stimulus
[145, 3].
In Wang et al. (2006), 11 volunteers with SCI attempted to operate an environmental
control using an SSVEP-based BCI system. Of the 11 participants, 10 were able to reach
an information transfer rate of 21 bits/minute using this system [230]. In this study, an
array of buttons, each flickering at a different frequency was presented to the user. The
user chose the desired option by attending to the appropriate button.
Mental task
Several other mental tasks such as language and arithmetic have also been shown to
induce distinctive EEG patterns in able-bodied individuals [140, 185]. Despite the cog-
nitive load imposed by these BCIs, they may have merits as BCI control mechanisms
for the target population. To the best of our knowledge, BCIs based on language and
arithmetic mental tasks have not been tested by the target population.
1.3.4 Evaluation Criteria
Performance of BCI systems have generally been measured by speed and accuracy which
are both important for communication. Since the reviewed studies focused on different
13
applications (e.g., spelling, cursor control), various measures of speed and accuracy were
used to report system performance as listed in Table 1. Examples of accuracy measures
include classification accuracy and r2 value, which reflects the level of correlation between
user intent and the signal features [237]. The number of characters typed per minute
has also served as a measure of speed. Information transfer rate, also known as bit rate,
has also been commonly used as a combined measure of accuracy and speed [237]. This
measure reflects the amount of ”correct” information transferred per unit time.
1.3.5 Future Directions in BCI research
A closer look at the reviewed studies provides a means of identifying emerging challenges
in BCI development and means to overcome these issues. In this light, the current section
summarizes future directions identified in BCI research.
Involve pediatric populations
The reviewed articles largely focused on individuals with adult-onset disabilities. It is un-
clear whether or not the findings of these studies translate to individuals with congenital
disabilities, who often have never experienced any means of communication. For exam-
ple, to the best of our knowledge, BCIs relying on motor imagery have never been tested
with individuals who have never experienced voluntary control of their movements. Pro-
longed deprivation from communication in childhood can lead to learned helplessness and
impede the development of contingency awareness [216]. Despite this compelling clinical
reason for investigating BCI use in the early stages of life, none of the reviewed studies
have investigated the effectiveness of EEG-based BCIs in the pediatric population.
Consider personal contextual factors in determining BCI speed requirements
Communication speed (e.g. words typed per minute) has traditionally been an important
factor in assessing BCI performance. The emphasis on maximizing speed may stem from
14
studies with able-bodied individuals or those with traumatic disabilities who may expect
BCI systems to replicate the high throughput of pathways such as speech. Nonetheless,
joint BCI studies involving both able-bodied individuals and those with severe disabili-
ties have pinpointed delays in reaction time [6] and slower item selection rates [38] in the
disabled participants. Thus, the speed expectations of patients are likely very different
from those of their able-bodied counterparts. Indeed, proficient users of single-switch
scanning systems typically only achieve 8-24 words per minute [61]. Further, children
with developmental disabilities and communication difficulties are known to exhibit only
a handful of intentional communication acts per minute (e.g., words, gestures and vocal-
izations) [18]. Therefore, we recommend that as an indicator BCI performance, speed
ought to be contextualized in terms of the individual’s time scale for communication, tak-
ing into account the time required to process received information and the time needed
to muster the resources to respond. The level of cognitive awareness of the BCI user has
a significant effect on the choice of control mechanism and may affect the speed of oper-
ating the BCI. In particular, spontaneous control mechanisms are appropriate for users
who can voluntarily modulate EEG patterns. However, due to the lack of alternative
means of communication, the cognitive awareness of the participant cannot always be
assessed using standard assessment tools that rely on motor responses [87]. Therefore, a
comprehensive evaluation of BCI performance ought to include an appropriate cognitive
assessment.
Train and evaluate in ecologically salient environments
BCI evaluation would not be complete without considering the environmental context it
operates within. Because a BCI system is often the only means of communication for
an individual with severe disabilities, BCI solutions must allow long-term use in home
environments. Despite this, only a handful of articles have evaluated BCI performance in
home environments [76] or a simulated home-like environment [27]. A notable example
15
is the evaluation of the BCI 2000 system modified for use in home environments [226]. In
evaluating BCI accuracy, contextual factors may also include communication partners.
In this regard, it is important to view the BCI as a tool for facilitating meaningful
communication and not necessarily as a tool for producing exact selections. For example,
when using a BCI system to control a scanning keyboard, meaningful communication can
occur in spite spelling errors. This suggests that to obtain an environmentally relevant
evaluation of a BCI, a measure of the conversational partner’s receptive communication
may be important. While BCI training and evaluation may be performed in the user’s
home environment, trained personnel must often be present to ensure proper set-up and
operation of the equipment. This can limit BCI users to the geographical vicinity of
research facilities. To overcome these geographical restrictions, both researchers and
patients may benefit from tele-monitoring systems that enable remote supervision of
training [148].
Introduce user-aware BCIs
None of the reviewed studies have incorporated user state (fatigue, attention, emotional
status) while operating BCI systems when determining performance. For example, it
is not clear whether the performance degradation observed during long periods of BCI
use has resulted from exacerbated fatigue or due to the failure of detection algorithms
used. Detecting changes in user status such as the level of fatigue, and attention may
improve BCI performance assessments. In addition, awareness of user-status may allow
the BCI to more intimately accommodate the user’s moment-by-moment needs. For ex-
ample, once user fatigue is detected, the system can suggest a rest period. Specific EEG
patterns are shown to reflect different states such as fatigue and attention. Extended
periods of performing tasks such as mental arithmetic or driving result in an increase in
frontal theta rhythms [224]. Hamadicharef et al (2009) were able to differentiate atten-
tion (reading/arithmatic) versus non-attention (rest) state with accuracies up to 89.4%
16
[67]. Petrantonakis and Hadjileontiadis (2010) showed that the six basic emotions (hap-
piness, surprise, anger, fear, disgust, and sadness) could be differentiated with 83.33%
accuracy using EEG activity [168]. Based on these findings, in future studies, EEG sig-
nals monitored by BCI systems may also be used to estimate user state leading to a more
user-accommodating implementation. EEG signals may also reflect the dynamics of the
interaction between the user and BCI systems. For example, error-related potentials,
which are manifested after an error occurs [24, 80], may be used as a post-hoc correction
mechanism. Once an error-related potential appears, an auto-correction strategy may be
invoked or user verification may be solicited. Using EEG patterns associated with the
user-system interaction such as error-related potentials may lead to more usable BCIs
[25].
Develop more effective training protocols
None of the reviewed studies have focused on the development of engaging training
paradigms. Training is an imperative part of realizing SMR and SCP BCI systems.
Improving the training interface may directly affect training success. Studies involving
able-bodied participants have previously explored alternative training paradigms. The
interested reader is referred to Neuper and Pfurtscheller (2010) [155]. For example, im-
mersive training protocols (using virtual environment) have been suggested for realizing
an informative yet engaging training environment [127]. Using more engaging training
paradigms such as those involving learning reinforcements may increase user motivation,
improve training effectiveness and reduce requisite training times. Such training regimens
would be particularly useful for motivating the pediatric user with disabilities.
1.3.6 Towards affective brain computer interfaces
Despite the merits offered by existing BCI systems, many nonverbal children and youth
are usually not candidates for existing BCI technologies due to developmental delays, lim-
17
ited expressive communication and unknown levels of receptive communication. Indeed
the aforementioned challenges preclude the training of specific mental activities. How-
ever, these individuals are still candidates for affective BCIs (A-BCI) which enable the
automatic recognition of affective states using brain activity [158]. A-BCIs may provide
a means of detecting spontaneous and natural reactions to emotion-evoking stimuli.
A-BCI development is a step towards addressing existing gaps in BCI research intro-
duced in 1.3.5. Emotions are an intuitive and natural means of responding to stimuli.
Therefore, A-BCI may provide an opportunity to realize communication pathways for
the pediatric population. A-BCIs can bring awareness to user state in existing BCI
systems. Emotional awareness may help create more user accommodating systems and
develop more effective training paradigms. Unlike existing active BCI systems which gen-
erate voluntary and direct commands for communication (e.g. the P300 speller), A-BCIs
may offer passive but intuitive control. Passive BCI systems detect implicit information
regarding the user state (e.g. emotions) and intentions, and enable situational interpre-
tations [242]. Ultimately, an affective BCI may enable the decoding of emotional state
in the absence of overt emotional expression.
Computer-based detection of emotional responses may enhance implicit communica-
tion about the user in human computer interaction systems [31]. Affective computing has
long been touted for its potential for more realistic and user-accommodating interactions
[171]. An emotionally-aware system stands to benefit non-verbal individuals with severe
disabilities by estimating their emotional state in the absence of more explicit means
of interaction (e.g. speech and gestures). In turn, knowledge of the patients affective
state may help to mitigate care-giver stress and facilitate treatment decisions in a timely
fashion [70].
18
1.4 Neural correlates of emotion
Emotional response has been shown to engage different pathways in the central and
autonomic nervous system. Autonomic nervous system (ANS) activity sensors such as
those that detect cardiovascular, respiratory, and electrodermal can unveil emotional
responses [108, 129]. For a review of studies using ANS activity sensors for identifying
emotions, the reader is referred to [108].
Based on theories suggesting a close relationship between emotional response and
survival, key neural structures in the brain have been identified in different animal studies.
Figure 1.2 summarizes the many neural structures involved in orchestrating an emotional
response within what is known as the survival network [123]. As shown in Figure 1.2,
emotional response can engage many substrates in the mammalian brain. The human
brain is no exception to this rule. Neuro-imaging techniques such as positron emission
tomography (PET) [221] and magnetic resonance imaging (MRI) [21] have provided an
opportunity for in vivo characterization of emotional perception in the human brain
[15, 16, 44, 206, 209].
Various brain circuits including parts of the limbic system and amygdala are found
to be responsible for the perception of emotional stimuli ([164, 208, 126]). Among these
areas, the frontal cortex plays an important role in regulating emotional response to sen-
sory input [34, 33, 187, 141]. Previous studies have confirmed the role of the frontal area
in emotional response. For example, severity of the depressive symptomatology in pa-
tients following stroke lesions was reported to be significantly correlated with proximity
of the lesion to the frontal pole [186]. Moreover, left and right frontal activations were
also found in response to watching video clips inducing positive and negative emotional
responses, respectively [235]. Activations in the orbito-frontal and ventral prefrontal cor-
tex in response to highly pleasurable self-selected music excerpts have also been reported
[15]. Tanida et al. showed that inducing mental stress could lead to bilateral increase
or decrease of oxygenated hemoglibin ([HbO2]) and deoxygenated hemoglobin ([Hb]),
19
Amygdala
Cortex
Thalamus
Hippocampus
Orbitofrontal Cortex
Choice behaviorMemory of emotional events
Hippocampus
Memory consolidationofemotional events, spatiallearning
Dorsal & Ventral Striatum
Instrumental approach oravoidance behavior
Lateral Hypothalamus
Tachycardi, skin conductanceresponse, paleness, pupildilation, blood pressureelevation
Dorsal Motor of VagusNucleus Ambiguus
Ulcers, urination, defecation,bradycardia
Paraventricular N.
Corticosteroid release (”stressresponse”)
Sensory
Response
Figure 1.2: Various structures within the survival network involved in the emotionalresponse, and the resulting outputs. [123]
20
respectively [219]. Matsuo et al. have reported PFC [HbO2] increases in a group of indi-
viduals with post-traumatic stress disorder as well as a healthy control group in response
to trauma-related videos [134].
1.4.1 The role of prefrontal cortex in default, salient and exec-
utive control networks
One of the remarkable features of the brain is its ability to attend to salient events in the
environment. The ability of the brain to regulate various processes and divert attention
to the more salient ones has been attributed to intrinsic and distinct functional networks
[214]. These networks are composed of strongly coupled sets of information processing
nodes distributed in the brain. Functional connectivity studies have confirmed the ex-
istence of at least three canonical networks: (i) central executive network, (ii) default
network; and (iii) salience network [214]. The salience and central executive network
exhibit increased activity during cognitively demanding tasks [63]. The default network,
on the other hand, shows higher levels of activity during resting state [63]. By regulat-
ing activation and deactivation of these networks, the brain can realize various ongoing
processes during resting state and respond to salient events when required. These salient
events could involve cognition, homeostasis or emotions [201]. Therefore, emotional re-
sponse may result in activity changes within these intrinsic brain networks. The salience,
central executive and default network are shown to encompass different areas within the
prefrontal cortex. The dorsolateral prefrontal cortex is shown to be part of the central
executive network [214, 201]. The ventromedial prefrontal cortex serves as one of the
nodes in the default network [214, 63]. Finally, the salience network encompasses the
ventrolateral prefrontal cortex [214, 201]. In addition, various areas in the prefrontal
cortex are shown to act as information hubs by integrating diverse information sources
within different brain networks. Buckner et al [20] identified prominent hubs in the
medial/lateral prefrontal cortex, in a functional magnetic resonance imaging study.
21
Based on the existing evidence, recordings from the prefrontal cortex may tap into
three major networks in the brain (salience, central executive and default networks). In
addition, recordings from the medial/lateral prefrontal cortex may enable monitoring of
the activity of intrinsic cortical hubs [20].
Unlike deeper brain areas such as amygdala, and limbic system, prefrotnal cortex
(PFC) hemodynamics can conveniently be monitored using non-invasive and portable
brain monitoring modalities, such as NIRS. Accessibility of PFC by brain sensing mod-
ules and particularly NIRS provides a great opportunity for realizing a bed-side emotion
identification system. Therefore, in this thesis, PFC hemodynamics were used for iden-
tifying emotional response.
1.5 Near-infrared spectroscopy of the brain
Among various brain monitoring modalities, hemodynamic measurements are not prone
to electrogenic artifacts such as bio-potentials associated with eye-movement or frontalis/temporalis
muscle contraction. These artifacts primarily occur in the forehead area and may reduce
signal to noise ratio when recording EEG from the prefrontal and frontal region. There-
fore, hemodynamic measurement in cortical areas are involved in emotion processing is
a meaningful pursuit in developing affective brain computer interfaces (A-BCIs).
Various brain sensing modalities have been developed for cerebral hemodynamic mon-
itoring such as magnetic resonance imaging (MRI), and positron emission tomography
(PET). However, neither of these technologies are currently suitable for long-term bedside
monitoring for emotion identification purposes. Current MRI technologies are bulky, ex-
pensive, and require radio frequency and magnetic shielding which impedes their use as a
portable bedside monitoring system. PET systems require administration of radioactive
tracers, and are therefore not suitable for long-term and repeated monitoring.
Near-infrared spectroscopy (NIRS), which is also a hemodynamic-based brain sensing
22
light emission
light detection
Sample recording[HbO ]2
[Hb]
Figure 1.3: General overview of NIRS recording system
modality, offers many advantages such as low cost and portability making it suitable
for long-term bed-side use. NIRS optically monitors the level of oxygenated and de-
oxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively) in the cerebral
cortex. Near-infrared light penetrates the adult skull and can be superficially detected
2.5-3 cm away from the source [228, 90] (Figure 1.3). The detected light intensity can
be used to identify [HbO2] and [Hb] in the underlying tissue due to the differences in ab-
sorption characteristics of these two chromophores. Deeper brain areas in the emotional
network such as amygdala cannot be monitored using NIRS. However, PFC activity can
be conveniently monitored using superficial light emitters and detectors.
1.6 Emotion induction via music
In a recent review of physiological markers of emotion [108], Kreibig illustrated the di-
versity of emotion induction paradigms and their usage frequencies. In the reviewed
sample, film clips were most frequently used, but other emotion induction methods such
as imagery, personalized recall and musical excerpts were also reported [108]. Among
these techniques, music, which is often presented over longer durations, has the capacity
to induce a response changing with time. Emotions experienced during the initial pre-
sentation of a piece of music may be different from those surfacing as the music unfolds.
However, music-based emotion induction is subject to debate among researchers, and the
23
Table 1.3: A summary of existing theories of emotions. See [50] for more details.
1954 Arnold Felt tendency towards an objectGasson accompanied by specific bodily changes
1986 Lutz A means of negotiating social relationsand White
1991 Lazarus Organized psychophysiologicalreactions with respect to ongoing relationshipswith the environment
1991 Ekman Characteristics common among emotions: ”rapid onset, short duration, unbidden occurance, automatic appraisal,and coherence among responses.
2008 Juslin Emotions are typically described as relatively brief,though intense, affective reactions topotentially important events or changes in the externalor internal environment that involve several subcomponents:cognitive appraisal (e.g., you appraise the situation asdangerous), subjective feeling(e.g., you feel afraid),physiological arousal (e.g., yourheart starts to beat faster),expression(e.g., you scream), action tendency (e.g., you run away),regulation (e.g., you try to calm yourself)
use of music as an emotion induction method is less prevalent than other stimuli [108].
Those opposing the use of music for inducing emotions have argued that music fails to
present immediacy resembling real-life situations (e.g. a life-threatening event).
The lack of consensus among researchers regarding the use of music for emotion
induction may stem from disagreement about the very definition of emotions. Theories
of emotion have emphasized different attributes of an emotional response in defining
emotions. Emotions have been defined with respect to bodily changes, social relations,
homeostasis within the surrounding environment and autonomic appraisal. Table 1.3
summarizes a number of different theories of emotion, and highlights the diversity among
them [50].
To overcome the lack of consensus about the role of music in inducing emotions,
Juslin et al. explored the ability of music to induce emotions based on various mechanisms
leading to emotional response [92]. Despite competing arguments for and against musical
24
emotion induction, music has been used as an emotional auditory stimulus in many
studies [106, 109, 81, 213, 60]. In addition, music has been used in many studies involving
emotional processing in the brain [15, 16].
There is a lot of diversity in the choice of music excerpts used in studies of emotion.
These studies can be categorized into two main streams: (i) studies using music in un-
altered form (with no computer adjustments) (ii) studies using music with modifications
to specific music characteristics such as dissonance or chords to influence emotional ex-
perience. For example, by modifying the degree of permanent dissonance (which affects
the pleasantness of stimuli) Blood et al. studied neural emotional processing with music
in a positron emission tomography study [16]. Steinbeis et al. [215] produced harmonic
sequences that ended on an irregular chord function, and were able to identify electro-
dermal activity modulations when the musical expectancy was violated. Other studies
use unaltered music belonging either to a collection of pre-selected music excerpts [167]
or a number of music pieces self-selected by the individuals [15].
With respect to neutral auditory stimulus, various strategies have been proposed. For
example, in some studies a neutral auditory stimulus was presented with environment
sounds (e.g. sounds from ocean waves or songbirds) [4]. Random static noise has also been
applied as a neutral stimulus [51, 212]. Other studies have used computer adjustments
to neutralize the emotional content of music [16].
Using music for emotion induction offers some specific advantages, particularly in
studies of emotion involving the pediatric population. The emotional content of music is
shown to be discernable by children as young as 6 years of age [32]. In addition, emotions
in music are known to be perceived across cultures [55]. Therefore, as a dynamic cross-
cultural emotion induction method, music has many merits. Therefore, to achieve the
goals of the current thesis, a music emotion induction paradigm was implemented.
25
1.7 Objectives
The objective of this thesis was to implement and test a means of identifying emotional
arousal and valence in response to music using Near Infrared Spectroscopy (NIRS) of
the prefrontal cortex (PFC). To achieve this goal, multiple investigations were necessary
to resolve technical and physiological challenges in the context of neural correlates of
emotion. In this light, the specific objectives of this thesis were:
A. To identify correlates of emotion by characterizing the signals recorded via PFC
NIRS with respect to emotional arousal and valence.
B. To investigate whether the detected activity patterns in objective A were due to
emotional response or mere music perception.
C. To identify features from the NIRS signals which are correlated to emotional
response and investigate the ability to differentiate emotional arousal and valence based
on these features.
D. To compare detection accuracies achieved using PFC NIRS signals to those at-
tained with autonomic nervous system signals such as heart rate, skin temperature, and
electrodermal activity, which have previously been used for emotion identification in the
literature.
E. To design and test a multi-modal emotion (arousal and valence) identification
system using ANS and PFC NIRS monitors.
1.8 Roadmap
The roadmap of this thesis is organized according to the objectives listed above. Chap-
ters 5-6 are arranged as journal articles each focused on one or multiple objectives listed
in section (1.7). The thesis structure is summarized in Figure 1.4. Chapter 2 provides
details regarding the methods and data collection procedures used. Chapters 3 to 6
may duplicate information regarding the procedures summarized in chapter 2. Likewise,
26
the introduction (or background) section of some chapters may also replicate informa-
tion presented in chapter 1. Where duplication occurs, we will highlight in the chapter
preamble, sections that the reader can skip. Following the five main chapters, the thesis
concludes with a summary of contributions and recommendations for future studies.
In Chapter 2, the study protocol is described in detail. These descriptions explain
the experimental paradigm, data collection procedures, and measurements used. In ad-
dition, the relevant data preprocessing algorithms are introduced in detail.
In Chapter 3, PFC NIRS signals, namely [HbO2] and [Hb] are characterized using
wavelet peak detection. The wavelet peak detection algorithm allows characterization in
time and frequency domains. These wavelet characteristics are examined with respect to
subjective ratings of arousal and valence rating. This chapter is in line with Objective
A.
In Chapter 4, the main effect of three music characteristics (mode, dissonance and
maximum sound pressure level), which are known to be effective in inducing emotions,
on PFC hemodynamics is investigated. PFC is likely to be involved in a brain network
specialized for perceiving emotions, and therefore, the activities observed may be due to
music perception and not the emotional content of the music. This chapter focusses on
objective B, and investigates whether PFC hemodynamics are directly affected by the
identified music characteristics.
In Chapter 5, a group of time-domain features are extracted from PFC [HbO2] and
[Hb] measurements. These features are then used for training two separate classifiers for
arousal and valence differentiation. In this validation study, a PCF NIRS-based arousal
and valence identification system is tested and therefore objective C is addressed.
Autonomic nervous system (ANS) activity has long been used for identifying emo-
tions. Therefore, in the pursuit of a physiologically-based emotion identification system,
it is important to compare the current detection rates achieved using PFC with those
realized using ANS activity.
27
Chapter 2: Study protocol
Characterizing PFC hemodynamicchanges with respect to valenceand arousal
Investigating the effect ofmusic characteristics
Chapter 3:
Chapter 4:
Automatic detection of emotional responseusing PFC hemodynamic features
Chapter 5:
Identifying emotional valence and arousalby combining autonomic and centralnervous system activity
Chapter 6:
Chapter 7: Concluding remarks
Figure 1.4: Thesis roadmap.
28
In Chapter 6, ANS activity collected in the form of heart rate, electrodermal activ-
ity and skin conductance features is used for solving the same classification problems as
those formulated in Chapter 5. In addition, a dynamic model based feature extraction is
implemented to improve classification results by including frequency domain features. Ul-
timately, a mixture of classifier experts, each trained using PFC or NIRS features are used
for solving the classification problem (i.e. high arousal versus low arousal and positive
versus negative valence). Overall, this chapter investigates the ability of a multi-modal
emotion identification system to improve upon accuracies achievable with classifiers con-
sidering exclusively PFC NIRS features. In this manner, Chapter 6 addresses objectives
C, D and E.
Chapter 2
Experimental Protocol
2.1 Preamble
This chapter summarizes the experimental details including the procedures, methods,
data acquisition and data preprocessing.
2.2 Introduction
To realize the testing of the hypothesis put forth in this thesis, a database of physiological
responses, and corresponding ratings of emotional valence (positive versus negative), and
arousal (intense versus neutral) was created using a collection of music excerpts. In this
chapter, you will find details regarding data collection procedures. In addition, you will
be introduced to the preprocessing techniques applied to the light intensities collected
via the NIRS devices to achieve [HbO2] and [Hb] signals used in the rest of this thesis.
2.3 Participants
Ten able-bodied volunteers (five females, five males, age: 25 ± 2.7 years) were re-
cruited for this study. The participants reported to have normal hearing, and normal or
29
30
corrected-to-normal vision. The recruitment criteria excluded individuals with reported
cardiovascular diseases, metabolic disorders, history of brain injury, respiratory condi-
tions, drug and alcohol-related and psychiatric conditions. Participants were instructed
to refrain from caffeine and alcohol consumption 5 hours prior to the study. Volunteers
had an average of 5.5 years of past music training. The duration of musical training is
reported due to previous research documenting the influence of music training on the
physiological responses to music [207]. Ethics approval was obtained from the Bloorview
Research Institute research ethics board (see Appendix E) and all participants provided
informed written consent.
2.4 Stimuli
The stimuli were composed of 78 music excerpts. All music segments were 45 s in du-
ration. The excerpts included lyrical and non-lyrical pieces. The lyrics were in different
languages (English, French, Italian and Spanish) to reduce potential effects of brain ac-
tivation due to mental singing. Within the excerpts presented to each participant, 72
standard music pieces were chosen by two researchers from different genres of music
(classical, rock, jazz and pop). Specifically, candidate pieces were assessed in terms of
their valence characteristics as suggested by the tone, rhythm and lyrics (where applica-
ble). Note that the researcher assessments were used solely to ensure an approximately
uniform representation of music between valences (positive versus negative). The actual
data analysis described in section 2.5 relied solely on participant ratings of valence and
arousal. For the participant-selected pieces, participants chose a priori three pieces of mu-
sic that personally induced intense positive emotions (joy or excitement) and three that
induced intense negative emotions (sadness). The control acoustic stimulus was Brown
noise (BN). User feedback in our pilot studies indicated that this type of noise was sub-
jectively more pleasant than white noise at the same sound pressure level [229]. For more
31
information regarding other alternatives for the neutral auditory stimulus, the reader is
referred to section 1.6. A list of the standard database presented to all participants is
presented in Appendix B.
2.5 Signal acquisition
An Imagent Functional Brain Imaging System from ISS Inc. (Champaign, IL) was used
for NIRS measurements. A custom made rubber polymer (3M 9900 series) headgear
held three light detectors and ten light sources in place over the forehead, as depicted
in Figure 2.1. At each X location in Figure 2.1, two light sources, one at 830 nm
and the other at 690 nm, were co-located. This layout had been previously used for
prefrontal cortex monitoring in Power et al (2010) and provided readings at the nine
shaded locations in Figure 2.1 [179]. With data from two wavelengths, this configuration
yielded 18 different channels of light intensity readings. The midpoint of the headgear
was aligned to anatomical midline (as estimated by the position of the participants nose),
while the lower edge of the headgear sat just above the eyebrows. Light sources were
modulated at 110 MHz and the detector amplifiers were modulated at 110.05 MHz which
led to a cross-correlation frequency of 5 kHz. The data were sampled at 31.25 Hz. During
a complete cycle of all ten sources, each source illuminated the surface for 1.6 ms during
which eight acquisitions were made. A fast Fourier transform was applied to the average
of the eight waveforms to obtain an estimate of ac and dc intensities as well as the phase
delay [179]. The dc light intensities were used to determine HbO2 and Hb concentrations
(i.e. [HbO2] and [Hb] ).
2.6 Pre-processing
Low-frequency artifacts such as respiration, heart rate and the Mayer wave were filtered
using a type II third order Chebychev low pass filter with a cut-off frequency of 0.1
32
Figure 2.1: The layout of light sources (circles) and detectors (X’s). The vertical linedenotes anatomical midline. The annotated shaded areas correspond to recording loca-tions.
Hz (normalized stop-band edge frequency of 0.032 and stop-band ripple of 50 dB down
from the peak pass-band value) [178]. The 830 nm and 630 nm light intensities at each
of the nine recording sites were used to calculate HBO2 and Hb concentrations via the
modified BeerLambert law [30, 41], which resulted in 18 channels of concentration data.
To reduce the effects of initial device calibration, the concentration time series were
normalized within each experimental block against the mean in the same block.
2.7 Study design
Each participant completed four sessions conducted on separate days. In each session,
the participant completed three blocks with optional breaks between blocks. Each block
consisted of 12 consecutive trials: four trials with positively valenced songs (one of which
was a participant-selected song), four trials with negatively valence songs (one of which
was a participant-selected song) and four BN trials. Within a block, the music and
BN trials were pseudo-randomized, such that two BN trials never occurred consecutively
while positively and negatively valenced songs appeared in no apparent order. The same
33
Figure 2.2: Trial sequence
pseudo-random sequence of trials was employed for all participants. Figure 2.2 depicts
a trial sequence. In each trial, the participant listened to 10 s of BN, followed by a 45
s auditory stimulus (music or BN), and finally 5 s of BN. The sound level was faded in
and out at the beginning and end of the trial, respectively, to reduce the risk of eliciting
a startle. At the end of each trial, the participant rated the intensity and valence of their
emotional experience using a nine-level self-assessment Manikin [146] shown in Figure 2.3.
The beginning and end of each trial was marked by an audible tone. The participants
were instructed to close their eyes when they heard the initial tone, and to open their
eyes upon hearing the second tone.
34
Figure 2.3: The Self Assessment Manikin Rating System is shown. The top and thebottom row depict valence (positive to negative) and arousal (intense to neutral) ratings,respectively. The participant could select one of the nine levels of arousal/valence bymarking the corresponding circles shown. For example, in the sample rating provided, avery intense positive emotion is represented
Chapter 3
Characterizing PFC Hemodynamic
changes due valence and arousal
3.1 Preamble
This chapter investigates the overall hemodynamic patterns accompanying emotional
response. Identifying patterns associated with emotions in prefrontal cortex using near
infrared spectroscopy is an important step towards emotion identification. In this study,
NIRS recordings were used to characterize the PFC hemodynamic response to emotional
arousal and valence. In particular, a wavelet-based peak detection technique was used to
characterize chromophore concentration patterns.
This chapter is entirely reproduced from the following journal article: Moghimi S,
Kushki A, Guerguerian AM, Chau T, Characterizing emotional response to music in
the prefrontal cortex using near infrared spectroscopy. Neuroscience Letters. 2012;
Elsevier.
Readers can skip section 3.4.1 as it reiterates the procedures described in chapter (2).
35
36
3.2 Abstract
Known to be involved in emotional processing, the human prefrontal cortex (PFC) can be
non-invasively monitored using near-infrared spectroscopy (NIRS). As such, PFC NIRS
can serve as a means for studying emotional processing by the PFC. Identifying pat-
terns associated with emotions in PFC using NIRS may provide a means of bedside
emotion identification for nonverbal children and youth with severe physical disabilities.
In this study, NIRS was used to characterize the PFC hemodynamic response to emo-
tional arousal and valence in a music-based emotion induction paradigm in 9 individuals
without disabilities or known health conditions. In particular, a novel technique based
on wavelet-based peak detection was used to characterize chromophore concentration
patterns. The maximum wavelet coefficients extracted from oxygenated hemoglobin con-
centration waveforms from all nine recording locations on the PFC were significantly
associated with emotional valence and arousal. Specifically, high arousal and negative
emotions were associated with larger maximum wavelet coefficients.
3.3 Introduction
Selected groups of nonverbal individuals with severe disabilities and little or no voluntary
muscle control have benefited from communication alternatives based on brain activity
known as brain computer interfaces (BCI) [237, 112]. However, due to developmental
delays, limited expressive communication and unknown levels of receptive communica-
tion, many nonverbal children and youth are usually not candidates for existing BCI
technologies. Indeed the aforementioned challenges preclude the training of specific men-
tal activities. However, these individuals are still candidates for affective BCIs (A-BCI)
which enable the automatic recognition of affective states using brain activity [158]. A-
BCIs may provide a means of detecting spontaneous and natural reactions to specific
stimuli. To this purpose, affective responses evoked by visual stimuli have been previ-
37
ously decoded in both facial thermographic [156] and cerebral hemodynamic pathways
[218].
Emotional responses engage many different areas of the brain including parts of the
limbic system, and prefrontal cortex (PFC) [164, 208, 126, 34]. Neuro-imaging techniques
such as positron emission tomography (PET) [221] and magnetic resonance imaging
(MRI) [21] have provided an opportunity for characterizing emotional perception in the
brain [39, 209, 44, 206, 16, 15]. However, the bulky set-ups required by PET and MRI
systems, and potential patient discomfort [150] preclude their use in studies of emotional
responses in real-life settings, and particularly in developing A-BCIs.
Among the different modalities available for monitoring brain activity, near infrared
spectroscopy (NIRS) is noninvasive, and particularly well-suited for monitoring PFC
activity, which is among the regions involved in emotional processing [34], in life-like
settings. NIRS monitors hemodynamic activity in the brain by measuring changes in
oxygenated ([HbO2]) and deoxygenated hemoglobin ([Hb]) concentrations (i.e. [HbO2]
and [Hb]) in regional cerebral blood flow [90, 228]. NIRS is not prone to electrogenic
artifacts (e.g. electrooculogram), present in the forehead area. NIRS provides lower spa-
tial resolution compared to PET and MRI neuroimaging systems, but it is non-invasive,
relatively inexpensive, and portable. As such, NIRS may be particularly more amenable
for A-BCI development involving children with severe disabilities.
Exposure to emotionally-laden stimuli is known to produce measureable changes in
chromophore concentrations (i.e., [HbO2] and [Hb]) [74, 241, 218]. Examining hemody-
namic changes in the prefrontal cortex using NIRS, Hoshi et al. showed that exposure to
both pleasant and extremely unpleasant pictures led to increases and decreases in [Hb]
, respectively [83]. Similarly, a recent study showed that highly-positive and negative
emotions associated with music could be differentiated with more than 70% accuracy
[144] based on prefrontal cortex NIRS measurements.
The current study used PFC NIRS to investigate characteristics of the hemodynamic
38
response, specifically [HbO2] and [Hb], to emotionally-laden music. Music has repeatedly
been used for emotion induction in various studies [91, 134]. Characterizing emotional
response to music in able-bodied adults is a step towards future investigations of emotion
in non-verbal individuals with severe disabilities.
In the current study, the relationship between chromophore concentrations, [HbO2]
and [Hb], and subjective ratings of emotional arousal and valence, was investigated us-
ing wavelet analysis. The wavelet transform is a tool for signals analysis in time and
frequency. Broadly speaking, the wavelet transform evaluates the similarity of the time
series to a given pattern, known as the mother wavelet. In particular, when applied to
a time-series, the wavelet transform produces a set of coefficients across a set of time
points and scale values where the wavelet coefficient at time t and scale a represents the
similarity of the data at t to the mother wavelet scaled by a factor of a. The wavelet
transform maps the signal onto a set of bases (wavelet family) consisting of scaled and
translated versions of a mother wavelet function.
In the present study, wavelet analysis provided a means of extracting oxygenation
patterns relevant to emotional valence and arousal, and was used to investigate the
shape of the peak [HbO2] and [Hb] (i.e. differentiate abrupt peaks from gradual peaks).
In particular, the maximum wavelet coefficient revealed the scale at which the signal
most closely resembles the prototypical hemodynamic response (e.g., increase followed
by decrease in oxygenation).
3.4 Methods
3.4.1 Procedures
Ten adults without disabilities or known health conditions (9 right-handed) were recruited
for this study. Only the 9 right-handed participants (5 female, age: 252.7 years) were
included in the analysis to mitigate any response variations due to differences in hemi-
39
Figure 3.1: The layout of light sources (circles) and detectors (X’s). The vertical linedenotes anatomical midline. The annotated shaded areas correspond to recording loca-tions.
spheric dominance. The Bloorview Research Institute research ethics board approved of
the study, and informed written consent was provided by all participants. Participants
donned a custom polyethylene headgear, which covered their foreheads and accommo-
dated the placement of multiple emitters and detectors. An Imagent Functional Brain
Imaging System from ISS Inc. (Champaign IL) was used for NIRS measurements across
nine different regions on the forehead (Figure 6.2). Each source housed two diodes that
emitted light at 830nm and 690nm. The light was detected by three detectors. The data
were sampled at 31.25 Hz.
During each trial, participants listened to either a music excerpt or a noise recording
that represented a neutral auditory stimulus (Figure 3.2). The music excerpts comprised
a database of 78 music pieces selected by the researchers together with 6 self-selected
excerpts for each participant. The study was divided into 4 separate sessions encompass-
ing 36 trials each (12 noise trials, 24 musical excerpts). After each trial, participants
were prompted to rate their emotions in terms of arousal and valence using a nine level
self-assessment manikin [146]. The valence ratings were mapped to 1(most positive) to
9(most negative), and arousal ratings ranged from 1(least intense) to 9(most intense).
40
Figure 3.2: Trial sequence
3.4.2 Wavelet-based peak detection
In this phase of the study, the relationship between chromophore concentrations, [HbO2]
and [Hb], and subjective ratings of emotional arousal and valence, was investigated using
wavelet analysis. The wavelet transform is a tool for signal analysis in time and frequency.
Broadly speaking, the wavelet transform evaluates the similarity of the time series to a
given pattern, known as the mother wavelet. In particular, when applied to a time-series,
the wavelet transform produces a set of coefficients (shown in (3.1)) across a set of time
translations and scale values where the wavelet coefficient at time displacement u ∈ ℜ
and scale u ∈ ℜ+ represents the similarity of the data at u to the mother wavelet (ψ(t))
scaled by a factor of s.
ψ(u, s)(t) =1√sψ(t− u
s) (3.1)
The continuous wavelet coefficient corresponding to scale s and translation u can be
determined using (3.2).
Wf(u,s) =∫ +∞
−∞f(t)
1√sψ∗(
t− u
s)dt (3.2)
In this manner, the original signal (f(t)) is projected onto a two dimensional space of
u and s. Therefore,Wf(u,s) allows the study of signal characteristics at time u and scale S.
The scale s changes inversely with frequency. Therefore, abrupt peaks (i.e. accompanied
41
Figure 3.3: Mexican hat wavelet
by faster changes in the vicinity of the peak) correspond to larger coefficients at lower
scales, whereas gradual peaks (i.e. accompanied by slower changes in the proximity of
the peak) correspond to higher scales. Therefore, wavelet coefficients identify peaks as
well as the rate of changes near these peaks. The interested reader is referred to [131] for
more details regarding wavelet analysis.
Wavelet analysis is often used for detecting patterns of interest in the data, such
as, stereo-typed neuroelectric waveforms (e.g. event related potentials) [35] and [206],
localized spikes in biological data [152], and unknown transients in the signal [54]. Visual
inspection of [HbO2] and [Hb] in high arousal and positive/negative rated trials lead to
the selection of the Mexican hat function as the mother wavelet 3.3.
The wavelet transform was computed for scales a ∈ {70, 71,, 400} which maps to
pseudo-frequencies in the range of 0.0190.111Hz [182] (the required range for capturing
the chromophore concentration changes given that the filtered concentrations have useful
frequency content in the range of < 0.1Hz). The transform was applied to time values
S ∈ {1, , k, , N}, where N was the number of samples included for analysis [152]. The first
5 s of the [HbO2] and [Hb] series during the music intervals were discarded in order to
ignore any residual activity from the period preceding music onset. Therefore, a total of
40 s of data corresponding to N = 1250 samples were included when determining wavelet
coefficients. Given the wavelet coefficients for each concentration time series, two features
42
were extracted for subsequent analysis: the maximum wavelet coefficient over all time
(i.e. across all translations) and scales, and the scale at which the maximum occurred.
3.4.3 Statistical analysis
To test whether or not the wavelet features (maximum wavelet coefficients for [HbO2]
and [Hb] and corresponding scales) were related to the subjective ratings of arousal and
valence, we used a mixed effects repeated measures linear regression analysis. Separate
regressions were conducted for valence (most positive (1) to most negative (9)) and
arousal (neutral (1) to most intense (9)) ratings, and for [HbO2] and [Hb] chromophores.
We report the regression coefficient (slope) and the associated p-value as an indicator
of the correlation. The analysis was repeated for each of the nine recording sites, again
considering arousal and valence ratings separately. To account for multiple comparisons
(9 recording sites), we set a Bonferroni adjusted significance level of α = 0.05/9 = 0.005
[79].
3.5 Results
Box-plots shown in Figure 3.4 a and b depict the distribution of participant valence and
arousal ratings, respectively. Across participants, the median valence rating was neutral
(i.e. 5), but the median arousal rating was situated more toward the lower end of the
scale (i.e. 3).
Figure 3.7 illustrates [HbO2] and [Hb] recordings (in black) from the nine PFC inter-
rogation sites for a representative trial rated at the highest arousal and most negative
valence. Along with each recording is shown the corresponding temporal waveform of
wavelet coefficients (in grey) at the scale containing the maximum wavelet coefficient.
Figure 3.5 and report the slopes of the regression lines between the maximum wavelet
coefficient and the subjective arousal and valence ratings, respectively. The results in
43
1
2
3
4
5
6
7
8
9
1 2 3 4 5 6 7 8 9
Participant number
a
Va
len
ce
ra
tin
g
1
2
3
4
5
6
7
8
9
1 2 3 4 5 6 7 8 9
Participant number
b
Aro
usa
l ra
tin
g
Figure 3.4: Box-plot of valence and arousal ratings for each participant
Figure 3.5 and 3.6 indicated that the maximum wavelet coefficient and the corresponding
scale exhibit significant regional associations with emotional ratings. In particular, the
maximum wavelet coefficients extracted from [HbO2] were significantly related to ratings
of arousal in all nine recording regions while coefficients from [Hb] were correlated to
inferior left (L2, L3 and L4) and right (R2, R3 and R4) locations. In both cases, the
regression slopes were positive, indicating that high arousal resulted in larger [HbO2]
and [Hb] peaks in the respective regions. The scale of the maximum wavelet coefficient
provided a measure of the sharpness of the concentration peaks. Specifically, smaller
scales (higher signal frequency) correspond to more abrupt concentration changes whereas
larger scales (lower frequencies) are indicative of more gradual changes. As such, our
results suggest that higher ratings of arousal were associated with more gradual peaks in
[HbO2] in regions R2, R3, and L3. Collectively, these findings reveal that more intense
emotions were accompanied by larger and less abrupt changes in the concentration of
oxy- and deoxy-hemoglobin.
Negative emotions were correlated with larger values of the maximum wavelet coeffi-
cient across all nine recording sites in [HbO2], and inferolateral left (L3) and right (R3)
44
Interrogation s ite
Wavelet
feature
chromophore R1 R2 R3 R4 O L1 L2 L3 L4
a. MWC [HbO2] 0.9531
p<.0001
1.4831
p<.0001
1.7885
p<.0001
1.3225
p<.0001
0.9643
p<.0001
1.0918
p<.0001
1 .2179
p<.0001
1.5341
p<.0001
1.4581
p<.0001
[Hb ] 0.2748
p=0.0007
0.5811
p<.0001
0.4388
p<.0001
0 .3289
p=0.0001
0.4989
p<.0001
0.5136
p<.0001
b. Scale of
M WC
[HbO2] 5.5083
<.0001
4.3377
0.0002
4.7002
0.0002
[Hb]
Figure 3.5: Slopes of regression lines between participant arousal ratings and (a) themaximum wavelet coefficient (MWC), and (b) the corresponding scale. Only slopessignificantly different from zero are shown (p < 0.005).
Interrogation site
Wavelet
feature
chromophore R1 R2 R3 R4 O L1 L2 L3 L4
a. MWC [HbO2] 0.80 48
p=0.0043
1.9382
p<.0001
2.143 6
p<.0001
1.6458
p<.0001
1.1121
p<.000 1
1.24 04
p<.0001
1.6016
p<.0001
2.088 6
p<.0001
1.6 445
p<.0001
[Hb] 0.501 6
p<.0001
0.514 3
p<.0001
b. Scale of
MWC
[HbO2] 4.3357
p=0 .0047
[Hb] -4.0693
p=0 .0032
4.8146
p=0.0019
Figure 3.6: Slopes of regression lines between participant valence ratings and (a) themaximum wavelet (MWC), and (b) the corresponding scale. Only slopes significantlydifferent from zero are shown (p < 0.005).
45
a)
b)
Figure 3.7: Plotted in black are the (a) [HbO2] (top panel) and (b) [Hb] (bottompanel)recordings across nine interrogation sites for a music sample inducing intense neg-ative emotions from one of the participants during 45 seconds of aural stimulus. In greyare the corresponding waveforms of wavelet coefficients at the scale where the maximumwavelet coefficient occurs. These waveforms have been scaled by their standard deviationto facilitate visual comparison.
locations in [Hb]. More negative ratings also corresponded to more gradual peaks in [Hb]
at L2, and [HbO2] at R3. Therefore, negative emotions tended to elicit larger and less
sudden regional chromophore concentration peaks. More negative ratings on the other
hand resulted in sharp concentration peaks in a more midline, superior location (L1).
3.6 Discussion
In this study, arousal ratings were found to be associated with changes in chromophore
concentrations. Intense emotional experience has been reported to result in heightened
hemodynamic changes. Tanida et al. showed that mental stress induction could result
in bilateral increase or decrease of [HbO2] and [Hb], respectively [219]. Matsuo et al.
have reported PFC [HbO2] increases in a group of individuals with post-traumatic stress
46
disorder as well as a healthy control group in response to trauma-related videos [134].
Previous findings have reported lateral activation in the PFC due to positive or neg-
ative emotional stimuli [235]. For example, Altenmuller et al. [4] reported an increase in
the left temporal activation due to exposure to positive auditory stimuli, and a bilateral
increase in response to negative auditory stimuli using electroencephalography (EEG).
However, in the current study, significant regression slopes were observed bilaterally.
Therefore no evidence of lateral activation patterns was obtained with respect to ratings
of valence.
The significance of the regression slopes resulting from models involving maximum
wavelet coefficients of [HbO2] indicated that the Mexican hat mother wavelet was a suit-
able template for identifying patterns relevant to emotional arousal and valence in [HbO2]
across all nine recording sites. Unlike static emotion induction paradigms (e.g. pictures)
where short time exposures can result in emotional experience [74], [218] and [241], dy-
namic emotion induction paradigms (e.g. music) can involve emotional unfolding at any
time during the course of exposure to stimuli. For example, the emotions experienced
during the introduction to a musical piece may be different from those experienced during
the main body. This scenario resembles real life emotional experience where emotions
can be manifested at any point in time. The results of the current study encourage
future studies of the temporal dynamics of emotion [106] using wavelet analysis for the
localization of emotional responses in time.
Chapter 4
The Effect of Music Characteristics
4.1 Preamble
There is compelling evidence of a network in the brain specialized for perceiving music.
For example, previous studies of focal lesions in the brain have indicated selective loss
of the ability to perceive specific music characteristics [243, 231]. This network may
include the prefrontal cortex (PFC)[16, 99]. Therefore, the PFC may play a dual role of
perceiving music characteristics and formulating emotional responses. To identify which
of these two mechanisms (music perception vs. emotional response) were involved in
the activation patterns observed in the PFC, the effect of music characteristics, namely
mode, dissonance and sound pressure level on the PFC [HbO2] and [Hb] was investigated
in this chapter.
4.2 Introduction
Every musical piece can be characterized by specific structural and performance fea-
tures. Performance characteristics, such as energy, timbre and pitch, involve the manner
in which the performer executes a musical piece. These features are quite variable due to
differences in performer skills and state. Structural features, on the other hand, involve
47
48
acoustic and foundational characteristics of music and are more consistent across per-
formers. These structural features, which include dissonance and mode, are shown to play
an important part in conveying the emotional content of a musical piece [91]. Therefore,
these characteristics may have played a part in inducing certain emotional experiences,
and these emotional responses may have resulted in prefrontal hemodynamic changes
detected in chapter (3). However, this reasoning may be challenged by an alternative
view point regarding the perception of musical characteristics in the brain.
Previous research has identified particular brain networks specialized for perceiving
musical characteristics. Primarily, lesions in the temporal lobe and auditory cortex were
shown to affect perception of pitch and tonal melodies. Zatorre showed that lesions in
the right temporal lobe could adversely affect the ability to discriminate tonal melodies
[243]. In a study involving a control group and thirty-six patients with focal excisions,
Warrier et al. identified the right anterior auditory cortical areas as being responsible
pitch judgements [231]. Such findings provide compelling evidence for the existence of
a brain network specialized for music perception. This network may include PFC, and
the prefrontal area may also be involved in perceiving music. Khalfa et al., who used
major and minor mode for emotion induction in an MRI study, reported left orbito and
mid-dorsolateral frontal activations in response to the minor mode [99]. Using auditory
stimuli designed to only vary in harmonic dissonance and unpleasantness, Blood et al.
found that the subjective ratings of dissonance correlated negatively with orbitofrontal
and ventromedial prefrontal cortex activation [16].
Due to the potential role of the PFC in perceiving music, it was necessary to in-
vestigate whether the activity patterns identified were purely due to the perception of
music characteristics or a result of the emotional content of the music. In this phase, the
influence of music characteristics on the observed activity patterns detected in the PFC
[HbO2] and [Hb] was investigated. Musical characteristics, such as dissonance and sound
pressure level, were compared to hemodynamic changes. In addition, emotional ratings
49
of arousal and valence were compared to average musical characteristics extracted from
the corresponding trials.
4.3 Methods
4.3.1 Music characteristic extraction
The music characteristics investigated in this chapter included mode (major or minor
tonality), sound pressure level (volume), and dissonance (a characteristic of harmony).
Dissonance has been noted as a mechanism by which modern music is capable of inducing
emotions. Children as young as 4 month were shown to react differently when exposed
to consonant versus dissonant music pieces [244]. The ability of dissonance to induce
emotions has been attributed to an innate response to danger because many alarming
sounds in nature such as cries of birds are dissonant auditory cues. Therefore, dissonance,
resulting from modifications to harmonic structures, can convey the salience of the au-
ditory stimulus and result in emotional response. Previous studies of emotion have used
dissonant and consonant music excerpts for inducing pleasant and unpleasant emotions
[16]. Similarly, intensity or volume has been shown to play a role in inducing emotions.
Studies of music for marketing purposes, and psychological assessments have confirmed
that the music volume can play a role in emotion induction [232, 19]. Finally, music
mode is shown to affect emotion induction; the major mode is commonly associated
with positive valence while minor mode conveys negative emotional content. In previous
studies of emotion involving brain activity, music mode has been used to convey positive
and negative emotions [99]. Unlike dissonance which involves the harmonic structure of
music, mode is related to the melodic characteristics of music. Interested readers are
referred to [91] for more information regarding emotional content of music.
For each music excerpt used for this study, mode and dissonance features were de-
termined using the music information retrieval toolbox (MIRTOOLBOX) developed in
50
University of Jyvskyl, Finland, which is available in MATLAB (Mathworks) [124]. MIR-
toolbox allows time domain extraction of music characteristics by breaking the music
piece into time epochs. These epochs were chosen to be 1.5 ms. Average characteristics
were extracted from the entire course of the trial. For more information regarding music
characteristic extraction see Appendix C.
4.3.2 Music database
As described in chapter 2, the music collection used during the data acquisition phase
was composed of two subsets: 72 music pieces identically played for all participants, and
six self-selected songs specific to each participant. The self-selected music excerpts were
played once per session. Therefore, each participant was exposed to four repetitions of
the same song. During these repetitions, the music characteristics remained the same.
Therefore, comparing [HbO2] and [Hb] recorded during separate repetitions may provide
an opportunity to detect music characteristic-dependent activity patterns in the PFC
hemodynamics.
The remaining 72 music pieces and the respective arousal and valence ratings were
used to detect whether emotional ratings have been influenced by music characteristics.
In addition, the hemodynamic response to the common music excerpts was compared to
music characteristics to identify if music characteristics had a significant effect on the
hemodynamic patterns observed.
4.3.3 Statistical analysis
To test whether the music characteristics were related to subjective ratings of arousal and
valence, a mixed effects repeated measures linear regression analysis was fit to the music
characteristics extracted from the common music excerpts used (i.e. 72 music pieces).
Separate mixed effect regression analysis were conducted for valence and arousal ratings.
The p-value associated with the regression slopes were recorded as an indicator of the
51
Table 4.1: P-values for the main effect of arousal and valence rating in modeling mode,dissonance and maximum sound pressure level.
hhhhhhhhhhhhhhhhhhhhhhhdependent variable
independent variableArousal Valence
Mode 0.6128 0.0056Dissonance 0.0082 <0.0001Maximum sound pressure level 0.0280 0.0006
significance of the detected relationship (p < 0.05).
To determine the extent to which music volume,dissonance and mode have affected
[HbO2] and [Hb] averaged across the nine recording regions, a mixed effect model was fit
to the peak values of average [HbO2] and [Hb] with the main effect of each music char-
acteristic separately (i.e. mode, maximum sound pressure level and average dissonance).
For region specific analysis of [HbO2] and [Hb] with respect to mode, maximum sound
pressure level and dissonance, please see appendix D.
4.4 Results
Table 4.1 summarizes the significance of the slope of the regression line for the mixed
effect model fit to mode, maximum sound pressure and dissonance to the main effect
of arousal and valence ratings in two separate models. The ratings of valence were
found to be significantly (p < 0.05) related to mode, maximum sound pressure level and
dissonance in each trial. The ratings of arousal were significantly related to dissonance
and maximum sound pressure level.
Music characteristics did not significantly (p < 0.05) influence the peaks of [HbO2]
and [Hb] averaged across the nine recording sites. Table 4.2 summarizes the p-values
corresponding to the main effects of the music characteristics, namely, mode, maximum
sound pressure and dissonance in modeling peaks of average [HbO2] and [Hb] across the
nine recording locations.
52
Table 4.2: P-values for the main effect of music characteristics (i.e. dissonance, mode,and maximum sound pressure level) in modeling the peaks of [HbO2] and [Hb] averagedacross the nine recording sites.
hhhhhhhhhhhhhhhhhhhhhhhdependent variable
independent variableMode Dissonance Maximum sound
pressure level[HbO2] 0.205 0.098 0.059[Hb] 0.769 0.052 0.250
time
0 5 10 15 20 25 30 35 40 45-1
-0.5
0
0.5
1
Figure 4.1: In grey: the normalized sound pressure level of self-selected song A forparticipant 3. In black: normalized [HbO2] averaged across the nine recording locationsshown for each of the four repetitions of song A. The [HbO2] varied in different repetitionsof the same song.
Figure 4.1 depicts the normalized [HbO2] averaged across all nine interrogation sites
during 4 repetitions of a self-selected song. Visual inspection suggests that the average
[HbO2] collected during separate repetitions of the same song showed temporal differ-
ences. For example, in Figure 4.1, the peak [HbO2] appeared at different time points.
4.5 Discussion
The slope of the regression line fit to the music characteristics reached significance
(p < 0.05) with the main effect of arousal and valence rating (with the exception of
53
music mode modeled using valence rating). The arousal and valence ratings represent
emotional experience, Therefore, these results confirmed the significant effect of music
characteristics in inducing emotions (Table 4.1). The music mode was found to signif-
icantly influence valence (p < 0.05) while the effect of mode on arousal did not reach
significance. This finding echoes those of previous studies involving emotional ratings
and music characteristics. Husain et. al showed that modifying the mode of a piece
by Mozart can affect the mood without influencing arousal [85]. Previous studies have
acknowledged the effect of mode on the perceived valence among listeners [75].
As shown in Table 4.2, mode, dissonance and maximum sound pressure level did
not significantly influence the peak PFC hemodynamics across participants (The slope
of the regression line did not reach significance). The results shown in Table 4.1, on
the other hand, confirmed that music characteristics significantly influenced subjective
ratings (with the exception of mode which did not significantly influence arousal ratings).
In addition, as shown in figure 4.1, repeated exposure to a music excerpt with identical
music characteristics resulted in different PFC hemodynamic patterns. Based on these
findings, mode dissonance and maximum sound pressure level are unlikely to have directly
influenced PFC hemodynamics, but they significantly (p < 0.05) influenced subjective
ratings of arousal and valence and, therefore, emotional experience.
4.5.1 Subject specific patterns
As described in previous chapters, emotions can be individual specific since different
participants may manifest different levels of emotional sensitivity. The variability in
emotional ratings in this study (see Figure 3.4) highlights the subject-specific nature of
emotional experience. In addition, in the same participant, emotional response may vary
between sessions due to mood differences [191]. These differences in emotional experience
may be responsible for the amount of variability between repeated exposure to the same
music excerpts as observed in Figure 4.1.
54
4.5.2 Temporal dynamics
Music characteristics are dynamic phenomena, and so are PFC hemodynamics and emo-
tions. Therefore, instantaneous comparisons between these three elements seem necessary
for understanding how they interact. However, accessing instantaneous emotional ratings
by interrupting the user may result in distractions impeding natural emotional response.
Therefore, in the current thesis, emotional ratings were collected at the end of each mu-
sic excerpt. Future studies involving music characteristics and the brain should consider
implementing experimental paradigms to realize dynamic emotional ratings.
4.6 Conclusion
In this chapter, the effect of music characteristics namely, mode, dissonance, and maxi-
mum sound pressure level on subjective ratings of arousal/valence and maximum [HbO2]
and [Hb] in the PFC was investigated. The PFC [HbO2] and [Hb] averaged across the
nine recording locations were not significantly influenced by the music characteristics
under investigation. However, the results indicated that dissonance and maximum sound
pressure level have significantly influenced subjective ratings of arousal and valence. In
addition, the ratings of valence were found to be significantly influenced by music mode.
Overall, these findings supported the conjecture that music characteristics can affect
emotional experience. The evidence fails to support the hypothesis that the observed PFC
hemodynamic patterns were due to music perception. Therefore, the patterns in the PFC
were more likely to have resulted from the underlying emotions than the perception of
music.
Chapter 5
Automatic Detection of Emotional
Response to Music
5.1 Preamble
In this chapter, the feasibility of automatic detection of emotional response to aural
stimuli using near-infrared spectroscopy of the prefrontal cortex is examined. Here, you
will find details of the machine learning algorithms used for training participant-specific
classifiers which were used to differentiate various levels of arousal and valence.
This chapter is entirely reproduced from the following journal article: Moghimi S,
Kushki A, Guerguerian AM, Chau T, Automatic detection of a prefrontal cortical re-
sponse to emotionally rated music using multi-channel near-infrared spectroscopy. Jour-
nal of Neural Engineering. 2012; 026022-9.
Readers can skip sections 5.4.1 and 5.4.2 since they reiterate the procedures described
in chapter (2).
55
56
5.2 Abstract
Emotional responses can be induced by external sensory stimuli. For severely disabled
nonverbal individuals who have no means of communication, the decoding of emotion
may offer insight into an individuals state of mind and his/her response to events taking
place in the surrounding environment. Near-infrared spectroscopy (NIRS) provides an
opportunity for bed-side monitoring of emotions via measurement of hemodynamic activ-
ity in the prefrontal cortex, a brain region known to be involved in emotion processing. In
this paper, prefrontal cortex activity of ten able-bodied participants was monitored using
NIRS as they listened to 78 music excerpts with different emotional content and a control
acoustic stimuli consisting of the Brown noise. The participants rated their emotional
state after listening to each excerpt along the dimensions of valence (positive versus nega-
tive) and arousal (intense versus neutral). These ratings were used to label the NIRS trial
data. Using a linear discriminant analysis-based classifier and a two-dimensional time-
domain feature set, trials with positive and negative emotions were discriminated with
an average accuracy of 71.94% ± 8.19%. Trials with audible Brown noise representing a
neutral response were differentiated from high arousal trials with an average accuracy of
71.93% ± 9.09% using a two-dimensional feature set. In nine out of the ten participants,
response to the neutral Brown noise was differentiated from high arousal trials with ac-
curacies exceeding chance level, and positive versus negative emotional differentiation
accuracies exceeded the chance level in seven out of the ten participants. These results
illustrate that NIRS recordings of the prefrontal cortex during presentation of music with
emotional content can be automatically decoded in terms of both valence and arousal
encouraging future investigation of NIRS-based emotion detection in individuals with
severe disabilities.
57
5.3 Introduction
Emotions have been characterized as patterns of experience, perception, action and com-
munication that can be animated in response to physical and social encounters [96].
Some theories suggest that emotions can be manifested as a result of human inter-
actions with the surrounding environment [53, 125, 23], which result in physiological
changes [160] such as the modulation of central and peripheral nervous system activity
[15, 74, 9, 28, 210, 111]. These changes may facilitate the identification of emotional
state in non-verbal individuals with severe disabilities who may have no other means
of expression. Of particular appeal is the detection of affective responses through brain
activity monitoring, as there is no requirement for voluntary motor control. Indeed,
computer-based detection of emotional responses may enhance implicit communication
about the user in humancomputer interaction systems [31]. Affective computing has
long been touted for its potential for more realistic and user-accommodating interactions
[171]. An emotionally aware system stands to benefit non-verbal individuals with severe
disabilities by estimating their emotional state in the absence of more direct means of
interaction (e.g. speech and gestures). In turn, knowledge of the patients affective state
may help to mitigate care-giver stress and facilitate treatment decisions in a timely fash-
ion [70]. Various brain circuits including parts of the limbic system and amygdala are
responsible for perception of emotional stimuli [164, 208, 126]. In addition, the frontal
region of the human brain is involved in regulating emotional response to sensory input
[187, 33, 34]. For example, severity of the depressive symptomatology in patients fol-
lowing stroke lesions was reported to be significantly correlated with proximity of the
lesion to the frontal pole [186]. Moreover, left and right frontal activations were also
found in response to watching video clips inducing positive and negative emotional re-
sponses, respectively [235]. Activations in the orbito-frontal and ventral prefrontal cortex
in response to highly pleasurable self-selected music excerpts have also been reported [15].
Among various brain measurement modalities such as electroencephalography [157],
58
positron emission tomography [221], magnetoencephalography [68] and magnetic reso-
nance imaging (Bushong 1988), near infrared spectroscopy (NIRS) is particularly well
suited to long-term bedside monitoring of prefrontal cortex activity. NIRS involves the
optical measurement of changes in oxygenated (HbO2) and deoxygenated hemoglobin
(Hb) concentrations in regional cerebral blood flow [90, 228]. Being an optical modality,
NIRS measurements are not susceptible to electrogenic artifacts such as electrooculo-
grams and electromyograms.
NIRS has been used previously to detect emotional responses in the prefrontal cortex.
Recent findings with emotionally laden visual stimuli have confirmed the presence of
prefrontal cortex activations detectable by NIRS [74, 241, 83]. Likewise, in the context
of automatic emotion detection, Tai and Chau (2009) were able to differentiate between
prefrontal responses to affective pictures and baseline activity on a single-trial basis
with an average of 75% accuracy [218]. However, the perception of visual stimuli may
require gaze fixation and the control of the eye muscles responsible for keeping the eyes
open. Therefore, individuals with severe disabilities who possess little or no voluntary
muscle control, possibly concomitant with vision impairment, may not be able to observe
visual stimuli. However, evidence suggests that aural stimuli, the perception of which
requires no voluntary muscle control, can also elicit a pre-frontal response [15, 16, 17].
Previous findings indicate that when used as a BCI control task, active music imagery
(mental singing) can be differentiated from the rest state and mental math with accuracies
significantly above chance [177, 45, 65]. However, NIRSbased automatic detection of
passive prefrontal responses to affective aural stimuli remains unexplored to date. In
this study, we examined the feasibility of automatically detecting emotional responses to
aural stimuli by near-infrared spectroscopic interrogation of the prefrontal cortex. Music
in particular is recognized for its ability to induce an emotional response in a wide array
of individuals [138]. The emotional content of music is known to be perceived across
cultures [55] and distinguished by children as young as 6 years of age [32]. In fact, music
59
has been frequently used as an emotional auditory stimulus [106, 109, 81, 213, 60]. In
this paper, music excerpts were thus used for inducing affective brain activity.
5.4 Methods
Ten able-bodied volunteers (five females, five males, age: 25 2.7 years) were recruited for
this study. The participants reported to have normal hearing, and normal or corrected-to-
normal vision. The recruitment criteria excluded individuals with reported cardiovascular
diseases, metabolic disorders, history of brain injury, respiratory conditions, drug and
alcohol-related and psychiatric conditions. Participants were instructed to refrain from
caffeine and alcohol consumption 5 h prior to the study. Volunteers had an average of 5.5
years of past music training. Ethics approval was obtained from the Bloorview Research
Institute research ethics board (see Appendix E) and all participants provided informed
written consent.
5.4.1 Stimuli
The stimuli were composed of 78 researcher-selected and 6 participant-selected musical
pieces. All music segments were 45 s in duration. The excerpts included lyrical and
nonlyrical pieces. The lyrics were in different languages (English, French, Italian and
Spanish) to reduce potential effects of brain activation due to mental singing. The 78
standard music pieces were chosen by two researchers from different genres of music
(classical, rock, jazz and pop). Specifically, candidate pieces were assessed in terms of
their valence characteristics as suggested by the tone, rhythm and lyrics (where applica-
ble). Note that the researcher assessments were used solely to ensure an approximately
uniform representation of music between valences (positive versus negative). The actual
data analysis described in section 2.5 relied solely on participant ratings of valence and
arousal. For the participant-selected pieces, participants chose a priori three pieces of
60
music that personally induced intense positive emotions (joy or excitement) and three
that induced intense negative emotions (sadness). The control acoustic stimulus was
Brown noise (BN). User feedback in our pilot studies indicated that this type of noise
was subjectively more pleasant than white noise at the same sound pressure level (Vossa
and Clarke 1978).
Each participant attended four sessions, which occurred on separate days, no more
than four weeks apart. In each session, participants completed three blocks with optional
breaks between blocks. Each block consisted of 12 consecutive trials: four trials with
positively valenced songs (one of which was a participant-selected song), four trials with
negatively valence songs (one of which was a participant-selected song) and four BN trials.
Within a block, the music and BN trials were pseudo-randomized, such that two BN trials
never occurred consecutively while positively and negatively valenced songs appeared in
no apparent order. The same pseudo-random sequence of trials was employed for all
participants. Figure 2 depicts a trial sequence. In each trial, the participant listened to
10 s of BN, followed by a 45 s auditory stimulus (music or BN), and finally 5 s of BN.
The sound level was faded in and out at the beginning and end of the trial, respectively,
to reduce the risk of eliciting a startle. At the end of each trial, the participant rated
the intensity and valence of their emotional experience using a nine-level self-assessment
Manikin (Morris 1995). The beginning and end of each trial was marked by an audible
tone. The participants were instructed to close their eyes when they heard the initial
tone, and to open their eyes upon hearing the second tone.
In this phase, hemodynamic activity was represented by features extracted from
[HbO2] and [Hb] concentrations. A subset of these features was selected and used for
training a linear discriminant analysis based classifier. The classifier was then tested
using a second subset set aside for testing. The training and testing feature subsets
were both labeled based on arousal and valence ratings provided by the participants.
Classifiers were trained separately for differentiating arousal and valence levels in each
61
participant.
5.4.2 Preprocessing
Low-frequency artifacts such as respiration, heart rate and the Mayer wave were filtered
using a type II third order Chebychev low pass filter with a cut-off frequency of 0.1
Hz (normalized stop-band edge frequency of 0.032 and stop-band ripple of 50 dB down
from the peak pass-band value) [178]. The 830 nm and 630 nm light intensities at each
of the nine recording sites were used to calculate HbO2 and Hb concentrations via the
modified BeerLambert law [30, 41], which resulted in 18 channels of concentration data.
To reduce the effects of initial device calibration, the concentration time series were
normalized within each experimental block against the mean in the same block.
5.4.3 Feature extraction
Two genres of features were considered: laterality features and single-channel features.
All features were extracted from [HbO2] and [Hb] concentrations. Table 5.1 summarizes
the features used. Single-channel features were calculated at each of the 9 interrogation
locations and consisted of the mean, slope and coefficient of variation of the concentration
signals during the 45s aural stimuli period, as well as the change in the average concen-
tration from the preceding baseline period to the task period. The slope was determined
by fitting a line using linear regression to all data points in the 45s trial window and
calculating the corresponding slope. The entire 45s window was used for determining the
slope because the concentration changes could occur at any point during the presenta-
tion of the aural stimulus. Coefficient of variation was determined by finding the ratio
of the variance to the mean over the course of the trial. The amplitude-based features
reflected the level of chromophore concentration which captured regional brain activity.
The slope of the concentration waveform represented response latency (i.e. faster vs.
slower changes). Such features have previously characterized task-based activation in
62
Table 5.1: Summary of features used in the analysis
Feature type Features
Laterality featuresLateral slope ratio (LSR)= right concentrationslope/left concentration slope
Lateral absolute mean difference (∆LM) =|Left concentration mean−Right concentration mean|
Single channel-based featuresStimuli period mean (M)Stimuli period slope (S)Coefficient of variation (CV )Mean difference between signal and noise(∆M) = Stimuli period mean - Preceding noiseperiod mean
the prefrontal cortex [178, 180, 218, 151]. In total there were 4 features/location × 9
locations × 2 chromophore concentrations = 72 single-channel features.
The two laterality features quantified differences in activity between the left and the
right sides, and thus were calculated for each of the four pairs of interrogation locations
symmetrical about the midline (i.e., 1L-1R, 2L-2R, 3L-3R and 4L-4R in Fig 2.1). Later-
ality features included the ratio of the concentration signal slopes, and the difference in
the average signal values, between corresponding left and right channels. The inclusion
of these features was motivated by physiological findings that confirm lateralized activa-
tions in response to emotional stimuli [235, 33, 4]. In total, there were 2 features/channel
pair × 4 channel pairs × 2 chromophore concentrations = 16 laterality features.
5.4.4 Classification procedures
For each trial, 65 seconds of data were extracted, including the 45 second stimulus period
and the preceding (10 s) and subsequent (5 s) Brown noise periods. The trials with Brown
noise (BN) were set aside, and the rest of the data were partitioned according to arousal
and valence ratings. For the analysis of arousal, the 48 highest rated trials (out of 96
trials with music) over all four sessions were selected. For the valence component, the
24 highest positively-rated and 24 highest negatively-rated trials across all four sessions
63
(out of 96 trials with music) were selected. The high arousal (HA), positive valence (PV),
negative valence (NV), and Brown noise (BN) trials were labeled accordingly. Note that
arousal and valence labeling were performed independently [156, 190].
A classifier based on linear discriminant analysis [40] was used to solve two different
two-class classification problems (HA vs. BN and PV vs. NV). Comparing the two
valence categories (i.e. PV and NV) individually with Brown noise was not feasible due to
the difference in sample sizes (nHV = nLV = 24, nBN = 48). The classification accuracy
was estimated using the average of 50 independent iterations of 10-fold cross-validation.
Due to the differences in prefrontal activation in different participants, Feature selection
was performed to select a subset of the feature set that best separated the two classes for
each participant. To measure separability, we used the Fisher score which is [40] defined
as the ratio of the difference between the mean of features extracted from each class
under investigation to the sum of variances of features from each class on the training
data. The Fisher score for each feature was calculated and the top two features with
the highest score were selected for classification. Classification accuracy is reported as
correct classification rate.
5.5 Results
Fig. 5.1 depicts normalized sample concentration recordings from all recording locations
for participant 3. Fig. 5.1(a) and 5.1(b) are recordings during a music excerpt rated
as highly arousing and strongly positive, whereas Fig. 5.1(c) and 5.1(d) are normal-
ized sample recordings from one of the most arousing but most negatively rated trials.
Recordings during a sample Brown noise trial are provided for comparison in both cases.
Some immediate patterns are evident. For both HbO2 plots, we notice a general increase
in concentration (hyper-oxygenation), illustrated in (Fig. 5.1(a)) and (Fig. 5.1(c)). The
hyper-oxygenation happens at different points in time during exposure to various audi-
64
(a) HbO2 concentration for positively va-lenced stimulus
(b) Hb concentration for positively va-lenced stimulus
(c) HbO2 concentration for negatively va-lenced stimulus
(d) Hb concentration for negatively va-lenced stimulus
Figure 5.1: Plots (a) and (c) exemplify normalized HbO2 concentration signals at differ-ent recording locations while plots (b) and (d) are the corresponding normalized Hb con-centration signals. The dark lines represent normalized signals corresponding to highlyvalenced, high arousal stimuli while the lighter grey line depicts normalized concentra-tions during Brown noise presentation to the same participant. The same Brown noisesample is illustrated for both positively and negatively valenced examples.
tory stimuli. In both positive and negative-rated trials depicted in Fig. 5.1, a decrease in
Hb concentration following hyper-oxygenation is observed which is consistent with pre-
vious findings of functional NIRS studies[161, 136]. The valenced responses are visibly
distinct from the sample Brown noise response (light grey traces).
The average classification accuracies for the valence (PV versus NV) and arousal (HA
versus BN) classification problems are reported in Tables 5.3 and 5.2, respectively, for
each participant. The best accuracy averaged over all participants was obtained with
2-dimensional feature sets for both HA versus BN (71.93%), and PV versus NV (71.94%)
classification problems. Tables 3 and 2 also summarize the different features selected by
the feature selection algorithm for each classification problem and each participant. As
seen, the optimal feature set was different for each participant.
The spatial distribution of features leading to the best accuracies are marked in Fig
5.2 for the HA versus BN and PV versus NV classification problems. In these figures,
65
Table 5.2: Classification accuracy in % for each participant when classifying HA vs. BN.Feature-types corresponding to the best average accuracy are also presented for eachparticipant (M = stimulus period mean; ∆M = stimulus period mean - preceding noiseperiod mean; LSR = lateral slope ratio; ∆LM = Lateral mean difference; S = slope, CV= coefficient of variation
Participants Gender HA vs. BN% (2 features) features chosen1 M 90.21 ± 1.72 ∆M2 F 76.91 ± 1.04 ∆M3 F 78.67 ± 3.31 ∆M4 F 67.57 ± 2.01 M,S5 F 69.04 ± 1.91 ∆M,CV6 M 58.12 ± 2.55 S7 M 61.71 ± 2.43 S,∆M8 F 71.16 ± 1.08 S9 M 70.17 ± 3.93 ∆M10 M 75.72 ± 1.28 ∆MAverage 71.93 ± 9.09
the magnitude of a rectangular area is directly proportional to the frequency at which
the feature in question was selected at a specific recording site across all participants.
The vertical line represents the anatomical midline. The values are based on the feature
set dimensionality resulting in the highest average classification accuracy.
Fig 5.3 illustrates how the adjusted classification accuracy (i.e. average of classifica-
tion sensitivity and specificity) results averaged across all participants change as trials
with lower arousal ratings are compared to brown noise. Similarly, Fig. 5.4 depicts how
the classification results change when different ranges of positively- and negatively-rated
trials are compared. Comparisons ranged from the highest negative trials (top 12) ver-
sus the highest positive trials (top 12) to all positively-rated trials classified against all
negatively-rated trials.
66
Table 5.3: Classification accuracy in % for each participant when classifying PV vs. NV.Feature-types corresponding to the best average accuracy are also presented for eachparticipant (M = stimulus period mean; ∆M = stimulus period mean - preceding noiseperiod mean; LSR = lateral slope ratio; ∆LM = Lateral mean difference; S = slope, CV= coefficient of variation
Participants Gender PV vs. NV% (2 features) features chosen1 M 75.20 ± 4.22 ∆M,M2 F 77.73 ± 2.09 LSR,S3 F 63.28 ± 4.30 LSR,M4 F 67.76 ± 2.83 LSR, ∆M5 F 77.57 ± 4.10 ∆M6 M 63.04 ± 3.67 ∆M,M7 M 62.00 ± 3.46 S,CV8 F 86.91 ± 2.87 ∆M9 M 76.99 ± 5.11 ∆M,M10 M 68.96 ± 6.55 S,MAverage 71.94 ± 8.19
5.6 Discussion
5.6.1 Classification Accuracy
The objective of this phase was to detect the brain response to emotionally-laden music
by monitoring the prefrontal hemodynamics manifested as changes in the HBO2 and
Hb concentrations. Visual inspection of the concentration waveforms in Fig. 5.1 sup-
ports the choice of discriminatory features (e.g., mean and slope). Emotional arousal
in response to music was classified against the Brown noise response with an average
accuracy of 71.93% while emotional valence (i.e. positive or negative) was differentiated
with 71.94% accuracy. These findings indicated that the emotional content of music in-
duces differential patterns of activity in the prefrontal cortex, detectable algorithmically
by NIRS.
As reported in Tables 5.2 and 5.3, classification accuracies varied across participants,
corroborating previous findings of individual differences in emotional reactivity [189, 22].
As can be seen in Tables 5.2 and 5.3, accuracies above chance level were achieved for
67
(a) HbO2, HA versus BN (b) Hb HA versus BN
(c) HbO2, PV versus NV (d) Hb, PV versus NV
Figure 5.2: Location of features resulting in the best overall accuracy. Each rectangle islocated over a recording site. The size of the rectangle is proportional to the number offeatures selected from the corresponding location. The vertical line denotes the anatomi-cal midline (HA = high arousal; BN = Brown noise; PV = positive valence; NV=negativevalence).
9 out of 10 participants in the HA versus BN classification problem (α = 0.05), while
in the PV versus NV scenario, accuracies for 7 out of 10 participants exceeded chance
(α = 0.05)1.
One of the concerns when investigating emotional experience using PFC activity is
the possibility of activation due to the emotion induction task requirements as opposed
to the emotions induced [74]. However, Fig 5.3 illustrates how the classification accuracy
degrades as trials with increasingly lower arousal rating are compared against brown
noise. Therefore, the difference in the task requirements (e.g. attentional demands),
when presenting music compared to brown noise presentation, is unlikely to be responsible
for classification accuracy. Similarly, in Fig 5.4, the classification accuracies degrade as
trials with increasingly lower positive and negative valence ratings are classified against
11Note that for a two-class problem, the 95% confidence interval (α = 0.05) for 48 and 24 trials perclass are 50± 9.80 and 50± 13.59, respectively [149]
68
12 24 36 48 60 72 84 9650
55
60
65
70
75
80
85
90
Number of trials included
Adj
uste
d cl
assi
ficat
ion
accu
racy
(%
)
Figure 5.3: Adjusted classification accuracy (shown in (??)) results (averaged across par-ticipants) versus the number of trials included for classification against brown noise trials,after sorting all trials based on ratings of arousal in descending order. (e.g. accuraciesreported for the top 12 are the result of classifying the 12 highest rated arousal trialsagainst all trials with brown noise. The confidence intervals are shown as error bars foreach number of trials included.)
12 24 36 4850
55
60
65
70
75
80
85
90
95
100
Number of trials included
Adj
uste
d cl
assi
ficat
ion
accu
racy
(%
)
Figure 5.4: Adjusted classification accuracy (shown in (??)) results (averaged acrossparticipants) versus the number of trials included for classification, after sorting all trialsbased on ratings of positive and negative valence in descending order. (e.g. accuraciesreported for the top 12 are the result of classifying the 12 most positively rated trialsagainst the 12 most negatively rated trials. The confidence intervals are shown as errorbars for each number of trials included.)
69
each other. This decrease in the classification accuracies is expected due to potential
similarities between trials rated at the lower positive and lower negative ends of valence
(approaching neutral state).”.
According to Fig 5.2, which depicts the recording sites corresponding to features se-
lected across all participants, the spatial distribution of the features resulting in the best
overall accuracy was bilateral. This finding is consistent with the bilateral physiological
substrates that are responsible for the perception of valence in the prefrontal cortex [33].
Nonetheless, in three out of ten participants, unilateral activation was most discrimina-
tory as laterality features were among those selected for solving the valence classification
problem (see Table 5.3).
5.6.2 Diversity in the music database
Previous studies have reported regional brain activity modulation due to specific char-
acteristics of music such as rhythm, timbre, and major/minor chords [118, 193, 163]. In
these studies, the investigators varied selected music characteristics while carefully con-
trolling for others. Other studies, focusing on emotion induction, have used diverse music
databases (e.g., self-selected music pieces) to ensure successful elicitation of emotional
reactions [15, 200]. In the current study, the second approach was used.
The variability of arousal and valence ratings for a given piece of music across partic-
ipants (i.e., the same music excerpt rated differently among participants) suggests that
the observed brain activity was indeed attributable to emotional experiences. Moreover,
the variability in ratings among participants implies that the classification algorithm was
not likely biased towards specific musical characteristics.
5.6.3 Challenges
Due to the limited number of samples, only two dimensions of emotion (valence and
arousal) were considered. Although these measures are informative, they fail to capture
70
more specific emotional labels. For example, fear and sadness can both be rated as
negatively valenced and high in arousal. In order to differentiate more specifically among
emotional labels, other dimensions of emotion such as occurrence (eruptive vs. gradually
arising) and dominance (complete control vs. no control over the situation) need to be
considered [240].
Special care was devoted to standardize headgear placement across all four sessions,
which in turn, should have minimized instrumentation inconsistencies. However, differ-
ences in the shape of the skull may have led to variabilities in the brain regions monitored
in different participants. Therefore, the present results preclude conclusions about the
specific brain regions that were activated.
The human response to emotional stimuli may be affected by emotional sensitivity.
In fact, [169], have shown that individuals with high trait emotional intelligence respond
faster and show more sensitivity in an emotion induction paradigm . Including a measure
of emotional sensitivity in addition to the self-reported ratings might have helped to
explain the inter-subject variability in classification accuracies.
Previous studies of emotion have indicated gender differences as an important factor in
emotional response [132, 241]. However, the limited number of participants did not allow
further investigations of gender related differences in the emotional response. Future
studies with larger sample sizes need to be devised to investigate the effects of gender in
emotion-induced prefrontal hemodynamic response.
Chapter 6
Combining autonomic and central
nervous system activity
6.1 Preamble
In this chapter, autonomic nervous system activity signals, namely electrodermal ac-
tivity (EDA), blood volume pulse (BVP), and skin temperature are used to solve the
classification problems introduced in chapter 5 (i.e. high arousal vs. brown noise and
most positive vs. most negative). In addition, new features using dynamic modeling and
template matching are introduced for emotion identification.
The goal here is to compare the results achieved using ANS features with those
obtained using exclusively PFC hemodynamic features (see chapter 5), and to combine
classifiers trained using features derived from ANS and PFC hemodynamics to improve
upon accuracies obtained in chapter 5. Readers can skip sections 6.3.1, 6.3.2 and 6.3.6
since they reiterate the procedures described in chapter (2) and the classification steps
introduced in section (5.4.4).
71
72
6.2 Introduction
Emotional response may engage various pathways in the central and autonomic nervous
system. In fact, some theories surrounding the neural basis of emotion have argued for
the existence of an intricate interaction pattern between central and autonomic nervous
system (ANS) during emotional response [222]. Autonomic nervous system activity has
long been used in the field of physiologically-based emotion identification. Physiological
emotion detection may provide a means of affective communication for adults and youth
with severe disabilities who may not be able to use conventional means of emotional ex-
pression such as speech or facial gestures due to severe motor impairments. In particular,
identifying affective state may help to mitigate care-giver stress and facilitate treatment
decisions in a timely fashion [70].
Cardiovascular, respiratory, and particularly electrodermal activity (EDA) sensors
can detect ANS activity modulations during emotional response [108, 129]. Many studies
have used multiple indicators of ANS activity for identifying emotions [244, 66, 100]. For
example, using EDA, temperature, BVP and ECG monitors, Kim et al. achieved 78% in
differentiating anger, sadness and stress [102].
Emerging neural indicators of emotion are based on activity of the central nervous
system (CNS), particularly brain areas which are found to be involved in emotional pro-
cessing. Highly pleasurable music excerpts were shown to result in activation patterns
in the amygdala, as well as the frontal and ventral prefrontal cortex [15], using mag-
netic resonance imaging (MRI). Another hemodynamic monitoring technology applied for
emotion identification is near-infrared spectroscopy (NIRS) which measures oxygenated
and deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively) in cerebral
blood flow [90, 228]. NIRS, which is a portable and relatively inexpensive optical im-
agery technology, is not suitable for monitoring deeper brain regions such as amygdala,
but is capable of monitoring the PFC which is part of the emotional perception circuitry
in the brain. In fact, NIRS studies of the PFC have identified correlates of emotion in
73
regional hemodynamic activity [218, 144]. Hoshi et al. showed that emotional response
to pleasant and unpleasant pictures resulted in regional increase and decrease of PFC
[HbO2], respectively [83].
Based on the existing physiological evidence, both autonomic and central nervous
system activity may show modulation during emotional activity. Therefore, realizing a
multi-modal emotion identification system which uses both CNS and ANS activity is a
meaningful pursuit. Recent studies have explored concomitant use of signals from both
ANS and CNS pathways for detecting emotions. For example, Kuncheva et al. showed
that an ensemble of classifiers each trained using electrocardiogram, electroencephalo-
gram, EDA and pulse signals [120] could achieve accuracies up to 73% in differentiating
positive from negative emotional state. In this light, the current chapter focuses on using
features from ANS activity for identifying most intense-rated music excerpts from neutral
brown noise and most positive-rated music excerpts from most negative rated excerpts.
Furthermore, a mixture of experts was used for combining classifier decisions. These
classifiers were trained using ANS-based features or NIRS-based features separately.
6.3 Methods
6.3.1 Procedures
10 able-bodied individuals (5 female, age: 25 2.7 years) with no reported cardiovascular
diseases, metabolic disorders, history of brain injury, respiratory conditions, drug and
alcohol-related and psychiatric conditions were recruited for this study. Ethics approval
was obtained from the ethics board at Holland Bloorview Kids Rehabilitation Hospital.
The experiments were conducted over four separate sessions, and encompassed a total of
144 trials, 48 of which included brown noise. Pilot studies indicated that this type of noise
was subjectively more pleasant than white noise at the same sound pressure level. In each
session, participants completed three blocks. Each block consisted of 12 consecutive trials:
74
Figure 6.1: Trial sequence
four trials with positively valenced songs (including one participant-selected song), four
trials with negatively valence songs (including one participant-selected song), and four
Brown noise trials. The music excerpts were randomly selected from a database composed
of six music pieces self-selected by the participant and a common music database selected
by researchers. The common music database included music pieces from different genres
of music (classical, rock, jazz, and pop), with and without lyrics. The trials within each
block were pseudo-randomized, such that Brown noise trials never occurred consecutively,
and, positively and negatively valenced songs appeared in no apparent order. The same
pseudo-random sequence of trials was used for all participants. Figure 6.1 illustrates a
typical trial sequence.
6.3.2 NIRS data
A multi-channel NIRS monitoring system (Imagent Functional Brain Imaging System
from ISS Inc., Champaign, IL) was used to record hemodynamic response across nine
different regions in the PFC. In this system, five optode pairs and three detectors were
placed on the forehead as shown in Figure 6.2. In each optode couple, one source emitted
light at 830nm, and the other at 690nm. The signals were recorded at a 31.25Hz sampling
rate.
75
Figure 6.2: The layout of light sources (circles) and detectors (X’s). The vertical linedenotes anatomical midline. The annotated shaded areas correspond to recording loca-tions.
6.3.3 ANS data
Blood volume pulse (BVP), electrodermal activity (EDA) and temperature were recorded
using a ProComp Infiniti multimodality encoder (Thought Technology, Montreal, QC,
Canada) at 256 Hz sampling rate. EDA was recorded using two Ag-AgCl surface elec-
trodes with 10-mm-diameter which were attached to index and middle finger phalanges
on the non-dominant hand. Using a thermal sensor secured on the fifth finger, the skin
temperature was recorded. The blood volume pressure was obtained using a photo-
plethysmograph sensor attached to the thumb. The recorded blood volume pressure was
used to determine heart rate by finding the interbeat interval.
6.3.4 Analysis
The BVP signals were band-pass filtered (0.2-0.33 Hz) using Daubechies-based contin-
ues wavelet transform [2] to facilitate peak detection. The inverse of the peak-to-peak
distances in time was used as an indicator of heart rate, and the peak values were used
to determine pulse volume amplitude (PVA) [101].
76
6.3.5 Feature extraction
PFC features
PFC hemodynamic-based features were extracted from [HbO2] and [Hb] at each of the
9 recording locations. These features included the mean, slope (determined using linear
regression) and coefficient of variation (ratio of variance to the mean), all estimated
over the music presentation period, and the change in the mean between the preceding
noise and music presentation period. In addition to single-channel features, the ratio of
the concentration signal slopes and the difference in the average signals were determined
between left and right channels (i.e., 1L-1R, 2L-2R, 3L-3R and 4L-4R in Fig 2). Laterality
features were introduced into the feature set, based on previous reports of lateralized
response to emotional stimuli [235, 33, 4]. For more information regarding these features,
the reader is referred to section 5.4.3.
Based on the findings presented in chapter (3), new features were derived by intro-
ducing a custom made template. First, by repeated visual inspection of [HbO2] and [Hb]
patterns in the highest arousal-rated trials, a template was designed. Figure 6.3. A and
B depict the designed template and a sample recoding from participant 3 during which
chills were reported, respectively. As shown in Figure 6.3, the template was designed to
capture the sudden increase and the proceeding plateau in the concentration waveform.
This custom-made template was akin to a mother wavelet and the maximum coefficients
across translation and scale were determined. The maximum coefficients determined us-
ing the template (maximum across scale and time) were used as features for classification.
Hence, the template was empirically determined.
ANS features
The features representing autonomic nervous system signals included the mean, the range
and the difference in the mean values during the aural stimulus period and the preceding
77
0 0.2 0.4 0.6 0.8 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0 5 10 15 20 25 30 35 40 45-1
-0.5
0
0.5
1
time
A. Template B. Sample trial with chills
Figure 6.3: A. Custom-made template, B. Sample normalized [HbO2] recorded in a trialwith chills.
noise period (see the trial sequence in Figure 2.2) for temperature recordings and EDA.
The number and magnitude of electrodermal responses (EDR) were also added to the
feature set. EDRs were detected by differentiating the EDA recordings and convolving
the resulting waveform with a Bartlett window, and finding the two consecutive zero-
crossings (positive to negative and negative to positive) [102]. The maximum value
between these two zero-crossings was recorded as the magnitude of the EDR [102]. The
average heart rate and PVA signals within the 45 sec period of exposure to music were
also included as features to represent cardiovascular response. In addition, the ratio of
high frequency (heart rates above 70 bpm) to low frequency (heart rates below 70 bpm)
energy content was also included in the feature set. These ANS features were selected
based on previous findings in studies of emotion involving ANS activity [108, 184].
Dynamic model-based features
System identification has previously been used for modeling interactions among various
physiological signals [197, 175]. For example, Saul et al. used system identification to
understand the relationship between respiratory signals and heart rate in their charac-
terization of autonomic regulation of heart rate [198]. The differences observed between
models fit to neutral and chilling trials, which will be reported in section 6.4.1, suggested
that dynamic model-based features may be useful in differentiating emotions.
To capture the relationship between EDA and [HbO2], an autoregressive model with
78
exogenous input (arx) was applied [130]. This arx model described a system based on
immediate past input (x) and output (y) values as shown in (6.1).
y(t) = b1x(t− 1) + ..+ bnbx(t− nb)− a1y(t− 1)− ..− anay(t− na) + ε (6.1)
In (6.1), nb and na are model orders, and ai and bi are model coefficients. In the
arx model estimated, the [HbO2]/[Hb] was used as the input (x) to the system and the
EDA was set as the output (y) and vice versa. Model order was selected according to
the Akaike Information Criterion (AIC )[130].
The EDA signals (originally collected at 256Hz) were down-sampled by a scale of 7
(resulting in a sampling rate of 4.57 Hz), and [HbO2] signals (originally collected at 31.25
Hz) were re-sampled at 4.57 Hz to match EDA sampling rate. To match the bandwidth
of the EDA signals to that of NIRS recordings, both types of signals were low pass
filtered using the same type II third order Chebychev filter (i.e. 0.1Hz cut-off). The
concentration time series and EDA signals within each trial were normalized to have zero
mean and were scaled down by the maximum absolute signal value during the trial. This
normalization resulted in signal magnitudes ranging from -1 to 1.
An autoregressive (AR) model was used to represent EDA, [HbO2], and [Hb] dynam-
ics. Unlike arx, an AR model describes the signal with respect to the immediate past of
the same signal as shown in (6.2).
y(t) = a1y(t− 1) + ..+ anay(t− na) + ε (6.2)
To illustrate the potential merits of dynamic-based feature extraction in emotion
identification, arx models ([HbO2] averaged across the nine recording regions was used as
input) were fit to trials with the highest arousal rating (i.e. chills) and those with brown
noise rated as neutral (i.e. neutral). These models were compared (chills vs. neutral)
79
Table 6.1: Features resulting from arx dynamic modeling. (very low frequency band(VLF) = 0-0.025 Hz, low frequency band (LF) = 0-0.075 Hz and high frequency band(HF) = 0.075-0.1 HZ)
Feature typesModel-based features
Model coefficients
Frequency response features
EnergytotalEnergyV LF
EnergyHF
EnergyLF
EnergyHF
EnergyV LF
in the frequency domain to explore the usefulness of dynamic-based feature extraction
in differentiating arousal. Chills were selected for comparison as they are well-defined
emotional events. The model orders identified using AIC were recorded for chills and
neutral trials, and the AIC order selected for the majority of trials was considered as the
generalized model order (GMO).
To identify features, each trial (n=144) was first modeled using arx under two condi-
tions: a) with [HbO2]/[Hb] as the input and, b) with EDA as the input (GMO for chills
was used for modeling in both (a) and (b)). The model coefficients (i.e. ai , bi ) were
included as direct model-based features. Other features were based on the frequency
response of the estimated dynamic model. The energy of the frequency response within
three frequency bands, namely, the very low frequency band (VLF = 0 - 0.025 Hz), the
low frequency band (LF=0 - 0.075 Hz) and the high frequency band (HF = 0.075 - 0.1
HZ) was used for extracting frequency response-based features. Ninety percent of the
spectral peaks of the models’ transfer functions occurred within these frequency bands.
Table 6.1 summarizes these features. Using a similar procedure, the GMO was used for
identifying AR model coefficients which were also included as features.
80
6.3.6 Classification
In order to compare the classification results with these newly proposed features against
that attained only for NIRS signals, procedures identical to chapter 5 were used for
labeling the data and classification. The trials with Brown noise (BN) were separated,
and the rest of the data were partitioned according to arousal and valence ratings. For
the analysis of arousal, the 48 highest rated trials over all four sessions were selected. For
the valence component, the 24 highest positively-rated and 24 highest negatively-rated
trials across all four sessions were selected. The high arousal (HA), positive valence (PV),
negative valence (NV), and Brown noise (BN) trials were labeled accordingly. Arousal
and valence labeling were performed independently
A classifier based on linear discriminant analysis [40] was used to solve two different
two-class problems (HA vs. BN and PV vs. NV). The classification accuracy was esti-
mated using the average of 50 independent iterations of 10-fold cross-validation. Feature
selection was performed to identify feature set that best separated the two classes for each
participant. To measure separability, the Fisher score [40] was used, which is defined as
the ratio of the difference between the mean of features extracted from each class under
investigation to the sum of variances of features from each class. The Fisher score for
each feature was calculated and the two features with the highest scores were selected
for classification. Classification accuracy was recorded as the correct classification rate.
6.3.7 Mixture of experts
Six linear discriminant analysis based classifiers [40] were separately trained using exclu-
sively features from one of the feature types, shown in Table 6.2, namely time domain
PFC features, ANS features, template-based features, and system dynamic-based fea-
tures. These classifier experts were used to decide the labels of trials set aside for testing.
The features were randomly segregated into training A and testing sets (shown in Figure
6.4) using a 10-fold cross-validation algorithm. Testing data were set aside for the final
81
Allfeatures
Training A
Testing
Training B
Validation
10-folds
10-folds
Figure 6.4: Feature segmentation.
testing of the classifier ensemble.
To determine the class label (i.e. wj, j = 1, 2) for the sample x in the testing set, the
classifier decisions were combined using the classifier confidence and the support for x
belonging to each class using a classifier combination algorithm introduced in [121] (see
Figure 6.5). Classifier support was determined using the discriminant function for class
wj (i.e. gj(x)) and transformed into a logistic link function g′j(x) which resulted in a
value ranging from 0 to 1. Larger support values indicated more likelihood for class wj
. For a two class problem, the discriminant and logistic link functions are determined as
shown in (6.3) and (6.4), respectively.
p(wj|x) =exp(gj(x))
exp(g1(x)) + exp(g2(x)), j = 1, 2 (6.3)
g′1(x) =1
1 + exp(−g1(x)), g′2(x) = 1− g′1(x), j = 1, 2 (6.4)
The classifier support values were arranged in the form of a decision profile (DP(x)).
The DP(x) was composed of dc,j(x) elements which represented the support that classifier
Dc (c = 1, 2, ..., 6) had for class wj, given the vector x from the testing set.
Classifier competence (Gc), which represented the classifier ability to identify class
82
Classifier 1
Classifier 2
Classifier 3
Classifier 4
Classifier 5
Classifier 6
FeatureSelection
FeatureSelection
FeatureSelection
FeatureSelection
FeatureSelection
FeatureSelection
Fusio
n o
f C
lassifie
r D
ecis
ions
OutputClass
Time-domainNIRS
ANS
Template-based
ARXinput: EDA
ARXinput:
[Hb]/[HbO ]2
AR
Figure 6.5: A simplified diagram depicting fusion of classifier decisions.
labels, was determined based on the training A data. The training A data was parti-
tioned into validation and training B in 20 iterations of a 10-fold cross validation. The
correct classification rates in each iteration and fold were averaged to estimate classifier
competence, Gc. This step resulted in 6 values, one for each feature set, namely, time
domain PFC, ANS features, template-based features, arx features (input: EDA), arx
features (input: [HbO2]/[Hb]), and AR features. To determine Gc (c = 1, 2, .., 6), in
each iteration and fold, 1 feature was selected using Fisher scores applied to the primary
training set (see Section 6.3.6).
The parameter λ was estimated by finding the real root greater than −1 of the
polynomial shown in (6.5).
1 + λ =6∏
i=1
(1 + λGi), λ = 0. (6.5)
For a given x, the DP vector corresponding to x for each class was sorted from the high-
est support to the lowest (i.e. [d1,j(x), d2,j(x), ..., d6,j(x)] → [dc1,j(x), dc2,j(x), ..., dc6,j(x)]).
83
Table 6.2: Feature used for training classifier experts
Feature typesTime domain PFC features Multi-channel time domain features
Laterality features
ANS featuresEDA featuresSkin temperature featuresBVP features
Wavelet-based features Maximum wavelet coefficients across time and scale
Dynamic-based featuresarx features (input: EDA, output: [HbO2]/[Hb])arx features (input: [HbO2]/[Hb], output: EDA)AR features (EDA, [HbO2], and [Hb])
The classifier competence values were sorted accordingly (i.e.Gc1(x), Gc2(x), ..., Gc6(x)).
The measure Q(k) was calculated recursively as shown in 6.6.
Q(k) = Gck +Q(k − 1) + λGckQ(k − 1), Q(1) = Gc1 , k = 2, ..., 6. (6.6)
The final degree of support for class wm was determined using a Suego integral, shown
in (6.7), and the class with the highest µm(x) value was noted as the ensemble decision
for sample x from the testing set [121].
µm(x) = maxk=1,..6{min{dck,m(x), Q(k)}},m = 1, 2. (6.7)
6.4 Results
ANS data from 61 trials out of the 1440 trials across all participants was lost due to
technical issues, and were therefore not included for the analysis. [HbO2] and [Hb]
signals corresponding to trials for which ANS signals were lost were also excluded from
the analysis.
84
Figure 6.6: Sample trial with chills (participant 2): EDA recording and estimation, usingthe average [HbO2] concentrations as the input to the arx model. The fit achieved bythe model for the depicted estimation is 52.9%.
6.4.1 Dynamic model-based features
The estimated arx model achieved a fit value exceeding 50% in 70% of trials in the chills
category and 77% of trials in the neutral cases. These results exemplify the ability of the
arx model to capture the interaction dynamics between EDA and [Hb]/[HbO2]. Figure
6.6 depicts a sample ([HbO2]) and the corresponding EDA recording and estimation for
a trial with chills for which the fit value was 52.9%. Figure 6.7 shows the normalized
frequency responses (magnitude and phase) for models with chills and neutral trials for
participant 4. The trials with chills manifested two distinct peaks. The two peaks,
shown in Figure 6.A, were observed in all other participants. However, the location of
these peaks varied across participants. Neutral-rated trials manifested a low pass filter
property with a single peak similar to that shown in Figure 6.7.B.
6.4.2 Classification results
Table 6.3 summarizes the ANS-based classification results for HA vs. BN and PV vs.
NV. Clearly, the ANS-based results varied across participants.
The mixture of experts classification rate is presented in Table 6.4. Tables 6.5 and 6.6
85
A. Chilling B. Neutral
0 0.05 0.1 0.15 0.2 0.25 0.30
0.005
0.01
0.015
0.02
0.025
0.03
0 0.05 0.1 0.15 0.2 0.25 0.30
0.005
0.01
0.015
0.02
0.025
0.03
0.035
Figure 6.7: Sample scaled frequency response estimated for (A) chilling and (B) neutraltrials for participant 4. The magnitude of the frequency response was normalized bydividing the results by the total power of the signal over the entire frequency range.
Table 6.3: Classification accuracy in % determined using ANS features for solving theHA vs BN and PV vs. NV classification problem
Participants HA vs BN % PV vs NV %1 59.8 ± 1.5 64.5 ± 1.72 53.0 ± 1.7 59.8 ± 2.53 77.3 ± 1.1 59.8 ± 2.84 51.5 ± 1.4 62.7 ± 2.55 55.6 ± 1.5 55.5 ± 2.06 58.7 ± 1.6 55.9 ± 2.07 55.4 ± 1.2 46.4 ± 2.28 58.1 ± 1.5 54.2 ± 2.39 54.3 ± 1.3 53.2 ± 2.310 69.8 ± 1.4 60.1 ± 2.4
summarize the dynamic model-based results for the two classification problems, namely
PV vs. NV and HA vs. BN.
6.5 Discussion
Many physiologically-based emotion identification efforts have included electromyogram
sensors to capture muscle activity due to emotions (e.g. muscle contractions resulting
from facial expression) [244, 66, 100]. For example, Picard et al. achieved an accuracy
of 81% in differentiating eight emotional states (neutral, anger, joy, grief, hate, romantic
love, reverence, platonic love) using features from facial electromyography, BVP, EDA
an respiration [172].
86
Table 6.4: Classification accuracy in % determined using the mixture of experts forsolving the HA vs. BN and PV vs. NV classification problem
Participants HA vs. BN % PV vs. NV %1 83.8 ± 0.8 85.1 ± 0.92 75.9 ± 1.0 58.4 ± 2.63 91.9 ± 0.5 57.5 ± 2.24 58.6 ± 1.7 49.2 ± 2.35 60.9 ± 1.4 64.5 ± 1.76 59.7 ± 1.5 55.7 ± 1.77 51.8 ± 1.4 42.8 ± 2.38 71.5 ± 1.2 71.4± 1.89 60.7 ± 1.6 58.8 ± 2.210 69.55 ± 1.2 58.9± 2.4
Table 6.5: Classification accuracy in % for each participant when classifying HA vs.BN. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and arx (b)input:[HbO2]/[Hb])) and template-based features.
Participants AR % arx (a) % arx (b) % Template-based1 56.0 ± 1.6 58.1±1.6 44.4 ±1.2 56.5±1.62 47.7± 1.0 65.6±1.4 48.0 ± 1.6 64.5±1.63 81.7± 1.1 65.1±1.1 61.1 ± 1.4 85.6 ±0.84 63.9± 1.1 45.2 ±1.2 49.1 ± 1.3 54.2 ±1.15 56.2± 2.1 56.5 ± 1.6 57.1 ± 1.3 59.0 ±1.46 63.8± 1.1 55.8± 1.4 59.0 ± 1.0 42.0 ± 1.17 46.0 ± 1.7 47.7± 1.6 51.6 ± 2.0 48.2 ±1.48 43.1± 1.3 54.7±1.9 45.1 ± 1.3 61.6±1.59 66.9± 1.5 62.4± 1.5 44.7 ± 1.4 61.4±1.110 59.8 ± 1.2 55.2± 1.7 54.9 ±1.7 63.8±1.4
Other studies have exclusively focused on ANS activity sensors. In a study involving
electrocardiogram (ECG), Agrafioti et al. [1] achieved accuracies up to 89% in differen-
tiating valence, and reported between-subject variability in classification results. Using
EDA, temperature, BVP and ECG monitors, Kim et al. achieved 78% in differentiat-
ing anger sadness and stress [102], and also indicated differences in correct classification
rate among participants. These findings confirm differences in correct identification rates
across individuals, which was also observed in the current investigation.
Autonomic nervous system activity patterns may vary across individuals. For ex-
87
Table 6.6: Classification accuracy in % for each participant when classifying PV vs.NV. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and arx (b)input:[HbO2]/[Hb])) and template-based features.
Participants AR % arx (a) % arx (b) % Template-based1 46.3 ± 2.4 52.3 ± 1.6 59.9 ± 1.7 59.0±2.52 46.4 ± 1.9 49.4 ± 1.8 54.1 ± 1.8 59.3± 1.83 51.9 ± 1.9 47.3 ± 2.4 69.4 ± 1.6 56.2±2.14 44.1 ± 2.2 54.0 ± 2.2 39.4± 1.4 46.7±2.45 64.3 ± 2.6 52.1± 1.8 53.2 ± 2.2 57.8 ±2.16 47.3 ± 2.2 55.2 ± 2.1 51.7 ± 2.3 45.6 ±1.57 46.5 ± 2.4 46.4 ± 2.5 47.0 ± 1.6 37.9±1.68 53.3 ± 1.9 59.7 ± 2.1 55.4 ± 2.1 62.0±2.39 48.0 ± 2.2 49.7 ± 2.6 40.4 ± 2.4 61.1±2.510 43.2 ± 1.9 59.2± 1.6 57.8 ± 2.3 47.8±1.9
ample, the EDA response magnitude due to sympathetic arousal may be suppressed in
some individuals[223]. This phenomena may explain the variability in ANS-based emo-
tion identification results in Table 6.3.
Various features such as ANS-based, dynamic model-based or template-based features
may not be equally useful for identifying emotions. For example, for a particular partici-
pant, dynamic model features may result in low accuracies in HA. vs. BN differentiation
while ANS features lead to higher identification accuracies (see results for participant
10), but for another individual, the opposite may be true (see results for participant
3). The multi-modal mixture of experts, used in this study, automatically accounted
for this variability by estimating the classifier competence which ultimately assessed the
usefulness of the feature set. Combining classifier decisions maintained or improved the
HA vs. BN classification results in only three participants (i.e. participants 3,6 and 8)
when compared to results obtained exclusively using NIRS features (see Table 5.2). The
PV vs. NV correct classification results were generally (with the exception of participant
1) lower (comparing 5.3 and 6.4). Previous studies have indicated differences between
arousal and valence detection accuracies. For example, using skin conductivity, blood
volume pressure, respiration and an electromyogram, Healey et al. found that valence
88
differentiation was less accuracte than arousal differentiation [72].
Compared to results obtained in chapter 5 (i.e. Table 5.2 and 5.3), the accuracies
obtained using PFC hemodynamic-based features were generally higher than the combi-
nation of classifiers based on PFC hemodynamic and ANS features. The average results
from the classifier combination shown in Table 6.4 was 66% for HA. vs. BN and 60%
for PV vs. NV which is lower than the results achieved using NIRS features exclusively
(see chapter 5). However, including autonomic nervous system activity features may help
improve emotion identification in a subset of individuals. Future studies involving larger
sample sizes, which are more likely to represent various ANS response phenotypes, may
help identify whether the multi-modal approach exceeds the performance achieved using
PFC NIRS features alone.
6.6 Conclusion
In this chapter, a multi-modal ensemble of classifiers was used to differentiate highest
arousal rated trials from brown noise (HA vs. BN), and most positive rated trials from
most negative rated trials (PV vs. NV). Each classifier in the ensemble was trained
by exclusively using features from ANS or PFC hemodynamics. Novel dynamic-based
features were introduced and demonstrated potential in arousal differentiation.
The classification results varied across participants. In particular, the classifier en-
semble was capable of maintaining or improving upon the results achieved using only
PFC hemodynamics in 3 participants for the HA vs. BN problem. However, the valence
differentiation rate was lower than those achieved with PFC hemodynamics alone.
Chapter 7
Concluding remarks
7.1 Summary of contributions
This thesis made several contributions to the field of rehabilitation engineering, specifi-
cally, in the area of affective brain computer interfaces. In summary, the results of this
thesis illustrated the feasibility of emotion identification using prefrontal cortex (PFC)
near infrared spectroscopy (NIRS) in response to a dynamic emotion induction method
(i.e. music). The specific contributions are listed in this chapter.
7.1.1 A literature appraisal of the existing evidence for the use
of BCI for individuals with disabilities [143]
The existing evidence for the use of brain computer interfaces (BCIs) involving indi-
viduals with disabilities was critically appraised. This literature review resulted in the
identification of current challenges surrounding BCI use for individuals with severe dis-
abilities. In addition important recommendations for future studies were made, including
the inclusion of user state and involvement of pediatric population. These recommen-
dations may benefit future BCI research efforts in realizing more user-accommodating
systems suitable for the target papulation (i.e. individuals with severe disabilities).
89
90
7.1.2 PFC [Hb] and [HbO2] patterns characterization using wavelet
analysis with respect to emotional arousal and valence
[142]
Regional PFC [Hb] and [HbO2] activity was characterized using wavelet peak detection.
This algorithm allowed identification of hemodynamic characteristics with respect to
the arousal and valence dimensions of emotions. In addition to hemodynamic response
magnitude, the wavelet peak detection method allowed investigation of the speed of
hemodynamic response (i.e. using the scale at maximum wavelet coefficient). Intense
negative emotional ratings were found to be generally related to heightened changes
in [HbO2] . These findings warranted further investigation of PFC NIRS for emotion
identification, particularly when using dynamic emotion induction methods such as music.
7.1.3 Identified emotional arousal and valence in response to
dynamic emotion induction using PFC NIRS [144]
Using time domain and laterality features extracted from the PFC NIRS, the highest
arousal rated trials were differentiated from trials with brown noise (HA. vs. BN) with
an average accuracy of 71%. Similarly, in differentiating the most positively rated trials
from most negatively rated trials (PV vs. NV), an average accuracy of 71% was achieved.
The 10 fold cross-validation used for classifier training and testing simulated single-trial
identification of arousal and valence and provided further evidence for the use of PFC
NIRS as a means of emotion identification.
91
7.1.4 Introduced features based on dynamic modeling for emo-
tion identification
Using dynamic modeling, additional features were introduced for solving the HA vs.
BN and PV vs. NV classification problem. Dynamic modeling was used for capturing
PFC NIRS and EDA signal dynamics. In addition, the interaction dynamics between
[Hb]/[HbO2] and EDA were captured using an arx model. Unlike previous emotion iden-
tification efforts which exclusively used autonomic or central nervous system signals for
identifying emotions, the arx model captured the interaction between PFC hemodynam-
ics and EDA. Despite variability across participants, using features extracted using arx
models, accuracies up to 81% were achieved in differentiating arousal.
7.1.5 Multi-modal emotion identification using a mixture of
classifier experts
A multi-modal mixture of experts exclusively trained using ANS and PFC hemody-
namic features was implemented for emotion identification. The classifier combination
automatically accounted for the variability across participants by estimating classifier
competence and taking classifier confidence into account. The mixture of experts was
capable of improving HA vs. BN identification in three participants.
92
7.2 Recommendation for future studies
7.2.1 Assessing PFC hemodynamics for emotion identification
in the pediatric population and individuals with severe
disabilities
The results of the current thesis have established grounds for future emotion identifi-
cation efforts involving individuals with severe disabilities. In particular, the pediatric
population with severe motor disabilities who may not be able to use other BCI systems
due to developmental delays, limited expressive communication and unknown levels of
receptive communication may benefit from NIRS-based emotion identification systems.
Emotional response may be an intuitive and more natural means of communication for
children with severe disabilities.
Future studies involving typically developing children and children with severe dis-
abilities should consider frontal cortex development in the target age group [36]. The
current results were achieved based on data from adults for whom the prefrontal cortex
is fully developed. The next step would be to test the proposed system with typically
developing children to investigate the feasibility of arousal and valence identification in
different prefrontal cortex developmental stages. This step will inform studies involving
children with disabilities for whom the prefrontal cortex is unlikely to be affected by the
clinical condition. Ultimately, the system may be tested with children with conditions
which may affect the prefrontal cortex.
Specific changes to current study design may be necessary for testing the system for
the pediatric population. For example, music excerpts used may need to be adjusted for
children (e.g. by using more simplified musical structures). Previous studies involving
music-induced emotions in children may be useful in identifying music excerpts suited
for inducing music in children (e.g. [81, 82]). Including self-selected excerpts may not
be feasible due to difficulty in identifying personal preference in children with regards
93
to music. More simplified emotion rating paradigms should be considered to facilitate
emotional ratings by children (e.g. using facial gestures as rating items [81] ).
7.2.2 Potential clinical implications
The system proposed in this thesis may serve as a passive brain computer interface
for detecting emotional state in nonverbal individuals with severe motor disabilities.
Knowledge of the emotional state may facilitate clinical decisions. For example, by
assessing the emotional response to various interventions, the care-givers and clinicians
may devise improved care strategies.
Physiological-based emotional identification has been effective in situational interpre-
tations within clinical settings. In a study involving ten children with disabilities, auto-
nomic nervous system activity was monitored during interaction with therapeutic clowns
compared to television exposure in the complex continuing care at Holland Bloorview
Kids Rehabilitation Hospital [105]. The results indicated a significant difference between
therapeutic clown intervention and exposure to television [105]. Similarly, the results
of the current thesis may lead to augmented awareness regarding the patient state by
providing a means for ongoing bed-side monitoring of emotional state using prefrontal
cortex activity.
Magnetic resonance imaging technology offers improved spatial resolution compared
to NIRS and allows monitoring of deeper brain regions not accessible by NIRS. However,
the use of magnetic resonance imaging technology may trigger anxiety and discomfort
to the extent that sedation may be required [150]. Unlike magnetic resonance imaging,
NIRS is suitable for long-term bedside monitoring, particularly in children and infants.
Therefore, future studies of emotion using NIRS of the prefrontal cortex may shed light
on emotional understanding in children of various age groups.
94
7.2.3 Dynamic emotional rating paradigms
Emotions may appear as transient phenomena during the emotion induction period.
For example, emotions during initial presentation of a musical piece may be different
from those manifested as the music unfolds. Therefore, the next step for studies in-
volving dynamic emotion induction (e.g. music or videos) and the brain is to consider
implementing experimental paradigms that support dynamic emotional rating. Dynamic
emotional ratings will enable the study of the temporal dynamics of emotion using PFC
hemodynamics. Ultimately, investigating the temporal dynamics of emotions with re-
spect to PFC hemodynamics will facilitate emotion decoding in real-life settings where
emotions can be manifested at any point in time.
7.2.4 Emotional sensitivity measures
Due to differences in emotional sensitivity among individuals, future studies should con-
sider emotional intelligence assessments prior to each recording session. Petrides et al.
[169] have shown that individuals with high trait emotional intelligence respond faster
and show more sensitivity in an emotion induction paradigm. These individual differences
may explain the variability in the emotion identification success rate across participants.
In this way, including a measure of emotional sensitivity in addition to the self-reported
ratings may be useful for future investigations involving physiologically-based emotion
identification.
7.2.5 Individual specific analysis
The subject-specific feature selection algorithm has been used for demonstrating the
feasibility of the proposed affective BCI. Previous emotion identification efforts have used
similar approaches due to individual differences in the physiological response to emotions
[172]. However, for this system to be used for individuals with severe motor disabilities,
95
the most informative features need to be identified. Due to the large variability in the
physiological response to emotions, including large participant cohorts is necessary to
capture various response phenotypes before global features can be identified and used
in studies involving individuals with severe disabilities. Given the limited sample size,
introducing global features was not feasible in the current study. Future studies with
larger sample size may help identify features that can robustly identify emotions across
individuals. Another approach maybe to implement adaptive feature selection where the
feature set can be optimized based on individual response phenotypes.
7.2.6 Inclusion of larger sample sizes
The results in this thesis were reported in a sample of 10 able-bodied adults. Due to the
extent of variability observed in identification accuracy across individuals, the sample
size may be a limitation in extending the results to larger sample sizes. To account for
the individual differences, individual feature selection and classification algorithms were
used. In addition, a mixed model was used for statistical analysis to account for the
limited sample size. However, future investigations should consider larger sample sizes
to account for different physiological phenotypes that may exist among individuals.
96
Appendix A: Open Challenges Regarding Control Mechanisms
Studies involving individuals with disabilities have demonstrated Various EEG con-
trol mechanisms. Each control mechanism has challenges and merits with respect to
habituation, required training period, response rate, fatigue, and cognitive awareness.
Exploring subject-specific control, performance predictors, alternative control mecha-
nisms, and self-paced BCI designs can help ameliorate current BCI technologies.
• Habituation and Response Rate
P300-based BCI may be affected by habituation. In particular, there are reports
of P300 peak magnitude and latency changes with repeated exposure to stimuli
[122]. Alternatively, SMR and SCP-based BCIs are not reported to be affected
by habituation. Based on the bit rates achieved in the reviewed articles, SSVEP-
based systems provide the fastest information transfer rate among the four control
mechanisms.
• Training and Fatigue
BCI systems based on evoked responses (P300 and SSVEP) require very little train-
ing for the participants as these responses are naturally occurring. In contrast, it
generally takes several training sessions for a user to learn to modulate spontaneous
EEG patterns. Despite the benefits of SSVEP-based systems with respect to train-
ing and transfer rate, the low-frequency flickering stimuli used by these systems is
fatiguing on the eyes and may induce photo-epileptic seizures in the photo-sensitive
population [49].
• Subject-specific EEG-based Control
Studies with able-bodied individuals have indicated that the ability to generate
various EEG patterns is user dependent. For example, of the 81 participants eval-
uating a P300-based speller nearly 3% did not produce any correct characters [64].
97
Similarly, only 19% of 99 participants using a SMR-based BCI achieved accuracies
above 80% [64]. These results suggest that some users may not be able to generate
EEG patterns to control a particular type of BCI. This issue, however, has not
been investigated in individuals with disabilities.
• Lack of Predictive Indicators of Performance
Due to the large amount of financial and time-related resources often required to
conduct studies involving the target population, it would be beneficial to develop
predictive indicators of success with given control mechanisms. One such predictive
measure is initial performance with a control mechanism. Using an SCP-based BCI,
Neumann et al. found that initial performance was related to performance in later
attempts in five patients with ALS [154]. In a later study, Kbler et al. [116] found
that initial performance was moderately correlated with the performance in the
advanced training sessions. Both studies were conducted with SCP-based BCIs.
• Limited Scope of Mental Tasks
Studies of BCIs based on spontaneous responses have focused exclusively on SMR
and SCP for use in individuals with disabilities. Several other mental tasks such
as language and arithmetic have also been shown to induce distinctive EEG pat-
terns in able-bodied individuals (Milln et al., 2002; Roberts Penny, 2000). Despite
the cognitive load imposed by these BCIs, they may have merits as BCI control
mechanisms for the target population. To the best of our knowledge, BCIs based
on language and arithmatic mental tasks have not been tested by the target pop-
ulation.
• System-paced Versus Self-paced
The majority of the reviewed BCIs require the user to generate EEG patterns
when cued by the system. This limits the user’s ability to control initiation and
duration of the mental task, a restriction that may hinder system practicality as
98
an independently controlled communication device. One way to overcome this
limitation is to develop self-paced (asynchronous) BCIs with a no control state
[139]. This can be accomplished through machine learning techniques that allow
detection of specific EEG patterns at any point in time [139, 140]. Leeb et al. (2007)
developed such a system for controlling a wheelchair in the virtual environment and
reported successful operation by an individual with SCI [127].
• Performance Evaluation
The reviewed articles mainly focused on traditional measures of BCI performance,
namely, speed and accuracy. These measures, however, must be appropriately mod-
ified when used to evaluate system performance with individuals with disabilities
[115]. Specifically, performance evaluation should consider the context in which
the system operates. According to the International Classification of Function-
ing, Disability and Health (ICF) (World Health Organization, 2001), this context
includes personal factors such as the nature of the disability, as well as environmen-
tal factors (physical, social, and attitudinal issues). Personal factors relating to the
nature of the disability are important in evaluating BCI suitability. In particular,
severity of the disability may affect BCI performance. For example, while BCIs
have been successfully used by individuals with incomplete locked-in syndrome,
[112] reported that basic communication could not be restored in any of the par-
ticipants with complete locked-in syndrome. Further study of different locked-in
syndrome sub-types can help identify the population which can most benefit from
BCI use. Another important personal factor is the possible improvement or decline
in function. Specifically, the extent of available communication function is a critical
personal factor. In this light, BCI speed is only a limited measure of performance
gains over other communication means available to the user. For example, while a
BCI may be much slower than speech or muscle activated switches, it may provide
a functional means of communication in the absence of extant muscle control.
99
• Neuroethics and responsible dissemination to media
With the ubiquity of BCI research, neuroethical concerns are materializing [188],
particularly around the breach of user privacy [46]. Further, many potential BCI
users face communication difficulties due to severe disabilities (e.g. conditions re-
sulting in LIS). Consequently, there are many challenges in reliably obtaining and
interpreting the user’s informed consent for participation in BCI research. In-
terested readers are referred to Haselager et. al (2009) [71]. Researchers should
exercise special care when communicating with caregivers and potential BCI users.
Because the reality of BCI research is often not well-portrayed by the media, users
and care-givers may formulate expectations beyond what is feasible. To manage
expectations, researchers must avoid ”over-hyping the significance of their findings”
[57, 71]. In a recent study, Nijboer, Clausen, Allison Haselager (2011) [158] pub-
lished results of a survey in which more than 80% of 144 BCI researchers acknowl-
edged the importance of active participation of researchers in separating factual and
fictional statements published in media. In addition, ”85.8 % of the participants
recommended ethical guidelines specific to BCI research and use within five years”.
Until such guidelines exist, researchers can prevent user and care-giver frustration
and disappointment by realistically presenting the expected outcomes, as well as
risks and complications surrounding BCI technology.
100
Appendix B: Music Database
Music excerpts used for emotion induction were selected from a variety of difference
genres of music. Previous studies of emotion using music were consulted in creating the
music database 1. In addition, motion picture soundtracks were also included due to their
ability to induce emotions. Table indicates the music excerpts included in the making of
the common database.
Each participant selected 6 music excerpts prior to data collection. These songs,
listed in table 2, were chosen by each participant for inducing intense positive or negative
emotions. Some participants selected identical music excerpts independent of each other.
In addition, with no prior knowledge of the common music database, a number of the
music pieces in the common database appeared in the self-selected songs.
101
Table 1: The list of music pieces included in the common music database
Title Composer/Artist
Caribbean blue EnyaArajuez Andrea bocelliSirens Police ”Natural born killers” motion pictureFirst youth Ennio morricone, ”Cinema paradiso” motion pictureBachehaye alp Mohsen Alizadeh, ”Dans les Alp” motion pictureCan’t take my eyes off of you The Everly BrothersSur le fil Yann Tiersen, ”Le Fabuleux Destin d’Amelie Poulain” motion pictureCan can Jacques OffenbachGoodbye Lenin Yann Tiersen, ”Goodbye Lenin” motion pictureKinderszenen Robert SchumannAgnus dei Samuel BarberJust the way you are Bruno MarsNocturne No. 20 in C sharp minor Frederic ChopinHalo BeyoneAdagio, G minor Tomaso AlbinoniLa noyee Yann Tiersen, ”Le Fabuleux Destin d’Amelie Poulain” motion pictureThe man who sold the world NirvanaCello Suite No. 1, Prelude Johann Sebastian BachConcerto No. 3 in F major, Op. 8, Allegro Antonio VivaldiAll that I am living for EvanescenceBella Ciao Yves MontandThe mission Ennio Morricone, ”The mission” sound trackLes millionnaire du dimanche Enrico MaciasOne day MatisyahuHasta Siempre Comandante Buena Vista Social ClubThe Lion Sleeps Tonight The Tokens, ”The Lion King” motion pictureLa vieille barque Mireille MathieuFireworks Katy PerryNothing else matters MetallicaAlp Mohsen Alizadeh, ”Dans les Alp” motion pictureWaka Waka (This Time for Africa) ShakiraLes roi du monde Philippe d’Avilla, Damien Sargue and Gregori Baquet , ”Romeo et Juliette” musicalLullaby Javier Navarrete ”Pan’s Labyrinth” motion pictureUnforgiven III MetallicaThe Winner Takes It All AbbaCon Te Partiro Andrea Bocellic’est peut etre des ange Gerard LenormanMalena Ennio Morricone ”Malena” motion pictureIf I had a hammer Peter Paul and Maryje t’aime Lara FabienHaven’t met you yet Michael BubleYesterday The BeatlesTo the beat of my heart Hilary DuffHit the road Jack! Ray CharlesHistoire D’un Amour DalidaWhen our wings are cut can we still fly? Gustavo Santaolalla, ”21 grams” motion pictureHabanera Georges Bizet, ”Carmen” OperaOne day I’ll fly away Nicole Kidman, ”Mouline Rouge” motion pictureConcerto No. 1 in E major, Op. 8 Allegro Antonio vivaldiSari gelin Composer unknown (Armenians, Azerbaijanis, Persians, and Turks folk song)Empire State of Mind Jay Z, Alicia KeysWonderful life BlackScarborough fair Simon and GarfunkelPor una cabeza Carlos Gardel, featured in ”the Scent of a woman” motion pictureJe suis malade Alice Dona and Serge LamaCloud song RiverdanceSend me an angel The ScorpionsCinderella Steven Curtis ChapmanDon’t dwell Tracy ChapmanI will survive Gloria GaynorI wanna hold your hand The beatlesMoon river Audrey Hepburn, ”Breakfast at Tiffany’s” motion pictureYou are loved Josh GrobanVerone ”Notre Dame de Paris” musicalDon’t cry for me Argentina Sinead O’Connor cover, ”Don’t Cry for me Argentina” motion pictureThe voice Celtic womenSlow me down Emmy RossumZombie The CranberriesApres une reve, (Op. 7, No. 1) Gabriel FaureDani california Red Hot Chilli PeppersCaruso Lucio DallaWaltz, Swan lake ballet Pyotr Ilyich Tchaikovsky
102
Table 2: The list of self-selected music pieces
Title Composer/Artist
Iris The Goo Goo DollsTears in heaven Eric ClaptonRequiem ”Dies Irae” Wolfgang Amadeus MozartUntitled Sigur RosAin’t no mountain high enough Marvin Gaye and Tammi TerrellTheme from Schindler’s List John Williams, ”Schindler’s List” motion pictureJulien PlaceboHow to save a life The FraysNocturne No. 20 in C sharp minor Frederic ChopinVirtual insanity JamiroquaiLittle Town Chorus Beauty And the Beast, Paige O’Hara and Richard White, ”Beauty and the Beast” motion pictureA world of our own The seekersVeronica Sawyer smokes Crash LoveTall trees Matt Mays and El TorpedoCello Suite No. 1, Prelude Johann Sebastian Bachhallelujah Jeff BuckleyNews bar Charlie ClouserUn petit peu d’air FelipechaGrand valse brillante Frederic ChopinThat’s how you know Amy Adams, ”Enchanted” motion pictureHe wasn’t man enough for me Toni BraxtonMan! I feel like a woman Shania TwainMad world Gary JulesBe our guest Chorus Beauty And the Beast, ”Beauty and the Beast” motion pictureThrough and through and through Joel PlaskettSo close Jon McLaughlin, ”Enchanted” motion pictureHuman nature Michael JacksonWay over yonder in the minor key Billy Bragg and WilcoEveryday will be like a holiday William Bell, (RZA remix)Give me Jesus Fernando OrtegoDon’t forget about me Chris KirbyI like it Julio IglesiasLove you Free designAu parc Chiara MastroianniCandle in the wind Elton JohnRed sun Neil YoungBlower’s daughter Damien RiceBookends Simon and GarfunkelAll I do is win DJ KhaledHit the road Jack! Ray Charles
103
Appendix C: Music characteristic extraction using MIRTOOLBOX
Music characteristics used in this thesis were sound pressure level, mode, and dis-
sonance, and extraction of each feature is described in details in this appendix. The
interested reader is referred to [89] for more details regarding music characteristic extrac-
tion.
• Sound pressure
The sound pressure level (shown in 1) is a logarithmic measure of sound pressure
(ρrms), expressed in decibels (dB) above a standard reference level (ρref = 2 ×
10−5Pa).
L(db) = 20log(ρrms
ρref) (1)
The sound pressure level waveform indicates the volume changes throughout the
music excerpt, and is extracted from the music waveform.
• Dissonance
The MIRTOOLBOX [124] estimates dissonance using a method proposed by Plomp
and Levelt (1965) [176], which determines the sensory dissonance by identifying
pairs of sinusoids that appear close in frequency. Therefore, the ratio of each pair
of sinusoid was used for identifying dissonance.
The total dissonance was determined by computing the peaks of spectrum and
finding the average of dissonance between all possible pairs of peaks [205].
• Mode
The MIRTOOLBOX [124] determines the mode (i.e. major/minor) using the key
strength value. The key strength value represents the probability of each possible
candidate key. This probability value is determined using a cross-correlation of
chromagram with similar profiles representing each possible tonality [62, 110].
104
Table 3: The significance of the main effect of a. Mode, b. Dissonance, and c. Maximumsound pressure level for each recording site shown in Figure 2.1. (α = 0.05)
Characteristic Chromophore R1 R2 R3 R4 O L1 L2 L3 L4
a. Mode[HbO ] X X X X X X X X X X[Hb] X X X X X X X X X X
b. Dissonance[HbO ] X X X X X X X X p=0.034 X[Hb] X X X X X X X X X X
c. Max sound pressure level[HbO ] X X p=0.015 X X X X X X X[Hb] X X X X X X X X X X
Appendix D: Region specific analysis of [HbO2] and [Hb] with respect to
music characteristics
In chapter 6, to identify the effect of music characteristics on PFC [HbO2] and [Hb],
these signals were averaged across the nine recording sites (see Figure 2.1). However,
hemodynamic changes may vary across the PFC, and identifying the effect of music
characteristics in each recording location is a meaningful pursuit.
Although considering regional activity patterns was appealing, due to the limited
samples available for this analysis (72 samples per participant), the average [HbO2] and
[Hb] signals were considered. (Including each recording site leads to 18 comparisons
per music characteristics). The average [HbO2] and [Hb] signals were used to reliably
captured the general pattern of hemodynamic changes. However, a separate analysis
involving maximum [HbO2] and [Hb] recordings at each recording site was conducted,
and the significance of the effect of each music characteristic is reported in Table 3. The
domains marked with ’x’ in Table 3 indicate that the effect of the corresponding music
characteristic did not reach significance. Therefore, the effect of mode did not reach sig-
nificance (α = 0.05) for any of the recording locations. However, the effect of dissonance
was significant for maximum [HbO2] in locations L3 and R3 (see figure 2.1)for dissonance
and maximum sound pressure level respectively. These locations correspond to infero-
lateral PFC regions. However after applying Bonferroni adjustment for 9 comparisons
per chromophore which results in α = 0.005, the effect of music characteristics does not
reach significance for any of the locations.
105
Appendix E: Contributions from Systemic Blood Flow
In NIRS hemodynamic monitoring, the near-infrared light needs to travel through
scalp, skull and cerebral spinal fluid before it reaches the brain. This light passage through
the scalp may introduce systemic blood flow components in the detected signals [84].
Assessing contributions from the systemic blood flow (e.g. skin blood flow) in the recorded
signals is an important pursuit for researchers in the field of functional NIRS. Specific
practices in the NIRS recordings were shown to reduce the effect of unwanted systemic
blood flow. For example, increased distance between the light source and detector was
shown to reduce systemic blood flow contributions [58, 59]. The source-detector distance
selected for this study (i.e. r=3cm) was within the recommended range for detecting
cerebral hemodynamic changes. Joint studies using magnetic resonance imaging and
NIRS have confirmed reliable hemodynamic detection at 3.3 cm [195]. The wavelengths
used for detection (i.e. 690 nm and 830 nm) were also shown to reduce the contribution
of the systemic blood flow to the detected signals [196]. In addition to the strategies
used in the current thesis, an important practice for future research in this field would be
inclusion of systemic blood flow monitors such as laser Doppler flowmetry. Using these
sensors, the systemic blood flow contributions can be directly measured and compared
to the NIRS recordings. For example, Hoshi et al. [83] used laser Doppler flowmetry
sensors and demonstrated that there were no task-related changes in the systemic blood
flow, while the NIRS recordings showed significant modulations. Another study involving
60 second exposure to visual stimulus by Villringer et al. [228], also demonstrated that
significant NIRS signal changes were not accompanied by task-related modulations in
systemic blood flow detected using laser Doppler flowmetry. Lack of additional laser
Doppler flowmetry sensors in the current thesis is a limitation that needs to be addressed
in future studies. Although empirical observations such as region-specific changes with
respect to emotional rating suggest a more dominant contribution from the cerebral blood
flow compared to skin blood flow, future studies of emotion using NIRS should consider
106
including additional skin blood flow sensors (e.g. laser Doppler flowmetry sensors or NIRS
sensors placed less than 0.5 cm apart) to assess systemic contribution to the results.
107
Appendix F: Cognitive Processing Activity in the Prefrontal Cortex
The prefrontal cortex is engaged during various emotional and cognitive processes.
The reader is referred to 1.4.1 for more details regarding the role of PFC from a net-
work perspective. Due to the wide range of processes during which prefrontal cortex is
recruited, other activities (e.g. cognitive processing) may have modulated brain activity
in this study. Therefore, there is the possibility of misrepresenting unrelated cognitive
functions such as thinking of something as emotional response. To reduce the probability
of detecting cognitive processes unrelated to emotion, the analysis was conducted with re-
spect to subjective ratings and multiple trials were used (144 trials per individual). Unless
the unrelated cognitive activities (e.g. distractions, thinking of something) are consis-
tently repeated across trials, increasing the number of trials can be used as a strategy
to mask the unrelated cognitive functions. In addition, assuming that subjective ratings
are a correct representation of one’s emotions, unrelated cognitive tasks occurring during
the trial may be reflected in the ratings (i.e. trials during which distractions occurred
may be rated lower). In addition to cognitive processes unrelated to emotions, there
may be those accompanying emotions. Distinguishing between cognitive and emotional
response may be challenging. For example, one of the mechanisms through which music
can induce emotion is by evoking episodic memories which involves memory retrieval
[92]. Hence, emotional response can be accompanied by cognitive appraisal. The current
study design cannot distinguish cognitive appraisal which accompanies or results in emo-
tional response from the emotional response itself. However, if this cognitive appraisal
was detected during the study, the findings would not be undermined. Because detecting
the cognitive appraisal accompanying emotions would lead to identifying emotions.
108
Appendix G: Research Ethics
109
Figure 1: Ethics approval notice
110
Assessing auditory stimuli presentation modalities in the affective modulation-based human
computer interface
November XX, 2010
Dear Participant,
My name is Saba Moghimi. I am a PhD. student at the University of Toronto. My supervisor,
Professor Tom Chau, and I work in a research team at Bloorview Kids Rehab. We are
investigating a technology that can potentially be used as a communication device for people
who cannot move or speak. Before agreeing to take part in this study, I would like to tell you
how you will be involved.
What is the study about?
Access technologies help people who cannot move or speak to communicate with other people.
Switches and eye trackers are examples ofthese technologies. Unfortunately, people who cannot
make movements cannot use these switches. To help these people, researchers are investigating
communication devices that are controlled by brain activity.
In this study, we will try to use brain activity and some other body signals to detect emotional
reactions in response to auditory stimulus (music). This study will not help you. This study will
help us design devices that help people with disabilities who cannot speak or express what they
like.
How will I be involved in this study?
To volunteer for this study you must be able to communicate in English. You must also be at
least 18 years old and have normal or corrected- to-normal vision and hearing. Please do not
volunteer for this study if you know you have any of the following conditions: 1) degenerative
disease; 2) cardiovascular disease; 3) metabolic disorders; 4) trauma-induced brain injury; 5)
respiratory conditions; 6) drug and alcohol-related conditions; and 6) psychiatric conditions.
We will ask you to come in forf our sessions in a 3-5 months period. Each session will be about
two hours.
You will be asked not to drink any caffeinated beverages or alcohol an hourbefore the recording
sessions. We will send you a reminder before each session.
111
We will put some sensors on your forehead. You ca n see the sensors in Figure 1. These sensors
can record your brain activity. Do not worry; we will not be able to read your thoughts. We will
also put sensors on your finger to measure your skin temperature, the amount of sweat in your
skin, and your pulse. You can see the sensors in Figure 1. B. We will also ask you to wear a belt
around your chest to record how you breathe. These sensors will not hurt you. You can let the
researcher know if you are uncomfortable and we will remove the sensors. We can stop the
recording or let you take a break if you are tired.
Before the experiment, we will ask you to name a number of music pieces you like.You will
hear the music you told us about and some other music pieces and sounds from the environment.
We will ask you to rate how you felt listening to the music after it plays.
Will anyone know what I say?
Your brain signals and physiological signals will be recorded in a private room. Only you and
the researcher will be present. You can feelfree to ask the researcher any questions about the
experiment. All your concerns will be kept confidential. We will not be able to read your mind or
your thoughts with these signals.
All the information that we collect from you will be confidential. All the forms that may have
your information and the data collected from you will be saved on a secure server or in a locked
cabinet. We will not use your name when publishing the results of this study. We will keep your
name and the data collected from you for seven years, and will destroy all the information at the
end of this time. We will not release any information that might identify you without asking for
your consent.
Do I have to do this?
If you decide not to take part in this study, that is okay. If you decide to take part, but change
your mind at any time, that is also okay. You may drop out of the study at any time. Doing this
will not affect your status at Bloorview Kids Rehab or at the University of Toronto.
What are the risks and benefits?
A B
112
You may get tired during the experiment. We have planned breaks during the session, but you
can ask for additional breaks during the experiment if you wish. You may also get bored or feel
sleepy. Please let us know when you are tired. We will let you take a break.
You will not directly benefit from this study. However, we think that this study will benefit
people who have no means of communication. After the study, we will send you a thank you
letter, and you will also receive a small token of appreciation for your participation.
What if I have questions?
Please ask me to explain anything you don’t understand before signing the consent form. My
phone number is 416-425-6220 x3603. If you leave a message, I will return your call within 48
hours. I can also be reached by email at [email protected].
Thank you for thinking about helping us with this project.
Yours sincerely,
Saba Moghimi
Ph.D Candidate
Bloorview Kids Rehab
Phone: 416-425-6220 x3270
E-mail: [email protected]
Supervisor:
Professor Tom Chau
Bloorview Kids Rehab
E-mail: [email protected]
113
CONSENT FORM
Holland Bloorview Kids Rehabilitation Hospital
Re: Detecting mental selection on the basis of prefrontal cortical and autonomic nervous system activity
Please complete this form below and return it to the investigator.
The investigator explained this study to me. I read the information letter dated __________________
and understand what this study is about. I understand that I may drop out of the study at any time. I
agree to participate in this study.
______________________________ _________________________ _________
Participant’s Name (please print) Signature Date
______________________________ ___________________________ _________
Researcher’s Name Signature Date
Figure 2: Participant consent form
Bibliography
[1] F. Agrafioti, D. Hatzinakos, and A.K. Anderson. Ecg pattern analysis for emotion
detection. Affective Computing, IEEE Transactions on, 3(1):102–115, 2012.
[2] C. Ahlstrom, A. Johansson, F. Uhlin, T. Lanne, and P. Ask. Noninvasive in-
vestigation of blood pressure changes using the pulse wave transit time: a novel
approach in the monitoring of hemodialysis patients. Journal of Artificial Organs,
8(3):192–197, 2005.
[3] B.Z. Allison, D.J. McFarland, G. Schalk, S.D. Zheng, M.M. Jackson, and J.R. Wol-
paw. Towards an independent brain-computer interface using steady state visual
evoked potentials. Clinical neurophysiology, 119(2):399–408, 2008.
[4] E. Altenmuller, K. Schurmann, V.K. Lim, and D. Parlitz. Hits to the left, flops
to the right: different emotions during listening to music are reflected in cortical
lateralisation patterns. Neuropsychologia, 40(13):2242–2256, 2002.
[5] F. Babiloni, F. Cincotti, M. Marciani, S. Salinari, L. Astolfi, F. Aloise,
F. De Vico Fallani, and D. Mattia. On the use of brain–computer interfaces outside
scientific laboratories: Toward an application in domotic environments. Interna-
tional review of neurobiology, 86:133–146, 2009.
[6] O. Bai, P. Lin, S. Vorbach, M.K. Floeter, N. Hattori, and M. Hallett. Sensorimotor
beta rhythm-based brain–computer interface. Journal of neural engineering, 5:24–
35, 2008.
114
115
[7] R. Bates and HO Istance. Why are eye mice unpopular? A detailed comparison of
head and eye controlled assistive technology pointing devices. Universal Access in
the Information Society, 2(3):280–290, 2003.
[8] G. Bauer, F. Gerstenbrand, and E. Rumpl. Varieties of the locked-in syndrome.
Journal of Neurology, 221(2):77–91, 1979.
[9] T. Baumgartner, M. Esslen, and L. Jancke. From emotion perception to emotion
experience: Emotions evoked by pictures and classical music. International Journal
of Psychophysiology, 60(1):34–43, 2006.
[10] J.D. Bayliss, S.A. Inverso, and A. Tentler. Changing the P300 brain computer
interface. CyberPsychology & Behavior, 7(6):694–704, 2004.
[11] N. Birbaumer. Slow cortical potentials: Plasticity, operant control, and behavioral
effects. The Neuroscientist, 5(2):74, 1999.
[12] N. Birbaumer, T. Elbert, AG Canavan, and B. Rockstroh. Slow potentials of the
cerebral cortex and behavior. Physiological Reviews, 70(1):1, 1990.
[13] N. Birbaumer, N. Ghanayim, T. Hinterberger, I. Iversen, B. Kotchoubey, A. Kubler,
J. Perelmouter, E. Taub, and H. Flor. A spelling device for the paralysed. Nature,
398(6725):297–298, 1999.
[14] N. Birbaumer, A. Kubler, N. Ghanayim, T. Hinterberger, J. Perelmouter, J. Kaiser,
I. Iversen, B. Kotchoubey, N. Neumann, and H. Flor. The thought translation de-
vice (TTD) for completely paralyzedpatients. IEEE Transactions on Rehabilitation
Engineering, 8(2):190–193, 2000.
[15] A.J. Blood and R.J. Zatorre. Intensely pleasurable responses to music correlate
with activity in brain regions implicated in reward and emotion. Proceedings of the
National Academy of Sciences of the United States of America, 98(20):11818, 2001.
116
[16] A.J. Blood, R.J. Zatorre, P. Bermudez, and A.C. Evans. Emotional responses to
pleasant and unpleasant music correlate with activity in paralimbic brain regions.
Nature neuroscience, 2:382–387, 1999.
[17] M. Boso, P. POLITI, F. BARALE, and E. EMANUELE. Neurophysiology and
neurobiology of the musical experience. Functional neurology, 21(4):187–191, 2006.
[18] N.C. Brady, J. Marquis, K. Fleming, and L. McLean. Prelinguistic predictors of
language growth in children with developmental disabilities. Journal of Speech,
Language and Hearing Research, 47(3):663, 2004.
[19] G.C. Bruner. Music, mood, and marketing. The Journal of Marketing, pages
94–104, 1990.
[20] R.L. Buckner, J. Sepulcre, T. Talukdar, F.M. Krienen, H. Liu, T. Hedden, J.R.
Andrews-Hanna, R.A. Sperling, and K.A. Johnson. Cortical hubs revealed by
intrinsic functional connectivity: mapping, assessment of stability, and relation to
alzheimer’s disease. The Journal of Neuroscience, 29(6):1860–1873, 2009.
[21] S.C. Bushong. Magnetic resonance imaging. St. Louis, MO (USA); CV Mosby Co.,
1988.
[22] A.H. Buss and R. Plomin. A temperament theory of personality development. Wiley-
Interscience, 1975.
[23] J.J. Campos, R.G. Campos, and K.C. Barrett. Emergent themes in the study
of emotional development and emotion regulation. Developmental Psychology,
25(3):394, 1989.
[24] C.S. Carter, T.S. Braver, D.M. Barch, M.M. Botvinick, D. Noll, and J.D. Cohen.
Anterior cingulate cortex, error detection, and the online monitoring of perfor-
mance. Science, 280(5364):747–749, 1998.
117
[25] R. Chavarriaga and J. del R Millan. Learning from eeg error-related potentials in
noninvasive brain-computer interfaces. Neural Systems and Rehabilitation Engi-
neering, IEEE Transactions on, 18(4):381–388, 2010.
[26] A. Christa Neuper, G.R. Muller-Putz, R. Scherer, and G. Pfurtscheller. Motor
imagery and EEG-based control of spelling devices and neuroprostheses. Event-
related dynamics of brain oscillations, page 393, 2006.
[27] F. Cincotti, D. Mattia, F. Aloise, S. Bufalari, G. Schalk, G. Oriolo, A. Cherubini,
M.G. Marciani, and F. Babiloni. Non-invasive brain–computer interface system:
Towards its application as assistive technology. Brain research bulletin, 75(6):796–
803, 2008.
[28] C. Collet, E. Vernet-Maury, G. Delhomme, and A. Dittmar. Autonomic nervous
system response patterns specificity to basic emotions. Journal of the autonomic
nervous system, 62(1-2):45–57, 1997.
[29] J. Conradi, B. Blankertz, M. Tangermann, V. Kunzmann, and G. Curio. Brain-
computer interfacing in tetraplegic patients with high spinal cord injury. Int J
Bioelectromagnetism Volume, 11(2):65–68, 2009.
[30] M. Cope. The application of near infrared spectroscopy to non invasive monitoring
of cerebral oxygenation in the newborn infant. Department of Medical Physics and
Bioengineering, University College London, pages 214–9, 1991.
[31] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, and
J.G. Taylor. Emotion recognition in human-computer interaction. Signal Processing
Magazine, IEEE, 18(1):32–80, 2001.
[32] S. Dalla Bella, I. Peretz, L. Rousseau, and N. Gosselin. A developmental study of
the affective value of tempo and mode in music. Cognition, 80(3):B1–B10, 2001.
118
[33] R.J. Davidson. Emotion and affective style: Hemispheric substrates. Psychological
Science, 3(1):39, 1992.
[34] RJ Davidson. What does the prefrontal cortex do in affect. Perspectives on frontal
EEG asymmetry research. Biological Psychology, 67:219–233, 2004.
[35] T. Demiralp et al. Event-related oscillations are real brain responses wavelet analy-
sis and new strategies. International Journal of Psychophysiology, 39(2-3):91–127,
2001.
[36] M. Dennis. Prefrontal cortex: Typical and atypical development. The frontal lobes:
Development, function and pathology, pages 128–162, 2006.
[37] P.A. Di Mattia, F.X. Curran, and J. Gips. An eye control teaching device for
students without language expressive capacity: EagleEyes. Edwin Mellen Pr, 2001.
[38] E. Donchin, KM Spencer, and R. Wijesinghe. The mental prosthesis: assessing the
speed of a P300-basedbrain-computer interface. IEEE transactions on rehabilitation
engineering, 8(2):174–179, 2000.
[39] W.C. Drevets, J.L. Price, J.R. Simpson, R.D. Todd, T. Reich, M. Vannier, and
M.E. Raichle. Subgenual prefrontal cortex abnormalities in mood disorders. Nature,
386(6627):824–827, 1997.
[40] R.O. Duda, P.E. Hart, and D.G. Stork. Pattern classification, volume 2. Citeseer,
2001.
[41] A. Duncan, J.H. Meek, M. Clemence, CE Elwell, L. Tyszczuk, M. Cope, and
D. Delpy. Optical pathlength measurements on adult head, calf and forearm and
the head of the newborn infant using phase resolved optical spectroscopy. Physics
in Medicine and Biology, 40:295, 1995.
119
[42] T. Elbert, N. Birbaumer, W. Lutzenberger, and B. Rockstroh. Biofeedback of slow
cortical potentials: self-regulation of central-autonomic patterns. Biofeedback and
self-regulation, pages 321–342, 1979.
[43] T. Elbert, B. Rockstroh, W. Lutzenberger, and N. Birbaumer. Biofeedback of
slow cortical potentials. I. Electroencephalography and Clinical Neurophysiology,
48(3):293–301, 1980.
[44] A. Etkin, T.D. Wager, et al. Functional neuroimaging of anxiety: a meta-analysis of
emotional processing in ptsd, social anxiety disorder, and specific phobia. American
Journal of Psychiatry, 164(10):1476–1488, 2007.
[45] T.H. Falk, M. Guirgis, S. Power, and T. Chau. Taking nirs-bcis outside the lab:
Towards achieving robustness against environment noise. Neural Systems and Re-
habilitation Engineering, IEEE Transactions on, 19(2):136–146, 2011.
[46] M.J. Farah. Neuroethics: the practical and the philosophical. Neuroethics Publi-
cations, page 8, 2005.
[47] L.A. Farwell and E. Donchin. Talking off the top of your head: toward a men-
tal prosthesis utilizing event-related brain potentials. Electroencephalography and
clinical Neurophysiology, 70(6):510–523, 1988.
[48] E.A. Felton, J.A. Wilson, J.C. Williams, and P.C. Garell. Electrocorticographically
controlled brain–computer interfaces using motor and sensory imagery in patients
with temporary subdural electrode implants. Journal of Neurosurgery: Pediatrics,
106(3), 2007.
[49] R.S. Fisher, G. Harding, G. Erba, G.L. Barkley, and A. Wilkins. Photic-and
pattern-induced seizures: a review for the Epilepsy Foundation of America Working
Group. Epilepsia, 46(9):1426–1441, 2005.
120
[50] S.T. Fiske, D.T. Gilbert, and G. Lindzey. Handbook of social psychology. 1, 2010.
[51] E.O. Flores-Gutierrez, J.L. Dıaz, F.A. Barrios, R. Favila-Humara, M.A. Guevara,
Y. del Rıo-Portilla, and M. Corsi-Cabrera. Metabolic and electric brain patterns
during pleasant and unpleasant emotions induced by music masterpieces. Interna-
tional Journal of Psychophysiology, 65(1):69–84, 2007.
[52] GM Friehs, VA Zerris, CL Ojakangas, MR Fellows, and JP Donoghue. Brain-
machine and brain-computer interfaces. Stroke, 35(11-Supplment 1):2702–2705,
2004.
[53] N.H. Frijda and B. Mesquita. The social roles and functions of emotions. Emotion
and culture, pages 51–87, 1994.
[54] M. Frisch and H. Messer. The use of the wavelet transform in the detection of an
unknown transient signal. Information Theory, IEEE Transactions on, 38(2):892–
897, 1992.
[55] T. Fritz, S. Jentschke, N. Gosselin, D. Sammler, I. Peretz, R. Turner, A.D.
Friederici, and S. Koelsch. Universal recognition of three basic emotions in music.
Current Biology, 19(7):573–576, 2009.
[56] A.W.K. Gaillard. Slow brain potentials preceding task performance. Biological
Psychology, 21(4):282–283, 1985.
[57] J.M. Garreu and S.J. Bird. Ethical issues in communicating science. Science and
engineering ethics, 6(4):435–442, 2000.
[58] TJ Germon, PD Evans, NJ Barnett, P Wall, AR Manara, and RJ Nelson. Cerebral
near infrared spectroscopy: emitter-detector separation must be increased. British
journal of anaesthesia, 82(6):831–837, 1999.
121
[59] TJ Germon, PD Evans, AR Manara, NJ Barnett, P Wall, and RJ Nelson. Sensitiv-
ity of near infrared spectroscopy to cerebral and extra-cerebral oxygenation changes
is determined by emitter-detector separation. Journal of clinical monitoring and
computing, 14(5):353–360, 1998.
[60] A. Gerrards-Hesse, K. Spies, and F.W. Hesse. Experimental inductions of emotional
states and their effectiveness: A review. British Journal of Psychology, 85(1):55–78,
1994.
[61] S. Glennen and D.C. DeCoste. The handbook of augmentative and alternative
communication. 1997.
[62] E. Gomez. Tonal description of polyphonic audio for music content processing.
INFORMS Journal on Computing, 18(3):294–304, 2006.
[63] M.D. Greicius, B. Krasnow, A.L. Reiss, and V. Menon. Functional connectivity in
the resting brain: a network analysis of the default mode hypothesis. Proceedings
of the National Academy of Sciences, 100(1):253–258, 2003.
[64] C. Guger, G. Edlinger, W. Harkam, I. Niedermayer, and G. Pfurtscheller. How
many people are able to operate an EEG-based brain-computer interface (BCI)?
IEEE Transactions on Neural Systems and Rehabilitation Engineering, 11(2):145,
2003.
[65] M. Guirgis, T. Falk, S. Power, S. Blain, and T. Chau. Harnessing physiological
responses to improve nirs-based brain-computer interface performance. In Proc.
ISSNIP Biosignals and Biorobotics Conference 2010, pages 59–62, 2010.
[66] A. Haag, S. Goronzy, P. Schaich, and J. Williams. Emotion recognition using
bio-sensors: First steps towards an automatic system. Affective Dialogue Systems,
pages 36–48, 2004.
122
[67] B. Hamadicharef, H. Zhang, C. Guan, C. Wang, K.S. Phua, K.P. Tee, and K.K.
Ang. Learning eeg-based spectral-spatial patterns for attention level measurement.
pages 1465–1468, 2009.
[68] M. Hamalainen, R. Hari, R.J. Ilmoniemi, J. Knuutila, and O.V. Lounasmaa. Mag-
netoencephalographytheory, instrumentation, and applications to noninvasive stud-
ies of the working human brain. Reviews of modern Physics, 65(2):413, 1993.
[69] HM Hamer, HH Morris, EJ Mascha, MT Karafa, WE Bingaman, MD Bej,
RC Burgess, DS Dinner, NR Foldvary, JF Hahn, et al. Complications of inva-
sive video-EEG monitoring with subdural grid electrodes. Neurology, 58(1):97,
2002.
[70] M.B. Happ. Interpretation of nonvocal behavior and the meaning of voicelessness
in critical care. Social Science & Medicine, 50(9):1247–1255, 2000.
[71] P. Haselager, R. Vlek, J. Hill, and F. Nijboer. A note on ethical aspects of bci.
Neural Networks, 22(9):1352–1357, 2009.
[72] J. Healey and R. Picard. Digital processing of affective signals. 6:3749–3752, 1998.
[73] C.S. Herrmann. Human EEG responses to 1–100 Hz flicker: resonance phenomena
in visual cortex and their potential correlation to cognitive phenomena. Experi-
mental Brain Research, 137(3):346–353, 2001.
[74] MJ Herrmann, A.C. Ehlis, and AJ Fallgatter. Prefrontal activation through task
requirements of emotional induction measured with NIRS. Biological psychology,
64(3):255–263, 2003.
[75] K. Hevner. The affective character of the major and minor modes in music. The
American Journal of Psychology, pages 103–118, 1935.
123
[76] T. Hinterberger, A. Kubler, J. Kaiser, N. Neumann, and N. Birbaumer. A brain
computer interface (BCI) for the locked in: comparison of different EEG classifica-
tions for the thought translation device. Clinical Neurophysiology, 114(3):416–425,
2003.
[77] LR Hochberg, MD Serruya, GM Friehs, JA Mukand, M Saleh, AH Caplan, A Bran-
ner, D Chen, RD Penn, and JP Donoghue. Neuronal ensemble control of prosthetic
devices by a human with tetraplegia. Nature, 442(7099):164–171, 2006.
[78] U. Hoffmann, J.M. Vesin, T. Ebrahimi, and K. Diserens. An efficient P300-based
brain-computer interface for disabled subjects. Journal of Neuroscience methods,
167(1):115–125, 2008.
[79] S. Holm. A simple sequentially rejective multiple test procedure. Scandinavian
journal of statistics, pages 65–70, 1979.
[80] C.B. Holroyd and M.G.H. Coles. The neural basis of human error processing:
reinforcement learning, dopamine, and the error-related negativity. Psychological
review, 109(4):679, 2002.
[81] T. Hopyan, S. Laughlin, and M. Dennis. Emotions and their cognitive control in
children with cerebellar tumors. Journal of the International Neuropsychological
Society, 1(-1):1–12, 2006.
[82] TALAR HOPYAN, SUZANNE LAUGHLIN, and MAUREEN DENNIS. Emotions
and their cognitive control in children with cerebellar tumors. Journal of the In-
ternational Neuropsychological Society, 16(6):1027, 2010.
[83] Y. Hoshi, J. Huang, S. Kohri, Y. Iguchi, M. Naya, T. Okamoto, and S. Ono. Recog-
nition of human emotions from cerebral blood flow changes in the frontal region:
A study with event-related near-infrared spectroscopy. Journal of Neuroimaging,
21(2):e94–e101, 2011.
124
[84] Yoko Hoshi. Functional near-infrared spectroscopy: Potential and limitations in
neuroimaging studies. International Review of Neurobiology, 66:237–266, 2005.
[85] G. Husain, W.F. Thompson, and E.G. Schellenberg. Effects of musical tempo and
mode on arousal, mood, and spatial abilities. Music Perception, 20(2):151–171,
2002.
[86] S Inci and T Ozgen. Locked-in syndrome due to metastatic pontomedullary tumor-
case report. Neurologia Medico-Chirurgica, 43(10):497–500, 2003.
[87] IH Iversen, N. Ghanayim, A. Kubler, N. Neumann, N. Birbaumer, and J. Kaiser. A
brain computer interface tool to assess cognitive functions in completely paralyzed
patients with amyotrophic lateral sclerosis. Clinical neurophysiology, 119(10):2214–
2223, 2008.
[88] R.I. Jahiel and M.J. Scherer. Initial steps towards a theory and praxis of person-
environment interaction in disability. Disability & Rehabilitation, 32(17):1467–
1474, 2010.
[89] J.H. Jensen. Feature extraction for music information retrieval. 2010.
[90] F.F. Jobsis. Noninvasive, infrared monitoring of cerebral and myocardial oxygen
sufficiency and circulatory parameters. Science, 198(4323):1264, 1977.
[91] P.N. Juslin. From mimesis to catharsis: expression, perception, and induction of
emotion in music. Musical communication, pages 85–115, 2005.
[92] P.N. Juslin and D. Vastfjall. Emotional responses to music: The need to consider
underlying mechanisms. Behavioral and Brain Sciences, 31(5):559–575, 2008.
[93] J. Kaiser, A. Kubler, T. Hinterberger, N. Neumann, and N. Birbaumer. A non-
invasive communication device for the paralyzed. Minimally Invasive Neurosurgery,
45(1):19–23, 2002.
125
[94] A.A. Karim, T. Hinterberger, J. Richter, J. Mellinger, N. Neumann, H. Flor,
A. Kubler, and N. Birbaumer. Neural Internet: web surfing with brain potentials
for the completely paralyzed. Neurorehabilitation and Neural Repair, 20(4):508,
2006.
[95] L. Kauhanen, P. Jylanki, J. Lehtonen, P. Rantanen, H. Alaranta, and M. Sams.
EEG-based brain-computer interface for tetraplegics. Computational Intelligence
and Neuroscience, 2007:1, 2007.
[96] D. Keltner and J.J. Gross. Functional accounts of emotions. Cognition and Emo-
tion, 13(5):467–480, 1999.
[97] IK Keme-Ebi and AA Asindi. Locked-in syndrome in a nigerian male with mul-
tiple sclerosis: a case report and literature review. Pan African Medical Journal,
1(4):10pp, 2008.
[98] P.R. Kennedy, R.A.E. Bakay, M.M. Moore, K. Adams, and J. Goldwaithe. Direct
control of a computer from the human central nervous system. IEEE Transactions
on Rehabilitation Engineering, 8(2):198–202, 2000.
[99] S. Khalfa, D. Schon, J.L. Anton, and C. Liegeois-Chauvel. Brain regions involved in
the recognition of happiness and sadness in music. Neuroreport, 16(18):1981–1984,
2005.
[100] J. Kim and E. Andre. Emotion recognition based on physiological changes in
music listening. Pattern Analysis and Machine Intelligence, IEEE Transactions
on, 30(12):2067–2083, 2008.
[101] J.M. Kim, K. Arakawa, K.T. Benson, and D.K. Fox. Pulse oximetry and circu-
latory kinetics associated with pulse volume amplitude measured by photoelectric
plethysmography. Anesthesia & Analgesia, 65(12):1333–1339, 1986.
126
[102] K.H. Kim, SW Bang, and SR Kim. Emotion recognition system using short-term
monitoring of physiological signals. Medical and biological engineering and comput-
ing, 42(3):419–427, 2004.
[103] S.P. Kim, J.D. Simeral, L.R. Hochberg, J.P. Donoghue, and M.J. Black. Neural
control of cursor velocity in humans with tetraplegia. Journal of neural engineering,
5:455–476, 2008.
[104] S.P. Kim, JD Simeral, LR Hochberg, JP Donoghue, GM Friehs, and MJ Black.
Multi-state decoding of point-and-click control signals from motor cortical activity
in a human with tetraplegia. In Neural Engineering, 2007. CNE’07. 3rd Interna-
tional IEEE/EMBS Conference on, pages 486–489, 2007.
[105] S. Kingsnorth, S. Blain, and P. McKeever. Physiological and emotional responses
of disabled children to therapeutic clowns: A pilot study. Evidence-Based Comple-
mentary and Alternative Medicine, 2011, 2011.
[106] S. Koelsch. Investigating emotion with music. Annals of the New York Academy
of Sciences, 1060(1):412–418, 2005.
[107] G. Krausz, R. Scherer, G. Korisek, and G. Pfurtscheller. Critical Decision-Speed
and Information Transfer in the Graz Brain–Computer Interface. Applied psy-
chophysiology and biofeedback, 28(3):233–240, 2003.
[108] S.D. Kreibig. Autonomic nervous system activity in emotion: A review. Biological
psychology, 84(3):394–421, 2010.
[109] G. Kreutz, U. Ott, D. Teichmann, P. Osawa, and D. Vaitl. Using music to induce
emotions: Influences of musical preference and absorption. Psychology of music,
36(1):101, 2008.
[110] C.L. Krumhansl. Cognitive foundations of musical pitch. (17), 1990.
127
[111] C.L. Krumhansl. An exploratory study of musical emotions and psychophysiology.
Canadian Journal of Experimental Psychology/Revue canadienne de psychologie
experimentale, 51(4):336, 1997.
[112] A. Kubler and N. Birbaumer. Brain computer interfaces and communication in
paralysis: Extinction of goal directed thinking in completely paralysed patients?
Clinical neurophysiology, 119(11):2658–2666, 2008.
[113] A. Kubler, A. Furdea, S. Halder, E.M. Hammer, F. Nijboer, and B. Kotchoubey.
A Brain–Computer Interface Controlled Auditory Event-Related Potential (P300)
Spelling System for Locked-In Patients. Annals of the New York Academy of Sci-
ences, 1157(Disorders of Consciousness):90–100, 2009.
[114] A. Kubler, B. Kotchoubey, T. Hinterberger, N. Ghanayim, J. Perelmouter,
M. Schauer, C. Fritsch, E. Taub, and N. Birbaumer. The thought translation
device: a neurophysiological approach to communication in total motor paralysis.
Experimental Brain Research, 124(2):223–232, 1999.
[115] A. Kubler, N. Neumann, J. Kaiser, B. Kotchoubey, T. Hinterberger, and NP Bir-
baumer. Brain-computer communication: self-regulation of slow cortical poten-
tials for verbal communication. Archives of physical medicine and rehabilitation,
82(11):1533, 2001.
[116] A. Kubler, N. Neumann, B. Wilhelm, T. Hinterberger, and N. Birbaumer.
Predictability of brain-computer communication. Journal of Psychophysiology,
18(2):121–129, 2004.
[117] A. Kubler, F. Nijboer, J. Mellinger, TM Vaughan, H. Pawelzik, G. Schalk, DJ Mc-
Farland, N. Birbaumer, and JR Wolpaw. Patients with ALS can use sensorimotor
rhythms to operate a brain-computer interface. Neurology, 64(10):1775, 2005.
128
[118] H. Kuck, M. Grossbach, M. Bangert, and E. Altenmuller. Brain processing of meter
and rhythm in music. Annals of the New York Academy of Sciences, 999(1):244–
253, 2003.
[119] W.N. Kuhlman. EEG feedback training: enhancement of somatosensory cortical
activity. Electroencephalography and clinical neurophysiology, 45(2):290–294, 1978.
[120] L. Kuncheva, T. Christy, I. Pierce, and S. Mansoor. Multi-modal biometric emotion
recognition using classifier ensembles. Modern Approaches in Applied Intelligence,
pages 317–326, 2011.
[121] L.I Kuncheva. Combining Pattern Classifiers: Methods and Algorithms. 2004.
[122] W.J. Lammers and P. Badia. Habituation of P300 to target stimuli. Physiology &
behavior, 45(3):595–601, 1989.
[123] P.J. Lang and M.M. Bradley. Emotion and the motivational brain. Biological
Psychology, 84(3):437–450, 2010.
[124] O. Lartillot, P. Toiviainen, and T. Eerola. A matlab toolbox for music information
retrieval. Data analysis, machine learning and applications, pages 261–268, 2008.
[125] R.S. Lazarus. Emotion and adaptation. 1991.
[126] J.E. LeDoux. Emotion circuits in the brain. The Science of Mental Health: Fear
and anxiety, page 259, 2001.
[127] R. Leeb, D. Friedman, G.R. Muller-Putz, R. Scherer, M. Slater, and
G. Pfurtscheller. Self-Paced(Asynchronous) BCI Control of a Wheelchair in Virtual
Environments: A Case Study with a Tetraplegic. Computational Intelligence and
Neuroscience, 2007:79642, 2007.
129
[128] B Leung and T Chau. A multiple camera tongue switch for a child with severe spas-
tic quadriplegic cerebral palsy. Disability & Rehabilitation: Assistive Technology,
5(1):58–68, 2010.
[129] R.W. Levenson. Autonomic nervous system differences among emotions. Psycho-
logical science, 3(1):23–27, 1992.
[130] L. Ljung. System identification. 1999.
[131] S.G. Mallat. A wavelet tour of signal processing. San Diego, CA:Academic Pr,
1999.
[132] K. Marumo, R. Takizawa, Y. Kawakubo, T. Onitsuka, and K. Kasai. Gender
difference in right lateral prefrontal hemodynamic response while viewing fearful
faces: A multi-channel near-infrared spectroscopy study. Neuroscience research,
63(2):89–94, 2009.
[133] SG Mason and GE Birch. A general framework for brain-computer interface design.
IEEE Transactions on Neural Systems and Rehabilitation Engineering, 11(1):70–
85, 2003.
[134] K. Matsuo, T. Kato, K. Taneichi, A. Matsumoto, T. Ohtani, T. Hamamoto, H. Ya-
masue, Y. Sakano, T. Sasaki, M. Sadamatsu, et al. Activation of the prefrontal
cortex to trauma-related stimuli measured by near-infrared spectroscopy in post-
traumatic stress disorder due to terrorism. Psychophysiology, 40(4):492–500, 2003.
[135] D.J. McFarland, D.J. Krusienski, W.A. Sarnacki, and J.R. Wolpaw. Emulation of
computer mouse control with a noninvasive brain-computer interface. Journal of
neural engineering, 5(2):101, 2008.
[136] JH Meek, CE Elwell, MJ Khan, J. Romaya, JS Wyatt, DT Delpy, and S. Zeki.
Regional changes in cerebral haemodynamics as a result of a visual stimulus mea-
130
sured by near infrared spectroscopy. Proceedings of the Royal Society of London.
Series B: Biological Sciences, 261(1362):351, 1995.
[137] N. Memarian, A.N. Venetsanopoulos, and T. Chau. Infrared thermography as an
access pathway for individuals with severe motor impairments. Journal of Neuro-
Engineering and Rehabilitation, 6(1):11, 2009.
[138] L.B. Meyer. Emotion and meaning in music. University of Chicago Press, 1956.
[139] J.R. Millan. Adaptive brain interfaces. Communications of the ACM, 46(3):74–80,
2003.
[140] J.R. Millan, J. Mourino, M. Franze, F. Cincotti, M. Varsta, J. Heikkonen, and
F. Babiloni. A local neural classifier for the recognition of EEG patterns associated
to mental tasks. IEEE Transactions on Neural Networks, 13(3), 2002.
[141] M.T. Mitterschiffthaler, C.H.Y. Fu, J.A. Dalton, C.M. Andrew, and S.C.R.
Williams. A functional mri study of happy and sad affective states induced by
classical music. Human Brain Mapping, 28(11):1150–1162, 2007.
[142] S. Moghimi, A. Kushki, A.M. Guerguerian, and T. Chau. Characterizing emo-
tional response to music in the prefrontal cortex using near infrared spectroscopy.
Neuroscience Letters, 2012.
[143] S. Moghimi, A. Kushki, A.M. Guerguerian, and T. Chau. A review of eeg-based
brain-computer interfaces as access pathways for individuals with severe disabilities.
Assistive technology: the official journal of RESNA, to appear (2012).
[144] S. Moghimi, A. Kushki, S. Power, A.M. Guerguerian, and T. Chau. Automatic
detection of a prefrontal cortical response to emotionally rated music using multi-
channel near-infrared spectroscopy. Journal of Neural Engineering, 9(2):026022,
2012.
131
[145] ST Morgan, JC Hansen, and SA Hillyard. Selective attention to stimulus location
modulates the steady-state visual evoked potential. Proceedings of the National
Academy of Sciences of the United States of America, 93(10):4770, 1996.
[146] J.D. Morris. SAM: the Self-Assessment Manikin. An efficient cross-cultural mea-
surement of emotional response. Journal of Advertising Research, 35(6), 1995.
[147] DW Mulder, LT Kurland, KP Offord, and CM Beard. Familial adult motor neuron
disease: amyotrophic lateral sclerosis. Neurology, 36(4):511, 1986.
[148] G.R. Muller, C. Neuper, and G. Pfurtscheller. Implementation of a telemonitoring
system for the control of an EEG-based brain-computer interface. IEEE Transac-
tions on Neural Systems and Rehabilitation Engineering, 11(1):54–59, 2003.
[149] G.R. Muller-Putz, R. Scherer, C. Brunner, R. Leeb, and G. Pfurtscheller. Better
than random? a closer look on bci results. International Journal of Bioelectromag-
netism, 10(1):52–55, 2008.
[150] K.J. Murphy and J.A. Brunberg. Adult claustrophobia, anxiety and sedation in
mri. Magnetic resonance imaging, 15(1):51–54, 1997.
[151] M. Naito, Y. Michioka, K. Ozawa, Y. Ito, M. Kiguchi, and T. Kanazawa. A commu-
nication means for totally locked-in als patients based on changes in cerebral blood
volume measured with near-infrared light. IEICE transactions on information and
systems, 90(7):1028–1037, 2007.
[152] Z. Nenadic and J.W. Burdick. Spike detection using the continuous wavelet trans-
form. Biomedical Engineering, IEEE Transactions on, 52(1):74–87, 2005.
[153] N. Neumann and N. Birbaumer. Predictors of successful self control during
brain-computer communication. Journal of Neurology, Neurosurgery & Psychiatry,
74(8):1117, 2003.
132
[154] N. Neumann and A. Kubler. Training locked-in patients: A challenge for the use
of brain-computer interfaces. IEEE Transactions on Neural Systems and Rehabili-
tation Engineering, 11(2):169–172, 2003.
[155] C. Neuper, GR Muller, A. Kubler, N. Birbaumer, and G. Pfurtscheller. Clinical
application of an EEG-based brain-computer interface: a case study in a patient
with severe motor impairment. Clinical Neurophysiology, 114(3):399–409, 2003.
[156] B.R. Nhan and T. Chau. Classifying affective states using thermal infrared imaging
of the human face. Biomedical Engineering, IEEE Transactions on, 57(4):979–987,
2010.
[157] E. Niedermeyer and F.H.L. Da Silva. Electroencephalography: basic principles,
clinical applications, and related fields. Lippincott Williams & Wilkins, 2005.
[158] F. Nijboer, SP Carmien, E. Leon, FO Morin, RA Koene, and U. Hoffmann. Affec-
tive brain-computer interfaces: Psychophysiological markers of emotion in healthy
persons and in persons with amyotrophic lateral sclerosis. In Affective Comput-
ing and Intelligent Interaction and Workshops, 2009. ACII 2009. 3rd International
Conference on, pages 1–11. IEEE, 2009.
[159] F. Nijboer, EW Sellers, J. Mellinger, MA Jordan, T. Matuz, A. Furdea, S. Halder,
U. Mochty, DJ Krusienski, TM Vaughan, et al. A P300-based brain-computer
interface for people with amyotrophic lateral sclerosis. Clinical neurophysiology,
119(8):1909–1916, 2008.
[160] K. Oatley, D. Keltner, and J.M. Jenkins. Understanding emotions. Wiley-
Blackwell, 2006.
[161] H. Obrig, C. Hirth, JG Junge-Hulsing, C. Doge, T. Wolf, U. Dirnagl, and A. Vill-
ringer. Cerebral oxygenation changes in response to motor stimulation. Journal of
Applied Physiology, 81(3):1174, 1996.
133
[162] F Ortiz-Corredor, JJ Silvestre-Avendano, and A Izquierdo-BEllo. Locked-in state
mimicking cerebral death in a child with guillain-barre syndrome. Revista de Neu-
rologica, 44(10):636–638, 2007.
[163] K.J. Pallesen, E. Brattico, C. Bailey, A. Korvenoja, J. Koivisto, A. Gjedde, and
S. Carlson. Emotion processing of major, minor, and dissonant chords. Annals of
the New York Academy of Sciences, 1060(1):450–453, 2005.
[164] J. Panksepp and G. Bernatzky. Emotional sounds and the brain: the neuro-affective
foundations of musical appreciation. Behavioural Processes, 60(2):133–155, 2002.
[165] M.A. Pastor, J. Artieda, J. Arbizu, M. Valencia, and J.C. Masdeu. Human cerebral
activation during steady-state visual-evoked responses. Journal of Neuroscience,
23(37):11621, 2003.
[166] J. Perelmouter and N. Birbaumer. A binary spelling interface with random errors.
IEEE Transactions on Rehabilitation Engineering, 8(2):227–232, 2000.
[167] I. Peretz, L. Gagnon, and B. Bouchard. Music and emotion: perceptual deter-
minants, immediacy, and isolation after brain damage. Cognition, 68(2):111–141,
1998.
[168] P.C. Petrantonakis and L.J. Hadjileontiadis. Emotion recognition from eeg using
higher order crossings. Information Technology in Biomedicine, IEEE Transactions
on, 14(2):186–197, 2010.
[169] KV Petrides and A. Furnham. Trait emotional intelligence: Behavioural validation
in two studies of emotion recognition and reactivity to mood induction. European
Journal of Personality, 17(1):39–57, 2003.
[170] G. Pfurtscheller, C. Neuper, C. Guger, W. Harkam, H. Ramoser, A. Schlogl,
B. Obermaier, M. Pregenzer, et al. Current trends in Graz brain-computer interface
134
(BCI) research. IEEE Transactions on Rehabilitation Engineering, 8(2):216–219,
2000.
[171] R.W. Picard. Affective computing. The MIT press, 2000.
[172] R.W. Picard, E. Vyzas, and J. Healey. Toward machine emotional intelligence:
Analysis of affective physiological state. Pattern Analysis and Machine Intelligence,
IEEE Transactions on, 23(10):1175–1191, 2001.
[173] F. Piccione, F. Giorgi, P. Tonin, K. Priftis, S. Giove, S. Silvoni, G. Palmas, and
F. Beverina. P300-based brain computer interface: reliability and performance in
healthy and paralysed participants. Clinical neurophysiology, 117(3):531–537, 2006.
[174] T.W. Picton. The P300 wave of the human event-related potential. Journal of
clinical neurophysiology, 9(4):456, 1992.
[175] GD Pinna and R. Maestri. Reliability of transfer function estimates in cardio-
vascular variability analysis. Medical and Biological Engineering and Computing,
39(3):338–347, 2001.
[176] R. Plomp and W.J.M. Levelt. Tonal consonance and critical bandwidth. The
journal of the Acoustical Society of America, 38(4):548–560, 1965.
[177] S Power, T Falk, and T Chau. Classification of prefrontal activity due to mental
arithmetic and music imagery using hidden markov models and frequency domain
near-infrared spectroscopy. Journal of Neural Engineering, 7(2):026002:9pp, 2010.
[178] S. Power, A. Kushki, and T. Chau. Toward a 3-state system-paced NIRS-BCI:
automatic discrimination of mental arithmetic, music imagery from the no-control
state. under review at Journal of Neural Engineering, 2011.
[179] S.D. Power, T.H. Falk, and T. Chau. Classification of prefrontal activity due to
135
mental arithmetic and music imagery using hidden Markov models and frequency
domain near-infrared spectroscopy. Journal of Neural Engineering, 7:026002, 2010.
[180] S.D. Power, A. Kushki, and T. Chau. Towards a system-paced near-infrared spec-
troscopy brain–computer interface: differentiating prefrontal activity due to mental
arithmetic and mental singing from the no-control state. Journal of Neural Engi-
neering, 8:066004, 2011.
[181] W.S. Pritchard. Psychophysiology of P300. Psychological Bulletin, 89(3):506–540,
1981.
[182] V. Rajagopalan and A. Ray. Symbolic time series analysis via wavelet-based par-
titioning. Signal Processing, 86(11):3309–3320, 2006.
[183] C. Ranganath and G. Rainer. Neural mechanisms for detecting and remembering
novel events. Nature Reviews Neuroscience, 4(3):193–202, 2003.
[184] P. Rani, C. Liu, N. Sarkar, and E. Vanman. An empirical study of machine learning
techniques for affect recognition in human–robot interaction. Pattern Analysis &
Applications, 9(1):58–69, 2006.
[185] S.J. Roberts and W.D. Penny. Real-time brain-computer interfacing: A preliminary
study using Bayesian learning. Medical and Biological Engineering and computing,
38(1):56–61, 2000.
[186] R.G. Robinson, K.L. Kubos, L.Y.N.B. Starr, K. Rao, and T.R. Price. Mood disor-
ders in stroke patients: importance of location of lesion. Brain, 107(1):81, 1984.
[187] E.T. Rolls. On¡ em¿ The brain and emotion¡/em¿. Behavioral and Brain Sciences,
23(02):219–228, 2000.
[188] A. Roskies. Neuroethics for the new millenium. Neuron, 35(1):21, 2002.
136
[189] M.K. Rothbart and D. Derryberry. Development of individual differences in tem-
perament. Advances in developmental psychology, 1:37–86, 1981.
[190] J.A. Russell. A circumplex model of affect. Journal of personality and social
psychology, 39(6):1161, 1980.
[191] C.L. Rusting. Personality, mood, and cognitive processing of emotional informa-
tion: three conceptual frameworks. Psychological bulletin, 124(2):165, 1998.
[192] D.L. Sackett. Rules of evidence and clinical recommendations on the use of an-
tithrombotic agents. Chest, 95(2 Supplement):2S, 1989.
[193] S. Samson. Neuropsychological studies of musical timbre. Annals of the New York
Academy of Sciences, 999(1):144–151, 2003.
[194] G Santhanam, SI Ryu, BM Yu, A Afshar, and KV Shenoy. A high-performance
brain-computer interface. Nature, 442(7099):195–198, 2006.
[195] Ichiro Sase, Hideo Eda, Akitoshi Seiyama, Hiroki C Tanabe, Akira Takatsuki, and
Toshio Yanagida. Multi-channel optical mapping: Investigation of depth informa-
tion. In Proc SPIE, volume 4250, pages 29–36, 2001.
[196] Hiroki Sato, Masashi Kiguchi, Fumio Kawaguchi, Atsushi Maki, et al. Practical-
ity of wavelength selection to improve signal-to-noise ratio in near-infrared spec-
troscopy. Neuroimage, 21(4):1554–1562, 2004.
[197] J.P. Saul, RD Berger, P. Albrecht, SP Stein, M.H. Chen, and R.J. Cohen. Transfer
function analysis of the circulation: unique insights into cardiovascular regulation.
American Journal of Physiology-Heart and Circulatory Physiology, 261(4):H1231–
H1245, 1991.
[198] J.P. Saul, R.D. Berger, MH Chen, and R.J. Cohen. Transfer function analysis
137
of autonomic regulation. ii. respiratory sinus arrhythmia. American Journal of
Physiology-Heart and Circulatory Physiology, 256(1):H153–H161, 1989.
[199] M.J. Scherer. The change in emphasis from people to person: introduction to
the special issue on Assistive Technology. Disability & Rehabilitation, 24(1-3):1–4,
2002.
[200] L.A. Schmidt and L.J. Trainor. Frontal brain electrical activity (eeg) distinguishes
valence and intensity of musical emotions. Cognition & Emotion, 15(4):487–500,
2001.
[201] W.W. Seeley, V. Menon, A.F. Schatzberg, J. Keller, G.H. Glover, H. Kenna, A.L.
Reiss, and M.D. Greicius. Dissociable intrinsic connectivity networks for salience
processing and executive control. The Journal of neuroscience, 27(9):2349–2356,
2007.
[202] E. Sellers, G. Schalk, and E. Donchin. The p300 as a typing tool: tests of brain-
computer interface with an als patient. Psychophysiology, 40:77, 2003.
[203] E.W. Sellers and E. Donchin. A P300-based brain-computer interface: initial tests
by ALS patients. Clinical Neurophysiology, 117(3):538–548, 2006.
[204] E.W. Sellers, A. Kubler, and E. Donchin. Brain–computer interface research at
the University of South Florida cognitive psychophysiology laboratory: the P300
speller. Biomed. Eng, 51(4):647–656, 2004.
[205] W.A. Sethares. Tuning, timbre, spectrum, scale. 2004.
[206] Y.I. Sheline. 3d mri studies of neuroanatomic changes in unipolar major depression:
the role of stress and medical comorbidity. Biological Psychiatry, 48(8):791–800,
2000.
138
[207] D.V. SHERMAN and D. Ely. Biochemical and galvanic skin responses to music
stimuli by college students in biology and music. Perceptual and motor skills,
74(3c):1079–1090, 1992.
[208] A. Siegel and H. Edinger. Neural control of aggression and rage behavior. Handbook
of the Hypothalamus, 3(Part B), 1981.
[209] J.R. Simpson, W.C. Drevets, A.Z. Snyder, D.A. Gusnard, and M.E. Raichle.
Emotion-induced changes in human medial prefrontal cortex: Ii. during antici-
patory anxiety. Proceedings of the National Academy of Sciences, 98(2):688–693,
2001.
[210] R. Sinha, W.R. Lovallo, and O.A. Parsons. Cardiovascular differentiation of emo-
tions. Psychosomatic Medicine, 54(4):422, 1992.
[211] E. Smith and M. Delargy. Locked-in syndrome. British Medical Journal,
330(7488):406, 2005.
[212] E.M. Sokhadze. Effects of music on the recovery of autonomic and electrocortical
activity after stress induced by aversive visual stimuli. Applied psychophysiology
and biofeedback, 32(1):31–50, 2007.
[213] M.P. Spackman, M. Fujiki, B. Brinton, D. Nelson, and J. Allen. The ability of chil-
dren with language impairment to recognize emotion conveyed by facial expression
and music. Communication Disorders Quarterly, 26(3):131, 2005.
[214] D. Sridharan, D.J. Levitin, and V. Menon. A critical role for the right fronto-
insular cortex in switching between central-executive and default-mode networks.
Proceedings of the National Academy of Sciences, 105(34):12569–12574, 2008.
[215] N. Steinbeis, S. Koelsch, and J.A. Sloboda. Emotional processing of harmonic
139
expectancy violations. Annals of the New York Academy of Sciences, 1060(1):457–
461, 2005.
[216] M.W. SullivanK et al. Contingency, means end skills, and the use of technology in
infant intervention. Infants & Young Children, 5(4):58, 1993.
[217] K. Tai, S. Blain, and T. Chau. A review of emerging access technologies for indi-
viduals with severe motor impairments. Assistive technology: the official journal
of RESNA, 20(4):204, 2008.
[218] K. Tai and T. Chau. Single-trial classification of NIRS signals during emotional in-
duction tasks: towards a corporeal machine interface. Journal of NeuroEngineering
and Rehabilitation, 6(1):39, 2009.
[219] M. Tanida, M. Katsuyama, and K. Sakatani. Relation between mental stress-
induced prefrontal cortex activity and skin conditions: A near-infrared spectroscopy
study. Brain research, 1184:210–216, 2007.
[220] J.J. Tecce. Contingent negative variation (CNV) and psychological processes in
man. Psychological Bulletin, 77(2):73–108, 1972.
[221] M.M. Ter-Pogossian, M.E. Raichle, and B.E. Sobel. Positron-emission tomography.
Sci. Am.;(United States), 243(4), 1980.
[222] J.F. Thayer and R.D. Lane. A model of neurovisceral integration in emotion reg-
ulation and dysregulation. Journal of affective disorders, 61(3):201–216, 2000.
[223] M. Toyokura. Waveform and habituation of sympathetic skin response. Electroen-
cephalography and Clinical Neurophysiology/Electromyography and Motor Control,
109(2):178–183, 1998.
[224] L. Trejo, K. Knuth, R. Prado, R. Rosipal, K. Kubitz, R. Kochavi, B. Matthews,
140
and Y. Zhang. Eeg-based estimation of mental fatigue: Convergent evidence for a
three-state model. Foundations of Augmented Cognition, pages 201–211, 2007.
[225] E.Z. Tronick. Emotions and emotional communication in infants. American psy-
chologist, 44(2):112, 1989.
[226] T.M. Vaughan, D.J. McFarland, G. Schalk, W.A. Sarnacki, D.J. Krusienski, E.W.
Sellers, and J.R. Wolpaw. The wadsworth bci research and development program:
at home with bci. IEEE Transactions on Neural Systems and Rehabilitation Engi-
neering, 14(2):229–233, 2006.
[227] M Velliste, S Perel, MC Spalding, AS Whitford, and AB Schwartz. Cortical control
of a prosthetic arm for self-feeding. Nature, 453(7198):1098–1101, 2008.
[228] A. Villringer, J. Planck, C. Hock, L. Schleinkofer, and U. Dirnagl. Near infrared
spectroscopy (NIRS): a new tool to study hemodynamic changes during activation
of brain function in human adults. Neuroscience Letters, 154(1-2):101–104, 1993.
[229] R.F. Vossa and J. Clarke. ” 1/f noise” in music: Music from 1/f noise. J. Acoust.
Soc. Am, 63(1):258, 1978.
[230] Y. Wang, R. Wang, X. Gao, B. Hong, and S. Gao. A practical VEP-based brain-
computer interface. IEEE Transactions on Neural Systems and Rehabilitation En-
gineering, 14(2):234–240, 2006.
[231] C.M. Warrier and R.J. Zatorre. Right temporal cortex is critical for utilization of
melodic contextual cues in a pitch constancy task. Brain, 127(7):1616–1625, 2004.
[232] L. Wedin. A multidimensional study of perceptual-emotional qualities in music.
Scandinavian Journal of Psychology, 13(1):241–257, 1972.
[233] N. Weiskopf, K. Mathiak, S.W. Bock, F. Scharnowski, R. Veit, W. Grodd,
R. Goebel, and N. Birbaumer. Principles of a brain-computer interface (bci) based
141
on real-time functional magnetic resonance imaging (fmri). Biomedical Engineer-
ing, IEEE Transactions on, 51(6):966–970, 2004.
[234] N. Weiskopf, F. Scharnowski, R. Veit, R. Goebel, N. Birbaumer, and K. Mathiak.
Self-regulation of local brain activity using real-time functional magnetic resonance
imaging (fMRI). Journal of Physiology-Paris, 98(4-6):357–373, 2004.
[235] R.E. Wheeler, R.J. Davidson, and A.J. Tomarken. Frontal brain asymmetry and
emotional reactivity: A biological substrate of affective style. Psychophysiology,
30(1):82–89, 1993.
[236] A. Wilson. Augmentative communication in practice: An introduction (2nd ed.).
University of Edinburgh, Edinburgh, Scotland, 1998.
[237] J.R. Wolpaw, N. Birbaumer, D.J. McFarland, G. Pfurtscheller, and T.M. Vaughan.
Brain-computer interfaces for communication and control. Clinical neurophysiology,
113(6):767–791, 2002.
[238] J.R. Wolpaw and D.J. McFarland. Control of a two-dimensional movement signal
by a noninvasive brain-computer interface in humans. Proceedings of the National
Academy of Sciences of the United States of America, 101(51):17849, 2004.
[239] J.R. Wolpaw, D.J. McFarland, T.M. Vaughan, and G. Schalk. The Wadsworth
Center brain-computer interface (BCI) research and development program. IEEE
Transactions on Neural Systems and Rehabilitation Engineering, 11(2):204–207,
2003.
[240] W.M. Wundt and C.H. Judd. Outlines of psychology. W. Engelmann, 1907.
[241] H. Yang, Z. Zhou, Y. Liu, Z. Ruan, H. Gong, Q. Luo, and Z. Lu. Gender difference
in hemodynamic responses of prefrontal area to emotional stress by near-infrared
spectroscopy. Behavioural brain research, 178(1):172–176, 2007.
142
[242] T.O. Zander and C. Kothe. Towards passive brain–computer interfaces: apply-
ing brain–computer interface technology to human–machine systems in general.
Journal of Neural Engineering, 8(2):025005, 2011.
[243] R.J. Zatorre. Discrimination and recognition of tonal melodies after unilateral
cerebral excisions. Neuropsychologia, 23(1):31–41, 1985.
[244] M.R. Zentner and J. Kagan. Infants’ perception of consonance and dissonance in
music. Infant Behavior and Development, 21(3):483–492, 1998.