by Saba Moghimi A thesis submitted in conformity with the ... · deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively), was used to monitor prefrontal cortex hemodynamics

Detecting Emotional response to music using near-infraredspectroscopy of the prefrontal cortex

by

Saba Moghimi

A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy

Graduate Department of Institute of Biomaterials and BiomedicalEngineering

University of Toronto

c⃝ Copyright 2013 by Saba Moghimi

Abstract

Detecting Emotional response to music using near-infrared spectroscopy of the

prefrontal cortex

Saba Moghimi

Doctor of Philosophy

Graduate Department of Institute of Biomaterials and Biomedical Engineering

University of Toronto

2013

Many individuals with severe motor disabilities may not be able to use conventional

means of emotion expression (e.g. vocalization, facial expression) to make their emo-

tions known to others. Lack of a means for expressing emotions may adversely affect

the quality of life of these individuals and their families. The main objective of this

thesis was to implement a non-invasive means of identifying emotional arousal (neutral

vs. intense) and valence (positive vs. negative) by directly using brain activity. In

this light, near infrared spectroscopy (NIRS), which optically measures oxygenated and

deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively), was used to

monitor prefrontal cortex hemodynamics in 10 individuals as they listened to music ex-

cerpts. Participants provided subjective ratings of arousal and valence. With respect to

valence and arousal, prefrontal cortex [HbO2] and [Hb] were characterized and significant

prefrontal cortex hemodynamic modulations were identified due to emotions. These mod-

ulations were not significantly related to the characteristics of the music excerpts used

for inducing emotions. These early investigations provided evidence for the use of pre-

frontal cortex NIRS in identifying emotions. Next, using features extracted from [HbO2]

and [Hb] in the prefrontal cortex, an average accuracy of 71% was achieved in identifying

arousal and valence. Novel hemodynamic features extracted using dynamic modeling and

template-matching were introduced for identifying arousal and valence. Ultimately, the

ii

ability of autonomic nervous system (ANS) signals including heart rate, electrodermal

activity and skin temperature to improve the identification results, achieved when using

PFC [HbO2] and [Hb] exclusively, was investigated. For the majority of the participants,

prefrontal cortex NIRS-based identification achieved higher classification accuracies than

combined ANS and NIRS features. The results indicated that NIRS recordings of the

prefrontal cortex during presentation of music with emotional content can be automat-

ically decoded in terms of both valence and arousal encouraging future investigation of

NIRS-based emotion detection in individuals with severe disabilities.

iii

Dedication

To Hope and Trinity for inspiring me to pursue this work.

iv

Acknowledgements

I would like to thank my supervisor Dr. Tom Chau for his kind help and all his support

throughout my work. I will be forever indebted to him for giving me the chance to be

part of his dynamic research team. His mentorship has helped me develop skills that I

will carry for the rest of my life. My special thanks to my co-supervisor Dr. Anne-Marie

Guerguerian for sharing her knowledge and supporting me throughout the challenges I

faced. Her unwavering care and concern for the patients has always been a source of

inspiration to me. I would like to thank my committee members Dr. Maureen Dennis

and Dr. Milos Popovic for sharing their insight, and guiding me with their suggestions.

I would like to express my gratitude to Dr. Azadeh Kushki and Dr. Sarah Power for

their kind help throughout my research. I am also grateful to Ka Lun Tam and Pierre

Duez for their technical support. I would like to express my gratitude to Dr. Negar

Memarian and Dr. Stefanie Blain-Moraes for helping me in developing my research

skills.

I would like to thank the participants who took the time to help me with this study,

without whom this work would have not been possible. I acknowledge the financial

support of the National Science and Engineering Research Council CREATE CARE

program, and Holland Bloorview Kids Rehabilitation Hospital graduate scholarship. I

would like to thank donors of the K.M. Peterborough Hunter graduate studentship for

their financial support.

Finally, I would like to express my gratitude to my family whose love and support

has always embraced me although they are miles and miles away. I would like to thank

my father for all his contributions. His interest in my work and our discussions truly

motivated me in my research. I thank my mother and my aunt Ferreshteh who reminded

me to be strong and determined throughout my work. Special thanks to my sister who

helped me in so many ways from encouraging me in my work to sharing her technical

insight. Finally, my special thanks to Amin Abdossalami for reminding me to never give

v

up.

vi

Contents

1 Introduction 1

1.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.3 Current clinical evidence for EEG-based BCIs, a literature appraisal . . . 3

1.3.1 BCI Development Using Electroencephalography . . . . . . . . . . 5

1.3.2 Applications User Interface . . . . . . . . . . . . . . . . . . . . . 7

1.3.3 Controlling brain computer interfaces . . . . . . . . . . . . . . . . 7

1.3.4 Evaluation Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.3.5 Future Directions in BCI research . . . . . . . . . . . . . . . . . . 13

1.3.6 Towards affective brain computer interfaces . . . . . . . . . . . . 16

1.4 Neural correlates of emotion . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.4.1 The role of prefrontal cortex in default, salient and executive con-

trol networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.5 Near-infrared spectroscopy of the brain . . . . . . . . . . . . . . . . . . . 21

1.6 Emotion induction via music . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.7 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

1.8 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 Experimental Protocol 29

2.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

vii

2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.3 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.4 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.5 Signal acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.6 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.7 Study design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3 Characterizing PFC Hemodynamic changes due valence and arousal 35

3.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.4.1 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.4.2 Wavelet-based peak detection . . . . . . . . . . . . . . . . . . . . 40

3.4.3 Statistical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4 The Effect of Music Characteristics 47

4.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.3.1 Music characteristic extraction . . . . . . . . . . . . . . . . . . . . 49

4.3.2 Music database . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.3.3 Statistical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.5.1 Subject specific patterns . . . . . . . . . . . . . . . . . . . . . . . 53

viii

4.5.2 Temporal dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5 Automatic Detection of Emotional Response to Music 55

5.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.2 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.3 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.4.1 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.4.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.4.3 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.4.4 Classification procedures . . . . . . . . . . . . . . . . . . . . . . . 62

5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5.6.1 Classification Accuracy . . . . . . . . . . . . . . . . . . . . . . . . 66

5.6.2 Diversity in the music database . . . . . . . . . . . . . . . . . . . 69

5.6.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6 Combining autonomic and central nervous system activity 71

6.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

6.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.3.1 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.3.2 NIRS data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.3.3 ANS data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.3.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.3.5 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6.3.6 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

ix

6.3.7 Mixture of experts . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.4.1 Dynamic model-based features . . . . . . . . . . . . . . . . . . . . 84

6.4.2 Classification results . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7 Concluding remarks 89

7.1 Summary of contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 89

7.1.1 A literature appraisal of the existing evidence for the use of BCI

for individuals with disabilities [143] . . . . . . . . . . . . . . . . 89

7.1.2 PFC [Hb] and [HbO2] patterns characterization using wavelet anal-

ysis with respect to emotional arousal and valence [142] . . . . . . 90

7.1.3 Identified emotional arousal and valence in response to dynamic

emotion induction using PFC NIRS [144] . . . . . . . . . . . . . . 90

7.1.4 Introduced features based on dynamic modeling for emotion iden-

tification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

7.1.5 Multi-modal emotion identification using a mixture of classifier ex-

perts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

7.2 Recommendation for future studies . . . . . . . . . . . . . . . . . . . . . 92

7.2.1 Assessing PFC hemodynamics for emotion identification in the pe-

diatric population and individuals with severe disabilities . . . . . 92

7.2.2 Potential clinical implications . . . . . . . . . . . . . . . . . . . . 93

7.2.3 Dynamic emotional rating paradigms . . . . . . . . . . . . . . . . 94

7.2.4 Emotional sensitivity measures . . . . . . . . . . . . . . . . . . . 94

7.2.5 Individual specific analysis . . . . . . . . . . . . . . . . . . . . . . 94

7.2.6 Inclusion of larger sample sizes . . . . . . . . . . . . . . . . . . . 95

x

Appendix A: Open Challenges Regarding Control Mechanisms 96

Appendix B: Music Database 100

Appendix C: Music characteristic extraction using MIRTOOLBOX 103

Appendix D: Region specific analysis of [HbO2] and [Hb] with respect to

music characteristics 104

Appendix E: Contributions from Systemic Blood Flow 105

Appendix F: Cognitive Processing Activity in the Prefrontal Cortex 107

Appendix G: Research Ethics 108

Bibliography 114

Acknowledgements

xi

List of Tables

1.1 Summary of BCI studies on individuals with disabilities (1999-2005) . . . 8

1.1 Summary of BCI studies on individuals with disabilities (2006-2009) . . . 9

1.2 BCI Control Mechanisms. . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3 A summary of existing theories of emotions. See [50] for more details. . . 23

4.1 P-values for the main effect of arousal and valence rating in modeling

mode, dissonance and maximum sound pressure level. . . . . . . . . . . . 51

4.2 P-values for the main effect of music characteristics (i.e. dissonance, mode,

and maximum sound pressure level) in modeling the peaks of [HbO2] and

[Hb] averaged across the nine recording sites. . . . . . . . . . . . . . . . . 52

5.1 Summary of features used in the analysis . . . . . . . . . . . . . . . . . . 62

5.2 Classification accuracy in % for each participant when classifying HA vs.

BN. Feature-types corresponding to the best average accuracy are also

presented for each participant (M = stimulus period mean; ∆M = stimulus

period mean - preceding noise period mean; LSR = lateral slope ratio;

∆LM = Lateral mean difference; S = slope, CV = coefficient of variation 65

xii

5.3 Classification accuracy in % for each participant when classifying PV vs.

NV. Feature-types corresponding to the best average accuracy are also

presented for each participant (M = stimulus period mean; ∆M = stimulus

period mean - preceding noise period mean; LSR = lateral slope ratio;

∆LM = Lateral mean difference; S = slope, CV = coefficient of variation 66

6.1 Features resulting from arx dynamic modeling. (very low frequency band

(VLF) = 0-0.025 Hz, low frequency band (LF) = 0-0.075 Hz and high

frequency band (HF) = 0.075-0.1 HZ) . . . . . . . . . . . . . . . . . . . . 79

6.2 Feature used for training classifier experts . . . . . . . . . . . . . . . . . 83

6.3 Classification accuracy in % determined using ANS features for solving

the HA vs BN and PV vs. NV classification problem . . . . . . . . . . . 85

6.4 Classification accuracy in % determined using the mixture of experts for

solving the HA vs. BN and PV vs. NV classification problem . . . . . . . 86

6.5 Classification accuracy in % for each participant when classifying HA vs.

BN. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and

arx (b) input:[HbO2]/[Hb])) and template-based features. . . . . . . . . . 86

6.6 Classification accuracy in % for each participant when classifying PV vs.

NV. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and

arx (b) input:[HbO2]/[Hb])) and template-based features. . . . . . . . . . 87

1 The list of music pieces included in the common music database . . . . . 101

2 The list of self-selected music pieces . . . . . . . . . . . . . . . . . . . . . 102

3 The significance of the main effect of a. Mode, b. Dissonance, and c.

Maximum sound pressure level for each recording site shown in Figure

2.1. (α = 0.05) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

xiii

List of Figures

1.1 General BCI Components . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Various structures within the survival network involved in the emotional

response, and the resulting outputs. [123] . . . . . . . . . . . . . . . . . . 19

1.3 General overview of NIRS recording system . . . . . . . . . . . . . . . . 22

1.4 Thesis roadmap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.1 The layout of light sources (circles) and detectors (X’s). The vertical line

denotes anatomical midline. The annotated shaded areas correspond to

recording locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.2 Trial sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.3 The Self Assessment Manikin Rating System is shown. The top and the

bottom row depict valence (positive to negative) and arousal (intense to

neutral) ratings, respectively. The participant could select one of the nine

levels of arousal/valence by marking the corresponding circles shown. For

example, in the sample rating provided, a very intense positive emotion is

represented . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34




3.2 Trial sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

xiv

3.3 Mexican hat wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.4 Box-plot of valence and arousal ratings for each participant . . . . . . . . 43

3.5 Slopes of regression lines between participant arousal ratings and (a) the

maximum wavelet coefficient (MWC), and (b) the corresponding scale.

Only slopes significantly different from zero are shown (p < 0.005). . . . . 44

3.6 Slopes of regression lines between participant valence ratings and (a) the

maximum wavelet (MWC), and (b) the corresponding scale. Only slopes

significantly different from zero are shown (p < 0.005). . . . . . . . . . . 44

3.7 Plotted in black are the (a) [HbO2] (top panel) and (b) [Hb] (bottom

panel)recordings across nine interrogation sites for a music sample inducing

intense negative emotions from one of the participants during 45 seconds

of aural stimulus. In grey are the corresponding waveforms of wavelet

coefficients at the scale where the maximum wavelet coefficient occurs.

These waveforms have been scaled by their standard deviation to facilitate

visual comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1 In grey: the normalized sound pressure level of self-selected song A for par-

ticipant 3. In black: normalized [HbO2] averaged across the nine recording

locations shown for each of the four repetitions of song A. The [HbO2] var-

ied in different repetitions of the same song. . . . . . . . . . . . . . . . . 52

5.1 Plots (a) and (c) exemplify normalized HbO2 concentration signals at dif-

ferent recording locations while plots (b) and (d) are the corresponding

normalized Hb concentration signals. The dark lines represent normalized

signals corresponding to highly valenced, high arousal stimuli while the

lighter grey line depicts normalized concentrations during Brown noise

presentation to the same participant. The same Brown noise sample is

illustrated for both positively and negatively valenced examples. . . . . . 64

xv

5.2 Location of features resulting in the best overall accuracy. Each rectangle

is located over a recording site. The size of the rectangle is proportional

to the number of features selected from the corresponding location. The

vertical line denotes the anatomical midline (HA = high arousal; BN =

Brown noise; PV = positive valence; NV=negative valence). . . . . . . . 67

5.3 Adjusted classification accuracy (shown in (??)) results (averaged across

participants) versus the number of trials included for classification against

brown noise trials, after sorting all trials based on ratings of arousal in

descending order. (e.g. accuracies reported for the top 12 are the result of

classifying the 12 highest rated arousal trials against all trials with brown

noise. The confidence intervals are shown as error bars for each number

of trials included.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.4 Adjusted classification accuracy (shown in (??)) results (averaged across

participants) versus the number of trials included for classification, after

sorting all trials based on ratings of positive and negative valence in de-

scending order. (e.g. accuracies reported for the top 12 are the result of

classifying the 12 most positively rated trials against the 12 most nega-

tively rated trials. The confidence intervals are shown as error bars for

each number of trials included.) . . . . . . . . . . . . . . . . . . . . . . . 68

6.1 Trial sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74




6.3 A. Custom-made template, B. Sample normalized [HbO2] recorded in a

trial with chills. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.4 Feature segmentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6.5 A simplified diagram depicting fusion of classifier decisions. . . . . . . . . 82

xvi

6.6 Sample trial with chills (participant 2): EDA recording and estimation,

using the average [HbO2] concentrations as the input to the arx model.

The fit achieved by the model for the depicted estimation is 52.9%. . . . 84

6.7 Sample scaled frequency response estimated for (A) chilling and (B) neu-

tral trials for participant 4. The magnitude of the frequency response was

normalized by dividing the results by the total power of the signal over

the entire frequency range. . . . . . . . . . . . . . . . . . . . . . . . . . . 85

1 Ethics approval notice . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

2 Participant consent form . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

xvii

xviii

List of abbreviations ANS: autonomic nervous system AR: autoregressive arx: autoregressive model with exogenous input BCI: brain computer interface BN: brown noise BVP: blood volume pulse CNS: central nervous system HA: high arousal EDA: electrodermal activity EEG: electroencephalography [HbO2]: oxygenated hemoglobin concentration [Hb]: deoxygenated hemoglobin concentration MRITOLLBOX: music information retrieval toolbox MRI: magnetic resonance imaging MWC: maximum wavelet coefficient NIRS: near infrared spectroscopy NV: negative valence PFC: prefrontal cortex PET: positron emission tomography PV: positive valence

Chapter 1

Introduction

1.1 Preamble

Sections of this chapter are drawn from the following published review paper: Moghimi S,

Kushki A, Guerguerian AM, Chau T, A Review of EEG-Based Brain-Computer Interfaces

as Access Pathways for Individuals with Severe Disabilities. To appear in Assistive

technology: the official journal of RESNA 2012.

1.2 Motivation

Many individuals with severe motor disabilities may not be able to use conventional

means of communication such as speech or facial gestures to express their intentions.

Lack of communication may adversely impact the quality of life of these individuals as

well as that of their families. In particular, manifestation of emotions such as facial ex-

pressions and body language are an imperative part of human interactions. Emotional

communication enables caretakers to address the needs of infants [225]. Severe motor

impairments may result in an absence of physical displays of emotion, and leave caretak-

ers with no means of interpreting emotional reactions. Realizing alternative pathways

through which individuals with severe motor impairments may express their affective

1

2

response may ultimately improve their quality of life and quality of care while reducing

care-giver stress[70].

Alternative access pathways can be used to translate functional intent into electri-

cal signals for environmental or computer control [217]. Examples of these alternative

pathways include mechanical switches and vision-based systems that generate binary

control signals from limb movements [236], eye gaze [7, 37], mouth opening [137] or

tongue protrusion [128]. These solutions, however, are not appropriate for individuals

who are cognitively capable but have little or no voluntary and repeatable muscle control.

The etiology ranges from acute conditions such as brain-stem stroke, infectious basilar

arteritis , acute inflammatory demyelinating polyneuropathy [162] or brainstem tumor

[86] to chronic causes including amyotrophic lateral sclerosis, severe spastic quadriplegic

cerebral palsy, severe nemaline myopathy and multiple sclerosis [97]. For example, in-

dividuals affected by neuro-degenerative conditions such as amyotrophic lateral sclerosis

(ALS) or multiple sclerosis (MS) may experience locked in syndrome (LIS) in the late

stages of the disease. Individuals with LIS have little or no voluntary muscle control

while retaining cognitive awareness. These individuals are aware of their surroundings,

however, they may not be able to communicate their intent via speech or facial expres-

sion. Children with severe congenital disabilities due to severe motor impairments may

also experience communication difficulties. To enable communication without relying

on motor capacity, physiologically-based communication systems have been investigated.

In particular, communication alternatives have been developed by directly using brain

activity. Technologies known as brain computer interfaces (BCI) can generate a control

command enabling users to operate communication interfaces [237, 115]. This thesis ex-

plores affective BCI systems [158] capable of detecting emotional response. These systems

constitute an emerging field of BCI research.

3

Brain sensing

moduleDecoder

Control

command

Communication

Environment

control

Device control

Communication: restoring communication e.g. Alternative and

augmentative communication (AAC), Speller, computer

mediated communication such as internet access control

Environment control: Interacting with and influencing the

surrounding environment e.g. TV, bed position, lights control

Device control: Controlling mechanical devices to restore

mobility or dexterity e.g. Neuroprosthesis, wheelchair control

CONTROL INTERFACE

USER INTERFACE

Figure 1.1: General BCI Components

1.3 Current clinical evidence for EEG-based BCIs, a

literature appraisal

While many different BCI system paradigms have been proposed (e.g. [133, 52]), most

fundamentally, a BCI system is comprised of an activity sensing module, a brain ac-

tivity decoder and an output module as depicted in Figure 1.1. The activity sensor

measures brain activity while the decoder detects specific evoked or spontaneous brain

activity patterns and translates them into control commands. The output module takes

these control commands to drive applications such as an on-screen scanning keyboard for

communication.

One of the key aspects of BCI development is choosing a brain sensing module suit-

able for long-term bedside monitoring. With their high spatial resolution, electrode

implants ([98, 103, 104]) have facilitated accurate cursor control in humans ([77]) and

high throughput ([194]) and multi-joint prosthesis control ([227]) in primates, but do

require invasive surgery ([69]). Sacrificing some spatial resolution for non-invasiveness,

economy and portability, EEG is the most widely used modality in BCI applications

to date. Therefore, Electroencephalography(EEG) which monitors electrical potentials

4

from the skull surface has dominated BCI research due to its non-invasiveness, low cost,

portability and convenient set-up requirements.

Another modality explored for BCI development is magnetic resonance imaging (MRI)

[234]. MRI is capable of detecting hemodynamic changes by monitoring blood oxygena-

tion level dependent (BOLD) with high spatial resolution and can detect signals from

deeper brain areas. In fact, Weiskopf et al. have been able to differentiate brain activity

corresponding to motor imagery, visual imagery and spatial navigation using MRI [233].

Despite these findings, and the spatial resolution available using MRI, the current MRI

technologies are bulky, expensive, and require radio frequency and magnetic shielding

which impedes their use as a portable bedside monitoring system.

Another emerging cerebral hemodynamic monitoring technology for BCI develop-

ment is near infrared spectroscopy (NIRS). NIRS monitors the level of oxygenated and

deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively) in the cerebral

cortex using optical imagery. Near-infrared light shined through the adult skull, is de-

tected 2.5-3 cm apart from the source [228, 90]. The light intensity detected can be used

to identify [HbO2] and [Hb] in the underlying tissue due to the differences in absorp-

tion characteristics of these two chromophores. NIRS systems are relatively inexpensive,

portable, and suitable for long-term bedside monitoring. Recent studies have illustrated

the ability of NIRS to detect task-related changes in brain activity. These findings have

indicated that active music imagery (mental singing) can be differentiated from the rest

state and mental math with accuracies significantly above chance [177, 45, 65]. In addi-

tion to user-convenience, NIRS is particularly immune to electro-genic artifacts due to

eye movement and muscle contractions which are frequently encountered in the prefrontal

area. Therefore, in this thesis, NIRS was selected for detecting hemodynamic changes in

the prefrontal cortex associated with emotional responses. However, due to the extent

of BCI systems developed using EEG, a literature appraisal of clinically investigated

EEG-based BCI systems was conducted to set the stage for understanding the potential

5

of NIRS.

1.3.1 BCI Development Using Electroencephalography

To find out the extent of existing evidence for EEG-based BCI use for individuals with

disabilities and identifying research gaps, a literature review was conducted. The spe-

cific focus of this search was BCI systems for communication and environmental control.

Studies related to other BCI applications such as brain-controlled prostheses were ex-

cluded from the review. PubMed, ISI Web of Science, and OVID (MEDLINE, CINAHL,

EBM Reviews, EMBASE, and Ovid Healthstar) databases were searched using keyword

combinations containing brain computer interface and one of disability, disabilities, dis-

abled or ALS. Only English-language journal articles that directly evaluated EEG-based

BCI technology with participants with physical disabilities were included. This search

was further narrowed to journal articles published between January 1999 and December

2010.

The search identified 380 articles. The reference and citation lists of the retrieved

articles were further examined. The articles were screened, based on title and abstract, to

only include studies involving individuals with disabilities. This screening exercise yielded

119 articles. Further screening for EEG studies focused on restoring communication in

the target population further reduced the sample to 39 articles, as listed in Table 1.1.

In the following sections, we appraise these articles with respect to the participants’

characteristics, and to the articles’ control mechanisms, clinical findings, and evaluation

criteria.

The level of evidence for clinical interventions is typically rated according to study

design criteria [192]. None of the studies examined were controlled experiments per se.

Fourteen (14) articles compared BCI use between able-bodied participants and individu-

als with disabilities. However, if we regard BCI as a clinical intervention, these studies do

not follow conventional experimental designs as there were no control and experimental

6

groups. Thus, according to the conventional rating criteria [192], the entire collection

of selected articles would be rated as level V i.e., case series without controls. By and

large, the majority of studies involved a small number of participants; only 8 of the 39

studies had more than 6 participants. The selected studies do not propose an interven-

tion for a certain population, but rather report efforts of restoring communication in

the few individuals participating in the respective investigations. Therefore, one may

argue that the focus of these BCI assistive technology studies has been on individual-

centered solutions [199]. In this light, introducing clinical rating guidelines revolving

around person-centered constructs such as person-environment interaction [88] may be

more appropriate for these studies.

The selected studies included 6 single-subject reports, 25 studies with fewer than six

participants, and eight studies with 6-35 participants with disabilities. A total of 14

studies involved both able-bodied and individuals with disabilities.

Of the 39 studies involving participants with disabilities, 29 studies reported BCI eval-

uation results for individuals with locked-in syndrome (LIS) [211]. In LIS, consciousness

is preserved but severe motor and communication impairments are present (quadriplegia

and anarthia). Depending on the level of residual motor control, LIS is classified into

three categories [8]: (1) incomplete LIS, where remnant voluntary motion is preserved in

addition to those retained in classical LIS, (2) classical LIS, which refers to cases with

total immobility except for blinking and vertical eye movement, and (3) total LIS where

no voluntary motor control is preserved. Among the studies involving participants with

LIS, only two studies recruited participants with total LIS. The majority of these stud-

ies (64%) considered participants with LIS resulting from amyotrophic lateral sclerosis

(ALS), an adult-onset progressive neuro-degenerative disease that affects both upper and

lower motor neurons [147]. In its late stages, ALS can lead to the locked-in state.

Other conditions reported in the reviewed articles included different levels of spinal

cord injury (SCI) (8 studies), cerebral palsy (4 studies), cerebral paresis ( 1 studies),

7

muscular dystrophy (4 studies), stroke (3 study), chronic Guillian Barr syndrome (2

studies), multiple sclerosis (1 study), spinal muscular atrophy (2 studies), post-polio (1

study), and primary lateral sclerosis (1 study). All of the selected studies considered

adult participants.

1.3.2 Applications User Interface

The selected articles used EEG-based BCI systems for two main applications, namely,

augmentative and alternative communication (34 studies) and environmental control (5

studies). Augmentative and alternative communication tools enable or facilitate com-

munication with other individuals and include spellers and Internet navigation tools.

Environmental control tools enable the user to modify environmental conditions and in-

clude body-position control, control of electronic appliances, and navigation in real and

virtual environments.

One of the spelling applications used is a binary tree arrangement of the alphabet to

efficiently locate a desired letter [166]. At each level of the tree, the user is presented with

two segments of the alphabet and can eventually choose the desired letter by traversing

the alphabet tree. Another common spelling interface is the scanning keyboard, where

different columns and rows of an array of letters are sequentially intensified [230]. A

third widely used interface involves on-screen object navigation. The user-guided cursor

points to the desired options or letters for communication purposes.

Table 1.1 summarizes EEG-based BCI studies involving individuals with disabilities

in the past decade.

1.3.3 Controlling brain computer interfaces

EEG-based BCIs rely on modulation of brain activity for application control. The mecha-

nism used to modulate brain activity may rely on reactions evoked by externally presented

stimuli or generated spontaneously by a trained user. The most commonly used control

8

Tab

le1.1:

Summaryof

BCIstudieson

individualswithdisab

ilities(1999-2005)

Authors

Participants

(condition)

Controlmechanism

Application

Birbau

mer

etal.(1999)[13]

2(A

LS)

SCP

Spelling

Kubleret

al.(1999)[114]

3(ALS),13able-bodied

SCP

Spelling

Birbau

mer

etal.(2000)[14]

5(A

LS)

SCP

Spelling

Don

chin,Spencer,

Wijesingh

e(2000)[38]

3(com

plete

paraplegia),10able-bodied

P300

Spelling

,1(incomplete

praraplegia)

Kubleret

al.(2001)[115]

2(A

LS)

SCP

Spelling

Kaiseret

al.(2002)[93]

1(A

LS)

SCP

Environmentalcontrol

Hinterbergeret

al.(2003)[76]

1(A

LS)

SCP

Spelling

Sellers

etal.(2003)

[202]

3(A

LS)

P300

Spelling

Muller

etal.(2003)

[148]

1(infantile

CP)

SMR

Cursorcontrol

Neuman

nan

dKubler(2003)

[154]

11(notspecified)

SCP

Krausz

etal.(2003)

[107]

4(SCI)

partialparalysis

SMR

Cursor(ball)control

Neuman

n,Birbau

mer

(2003)

[153]

5(A

LS)

SCP

Spelling

Neuman

net

al.(2003)[154]

5(A

LS)

SCP

Spelling

Neuper

etal.(2003)[155]

1(C

P)

SCP

Spelling

Bayliss

etal.(2003)[10]

1(A

LS),9able-bodies

P300

3-choicesw

itch

Kubleret

al.(2004)[116]

10(A

LS),10able-bodied

SCP

Cursorcontrol

Wolpaw

,McF

arland(2004)

[238]

2(SCI),2

able-bodied

MSMR

Cursorcontrol

Sellers

etal.[204]

15(A

LS),1(brain

stem

stroke

P300

Spelling

Kubleret

al.(2005)[117]

4(A

LS)

SMR

Cursorcontrol

Piccion

eet

al.(2005)[173]

1(ALS),1(LIS

post

vertebrobasilartrombosis)

P300

4-choicesw

itch

1(ALS),1(LIS

post

vertebrobasilartrombosis)

1(SCI),1(G

uillain

Barresyndrome)

1(M

S),7(able-bodied)

Note:

ALS:am

yotropic

lateralsclerosis,

CP:cerebralpalsy,DMD:Dunchennemusculardystrophy,

LIS:locked-in

syndrome,

Lv:level,MD:musculardystrophy,

MS:multiple

sclerosis,

SCI:spinalcord

injury,SCP:

slow

cortical

potentials,

SMA:spinal

muscularatrophy,

SMR:sensorimotorrhythms.

9

Tab

le1.1:

Summaryof

BCIstudieson

individualswithdisab

ilities(2006-2009)

Authors

Participants

(condition)

Controlmechanism

Application

Karim

etal.(2006)[94]

1(A

LS)

SCP

Internet

surfing

Neuper

etal.(2006)[26]

1(CP),1(MDD),1(A

LS)

SMR

Spelling

,1(SCI,lv.C4),1(SCI,lv.C5)

Sellers,D

onchin(2006)

[203]

3(ALS),3able-bodied

P300

Spelling

Vau

ghan

etal.(2006)

[226]

1(A

LS)

SMR,SCP,P300

Spelling,Cursor

control

Wan

get

al.(2006)[230]

11(SCI,lv.C

4-C7),16able-bodied

SSVEP

Environmentalcontrol

Kau

han

enet

al.(2007)[95]

5(SCI,lv.C

4-C5),1(GuillianBarresyndrome)

SMR

Cursor(circle)control

Leebet

al.(2007)[127]

1(SCI,complete

lesionbelow

C4

SMR

VirtualEnvironment

,incomplete

lesionbelow

C5)

nav

igation

Bai

etal.(2008)[6]

1(brain

stroke),1(A

LS),9able-bodied

SMR

Binary

switch

Cincottiet

al.(2008)[27]

14(SMA,D

MD),14able-bodied

SMR

Environmentcontrol

Hoff

man

net

al.(2008)[78]

1(CP),1(SMA),1(A

LS)

P300

6-choicesw

itch

,1(brain

andspinalcord

injury)

,1(P

ost-an

oxic

encephalopathy)

,4able-bodied

Kubler,

Birbau

mer(2008)

[112]

29(A

LS),6(Guillain

Barresyndrome)

SCP

Cursorcontrol

,1(m

usculardystrophy),1(cerebralparesis)

Spelling

1(diffuse

brain

dam

agepost

hypox

ia),2(brain

stroke)

McF

arlandet

al.(2008)[135]

1(SCI,lv.C4),1(SCI,lv.T7)

SMR

Cursorcontrol

Nijboer

etal.(2008)[159]

8(A

LS)

P300

Spelling

Kubleret

al.(2009)[113]

4(A

LS)

P300

Spelling(auditory)

Bab

ilon

iet

al.(2009)

[5]

6(D

MD)

SMR

Environmentcontrol

Con

radiet

al.(2009)[29]

7(SCI)

SMR

Cursorcontrol

Feltonet

al.(2009)

[48]

2(ALS),1(MD

post-polio),

3(SMA)

SMR

Cursorcontrol

2(post-polio),

1(C

P),2(SCI),1(L

IS)

8ab

le-bodied

Bai

etal.(2010)

3(PLS),3(A

LS)

SMR

Cursorcontrol

Mugler

etal.(2010)

3(ALS),10able-bodied

P300

Internet

browsing

Note:

ALS:am

yotrop

iclateralsclerosis,

CP:cerebralpalsy,

DMD:Dunchennemusculardystrophy,

LIS:locked-in

syndrome,

Lv:level,MD:musculardystrophy,

MS:multiple

sclerosis,

SCI:spinalcord

injury,SCP:

slow

cortical

potentials,

SMA:spinal

muscularatrophy,

SMR:sensorimotorrhythms.

10

Table 1.2: BCI Control Mechanisms.

Control Mechanism

SpontaneousSlow Cortical Potentials (SCP)Sensorimotor Rhythms (SMR)

EvokedP300Steady State Visually Evoked Potentials

Mental TaskLanguage tasksMental Arithmetic

mechanisms are shown in Table 1.2. The remainder of this section discusses each of

these mechanisms in detail. The review of the selected articles led to identifying different

challenges surrounding BCI control mechanisms which are listed in Appendix A.

Slow cortical potentials

The most frequently deployed control mechanism among the selected studies is the slow

cortical potential (SCP), a spontaneously generated signal. SCPs are slowly varying

trends that are time locked to specific external or internal events [12]. The duration of

these potentials is generally between 300 milliseconds to several seconds [13].

Voluntary control using behavioral manipulations can cause positive or negative SCP

shifts. Negative deviations of the SCPs are known to be associated with arousal as well

as response preparation [220, 56]. Positive deflections in SCPs are related to response

inhibition and relaxation [42]. Voluntary control of SCP can be achieved by providing

visual or auditory biofeedback to participants [43, 12].

Thought translation device (TTD) is an example of an SCP-based BCI [11], employing

voluntarily generated SCPs to control a computer. The TTD requires a training phase

during which the user receives visual or audio-visual feedback reflecting the presence of

positive or negative deflections [115]. In particular, the reviewed articles reported that

successful use of the SCP-based BCI (achieving accuracies higher than 70%) required

several training sessions. Among the reviewed articles SCP-based BCIs were used to

operate spelling interfaces, navigate the Internet, and control environmental devices.

11

Sensorimotor rhythms

Like SCPs, sensorimotor rhythms (SMRs) rhythms are spontaneously occurring EEG

activities in the somatosensory cortex in the absence of movement [157]. These rhythms

are attenuated by movement or somatosensory stimulation. Control of SMRs can be

achieved through biofeedback-based training as the user performs motor imagery [119,

170]. While SMR-based BCIs were successfully used by individuals with disabilities

[117, 170, 239], voluntary modulation of SMRs for BCI control required many training

sessions.

SMR-based BCI have been used for cursor control, where bilateral motor imagery of

hands, legs, and tongue was used to control the direction of cursor movement. SMRs

have also been used for selecting various targets. For example, Mcfarland et al. [135]

used a linear combination of SMRs to enable selection once the cursor reached the target

choice. Such cursor control can also be used for spelling.

P300 evoked potentials

In contrast to the spontaneous control mechanisms, the P300 is an evoked response.

The P300 wave has a latency of 300ms and is the positive-going component of the event-

related potential that results from exposure to an occasional stimulus [181]. This response

is generated by a network comprising the prefrontal cortex, anterior insula, cingulate

gyrus, temporoparietal cortex, medial temporal cortex, and the hippocampal formation

[183] and can be maximally recorded from the midline centroparietal regions [174].

An example of a P300-based BCI is the P300 speller [47] that intensifies columns and

rows of an alphabet matrix presented visually to the user. A P300 response is elicited

when the user is presented with intensification of the row and column containing the

desired letter. Thus, the presence of the P300 can be used to detect the user’s choice.

The P300 speller was shown to achieve higher than 70% accuracy in 5 of 6 participants

with ALS in [159]. Moreover, Sellers et al. reported that with visual and auditory P300-

12

inducing stimuli, 2 of 3 participants with ALS achieved a selection accuracy comparable

to that of able-bodied individuals using a similar system [202].

Steady state visually evoked responses

When presented with repetitive visual stimuli, EEG recordings from the parieto-occipital

sites demonstrate peaks at frequencies matching that of the stimuli and its harmonics

[73]. This response is known as the steady state visually evoked potential (SSVEP). The

physiological mechanism underlying generation of SSVEPs remains largely unknown, al-

though the amplitude of the SSVEPs is reportedly related to increase in synaptic activity

[165]. It is suggested that SSVEP peaks intensify with selective attention to the stimulus

[145, 3].

In Wang et al. (2006), 11 volunteers with SCI attempted to operate an environmental

control using an SSVEP-based BCI system. Of the 11 participants, 10 were able to reach

an information transfer rate of 21 bits/minute using this system [230]. In this study, an

array of buttons, each flickering at a different frequency was presented to the user. The

user chose the desired option by attending to the appropriate button.

Mental task

Several other mental tasks such as language and arithmetic have also been shown to

induce distinctive EEG patterns in able-bodied individuals [140, 185]. Despite the cog-

nitive load imposed by these BCIs, they may have merits as BCI control mechanisms

for the target population. To the best of our knowledge, BCIs based on language and

arithmetic mental tasks have not been tested by the target population.

1.3.4 Evaluation Criteria

Performance of BCI systems have generally been measured by speed and accuracy which

are both important for communication. Since the reviewed studies focused on different

13

applications (e.g., spelling, cursor control), various measures of speed and accuracy were

used to report system performance as listed in Table 1. Examples of accuracy measures

include classification accuracy and r2 value, which reflects the level of correlation between

user intent and the signal features [237]. The number of characters typed per minute

has also served as a measure of speed. Information transfer rate, also known as bit rate,

has also been commonly used as a combined measure of accuracy and speed [237]. This

measure reflects the amount of ”correct” information transferred per unit time.

1.3.5 Future Directions in BCI research

A closer look at the reviewed studies provides a means of identifying emerging challenges

in BCI development and means to overcome these issues. In this light, the current section

summarizes future directions identified in BCI research.

Involve pediatric populations

The reviewed articles largely focused on individuals with adult-onset disabilities. It is un-

clear whether or not the findings of these studies translate to individuals with congenital

disabilities, who often have never experienced any means of communication. For exam-

ple, to the best of our knowledge, BCIs relying on motor imagery have never been tested

with individuals who have never experienced voluntary control of their movements. Pro-

longed deprivation from communication in childhood can lead to learned helplessness and

impede the development of contingency awareness [216]. Despite this compelling clinical

reason for investigating BCI use in the early stages of life, none of the reviewed studies

have investigated the effectiveness of EEG-based BCIs in the pediatric population.

Consider personal contextual factors in determining BCI speed requirements

Communication speed (e.g. words typed per minute) has traditionally been an important

factor in assessing BCI performance. The emphasis on maximizing speed may stem from

14

studies with able-bodied individuals or those with traumatic disabilities who may expect

BCI systems to replicate the high throughput of pathways such as speech. Nonetheless,

joint BCI studies involving both able-bodied individuals and those with severe disabili-

ties have pinpointed delays in reaction time [6] and slower item selection rates [38] in the

disabled participants. Thus, the speed expectations of patients are likely very different

from those of their able-bodied counterparts. Indeed, proficient users of single-switch

scanning systems typically only achieve 8-24 words per minute [61]. Further, children

with developmental disabilities and communication difficulties are known to exhibit only

a handful of intentional communication acts per minute (e.g., words, gestures and vocal-

izations) [18]. Therefore, we recommend that as an indicator BCI performance, speed

ought to be contextualized in terms of the individual’s time scale for communication, tak-

ing into account the time required to process received information and the time needed

to muster the resources to respond. The level of cognitive awareness of the BCI user has

a significant effect on the choice of control mechanism and may affect the speed of oper-

ating the BCI. In particular, spontaneous control mechanisms are appropriate for users

who can voluntarily modulate EEG patterns. However, due to the lack of alternative

means of communication, the cognitive awareness of the participant cannot always be

assessed using standard assessment tools that rely on motor responses [87]. Therefore, a

comprehensive evaluation of BCI performance ought to include an appropriate cognitive

assessment.

Train and evaluate in ecologically salient environments

BCI evaluation would not be complete without considering the environmental context it

operates within. Because a BCI system is often the only means of communication for

an individual with severe disabilities, BCI solutions must allow long-term use in home

environments. Despite this, only a handful of articles have evaluated BCI performance in

home environments [76] or a simulated home-like environment [27]. A notable example

15

is the evaluation of the BCI 2000 system modified for use in home environments [226]. In

evaluating BCI accuracy, contextual factors may also include communication partners.

In this regard, it is important to view the BCI as a tool for facilitating meaningful

communication and not necessarily as a tool for producing exact selections. For example,

when using a BCI system to control a scanning keyboard, meaningful communication can

occur in spite spelling errors. This suggests that to obtain an environmentally relevant

evaluation of a BCI, a measure of the conversational partner’s receptive communication

may be important. While BCI training and evaluation may be performed in the user’s

home environment, trained personnel must often be present to ensure proper set-up and

operation of the equipment. This can limit BCI users to the geographical vicinity of

research facilities. To overcome these geographical restrictions, both researchers and

patients may benefit from tele-monitoring systems that enable remote supervision of

training [148].

Introduce user-aware BCIs

None of the reviewed studies have incorporated user state (fatigue, attention, emotional

status) while operating BCI systems when determining performance. For example, it

is not clear whether the performance degradation observed during long periods of BCI

use has resulted from exacerbated fatigue or due to the failure of detection algorithms

used. Detecting changes in user status such as the level of fatigue, and attention may

improve BCI performance assessments. In addition, awareness of user-status may allow

the BCI to more intimately accommodate the user’s moment-by-moment needs. For ex-

ample, once user fatigue is detected, the system can suggest a rest period. Specific EEG

patterns are shown to reflect different states such as fatigue and attention. Extended

periods of performing tasks such as mental arithmetic or driving result in an increase in

frontal theta rhythms [224]. Hamadicharef et al (2009) were able to differentiate atten-

tion (reading/arithmatic) versus non-attention (rest) state with accuracies up to 89.4%

16

[67]. Petrantonakis and Hadjileontiadis (2010) showed that the six basic emotions (hap-

piness, surprise, anger, fear, disgust, and sadness) could be differentiated with 83.33%

accuracy using EEG activity [168]. Based on these findings, in future studies, EEG sig-

nals monitored by BCI systems may also be used to estimate user state leading to a more

user-accommodating implementation. EEG signals may also reflect the dynamics of the

interaction between the user and BCI systems. For example, error-related potentials,

which are manifested after an error occurs [24, 80], may be used as a post-hoc correction

mechanism. Once an error-related potential appears, an auto-correction strategy may be

invoked or user verification may be solicited. Using EEG patterns associated with the

user-system interaction such as error-related potentials may lead to more usable BCIs

[25].

Develop more effective training protocols

None of the reviewed studies have focused on the development of engaging training

paradigms. Training is an imperative part of realizing SMR and SCP BCI systems.

Improving the training interface may directly affect training success. Studies involving

able-bodied participants have previously explored alternative training paradigms. The

interested reader is referred to Neuper and Pfurtscheller (2010) [155]. For example, im-

mersive training protocols (using virtual environment) have been suggested for realizing

an informative yet engaging training environment [127]. Using more engaging training

paradigms such as those involving learning reinforcements may increase user motivation,

improve training effectiveness and reduce requisite training times. Such training regimens

would be particularly useful for motivating the pediatric user with disabilities.

1.3.6 Towards affective brain computer interfaces

Despite the merits offered by existing BCI systems, many nonverbal children and youth

are usually not candidates for existing BCI technologies due to developmental delays, lim-

17

ited expressive communication and unknown levels of receptive communication. Indeed

the aforementioned challenges preclude the training of specific mental activities. How-

ever, these individuals are still candidates for affective BCIs (A-BCI) which enable the

automatic recognition of affective states using brain activity [158]. A-BCIs may provide

a means of detecting spontaneous and natural reactions to emotion-evoking stimuli.

A-BCI development is a step towards addressing existing gaps in BCI research intro-

duced in 1.3.5. Emotions are an intuitive and natural means of responding to stimuli.

Therefore, A-BCI may provide an opportunity to realize communication pathways for

the pediatric population. A-BCIs can bring awareness to user state in existing BCI

systems. Emotional awareness may help create more user accommodating systems and

develop more effective training paradigms. Unlike existing active BCI systems which gen-

erate voluntary and direct commands for communication (e.g. the P300 speller), A-BCIs

may offer passive but intuitive control. Passive BCI systems detect implicit information

regarding the user state (e.g. emotions) and intentions, and enable situational interpre-

tations [242]. Ultimately, an affective BCI may enable the decoding of emotional state

in the absence of overt emotional expression.

Computer-based detection of emotional responses may enhance implicit communica-

tion about the user in human computer interaction systems [31]. Affective computing has

long been touted for its potential for more realistic and user-accommodating interactions

[171]. An emotionally-aware system stands to benefit non-verbal individuals with severe

disabilities by estimating their emotional state in the absence of more explicit means

of interaction (e.g. speech and gestures). In turn, knowledge of the patients affective

state may help to mitigate care-giver stress and facilitate treatment decisions in a timely

fashion [70].

18

1.4 Neural correlates of emotion

Emotional response has been shown to engage different pathways in the central and

autonomic nervous system. Autonomic nervous system (ANS) activity sensors such as

those that detect cardiovascular, respiratory, and electrodermal can unveil emotional

responses [108, 129]. For a review of studies using ANS activity sensors for identifying

emotions, the reader is referred to [108].

Based on theories suggesting a close relationship between emotional response and

survival, key neural structures in the brain have been identified in different animal studies.

Figure 1.2 summarizes the many neural structures involved in orchestrating an emotional

response within what is known as the survival network [123]. As shown in Figure 1.2,

emotional response can engage many substrates in the mammalian brain. The human

brain is no exception to this rule. Neuro-imaging techniques such as positron emission

tomography (PET) [221] and magnetic resonance imaging (MRI) [21] have provided an

opportunity for in vivo characterization of emotional perception in the human brain

[15, 16, 44, 206, 209].

Various brain circuits including parts of the limbic system and amygdala are found

to be responsible for the perception of emotional stimuli ([164, 208, 126]). Among these

areas, the frontal cortex plays an important role in regulating emotional response to sen-

sory input [34, 33, 187, 141]. Previous studies have confirmed the role of the frontal area

in emotional response. For example, severity of the depressive symptomatology in pa-

tients following stroke lesions was reported to be significantly correlated with proximity

of the lesion to the frontal pole [186]. Moreover, left and right frontal activations were

also found in response to watching video clips inducing positive and negative emotional

responses, respectively [235]. Activations in the orbito-frontal and ventral prefrontal cor-

tex in response to highly pleasurable self-selected music excerpts have also been reported

[15]. Tanida et al. showed that inducing mental stress could lead to bilateral increase

or decrease of oxygenated hemoglibin ([HbO2]) and deoxygenated hemoglobin ([Hb]),

19

Amygdala

Cortex

Thalamus

Hippocampus

Orbitofrontal Cortex

Choice behaviorMemory of emotional events

Hippocampus

Memory consolidationofemotional events, spatiallearning

Dorsal & Ventral Striatum

Instrumental approach oravoidance behavior

Lateral Hypothalamus

Tachycardi, skin conductanceresponse, paleness, pupildilation, blood pressureelevation

Dorsal Motor of VagusNucleus Ambiguus

Ulcers, urination, defecation,bradycardia

Paraventricular N.

Corticosteroid release (”stressresponse”)

Sensory

Response

Figure 1.2: Various structures within the survival network involved in the emotionalresponse, and the resulting outputs. [123]

20

respectively [219]. Matsuo et al. have reported PFC [HbO2] increases in a group of indi-

viduals with post-traumatic stress disorder as well as a healthy control group in response

to trauma-related videos [134].

1.4.1 The role of prefrontal cortex in default, salient and exec-

utive control networks

One of the remarkable features of the brain is its ability to attend to salient events in the

environment. The ability of the brain to regulate various processes and divert attention

to the more salient ones has been attributed to intrinsic and distinct functional networks

[214]. These networks are composed of strongly coupled sets of information processing

nodes distributed in the brain. Functional connectivity studies have confirmed the ex-

istence of at least three canonical networks: (i) central executive network, (ii) default

network; and (iii) salience network [214]. The salience and central executive network

exhibit increased activity during cognitively demanding tasks [63]. The default network,

on the other hand, shows higher levels of activity during resting state [63]. By regulat-

ing activation and deactivation of these networks, the brain can realize various ongoing

processes during resting state and respond to salient events when required. These salient

events could involve cognition, homeostasis or emotions [201]. Therefore, emotional re-

sponse may result in activity changes within these intrinsic brain networks. The salience,

central executive and default network are shown to encompass different areas within the

prefrontal cortex. The dorsolateral prefrontal cortex is shown to be part of the central

executive network [214, 201]. The ventromedial prefrontal cortex serves as one of the

nodes in the default network [214, 63]. Finally, the salience network encompasses the

ventrolateral prefrontal cortex [214, 201]. In addition, various areas in the prefrontal

cortex are shown to act as information hubs by integrating diverse information sources

within different brain networks. Buckner et al [20] identified prominent hubs in the

medial/lateral prefrontal cortex, in a functional magnetic resonance imaging study.

21

Based on the existing evidence, recordings from the prefrontal cortex may tap into

three major networks in the brain (salience, central executive and default networks). In

addition, recordings from the medial/lateral prefrontal cortex may enable monitoring of

the activity of intrinsic cortical hubs [20].

Unlike deeper brain areas such as amygdala, and limbic system, prefrotnal cortex

(PFC) hemodynamics can conveniently be monitored using non-invasive and portable

brain monitoring modalities, such as NIRS. Accessibility of PFC by brain sensing mod-

ules and particularly NIRS provides a great opportunity for realizing a bed-side emotion

identification system. Therefore, in this thesis, PFC hemodynamics were used for iden-

tifying emotional response.

1.5 Near-infrared spectroscopy of the brain

Among various brain monitoring modalities, hemodynamic measurements are not prone

to electrogenic artifacts such as bio-potentials associated with eye-movement or frontalis/temporalis

muscle contraction. These artifacts primarily occur in the forehead area and may reduce

signal to noise ratio when recording EEG from the prefrontal and frontal region. There-

fore, hemodynamic measurement in cortical areas are involved in emotion processing is

a meaningful pursuit in developing affective brain computer interfaces (A-BCIs).

Various brain sensing modalities have been developed for cerebral hemodynamic mon-

itoring such as magnetic resonance imaging (MRI), and positron emission tomography

(PET). However, neither of these technologies are currently suitable for long-term bedside

monitoring for emotion identification purposes. Current MRI technologies are bulky, ex-

pensive, and require radio frequency and magnetic shielding which impedes their use as a

portable bedside monitoring system. PET systems require administration of radioactive

tracers, and are therefore not suitable for long-term and repeated monitoring.

Near-infrared spectroscopy (NIRS), which is also a hemodynamic-based brain sensing

22

light emission

light detection

Sample recording[HbO ]2

[Hb]

Figure 1.3: General overview of NIRS recording system

modality, offers many advantages such as low cost and portability making it suitable

for long-term bed-side use. NIRS optically monitors the level of oxygenated and de-

oxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively) in the cerebral

cortex. Near-infrared light penetrates the adult skull and can be superficially detected

2.5-3 cm away from the source [228, 90] (Figure 1.3). The detected light intensity can

be used to identify [HbO2] and [Hb] in the underlying tissue due to the differences in ab-

sorption characteristics of these two chromophores. Deeper brain areas in the emotional

network such as amygdala cannot be monitored using NIRS. However, PFC activity can

be conveniently monitored using superficial light emitters and detectors.

1.6 Emotion induction via music

In a recent review of physiological markers of emotion [108], Kreibig illustrated the di-

versity of emotion induction paradigms and their usage frequencies. In the reviewed

sample, film clips were most frequently used, but other emotion induction methods such

as imagery, personalized recall and musical excerpts were also reported [108]. Among

these techniques, music, which is often presented over longer durations, has the capacity

to induce a response changing with time. Emotions experienced during the initial pre-

sentation of a piece of music may be different from those surfacing as the music unfolds.

However, music-based emotion induction is subject to debate among researchers, and the

23

Table 1.3: A summary of existing theories of emotions. See [50] for more details.

1954 Arnold Felt tendency towards an objectGasson accompanied by specific bodily changes

1986 Lutz A means of negotiating social relationsand White

1991 Lazarus Organized psychophysiologicalreactions with respect to ongoing relationshipswith the environment

1991 Ekman Characteristics common among emotions: ”rapid onset, short duration, unbidden occurance, automatic appraisal,and coherence among responses.

2008 Juslin Emotions are typically described as relatively brief,though intense, affective reactions topotentially important events or changes in the externalor internal environment that involve several subcomponents:cognitive appraisal (e.g., you appraise the situation asdangerous), subjective feeling(e.g., you feel afraid),physiological arousal (e.g., yourheart starts to beat faster),expression(e.g., you scream), action tendency (e.g., you run away),regulation (e.g., you try to calm yourself)

use of music as an emotion induction method is less prevalent than other stimuli [108].

Those opposing the use of music for inducing emotions have argued that music fails to

present immediacy resembling real-life situations (e.g. a life-threatening event).

The lack of consensus among researchers regarding the use of music for emotion

induction may stem from disagreement about the very definition of emotions. Theories

of emotion have emphasized different attributes of an emotional response in defining

emotions. Emotions have been defined with respect to bodily changes, social relations,

homeostasis within the surrounding environment and autonomic appraisal. Table 1.3

summarizes a number of different theories of emotion, and highlights the diversity among

them [50].

To overcome the lack of consensus about the role of music in inducing emotions,

Juslin et al. explored the ability of music to induce emotions based on various mechanisms

leading to emotional response [92]. Despite competing arguments for and against musical

24

emotion induction, music has been used as an emotional auditory stimulus in many

studies [106, 109, 81, 213, 60]. In addition, music has been used in many studies involving

emotional processing in the brain [15, 16].

There is a lot of diversity in the choice of music excerpts used in studies of emotion.

These studies can be categorized into two main streams: (i) studies using music in un-

altered form (with no computer adjustments) (ii) studies using music with modifications

to specific music characteristics such as dissonance or chords to influence emotional ex-

perience. For example, by modifying the degree of permanent dissonance (which affects

the pleasantness of stimuli) Blood et al. studied neural emotional processing with music

in a positron emission tomography study [16]. Steinbeis et al. [215] produced harmonic

sequences that ended on an irregular chord function, and were able to identify electro-

dermal activity modulations when the musical expectancy was violated. Other studies

use unaltered music belonging either to a collection of pre-selected music excerpts [167]

or a number of music pieces self-selected by the individuals [15].

With respect to neutral auditory stimulus, various strategies have been proposed. For

example, in some studies a neutral auditory stimulus was presented with environment

sounds (e.g. sounds from ocean waves or songbirds) [4]. Random static noise has also been

applied as a neutral stimulus [51, 212]. Other studies have used computer adjustments

to neutralize the emotional content of music [16].

Using music for emotion induction offers some specific advantages, particularly in

studies of emotion involving the pediatric population. The emotional content of music is

shown to be discernable by children as young as 6 years of age [32]. In addition, emotions

in music are known to be perceived across cultures [55]. Therefore, as a dynamic cross-

cultural emotion induction method, music has many merits. Therefore, to achieve the

goals of the current thesis, a music emotion induction paradigm was implemented.

25

1.7 Objectives

The objective of this thesis was to implement and test a means of identifying emotional

arousal and valence in response to music using Near Infrared Spectroscopy (NIRS) of

the prefrontal cortex (PFC). To achieve this goal, multiple investigations were necessary

to resolve technical and physiological challenges in the context of neural correlates of

emotion. In this light, the specific objectives of this thesis were:

A. To identify correlates of emotion by characterizing the signals recorded via PFC

NIRS with respect to emotional arousal and valence.

B. To investigate whether the detected activity patterns in objective A were due to

emotional response or mere music perception.

C. To identify features from the NIRS signals which are correlated to emotional

response and investigate the ability to differentiate emotional arousal and valence based

on these features.

D. To compare detection accuracies achieved using PFC NIRS signals to those at-

tained with autonomic nervous system signals such as heart rate, skin temperature, and

electrodermal activity, which have previously been used for emotion identification in the

literature.

E. To design and test a multi-modal emotion (arousal and valence) identification

system using ANS and PFC NIRS monitors.

1.8 Roadmap

The roadmap of this thesis is organized according to the objectives listed above. Chap-

ters 5-6 are arranged as journal articles each focused on one or multiple objectives listed

in section (1.7). The thesis structure is summarized in Figure 1.4. Chapter 2 provides

details regarding the methods and data collection procedures used. Chapters 3 to 6

may duplicate information regarding the procedures summarized in chapter 2. Likewise,

26

the introduction (or background) section of some chapters may also replicate informa-

tion presented in chapter 1. Where duplication occurs, we will highlight in the chapter

preamble, sections that the reader can skip. Following the five main chapters, the thesis

concludes with a summary of contributions and recommendations for future studies.

In Chapter 2, the study protocol is described in detail. These descriptions explain

the experimental paradigm, data collection procedures, and measurements used. In ad-

dition, the relevant data preprocessing algorithms are introduced in detail.

In Chapter 3, PFC NIRS signals, namely [HbO2] and [Hb] are characterized using

wavelet peak detection. The wavelet peak detection algorithm allows characterization in

time and frequency domains. These wavelet characteristics are examined with respect to

subjective ratings of arousal and valence rating. This chapter is in line with Objective

A.

In Chapter 4, the main effect of three music characteristics (mode, dissonance and

maximum sound pressure level), which are known to be effective in inducing emotions,

on PFC hemodynamics is investigated. PFC is likely to be involved in a brain network

specialized for perceiving emotions, and therefore, the activities observed may be due to

music perception and not the emotional content of the music. This chapter focusses on

objective B, and investigates whether PFC hemodynamics are directly affected by the

identified music characteristics.

In Chapter 5, a group of time-domain features are extracted from PFC [HbO2] and

[Hb] measurements. These features are then used for training two separate classifiers for

arousal and valence differentiation. In this validation study, a PCF NIRS-based arousal

and valence identification system is tested and therefore objective C is addressed.

Autonomic nervous system (ANS) activity has long been used for identifying emo-

tions. Therefore, in the pursuit of a physiologically-based emotion identification system,

it is important to compare the current detection rates achieved using PFC with those

realized using ANS activity.

27

Chapter 2: Study protocol

Characterizing PFC hemodynamicchanges with respect to valenceand arousal

Investigating the effect ofmusic characteristics

Chapter 3:

Chapter 4:

Automatic detection of emotional responseusing PFC hemodynamic features

Chapter 5:

Identifying emotional valence and arousalby combining autonomic and centralnervous system activity

Chapter 6:

Chapter 7: Concluding remarks

Figure 1.4: Thesis roadmap.

28

In Chapter 6, ANS activity collected in the form of heart rate, electrodermal activ-

ity and skin conductance features is used for solving the same classification problems as

those formulated in Chapter 5. In addition, a dynamic model based feature extraction is

implemented to improve classification results by including frequency domain features. Ul-

timately, a mixture of classifier experts, each trained using PFC or NIRS features are used

for solving the classification problem (i.e. high arousal versus low arousal and positive

versus negative valence). Overall, this chapter investigates the ability of a multi-modal

emotion identification system to improve upon accuracies achievable with classifiers con-

sidering exclusively PFC NIRS features. In this manner, Chapter 6 addresses objectives

C, D and E.

Chapter 2

Experimental Protocol

2.1 Preamble

This chapter summarizes the experimental details including the procedures, methods,

data acquisition and data preprocessing.

2.2 Introduction

To realize the testing of the hypothesis put forth in this thesis, a database of physiological

responses, and corresponding ratings of emotional valence (positive versus negative), and

arousal (intense versus neutral) was created using a collection of music excerpts. In this

chapter, you will find details regarding data collection procedures. In addition, you will

be introduced to the preprocessing techniques applied to the light intensities collected

via the NIRS devices to achieve [HbO2] and [Hb] signals used in the rest of this thesis.

2.3 Participants

Ten able-bodied volunteers (five females, five males, age: 25 ± 2.7 years) were re-

cruited for this study. The participants reported to have normal hearing, and normal or

29

30

corrected-to-normal vision. The recruitment criteria excluded individuals with reported

cardiovascular diseases, metabolic disorders, history of brain injury, respiratory condi-

tions, drug and alcohol-related and psychiatric conditions. Participants were instructed

to refrain from caffeine and alcohol consumption 5 hours prior to the study. Volunteers

had an average of 5.5 years of past music training. The duration of musical training is

reported due to previous research documenting the influence of music training on the

physiological responses to music [207]. Ethics approval was obtained from the Bloorview

Research Institute research ethics board (see Appendix E) and all participants provided

informed written consent.

2.4 Stimuli

The stimuli were composed of 78 music excerpts. All music segments were 45 s in du-

ration. The excerpts included lyrical and non-lyrical pieces. The lyrics were in different

languages (English, French, Italian and Spanish) to reduce potential effects of brain ac-

tivation due to mental singing. Within the excerpts presented to each participant, 72

standard music pieces were chosen by two researchers from different genres of music

(classical, rock, jazz and pop). Specifically, candidate pieces were assessed in terms of

their valence characteristics as suggested by the tone, rhythm and lyrics (where applica-

ble). Note that the researcher assessments were used solely to ensure an approximately

uniform representation of music between valences (positive versus negative). The actual

data analysis described in section 2.5 relied solely on participant ratings of valence and

arousal. For the participant-selected pieces, participants chose a priori three pieces of mu-

sic that personally induced intense positive emotions (joy or excitement) and three that

induced intense negative emotions (sadness). The control acoustic stimulus was Brown

noise (BN). User feedback in our pilot studies indicated that this type of noise was sub-

jectively more pleasant than white noise at the same sound pressure level [229]. For more

31

information regarding other alternatives for the neutral auditory stimulus, the reader is

referred to section 1.6. A list of the standard database presented to all participants is

presented in Appendix B.

2.5 Signal acquisition

An Imagent Functional Brain Imaging System from ISS Inc. (Champaign, IL) was used

for NIRS measurements. A custom made rubber polymer (3M 9900 series) headgear

held three light detectors and ten light sources in place over the forehead, as depicted

in Figure 2.1. At each X location in Figure 2.1, two light sources, one at 830 nm

and the other at 690 nm, were co-located. This layout had been previously used for

prefrontal cortex monitoring in Power et al (2010) and provided readings at the nine

shaded locations in Figure 2.1 [179]. With data from two wavelengths, this configuration

yielded 18 different channels of light intensity readings. The midpoint of the headgear

was aligned to anatomical midline (as estimated by the position of the participants nose),

while the lower edge of the headgear sat just above the eyebrows. Light sources were

modulated at 110 MHz and the detector amplifiers were modulated at 110.05 MHz which

led to a cross-correlation frequency of 5 kHz. The data were sampled at 31.25 Hz. During

a complete cycle of all ten sources, each source illuminated the surface for 1.6 ms during

which eight acquisitions were made. A fast Fourier transform was applied to the average

of the eight waveforms to obtain an estimate of ac and dc intensities as well as the phase

delay [179]. The dc light intensities were used to determine HbO2 and Hb concentrations

(i.e. [HbO2] and [Hb] ).

2.6 Pre-processing

Low-frequency artifacts such as respiration, heart rate and the Mayer wave were filtered

using a type II third order Chebychev low pass filter with a cut-off frequency of 0.1

32

Figure 2.1: The layout of light sources (circles) and detectors (X’s). The vertical linedenotes anatomical midline. The annotated shaded areas correspond to recording loca-tions.

Hz (normalized stop-band edge frequency of 0.032 and stop-band ripple of 50 dB down

from the peak pass-band value) [178]. The 830 nm and 630 nm light intensities at each

of the nine recording sites were used to calculate HBO2 and Hb concentrations via the

modified BeerLambert law [30, 41], which resulted in 18 channels of concentration data.

To reduce the effects of initial device calibration, the concentration time series were

normalized within each experimental block against the mean in the same block.

2.7 Study design

Each participant completed four sessions conducted on separate days. In each session,

the participant completed three blocks with optional breaks between blocks. Each block

consisted of 12 consecutive trials: four trials with positively valenced songs (one of which

was a participant-selected song), four trials with negatively valence songs (one of which

was a participant-selected song) and four BN trials. Within a block, the music and

BN trials were pseudo-randomized, such that two BN trials never occurred consecutively

while positively and negatively valenced songs appeared in no apparent order. The same

33

Figure 2.2: Trial sequence

pseudo-random sequence of trials was employed for all participants. Figure 2.2 depicts

a trial sequence. In each trial, the participant listened to 10 s of BN, followed by a 45

s auditory stimulus (music or BN), and finally 5 s of BN. The sound level was faded in

and out at the beginning and end of the trial, respectively, to reduce the risk of eliciting

a startle. At the end of each trial, the participant rated the intensity and valence of their

emotional experience using a nine-level self-assessment Manikin [146] shown in Figure 2.3.

The beginning and end of each trial was marked by an audible tone. The participants

were instructed to close their eyes when they heard the initial tone, and to open their

eyes upon hearing the second tone.

34

Figure 2.3: The Self Assessment Manikin Rating System is shown. The top and thebottom row depict valence (positive to negative) and arousal (intense to neutral) ratings,respectively. The participant could select one of the nine levels of arousal/valence bymarking the corresponding circles shown. For example, in the sample rating provided, avery intense positive emotion is represented

Chapter 3

Characterizing PFC Hemodynamic

changes due valence and arousal

3.1 Preamble

This chapter investigates the overall hemodynamic patterns accompanying emotional

response. Identifying patterns associated with emotions in prefrontal cortex using near

infrared spectroscopy is an important step towards emotion identification. In this study,

NIRS recordings were used to characterize the PFC hemodynamic response to emotional

arousal and valence. In particular, a wavelet-based peak detection technique was used to

characterize chromophore concentration patterns.

This chapter is entirely reproduced from the following journal article: Moghimi S,

Kushki A, Guerguerian AM, Chau T, Characterizing emotional response to music in

the prefrontal cortex using near infrared spectroscopy. Neuroscience Letters. 2012;

Elsevier.

Readers can skip section 3.4.1 as it reiterates the procedures described in chapter (2).

35

36

3.2 Abstract

Known to be involved in emotional processing, the human prefrontal cortex (PFC) can be

non-invasively monitored using near-infrared spectroscopy (NIRS). As such, PFC NIRS

can serve as a means for studying emotional processing by the PFC. Identifying pat-

terns associated with emotions in PFC using NIRS may provide a means of bedside

emotion identification for nonverbal children and youth with severe physical disabilities.

In this study, NIRS was used to characterize the PFC hemodynamic response to emo-

tional arousal and valence in a music-based emotion induction paradigm in 9 individuals

without disabilities or known health conditions. In particular, a novel technique based

on wavelet-based peak detection was used to characterize chromophore concentration

patterns. The maximum wavelet coefficients extracted from oxygenated hemoglobin con-

centration waveforms from all nine recording locations on the PFC were significantly

associated with emotional valence and arousal. Specifically, high arousal and negative

emotions were associated with larger maximum wavelet coefficients.

3.3 Introduction

Selected groups of nonverbal individuals with severe disabilities and little or no voluntary

muscle control have benefited from communication alternatives based on brain activity

known as brain computer interfaces (BCI) [237, 112]. However, due to developmental

delays, limited expressive communication and unknown levels of receptive communica-

tion, many nonverbal children and youth are usually not candidates for existing BCI

technologies. Indeed the aforementioned challenges preclude the training of specific men-

tal activities. However, these individuals are still candidates for affective BCIs (A-BCI)

which enable the automatic recognition of affective states using brain activity [158]. A-

BCIs may provide a means of detecting spontaneous and natural reactions to specific

stimuli. To this purpose, affective responses evoked by visual stimuli have been previ-

37

ously decoded in both facial thermographic [156] and cerebral hemodynamic pathways

[218].

Emotional responses engage many different areas of the brain including parts of the

limbic system, and prefrontal cortex (PFC) [164, 208, 126, 34]. Neuro-imaging techniques

such as positron emission tomography (PET) [221] and magnetic resonance imaging

(MRI) [21] have provided an opportunity for characterizing emotional perception in the

brain [39, 209, 44, 206, 16, 15]. However, the bulky set-ups required by PET and MRI

systems, and potential patient discomfort [150] preclude their use in studies of emotional

responses in real-life settings, and particularly in developing A-BCIs.

Among the different modalities available for monitoring brain activity, near infrared

spectroscopy (NIRS) is noninvasive, and particularly well-suited for monitoring PFC

activity, which is among the regions involved in emotional processing [34], in life-like

settings. NIRS monitors hemodynamic activity in the brain by measuring changes in

oxygenated ([HbO2]) and deoxygenated hemoglobin ([Hb]) concentrations (i.e. [HbO2]

and [Hb]) in regional cerebral blood flow [90, 228]. NIRS is not prone to electrogenic

artifacts (e.g. electrooculogram), present in the forehead area. NIRS provides lower spa-

tial resolution compared to PET and MRI neuroimaging systems, but it is non-invasive,

relatively inexpensive, and portable. As such, NIRS may be particularly more amenable

for A-BCI development involving children with severe disabilities.

Exposure to emotionally-laden stimuli is known to produce measureable changes in

chromophore concentrations (i.e., [HbO2] and [Hb]) [74, 241, 218]. Examining hemody-

namic changes in the prefrontal cortex using NIRS, Hoshi et al. showed that exposure to

both pleasant and extremely unpleasant pictures led to increases and decreases in [Hb]

, respectively [83]. Similarly, a recent study showed that highly-positive and negative

emotions associated with music could be differentiated with more than 70% accuracy

[144] based on prefrontal cortex NIRS measurements.

The current study used PFC NIRS to investigate characteristics of the hemodynamic

38

response, specifically [HbO2] and [Hb], to emotionally-laden music. Music has repeatedly

been used for emotion induction in various studies [91, 134]. Characterizing emotional

response to music in able-bodied adults is a step towards future investigations of emotion

in non-verbal individuals with severe disabilities.

In the current study, the relationship between chromophore concentrations, [HbO2]

and [Hb], and subjective ratings of emotional arousal and valence, was investigated us-

ing wavelet analysis. The wavelet transform is a tool for signals analysis in time and

frequency. Broadly speaking, the wavelet transform evaluates the similarity of the time

series to a given pattern, known as the mother wavelet. In particular, when applied to

a time-series, the wavelet transform produces a set of coefficients across a set of time

points and scale values where the wavelet coefficient at time t and scale a represents the

similarity of the data at t to the mother wavelet scaled by a factor of a. The wavelet

transform maps the signal onto a set of bases (wavelet family) consisting of scaled and

translated versions of a mother wavelet function.

In the present study, wavelet analysis provided a means of extracting oxygenation

patterns relevant to emotional valence and arousal, and was used to investigate the

shape of the peak [HbO2] and [Hb] (i.e. differentiate abrupt peaks from gradual peaks).

In particular, the maximum wavelet coefficient revealed the scale at which the signal

most closely resembles the prototypical hemodynamic response (e.g., increase followed

by decrease in oxygenation).

3.4 Methods

3.4.1 Procedures

Ten adults without disabilities or known health conditions (9 right-handed) were recruited

for this study. Only the 9 right-handed participants (5 female, age: 252.7 years) were

included in the analysis to mitigate any response variations due to differences in hemi-

39


spheric dominance. The Bloorview Research Institute research ethics board approved of

the study, and informed written consent was provided by all participants. Participants

donned a custom polyethylene headgear, which covered their foreheads and accommo-

dated the placement of multiple emitters and detectors. An Imagent Functional Brain

Imaging System from ISS Inc. (Champaign IL) was used for NIRS measurements across

nine different regions on the forehead (Figure 6.2). Each source housed two diodes that

emitted light at 830nm and 690nm. The light was detected by three detectors. The data

were sampled at 31.25 Hz.

During each trial, participants listened to either a music excerpt or a noise recording

that represented a neutral auditory stimulus (Figure 3.2). The music excerpts comprised

a database of 78 music pieces selected by the researchers together with 6 self-selected

excerpts for each participant. The study was divided into 4 separate sessions encompass-

ing 36 trials each (12 noise trials, 24 musical excerpts). After each trial, participants

were prompted to rate their emotions in terms of arousal and valence using a nine level

self-assessment manikin [146]. The valence ratings were mapped to 1(most positive) to

9(most negative), and arousal ratings ranged from 1(least intense) to 9(most intense).

40


3.4.2 Wavelet-based peak detection

In this phase of the study, the relationship between chromophore concentrations, [HbO2]

and [Hb], and subjective ratings of emotional arousal and valence, was investigated using

wavelet analysis. The wavelet transform is a tool for signal analysis in time and frequency.

Broadly speaking, the wavelet transform evaluates the similarity of the time series to a

given pattern, known as the mother wavelet. In particular, when applied to a time-series,

the wavelet transform produces a set of coefficients (shown in (3.1)) across a set of time

translations and scale values where the wavelet coefficient at time displacement u ∈ ℜ

and scale u ∈ ℜ+ represents the similarity of the data at u to the mother wavelet (ψ(t))

scaled by a factor of s.

ψ(u, s)(t) =1√sψ(t− u

s) (3.1)

The continuous wavelet coefficient corresponding to scale s and translation u can be

determined using (3.2).

Wf(u,s) =∫ +∞

−∞f(t)

1√sψ∗(

t− u

s)dt (3.2)

In this manner, the original signal (f(t)) is projected onto a two dimensional space of

u and s. Therefore,Wf(u,s) allows the study of signal characteristics at time u and scale S.

The scale s changes inversely with frequency. Therefore, abrupt peaks (i.e. accompanied

41

Figure 3.3: Mexican hat wavelet

by faster changes in the vicinity of the peak) correspond to larger coefficients at lower

scales, whereas gradual peaks (i.e. accompanied by slower changes in the proximity of

the peak) correspond to higher scales. Therefore, wavelet coefficients identify peaks as

well as the rate of changes near these peaks. The interested reader is referred to [131] for

more details regarding wavelet analysis.

Wavelet analysis is often used for detecting patterns of interest in the data, such

as, stereo-typed neuroelectric waveforms (e.g. event related potentials) [35] and [206],

localized spikes in biological data [152], and unknown transients in the signal [54]. Visual

inspection of [HbO2] and [Hb] in high arousal and positive/negative rated trials lead to

the selection of the Mexican hat function as the mother wavelet 3.3.

The wavelet transform was computed for scales a ∈ {70, 71,, 400} which maps to

pseudo-frequencies in the range of 0.0190.111Hz [182] (the required range for capturing

the chromophore concentration changes given that the filtered concentrations have useful

frequency content in the range of < 0.1Hz). The transform was applied to time values

S ∈ {1, , k, , N}, where N was the number of samples included for analysis [152]. The first

5 s of the [HbO2] and [Hb] series during the music intervals were discarded in order to

ignore any residual activity from the period preceding music onset. Therefore, a total of

40 s of data corresponding to N = 1250 samples were included when determining wavelet

coefficients. Given the wavelet coefficients for each concentration time series, two features

42

were extracted for subsequent analysis: the maximum wavelet coefficient over all time

(i.e. across all translations) and scales, and the scale at which the maximum occurred.

3.4.3 Statistical analysis

To test whether or not the wavelet features (maximum wavelet coefficients for [HbO2]

and [Hb] and corresponding scales) were related to the subjective ratings of arousal and

valence, we used a mixed effects repeated measures linear regression analysis. Separate

regressions were conducted for valence (most positive (1) to most negative (9)) and

arousal (neutral (1) to most intense (9)) ratings, and for [HbO2] and [Hb] chromophores.

We report the regression coefficient (slope) and the associated p-value as an indicator

of the correlation. The analysis was repeated for each of the nine recording sites, again

considering arousal and valence ratings separately. To account for multiple comparisons

(9 recording sites), we set a Bonferroni adjusted significance level of α = 0.05/9 = 0.005

[79].

3.5 Results

Box-plots shown in Figure 3.4 a and b depict the distribution of participant valence and

arousal ratings, respectively. Across participants, the median valence rating was neutral

(i.e. 5), but the median arousal rating was situated more toward the lower end of the

scale (i.e. 3).

Figure 3.7 illustrates [HbO2] and [Hb] recordings (in black) from the nine PFC inter-

rogation sites for a representative trial rated at the highest arousal and most negative

valence. Along with each recording is shown the corresponding temporal waveform of

wavelet coefficients (in grey) at the scale containing the maximum wavelet coefficient.

Figure 3.5 and report the slopes of the regression lines between the maximum wavelet

coefficient and the subjective arousal and valence ratings, respectively. The results in

43

1

2

3

4

5

6

7

8

9

1 2 3 4 5 6 7 8 9

Participant number

a

Va

len

ce

ra

tin

g

1

2

3

4

5

6

7

8

9

1 2 3 4 5 6 7 8 9

Participant number

b

Aro

usa

l ra

tin

g

Figure 3.4: Box-plot of valence and arousal ratings for each participant

Figure 3.5 and 3.6 indicated that the maximum wavelet coefficient and the corresponding

scale exhibit significant regional associations with emotional ratings. In particular, the

maximum wavelet coefficients extracted from [HbO2] were significantly related to ratings

of arousal in all nine recording regions while coefficients from [Hb] were correlated to

inferior left (L2, L3 and L4) and right (R2, R3 and R4) locations. In both cases, the

regression slopes were positive, indicating that high arousal resulted in larger [HbO2]

and [Hb] peaks in the respective regions. The scale of the maximum wavelet coefficient

provided a measure of the sharpness of the concentration peaks. Specifically, smaller

scales (higher signal frequency) correspond to more abrupt concentration changes whereas

larger scales (lower frequencies) are indicative of more gradual changes. As such, our

results suggest that higher ratings of arousal were associated with more gradual peaks in

[HbO2] in regions R2, R3, and L3. Collectively, these findings reveal that more intense

emotions were accompanied by larger and less abrupt changes in the concentration of

oxy- and deoxy-hemoglobin.

Negative emotions were correlated with larger values of the maximum wavelet coeffi-

cient across all nine recording sites in [HbO2], and inferolateral left (L3) and right (R3)

44

Interrogation s ite

Wavelet

feature

chromophore R1 R2 R3 R4 O L1 L2 L3 L4

a. MWC [HbO2] 0.9531

p<.0001

1.4831

p<.0001

1.7885

p<.0001

1.3225

p<.0001

0.9643

p<.0001

1.0918

p<.0001

1 .2179

p<.0001

1.5341

p<.0001

1.4581

p<.0001

[Hb ] 0.2748

p=0.0007

0.5811

p<.0001

0.4388

p<.0001

0 .3289

p=0.0001

0.4989

p<.0001

0.5136

p<.0001

b. Scale of

M WC

[HbO2] 5.5083

<.0001

4.3377

0.0002

4.7002

0.0002

[Hb]

Figure 3.5: Slopes of regression lines between participant arousal ratings and (a) themaximum wavelet coefficient (MWC), and (b) the corresponding scale. Only slopessignificantly different from zero are shown (p < 0.005).

Interrogation site

Wavelet

feature

chromophore R1 R2 R3 R4 O L1 L2 L3 L4

a. MWC [HbO2] 0.80 48

p=0.0043

1.9382

p<.0001

2.143 6

p<.0001

1.6458

p<.0001

1.1121

p<.000 1

1.24 04

p<.0001

1.6016

p<.0001

2.088 6

p<.0001

1.6 445

p<.0001

[Hb] 0.501 6

p<.0001

0.514 3

p<.0001

b. Scale of

MWC

[HbO2] 4.3357

p=0 .0047

[Hb] -4.0693

p=0 .0032

4.8146

p=0.0019

Figure 3.6: Slopes of regression lines between participant valence ratings and (a) themaximum wavelet (MWC), and (b) the corresponding scale. Only slopes significantlydifferent from zero are shown (p < 0.005).

45

a)

b)

Figure 3.7: Plotted in black are the (a) [HbO2] (top panel) and (b) [Hb] (bottompanel)recordings across nine interrogation sites for a music sample inducing intense neg-ative emotions from one of the participants during 45 seconds of aural stimulus. In greyare the corresponding waveforms of wavelet coefficients at the scale where the maximumwavelet coefficient occurs. These waveforms have been scaled by their standard deviationto facilitate visual comparison.

locations in [Hb]. More negative ratings also corresponded to more gradual peaks in [Hb]

at L2, and [HbO2] at R3. Therefore, negative emotions tended to elicit larger and less

sudden regional chromophore concentration peaks. More negative ratings on the other

hand resulted in sharp concentration peaks in a more midline, superior location (L1).

3.6 Discussion

In this study, arousal ratings were found to be associated with changes in chromophore

concentrations. Intense emotional experience has been reported to result in heightened

hemodynamic changes. Tanida et al. showed that mental stress induction could result

in bilateral increase or decrease of [HbO2] and [Hb], respectively [219]. Matsuo et al.

have reported PFC [HbO2] increases in a group of individuals with post-traumatic stress

46

disorder as well as a healthy control group in response to trauma-related videos [134].

Previous findings have reported lateral activation in the PFC due to positive or neg-

ative emotional stimuli [235]. For example, Altenmuller et al. [4] reported an increase in

the left temporal activation due to exposure to positive auditory stimuli, and a bilateral

increase in response to negative auditory stimuli using electroencephalography (EEG).

However, in the current study, significant regression slopes were observed bilaterally.

Therefore no evidence of lateral activation patterns was obtained with respect to ratings

of valence.

The significance of the regression slopes resulting from models involving maximum

wavelet coefficients of [HbO2] indicated that the Mexican hat mother wavelet was a suit-

able template for identifying patterns relevant to emotional arousal and valence in [HbO2]

across all nine recording sites. Unlike static emotion induction paradigms (e.g. pictures)

where short time exposures can result in emotional experience [74], [218] and [241], dy-

namic emotion induction paradigms (e.g. music) can involve emotional unfolding at any

time during the course of exposure to stimuli. For example, the emotions experienced

during the introduction to a musical piece may be different from those experienced during

the main body. This scenario resembles real life emotional experience where emotions

can be manifested at any point in time. The results of the current study encourage

future studies of the temporal dynamics of emotion [106] using wavelet analysis for the

localization of emotional responses in time.

Chapter 4

The Effect of Music Characteristics

4.1 Preamble

There is compelling evidence of a network in the brain specialized for perceiving music.

For example, previous studies of focal lesions in the brain have indicated selective loss

of the ability to perceive specific music characteristics [243, 231]. This network may

include the prefrontal cortex (PFC)[16, 99]. Therefore, the PFC may play a dual role of

perceiving music characteristics and formulating emotional responses. To identify which

of these two mechanisms (music perception vs. emotional response) were involved in

the activation patterns observed in the PFC, the effect of music characteristics, namely

mode, dissonance and sound pressure level on the PFC [HbO2] and [Hb] was investigated

in this chapter.

4.2 Introduction

Every musical piece can be characterized by specific structural and performance fea-

tures. Performance characteristics, such as energy, timbre and pitch, involve the manner

in which the performer executes a musical piece. These features are quite variable due to

differences in performer skills and state. Structural features, on the other hand, involve

47

48

acoustic and foundational characteristics of music and are more consistent across per-

formers. These structural features, which include dissonance and mode, are shown to play

an important part in conveying the emotional content of a musical piece [91]. Therefore,

these characteristics may have played a part in inducing certain emotional experiences,

and these emotional responses may have resulted in prefrontal hemodynamic changes

detected in chapter (3). However, this reasoning may be challenged by an alternative

view point regarding the perception of musical characteristics in the brain.

Previous research has identified particular brain networks specialized for perceiving

musical characteristics. Primarily, lesions in the temporal lobe and auditory cortex were

shown to affect perception of pitch and tonal melodies. Zatorre showed that lesions in

the right temporal lobe could adversely affect the ability to discriminate tonal melodies

[243]. In a study involving a control group and thirty-six patients with focal excisions,

Warrier et al. identified the right anterior auditory cortical areas as being responsible

pitch judgements [231]. Such findings provide compelling evidence for the existence of

a brain network specialized for music perception. This network may include PFC, and

the prefrontal area may also be involved in perceiving music. Khalfa et al., who used

major and minor mode for emotion induction in an MRI study, reported left orbito and

mid-dorsolateral frontal activations in response to the minor mode [99]. Using auditory

stimuli designed to only vary in harmonic dissonance and unpleasantness, Blood et al.

found that the subjective ratings of dissonance correlated negatively with orbitofrontal

and ventromedial prefrontal cortex activation [16].

Due to the potential role of the PFC in perceiving music, it was necessary to in-

vestigate whether the activity patterns identified were purely due to the perception of

music characteristics or a result of the emotional content of the music. In this phase, the

influence of music characteristics on the observed activity patterns detected in the PFC

[HbO2] and [Hb] was investigated. Musical characteristics, such as dissonance and sound

pressure level, were compared to hemodynamic changes. In addition, emotional ratings

49

of arousal and valence were compared to average musical characteristics extracted from

the corresponding trials.

4.3 Methods

4.3.1 Music characteristic extraction

The music characteristics investigated in this chapter included mode (major or minor

tonality), sound pressure level (volume), and dissonance (a characteristic of harmony).

Dissonance has been noted as a mechanism by which modern music is capable of inducing

emotions. Children as young as 4 month were shown to react differently when exposed

to consonant versus dissonant music pieces [244]. The ability of dissonance to induce

emotions has been attributed to an innate response to danger because many alarming

sounds in nature such as cries of birds are dissonant auditory cues. Therefore, dissonance,

resulting from modifications to harmonic structures, can convey the salience of the au-

ditory stimulus and result in emotional response. Previous studies of emotion have used

dissonant and consonant music excerpts for inducing pleasant and unpleasant emotions

[16]. Similarly, intensity or volume has been shown to play a role in inducing emotions.

Studies of music for marketing purposes, and psychological assessments have confirmed

that the music volume can play a role in emotion induction [232, 19]. Finally, music

mode is shown to affect emotion induction; the major mode is commonly associated

with positive valence while minor mode conveys negative emotional content. In previous

studies of emotion involving brain activity, music mode has been used to convey positive

and negative emotions [99]. Unlike dissonance which involves the harmonic structure of

music, mode is related to the melodic characteristics of music. Interested readers are

referred to [91] for more information regarding emotional content of music.

For each music excerpt used for this study, mode and dissonance features were de-

termined using the music information retrieval toolbox (MIRTOOLBOX) developed in

50

University of Jyvskyl, Finland, which is available in MATLAB (Mathworks) [124]. MIR-

toolbox allows time domain extraction of music characteristics by breaking the music

piece into time epochs. These epochs were chosen to be 1.5 ms. Average characteristics

were extracted from the entire course of the trial. For more information regarding music

characteristic extraction see Appendix C.

4.3.2 Music database

As described in chapter 2, the music collection used during the data acquisition phase

was composed of two subsets: 72 music pieces identically played for all participants, and

six self-selected songs specific to each participant. The self-selected music excerpts were

played once per session. Therefore, each participant was exposed to four repetitions of

the same song. During these repetitions, the music characteristics remained the same.

Therefore, comparing [HbO2] and [Hb] recorded during separate repetitions may provide

an opportunity to detect music characteristic-dependent activity patterns in the PFC

hemodynamics.

The remaining 72 music pieces and the respective arousal and valence ratings were

used to detect whether emotional ratings have been influenced by music characteristics.

In addition, the hemodynamic response to the common music excerpts was compared to

music characteristics to identify if music characteristics had a significant effect on the

hemodynamic patterns observed.

4.3.3 Statistical analysis

To test whether the music characteristics were related to subjective ratings of arousal and

valence, a mixed effects repeated measures linear regression analysis was fit to the music

characteristics extracted from the common music excerpts used (i.e. 72 music pieces).

Separate mixed effect regression analysis were conducted for valence and arousal ratings.

The p-value associated with the regression slopes were recorded as an indicator of the

51

Table 4.1: P-values for the main effect of arousal and valence rating in modeling mode,dissonance and maximum sound pressure level.

hhhhhhhhhhhhhhhhhhhhhhhdependent variable

independent variableArousal Valence

Mode 0.6128 0.0056Dissonance 0.0082 <0.0001Maximum sound pressure level 0.0280 0.0006

significance of the detected relationship (p < 0.05).

To determine the extent to which music volume,dissonance and mode have affected

[HbO2] and [Hb] averaged across the nine recording regions, a mixed effect model was fit

to the peak values of average [HbO2] and [Hb] with the main effect of each music char-

acteristic separately (i.e. mode, maximum sound pressure level and average dissonance).

For region specific analysis of [HbO2] and [Hb] with respect to mode, maximum sound

pressure level and dissonance, please see appendix D.

4.4 Results

Table 4.1 summarizes the significance of the slope of the regression line for the mixed

effect model fit to mode, maximum sound pressure and dissonance to the main effect

of arousal and valence ratings in two separate models. The ratings of valence were

found to be significantly (p < 0.05) related to mode, maximum sound pressure level and

dissonance in each trial. The ratings of arousal were significantly related to dissonance

and maximum sound pressure level.

Music characteristics did not significantly (p < 0.05) influence the peaks of [HbO2]

and [Hb] averaged across the nine recording sites. Table 4.2 summarizes the p-values

corresponding to the main effects of the music characteristics, namely, mode, maximum

sound pressure and dissonance in modeling peaks of average [HbO2] and [Hb] across the

nine recording locations.

52

Table 4.2: P-values for the main effect of music characteristics (i.e. dissonance, mode,and maximum sound pressure level) in modeling the peaks of [HbO2] and [Hb] averagedacross the nine recording sites.

hhhhhhhhhhhhhhhhhhhhhhhdependent variable

independent variableMode Dissonance Maximum sound

pressure level[HbO2] 0.205 0.098 0.059[Hb] 0.769 0.052 0.250

time

0 5 10 15 20 25 30 35 40 45-1

-0.5

0

0.5

1

Figure 4.1: In grey: the normalized sound pressure level of self-selected song A forparticipant 3. In black: normalized [HbO2] averaged across the nine recording locationsshown for each of the four repetitions of song A. The [HbO2] varied in different repetitionsof the same song.

Figure 4.1 depicts the normalized [HbO2] averaged across all nine interrogation sites

during 4 repetitions of a self-selected song. Visual inspection suggests that the average

[HbO2] collected during separate repetitions of the same song showed temporal differ-

ences. For example, in Figure 4.1, the peak [HbO2] appeared at different time points.

4.5 Discussion

The slope of the regression line fit to the music characteristics reached significance

(p < 0.05) with the main effect of arousal and valence rating (with the exception of

53

music mode modeled using valence rating). The arousal and valence ratings represent

emotional experience, Therefore, these results confirmed the significant effect of music

characteristics in inducing emotions (Table 4.1). The music mode was found to signif-

icantly influence valence (p < 0.05) while the effect of mode on arousal did not reach

significance. This finding echoes those of previous studies involving emotional ratings

and music characteristics. Husain et. al showed that modifying the mode of a piece

by Mozart can affect the mood without influencing arousal [85]. Previous studies have

acknowledged the effect of mode on the perceived valence among listeners [75].

As shown in Table 4.2, mode, dissonance and maximum sound pressure level did

not significantly influence the peak PFC hemodynamics across participants (The slope

of the regression line did not reach significance). The results shown in Table 4.1, on

the other hand, confirmed that music characteristics significantly influenced subjective

ratings (with the exception of mode which did not significantly influence arousal ratings).

In addition, as shown in figure 4.1, repeated exposure to a music excerpt with identical

music characteristics resulted in different PFC hemodynamic patterns. Based on these

findings, mode dissonance and maximum sound pressure level are unlikely to have directly

influenced PFC hemodynamics, but they significantly (p < 0.05) influenced subjective

ratings of arousal and valence and, therefore, emotional experience.

4.5.1 Subject specific patterns

As described in previous chapters, emotions can be individual specific since different

participants may manifest different levels of emotional sensitivity. The variability in

emotional ratings in this study (see Figure 3.4) highlights the subject-specific nature of

emotional experience. In addition, in the same participant, emotional response may vary

between sessions due to mood differences [191]. These differences in emotional experience

may be responsible for the amount of variability between repeated exposure to the same

music excerpts as observed in Figure 4.1.

54

4.5.2 Temporal dynamics

Music characteristics are dynamic phenomena, and so are PFC hemodynamics and emo-

tions. Therefore, instantaneous comparisons between these three elements seem necessary

for understanding how they interact. However, accessing instantaneous emotional ratings

by interrupting the user may result in distractions impeding natural emotional response.

Therefore, in the current thesis, emotional ratings were collected at the end of each mu-

sic excerpt. Future studies involving music characteristics and the brain should consider

implementing experimental paradigms to realize dynamic emotional ratings.

4.6 Conclusion

In this chapter, the effect of music characteristics namely, mode, dissonance, and maxi-

mum sound pressure level on subjective ratings of arousal/valence and maximum [HbO2]

and [Hb] in the PFC was investigated. The PFC [HbO2] and [Hb] averaged across the

nine recording locations were not significantly influenced by the music characteristics

under investigation. However, the results indicated that dissonance and maximum sound

pressure level have significantly influenced subjective ratings of arousal and valence. In

addition, the ratings of valence were found to be significantly influenced by music mode.

Overall, these findings supported the conjecture that music characteristics can affect

emotional experience. The evidence fails to support the hypothesis that the observed PFC

hemodynamic patterns were due to music perception. Therefore, the patterns in the PFC

were more likely to have resulted from the underlying emotions than the perception of

music.

Chapter 5

Automatic Detection of Emotional

Response to Music

5.1 Preamble

In this chapter, the feasibility of automatic detection of emotional response to aural

stimuli using near-infrared spectroscopy of the prefrontal cortex is examined. Here, you

will find details of the machine learning algorithms used for training participant-specific

classifiers which were used to differentiate various levels of arousal and valence.

This chapter is entirely reproduced from the following journal article: Moghimi S,

Kushki A, Guerguerian AM, Chau T, Automatic detection of a prefrontal cortical re-

sponse to emotionally rated music using multi-channel near-infrared spectroscopy. Jour-

nal of Neural Engineering. 2012; 026022-9.

Readers can skip sections 5.4.1 and 5.4.2 since they reiterate the procedures described

in chapter (2).

55

56

5.2 Abstract

Emotional responses can be induced by external sensory stimuli. For severely disabled

nonverbal individuals who have no means of communication, the decoding of emotion

may offer insight into an individuals state of mind and his/her response to events taking

place in the surrounding environment. Near-infrared spectroscopy (NIRS) provides an

opportunity for bed-side monitoring of emotions via measurement of hemodynamic activ-

ity in the prefrontal cortex, a brain region known to be involved in emotion processing. In

this paper, prefrontal cortex activity of ten able-bodied participants was monitored using

NIRS as they listened to 78 music excerpts with different emotional content and a control

acoustic stimuli consisting of the Brown noise. The participants rated their emotional

state after listening to each excerpt along the dimensions of valence (positive versus nega-

tive) and arousal (intense versus neutral). These ratings were used to label the NIRS trial

data. Using a linear discriminant analysis-based classifier and a two-dimensional time-

domain feature set, trials with positive and negative emotions were discriminated with

an average accuracy of 71.94% ± 8.19%. Trials with audible Brown noise representing a

neutral response were differentiated from high arousal trials with an average accuracy of

71.93% ± 9.09% using a two-dimensional feature set. In nine out of the ten participants,

response to the neutral Brown noise was differentiated from high arousal trials with ac-

curacies exceeding chance level, and positive versus negative emotional differentiation

accuracies exceeded the chance level in seven out of the ten participants. These results

illustrate that NIRS recordings of the prefrontal cortex during presentation of music with

emotional content can be automatically decoded in terms of both valence and arousal

encouraging future investigation of NIRS-based emotion detection in individuals with

severe disabilities.

57

5.3 Introduction

Emotions have been characterized as patterns of experience, perception, action and com-

munication that can be animated in response to physical and social encounters [96].

Some theories suggest that emotions can be manifested as a result of human inter-

actions with the surrounding environment [53, 125, 23], which result in physiological

changes [160] such as the modulation of central and peripheral nervous system activity

[15, 74, 9, 28, 210, 111]. These changes may facilitate the identification of emotional

state in non-verbal individuals with severe disabilities who may have no other means

of expression. Of particular appeal is the detection of affective responses through brain

activity monitoring, as there is no requirement for voluntary motor control. Indeed,

computer-based detection of emotional responses may enhance implicit communication

about the user in humancomputer interaction systems [31]. Affective computing has

long been touted for its potential for more realistic and user-accommodating interactions

[171]. An emotionally aware system stands to benefit non-verbal individuals with severe

disabilities by estimating their emotional state in the absence of more direct means of

interaction (e.g. speech and gestures). In turn, knowledge of the patients affective state

may help to mitigate care-giver stress and facilitate treatment decisions in a timely fash-

ion [70]. Various brain circuits including parts of the limbic system and amygdala are

responsible for perception of emotional stimuli [164, 208, 126]. In addition, the frontal

region of the human brain is involved in regulating emotional response to sensory input

[187, 33, 34]. For example, severity of the depressive symptomatology in patients fol-

lowing stroke lesions was reported to be significantly correlated with proximity of the

lesion to the frontal pole [186]. Moreover, left and right frontal activations were also

found in response to watching video clips inducing positive and negative emotional re-

sponses, respectively [235]. Activations in the orbito-frontal and ventral prefrontal cortex

in response to highly pleasurable self-selected music excerpts have also been reported [15].

Among various brain measurement modalities such as electroencephalography [157],

58

positron emission tomography [221], magnetoencephalography [68] and magnetic reso-

nance imaging (Bushong 1988), near infrared spectroscopy (NIRS) is particularly well

suited to long-term bedside monitoring of prefrontal cortex activity. NIRS involves the

optical measurement of changes in oxygenated (HbO2) and deoxygenated hemoglobin

(Hb) concentrations in regional cerebral blood flow [90, 228]. Being an optical modality,

NIRS measurements are not susceptible to electrogenic artifacts such as electrooculo-

grams and electromyograms.

NIRS has been used previously to detect emotional responses in the prefrontal cortex.

Recent findings with emotionally laden visual stimuli have confirmed the presence of

prefrontal cortex activations detectable by NIRS [74, 241, 83]. Likewise, in the context

of automatic emotion detection, Tai and Chau (2009) were able to differentiate between

prefrontal responses to affective pictures and baseline activity on a single-trial basis

with an average of 75% accuracy [218]. However, the perception of visual stimuli may

require gaze fixation and the control of the eye muscles responsible for keeping the eyes

open. Therefore, individuals with severe disabilities who possess little or no voluntary

muscle control, possibly concomitant with vision impairment, may not be able to observe

visual stimuli. However, evidence suggests that aural stimuli, the perception of which

requires no voluntary muscle control, can also elicit a pre-frontal response [15, 16, 17].

Previous findings indicate that when used as a BCI control task, active music imagery

(mental singing) can be differentiated from the rest state and mental math with accuracies

significantly above chance [177, 45, 65]. However, NIRSbased automatic detection of

passive prefrontal responses to affective aural stimuli remains unexplored to date. In

this study, we examined the feasibility of automatically detecting emotional responses to

aural stimuli by near-infrared spectroscopic interrogation of the prefrontal cortex. Music

in particular is recognized for its ability to induce an emotional response in a wide array

of individuals [138]. The emotional content of music is known to be perceived across

cultures [55] and distinguished by children as young as 6 years of age [32]. In fact, music

59

has been frequently used as an emotional auditory stimulus [106, 109, 81, 213, 60]. In

this paper, music excerpts were thus used for inducing affective brain activity.

5.4 Methods

Ten able-bodied volunteers (five females, five males, age: 25 2.7 years) were recruited for

this study. The participants reported to have normal hearing, and normal or corrected-to-

normal vision. The recruitment criteria excluded individuals with reported cardiovascular

diseases, metabolic disorders, history of brain injury, respiratory conditions, drug and

alcohol-related and psychiatric conditions. Participants were instructed to refrain from

caffeine and alcohol consumption 5 h prior to the study. Volunteers had an average of 5.5

years of past music training. Ethics approval was obtained from the Bloorview Research

Institute research ethics board (see Appendix E) and all participants provided informed

written consent.

5.4.1 Stimuli

The stimuli were composed of 78 researcher-selected and 6 participant-selected musical

pieces. All music segments were 45 s in duration. The excerpts included lyrical and

nonlyrical pieces. The lyrics were in different languages (English, French, Italian and

Spanish) to reduce potential effects of brain activation due to mental singing. The 78

standard music pieces were chosen by two researchers from different genres of music

(classical, rock, jazz and pop). Specifically, candidate pieces were assessed in terms of

their valence characteristics as suggested by the tone, rhythm and lyrics (where applica-

ble). Note that the researcher assessments were used solely to ensure an approximately

uniform representation of music between valences (positive versus negative). The actual

data analysis described in section 2.5 relied solely on participant ratings of valence and

arousal. For the participant-selected pieces, participants chose a priori three pieces of

60

music that personally induced intense positive emotions (joy or excitement) and three

that induced intense negative emotions (sadness). The control acoustic stimulus was

Brown noise (BN). User feedback in our pilot studies indicated that this type of noise

was subjectively more pleasant than white noise at the same sound pressure level (Vossa

and Clarke 1978).

Each participant attended four sessions, which occurred on separate days, no more

than four weeks apart. In each session, participants completed three blocks with optional

breaks between blocks. Each block consisted of 12 consecutive trials: four trials with

positively valenced songs (one of which was a participant-selected song), four trials with

negatively valence songs (one of which was a participant-selected song) and four BN trials.

Within a block, the music and BN trials were pseudo-randomized, such that two BN trials

never occurred consecutively while positively and negatively valenced songs appeared in

no apparent order. The same pseudo-random sequence of trials was employed for all

participants. Figure 2 depicts a trial sequence. In each trial, the participant listened to

10 s of BN, followed by a 45 s auditory stimulus (music or BN), and finally 5 s of BN.

The sound level was faded in and out at the beginning and end of the trial, respectively,

to reduce the risk of eliciting a startle. At the end of each trial, the participant rated

the intensity and valence of their emotional experience using a nine-level self-assessment

Manikin (Morris 1995). The beginning and end of each trial was marked by an audible

tone. The participants were instructed to close their eyes when they heard the initial

tone, and to open their eyes upon hearing the second tone.

In this phase, hemodynamic activity was represented by features extracted from

[HbO2] and [Hb] concentrations. A subset of these features was selected and used for

training a linear discriminant analysis based classifier. The classifier was then tested

using a second subset set aside for testing. The training and testing feature subsets

were both labeled based on arousal and valence ratings provided by the participants.

Classifiers were trained separately for differentiating arousal and valence levels in each

61

participant.

5.4.2 Preprocessing

Low-frequency artifacts such as respiration, heart rate and the Mayer wave were filtered

using a type II third order Chebychev low pass filter with a cut-off frequency of 0.1

Hz (normalized stop-band edge frequency of 0.032 and stop-band ripple of 50 dB down

from the peak pass-band value) [178]. The 830 nm and 630 nm light intensities at each

of the nine recording sites were used to calculate HbO2 and Hb concentrations via the

modified BeerLambert law [30, 41], which resulted in 18 channels of concentration data.

To reduce the effects of initial device calibration, the concentration time series were

normalized within each experimental block against the mean in the same block.

5.4.3 Feature extraction

Two genres of features were considered: laterality features and single-channel features.

All features were extracted from [HbO2] and [Hb] concentrations. Table 5.1 summarizes

the features used. Single-channel features were calculated at each of the 9 interrogation

locations and consisted of the mean, slope and coefficient of variation of the concentration

signals during the 45s aural stimuli period, as well as the change in the average concen-

tration from the preceding baseline period to the task period. The slope was determined

by fitting a line using linear regression to all data points in the 45s trial window and

calculating the corresponding slope. The entire 45s window was used for determining the

slope because the concentration changes could occur at any point during the presenta-

tion of the aural stimulus. Coefficient of variation was determined by finding the ratio

of the variance to the mean over the course of the trial. The amplitude-based features

reflected the level of chromophore concentration which captured regional brain activity.

The slope of the concentration waveform represented response latency (i.e. faster vs.

slower changes). Such features have previously characterized task-based activation in

62

Table 5.1: Summary of features used in the analysis

Feature type Features

Laterality featuresLateral slope ratio (LSR)= right concentrationslope/left concentration slope

Lateral absolute mean difference (∆LM) =|Left concentration mean−Right concentration mean|

Single channel-based featuresStimuli period mean (M)Stimuli period slope (S)Coefficient of variation (CV )Mean difference between signal and noise(∆M) = Stimuli period mean - Preceding noiseperiod mean

the prefrontal cortex [178, 180, 218, 151]. In total there were 4 features/location × 9

locations × 2 chromophore concentrations = 72 single-channel features.

The two laterality features quantified differences in activity between the left and the

right sides, and thus were calculated for each of the four pairs of interrogation locations

symmetrical about the midline (i.e., 1L-1R, 2L-2R, 3L-3R and 4L-4R in Fig 2.1). Later-

ality features included the ratio of the concentration signal slopes, and the difference in

the average signal values, between corresponding left and right channels. The inclusion

of these features was motivated by physiological findings that confirm lateralized activa-

tions in response to emotional stimuli [235, 33, 4]. In total, there were 2 features/channel

pair × 4 channel pairs × 2 chromophore concentrations = 16 laterality features.

5.4.4 Classification procedures

For each trial, 65 seconds of data were extracted, including the 45 second stimulus period

and the preceding (10 s) and subsequent (5 s) Brown noise periods. The trials with Brown

noise (BN) were set aside, and the rest of the data were partitioned according to arousal

and valence ratings. For the analysis of arousal, the 48 highest rated trials (out of 96

trials with music) over all four sessions were selected. For the valence component, the

24 highest positively-rated and 24 highest negatively-rated trials across all four sessions

63

(out of 96 trials with music) were selected. The high arousal (HA), positive valence (PV),

negative valence (NV), and Brown noise (BN) trials were labeled accordingly. Note that

arousal and valence labeling were performed independently [156, 190].

A classifier based on linear discriminant analysis [40] was used to solve two different

two-class classification problems (HA vs. BN and PV vs. NV). Comparing the two

valence categories (i.e. PV and NV) individually with Brown noise was not feasible due to

the difference in sample sizes (nHV = nLV = 24, nBN = 48). The classification accuracy

was estimated using the average of 50 independent iterations of 10-fold cross-validation.

Due to the differences in prefrontal activation in different participants, Feature selection

was performed to select a subset of the feature set that best separated the two classes for

each participant. To measure separability, we used the Fisher score which is [40] defined

as the ratio of the difference between the mean of features extracted from each class

under investigation to the sum of variances of features from each class on the training

data. The Fisher score for each feature was calculated and the top two features with

the highest score were selected for classification. Classification accuracy is reported as

correct classification rate.

5.5 Results

Fig. 5.1 depicts normalized sample concentration recordings from all recording locations

for participant 3. Fig. 5.1(a) and 5.1(b) are recordings during a music excerpt rated

as highly arousing and strongly positive, whereas Fig. 5.1(c) and 5.1(d) are normal-

ized sample recordings from one of the most arousing but most negatively rated trials.

Recordings during a sample Brown noise trial are provided for comparison in both cases.

Some immediate patterns are evident. For both HbO2 plots, we notice a general increase

in concentration (hyper-oxygenation), illustrated in (Fig. 5.1(a)) and (Fig. 5.1(c)). The

hyper-oxygenation happens at different points in time during exposure to various audi-

64

(a) HbO2 concentration for positively va-lenced stimulus

(b) Hb concentration for positively va-lenced stimulus

(c) HbO2 concentration for negatively va-lenced stimulus

(d) Hb concentration for negatively va-lenced stimulus

Figure 5.1: Plots (a) and (c) exemplify normalized HbO2 concentration signals at differ-ent recording locations while plots (b) and (d) are the corresponding normalized Hb con-centration signals. The dark lines represent normalized signals corresponding to highlyvalenced, high arousal stimuli while the lighter grey line depicts normalized concentra-tions during Brown noise presentation to the same participant. The same Brown noisesample is illustrated for both positively and negatively valenced examples.

tory stimuli. In both positive and negative-rated trials depicted in Fig. 5.1, a decrease in

Hb concentration following hyper-oxygenation is observed which is consistent with pre-

vious findings of functional NIRS studies[161, 136]. The valenced responses are visibly

distinct from the sample Brown noise response (light grey traces).

The average classification accuracies for the valence (PV versus NV) and arousal (HA

versus BN) classification problems are reported in Tables 5.3 and 5.2, respectively, for

each participant. The best accuracy averaged over all participants was obtained with

2-dimensional feature sets for both HA versus BN (71.93%), and PV versus NV (71.94%)

classification problems. Tables 3 and 2 also summarize the different features selected by

the feature selection algorithm for each classification problem and each participant. As

seen, the optimal feature set was different for each participant.

The spatial distribution of features leading to the best accuracies are marked in Fig

5.2 for the HA versus BN and PV versus NV classification problems. In these figures,

65

Table 5.2: Classification accuracy in % for each participant when classifying HA vs. BN.Feature-types corresponding to the best average accuracy are also presented for eachparticipant (M = stimulus period mean; ∆M = stimulus period mean - preceding noiseperiod mean; LSR = lateral slope ratio; ∆LM = Lateral mean difference; S = slope, CV= coefficient of variation

Participants Gender HA vs. BN% (2 features) features chosen1 M 90.21 ± 1.72 ∆M2 F 76.91 ± 1.04 ∆M3 F 78.67 ± 3.31 ∆M4 F 67.57 ± 2.01 M,S5 F 69.04 ± 1.91 ∆M,CV6 M 58.12 ± 2.55 S7 M 61.71 ± 2.43 S,∆M8 F 71.16 ± 1.08 S9 M 70.17 ± 3.93 ∆M10 M 75.72 ± 1.28 ∆MAverage 71.93 ± 9.09

the magnitude of a rectangular area is directly proportional to the frequency at which

the feature in question was selected at a specific recording site across all participants.

The vertical line represents the anatomical midline. The values are based on the feature

set dimensionality resulting in the highest average classification accuracy.

Fig 5.3 illustrates how the adjusted classification accuracy (i.e. average of classifica-

tion sensitivity and specificity) results averaged across all participants change as trials

with lower arousal ratings are compared to brown noise. Similarly, Fig. 5.4 depicts how

the classification results change when different ranges of positively- and negatively-rated

trials are compared. Comparisons ranged from the highest negative trials (top 12) ver-

sus the highest positive trials (top 12) to all positively-rated trials classified against all

negatively-rated trials.

66

Table 5.3: Classification accuracy in % for each participant when classifying PV vs. NV.Feature-types corresponding to the best average accuracy are also presented for eachparticipant (M = stimulus period mean; ∆M = stimulus period mean - preceding noiseperiod mean; LSR = lateral slope ratio; ∆LM = Lateral mean difference; S = slope, CV= coefficient of variation

Participants Gender PV vs. NV% (2 features) features chosen1 M 75.20 ± 4.22 ∆M,M2 F 77.73 ± 2.09 LSR,S3 F 63.28 ± 4.30 LSR,M4 F 67.76 ± 2.83 LSR, ∆M5 F 77.57 ± 4.10 ∆M6 M 63.04 ± 3.67 ∆M,M7 M 62.00 ± 3.46 S,CV8 F 86.91 ± 2.87 ∆M9 M 76.99 ± 5.11 ∆M,M10 M 68.96 ± 6.55 S,MAverage 71.94 ± 8.19

5.6 Discussion

5.6.1 Classification Accuracy

The objective of this phase was to detect the brain response to emotionally-laden music

by monitoring the prefrontal hemodynamics manifested as changes in the HBO2 and

Hb concentrations. Visual inspection of the concentration waveforms in Fig. 5.1 sup-

ports the choice of discriminatory features (e.g., mean and slope). Emotional arousal

in response to music was classified against the Brown noise response with an average

accuracy of 71.93% while emotional valence (i.e. positive or negative) was differentiated

with 71.94% accuracy. These findings indicated that the emotional content of music in-

duces differential patterns of activity in the prefrontal cortex, detectable algorithmically

by NIRS.

As reported in Tables 5.2 and 5.3, classification accuracies varied across participants,

corroborating previous findings of individual differences in emotional reactivity [189, 22].

As can be seen in Tables 5.2 and 5.3, accuracies above chance level were achieved for

67

(a) HbO2, HA versus BN (b) Hb HA versus BN

(c) HbO2, PV versus NV (d) Hb, PV versus NV

Figure 5.2: Location of features resulting in the best overall accuracy. Each rectangle islocated over a recording site. The size of the rectangle is proportional to the number offeatures selected from the corresponding location. The vertical line denotes the anatomi-cal midline (HA = high arousal; BN = Brown noise; PV = positive valence; NV=negativevalence).

9 out of 10 participants in the HA versus BN classification problem (α = 0.05), while

in the PV versus NV scenario, accuracies for 7 out of 10 participants exceeded chance

(α = 0.05)1.

One of the concerns when investigating emotional experience using PFC activity is

the possibility of activation due to the emotion induction task requirements as opposed

to the emotions induced [74]. However, Fig 5.3 illustrates how the classification accuracy

degrades as trials with increasingly lower arousal rating are compared against brown

noise. Therefore, the difference in the task requirements (e.g. attentional demands),

when presenting music compared to brown noise presentation, is unlikely to be responsible

for classification accuracy. Similarly, in Fig 5.4, the classification accuracies degrade as

trials with increasingly lower positive and negative valence ratings are classified against

11Note that for a two-class problem, the 95% confidence interval (α = 0.05) for 48 and 24 trials perclass are 50± 9.80 and 50± 13.59, respectively [149]

68

12 24 36 48 60 72 84 9650

55

60

65

70

75

80

85

90

Number of trials included

Adj

uste

d cl

assi

ficat

ion

accu

racy

(%

)

Figure 5.3: Adjusted classification accuracy (shown in (??)) results (averaged across par-ticipants) versus the number of trials included for classification against brown noise trials,after sorting all trials based on ratings of arousal in descending order. (e.g. accuraciesreported for the top 12 are the result of classifying the 12 highest rated arousal trialsagainst all trials with brown noise. The confidence intervals are shown as error bars foreach number of trials included.)

12 24 36 4850

55

60

65

70

75

80

85

90

95

100

Number of trials included

Adj

uste

d cl

assi

ficat

ion

accu

racy

(%

)

Figure 5.4: Adjusted classification accuracy (shown in (??)) results (averaged acrossparticipants) versus the number of trials included for classification, after sorting all trialsbased on ratings of positive and negative valence in descending order. (e.g. accuraciesreported for the top 12 are the result of classifying the 12 most positively rated trialsagainst the 12 most negatively rated trials. The confidence intervals are shown as errorbars for each number of trials included.)

69

each other. This decrease in the classification accuracies is expected due to potential

similarities between trials rated at the lower positive and lower negative ends of valence

(approaching neutral state).”.

According to Fig 5.2, which depicts the recording sites corresponding to features se-

lected across all participants, the spatial distribution of the features resulting in the best

overall accuracy was bilateral. This finding is consistent with the bilateral physiological

substrates that are responsible for the perception of valence in the prefrontal cortex [33].

Nonetheless, in three out of ten participants, unilateral activation was most discrimina-

tory as laterality features were among those selected for solving the valence classification

problem (see Table 5.3).

5.6.2 Diversity in the music database

Previous studies have reported regional brain activity modulation due to specific char-

acteristics of music such as rhythm, timbre, and major/minor chords [118, 193, 163]. In

these studies, the investigators varied selected music characteristics while carefully con-

trolling for others. Other studies, focusing on emotion induction, have used diverse music

databases (e.g., self-selected music pieces) to ensure successful elicitation of emotional

reactions [15, 200]. In the current study, the second approach was used.

The variability of arousal and valence ratings for a given piece of music across partic-

ipants (i.e., the same music excerpt rated differently among participants) suggests that

the observed brain activity was indeed attributable to emotional experiences. Moreover,

the variability in ratings among participants implies that the classification algorithm was

not likely biased towards specific musical characteristics.

5.6.3 Challenges

Due to the limited number of samples, only two dimensions of emotion (valence and

arousal) were considered. Although these measures are informative, they fail to capture

70

more specific emotional labels. For example, fear and sadness can both be rated as

negatively valenced and high in arousal. In order to differentiate more specifically among

emotional labels, other dimensions of emotion such as occurrence (eruptive vs. gradually

arising) and dominance (complete control vs. no control over the situation) need to be

considered [240].

Special care was devoted to standardize headgear placement across all four sessions,

which in turn, should have minimized instrumentation inconsistencies. However, differ-

ences in the shape of the skull may have led to variabilities in the brain regions monitored

in different participants. Therefore, the present results preclude conclusions about the

specific brain regions that were activated.

The human response to emotional stimuli may be affected by emotional sensitivity.

In fact, [169], have shown that individuals with high trait emotional intelligence respond

faster and show more sensitivity in an emotion induction paradigm . Including a measure

of emotional sensitivity in addition to the self-reported ratings might have helped to

explain the inter-subject variability in classification accuracies.

Previous studies of emotion have indicated gender differences as an important factor in

emotional response [132, 241]. However, the limited number of participants did not allow

further investigations of gender related differences in the emotional response. Future

studies with larger sample sizes need to be devised to investigate the effects of gender in

emotion-induced prefrontal hemodynamic response.

Chapter 6

Combining autonomic and central

nervous system activity

6.1 Preamble

In this chapter, autonomic nervous system activity signals, namely electrodermal ac-

tivity (EDA), blood volume pulse (BVP), and skin temperature are used to solve the

classification problems introduced in chapter 5 (i.e. high arousal vs. brown noise and

most positive vs. most negative). In addition, new features using dynamic modeling and

template matching are introduced for emotion identification.

The goal here is to compare the results achieved using ANS features with those

obtained using exclusively PFC hemodynamic features (see chapter 5), and to combine

classifiers trained using features derived from ANS and PFC hemodynamics to improve

upon accuracies obtained in chapter 5. Readers can skip sections 6.3.1, 6.3.2 and 6.3.6

since they reiterate the procedures described in chapter (2) and the classification steps

introduced in section (5.4.4).

71

72

6.2 Introduction

Emotional response may engage various pathways in the central and autonomic nervous

system. In fact, some theories surrounding the neural basis of emotion have argued for

the existence of an intricate interaction pattern between central and autonomic nervous

system (ANS) during emotional response [222]. Autonomic nervous system activity has

long been used in the field of physiologically-based emotion identification. Physiological

emotion detection may provide a means of affective communication for adults and youth

with severe disabilities who may not be able to use conventional means of emotional ex-

pression such as speech or facial gestures due to severe motor impairments. In particular,

identifying affective state may help to mitigate care-giver stress and facilitate treatment

decisions in a timely fashion [70].

Cardiovascular, respiratory, and particularly electrodermal activity (EDA) sensors

can detect ANS activity modulations during emotional response [108, 129]. Many studies

have used multiple indicators of ANS activity for identifying emotions [244, 66, 100]. For

example, using EDA, temperature, BVP and ECG monitors, Kim et al. achieved 78% in

differentiating anger, sadness and stress [102].

Emerging neural indicators of emotion are based on activity of the central nervous

system (CNS), particularly brain areas which are found to be involved in emotional pro-

cessing. Highly pleasurable music excerpts were shown to result in activation patterns

in the amygdala, as well as the frontal and ventral prefrontal cortex [15], using mag-

netic resonance imaging (MRI). Another hemodynamic monitoring technology applied for

emotion identification is near-infrared spectroscopy (NIRS) which measures oxygenated

and deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively) in cerebral

blood flow [90, 228]. NIRS, which is a portable and relatively inexpensive optical im-

agery technology, is not suitable for monitoring deeper brain regions such as amygdala,

but is capable of monitoring the PFC which is part of the emotional perception circuitry

in the brain. In fact, NIRS studies of the PFC have identified correlates of emotion in

73

regional hemodynamic activity [218, 144]. Hoshi et al. showed that emotional response

to pleasant and unpleasant pictures resulted in regional increase and decrease of PFC

[HbO2], respectively [83].

Based on the existing physiological evidence, both autonomic and central nervous

system activity may show modulation during emotional activity. Therefore, realizing a

multi-modal emotion identification system which uses both CNS and ANS activity is a

meaningful pursuit. Recent studies have explored concomitant use of signals from both

ANS and CNS pathways for detecting emotions. For example, Kuncheva et al. showed

that an ensemble of classifiers each trained using electrocardiogram, electroencephalo-

gram, EDA and pulse signals [120] could achieve accuracies up to 73% in differentiating

positive from negative emotional state. In this light, the current chapter focuses on using

features from ANS activity for identifying most intense-rated music excerpts from neutral

brown noise and most positive-rated music excerpts from most negative rated excerpts.

Furthermore, a mixture of experts was used for combining classifier decisions. These

classifiers were trained using ANS-based features or NIRS-based features separately.

6.3 Methods

6.3.1 Procedures

10 able-bodied individuals (5 female, age: 25 2.7 years) with no reported cardiovascular

diseases, metabolic disorders, history of brain injury, respiratory conditions, drug and

alcohol-related and psychiatric conditions were recruited for this study. Ethics approval

was obtained from the ethics board at Holland Bloorview Kids Rehabilitation Hospital.

The experiments were conducted over four separate sessions, and encompassed a total of

144 trials, 48 of which included brown noise. Pilot studies indicated that this type of noise

was subjectively more pleasant than white noise at the same sound pressure level. In each

session, participants completed three blocks. Each block consisted of 12 consecutive trials:

74


four trials with positively valenced songs (including one participant-selected song), four

trials with negatively valence songs (including one participant-selected song), and four

Brown noise trials. The music excerpts were randomly selected from a database composed

of six music pieces self-selected by the participant and a common music database selected

by researchers. The common music database included music pieces from different genres

of music (classical, rock, jazz, and pop), with and without lyrics. The trials within each

block were pseudo-randomized, such that Brown noise trials never occurred consecutively,

and, positively and negatively valenced songs appeared in no apparent order. The same

pseudo-random sequence of trials was used for all participants. Figure 6.1 illustrates a

typical trial sequence.

6.3.2 NIRS data

A multi-channel NIRS monitoring system (Imagent Functional Brain Imaging System

from ISS Inc., Champaign, IL) was used to record hemodynamic response across nine

different regions in the PFC. In this system, five optode pairs and three detectors were

placed on the forehead as shown in Figure 6.2. In each optode couple, one source emitted

light at 830nm, and the other at 690nm. The signals were recorded at a 31.25Hz sampling

rate.

75


6.3.3 ANS data

Blood volume pulse (BVP), electrodermal activity (EDA) and temperature were recorded

using a ProComp Infiniti multimodality encoder (Thought Technology, Montreal, QC,

Canada) at 256 Hz sampling rate. EDA was recorded using two Ag-AgCl surface elec-

trodes with 10-mm-diameter which were attached to index and middle finger phalanges

on the non-dominant hand. Using a thermal sensor secured on the fifth finger, the skin

temperature was recorded. The blood volume pressure was obtained using a photo-

plethysmograph sensor attached to the thumb. The recorded blood volume pressure was

used to determine heart rate by finding the interbeat interval.

6.3.4 Analysis

The BVP signals were band-pass filtered (0.2-0.33 Hz) using Daubechies-based contin-

ues wavelet transform [2] to facilitate peak detection. The inverse of the peak-to-peak

distances in time was used as an indicator of heart rate, and the peak values were used

to determine pulse volume amplitude (PVA) [101].

76

6.3.5 Feature extraction

PFC features

PFC hemodynamic-based features were extracted from [HbO2] and [Hb] at each of the

9 recording locations. These features included the mean, slope (determined using linear

regression) and coefficient of variation (ratio of variance to the mean), all estimated

over the music presentation period, and the change in the mean between the preceding

noise and music presentation period. In addition to single-channel features, the ratio of

the concentration signal slopes and the difference in the average signals were determined

between left and right channels (i.e., 1L-1R, 2L-2R, 3L-3R and 4L-4R in Fig 2). Laterality

features were introduced into the feature set, based on previous reports of lateralized

response to emotional stimuli [235, 33, 4]. For more information regarding these features,

the reader is referred to section 5.4.3.

Based on the findings presented in chapter (3), new features were derived by intro-

ducing a custom made template. First, by repeated visual inspection of [HbO2] and [Hb]

patterns in the highest arousal-rated trials, a template was designed. Figure 6.3. A and

B depict the designed template and a sample recoding from participant 3 during which

chills were reported, respectively. As shown in Figure 6.3, the template was designed to

capture the sudden increase and the proceeding plateau in the concentration waveform.

This custom-made template was akin to a mother wavelet and the maximum coefficients

across translation and scale were determined. The maximum coefficients determined us-

ing the template (maximum across scale and time) were used as features for classification.

Hence, the template was empirically determined.

ANS features

The features representing autonomic nervous system signals included the mean, the range

and the difference in the mean values during the aural stimulus period and the preceding

77

0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0 5 10 15 20 25 30 35 40 45-1

-0.5

0

0.5

1

time

A. Template B. Sample trial with chills

Figure 6.3: A. Custom-made template, B. Sample normalized [HbO2] recorded in a trialwith chills.

noise period (see the trial sequence in Figure 2.2) for temperature recordings and EDA.

The number and magnitude of electrodermal responses (EDR) were also added to the

feature set. EDRs were detected by differentiating the EDA recordings and convolving

the resulting waveform with a Bartlett window, and finding the two consecutive zero-

crossings (positive to negative and negative to positive) [102]. The maximum value

between these two zero-crossings was recorded as the magnitude of the EDR [102]. The

average heart rate and PVA signals within the 45 sec period of exposure to music were

also included as features to represent cardiovascular response. In addition, the ratio of

high frequency (heart rates above 70 bpm) to low frequency (heart rates below 70 bpm)

energy content was also included in the feature set. These ANS features were selected

based on previous findings in studies of emotion involving ANS activity [108, 184].

Dynamic model-based features

System identification has previously been used for modeling interactions among various

physiological signals [197, 175]. For example, Saul et al. used system identification to

understand the relationship between respiratory signals and heart rate in their charac-

terization of autonomic regulation of heart rate [198]. The differences observed between

models fit to neutral and chilling trials, which will be reported in section 6.4.1, suggested

that dynamic model-based features may be useful in differentiating emotions.

To capture the relationship between EDA and [HbO2], an autoregressive model with

78

exogenous input (arx) was applied [130]. This arx model described a system based on

immediate past input (x) and output (y) values as shown in (6.1).

y(t) = b1x(t− 1) + ..+ bnbx(t− nb)− a1y(t− 1)− ..− anay(t− na) + ε (6.1)

In (6.1), nb and na are model orders, and ai and bi are model coefficients. In the

arx model estimated, the [HbO2]/[Hb] was used as the input (x) to the system and the

EDA was set as the output (y) and vice versa. Model order was selected according to

the Akaike Information Criterion (AIC )[130].

The EDA signals (originally collected at 256Hz) were down-sampled by a scale of 7

(resulting in a sampling rate of 4.57 Hz), and [HbO2] signals (originally collected at 31.25

Hz) were re-sampled at 4.57 Hz to match EDA sampling rate. To match the bandwidth

of the EDA signals to that of NIRS recordings, both types of signals were low pass

filtered using the same type II third order Chebychev filter (i.e. 0.1Hz cut-off). The

concentration time series and EDA signals within each trial were normalized to have zero

mean and were scaled down by the maximum absolute signal value during the trial. This

normalization resulted in signal magnitudes ranging from -1 to 1.

An autoregressive (AR) model was used to represent EDA, [HbO2], and [Hb] dynam-

ics. Unlike arx, an AR model describes the signal with respect to the immediate past of

the same signal as shown in (6.2).

y(t) = a1y(t− 1) + ..+ anay(t− na) + ε (6.2)

To illustrate the potential merits of dynamic-based feature extraction in emotion

identification, arx models ([HbO2] averaged across the nine recording regions was used as

input) were fit to trials with the highest arousal rating (i.e. chills) and those with brown

noise rated as neutral (i.e. neutral). These models were compared (chills vs. neutral)

79

Table 6.1: Features resulting from arx dynamic modeling. (very low frequency band(VLF) = 0-0.025 Hz, low frequency band (LF) = 0-0.075 Hz and high frequency band(HF) = 0.075-0.1 HZ)

Feature typesModel-based features

Model coefficients

Frequency response features

EnergytotalEnergyV LF

EnergyHF

EnergyLF

EnergyHF

EnergyV LF

in the frequency domain to explore the usefulness of dynamic-based feature extraction

in differentiating arousal. Chills were selected for comparison as they are well-defined

emotional events. The model orders identified using AIC were recorded for chills and

neutral trials, and the AIC order selected for the majority of trials was considered as the

generalized model order (GMO).

To identify features, each trial (n=144) was first modeled using arx under two condi-

tions: a) with [HbO2]/[Hb] as the input and, b) with EDA as the input (GMO for chills

was used for modeling in both (a) and (b)). The model coefficients (i.e. ai , bi ) were

included as direct model-based features. Other features were based on the frequency

response of the estimated dynamic model. The energy of the frequency response within

three frequency bands, namely, the very low frequency band (VLF = 0 - 0.025 Hz), the

low frequency band (LF=0 - 0.075 Hz) and the high frequency band (HF = 0.075 - 0.1

HZ) was used for extracting frequency response-based features. Ninety percent of the

spectral peaks of the models’ transfer functions occurred within these frequency bands.

Table 6.1 summarizes these features. Using a similar procedure, the GMO was used for

identifying AR model coefficients which were also included as features.

80

6.3.6 Classification

In order to compare the classification results with these newly proposed features against

that attained only for NIRS signals, procedures identical to chapter 5 were used for

labeling the data and classification. The trials with Brown noise (BN) were separated,

and the rest of the data were partitioned according to arousal and valence ratings. For

the analysis of arousal, the 48 highest rated trials over all four sessions were selected. For

the valence component, the 24 highest positively-rated and 24 highest negatively-rated

trials across all four sessions were selected. The high arousal (HA), positive valence (PV),

negative valence (NV), and Brown noise (BN) trials were labeled accordingly. Arousal

and valence labeling were performed independently

A classifier based on linear discriminant analysis [40] was used to solve two different

two-class problems (HA vs. BN and PV vs. NV). The classification accuracy was esti-

mated using the average of 50 independent iterations of 10-fold cross-validation. Feature

selection was performed to identify feature set that best separated the two classes for each

participant. To measure separability, the Fisher score [40] was used, which is defined as

the ratio of the difference between the mean of features extracted from each class under

investigation to the sum of variances of features from each class. The Fisher score for

each feature was calculated and the two features with the highest scores were selected

for classification. Classification accuracy was recorded as the correct classification rate.

6.3.7 Mixture of experts

Six linear discriminant analysis based classifiers [40] were separately trained using exclu-

sively features from one of the feature types, shown in Table 6.2, namely time domain

PFC features, ANS features, template-based features, and system dynamic-based fea-

tures. These classifier experts were used to decide the labels of trials set aside for testing.

The features were randomly segregated into training A and testing sets (shown in Figure

6.4) using a 10-fold cross-validation algorithm. Testing data were set aside for the final

81

Allfeatures

Training A

Testing

Training B

Validation

10-folds

10-folds

Figure 6.4: Feature segmentation.

testing of the classifier ensemble.

To determine the class label (i.e. wj, j = 1, 2) for the sample x in the testing set, the

classifier decisions were combined using the classifier confidence and the support for x

belonging to each class using a classifier combination algorithm introduced in [121] (see

Figure 6.5). Classifier support was determined using the discriminant function for class

wj (i.e. gj(x)) and transformed into a logistic link function g′j(x) which resulted in a

value ranging from 0 to 1. Larger support values indicated more likelihood for class wj

. For a two class problem, the discriminant and logistic link functions are determined as

shown in (6.3) and (6.4), respectively.

p(wj|x) =exp(gj(x))

exp(g1(x)) + exp(g2(x)), j = 1, 2 (6.3)

g′1(x) =1

1 + exp(−g1(x)), g′2(x) = 1− g′1(x), j = 1, 2 (6.4)

The classifier support values were arranged in the form of a decision profile (DP(x)).

The DP(x) was composed of dc,j(x) elements which represented the support that classifier

Dc (c = 1, 2, ..., 6) had for class wj, given the vector x from the testing set.

Classifier competence (Gc), which represented the classifier ability to identify class

82

Classifier 1

Classifier 2

Classifier 3

Classifier 4

Classifier 5

Classifier 6

FeatureSelection

FeatureSelection

FeatureSelection

FeatureSelection

FeatureSelection

FeatureSelection

Fusio

n o

f C

lassifie

r D

ecis

ions

OutputClass

Time-domainNIRS

ANS

Template-based

ARXinput: EDA

ARXinput:

[Hb]/[HbO ]2

AR

Figure 6.5: A simplified diagram depicting fusion of classifier decisions.

labels, was determined based on the training A data. The training A data was parti-

tioned into validation and training B in 20 iterations of a 10-fold cross validation. The

correct classification rates in each iteration and fold were averaged to estimate classifier

competence, Gc. This step resulted in 6 values, one for each feature set, namely, time

domain PFC, ANS features, template-based features, arx features (input: EDA), arx

features (input: [HbO2]/[Hb]), and AR features. To determine Gc (c = 1, 2, .., 6), in

each iteration and fold, 1 feature was selected using Fisher scores applied to the primary

training set (see Section 6.3.6).

The parameter λ was estimated by finding the real root greater than −1 of the

polynomial shown in (6.5).

1 + λ =6∏

i=1

(1 + λGi), λ = 0. (6.5)

For a given x, the DP vector corresponding to x for each class was sorted from the high-

est support to the lowest (i.e. [d1,j(x), d2,j(x), ..., d6,j(x)] → [dc1,j(x), dc2,j(x), ..., dc6,j(x)]).

83

Table 6.2: Feature used for training classifier experts

Feature typesTime domain PFC features Multi-channel time domain features

Laterality features

ANS featuresEDA featuresSkin temperature featuresBVP features

Wavelet-based features Maximum wavelet coefficients across time and scale

Dynamic-based featuresarx features (input: EDA, output: [HbO2]/[Hb])arx features (input: [HbO2]/[Hb], output: EDA)AR features (EDA, [HbO2], and [Hb])

The classifier competence values were sorted accordingly (i.e.Gc1(x), Gc2(x), ..., Gc6(x)).

The measure Q(k) was calculated recursively as shown in 6.6.

Q(k) = Gck +Q(k − 1) + λGckQ(k − 1), Q(1) = Gc1 , k = 2, ..., 6. (6.6)

The final degree of support for class wm was determined using a Suego integral, shown

in (6.7), and the class with the highest µm(x) value was noted as the ensemble decision

for sample x from the testing set [121].

µm(x) = maxk=1,..6{min{dck,m(x), Q(k)}},m = 1, 2. (6.7)

6.4 Results

ANS data from 61 trials out of the 1440 trials across all participants was lost due to

technical issues, and were therefore not included for the analysis. [HbO2] and [Hb]

signals corresponding to trials for which ANS signals were lost were also excluded from

the analysis.

84

Figure 6.6: Sample trial with chills (participant 2): EDA recording and estimation, usingthe average [HbO2] concentrations as the input to the arx model. The fit achieved bythe model for the depicted estimation is 52.9%.

6.4.1 Dynamic model-based features

The estimated arx model achieved a fit value exceeding 50% in 70% of trials in the chills

category and 77% of trials in the neutral cases. These results exemplify the ability of the

arx model to capture the interaction dynamics between EDA and [Hb]/[HbO2]. Figure

6.6 depicts a sample ([HbO2]) and the corresponding EDA recording and estimation for

a trial with chills for which the fit value was 52.9%. Figure 6.7 shows the normalized

frequency responses (magnitude and phase) for models with chills and neutral trials for

participant 4. The trials with chills manifested two distinct peaks. The two peaks,

shown in Figure 6.A, were observed in all other participants. However, the location of

these peaks varied across participants. Neutral-rated trials manifested a low pass filter

property with a single peak similar to that shown in Figure 6.7.B.

6.4.2 Classification results

Table 6.3 summarizes the ANS-based classification results for HA vs. BN and PV vs.

NV. Clearly, the ANS-based results varied across participants.

The mixture of experts classification rate is presented in Table 6.4. Tables 6.5 and 6.6

85

A. Chilling B. Neutral

0 0.05 0.1 0.15 0.2 0.25 0.30

0.005

0.01

0.015

0.02

0.025

0.03

0 0.05 0.1 0.15 0.2 0.25 0.30

0.005

0.01

0.015

0.02

0.025

0.03

0.035

Figure 6.7: Sample scaled frequency response estimated for (A) chilling and (B) neutraltrials for participant 4. The magnitude of the frequency response was normalized bydividing the results by the total power of the signal over the entire frequency range.

Table 6.3: Classification accuracy in % determined using ANS features for solving theHA vs BN and PV vs. NV classification problem

Participants HA vs BN % PV vs NV %1 59.8 ± 1.5 64.5 ± 1.72 53.0 ± 1.7 59.8 ± 2.53 77.3 ± 1.1 59.8 ± 2.84 51.5 ± 1.4 62.7 ± 2.55 55.6 ± 1.5 55.5 ± 2.06 58.7 ± 1.6 55.9 ± 2.07 55.4 ± 1.2 46.4 ± 2.28 58.1 ± 1.5 54.2 ± 2.39 54.3 ± 1.3 53.2 ± 2.310 69.8 ± 1.4 60.1 ± 2.4

summarize the dynamic model-based results for the two classification problems, namely

PV vs. NV and HA vs. BN.

6.5 Discussion

Many physiologically-based emotion identification efforts have included electromyogram

sensors to capture muscle activity due to emotions (e.g. muscle contractions resulting

from facial expression) [244, 66, 100]. For example, Picard et al. achieved an accuracy

of 81% in differentiating eight emotional states (neutral, anger, joy, grief, hate, romantic

love, reverence, platonic love) using features from facial electromyography, BVP, EDA

an respiration [172].

86

Table 6.4: Classification accuracy in % determined using the mixture of experts forsolving the HA vs. BN and PV vs. NV classification problem

Participants HA vs. BN % PV vs. NV %1 83.8 ± 0.8 85.1 ± 0.92 75.9 ± 1.0 58.4 ± 2.63 91.9 ± 0.5 57.5 ± 2.24 58.6 ± 1.7 49.2 ± 2.35 60.9 ± 1.4 64.5 ± 1.76 59.7 ± 1.5 55.7 ± 1.77 51.8 ± 1.4 42.8 ± 2.38 71.5 ± 1.2 71.4± 1.89 60.7 ± 1.6 58.8 ± 2.210 69.55 ± 1.2 58.9± 2.4

Table 6.5: Classification accuracy in % for each participant when classifying HA vs.BN. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and arx (b)input:[HbO2]/[Hb])) and template-based features.

Participants AR % arx (a) % arx (b) % Template-based1 56.0 ± 1.6 58.1±1.6 44.4 ±1.2 56.5±1.62 47.7± 1.0 65.6±1.4 48.0 ± 1.6 64.5±1.63 81.7± 1.1 65.1±1.1 61.1 ± 1.4 85.6 ±0.84 63.9± 1.1 45.2 ±1.2 49.1 ± 1.3 54.2 ±1.15 56.2± 2.1 56.5 ± 1.6 57.1 ± 1.3 59.0 ±1.46 63.8± 1.1 55.8± 1.4 59.0 ± 1.0 42.0 ± 1.17 46.0 ± 1.7 47.7± 1.6 51.6 ± 2.0 48.2 ±1.48 43.1± 1.3 54.7±1.9 45.1 ± 1.3 61.6±1.59 66.9± 1.5 62.4± 1.5 44.7 ± 1.4 61.4±1.110 59.8 ± 1.2 55.2± 1.7 54.9 ±1.7 63.8±1.4

Other studies have exclusively focused on ANS activity sensors. In a study involving

electrocardiogram (ECG), Agrafioti et al. [1] achieved accuracies up to 89% in differen-

tiating valence, and reported between-subject variability in classification results. Using

EDA, temperature, BVP and ECG monitors, Kim et al. achieved 78% in differentiat-

ing anger sadness and stress [102], and also indicated differences in correct classification

rate among participants. These findings confirm differences in correct identification rates

across individuals, which was also observed in the current investigation.

Autonomic nervous system activity patterns may vary across individuals. For ex-

87

Table 6.6: Classification accuracy in % for each participant when classifying PV vs.NV. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and arx (b)input:[HbO2]/[Hb])) and template-based features.

Participants AR % arx (a) % arx (b) % Template-based1 46.3 ± 2.4 52.3 ± 1.6 59.9 ± 1.7 59.0±2.52 46.4 ± 1.9 49.4 ± 1.8 54.1 ± 1.8 59.3± 1.83 51.9 ± 1.9 47.3 ± 2.4 69.4 ± 1.6 56.2±2.14 44.1 ± 2.2 54.0 ± 2.2 39.4± 1.4 46.7±2.45 64.3 ± 2.6 52.1± 1.8 53.2 ± 2.2 57.8 ±2.16 47.3 ± 2.2 55.2 ± 2.1 51.7 ± 2.3 45.6 ±1.57 46.5 ± 2.4 46.4 ± 2.5 47.0 ± 1.6 37.9±1.68 53.3 ± 1.9 59.7 ± 2.1 55.4 ± 2.1 62.0±2.39 48.0 ± 2.2 49.7 ± 2.6 40.4 ± 2.4 61.1±2.510 43.2 ± 1.9 59.2± 1.6 57.8 ± 2.3 47.8±1.9

ample, the EDA response magnitude due to sympathetic arousal may be suppressed in

some individuals[223]. This phenomena may explain the variability in ANS-based emo-

tion identification results in Table 6.3.

Various features such as ANS-based, dynamic model-based or template-based features

may not be equally useful for identifying emotions. For example, for a particular partici-

pant, dynamic model features may result in low accuracies in HA. vs. BN differentiation

while ANS features lead to higher identification accuracies (see results for participant

10), but for another individual, the opposite may be true (see results for participant

3). The multi-modal mixture of experts, used in this study, automatically accounted

for this variability by estimating the classifier competence which ultimately assessed the

usefulness of the feature set. Combining classifier decisions maintained or improved the

HA vs. BN classification results in only three participants (i.e. participants 3,6 and 8)

when compared to results obtained exclusively using NIRS features (see Table 5.2). The

PV vs. NV correct classification results were generally (with the exception of participant

1) lower (comparing 5.3 and 6.4). Previous studies have indicated differences between

arousal and valence detection accuracies. For example, using skin conductivity, blood

volume pressure, respiration and an electromyogram, Healey et al. found that valence

88

differentiation was less accuracte than arousal differentiation [72].

Compared to results obtained in chapter 5 (i.e. Table 5.2 and 5.3), the accuracies

obtained using PFC hemodynamic-based features were generally higher than the combi-

nation of classifiers based on PFC hemodynamic and ANS features. The average results

from the classifier combination shown in Table 6.4 was 66% for HA. vs. BN and 60%

for PV vs. NV which is lower than the results achieved using NIRS features exclusively

(see chapter 5). However, including autonomic nervous system activity features may help

improve emotion identification in a subset of individuals. Future studies involving larger

sample sizes, which are more likely to represent various ANS response phenotypes, may

help identify whether the multi-modal approach exceeds the performance achieved using

PFC NIRS features alone.

6.6 Conclusion

In this chapter, a multi-modal ensemble of classifiers was used to differentiate highest

arousal rated trials from brown noise (HA vs. BN), and most positive rated trials from

most negative rated trials (PV vs. NV). Each classifier in the ensemble was trained

by exclusively using features from ANS or PFC hemodynamics. Novel dynamic-based

features were introduced and demonstrated potential in arousal differentiation.

The classification results varied across participants. In particular, the classifier en-

semble was capable of maintaining or improving upon the results achieved using only

PFC hemodynamics in 3 participants for the HA vs. BN problem. However, the valence

differentiation rate was lower than those achieved with PFC hemodynamics alone.

Chapter 7

Concluding remarks

7.1 Summary of contributions

This thesis made several contributions to the field of rehabilitation engineering, specifi-

cally, in the area of affective brain computer interfaces. In summary, the results of this

thesis illustrated the feasibility of emotion identification using prefrontal cortex (PFC)

near infrared spectroscopy (NIRS) in response to a dynamic emotion induction method

(i.e. music). The specific contributions are listed in this chapter.

7.1.1 A literature appraisal of the existing evidence for the use

of BCI for individuals with disabilities [143]

The existing evidence for the use of brain computer interfaces (BCIs) involving indi-

viduals with disabilities was critically appraised. This literature review resulted in the

identification of current challenges surrounding BCI use for individuals with severe dis-

abilities. In addition important recommendations for future studies were made, including

the inclusion of user state and involvement of pediatric population. These recommen-

dations may benefit future BCI research efforts in realizing more user-accommodating

systems suitable for the target papulation (i.e. individuals with severe disabilities).

89

90

7.1.2 PFC [Hb] and [HbO2] patterns characterization using wavelet

analysis with respect to emotional arousal and valence

[142]

Regional PFC [Hb] and [HbO2] activity was characterized using wavelet peak detection.

This algorithm allowed identification of hemodynamic characteristics with respect to

the arousal and valence dimensions of emotions. In addition to hemodynamic response

magnitude, the wavelet peak detection method allowed investigation of the speed of

hemodynamic response (i.e. using the scale at maximum wavelet coefficient). Intense

negative emotional ratings were found to be generally related to heightened changes

in [HbO2] . These findings warranted further investigation of PFC NIRS for emotion

identification, particularly when using dynamic emotion induction methods such as music.

7.1.3 Identified emotional arousal and valence in response to

dynamic emotion induction using PFC NIRS [144]

Using time domain and laterality features extracted from the PFC NIRS, the highest

arousal rated trials were differentiated from trials with brown noise (HA. vs. BN) with

an average accuracy of 71%. Similarly, in differentiating the most positively rated trials

from most negatively rated trials (PV vs. NV), an average accuracy of 71% was achieved.

The 10 fold cross-validation used for classifier training and testing simulated single-trial

identification of arousal and valence and provided further evidence for the use of PFC

NIRS as a means of emotion identification.

91

7.1.4 Introduced features based on dynamic modeling for emo-

tion identification

Using dynamic modeling, additional features were introduced for solving the HA vs.

BN and PV vs. NV classification problem. Dynamic modeling was used for capturing

PFC NIRS and EDA signal dynamics. In addition, the interaction dynamics between

[Hb]/[HbO2] and EDA were captured using an arx model. Unlike previous emotion iden-

tification efforts which exclusively used autonomic or central nervous system signals for

identifying emotions, the arx model captured the interaction between PFC hemodynam-

ics and EDA. Despite variability across participants, using features extracted using arx

models, accuracies up to 81% were achieved in differentiating arousal.

7.1.5 Multi-modal emotion identification using a mixture of

classifier experts

A multi-modal mixture of experts exclusively trained using ANS and PFC hemody-

namic features was implemented for emotion identification. The classifier combination

automatically accounted for the variability across participants by estimating classifier

competence and taking classifier confidence into account. The mixture of experts was

capable of improving HA vs. BN identification in three participants.

92

7.2 Recommendation for future studies

7.2.1 Assessing PFC hemodynamics for emotion identification

in the pediatric population and individuals with severe

disabilities

The results of the current thesis have established grounds for future emotion identifi-

cation efforts involving individuals with severe disabilities. In particular, the pediatric

population with severe motor disabilities who may not be able to use other BCI systems

due to developmental delays, limited expressive communication and unknown levels of

receptive communication may benefit from NIRS-based emotion identification systems.

Emotional response may be an intuitive and more natural means of communication for

children with severe disabilities.

Future studies involving typically developing children and children with severe dis-

abilities should consider frontal cortex development in the target age group [36]. The

current results were achieved based on data from adults for whom the prefrontal cortex

is fully developed. The next step would be to test the proposed system with typically

developing children to investigate the feasibility of arousal and valence identification in

different prefrontal cortex developmental stages. This step will inform studies involving

children with disabilities for whom the prefrontal cortex is unlikely to be affected by the

clinical condition. Ultimately, the system may be tested with children with conditions

which may affect the prefrontal cortex.

Specific changes to current study design may be necessary for testing the system for

the pediatric population. For example, music excerpts used may need to be adjusted for

children (e.g. by using more simplified musical structures). Previous studies involving

music-induced emotions in children may be useful in identifying music excerpts suited

for inducing music in children (e.g. [81, 82]). Including self-selected excerpts may not

be feasible due to difficulty in identifying personal preference in children with regards

93

to music. More simplified emotion rating paradigms should be considered to facilitate

emotional ratings by children (e.g. using facial gestures as rating items [81] ).

7.2.2 Potential clinical implications

The system proposed in this thesis may serve as a passive brain computer interface

for detecting emotional state in nonverbal individuals with severe motor disabilities.

Knowledge of the emotional state may facilitate clinical decisions. For example, by

assessing the emotional response to various interventions, the care-givers and clinicians

may devise improved care strategies.

Physiological-based emotional identification has been effective in situational interpre-

tations within clinical settings. In a study involving ten children with disabilities, auto-

nomic nervous system activity was monitored during interaction with therapeutic clowns

compared to television exposure in the complex continuing care at Holland Bloorview

Kids Rehabilitation Hospital [105]. The results indicated a significant difference between

therapeutic clown intervention and exposure to television [105]. Similarly, the results

of the current thesis may lead to augmented awareness regarding the patient state by

providing a means for ongoing bed-side monitoring of emotional state using prefrontal

cortex activity.

Magnetic resonance imaging technology offers improved spatial resolution compared

to NIRS and allows monitoring of deeper brain regions not accessible by NIRS. However,

the use of magnetic resonance imaging technology may trigger anxiety and discomfort

to the extent that sedation may be required [150]. Unlike magnetic resonance imaging,

NIRS is suitable for long-term bedside monitoring, particularly in children and infants.

Therefore, future studies of emotion using NIRS of the prefrontal cortex may shed light

on emotional understanding in children of various age groups.

94

7.2.3 Dynamic emotional rating paradigms

Emotions may appear as transient phenomena during the emotion induction period.

For example, emotions during initial presentation of a musical piece may be different

from those manifested as the music unfolds. Therefore, the next step for studies in-

volving dynamic emotion induction (e.g. music or videos) and the brain is to consider

implementing experimental paradigms that support dynamic emotional rating. Dynamic

emotional ratings will enable the study of the temporal dynamics of emotion using PFC

hemodynamics. Ultimately, investigating the temporal dynamics of emotions with re-

spect to PFC hemodynamics will facilitate emotion decoding in real-life settings where

emotions can be manifested at any point in time.

7.2.4 Emotional sensitivity measures

Due to differences in emotional sensitivity among individuals, future studies should con-

sider emotional intelligence assessments prior to each recording session. Petrides et al.

[169] have shown that individuals with high trait emotional intelligence respond faster

and show more sensitivity in an emotion induction paradigm. These individual differences

may explain the variability in the emotion identification success rate across participants.

In this way, including a measure of emotional sensitivity in addition to the self-reported

ratings may be useful for future investigations involving physiologically-based emotion

identification.

7.2.5 Individual specific analysis

The subject-specific feature selection algorithm has been used for demonstrating the

feasibility of the proposed affective BCI. Previous emotion identification efforts have used

similar approaches due to individual differences in the physiological response to emotions

[172]. However, for this system to be used for individuals with severe motor disabilities,

95

the most informative features need to be identified. Due to the large variability in the

physiological response to emotions, including large participant cohorts is necessary to

capture various response phenotypes before global features can be identified and used

in studies involving individuals with severe disabilities. Given the limited sample size,

introducing global features was not feasible in the current study. Future studies with

larger sample size may help identify features that can robustly identify emotions across

individuals. Another approach maybe to implement adaptive feature selection where the

feature set can be optimized based on individual response phenotypes.

7.2.6 Inclusion of larger sample sizes

The results in this thesis were reported in a sample of 10 able-bodied adults. Due to the

extent of variability observed in identification accuracy across individuals, the sample

size may be a limitation in extending the results to larger sample sizes. To account for

the individual differences, individual feature selection and classification algorithms were

used. In addition, a mixed model was used for statistical analysis to account for the

limited sample size. However, future investigations should consider larger sample sizes

to account for different physiological phenotypes that may exist among individuals.

96

Appendix A: Open Challenges Regarding Control Mechanisms

Studies involving individuals with disabilities have demonstrated Various EEG con-

trol mechanisms. Each control mechanism has challenges and merits with respect to

habituation, required training period, response rate, fatigue, and cognitive awareness.

Exploring subject-specific control, performance predictors, alternative control mecha-

nisms, and self-paced BCI designs can help ameliorate current BCI technologies.

• Habituation and Response Rate

P300-based BCI may be affected by habituation. In particular, there are reports

of P300 peak magnitude and latency changes with repeated exposure to stimuli

[122]. Alternatively, SMR and SCP-based BCIs are not reported to be affected

by habituation. Based on the bit rates achieved in the reviewed articles, SSVEP-

based systems provide the fastest information transfer rate among the four control

mechanisms.

• Training and Fatigue

BCI systems based on evoked responses (P300 and SSVEP) require very little train-

ing for the participants as these responses are naturally occurring. In contrast, it

generally takes several training sessions for a user to learn to modulate spontaneous

EEG patterns. Despite the benefits of SSVEP-based systems with respect to train-

ing and transfer rate, the low-frequency flickering stimuli used by these systems is

fatiguing on the eyes and may induce photo-epileptic seizures in the photo-sensitive

population [49].

• Subject-specific EEG-based Control

Studies with able-bodied individuals have indicated that the ability to generate

various EEG patterns is user dependent. For example, of the 81 participants eval-

uating a P300-based speller nearly 3% did not produce any correct characters [64].

97

Similarly, only 19% of 99 participants using a SMR-based BCI achieved accuracies

above 80% [64]. These results suggest that some users may not be able to generate

EEG patterns to control a particular type of BCI. This issue, however, has not

been investigated in individuals with disabilities.

• Lack of Predictive Indicators of Performance

Due to the large amount of financial and time-related resources often required to

conduct studies involving the target population, it would be beneficial to develop

predictive indicators of success with given control mechanisms. One such predictive

measure is initial performance with a control mechanism. Using an SCP-based BCI,

Neumann et al. found that initial performance was related to performance in later

attempts in five patients with ALS [154]. In a later study, Kbler et al. [116] found

that initial performance was moderately correlated with the performance in the

advanced training sessions. Both studies were conducted with SCP-based BCIs.

• Limited Scope of Mental Tasks

Studies of BCIs based on spontaneous responses have focused exclusively on SMR

and SCP for use in individuals with disabilities. Several other mental tasks such

as language and arithmetic have also been shown to induce distinctive EEG pat-

terns in able-bodied individuals (Milln et al., 2002; Roberts Penny, 2000). Despite

the cognitive load imposed by these BCIs, they may have merits as BCI control

mechanisms for the target population. To the best of our knowledge, BCIs based

on language and arithmatic mental tasks have not been tested by the target pop-

ulation.

• System-paced Versus Self-paced

The majority of the reviewed BCIs require the user to generate EEG patterns

when cued by the system. This limits the user’s ability to control initiation and

duration of the mental task, a restriction that may hinder system practicality as

98

an independently controlled communication device. One way to overcome this

limitation is to develop self-paced (asynchronous) BCIs with a no control state

[139]. This can be accomplished through machine learning techniques that allow

detection of specific EEG patterns at any point in time [139, 140]. Leeb et al. (2007)

developed such a system for controlling a wheelchair in the virtual environment and

reported successful operation by an individual with SCI [127].

• Performance Evaluation

The reviewed articles mainly focused on traditional measures of BCI performance,

namely, speed and accuracy. These measures, however, must be appropriately mod-

ified when used to evaluate system performance with individuals with disabilities

[115]. Specifically, performance evaluation should consider the context in which

the system operates. According to the International Classification of Function-

ing, Disability and Health (ICF) (World Health Organization, 2001), this context

includes personal factors such as the nature of the disability, as well as environmen-

tal factors (physical, social, and attitudinal issues). Personal factors relating to the

nature of the disability are important in evaluating BCI suitability. In particular,

severity of the disability may affect BCI performance. For example, while BCIs

have been successfully used by individuals with incomplete locked-in syndrome,

[112] reported that basic communication could not be restored in any of the par-

ticipants with complete locked-in syndrome. Further study of different locked-in

syndrome sub-types can help identify the population which can most benefit from

BCI use. Another important personal factor is the possible improvement or decline

in function. Specifically, the extent of available communication function is a critical

personal factor. In this light, BCI speed is only a limited measure of performance

gains over other communication means available to the user. For example, while a

BCI may be much slower than speech or muscle activated switches, it may provide

a functional means of communication in the absence of extant muscle control.

99

• Neuroethics and responsible dissemination to media

With the ubiquity of BCI research, neuroethical concerns are materializing [188],

particularly around the breach of user privacy [46]. Further, many potential BCI

users face communication difficulties due to severe disabilities (e.g. conditions re-

sulting in LIS). Consequently, there are many challenges in reliably obtaining and

interpreting the user’s informed consent for participation in BCI research. In-

terested readers are referred to Haselager et. al (2009) [71]. Researchers should

exercise special care when communicating with caregivers and potential BCI users.

Because the reality of BCI research is often not well-portrayed by the media, users

and care-givers may formulate expectations beyond what is feasible. To manage

expectations, researchers must avoid ”over-hyping the significance of their findings”

[57, 71]. In a recent study, Nijboer, Clausen, Allison Haselager (2011) [158] pub-

lished results of a survey in which more than 80% of 144 BCI researchers acknowl-

edged the importance of active participation of researchers in separating factual and

fictional statements published in media. In addition, ”85.8 % of the participants

recommended ethical guidelines specific to BCI research and use within five years”.

Until such guidelines exist, researchers can prevent user and care-giver frustration

and disappointment by realistically presenting the expected outcomes, as well as

risks and complications surrounding BCI technology.

100

Appendix B: Music Database

Music excerpts used for emotion induction were selected from a variety of difference

genres of music. Previous studies of emotion using music were consulted in creating the

music database 1. In addition, motion picture soundtracks were also included due to their

ability to induce emotions. Table indicates the music excerpts included in the making of

the common database.

Each participant selected 6 music excerpts prior to data collection. These songs,

listed in table 2, were chosen by each participant for inducing intense positive or negative

emotions. Some participants selected identical music excerpts independent of each other.

In addition, with no prior knowledge of the common music database, a number of the

music pieces in the common database appeared in the self-selected songs.

101

Table 1: The list of music pieces included in the common music database

Title Composer/Artist

Caribbean blue EnyaArajuez Andrea bocelliSirens Police ”Natural born killers” motion pictureFirst youth Ennio morricone, ”Cinema paradiso” motion pictureBachehaye alp Mohsen Alizadeh, ”Dans les Alp” motion pictureCan’t take my eyes off of you The Everly BrothersSur le fil Yann Tiersen, ”Le Fabuleux Destin d’Amelie Poulain” motion pictureCan can Jacques OffenbachGoodbye Lenin Yann Tiersen, ”Goodbye Lenin” motion pictureKinderszenen Robert SchumannAgnus dei Samuel BarberJust the way you are Bruno MarsNocturne No. 20 in C sharp minor Frederic ChopinHalo BeyoneAdagio, G minor Tomaso AlbinoniLa noyee Yann Tiersen, ”Le Fabuleux Destin d’Amelie Poulain” motion pictureThe man who sold the world NirvanaCello Suite No. 1, Prelude Johann Sebastian BachConcerto No. 3 in F major, Op. 8, Allegro Antonio VivaldiAll that I am living for EvanescenceBella Ciao Yves MontandThe mission Ennio Morricone, ”The mission” sound trackLes millionnaire du dimanche Enrico MaciasOne day MatisyahuHasta Siempre Comandante Buena Vista Social ClubThe Lion Sleeps Tonight The Tokens, ”The Lion King” motion pictureLa vieille barque Mireille MathieuFireworks Katy PerryNothing else matters MetallicaAlp Mohsen Alizadeh, ”Dans les Alp” motion pictureWaka Waka (This Time for Africa) ShakiraLes roi du monde Philippe d’Avilla, Damien Sargue and Gregori Baquet , ”Romeo et Juliette” musicalLullaby Javier Navarrete ”Pan’s Labyrinth” motion pictureUnforgiven III MetallicaThe Winner Takes It All AbbaCon Te Partiro Andrea Bocellic’est peut etre des ange Gerard LenormanMalena Ennio Morricone ”Malena” motion pictureIf I had a hammer Peter Paul and Maryje t’aime Lara FabienHaven’t met you yet Michael BubleYesterday The BeatlesTo the beat of my heart Hilary DuffHit the road Jack! Ray CharlesHistoire D’un Amour DalidaWhen our wings are cut can we still fly? Gustavo Santaolalla, ”21 grams” motion pictureHabanera Georges Bizet, ”Carmen” OperaOne day I’ll fly away Nicole Kidman, ”Mouline Rouge” motion pictureConcerto No. 1 in E major, Op. 8 Allegro Antonio vivaldiSari gelin Composer unknown (Armenians, Azerbaijanis, Persians, and Turks folk song)Empire State of Mind Jay Z, Alicia KeysWonderful life BlackScarborough fair Simon and GarfunkelPor una cabeza Carlos Gardel, featured in ”the Scent of a woman” motion pictureJe suis malade Alice Dona and Serge LamaCloud song RiverdanceSend me an angel The ScorpionsCinderella Steven Curtis ChapmanDon’t dwell Tracy ChapmanI will survive Gloria GaynorI wanna hold your hand The beatlesMoon river Audrey Hepburn, ”Breakfast at Tiffany’s” motion pictureYou are loved Josh GrobanVerone ”Notre Dame de Paris” musicalDon’t cry for me Argentina Sinead O’Connor cover, ”Don’t Cry for me Argentina” motion pictureThe voice Celtic womenSlow me down Emmy RossumZombie The CranberriesApres une reve, (Op. 7, No. 1) Gabriel FaureDani california Red Hot Chilli PeppersCaruso Lucio DallaWaltz, Swan lake ballet Pyotr Ilyich Tchaikovsky

102

Table 2: The list of self-selected music pieces

Title Composer/Artist

Iris The Goo Goo DollsTears in heaven Eric ClaptonRequiem ”Dies Irae” Wolfgang Amadeus MozartUntitled Sigur RosAin’t no mountain high enough Marvin Gaye and Tammi TerrellTheme from Schindler’s List John Williams, ”Schindler’s List” motion pictureJulien PlaceboHow to save a life The FraysNocturne No. 20 in C sharp minor Frederic ChopinVirtual insanity JamiroquaiLittle Town Chorus Beauty And the Beast, Paige O’Hara and Richard White, ”Beauty and the Beast” motion pictureA world of our own The seekersVeronica Sawyer smokes Crash LoveTall trees Matt Mays and El TorpedoCello Suite No. 1, Prelude Johann Sebastian Bachhallelujah Jeff BuckleyNews bar Charlie ClouserUn petit peu d’air FelipechaGrand valse brillante Frederic ChopinThat’s how you know Amy Adams, ”Enchanted” motion pictureHe wasn’t man enough for me Toni BraxtonMan! I feel like a woman Shania TwainMad world Gary JulesBe our guest Chorus Beauty And the Beast, ”Beauty and the Beast” motion pictureThrough and through and through Joel PlaskettSo close Jon McLaughlin, ”Enchanted” motion pictureHuman nature Michael JacksonWay over yonder in the minor key Billy Bragg and WilcoEveryday will be like a holiday William Bell, (RZA remix)Give me Jesus Fernando OrtegoDon’t forget about me Chris KirbyI like it Julio IglesiasLove you Free designAu parc Chiara MastroianniCandle in the wind Elton JohnRed sun Neil YoungBlower’s daughter Damien RiceBookends Simon and GarfunkelAll I do is win DJ KhaledHit the road Jack! Ray Charles

103

Appendix C: Music characteristic extraction using MIRTOOLBOX

Music characteristics used in this thesis were sound pressure level, mode, and dis-

sonance, and extraction of each feature is described in details in this appendix. The

interested reader is referred to [89] for more details regarding music characteristic extrac-

tion.

• Sound pressure

The sound pressure level (shown in 1) is a logarithmic measure of sound pressure

(ρrms), expressed in decibels (dB) above a standard reference level (ρref = 2 ×

10−5Pa).

L(db) = 20log(ρrms

ρref) (1)

The sound pressure level waveform indicates the volume changes throughout the

music excerpt, and is extracted from the music waveform.

• Dissonance

The MIRTOOLBOX [124] estimates dissonance using a method proposed by Plomp

and Levelt (1965) [176], which determines the sensory dissonance by identifying

pairs of sinusoids that appear close in frequency. Therefore, the ratio of each pair

of sinusoid was used for identifying dissonance.

The total dissonance was determined by computing the peaks of spectrum and

finding the average of dissonance between all possible pairs of peaks [205].

• Mode

The MIRTOOLBOX [124] determines the mode (i.e. major/minor) using the key

strength value. The key strength value represents the probability of each possible

candidate key. This probability value is determined using a cross-correlation of

chromagram with similar profiles representing each possible tonality [62, 110].

104

Table 3: The significance of the main effect of a. Mode, b. Dissonance, and c. Maximumsound pressure level for each recording site shown in Figure 2.1. (α = 0.05)

Characteristic Chromophore R1 R2 R3 R4 O L1 L2 L3 L4

a. Mode[HbO ] X X X X X X X X X X[Hb] X X X X X X X X X X

b. Dissonance[HbO ] X X X X X X X X p=0.034 X[Hb] X X X X X X X X X X

c. Max sound pressure level[HbO ] X X p=0.015 X X X X X X X[Hb] X X X X X X X X X X

Appendix D: Region specific analysis of [HbO2] and [Hb] with respect to

music characteristics

In chapter 6, to identify the effect of music characteristics on PFC [HbO2] and [Hb],

these signals were averaged across the nine recording sites (see Figure 2.1). However,

hemodynamic changes may vary across the PFC, and identifying the effect of music

characteristics in each recording location is a meaningful pursuit.

Although considering regional activity patterns was appealing, due to the limited

samples available for this analysis (72 samples per participant), the average [HbO2] and

[Hb] signals were considered. (Including each recording site leads to 18 comparisons

per music characteristics). The average [HbO2] and [Hb] signals were used to reliably

captured the general pattern of hemodynamic changes. However, a separate analysis

involving maximum [HbO2] and [Hb] recordings at each recording site was conducted,

and the significance of the effect of each music characteristic is reported in Table 3. The

domains marked with ’x’ in Table 3 indicate that the effect of the corresponding music

characteristic did not reach significance. Therefore, the effect of mode did not reach sig-

nificance (α = 0.05) for any of the recording locations. However, the effect of dissonance

was significant for maximum [HbO2] in locations L3 and R3 (see figure 2.1)for dissonance

and maximum sound pressure level respectively. These locations correspond to infero-

lateral PFC regions. However after applying Bonferroni adjustment for 9 comparisons

per chromophore which results in α = 0.005, the effect of music characteristics does not

reach significance for any of the locations.

105

Appendix E: Contributions from Systemic Blood Flow

In NIRS hemodynamic monitoring, the near-infrared light needs to travel through

scalp, skull and cerebral spinal fluid before it reaches the brain. This light passage through

the scalp may introduce systemic blood flow components in the detected signals [84].

Assessing contributions from the systemic blood flow (e.g. skin blood flow) in the recorded

signals is an important pursuit for researchers in the field of functional NIRS. Specific

practices in the NIRS recordings were shown to reduce the effect of unwanted systemic

blood flow. For example, increased distance between the light source and detector was

shown to reduce systemic blood flow contributions [58, 59]. The source-detector distance

selected for this study (i.e. r=3cm) was within the recommended range for detecting

cerebral hemodynamic changes. Joint studies using magnetic resonance imaging and

NIRS have confirmed reliable hemodynamic detection at 3.3 cm [195]. The wavelengths

used for detection (i.e. 690 nm and 830 nm) were also shown to reduce the contribution

of the systemic blood flow to the detected signals [196]. In addition to the strategies

used in the current thesis, an important practice for future research in this field would be

inclusion of systemic blood flow monitors such as laser Doppler flowmetry. Using these

sensors, the systemic blood flow contributions can be directly measured and compared

to the NIRS recordings. For example, Hoshi et al. [83] used laser Doppler flowmetry

sensors and demonstrated that there were no task-related changes in the systemic blood

flow, while the NIRS recordings showed significant modulations. Another study involving

60 second exposure to visual stimulus by Villringer et al. [228], also demonstrated that

significant NIRS signal changes were not accompanied by task-related modulations in

systemic blood flow detected using laser Doppler flowmetry. Lack of additional laser

Doppler flowmetry sensors in the current thesis is a limitation that needs to be addressed

in future studies. Although empirical observations such as region-specific changes with

respect to emotional rating suggest a more dominant contribution from the cerebral blood

flow compared to skin blood flow, future studies of emotion using NIRS should consider

106

including additional skin blood flow sensors (e.g. laser Doppler flowmetry sensors or NIRS

sensors placed less than 0.5 cm apart) to assess systemic contribution to the results.

107

Appendix F: Cognitive Processing Activity in the Prefrontal Cortex

The prefrontal cortex is engaged during various emotional and cognitive processes.

The reader is referred to 1.4.1 for more details regarding the role of PFC from a net-

work perspective. Due to the wide range of processes during which prefrontal cortex is

recruited, other activities (e.g. cognitive processing) may have modulated brain activity

in this study. Therefore, there is the possibility of misrepresenting unrelated cognitive

functions such as thinking of something as emotional response. To reduce the probability

of detecting cognitive processes unrelated to emotion, the analysis was conducted with re-

spect to subjective ratings and multiple trials were used (144 trials per individual). Unless

the unrelated cognitive activities (e.g. distractions, thinking of something) are consis-

tently repeated across trials, increasing the number of trials can be used as a strategy

to mask the unrelated cognitive functions. In addition, assuming that subjective ratings

are a correct representation of one’s emotions, unrelated cognitive tasks occurring during

the trial may be reflected in the ratings (i.e. trials during which distractions occurred

may be rated lower). In addition to cognitive processes unrelated to emotions, there

may be those accompanying emotions. Distinguishing between cognitive and emotional

response may be challenging. For example, one of the mechanisms through which music

can induce emotion is by evoking episodic memories which involves memory retrieval

[92]. Hence, emotional response can be accompanied by cognitive appraisal. The current

study design cannot distinguish cognitive appraisal which accompanies or results in emo-

tional response from the emotional response itself. However, if this cognitive appraisal

was detected during the study, the findings would not be undermined. Because detecting

the cognitive appraisal accompanying emotions would lead to identifying emotions.

108

Appendix G: Research Ethics

109

Figure 1: Ethics approval notice

110

Assessing auditory stimuli presentation modalities in the affective modulation-based human

computer interface

November XX, 2010

Dear Participant,

My name is Saba Moghimi. I am a PhD. student at the University of Toronto. My supervisor,

Professor Tom Chau, and I work in a research team at Bloorview Kids Rehab. We are

investigating a technology that can potentially be used as a communication device for people

who cannot move or speak. Before agreeing to take part in this study, I would like to tell you

how you will be involved.

What is the study about?

Access technologies help people who cannot move or speak to communicate with other people.

Switches and eye trackers are examples ofthese technologies. Unfortunately, people who cannot

make movements cannot use these switches. To help these people, researchers are investigating

communication devices that are controlled by brain activity.

In this study, we will try to use brain activity and some other body signals to detect emotional

reactions in response to auditory stimulus (music). This study will not help you. This study will

help us design devices that help people with disabilities who cannot speak or express what they

like.

How will I be involved in this study?

To volunteer for this study you must be able to communicate in English. You must also be at

least 18 years old and have normal or corrected- to-normal vision and hearing. Please do not

volunteer for this study if you know you have any of the following conditions: 1) degenerative

disease; 2) cardiovascular disease; 3) metabolic disorders; 4) trauma-induced brain injury; 5)

respiratory conditions; 6) drug and alcohol-related conditions; and 6) psychiatric conditions.

We will ask you to come in forf our sessions in a 3-5 months period. Each session will be about

two hours.

You will be asked not to drink any caffeinated beverages or alcohol an hourbefore the recording

sessions. We will send you a reminder before each session.

111

We will put some sensors on your forehead. You ca n see the sensors in Figure 1. These sensors

can record your brain activity. Do not worry; we will not be able to read your thoughts. We will

also put sensors on your finger to measure your skin temperature, the amount of sweat in your

skin, and your pulse. You can see the sensors in Figure 1. B. We will also ask you to wear a belt

around your chest to record how you breathe. These sensors will not hurt you. You can let the

researcher know if you are uncomfortable and we will remove the sensors. We can stop the

recording or let you take a break if you are tired.

Before the experiment, we will ask you to name a number of music pieces you like.You will

hear the music you told us about and some other music pieces and sounds from the environment.

We will ask you to rate how you felt listening to the music after it plays.

Will anyone know what I say?

Your brain signals and physiological signals will be recorded in a private room. Only you and

the researcher will be present. You can feelfree to ask the researcher any questions about the

experiment. All your concerns will be kept confidential. We will not be able to read your mind or

your thoughts with these signals.

All the information that we collect from you will be confidential. All the forms that may have

your information and the data collected from you will be saved on a secure server or in a locked

cabinet. We will not use your name when publishing the results of this study. We will keep your

name and the data collected from you for seven years, and will destroy all the information at the

end of this time. We will not release any information that might identify you without asking for

your consent.

Do I have to do this?

If you decide not to take part in this study, that is okay. If you decide to take part, but change

your mind at any time, that is also okay. You may drop out of the study at any time. Doing this

will not affect your status at Bloorview Kids Rehab or at the University of Toronto.

What are the risks and benefits?

A B

112

You may get tired during the experiment. We have planned breaks during the session, but you

can ask for additional breaks during the experiment if you wish. You may also get bored or feel

sleepy. Please let us know when you are tired. We will let you take a break.

You will not directly benefit from this study. However, we think that this study will benefit

people who have no means of communication. After the study, we will send you a thank you

letter, and you will also receive a small token of appreciation for your participation.

What if I have questions?

Please ask me to explain anything you don’t understand before signing the consent form. My

phone number is 416-425-6220 x3603. If you leave a message, I will return your call within 48

hours. I can also be reached by email at [email protected].

Thank you for thinking about helping us with this project.

Yours sincerely,

Saba Moghimi

Ph.D Candidate

Bloorview Kids Rehab

Phone: 416-425-6220 x3270

E-mail: [email protected]

Supervisor:

Professor Tom Chau

Bloorview Kids Rehab

E-mail: [email protected]

113

CONSENT FORM

Holland Bloorview Kids Rehabilitation Hospital

Re: Detecting mental selection on the basis of prefrontal cortical and autonomic nervous system activity

Please complete this form below and return it to the investigator.

The investigator explained this study to me. I read the information letter dated __________________

and understand what this study is about. I understand that I may drop out of the study at any time. I

agree to participate in this study.

______________________________ _________________________ _________

Participant’s Name (please print) Signature Date

______________________________ ___________________________ _________

Researcher’s Name Signature Date

Figure 2: Participant consent form

Bibliography

[1] F. Agrafioti, D. Hatzinakos, and A.K. Anderson. Ecg pattern analysis for emotion

detection. Affective Computing, IEEE Transactions on, 3(1):102–115, 2012.

[2] C. Ahlstrom, A. Johansson, F. Uhlin, T. Lanne, and P. Ask. Noninvasive in-

vestigation of blood pressure changes using the pulse wave transit time: a novel

approach in the monitoring of hemodialysis patients. Journal of Artificial Organs,

8(3):192–197, 2005.

[3] B.Z. Allison, D.J. McFarland, G. Schalk, S.D. Zheng, M.M. Jackson, and J.R. Wol-

paw. Towards an independent brain-computer interface using steady state visual

evoked potentials. Clinical neurophysiology, 119(2):399–408, 2008.

[4] E. Altenmuller, K. Schurmann, V.K. Lim, and D. Parlitz. Hits to the left, flops

to the right: different emotions during listening to music are reflected in cortical

lateralisation patterns. Neuropsychologia, 40(13):2242–2256, 2002.

[5] F. Babiloni, F. Cincotti, M. Marciani, S. Salinari, L. Astolfi, F. Aloise,

F. De Vico Fallani, and D. Mattia. On the use of brain–computer interfaces outside

scientific laboratories: Toward an application in domotic environments. Interna-

tional review of neurobiology, 86:133–146, 2009.

[6] O. Bai, P. Lin, S. Vorbach, M.K. Floeter, N. Hattori, and M. Hallett. Sensorimotor

beta rhythm-based brain–computer interface. Journal of neural engineering, 5:24–

35, 2008.

114

115

[7] R. Bates and HO Istance. Why are eye mice unpopular? A detailed comparison of

head and eye controlled assistive technology pointing devices. Universal Access in

the Information Society, 2(3):280–290, 2003.

[8] G. Bauer, F. Gerstenbrand, and E. Rumpl. Varieties of the locked-in syndrome.

Journal of Neurology, 221(2):77–91, 1979.

[9] T. Baumgartner, M. Esslen, and L. Jancke. From emotion perception to emotion

experience: Emotions evoked by pictures and classical music. International Journal

of Psychophysiology, 60(1):34–43, 2006.

[10] J.D. Bayliss, S.A. Inverso, and A. Tentler. Changing the P300 brain computer

interface. CyberPsychology & Behavior, 7(6):694–704, 2004.

[11] N. Birbaumer. Slow cortical potentials: Plasticity, operant control, and behavioral

effects. The Neuroscientist, 5(2):74, 1999.

[12] N. Birbaumer, T. Elbert, AG Canavan, and B. Rockstroh. Slow potentials of the

cerebral cortex and behavior. Physiological Reviews, 70(1):1, 1990.

[13] N. Birbaumer, N. Ghanayim, T. Hinterberger, I. Iversen, B. Kotchoubey, A. Kubler,

J. Perelmouter, E. Taub, and H. Flor. A spelling device for the paralysed. Nature,

398(6725):297–298, 1999.

[14] N. Birbaumer, A. Kubler, N. Ghanayim, T. Hinterberger, J. Perelmouter, J. Kaiser,

I. Iversen, B. Kotchoubey, N. Neumann, and H. Flor. The thought translation de-

vice (TTD) for completely paralyzedpatients. IEEE Transactions on Rehabilitation

Engineering, 8(2):190–193, 2000.

[15] A.J. Blood and R.J. Zatorre. Intensely pleasurable responses to music correlate

with activity in brain regions implicated in reward and emotion. Proceedings of the

National Academy of Sciences of the United States of America, 98(20):11818, 2001.

116

[16] A.J. Blood, R.J. Zatorre, P. Bermudez, and A.C. Evans. Emotional responses to

pleasant and unpleasant music correlate with activity in paralimbic brain regions.

Nature neuroscience, 2:382–387, 1999.

[17] M. Boso, P. POLITI, F. BARALE, and E. EMANUELE. Neurophysiology and

neurobiology of the musical experience. Functional neurology, 21(4):187–191, 2006.

[18] N.C. Brady, J. Marquis, K. Fleming, and L. McLean. Prelinguistic predictors of

language growth in children with developmental disabilities. Journal of Speech,

Language and Hearing Research, 47(3):663, 2004.

[19] G.C. Bruner. Music, mood, and marketing. The Journal of Marketing, pages

94–104, 1990.

[20] R.L. Buckner, J. Sepulcre, T. Talukdar, F.M. Krienen, H. Liu, T. Hedden, J.R.

Andrews-Hanna, R.A. Sperling, and K.A. Johnson. Cortical hubs revealed by

intrinsic functional connectivity: mapping, assessment of stability, and relation to

alzheimer’s disease. The Journal of Neuroscience, 29(6):1860–1873, 2009.

[21] S.C. Bushong. Magnetic resonance imaging. St. Louis, MO (USA); CV Mosby Co.,

1988.

[22] A.H. Buss and R. Plomin. A temperament theory of personality development. Wiley-

Interscience, 1975.

[23] J.J. Campos, R.G. Campos, and K.C. Barrett. Emergent themes in the study

of emotional development and emotion regulation. Developmental Psychology,

25(3):394, 1989.

[24] C.S. Carter, T.S. Braver, D.M. Barch, M.M. Botvinick, D. Noll, and J.D. Cohen.

Anterior cingulate cortex, error detection, and the online monitoring of perfor-

mance. Science, 280(5364):747–749, 1998.

117

[25] R. Chavarriaga and J. del R Millan. Learning from eeg error-related potentials in

noninvasive brain-computer interfaces. Neural Systems and Rehabilitation Engi-

neering, IEEE Transactions on, 18(4):381–388, 2010.

[26] A. Christa Neuper, G.R. Muller-Putz, R. Scherer, and G. Pfurtscheller. Motor

imagery and EEG-based control of spelling devices and neuroprostheses. Event-

related dynamics of brain oscillations, page 393, 2006.

[27] F. Cincotti, D. Mattia, F. Aloise, S. Bufalari, G. Schalk, G. Oriolo, A. Cherubini,

M.G. Marciani, and F. Babiloni. Non-invasive brain–computer interface system:

Towards its application as assistive technology. Brain research bulletin, 75(6):796–

803, 2008.

[28] C. Collet, E. Vernet-Maury, G. Delhomme, and A. Dittmar. Autonomic nervous

system response patterns specificity to basic emotions. Journal of the autonomic

nervous system, 62(1-2):45–57, 1997.

[29] J. Conradi, B. Blankertz, M. Tangermann, V. Kunzmann, and G. Curio. Brain-

computer interfacing in tetraplegic patients with high spinal cord injury. Int J

Bioelectromagnetism Volume, 11(2):65–68, 2009.

[30] M. Cope. The application of near infrared spectroscopy to non invasive monitoring

of cerebral oxygenation in the newborn infant. Department of Medical Physics and

Bioengineering, University College London, pages 214–9, 1991.

[31] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, and

J.G. Taylor. Emotion recognition in human-computer interaction. Signal Processing

Magazine, IEEE, 18(1):32–80, 2001.

[32] S. Dalla Bella, I. Peretz, L. Rousseau, and N. Gosselin. A developmental study of

the affective value of tempo and mode in music. Cognition, 80(3):B1–B10, 2001.

118

[33] R.J. Davidson. Emotion and affective style: Hemispheric substrates. Psychological

Science, 3(1):39, 1992.

[34] RJ Davidson. What does the prefrontal cortex do in affect. Perspectives on frontal

EEG asymmetry research. Biological Psychology, 67:219–233, 2004.

[35] T. Demiralp et al. Event-related oscillations are real brain responses wavelet analy-

sis and new strategies. International Journal of Psychophysiology, 39(2-3):91–127,

2001.

[36] M. Dennis. Prefrontal cortex: Typical and atypical development. The frontal lobes:

Development, function and pathology, pages 128–162, 2006.

[37] P.A. Di Mattia, F.X. Curran, and J. Gips. An eye control teaching device for

students without language expressive capacity: EagleEyes. Edwin Mellen Pr, 2001.

[38] E. Donchin, KM Spencer, and R. Wijesinghe. The mental prosthesis: assessing the

speed of a P300-basedbrain-computer interface. IEEE transactions on rehabilitation

engineering, 8(2):174–179, 2000.

[39] W.C. Drevets, J.L. Price, J.R. Simpson, R.D. Todd, T. Reich, M. Vannier, and

M.E. Raichle. Subgenual prefrontal cortex abnormalities in mood disorders. Nature,

386(6627):824–827, 1997.

[40] R.O. Duda, P.E. Hart, and D.G. Stork. Pattern classification, volume 2. Citeseer,

2001.

[41] A. Duncan, J.H. Meek, M. Clemence, CE Elwell, L. Tyszczuk, M. Cope, and

D. Delpy. Optical pathlength measurements on adult head, calf and forearm and

the head of the newborn infant using phase resolved optical spectroscopy. Physics

in Medicine and Biology, 40:295, 1995.

119

[42] T. Elbert, N. Birbaumer, W. Lutzenberger, and B. Rockstroh. Biofeedback of slow

cortical potentials: self-regulation of central-autonomic patterns. Biofeedback and

self-regulation, pages 321–342, 1979.

[43] T. Elbert, B. Rockstroh, W. Lutzenberger, and N. Birbaumer. Biofeedback of

slow cortical potentials. I. Electroencephalography and Clinical Neurophysiology,

48(3):293–301, 1980.

[44] A. Etkin, T.D. Wager, et al. Functional neuroimaging of anxiety: a meta-analysis of

emotional processing in ptsd, social anxiety disorder, and specific phobia. American

Journal of Psychiatry, 164(10):1476–1488, 2007.

[45] T.H. Falk, M. Guirgis, S. Power, and T. Chau. Taking nirs-bcis outside the lab:

Towards achieving robustness against environment noise. Neural Systems and Re-

habilitation Engineering, IEEE Transactions on, 19(2):136–146, 2011.

[46] M.J. Farah. Neuroethics: the practical and the philosophical. Neuroethics Publi-

cations, page 8, 2005.

[47] L.A. Farwell and E. Donchin. Talking off the top of your head: toward a men-

tal prosthesis utilizing event-related brain potentials. Electroencephalography and

clinical Neurophysiology, 70(6):510–523, 1988.

[48] E.A. Felton, J.A. Wilson, J.C. Williams, and P.C. Garell. Electrocorticographically

controlled brain–computer interfaces using motor and sensory imagery in patients

with temporary subdural electrode implants. Journal of Neurosurgery: Pediatrics,

106(3), 2007.

[49] R.S. Fisher, G. Harding, G. Erba, G.L. Barkley, and A. Wilkins. Photic-and

pattern-induced seizures: a review for the Epilepsy Foundation of America Working

Group. Epilepsia, 46(9):1426–1441, 2005.

120

[50] S.T. Fiske, D.T. Gilbert, and G. Lindzey. Handbook of social psychology. 1, 2010.

[51] E.O. Flores-Gutierrez, J.L. Dıaz, F.A. Barrios, R. Favila-Humara, M.A. Guevara,

Y. del Rıo-Portilla, and M. Corsi-Cabrera. Metabolic and electric brain patterns

during pleasant and unpleasant emotions induced by music masterpieces. Interna-

tional Journal of Psychophysiology, 65(1):69–84, 2007.

[52] GM Friehs, VA Zerris, CL Ojakangas, MR Fellows, and JP Donoghue. Brain-

machine and brain-computer interfaces. Stroke, 35(11-Supplment 1):2702–2705,

2004.

[53] N.H. Frijda and B. Mesquita. The social roles and functions of emotions. Emotion

and culture, pages 51–87, 1994.

[54] M. Frisch and H. Messer. The use of the wavelet transform in the detection of an

unknown transient signal. Information Theory, IEEE Transactions on, 38(2):892–

897, 1992.

[55] T. Fritz, S. Jentschke, N. Gosselin, D. Sammler, I. Peretz, R. Turner, A.D.

Friederici, and S. Koelsch. Universal recognition of three basic emotions in music.

Current Biology, 19(7):573–576, 2009.

[56] A.W.K. Gaillard. Slow brain potentials preceding task performance. Biological

Psychology, 21(4):282–283, 1985.

[57] J.M. Garreu and S.J. Bird. Ethical issues in communicating science. Science and

engineering ethics, 6(4):435–442, 2000.

[58] TJ Germon, PD Evans, NJ Barnett, P Wall, AR Manara, and RJ Nelson. Cerebral

near infrared spectroscopy: emitter-detector separation must be increased. British

journal of anaesthesia, 82(6):831–837, 1999.

121

[59] TJ Germon, PD Evans, AR Manara, NJ Barnett, P Wall, and RJ Nelson. Sensitiv-

ity of near infrared spectroscopy to cerebral and extra-cerebral oxygenation changes

is determined by emitter-detector separation. Journal of clinical monitoring and

computing, 14(5):353–360, 1998.

[60] A. Gerrards-Hesse, K. Spies, and F.W. Hesse. Experimental inductions of emotional

states and their effectiveness: A review. British Journal of Psychology, 85(1):55–78,

1994.

[61] S. Glennen and D.C. DeCoste. The handbook of augmentative and alternative

communication. 1997.

[62] E. Gomez. Tonal description of polyphonic audio for music content processing.

INFORMS Journal on Computing, 18(3):294–304, 2006.

[63] M.D. Greicius, B. Krasnow, A.L. Reiss, and V. Menon. Functional connectivity in

the resting brain: a network analysis of the default mode hypothesis. Proceedings

of the National Academy of Sciences, 100(1):253–258, 2003.

[64] C. Guger, G. Edlinger, W. Harkam, I. Niedermayer, and G. Pfurtscheller. How

many people are able to operate an EEG-based brain-computer interface (BCI)?

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 11(2):145,

2003.

[65] M. Guirgis, T. Falk, S. Power, S. Blain, and T. Chau. Harnessing physiological

responses to improve nirs-based brain-computer interface performance. In Proc.

ISSNIP Biosignals and Biorobotics Conference 2010, pages 59–62, 2010.

[66] A. Haag, S. Goronzy, P. Schaich, and J. Williams. Emotion recognition using

bio-sensors: First steps towards an automatic system. Affective Dialogue Systems,

pages 36–48, 2004.

122

[67] B. Hamadicharef, H. Zhang, C. Guan, C. Wang, K.S. Phua, K.P. Tee, and K.K.

Ang. Learning eeg-based spectral-spatial patterns for attention level measurement.

pages 1465–1468, 2009.

[68] M. Hamalainen, R. Hari, R.J. Ilmoniemi, J. Knuutila, and O.V. Lounasmaa. Mag-

netoencephalographytheory, instrumentation, and applications to noninvasive stud-

ies of the working human brain. Reviews of modern Physics, 65(2):413, 1993.

[69] HM Hamer, HH Morris, EJ Mascha, MT Karafa, WE Bingaman, MD Bej,

RC Burgess, DS Dinner, NR Foldvary, JF Hahn, et al. Complications of inva-

sive video-EEG monitoring with subdural grid electrodes. Neurology, 58(1):97,

2002.

[70] M.B. Happ. Interpretation of nonvocal behavior and the meaning of voicelessness

in critical care. Social Science & Medicine, 50(9):1247–1255, 2000.

[71] P. Haselager, R. Vlek, J. Hill, and F. Nijboer. A note on ethical aspects of bci.

Neural Networks, 22(9):1352–1357, 2009.

[72] J. Healey and R. Picard. Digital processing of affective signals. 6:3749–3752, 1998.

[73] C.S. Herrmann. Human EEG responses to 1–100 Hz flicker: resonance phenomena

in visual cortex and their potential correlation to cognitive phenomena. Experi-

mental Brain Research, 137(3):346–353, 2001.

[74] MJ Herrmann, A.C. Ehlis, and AJ Fallgatter. Prefrontal activation through task

requirements of emotional induction measured with NIRS. Biological psychology,

64(3):255–263, 2003.

[75] K. Hevner. The affective character of the major and minor modes in music. The

American Journal of Psychology, pages 103–118, 1935.

123

[76] T. Hinterberger, A. Kubler, J. Kaiser, N. Neumann, and N. Birbaumer. A brain

computer interface (BCI) for the locked in: comparison of different EEG classifica-

tions for the thought translation device. Clinical Neurophysiology, 114(3):416–425,

2003.

[77] LR Hochberg, MD Serruya, GM Friehs, JA Mukand, M Saleh, AH Caplan, A Bran-

ner, D Chen, RD Penn, and JP Donoghue. Neuronal ensemble control of prosthetic

devices by a human with tetraplegia. Nature, 442(7099):164–171, 2006.

[78] U. Hoffmann, J.M. Vesin, T. Ebrahimi, and K. Diserens. An efficient P300-based

brain-computer interface for disabled subjects. Journal of Neuroscience methods,

167(1):115–125, 2008.

[79] S. Holm. A simple sequentially rejective multiple test procedure. Scandinavian

journal of statistics, pages 65–70, 1979.

[80] C.B. Holroyd and M.G.H. Coles. The neural basis of human error processing:

reinforcement learning, dopamine, and the error-related negativity. Psychological

review, 109(4):679, 2002.

[81] T. Hopyan, S. Laughlin, and M. Dennis. Emotions and their cognitive control in

children with cerebellar tumors. Journal of the International Neuropsychological

Society, 1(-1):1–12, 2006.

[82] TALAR HOPYAN, SUZANNE LAUGHLIN, and MAUREEN DENNIS. Emotions

and their cognitive control in children with cerebellar tumors. Journal of the In-

ternational Neuropsychological Society, 16(6):1027, 2010.

[83] Y. Hoshi, J. Huang, S. Kohri, Y. Iguchi, M. Naya, T. Okamoto, and S. Ono. Recog-

nition of human emotions from cerebral blood flow changes in the frontal region:

A study with event-related near-infrared spectroscopy. Journal of Neuroimaging,

21(2):e94–e101, 2011.

124

[84] Yoko Hoshi. Functional near-infrared spectroscopy: Potential and limitations in

neuroimaging studies. International Review of Neurobiology, 66:237–266, 2005.

[85] G. Husain, W.F. Thompson, and E.G. Schellenberg. Effects of musical tempo and

mode on arousal, mood, and spatial abilities. Music Perception, 20(2):151–171,

2002.

[86] S Inci and T Ozgen. Locked-in syndrome due to metastatic pontomedullary tumor-

case report. Neurologia Medico-Chirurgica, 43(10):497–500, 2003.

[87] IH Iversen, N. Ghanayim, A. Kubler, N. Neumann, N. Birbaumer, and J. Kaiser. A

brain computer interface tool to assess cognitive functions in completely paralyzed

patients with amyotrophic lateral sclerosis. Clinical neurophysiology, 119(10):2214–

2223, 2008.

[88] R.I. Jahiel and M.J. Scherer. Initial steps towards a theory and praxis of person-

environment interaction in disability. Disability & Rehabilitation, 32(17):1467–

1474, 2010.

[89] J.H. Jensen. Feature extraction for music information retrieval. 2010.

[90] F.F. Jobsis. Noninvasive, infrared monitoring of cerebral and myocardial oxygen

sufficiency and circulatory parameters. Science, 198(4323):1264, 1977.

[91] P.N. Juslin. From mimesis to catharsis: expression, perception, and induction of

emotion in music. Musical communication, pages 85–115, 2005.

[92] P.N. Juslin and D. Vastfjall. Emotional responses to music: The need to consider

underlying mechanisms. Behavioral and Brain Sciences, 31(5):559–575, 2008.

[93] J. Kaiser, A. Kubler, T. Hinterberger, N. Neumann, and N. Birbaumer. A non-

invasive communication device for the paralyzed. Minimally Invasive Neurosurgery,

45(1):19–23, 2002.

125

[94] A.A. Karim, T. Hinterberger, J. Richter, J. Mellinger, N. Neumann, H. Flor,

A. Kubler, and N. Birbaumer. Neural Internet: web surfing with brain potentials

for the completely paralyzed. Neurorehabilitation and Neural Repair, 20(4):508,

2006.

[95] L. Kauhanen, P. Jylanki, J. Lehtonen, P. Rantanen, H. Alaranta, and M. Sams.

EEG-based brain-computer interface for tetraplegics. Computational Intelligence

and Neuroscience, 2007:1, 2007.

[96] D. Keltner and J.J. Gross. Functional accounts of emotions. Cognition and Emo-

tion, 13(5):467–480, 1999.

[97] IK Keme-Ebi and AA Asindi. Locked-in syndrome in a nigerian male with mul-

tiple sclerosis: a case report and literature review. Pan African Medical Journal,

1(4):10pp, 2008.

[98] P.R. Kennedy, R.A.E. Bakay, M.M. Moore, K. Adams, and J. Goldwaithe. Direct

control of a computer from the human central nervous system. IEEE Transactions

on Rehabilitation Engineering, 8(2):198–202, 2000.

[99] S. Khalfa, D. Schon, J.L. Anton, and C. Liegeois-Chauvel. Brain regions involved in

the recognition of happiness and sadness in music. Neuroreport, 16(18):1981–1984,

2005.

[100] J. Kim and E. Andre. Emotion recognition based on physiological changes in

music listening. Pattern Analysis and Machine Intelligence, IEEE Transactions

on, 30(12):2067–2083, 2008.

[101] J.M. Kim, K. Arakawa, K.T. Benson, and D.K. Fox. Pulse oximetry and circu-

latory kinetics associated with pulse volume amplitude measured by photoelectric

plethysmography. Anesthesia & Analgesia, 65(12):1333–1339, 1986.

126

[102] K.H. Kim, SW Bang, and SR Kim. Emotion recognition system using short-term

monitoring of physiological signals. Medical and biological engineering and comput-

ing, 42(3):419–427, 2004.

[103] S.P. Kim, J.D. Simeral, L.R. Hochberg, J.P. Donoghue, and M.J. Black. Neural

control of cursor velocity in humans with tetraplegia. Journal of neural engineering,

5:455–476, 2008.

[104] S.P. Kim, JD Simeral, LR Hochberg, JP Donoghue, GM Friehs, and MJ Black.

Multi-state decoding of point-and-click control signals from motor cortical activity

in a human with tetraplegia. In Neural Engineering, 2007. CNE’07. 3rd Interna-

tional IEEE/EMBS Conference on, pages 486–489, 2007.

[105] S. Kingsnorth, S. Blain, and P. McKeever. Physiological and emotional responses

of disabled children to therapeutic clowns: A pilot study. Evidence-Based Comple-

mentary and Alternative Medicine, 2011, 2011.

[106] S. Koelsch. Investigating emotion with music. Annals of the New York Academy

of Sciences, 1060(1):412–418, 2005.

[107] G. Krausz, R. Scherer, G. Korisek, and G. Pfurtscheller. Critical Decision-Speed

and Information Transfer in the Graz Brain–Computer Interface. Applied psy-

chophysiology and biofeedback, 28(3):233–240, 2003.

[108] S.D. Kreibig. Autonomic nervous system activity in emotion: A review. Biological

psychology, 84(3):394–421, 2010.

[109] G. Kreutz, U. Ott, D. Teichmann, P. Osawa, and D. Vaitl. Using music to induce

emotions: Influences of musical preference and absorption. Psychology of music,

36(1):101, 2008.

[110] C.L. Krumhansl. Cognitive foundations of musical pitch. (17), 1990.

127

[111] C.L. Krumhansl. An exploratory study of musical emotions and psychophysiology.

Canadian Journal of Experimental Psychology/Revue canadienne de psychologie

experimentale, 51(4):336, 1997.

[112] A. Kubler and N. Birbaumer. Brain computer interfaces and communication in

paralysis: Extinction of goal directed thinking in completely paralysed patients?

Clinical neurophysiology, 119(11):2658–2666, 2008.

[113] A. Kubler, A. Furdea, S. Halder, E.M. Hammer, F. Nijboer, and B. Kotchoubey.

A Brain–Computer Interface Controlled Auditory Event-Related Potential (P300)

Spelling System for Locked-In Patients. Annals of the New York Academy of Sci-

ences, 1157(Disorders of Consciousness):90–100, 2009.

[114] A. Kubler, B. Kotchoubey, T. Hinterberger, N. Ghanayim, J. Perelmouter,

M. Schauer, C. Fritsch, E. Taub, and N. Birbaumer. The thought translation

device: a neurophysiological approach to communication in total motor paralysis.

Experimental Brain Research, 124(2):223–232, 1999.

[115] A. Kubler, N. Neumann, J. Kaiser, B. Kotchoubey, T. Hinterberger, and NP Bir-

baumer. Brain-computer communication: self-regulation of slow cortical poten-

tials for verbal communication. Archives of physical medicine and rehabilitation,

82(11):1533, 2001.

[116] A. Kubler, N. Neumann, B. Wilhelm, T. Hinterberger, and N. Birbaumer.

Predictability of brain-computer communication. Journal of Psychophysiology,

18(2):121–129, 2004.

[117] A. Kubler, F. Nijboer, J. Mellinger, TM Vaughan, H. Pawelzik, G. Schalk, DJ Mc-

Farland, N. Birbaumer, and JR Wolpaw. Patients with ALS can use sensorimotor

rhythms to operate a brain-computer interface. Neurology, 64(10):1775, 2005.

128

[118] H. Kuck, M. Grossbach, M. Bangert, and E. Altenmuller. Brain processing of meter

and rhythm in music. Annals of the New York Academy of Sciences, 999(1):244–

253, 2003.

[119] W.N. Kuhlman. EEG feedback training: enhancement of somatosensory cortical

activity. Electroencephalography and clinical neurophysiology, 45(2):290–294, 1978.

[120] L. Kuncheva, T. Christy, I. Pierce, and S. Mansoor. Multi-modal biometric emotion

recognition using classifier ensembles. Modern Approaches in Applied Intelligence,

pages 317–326, 2011.

[121] L.I Kuncheva. Combining Pattern Classifiers: Methods and Algorithms. 2004.

[122] W.J. Lammers and P. Badia. Habituation of P300 to target stimuli. Physiology &

behavior, 45(3):595–601, 1989.

[123] P.J. Lang and M.M. Bradley. Emotion and the motivational brain. Biological

Psychology, 84(3):437–450, 2010.

[124] O. Lartillot, P. Toiviainen, and T. Eerola. A matlab toolbox for music information

retrieval. Data analysis, machine learning and applications, pages 261–268, 2008.

[125] R.S. Lazarus. Emotion and adaptation. 1991.

[126] J.E. LeDoux. Emotion circuits in the brain. The Science of Mental Health: Fear

and anxiety, page 259, 2001.

[127] R. Leeb, D. Friedman, G.R. Muller-Putz, R. Scherer, M. Slater, and

G. Pfurtscheller. Self-Paced(Asynchronous) BCI Control of a Wheelchair in Virtual

Environments: A Case Study with a Tetraplegic. Computational Intelligence and

Neuroscience, 2007:79642, 2007.

129

[128] B Leung and T Chau. A multiple camera tongue switch for a child with severe spas-

tic quadriplegic cerebral palsy. Disability & Rehabilitation: Assistive Technology,

5(1):58–68, 2010.

[129] R.W. Levenson. Autonomic nervous system differences among emotions. Psycho-

logical science, 3(1):23–27, 1992.

[130] L. Ljung. System identification. 1999.

[131] S.G. Mallat. A wavelet tour of signal processing. San Diego, CA:Academic Pr,

1999.

[132] K. Marumo, R. Takizawa, Y. Kawakubo, T. Onitsuka, and K. Kasai. Gender

difference in right lateral prefrontal hemodynamic response while viewing fearful

faces: A multi-channel near-infrared spectroscopy study. Neuroscience research,

63(2):89–94, 2009.

[133] SG Mason and GE Birch. A general framework for brain-computer interface design.

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 11(1):70–

85, 2003.

[134] K. Matsuo, T. Kato, K. Taneichi, A. Matsumoto, T. Ohtani, T. Hamamoto, H. Ya-

masue, Y. Sakano, T. Sasaki, M. Sadamatsu, et al. Activation of the prefrontal

cortex to trauma-related stimuli measured by near-infrared spectroscopy in post-

traumatic stress disorder due to terrorism. Psychophysiology, 40(4):492–500, 2003.

[135] D.J. McFarland, D.J. Krusienski, W.A. Sarnacki, and J.R. Wolpaw. Emulation of

computer mouse control with a noninvasive brain-computer interface. Journal of

neural engineering, 5(2):101, 2008.

[136] JH Meek, CE Elwell, MJ Khan, J. Romaya, JS Wyatt, DT Delpy, and S. Zeki.

Regional changes in cerebral haemodynamics as a result of a visual stimulus mea-

130

sured by near infrared spectroscopy. Proceedings of the Royal Society of London.

Series B: Biological Sciences, 261(1362):351, 1995.

[137] N. Memarian, A.N. Venetsanopoulos, and T. Chau. Infrared thermography as an

access pathway for individuals with severe motor impairments. Journal of Neuro-

Engineering and Rehabilitation, 6(1):11, 2009.

[138] L.B. Meyer. Emotion and meaning in music. University of Chicago Press, 1956.

[139] J.R. Millan. Adaptive brain interfaces. Communications of the ACM, 46(3):74–80,

2003.

[140] J.R. Millan, J. Mourino, M. Franze, F. Cincotti, M. Varsta, J. Heikkonen, and

F. Babiloni. A local neural classifier for the recognition of EEG patterns associated

to mental tasks. IEEE Transactions on Neural Networks, 13(3), 2002.

[141] M.T. Mitterschiffthaler, C.H.Y. Fu, J.A. Dalton, C.M. Andrew, and S.C.R.

Williams. A functional mri study of happy and sad affective states induced by

classical music. Human Brain Mapping, 28(11):1150–1162, 2007.

[142] S. Moghimi, A. Kushki, A.M. Guerguerian, and T. Chau. Characterizing emo-

tional response to music in the prefrontal cortex using near infrared spectroscopy.

Neuroscience Letters, 2012.

[143] S. Moghimi, A. Kushki, A.M. Guerguerian, and T. Chau. A review of eeg-based

brain-computer interfaces as access pathways for individuals with severe disabilities.

Assistive technology: the official journal of RESNA, to appear (2012).

[144] S. Moghimi, A. Kushki, S. Power, A.M. Guerguerian, and T. Chau. Automatic

detection of a prefrontal cortical response to emotionally rated music using multi-

channel near-infrared spectroscopy. Journal of Neural Engineering, 9(2):026022,

2012.

131

[145] ST Morgan, JC Hansen, and SA Hillyard. Selective attention to stimulus location

modulates the steady-state visual evoked potential. Proceedings of the National

Academy of Sciences of the United States of America, 93(10):4770, 1996.

[146] J.D. Morris. SAM: the Self-Assessment Manikin. An efficient cross-cultural mea-

surement of emotional response. Journal of Advertising Research, 35(6), 1995.

[147] DW Mulder, LT Kurland, KP Offord, and CM Beard. Familial adult motor neuron

disease: amyotrophic lateral sclerosis. Neurology, 36(4):511, 1986.

[148] G.R. Muller, C. Neuper, and G. Pfurtscheller. Implementation of a telemonitoring

system for the control of an EEG-based brain-computer interface. IEEE Transac-

tions on Neural Systems and Rehabilitation Engineering, 11(1):54–59, 2003.

[149] G.R. Muller-Putz, R. Scherer, C. Brunner, R. Leeb, and G. Pfurtscheller. Better

than random? a closer look on bci results. International Journal of Bioelectromag-

netism, 10(1):52–55, 2008.

[150] K.J. Murphy and J.A. Brunberg. Adult claustrophobia, anxiety and sedation in

mri. Magnetic resonance imaging, 15(1):51–54, 1997.

[151] M. Naito, Y. Michioka, K. Ozawa, Y. Ito, M. Kiguchi, and T. Kanazawa. A commu-

nication means for totally locked-in als patients based on changes in cerebral blood

volume measured with near-infrared light. IEICE transactions on information and

systems, 90(7):1028–1037, 2007.

[152] Z. Nenadic and J.W. Burdick. Spike detection using the continuous wavelet trans-

form. Biomedical Engineering, IEEE Transactions on, 52(1):74–87, 2005.

[153] N. Neumann and N. Birbaumer. Predictors of successful self control during

brain-computer communication. Journal of Neurology, Neurosurgery & Psychiatry,

74(8):1117, 2003.

132

[154] N. Neumann and A. Kubler. Training locked-in patients: A challenge for the use

of brain-computer interfaces. IEEE Transactions on Neural Systems and Rehabili-

tation Engineering, 11(2):169–172, 2003.

[155] C. Neuper, GR Muller, A. Kubler, N. Birbaumer, and G. Pfurtscheller. Clinical

application of an EEG-based brain-computer interface: a case study in a patient

with severe motor impairment. Clinical Neurophysiology, 114(3):399–409, 2003.

[156] B.R. Nhan and T. Chau. Classifying affective states using thermal infrared imaging

of the human face. Biomedical Engineering, IEEE Transactions on, 57(4):979–987,

2010.

[157] E. Niedermeyer and F.H.L. Da Silva. Electroencephalography: basic principles,

clinical applications, and related fields. Lippincott Williams & Wilkins, 2005.

[158] F. Nijboer, SP Carmien, E. Leon, FO Morin, RA Koene, and U. Hoffmann. Affec-

tive brain-computer interfaces: Psychophysiological markers of emotion in healthy

persons and in persons with amyotrophic lateral sclerosis. In Affective Comput-

ing and Intelligent Interaction and Workshops, 2009. ACII 2009. 3rd International

Conference on, pages 1–11. IEEE, 2009.

[159] F. Nijboer, EW Sellers, J. Mellinger, MA Jordan, T. Matuz, A. Furdea, S. Halder,

U. Mochty, DJ Krusienski, TM Vaughan, et al. A P300-based brain-computer

interface for people with amyotrophic lateral sclerosis. Clinical neurophysiology,

119(8):1909–1916, 2008.

[160] K. Oatley, D. Keltner, and J.M. Jenkins. Understanding emotions. Wiley-

Blackwell, 2006.

[161] H. Obrig, C. Hirth, JG Junge-Hulsing, C. Doge, T. Wolf, U. Dirnagl, and A. Vill-

ringer. Cerebral oxygenation changes in response to motor stimulation. Journal of

Applied Physiology, 81(3):1174, 1996.

133

[162] F Ortiz-Corredor, JJ Silvestre-Avendano, and A Izquierdo-BEllo. Locked-in state

mimicking cerebral death in a child with guillain-barre syndrome. Revista de Neu-

rologica, 44(10):636–638, 2007.

[163] K.J. Pallesen, E. Brattico, C. Bailey, A. Korvenoja, J. Koivisto, A. Gjedde, and

S. Carlson. Emotion processing of major, minor, and dissonant chords. Annals of

the New York Academy of Sciences, 1060(1):450–453, 2005.

[164] J. Panksepp and G. Bernatzky. Emotional sounds and the brain: the neuro-affective

foundations of musical appreciation. Behavioural Processes, 60(2):133–155, 2002.

[165] M.A. Pastor, J. Artieda, J. Arbizu, M. Valencia, and J.C. Masdeu. Human cerebral

activation during steady-state visual-evoked responses. Journal of Neuroscience,

23(37):11621, 2003.

[166] J. Perelmouter and N. Birbaumer. A binary spelling interface with random errors.

IEEE Transactions on Rehabilitation Engineering, 8(2):227–232, 2000.

[167] I. Peretz, L. Gagnon, and B. Bouchard. Music and emotion: perceptual deter-

minants, immediacy, and isolation after brain damage. Cognition, 68(2):111–141,

1998.

[168] P.C. Petrantonakis and L.J. Hadjileontiadis. Emotion recognition from eeg using

higher order crossings. Information Technology in Biomedicine, IEEE Transactions

on, 14(2):186–197, 2010.

[169] KV Petrides and A. Furnham. Trait emotional intelligence: Behavioural validation

in two studies of emotion recognition and reactivity to mood induction. European

Journal of Personality, 17(1):39–57, 2003.

[170] G. Pfurtscheller, C. Neuper, C. Guger, W. Harkam, H. Ramoser, A. Schlogl,

B. Obermaier, M. Pregenzer, et al. Current trends in Graz brain-computer interface

134

(BCI) research. IEEE Transactions on Rehabilitation Engineering, 8(2):216–219,

2000.

[171] R.W. Picard. Affective computing. The MIT press, 2000.

[172] R.W. Picard, E. Vyzas, and J. Healey. Toward machine emotional intelligence:

Analysis of affective physiological state. Pattern Analysis and Machine Intelligence,

IEEE Transactions on, 23(10):1175–1191, 2001.

[173] F. Piccione, F. Giorgi, P. Tonin, K. Priftis, S. Giove, S. Silvoni, G. Palmas, and

F. Beverina. P300-based brain computer interface: reliability and performance in

healthy and paralysed participants. Clinical neurophysiology, 117(3):531–537, 2006.

[174] T.W. Picton. The P300 wave of the human event-related potential. Journal of

clinical neurophysiology, 9(4):456, 1992.

[175] GD Pinna and R. Maestri. Reliability of transfer function estimates in cardio-

vascular variability analysis. Medical and Biological Engineering and Computing,

39(3):338–347, 2001.

[176] R. Plomp and W.J.M. Levelt. Tonal consonance and critical bandwidth. The

journal of the Acoustical Society of America, 38(4):548–560, 1965.

[177] S Power, T Falk, and T Chau. Classification of prefrontal activity due to mental

arithmetic and music imagery using hidden markov models and frequency domain

near-infrared spectroscopy. Journal of Neural Engineering, 7(2):026002:9pp, 2010.

[178] S. Power, A. Kushki, and T. Chau. Toward a 3-state system-paced NIRS-BCI:

automatic discrimination of mental arithmetic, music imagery from the no-control

state. under review at Journal of Neural Engineering, 2011.

[179] S.D. Power, T.H. Falk, and T. Chau. Classification of prefrontal activity due to

135

mental arithmetic and music imagery using hidden Markov models and frequency

domain near-infrared spectroscopy. Journal of Neural Engineering, 7:026002, 2010.

[180] S.D. Power, A. Kushki, and T. Chau. Towards a system-paced near-infrared spec-

troscopy brain–computer interface: differentiating prefrontal activity due to mental

arithmetic and mental singing from the no-control state. Journal of Neural Engi-

neering, 8:066004, 2011.

[181] W.S. Pritchard. Psychophysiology of P300. Psychological Bulletin, 89(3):506–540,

1981.

[182] V. Rajagopalan and A. Ray. Symbolic time series analysis via wavelet-based par-

titioning. Signal Processing, 86(11):3309–3320, 2006.

[183] C. Ranganath and G. Rainer. Neural mechanisms for detecting and remembering

novel events. Nature Reviews Neuroscience, 4(3):193–202, 2003.

[184] P. Rani, C. Liu, N. Sarkar, and E. Vanman. An empirical study of machine learning

techniques for affect recognition in human–robot interaction. Pattern Analysis &

Applications, 9(1):58–69, 2006.

[185] S.J. Roberts and W.D. Penny. Real-time brain-computer interfacing: A preliminary

study using Bayesian learning. Medical and Biological Engineering and computing,

38(1):56–61, 2000.

[186] R.G. Robinson, K.L. Kubos, L.Y.N.B. Starr, K. Rao, and T.R. Price. Mood disor-

ders in stroke patients: importance of location of lesion. Brain, 107(1):81, 1984.

[187] E.T. Rolls. On¡ em¿ The brain and emotion¡/em¿. Behavioral and Brain Sciences,

23(02):219–228, 2000.

[188] A. Roskies. Neuroethics for the new millenium. Neuron, 35(1):21, 2002.

136

[189] M.K. Rothbart and D. Derryberry. Development of individual differences in tem-

perament. Advances in developmental psychology, 1:37–86, 1981.

[190] J.A. Russell. A circumplex model of affect. Journal of personality and social

psychology, 39(6):1161, 1980.

[191] C.L. Rusting. Personality, mood, and cognitive processing of emotional informa-

tion: three conceptual frameworks. Psychological bulletin, 124(2):165, 1998.

[192] D.L. Sackett. Rules of evidence and clinical recommendations on the use of an-

tithrombotic agents. Chest, 95(2 Supplement):2S, 1989.

[193] S. Samson. Neuropsychological studies of musical timbre. Annals of the New York

Academy of Sciences, 999(1):144–151, 2003.

[194] G Santhanam, SI Ryu, BM Yu, A Afshar, and KV Shenoy. A high-performance

brain-computer interface. Nature, 442(7099):195–198, 2006.

[195] Ichiro Sase, Hideo Eda, Akitoshi Seiyama, Hiroki C Tanabe, Akira Takatsuki, and

Toshio Yanagida. Multi-channel optical mapping: Investigation of depth informa-

tion. In Proc SPIE, volume 4250, pages 29–36, 2001.

[196] Hiroki Sato, Masashi Kiguchi, Fumio Kawaguchi, Atsushi Maki, et al. Practical-

ity of wavelength selection to improve signal-to-noise ratio in near-infrared spec-

troscopy. Neuroimage, 21(4):1554–1562, 2004.

[197] J.P. Saul, RD Berger, P. Albrecht, SP Stein, M.H. Chen, and R.J. Cohen. Transfer

function analysis of the circulation: unique insights into cardiovascular regulation.

American Journal of Physiology-Heart and Circulatory Physiology, 261(4):H1231–

H1245, 1991.

[198] J.P. Saul, R.D. Berger, MH Chen, and R.J. Cohen. Transfer function analysis

137

of autonomic regulation. ii. respiratory sinus arrhythmia. American Journal of

Physiology-Heart and Circulatory Physiology, 256(1):H153–H161, 1989.

[199] M.J. Scherer. The change in emphasis from people to person: introduction to

the special issue on Assistive Technology. Disability & Rehabilitation, 24(1-3):1–4,

2002.

[200] L.A. Schmidt and L.J. Trainor. Frontal brain electrical activity (eeg) distinguishes

valence and intensity of musical emotions. Cognition & Emotion, 15(4):487–500,

2001.

[201] W.W. Seeley, V. Menon, A.F. Schatzberg, J. Keller, G.H. Glover, H. Kenna, A.L.

Reiss, and M.D. Greicius. Dissociable intrinsic connectivity networks for salience

processing and executive control. The Journal of neuroscience, 27(9):2349–2356,

2007.

[202] E. Sellers, G. Schalk, and E. Donchin. The p300 as a typing tool: tests of brain-

computer interface with an als patient. Psychophysiology, 40:77, 2003.

[203] E.W. Sellers and E. Donchin. A P300-based brain-computer interface: initial tests

by ALS patients. Clinical Neurophysiology, 117(3):538–548, 2006.

[204] E.W. Sellers, A. Kubler, and E. Donchin. Brain–computer interface research at

the University of South Florida cognitive psychophysiology laboratory: the P300

speller. Biomed. Eng, 51(4):647–656, 2004.

[205] W.A. Sethares. Tuning, timbre, spectrum, scale. 2004.

[206] Y.I. Sheline. 3d mri studies of neuroanatomic changes in unipolar major depression:

the role of stress and medical comorbidity. Biological Psychiatry, 48(8):791–800,

2000.

138

[207] D.V. SHERMAN and D. Ely. Biochemical and galvanic skin responses to music

stimuli by college students in biology and music. Perceptual and motor skills,

74(3c):1079–1090, 1992.

[208] A. Siegel and H. Edinger. Neural control of aggression and rage behavior. Handbook

of the Hypothalamus, 3(Part B), 1981.

[209] J.R. Simpson, W.C. Drevets, A.Z. Snyder, D.A. Gusnard, and M.E. Raichle.

Emotion-induced changes in human medial prefrontal cortex: Ii. during antici-

patory anxiety. Proceedings of the National Academy of Sciences, 98(2):688–693,

2001.

[210] R. Sinha, W.R. Lovallo, and O.A. Parsons. Cardiovascular differentiation of emo-

tions. Psychosomatic Medicine, 54(4):422, 1992.

[211] E. Smith and M. Delargy. Locked-in syndrome. British Medical Journal,

330(7488):406, 2005.

[212] E.M. Sokhadze. Effects of music on the recovery of autonomic and electrocortical

activity after stress induced by aversive visual stimuli. Applied psychophysiology

and biofeedback, 32(1):31–50, 2007.

[213] M.P. Spackman, M. Fujiki, B. Brinton, D. Nelson, and J. Allen. The ability of chil-

dren with language impairment to recognize emotion conveyed by facial expression

and music. Communication Disorders Quarterly, 26(3):131, 2005.

[214] D. Sridharan, D.J. Levitin, and V. Menon. A critical role for the right fronto-

insular cortex in switching between central-executive and default-mode networks.

Proceedings of the National Academy of Sciences, 105(34):12569–12574, 2008.

[215] N. Steinbeis, S. Koelsch, and J.A. Sloboda. Emotional processing of harmonic

139

expectancy violations. Annals of the New York Academy of Sciences, 1060(1):457–

461, 2005.

[216] M.W. SullivanK et al. Contingency, means end skills, and the use of technology in

infant intervention. Infants & Young Children, 5(4):58, 1993.

[217] K. Tai, S. Blain, and T. Chau. A review of emerging access technologies for indi-

viduals with severe motor impairments. Assistive technology: the official journal

of RESNA, 20(4):204, 2008.

[218] K. Tai and T. Chau. Single-trial classification of NIRS signals during emotional in-

duction tasks: towards a corporeal machine interface. Journal of NeuroEngineering

and Rehabilitation, 6(1):39, 2009.

[219] M. Tanida, M. Katsuyama, and K. Sakatani. Relation between mental stress-

induced prefrontal cortex activity and skin conditions: A near-infrared spectroscopy

study. Brain research, 1184:210–216, 2007.

[220] J.J. Tecce. Contingent negative variation (CNV) and psychological processes in

man. Psychological Bulletin, 77(2):73–108, 1972.

[221] M.M. Ter-Pogossian, M.E. Raichle, and B.E. Sobel. Positron-emission tomography.

Sci. Am.;(United States), 243(4), 1980.

[222] J.F. Thayer and R.D. Lane. A model of neurovisceral integration in emotion reg-

ulation and dysregulation. Journal of affective disorders, 61(3):201–216, 2000.

[223] M. Toyokura. Waveform and habituation of sympathetic skin response. Electroen-

cephalography and Clinical Neurophysiology/Electromyography and Motor Control,

109(2):178–183, 1998.

[224] L. Trejo, K. Knuth, R. Prado, R. Rosipal, K. Kubitz, R. Kochavi, B. Matthews,

140

and Y. Zhang. Eeg-based estimation of mental fatigue: Convergent evidence for a

three-state model. Foundations of Augmented Cognition, pages 201–211, 2007.

[225] E.Z. Tronick. Emotions and emotional communication in infants. American psy-

chologist, 44(2):112, 1989.

[226] T.M. Vaughan, D.J. McFarland, G. Schalk, W.A. Sarnacki, D.J. Krusienski, E.W.

Sellers, and J.R. Wolpaw. The wadsworth bci research and development program:

at home with bci. IEEE Transactions on Neural Systems and Rehabilitation Engi-

neering, 14(2):229–233, 2006.

[227] M Velliste, S Perel, MC Spalding, AS Whitford, and AB Schwartz. Cortical control

of a prosthetic arm for self-feeding. Nature, 453(7198):1098–1101, 2008.

[228] A. Villringer, J. Planck, C. Hock, L. Schleinkofer, and U. Dirnagl. Near infrared

spectroscopy (NIRS): a new tool to study hemodynamic changes during activation

of brain function in human adults. Neuroscience Letters, 154(1-2):101–104, 1993.

[229] R.F. Vossa and J. Clarke. ” 1/f noise” in music: Music from 1/f noise. J. Acoust.

Soc. Am, 63(1):258, 1978.

[230] Y. Wang, R. Wang, X. Gao, B. Hong, and S. Gao. A practical VEP-based brain-

computer interface. IEEE Transactions on Neural Systems and Rehabilitation En-

gineering, 14(2):234–240, 2006.

[231] C.M. Warrier and R.J. Zatorre. Right temporal cortex is critical for utilization of

melodic contextual cues in a pitch constancy task. Brain, 127(7):1616–1625, 2004.

[232] L. Wedin. A multidimensional study of perceptual-emotional qualities in music.

Scandinavian Journal of Psychology, 13(1):241–257, 1972.

[233] N. Weiskopf, K. Mathiak, S.W. Bock, F. Scharnowski, R. Veit, W. Grodd,

R. Goebel, and N. Birbaumer. Principles of a brain-computer interface (bci) based

141

on real-time functional magnetic resonance imaging (fmri). Biomedical Engineer-

ing, IEEE Transactions on, 51(6):966–970, 2004.

[234] N. Weiskopf, F. Scharnowski, R. Veit, R. Goebel, N. Birbaumer, and K. Mathiak.

Self-regulation of local brain activity using real-time functional magnetic resonance

imaging (fMRI). Journal of Physiology-Paris, 98(4-6):357–373, 2004.

[235] R.E. Wheeler, R.J. Davidson, and A.J. Tomarken. Frontal brain asymmetry and

emotional reactivity: A biological substrate of affective style. Psychophysiology,

30(1):82–89, 1993.

[236] A. Wilson. Augmentative communication in practice: An introduction (2nd ed.).

University of Edinburgh, Edinburgh, Scotland, 1998.

[237] J.R. Wolpaw, N. Birbaumer, D.J. McFarland, G. Pfurtscheller, and T.M. Vaughan.

Brain-computer interfaces for communication and control. Clinical neurophysiology,

113(6):767–791, 2002.

[238] J.R. Wolpaw and D.J. McFarland. Control of a two-dimensional movement signal

by a noninvasive brain-computer interface in humans. Proceedings of the National

Academy of Sciences of the United States of America, 101(51):17849, 2004.

[239] J.R. Wolpaw, D.J. McFarland, T.M. Vaughan, and G. Schalk. The Wadsworth

Center brain-computer interface (BCI) research and development program. IEEE

Transactions on Neural Systems and Rehabilitation Engineering, 11(2):204–207,

2003.

[240] W.M. Wundt and C.H. Judd. Outlines of psychology. W. Engelmann, 1907.

[241] H. Yang, Z. Zhou, Y. Liu, Z. Ruan, H. Gong, Q. Luo, and Z. Lu. Gender difference

in hemodynamic responses of prefrontal area to emotional stress by near-infrared

spectroscopy. Behavioural brain research, 178(1):172–176, 2007.

142

[242] T.O. Zander and C. Kothe. Towards passive brain–computer interfaces: apply-

ing brain–computer interface technology to human–machine systems in general.

Journal of Neural Engineering, 8(2):025005, 2011.

[243] R.J. Zatorre. Discrimination and recognition of tonal melodies after unilateral

cerebral excisions. Neuropsychologia, 23(1):31–41, 1985.

[244] M.R. Zentner and J. Kagan. Infants’ perception of consonance and dissonance in

music. Infant Behavior and Development, 21(3):483–492, 1998.

Documents

by Saba Moghimi A thesis submitted in conformity with the ... · deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively), was used to monitor prefrontal cortex hemodynamics