86
Title Classification of Music-Induced Emotions using Psycho-physiological Data Author(s) Cabredo, Rafael Angsico Citation Issue Date Text Version ETD URL http://hdl.handle.net/11094/51495 DOI rights

Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Embed Size (px)

Citation preview

Page 1: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Title Classification of Music-Induced Emotions usingPsycho-physiological Data

Author(s) Cabredo, Rafael Angsico

Citation

Issue Date

Text Version ETD

URL http://hdl.handle.net/11094/51495

DOI

rights

Page 2: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Classification of Music-Induced Emotionsusing Psycho-physiological Data

January 2013

Rafael Angsico CABREDO

Page 3: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Classification

ofM

usic-InducedEm

otionsusing

Psycho-physiologicalD

ataJanuary

2013R

.A.C

AB

RED

O

Page 4: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Dissertation

Classification of Music-Induced Emotions usingPsycho-physiological Data

Submitted to

Graduate School of Information Science and TechnologyOsaka University

January 2013

Rafael Angsico CABREDO

Page 5: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Research Output

Journal Publications

Rafael Cabredo, Roberto Legaspi, Paul Salvador Inventado, Masayuki Numao.

Discovering Emotion Inducing Music Features using EEG Signals, Journal of

Advanced Computational Intelligence and Intelligent Informatics. (Accepted for

publication: January 7, 2013)

International Conference Papers

1. Rafael Cabredo, Roberto Legaspi, Paul Salvador Inventado, Masayuki Numao. An

Emotion Model for Music Using Brain waves, In Proc. 13th International Society

for Music Information Retrieval Conference (ISMIR), Portugal, pp. 265-270, 2012.

2. Rafael Cabredo, Roberto Legaspi, Paul Salvador Inventado, Masayuki Numao.

EEG-based Music Emotion Recognition using Regression Analysis, In Proc. 3rd

International Workshop on Empathic Computing. (to appear)

3. Rafael Cabredo, Roberto Legaspi, Paul Salvador Inventado, Masayuki Numao.

Discovering Emotion Features in Symbolic Music, In Proc. 26th Annual Confer-

ence of the Japanese Society for Artificial Intelligence in CD-ROM, Yamaguchi,

Japan, 2012.

4. Rafael Cabredo, Roberto Legaspi, Paul Salvador Inventado, Masayuki Numao.

Finding Motifs in Psychophysiological Responses and Chord Sequences, Proc. in

Information and Communications Technology : Theory and Practice of Compu-

tation, Springer, pp. 78-89, 2012.

5. Rafael Cabredo, Roberto Legaspi, Masayuki Numao. Identifying Emotion Seg-

ments in Music by Discovering Motifs in Physiological Data, In Proc. 12th In-

ternational Society for Music Information Retrieval Conference (ISMIR), Miami,

USA, pp. 753-758, 2011.

1

Page 6: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

International Conference Papers (co-authored)

1. Paul Salvador Inventado, Roberto Legaspi, Rafael Cabredo, Masayuki Numao.

Sidekick Retrospect: A Self-regulation Tool for Unsupervised Learning Environ-

ments, Proc. in Information and Communications Technology : Theory and

Practice of Computation, Springer. (to appear)

2. Yu Yamano, Rafael Cabredo, Paul Salvador Inventado, Roberto Legaspi, Koichi

Moriyama, Kenichi Fukui, Satoshi Kurihara, Masayuki Numao. Estimating Emo-

tions on Music Based on Brainwave Analyses, Proc. 3rd International Workshop

on Empathic Computing, Springer. (to appear)

3. Paul Salvador Inventado, Roberto Legaspi, Rafael Cabredo, Masayuki Numao.

Modeling Affect and Intentions in Unsupervised Learning Environments, Proc.

3rd International Workshop on Empathic Computing, Springer. (to appear)

4. Anh Mai, Roberto Legaspi, Paul Salvador Inventado, Rafael Cabredo, Masayuki

Numao. Users Sitting Postures to Infer User’s Learning and Non-learning States,

Proc. 3rd International Workshop on Empathic Computing, Springer. (to appear)

5. Jocelynn Cu, Rafael Cabredo, Roberto Legaspi, Merlin Teodosia Suarez. On

Modelling Emotional Responses to Rhythm Features, Lecture Notes in Com-

puter Science: Proc. PRICAI 2012 Trends in Artificial Intelligence, pp. 857-860,

Springer, 2012.

6. Anh Mai, Roberto Legaspi, Paul Salvador Inventado, Rafael Cabredo, Masayuki

Numao. A Model for Sitting Postures in Relation to Learning and Non-learning

Behaviors, In Proc. 26th Annual Conference of the Japanese Society for Artificial

Intelligence in CD-ROM, Yamaguchi, Japan, 2012.

7. Hal Gino Avisado, John Vincent Cocjin, Joshua Gaverza, Rafael Cabredo, Joce-

lynn Cu. Analysis of Music Timbre Features for the Construction of a User-specific

Affect Model, Proc. in Information and Communications Technology : Theory

and Practice of Computation, Springer, pp. 28-35, 2012.

8. Masayuki Numao, Rafael Cabredo, Danaipat Sodkomkham, Kazuya Maruo, Ro-

berto Legaspi, Kenichi Fukui, Koichi Moriyama, Satoshi Kurihara. Extracting

Time Series Motifs for Emotion and Behavior Modeling, Proc. 15th SANKEN

International Symposium, Osaka, Japan, pp. 36-37, 2012.

9. Jocelynn Cu, Rafael Cabredo, Paul Salvador Inventado, Rhia Trogo, merlin Teo-

dosia Suarez, Roberto Legaspi. The TALA Empathic Space: Integrating Affect

and Activity Recognition into a Smart Space Proc. 3rd International Conference

on Human-Centric Computing, Philippines, pp. 1-6, 2010.

2

Page 7: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Other Research Output

1. Rafael Cabredo, Roberto Legaspi, Paul Salvador Inventado, Masayuki Numao.

Identifying Emotions in Music through Psychophysiological Sensors, Academic

Research Symposium of De La Salle University and Osaka University, Manila,

Philippines, Sep 26, 2012. (presentation)

2. Rafael Cabredo, Roberto Legaspi, Masayuki Numao. Comparing Effectiveness of

Different Physiological Sensors for Music Segmentation In Proc. 14th SANKEN

International Symposium and the 9th SANKEN Nanotechnology Symposium, Otso,

Japan, p. 77, Jan 25-26 2011 (poster)

3. Hal Gino Avisado, John Vincent Cocjin, Joshua Gaverza, Rafael Cabredo, Joce-

lynn Cu. Analysis of Timbre Features for Construction of a User-specific Affect

Model for Classifying Music In Proc. 15th Joint Academic Research Symposium

of De La Salle University and Osaka University, Manila, Philippines, Sep 29-30

2010.

4. Juan Lorenzo Hagad, Roberto Legaspi, Rafael Cabredo, Merlin Teodosia Suarez,

Masayuki Numao. Automatic Detection of Posture Congruence in Dyadic Inter-

actions to Predict Rapport, In Proc. 15th Joint Academic Research Symposium

of De La Salle University and Osaka University, Manila, Philippines, Sep 29-30

2010.

5. Hal Gino Avisado, John Vincent Cocjin, Joshua Gaverza, Rafael Cabredo. De-

veloping a User-specific Affect Model for Classifying Music Based on Timbral

Content – An analysis, In Proc. 10th Philippine Computing Science Congress,

Davao, Philippines, Mar 5-6 2010.

6. Rafael Cabredo Modeling User Preferences from Emotion Elicited from Music

for Use in an Ambient Intelligent Empathic Space, 1st Osaka University and

De La Salle University Workshop on Empathic Computing (WEC 2009), Manila,

Philippines, Sep 23 2009.

3

Page 8: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Abstract

Listening to music can induce different emotions. Existing researches in music and

emotions have identified several music properties that cause these emotional reactions,

however an automated system using these findings has not been developed. This research

aims to build an emotion model that can be used for automated music recommenda-

tion. This involves building a knowledge base of mappings between affective states (e.g.,

stress and relaxed) and music features (e.g., rhythm, chord progression, harmony, in-

strumentation, etc.). Psycho-physiological responses of a user are recorded while the

participants listen to music in order to measure their emotional response. These signals

are analysed and mapped to various musical features of the songs listened to. Data min-

ing techniques are used to analyse the raw physiological sensor data and convert these

into continuous emotion labels used for annotating the extracted music features from

MIDI files. Two separate supervised classification techniques use the labelled features

to build the emotion models. Classifiers have a relative absolute error of 1.7%–37.2%

when predicting the emotion labels using different parameters.

Page 9: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Acknowledgement

This dissertation would have not been possible without the guidance and support of

so many people. I would like to express my deepest appreciation and thanks to my

supervisor, Professor Masayuki Numao, whose mentorship and support allowed me to

study in Japan and progress in my academic career. I am equally thankful to Dr.

Roberto Legaspi, who spent time discussing critical points in my research and assisted

me in my life in Japan. Your words of wisdom will not be forgotten.

I would also like to thank members of the review committee, Professors Jun Tanida,

Hiroshi Morita and Satoshi Kurihara, who reviewed this dissertation and provided valu-

able comments to achieve its final form.

I also thank other members of the Numao laboratory (Architecture for Intelligence),

Professors Koichi Moriyama and Ken-ichi Fukui who also provided comments and sug-

gestions during laboratory meetings. Many thanks as well to the various students in the

laboratory for the friendship and fond memories during our time together. A special

thanks to the laboratory secretaries, Ms. Chiharu Wada and Ms. Misuzu Yuuki for

helping with various paperwork for conference trips and other administrative matters.

This research would have not been possible without the aid of different funding pro-

grams: Management Expenses Grants for National Universities Corporations through

the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan;

the Global Centers of Excellence Program of MEXT; and KAKENHI 23300059.

Finally, my deepest gratitude to my family, colleagues and friends here in Japan and

in the Philippines for the constant support and encouragement. Thank you for believing

and for interceding on my behalf.

And to God, almighty Father, thank you for all the blessings I have received and

being with me during the lonely nights in the laboratory. All these I do for Your greater

glory.

Page 10: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Contents

List of Figures iv

List of Tables v

1 Introduction 1

1.1 Background to the Research . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Significance of the Research . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Theoretical Framework 6

2.1 Emotion Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 What are Emotions? . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.2 Emotions and Music . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Psycho-physiology of Emotions . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.1 Emotion Recognition using Psycho-physiological Responses . . . 9

2.2.2 Physiological Metrics of Emotion . . . . . . . . . . . . . . . . . . 9

3 Detection of Emotion-Inducing Music Segments and Psycho-physiological

Response 11

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 Time Series Motif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3 Research Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3.1 Music Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.3.2 Mueen-Keogh Algorithm . . . . . . . . . . . . . . . . . . . . . . . 14

3.3.3 Identifying Frequent Chord Sequences . . . . . . . . . . . . . . . 15

3.3.4 Motif Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.4 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.5.1 Chord Progression Motifs . . . . . . . . . . . . . . . . . . . . . . 21

3.5.2 Physiological Time Series Motifs . . . . . . . . . . . . . . . . . . 22

3.5.3 Improving the Data Collection Methodology . . . . . . . . . . . . 23

i

Page 11: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4 Emotion Classification using High-level Music Features and Emotion

Annotations from EEG Data 25

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.2 Research Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2.1 High-level Music Features . . . . . . . . . . . . . . . . . . . . . . 27

4.2.2 Emotion Annotations . . . . . . . . . . . . . . . . . . . . . . . . 27

4.2.3 Machine Learning Task . . . . . . . . . . . . . . . . . . . . . . . 28

4.3 Data Collection Methodology . . . . . . . . . . . . . . . . . . . . . . . . 28

4.3.1 Experimental set-up . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.3.2 EEG Data and Emotion annotation . . . . . . . . . . . . . . . . 31

4.3.3 Extracting Music Features . . . . . . . . . . . . . . . . . . . . . . 32

4.4 Emotion Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.4.1 Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.4.2 C4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.4.3 Testing and Evaluation . . . . . . . . . . . . . . . . . . . . . . . 36

4.5 Results and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.5.1 Consistency of EEG Readings . . . . . . . . . . . . . . . . . . . . 42

4.5.2 Influence of window length . . . . . . . . . . . . . . . . . . . . . 43

4.5.3 Important features used in C4.5 . . . . . . . . . . . . . . . . . . 48

4.5.4 Accuracy of Emotion labels . . . . . . . . . . . . . . . . . . . . . 49

4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5 Conclusion 52

5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.2 Recommendations for future work . . . . . . . . . . . . . . . . . . . . . 53

A MIDI Files for EEG Experiments 64

B Consent Form for Data Collection 67

Glossary 71

Acronyms 73

ii

Page 12: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

List of Figures

2.1 Model of emotional communication in music . . . . . . . . . . . . . . . . 7

3.1 Similar subsequences in a blood volume pulse time series . . . . . . . . . 12

3.2 Research Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 Representation of chord sequences as a time series . . . . . . . . . . . . 17

3.4 Setup for data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.5 Comparison of motif distances between BVP and RR . . . . . . . . . . . 23

4.1 Modified research framework . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2 EEG electrode positions and electrodes used for ESAM . . . . . . . . . 30

4.3 Sample EEG Signals from subject B for a segment of the song Stand By

Me that is measured with different stress values . . . . . . . . . . . . . . 32

4.4 Illustration of labelling music features with labels . . . . . . . . . . . . . 34

4.5 Illustration of datasets for the subjects . . . . . . . . . . . . . . . . . . . 35

4.6 Relative absolute error results of stress and relaxation models using all

instances for subject B . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.7 Relative absolute error results of stress and relaxation models using C4.5 40

4.8 Relative absolute error results of stress and relaxation models using linear

regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.9 Visual comparison of relaxation readings while subject B listens to song 49 43

4.10 Mean of emotion values for different window lengths . . . . . . . . . . . 46

4.11 Standard deviation of emotion values for different window lengths . . . 47

iii

Page 13: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

List of Tables

3.1 Mueen-Keogh Motif Discovery . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2 Basic space of the tonic chord in the Key of C Major . . . . . . . . . . . 17

3.3 Summary of music included for motif discovery . . . . . . . . . . . . . . 20

3.4 Mapping of chords of Yesterday to TPS Chord Distance . . . . . . . . . 21

3.5 Sample of identified chord progressions . . . . . . . . . . . . . . . . . . . 22

4.1 Distribution of features used for the instances . . . . . . . . . . . . . . . 34

4.2 Performance measures for numeric prediction . . . . . . . . . . . . . . . 37

4.3 Stress model results using C4.5 on all instances for subject B . . . . . . 38

4.4 Stress model results using linear regression on all instances for subject B 38

4.5 Relax model results using C4.5 on all instances for subject B . . . . . . 38

4.6 Relax model results using linear regression on all instances for subject B 38

4.7 Stress model results using linear regression for subject A . . . . . . . . . 41

4.8 Stress model results using C4.5 for subject A . . . . . . . . . . . . . . . 41

4.9 Relax model results using linear regression for subject A . . . . . . . . . 41

4.10 Relax model results using C4.5 for subject A . . . . . . . . . . . . . . . 41

4.11 Stress model averaged results using linear regression for subject B . . . 41

4.12 Stress model averaged results using C4.5 for subject B . . . . . . . . . . 41

4.13 Relax model averaged results using linear regression for subject B . . . . 42

4.14 Relax model averaged results using C4.5 for subject B . . . . . . . . . . 42

4.15 Similarity between session data using Stress annotations . . . . . . . . . 44

4.16 Similarity between session data using Relaxation annotations . . . . . . 45

4.17 Correlation between average similarity and manual annotations . . . . . 45

4.18 Class sizes for Stress and Relaxation data after discretization . . . . . . 47

4.19 Distribution of features in decision trees (n = 60) . . . . . . . . . . . . . 48

4.20 Feature distribution for the first 5 levels of C4.5 decision trees (n = 60) 49

4.21 Comparison of manual annotations and discretized ESAM annotations . 50

4.22 Correlation of manual annotations . . . . . . . . . . . . . . . . . . . . . 50

4.23 Correlation of annotations using ESAM . . . . . . . . . . . . . . . . . . 51

iv

Page 14: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Chapter 1

Introduction

1.1 Background to the Research

Listening to music brings out different kinds of emotions. It can be involuntary and

different for every person and primarily caused by musical content. Detecting emotion

in music has been a subject of interest for many researchers of various fields. Researchers

have performed experiments to substantiate the hypothesis that music inherently carries

emotional meaning [1, 2, 3].

More recent research are focused on identifying music features that are associated

with affecting emotion or mood [4, 5, 6, 7]. For instance, it is generally accepted that

in Western music, a strong association between mode and valence exists. Music with

major and minor mode are associated with happiness and sadness, respectively [1, 8].

The work of Livingstone, et al. [9] also investigates music features and discusses how

changing these features can affect the emotions the music elicits.

With a good background of how different music features affect emotions, it is possible

to automatically classify and predict what kind of emotions a person will experience

when presented with a song with similar features. A survey of music emotion research

by Kim et al. [10] report that the typical approach for classifying music using emotion is

to build the ground truth of emotion labels by subjective tests. Afterwards, a machine

learning technique is used to train a classifier to recognize emotions in the music using

high-level or low-level music features and the ground truth emotion labels. High-level

music features refer to information grounded on music theory that is obtained from

1

Page 15: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

precise pitch and timing information of individual notes in the music. Low-level music

features are obtained from audio files (i.e. WAV or MP3 files) and describe acoustic

features in the music, like the energy or the frequency content of a sound.

A common problem encountered by previous work is the limitation of the annotation

for emotion used for building the ground truth. It takes a lot of time and resources to

annotate music with labels describing emotion, regardless if the emotions are induced

or expressed by the music [11, 12]. Lin, et al. [13] reviews various work on music emo-

tion classification and utilize the vast amount of online social tags to improve emotion

classification. However, a personalized emotion model for labelling music would still be

desirable. Music that is relaxing for some people may be stressful for others.

Songs are also usually annotated with the most prominent emotion (i.e., only one

emotion label per song). Multilabel classification can also be used to have richer emotion

annotations as done in [14]. These annotations, however, are still discrete emotion labels

and introduces ambiguous emotion labels. In emotion theory, arousal and valence can

be used to model emotions [15]. Arousal describes the physical activation of emotion

while valence describes the pleasantness of the emotion. Using the words Happy, Joy,

Exuberant, and Gleeful are all regarded as having positive arousal and valence but it

is not clear how similar these emotions are to each other. In addition, the degree of

happiness for each person varies.

In order to address these problems, some researches used continuous-valued emotion

labels. One method for constructing such emotion annotations is to use physiological

sensors to measure and possibly describe the emotion experienced while listening to

music. One such work, by Kim and Andre [16], used psycho-physiological data to recog-

nize emotion induced by music listening using a feature-based multiclass classification

system. Peripheral physiological sensors and an electroencephalogram (EEG) were also

used in the work for developing a constructive adaptive user interface (CAUI), which

can arrange [17, 18] and compose [19] music based on one’s impressions of the music.

In this research, physiological sensors are used to measure how emotion changes

throughout the music and learn how high-level music features cause these changes. By

2

Page 16: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

developing a method for collecting and analysing continuously recorded physiological

data, continuous-valued emotion annotations can be obtained for full-length music. This

contributes to the research done in music emotion recognition where most work deal with

one-time, discrete-value labels (i.e., a single value between 1–5, or descriptive tags) for

short music segments of 10–30 seconds. In addition, the work focuses on considering

individual emotion reactions to music as opposed to building a generalized emotion

model.

1.2 Research Objectives

This research intends to answer the question:

How can psycho-physiological data be used for modelling music-

induced emotions?

The objective of this research is to improve current techniques for music emotion

recognition by using psycho-physiological reactions to emotion-inducing music. Specifi-

cally, the following sub-problems must be answered:

1. How can a continuous-valued ground truth of emotion labels be

properly collected?

Most of the existing research use discrete emotion labels for music emotion recogni-

tion. The methodology for collecting continuous-valued ground truth labels is yet

to be explored and perfected to obtain a ground truth of acceptable quality. The

guiding principles for drafting the data collection methodology are prepared after

preliminary experiments. These principles should be supported by observations

from previous research and have a theoretical foundation.

2. How should high-level music features be extracted and used for mod-

elling emotions?

High-level music features obtained from symbolic music contain more meaning-

ful information for modelling emotions as compared to low-level music features

[20]. Prior to performing the modelling task, the method for extracting high-level

3

Page 17: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

music features must be explored. Previous research only used music clips that

are, on average, 30 seconds long. Obtaining music features from different parts

of the song would require additional processing and research. It is hypothesized

that each music feature has a different effect to certain emotions. Classification

efficiency can be improved by reducing the features that need to be extracted to

recognize certain emotions.

3. How can existing machine learning techniques be used for developing

an emotion model?

Various supervised learning techniques exist that can be used to learn how music

features affect a listener’s emotion. Since the ground-truth to be collected have

continuous values, a regression approach would be appropriate to use. However,

other methods should also be explored.

1.3 Significance of the Research

Music has become a ubiquitous form of entertainment. People listen to music in various

situations: while travelling, doing sports, studying, or relaxing. As personal music

collections grow and evolve, it becomes necessary to find an automated way to filter

music for certain activities or to select music to fit a certain mood. This research

proposes a method for analysing music features to discover patterns and develop a

model useful for music recommendation.

Existing music recommendation systems rely on meta-data (e.g., artist name, album

information, music genre, popularity, etc.) or descriptive tags for recommending music

to users. Using meta-data can be very limiting when recommending music by emotion.

An immediate solution would be to use tags (e.g., happy, noisy, relaxing music) to

easily identify music. However, as discussed in [13], these tags can be ambiguous and

incomplete.

In this research, a novel approach of identifying important musical features that can

lead to automatic emotion annotation of music is proposed. The psycho-physiological

responses of a subject listening to music are recorded and analysed to determine which

4

Page 18: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

parts of the music cause a certain emotional response. This emotion annotation is

continuous-valued and exists for the entire music, not just a music segment. This pro-

vides an opportunity to investigate how emotions of the subject change over time.

Finally, the ground truth of emotion annotations can be used as a dataset for other

researchers in music emotion recognition and music recommendation.

1.4 Methodology

The activities undertaken for this research are described below:

• Review of Related Literature. Existing research on music emotion recognition,

emotion modelling and emotion recognition using physiological sensors are studied,

analysed and compared. Through this step, a concrete data collection methodology

is established from previous research in the area.

• Materials and Data Collection. The physiological sensors to be used are

selected during this phase. Using the appropriate data collection methodology,

subjects will be selected to participate in experiments for recording physiological

responses while they listen to several emotion-inducing music.

• Modelling and Testing. Different experiments on extracting features from the

physiological data and from the music files are done. Machine learning techniques

are used to build emotion models using these features.

• Evaluation. Each emotion model is evaluated using the ground truth. Perfor-

mance of the models are measured to determine how different machine learning

techniques fair.

• Documentation. This is essential for recording how the different stages of re-

search were performed. The final document contains sections discussing the the-

oretical basis of the research, details for experiments done, discussion of results,

conclusion and recommendations for future work. In addition, paper submissions

to conference and journals are prepared.

5

Page 19: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Chapter 2

Theoretical Framework

This chapter presents various background research that provide the foundation for this

research. Key concepts in the areas of Music Psychology, Emotion Psychology, Psy-

chophysiology, Computer and Music Affective Computing are included.

2.1 Emotion Theory

2.1.1 What are Emotions?

One of the first tasks is to define emotions. The same question was raised by William

James in his article in [21]. James defines emotions as an internal physiological reaction

combined with a reaction towards a stimulus. Carl Lange reinterpreted James’ theory

as the perception of physiological changes [22]. The James-Lange theory states that a

person’s physiological arousal instigates the experience of a specific emotion.

This theory was criticized by Walter B. Cannon and Philip Bard who performed

experiments on animals whose internal organs were separated from the nervous system.

They observed that animals could display emotional expression and concluded that emo-

tion is not only the perception of physiological changes but stimuli had to be evaluated

independently from physiological reactions of the body [23].

These fundamental ideas have led to more research on mood and emotions. A recent

review of emotion research, definitions and how it can be measured can be found in a

review by Scherer [24].

A broadly accepted view on emotions is based on Plutchik’s work [25]. His model

6

Page 20: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

of emotion is based on the idea that emotions have three components: 1. subjective

(cognitive), 2. physiological, and 3. behavioral (facial expression or gestures). All

components can be measured during emotional perception of stimuli, such as images

or sounds. Although emotions have been studied since the 19th century, there have

been few research in either psychology or musicology with regard to emotion and music.

Notable work would be that of Lazarus [26], Gabrielsson & Lindstrom [27], and Juslin

& Zentner [28].

2.1.2 Emotions and Music

Most researchers agree that music can induce emotions and alter emotional states.

Scherer, et al. [29] presented a model that illustrates how emotions are communicated

through music as shown in Fig. 2.1.

The composer of the music has an idea of what emotion is to be expressed. This is

interpreted by the musician through his or her performance. A listener of this music will

encode the musician’s interpretation as an internal representation of emotion which is

verbalized or measured using biometric sensors. This model is used by Gabrielsson and

Juslin [30] in their work and found that participants are able to decode the interpreter’s

intended expression. Accuracy of the results were dependent on the type of emotion

that was being expressed.

Figure 2.1: Model of emotional communication in music from [29]

7

Page 21: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

It is unclear whether emotions are only perceived or really felt. Kivy distinguishes the

cognitivist-position and the emotivist-position in [31]. The cognitivist-position asserts

that emotions are simply recognized from music. The emotivist-position, on the other

hand, asserts that emotions are induced from music.

Sloboda and Juslin [32] also distinguished between extrinsic and intrinsic emotions.

Intrinsic emotions are emotions that are elicited directly by the use of structural pa-

rameters of the music (e.g. syncopations or appoggiaturas) [33, 34]. Extrinsic emotions

focus more on surface features of the music (e.g. loudness and tempo).

The emotion experienced from music is also influenced by personal factors (e.g., cul-

tural and educational background), musical preferences and mood [4, 35, 36]. A review

on the literature concerning the relationship between musical features and perceived

emotions can be found in [4].

2.2 Psycho-physiology of Emotions

The field of psycho-physiology of emotion studies the relationship between the psycho-

logical level (i.e., the conscious and subjective emotional experience of individuals) and

the physiological level (i.e., the measured signals associated with emotional activity).

Both components are multi-faceted and complex [37]. This complexity implies that a

complete and accurate measurement of affective states will register responses from dif-

ferent physiologic domains since there is no single physiologic parameter that is linked

to a specific emotional state unambiguously [38].

In [39], it was shown that the autonomic nervous system (ANS) is viewed by many

researchers as a major component of the emotion response. ANS is part of the peripheral

nervous system that maintains homoeostasis in the functioning of the many organs in

the body. Peripheral measured activities of the ANS (i.e., heart rate, respiration, etc.)

allows indirect measurement of cerebral emotional processing [40].

8

Page 22: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

2.2.1 Emotion Recognition using Psycho-physiological Responses

Various psycho-physiological parameters (e.g., heart rate variability, galvanic skin re-

sponse, breathing rhythm, etc.) can provide important information on the emotional

and cognitive state of a person. Some studies that investigated the relationship between

physiology and emotions can be found in [41, 42, 43, 44]. Recent studies that used elec-

troencephalograms (EEG) can be found in [45, 46]. With regard to music and sounds,

several studies were done analyzing psycho-physiological peripheral parameters as cor-

relates of emotions [47, 48, 41, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58]. The most prominent

parameters that were studied were heart rate (HR) and its variance, skin conductivity

(SC) and electromyograms (EMG) of facial muscles. Some studies also measured skin

temperature and breathing rate.

In the work of Krumhansl [52], the SC, cardiac, vascular and respiratory functions

were recorded and correlated to scales of sadness, fear, happiness and tension. These

physiological readings were taken while the participants were listening to musical ex-

cerpts. The largest changes in HR, blood pressure and SC were observed for sad music.

Happy produced the largest changes in the measure of respiration. These results were

interpreted as supporting evidence that emotions are elicited by the music and not only

perceived.

2.2.2 Physiological Metrics of Emotion

There are several physiological periphery that can be used to measure emotional re-

sponses. Some of the commonly used metrics are briefly described below.

Heart rate (HR)

Heart rate is a primary ANS measure of anxiety or stress activation. It is also a strong

indicator for mental stress. This is dually controlled by the sympathetic and parasym-

pathetic branch of the ANS that may act independently [59, 60].

9

Page 23: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Blood volume pulse (BVP)

BVP is measured using photoplethysmography. BVP sensors bounce infra-red light

against the skin surface and measure the amount of reflected light. BVP has been used

to measure levels of anger, stress, sadness and relaxation.

Respiration rate (RR)

Research of [61, 62] suggest that respiratory parameters can be mapped into the affective

space of Lang [63]. Respiration rate, in particular, is measured as the number of breaths

taken within a set amount of time.

Skin conductivity (SC)

This measures the skin’s ability to conduct electricity. SC can be used to find the

emotional modulation of ANS [51] since it is linearly correlated with arousal. It can also

represent changes in the sympathetic nervous system and reflect emotional responses

and cognitive activity.

Electroencephalogram (EEG)

An EEG measures the brain’s electric field at the scalp, which is spatially diffused by the

insulating skull material. EEG is sensitive to the activity of a large number of neurons,

providing a spatio-temporal record of brain activity that is indicative of the cognitive

state.

The principal spectral components of EEG are divided into the following signal

bands: delta (0–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), beta (above 12 Hz) and gamma

(above 40 Hz). Many studies have related these signal bands to alertness, cognitive

functions, and the overall capacity of the brain to operate within its usual limits.

Work on EEG to recognize emotions find that different mental state produces a

distinct pattern of electrical activity [64, 65]. The right hemisphere is responsible for

negative emotions (i.e. stress, disgust, sadness) while the left hemisphere is responsible

for positive emotions (i.e. happiness, gratitude, amusement).

10

Page 24: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Chapter 3

Detection of Emotion-Inducing

Music Segments and

Psycho-physiological Response

3.1 Introduction

Recognizing emotions using physiological sensors requires careful planning and execu-

tion to obtain accurate results. As a preliminary step towards emotion modelling, this

experiment concentrates on learning more about the dynamics between physiological

data and music features.

In this stage of the research, an algorithm used in data mining is adapted to dis-

cover repeating patterns in the physiological data. These patterns are used to identify

segments in the music that can be considered segments of interest since these segments

elicit similar physiological responses from the subject. In addition, the guiding prin-

ciples for performing data collection using physiological sensors are defined using this

experiment.

The next section discusses the data representation used for the experiment – time

series motifs. In section 3.3, the framework used for the research and description of how

the physiological data is used will be discussed. Next, a discussion of the data collection

and experiments is provided followed by the experiment results.

11

Page 25: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

3.2 Time Series Motif

A time series motif is a pair of subsequences of a longer time series which are very similar

to each other [66]. Discovering this repeated structure in the data suggests that there is

an underlying reason for its presence. Since its formalism in 2002, researchers have used

motifs in various domains, such as in medicine[67], entertainment [68], and biology [69].

In this experiment, physiological data, particularly respiration rate (RR) and blood

volume pulse (BVP), and chord sequences are used as time series data. Figure 3.1 shows

an example of a motif discovered in the BVP recording of a subject listening to music.

Formally, given a time series T = t1, ...tm with an ordered set of m real-valued

variables, a subsequence C of T is a sampling of length n < m of contiguous position

from T , that is, C = tp, ..., tp+n−1 for 1 ≤ p ≤ m − n + 1. ∀a, b, i, j the subsequence

pair {Ti, Tj} is the motif iff dist(Ti, Tj) ≤ dist(Ta, Tb), i = j and a = b. This definition

excludes trivial matches of a subsequence with itself by not allowing i = j. The distance

between two subsequences is measured using the Euclidean distance defined as:

dist(X,Y ) ≡

√√√√ n∑i=1

(xi − yi)2 (3.1)

Figure 3.1: A blood volume pulse time series (above) indicating 2 near identical sub-sequences. A “zoom-in” (below) reveals how similar to each other the subsequencesare.

12

Page 26: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

The algorithm used for discovering motifs in the music data is called the Mueen-

Keogh (MK) algorithm [70], which is further discussed in the next section.

3.3 Research Framework

The research framework for this experiment is shown in Fig. 3.2. The first task is

collecting psycho-physiological responses from a subject while he listens to music. After

noise filtering and data transformation, the data is used by the motif discovery module

which discovers patterns in the time series data.

A music feature extraction module is used to determine various information from

the music (e.g., beat occurrences, tempo, chords used, etc.). The music key and chord

information is used in identifying frequent chord sequences in the music. Motifs are then

matched with the chord sequences and other music information that occur at the same

time as the motif subsequence pairs. Afterwards, these are stored in a library.

All the information recorded can then be used by a music recommendation system

that is envisioned to generate a playlist of songs that have similar music features. Intu-

itively, it is assumed that the subject will enjoy listening to music with similar features

during the same listening session. The focus of this work however is on building a library

of music features that can be used for identifying emotions induced by music.

Figure 3.2: Research Framework

13

Page 27: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

3.3.1 Music Collection

The songs used for testing the algorithms are obtained from the isophonics dataset1. The

collection includes 301 songs from various artists as well as annotations for song key,

chords, beat and metric position, and music structure segmentation (i.e. intro, verse,

chorus, etc.). These annotations were done manually by music experts and students

[71]. Songs for the experiments were selected based on three constraints. First, the song

should not have any key and tempo changes. Second, the song should have complete

chord and beat annotations. Last, the song is in a major key. Using this criteria, 83

songs were selected which include 77 songs from The Beatles, four Queen songs, and

two Carole King songs.

Since the isophonics dataset already includes the chord, beat, key and segment anno-

tations for the different songs, a simple text parser to read the different file annotations

was needed.

3.3.2 Mueen-Keogh Algorithm

The algorithm of Mueen and Keogh [70] was used and modified for finding motifs in

a time series. Unlike other algorithms on motif discovery that approximates computa-

tion for real-valued time series, the Mueen-Keogh algorithm is an exact motif discovery

algorithm. Furthermore, the algorithm utilizes two optimization techniques that re-

duce execution time by three orders of magnitude: early abandon of Euclidean distance

computation, and pruning of the search space of motif candidates.

The objective of the algorithm is to find the closest-pair subsequence of length n

in the time series T . This is determined by using a best-so-far distance for the motif

pair, which is initialized to infinity. A random object, in this case a subsequence, is

used as a reference point and all other objects are ordered by their distances from the

reference point. The distances of all subsequences are computed and stored in a table

called Dist. This table is used in sorting all objects. This ordering step provides a

useful heuristic for the motif search. It is observed that if two objects are close in the

1http://www.isophonics.net/datasets

14

Page 28: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

original space, they are also close in the linear ordering. All objects that are worse than

the best-so-far distance are no longer considered for computing the actual distance of

two motif candidates. This pruning step is reflected in lines 20 and 21 of the algorithm

in Table 3.1.

After the objects have been arranged, the linear ordering is traversed and the true

distances between adjacent pairs is measured. Every time the distance is measured

the best-so-far distance is updated. All data points that were not pruned during the

initial ordering step are stored in I (line 13). During the actual distance computation,

a variable offset is introduced. It is an integer between 1 and n− 1 used to refer to the

jth item and jth+ offset item in the ordered list I, which are both candidate pairs for

testing. The algorithm starts with an initial offset of 1 and searches pairs that are offset

apart in the I ordering. Once all the pairs have been identified, the offset is increased

and another round of searching is done. This algorithm continues until all possible pairs

have been exhausted.

The effectiveness of the algorithm is affected by the reference point that was initially

chosen. For a large dataset, a poorly chosen reference point would lead to still having

a large search space. To remedy this, multiple reference points are chosen. From the

experiments of Mueen, et al., choosing any value from five to sixty for R gives two orders

of magnitude speedup.

3.3.3 Identifying Frequent Chord Sequences

The frequent chord sequences are also identified using the motif discovery algorithm.

Chords in the song are first represented as a time series similar to the one shown in Fig.

3.3.

The chords are converted into a numerical representation using Lerdahl’s Tonal Pitch

Space (TPS) [72]. The TPS is a model of tonality that fits human intuitions. Using

the TPS model, the distance between chords can be computed given the key. The basis

of TPS is the basic space show in Table 3.2. It consists of five hierarchical levels for

the pitch class subsets ordered from stable to unstable. The most stable level (a) is the

15

Page 29: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 3.1: Mueen-Keogh Motif Discovery [70]

Algorithm L1, L2=MK-Motif(T, R, n)

in: T : a time series, R: the number of reference objects, n: the subsequence length

out: L1, L2 are the locations for a Motif

1: best-so-far = INF2: for i = 1 → R do3: refi = a randomly chosen subsequence Tr from T4: for j = 1 → n do5: Disti,j = d(refi, Tj)6: if Disti,j < best-so-far then7: best-so-far = Disti,j , L1 = r, L2 = j8: end if9: Si = standardDeviation(Disti)

10: end for11: end for12: find an ordering Z of the indices to the reference objects in ref such that SZ(i) ≥

SZ(i+1)

13: find an ordering I of the indices to the subsequences in T such that DistZ(I),I(j) ≤DistZ(I),I(j+1)

14: offset = 0, abandon = false15: while abandon = false do16: offset = offset+ 1, abandon = true17: for j = 1 → n do18: reject = false19: for i = 1 → R do20: lowerBound = |DistZ(i),I(j) −DistZ(i),I(j+offset)|21: if lowerBound > best-so-far then22: reject = true, break23: else if i = 1 then24: abandon = false25: end if26: end for27: if reject = false then28: if d(DI(j), DI(j+offset)) < best-so-far then29: best-so-far = d(DI(j), DI(j+offset))30: L1 = I(j), L2 = I(j + offset)31: end if32: end if33: end for34: end while

root level, containing only the root of the chord. Level (b) adds the fifth of the chord.

The triadic level (c) contains all pitch classes of the chord. Next is the diatonic level

(d) consisting of all pitch classes of the diatonic scale of the current key. Last and least

16

Page 30: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 3.2: The basic space of the tonic chord in the key of C Major (C = 0, C♯=1, ...,B = 11), from Lerdahl [72].

(a) octave (root) level: 0 (0)(b) fifths level: 0 7 (0)(c) triadic (chord) level: 0 4 7 (0)(d) diatonic level: 0 2 4 5 7 9 11 (0)(e) chromatic level: 0 1 2 3 4 5 6 7 8 9 10 11 (0)

C D E F G A B C

stable level is the chromatic level containing all pitch classes (e). The basic space is

designed to be hierarchical, i.e. if a pitch class is present at a higher level, it is also

present at lower levels.

In order to calculate the distance between two chords, the basic space is set to

match the key of the piece (level d). Then, the levels (a-c) can be adapted to match

the chords to be compared with. Distance between two chords is calculated using the

Chord distance rule used in TPS [72] defined as follows:

Chord distance rule : d(x, y) = j+k, where d(x, y) is the distance between the chord

x and chord y. The minimal number of applications of the Circle-of-fifths rule in

one direction needed to shift x into y is defined by y. k is the number of non-

common pitch classes in the levels (a-d) within the basic spaces of y compared to

those in the basic space of x. A pitch class is non-common if it is present in x

or y but not in both chords. This definition causes the distance function to be

non-symmetrical, i.e. d(Cm, G) = d(G, Cm).

Figure 3.3: Representation of the chord sequences as a time series for Yesterday. Se-quences are divided into the first two parts of the song: (a) introduction, and (b) firstverse.

17

Page 31: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Circle-of-fifths rule : move the levels (a-c) four steps to the right or four steps to the

left (modulo 7) on level (d).

Using the chord distance rule, all chords in the song are compared with the harmonic

center or tonic of the music. For example, given a chord sequence D-G and the song

being in the key of D major, the corresponding distance values would be d(D, D) and

d(D, G). A list of chord distances is then constructed using the same sampling frequency

as the physiological data.

After representing the chord sequences as a time series, the motif discovery algorithm

can be used to find the closest-pair subsequences. The goal then is to find chord progres-

sions of various length, particularly, chord progressions having two or more chords in

the sequence. To discover these chord progressions, the motif discovery algorithm is run

iteratively using a different chord motif length l. Initially l is set to a value that would

capture one second of chord progression. Then, this value is continuously increased by

0.5 second increments until a chord progression that is eight seconds long is obtained.

All chord progressions discovered are stored in a list CP = {cp1, cp2, ...cpk}, where

cpi is a chord progression and length pair : ⟨Ci, l⟩.

3.3.4 Motif Analysis

The motif discovered from the physiological responses are mapped to the frequent chord

progressions that were discovered. At this point other music features could also be

included in annotating the motif. However, for this experiment only chord progressions,

and indirectly, the key of the songs are included.

The chronological order of subsequences is used to map the motifs with the chord

progressions. If a chord progression is found to occur within the time the motif was ob-

served, then this chord progression is included in the list. Formally, a chord progression

Ci with length l is a chord progression of motif Tj with length n iff Tj ≤ Ci+l ≤ Tj+n

or Tj ≤ Ci ≤ Tj+n.

18

Page 32: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

3.4 Data Collection

For this experiment, data is collected from one subject (a 22-year male graduate stu-

dent). The subject listened to the songs via audio-technica closed headphones (ATH-

T400) connected to a computer in a controlled experiment room. The physiological

data was recorded using three sensors of the BioGraph Infinity System2, namely, sen-

sors for blood volume pulse (BVP), respiration rate (RR) and skin conductance (SC).

The sensors are attached to the subject as shown in the experiment setup in Fig. 3.4.

Several sessions were needed for the subject to listen to all the songs without making

him feel stressed. Each session took approximately 20 minutes, which allowed the subject

to listen to seven to nine songs per session. One week was needed to complete the data

collection. Sessions were held at the same time of the day throughout the week (i.e.,

one session per day).

Before each session ended, the subject also self-reported the mood he had while

listening to each song. A scale of one to five was used to describe how happy and how

exciting the song made him feel.

Figure 3.4: Setup for data collection: BVP sensor is worn on the left middle finger, skinconductance sensor is worn on the left index and ring finger, respiration sensor is wornon the chest, and music is heard via closed headphones

2About BioGraph Infinity System. Thought Technology Ltd. 12 December 2012.http://www.thoughttechnology.com

19

Page 33: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 3.3: Summary of music included for motif discovery

KeyTempo

TotalAndante Moderato Allegro

C 1 1 3 5D 1 1 7 9E 3 3 8 14F 2 1 2 5F♯ 0 0 1 1G 5 2 3 10A♭ 1 0 0 1A 5 4 5 14B♭ 1 0 1 2B 1 1 1 3

Total 20 13 31 64

Andante : 76–108bpm Allegro : 120-168bpmModerato : 108–120bpm

Although 83 songs were used for the data collection, only data from 64 songs are

included for analysis for this experiment. Only songs that made the subject happy (i.e.

songs rated three and above) and have a tempo between 76 – 168 beats per minute

(bpm) are included. The tempo and key information of the music data set is shown in

Table 3.3.

Prior to motif discovery, offset and amplitude scaling transformations are applied

to the physiological data using (3.2) and (3.3), respectively [73, 74]. These operations

improve the accuracy of Euclidean distance which cannot handle situations where two

sequences being compared are alike, but one has been “stretched” or “compressed” in

the Y-axis.

Qoffset = Q−∑n

i=1 qin

, (3.2)

where Q is defined as a time series with n length and Qoffset is the time series after

offset transformation.

Qscaled =Qoffset

σ, (3.3)

where σ is the standard deviation of the data and Qscaled is the time series after ampli-

tude scaling transformation.

20

Page 34: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Before applying the the offset and amplitude scaling operations, the sequences needs

to be normalized to the range [0,1] using (3.4).

Q =Q−min(Q)

max(Q)−min(Q), (3.4)

where Q is a time series, min(Q) and max(Q) are the minimum and maximum

values in the time series, respectively.

3.5 Results

In this section, the results of using the motif discovery algorithm on the two time series,

namely, physiological data and chord sequences are presented.

3.5.1 Chord Progression Motifs

A crucial step in using the motif discovery algorithm on the chord progressions is the

conversion from symbolic chord notation to the TPS Chord distance notation. Using this

representation can adequately convert the data into a time series. However, it is observed

that the representation simplifies the chords and maps two or more harmonically distinct

chords to the same numerical value. As an example, Table 3.4 shows the chords in the

song “Yesterday” that were converted to the numerical value. It can be seen that the

chords C and C:7 are converted to different numerical values even though these sound

very similar.

Although the conversion of chords to numerical form approximates the chord se-

Table 3.4: Mapping of chords of Yesterday to TPS Chord Distance

Numeric Original Chords

0 F, F/5, F/7, F:maj(*3)2 F:75 Bb, Bb/5, Bb:maj, Bb:maj7, B/7, C6 C:77 A:min/b3, D:min, D:min7, D:min/b78 G:min, A:sus4/5, E:min7(4)9 G, G:7, G/3, A10 A:7

21

Page 35: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 3.5: Sample of identified chord progressions

Length Chord Progression Key Chords

2 I-IV D D-GI-V A A-Eiii-I G Bm-GV-I E B-E

3 I-ii-V G G-Am-DI-V-I G G-D-GI-IV-I E E-A-EI-IV-V D D-G-AI-V-IV E E-B-AIV-V-I A D-E-A

4 I-IV-V-I D D-G-A-DI-IV-ii-vi/5 E E-A-F♯m-C♯m/5I-ii-I/3-IV G G-Am-G/3-Cii-V-I-ii G Am-D-G-Am

5 I-vi-IV-V-I E E-C♯m-A-B-EIV-ii-vi/5-IV-I E A-F♯m-C♯m/5-A-E

quences, results show that it is still adequate for identifying frequent chord progressions

in a song. Table 3.5 shows some of the chord progressions that were discovered. As seen

in this list, some of the chord progressions found are considered common in pop music,

such as the three-chord progressions I-IV-V, I-V-IV, and I-ii-V. The 5-chord progression

I-vi-IV-V-I could be considered as an extended version of another common sequence

known as the 50s progression (I-vi-IV-V or I-vi-ii-V).

One important advantage of using the TPS model is it converts chord sequences to

a key-invariant representation. This allows comparison of chord sequences to chord se-

quences that belong to another song with a different key. This would help in discovering

frequent chord sequences that are found in the entire music collection.

3.5.2 Physiological Time Series Motifs

Prior to mapping the chord progression to the physiological data, the quality of the

motifs that were obtained from the BVP and RR time series are evaluated. The motif

discovery algorithm was used on both datasets using a motif length of one to eight

seconds. After each run, the average distances of the motifs discovered are calculated.

It was observed that as the motif length increased, the motif distances also increased.

22

Page 36: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Figure 3.5: Comparison of motif distances between BVP and RR

However, it was also noticed that the motifs obtained from the RR time series had a

slower rate of change. Figure 3.5 shows a graph of the rate of change. With this new

insight, it was decided to use the RR time series to test the algorithm for mapping

physiological motifs with chord progressions.

Using the algorithm described for the motif analysis function, every physiological

motif was annotated with at least one chord progression. However, to get this result,

the motifs generated using different motif lengths had to be used. On average a motif

length that captures 4 seconds of data yielded the most number of co-occurrence with

chord progressions.

It was also observed that some motif pairs were mapped to the same chord progres-

sion. For example, both subsequences representing the physiological motif of the song

I’ll Follow the Sun mapped to the chord progression C:7-D:min-F:min-C (I7-ii-iv-I). This

might indicate that this chord progression has a strong effect to our subject. However,

this kind of result happens only 5% of the time using the data that was used.

3.5.3 Improving the Data Collection Methodology

Using the data collection methodology, a psycho-physiological data set was built. How-

ever, additional steps are needed to improve the data quality. First, the subject needs

to be interviewed after he or she listens to each song. This will help confirm if the motifs

identified correlate to significant emotions experienced by the subject.

Second, the quality of the physiological signals needs to be confirmed by repeating

23

Page 37: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

the data collection for several times. This will ensure that a physiological motif that is

discovered will be present in all the data collection trials.

Finally, additional subjects are needed to confirm if the methodology will produce

consistent results.

3.6 Summary

In this experiment, a motif discovery algorithm was used to identify song segments that

contain chord progressions that invoke an emotional response from a person. The motif

discovery algorithm was used for detecting patterns in both the physiological data as

well as in the chord sequences. Further analysis showed that each frequently occurring

chord progression co-occurs with a motif from the physiological data. This confirms

the hypothesis that music features, particularly, chords can elicit a similar physiological

response. However, it is still unclear how to describe the emotional response since the

emotion annotations provided by the subject conveys only a general impression for the

songs (i.e., one emotion label per song). A fine-grained emotion annotation is needed to

enhance the usability of the results.

With the help of this experiment, several problems were identified in the data col-

lection methodology. These problems are addressed in the main experiment discussed

in the next chapter.

24

Page 38: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Chapter 4

Emotion Classification using

High-level Music Features and

Emotion Annotations from EEG

Data

4.1 Introduction

In this chapter, the main experiment for the research is discussed. This experiment

improves the previous experiment in three aspects.

1. Refined data collection methodology. During the previous experiment some

problems regarding data collection were identified. These are addressed in this

experiment and are discussed in Section 4.3.

2. Increased amount of high-level music features. In addition to chord and

harmony information, additional high-level music features are used for describing

music segments. However, a different source of music files was needed in order to

extract the music features. This is discussed further in Section 4.2.1.

3. Use of continuous-valued emotion labels. One of the novelty of this research

is being able to collect psycho-physiological response for an entire song. It is thus

crucial to be able to distinguish and use a fine-grained emotion label for annotating

different parts of the music. This prompted the use of an emotion spectrum

25

Page 39: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

analysis method by Musha [75] which uses an electroencephalogram (EEG) to

continuously record the emotional state of a subject and obtain continuous-valued

emotion labels.

The changes mentioned above affects the manner in which the data is analysed and

used for constructing emotion models. Section 4.2 discusses the detailed changes made

in the research framework. Details of the emotion model and results of testing and

evaluation are found in Sections 4.4 and 4.5, respectively.

4.2 Research Framework

The framework show in Fig. 4.1 is similar to the framework presented in chapter 3.3.

The main components are similar but the data and the techniques used for processing

the data are different. Emotion models are built by supervised learning techniques which

learn positive examples from labelled high-level features extracted from MIDI files. The

labels used are emotion annotations automatically obtained from EEG readings of the

person listening to the songs.

Figure 4.1: Modified research framework

26

Page 40: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

4.2.1 High-level Music Features

Music is generally digitally represented either as audio data (e.g., wav, aiff, or MP3)

or symbolic data (e.g., MIDI, MusicXML, or Humdrum). Audio files encode analog

waves into digital samples. Symbolic data, on the other hand, store musical events,

such as note onsets, note durations, pitches, changes in tempo, loudness, and so forth.

Thus, symbolic data are regarded as high-level music representation and audio data as

low-level representation.

With regard to music feature extraction, audio data and symbolic data have their

respective strengths and weaknesses. Symbolic data in the form of MIDI files are chosen

for this experiment primarily due to two reasons. First, in this experiment, features that

are related to high-level music information are desired for the machine learning task.

Extracting such music information from audio files is still an open problem. Existing

tools can only extract, with low accuracy, very basic information. In contrast, this task

is easy to do using MIDI files since the basic features are readily available in the file.

Second, MIDI files do not include lyrics of the songs which help eliminate additional

emotions contributed by these.

Without a doubt there are also good reasons for using audio data. Most important

would be that audio data are what people actually listen to in general. Using original

music with the correct timbre quality as opposed to the synthetic sounding MIDI music,

would provide a more authentic emotional reaction from the listener. However, it was

decided to focus on the music features for this work.

4.2.2 Emotion Annotations

An EEG is used to measure the psycho-physiological response of a subject listening

to music. These continuous EEG readings are converted to continuous-valued emotion

annotations using emotion spectrum analysis method (ESAM) [75]. ESAM provides the

magnitude or intensity of specific emotions. Using this method, it is easy to determine

how the emotional state of a subject changes as he or she listens to the song. The details

of the data collection methodology and the process of converting the EEG readings into

27

Page 41: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

emotion annotations is further discussed in the next section.

4.2.3 Machine Learning Task

Supervised machine learning techniques are used to learn which high-level music features

are best to use for classifying music by emotion. The emotion models are evaluated using

the ground truth of emotion labels obtained during data collection.

4.3 Data Collection Methodology

Creating the emotion models require examples of music features that can elicit specific

emotions from a listener. The music that subjects will listen to as well as the method for

recording physiological readings must be guided by the following design considerations:

1. Musical taste is highly personal. Personal factors have an effect on affect as iden-

tified by other researchers. These include, but are not limited to, personality

[76], familiarity with the music [77, 78], age [79], gender [80], and musical prefer-

ence [78, 81]. This implies that personally selected music would induce stronger

emotions than music selected by the researcher.

2. Music features contribute to changes in emotion. Musicologists have identified

that the mode (major or minor), harmony, rhythm, tempo, instrumentation, and

lyrics contribute to a person’s affect. Choosing which music features to be included

in the research will also limit the kind of emotions induced by the music.

3. Physiology is responsive to many physiological and physical influences aside from

affect. Therefore, physiological signals should be standardized per measurement

session. Subjects must be conscious of the fact that sensors are sensitive and

movements cause artefacts or noise in the data. Finally, multiple measurements

must be done to ensure consistent data.

These considerations are used in designing the experiments. The following subsections

describe how these are implemented in the research.

28

Page 42: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

4.3.1 Experimental set-up

There are two participants who selected and annotated songs. Subject A, a 22-year old

female, and subject B, a 22-year old male, were in good mental and physical health

when the experiments were conducted. The subjects were paid for their participation

in the research and signed a waiver giving permission to use their data for the research.

The experiment was conducted in a well-lit and controlled experiment room. During

data collection, only one subject and the examiner were in the experiment room. Music

was played using a computer attached to two external speakers (Yamaha MSP3). The

music collection is a set of MIDI files comprised of 121 Japanese and Western songs

having 33 Folk, 20 Jazz, 44 Pop, and 24 Rock music. The complete list of songs used

for the experiments can be found in Appendix A .

The data collection is divided into two phases. The first phase requires the subjects

to listen to all songs and manually identify the emotions that were experienced for the

song. The subjects were instructed to listen to the entire song and were given full control

on which parts of the song they wanted to listen to. After listening to each song, the

subjects give a general impression on how joyful, sad, relaxing, and stressful each song

was using a five-point Likert scale. Aside from the emotions felt, the subjects were also

asked to rate whether they were familiar with the song or not using the same scale. The

manual annotation was done in one session for approximately one and a half hours.

The second phase involves using an EEG to measure changes in electrical activity

along the subjects scalp while the subjects listened to selected songs. Using the manual

annotations, the 10 most relaxing songs and 10 most stressful songs with varying levels

of familiarity to the subject were selected. Since collection of the emotion annotations

takes a lot of time and effort from the subject, it was decided to concentrate resources on

recording data for certain types of emotional states, specifically, relaxing and stressful

music.

The EEG device is a helmet with electrodes that can be placed on 22 scalp positions

according to the International 10–20 Standard. Figure 4.2 shows the location of the

different electrodes. Using the EEG, electric potential differences were recorded with a

29

Page 43: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

(a) EEG electrode positions (b) Electrodes used for ESAM

Figure 4.2: The EEG has 22 electrodes used to record electrical changes on the scalp.Each node is identified by a letter to indicate lobe position: F-Frontal lobe, T-Temporallobe, C-Central lobe, P-Parietal lobe, O-Occipital lobe. ’Z’ refers to an electrode placedon the mid-line.

reference electrode on the right earlobe.

The subjects were advised to close their eyes and remain still while data was being

collected. Listening sessions had to be limited to a maximum of 30 minutes or upto the

moment that the subjects began to feel uncomfortable wearing the EEG helmet. It was

important to ensure that the subjects were comfortable and eliminate external factors

that may contribute to changes in emotion. Prior to collecting the data, the volume of

the music was adjusted to a comfortable level for the subject. The same volume setting

was used for playing all the songs for all sessions. On average, EEG readings for 7 songs

were recorded per listening session.

Before each music is played, a 10 second white noise was introduced to help the

subjects focus on the task at hand without stimulating a strong emotional response.

Collection of the EEG signals was done per song. Recording of the signals commenced

after the white noise was heard and was stopped when the music ended.

After listening to one song, a short interview is conducted to determine if the subjects

particularly liked or disliked specific parts of the song. The interview also helped confirm

the initial manual annotations of the subjects.

After one set of EEG readings for the 20 songs was recorded, the experiment was

repeated two more times to ensure that the measurement obtained was accurate or at

the least consistent. The entire recording for one subject lasted for 3 weeks (i.e. one set

30

Page 44: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

of 20 songs per week). A total of 70 EEG readings was recorded for the research. The

first subject, unfortunately, did not complete the 2nd and 3rd recording sessions due to

health reasons.

4.3.2 EEG Data and Emotion annotation

Continuous emotion annotations were obtained using EMonSys. This software1 uses the

emotion spectrum analysis method (ESAM) [75] to convert EEG readings to continuous-

valued emotion annotations. Using data from 10 scalp positions (see Fig. 4.2b) at Fp1,

Fp2, F3, F4, T3, T4, P3, P4, O1, O2, electric potentials were separated into their

θ (5–8 Hz), α (8–13 Hz) and β (13–20 Hz) frequency components by means of fast

Fourier transforms (FFT). The values of the cross-correlation coefficients for the three

components on 45 channel pairs were evaluated every 0.64 seconds. For example, the

cross-correlation coefficient c(α; jk) between potentials collected with electrodes j and

k for the α band is given as:

c(α; jk) =

∑αXj(fn)X

∗k(fn)√∑

α |Xj(fn)|2√∑

α |Xk(fn)|2(4.1)

The 135 variables obtained from the operation described forms the input vector Y .

Using an emotion matrix C, this 135-dimensional vector is linearly transformed into a

4-D emotion vector E = (e1, e2, e3, e4), where ei corresponds to the 4 emotional states,

namely: stress, joy, sadness, and relaxation. Formally, the emotion vector is obtained

by

C · Y + d = E (4.2)

where d is a constant vector. The emotion matrix used in the research is included in the

software. It was prepared by the developers using a purification process described in

[75]. EEG data for each emotion state was obtained from participants divided into two

groups of almost equal size. The matrix was prepared using EEG data from group A

and tested on group B. During the test stage, EEG data that did not give correct results

were removed. After the purification step, a new matrix constructed from filtered data

1software developed by Brain Functions Laboratory, Inc.

31

Page 45: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

(a) Medium-level stress (b) Low-level stress

Figure 4.3: Sample EEG Signals from subject B for a segment of the song Stand By Methat is measured with different stress values

of group B was tested on group A. Two or three repetitions were necessary to remove

incorrect EEG data samples.

The emotion vector is used to provide a continuous annotation to the music every

0.64 seconds. For example, if one feels joy, the emotion vector would have a value of

E = (0, e2, 0, 0). The value for each vector describes the magnitude for that emotion.

Figures 4.3a and 4.3b show segments of the EEG data of subject B that are measured

having medium and low levels of stress, respectively. The EEG data was recorded while

the subject was listening to the song Stand By Me.

4.3.3 Extracting Music Features

Each song having length m is split into several segments. Each segment, or now referred

to as a window w, has a length n, where one unit of length corresponds to one sample

of emotion annotation.

MIDI information for each window is extracted using a module adapted from jSym-

bolic [82] to extract 109 high-level music features. These features can be loosely grouped

into the following categories:

• Instrumentation. These features describe the kinds of instruments that are

32

Page 46: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

present and how much importance is given to certain instruments over others. It

considers importance of both pitch and non-pitched instruments and the interac-

tion between these.

• Texture. Several voices may be present in the midi. These features determine

how many there are and its relative importance.

• Dynamics. These are features that describe how loud or soft notes are played.

Some features also describe the types of variations in dynamics that occur.

• Rhythm. These features consider intervals between attacks of different notes and

durations of each one. Features that determine meter and rhythmic patterns are

also included.

• Pitch Statistics. Different statistics about the pitches found in the music are

described by these features. For example, the occurrence rates of different notes in

terms of both pitches and pitch classes, the tonal range of the music, the frequency

of notes that have a certain pitch range, etc.

• Melody. These features describe the melodic intervals that are used in the music

and how often these melodic variations occur. Some features also describe the

melodic contours.

• Chords. This set of features identify the vertical intervals that are present in the

music. Some features describe the harmonic movement that occurs and how fast

these movements are.

The feature set includes one-dimensional and multi-dimensional features. For ex-

ample, Amount of Arpeggiation is a one-dimensional Melody feature, Beat Histogram is

a 161-dimensional Rhythm feature, etc. A complete list with an elaborate description

of each feature can be found in [20]. All features available in jSymbolic were used to

build a 1023-dimension feature vector. The category distribution of the feature vector

is shown in Table 4.1. The Others category refers to the features Duration and Music

Position. Duration is a feature from jSymbolic, which describes the length of the song

33

Page 47: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 4.1: Distribution of features used for the instances

Category Amount Percentage

Dynamics 4 0.4%Instrumentation 493 48.2%Melody 145 14.2%Pitch 174 17.0%Rhythm 191 18.7%Texture 14 1.4%Others 2 0.2%

Figure 4.4: Illustration of labelling music features with labels

in seconds. Music Position refers to the position of the window relative to duration of

the song. Although it was known that not all of the features will be used, this approach

allows utilization of feature selection techniques to determine which features were the

most important in classification.

After extracting the features for one window, the window goes through the data

using a step size s until the end of the song is reached. Each window was labelled using

the average emotion value within the length of the window. Formally, the label for wi

is the emotion vector

Ei =1

n

i+n∑j=i

Ej =1

n

i+n∑j=i

(ej1, e

j2, e

j3, e

j4

), (4.3)

where 1 ≤ i ≤ m− n.

The process described is illustrated in Fig. 4.4. The figure shows how a window wi

is labelled using the emotion vector Ei.

34

Page 48: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

(a) Subject A dataset

(b) Subject B dataset

Figure 4.5: Illustration of datasets for the subjects

4.4 Emotion Model

Weka[83], an open-source machine learning platform, was used for building the emotion

models using linear regression and C4.5. The training examples for the classifiers were

obtained using the windowing technique with emotion labels for stress (e1)and relaxation

(e4). Different training datasets were constructed for each subject. One dataset is

constructed using a different window length parameter value (i.e., 1 ≤ n ≤ 210) and

emotion label. The dataset for subject A consists of all information obtained from 20

songs for one trial. The dataset for subject B consists of information for the three

recording trials of the 20 songs. Figure 4.5 illustrates how the datasets are organized.

The size of each trial subset varies depending on the parameters used for the sliding

window and the total length of the songs used for the subject. The size ranges from a

minimum of 2132 instances and a maximum of 6746 instances using the smallest and

largest sizes for window length (1 ≤ n ≤ 210).

During preliminary experiments it was observed that the decrease of training data

due to larger step sizes had too much of a negative influence on performance. As such,

all features were extracted using the smallest step size of s = 1 for all experiments.

Prior to training, all features that do not change at all or vary too frequently (i.e.

35

Page 49: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

varies 99% of the time) are removed. Afterwards, normalization is performed to have

all feature values within [0, 1].

4.4.1 Linear Regression

The linear regression used for building the emotion models uses the Akaike criterion for

model selection and M5 method [84] to select features. The M5 method steps through

the features and removes features with the smallest standardized coefficient until no

improvement is observed in the estimate of the error given by the Akaike information

criterion.

4.4.2 C4.5

C4.5 [85] is a learning technique that builds a decision tree from the set of training data

using the concept of information entropy. Since this technique requires nominal class

values, the emotion labels are first discretized into five bins. Initial work used larger bin

sizes but we observed poorer performance using these.

4.4.3 Testing and Evaluation

In order to assess the models generated by the two methods, 10-fold cross-validation

was used. Models were generated using instances generated with varying values for the

window length.

Results from the 10-fold cross validation are compared using different performance

measures, which are summarized in Table 4.2. The predicted values on the test instances

are p1, p2, ..., pn; the actual values are a1, a2, ...an.

In addition to these correlation coefficient is also computed for models built using

linear regression. The correlation coefficient r is defined as,

r =SPA√SPSA

, (4.4)

where SPA =∑

i(pi−p)(ai−a)n−1 , SP =

∑i(pi−p)2

n−1 , and SA =∑

i(ai−a)2

n−1

For models built using C4.5, the Kappa statistic was also used as a performance

measure. The Kappa statistic k describes the chance-corrected measure of agreement

36

Page 50: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 4.2: Performance measures for numeric prediction [86]

Mean absolute error |p1−a1|+...+|pn−an|n

Root mean-squared error

√(p1−a1)

2+...+(pn−an)2

n

Relative absolute error |p1−a1|+...+|pn−an||a1−a|+...+|an−a|

Relative squared error (p1−a1)2+...+(pn−an)

2

(a1−a)2+...+(an−a)2, where a = 1

n

∑i ai

Root relative squared error

√(p1−a1)

2+...+(pn−an)2

(a1−a)2+...+(an−a)2

between the classifications and the true classes and is computed as:

k =totalAccuracy − randomAccuracy

1− randomAccuracy(4.5)

Total accuracy is simply the sum of true positive and true negatives, divided by the

total number of items, that is:

totalAccuracy = TP+TNTP+TN+FP+FN ,

where TP is the count of true positives, TN is the count of true negatives, FP is

the count of false positives and FN is the count of false negatives.

Random accuracy is defined as the sum of the products of reference likelihood and

result likelihood for each class. Formally,

randomAccuracy = (TN+FP )(TN+FN)+(FN+TP )(FP+TP )Total∗Total ,

where Total = TP + TN + FP + FN .

4.5 Results and Analysis

The classification results using all the instances from the three recording trials of subject

B are shown in Tables 4.3–4.6. Tables 4.3 and 4.4 show the results for the stress model

while the other two tables show the results for the relaxation (or relax) model. Figure

4.6 shows a graph of the relative absolute error for the two models using both machine

learning techniques.

From the results of both models it is clear that using all instances from all record-

ing sessions yield poor results, regardless of the classification technique that was used.

37

Page 51: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 4.3: Stress model results using C4.5 on all instances for subject B

Window length 1 30 60 90 120 150 180 210No. of instances 20244 18504 16704 14904 13104 11331 9636 8118Kappa statistic 0.000 0.302 0.316 0.297 0.318 0.327 0.341 0.342Mean absolute error 0.062 0.213 0.227 0.232 0.227 0.225 0.225 0.223Root mean squared error 0.176 0.332 0.340 0.343 0.339 0.337 0.337 0.335Relative absolute error 99.9% 82.4% 81.7% 81.9% 78.7% 78.1% 77.2% 77.1%Root relative squared error 100.0% 92.3% 91.2% 91.1% 89.3% 88.8% 88.3% 88.1%

Table 4.4: Stress model results using linear regression on all instances for subject B

Window length 1 30 60 90 120 150 180 210No. of instances 20244 18504 16704 14904 13104 11331 9636 8118Correlation coefficient 0.278 0.551 0.574 0.580 0.590 0.600 0.612 0.618Mean absolute error 0.058 0.115 0.133 0.138 0.147 0.149 0.154 0.158Root mean squared error 0.079 0.141 0.160 0.165 0.175 0.176 0.182 0.186Relative absolute error 95.0% 84.8% 85.1% 86.0% 86.4% 86.8% 86.9% 87.4%Root relative squared error 96.2% 83.5% 82.0% 81.5% 80.8% 80.0% 79.2% 78.7%

Table 4.5: Relax model results using C4.5 on all instances for subject B

Window length 1 30 60 90 120 150 180 210No. of instances 20244 18504 16704 14904 13104 11331 9636 8118Kappa statistic 0.000 0.188 0.214 0.219 0.272 0.286 0.270 0.296Mean absolute error 0.024 0.156 0.179 0.182 0.191 0.197 0.194 0.225Root mean squared error 0.109 0.282 0.301 0.303 0.311 0.315 0.313 0.337Relative absolute error 99.7% 90.8% 88.7% 87.2% 86.8% 86.4% 85.4% 79.6%Root relative squared error 100.0% 96.1% 94.7% 93.9% 93.7% 93.4% 92.9% 89.6%

Table 4.6: Relax model results using linear regression on all instances for subject B

Window length 1 30 60 90 120 150 180 210No. of instances 20244 18504 16704 14904 13104 11331 9636 8118Correlation coefficient 0.174 0.442 0.453 0.459 0.464 0.465 0.474 0.481Mean absolute error 0.040 0.090 0.097 0.099 0.105 0.108 0.106 0.156Root mean squared error 0.062 0.121 0.127 0.128 0.134 0.137 0.132 0.190Relative absolute error 98.0% 89.6% 89.0% 88.6% 88.3% 88.4% 87.9% 88.1%Root relative squared error 98.7% 89.8% 89.3% 89.0% 88.7% 88.7% 88.2% 87.8%

38

Page 52: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Figure 4.6: Relative absolute error results of stress and relaxation models using allinstances for subject B

Nonetheless, performance continuously improves as the window length is increased ex-

cept for the stress model built using linear regression. Results of models using linear

regression performed better than models constructed using C4.5.

Instead of using all data from the three sessions, separate models are created for each

recording trial dataset. The results for these models can be seen in Tables 4.7–4.13. The

first four tables summarize the results for the sessions of subject A. The last four tables

summarize the results for all recording sessions of subject B. The values presented are

the average values for the performance measures of the 3 sets of models. Figures 4.7

and 4.8 shows the graph of the relative absolute error.

Using only data instances from one recording session, performance of the models

improve dramatically. Similar to the previous results, performance for all models, in

average, improve as the window length is increased. When n = 30 (19.2 seconds of

music data), a significant improvement in performance is already obtained. Improve-

ment of the models slows down when a value n ≥ 60 is used. There is also a slight

decrease in performance when n ≥ 210 (2.24 minutes of music data). This decrease is

unavoidable because instances of songs that are shorter than this window length are no

longer included in the classification.

In general, emotion models constructed using C4.5 perform better than models con-

39

Page 53: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Figure 4.7: Relative absolute error results of stress and relaxation models using C4.5

Figure 4.8: Relative absolute error results of stress and relaxation models using linearregression

structed using linear regression, which is the opposite of the previous classification task

that used all session instances. C4.5 models using instances with n = 30 can already

obtain a good results. Linear regression models need to use instances with n ≥ 90 to

obtain the same results.

Even though subject A was only able to complete one session, the performance of

the models are similar to that of the other subject.

40

Page 54: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 4.7: Stress model results using linear regression for subject A

Window length 30 60 90 120 150 180 210Correlation coefficient 0.474 0.701 0.928 0.940 0.953 0.996 0.967Mean absolute error 0.025 0.017 0.008 0.008 0.007 0.004 0.005Root mean squared error 0.333 0.209 0.086 0.082 0.077 0.023 0.068Relative absolute error 16.4% 12.7% 4.6% 4.2% 3.2% 2.0% 2.3%Root relative squared error 177.7% 108.8% 39.9% 36.1% 31.9% 9.2% 26.3%

Table 4.8: Stress model results using C4.5 for subject A

Window length 30 60 90 120 150 180 210Kappa statistic 0.948 0.970 0.984 0.989 0.985 0.986 0.986Mean absolute error 0.013 0.008 0.005 0.004 0.005 0.004 0.005Root mean squared error 0.106 0.084 0.063 0.054 0.062 0.063 0.064Relative absolute error 5.8% 3.3% 1.8% 1.3% 1.7% 1.5% 1.6%Root relative squared error 31.4% 23.7% 17.6% 14.9% 17.0% 17.0% 16.8%

Table 4.9: Relax model results using linear regression for subject A

Window length 30 60 90 120 150 180 210Correlation coefficient 0.885 0.861 0.837 0.784 0.932 0.964 0.981Mean absolute error 0.026 0.019 0.012 0.011 0.008 0.007 0.006Root mean squared error 0.076 0.100 0.124 0.158 0.086 0.066 0.052Relative absolute error 24.2% 15.7% 8.6% 7.7% 5.2% 4.0% 3.0%Root relative squared error 50.6% 58.0% 65.4% 78.4% 38.9% 27.3% 19.6%

Table 4.10: Relax model results using C4.5 for subject A

Window length 30 60 90 120 150 180 210Kappa statistic 0.918 0.956 0.940 0.948 0.977 0.995 0.983Mean absolute error 0.011 0.006 0.008 0.008 0.004 0.001 0.003Root mean squared error 0.098 0.074 0.086 0.082 0.055 0.054 0.053Relative absolute error 8.7% 4.8% 6.3% 5.7% 2.4% 0.5% 1.8%Root relative squared error 39.2% 28.7% 33.8% 31.5% 20.3% 9.7% 18.5%

Table 4.11: Stress model averaged results using linear regression for subject B

Window length 30 60 90 120 150 180 210Correlation coefficient 0.937 0.980 0.992 0.996 0.998 0.998 0.998Mean absolute error 0.048 0.029 0.019 0.014 0.011 0.009 0.008Root mean squared error 0.063 0.038 0.025 0.019 0.016 0.016 0.017Relative absolute error 32.6% 17.7% 11.0% 7.6% 5.7% 4.6% 4.0%Root relative squared error 35.0% 19.5% 12.2% 8.5% 6.8% 6.5% 6.6%

Table 4.12: Stress model averaged results using C4.5 for subject B

Window length 30 60 90 120 150 180 210Kappa Statistic 0.866 0.924 0.944 0.961 0.965 0.975 0.974Mean absolute error 0.039 0.023 0.017 0.013 0.011 0.008 0.008Root mean squared error 0.186 0.143 0.123 0.106 0.099 0.083 0.082Relative absolute error 14.5% 8.2% 6.0% 4.4% 3.8% 2.7% 2.7%Root relative squared error 50.6% 38.0% 32.4% 27.7% 26.0% 21.7% 21.0%

41

Page 55: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 4.13: Relax model averaged results using linear regression for subject B

Window length 30 60 90 120 150 180 210Correlation coefficient 0.920 0.976 0.990 0.995 0.996 0.998 0.998Mean absolute error 0.045 0.029 0.020 0.015 0.012 0.009 0.009Root mean squared error 0.062 0.039 0.027 0.020 0.019 0.014 0.015Relative absolute error 37.2% 20.2% 13.0% 9.6% 7.7% 5.7% 4.9%Root relative squared error 38.9% 21.3% 13.6% 10.2% 9.1% 6.7% 6.7%

Table 4.14: Relax model averaged results using C4.5 for subject B

Window length 30 60 90 120 150 180 210Kappa Statistic 0.869 0.927 0.940 0.953 0.962 0.962 0.968Mean absolute error 0.032 0.020 0.018 0.013 0.011 0.011 0.010Root mean squared error 0.168 0.134 0.124 0.108 0.099 0.097 0.091Relative absolute error 14.1% 8.0% 6.6% 5.1% 4.2% 4.1% 3.5%Root relative squared error 49.9% 37.6% 33.9% 29.9% 27.3% 26.9% 25.0%

4.5.1 Consistency of EEG Readings

One possible reason for the poor performance of the model that uses all instances could

be caused by the inconsistent emotion annotations recorded for the same song. Addi-

tional analysis was performed to compare the three emotion annotations of the songs

selected by subject B.

The three ESAM emotion annotations were compared with each other using dynamic

time warping (DTW) [87], an established technique originally used in speech recogni-

tion. A visual comparison of randomly selected song annotations revealed that some

annotations had a similar shape but did not occur at the same time. Figure 4.9 shows

the relaxation annotations for song 49 (The Christmas Song)taken from the three dif-

ferent recording trials. First, it can be seen that the emotion annotation for the same

song is not exactly the same. However, there are noticeable similarities at certain parts

of the song. The three highlighted areas in the figure show the presence of high peaks

occurring in similar areas of the song, but these peaks do not occur at exactly the same

time. By using DTW, the optimal alignment between two emotion annotation sequences

can be computed to cope with time delay that was observed in the data.

DTW is used to compare emotion annotations for each trial pair combinations. The

results of the comparisons are shown in Tables 4.15 and 4.16.

42

Page 56: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Figure 4.9: Visual comparison of relaxation readings while subject B listens to song 49

Results show that the similarity between the sessions are above average. The average

similarity measure for Stress annotations is 0.80 and the average similarity for Relaxation

annotations is 0.824. This indicates that there are only a few songs that were consistently

annotated using the EEG readings. These findings could explain the poor classification

results when the instances from all trial sessions are used. This also raises a concern

on how feasible it is to actually use physiological sensors to obtain consistent results for

music emotion. Every time a subject listens to a song, the emotions induced will be

slightly different.

The average similarity values for each song were also compared against the manual

emotion annotations. The results of correlating these values are summarized in Table

4.17. The relaxation annotations were found to have a 0.65 correlation coefficient when

compared with the familiarity annotations. This suggests that the relaxation annota-

tions are more consistent when the subject is more familiar with the song.

4.5.2 Influence of window length

The classification results show that model accuracy is highly dependent on the param-

eters of the windowing technique. Increasing the window length allows more music

43

Page 57: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 4.15: Similarity between session data using Stress annotations

Song Ses. 1 & 2 Ses. 1 & 3 Ses. 2 & 3 Average Std dev

001 0.88 0.89 0.83 0.865 0.022013 0.85 0.81 0.81 0.822 0.018014 0.82 0.68 0.71 0.737 0.060015 0.79 0.75 0.80 0.778 0.019017 0.73 0.81 0.74 0.762 0.035024 0.76 0.75 0.75 0.752 0.003028 0.77 0.69 0.74 0.734 0.031039 0.91 0.77 0.79 0.823 0.062045 0.83 0.82 0.85 0.833 0.011049 0.83 0.72 0.78 0.776 0.047051 0.83 0.84 0.78 0.820 0.027056 0.83 0.81 0.82 0.820 0.008072 0.73 0.86 0.76 0.783 0.055073 0.78 0.79 0.74 0.770 0.022076 0.81 0.76 0.84 0.804 0.034084 0.86 0.80 0.84 0.833 0.021092 0.81 0.75 0.77 0.778 0.025100 0.85 0.82 0.79 0.820 0.026105 0.76 0.76 0.81 0.779 0.025111 0.86 0.83 0.83 0.841 0.013

Average 0.814 0.786 0.790std dev 0.047 0.052 0.040max 0.91 0.89 0.85min 0.73 0.69 0.71

44

Page 58: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 4.16: Similarity between session data using Relaxation annotations

Song Ses. 1 & 2 Ses. 1 & 3 Ses. 2 & 3 Average Std dev

001 0.89 0.83 0.88 0.866 0.026013 0.78 0.72 0.88 0.795 0.065014 0.75 0.78 0.93 0.820 0.076015 0.81 0.95 0.79 0.849 0.070017 0.89 0.90 0.91 0.898 0.008024 0.85 0.83 0.85 0.844 0.010028 0.67 0.62 0.77 0.687 0.060039 0.79 0.81 0.67 0.756 0.062045 0.83 0.79 0.78 0.798 0.021049 0.84 0.83 0.81 0.825 0.012051 0.82 0.82 0.91 0.848 0.041056 0.83 0.80 0.74 0.790 0.038072 0.88 0.88 0.92 0.893 0.021073 0.85 0.80 0.85 0.835 0.023076 0.87 0.84 0.88 0.860 0.015084 0.72 0.74 0.88 0.780 0.070092 0.83 0.86 0.91 0.867 0.032100 0.77 0.80 0.86 0.811 0.040105 0.85 0.85 0.92 0.874 0.030111 0.81 0.78 0.76 0.782 0.018

Average 0.816 0.812 0.844Std dev 0.055 0.066 0.070max 0.89 0.95 0.93min 0.67 0.62 0.67

Table 4.17: Correlation between average similarity and manual annotations

Average Similarity Average SimilarityStress Relaxation

Relaxation -0.05 0.26Stress 0.02 -0.17Familiarity -0.1 0.65

45

Page 59: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Figure 4.10: Mean of emotion values for different window lengths

information to be included in the instances making each more distinguishable from in-

stances of other classes. All the data of subject B are examined more closely to have a

better understanding of the window length’s influence to model performance.

Increasing the window length also affects the emotion annotations. ESAM was con-

figured to produce emotion vectors having positive values. Since most of the emotion

values are near zero, the average emotion values for the windows are also low. Fig-

ure 4.10 shows the steady increase of the values used for emotion labels as the window

length is increased. The standard deviation also follows a similar trend as show in Fig.

4.11. Using larger window lengths diversifies the emotion labels as well which, in turn,

contributes to better accuracy.

The low average values also affected the discretization of the emotion labels for

C4.5. It resulted to having a majority class. Table 4.18 shows that class 1 is usually the

majority class for the data set. With a small window length, more instances are labelled

with emotion value close to 0. However, as window length is increased, the number of

classes steadily balances out. For example, at n = 30, 43% of the stress model’s data is

labelled as class 1, but when n = 90, it is only 31.5%. This is similarly observed for the

relaxation data but to a lesser degree.

46

Page 60: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Figure 4.11: Standard deviation of emotion values for different window lengths

Table 4.18: Class sizes for Stress (S) and Relaxation (R) data after discretization

n = 30 n = 60 n = 90 n = 120Class No. S R S R S R S R

1 43.3% 72.6% 34.3% 65.0% 31.5% 63.2% 29.1% 60.0%2 37.5% 21.7% 37.9% 25.5% 38.8% 26.9% 39.0% 28.3%3 15.4% 5.1% 18.8% 8.5% 18.6% 8.8% 17.3% 10.0%4 3.4% 0.8% 7.4% 0.6% 8.9% 0.5% 10.7% 1.2%5 0.4% 0.2% 1.6% 0.4% 2.3% 0.6% 3.8% 0.5%

47

Page 61: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

4.5.3 Important features used in C4.5

C4.5 builds a decision tree by finding features in the data that most effectively splits

the data into subsets enriched in one class or the other. This causes a side effect of

identifying music features that are most beneficial for classifying emotions.

Table 4.19 summarizes the features included in the trees generated by the algorithm

using n = 60. The items are ordered according to the number of features present in the

decision trees. A big portion of the features included are rhythmic features averaging

34.5% of the feature set. This is followed by pitch features. The features used for

modelling stress and relaxation are similar since these two emotional states can be

considered as opposites.

A closer inspection of the decision tree reveals that each emotion can be classified

faster using a different ordering of music features. Table 4.20 shows the distribution of

features found in the first 5 levels of the different decision trees. The stress model of

subject A mostly uses rhythmic features and some melody, instrumentation and dynamic

features. During the interview with the subject, when asked which parts of the songs are

stressful, the subject explains that songs with electric guitar and rock songs in general

were very stressful. Rock songs used in the dataset have fast tempo and may be a factor

as to the construction of the decision tree.

For subject B, aside from rhythmic features, pitch and instrumentation features are

also deemed important. During the interview, the subject mentioned that the high pitch

sound of the MIDI files irritated him.

Table 4.19: Distribution of features in decision trees (n = 60)

Subject A Subject BCategory Stress Relaxation Stress Relaxation Average

Rhythm 39.6% 32.6% 32.6% 33.3% 34.5%Pitch 26.1% 28.3% 28.3% 30.8% 28.4%Melody 13.5% 15.5% 15.5% 13.2% 14.4%Instrumentation 13.5% 12.8% 12.8% 11.3% 12.6%Texture 2.7% 5.3% 5.3% 5.7% 4.8%Others 2.7% 4.3% 4.3% 3.8% 3.8%Dynamics 1.8% 1.1% 1.1% 1.9% 1.5%

48

Page 62: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 4.20: Feature distribution for the first 5 levels of C4.5 decision trees (n = 60)

Subject A Subject BCategory Stress Relaxation Stress Relaxation

Rhythm 23.4% 13.5% 47.1% 33.3%Pitch 0.0% 10.8% 11.8% 28.6%Melody 4.3% 2.7% 11.8% 9.5%Instrumentation 4.3% 2.7% 23.5% 4.8%Texture 0.0% 0.0% 0.0% 9.5%Dynamics 2.1% 2.7% 0.0% 9.5%Others 0.0% 2.7% 5.9% 4.8%

For relaxing music, both subjects mentioned that there are specific parts of the

songs that made it more relaxing. These include introductory parts, transitions between

chorus and verses, piano and harp instrumentals, and climactic parts of the song (i.e.

last verse-chorus or bridge). Examining the decision tree for relaxation, the Melodic

Interval Histogram, Basic Pitch Histogram, and Music Position features are used for

the first 3 levels, which are features that support the statements of the subjects.

4.5.4 Accuracy of Emotion labels

The manual emotion labels were also compared to the emotion values from ESAM.

The average emotion value for each song was calculated and transformed into a 5-point

scale. Table 4.21 shows how different the manual annotations are compared with the

discretized continuous annotations.

Only 22.5% of the emotion labels from ESAM were the same with the manual annota-

tions, 45% of the emotion labels from EEG slightly differed from the manual annotations,

and 5% were completely opposite from what was reported by the subjects. It is difficult

to attribute error for the discrepancy. One possible cause could be the methodology

for manual annotations. While the subjects were doing the manual annotations, it was

observed that usually, they would only listen to the first 30 seconds of the song and in

some cases skip to the middle of the song. It is possible that the manual annotation

incompletely represents the emotion of the entire song.

It is also possible that the subjects experienced a different kind of emotion uncon-

sciously while listening to the music and these were recorded using EEG. For example

49

Page 63: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

some songs that were reported to be stressful turned out not stressful at all. The emo-

tion annotations were compared with the manual annotations to see if there was any

dependency between the values.

Table 4.22 shows that the subjects treated the emotion Stress to be the bipolar

opposite of Relaxation due to the high negative correlation value. Using ESAM, a

similar situation can be observed but there is only a moderate negative correlation

between the two as shown in Table 4.23. For the other emotions, Joy has a correlation

with Relaxation and a negative correlation with Stress. This is consistently reported for

both manual annotations and annotations using ESAM.

Finally, the amount of discrepancy between manual and automated annotations was

compared against the subject’s familiarity with the song. The discrepancy values for

joyful and relaxing songs have a high correlation with familiarity: 0.61 for Joy and

0.60 for Relaxation. This implies that measurements of ESAM for Joy and Relaxation

become more accurate when the subject is not familiar with the songs. It is possible

that unfamiliar songs will help induce stronger emotions as compared to familiar music.

This may be an important factor when using psycho-physiological sensors in measuring

emotion.

Table 4.21: Comparison of manual annotations and discretized ESAM annotations

Difference Stress Relaxation Total

0 6 (30%) 3 (15%) 9 (22.5%)±1 10 (50%) 8 (40%) 18 (45.0%)±2 3 (15%) 2 (10%) 5 (12.5%)±3 0 (0%) 6 (30%) 6 (15.0%)±4 1 (5%) 1 (5%) 2 (5.0%)

Table 4.22: Correlation of manual annotations

Joy Sadness Relaxation Stress

Sadness -0.56Relaxation 0.59 0.07Stress -0.62 -0.06 -0.98Familiarity 0.72 -0.25 0.56 -0.63

50

Page 64: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Table 4.23: Correlation of annotations using ESAM

Joy Sadness Relaxation Stress

Sadness -0.12Relaxation 0.46 -0.23Stress -0.45 0.31 -0.42Familiarity -0.06 0.30 -0.23 0.57

4.6 Summary

This experiment focused on building an emotion model for relaxing and stressful music.

The model was built by extracting high-level music features from MIDI files using a

windowing technique. The features were labelled using emotion values generated using

EEG data and ESAM. The emotion annotations of ESAM were also compared against

manual emotion annotations by the two subjects. With the help of interviews conducted

with the subjects, it was established that EEG and ESAM can be used for obtaining

continuous-valued emotion annotations for music most specially when the subject ex-

periences a strong intensity of that emotion. Familiarity of the subject with the song

can affect genuine emotions as well as consistency of emotions over an extend period of

time.

Linear regression and C4.5 were used to build the different emotion models. Using

a 10-fold cross-validation for evaluating the models, high performance was obtained by

using large window lengths encompassing between 38.4 seconds (n = 60) to 57.6 seconds

(n = 90) of music.

51

Page 65: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Chapter 5

Conclusion

5.1 Summary

This research investigates the use of psycho-physiological data for modelling emotions

induced by music. This was achieved by performing two experiments that used dif-

ferent physiological sensors to capture psycho-physiological data of subjects listening

to emotion-inducing music. The first experiment used a motif discovery algorithm to

identify frequently occurring patterns in blood volume pulse and respiration rate data.

These are mapped to the frequently occurring chord progressions in the music dataset.

Analysis of the mappings reveal that common chord progressions in popular music are

associated with consistent physiological reactions. Unfortunately, these reactions could

not be annotated with emotion labels using existing techniques.

The second experiment addresses the issues that were identified in the first exper-

iment. The data collection methodology was improved to become more stringent and

include more participants. With the help of these modifications, a dataset containing

physiological readings, music features, and emotion annotations was produced. Addi-

tional high-level music features were extracted from symbolic music using a windowing

technique and continuous-valued emotion labels were used for annotating all songs.

Through these two experiments, it was shown that physiological sensors coupled with

high-level music features can be used for identifying emotions in music with some degree

of accuracy. However, this research is limited by the use of MIDI files and equipment

that requires training and several operating constraints to be used accurately.

52

Page 66: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

It was also shown that accuracy of the classifiers was greatly influenced by the music

length used for feature extraction. It is ideal to use as much information from the music

for classification. However, as experienced in the research, collecting data for entire

songs will require a lot of time and resources.

Finally, through the analysis of the electroencephalogram-derived emotion annota-

tions, it was shown that familiarity of the subject towards the music stimuli affects the

consistency and intensity of emotion that is experienced. This affects the manner in

which future music emotion research should be carried out. Familiarity can determine

how much weight should be placed on emotion annotations. Annotations given for fa-

miliar songs can be used with a higher degree of confidence than annotations given for

unfamiliar songs.

5.2 Recommendations for future work

Improvement of the emotion models can be achieved by modifying different aspects

of this research. Changing the manner in which the emotion labels were derived for

annotating the different music segments or using a different classification algorithm

could yield improvements to the model accuracy.

The dataset that has been prepared contains all segments of the music. It is possible

to investigate how specific music segments contribute to the emotion experienced by a

listener. It would be possible to compare, for example, the changes of emotion when a

listener hears the verse or chorus of the song.

Preliminary experiments performed in the 1st experiment hinted that tempo might

play a role in identifying the proper window length. Further research is needed to see

how far this music feature has an influence to the window length values. The hypothesis

is that slower songs would require longer window lengths to capture the same amount

of information needed for fast songs.

Finally, as mentioned in the earlier section, the current research uses MIDI files as

the source for high-level music information. Parallel research on extracting the same

music features from audio files would validate the results that discussed in this research.

53

Page 67: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Bibliography

[1] Kate Hevner. The affective character of the major and minor modes in music.

American Journal of Psychology, 47(1):103–118, 1935.

[2] Kate Hevner. Experimental studies of the elements of expression in music. American

Journal of Psychology, 48(1):246–268, 1936.

[3] Paul R. Farnsworth. The social psychology of music. The Dryden Press, 1958.

[4] Alf Gabrielsson and Erik Lindstrom. The influence of musical structure on emo-

tional expression. In Patrik N. Juslin and John A. Sloboda, editors, Music and

emotion: Theory and research, page 223–248. London: Oxford University Press,

2001.

[5] Alf Gabrielsson and Patrik N. Juslin. Emotional expression in music. In Richard J.

Davidson, Klaus R. Scherer, and H. Hill Goldsmith, editors, Handbook of affective

sciences, page 503–534. New York: Oxford University Press, 2003.

[6] Emery Schubert. Affective, evaluative, and collative responses to hated and loved

music. Psychology of Aesthetics Creativity and the Arts, 4(1):36–46, 2010.

[7] Patrik N. Juslin and John A. Sloboda. Handbook of music and emotion: theory,

research, applications. Oxford University Press, 2010.

[8] Marianna Pinchot Kastner and Robert G. Crowder. Perception of the major/minor

distinction: IV. emotional connotations in young children. Music Perception,

8(2):189–202, 1990.

54

Page 68: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

[9] Steven R. Livingstone, Ralf Muhlberger, Andrew R. Brown, andWilliam F. Thomp-

son. Changing musical emotion: A computational rule system for modifying score

and performance. Computer Music Journal, 34(1):41–64, 2010.

[10] Youngmoo E. Kim, Erik M. Schmidt, Raymond Migneco, Brandon G. Morton,

Patrick Richardson, Jeffrey Scott, Jacquelin A. Speck, and Douglas Turnbull. Music

emotion recognition: A state of the art review. In 11th Int’l Society for MIR Conf.,

pages 255–266, August 2010.

[11] Peter Kivy. Sound sentiment: An essay on the musical emotions. Temple University

Press, Philadelphia, November 1989.

[12] Peter Kivy. Music alone: Philosophical reflections on the purely musical experience.

Cornell University Press, Ithaca, NY, 1990.

[13] Yu-Ching Lin, Yi-Hsuan Yang, and Homer H. Chen. Exploiting online music tags

for music emotion classification. ACM Transactions on Multimedia Computing,

Communications, and Applications, 7S(1):1–16, 2011.

[14] Konstantinos Trohidis, G. Tsoumakas, George Kalliris, and Ioannis Vlahavas. Mul-

tilabel classification of music into emotion. In Proc. of the Int’l Conf. on MIR.

Philadelphia, PA, 2008.

[15] J.A. Russell. A circumplex model of affect. Journal of personality and social

psychology, 39(6):1161, 1980.

[16] Jonghwa Kim and Elisabeth Andre. Emotion recognition based on physiological

changes in music listening. IEEE Trans. on Pattern Analysis and Machine Intelli-

gence, 30:2067–2083, 2008.

[17] Roberto S. Legaspi, Yuya Hashimoto, Koichi Moriyama, Satoshi Kurihara, and

Masayuki Numao. Music compositional intelligence with an affective flavor. In

Proc. of the 12th International Conference on Intelligent User Interfaces, pages

216–224, 2007.

55

Page 69: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

[18] Masayuki Numao, Masashi Kobayashi, and Katsuyuki Sakaniwa. Acquisition of

human feelings in music arrangement. In Proc. of IJCAI ’97, pages 268–273, 1997.

[19] Masayuki Numao, Shoichi Takagi, and Keisuke Nakamura. Constructive adaptive

user interfaces - composing music based on human feelings. In Proc. of AAAI ’02,

pages 193–198, 2002.

[20] Cory McKay. Automatic genre classification of midi recordings. Master’s thesis,

McGill University, Montreal, June 2004.

[21] William James. What is an emotion? Mind, pages 188–205, 1884.

[22] J. Schioldann. on periodical depressions and their pathogenesis by carl lange (1886).

History of Psychiatry, 22(1):108–115, 2011.

[23] Walter B. Cannon. The james-lange theory of emotions: A critical examination

and an alternative theory. The American Journal of Psychology, 39(1/4):106–124,

1927.

[24] Klaus R. Scherer. What are emotions? and how can they be measured? Social

science information, 44(4):695–729, 2005.

[25] Robert Plutchik. A general psychoevolutionary theory of emotion. Emotion: The-

ory, research, and experience, 1(3):3–33, 1980.

[26] Richard S. Lazarus. Emotion and adaptation. Oxford University Press, USA, 1991.

[27] Alf Gabrielsson and Siv Lindstrom. On strong experiences of music. In R. Steinberg,

editor, Music and the mind machine, page 118–139. Berlin: Springer-Verlag, 1995.

[28] Patrik N. Juslin and Marcel R. Zentner. Current trends in the study of music and

emotion: Overture. Musicae scientiae, pages 3–21, 2002.

[29] Klaus R. Scherer, Marcel R. Zentner, and Annekathrin Schacht. Emotional states

generated by music: An exploratory study of music experts. Musicae Scientiae,

2002.

56

Page 70: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

[30] Alf Gabrielsson and Patrik N. Juslin. Emotional expression in music performance:

Between the performer’s intention and the listener’s experience. Psychology of

music, 24(1):68–91, 1996.

[31] Peter Kivy. How music moves. In Music alone; philosophical reflections on the

purely musical experience, pages 146–172. Cornell University Press, Ithaca London,

1994.

[32] John A. Sloboda and Patrik N Juslin. Psychological perspectives on music and

emotion. In Patrik N. Juslin and John A. Sloboda, editors, Music and Emotion:

Theory and Research, pages 71–104. Oxford University Press, Oxford, UK, 2001.

[33] Leonard B. Meyer. Emotion and Meaning in Music. Chicago University Press,

Chicago, 1956.

[34] John A. Sloboda. Music structure and emotional response: some empirical findings.

Psychology of Music, 19:110–120, 1991.

[35] Patrik N. Juslin. Cue utilization in communication of emotion in music perfor-

mance: Relating performance to perception. Journal of Experimental Psychology:

Human perception and performance, 26(6):1797, 2000.

[36] Klaus R. Scherer. Emotion expression in speech and music. In Music, language,

speech and brain, pages 146–156. Macmillian, London, 1991.

[37] Peter Sterling and Joseph Eyer. Allostasis: a new paradigm to explain arousal

pathology. In Shirley Fisher and James Reason, editors, Handbook of life stress,

cognition and health, chapter xxxiii, pages 629–649. John Wiley & Sons, Oxford,

England, 1988.

[38] Frank H. Wilhelm, Monique C. Pfaltz, and Paul Grossman. Continuous electronic

data capture of physiology, behavior and experience in real life: towards ecological

momentary assessment of emotion. Interacting with Computers, 18(2):171–186,

2006.

57

Page 71: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

[39] Sylvia D. Kreibig. Autonomic nervous system activity in emotion: A review. Bio-

logical psychology, 84(3):394–421, 2010.

[40] Israel C. Christie and Bruce H. Friedman. Autonomic specificity of discrete emotion

and dimensions of affective space: A multivariate approach. International Journal

of Psychophysiology, 51(2):143–153, 2004.

[41] Shigeki Watanuki and Yeon Kyu Kim. Physiological responses induced by pleasant

stimuli. Journal of physiological anthropology and applied human science, 24(1):135–

138, 2005.

[42] David Shapiro, Larry D. Jamner, Iris B. Goldstein, and Ralph J. Delfino. Striking

a chord: Moods, blood pressure, and heart rate in everyday life. Psychophysiology,

38(2):197–204, 2001.

[43] Jaeseung Jeong, Moo Kwang Joung, and Soo Yong Kim. Quantification of emo-

tion by nonlinear analysis of the chaotic dynamics of electroencephalograms during

perception of 1/f music. Biological Cybernetics, 78(3):217–225, 1998.

[44] G. Harrer and H. Harrer. Music, emotion, and autonomic function. Music and the

brain. Studies in the neurology of music, pages 202–216, 1977.

[45] Yisi Liu, Olga Sourina, and Minh Khoa Nguyen. Real-time eeg-based human emo-

tion recognition and visualization. In Proc. of the Int’l Conf. on Cyberworlds, pages

262–269, 2010.

[46] Julie Onton and Scott Makeig. High-frequency broadband modulation of electroen-

cephalographic spectra. Frontiers in Human Neurosciences, 3(61), 2009.

[47] Yuan-Pin Lin, Chi-Hong Wang, Tien-Lin Wu, Shyh-Kang Jeng, and Jyh-Horng

Chen. Eeg-based emotion recognition in music listening: A comparison of schemes

for multiclass support vector machine. In Proc. of Int’l Conf. on Acoustics, Speech,

and Signal Processing, pages 489–492, 2009.

58

Page 72: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

[48] Charlotte van oyen Witvliet and Scott R. Vrana. Psychophysiological responses as

indices of affective dimensions. Psychophysiology, 32(5):436–443, 2007.

[49] Stephanie Khalfa, Peretz Isabelle, Blondin Jean-Pierre, and Robert Manon. Event-

related skin conductance responses to musical emotions in humans. Neuroscience

letters, 328(2):145–149, 2002.

[50] Rosalind W. Picard, Elias Vyzas, and Jennifer Healey. Toward machine emotional

intelligence: Analysis of affective physiological state. Pattern Analysis and Machine

Intelligence, IEEE Transactions on, 23(10):1175–1191, 2001.

[51] Margaret M. Bradley and Peter J. Lang. Affective reactions to acoustic stimuli.

Psychophysiology, 37(02):204–215, 2000.

[52] Carol L. Krumhansl. An exploratory study of musical emotions and psychophysiol-

ogy. Canadian Journal of Experimental Psychology/Revue canadienne de psycholo-

gie experimentale, 51(4):336, 1997.

[53] Ivan Nyklıcek, Julian F. Thayer, and Lorenz J.P. van Doornen. Cardiorespiratory

differentiation of musically-induced emotions. Journal of Psychophysiology, 1997.

[54] Dale L. Bartlett. Physiological responses to music and sound stimuli. Handbook of

music psychology, 2:343–85, 1996.

[55] Nikki S. Rickard. Intense emotional responses to music: a test of the physiological

arousal hypothesis. Psychology of Music, 32(4):371–388, 2004.

[56] Ulf Dimberg. Facial electromyographic reactions and autonomic activity to auditory

stimuli. Biological psychology, 31(2):137–147, 1990.

[57] W. Lawrence Gulick, George A. Gescheider, and Robert D. Frisina. Hearing: Phys-

iological acoustics, neural coding, and psychoacoustics. Oxford University Press,

1989.

[58] Avram Goldstein. Thrills in response to music and other stimuli. Physiological

Psychology, 1980.

59

Page 73: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

[59] Gary G. Berntson, John T. Cacioppo, and Karen S. Quigley. Autonomic deter-

minism: the modes of autonomic control, the doctrine of autonomic space, and the

laws of autonomic constraint. Psychological review, 98(4):459, 1991.

[60] Paul Grossman, G. Stemmler, and E. Meinhardt. Paced respiratory sinus arrhyth-

mia as an index of cardiac parasympathetic tone during varying behavioral tasks.

Psychophysiology, 27(4):404–416, 2007.

[61] Thomas Ritz and Miriam Thons. Airway response of healthy individuals to affective

picture series. International journal of psychophysiology, 46(1):67–75, 2002.

[62] Ilse van Diest, Winnie Winters, Stephan Devriese, Elke Vercamst, Jiang N. Han,

Karel P. Woestijne, and Omer Bergh. Hyperventilation beyond fight/flight: respi-

ratory responses during emotional imagery. Psychophysiology, 38(6):961–968, 2001.

[63] Peter J. Lang. The emotion probe: Studies of motivation and attention. American

psychologist, 50(5):372, 1995.

[64] Karim Ansari-Asl, Guillaume Chanel, and Thierry Pun. A channel selection method

for EEG classification in emotion assessment based on synchronization likelihood.

In Proc. of the 15th Eur. Signal Proc. Conf. (EUSIPCO), page 1241–1245, 2007.

[65] Guillaume Chanel, Julien Kronegg, Didier Grandjean, and Thierry Pun. Emo-

tion assessment: arousal evaluation using eegs and peripheral physiological signals.

In July 17 th August 11 th , Dubrovnik, Croatia - Final Project Report Proc.

Int. Workshop on Multimedia Content Representation, Classification and Security,

pages 530–537, 2006.

[66] Jessica Lin, Eamonn Keogh, Stefano Lonardi, and Pranav Patel. Finding motifs in

time series. In Proc. of 2nd Workshop on Temporal Data Mining (KDD’02), 2002.

[67] Hidenao Abe and Takahira Yamaguchi. Implementing an integrated time-series data

mining environment a case study of medical kdd on chronic hepatitis. presented

60

Page 74: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

at the 1st International Conference on Complex Medical Engineering (CME2005),

2005.

[68] Ioannis P. Androulakis, James Wu, Joseph L. Vitolo, and Charles M. Roth. Select-

ing maximally informative genes to enable temporal expression profiling analysis.

In Proc. of Foundations of Systems Biology in Engineering, 2005.

[69] Bhrigu Celly and Victor B. Zordan. Animated people textures. In Proc. of 17th

International Conference on Computer Animation and Social Agents (CASA), 2004.

[70] Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney Cash, and Brandon West-

over. Exact discovery of time series motifs. In Proc. of the SIAM Int’l Conf. on

Data Mining (SDM 2009), page 473–484, 2009.

[71] Matthias Mauch, Chris Cannam, Matthew Davies, Christopher Harte, Sefki

Kolozali, Dan Tidhar, and Mark Sandler. OMRAS2 metadata project 2009. In

10th Int’l Conf. on MIR Late-Breaking Session. Kobe, Japan, 2009.

[72] Fred Lerdahl. Tonal Pitch Space. Oxford University Press, 2001.

[73] Kin-pong Chan and Ada Wai-chee Fu. Efficient time series matching by wavelets. In

Proc. of the 15th IEEE Int’l Conf. on Data Eng., page 126–133. Sydney, Australia,

Mar 23-26 1999.

[74] Eamonn Keogh and Shruti Kasetty. On the need for time series data mining bench-

marks: A survey and empirical demonstration. In Proc. of the 8th ACM SIGKDD

Int’l Conf. on Knowledge Discovery and Data Mining, page 102–111. Edmonton,

Alberta, Canada, July 2002.

[75] Toshimitsu Musha, Yuniko Terasaki, Hasnine A. Haque, and George A. Ivanitsky.

Feature extraction from eegs associated with emotions. Journal of Artificial Life

and Robotics, 1(1):15–19, 1997.

61

Page 75: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

[76] Peter J. Rentfrow and Samuel D. Gosling. The do re mi’s of everyday life: the

structure and personality correlates of music preferences. Journal of personality

and social psychology, 84(6):1236, 2003.

[77] Damian A. Ritossa and Nikki S. Rickard. The relative utility of pleasantness and

likingdimensions in predicting the emotions expressed by music. Psychology of

Music, 32(1):5–22, 2004.

[78] John A. Sloboda. Exploring the musical mind: cognition, emotion, ability, function.

Oxford University Press, USA, 2005.

[79] Cori L. Pelletier. The effect of music on decreasing arousal due to stress: a meta-

analysis. Journal of Music Therapy, 2004.

[80] Gregory D. Webster and Catherine G. Weir. Emotional responses to music: Inter-

active effects of mode, texture, and tempo. Motivation and Emotion, 29(1):19–39,

2005.

[81] Emery Schubert. The influence of emotion, locus of emotion and familiarity upon

preference in music. Psychology of Music, 35(3):499–515, 2007.

[82] Cory McKay and Ichiro Fujinaga. jsymbolic: A feature extractor for midi files. In

Proc. of the International Computer Music Conference, pages 302–305, 2006.

[83] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann,

and Ian H. Witten. The weka data mining software: an update. SIGKDD Explor.

Newsl., 11:10–18, November 2009.

[84] Ross J. Quinlan. Learning with continuous classes. In AI92, 5th Australian Joint

Conference on Artificial Intelligence, pages 343–348. World Scientific, 1992.

[85] Ross J. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Pub-

lishers, 1993.

62

Page 76: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

[86] Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and

Techniques, Second Edition. The Morgan Kaufmann Series in Data Management

Systems. Elsevier Science, 2005.

[87] Donald J. Berndt and James Clifford. Using dynamic time warping to find patterns

in time series. In KDD workshop, volume 10, pages 359–370, 1994.

63

Page 77: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Appendix A

MIDI Files for EEG Experiments

64

Page 78: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Number Song Title Language Time Genre Avg. Tempo No. Unique Chords Key

1 ALOHA OE English 236.0 Folk 71 5 G2 Forca English 221.7 Folk 123 3 A♭3 Greensleeves English 165.2 Folk 120 23 Dm4 Santa Lucia English 168.5 Folk 67 7 D5 jinsei ni namida ari Japanese 166.4 Folk 111 5 Cm6 aoge ba ttoshi Japanese 256.7 Folk 91 9 D7 anta no kad Japanese 286.0 Folk 134 4 E8 kagome kagome Japanese 54.7 Folk 72 2 Am9 koko wa minatoch Japanese 235.0 Folk 117 7 B♭m10 sasori za no onna Japanese 191.6 Folk 136 10 Em11 shabon dama Japanese 36.5 Folk 98 3 E♭12 jon kara onna bushi Japanese 278.8 Folk 128 8 Dm13 hagure kokiriko Japanese 311.9 Folk 70 11 Fm14 yo sa koi fushi Japanese 92.7 Folk 83 1 G♯m15 hawai no kekkon no uta Japanese 211.8 Folk 88 12 B♭16 pardo Japanese 292.7 Folk 149 11 C17 ichi ken Japanese 280.7 Folk 64 10 Am18 shiranui shu Japanese 227.0 Folk 69 6 B♭m19 kydai sen Japanese 213.2 Folk 71 4 Cm20 j nin no indian Japanese 42.8 Folk 132 2 F21 aish sabaku Japanese 252.2 Folk 62 14 E♭ m22 amagi koe Japanese 277.5 Folk 75 13 Bm23 shwa karesusuki Japanese 271.6 Folk 82 5 C♯m24 ki no ii ahiru Japanese 114.0 Folk 105 5 F25 naniwabushi da yo jinsei wa Japanese 235.5 Folk 100 8 D26 hakone hachi sato no hanjir Japanese 288.3 Folk 62 5 D♭27 kjnotsuki Japanese 209.8 Folk 81 10 Cm28 akatonbo Japanese 124.3 Folk 63 7 E♭29 ett tsuba me Japanese 236.5 Folk 70 10 B♭m30 trya nse Japanese 83.8 Folk 74 9 Cm31 ae te... yokohama Japanese 264.3 Folk 77 10 E♭m32 tokai no komori ka Japanese 286.6 Folk 78 11 Bm33 takasugi shinsaku Japanese 275.5 Folk 51 7 D♭34 Avalon English 110.3 Jazz 135 16 E♭35 Creepin’ In English 183.0 Jazz 94 6 B♭36 Don’ t Know Why English 185.2 Jazz 80 12 B♭37 Don’ t You Worry ’Bout A Thing English 312.0 Jazz 108 28 E♭m38 For Sentimental Reasons English 178.7 Jazz 67 26 D♭39 In The Midnight Hour English 149.4 Jazz 111 6 E40 Kissing A Fool English 274.7 Jazz 77 16 D41 My Way English 269.0 Jazz 69 12 D42 Route 66 English 221.9 Jazz 183 20 G43 South Of The Border English 170.4 Jazz 132 21 C44 Spinning Wheel English 240.2 Jazz 137 21 G45 Still Crazy For You English 277.1 Jazz 64 24 E♭46 Straighten Up And Fly Right English 157.3 Jazz 152 24 A♭47 Sunrise English 199.9 Jazz 75 8 E♭48 Sweet Lorraine English 276.8 Jazz 74 46 G49 The Christmas Song English 197.0 Jazz 64 25 D♭50 Those Sweet Words English 204.5 Jazz 101 12 A51 White Christmas English 184.7 Jazz 88 20 A52 Young At Heart English 193.5 Jazz 78 34 G53 maripsa Japanese 194.4 Jazz 202 21 A♭m54 ABC English 180.9 Pop 93 6 A♭55 Another Day In Paradise English 328.0 Pop 96 10 Fm56 Change The World English 236.4 Pop 93 15 E57 Christian English 311.0 Pop 95 8 A58 Daydream Believer English 178.3 Pop 128 14 G59 Eight Days A Week English 164.6 Pop 138 5 D60 Emotions English 249.3 Pop 116 13 Am61 Everything She Wants English 300.0 Pop 115 11 F♯

65

Page 79: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Number Song Title Language Time Genre Avg. Tempo No. Unique Chords Key

62 Eyes to me English 268.7 Pop 114 12 D63 GET BACK English 194.6 Pop 122 4 A64 Green Green Grass Of Home English 181.9 Pop 84 8 A♭65 Honesty English 228.1 Pop 62 25 B♭66 I Feel The Earth Move English 182.7 Pop 117 7 Cm67 I’ll Make Love To You English 240.4 Pop 71 11 D68 It’s Now Or Never English 194.2 Pop 103 7 E69 It’s Over English 282.7 Pop 64 24 D♭70 OB English 189.4 Pop 113 6 B♭71 SHAKE English 250.6 Pop 128 26 E♭72 Stand By Me English 178.0 Pop 120 4 A73 Superstition English 249.3 Pop 104 6 E♭m74 Thriller English 352.8 Pop 62 17 C♯m75 Waterloo English 164.1 Pop 147 5 D76 Yesterday English 126.9 Pop 87 14 F77 You Can’t Hurry Love English 171.1 Pop 95 5 B♭78 i. ke. na. i rju majikku Japanese 179.6 Pop 135 9 E79 soshite kbe Japanese 172.2 Pop 94 14 Em80 soyokaze no ywaku Japanese 212.0 Pop 117 17 C81 dango 3 kydai Japanese 124.9 Pop 137 16 Cm82 donna toki mo Japanese 295.1 Pop 123 11 F♯83 aka tsuki no shi Japanese 271.5 Pop 81 25 A84 ajia no junshin Japanese 253.8 Pop 115 13 E85 gogo no pardo Japanese 339.5 Pop 126 15 Em86 aika (erej) Japanese 323.9 Pop 72 18 Em87 natsu shoku Japanese 199.6 Pop 119 14 E♭88 gake no ue no ponyo Japanese 163.1 Pop 120 16 F89 ai ga umare ta hi Japanese 320.1 Pop 79 16 Em90 toki no nagare ni mi o makase Japanese 248.3 Pop 63 13 F91 oborozukiyo inori Japanese 348.6 Pop 78 12 Am92 mirai yos zu ni Japanese 412.5 Pop 71 17 G♭93 aki no kehai Japanese 229.4 Pop 121 14 G94 warabegami yamatoguchi Japanese 282.8 Pop 74 14 D♭95 kekkon shiyo u yo Japanese 164.5 Pop 96 6 E♭96 sake to namida to otoko to onna Japanese 248.4 Pop 57 7 A97 ao no rekuiemu Japanese 264.1 Pop 72 19 E98 (I Can’t Get No) Satisfaction English 218.5 Rock 110 4 E99 BURN English 311.0 Rock 114 14 Am100 Cross Roads English 245.6 Rock 132 3 A101 Don’t Tell Me English 204.3 Rock 72 10 E102 EASY COME EASY GO ! English 282.4 Rock 114 4 A103 HAVE A NICE DAY English 229.9 Rock 130 6 C♯m104 I am a Father English 269.1 Rock 153 18 G105 I Can’t Tell You Why English 290.4 Rock 85 6 Bm106 I Was Born To Love You English 284.8 Rock 139 16 A♭107 Love Comes Quickly English 261.1 Rock 109 9 Bm108 Maybe Blue English 256.4 Rock 120 17 Bm109 Rock and Roll English 220.7 Rock 168 4 A110 sailing day English 241.6 Rock 190 14 E♭111 STAY TUNED English 215.2 Rock 146 17 E112 Warning Sign English 331.0 Rock 70 8 B♭113 youthful days English 318.6 Rock 148 13 E114 anata no yasashi sa o ore wa Japanese 218.4 Rock 100 8 E

nani ni tatoeyo u115 natsuyasumi Japanese 280.7 Rock 132 13 F♯m116 ee nen Japanese 211.4 Rock 173 8 D117 girigiri chop Japanese 236.2 Rock 235 8 Am118 shsgmu Japanese 269.1 Rock 135 11 E119 rpu & rpu Japanese 225.0 Rock 129 7 E♭120 rokkunrru Japanese 250.7 Rock 121 8 A121 rokkunrru doraggu Japanese 155.7 Rock 157 6 Gm

66

Page 80: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Appendix B

Consent Form for Data Collection

67

Page 81: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

68

Page 82: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

69

Page 83: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

70

Page 84: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Glossary

chord progression is another term for chord sequence or harmonic progression.

chord sequence is a series of musical chords that are played one after the other; it

aims to establish (or contradict) a tonality founded on a key, root or tonic chord

and that is based upon a succession of root relationships.

chromatic refers to structures derived from the chromatic scale which consists of all

semitones.

chromatic scale is a musical scale with twelve pitches, each a semitone above or below

another.

circle of fifths is a visual representation of the relationships among the 12 tones of

the chromatic scale, their corresponding key signatures, and the associated major

and minor keys; it geometrically represents the relationships among the 12 pitch

classes of the chromatic scale in pitch class space.

diatonic refers to musical elements derived from the modes and transpositions of the

“white note scale” C-D-E-F-G-A-B.

emotion the generic term for subjective, conscious experience that is characterized by

psycho-physiological expressions, biological reactions and mental states.

ground truth is a term in machine learning that refers to the accuracy of the training

set’s classification for supervised learning techniques.

motif is short term for time series motif.

71

Page 85: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

octave is the interval between one musical pitch and another with half or double its

frequency.

perfect fifth is a musical interval encompassing five staff positions or seven semitones;

for example, the interval from C to G is a perfect fifth.

physiology is a branch of biology that deals with the functions and activities of life or

of living matter and of the physical and chemical phenomena involved.

pitch is a perceptual property in music that allows the ordering of sounds on a frequency-

related scale.

pitch class is a set of all pitches that are a whole number of octaves apart, e.g., the

pitch class C consists of the Cs in all octaves.

rhythm is the movement in the music marked by the regulated succession of strong

and weak melodic and harmonic beats.

supervised learning is the machine learning task of inferring a function from labelled

training data.

time series is a sequence of data points, measured typically at successive time instants

spaced at uniform time intervals.

time series motif is a pair of subsequences having the same length that are very sim-

ilar to each other.

tonic is the first note of a diatonic scale; also known as the keynote or harmonic center.

triadic is a chord of three tones, especially one built on a given root tone plus a major

or minor third and a perfect fifth.

white noise is a random signal with a flat power spectral density; the signal contains

equal power within a fixed bandwidth at any center frequency.

72

Page 86: Classification of Music-Induced Emotions using Psycho ...«–文.pdf · Emotion Model for Music Using Brain waves, ... chord progression, harmony, in-strumentation, etc.). Psycho-physiological

Acronyms

ANS Autonomic nervous system.

BVP Blood volume pulse.

CAUI Constructive adaptive user interface.

DTW Dynamic time warping.

EEG Electroencephalogram.

EMG Electromyogram.

ESAM Emotion spectrum analysis method.

FFT Fast Fourier transform.

HR Heart rate.

MK Mueen-Keogh algorithm.

RR Respiration rate.

SC Skin conductance.

TPS Tonal pitch space.

73