45
National University of Tainan 國國國國國國國國國國國國國 國國國國國國國 1 A new approach in the BCI research based on fractal dimension as featu re and Adaboost as classifier JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17. Reza Boostani and Mohammad Hassan Moradi 2005/10/13 M09356002 國國國

JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

Embed Size (px)

DESCRIPTION

A new approach in the BCI research based on fractal dimension as feature and Adaboost as classifier. JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17. Reza Boostani and Mohammad Hassan Moradi 2005/ 10 / 13. M09356002 劉明機. Scheme. BCI 的研究內容? 實驗如何設計? 所用的特徵值與分類器為何? - PowerPoint PPT Presentation

Citation preview

Page 1: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 1

A new approach in the BCI research based on fractal dimension as feature and Adaboost as classifi

er

JOURNAL OF NEURAL ENGINEERING2004 Dec;1(4):212-7. Epub 2004 Nov 17.

Reza Boostani and Mohammad Hassan Moradi

2005/10/13

M09356002 劉明機

Page 2: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 2

Scheme

• BCI 的研究內容?• 實驗如何設計?• 所用的特徵值與分類器為何?• 不同的特徵值與不同的分類器互相組合比較的結果如何?

Page 3: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 3

Abstract• 如何提高心智活動的分類正確率仍然是 BCI領域的熱門研究之一。

• 本論文使用基於碎形維度 fractal dimension當特徵值, Adaboost 當分類器的新方法來增加分類的正確率,並對五各受試者進行測試。

• 本論文同時使用 band power, Hjorth parameters 當特徵值與 LDA 當分類器做比較。

Page 4: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 4

Introduction

• brain computer interface (BCI)– 無法自由行動的併發症 (locked-in-syndrome)– 嚴重脊髓受傷 (severe spinal cord injury)– move his limbs by functional electrical stimulation

controlled with thoughts

• non-invasive EEG as input

• optimize pre-processing, feature extraction and feature classification methods.

Page 5: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 5

Introduction• the Graz-BCI team have employed different features such as b

and power , adaptive autoregressive coefficients and some classifiers including LDA, FIRMLP , LVQ and HMM to improve the classification rate between different motor imagery tasks.They have also used DSLVQand G.A. for feature selection

• Deriche and Al-Ani selected the best feature combination among the variance,AR coefficients, wavelet coefficients, and fractal dimension by a modified mutual information method and classified them by aMLP classifier

• Cincotti et al compared the performance of three classifiers (ANN, Mahalonobis distance and HMM) to classify the band power feature for six subjects

Page 6: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 6

Introduction

• The aim of this paper is to improve the classification rate of a cuebased BCI by a new approach based on fractal dimension and Adaboost classifier

• As a comparison, band power and Hjorth parameters along with LDA are assessed on the same data set (five subjects).

Page 7: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 7

Subjects and data acquisition• Five healthy subjects (L1, o3, k3, f8 and o8)• sat in a relaxing chair with armrests about 1.5 m in fr

ont of the computer screen • Three bipolar EEG channels were recorded from 6 A

g/AgCl electrodes placed 2.5 cm anterior and 2.5 cm posterior to the standardized positions C3, Cz and C4 (international 10–20 system).

• filtered 0.5-70Hz • sample rate 128 Hz

Page 8: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 8

Subjects and data acquisition• Each trial lasted 8 s and started with the presentation of a blan

k screen • A short acoustical warning tone was presented at second 2 and

a fixation cross appeared in the middle of the screen• From second 3 to second 7 an arrow(cue), representing the me

ntal task to perform, was prompted• An arrow pointing either to the left or to the right indicated the

imagination of a left hand or right hand movement • The order of appearance of the arrows was randomized • at second 7 the screen content was erased• The trial finished with the presentation of a randomly selected

inter-trial period (up to 2 s) beginning at second 8

Page 9: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 9

Subjects and data acquisition

• Three sessions were recorded for each subject on three different days.

• Each session consisted of three runs with 40 trials each.

Training paradigm

Page 10: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 10

Feature extraction and classification

• Feature extraction– The goal of feature extraction is to find a suitable r

epresentative of the data that simplify the subsequent classification or detection of brain patterns

– Band power -> frequency bands;– Hjorth parameters ->morphological characteristics– fractal dimension -> entropy

Page 11: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 11

Feature extraction and classification

• Band power (BP)– The EEG contains different specific frequency co

mponents,for example, alpha and beta bands which are particularly important to classify the different brain states, especially for discriminating motor imagery tasks.

• 10–12 Hz,16–24 Hz• 1s time window• 250 ms overlap

Page 12: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 12

Feature extraction and classification

• Hjorth parameters– describe the signal characteristics in terms of

• activity (variance (VAR) of signal)

• mobility (a measure of mean frequency)

• complexity (a measure of the deviation from sine shape)

• 500 ms time window• without overlapping• an exponential window has been applied to the featur

es for smoothing

Page 13: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 13

Feature extraction and classification

• Hjorth parameters2

1( ( ) )

Activity(y) = N

VAR(y')Mobility =

VAR(y)

Mobility(y')Complexity =

Mobility(y)

N

iy i

y : the signaly' : the derivative of the signalN : the number of samples in the windowμ : the mean of the signal in the window

Page 14: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 14

Feature extraction and classification

• Fractal dimension (FD)– Fractal dimension can be interpreted simply as the degree o

f meandering (or roughness or irregularity) of a signal.

– methods for calculating fractal dimension• Katz, Higuchi and Peterson

• Katz method is more robust,implemented in this research

• 1s time window• 250 ms overlap

Page 15: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 15

Feature extraction and classification• Fractal dimension (FD)• 自我重覆性或自我相似性( self-similarity )

– 無論將其形狀放大或縮小至何種程度,它仍然會表現出與原來相似之形狀

– 物體整體外觀與組成部分之形狀所具有的相似性,稱之為碎形。一般而言可分為完全自相似碎形與統計自相似碎形,前者如三角 Koch 曲線,它具有一層層縮小而相似的特性,事實上在自然界找不到像這類完全自我相似的例子,其只能作為數學模型。但後者卻普遍地存在於自然界中,就統計意義上而言,其整體與組成部分是相似的,故稱為統計自相似碎形,如河川、山脈、島嶼及海岸線等 。

• 具有非整數的維度,稱為「碎維維度」 ( fractal dimension ) • 碎維維度的數值越大,其彎曲度越大,故碎維維度是量測

一個幾何結構粗糙程度或複雜程度的指標。

Page 16: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 16

Feature extraction and classification

• Fractal dimension (FD)– Katz fractal dimension

10

10

Log LFD =

Log d

L : the total length of the curved : the diameter estimated as the distance between the first point of the sequence and the point of the sequence that provides the farthest distance

Page 17: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 17

Feature extraction and classification

• Linear discriminant analysis (LDA)– Fisher's linear discriminant is maximizing the betw

een-group variance to the within-group variance ratio,which in this case is measured by the ratio of the determinants of the preceding two matrices.

Page 18: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 18

Feature extraction and classification

• Adaboost– The principle of theAdaboost is that a committee

machine can adaptively adjust to the errors of its components, the so-called weak learners.

• The classification rate of each weak learner should exceed 50%.

• a neural network with one hidden layer is selected as the weak learner.

Page 19: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 19

Feature extraction and classification

• Adaboost• 委員會機器 (Committee Machine) 在類神經網路領域中的應用,主要是用來結合數個單一類神經網路,透過某些投票方式,以及整合的方法,可以將數個單一類神經網路結合,建立一委員會機器;如同社會上各種會議結構,都有一定的議程和投票方式,整合各方委員的意見來決策某些事項。

Page 20: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 20

Feature extraction and classification

• Adaboost

A committee of neural networks generated

using Adaboost

Page 21: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 21

Feature extraction and classification

• Adaboost

Page 22: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 22

Page 23: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 23

Feature extraction and classification

• Adaboost• First, the first neural network trains with the equal e

rror weight for all the samples

D1(i):the error weight for the samples in the first iterationN:the number of input samples.

1( ) 1/ 1,...,D i N i N    

Page 24: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 24

Feature extraction and classification

• Adaboost• For the next iteration, the error weight of the samples is c

hanged regarding their error in the previous iteration by the following relation:

Page 25: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 25

Feature extraction and classification

• Adaboost 1 1

n-1 i n-1

( )11

1

n-1 n-1 n-1F (X ) y

n-1n-1 n-1

n-1

(i)( ) *

= D (i) 0 0.5

= 0 11-

n n i iB if F X dnn otherwise

n

DD i

Z

              

     

       

Fn-1(x): the output of the (n-1)th weak learnerεn-1 :the error of the weak learner in the (n-1)th stepdi :the label of the ith input sampleβn-1 :measures the importance of the hypothesis of Fn-1(x) and it ecreases with error.

Page 26: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 26

Feature extraction and classification

• Adaboost• the misclassified samples in each stage are given a hi

gh value of Dn(i) (error weight) for the next stage.• This iterative procedure repeats till the time that n rea

ches T (the maximum considered value for the number of weak learners).

Page 27: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 27

Feature extraction and classification

• Adaboost• After the training phase, total output of the Adaboost

is calculated as follows:

nn=1...T,F (x)=y n

1(x)=argmax log

Page 28: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 28

Feature extraction and classification

• Adaboost• For this classifier, all the feature values must be norm

alized in the interval [-1, 1].

Page 29: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 29

Evaluation of classification performance

• The training set is evaluated ten times, by tenfold cross validation

• The best classifiers from the evaluation phase have been selected and applied to the test data

• 360 trials have been recorded for each subject• for each subject

– 240 trials for cross validation– 120 trials for the testing phase

• then artefact trials were removed

Page 30: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 30

Evaluation of classification performance

• 交互效度檢定 (cross-validation) 。基本原理是再抽取另一樣本對該模式進行擬合度檢定。

• In bioinformatics, if you have a set of 100 data can be used as a training set to tune your program. You may use 99 of them to train your program and predict the last one. To show this is not a chance, you need to pick another 99 to train your program and predict the last one again. If you do the same thing again and again, each data set will be predicted based on the program trained by the rest 99 data set. The whole process is called "cross validation".

Page 31: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 31

Results

• No combination of the features has been considered in this paper

• BP with LDA yielded the best performance for four subjects (L1, o3, o8, f8)

• k3 for which Hjorth parameters along with both classifiers showed a significant result.

Page 32: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 32

Results

Band powerHjorth

parameters

Fractal

Dimension

LDA Adaboost LDA Adaboost LDA Adaboost

20% 35% 20% 25% 30% 22%

4.75 5.25 4 4 4.25 4.25

Band powerHjorth

parameters

Fractal

Dimension

LDAAdabo

ostLDA

Adaboost

LDAAdabo

ost

15.4% 22% 25% 30% 18% 20%

4.75 4.75 4 4 4.5 4.25

Band powerHjorth

parameters

Fractal

Dimension

LDAAdaboost

LDAAdaboost

LDAAdaboost

13.6%

16% 20% 26%16.5%

18%

4.75 4 4 4.5 4.75 4.5

L1

o1

o8

cross validation data

Page 33: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 33

Results

Band powerHjorth

parameters

Fractal

Dimension

LDAAdaboost

LDAAdaboost

LDAAdaboost

15% 16% 9.6% 10%19.7%

21%

5.25 5 4.5 4 3.5 5.5

Band powerHjorth

parameters

Fractal

Dimension

LDAAdaboost

LDAAdaboost

LDAAdaboost

16.5%

16%21.5%

20% 23%26.5%

5.5 4 4.5 3.5 4.5 5

cross validation data

k3 f8

Page 34: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 34

Results• In the test phase,• FD and Adaboost showed the best result (16.5

% error) for subject L1• o8 and k3 with LDA and BP. • FD along with LDA has shown good results in

cases k3 and L1. • But for the three other subjects (o3, o8 and f8)

the test results confirm the cross validation phase.

Page 35: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 35

ResultsBand power

Hjorth

parameters

Fractal

Dimension

LDAAdaboost

LDAAdaboost

LDAAdaboost

28% 32% 26% 35% 25%16.5%

4.75 4.25 4 4 4.75 4.5

Band powerHjorth

parameters

Fractal

Dimension

LDAAdaboost

LDAAdaboost

LDAAdaboost

9.5% 25% 23% 28%19.1%

22%

5.75 4.5 5 4 5 6.5

Band powerHjorth

parameters

Fractal

Dimension

LDAAdaboost

LDAAdaboost

LDAAdaboost

18% 20% 24% 30% 24% 20%

6.25 4 4 4.5 4.75 4.5

L1 o1

o8

test data

Page 36: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 36

Results

Band powerHjorth

parameters

Fractal

Dimension

LDAAdaboost

LDAAdaboost

LDAAdaboost

10.2%

20.4%

23.5%

17%10.2%

14.3%

4.5 5.5 5.5 5.5 5 5

Band powerHjorth

parameters

Fractal

Dimension

LDAAdaboost

LDAAdaboost

LDAAdaboost

16.4%

20.71%

23.5%

22.86%

18.5%

25%

6.5 4.5 4.5 4 4.25 4.25

k3 f8

Test data

Page 37: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 37

Results• To have significant results, the F test and the T test were perfo

rmed on the test results• Test and training data were randomly chosen from the pure dat

a for 20 times• All results were significant, which means that the P value was

lower than 0.05 for all evaluated results.• the test curves for FD and BP along with two classifiers (LDA

and Adaboost) for the whole paradigm are depicted in figures 3–12 for our subjects

• Hjorth results are eliminated from the graphs, because these results are not the best in any case

Page 38: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 38

ResultsBP , L1BP , o3

BP , o8

FD , L1 FD , o3

Page 39: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 39

Results

FD , k3

BP , K3 BP , f8FD , o8

FD , f8

Page 40: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 40

Discussion• Cross validation results in four cases indicate that the

best combination for classifying the imagery tasks is BP with LDA.But in the test phase, results did not completely confirm the cross validation results.

• FD and Adaboost showed a significant result for subject L1,k3 and o8.

• FD with LDA in the cases k3 and L1.• FD with both classifiers can be an acceptable alternati

ve with the combination of BP and LDA.

Page 41: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 41

Discussion

• In many articles BP and LDA are presented as a gold-standard technique for BCI applications,This paper showed that the selection of a feature and a classifier is extremely dependent on the case.

• In the above-mentioned articles results were shown for a maximum of two or three cases and no appropriate comparison was made.

Page 42: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 42

Discussion• There is a trade-off between the minimal error rate and its late

ncy,• latency of minimal error rate after 2.5 s of the cue stimulus is a

cceptable.• 9.5% error for case o3 (happening in 6.25 s) cannot be accepta

ble,because, in the real application, a subject might lose concentration if he does not see the feedback on the screen.

• The evaluation was also performed for different window lengths in the 250 ms interval, FD shows supremacy to BP and Hjorth parameters.

• As the time window increases, the classification rate improves but it causes a delay in reporting the changes in the EEG.

Page 43: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 43

Discussion• Of the two classifiers• LDA shows a very reliable and robust behaviour.• Adaboost is more time consuming for the training• a chaotic behaviour in the case L1, therefore, FD can

present it much better than two other features, and the non-linear classifier (Adaboost) could find a more flexible margin through the classes.

Page 44: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 44

Discussion• the biological properties of every human are u

nique,therefore, gold-standard combination does not make sense for all the cases.

• for each individual, we have to find the best combination of feature and classifier or on some occasions,a combination of the features by evolutionary algorithms or a tree combination of classifiers which can lead to the best result.

Page 45: JOURNAL OF NEURAL ENGINEERING 2004 Dec;1(4):212-7. Epub 2004 Nov 17

National University of Tainan

國立台南大學資訊教育研究所 人工智慧實驗室 45

END

Thank you