A Parallel and Distributed Genetic-Based Learning

A Parallel and Distributed Genetic-Based Learning

Classifier System with Application in Human

Electroencephalographic Signal Classification

Bradley Skinner

Doctor of Philosophy (Engineering)

Year of Submission: 2007

Centre for Health Technologies

Mechatronics and Intelligent Systems Group

Faculty of Engineering

University of Technology, Sydney

Australia

_«T* UNIVERSITY OFTECHNOLOGY SYDNEY V____________________

Contents

Introduction.............................................................................................................................23

1.1 Problem Statement.....................................................................................................23

1.2 Aim...........................................................................................................................24

1.3 Overview of the Study.............................................................................................. 25

1.4 Research Methods.....................................................................................................28

Review of Literature............................................................................................................... 30

2.1 Introduction............................................................................................................. 30

2.2 Genetic Algorithms................................................................................................. 31

2.2.1 Why use Genetic Algorithms?............................................................................ 32

2.2.2 Basic Definitions and Terminology.................................................................... 33

2.2.3 Selection Operators............................................................................................. 36

2.2.4 Recombination Operators.................................................................................... 38

2.2.5 Mutation Operators............................................................................................. 39

2.2.6 Static and Adaptive Operators............................................................................. 40

2.2.7 Static Crossover and Mutation Operators........................................................... 40

2.2.8 Dynamic Crossover and Mutation Operators...................................................... 41

2.2.9 Replacement Mechanisms................................................................................... 43

2.2.10 Coding Schemes.............................................................................................. 44

2.2.11 The Schema Theorem..................................................................................... 44

2.2.12 Building Block Hypothesis............................................................................. 47

2.2.13 The No Free Lunch Theorem........................................................................... 49

2.3 Parallel Genetic Algorithms.................................................................................... 50

2.3.1 Classification of Parallel Genetic Algorithms......................................................50

2.4 Parallel Computing Architectures............................................................................58

2.5 Inter-process Communication Models..................................................................... 60

2.5.1 Socket Programming............................................................................................60

2.5.2 Parallel Virtual Machine - PVM......................................................................... 61

2.5.3 Message Passing Interface - MPI.........................................................................61

2.5.4 Java Remote Method Invocation - JRMI............................................................ 62

2.5.5 Common Object Request Broker Architecture - CORBA................................... 62

2.5.6 Globus................................................................................................................. 62

2.6 Learning Classifier Systems.....................................................................................63

2.6.1 Reinforcement Learning...................................................................................... 64

2.6.2 Reinforcement Learning Architectures............................................................... 65

2.6.3 Tabular Q-Learning............................................................................................. 67

2.6.4 A Brief History of Learning Classifier Systems................................................. 68

2.6.5 Holland’s Learning Classifier System................................................................. 69

2.6.6 Zeroth Level Classifier System - ZCS................................................................ 72

2.6.7 Overview of the XCS Classifier System............................................................. 76

2.6.8 Detailed Description of the XCS Classifier System............................................ 78

2.6.9 XCS and Tabular Q-Learning............................................................................. 93

2.6.10 Common Classifier Encodings....................................................................... 94

2.7 Parallel Learning Classifier Systems....................................................................... 97

2.8 Conclusion............................................................................................................... 98

Classification Learning of the Human Electroencephalogram........................................ 101

3.1 Introduction............................................................................................................ 101

3.2 The Human Electroencephalogram........................................................................102

3.2.1 The Source of EEG Signals............................................................................... 102

3.2.2 Rhythms of the Human Brain.............................................................................108

3.2.3 Recording Electrodes.........................................................................................109

3.2.4 The EEG System................................................................................................115

3.2.5 EEG Artefacts....................................................................................................117

3.3 Correlation of Human EEG with Mental Tasks.....................................................120

3.4 Signal Processing and Representation of Human EEG..........................................123

3.4.1 Frequency-Based Representation.......................................................................123

3.4.2 Parametric Methods...........................................................................................124

3.4.3 Other Signal Processing and Representations....................................................125

3.5 Data Mining Human EEG......................................................................................126

3.5.1 Classification of EGG using Noil-Evolutionary Algorithms............................. 127

3.5.2 Learning Classifier Systems for Data Mining in Medical Databases............... 130

3.6 Conclusion.............................................................................................................. 133

Optimal Migration Strategies for a PGA with Adaptive Mutation................................. 136

CONTENTS 3

CONTENTS 4

Chapter Outline................................................................................................................... 136

4.1 Aim......................................................................................................................... 137

4.2 PGA Convergence Velocity and Solution Quality................................................. 137

4.2.1 Refinement.........................................................................................................138

4.2.2 Operational Hypotheses.....................................................................................139

4.3 Method.................................................................................................................... 139

4.3.1 Multi-Deme Parallel Genetic Algorithm............................................................ 140

4.3.2 Numerical Objective Functions..........................................................................148

4.3.3 Experiment Parameters......................................................................................151

4.3.4 Experimental Apparatus.....................................................................................152

4.4 Results.................................................................................................................... 153

4.4.1 Description.........................................................................................................153

4.4.2 Presentation........................................................................................................ 154

4.4.3 Analysis.............................................................................................................. 161

4.4.4 Interpretation......................................................................................................163

4.5 Conclusion.............................................................................................................. 169

Sequential Quadratic Programming Methods in PGAs....................................................171

Chapter Outline................................................................................................................... 171

5.1 Aim......................................................................................................................... 172

5.2 Performance Hypothesis of the Hybrid Algorithm................................................. 173

5.2.1 Refinement......................................................................................................... 173

5.3 Method.................................................................................................................... 175

5.3.1 The Hybrid Search Algorithm............................................................................175

5.3.2 Exploration Stage - Parallel Genetic Algorithm................................................ 177

5.3.3 Exploitation Stage - The RFSQP Algorithm..................................................... 182

5.3.4 Numerical Objective Functions..........................................................................186

5.3.5 Experiment Parameters......................................................................................189


5.4 Results.................................................................................................................... 191

5.4.1 Description......................................................................................................... 191

5.4.2 Presentation........................................................................................................ 192

5.4.3 Analysis.............................................................................................................. 196

5.4.4 Interpretation...................................................................................................... 198

5.5 Conclusion..............................................................................................................200

CONTENTS 5

Genetic-Based Machine Learning Classifier for EEG Signal Classification.................. 202

Chapter Outline...................................................................................................................202

6.1 Aim.........................................................................................................................203

6.2 EEG Signal Representation and Competitive Hypotheses.................................... 203

6.2.1 Refinement..........................................................................................................204

6.2.2 Operational Hypotheses.................................................................................... 206

6.3 Method....................................................................................................................206

6.3.1 Experimental Apparatus - EEG Data Acquisition............................................. 207

6.3.2 Participant Preparation.......................................................................................213

6.3.3 Mental Task Experiments................................................................................. 214

6.3.4 Data Acquisition and Pre-processing................................................................ 215

6.3.5 EEG Signal Representations............................................................................. 222

6.3.6 Classification Environment............................................................................... 231

6.4 Results....................................................................................................................236

6.4.1 Description........................................................................................................ 236

6.4.2 Presentation........................................................................................................237

6.4.3 Analysis..............................................................................................................242

6.4.4 Interpretation......................................................................................................245

6.5 Conclusion..............................................................................................................249

Distributed Classifier Migration in XCS for Classification of EEG Signals........... . 251

Chapter Outline...................................................................................................................251

7.1 Aim.........................................................................................................................252

7.2 Parallel Evolutionary Methods in Leaning Classifier Systems............................. 252

7.2.1 Refinement.................................................................. 253

7.2.2 Operational Hypotheses.....................................................................................254

7.3 Method....................................................................................................................255

7.3.1 Classification Environment - pXCS................................................................... 255

7.3.2 Parametric Representation of EEG Data............................................................ 261

7.3.3 Feature Vector Encoding................................................................................... 262


7.3.5 Experimental Parameters.................................................................................. 263

7.4 Results....................................................................................................................264

7.4.1 Description.........................................................................................................264

7.4.2 Presentation........................................................................................................266

7.4.3 Analysis..............................................................................................................273

CONTENTS 6

7.4.4 Interpretation..................................................................................................... 282

7.5 Conclusion.................................................................................................................. 285

Summary and Conclusions................................................................................................... 287

8.1 Summary of Results.............................................................................................. 287

8.1.1 Findings on XCS performance for the Classification of EEG.......................... 287

8.1.2 Findings on Distributed Classifier Migration in pXCS for the Classification of

EEG ............................................................................................................................289

8.1.3 Findings on Optimal Migration Strategies in a PGA with Adaptive Mutation. 291

8.1.4 Findings on the Incorporation of the RFSQP method in PGAs........................ 292

8.2 Conclusions........................................................................................................... 294

8.2.1 Aim One............................................................................................................ 294

8.2.2 Aim Two........................................................................................................... 295

8.2.3 Aim Three......................................................................................................... 295

8.2.4 Aim Four........................................................................................................... 296

8.2.5 Aim Five........................................................................................................... 296

8.3 Direction of Future Work...................................................................................... 297

Bibliography...........................................................................................................................299

A Schema Theorem.............................................................................................................. 309

B Test Functions....................................................................................................................313

B.I Introduction........................................................................................................... 313

Sphere Function (fSph)..................................................................................................... 313

Rastrigin Function (fRas)................................................................................................. 314

Ackley Path Function (fAck)............................................................................................ 315

Schwefel Function (fSch)................................................................................................. 316

Michalewicz Function (fMic)........................................................................................... 317

Langerman Function (f^n)..............................................................................................318

Deceptive n-bit Trap Function (fTrap).............................................................................. 319

Royal Roads Function (fRR).............................................................................................320

C Taylors Series - RFSQP Algorithm................................................................................ 321

D The XCS Classifier........................................................................................................... 323

CONTENTS 7

E Genetic-Based Machine Learning Classifier for EEG Classification........................... 329

F The pXCS Classifier......................................................................................................... 337

G The Electroencephalographic System............................................................................ 340

G. 1 BioSemi Active Two Mk2 System Specification................................................... 340

G.2 BioSemi EEG Electrode Montage for 32+2 Channels........................................... 341

G.3 BioSemi BDF and EDF File Format...................................................................... 342

H Thesis Software Collection.............................................................................................. 343

List of Figures1: Taxonomy of Search Methods in the broader field of Computational Intelligence....................... 31

2: (a) One-Point Crossover Operation using gene position 3. (b) The Two-Point Crossover Operation

using gene positions 2 and 6................................................................................................................38

3: Uniform Crossover Operator..........................................................................................................39

4: The Mutation Operator selecting genes in offspring chromosomes............................................... 40

5: 4-bit Deceptive Trap Function........................................................................................................48

6: Schematic diagram of the Multiple-Population Coarse-Grained PGA consisting of six isolated

subpopulations......................................................................................................................................53

7: Schematic diagram of the Global Single-Population Master-Slave PGA.......................................56

8: Schematic diagram of the Single-Population Fine-Grained PGA arranged into neighbourhoods ..57

9: (a) Multi-deme coarse-grained PGA at both levels with different connection topologies, (b) Multi-

deme coarse-grained PGA at upper level and fine-grained PGA at lower level.................................57

10: The basic Reinforcement Learning Problem.................................................................................65

11: Reinforcement Learning Architectures - Policy-Only and Reinforcement-Comparison..............66

12: Reinforcement Learning Architectures - Adaptive-Heuristic-Critic and Q-Learning................ 66

13: Schematic illustration of Holland’s 1986 Learning Classifier System..........................................70

14: Schematic illustration of ZCS Learning Classifier System........................................................... 72

15: Schematic illustration of XCS Learning Classifier System, 1998.................................................79

16: Classifier accuracy k as a function of classifier prediction error e............................................... 89

1 7: The Cerebrum, showing the 4 lobes (Frontal, Parietal, Temporal, and Occipital)..................... 103

18: Structure of a typical neuron........................................................................................................ 104

19: The Purkinje cell and the Pyramidal cell located in the cerebral cortex..................................... 104

20: Postsynaptic and Action Potential of neuronal membrane..........................................................105

21: The generation of EEG by the cerebral cortex............................................................................107

22: Different types of normal EEG waves in the time domain (a) Delta, (b) Theta, (c) Alpha (d) Beta

activity (e) blocking of the alpha wave by eye opening and (f) 1 second markers........................... 109

23: Equivalent Circuit Model for an electrode.................................................................................. 1 12

24: Basic circuit model for two scalp electrodes with input into a differential amplifier.................112

25: Electrode names and placement in the International 10-20 System........................................... 1 14

26: Modified expanded 10-20 system................................................................................................115

27: Simplified Block Diagram of a modern 8-Channel EEG Recording System............................. 115

LIST OF FIGURES 9

28: A modern Sigma-Delta ADC with Fully programmable Low-Pass filter.................................. 117

29: Eye movement artefacts in time domain. (1) Blink, (2) eye opening (EO), (3) eye closing (EC),

rhythmical slow eye movement (EM), (5) saccadic eye movement (EM)........................................118

30: Common artefacts generated by extracerebral activity................................................................ 118

31: Non-biological artefacts............................................................................................................... 119

32: Mutation Probability PM(0...........................................................................................................14233: The Multi-Deme Master-Slave Topologies of the PGA using 4- and 8- subpopulations............142

34: Unordered and blocking communication provided synchronous migration between loosely-

coupled and coarse-grained sub-populations of the Master-Slave PGA...........................................144

35: Decomposition of the Level 1 (PGA) Structure Chart for the PGA Software System............... 146

36: Decomposition of Level 1.1 (Initialise), 1.3 (OptionalGAFuncs), and 1.4 (Cycle_Error) Structure

Charts for the PGA System................................................................................................................ 146

37: Decomposition of Level 1.2 (Generation) Structure Chart for the PGA Software System.........147

38: Decomposition of Level 1.5 (Migration) Structure Chart for the PGA Software System.......... 147

39: Decomposition of Level 1.6 (Write Re suits) Structure Chart for the PGA Software System...... 147

40: An example illustration of the Sphere Function for n=3..............................................................148

41: An example illustration of Rastrigin's Function for n=3 (top) and n=2 (bottom)........................149

42: An example illustration of the Ackley Path Function for n=3 (top) and n=2 (bottom)............... 150

43: Mutation Probability PM(t) for a = 5, 10, 20, 40, 80, 160............................................................152

44: Sphere Function with 4-subpopulations and variable migration period...................................... 155

45: Sphere Function with 8-subpopulations and variable migration period...................................... 155

46: Rastrigin’s Function with 4-subpopulations and variable migration period................................156

47: Rastrigin’s Function with 8-subpopulations and variable migration period................................156

48: Ackley’s Path Function with 4-subpopulations and variable migration period...........................157

49: Ackley’s Path Function with 8-subpopulations and variable migration period........................... 157

50: Adaptive Mutation vs Static Mutation probability for the Sphere Function (mF= 5, cl# = 8).......158

51: Adaptive Mutation vs Static Mutation probability for the Rastrigin Function (mF= 5, cl# = 8).... 159

52: Adaptive vs Static Mutation probability for the Ackley Path Function (mF=5, cl# = 8)..............160

53: The hybrid search algorithm architecture.....................................................................................17654: The Multi-Deme Master-Slave Topology of the PGA used in the exploration stage of the hybrid

algorithm.............................................................................................................................................17855: Unordered and blocking communication provided synchronous migration between loosely-

coupled and coarse-grained sub-populations of a 4-deme Master-Slave PA....................................179

56: Hybrid Algorithm Search - Exploration and Exploitation...........................................................184

57: Decomposition of the Level 1 Structure Chart for the Hybrid Algorithm Software System.......185

58: Decomposition of the Level 1.7 Structure Chart for the RFSQP algorithm................................ 185

LIST OF FIGURES 10

59: An example illustration of Schwefel’s Function for n=3 (top) and n=2 (bottom)....................... 187

60: An example illustration of the Michalewicz Function for n=3 (top) and n=2 (bottom).............. 188

61: An example illustration of the Langerman Function for n=3 (top) and n=2 (bottom)................. 189

62: Results for PGA and Hybrid Search Algorithm applied to the Sphere Function......................... 192

63: Results for PGA and Hybrid Search Algorithm applied to Rastrigin’s Function........................ 193

64: Results for PGA and Hybrid Search Algorithm applied to the Ackley Path Function............... 193

65: Results for PGA and Hybrid Search Algorithm applied to the Schwefel Function.................... 194

66: Results for PGA and Hybrid Search Algorithm applied to the Michalewicz Function.............. 194

67: Results for PGA and Hybrid Search Algorithm applied to the Langerman Function.................195

68: Experimental configuration of the BioSemi Active-Two EEG Acquisition subsystem..............207

69: Rear view of the BioSemi headcap used for all experiements in this study................................ 208

70: Circuit diagram and transfer function of the active electrode......................................................208

71: (a) BioSemi “pin-type” active electrode and (b) Standard set of 32 electrodes.......................... 209

72: BiosSemi ActiveTwo AD-Box and battery box power supply....................................................210

73: Front and Rear view of the ActiveTwo Optical-to-USB2 converter........................................... 211

74: The BioSemi ActiView Real-Time Interface.............................................................................. 212

75: Participant Preparation.................................................................................................................213

76: A fully prepared participant ready to commence mental task experiments................................. 214

77: An example of 32-channel referenced EEG data from t=9 to t=21 seconds for baseline and

Mental Counting task from a single participant................................................................................ 217

78: An example of 6-channel EEG data from t-9 to r=21 seconds for baseline and Mental Counting

Task from a single participant.......................................................................................................... 218

79: A 10-second recording of the Mental Counting Task being windowed to 8-seconds by removal of

the first and last 1-second of time-domain EEG data from all 6-channels........................................ 219

80: Partitioning EEG windows into segments of 4.0- and 2.0-seconds in length.............................. 220

81: Partitioning EEG windows into segments of 1.0- and 0.5-seconds in length.............................. 221

82: Generalised Autoregressive Filter Model of order p................................................................... 223

83: Example of visual AR order determination................................................................................. 224

84: AR order determination using the Akaike Information Criterion (AIC) Technique................... 225

85: Tree solution for the 9 asymmetry ratios of the Delta (8) EEG frequency band......................... 226

86: A single 108-bit feature vector based on the parametric representation......................................227

87: A single feature vector based on the frequency-band representation.......................................... 228

88: Summary of the steps required produce training and testing sets................................................229

89: A single pre-processed and windowed time-domain segment of EEG data is transformed into the

frequency-domain using FFT........................................................................................................... 230

90: Default settings for classifiers implemented in Weka v3.5.5......................................................233

LIST OF FIGURES 11

91: Weka v3.5.5 GUI; (a) The training data and (b) Classification result using supplied test set....234

92: Topologies employed by pXCS - (a) Uni-directional Ring, (b) Bi-directional ring and (c) Fully-

Connected topologies........................................................................................................................256

93: pXCS Migration System performs classifier selection and replacement upon the sub-population

[P], producing classifier sets [I] and [E]............................................................................................260

94: Average of the best classification accuracy with varying migration frequencies between pXCS

and XCS.............................................................................................................................................267

95: Average of the best classification accuracy with varying migration rates between pXCS and

XCS....................................................................................................................................................267

96: Average of the best classification accuracy between a fully connected pXCS topology and XCS

for each participant and associated task experiment..........................................................................268

97: Average of the best classification accuracy between a bi-directional ring pXCS topology and

XCS for each participant and associated task experiment................................................................ 268

98: Average of the best classification accuracy between a uni-directional ring pXCS topology and

XCS for each participant and associated task experiment................................................................ 269

99: Histogram of improvements and degradation in average classification accuracy between pXCS

and XCS for varying migration frequencies......................................................................................270

100: Histogram of improvements and degradation in average classification accuracy between pXCS

and XCS for varying migration rates................................................................................................. 270

101: Reduction in Classifier Population between pXCS and XCS during training with varying

migration frequencies.........................................................................................................................271

102: Reduction in Classifier Population between pXCS and XCS during training with varying

migration rates...................................... 271

103: Improvement in Learning Rate during training between pXCS and XCS with varying migration

frequencies......................................................................................................................................... 272

104: Improvement in Learning Rate during training between pXCS and XCS with varying migration

rates....................................................................................................................................................272

105:C!assification Accuracy of pXCS and XCS for a single experimental instance........................273

106: Two-sample pooled t-test results comparing the classification accuracy of pXCS vs XCS..... 274

107: Two-sample pooled t-test curve comparing the classification accuracy of pXCS vs XCS..... 276

108: Two-sample pooled t-test curve comparing the classification accuracy of pXCS vs XCS..... 276

109: Example of reduction in classifier population of pXCS and XCS............................................ 279

1 10: Example of learning rate as a fraction of correct classifications of pXCS and XCS................ 281

111: Two-sample pooled t-test curve comparing the classification accuracy of pXCS and XCS for

the results presented in Figure 110.................................................................................................... 281

I I2.B: Sphere Function (n=3), 113.B: Sphere Function Contour Plot (n=2)....................................313

LIST OF FIGURES 12

114. B: Rastrigin Function (n=3)....................................................................................................... 314

115. B: Rastrigin Function Contour Plot (n=2)...............................................................................314

116. B: Ackley Path Function (n=3)..................................................................................................315

117. B: Ackley Path Function Contour Plot (n=2)............................................................................315

118. B: Schwefel Function (n=3)...................................................................................................... 316

119. B: Schwefel Function Contour Plot (n=2).................................................................................316

120. B: Michalewicz Function (n=3)...............................................................................................317

121. B: Michalewicz Function Contour Plot (n=2).........................................................................317

122. B: Langerman Function (n=3)................................................................................................ 318

123. B: Langerman Function Contour Plot (n=2)........................................................................... 318

List of Tables1: Mapping between terms in Natural Evolution and the Genetic Algorithms Framework....34

2: Population of Micro-classifiers in XCS............................................................................... 81

3: Population of Macro-classifiers in XCS.............................................................................. 81

4: Datasets used for the Comparison of Learning Schemes in...............................................132

5: Average Prediction Accuracy and Standard Deviations of six Machine Learning Schemes

using Stratified Ten-Fold Cross-Validation............................................................................132

6: Parallel Genetic Algorithm Parameter Specification........................................................ 151

7: Cluster Computing Hardware and Software Environment - PGA.................................... 152

8: Convergence Velocity and Solution Quality - Sphere Function....................................... 155

9: Convergence Velocity and Solution Quality - Rastrigin Function................................... 156

10: Convergence Velocity and Solution Quality - Ackley Path Function............................157

11: Effect of Adaptive Mutation Operator (a) on Convergence Velocity and Solution Quality

- Sphere Function....................................................................................................................158

12: Effect of Adaptive Mutation Operator (CC) on Convergence Velocity and Solution Quality

- Rastrigin Function.................................................................................................................159

13: Effect of Adaptive Mutation Operator (a) on Convergence Velocity and Solution Quality

- Ackley Path Function...........................................................................................................160

14: Results For Convergence Velocity against Migration Frequency.................................... 164

15: Results For Solution Quality Against Migration Frequency............................................ 164

16: Results for Convergence Velocity and Solution Quality Against Deme Size................. 166

17: Results for Static and Adaptive Mutation Probabilities....................................................168

18: Parallel Genetic Algorithm Parameter Specification....................................................... 189

19: RFSQP Algorithm Parameter Specification......................................................................190

20: Cluster Computing Hardware and Software Environment - Hybrid Algorithm.............. 191

21: Computation Times for the hybrid search and parallel genetic algorithms..................... 195

22: Results For Convergence Velocity, Solution Quality and Computation Time................ 199

23: Cluster Computing Hardware and Software Environment - XCS.................... ...235

24: Classification Accuracy for Burg Parametric Method (order p=6)..................................238

25: Classification Accuracy for Cov Parametric Method (order p=6)...................................238

26: Classification Accuracy for MCov Parametric Method (order p=6)................................238

27: Average Classification Results for the Different Parametric Methods (order:p=6)........ 239

LIST OF TABLES 14

28: Classification Accuracy for Frequency-Band method..................................................... 240

29: Classifier Scheme Comparison Using Parametric Methods (order p=6).........................241

30: Classifier Scheme Comparison Using Frequency-Band Representation.........................241

31: Results For Average Classification Accuracy of Parametric Methods............................242

32: Results For Average Classification Accuracy of EEG Frequency-Bands.......................243

33: Results For Average Classification Accuracy between The Parametric and Frequency-

Band Representations............................................................................................................. 246

34: Results For Average Classification Accuracy using Parametric Representation.............248

35: Results For Average Classification Accuracy using Frequency-Band Representation....248

36: Cluster Computing Hardware and Software Environment - pXCS.................................263

37: Tasks, Topological and Participatory Experiment Matrix for pXCS...............................266

Certificate of Authorship / Originality

I certify that the work in this thesis has not previously been submitted for a degree nor has it

been submitted as part of the requirements for a degree except as acknowledged within the

text.

I also certify that the thesis has been written by me. Any assistance that I have received in my

research work and the preparation of the thesis itself has been acknowledged. In addition, I

certify that all information sources and literature used are indicated in the thesis.

Signature of Candidate

Abstract

Genetic-based Learning Classifier Systems have been proposed as a competent technology for

the classification of medical data sets. What is not known about this class of system is two

fold. Firstly, how does a Learning Classifier System (LCS) perform when applied to the

single-step classification of multiple-channel, noisy, artefact-inclusive human EEG signals

acquired from many participants? Secondly and more importantly, is how the learning

classifier system performs when incorporated with migration strategies, inspired by multi-

deme, coarse-grained Parallel Genetic Algorithms (PGA) to provide parallel and distributed

classifier migration? This research investigates these open questions and concludes, subject to

the considerations herein, that these technological approaches can provide competitive

classification performance for such applications.

We performed a preliminary examination and implementation of a parallel genetic

algorithm and hybrid local search PGA using experimental methods. The parallelisation and

incorporation of classical local search methods into a genetic algorithm are well known

methods for increasing performance and we examine this. Furthermore, inspired by the

significant improvements in convergence velocity and solution quality provided by the multi-

deme, coarse-grained Parallel Genetic Algorithm, we incorporate the method into a learning

classifier system with the aim of providing parallel and distributed classifier migration. As a

result, a unique learning classifier system (pXCS) is proposed that improves classification

accuracy, achieves increased learning rates and significantly reduces the classifier population

during learning. It is compared to the extended learning Classifier System (XCS) and several

state of the art non-evolutionary classifiers in the single-step classification of noisy, artefact-

inclusive human EEG signals, derived from mental task experiments conducted using ten

human participants.

We also conclude that establishing an appropriate migration strategy is an important cause

of pXCS learning and classification performance. However, an inappropriate migration rate,

frequency or selectiomreplacement scheme can reduce performance and we document the

factors associated with this. Furthermore, we conclude that both EEG segment size and

representation both have a significant influence on classification performance. In effect,

determining an appropriate representation of the raw EEG signal is tantamount to the

classification method itself.

ABSTRACT 17

This research allows us to further explore and incorporate pXCS evolved classifiers derived

from multi-channel human EEG signals as an interface in the control of a device such as a

powered wheelchair or brain-computer interface (BCI) applications.

Statement of Contribution

Contributions to Knowledge

We have proposed and tested three research hypotheses (H3, H4, H5) related to the Genetic-

Based Machine Learning (GBML) Classifier System for the single-step classification of

multi-channel, artefact-inclusive human electroencephalographic (EEG) signals, derived from

mental task experiments. In addition we have tested two research hypotheses (HI and H2)

related to the convergence property and solution quality of a Parallel Genetic Algorithm and

Hybrid Algorithm:

• Hypothesis HI which claims ‘The convergence velocity and solution quality

achieved by a parallel genetic algorithm, with adaptive mutation is superior to that

achieved by the serial genetic algorithm on a common set of optimisation problems.”

• Hypothesis H2 which claims “The convergence velocity, solution quality and

computation time achieved by a hybrid search algorithm is superior to that achieved

by the parallel genetic algorithm on a common set of optimisation problems.”

• Hypothesis H3 which claims “A parametric representation provides superior

performance compared to the frequency-band representation for the single-step

classification of multiple-channel, artefact-inclusive human EEG data using

evolutionary and non-evolutionary classifiers.”

• Hypothesis H4 which claims “The extended learning Classifier System (XCS)

provides superior performance when compared to the current state of the art in non

evolutionary classifiers for the single-step classification of multiple-channel, artefact-

inclusive human EEG data” •

• Hypothesis H5 which claims “The incorporation of a parallel and distributed

classifier migration policy into the extended learning Classifier System (XCS)

STATEMENT OF CONTRIBUTION 19

increases classification performance for the single-step classification of multiple-

channel, artefact-inclusive human EEG data”

The implications of these findings provide, in some measure, new knowledge about the

classification of human EEG signals using parallel and distributed Genetic-Based Machine

Learning (GBML) Classifier System. In the course of examining these hypotheses, the

following contributions were also made:

• We have determined, documented and implemented a simple mechanism for the

parallel and distributed migration of classifiers in the discovery component of a

genetic-based learning classifier system, which we called pXCS. This parallel and

distributed algorithm extends Wilson’s XCS learning classifier system by introducing

a migratory pressure, with improved learning rate, improved classification accuracy

and smaller classifier population size.

• The degree-of-connectivity implicit in the fully-connected, bi-directional and uni

directional ring topologies have a significant influence upon the learning and

classification performance of pXCS. This finding extends the number of topologies

found in the literature.

• The rate and frequency of distributed classifier migrations in pXCS does have a

significant impact on the learning and classification performance. This finding is at

variance with the literature which portrays classifications performance for idealised

problems.

• A selection and replacement scheme based upon ranking of classifier fitness,

numerosity and random in pXCS can have significant impact on the learning and

classification performance. This finding extends the number of such schemes found in

the literature for idealised problems. Furthermore, we have shown that fitness bias

selection can inhibit classification performance. •

• We have provided extensive benchmarking for the classification performance of XCS

and pXCS in an observable and repeatable manner in the single-step classification of

multiple-channel, artefact-inclusive human EEG signals. This approach allowed for

the comparison of evolutionary and current state-of-the-art non-evolutionary classifier

systems, in terms of average classification accuracy and its variance.

STATEMENT OF CONTRIBUTION 20

• In terms of learning and classification performance of XCS and pXCS, in conjunction

with typical real-time constraints of similar applications we have determined:

(a) an optimal EEG segment size of 2.0-seconds,

(b) an optimal autoregressive model order of p=6 for the Burg, Covariance, and

Modified Covariance parametric methods.

• We have achieved significant improvements in classification accuracy by increasing

the ability of XCS to form generalisations in the single-step classification of multiple-

channel, artefact-inclusive human EEG signals. More specifically, establishing a

generalisation or ‘don’t care’ probability of P#=0.86 tended to provide significant

improvements in classification accuracy compared to P#=0.76.

• We have discovered that an XCS classifier population of vV=8000 did not provide

significant improvement or degradation in classification accuracy when compared

with /V=2000 and /V=4000 for parametric representations and P^O.86.

• We have developed, documented and implemented a simple hybrid search algorithm

employing an iterative global search heuristic and cascaded architecture. Our method

loosely couples our parallel genetic algorithm with a non-evolutionary search

algorithm, which based on the Reduced Feasible Sequential Quadratic Algorithm.

The hybrid algorithm achieved significant improvements in convergence velocity,

solution quality and computation time for a suite of idealised objective functions. •

• We have developed, documented and implemented a multi-deme parallel genetic

algorithm (PGA) employing an adaptive mutation function. In general, adaptive

mutation can increase convergence velocity and solution quality of a multi-deme

PGA for a suite of idealised objective functions. This finding supports the current

literature.

Related Publications

• Performance Study of a Multi-Deme Parallel Genetic Algorithm with Adaptive

Mutation, B.T. Skinner, H.T. Nguyen, D.K. Liu, International Conference on

Autonomous Robots and Agents (ICARA), pp. 88-94, New Zealand, December 2004.

• Hybrid Optimisation Method Using PGA and SQP Algorithm, B.T. Skinner, H.T.

Nguyen, D.K. Liu, IEEE Symposium on Foundations of Computational Intelligence

(FOCI), pp. 73-80, Honolulu, Hawaii, USA, April 2007.

• Classification of EEG Signals Using a Genetic-Based Machine Learning Classifier,

B.T. Skinner, H.T. Nguyen, D.K. Liu, 29th Annual International Conference of the

IEEE Engineering in Medicine and Biology Society (EMBC), pp. 3120-3123, Lyon,

France, 2007.

• Distributed Classifier Migration in XCS for the Classification of

Electroencephalographic Signals, B.T. Skinner, H.T. Nguyen, D.K. Liu, IEEE

Congress on Evolutionary Computation (CEC), pp. 2829-2836, Singapore, 2007.

Other Publications during PhD Candidature

• Distributed Simultaneous Task Allocation and Motion Coordination of Autonomous

Vehicles Using a Parallel Computing Cluster, A.K. Kulantunga, B.T. Skinner, D.K.

Liu, H.T. Nguyen, in Robotic Welding, Intelligence and Automation, Lecture Notes

in Control and Information Sciences (Vol. 362), Springer-Verlag, 2007, pp. 409-420.

AcknowledgmentsFirstly I would like to thank my supervisor, Professor Hung Nguyen for providing me with

the opportunity to pursue postgraduate research at the University of Technology, Sydney.

Without Hung’s support, guidance and honesty I would not have been able to become a

capable research engineer. His patience and kindness continue to surprise me. I am also

grateful to Dr Dikai Liu for his guidance, encouragement and lengthy discussions on

evolutionary computation, which proved invaluable. I am indebted to him for his friendship

and company in distant places.

I would also like to thank Dr Yvonne Tran and Professor Ashley Craig for providing access

to the EEG laboratory during participant experimentation. I am thankful for their helpful

advice, ongoing support and review of my work. I am grateful to Dr Matt Gaston and Mr

Peter Brady for establishing, maintaining and supporting the UTS Engineering Linux

Computing Cluster Environment. Without their valuable help and interest my research study

would be have been more difficult. I am also thankful for the early discussions about XCS

with by Dr Alwyn Barry.

I thank my colleagues: Lesley King and Phil Taylor for the scholarly discussions about

classification learning and many other topics, which frequently took place over a beer; Daniel

Craig for his kind cooperation during the EEG participant experiments and our lengthy chats,

which I miss; Dr Shane Magrath for providing advice on the structure and presentation of this

thesis; Dr Son Nguyen for his friendly discussions; and the many people who thoughtfully

participated in the EEG recording sessions.

Perhaps the greatest credit goes to my family who have supported me throughout my life

and without which this work would not have been possible. I would like to dedicate this

research to my wonderful Melissa for her constant empathy and unwavering support

particularly when times were tough - she is a matchless partner in all adventures.

To the people who helped

Documents

A Parallel and Distributed Genetic-Based Learning